Commit Graph

18263 Commits

Author SHA1 Message Date
123a187c23 [regression test](schema change) add case for modify partition info (#30338) 2024-01-30 15:30:39 +08:00
57a8c75ddc [regression test](schema change) add case for column type change (#30472) 2024-01-30 15:30:39 +08:00
f7e01ceffa [bug](node) add dependency for set operation node (#30203)
These sinks must be completed one by one in order, eg: child(1) must wait child(0) build finish
2024-01-30 15:30:39 +08:00
f0a35f6e2d [regression test](schema change) add some case for agg col (#30479) 2024-01-30 15:30:39 +08:00
49d17f2be2 [fix](move-memtable) fix potential duplicate of TabletStream profile (#30397) 2024-01-30 15:30:14 +08:00
6eba030897 [fix](chore) path gc should consider tablet migration (#30095) (#30548)
Background:

Migration will create new tablet in different DataDir, the old tablet will be moved to TabletManager::_shutdown_tablets.
The migration task won't copy data in stale rowsets to new tablet, so after migration, the new tablet don't contains stale rowsets of old tablet
The path GC process will check every path, to make sure if it's an useless tablet, or an useless rowset. If it is, will remove data of these tablets/rowsets
The issue:

When path GC got a stale rowset path from the data dir of old tablet, it extract the tablet id and rowset id
Then it check if the tablet id exists in TabletManager, and the answer is YES!
It got the tablet instance, which is the new tablet, then it check if the stale rowset id from the old tablet path exists in the new tablet instance, and got the answer NO.
The path GC process treat the rowset as an useless rowset, since it can't find anyone holds reference to it, then delete the data of this stale rowset.
But some query may still holds reference to this stale rowset, the deletion will cause query failure.
Solution:

The lifecycle of all rowsets in a shutdown tablet, should be related with the lifecycle of this tablet
We need to differentiate the old tablet and the new one created by migration task, while performing path GC.
2024-01-30 12:03:21 +08:00
cc3c6d1479 [improvement](create tablet) backend create tablet round robin among … (#30530)
* [improvement](create tablet) backend create tablet round robin among … (#29818)

* [improvement](create tablet) be choose disk tolerate with little skew (#30354)

---------

Co-authored-by: yujun <yu.jun.reach@gmail.com>
2024-01-30 10:20:35 +08:00
f17d29090e [feat](Nereids): drop foreign key after dropping primary key that is referenced by the foreign key (#30417) 2024-01-29 19:03:48 +08:00
a0100ce29f Reduce AlterJobV2/TruncateTable binlog size (#30505)
Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>
2024-01-29 19:03:48 +08:00
15c625dcbc [fix](Nereids) should not generate same exprId for diff column when sink (#30501) 2024-01-29 19:03:48 +08:00
0f81d2d533 [FIX](complextype)fix complex type nested version type but not hide version (#30419) 2024-01-29 19:03:47 +08:00
6231300e9e [Fix](Rf) fix in_or_bloom filter merge error in broadcast join remote target tpcds q78 (#30492) 2024-01-29 19:03:47 +08:00
afab713048 [fix](Nereids) query mv column directly (#30444) 2024-01-29 19:03:47 +08:00
a0d2aa3619 [test](nereids) add SimplifyArithmeticRule test (#27081) 2024-01-29 19:03:47 +08:00
dce6c8bd65 [Improvement](Nereids) Support aggregate rewrite by materialized view with complex expression (#30440)
materialized view definition is

>            select
>            sum(o_totalprice) as sum_total,
>            max(o_totalprice) as max_total,
>            min(o_totalprice) as min_total,
>           count(*) as count_all,
>            bitmap_union(to_bitmap(case when o_shippriority > 1 and o_orderkey IN (1, 3) then o_custkey else null end)) >cnt_1,
>            bitmap_union(to_bitmap(case when o_shippriority > 2 and o_orderkey IN (2) then o_custkey else null end)) as >cnt_2
>            from lineitem
>            left join orders on l_orderkey = o_orderkey and l_shipdate = o_orderdate;
   

the query following can be rewritten by materialized view above.
it use the aggregate fuction arithmetic calculation in the select 

>            select
>            count(distinct case when O_SHIPPRIORITY > 2 and o_orderkey IN (2) then o_custkey else null end) as cnt_2,
>            (sum(o_totalprice) + min(o_totalprice)) * count(*),
>            min(o_totalprice) + count(distinct case when O_SHIPPRIORITY > 2 and o_orderkey IN (2) then o_custkey else null >end)
>            from lineitem
>            left join orders on l_orderkey = o_orderkey and l_shipdate = o_orderdate;
2024-01-29 19:03:47 +08:00
edeec320d3 [test](nereids) add SimplifyCastRule test case (#26708) 2024-01-29 19:03:47 +08:00
081cdc6ecd [test](nereids)add more case for SimplifyRange rule (#27314) 2024-01-29 19:03:47 +08:00
e36f390134 [fix](nereids)window expression's window frame may lost in NormalizeToSlot (#30378) 2024-01-29 19:03:47 +08:00
036e17dcb0 [test](nereids)add fe ut for SimplifyArithmeticComparisonRule (#27644) 2024-01-29 19:03:47 +08:00
3b85e3de1b [fix](planner)avg function may use wrong decimal precision and scale (#30364) 2024-01-29 19:03:47 +08:00
90c0806178 [fix](query state) Print correct DML state (#30489) 2024-01-29 19:03:47 +08:00
b91a338bce [enhance](auto-partition) Constrain dynamic & auto partition use same interval unit if both enable (#30426) 2024-01-29 19:03:47 +08:00
3354ac48f7 [enhance](mtmv)add version and version time for table (#30437)
Add version to record data changes in the table

Scope of impact: 

- Transaction related operations
- drop partition
- replace partition
2024-01-29 19:03:47 +08:00
db094da081 [fix](common) If the properties in DDL is not provided, the mysql client will lost connection (#30256) 2024-01-29 19:03:47 +08:00
11f1b129c0 [optimize](invert index) avoid redundant checks for exist. (#30191) 2024-01-29 19:03:47 +08:00
779a9a1fbb [opt](planner) use string for varchar in ctas if original table is not olap (#30323) 2024-01-29 19:03:47 +08:00
930e3bb701 [feature](Nereids): double eager support mix function (#30468) 2024-01-29 19:03:47 +08:00
cc963b0f71 [Refact](inverted index) use boost regex to resolve stack overflow issues (#30477) 2024-01-29 19:02:46 +08:00
dcfccde3d1 [fix](Nereids) create table should check column name format (#30421) 2024-01-29 19:02:46 +08:00
5a13c7596a [fix](nereids)should normalize mv column's name before matching prefix keys (#27464) 2024-01-29 19:02:46 +08:00
ae38f28280 [feature](invert index) does not create an inverted index to support the match_phrase_prefix feature. (#30414) 2024-01-29 19:02:46 +08:00
7667fe8570 [Improve)(Variant) do not allow fall back to legacy planner (#30430) 2024-01-29 19:02:46 +08:00
658c869aac [improvement](mtmv)mtmv support partition by hms table (#29989) 2024-01-29 19:02:46 +08:00
a4ccf92fec [fix](thirdparty) patch brpc 1.4.0 to fix stream rpc (#30476) 2024-01-29 19:02:46 +08:00
15a68924f5 Add note for workload group when upgrade Doris (#30457) 2024-01-29 19:02:45 +08:00
bfdc41d37b [fix](ccr) handle large binlog (#30435) 2024-01-28 18:25:31 +08:00
92dc395f9a [fix](nereids)should always call visitBoundFunction first when binding ElementAt function (#30469) 2024-01-28 18:25:31 +08:00
7e19224a6c [fix](function) fix ipv4 funcs get failed error, improve an ipv6 func and exception message (#30269) 2024-01-28 18:25:31 +08:00
0433b8730d [Feature](profile)add shuffle send rows/bytes #30456 2024-01-28 18:25:08 +08:00
b1a9370004 [fix](glue)support access glue iceberg with credential list (#30473)
merge from #30292
2024-01-28 18:23:07 +08:00
f988686708 2.1.0-rc07 2024-01-27 10:55:03 +08:00
96c4fcfb20 [improve](node) refactor partition sort node to reduce memory use
pipelineX
2024-01-27 10:54:29 +08:00
e9218861ec fix code format 2024-01-27 10:40:27 +08:00
4f915129a9 [pipelineX](localexchange) Add local exchange before TabletFunction (#30446)
* [pipelineX](localexchange) Add local exchange before TabletFunction

* update
2024-01-27 10:29:41 +08:00
5d7543b30b [feature](ranger) Support Apache ranger for Doris (#27864)
For usage, see:
5d340ce24f/docs/zh-CN/docs/admin-manual/privilege-ldap/ranger.md

For range-doris-plugin, see:
https://github.com/morningman/ranger/tree/doris-plugin

To support ranger, there are several other modification:

1. Support `show resources like "pattern"`
2. Support `show workload group like "pattern"`
3. Support `show schemas like "pattern"`
2024-01-27 10:29:38 +08:00
2284575afa [opt](nereids)set flag to indicate if bloom filter size is calculated by ndv (#30278)
set flag to indicate if bloom filter size is calculated by ndv
2024-01-27 10:07:41 +08:00
5986d5415e [opt](Nereids) make runtime filter target support expression (#30131)
the target expression should be:
1. only one numeric slot, or
2. cast for any data type

example:
select * from T1 join T2 on abs(T1.a) = T2.a
RF T2.a->abs(T1.a)
2024-01-27 10:07:10 +08:00
823c469c5c [fix](rowsetreader) determine merge iterator considering segment num (#29269) 2024-01-27 09:13:21 +08:00
e576412a56 fix iceberg table get split fail when with date type conjuct (#30162) 2024-01-27 09:13:21 +08:00
bedad15f03 [enhancement](scanner) add a lower bound for bytes in scanner queue (#29624) 2024-01-27 09:13:21 +08:00