080c07ad87
[bug](random distribution) fix data loss and incorrect in random distribution table #33962
2024-04-24 17:13:50 +08:00
8d98c71079
[FIX]fix cidr func with const param ( #33968 )
2024-04-24 17:13:50 +08:00
2f60dcf890
[test](hll) fix unstable case without order by clause ( #33947 )
2024-04-24 17:13:50 +08:00
6531e4c540
[improve](regression test)Add test for time series compact empty rowset ( #29509 )
2024-04-24 17:13:49 +08:00
5a5063be20
[bug](fix) heap use after free when json parse failed ( #33955 )
2024-04-22 22:33:24 +08:00
299d069da9
Fix alter policy failed ( #33910 )
2024-04-22 22:33:24 +08:00
98e90dd47e
[fix](auth)fix missing authentication ( #33347 ) ( #33956 )
...
bp #33347
Co-authored-by: zhangdong <493738387@qq.com >
2024-04-22 13:52:36 +08:00
8096753367
[improvement](mtmv) Support union rewrite when the materialized view is not enough to provide all the data for the query ( #33800 )
...
When the materialized view is not enough to provide all the data for the query, if the materialized view is increment update by partition. we can union materialized view and origin query to reponse the query.
this depends on https://github.com/apache/doris/pull/33362
such as materialized view def is as following:
> CREATE MATERIALIZED VIEW mv_10086
> BUILD IMMEDIATE REFRESH AUTO ON MANUAL
> partition by(l_shipdate)
> DISTRIBUTED BY RANDOM BUCKETS 2
> PROPERTIES ('replication_num' = '1')
> AS
> select l_shipdate, o_orderdate, l_partkey, l_suppkey, sum(o_totalprice) as sum_total
> from lineitem
> left join orders on lineitem.l_orderkey = orders.o_orderkey and l_shipdate = o_orderdate
> group by
> l_shipdate,
> o_orderdate,
> l_partkey,
> l_suppkey;
the materialized view data is as following:
+------------+-------------+-----------+-----------+-----------+
| l_shipdate | o_orderdate | l_partkey | l_suppkey | sum_total |
+------------+-------------+-----------+-----------+-----------+
| 2023-10-18 | 2023-10-18 | 2 | 3 | 109.20 |
| 2023-10-17 | 2023-10-17 | 2 | 3 | 99.50 |
| 2023-10-19 | 2023-10-19 | 2 | 3 | 99.50 |
+------------+-------------+-----------+-----------+-----------+
when we insert data to partition `2023-10-17`, if we run query as following
```
select l_shipdate, o_orderdate, l_partkey, l_suppkey, sum(o_totalprice) as sum_total
from lineitem
left join orders on lineitem.l_orderkey = orders.o_orderkey and l_shipdate = o_orderdate
group by
l_shipdate,
o_orderdate,
l_partkey,
l_suppkey;
```
query rewrite by materialzied view will fail with message `Check partition query used validation fail`
if we turn on the switch `SET enable_materialized_view_union_rewrite = true;` default true
we run the query above again, it will success and will use union all materialized view and origin query to response the query correctly. the plan is as following:
```
| Explain String(Nereids Planner) |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| PLAN FRAGMENT 0 |
| OUTPUT EXPRS: |
| l_shipdate[#52 ] |
| o_orderdate[#53 ] |
| l_partkey[#54 ] |
| l_suppkey[#55 ] |
| sum_total[#56 ] |
| PARTITION: UNPARTITIONED |
| |
| HAS_COLO_PLAN_NODE: false |
| |
| VRESULT SINK |
| MYSQL_PROTOCAL |
| |
| 11:VEXCHANGE |
| offset: 0 |
| distribute expr lists: |
| |
| PLAN FRAGMENT 1 |
| |
| PARTITION: HASH_PARTITIONED: l_shipdate[#42 ], o_orderdate[#43 ], l_partkey[#44 ], l_suppkey[#45 ] |
| |
| HAS_COLO_PLAN_NODE: false |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 11 |
| UNPARTITIONED |
| |
| 10:VUNION(756) |
| | |
| |----9:VAGGREGATE (merge finalize)(753) |
| | | output: sum(partial_sum(o_totalprice)[#46 ])[#51 ] |
| | | group by: l_shipdate[#42 ], o_orderdate[#43 ], l_partkey[#44 ], l_suppkey[#45 ] |
| | | cardinality=2 |
| | | distribute expr lists: l_shipdate[#42 ], o_orderdate[#43 ], l_partkey[#44 ], l_suppkey[#45 ] |
| | | |
| | 8:VEXCHANGE |
| | offset: 0 |
| | distribute expr lists: l_shipdate[#42 ] |
| | |
| 1:VEXCHANGE |
| offset: 0 |
| distribute expr lists: |
| |
| PLAN FRAGMENT 2 |
| |
| PARTITION: HASH_PARTITIONED: o_orderkey[#21 ], o_orderdate[#25 ] |
| |
| HAS_COLO_PLAN_NODE: false |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 08 |
| HASH_PARTITIONED: l_shipdate[#42 ], o_orderdate[#43 ], l_partkey[#44 ], l_suppkey[#45 ] |
| |
| 7:VAGGREGATE (update serialize)(747) |
| | STREAMING |
| | output: partial_sum(o_totalprice[#41 ])[#46 ] |
| | group by: l_shipdate[#37 ], o_orderdate[#38 ], l_partkey[#39 ], l_suppkey[#40 ] |
| | cardinality=2 |
| | distribute expr lists: l_shipdate[#37 ] |
| | |
| 6:VHASH JOIN(741) |
| | join op: RIGHT OUTER JOIN(PARTITIONED)[] |
| | equal join conjunct: (o_orderkey[#21 ] = l_orderkey[#5 ]) |
| | equal join conjunct: (o_orderdate[#25 ] = l_shipdate[#15 ]) |
| | runtime filters: RF000[min_max] <- l_orderkey[#5 ](2/2/2048), RF001[bloom] <- l_orderkey[#5 ](2/2/2048), RF002[min_max] <- l_shipdate[#15 ](1/1/2048), RF003[bloom] <- l_shipdate[#15 ](1/1/2048) |
| | cardinality=2 |
| | vec output tuple id: 4 |
| | output tuple id: 4 |
| | vIntermediate tuple ids: 3 |
| | hash output slot ids: 6 7 24 25 15 |
| | final projections: l_shipdate[#36 ], o_orderdate[#32 ], l_partkey[#34 ], l_suppkey[#35 ], o_totalprice[#31 ] |
| | final project output tuple id: 4 |
| | distribute expr lists: o_orderkey[#21 ], o_orderdate[#25 ] |
| | distribute expr lists: l_orderkey[#5 ], l_shipdate[#15 ] |
| | |
| |----3:VEXCHANGE |
| | offset: 0 |
| | distribute expr lists: l_orderkey[#5 ] |
| | |
| 5:VEXCHANGE |
| offset: 0 |
| distribute expr lists: |
| |
| PLAN FRAGMENT 3 |
| |
| PARTITION: RANDOM |
| |
| HAS_COLO_PLAN_NODE: false |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 05 |
| HASH_PARTITIONED: o_orderkey[#21 ], o_orderdate[#25 ] |
| |
| 4:VOlapScanNode(722) |
| TABLE: union_db.orders(orders), PREAGGREGATION: ON |
| runtime filters: RF000[min_max] -> o_orderkey[#21 ], RF001[bloom] -> o_orderkey[#21 ], RF002[min_max] -> o_orderdate[#25 ], RF003[bloom] -> o_orderdate[#25 ] |
| partitions=3/3 (p_20231017,p_20231018,p_20231019), tablets=9/9, tabletList=161188,161190,161192 ... |
| cardinality=3, avgRowSize=0.0, numNodes=1 |
| pushAggOp=NONE |
| |
| PLAN FRAGMENT 4 |
| |
| PARTITION: HASH_PARTITIONED: l_orderkey[#5 ] |
| |
| HAS_COLO_PLAN_NODE: false |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 03 |
| HASH_PARTITIONED: l_orderkey[#5 ], l_shipdate[#15 ] |
| |
| 2:VOlapScanNode(729) |
| TABLE: union_db.lineitem(lineitem), PREAGGREGATION: ON |
| PREDICATES: (l_shipdate[#15 ] >= '2023-10-17') AND (l_shipdate[#15 ] < '2023-10-18') |
| partitions=1/3 (p_20231017), tablets=3/3, tabletList=161223,161225,161227 |
| cardinality=2, avgRowSize=0.0, numNodes=1 |
| pushAggOp=NONE |
| |
| PLAN FRAGMENT 5 |
| |
| PARTITION: RANDOM |
| |
| HAS_COLO_PLAN_NODE: false |
| |
| STREAM DATA SINK |
| EXCHANGE ID: 01 |
| RANDOM |
| |
| 0:VOlapScanNode(718) |
| TABLE: union_db.mv_10086(mv_10086), PREAGGREGATION: ON |
| partitions=2/3 (p_20231018_20231019,p_20231019_20231020), tablets=4/4, tabletList=161251,161253,161265 ... |
| cardinality=2, avgRowSize=0.0, numNodes=1 |
| pushAggOp=NONE |
| |
| MaterializedView |
| MaterializedViewRewriteSuccessAndChose: |
| Names: mv_10086 |
| MaterializedViewRewriteSuccessButNotChose: |
| |
| MaterializedViewRewriteFail: |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
```
2024-04-21 13:22:26 +08:00
e6a6b82201
[nereids](mtmv) Support rewrite by mv nested materialized view ( #33362 )
...
Support query rewritting by nested materialized view.
Such as `inner_mv` def is as following
> select
> l_linenumber,
> o_custkey,
> o_orderkey,
> o_orderstatus,
> l_partkey,
> l_suppkey,
> l_orderkey
> from lineitem
> inner join orders on lineitem.l_orderkey = orders.o_orderkey;
the mv1_0 def is as following:
> select
> l_linenumber,
> o_custkey,
> o_orderkey,
> o_orderstatus,
> l_partkey,
> l_suppkey,
> l_orderkey,
> ps_availqty
> from inner_mv
> inner join partsupp on l_partkey = ps_partkey AND l_suppkey = ps_suppkey;
for the following query, both inner_mv and mv1_0 can be successful when query rewritting by materialized view,and cbo will chose `mv1_0` finally.
> select lineitem.l_linenumber
> from lineitem
> inner join orders on l_orderkey = o_orderkey
> inner join partsupp on l_partkey = ps_partkey AND l_suppkey = ps_suppkey
> where o_orderstatus = 'o' AND l_linenumber in (1, 2, 3, 4, 5)
2024-04-21 09:55:34 +08:00
60253c827c
[fix](nereids) do not push RF into nested cte ( #33769 )
2024-04-20 20:08:00 +08:00
36a70ba1e7
[Fix](Csv-Reader)Fix the issue of BE core dump caused by improper configuration of column_seperator and line_delimiter. ( #33693 )
2024-04-20 20:06:48 +08:00
03c3419265
[Refactor](executor)Add workload schedule policy table ( #33729 )
2024-04-20 20:06:34 +08:00
0e3ad5cd9d
[fix](parquet) fix time zone error(isAdjustedToUTC=true) in parquet reader ( #33675 ) ( #33924 )
...
bp (#33675 )
Co-authored-by: Ashin Gau <AshinGau@users.noreply.github.com >
2024-04-20 19:06:54 +08:00
80307354b2
[fix](Nereids): add whole tree rewriter when root is not CTEAnchor ( #33591 ) ( #33906 )
2024-04-20 01:05:57 +08:00
ee687a43fd
[fix](plsql) Fix regression test for routine select ( #33860 )
...
fix #33608 , more comprehensive test
2024-04-19 23:41:46 +08:00
175e85d616
[Bug](runtime-filter) fix coredump on no null string type rf ( #33869 )
...
fix coredump on no null string type rf
2024-04-19 15:03:06 +08:00
659900040f
[Fix](inverted index) fix wrong need read data opt when encounters columnA > columnB predicate ( #33855 )
2024-04-19 15:03:06 +08:00
6776a3ad1b
[Fix](planner) fix create view star except and modify cast to sql ( #33726 )
2024-04-19 15:02:49 +08:00
a8ba933947
[Fix](nereids) fix bind order by expression logic ( #33843 )
2024-04-19 15:02:49 +08:00
2675e94a93
[feature](variable) add read_only and super_read_only ( #33795 )
2024-04-19 15:02:21 +08:00
5abc84af71
[fix](txn insert) Fix txn insert commit failed when schema change ( #33706 )
2024-04-19 15:01:57 +08:00
8f6f4cf0eb
[Pick](Variant) pick #33734 #33766 #33707 to branch-2.1 ( #33848 )
...
* [Fix](Variant Type) forbit distribution info contains variant columns (#33707 )
* [Fix](Variant) VariantRootColumnIterator::read_by_rowids with wrong null map size (#33734 )
insert_range_from should start from `size` with `count` elements for null map
* [Fix](Variant) check column index validation for extracted columns (#33766 )
2024-04-18 19:42:44 +08:00
34a97d5e8b
[fix](Nereids)fix unstable plan shape in limit_push_down case
2024-04-18 19:05:24 +08:00
e106d34190
[enhancement](plsql) regression for routine select and show create procedure ( #33608 )
...
add regression for routines and show create procedure
Issue Number: close #31297
add regression for routines and show create procedure
2024-04-18 19:04:03 +08:00
04e30c91a0
[Fix](Variant) VariantRootColumnIterator::read_by_rowids with wrong null map size ( #33734 )
...
insert_range_from should start from `size` with `count` elements for null map
2024-04-18 19:02:58 +08:00
8c535c51b5
[Improvement](materialized-view) support multiple agg function have same base table slot ( #33774 )
...
support multiple agg function have same base table slot
2024-04-18 19:02:49 +08:00
a57e0d3500
[Pick](nerids) pick #33010 #32982 #33531 to branch 2.1 ( #33829 )
2024-04-18 18:40:36 +08:00
d72d5c9b5d
[fix](inverted index) normal process query for null condition when index is missing ( #33663 )
2024-04-17 23:42:14 +08:00
2648a92594
[FIX](load)fix load with split-by-string ( #33713 )
2024-04-17 23:42:14 +08:00
bb33375dba
[test](xor) add test for xor #33731
2024-04-17 23:42:13 +08:00
81f7c53bad
[fix](Nereids) could not query variant that not from table ( #33704 )
2024-04-17 23:42:13 +08:00
1c025c0488
[docker](hive) add hive3 docker compose and modify scripts ( #33115 )
...
add hive3 docker compose from:
big-data-europe/docker-hive#56
2024-04-17 23:42:13 +08:00
22a6b1d3f5
[feature](function) support hll functions hll_from_base64, hll_to_base64 ( #32089 )
...
Issue Number: #31320
Support two hll functions:
- hll_from_base64
Convert a base64 string(result of function hll_to_base64) into a hll.
- hll_to_base64
Convert an input hll to a base64 string.
2024-04-17 23:42:13 +08:00
3096150d1b
[feature](agg) support aggregate function group_array_intersect ( #33265 )
2024-04-17 23:42:13 +08:00
b07e0a2f06
[FIX](cast)fix full/right out join for cast array ( #33475 )
...
in some case, we has code
```
if (_join_op == TJoinOp::RIGHT_OUTER_JOIN || _join_op == TJoinOp::FULL_OUTER_JOIN) {
_probe_column_convert_to_null = _convert_block_to_null(*input_block);
}
```
then do next function like cast , but in function cast we assume block column is same with from_type.which will make status error
2024-04-17 23:42:13 +08:00
d15981abd2
[Enhencement](Nereids) add rule of agg(case when) to agg(filter) ( #33598 )
2024-04-17 23:42:13 +08:00
8e38549a92
[fix](nereids) Use correct PREAGGREGATION in agg(filter(scan)) ( #33454 )
...
1. set `PreAggStatus` to `ON` when agg key column by max or min;
2. #28747 may change `PreAggStatus` of scan, inherit it from the previous one.
2024-04-17 23:42:13 +08:00
7b16cb5a4c
[feature](inverted index) add slop functionality to match_phrase ( #33225 )
...
https://github.com/apache/doris-website/pull/553 doc
2024-04-17 23:42:12 +08:00
06a155abb0
[branch-2.1](cherry-pick) Pick some partial-update PR from master ( #33639 )
...
* [Fix](partial-update) Fix partial update fail when the datetime default value is 'current_time' (#32926 )
* Problem: When importing data that includes datetime with a default value of current time for partial column updates, the import fails.
Reason: Partial column updates do not handle the logic for datetime default values.
Solution: During partial column updates, when the default value is set to current time, read the current time from the runtime state and write it into the data.
* [Enhancement](partial update)Add timezone case for partial update timestamp #33177
* [fix](partial update) Support partial update when the date default value is 'current_date'. This PR is a extension of PR #32926 . (#33394 )
2024-04-17 23:42:12 +08:00
4740b22481
[fix](test) fix some p2 external table test cases ( #33624 )
...
bp #33621
Also fix a merge bug from #33245
2024-04-17 23:42:12 +08:00
1be753ed75
[enhancement](mysql compatible) add user and procs_priv tables to mysql db in all catalogs ( #33058 )
...
Issue Number: close #xxx
This PR aims to enhance the compatibility of BI tools (such as Dbeaver, DataGrip) when using the mysql connector to connect to Doris, because some BI tools query some tables in the mysql database. In our tests, the user and procs_priv tables were mainly queried. This PR adds these two tables and adds actual data to the user table. However, please note that most of the fields in the user table are in Doris' own format rather than mysql format, so it can only ensure that the BI tool is querying No error is reported when accessing these tables, which does not guarantee that the data is completely displayed, and the tables under Doris's mysql database do not support data modification.
Thanks to @liujiwen-up for assisting in testing
2024-04-17 23:42:12 +08:00
48880c3e1a
[Fix](timezone) fix miss of expected rounding of Date type with timezone #33553
2024-04-17 23:42:11 +08:00
1f5116f3c1
[testcases](auto-partition) Add and fix testcases in P0 #33588
2024-04-17 23:42:11 +08:00
379c2e0762
[Fix](Nereids) fix leading hint should have all tables in one query block ( #33517 )
2024-04-17 23:42:00 +08:00
ae4a7c93ac
[feat](nereids) support create view in nereids ( #32743 )
2024-04-17 23:42:00 +08:00
249a9c9875
[Feature](Variant) support aggregation model for Variant type ( #33493 )
...
refactor use `insert_from` to replace `replace_column_data` for variable lengths columns
2024-04-17 23:42:00 +08:00
6bcf24b1f6
[bug](not in) if not in (null) could eos early ( #33482 )
...
* [bug](not in) if not in (null) could eos early
2024-04-17 23:41:59 +08:00
9dfc9665fb
[Fix](inverted index) fix build index error status when batch_next ( #33532 )
2024-04-17 23:41:59 +08:00
09fb30c989
(Chore)[regression-test] fix unstable output variant case ( #33520 )
2024-04-17 23:41:59 +08:00
82d2bde3c7
[fix](nereids) do not transpose semi join agg when mark join ( #32475 )
2024-04-17 23:41:59 +08:00