doris

Author	SHA1	Message	Date
minghong	e623f3fb9e	[runtimeFilter](nereids) use runtime filter default size for debug purpose (#20065 ) use rf default size for debug	2023-05-30 14:34:14 +08:00
Mryange	94e1072d14	Revert "[fix](DECIMALV3) Fix the error in DECIMALV3 when explicitly casting. (#19926 )" (#20204 ) This reverts commit 8ca4f9306763b5a18ffda27a07ab03cc77351e35.	2023-05-30 10:35:33 +08:00
AKIRA	72cfe5865a	[feat](optimizer) Support CTE reuse (#19934 ) Before this PR, new optimizer would inline CTE directly. However in many scenario a CTE could be referenced many times, such as in TPC-DS tests, for these cases materialize the result sets of CTE and reuse it would significantly agument performance. In our tests on tpc-ds related sqls, it would improve the performance by up to almost 4 times than before. We introduce belowing plan node in optimizer 1. CTEConsumer: which hold a reference to CTEProducer 2. CTEProducer: Plan defined by CTE stmt 3. CTEAnchor: the father node of CTEProducer, a CTEProducer could only be referenced from corresponding CTEAnchor's right child. A CTEConsumer would be converted to a inlined plan if corresponding CTE referenced less than or equal `inline_cte_referenced_threshold` (it's a session variable, by default is 1). For SQL: ```sql EXPLAIN REWRITTEN PLAN WITH cte AS (SELECT col2 FROM t1) SELECT * FROM t1 WHERE (col3 IN (SELECT c1.col2 FROM cte c1)) UNION ALL SELECT * FROM t1 WHERE (col3 IN (SELECT c1.col2 FROM cte c1)); ``` Rewritten plan before this PR: ``` +------------------------------------------------------------------------------------------------------------------------------------------------------+ \| Explain String \| +------------------------------------------------------------------------------------------------------------------------------------------------------+ \| LogicalUnion ( qualifier=ALL, outputs=[col1#14, col2#15, col3#16], hasPushedFilter=false ) \| \| \|--LogicalJoin[559] ( type=LEFT_SEMI_JOIN, markJoinSlotReference=Optional.empty, hashJoinConjuncts=[(col3#6 = col2#8)], otherJoinConjuncts=[] ) \| \| \| \|--LogicalProject[551] ( distinct=false, projects=[col1#4, col2#5, col3#6], excepts=[], canEliminate=true ) \| \| \| \| +--LogicalFilter[549] ( predicates=(__DORIS_DELETE_SIGN__#7 = 0) ) \| \| \| \| +--LogicalOlapScan ( qualified=default_cluster:test.t1, indexName=t1, selectedIndexId=42723, preAgg=ON ) \| \| \| +--LogicalProject[555] ( distinct=false, projects=[col2#20 AS `col2`#8], excepts=[], canEliminate=true ) \| \| \| +--LogicalFilter[553] ( predicates=(__DORIS_DELETE_SIGN__#22 = 0) ) \| \| \| +--LogicalOlapScan ( qualified=default_cluster:test.t1, indexName=t1, selectedIndexId=42723, preAgg=ON ) \| \| +--LogicalProject[575] ( distinct=false, projects=[col1#9, col2#10, col3#11], excepts=[], canEliminate=false ) \| \| +--LogicalJoin[573] ( type=LEFT_SEMI_JOIN, markJoinSlotReference=Optional.empty, hashJoinConjuncts=[(col3#11 = col2#13)], otherJoinConjuncts=[] ) \| \| \|--LogicalProject[565] ( distinct=false, projects=[col1#9, col2#10, col3#11], excepts=[], canEliminate=true ) \| \| \| +--LogicalFilter[563] ( predicates=(__DORIS_DELETE_SIGN__#12 = 0) ) \| \| \| +--LogicalOlapScan ( qualified=default_cluster:test.t1, indexName=t1, selectedIndexId=42723, preAgg=ON ) \| \| +--LogicalProject[569] ( distinct=false, projects=[col2#24 AS `col2`#13], excepts=[], canEliminate=true ) \| \| +--LogicalFilter[567] ( predicates=(__DORIS_DELETE_SIGN__#26 = 0) ) \| \| +--LogicalOlapScan ( qualified=default_cluster:test.t1, indexName=t1, selectedIndexId=42723, preAgg=ON ) \| +------------------------------------------------------------------------------------------------------------------------------------------------------+ ``` After this PR ``` +------------------------------------------------------------------------------------------------------------------------------------------------------+ \| Explain String \| +------------------------------------------------------------------------------------------------------------------------------------------------------+ \| LogicalUnion ( qualifier=ALL, outputs=[col1#14, col2#15, col3#16], hasPushedFilter=false ) \| \| \|--LOGICAL_CTE_ANCHOR#-1164890733 \| \| \| \|--LOGICAL_CTE_PRODUCER#-1164890733 \| \| \| \| +--LogicalProject[427] ( distinct=false, projects=[col2#1], excepts=[], canEliminate=true ) \| \| \| \| +--LogicalFilter[425] ( predicates=(__DORIS_DELETE_SIGN__#3 = 0) ) \| \| \| \| +--LogicalOlapScan ( qualified=default_cluster:test.t1, indexName=t1, selectedIndexId=42723, preAgg=ON ) \| \| \| +--LogicalJoin[373] ( type=LEFT_SEMI_JOIN, markJoinSlotReference=Optional.empty, hashJoinConjuncts=[(col3#6 = col2#8)], otherJoinConjuncts=[] ) \| \| \| \|--LogicalProject[370] ( distinct=false, projects=[col1#4, col2#5, col3#6], excepts=[], canEliminate=true ) \| \| \| \| +--LogicalFilter[368] ( predicates=(__DORIS_DELETE_SIGN__#7 = 0) ) \| \| \| \| +--LogicalOlapScan ( qualified=default_cluster:test.t1, indexName=t1, selectedIndexId=42723, preAgg=ON ) \| \| \| +--LOGICAL_CTE_CONSUMER#-1164890733#1038782805 \| \| +--LogicalProject[384] ( distinct=false, projects=[col1#9, col2#10, col3#11], excepts=[], canEliminate=false ) \| \| +--LogicalJoin[382] ( type=LEFT_SEMI_JOIN, markJoinSlotReference=Optional.empty, hashJoinConjuncts=[(col3#11 = col2#13)], otherJoinConjuncts=[] ) \| \| \|--LogicalProject[379] ( distinct=false, projects=[col1#9, col2#10, col3#11], excepts=[], canEliminate=true ) \| \| \| +--LogicalFilter[377] ( predicates=(__DORIS_DELETE_SIGN__#12 = 0) ) \| \| \| +--LogicalOlapScan ( qualified=default_cluster:test.t1, indexName=t1, selectedIndexId=42723, preAgg=ON ) \| \| +--LOGICAL_CTE_CONSUMER#-1164890733#858618008 \| +------------------------------------------------------------------------------------------------------------------------------------------------------+ ```	2023-05-30 10:18:59 +08:00
Mryange	8ca4f93067	[fix](DECIMALV3) Fix the error in DECIMALV3 when explicitly casting. (#19926 ) before mysql [test]>select cast(1 as DECIMALV3(16, 2)) / cast(3 as DECIMALV3(16, 2)); +-----------------------------------------------------------+ \| CAST(1 AS DECIMALV3(16, 2)) / CAST(3 AS DECIMALV3(16, 2)) \| +-----------------------------------------------------------+ \| 0.00 \| +-----------------------------------------------------------+ mysql [test]>select * from divtest; +------+------+ \| id \| val \| +------+------+ \| 3 \| 5.00 \| \| 2 \| 4.00 \| \| 1 \| 3.00 \| +------+------+ mysql [test]>select cast(1 as decimalv3(16,2)) / val from divtest; +-------------------------------------+ \| CAST(1 AS DECIMALV3(16, 2)) / `val` \| +-------------------------------------+ \| 0 \| \| 0 \| \| 0 \| +-------------------------------------+ after mysql [test]>select cast(1 as DECIMALV3(16, 2)) / cast(3 as DECIMALV3(16, 2)); +-----------------------------------------------------------+ \| CAST(1 AS DECIMALV3(16, 2)) / CAST(3 AS DECIMALV3(16, 2)) \| +-----------------------------------------------------------+ \| 0.33 \| +-----------------------------------------------------------+ mysql [test]>select cast(1 as decimalv3(16,2)) / val from divtest; +-------------------------------------+ \| CAST(1 AS DECIMALV3(16, 2)) / `val` \| +-------------------------------------+ \| 0.250000 \| \| 0.200000 \| \| 0.333333 \| +-------------------------------------+ This is because in the previous code, the constant 1.000 would be transformed into 1. remove "ReduceType	2023-05-29 19:51:12 +08:00
Long Zhao	d76be1315f	[BUG]storage_min_left_capacity_bytes default value has integer overflow #19943	2023-05-29 19:50:31 +08:00
mch_ucchi	5f37396514	[Enhancement](Nerieds) add switch for developing Nereids DML (#20100 )	2023-05-29 19:06:55 +08:00
Pxl	5788214416	[Bug](function) fix equals implements not judge order by elements of function call expr (#20083 ) fix equals implements not judge order by elements of function call expr #19296	2023-05-29 19:03:05 +08:00
lihangyu	ab8125d56f	[Improve](performance) introduce SchemaCache to cache TabletSchame & Schema (#20037 ) * [Improve](performance) introduce SchemaCache to cache TabletSchame & Schema 1. When the system is under high-concurrency load with wide table point queries, the frequent memory allocation and deallocation of Schema become evident system bottlenecks. Additionally, the initialization of TabletSchema and Schema also becomes a CPU hotspot.Therefore, the introduction of a SchemaCache is implemented to cache these resources for reuse. 2. Make some variables wrapped with std::unique<unique_ptr> Performance: \| 状态 \| QPS \| 平均响应时间 (avg) \| P99 响应时间 \| \|------------------\|-----\|------------------\|-------------\| \| 开启 SchemaCache \| 501 \| 20ms \| 34ms \| \| 关闭 SchemaCache \| 321 \| 31ms \| 61ms \| * handle schema change with schema version * remove useless header * rebase	2023-05-29 17:34:53 +08:00
AKIRA	cc47ee480c	[feat](stats) delete data size stat and Made task timeout configurable (#20090 ) 1. Delete the stats for data size, since it would cost too much time but useless 2. Make task time out configurable since when it's common to analyze a quite huge table that the default 10 min is not suitable	2023-05-29 16:40:59 +08:00
Gabriel	55ccddb62c	[Conf](decimalv3) enable decimalv3 by default	2023-05-29 15:38:31 +08:00
Jibing-Li	500995c442	[Fix](multi catalog)Fix Iceberg table missing column unique id bug (#20152 ) This pr is to fix the bug introduced by PR #19909 The bug failed to set column unique id for iceberg table, which will cause the query result for iceberg table are all NULL. ``` mysql> select * from iceberg_partition_lower_case_parquet limit 1; +------+------+------+---------+ \| k1 \| k2 \| k3 \| city \| +------+------+------+---------+ \| NULL \| NULL \| NULL \| Beijing \| +------+------+------+---------+ 1 row in set (0.60 sec) ``` After fix: ``` mysql> select * from iceberg_partition_lower_case_parquet limit 1; +------+------+------+---------+ \| k1 \| k2 \| k3 \| city \| +------+------+------+---------+ \| 1 \| k2_1 \| k3_1 \| Beijing \| +------+------+------+---------+ 1 row in set (0.35 sec) ```	2023-05-29 15:04:12 +08:00
Pxl	8376e5eefb	[Chore](build) add non-virtual-dtor, remove no-embedded-directive/no-zero-length-array (#20118 ) add non-virtual-dtor, remove no-embedded-directive/no-zero-length-array	2023-05-29 14:42:47 +08:00
Pxl	bbb3af6ce6	[Feature](agg_state) support agg_state combinators (#19969 ) support agg_state combinators state/merge/union	2023-05-29 13:07:29 +08:00
caiconghui	f217e052d3	[fix](dynamic_partition) fix dynamic partition not work when drop and recover olap table (#19031 ) when olap table is dynamic partition enable, if drop and recover olap table, the table should be added to DynamicPartitionScheduler again --------- Co-authored-by: caiconghui1 <caiconghui1@jd.com>	2023-05-29 13:02:10 +08:00
Jerry Hu	9f8de89659	[refactor](exec) replace the single pointer with an array of 'conjuncts' in ExecNode (#19758 ) Refactoring the filtering conditions in the current ExecNode from an expression tree to an array can simplify the process of adding runtime filters. It eliminates the need for complex merge operations and removes the requirement for the frontend to combine expressions into a single entity. By representing the filtering conditions as an array, each condition can be treated individually, making it easier to add runtime filters without the need for complex merging logic. The array can store the individual conditions, and the runtime filter logic can iterate through the array to apply the filters as needed. This refactoring simplifies the codebase, improves readability, and reduces the complexity associated with handling filtering conditions and adding runtime filters. It separates the conditions into discrete entities, enabling more straightforward manipulation and management within the execution node.	2023-05-29 11:47:31 +08:00
zhengshiJ	970efdc1cb	[Feature](Nereids) support advanced materialized view (#19650 ) Increase the functionality of advanced materialized view This feature already supported by legacy planner with PR #19650 This PR implement it in Nereids. This PR implement the features as below: 1. Support multiple columns in aggregate function. eg: select sum(c1 + c2) from t1; 2. Supports complex expressions. eg: select abs(c1), sum(abc(c1+1) + 1) from t1; TODO: 1. Support adding where in materialized view	2023-05-29 10:37:44 +08:00
yujun	344ca112af	[fix] (clone) fix drop biggest version replica during reblance step (#20107 ) * add check for rebalancer choose deleted replica * impr a compare	2023-05-29 09:00:51 +08:00
yujun	42239d635a	[fix](tablet_manager_lock) fix create tablet timeout #20067 (#20069 )	2023-05-28 23:05:13 +08:00
ZhangJian He	a5d73d47b6	[security] Don't print password in BaseController (#18862 )	2023-05-28 22:49:18 +08:00
Changming Xiao	5f9c6e076f	[Fix](load)Make insert timeout accurate in `show load` statistics (#20068 )	2023-05-28 21:19:06 +08:00
YueW	ae352997b4	[Enhancement](alter inverted index) Improve alter inverted index performance with light weight add or drop inverted index (#19063 )	2023-05-28 11:23:07 +08:00
luozenglin	f21bf11cf5	[fix](ldap) fix ldap related errors (#19959 ) 1. fix ldap user show grants return null pointer exception; 2. fix ldap user show databases return no authority db; 3. ldap authentication supports catalog level;	2023-05-27 23:51:32 +08:00
starocean999	4cbb6ece10	[fix](fe)ordering exprs should be substituted in the same way as select part (#20091 )	2023-05-27 21:00:57 +08:00
Yanko	f54a068d82	[feature](function) add json->operator convert to json_extract (#19899 )	2023-05-27 12:45:45 +08:00
lihangyu	f3d8af330a	[Bug](point query) check point query before check two phase read (#20055 ) * [Bug](point query) checkAndSetPointQuery before checkEnableTwoPhaseRead 1. checkEnableTwoPhaseRead rely on thr short circuit flag 2. add more metric to display lookup profile * fix rebase	2023-05-27 12:38:58 +08:00
HappenLee	9539bbf8ae	Revert "[test](executor)add crud regression test for resource group (#19659 )" (#20121 ) This reverts commit 8b9813663d87afa7b359b31782f3864dc54881df.	2023-05-27 08:25:00 +08:00
zhangdong	51ca645c3f	[fix](mtmv)Fix tablet not found when restart fe (#20095 ) The replayCreateTable restriction must be olapTable. If mv is used, nothing will be done, resulting in no call to invertedIndex.addReplica	2023-05-27 08:20:06 +08:00
Jack Drogon	93933308e6	[Feature-WIP](CCR): Add ccr doris interface (WIP) (#17881 )	2023-05-26 23:40:49 +08:00
Qi Chen	860e28a3a3	[Fix](multi-catalog) Fix db name is not lower case when jdbc catalog configuration `lower_case_table_names` is `true`. (#20021 ) Fix db name is not lower case when jdbc catalog configuration lower_case_table_names is true. Fix regression-test test_oracle_jdbc_catalog.	2023-05-26 21:35:38 +08:00
starocean999	dcdc81844f	[fix](nereids)use same decimalv3 type for params and return types (#20101 )	2023-05-26 20:15:51 +08:00
Chengpeng Yan	d6c2ef4727	[opt](Nereids) support use the string as the hint name key (#20053 ) We can not use the string as the variable key to use in the hint. Before this PR mysql> SET enable_nereids_planner=true; Query OK, 0 rows affected (0.01 sec) mysql> set enable_fallback_to_original_planner=false; Query OK, 0 rows affected (0.10 sec) mysql> explain select /+ SET_var("enable_nereids_planner" = "false") / 1; ERROR 1105 (HY000): Exception, msg: Nereids cannot parse the SQL, and fallback disabled. caused by: no viable alternative at input 'select /+ SET_var("enable_nereids_planner"'(line 1, pos 27) After this PR mysql> SET enable_nereids_planner=true; Query OK, 0 rows affected (0.01 sec) mysql> set enable_fallback_to_original_planner=false; Query OK, 0 rows affected (0.10 sec) mysql> select /+ SET_var("enable_nereids_planner" = "false") */ 1; +------+ \| 1 \| +------+ \| 1 \| +------+ 1 row in set (0.00 sec) Describe your changes. Support the string for the hint key in the Parser.	2023-05-26 18:04:04 +08:00
LiBinfeng	b7fd481dcb	[Fix](Nereids) Using switch to control minidump input serialize (#20049 ) Before change, when doing optimize use Nereids planner, input will serialize to memory first. And when bug happen, it would be dump to minidump file when catching the exception. We found that serialization process will cause the performance when statistic message too large or when optimization time be small enough. So the user minidump using should change to ONLY YOU OPEN MINIDUMP SWITCH(set enable_minidump=true;) can you use it.	2023-05-26 18:03:34 +08:00
Pxl	041081f081	[Chore](decimal) make decimal value parse fail information readable #20057	2023-05-26 16:17:40 +08:00
minghong	a842b9787f	[fix](Nereids) should allow identity project when generate bitmap runtime filter (#20062 )	2023-05-26 16:08:57 +08:00
lihangyu	317338913c	[Bug](topn) Fix topn fetch set real default value (#20074 ) 1. Before this PR if rowset does not contain column which should be read for related SlotDescriptor will call `insert_default` to column, but it's not this real defautl value.Real default value relevant information should be provided by the frontend side. 2. Support fetch when light schema change is not enabled, but disable for AGG or UNIQUE MOR model	2023-05-26 16:06:55 +08:00
zy-kkk	50ced3c3a6	[improve] (jdbc catalog) better handling of postresql bit(1) types with bool type (#20022 ) When the postgresql bit type size is 1, it reads as a java.lang.boolean via jdbc, and if we match against string, it will display true or false. But the normal display should be a number, so when I detect that the size of bit is 1, I will match it with boolean	2023-05-26 16:06:38 +08:00
ZhouZhou	635a9f7a0e	[fix](ui)(fe-system) fix fe System Info query error when the fe server run in Windows. (#20072 ) (#20073 ) 1. Fix duplicate '/' in front-end request URI. 2. When the FileSystemSeparator is '\\', replace '\\' as '/' Co-authored-by: labuladuo <labuladuo@douyu.tv>	2023-05-26 15:25:44 +08:00
yiguolei	0ed817ed1a	[improvement](status) should send query timeout status to be, instead of internal error (#20016 ) If a query is cancelled, the reason is very unclear and we do not know the call stack. --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-05-26 15:11:17 +08:00
Chengpeng Yan	dee9c2240f	[feature](Nereids) pushdown filter through window (#18784 ) Support the operator `PartitionTopN`, which can partition first and do the topn operation later in each partition. It used in the following case ``` -- Support push the filter down to the window and generate the PartitionTopN. -- The plan change from `window -> filter` to `partitionTopN -> window -> filter`. explain select * from (select * , row_number() over(partition by b order by a) as num from t ) tt where num <= 10; -- Support push the limit down to the window and generate the PartitionTopN. -- The plan change from `window -> limit` to `partitionTopN -> window -> limit `. explain select row_number() over(partition by b order by a) as num from t limit 10; -- Support push the topn down to the window and generate the PartitionTopN. -- The plan change from `window -> topn` to `partitionTopN -> window -> topn `. explain select row_number() over(partition by b order by a) as num from t order by num limit 10; ``` The FE part detail design: 1. Add the following rewrite rules: - PUSHDOWN_FILTER_THROUGH_WINDOW - PUSH_LIMIT_THROUGH_PROJECT_WINDOW - PUSH_LIMIT_THROUGH_WINDOW - PUSHDOWN_TOP_N_THROUGH_PROJECTION_WINDOW - PUSHDOWN_TOP_N_THROUGH_WINDOW 2. Add the PartitionTopN node(LogicalPlan/ PhysicalPlan/ TranslatorPlan) 3. For the rewrite plan, there are several requests that need to meet: - For the `Filter` part, only consider `</ <=/ =` conditions. And the filter conditions will be stored. - For the `Window` part, we only support one window function. And we support the `row_number`, `rank`, `dense_rank` window functions. And the `partition by` key and `order by` key can not be empty at the same time. The `Window Frame` should be `UNBOUNDED to CURRENT`. 4. For the `PhysicalPartitionTopN`, the requested property is `Any`and the output property is its children's property. That's the main details that are very important. For the other part, you can directly check the code. Issue Number #18646 BE Part #19708	2023-05-26 11:23:48 +08:00
starocean999	558f625d3b	[fix](planner) The group by part should be substituted in the same way as select part (#20019 )	2023-05-26 11:05:02 +08:00
Jibing-Li	9c22fc4130	[fix](multi catalog)Support Hive partiton manually removed (#20024 ) If the user manually removed a hive partition (remove the partition dir through hdfs), doris will failed to query the hive table with an error message get file split failed for table. That is because the Hive metadata still contains the removed partition. This pr is to fix this bug. Skip the not exist dirs.	2023-05-26 10:32:45 +08:00
Jibing-Li	281a0971c8	[Fix](multi catalog, metadata)Init logType in ExternalCatalog while replay meta data to avoid NPE. Remove type variable in ExternaCatalog (#20009 ) The variable logType in ExternalCatalog is not persistent to disk, after refresh, it will be set to NULL and cause NPE. This pr is to fix the bug. Also, remove the old type variable in ExternalCatalog, use logType instead.	2023-05-26 10:31:34 +08:00
Jibing-Li	8a8d3bcb59	[improvement](multi catalog, nereids)Support collect hive table statistics by sql (#19955 ) Support collect hive external table statistics by running sql against hive table. By running sql, we could collect all the statistics collected for Olap table, including the min, max value of String column. With 3 BE (16 core, 64 GB), it cost less than 2 minutes to collect TPCH 100GB statistics for all columns of all tables. Also less than 2 minutes to collect all columns statistics for SSB 100GB tables.	2023-05-26 10:31:02 +08:00
Chengpeng Yan	5621ae08e6	[fix](Nereids) function ABS return type not same between constant folding and function signature (#20059 ) The abs return the wrong type for the integer type. Return the int type when the arg's type is integer	2023-05-26 10:24:32 +08:00
morrySnow	f1b949ad59	[fix](Nereids) local sort should not translate to unpartitioned partition (#20031 ) 1. local sort should not update current fragment partition to UNPARTITIONED 2. should set input fragment dest exchange node after create dest fragment	2023-05-26 10:18:56 +08:00
morrySnow	dca0ebb281	[fix](Nereids) constant folding to null should retain data type (#20070 )	2023-05-26 10:14:08 +08:00
Qi Chen	9185b202c5	[Fix](multi-catalog) Fix compilation errors in `Column.java`. (#20075 )	2023-05-25 23:51:29 +08:00
lexluo09	3f971889b7	[Enhancement](multi catalog) Support hudi mor only java side ,be side not support (#19909 ) Support reading Hudi MOR table by using jni connector. Note: the FE part of the current PR is not completed all, and the BE part will be supplemented in next PR.	2023-05-25 20:37:01 +08:00
jakevin	5ee13ce2ac	[fix](Nereids): memo skipProject() shouldn't skip NotEliminated project (#20051 )	2023-05-25 20:01:31 +08:00
starocean999	0dce725120	[fix](nereids)fix decimalv3 type error of mod operator (#20039 )	2023-05-25 17:25:11 +08:00

1 2 3 4 5 ...

4759 Commits