doris

Author	SHA1	Message	Date
Pxl	8249441335	[Bug](planner) add conjunct slotref id to table function node to avoid result incorrect (#18063 ) add conjunct slotref id to table function node to avoid result incorrect	2023-03-24 14:48:03 +08:00
zhangstar333	2a35adbba8	[vectorized](udaf) fix java-udaf case of P0 is unstable (#18054 ) the udaf case is unstable reason: when enable_pipeline_engine=true, the case of agg function only 1 instance, so not merge the default value, but if instance>1, will merge the default value	2023-03-24 09:10:58 +08:00
924060929	d3e7f12ada	[refactor](Nereids) refactor column pruning (#17579 ) This pr refactor the column pruning by the visitor, the good sides 1. easy to provide ability of column pruning for new plan by implement the interface `OutputPrunable` if the plan contains output field or do nothing if not contains output field, don't need to add new rule like `PruneXxxChildColumns`, few scenarios need to override the visit function to write special logic, like prune the LogicalSetOperation and Aggregate 2. support shrink output field in some plans, this can skip some useless operations so improvement example: ```sql select id from ( select id, sum(age) from student group by id )a ``` we should prune the useless `sum (age)` in the aggregate. before refactor: ``` LogicalProject ( distinct=false, projects=[id#0], excepts=[], canEliminate=true ) +--LogicalSubQueryAlias ( qualifier=[a] ) +--LogicalAggregate ( groupByExpr=[id#0], outputExpr=[id#0, sum(age#2) AS `sum(age)`#4], hasRepeat=false ) +--LogicalProject ( distinct=false, projects=[id#0, age#2], excepts=[], canEliminate=true ) +--LogicalOlapScan ( qualified=default_cluster:test.student, indexName=<index_not_selected>, selectedIndexId=10007, preAgg=ON ) ``` after refactor: ``` LogicalProject ( distinct=false, projects=[id#0], excepts=[], canEliminate=true ) +--LogicalSubQueryAlias ( qualifier=[a] ) +--LogicalAggregate ( groupByExpr=[id#0], outputExpr=[id#0], hasRepeat=false ) +--LogicalProject ( distinct=false, projects=[id#0], excepts=[], canEliminate=true ) +--LogicalOlapScan ( qualified=default_cluster:test.student, indexName=<index_not_selected>, selectedIndexId=10007, preAgg=ON ) ```	2023-03-24 09:00:48 +08:00
slothever	678314d657	[fix](regression)fix glue regression (#17952 )	2023-03-24 00:10:20 +08:00
morrySnow	c1bd5b26a8	[refactor](Nereids) expression translate no long rely on legacy planner code (#17671 )	2023-03-23 23:05:15 +08:00
Gabriel	5445a86570	[Bug](array_product) Fix array_product for ARRAY<DECIMAL> (#18014 )	2023-03-23 20:29:50 +08:00
Xin Liao	bdff9a7a7b	[regression-test](merge-on-write) Optimize merge-on-write case (#18038 )	2023-03-23 17:59:49 +08:00
lihangyu	4c5ba4bb01	[Improve](point query) optimize sendFields since `writeField` is heav… (#18000 ) save about 20% FE cpu cost for point query with prepared statement which table contains 100 columns	2023-03-23 17:45:56 +08:00
Pxl	f43d2ded0a	[Chore](case) add order by to testIncorrectMVRewriteInSubquery (#18017 ) add order by to testIncorrectMVRewriteInSubquery	2023-03-23 16:39:46 +08:00
morrySnow	20d26397aa	[fix](planner) forbid inline view but not the subquery resolve from parent tuples (#18032 ) in PR #17813 , we want to forbid bind slot on brother's column howerver the fix is not in correct way. the correct way to do that is forbid subquery register itself in parent's analyzer. This reverts commit b91a3b5a72520105638dad1079b71a05f02c10a0.	2023-03-23 16:11:04 +08:00
qiye	cedd36c786	[improvement](compaction)Support segcompaction for inverted index (#17874 ) Since Doris supports segcompaction #12866 during loading, inverted index support is also needed.	2023-03-23 14:41:30 +08:00
ZhangYu0123	089a91ecd5	[vectorized](function) support array_exists lambda function (#17931 ) Co-authored-by: zhangyu209 <zhangyu209@meituan.com>	2023-03-23 11:11:39 +08:00
Xinyi Zou	ebef0c038d	Revert "[fix](function) fix AES/SM3/SM4 encrypt/ decrypt algorithm initialization vector bug (#17420 )" (#17887 ) This reverts commit 397cc011c4f1ba5a25c770258c13f1cd3f28b47d.	2023-03-22 13:28:25 +08:00
mch_ucchi	0e7f0abe61	[test] (Nereids) add regression-test of arithmetic expressions of decimalv3 for nereids (#17549 ) add regression-test of decimalv3 for nereids and refactor some suites. too many suites will be changed, so this pr we just add arithmetic test. 1. some tests are disabled because of unfixed results and precision, detailed a big integer mul and div a float will cause the latter and bit-op will cause the former. 2. the disabled tests with tag original planner are caused by unfixed results.	2023-03-22 11:25:49 +08:00
Pxl	40ca250678	[Feature](materialized-view) support where clause on create materialized view (#17534 ) support where clause on create materialized view	2023-03-22 11:25:13 +08:00
Pxl	401836f523	[Bug](planner) fix core dump when lateral view above union node and have predicate (#17912 ) fix core dump when lateral view above union node and have predicate	2023-03-22 11:24:45 +08:00
starocean999	17a1ce5ed3	[fix](nereids) add a project node above sort node to eliminate unused order by keys (#17913 ) if the order by keys are not simple slot in sort node, the order by exprs have to been added to sort node's output tuple. In that case, we need add a project node above sort node to eliminate the unused order by exprs. for example: ```sql WITH t0 AS (SELECT DATE_FORMAT(date, '%Y%m%d') AS date FROM cir_1756_t1 ), t3 AS (SELECT date_format(date, '%Y%m%d') AS `date` FROM `cir_1756_t2` GROUP BY date_format(date, '%Y%m%d') ORDER BY date_format(date, '%Y%m%d') ) SELECT t0.date FROM t0 LEFT JOIN t3 ON t0.date = t3.date; ``` before: ``` +--------------------------------------------------------------------------------------------------------------------------------------------------+ \| Explain String \| +--------------------------------------------------------------------------------------------------------------------------------------------------+ \| LogicalProject[159] ( distinct=false, projects=[date#1], excepts=[], canEliminate=true ) \| \| +--LogicalJoin[158] ( type=LEFT_OUTER_JOIN, markJoinSlotReference=Optional.empty, hashJoinConjuncts=[(date#1 = date#3)], otherJoinConjuncts=[] ) \| \| \|--LogicalProject[151] ( distinct=false, projects=[date_format(date#0, '%Y%m%d') AS `date`#1], excepts=[], canEliminate=true ) \| \| \| +--LogicalOlapScan ( qualified=default_cluster:bugfix.cir_1756_t1, indexName=cir_1756_t1, selectedIndexId=412339, preAgg=ON ) \| \| +--LogicalSort[157] ( orderKeys=[date_format(cast(date#3 as DATETIME), '%Y%m%d') asc null first] ) \| \| +--LogicalAggregate[156] ( groupByExpr=[date#3], outputExpr=[date#3], hasRepeat=false ) \| \| +--LogicalProject[155] ( distinct=false, projects=[date_format(date#2, '%Y%m%d') AS `date`#3], excepts=[], canEliminate=true ) \| \| +--LogicalOlapScan ( qualified=default_cluster:bugfix.cir_1756_t2, indexName=cir_1756_t2, selectedIndexId=412352, preAgg=ON ) \| +--------------------------------------------------------------------------------------------------------------------------------------------------+ ``` after: ``` +--------------------------------------------------------------------------------------------------------------------------------------------------+ \| Explain String \| +--------------------------------------------------------------------------------------------------------------------------------------------------+ \| LogicalProject[171] ( distinct=false, projects=[date#2], excepts=[], canEliminate=true ) \| \| +--LogicalJoin[170] ( type=LEFT_OUTER_JOIN, markJoinSlotReference=Optional.empty, hashJoinConjuncts=[(date#2 = date#4)], otherJoinConjuncts=[] ) \| \| \|--LogicalProject[162] ( distinct=false, projects=[date_format(date#0, '%Y%m%d') AS `date`#2], excepts=[], canEliminate=true ) \| \| \| +--LogicalOlapScan ( qualified=default_cluster:bugfix.cir_1756_t1, indexName=cir_1756_t1, selectedIndexId=1049812, preAgg=ON ) \| \| +--LogicalProject[169] ( distinct=false, projects=[date#4], excepts=[], canEliminate=false ) \| \| +--LogicalSort[168] ( orderKeys=[date_format(cast(date#4 as DATETIME), '%Y%m%d') asc null first] ) \| \| +--LogicalAggregate[167] ( groupByExpr=[date#4], outputExpr=[date#4], hasRepeat=false ) \| \| +--LogicalProject[166] ( distinct=false, projects=[date_format(date#3, '%Y%m%d') AS `date`#4], excepts=[], canEliminate=true ) \| \| +--LogicalOlapScan ( qualified=default_cluster:bugfix.cir_1756_t2, indexName=cir_1756_t2, selectedIndexId=1049825, preAgg=ON ) \| +--------------------------------------------------------------------------------------------------------------------------------------------------+ ```	2023-03-22 11:19:32 +08:00
morrySnow	173d68409c	[enhencement](planner) update and delete support use alias for target table (#17914 )	2023-03-22 11:07:39 +08:00
morrySnow	b91a3b5a72	[fix](planner) should not bind slot on brother's tuple in subquery (#17813 ) consider the query like this: ```sql SELECT k3, k4 FROM test WHERE EXISTS( SELECT d.* FROM (SELECT k1 AS _1234, SUM(k2) FROM `test` d GROUP BY _1234) d LEFT JOIN (SELECT k1 AS _1234, SUM(k2) FROM `test` GROUP BY _1234) temp ON d._1234 = temp._1234) ORDER BY k3, k4 ``` when we analyze group by exprs in `temp` inline view. we bind the `_1234` on `d._1234` by mistake. that because, when we do analyze in a SUB-QUERY, we will resolve SlotRef by itself AND parent's tuple. in the meanwhile, we register child's tuple to parent's analyzer. So, in a SUB-QUERY, the brother's tuple will affect the resolve result of current inlineview's slot. This PR: 1. add a flag on the function `resolveColumnRef` in `Analyzer` ```java private TupleDescriptor resolveColumnRef(String colName, boolean requestFromChild); private TupleDescriptor resolveColumnRef(TableName tblName, String colName, boolean requestByChild); ``` 2. add a flag to specify whether the tuple is from child. ```java // alias name -> <from child, tupleDesc> private final Multimap<String, Pair<Boolean, TupleDescriptor>> tupleByAlias; ``` when `requestByChild == true`, we SKIP the tuple from other child to avoid resolve error.	2023-03-22 11:00:55 +08:00
huangzhaowei	8df4a94826	[fix](MTMV) Tasks leak when dropping job (#17984 ) 1. Divide MTMV regression tests into 4 suites 2. Try to remove tasks which were killed by dropping job actions in running map.	2023-03-21 23:22:17 +08:00
zhengshiJ	82716ec99d	[fix](Nereids) type coercion for subquery (#17661 ) Complete the type coercion of the subquery in the function Binder process. Expressions generated when subqueries are nested are uniformly converted to implicit types in the analyze stage. Method: Add a typeCoercionExpr field to the subquery expression to store the generated cast information. Fix scenario where scalarSubQuery handles arithmetic expressions when implicitly converting types	2023-03-21 20:38:06 +08:00
Mellorsssss	4193884a32	[feature](array_zip) Support array_zip function (#17696 )	2023-03-21 18:44:30 +08:00
Xin Liao	61366b21aa	[regression-test](merge-on-write) Optimize merge-on-write case execution time (#17956 )	2023-03-21 12:49:42 +08:00
huangzhaowei	4023670f35	[BugFix](DOE) Add http prefix when it's not set in hosts properties. (#17745 ) * Add http prefix when it's not set in hosts properties	2023-03-21 10:08:20 +08:00
HappenLee	7b93c17364	[Bug][Fix] regexp function core dump DCHECK failed and error result (#17953 ) CREATE TABLE `test` ( `name` varchar(64) NULL, `age` int(11) NULL ) ENGINE=OLAP DUPLICATE KEY(`name`) COMMENT 'OLAP' DISTRIBUTED BY HASH(`name`) BUCKETS 1 PROPERTIES ( "replication_allocation" = "tag.location.default: 1", "in_memory" = "false", "storage_format" = "V2", "disable_auto_compaction" = "false" ); insert into `test` values ("lemon",1),("tom",2); select a.name regexp concat('^', a.name) from test a;	2023-03-21 08:56:19 +08:00
huangzhaowei	bae9d8d7f2	[Feature-Wip](MySQL LOAD)Add trim quotes property for mysql load (#17775 ) Add trim quotes property for mysql load to trim double quotes in the load files.	2023-03-21 00:32:58 +08:00
zhangstar333	dc284b62d9	[vectorized](function) support array_filter function (#17832 )	2023-03-20 23:18:10 +08:00
starocean999	8ffc85b6ff	[fix](planner)project should be done inside inlineview (#17831 ) * [fix](planner)project should be done inside inlineview * add src column for slots in scan node's output tuple	2023-03-20 21:12:45 +08:00
Pxl	a92115f709	[Bug](materialized-view) fix select mv rollback fail on left join (#17850 ) fix select mv rollback fail on left join	2023-03-20 19:14:17 +08:00
Pxl	45232d65a6	[Chore](case) remove load big lateral view from p1 to p2 (#17851 ) move load big lateral view from p1 to p2, this case takes a long time to execute	2023-03-20 13:10:12 +08:00
AKIRA	5c990fb737	[fix](nereids) Analyze failed for SQL that has count distinct with same col (#17928 ) This problem is caused by the slots with same hashcodes was put in the hashset results into the wrong rules was selected.Use list instead of set as return type of getDistinctArguments method	2023-03-19 21:31:47 +08:00
TengJianPing	dfa2528b5e	[fix](bitmap) fix wrong result of bitmap count functions for null values (#17849 ) bitmap count functions result is null when there are null values, which is not right:	2023-03-19 11:49:58 +08:00
morrySnow	5f2b68df24	[fix](regression-test) fix unstable regression test cases found in p0 (#17900 )	2023-03-19 10:11:57 +08:00
lihangyu	043f77200f	[Bug](dynamic-table) Fix column alignment logic and support filtering null values when slot is not null (#17842 ) Before this PR when encountering null values with some columns which is specified as `NOT NULL`, null values will not be filtered,thi behavior does not match with the original load behavior. Second column alignment logic has bug : ``` template <typename ColumnInserterFn> void align_variant_by_name_and_type(ColumnObject& dst, const ColumnObject& src, size_t row_cnt, ColumnInserterFn inserter) { CHECK(dst.is_finalized() && src.is_finalized()); // Use rows() here instead of size(), since size() will check_consistency // but we could not check_consistency since num_rows will be upgraded even // if src and dst is empty, we just increase the num_rows of dst and fill // num_rows of default values when meet new data size_t num_rows = dst.rows(); ```	2023-03-17 16:53:30 +08:00
WenYao	bd44cc3f73	[fix](regression-test) move some case in test_query_sys_tables to p2 #17859	2023-03-17 11:26:06 +08:00
Kang	5d3de05976	[feature](map) basic functions for map datatype (#16916 ) basic functions for map datatype: - MAP<K, V> map(K k1, V v1, ...) - BIGINT map_size(MAP<K, V> m) - BOOL map_contains_key(MAP<K, V> m, K k1) - BOOL map_contains_value(MAP<K, V> m, V v1) - ARRAY< K> map_keys(MAP<K, V> m) - ARRAY< V> map_values(MAP<K, V> m)	2023-03-17 10:28:17 +08:00
morrySnow	ffda858f01	[fix](regression) fix unstable test cases and remove redundant cases (#17845 ) aggregate_strategies execution too slow, use smaller table valued function to speed up add a p2 case nereids_syntax_p2/aggregate_strategies to use larger table valued function to ensure correct remove case nereids_syntax_p0/test_join_nereids since it redundant with nereids_p0/join/test_join remove unstable case in query_p0/aggregate/aggregate	2023-03-16 15:59:26 +08:00
amory	ee7226348d	[FIX](Map) fix map compaction error (#17795 ) When compaction case, memory map offsets coming to same olap convertor which is from 0 to 0+size but it should be continue in different pages when in one segment writer . eg : last block with map offset : [3, 6, 8, ... 100] this block with map offset : [5, 10, 15 ..., 100] the same convertor should record last offset to make later coming offset followed last offset. so after convertor : the current offset should [105, 110, 115, ... 200], then column writer just call append_data() to make the right offset data append pages	2023-03-16 13:54:01 +08:00
morrySnow	0086fdbbdb	[enhancement](planner) support delete from using syntax (#17787 ) support syntax delete using, this syntax only support UNIQUE KEY model use the result of `t2` join `t3` to romve rows from `t1` ```sql -- create t1, t2, t3 tables CREATE TABLE t1 (id INT, c1 BIGINT, c2 STRING, c3 DOUBLE, c4 DATE) UNIQUE KEY (id) DISTRIBUTED BY HASH (id) PROPERTIES('replication_num'='1', "function_column.sequence_col" = "c4"); CREATE TABLE t2 (id INT, c1 BIGINT, c2 STRING, c3 DOUBLE, c4 DATE) DISTRIBUTED BY HASH (id) PROPERTIES('replication_num'='1'); CREATE TABLE t3 (id INT) DISTRIBUTED BY HASH (id) PROPERTIES('replication_num'='1'); -- insert data INSERT INTO t1 VALUES (1, 1, '1', 1.0, '2000-01-01'), (2, 2, '2', 2.0, '2000-01-02'), (3, 3, '3', 3.0, '2000-01-03'); INSERT INTO t2 VALUES (1, 10, '10', 10.0, '2000-01-10'), (2, 20, '20', 20.0, '2000-01-20'), (3, 30, '30', 30.0, '2000-01-30'), (4, 4, '4', 4.0, '2000-01-04'), (5, 5, '5', 5.0, '2000-01-05'); INSERT INTO t3 VALUES (1), (4), (5); -- remove rows from t1 DELETE FROM t1 USING t2 INNER JOIN t3 ON t2.id = t3.id WHERE t1.id = t2.id; ``` the expect result is only remove the row where id = 1 in table t1 ``` +----+----+----+--------+------------+ \| id \| c1 \| c2 \| c3 \| c4 \| +----+----+----+--------+------------+ \| 2 \| 2 \| 2 \| 2.0 \| 2000-01-02 \| \| 3 \| 3 \| 3 \| 3.0 \| 2000-01-03 \| +----+----+----+--------+------------+ ```	2023-03-16 13:12:00 +08:00
meiyi	1da3e7596e	[fix](point query) Fix NegativeArraySizeException when prepared statement contains a long string (#17651 )	2023-03-16 10:24:33 +08:00
ZhangYu0123	a53d46e317	[Fix](array function) fix array_pushfront function with DecimalV3 #17760 Support array_pushfront function with DecimalV3 Issue Number: close #xxx	2023-03-16 09:03:52 +08:00
Gabriel	079e6a3e12	[regression-test](vectorized) remove unused vectorization flag (#17662 )	2023-03-15 17:59:22 +08:00
morrySnow	049b70b957	[test](Nereids) add yandex metrica p2 regression case (#17082 )	2023-03-15 11:50:00 +08:00
ZhaoChangle	66f3ef568e	(functions) optimize const_column to full convert	2023-03-15 10:57:03 +08:00
zhangstar333	85080ee3c3	[vectorized](function) support array_map function (#17581 )	2023-03-15 10:51:29 +08:00
morrySnow	5ab758674e	[fix](planner) nested loop join with left semi generate repeat result (#17767 )	2023-03-15 09:56:44 +08:00
TengJianPing	64c2437be5	[fix](coalesce) support coalesce function for bitmap (#17798 )	2023-03-15 09:34:44 +08:00
morrySnow	699159698e	[enhancement](planner) support update from syntax (#17639 ) support update from syntax note: enable_concurrent_update is not supported now ``` UPDATE <target_table> SET <col_name> = <value> [ , <col_name> = <value> , ... ] [ FROM <additional_tables> ] [ WHERE <condition> ] ``` for example: t1 ``` +----+----+----+-----+------------+ \| id \| c1 \| c2 \| c3 \| c4 \| +----+----+----+-----+------------+ \| 3 \| 3 \| 3 \| 3.0 \| 2000-01-03 \| \| 2 \| 2 \| 2 \| 2.0 \| 2000-01-02 \| \| 1 \| 1 \| 1 \| 1.0 \| 2000-01-01 \| +----+----+----+-----+------------+ ``` t2 ``` +----+----+----+------+------------+ \| id \| c1 \| c2 \| c3 \| c4 \| +----+----+----+------+------------+ \| 4 \| 4 \| 4 \| 4.0 \| 2000-01-04 \| \| 2 \| 20 \| 20 \| 20.0 \| 2000-01-20 \| \| 5 \| 5 \| 5 \| 5.0 \| 2000-01-05 \| \| 1 \| 10 \| 10 \| 10.0 \| 2000-01-10 \| \| 3 \| 30 \| 30 \| 30.0 \| 2000-01-30 \| +----+----+----+------+------------+ ``` t3 ``` +----+ \| id \| +----+ \| 1 \| \| 5 \| \| 4 \| +----+ ``` do update ```sql update t1 set t1.c1 = t2.c1, t1.c3 = t2.c3 * 100 from t2 inner join t3 on t2.id = t3.id where t1.id = t2.id; ``` the result ``` +----+----+----+--------+------------+ \| id \| c1 \| c2 \| c3 \| c4 \| +----+----+----+--------+------------+ \| 3 \| 3 \| 3 \| 3.0 \| 2000-01-03 \| \| 2 \| 2 \| 2 \| 2.0 \| 2000-01-02 \| \| 1 \| 10 \| 1 \| 1000.0 \| 2000-01-01 \| +----+----+----+--------+------------+ ```	2023-03-14 19:26:30 +08:00
spaces-x	5b39fa9843	[Feature](vec)(quantile_state): support quantile state in vectorized engine (#16562 ) * [Feature](vectorized)(quantile_state): support vectorized quantile state functions 1. now quantile column only support not nullable 2. add up some regression test cases 3. set default enable_quantile_state_type = true --------- Co-authored-by: spaces-x <weixiang06@meituan.com>	2023-03-14 10:54:04 +08:00
weij	ba0f5a2355	[test](mv) Add mv case from fe ut (#17204 ) add some mv case from fe ut MaterializedViewFunctionTest	2023-03-14 10:29:43 +08:00

1 2 3 4 5 ...

962 Commits