doris

Author	SHA1	Message	Date
morrySnow	99bd5ec022	[fix](Nereids) fix some bugs in Subquery to window rule (#18233 ) we introduce this rule by PR #17968, but some corner case do not be processed correctly. This PR fix these bugs: 1. fix window function generation method, replace inner slot with equivalent outer slot 2. forbid below scenes a. inner has a mapping project b. inner has an unexpected filter c. outer has a mapping project d. outer has an unexpected filter e. outer has additional table f. outer has same table g. outer and inner with different join condition h. outer and inner has same table with different join condition	2023-03-30 16:09:16 +08:00
amory	ea41d94582	[Improve](complex-type) Support Count(complexType) (#17868 ) Support count function for ARRAY/MAP/STRUCT type	2023-03-30 15:43:32 +08:00
huanghaibin	e3bd812887	[fix](stream-load) find line delimiter in csv should start with no offset (#18161 ) when loading big file with multi bytes line delimiter, some line record maybe incomplete because of _output_buf_limit, so this incomplete data will move to the beginning of the output buf and read more data into output buf. In this case, find line delimiter should start with no offset to avoid a bug that spilt two lines as one line.	2023-03-30 14:42:34 +08:00
Pxl	c8ad62a3cd	[Enchancement](materialized-view) enchance materialized view where clause match (#18179 ) enchance materialized view where clause match	2023-03-30 13:02:21 +08:00
zhangstar333	525f15dddf	[vectorized](function) support array_sortby function (#18071 )	2023-03-30 11:07:49 +08:00
TengJianPing	9877143210	[fix](like) fix wrong result of like pattern with backslash (#18039 ) Result is empty for query select * from person where address like '%\\\\%';, but MySQL can get a line of result. CREATE TABLE `person` ( `id` int(11) NULL, `name` text NULL, `age` int(11) NULL, `class` int(11) NULL, `address` text NULL ) ENGINE=OLAP UNIQUE KEY(`id`) COMMENT 'OLAP' DISTRIBUTED BY HASH(`id`) BUCKETS 1 PROPERTIES ( "replication_allocation" = "tag.location.default: 1", "in_memory" = "false", "storage_format" = "V2", "disable_auto_compaction" = "false" ); insert into person values (10001,'test1',30,2,'test\\\\,xxx'); Adding logs: select * from person where address like '%\\\\%'; I0323 10:26:15.907760 2387043 like.cpp:558] arg str: %\\%, size: 4, pattern LIKE_ENDS_WITH_RE: (?:%+)(((\\%)\|(\\_)\|([^%_]))+), size: 30 I0323 10:26:15.907789 2387043 like.cpp:562] match 0: \\%, size: 3 I0323 10:26:15.907801 2387043 like.cpp:562] match 1: \%, size: 2 I0323 10:26:15.907811 2387043 like.cpp:562] match 2: \%, size: 2 I0323 10:26:15.907821 2387043 like.cpp:562] match 3: , size: 0 I0323 10:26:15.907830 2387043 like.cpp:562] match 4: \, size: 1 I0323 10:26:15.907842 2387043 like.cpp:615] search_string : \\% I0323 10:26:15.907855 2387043 like.cpp:619] search_string escape removed: \% It matchs against the LIKE_ENDS_WITH_RE which is wrong, the meaning of the sql should be: match strings that have one backslash in any place.	2023-03-30 11:05:09 +08:00
TengJianPing	3b04d42779	[fix](bitmap) fix bug: orthogonal_bitmap_union_count coredump when arg is nullable (#18182 ) Query cause be cordump: select ORTHOGONAL_BITMAP_UNION_COUNT( cast(null as bitmap)) from t;	2023-03-30 09:31:58 +08:00
Xinyi Zou	6964d9f99c	[fix](function) resubmit-fix AES/SM3/SM4 encrypt/ decrypt algorithm initialization vector bug (#17907 ) * Revert "[fix](function) fix AES/SM3/SM4 encrypt/ decrypt algorithm initialization vector bug (#17420)" This reverts commit 397cc011c4f1ba5a25c770258c13f1cd3f28b47d. * [fix-resubmit](function) fix AES/SM3/SM4 encrypt/ decrypt algorithm initialization vector bug (#17420) ECB algorithm, block_encryption_mode does not take effect, it only takes effect when init vector is provided. Solved: 192/256 supports calculation without init vector For other algorithms, an error should be reported when there is no init vector Initialization Vector. The default value for the block_encryption_mode system variable is aes-128-ecb, or ECB mode, which does not require an initialization vector. The alternative permitted block encryption modes CBC, CFB1, CFB8, CFB128, and OFB all require an initialization vector. Reference: https://dev.mysql.com/doc/refman/8.0/en/encryption-functions.html#function_aes-decrypt Note: This fix does not support smooth upgrades. during upgrade process, query may report error: funciton not found	2023-03-29 21:13:01 +08:00
zhengshiJ	b92087dee8	[Fix](Nereids) ReorderJoin rule cannot process MarkJoin correctly (#18159 ) Fix two problems, 1. The logical join containing the MarkJoinSlotRefrance column will generate a plan->MarkJoinSlotreference structure when reorderJoin is executed, and the MarkJoinSlotreference column will be restored after the reorder is completed. But when filter+crossJoin exists, it will be transformed into innerJoin in the rules, causing the map to fail, and the corresponding plan cannot be found, thus losing the MarkJoinSlotreference column. 2. Originally, the MarkJoinSlotReference column was used as the NonUserVisibleOutput of logicalJoin. At the same time, when logicalApply was generated, the added logicalProject did not include the MarkJoinSlotReference column, and the invalid logicalProject was deleted based on other rules, so as to ensure that LogicalApply was under the logicalFilter and could recognize the MarkJoinSlotReference column. But there will be problems if logicalProject cannot be deleted. Repair method 1. For logicalJoin containing MarkJoinSlotreference, the rules of reorderJoin are not executed. 2. Use MarkJoinSlotreference as the output of logicalJoin and also as the output of LogicalApply. 3. When generating LogicalApply, if MarkJoinSlotreference is included, you need to add an additional logicalProject to logicalFilter, and remove the MarkJoinSlotreference column. eg ``` logicalFilter(subquery with disconjunct) after SubqueryToApply logicalProject(without markJoinSlotReference) +-- logicalFilter(markJoinSlotReference) +-- logicalProject(with markJoinSlotReference) +-- logicalApply() ``` ``` SELECT * FROM sub_query_correlated_subquery1 WHERE k1 IN (SELECT k1 FROM sub_query_correlated_subquery3) OR k1 < 10; +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ \| Explain String \| +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ \| LogicalProject[60] ( distinct=false, projects=[k1#0, k2#1], excepts=[], canEliminate=true ) \| \| +--LogicalProject[59] ( distinct=false, projects=[k1#0, k2#1], excepts=[], canEliminate=true ) \| \| +--LogicalFilter[58] ( predicates=($c$1#7#false OR (k1#0 < 10)) ) \| \| +--LogicalProject[57] ( distinct=false, projects=[k1#0, k2#1, $c$1#7#false], excepts=[], canEliminate=true ) \| \| +--LogicalApply ( correlationSlot=[], correlationFilter=Optional.empty, isMarkJoin=true, MarkJoinSlotReference=$c$1#7#false, scalarSubCorrespondingSlot=empty ) \| \| \|--LogicalOlapScan ( qualified=default_cluster:regression_test_nereids_syntax_p0.sub_query_correlated_subquery1, indexName=<index_not_selected>, selectedIndexId=63105, preAgg=ON ) \| \| +--LogicalProject[34] ( distinct=false, projects=[k1#2], excepts=[], canEliminate=true ) \| \| +--LogicalOlapScan ( qualified=default_cluster:regression_test_nereids_syntax_p0.sub_query_correlated_subquery3, indexName=<index_not_selected>, selectedIndexId=63115, preAgg=ON ) \| +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ ```	2023-03-29 16:12:42 +08:00
Gabriel	c0797c4be3	[test](decimal) Update output data for P1 regression (#18199 )	2023-03-29 15:13:12 +08:00
Pxl	0c01df6bb2	[Bug](view) fix AES_ENCRYPT have wrong result on view (#18034 )	2023-03-29 10:49:39 +08:00
Liqf	012f7bd031	[feature](function)Add ST_Area function (#18138 )	2023-03-28 19:36:09 +08:00
Jerry Hu	d27201f331	[fix](nested_loop_join)got incorrect result from nested loop join without condition (#18139 )	2023-03-28 16:20:05 +08:00
zhengyu	ba1b159ad2	[fix](regression) deal with output order and timeout for segcompaction p1 (#18162 ) 1. Add `order by` to regulate the output order to avoid false-negative mismatch for dup table. 2. Increase load timeout. Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>	2023-03-28 16:00:27 +08:00
Tiewei Fang	d7dcdfcba9	[Fix](Create View) support create view from tvf (#18087 ) Support create view as select * from tvf()	2023-03-28 15:07:32 +08:00
xueweizhang	1956f04aa2	[feature](multi-catalog) add specified_database_list PROPERTY for jdbc/hms/iceberg catalog (#17803 ) add specified_database_list PROPERTY for jdbc catalog, user can use many database specified by jdbc catalog	2023-03-28 14:04:41 +08:00
Pxl	d2839eb41f	[Chore](Materialized-View) add some mv regression test case (#18095 ) add some mv regression test case	2023-03-28 10:31:37 +08:00
herry2038	09e346e47c	[fix](type) Data precision is lost when converting DOUBLE type data to DECIMAL (#17191 ) (#17562 ) 1. Fix bug when converting DOUBLE to DECIMAL; 2. Fix bug when converting DOUBLE to DECIMALV3;	2023-03-28 09:46:43 +08:00
TengJianPing	c95b81f950	[fix](order by) fix bug of order by desc when rowsets is no overlapping (#18100 ) In the case of rowets non-overlap and desc sorting, the logic of VCollectIterator::Level0Iterator::init_for_union will be followed. In this function, the row ref pos of the first level0 iterator is set to 0, and the row pos of other level0 iterators are all Set to -1. But in the level1iterator, when rowets are non-overlapping and is ordering by desc, the list of rowset iterators will be reversed, causing the row ref pos of the first level0 iterator in the list to be -1, causing the block reader to think that the entire tablet has no data.	2023-03-28 09:31:37 +08:00
zhangstar333	99427d409d	[vectorized](udaf) fix java-udaf case is unstable with fuzzy mode #18146 he udaf case is unstable reason: when fuzzy enable_pipeline_engine=true, the case of agg function only 1 instance, so not merge the default value, but if instance>1, will merge the default value	2023-03-28 09:30:49 +08:00
ZhangYu0123	115e52c16c	[Opt](array) optimize_array_sort (#18123 )	2023-03-27 22:01:24 +08:00
gitccl	ee80c12815	[feature](json) add json_extract function (#17808 )	2023-03-27 21:19:47 +08:00
Liqf	bcf95cd920	[feature](function)Add ST_Angle_Sphere function (#17919 )	2023-03-27 10:14:46 +08:00
TengJianPing	78abb40fdc	[improvement](string) throw exception instead of log fatal if string column exceed total size limit (#17989 ) Throw exception instead of log fatal if string column exceed total size limit, so that we can catch it and let query fail, instead of causing be exit.	2023-03-27 08:55:26 +08:00
ZashJie	2a0890d803	[feature](datatype) add show data types stmt (#18111 )	2023-03-26 12:37:06 +08:00
Yisong Han	df0eca4003	[improvement] (schema change) Lightweight schema change of modify column with varchar length (#17207 ) Signed-off-by: Yisong Han <yisong8686@gmail.com>	2023-03-25 22:38:19 +08:00
ZhangYu0123	360d3050bc	[Feature](array-function) Support array_reverse_sort function (#17754 ) Co-authored-by: zhangyu209 <zhangyu209@meituan.com>	2023-03-25 21:58:11 +08:00
xueweizhang	50eeb2d9a4	[fix](json) change int to bigint for json function (#17769 )	2023-03-25 21:57:29 +08:00
Pxl	a8753faeb1	[Bug](function) fix column complex not resize after filter (#18043 )	2023-03-25 21:48:13 +08:00
Jerry Hu	f84481886b	[feature](string_functions) The 'split_part' function supports non-constant parameters (#18029 )	2023-03-25 12:03:11 +08:00
Gabriel	2408ca5da8	[Bug](DECIMALV3) Fix wrong precision for plus/minus (#18052 ) Result type for DECIMAL(x, y) plus/minus DECIMAL(m, n) should be DECIMAL(max(x - y, m - n) + max(y + n) + 1, max(y + n))	2023-03-25 09:42:39 +08:00
HappenLee	473f0c45ff	[Bug](delete) Fix bug of delete partition prune error (#18057 )	2023-03-24 20:22:12 +08:00
Pxl	8249441335	[Bug](planner) add conjunct slotref id to table function node to avoid result incorrect (#18063 ) add conjunct slotref id to table function node to avoid result incorrect	2023-03-24 14:48:03 +08:00
zhangstar333	2a35adbba8	[vectorized](udaf) fix java-udaf case of P0 is unstable (#18054 ) the udaf case is unstable reason: when enable_pipeline_engine=true, the case of agg function only 1 instance, so not merge the default value, but if instance>1, will merge the default value	2023-03-24 09:10:58 +08:00
924060929	d3e7f12ada	[refactor](Nereids) refactor column pruning (#17579 ) This pr refactor the column pruning by the visitor, the good sides 1. easy to provide ability of column pruning for new plan by implement the interface `OutputPrunable` if the plan contains output field or do nothing if not contains output field, don't need to add new rule like `PruneXxxChildColumns`, few scenarios need to override the visit function to write special logic, like prune the LogicalSetOperation and Aggregate 2. support shrink output field in some plans, this can skip some useless operations so improvement example: ```sql select id from ( select id, sum(age) from student group by id )a ``` we should prune the useless `sum (age)` in the aggregate. before refactor: ``` LogicalProject ( distinct=false, projects=[id#0], excepts=[], canEliminate=true ) +--LogicalSubQueryAlias ( qualifier=[a] ) +--LogicalAggregate ( groupByExpr=[id#0], outputExpr=[id#0, sum(age#2) AS `sum(age)`#4], hasRepeat=false ) +--LogicalProject ( distinct=false, projects=[id#0, age#2], excepts=[], canEliminate=true ) +--LogicalOlapScan ( qualified=default_cluster:test.student, indexName=<index_not_selected>, selectedIndexId=10007, preAgg=ON ) ``` after refactor: ``` LogicalProject ( distinct=false, projects=[id#0], excepts=[], canEliminate=true ) +--LogicalSubQueryAlias ( qualifier=[a] ) +--LogicalAggregate ( groupByExpr=[id#0], outputExpr=[id#0], hasRepeat=false ) +--LogicalProject ( distinct=false, projects=[id#0], excepts=[], canEliminate=true ) +--LogicalOlapScan ( qualified=default_cluster:test.student, indexName=<index_not_selected>, selectedIndexId=10007, preAgg=ON ) ```	2023-03-24 09:00:48 +08:00
slothever	678314d657	[fix](regression)fix glue regression (#17952 )	2023-03-24 00:10:20 +08:00
morrySnow	c1bd5b26a8	[refactor](Nereids) expression translate no long rely on legacy planner code (#17671 )	2023-03-23 23:05:15 +08:00
Gabriel	5445a86570	[Bug](array_product) Fix array_product for ARRAY<DECIMAL> (#18014 )	2023-03-23 20:29:50 +08:00
Xin Liao	bdff9a7a7b	[regression-test](merge-on-write) Optimize merge-on-write case (#18038 )	2023-03-23 17:59:49 +08:00
lihangyu	4c5ba4bb01	[Improve](point query) optimize sendFields since `writeField` is heav… (#18000 ) save about 20% FE cpu cost for point query with prepared statement which table contains 100 columns	2023-03-23 17:45:56 +08:00
Pxl	f43d2ded0a	[Chore](case) add order by to testIncorrectMVRewriteInSubquery (#18017 ) add order by to testIncorrectMVRewriteInSubquery	2023-03-23 16:39:46 +08:00
morrySnow	20d26397aa	[fix](planner) forbid inline view but not the subquery resolve from parent tuples (#18032 ) in PR #17813 , we want to forbid bind slot on brother's column howerver the fix is not in correct way. the correct way to do that is forbid subquery register itself in parent's analyzer. This reverts commit b91a3b5a72520105638dad1079b71a05f02c10a0.	2023-03-23 16:11:04 +08:00
qiye	cedd36c786	[improvement](compaction)Support segcompaction for inverted index (#17874 ) Since Doris supports segcompaction #12866 during loading, inverted index support is also needed.	2023-03-23 14:41:30 +08:00
ZhangYu0123	089a91ecd5	[vectorized](function) support array_exists lambda function (#17931 ) Co-authored-by: zhangyu209 <zhangyu209@meituan.com>	2023-03-23 11:11:39 +08:00
Xinyi Zou	ebef0c038d	Revert "[fix](function) fix AES/SM3/SM4 encrypt/ decrypt algorithm initialization vector bug (#17420 )" (#17887 ) This reverts commit 397cc011c4f1ba5a25c770258c13f1cd3f28b47d.	2023-03-22 13:28:25 +08:00
mch_ucchi	0e7f0abe61	[test] (Nereids) add regression-test of arithmetic expressions of decimalv3 for nereids (#17549 ) add regression-test of decimalv3 for nereids and refactor some suites. too many suites will be changed, so this pr we just add arithmetic test. 1. some tests are disabled because of unfixed results and precision, detailed a big integer mul and div a float will cause the latter and bit-op will cause the former. 2. the disabled tests with tag original planner are caused by unfixed results.	2023-03-22 11:25:49 +08:00
Pxl	40ca250678	[Feature](materialized-view) support where clause on create materialized view (#17534 ) support where clause on create materialized view	2023-03-22 11:25:13 +08:00
Pxl	401836f523	[Bug](planner) fix core dump when lateral view above union node and have predicate (#17912 ) fix core dump when lateral view above union node and have predicate	2023-03-22 11:24:45 +08:00
starocean999	17a1ce5ed3	[fix](nereids) add a project node above sort node to eliminate unused order by keys (#17913 ) if the order by keys are not simple slot in sort node, the order by exprs have to been added to sort node's output tuple. In that case, we need add a project node above sort node to eliminate the unused order by exprs. for example: ```sql WITH t0 AS (SELECT DATE_FORMAT(date, '%Y%m%d') AS date FROM cir_1756_t1 ), t3 AS (SELECT date_format(date, '%Y%m%d') AS `date` FROM `cir_1756_t2` GROUP BY date_format(date, '%Y%m%d') ORDER BY date_format(date, '%Y%m%d') ) SELECT t0.date FROM t0 LEFT JOIN t3 ON t0.date = t3.date; ``` before: ``` +--------------------------------------------------------------------------------------------------------------------------------------------------+ \| Explain String \| +--------------------------------------------------------------------------------------------------------------------------------------------------+ \| LogicalProject[159] ( distinct=false, projects=[date#1], excepts=[], canEliminate=true ) \| \| +--LogicalJoin[158] ( type=LEFT_OUTER_JOIN, markJoinSlotReference=Optional.empty, hashJoinConjuncts=[(date#1 = date#3)], otherJoinConjuncts=[] ) \| \| \|--LogicalProject[151] ( distinct=false, projects=[date_format(date#0, '%Y%m%d') AS `date`#1], excepts=[], canEliminate=true ) \| \| \| +--LogicalOlapScan ( qualified=default_cluster:bugfix.cir_1756_t1, indexName=cir_1756_t1, selectedIndexId=412339, preAgg=ON ) \| \| +--LogicalSort[157] ( orderKeys=[date_format(cast(date#3 as DATETIME), '%Y%m%d') asc null first] ) \| \| +--LogicalAggregate[156] ( groupByExpr=[date#3], outputExpr=[date#3], hasRepeat=false ) \| \| +--LogicalProject[155] ( distinct=false, projects=[date_format(date#2, '%Y%m%d') AS `date`#3], excepts=[], canEliminate=true ) \| \| +--LogicalOlapScan ( qualified=default_cluster:bugfix.cir_1756_t2, indexName=cir_1756_t2, selectedIndexId=412352, preAgg=ON ) \| +--------------------------------------------------------------------------------------------------------------------------------------------------+ ``` after: ``` +--------------------------------------------------------------------------------------------------------------------------------------------------+ \| Explain String \| +--------------------------------------------------------------------------------------------------------------------------------------------------+ \| LogicalProject[171] ( distinct=false, projects=[date#2], excepts=[], canEliminate=true ) \| \| +--LogicalJoin[170] ( type=LEFT_OUTER_JOIN, markJoinSlotReference=Optional.empty, hashJoinConjuncts=[(date#2 = date#4)], otherJoinConjuncts=[] ) \| \| \|--LogicalProject[162] ( distinct=false, projects=[date_format(date#0, '%Y%m%d') AS `date`#2], excepts=[], canEliminate=true ) \| \| \| +--LogicalOlapScan ( qualified=default_cluster:bugfix.cir_1756_t1, indexName=cir_1756_t1, selectedIndexId=1049812, preAgg=ON ) \| \| +--LogicalProject[169] ( distinct=false, projects=[date#4], excepts=[], canEliminate=false ) \| \| +--LogicalSort[168] ( orderKeys=[date_format(cast(date#4 as DATETIME), '%Y%m%d') asc null first] ) \| \| +--LogicalAggregate[167] ( groupByExpr=[date#4], outputExpr=[date#4], hasRepeat=false ) \| \| +--LogicalProject[166] ( distinct=false, projects=[date_format(date#3, '%Y%m%d') AS `date`#4], excepts=[], canEliminate=true ) \| \| +--LogicalOlapScan ( qualified=default_cluster:bugfix.cir_1756_t2, indexName=cir_1756_t2, selectedIndexId=1049825, preAgg=ON ) \| +--------------------------------------------------------------------------------------------------------------------------------------------------+ ```	2023-03-22 11:19:32 +08:00
morrySnow	173d68409c	[enhencement](planner) update and delete support use alias for target table (#17914 )	2023-03-22 11:07:39 +08:00

1 2 3 4 5 ...

994 Commits