doris

Author	SHA1	Message	Date
starocean999	cd70c37402	[fix](nereids) filter and project node should be pushed down through cte (#20508 ) 1.move PushdownFilterThroughCTEAnchor and PushdownProjectThroughCTEAnchor into PUSH_DOWN_FILTERS rule set 2.move PushdownFilterThroughProject before MergeProjectPostProcessor	2023-06-07 10:36:32 +08:00
AKIRA	cd0379df4e	[fix](nereids) select with specified partition name is not work as expected (#20269 ) This PR is to fix the select specific partition issue, certain codes related to this feature were accidentally deleted.	2023-06-05 12:48:54 +08:00
Mryange	519f01133a	[feature](decimal)support cast rounding half up and div precision increment in decimalv3. (#19811 )	2023-06-01 13:09:58 +08:00
Gabriel	55ccddb62c	[Conf](decimalv3) enable decimalv3 by default	2023-05-29 15:38:31 +08:00
zhengshiJ	970efdc1cb	[Feature](Nereids) support advanced materialized view (#19650 ) Increase the functionality of advanced materialized view This feature already supported by legacy planner with PR #19650 This PR implement it in Nereids. This PR implement the features as below: 1. Support multiple columns in aggregate function. eg: select sum(c1 + c2) from t1; 2. Supports complex expressions. eg: select abs(c1), sum(abc(c1+1) + 1) from t1; TODO: 1. Support adding where in materialized view	2023-05-29 10:37:44 +08:00
morrySnow	f1b949ad59	[fix](Nereids) local sort should not translate to unpartitioned partition (#20031 ) 1. local sort should not update current fragment partition to UNPARTITIONED 2. should set input fragment dest exchange node after create dest fragment	2023-05-26 10:18:56 +08:00
starocean999	0dce725120	[fix](nereids)fix decimalv3 type error of mod operator (#20039 )	2023-05-25 17:25:11 +08:00
starocean999	c41b486e7e	[fix](nereids) LogicalProject should always has non-empty project list (#18863 )	2023-04-21 14:28:07 +08:00
Gabriel	5300b21db7	[Bug](DECIMALV3) report failure if a decimal value is overflow (#18336 )	2023-04-17 13:18:14 +08:00
starocean999	a9f9366736	[fix](nereids) the data type of compareExpr and listQuery should be the same when creating InSubquery (#18539 ) Consider sql select table_B_alias.b from table_B_alias where table_B_alias.b in ( select a from table_A_alias ); if table_B_alias.b is int and table_A_alias.a is bigint, we should cast(b as bigint) to make the data type the same as the InSubquery.	2023-04-12 20:02:37 +08:00
starocean999	735cd15a3d	[fix](nereids) PushdownAliasThroughJoin should handle same column with different alias in project list (#18470 )	2023-04-10 11:50:37 +08:00
zhengshiJ	b92087dee8	[Fix](Nereids) ReorderJoin rule cannot process MarkJoin correctly (#18159 ) Fix two problems, 1. The logical join containing the MarkJoinSlotRefrance column will generate a plan->MarkJoinSlotreference structure when reorderJoin is executed, and the MarkJoinSlotreference column will be restored after the reorder is completed. But when filter+crossJoin exists, it will be transformed into innerJoin in the rules, causing the map to fail, and the corresponding plan cannot be found, thus losing the MarkJoinSlotreference column. 2. Originally, the MarkJoinSlotReference column was used as the NonUserVisibleOutput of logicalJoin. At the same time, when logicalApply was generated, the added logicalProject did not include the MarkJoinSlotReference column, and the invalid logicalProject was deleted based on other rules, so as to ensure that LogicalApply was under the logicalFilter and could recognize the MarkJoinSlotReference column. But there will be problems if logicalProject cannot be deleted. Repair method 1. For logicalJoin containing MarkJoinSlotreference, the rules of reorderJoin are not executed. 2. Use MarkJoinSlotreference as the output of logicalJoin and also as the output of LogicalApply. 3. When generating LogicalApply, if MarkJoinSlotreference is included, you need to add an additional logicalProject to logicalFilter, and remove the MarkJoinSlotreference column. eg ``` logicalFilter(subquery with disconjunct) after SubqueryToApply logicalProject(without markJoinSlotReference) +-- logicalFilter(markJoinSlotReference) +-- logicalProject(with markJoinSlotReference) +-- logicalApply() ``` ``` SELECT * FROM sub_query_correlated_subquery1 WHERE k1 IN (SELECT k1 FROM sub_query_correlated_subquery3) OR k1 < 10; +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ \| Explain String \| +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ \| LogicalProject[60] ( distinct=false, projects=[k1#0, k2#1], excepts=[], canEliminate=true ) \| \| +--LogicalProject[59] ( distinct=false, projects=[k1#0, k2#1], excepts=[], canEliminate=true ) \| \| +--LogicalFilter[58] ( predicates=($c$1#7#false OR (k1#0 < 10)) ) \| \| +--LogicalProject[57] ( distinct=false, projects=[k1#0, k2#1, $c$1#7#false], excepts=[], canEliminate=true ) \| \| +--LogicalApply ( correlationSlot=[], correlationFilter=Optional.empty, isMarkJoin=true, MarkJoinSlotReference=$c$1#7#false, scalarSubCorrespondingSlot=empty ) \| \| \|--LogicalOlapScan ( qualified=default_cluster:regression_test_nereids_syntax_p0.sub_query_correlated_subquery1, indexName=<index_not_selected>, selectedIndexId=63105, preAgg=ON ) \| \| +--LogicalProject[34] ( distinct=false, projects=[k1#2], excepts=[], canEliminate=true ) \| \| +--LogicalOlapScan ( qualified=default_cluster:regression_test_nereids_syntax_p0.sub_query_correlated_subquery3, indexName=<index_not_selected>, selectedIndexId=63115, preAgg=ON ) \| +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ ```	2023-03-29 16:12:42 +08:00
924060929	d3e7f12ada	[refactor](Nereids) refactor column pruning (#17579 ) This pr refactor the column pruning by the visitor, the good sides 1. easy to provide ability of column pruning for new plan by implement the interface `OutputPrunable` if the plan contains output field or do nothing if not contains output field, don't need to add new rule like `PruneXxxChildColumns`, few scenarios need to override the visit function to write special logic, like prune the LogicalSetOperation and Aggregate 2. support shrink output field in some plans, this can skip some useless operations so improvement example: ```sql select id from ( select id, sum(age) from student group by id )a ``` we should prune the useless `sum (age)` in the aggregate. before refactor: ``` LogicalProject ( distinct=false, projects=[id#0], excepts=[], canEliminate=true ) +--LogicalSubQueryAlias ( qualifier=[a] ) +--LogicalAggregate ( groupByExpr=[id#0], outputExpr=[id#0, sum(age#2) AS `sum(age)`#4], hasRepeat=false ) +--LogicalProject ( distinct=false, projects=[id#0, age#2], excepts=[], canEliminate=true ) +--LogicalOlapScan ( qualified=default_cluster:test.student, indexName=<index_not_selected>, selectedIndexId=10007, preAgg=ON ) ``` after refactor: ``` LogicalProject ( distinct=false, projects=[id#0], excepts=[], canEliminate=true ) +--LogicalSubQueryAlias ( qualifier=[a] ) +--LogicalAggregate ( groupByExpr=[id#0], outputExpr=[id#0], hasRepeat=false ) +--LogicalProject ( distinct=false, projects=[id#0], excepts=[], canEliminate=true ) +--LogicalOlapScan ( qualified=default_cluster:test.student, indexName=<index_not_selected>, selectedIndexId=10007, preAgg=ON ) ```	2023-03-24 09:00:48 +08:00
zhengshiJ	82716ec99d	[fix](Nereids) type coercion for subquery (#17661 ) Complete the type coercion of the subquery in the function Binder process. Expressions generated when subqueries are nested are uniformly converted to implicit types in the analyze stage. Method: Add a typeCoercionExpr field to the subquery expression to store the generated cast information. Fix scenario where scalarSubQuery handles arithmetic expressions when implicitly converting types	2023-03-21 20:38:06 +08:00
AKIRA	5c990fb737	[fix](nereids) Analyze failed for SQL that has count distinct with same col (#17928 ) This problem is caused by the slots with same hashcodes was put in the hashset results into the wrong rules was selected.Use list instead of set as return type of getDistinctArguments method	2023-03-19 21:31:47 +08:00
TengJianPing	dfa2528b5e	[fix](bitmap) fix wrong result of bitmap count functions for null values (#17849 ) bitmap count functions result is null when there are null values, which is not right:	2023-03-19 11:49:58 +08:00
morrySnow	5f2b68df24	[fix](regression-test) fix unstable regression test cases found in p0 (#17900 )	2023-03-19 10:11:57 +08:00
morrySnow	ffda858f01	[fix](regression) fix unstable test cases and remove redundant cases (#17845 ) aggregate_strategies execution too slow, use smaller table valued function to speed up add a p2 case nereids_syntax_p2/aggregate_strategies to use larger table valued function to ensure correct remove case nereids_syntax_p0/test_join_nereids since it redundant with nereids_p0/join/test_join remove unstable case in query_p0/aggregate/aggregate	2023-03-16 15:59:26 +08:00
HappenLee	39b5682d59	[Pipeline](shared_scan_opt) Support shared scan opt in pipeline exec engine	2023-03-13 10:33:57 +08:00
morrySnow	6c894be007	[enhancement](Nereids) support decimalv3 and precision derive (#17393 )	2023-03-09 14:12:10 +08:00
TengJianPing	eea6d770d7	[fix](bitmap) fix wrong result of bitmap_or for null (#17456 ) Result of select bitmap_to_string(bitmap_or(to_bitmap(1), null)) should be 1 instead of null. This PR fix logic of bitmap_or and bitmap_or_count. Other count related funcitons should also be checked and fix, they will be fixed in another PR.	2023-03-08 16:29:01 +08:00
zhengshiJ	aab14922af	[Feature](Nereids) support MarkJoin (#16616 ) # Proposed changes 1.The new optimizer supports the combination of subquery and disjunction.In the way of MarkJoin, it behaves the same as the old optimizer. For design details see:https://emmymiao87.github.io/jekyll/update/2021/07/25/Mark-Join.html. 2.Implicit type conversion is performed when conjects are generated after subquery parsing 3.Convert the unnesting of scalarSubquery in filter from filter+join to join + Conjuncts.	2023-03-08 14:26:24 +08:00
minghong	fd8adb492d	[fix](nereids) fix bugs in nereids window function (#17284 ) fix two problems: 1. push agg-fun in windowExpression down to AggregateNode for example, sql: select sum(sum(a)) over (order by b) Plan: windowExpression( sum(y) over (order by b)) +--- Agg(sum(a) as y, b) 2. push other expr to upper proj for example, sql: select sum(a+1) over () Plan: windowExpression(sum(y) over ()) +--- Project(a + 1 as y,...) +--- Agg(a,...)	2023-03-07 16:35:37 +08:00
morrySnow	3eeeff09fd	[enhancement](nereids) convert string literal to commontype in in-expr and cass-when-expr (#17200 )	2023-03-02 22:05:35 +08:00
morrySnow	469b6b8466	[enhancement](Nereids) datetime v2 type precision derive (#17079 )	2023-02-26 22:33:55 +08:00
YangShaw	c53b6a9532	[fix](Nereids) fix nullable() of lead/lag (#17014 ) fix bug when we use NULL as default value for window function lead() and lag()	2023-02-24 21:27:44 +08:00
morrySnow	7956800df7	[refactor](Nereids) let type coercion same with legacy planner (#16844 ) - change for Nereids 1. add a variable length parameter to the ctor of Count for a good error reporting of Count(a, b) 2. refactor StringRegexPredicate, let it inherit from ScalarFunction 3. remove useless class TypeCollection 4. use catalog.Type.Collection to check expression arguments type 5. change type coercion for TimestampArithmetic, divide, integral divide, comparison predicate, case when and in predicate. Let them same as legacy planner. - change for legacy planner 1. change the common type of floating and Decimal from Decimal to Double	2023-02-22 17:29:37 +08:00
YangShaw	77a3288ce7	[feature](Nereids) support window function (#14397 )	2023-02-13 21:20:56 +08:00
minghong	4f778c38a1	[feature](nereids) support explore 4 phase aggregation (#16298 ) support 4 phase Aggregation. example: `select count(distinct k1), sum(k2) from t` suppose t.k0 is distribute key. we have plan ``` Agg(DISTINCT_GLOBAL) \| Exchange(Gather) \| Agg(DISTINCT_LOCAL) \| Agg(GLOBAL) \| Exchange(hash distribute by k1) \| Agg(LOCAL) \| scan ``` limitations: 1. only support sql with one distinct. not support:`select count(distinct k1), count(distinct k2) from t` 2. only support sql with distinct one column not support: `select count(distinct k1, k2) from t`	2023-02-03 21:51:10 +08:00
zhengshiJ	929b31bd3c	[Feature](Nereids) Support CaseWhen with subquery (#16385 ) Co-authored-by: jianghaochen <jianghaochen@meituan.com>	2023-02-03 18:20:47 +08:00
zhengshiJ	e31913faca	[Feature](Nereids) Support order and limit in subquery (#15971 ) 1.Compatible with the old optimizer, the sort and limit in the subquery will not take effect, just delete it directly. ``` select * from sub_query_correlated_subquery1 where sub_query_correlated_subquery1.k1 > (select sum(sub_query_correlated_subquery3.k3) a from sub_query_correlated_subquery3 where sub_query_correlated_subquery3.v2 = sub_query_correlated_subquery1.k2 order by a limit 1); ``` 2.Adjust the unnesting position of the subquery to ensure that the conjunct in the filter has been optimized, and then unnesting Support: ``` SELECT DISTINCT k1 FROM sub_query_correlated_subquery1 i1 WHERE ((SELECT count() FROM sub_query_correlated_subquery1 WHERE ((k1 = i1.k1) AND (k2 = 2)) or ((k1 = i1.k1) AND (k2 = 1)) ) > 0); ``` The reason why the above can be supported is that conjunction will be performed, which can be converted into the following ``` SELECT DISTINCT k1 FROM sub_query_correlated_subquery1 i1 WHERE ((SELECT count() FROM sub_query_correlated_subquery1 WHERE ((k1 = i1.k1) AND (k2 = 2 or k2 = 1)) ) > 0); ``` Not Support: ``` SELECT DISTINCT k1 FROM sub_query_correlated_subquery1 i1 WHERE ((SELECT count(*) FROM sub_query_correlated_subquery1 WHERE ((k1 = i1.k1) AND (k2 = 2)) or ((k2 = i1.k1) AND (k2 = 1)) ) > 0); ```	2023-02-02 18:17:30 +08:00
谢健	09abd32957	[fix](test) result order in group-by-costant case is not stable (#16323 )	2023-02-02 16:54:01 +08:00
starocean999	1ec88cbff6	[fix](nereids) AggregationNode process null as key column in wrong way (#16125 ) in AggregationNode, _merge_with_serialized_key_helper method should convert the key column to full column if the key column is null literal.	2023-01-29 20:12:07 +08:00
starocean999	cbb203efd2	[fix](nereids) fix test_join regression test for nereids (#16094 ) 1. add TypeCoercion for (string, decimal) and (date, decimal) 2. The equality of LogicalProject node should consider children in some case 3. don't push down join condition like "t1 join t2 on true/false" 4. add PUSH_DOWN_FILTERS after FindHashConditionForJoin 5. nestloop join should support all kind of join 6. the intermediate tuple should contains slots from both children of nest loop join.	2023-01-20 14:02:29 +08:00
minghong	dd869077f8	[fix](nereids) do not generate compare between Date to Date (#16061 ) BE storage Engine has some bug in Date comparison, and hence if we push down predicates like Date'x' < Date 'y', we get error results. This pr just convert expr like ’Date'x' < Date 'y',‘ to DateTime'x' < DateTime 'y' TODO: do storage engine support date slot compare with datetime? if it support, we could avoid add cast on the slot and then, this expression could push down to storage engine.	2023-01-19 15:56:51 +08:00
AKIRA	21b78cb820	[fix](nereids) Fix bind failed of the slots in the group by clause (#16077 ) Child's slot with same name to the slots in the outputexpression would be discarded which would cause the bind failed, since the slots in the group by expressions cannot find the corresponding bound slots from the child's output	2023-01-19 15:36:13 +08:00
minghong	0144c51ddb	[fix](nereids) fix bug in CaseWhen.getDataType and add some missing case for findTightestCommonType (#15776 )	2023-01-19 15:30:25 +08:00
谢健	d8f598eeab	[enhancement](Nereids) add timestampadd, timestampdiff functions (#16072 )	2023-01-19 01:05:25 +08:00
mch_ucchi	baf62b4418	[test](Nereids) add regression-test for running_difference and regexp_extract_all (#16049 )	2023-01-18 22:24:52 +08:00
AKIRA	0916cbcb10	[ehancement](nereids) Made the parse for named expression more complete (#16010 ) After this PR, we could support such grammar. SELECT SUBSTRING("dddd编", 0, 3) AS "测试"; SELECT SUBSTRING("dddd编", 0, 3) "测试";	2023-01-18 19:44:51 +08:00
谢健	1fa2b662cf	[opt](Nereids) add date_add/sub function (#16048 ) 1. add week_add week_diff function 2. register all date_add/date_diff function	2023-01-18 17:11:44 +08:00
starocean999	96b9115286	[fix](nereids) fix bug of invalid column in olap scan node when a materialized view is selected (#15976 ) if a materialized view is selected, the olap scan node's NonUserVisibleOutput property may contains column from other materialized view. This pr remove invalid column	2023-01-18 01:02:12 +08:00
starocean999	0c8255d9b8	[fix](nereids)nest loop join should support filter conjuncts like hash join (#15979 )	2023-01-17 20:38:38 +08:00
morrySnow	7e4bc1fee6	[fix](Nereids) add a rule to adjust nullable of all expressions (#15791 ) we have some rules that change output's nullable in rewrite step. So we need a rule to adjust nullable at the end of rewrite step. TODO - remove the output slot map - add nullable compare into slot reference - use exprid to compare two slot if do not need to compare nullable - merge all rules into one to adjust all type plans	2023-01-17 15:51:25 +08:00
morrySnow	d98abb12f9	[fix](Nereids)set oepration type coercion is diff with legacy planner (#15982 )	2023-01-17 11:41:41 +08:00
morrySnow	ce1d19b373	[fix](Nereids) lateral view cannot bind function nested in generators (#15960 )	2023-01-17 11:37:56 +08:00
minghong	8d25b156aa	[fix](nereids) bind slot using exactly match (#15950 ) example: unbound slot k bounded [k, t.k] In previous binding algorithm, there are 2 candidate bindings, in which bounded k is exactly matched unbound slot k, it has higher priority than that of t1.k	2023-01-17 11:25:08 +08:00
minghong	fa03c8a241	[feature](nereids) const folding for in-predicate with null literal (#15880 ) select 1 in (2 , null) => null select 1 in (1 , null) => true select 1 not in (2 , null) => null select 1 not in (1 , null) => false	2023-01-16 13:48:45 +08:00
minghong	67378a2dc3	[fix](nereids) fix bug in SequenceFunction legality check (#15812 ) 1. fix bug in sequence_match function 2. do type promotion instead of explicit cast for - varcharLiteral -> stringLiteral - charLiteral->stringLiteral	2023-01-13 12:09:53 +08:00
谢健	39697bb83e	[fix](Nereids) make the type of the first parameter in window_funnel is intergerLike (#15810 )	2023-01-12 11:53:28 +08:00

1 2 3

126 Commits