1. Add IntegralDivide operator to support `DIV` semantics
2. Add more operator rewriter to keep expression type consistent between operators
3. Support the convertion between float type and decimal type.
After this PR, below cases could be executed normaly like the legacy optimizer:
use test_query_db;
select k1, k5,100000*k5 from test order by k1, k2, k3, k4;
select avg(k9) as a from test group by k1 having a < 100.0 order by a;
Original: group by is bound to the outputExpression of the current node.
Problem: When the name of the new reference of outputExpression is the same as the child's output column, the child's output column should be used for group by, but at this time, the new reference of the node's outputExpression will be used for group by, resulting in an error
Now: Give priority to the child's output for group by binding. If the child does not have a corresponding column, use the outputExpression of this node for binding
When light schema change is enabled by default (#15344), regression tests that run SQL by selecting data from the materialized index will fail.
This PR disabled those failed queries in the regression test. Those tests would be added back when nereids planner could give the correct plan when light schema change is enabled.
SELECT 2 FROM tbl GROUP BY 1
it should produce 2 would the table is not empty when table is not empty. Before this PR, the execution of nereids generated plan would produce empty result set
The join node need project operation to remove unnecessary columns from the output tuples.
For SetOperationNode output tuple and input tuple is consistent and do not need project,
but the children of SetOperationNode may be join nodes, so the children of the SetOperationNode
need to do the project operation.
1. Fix 1 bug:
Throw null pointer exception when reading data after the reader reaches the end of file, so should return directly when `_do_lazy_read` read no data.
2. Optimize code:
Remove unused parameters.
3. Fix regression test
Add a new config "jdbc_drivers_dir" for both FE and BE.
User can put jdbc drivers' jar file in this dir, and only specify file name in "driver_url" properties
when creating jdbc resource.
And Doris will find jar files in this dir.
Also modify the logic so that when the jdbc resource is modified, the corresponding jdbc table
will get the latest properties.
the union node will make children pass through in wrong condition. If the children's materialized slots are different from union node, children can't be passed through.
Add a new rule 'ProjectWithDistinctToAggregate' to support "select distinct xx from table".
This rule check's the logicalProject node's isDisinct property and replace the logicalProject node with a LogicalAggregate node.
So any rule before this, if createing a new logicalProject node, should make sure isDisinct property is correctly passed around.
please see rule BindSlotReference or BindFunction for example.
Optimize for key topn query like `SELECT * FROM store_sales ORDER BY ss_sold_date_sk, ss_sold_time_sk LIMIT 100`
(ss_sold_date_sk, ss_sold_time_sk is prefix of table sort key).
Check per scanner limit and set eof true to reduce the data need to be read.
* [fix](Inbitmap) fix in bitmap result error when left expr is constant
1. When left expr of the in predicate is a constant, instead of generating a bitmap filter, rewrite sql to use `bitmap_contains`.
For example,"select k1, k2 from (select 2 k1, 11 k2) t where k1 in (select bitmap_col from bitmap_tbl)"
=> "select k1, k2 from (select 2 k1, 11 k2) t left semi join bitmap_tbl b on bitmap_contains(b.bitmap_col, t.k1)"
* add regression test
when we process not in subquery. if the subquery return column is nullable, we need a NULL AWARE ANTI JOIN instead of ANTI JOIN.
Doris already support NULL AWARE ANTI JOIN in PR #13871
Nereids need to do that so.
in previous pr(#14876) we compact equals like "a=1 or a=2 or a = 3 " in to "a in (1, 2, 3)"
this pr set a lower bound for the number of equals COMPACT_EQUAL_TO_IN_PREDICATE_THRESHOLD (default is 2)
for performance reason, we create a hashSet to collect literals, like {1,2,3}. and hence, the literals in "in-predicates" are in random order.
for regression test, if we need stable output of explain string, set COMPACT_EQUAL_TO_IN_PREDICATE_THRESHOLD to a large number to avoid compact rule.
The pr implements the SetOperation.
- Adapt to the EliminateUnnecessaryProject rule to ensure that the project under SetOperation is not deleted.
- Add predicate pushdown of SetOperation
- Optimization: Merge multiple SetOperations with the same type and the same qualifier
- Optimization: merge oneRowRelation and union