Support such grammer:
select * from t_p temporary partition(tp1);
select * from t_p temporary partitions(tp1);
select * from t_p temporary partition tp1;
When light schema change is enabled by default, a column in OLAP scan is retrieved by column unique id instead of the column name. Columns with the same name would use different unique IDs among materialized indexes.
This PR ensures that the column in the OLAP scan node could use the correct column unique id.
This PR defines order_key and having_key binding priority.
1. order key priority
```
select
col1 * -1 as col1 # inner_col1 * -1 as alias_col1
from
t
order by col1; # order by order_col1
```
to bind `order_col1`, `alias_col1` has higher priority than `inner_col1`
2. having key priority
```
select (a-1) as a # inner_a - 1 as alias_a
from bind_priority_tbl
group by a
having a=1;
```
to bind having key, `inner_a` has higher priority than `alias_a`
3. group by key binding priority
```
SELECT date_format(b.k10,
'%Y%m%d') AS k10
FROM test a
LEFT JOIN
(SELECT k10
FROM baseall) b
ON a.k10 = b.k10
GROUP BY k10;
```
group_by_key (k10) binding priority:
- agg.child.output
- agg.output
if binding with agg.child.output failed(the slot not found, or more than one candidate slot found in agg.child.output), nereids try to bind group_by_key with agg.output.
In above example, nereids found 2 candidate slots (a.k10, b.k10) in agg.child.output for group_by_key (k10), binding with agg.child.output failed. Then nereids try to bind group_by_key with agg.output, that is `date_format(b.k10, '%Y%m%d') AS k10`. and finally, group_by_key is bound with `alias k10`
When calculating the statsCalculator of except and intersect, the slotId of the corresponding column was not replaced with the slotId of output, resulting in NPE.
when nereids translates PhysicalHashAggreg node to original plan, if the input fragment root is exchange node, nereids assumes that this exchanged node is generated from PhyscialDistirbute node.
But this assumption is not true. For example, sort node could be translated to exchange(merge phase)+sort(local phase).
1. the agg function without distinct keyword should be a "merge" funcion in threePhaseAggregateWithDistinct
2. use aggregateParam.aggMode.consumeAggregateBuffer instead of aggregateParam.aggPhase.isGlobal() to indicate if a agg function is a "merge" function
3. add an AvgDistinctToSumDivCount rule to support avg(distinct xxx) in some case
4. AggregateExpression's nullable method should call inner function's nullable method.
5. add a bind slot rule to bind pattern "logicalSort(logicalHaving(logicalProject()))"
6. don't remove project node in PhysicalPlanTranslator
7. add a cast to bigint expr when count( distinct datelike type )
8. fallback to old optimizer if bitmap runtime filter is enabled.
9. fix exchange node mem leak
1. Add IntegralDivide operator to support `DIV` semantics
2. Add more operator rewriter to keep expression type consistent between operators
3. Support the convertion between float type and decimal type.
After this PR, below cases could be executed normaly like the legacy optimizer:
use test_query_db;
select k1, k5,100000*k5 from test order by k1, k2, k3, k4;
select avg(k9) as a from test group by k1 having a < 100.0 order by a;
Original: group by is bound to the outputExpression of the current node.
Problem: When the name of the new reference of outputExpression is the same as the child's output column, the child's output column should be used for group by, but at this time, the new reference of the node's outputExpression will be used for group by, resulting in an error
Now: Give priority to the child's output for group by binding. If the child does not have a corresponding column, use the outputExpression of this node for binding
When light schema change is enabled by default (#15344), regression tests that run SQL by selecting data from the materialized index will fail.
This PR disabled those failed queries in the regression test. Those tests would be added back when nereids planner could give the correct plan when light schema change is enabled.
SELECT 2 FROM tbl GROUP BY 1
it should produce 2 would the table is not empty when table is not empty. Before this PR, the execution of nereids generated plan would produce empty result set
The join node need project operation to remove unnecessary columns from the output tuples.
For SetOperationNode output tuple and input tuple is consistent and do not need project,
but the children of SetOperationNode may be join nodes, so the children of the SetOperationNode
need to do the project operation.
1. Fix 1 bug:
Throw null pointer exception when reading data after the reader reaches the end of file, so should return directly when `_do_lazy_read` read no data.
2. Optimize code:
Remove unused parameters.
3. Fix regression test
Add a new config "jdbc_drivers_dir" for both FE and BE.
User can put jdbc drivers' jar file in this dir, and only specify file name in "driver_url" properties
when creating jdbc resource.
And Doris will find jar files in this dir.
Also modify the logic so that when the jdbc resource is modified, the corresponding jdbc table
will get the latest properties.