modify the bind logical to make the order by has same behavior with mysql when sort child is aggregate.
when an order by Expr has aggregate function, all slots in this order by Expr should bind the LogicalAggregate non-AggFunction outputs first, then bind the LogicalAggregate Child
e.g.
select 2*abs(sum(c1)) as c1, c1,sum(c1)+c1 from t_order_by_bind_priority group by c1 order by sum(c1)+c1 asc;
in this sql, the two c1 in order by all bind to the c1 in t_order_by_bind_priority
This sql will failed because
2 in the group by will bind to 1 as col2 in BindExpression
ResolveOrdinalInOrderByAndGroupBy will replace 1 to MIN (LENGTH (cast(age as varchar)))
CheckAnalysis will throw an exception because group by can not contains aggregate function
select MIN (LENGTH (cast(age as varchar))), 1 AS col2
from test_bind_groupby_slots
group by 2
we should move ResolveOrdinalInOrderByAndGroupBy into BindExpression
(cherry picked from commit 3fab4496c3fefe95b4db01f300bf747080bfc3d8)
for sql
```
t1, t2 join t3
```
we should generate plan like:
```
t1 join (t2 join t3)
```
but we generate:
```
(t1 join t2) join t3
```
to follow legancy planner.
q1: "select * from ut_p temporary partitions(tp1) where val > 0"
in q1, temporary partition tp1 is scaned
q2: "select * from ut_p where val > 0"
in q2, temporary partition tp1 is not scaned.
Problem:
when we create table with datatype varchar(), we regard it to be max length by default. But when we desc, it does not show
real length but show varchar()
Reason:
when we upgrade version from 2.0.1 to 2.0.2, we support new feature of creating varchar(), and it shows the same way with
ddl schema. So user would confuse of the length of varchar
Solved:
change the showing of varchar() to varchar(65533), which in compatible with hive
sql select * from t1 a join t1 b on b.id in (select 1) and a.id = b.id; will report an error.
This pr support uncorrelated subquery in join condition to fix it
consider sql having in-subquery
SELECT count(*)
FROM sub_query_correlated_subquery6
WHERE k1 IN
(SELECT k1
FROM
(**SELECT k1,
sum(k3) AS bbb,
count(k2) AS aaa
FROM sub_query_correlated_subquery7
WHERE k1 > 0
AND k3 > 0
GROUP BY k1** ) y
WHERE y.aaa>0
AND k1>1);
The subquery part having agg is un-correlated, which can be unnested.
on the other side:
SELECT count(*)
FROM sub_query_correlated_subquery6
WHERE k1 IN
(SELECT k1
FROM
(**SELECT k1,
sum(k3) AS bbb,
count(k2) AS aaa
FROM sub_query_correlated_subquery7
WHERE k1 > 0
AND k3 > 0 and sub_query_correlated_subquery6.k1 > 2
GROUP BY k1** ) y
WHERE y.aaa>0
AND k1>1);
The subquery part having agg is correlated, which can't be unnested.
This pr makes three changes to the display of complex types:
1. NULL value in complex types refers to being displayed as `null`, not `NULL`
2. struct type is displayed as "column_name": column_value
3. Time types such as `datetime` and `date`, are displayed with double quotes in complex types. like
`{1, "2023-10-26 12:12:12"}`
This pr also do a code refactor:
1. nesting_level is set to a member variable of the `DataTypeSerDe`, rather than a parameter in methods.
What's more, this pr fix a bug that fileSize is not correct, introduced by this pr: #25854
create table t1(c1 int, c2 int);
create table t2(c1 int, c2 int);
insert into t1 values (1,1);
insert into t2 values (1,1);
select * from t1 where exists (select * from t2 where t1.c1 = t2.c1 limit 0);
the result should be empty set.
we generate project for all set operation's children to ensure the order
of all children are not changed. However, some rules, such as
PushDownProjectThroughLimit could remove these projects involuntarily.
When it happen, the column order is wrong and lead to BE core dump.
This PR use a new variable in SetOperation to save the output order of
children of set operation. Then the children's output order could be
changed and never affect to SetOperation at all.
select rank() over (partition by A, B) as r, sum(x) over(A, C) as s from T;
A is a common partition key for all windowExpressions, that is A is intersection of {A,B} and {A, C}
we could push filter A=1 through this window, since A is a common Partition key:
select * from (select a, row_number() over (partition by a) from win) T where a=1;
origin plan:
----filter((T.a = 1))
----------PhysicalWindow
------------PhysicalQuickSort
--------------PhysicalProject
------------------PhysicalOlapScan[win]
transformed to
----PhysicalWindow
------PhysicalQuickSort
--------PhysicalProject
----------filter((T.a = 1))
------------PhysicalOlapScan[win]
But C=1 can not be pushed through window.
physicalPlan.resetLogicalProperties(); will not change the origin plan but create a new plan with no logical property. So should update the plan using resetLogicalProperties()'s return value.