for sql
```
t1, t2 join t3
```
we should generate plan like:
```
t1 join (t2 join t3)
```
but we generate:
```
(t1 join t2) join t3
```
to follow legancy planner.
q1: "select * from ut_p temporary partitions(tp1) where val > 0"
in q1, temporary partition tp1 is scaned
q2: "select * from ut_p where val > 0"
in q2, temporary partition tp1 is not scaned.
Problem:
when we create table with datatype varchar(), we regard it to be max length by default. But when we desc, it does not show
real length but show varchar()
Reason:
when we upgrade version from 2.0.1 to 2.0.2, we support new feature of creating varchar(), and it shows the same way with
ddl schema. So user would confuse of the length of varchar
Solved:
change the showing of varchar() to varchar(65533), which in compatible with hive
sql select * from t1 a join t1 b on b.id in (select 1) and a.id = b.id; will report an error.
This pr support uncorrelated subquery in join condition to fix it
consider sql having in-subquery
SELECT count(*)
FROM sub_query_correlated_subquery6
WHERE k1 IN
(SELECT k1
FROM
(**SELECT k1,
sum(k3) AS bbb,
count(k2) AS aaa
FROM sub_query_correlated_subquery7
WHERE k1 > 0
AND k3 > 0
GROUP BY k1** ) y
WHERE y.aaa>0
AND k1>1);
The subquery part having agg is un-correlated, which can be unnested.
on the other side:
SELECT count(*)
FROM sub_query_correlated_subquery6
WHERE k1 IN
(SELECT k1
FROM
(**SELECT k1,
sum(k3) AS bbb,
count(k2) AS aaa
FROM sub_query_correlated_subquery7
WHERE k1 > 0
AND k3 > 0 and sub_query_correlated_subquery6.k1 > 2
GROUP BY k1** ) y
WHERE y.aaa>0
AND k1>1);
The subquery part having agg is correlated, which can't be unnested.
This pr makes three changes to the display of complex types:
1. NULL value in complex types refers to being displayed as `null`, not `NULL`
2. struct type is displayed as "column_name": column_value
3. Time types such as `datetime` and `date`, are displayed with double quotes in complex types. like
`{1, "2023-10-26 12:12:12"}`
This pr also do a code refactor:
1. nesting_level is set to a member variable of the `DataTypeSerDe`, rather than a parameter in methods.
What's more, this pr fix a bug that fileSize is not correct, introduced by this pr: #25854
create table t1(c1 int, c2 int);
create table t2(c1 int, c2 int);
insert into t1 values (1,1);
insert into t2 values (1,1);
select * from t1 where exists (select * from t2 where t1.c1 = t2.c1 limit 0);
the result should be empty set.
we generate project for all set operation's children to ensure the order
of all children are not changed. However, some rules, such as
PushDownProjectThroughLimit could remove these projects involuntarily.
When it happen, the column order is wrong and lead to BE core dump.
This PR use a new variable in SetOperation to save the output order of
children of set operation. Then the children's output order could be
changed and never affect to SetOperation at all.
select rank() over (partition by A, B) as r, sum(x) over(A, C) as s from T;
A is a common partition key for all windowExpressions, that is A is intersection of {A,B} and {A, C}
we could push filter A=1 through this window, since A is a common Partition key:
select * from (select a, row_number() over (partition by a) from win) T where a=1;
origin plan:
----filter((T.a = 1))
----------PhysicalWindow
------------PhysicalQuickSort
--------------PhysicalProject
------------------PhysicalOlapScan[win]
transformed to
----PhysicalWindow
------PhysicalQuickSort
--------PhysicalProject
----------filter((T.a = 1))
------------PhysicalOlapScan[win]
But C=1 can not be pushed through window.
physicalPlan.resetLogicalProperties(); will not change the origin plan but create a new plan with no logical property. So should update the plan using resetLogicalProperties()'s return value.
join commute rule will swap the left and right child. This cause the change of logical properties. So we need recompute the logical properties in plan post process to get the correct result
consider sql:
```
SELECT *
FROM sub_query_correlated_subquery1 t1
WHERE coalesce(bitand(
cast(
(SELECT sum(k1)
FROM sub_query_correlated_subquery3 ) AS int),
cast(t1.k1 AS int)),
coalesce(t1.k1, t1.k2)) is NULL
ORDER BY t1.k1, t1.k2;
```
is Null conjunct is lost in SubqueryToApply rule. This pr fix it