Parquet INT96 timestamp values were compared incorrectly for the purposes of producing statistics
by older parquet writers, so PARQUET-1065 deprecated them. The result is that any writer that produced
stats was producing unusable incorrect values, except the special case where min == max and an incorrect
ordering would not be material to the result. PARQUET-1026 made binary stats available and valid in that special case.
1. Fix the issue with tvf reading empty compressed files.
2. move two test cases (`test_local_tvf_compression` and `test_s3_tvf_compression`) from p2 to p0
pick from master #35076
intro by PR #34933
This PR attempts to address the issue of losing conjuncts
when performing a deep copy of the outer structure.
However, the timing of copying the conjuncts is incorrect,
resulting in the inability to map slots within the conjuncts
to the output of the outer structure.
* [Fix](auto inc) Fix multiple replica partial update auto inc data inconsistency problem (#34788)
* **Problem:** For tables with auto-increment columns, updating partial columns can cause data inconsistency among replicas.
**Cause:** Previously, the implementation for updating partial columns in tables with auto-increment columns was done independently on each BE (Backend), leading to potential inconsistencies in the auto-increment column values generated by each BE.
**Solution:** Before distributing blocks, determine if the update involves partial columns of a table with an auto-increment column. If so, add the auto-increment column to the last column of the block. After distributing to each BE, each BE will check if the data key for the partial column update exists. If it exists, the previous auto-increment column value is used; if not, the auto-increment column value from the last column of the block is used. This ensures that the auto-increment column values are consistent across different BEs.
* 2
* [Fix](regression-test) Fix auto inc partial update unstable regression test (#34940)
* [chore](femetaversion) add a check in fe code to avoid fe meta version changed during pick PR
* f
* f
---------
Co-authored-by: yiguolei <yiguolei@gmail.com>
fix leading with cte and same subqueryalias name
Example:
with tbl1 as select t1.c1 from t1
select tbl2.c2 from (select / * + leading(t2 tbl1) * / tbl1.c1, t2.c2 from tbl1 join t2) as tbl2 join t3;
Reason:
in this case, before getting analyzed preprocess would change subquery tbl2 to cte plan, and this cte plan should be in upper level cte plan, but not in logical result sink plan
Problem:
when using leading like leading(tbl1 tbl2) in
"select * from (select tbl1.c1 from t1 as tbl1 join t2 as tbl2) join t3 as tbl2 on tbl2.c3 != 101;",
in which tbl2.c3 means t3.c3 but not t2.c3
Causes and solved:
when finding columns in condition, leading hint would find tbl2.c3's RelationId, and when we collect RelationId and aliasName
we should update it if aliasName is repeat
fix leading with multi level of brace pairs
example:
leading(t1 {{t2 t3} {t4 t5}} t6) can be reduced to leading(t1 {t2 t3 {t4 t5}} t6)
also update cases which remove project node from explain shape plan
Before, when executing `create table hive.db.table as select` to create table in hive catalog,
if current catalog is not hive catalog, the default engine name will be filled with `olap`, which is wrong.
This PR will fill the default engine name base on specified catalog.
before fix, join node will retain some slots, which are not materialized and unrequired.
join node need remove these slots and not make them be output slots.
Signed-off-by: nextdreamblue <zxw520blue1@163.com>
partition id will change when insert overwrite
When the materialized view runs a task, if the base table is in insert overwrite, the materialized view task may report an error: partition not found by partitionId
Upgrade compatibility: Hive currently does not support automatic refresh, so it has no impact