1. Parquet with page v2 is parsed error when using other codec except snappy. Because `compressed_page_size` has the same meaning in page v1 and v2, it always contains the bytes of definition level, repetition level and compressed data.
2. Add regression test for `fix_length_byte_array` stored decimal type, and dictionary encoded date/datetime type.
close#26882
We should not use the singleNodePlan to generate the rootPlanFragment if the query is inside a insert statement or distributedPlanner will be null.
introduced in #15491
Concurrent schema change and txn may cause dead lock. An example:
Txn T commit but not publish;
Run schema change or rollup on T's related partition, add alter replica R;
sc/rollup add a sched txn watermark M;
Restart fe;
After fe restart, T's loadedTblIndexes will clear because it's not save to disk;
T will publish version to all tablet, including sc/rollup's new alter replica R;
Since R not contains txn data, so the T will fail. It will then always waitting for R's data;
sc/rollup wait for txn before M to finish, only after that it will let R copy history data;
Since T's not finished, so sc/rollup will always wait, so R will nerver copy history data;
Txn T and sc/rollup will wait each other forever, cause dead lock;
Fix: because sc/rollup will ensure double write after the sched watermark M, so for finish transaction, when checking a alter replica:
if txn id is bigger than M, check it just like a normal replica;
otherwise skip check this replica, the BE will modify history data later.
This commit overhauls the JDBC connector logic within our project, transitioning from the previous mechanism of fetching data through JNI calls for individual ResultSet items to a more efficient and unified approach using the VectorTable data structure.
sql select * from t1 a join t1 b on b.id in (select 1) and a.id = b.id; will report an error.
This pr support uncorrelated subquery in join condition to fix it
1 enable the cases about str_to_date, which have been muted because some parallel config influence.
2 serialise some cases which called admin set config
fix bug that #24059 .
Added some information_schema scanner tests.
files
schema_privileges
table_privileges
partitions
rowsets
statistics
table_constraints
Based on infodb_support_ext_catalog=false, it currently includes tests for all tables under the information_schema database.
forbid one phase agg for pattern: agg-unionAll
one phase agg plan: agg-union-hashDistribute-children
two phase agg plan: agg(global) - hashDistribute-agg(local)-union-randomDistribute
the key point is the cost of randomDistribute is much lower than the hashDistribute, and hence two-phase agg wins.
if left bucket has no data, we do not generate left bucket instance.
These join should reserve all right side data. But because left instance
is not exists. So right data will be discard since no dest be set.
We ban these join temporarily until we could generate all instance
for left side in Coordinator.
consider sql having in-subquery
SELECT count(*)
FROM sub_query_correlated_subquery6
WHERE k1 IN
(SELECT k1
FROM
(**SELECT k1,
sum(k3) AS bbb,
count(k2) AS aaa
FROM sub_query_correlated_subquery7
WHERE k1 > 0
AND k3 > 0
GROUP BY k1** ) y
WHERE y.aaa>0
AND k1>1);
The subquery part having agg is un-correlated, which can be unnested.
on the other side:
SELECT count(*)
FROM sub_query_correlated_subquery6
WHERE k1 IN
(SELECT k1
FROM
(**SELECT k1,
sum(k3) AS bbb,
count(k2) AS aaa
FROM sub_query_correlated_subquery7
WHERE k1 > 0
AND k3 > 0 and sub_query_correlated_subquery6.k1 > 2
GROUP BY k1** ) y
WHERE y.aaa>0
AND k1>1);
The subquery part having agg is correlated, which can't be unnested.