Commit Graph

18686 Commits

Author SHA1 Message Date
037de3dedd [Fix](executor)Fix workload policy test #35146 2024-05-21 19:04:04 +08:00
0599cb2efd fix replica's remote data size set to data size (#35098)
fix replica's remote data size set to data size
2024-05-21 16:48:08 +08:00
367603a6c9 [security] fix fastjson security issues. (#35120)
cherry pick from #34627
Co-authored-by: derenli <derenli@tencent.com>
2024-05-21 16:35:42 +08:00
b0ecf76131 [fix][build](audit-loader) Fix a build error for AuditLoaderPlugin. (#35119) 2024-05-21 16:34:48 +08:00
428a6fd6ab fix test_decommission_with_replica_num_fail (#35123) 2024-05-21 15:49:31 +08:00
706c9c473b [fix](autobucket) calc bucket num exclude today's partition #34304 #35129 2024-05-21 15:49:16 +08:00
5019aa03e9 [enhancement](be-meta) disable sync rocksdb by default for better performance (#32714) (#35122) 2024-05-21 15:30:49 +08:00
44bb2bb639 [opt](routine-load) do not schedule invalid task (#34918) 2024-05-21 13:02:42 +08:00
fb28d0b185 [BUG] fix scan range boundary handling is incorrect (#34832)
fix scan range boundary handling is incorrect
Co-authored-by: shizhiqiang03 <shizhiqiang03@meituan.com>
2024-05-21 13:00:50 +08:00
74d66e9650 [Fix](parquet-reader) Fix Timestamp Int96 min-max statistics is incorrect when was written by some old parquet writers by disable it. (#35041)
Parquet INT96 timestamp values were compared incorrectly for the purposes of producing statistics
by older parquet writers, so PARQUET-1065 deprecated them. The result is that any writer that produced
stats was producing unusable incorrect values, except the special case where min == max and an incorrect
ordering would not be material to the result. PARQUET-1026 made binary stats available and valid in that special case.
2024-05-21 13:00:22 +08:00
c0fd98abe5 [Fix](tvf) Fix that tvf reading empty files in compressed formats. (#34926)
1. Fix the issue with tvf reading empty compressed files.
2. move two test cases (`test_local_tvf_compression` and `test_s3_tvf_compression`) from p2 to p0
2024-05-21 12:59:31 +08:00
944d9bd4bd [exec](performance) opt the topn nullable column order performance in Heap Sort (#35042) 2024-05-21 12:58:58 +08:00
b4a798240a [fix](inverted_index) donot use int32_t for index id to avoid overflow (#35062) 2024-05-21 12:58:38 +08:00
f3762322c8 [opt](nereids)new way to set pre-agg status (#34738) 2024-05-21 12:54:49 +08:00
5872173901 [improve](function) add limit check for lpad/rpad function input big value of length (#34810) 2024-05-21 12:54:25 +08:00
518b143caa [feat](Nereids)choose agg mv in cbo #35020 2024-05-21 12:54:10 +08:00
26d5c50339 [Bug](Variant) fix incorrect use of column index in TabletSchema (#35019) 2024-05-21 12:52:26 +08:00
8ca399ab92 [exec](pipeline) runtime filter wait time (#35108) 2024-05-21 12:50:05 +08:00
45c145fdf7 [fix](Nereids) LogicalPlanDeepCopier copy scan conjuncts in wrong way (#35077)
pick from master #35076

intro by PR #34933
This PR attempts to address the issue of losing conjuncts
when performing a deep copy of the outer structure.
However, the timing of copying the conjuncts is incorrect,
resulting in the inability to map slots within the conjuncts
to the output of the outer structure.
2024-05-20 21:49:53 +08:00
6fe533eede [branch-2.1](routine-load) fix routine load case fail #35084 2024-05-20 21:11:22 +08:00
aba00d7146 [Fix](executor)Fix workload reg test #35082 2024-05-20 20:36:29 +08:00
dbaa27ebbe [fix](agg) memory leak issue in agg operator (#35037) (#35055) 2024-05-20 16:50:50 +08:00
cb5e5c5aa7 [branch-2.1](test) fix dual test #35067 2024-05-20 16:47:23 +08:00
42425808a1 [Cherry-Pick](branch-2.1) Pick "Fix multiple replica partial update auto inc data inconsistency problem #34788" (#35056)
* [Fix](auto inc) Fix multiple replica partial update auto inc data inconsistency problem (#34788)

* **Problem:** For tables with auto-increment columns, updating partial columns can cause data inconsistency among replicas.

**Cause:** Previously, the implementation for updating partial columns in tables with auto-increment columns was done independently on each BE (Backend), leading to potential inconsistencies in the auto-increment column values generated by each BE.

**Solution:** Before distributing blocks, determine if the update involves partial columns of a table with an auto-increment column. If so, add the auto-increment column to the last column of the block. After distributing to each BE, each BE will check if the data key for the partial column update exists. If it exists, the previous auto-increment column value is used; if not, the auto-increment column value from the last column of the block is used. This ensures that the auto-increment column values are consistent across different BEs.

* 2

* [Fix](regression-test) Fix auto inc partial update unstable regression test (#34940)
2024-05-20 15:43:46 +08:00
5fa5ea8783 add check fe meta version to 129 for branch2.1 2024-05-20 13:30:15 +08:00
a43c6eca22 [chore](femetaversion) add a check in fe code to avoid fe meta version changed during pick PR (#35039)
* [chore](femetaversion) add a check in fe code to avoid fe meta version changed during pick PR

* f

* f

---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-05-20 13:29:17 +08:00
be50139eb1 [Fix](Nereids) fix leading with cte and same subqueryalias name (#34838) (#35047)
fix leading with cte and same subqueryalias name
Example:
with tbl1 as select t1.c1 from t1
select tbl2.c2 from (select / * + leading(t2 tbl1) * / tbl1.c1, t2.c2 from tbl1 join t2) as tbl2 join t3;
Reason:
in this case, before getting analyzed preprocess would change subquery tbl2 to cte plan, and this cte plan should be in upper level cte plan, but not in logical result sink plan
2024-05-20 10:44:22 +08:00
5ac4ea2cd9 [Fix](Nereids) fix leading hint with update of alias name (#34434) (#35046)
Problem:
when using leading like leading(tbl1 tbl2) in
"select * from (select tbl1.c1 from t1 as tbl1 join t2 as tbl2) join t3 as tbl2 on tbl2.c3 != 101;",
in which tbl2.c3 means t3.c3 but not t2.c3
Causes and solved:
when finding columns in condition, leading hint would find tbl2.c3's RelationId, and when we collect RelationId and aliasName
we should update it if aliasName is repeat
2024-05-20 10:40:10 +08:00
7c29a964e5 [Fix](Nereids) fix leading with multi level of brace pairs (#34169) (#35043)
fix leading with multi level of brace pairs
example:
leading(t1 {{t2 t3} {t4 t5}} t6) can be reduced to leading(t1 {t2 t3 {t4 t5}} t6)
also update cases which remove project node from explain shape plan
2024-05-20 10:28:22 +08:00
6656508579 [fix](agg) fix DCHECK failure of agg when failed to alloc memory (#35011)
* [fix](agg) fix DCHECK failure of agg when failed to alloc memory

* add comment
2024-05-20 10:12:16 +08:00
7aff10b93b [Fix](inverted index) fix race condition for column reader load inverted index reader (#34922) (#35040) 2024-05-20 09:52:22 +08:00
a6a398d7a4 [Fix](function) remove datev2 signature of microsecond #35017 2024-05-19 19:58:02 +08:00
22f85be712 [fix](hive-ctas) support create hive table with full quolified name (#34984)
Before, when executing `create table hive.db.table as select` to create table in hive catalog,
if current catalog is not hive catalog, the default engine name will be filled with `olap`, which is wrong.

This PR will fill the default engine name base on specified catalog.
2024-05-18 18:42:43 +08:00
89d5f2e816 [fix](multi-catalog)remove http scheme in oss endpoint (#34907)
remove http scheme in oss endpoint, scheme maybe appear in url (http://bucket.http//.region.aliyuncs.com) if use http client
2024-05-18 18:42:33 +08:00
a59f9c3fa1 [fix](planner) fix unrequired slot bug when join node introduced by #25204 (#34923)
before fix, join node will retain some slots, which are not materialized and unrequired.
join node need remove these slots and not make them be output slots.

Signed-off-by: nextdreamblue <zxw520blue1@163.com>
2024-05-18 18:40:56 +08:00
435147d449 [enhance](mtmv) MTMV deal partition use name instead of id (#34910)
partition id will change when insert overwrite

When the materialized view runs a task, if the base table is in insert overwrite, the materialized view task may report an error: partition not found by partitionId

Upgrade compatibility: Hive currently does not support automatic refresh, so it has no impact
2024-05-18 18:40:29 +08:00
e3e5f18f26 [Fix](Json type) correct cast result for json type (#34764) 2024-05-18 18:40:17 +08:00
81bcb9d490 [opt](planner)(Nereids) support auto aggregation for random distributed table (#33630)
support auto aggregation for querying detail data of random distributed table:
the same key column will return only one row.
2024-05-18 18:40:16 +08:00
bfd875eae3 [opt](nereids) lazy get expression map when comparing hypergraph (#34753) 2024-05-18 18:38:19 +08:00
e66dd58860 [Improve](inverted index) improve performance by introducing bulk contains for bitmap in _build_index_result_column (#34831) 2024-05-18 18:38:04 +08:00
9b5028785d [fix](prepare) fix datetimev2 return err when binary_row_format (#34662)
fix datetimev2 return err when binary_row_format. before pr, Backend return datetimev2 alwary by to_string.
fix datatimev2 return metadata loss scale.
2024-05-18 18:37:41 +08:00
437c1a1ba4 [enhancement](regression-test) modify a key type tests (#34717)
Co-authored-by: cjj2010 <2449402815@qq.com>
2024-05-18 18:37:41 +08:00
274c96b12d [enhancement](regression-test) modify a key type tests (#34600)
Co-authored-by: cjj2010 <2449402815@qq.com>
2024-05-18 18:37:41 +08:00
05605d99a9 [opt](routine-load) optimize routine load task allocation algorithm (#34778) 2024-05-18 18:37:41 +08:00
cc11e50200 [fix](mtmv)Fix slot desc wrong in query rewrite by materialized view when query is complex (#34904) 2024-05-18 18:37:10 +08:00
5b72dd1217 [chore](test) remove useless drop table in test_list_partition_datatype (#34930) 2024-05-18 18:36:48 +08:00
73419c2431 [enhance](mtmv)MTMV support hive table determine whether the data is sync (#34845)
Previously supported, this PR only turns on the switch
2024-05-18 18:35:42 +08:00
eb7eaee386 [fix](function) money format (#34680) 2024-05-18 18:35:29 +08:00
6f5abfd23f [regression-test](fix) fix case bug, using test_insert_dft_tbl in multiple test cases #34983 2024-05-18 18:35:16 +08:00
db273d578f [Fix](tablet id) use int64_t instead of int32_t or uint32_t for tablet_id (#34962) 2024-05-18 18:34:05 +08:00