[feature](delete-predicate) support delete sub predicate v2 (#22442)
New structure for delete sub predicate.
Delete sub predicate uses a string type condition_str to stored temporarily now and fields will be extracted from it using std::regex, which may introduces stack overflow when matching a extremely large string(bug of libc).
Now we attempt to use a new PB structure to hold the delete sub predicate, to avoid that problem.
message DeleteSubPredicatePB {
optional int32 column_unique_id = 1;
optional string column_name = 2;
optional string op = 3;
optional string cond_value = 4;
}
Currently, 2 versions of sub predicate will both be filled. For query, we use the v2, and during compaction we still use v1. The old rowset meta with delete predicates which had sub predicate v1 will be attempted to convert to v2 when read from PB. Moreover, efforts will be made to rewrite these meta with the new delete sub predicate.
Make preparation to use column unique id to specify a column globally.
Using the column unique id rather than the column name to identify a column is vital for flexible schema change. The rewritten delete predicate will attach column unique id.
This commit is contained in:
@ -381,6 +381,8 @@ public class SessionVariable implements Serializable, Writable {
|
||||
|
||||
public static final String ROUND_PRECISE_DECIMALV2_VALUE = "round_precise_decimalv2_value";
|
||||
|
||||
public static final String ENABLE_DELETE_SUB_PREDICATE_V2 = "enable_delete_sub_predicate_v2";
|
||||
|
||||
public static final String JDBC_CLICKHOUSE_QUERY_FINAL = "jdbc_clickhouse_query_final";
|
||||
|
||||
public static final String ENABLE_MEMTABLE_ON_SINK_NODE =
|
||||
@ -1109,6 +1111,9 @@ public class SessionVariable implements Serializable, Writable {
|
||||
@VariableMgr.VarAttr(name = PARALLEL_SYNC_ANALYZE_TASK_NUM)
|
||||
public int parallelSyncAnalyzeTaskNum = 2;
|
||||
|
||||
@VariableMgr.VarAttr(name = ENABLE_DELETE_SUB_PREDICATE_V2, fuzzy = true, needForward = true)
|
||||
public boolean enableDeleteSubPredicateV2 = true;
|
||||
|
||||
@VariableMgr.VarAttr(name = TRUNCATE_CHAR_OR_VARCHAR_COLUMNS,
|
||||
description = {"是否按照表的 schema 来截断 char 或者 varchar 列。默认为 false。\n"
|
||||
+ "因为外表会存在表的 schema 中 char 或者 varchar 列的最大长度和底层 parquet 或者 orc 文件中的 schema 不一致"
|
||||
@ -1144,9 +1149,11 @@ public class SessionVariable implements Serializable, Writable {
|
||||
if (randomInt % 2 == 0) {
|
||||
this.rewriteOrToInPredicateThreshold = 100000;
|
||||
this.enableFunctionPushdown = false;
|
||||
this.enableDeleteSubPredicateV2 = false;
|
||||
} else {
|
||||
this.rewriteOrToInPredicateThreshold = 2;
|
||||
this.enableFunctionPushdown = true;
|
||||
this.enableDeleteSubPredicateV2 = true;
|
||||
}
|
||||
this.runtimeFilterType = 1 << randomInt;
|
||||
/*
|
||||
@ -2205,6 +2212,7 @@ public class SessionVariable implements Serializable, Writable {
|
||||
tResult.setEnableParquetLazyMat(enableParquetLazyMat);
|
||||
tResult.setEnableOrcLazyMat(enableOrcLazyMat);
|
||||
|
||||
tResult.setEnableDeleteSubPredicateV2(enableDeleteSubPredicateV2);
|
||||
tResult.setTruncateCharOrVarcharColumns(truncateCharOrVarcharColumns);
|
||||
tResult.setEnableMemtableOnSinkNode(enableMemtableOnSinkNode);
|
||||
|
||||
|
||||
Reference in New Issue
Block a user