Fix of PR #23582
Some Fe codes are deleted by [Improvement](pipeline) Cancel outdated query if original fe restarts #23582 , need to be added back;
Fix mac build failed caused by wrong thrift declaration order.
1. "set" may overwrite the original ID.
2.A bloom filter may not necessarily be an IN_OR_BLOOM_FILTER.
before may be
RuntimeFilterInfo id -1: [type = BF, input = 25, filtered = 0]
now
RuntimeFilterInfo id 0: [type = BF, input = 25, filtered = 0]
Current initialization dependency:
Daemon ───┬──► StorageEngine ──► ExecEnv ──► Disk/Mem/CpuInfo
│
│
BackendService ─┘
However, original code incorrectly initialize Daemon before StorageEngine.
This PR also stop and join threads of daemon services in their dtor, to ensure Daemon services release resources in reverse order of initialization via RAII.
1. When file meta cache is disabled (by setting `max_external_file_meta_cache_num=0` in be.conf),
the parquet's meta info is owned by parquet reader and will be released when calling `reader->close()`.
But the underlying file reader of this parquet reader will be released after `reader->close()`,
this may causing `heap-use-after-free` bug because some part of meta info may be referenced by file reader.
This PR fix it by making sure that meta info is released after file reader released.
2. Add modification time for file meta cache in BE, to avoid parquet read error like:
`Failed to deserialize parquet page header`
The reason is that sql cache just use partitionKey , latestVersion and latestTime to check if the cache should be returned, if we delete some partition(s) which is not the latest updated partition, all above values are not changed, so the cache will hit.
Use a field to save the partition num of these tables and sum the partition nums and send it to BE, there are two situations which contains delete-partition ops:
- just delete some partition(s), so the sum of partition num will be lower than before.
- delete some partition(s) coexists with add some partition(s), so the latest time or latest version will be higher than before.
log error:
`W20230830 11:19:47.495721 3046231 status.h:363] meet error status: [INTERNAL_ERROR]user function's name should be function_id.checksum[.file_name].file_type, now the all split parts are by delimiter(.): 7119053928154065546.20c8228267b6c9ce620fddb39467d3eb.postgresql-42.5.0.jar`
When the jdbc driver had `.` in its name we failed to split it properly
eg: udf('asd'), this need return a const column, otherwise will be failed use the return column as other function params.
mysql> select concat('a', 'b', cuuid9('a'), ':c');
ERROR 1105 (HY000): errCode = 2, detailMessage = (10.16.10.6)[CANCELLED][INTERNAL_ERROR]const check failed, expr=VectorizedFn[VectorizedFnCallconcat]{
VLiteral (name = String, type = String, value = (a)),
VLiteral (name = String, type = String, value = (b)),
VectorizedFn[VectorizedFnCallcuuid9]
{ VLiteral (name = String, type = String, value = (a))},
VLiteral (name = String, type = String, value = (:c))}
New structure for delete sub predicate.
Delete sub predicate uses a string type condition_str to stored temporarily now and fields will be extracted from it using std::regex, which may introduces stack overflow when matching a extremely large string(bug of libc).
Now we attempt to use a new PB structure to hold the delete sub predicate, to avoid that problem.
message DeleteSubPredicatePB {
optional int32 column_unique_id = 1;
optional string column_name = 2;
optional string op = 3;
optional string cond_value = 4;
}
Currently, 2 versions of sub predicate will both be filled. For query, we use the v2, and during compaction we still use v1. The old rowset meta with delete predicates which had sub predicate v1 will be attempted to convert to v2 when read from PB. Moreover, efforts will be made to rewrite these meta with the new delete sub predicate.
Make preparation to use column unique id to specify a column globally.
Using the column unique id rather than the column name to identify a column is vital for flexible schema change. The rewritten delete predicate will attach column unique id.