This pull request includes some implementations of the statistics(https://github.com/apache/incubator-doris/issues/6370), it will not affect any existing code and users will not be able to create statistics job.
Now only MetaStatisticsTask that directly collects statistics by reading FE meta is implemented. SQLStatisticsTask is still being implemented, it needs to query BE through FE.
The following is the function implemented by this pr:
1. Support statistics collection for partitioned and non-partitioned tables. For partitioned tables, the collection of statistics for the specified partition is implemented.
2. When the task is divided, it is divided according to the partition table and the non-partition table. The most fine-grained is to the tablet level. A matetask collects as many statistics as possible.
3. Add partition statistics (Table -> Partition -> Column). For example, the size of the table, the number of rows, the size of the partition, the number of rows, the maximum and minimum values of the columns, etc.
4. Display and modify partition-level statistics.
…
* support like/not like conjuncts push down to storage engine
* vectorized engine support like/not like conjuncts push down to storage engine
* support both evaluate and evaluate_vec method in like predicate
* reuse remove_pushed_conjuncts and prevent logic error during move function conjuncts
* change #ifndef to pragma once as per comments
* change enable_function_pushdown default to false
Co-authored-by: heguangnan <heguangnan@bytedance.com>
Add hashCode(), equals() for operator.
Add basic UT for them(need more detail test).
**future ticket**: add hashCode(), equals() and UT for `Expression`
* improvement for dynamic schema
not use schema as lru cache key any more.
load segment just use the rowset's original schema not the current read schema.
generate column reader and column iterator using the original schema, using the read schema if it is a new column.
using column unique id as key instead of column ordinals.
Co-authored-by: yiguolei <yiguolei@gmail.com>
In the strict memory usage mode of STRICT_MEMORY_USE=ON, when the capacity of the vectorized Hash Table is greater than 2G, it starts to grow when 75% of the capacity is satisfied, the memory usage of the vectorized Join becomes 50% of the previous value.
STRICT_MEMORY_USE=ON` expects BE to use less memory, and gives priority to ensuring stability when the cluster memory is limited.
Compilation error after merging due to Literal refactoring.
Compilation failure:
fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/functions/Substring.java:[40,38] org.apache.doris.nereids.trees.expressions.Literal is abstract; cannot be instantiated
* remove alpha_rowset_meta
* remove alpha rowset related codes in compaction
* remove alpha rowset related codes in RowsetMeta
* fix be ut because some ut use alpha rowsetmeta