1. Evict the dropped stats from cache
2. Remove codes for the partition level stats collection
3. Disable analyze whole database directly
4. Fix the potential death loop in the stats cleaner
5. Sleep thread in each loop when scanning stats table to avoid excessive IO usage by this task.
`commons-lang`(1and2) is no longer maintained since 2011, and the official recommendation is `commons-lang3`, which can be smoothly upgraded to be compatible with `commons-lang`.
We use both dependencies in `fe`, which can be completely unified.
`PatternGenerator#generateTypePattern` has many meaningless loops, and IntegerRange is introduced for,
which is unnecessary. So I refactored it.
Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.
Tpc-h q10 and q5 benefit from this optimization.
For a given hash join condition, A=B, sometimes both A and B are reduced by filters. In this pr, both reductions are counted in join estimation.
1. Add apache-orc git submodule update path, not update all modules
When sh build.sh, update all modules will fails serveral times because of unstable github network.
It wastes many time.
2. Add gitignore for be/src/apache-orc/ to avoid mistake commits.
PR(#17960) has introduced vector table which can map java table to c++ block.
In some cases(java udf & jdbc exector), we should map c++ block to java table. This PR implements this function.
The memory structure of java vector table and c++ block is consistent,
so the implementation doesn't copy the block, just passes the memory address.
`hudi-common` depends on `parque-avro`, but the dependency scope is `provide`.
When we use `hudi-catalog`, `HoodieAvroWriteSupport` will be called. This method depends on `parque-avro`, so it will generate ClassNotFound
Describe your changes.
For each release of Doris, there are some experimental features.
These feature may not stable or qualified enough, and user need to use it by setting config or session variables,
eg, set enable_mtmv = true, otherwise, these feature is disable by default.
We should explicitly tell user which features are experimental, so that user will notice that and decide whether to
use it.
Changes
In this PR, I support the experimental_ prefix for FE config and session variables.
Session Variable
Given enable_nereids_planner as an example.
The Nereids planner is an experimental feature in Doris, so there is an EXPERIMENTAL annotation for it:
@VariableMgr.VarAttr(..., expType = ExperimentalType.EXPERIMENTAL)
private boolean enableNereidsPlanner = false;
And for compatibility, user can set it by:
set enable_nereids_planner = true;
set experimental_enable_nereids_planner = true;
And for show variables, it will only show experimental_enable_nereids_planner entry.
And you can also see all experimental session variables by:
show variables like "%experimental%"
Config
Same as session variable, give enable_mtmv as an example.
@ConfField(..., expType = ExperimentalType.EXPERIMENTAL)
public static boolean enable_mtmv = false;
User can set it in fe.conf or ADMIN SET FRONTEND CONFIG stmt with both names:
enable_mtmv
experimental_enable_mtmv
And user can see all experimental FE configs by:
ADMIN SHOW FRONTEND CONFIG LIKE "%experimental%";
TODO
Support this feature for BE config
Only add experimental for:
enable_pipeline_engine
enable_nereids_planner
enable_single_replica_insert
and FE config:
enable_mtmv
enabel_ssl
enable_fqdn_mode
Should modify other config and session vars