The image file of our cluster reaches 2.3G. After the checkpoint, Followers synchronize the image timeout, resulting in the continuous increase of the bdb directory.
related pr: #25768
notice: this PR have change the fe meta version.
the nullable mode of udf-function is important,
if not write to info, it's will be loss after restart.
Add new FE config `ignore_unknown_metadata_module`. Default is false.
If set to true, when reading metadata image file, and there are unknown modules, these modules
will be ignored and skipped.
This is mainly used in downgrade operation, old version can be compatible with new version Image file.
If user has database with same name mysql, will introduce problem when doing checkpoint.
Solution:
Add check for this situation, if duplicate, exit and print log info to prevent damage of metadata;
Add fe config field: mysqldb_replace_name to make things correct if user already has mysql db.
Related pr: #23087#22868
1. Fix data size calculation of auto sample, before this pr, the data size is include all the replicas
2. Move some auto analyze related options to global session variable
3. Add some logs
In the old code, when using desc command to view the table schema
It will display as follows
```
ARRAY<TINYINT(4)>
ARRAY<SMALLINT(6)>
ARRAY<INT(11)>
ARRAY<BIGINT(20)>
ARRAY<LARGEINT(40)>
```
However, for normal integer type displays, the width is not displayed
So, I changed it to the following
```
ARRAY<TINYINT>
ARRAY<SMALLINT>
ARRAY<INT>
ARRAY<BIGINT>
ARRAY<LARGEINT>
```
### Support dispaly of auto analyze jobs
After this PR, users and DBA could use such grammar to check the execution status of auto analyze jobs:
```sql
SHOW AUTO ANALYZE [tbl_name] [WHERE STATE='SOME STATE']
```
Record count of history auto analyze job could be configured by setting FE option: auto_analyze_job_record_count, default value is 2000
### Enhance auto analyze
After this PR, auto jobs those created automatically will no longer execute beyond a specific time frame.
Add variant type for metadata Add persistent information for variant, including the path of variant sub-columns, persisting them to the segment footer and tablet schema of the rowset.
1. Analyze with sample automatically when table size is greater than huge_table_lower_bound_size_in_bytes(5G by default). User can disable this feature by fe option enable_auto_sample
2. Support grammer like `ANALYZE TABLE test WITH FULL` to force do full analyze whatever table size is
3. Fix bugs that tables stats doesn't get updated properly when stats is dropped, or only few column is analyzed
The origin virtual number is Math.max(Math.min(512 / backends.size(), 32), 2);, which is too small,
causing uneven cache distribution when enabling file cache.