In the past, two conditions needed to be met when sampling a partitioned table: 1. Data is evenly distributed between partitions; 2. Data is evenly distributed between buckets. Finally, the number of sampled rows in each partition and each bucket is the same.
Now, sampling will be proportional to the number of partitioned and bucketed rows.
1.Reconstruct the logic of decode to read parquet. The parquet reader first reads the data according to the parquet physical type, and then performs a type conversion.
2.Support hive alter table.
This pr makes three changes to the display of complex types:
1. NULL value in complex types refers to being displayed as `null`, not `NULL`
2. struct type is displayed as "column_name": column_value
3. Time types such as `datetime` and `date`, are displayed with double quotes in complex types. like
`{1, "2023-10-26 12:12:12"}`
This pr also do a code refactor:
1. nesting_level is set to a member variable of the `DataTypeSerDe`, rather than a parameter in methods.
What's more, this pr fix a bug that fileSize is not correct, introduced by this pr: #25854
Use field datatype such as decimal(10, 0) to create table like. Because the scale is 0, the precision and scale will lost when create table like done. this will fix the bug.
**Before fix, create table with following SQL**:
CREATE TABLE IF NOT EXISTS db_test.table_test
(
`name` varchar COMMENT "1m size",
`id` SMALLINT COMMENT "[-32768, 32767]",
`timestamp0` decimal null comment "c0",
`timestamp1` decimal(38, 0) null comment "c1"
)
DISTRIBUTED BY HASH(`id`) BUCKETS 1
PROPERTIES ('replication_num' = '1');
**and Then run**
CREATE TABLE db_test.table_test_like LIKE db_test.table_test
SHOW CREATE TABLE db_test.table_test_like;
the field `timestamp1` will be decimal(9, 0), it's wrong. this will fix it.
When enable shared scan, all scanners will be created by one instance. When the main instance reach eos and quit, all states of it will be released. But other instances are still possible to get block from those scanners. So we must assure scanners will not be dependent on any states of the main instance after it quit.
1. FE could not compile because below error. Intro by PR #25933
```
[INFO] --- exec:3.1.0:java (doc) @ fe-core ---
...
Failed to generate doc for ignoreRuntimeFilterIds
```
2. fix UT bugs intro by below PRs
> - #25951
> - #26031
3. because fe could not compile, FE UT CI do not work well. So, some UT failed be introduced by the PRs merged after PR #25933 merged. So this PR revert them to fix FE UT
> - Revert "[Bug](materialized-view) SelectMaterializedIndexWithAggregate do not change plan > when match ba… (#26145)"
> This reverts commit 8d7abf60f94d2d1208b71e96b9290ea02122b8d8.
> - Revert "[enhancement](Nereids): optimize GroupExpressionMatching (#26130)"
> This reverts commit 19122b55cd95af097b4ef7b6eb809f37db29765f.
> - Revert "[Performance](Nereids): optimize GroupExpressionMatching (#26084)"
> This reverts commit 0d956e90cf920039b8baa79c170a298be56a128d.
1. restore overwrites an exists table
2. backup & restore with exclude table
3. restore to a new table
4. restore mix exists and new tables
5. restore with alias