Files
doris/regression-test/data/statistics
ElvinWei 1a6401d682 [enchancement](statistics) support sampling collection of statistics (#18880)
1. Supports sampling to collect statistics
2. Improved syntax for collecting statistics
3. Support histogram specifies the number of buckets
4. Tweaked some code structure

---

The syntax supports WITH and PROPERTIES, using the same syntax as before.

Column Statistics Collection Syntax:
```SQL
ANALYZE [ SYNC ] TABLE table_name
     [ (column_name [, ...]) ]
     [ [WITH SYNC] | [WITH INCREMENTAL] | [WITH SAMPLE PERCENT | ROWS ] ]
     [ PROPERTIES ('key' = 'value', ...) ];
```

Column histogram collection syntax:
```SQL
ANALYZE [ SYNC ] TABLE table_name
     [ (column_name [, ...]) ]
     UPDATE HISTOGRAM
     [ [ WITH SYNC ][ WITH INCREMENTAL ][ WITH SAMPLE PERCENT | ROWS ][ WITH BUCKETS ] ]
     [ PROPERTIES ('key' = 'value', ...) ];
```

Illustrate:
- sync:Collect statistics synchronously. Return after collecting.
- incremental:Collect statistics incrementally. Incremental collection of histogram statistics is not supported.
- sample percent | rows:Collect statistics by sampling. Scale and number of rows can be sampled.
- buckets:Specifies the maximum number of buckets generated when collecting histogram statistics.
- table_name: The purpose table for collecting statistics. Can be of the form `db_name.table_name`.
- column_name: The specified destination column must be a column that exists in `table_name`, and multiple column names are separated by commas.
- properties:Properties used to set statistics tasks. Currently only the following configurations are supported (equivalent to the with statement)
   - 'sync' = 'true'
   - 'incremental' = 'true'
   - 'sample.percent' = '50'
   - 'sample.rows' = '1000'
   - 'num.buckets' = 10

--- 

TODO: 
- Supplement the complete p0 test
- `Incremental` statistics see #18653
2023-04-21 13:11:43 +08:00
..