supports users to manually inject table level statistics.
table stats type:
- row_count
Modify table or partition statistics:
```SQL
ALTER TABLE table_name SET STATS ('k1' = 'v1', ...)
```
TODO:
- support other table stats type if necessary
- update statistics cache if necessary
1. refactor aggregate normalization to avoid data amplification before aggregate
2. remove useless aggreagte processing in ExtractAndNormalizeWindowExpression
3. only push distinct aggregate function children
TODO:
1. push down redundant expression in aggregate functions
2. refactor normalize repeat rule
3. move expression normalization and optimization after plan normalization to avoid unexpected expression optimization.
When user set default_storage_medium to true, the storage medium of all partitions should be SSD,
and cooldown time should be 9999-12-31 23:59:59.
So that it won't change to HDD.
But looks like sometimes it still change to HDD.
So I change the debug log to info to observer it.
Unfortunately BthreadCountDownEvent will not serve as one sync primitive for this scenario where are all pthread workers. BthreadCountDownEvent::time_wait is used for bthread so it will result in some confusing sync problem like heap buffer use after free.
1.some encrypt and decrypt functions have wrong blockEncryptionMode
2.topN node should compare tuples from intermediate_row_desc with first_sort_slot.tuple_id
3.must keep the limit if it's an uncorrelated in-subquery with limit on sort, like select a from t1 where a in ( select b from t2 order by xx limit yy )
make nereids generate more reasonable plans with table row count, but without column stats.
TODO: q5 and q7 is not good, because of column correlation
ps_suppkey and ps_partkey
If submit a query contains hms tbls which data files are compressed (bz2,lzo,lz4 ...), a error will occurs like this:
```[INTERNAL_ERROR]Only support csv data in utf8 codec``` .
This is because `org.apache.doris.planner.external.HiveScanNode` set `fileFormatType` as `TFileFormatType.FORMAT_CSV_PLAIN` whether the real compress algo of data files are. This pr try to fix this problem.
We shouldn't push top Project of JoinCluster in PushdownProjectThroughJoin
like
```
* Project (id + 1) if this project is top project of Join Cluster
* |
* Join
* / \
* Join Join
* / ....
* Join
```
Doris block does not support complex nested type now, but orc and parquet reader has generated complex nested column,
which makes the output of mysql client wrong and users confused.
Get the last modification time from file status, and use the combination of path and modification time to generate cache identifier.
When a file is changed, the modification time will be changed, so the former cache path will be invalid.
Fix some hive partition issues.
1. Fix be will crash when using hive partitions field of `date`, `timestamp`, `decimal` type.
2. Fix hdfs uri decode error when using `timestamp` partition filed which will cause some url-encoding for special chars, such as `%3A` will encode `:`.
* change docker compose to 'docker-compose'
* modify sql of mysql
* fix docker start and stop cmd
* new commit
* add comment of partition and key/value column
* Update cn doc format
---------
Co-authored-by: Luzhijing <82810928+luzhijing@users.noreply.github.com>