1. add session variable max_execution_time to an alias of query timeout, if user set max_execution_time, the query timeout will be modified too.
2. add a setter attribute to session variable, so that we could add some logic in setter method instead of field reflection.
Fix partition field conjuncts not work.
Add predicate_partition_columns in _slot_id_to_filter_conjuncts(single slot conjuncts) to _filter_conjuncts, others should had been added from not_single_slot_filter_conjuncts.
To be more compatible with MySQL, rename JSONB type name and function name to JSON.
The old JSONB type name and jsonb_xx function can still be used for backward compatibility.
There is a function jsonb_extract remained since json_extract is used by json string function and more work need to change it. It will be changed further.
1. add json_unquote and json_extract functions
2. remove mv releated code in visitPhysicalOlapScan
3. forbid bitmap and hll type for topn node's sort exprs
4. HashDistributionInfo of olap scan node should use the slots from output not the full schema
5. SelectMaterializedIndexWithoutAggregate should use the filter node's output together with the predicate to get the correct mv
6. forbid SimplifyArithmeticRule for decimal type
7. make DecimalLiteral's type and value consistent with each other if the value is decimalv2
8. json_array need support empty argument
Fix threes bugs of timestampv2 precision:
1. Hive catalog doesn't set the precision of timestampv2, and can't get the precision from hive metastore, so set the largest precision for timestampv2;
2. Jdbc catalog use datetimev1 to parse timestamp, and convert to timestampv2, so the precision is lost.
3. TVF doesn't use the precision from meta data of file format.
# Proposed changes
In the nereids. Before this PR: when we access some unexists tables. It will report the exception as follows:
```
mysql> select * from tt;
ERROR 1105 (HY000): errCode = 2, detailMessage = Unexpected exception: null
```
After this PR, it will get the following results:
```
mysql> select * from tt;
ERROR 1105 (HY000): errCode = 2, detailMessage = Unexpected exception: Table [tt] does not exist in database [default_cluster:test].
```
## Problem summary
It is because in this [function](f5af07f7b2/fe/fe-core/src/main/java/org/apache/doris/nereids/CascadesContext.java (L328)), we ignore the exception. So the size of `tables` in `CascadesContext` is zero not null. So we can only get null after `table = cascadesContext.getTableByName(tableName);`.
We can support this by add a new properties for tvf, like :
`select * from hdfs("uri" = "xxx", ..., "compress_type" = "lz4", ...)`
User can:
Specify compression explicitly by setting `"compression" = "xxx"`.
Doris can infer the compression type by the suffix of file name(e.g. `file1.gz`)
Currently, we only support reading compress file in `csv` format, and on BE side, we already support.
All need to do is to analyze the `"compress_type"` on FE side and pass it to BE.
1. support export `LARGEINT` data type to parquet/orc file format.
2. Export the DORIS `DATE/DATETIME` type to the `Date/Timestamp` logic type of parquet file format.
3. Fix that the data is not correct when the DATE type data is exported to ORC.
supports users to manually inject table level statistics.
table stats type:
- row_count
Modify table or partition statistics:
```SQL
ALTER TABLE table_name SET STATS ('k1' = 'v1', ...)
```
TODO:
- support other table stats type if necessary
- update statistics cache if necessary
1. refactor aggregate normalization to avoid data amplification before aggregate
2. remove useless aggreagte processing in ExtractAndNormalizeWindowExpression
3. only push distinct aggregate function children
TODO:
1. push down redundant expression in aggregate functions
2. refactor normalize repeat rule
3. move expression normalization and optimization after plan normalization to avoid unexpected expression optimization.
1.some encrypt and decrypt functions have wrong blockEncryptionMode
2.topN node should compare tuples from intermediate_row_desc with first_sort_slot.tuple_id
3.must keep the limit if it's an uncorrelated in-subquery with limit on sort, like select a from t1 where a in ( select b from t2 order by xx limit yy )
We shouldn't push top Project of JoinCluster in PushdownProjectThroughJoin
like
```
* Project (id + 1) if this project is top project of Join Cluster
* |
* Join
* / \
* Join Join
* / ....
* Join
```
Fix some hive partition issues.
1. Fix be will crash when using hive partitions field of `date`, `timestamp`, `decimal` type.
2. Fix hdfs uri decode error when using `timestamp` partition filed which will cause some url-encoding for special chars, such as `%3A` will encode `:`.