For comparison predicate, two arguments must be cast to datetime and push down to storage if either one is date type. This PR disables predicate push-down for this case.
Support aggregate functions in select without from clause, here are some examples as following:
SELECT 1,
'a',
COUNT(),
SUM(1) + 1,
AVG(2) / COUNT(),
MAX(3),
MIN(4),
RANK() OVER() AS w_rank,
DENSE_RANK() OVER() AS w_dense_rank,
ROW_NUMBER() OVER() AS w_row_number,
SUM(5) OVER() AS w_sum,
AVG(6) OVER() AS w_avg,
COUNT() OVER() AS w_count,
MAX(7) OVER() AS w_max,
MIN(8) OVER() AS w_min;
1. add checks and handling of sequence column in #21896 to insert statement in origin planner and Nereids planner.
2. disable drop sequence mapping column in schema change.
Use the unified jni framework to refactor java udf.
The unified jni framework takes VectorTable as the container to transform data between c++ and java, and hide the details of data format conversion.
In addition, the unified framework supports complex and nested types.
The performance of basic types remains consistent, with a 30% improvement in string types and an order of magnitude improvement in complex types.
Add new FE config `ignore_unknown_metadata_module`. Default is false.
If set to true, when reading metadata image file, and there are unknown modules, these modules
will be ignored and skipped.
This is mainly used in downgrade operation, old version can be compatible with new version Image file.
Hive partition columns' stats could be calculated from hive metastore data. Doesn't need to execute sql to get the stats.
This PR is using hive partition metadata to collect partition column stats.
### Before:
return errors when tvf queries an empty file or an error uri:
1. get parsed schema failed, empty csv file
2. Can not get first file, please check uri.
### Now:
we just return empty set when tvf queries an empty file or an error uri.
```sql
mysql> select * from s3(
"uri" = "https://error_uri/exp_1.csv",
"s3.access_key"= "xx",
"s3.secret_key" = "yy",
"format" = "csv") limit 10;
Empty set (1.29 sec)
```
Support transforming trino dialect SQL to logical plan (#21854)
## Proposed changes
Issue Number: #21854
Use io.trino.sql.tree.AstVisitor as vistor, visit coorresponding trino node and transform it to doris logical plan.
## Further comments
Here are some examples for function transforming as following:
**ascii('a')** function is in doris and **codepoint('a')** funtion in trino, they have the same feature and have the same method signature, so we can use [TrinoFnCallTransformer](3b37b76886/fe/fe-core/src/main/java/org/apache/doris/nereids/parser/trino/TrinoFnCallTransformer.java) to handle them.
another example for ComplexTransformer as following:
**date_diff('second', TIMESTAMP '2020-12-25 22:00:00', TIMESTAMP '2020-12-25 21:00:00')"** fuction in trino
and **seconds_diff(2020-12-25 22:00:00, 2020-12-25 21:00:00)")** fuction in doris. They have different method signature, we cant not handle it by TrinoFnCallTransformer simply and we should handle it by individual complex transformer [DateDiffFnCallTransformer](3b37b76886/fe/fe-core/src/main/java/org/apache/doris/nereids/parser/trino/DateDiffFnCallTransformer.java).
At present, `load_to_singlt_tablet` import implementation refers to simple random number remainder, which cannot achieve true averaging. This will lead to uneven disk IO and uneven use of cluster resources. To solve this problem, we are preparing to implement round-robin for each partition tablet imported each time, in order to achieve average load to each tablet.
When generating the load query plan, the tablet index record currently imported is passed to BE.
Add a deamon task in FE to regularly clean up the `loadTabletRecordMap`. The map will get the bucket_number of the partition and update the `load_tablet_index` when `getCurrentLoadTabletIndex`.