1. Filter out and remain predicates that do not support applying on inverted index,
like `BF` predicate, `IS_NULL` predicate, `IS_NOT_NULL` predicate.
2. Add inverted index regression case that based on tpcds_sf1 data set.
* [fix](Nereids): fix scalar_function A-F.
* [Fix](regression-test)fix regression test framework cannot compare double value nan and inf.
* revert dround()
Main subtask of [DSIP-28](https://cwiki.apache.org/confluence/display/DORIS/DSIP-028%3A+Suppot+MySQL+Load+Data)
## Problem summary
Support mysql load syntax as below:
```sql
LOAD DATA
[LOCAL]
INFILE 'file_name'
INTO TABLE tbl_name
[PARTITION (partition_name [, partition_name] ...)]
[COLUMNS TERMINATED BY 'string']
[LINES TERMINATED BY 'string']
[IGNORE number {LINES | ROWS}]
[(col_name_or_user_var [, col_name_or_user_var] ...)]
[SET (col_name={expr | DEFAULT} [, col_name={expr | DEFAULT}] ...)]
[PROPERTIES (key1 = value1 [, key2=value2]) ]
```
For example,
```sql
LOAD DATA
LOCAL
INFILE 'local_test.file'
INTO TABLE db1.table1
PARTITION (partition_a, partition_b, partition_c, partition_d)
COLUMNS TERMINATED BY '\t'
(k1, k2, v2, v10, v11)
set (c1=k1,c2=k2,c3=v10,c4=v11)
PROPERTIES ("auth" = "root:", "strict_mode"="true")
```
Note that in this pr the property named `auth` must be set since stream load need auth. I will optimize it later.
External hms catalog table column names in doris are all in lower case,
while iceberg table or spark-sql created hive table may contain upper case column name,
which will cause empty query result. This pr is to fix this bug.
1. For parquet file, transfer all column names to lower case while parse parquet metadata.
2. For orc file, store the origin column names and lower case column names in two vectors, use the suitable names in different cases.
3. FE side, change the column name back to the origin column name in iceberg while doing convertToIcebergExpr.
1. add TypeCoercion for (string, decimal) and (date, decimal)
2. The equality of LogicalProject node should consider children in some case
3. don't push down join condition like "t1 join t2 on true/false"
4. add PUSH_DOWN_FILTERS after FindHashConditionForJoin
5. nestloop join should support all kind of join
6. the intermediate tuple should contains slots from both children of nest loop join.
1. support row format using codec of jsonb
2. short path optimize for point query
3. support prepared statement for point query
4. support mysql binary format
Support iceberg schema evolution for parquet file format.
Iceberg use unique id for each column to support schema evolution.
To support this feature in Doris, FE side need to get the current column id for each column and send the ids to be side.
Be read column id from parquet key_value_metadata, set the changed column name in Block to match the name in parquet file before reading data. And set the name back after reading data.
BE storage Engine has some bug in Date comparison, and hence if we push down predicates like Date'x' < Date 'y', we get error results.
This pr just convert expr like ’Date'x' < Date 'y',‘ to DateTime'x' < DateTime 'y'
TODO:
do storage engine support date slot compare with datetime?
if it support, we could avoid add cast on the slot
and then, this expression could push down to storage engine.
Child's slot with same name to the slots in the outputexpression would be discarded which would cause the bind failed, since the slots in the group by expressions cannot find the corresponding bound slots from the child's output
For example, in this case, the `date` in having clause should be bind to alias which has same name, instead of `date` field of the relation
SELECT date_format(date, '%x%v') AS `date` FROM `tb_holiday` WHERE `date` between 20221111 AND 20221116 HAVING date = 202245 ORDER BY date;
1. signatures without order element are wrong
2. signature with one arg is miss
3. group_concat should be NullableAggregateFunction
4. fold constant on fe should not fold NullableAggregateFunction with null arg
TODO
1. reorder rewrite rules, and then only forbid fold constant on NullableAggregateFunction with alwaysNullable == true
Fix bug that when create jdbc resource with only jdbc driver file name, it will failed to do checksum
This is because we forgot the pass the full driver url to JdbcClient.
Add ResultSet.FETCH_FORWARD and set AutoCommit to false to jdbc connection, so to avoid OOM when fetching large amount of data
set useCursorFetch in jdbc url for both MySQL and PostgreSQL.
Fix some p2 external datasource bug