adjust nullable on empty set should apply after unnested sub-query
some function should propagate nullable when args are datev2 or datetimev2
add back tpch sf0.1 nereids regression test
When the argument of truncate function is float type, it can match both truncate(DECIMALV3) and truncate(DOUBLE), if the match is truncate(DECIMALV3), the precision is lost when converting float to DECIMALV3(38, 0).
Here I modify it to match truncate(DOUBLE) for now, maybe we still need to solve the problem of losing precision when converting float to DECIMALV3.
```
create table tbl1 (k1 varchar(100), k2 string) distributed by hash(k1) buckets 1 properties("replication_num" = "1");
insert into tbl1 values(1, "alice");
select cast(k1 as INT) as id from tbl1 order by id limit 2;
```
The above query could pass `checkEnableTwoPhaseRead` since the order by element is SlotRef but actually it's an function call expr
The columns name in stream load and broker load are case sensitive, make it case insensitive. This would be consist with query, because query sql columns name are case insensitve.
Add DATA_TYPE in information schema for types: datev2, datatimev2, decimal, jsonb. It was 'unknown' for these types and cause problem for tools such as BI using information schema.
support 4 phase Aggregation.
example:
`select count(distinct k1), sum(k2) from t`
suppose t.k0 is distribute key.
we have plan
```
Agg(DISTINCT_GLOBAL)
|
Exchange(Gather)
|
Agg(DISTINCT_LOCAL)
|
Agg(GLOBAL)
|
Exchange(hash distribute by k1)
|
Agg(LOCAL)
|
scan
```
limitations:
1. only support sql with one distinct.
not support:`select count(distinct k1), count(distinct k2) from t`
2. only support sql with distinct one column
not support: `select count(distinct k1, k2) from t`
Doris always delays the execution of expressions as possible as it can, so as the expansion of constant expression. Given below SQL:
```sql
select i from (select 'abc' as i, sum(birth) as j from subquerytest2) as tmp
```
The aggregation would be eliminated, since its output is not required by the outer block, but the expasion for constant expression would be done in the final result expr, and since aggreagete output has been eliminate, the expasion would actually do nothing, and finally cause a empty results.
To fix this, we materialize the results expr in the inner block for such SQL, it may affect performance, but better than let system produce a mistaken result.
1.Compatible with the old optimizer, the sort and limit in the subquery will not take effect, just delete it directly.
```
select * from sub_query_correlated_subquery1 where sub_query_correlated_subquery1.k1 > (select sum(sub_query_correlated_subquery3.k3) a from sub_query_correlated_subquery3 where sub_query_correlated_subquery3.v2 = sub_query_correlated_subquery1.k2 order by a limit 1);
```
2.Adjust the unnesting position of the subquery to ensure that the conjunct in the filter has been optimized, and then unnesting
Support:
```
SELECT DISTINCT k1 FROM sub_query_correlated_subquery1 i1 WHERE ((SELECT count(*) FROM sub_query_correlated_subquery1 WHERE ((k1 = i1.k1) AND (k2 = 2)) or ((k1 = i1.k1) AND (k2 = 1)) ) > 0);
```
The reason why the above can be supported is that conjunction will be performed, which can be converted into the following
```
SELECT DISTINCT k1 FROM sub_query_correlated_subquery1 i1 WHERE ((SELECT count(*) FROM sub_query_correlated_subquery1 WHERE ((k1 = i1.k1) AND (k2 = 2 or k2 = 1)) ) > 0);
```
Not Support:
```
SELECT DISTINCT k1 FROM sub_query_correlated_subquery1 i1 WHERE ((SELECT count(*) FROM sub_query_correlated_subquery1 WHERE ((k1 = i1.k1) AND (k2 = 2)) or ((k2 = i1.k1) AND (k2 = 1)) ) > 0);
```
Support set skip line number for stream load to load csv file.
Usage `-H skip_lines:number`:
```
curl --location-trusted -u root: -T test.csv -H skip_lines:5 -XPUT http://127.0.0.1:8030/api/testDb/testTbl/_stream_load
```
Skip line number also can be used in mysql load as below:
```sql
LOAD DATA
LOCAL
INFILE '${mysql_load_skip_lines}'
INTO TABLE ${tableName}
COLUMNS TERMINATED BY ','
IGNORE 2 LINES
PROPERTIES ("auth" = "root:");
```
1. Filter out and remain predicates that do not support applying on inverted index,
like `BF` predicate, `IS_NULL` predicate, `IS_NOT_NULL` predicate.
2. Add inverted index regression case that based on tpcds_sf1 data set.