In previous, when querying hive table in orc format, and the file is splitted.
the result of select count(*) may be multiple of the real row number.
This is because the number of rows should be got after orc strip prune,
otherwise, it may return wrong result
original SQL
select t1.* from t1 where t1.k1 not in ( select t3.k1 from t3 where t1.k2 = t3.k2 );
rewrite SQL
before (wrong):
select t1.* from t1 null aware left anti join t2 on t1.k1 = t3.k1 and t1.k2 = t3.k2;
now (correct):
select t1.* from t1 left anti join t3 on t1.k2 = t3.k2 and (t1.k1 = t3.k1 or t3.k1 is null or t1.k1 is null);
1. Change from using string matching function to using Expr matching
2. Replace the `nvl` function with `ifnull` when pushed down to MySQL
3. Adapt ClickHouse's `from_unixtime` function to push down
4. Non-function filtering can still be pushed down when `enable_func_pushdown` is set to false
support insert into table values(...) for Nereids.
sql like:
insert into t values(1, 2, 3)
insert into t values(1 + 1, dayofweek(now()), 4), (4, 5, 6)
insert into t values('1', '6.5', cast(1.5 as int))
1. Fix hive partition prune bug, introduced from #23845, will fail `test_hive_default_partition` test case.
2. Fix `test_local_tvf.groovy` test case, the path of local tvf should be relative path.
3. Fix `test_external_catalog_hive` test case, the `partitions` is now reserve keywords
4. Support `local` tvf in Nereids, but fix related issue like:
```
Caused by: java.lang.NullPointerException
at org.apache.doris.nereids.stats.ExpressionEstimation.castMinMax(ExpressionEstimation.java:171) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.nereids.stats.ExpressionEstimation.visitCast(ExpressionEstimation.java:167) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.nereids.stats.ExpressionEstimation.visitCast(ExpressionEstimation.java:109) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.nereids.trees.expressions.Cast.accept(Cast.java:55) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.nereids.stats.ExpressionEstimation.visitAlias(ExpressionEstimation.java:394) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.nereids.stats.ExpressionEstimation.visitAlias(ExpressionEstimation.java:109) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.nereids.trees.expressions.Alias.accept(Alias.java:145) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.nereids.stats.ExpressionEstimation.estimate(ExpressionEstimation.java:119) ~[doris-fe.jar:1.2-SNAPSHOT]
at org.apache.doris.nereids.stats.StatsCalculator.lambda$computeProject$7(StatsCalculator.java:785) ~[doris-fe.jar:1.2-SNAPSHOT]
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) ~[?:1.8.0_341]
at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) ~[?:1.8.0_341]
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) ~[?:1.8.0_341]
```
1. Change the external hive docker network mode from the bridge mode to the host mode to support the external test of the multi-node doris cluster
2. Added more hive test data in various formats
3. Added a test case with hive
if set preserveRootTypes to false when calling substituteList, the root cast expr may be lost during substituting. For example, the top cast expr is cast(decimal_col as double), if it's lost, the data type mismatch between plan node and be crashes.
now we make error if we deal with leading zeros in decimal value , type_precision >= precision will make value overflow and DCHECK will fail , so if here has leading zero we should only make type_precision > precision to make value right