Refactored the SQL query to handle the distinct and nullable values when computing the sum of 'v' and 'v + 1' in the 't' table like:
select sum(v + 1), sum(distinct v + 1) from t
=>
select sum(v) + count(v), sum(distinct v) + count(distinct v)
1. only push having as agg's parent if having just use slots from agg's output
2. show user friendly error message when item in select list but not in aggregate node's output
support data masking policy
note:
if a user send the query
```sql
select name from tbl limit 1
```
and the user have row policy on `tbl.name` with the filter `name = 'Beijing'`, and have data masking policy on `tbl.name` with the masking `concat(substring(name, 1, 4), '****')`, we will rewrite the query to
```sql
select concat(substring(name, 1, 4), '****') as name
from tbl
where name = 'Beijing' -- note that this name is from tbl, not from the alias in the select list
limit 1
```
the result would be `Beij****`
In order to support paimon with hive2, we need to modify the origin HiveMetastoreClient.java
to let it compatible with both hive2 and hive3.
And this modified HiveMetastoreClient should be at the front of the CLASSPATH, so that
it can overwrite the HiveMetastoreClient in hadoop jar.
This PR mainly changes:
1. Copy HiveMetastoreClient.java in FE to BE's preload jar.
2. Split the origin `preload-extensions-jar-with-dependencies.jar` into 2 jars
1. `preload-extensions-project.jar`, which contains the modified HiveMetastoreClient.
2. `preload-extensions-jar-with-dependencies.jar`, which contains other dependency jars.
3. Modify the `start_be.sh`, to let `preload-extensions-project.jar` be loaded first.
4. Change the way the assemble the jni scanner jar
Only need to assemble the project jar, without other dependencies.
Because actually we only use classed under `org.apache.doris` package.
So remove other unused dependency jars can also reduce the output size of BE.
5. fix bug that the prefix of paimon properties should be `paimon.`, not `paimon`
6. Support paimon with hive2
User can set `hive.version` in paimon catalog properties to specify the hive version.
* [Fix](TransientTask)Export tasks should only be run on the master node
Add thread name
Export Task runs only on the master node, so it is necessary to explicitly start the corresponding resources. At the same time, refactor some code to avoid circular dependencies.
* TransientTaskManager is initialized twice. Therefore, the second initialization needs to be deleted.
Query like `select * from ut_p partitions(p2) where cast(var['a'] as int) > 0` will fall through parition/tablet prunning since it's plan like
```
mysql> explain analyzed plan select * from ut_p where id = 3 and cast(var['a'] as int) = 789;
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Explain String(Nereids Planner) |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| LogicalResultSink[26] ( outputExprs=[id#0, var#1] ) |
| +--LogicalProject[25] ( distinct=false, projects=[id#0, var#1], excepts=[] ) |
| +--LogicalFilter[24] ( predicates=((cast(var#4 as INT) = 789) AND (id#0 = 3)) ) |
| +--LogicalFilter[23] ( predicates=(0 = __DORIS_DELETE_SIGN__#2) ) |
| +--LogicalProject[22] ( distinct=false, projects=[id#0, var#1, __DORIS_DELETE_SIGN__#2, __DORIS_VERSION_COL__#3, element_at(var#1, 'a') AS `var`#4], excepts=[] ) |
| +--LogicalOlapScan ( qualified=regression_test_variant_p0.ut_p, indexName=<index_not_selected>, selectedIndexId=10145, preAgg=ON ) |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
6 rows in set (0.01 sec)
```
with an extra LogicalProject on top of LogicalOlapScan, so we should handle such case to prune parition/tablet
1. variant type core dump at call get_data_at function, as not impl this function.
2. some case can't pass at old planner and fold_constant_by_be = on.
3. open enable_fold_constant_by_be = true.