This is an example of s3 hms_catalog:
```sql
CREATE CATALOG hms_catalog properties(
"type" = "hms",
"hive.metastore.uris"="thrift://localhost:9083",
"AWS_ACCESS_KEY" = "your access key",
"AWS_SECRET_KEY"="your secret key",
"AWS_ENDPOINT"="s3 endpoint",
"AWS_REGION"="s3-region",
"fs.s3a.paging.maximum"="1000");
```
All these params are necessary;
1.This pr is used for adding the supported sub-type for array which has been modified in #9916
2.add regression test for the supported sub-type
Co-authored-by: hucheng01 <hucheng01@baidu.com>
# first: Add two expr rewrite rule:
1. remove duplicate expr
a = 1 and a = 1 -> a = 1
2. extract common expr
(a or b) and (a or c) -> a or (b and c)
# second: Add some plan rewrite rule of rewriting expr of operator
1. NormalizeExpressionOfPlan contains normalize expr rewrite rule. Using these normalizerule rewrite LogicalFilter、LogicalAggravate,LogicalProject,LogicalJoin exprs
2. OptimizeExpressionOfPlan contains optimize expr rewrite rule. Using these optimize rule rewrite LogicalFilter、LogicalAggravate,LogicalProject,LogicalJoin exprs
review and add all missing equals and hashCode function to Expression and its sub class.
Alias
Arithmetic
BoundFunction
CompoundPredicate
Not
UnboundFunction
UnboundSlot
UnboundStar
support case when for TPC-H
for example:
CASE [expression] WHEN [value] THEN [expression] ... ELSE [expression] END
or
CASE WHEN [predicate] THEN [expression] ... ELSE [expression] END
* [regressiontest] add tpcds_sf1 test (#10852)
Co-authored-by: smallhibiscus <844981280>
Co-authored-by: stephen <hello-stephen@qq.com>
* ignore q30 temporarily since it encounter latin character Ô
Co-authored-by: stephen <hello-stephen@qq.com>
GroupExpression.getParent() returns the group which contains this expr. This name is missleading especially in tree structures.
So we change the name to getOwnerGroup.
Try to eliminate cross join via finding join conditions in filters and changing the join orders.
For example:
-- input:
SELECT * FROM t1, t2, t3 WHERE t1.id=t3.id AND t2.id=t3.id
-- output:
SELECT * FROM t1 JOIN t3 ON t1.id=t3.id JOIN t2 ON t2.id=t3.id
This feature is controlled by session variable enable_nereids_reorder_to_eliminate_cross_join with true by default.
Simplify usage of Memo and rewrite rule application.
Before this PR, if we want to apply a rewrite rule to a plan, the code is like the below:
Memo memo = new Memo();
memo.initialize(root);
PlannerContext plannerContext = new PlannerContext(memo, new ConnectContext());
JobContext jobContext = new JobContext(plannerContext, new PhysicalProperties(), 0);
RewriteTopDownJob rewriteTopDownJob = new RewriteTopDownJob(memo.getRoot(),
ImmutableList.of(new AggregateDisassemble().build()), jobContext);
plannerContext.pushJob(rewriteTopDownJob);
plannerContext.getJobScheduler().executeJobPool(plannerContext);
Plan after = memo.copyOut();
After this PR, we could use chain style calling:
new Memo(plan)
.newPlannerContext(connectContext)
.setDefaultJobContext()
.topDownRewrite(new AggregateDisassemble())
.getMemo()
.copyOut();
Rename the session variable enable_nereids to enable_nereids_planner to make it more meaningful.
During the analysis of BinaryPredicate, it will generate a CastExpr if the slot implicitly in the below case:
SELECT * FROM t1 WHERE t1.col1 = '1';
col1 is integer column.
This will prevent the binary predicate from pushing down to OlapScan which would impact the performance.
When a rowset includes multiple segments, segments rows will be merged in generic_iterator but merged_rows is not maintained. Compaction will failed in check_correctness.
Co-authored-by: yixiutt <yixiu@selectdb.com>
This pull request includes some implementations of the statistics(https://github.com/apache/incubator-doris/issues/6370), it will not affect any existing code and users will not be able to create statistics job.
Now only MetaStatisticsTask that directly collects statistics by reading FE meta is implemented. SQLStatisticsTask is still being implemented, it needs to query BE through FE.
The following is the function implemented by this pr:
1. Support statistics collection for partitioned and non-partitioned tables. For partitioned tables, the collection of statistics for the specified partition is implemented.
2. When the task is divided, it is divided according to the partition table and the non-partition table. The most fine-grained is to the tablet level. A matetask collects as many statistics as possible.
3. Add partition statistics (Table -> Partition -> Column). For example, the size of the table, the number of rows, the size of the partition, the number of rows, the maximum and minimum values of the columns, etc.
4. Display and modify partition-level statistics.
…
* support like/not like conjuncts push down to storage engine
* vectorized engine support like/not like conjuncts push down to storage engine
* support both evaluate and evaluate_vec method in like predicate
* reuse remove_pushed_conjuncts and prevent logic error during move function conjuncts
* change #ifndef to pragma once as per comments
* change enable_function_pushdown default to false
Co-authored-by: heguangnan <heguangnan@bytedance.com>