In SetOperationNode we do passthrough, if we child output is same with itself output.
In method isChildPassthrough we only consider memory layout.
When we use vectorized engine, we need to use SlotDesc offset in TupleDesc instead of
memory layout to check whether pass-through can be performed
closed#9644
Second step of statistics derivation: implementation of nodes other than scan_node.
The statistical information derivation interface of all nodes is uniformly placed in DeriveFactory.java.
Added one-sided to verify the derivation is correct.
Statistics derivation for each node is placed in its own *StatsDerive.java
detailed design: https://docs.google.com/document/d/1u1L6XhyzKShoyYRwFQ6kE1rnvY2iFwauwg289au5Qq0/edit
Currently, only the root user has node_priv privileges.
That is, only the root user can operate the addition and deletion of nodes.
In the original design of Doris, there is an Operator role. This role can have node_priv for node operations.
This PR supports assigning node_priv to users other than root.
However, only users who have both grant_priv and node_priv can assign node_priv to other users.
This ensures that only the root user has this permission, and users who are given node_priv
cannot continue to expand this permission outward.
- set inline view's slot descriptor to nullable in register column ref
- propagate slot nullable when generate inline view's query node in SingleNodePlanner
This CL mainly changes:
1. Reducing the rpc timeout problem caused by rpc waiting for the worker thread of brpc.
1. Merge multiple fragment instances on the same BE to send requests to reduce the number of send fragment rpcs
2. If fragments size >= 3, use 2 phase RPC: one is to send all fragments, two is to start these fragments. So that there
will be at most 2 RPC for each query on one BE.
3. Set the timeout of send fragment rpc to the query timeout to ensure the consistency of users' expectation of query timeout period.
4. Do not close the connection anymore when rpc timeout occurs.
5. Change some log level from info to debug to simplify the fe.log content.
NOTICE:
1. Change the definition of execPlanFragment rpc, must first upgrade BE.
3. Remove FE config `remote_fragment_exec_timeout_ms`
1. Convert child expressions in InPredicate to column type and discard child expressions in them that cannot be converted exactly.
2. Fix the bug of ColumnRange exception caused by InPredicate child expressions type conversion.
3. Fix the problem that the tablet could not be hit due caused by InPredicate child expressions type conversion.
Issue Number: close#9627 , #9628
This PR introduce two essentials for Nereids
1. pattern match iterator used in memo
pattern match iterator is implemented by two iterators nested within each other: GroupExpressionIterator and GroupIterator.
GroupExpressionIterator use GroupIterator to get all children Plan which matching pattern and use them as children to generate pattern matched plan.
GroupIterator use GroupExpressionIterator to get all pattern matched Plan related to GroupExpressions in itself.
2. plan rewrite framework for memo
Rewrite framework is implemented by two jobs: RewriteTopDownJob and RewriteBottomUpJob
Both of them takes a group, a set of rules that need to be applied, and a context as construction parameters.
RewriteTopDownJob apply these jobs from top to down one by one.
RewriteBottomUpJob apply these jobs from bottom to up one by one.
When one rule rewrites plan tree at a plan node. This plan node will be applied all rules again until no rules can rewrite it.
This CL mainly changes:
1. Broker Load
When assigning backends, use user level resource tag to find available backends.
If user level resource tag is not set, broker load task can be assigned to any BE node,
otherwise, task can only be assigned to BE node which match the user level tags.
2. Routine Load
The current routine load job does not have user info, so it can not get user level tag when assigning tasks.
So there are 2 ways:
1. For old routine load job, use tags of replica allocation info to select BE nodes.
2. For new routine load job, the user info will be added and persisted in routine load job.
Close#9623
Summary:
This pr refactor plan node into plan + operator.
In the previous version in nereids, a plan node consists of children and relational algebra, e.g.
```java
class LogicalJoin extends LogicalBinary {
private Plan left, right;
}
```
This structure above is easy to understand, but it difficult to optimize `Memo.copyIn`: rule generate complete sub-plan,
and Memo must compare the complete sub-plan to distinct GroupExpression and hurt performance.
First, we need change the rule to generate partial sub-plan, and replace some children plan to a placeholder, e.g. LeafOp in Columbia optimizer. And then mark some children in sub-plan to unchanged, and bind the relate group, so don't have to compare and copy some sub-plan if relate group exists.
Second, we need separate the origin `Plan` into `Plan` and `Operator`, which Plan contains children and Operator, and Operator just denote relation relational algebra(no children/ input field). This design make operator and children not affect each other. So plan-group binder can generate placeholder plan(contains relate group) for the sub-query, don't have to generate current plan node case by case because the plan is immutable(means generate a new plan with replace children). And rule implementer can reuse the placeholder to generate partial sub-plan.
Operator and Plan have the similar inheritance structure like below. XxxPlan contains XxxOperator, e.g. LogicalBinary contains a LogicalBinaryOperator.
```
TreeNode
│
│
┌───────┴────────┐ Operator
│ │ │
│ │ │
│ │ │
▼ ▼ ▼
Expression Plan PlanOperator
│ │
│ │
┌───────────┴─────────┐ │
│ │ ┌───────────┴──────────────────┐
│ │ │ │
│ │ │ │
▼ ▼ ▼ ▼
LogicalPlan PhysicalPlan LogicalPlanOperator PhysicalPlanOperator
│ │ │ │
│ │ │ │
│ │ │ │
│ │ │ │
│ │ │ │
│ │ │ │
├───►LogicalLeaf ├──►PhysicalLeaf ├──► LogicalLeafOperator ├───►PhysicalLeafOperator
│ │ │ │
│ │ │ │
│ │ │ │
├───►LogicalUnary ├──►PhysicalUnary ├──► LogicalUnaryOperator ├───►PhysicalUnaryOperator
│ │ │ │
│ │ │ │
│ │ │ │
└───►LogicalBinary └──►PhysicalBinary └──► LogicalBinaryOperator └───►PhysicalBinaryOperator
```
The concrete operator extends the XxxNaryOperator, e.g.
```java
class LogicalJoin extends LogicalBinaryOperator;
class PhysicalProject extends PhysicalUnaryOperator;
class LogicalRelation extends LogicalLeafOperator;
```
So the first example change to this:
```java
class LogicalBinary extends AbstractLogicalPlan implements BinaryPlan {
private Plan left, right;
private LogicalBinaryOperator operator;
}
class LogicalJoin extends LogicalBinaryOperator {}
```
Under such changes, Rule must build the plan and operator as needed, not only the plan like before.
for example: JoinCommutative Rule
```java
public Rule<Plan> build() {
// the plan override function can automatic build plan, according to the Operator's type,
// so return a LogicalBinary(LogicalJoin, Plan, Plan)
return innerLogicalJoin().then(join -> plan(
// operator
new LogicalJoin(join.op.getJoinType().swap(), join.op.getOnClause()),
// children
join.right(),
join.left()
)).toRule(RuleType.LOGICAL_JOIN_COMMUTATIVE);
}
```
Due to the current architecture, predicate derivation at rewrite cannot satisfy all cases,
because rewrite is performed on first and then where, and when there are subqueries, all cases cannot be derived.
So keep the predicate pushdown method here.
eg.
select * from t1 left join t2 on t1 = t2 where t1 = 1;
InferFiltersRule can't infer t2 = 1, because this is out of specification.
The expression(t2 = 1) can actually be deduced to push it down to the scan node.
* [Refactor][Bug-Fix][Load Vec] Refactor code of basescanner and vjson/vparquet/vbroker scanner
1. fix bug of vjson scanner not support `range_from_file_path`
2. fix bug of vjson/vbrocker scanner core dump by src/dest slot nullable is different
3. fix bug of vparquest filter_block reference of column in not 1
4. refactor code to simple all the code
It only changed vectorized load, not original row based load.
Co-authored-by: lihaopeng <lihaopeng@baidu.com>