ZSTD compression is fast with high compression ratio. It can be used to archive higher compression ratio
than default Lz4f codec for storing cost sensitive data such as logs.
Compared to Lz4f codec, we see zstd codec get 35% compressed size off, 30% faster at first time read without OS page
cache, 40% slower at second time read with OS page cache in the following comparison test.
test data: 25GB text log, 110 million rows
test table: test_table(ts varchar(30), log string)
test SQL: set enable_vectorized_engine=1; select sum(length(log)) from test_table
be.conf: disable_storage_page_cache = true
set this config to disable doris page cache to avoid all data cached in memory for test real decompression speed.
test result
master branch with lz4f codec result:
- compressed size 4.3G
- SQL first exec time(read data from disk + decompress + little computation) : 18.3s
- SQL second exec time(read data from OS pagecache + decompress + little computation) : 2.4s
this branch with zstd codec (hardcode enable it) result:
- compressed size: 2.8G
- SQL first exec time: 12.8s
- SQL second exec time: 3.4s
This CL mainly changes:
1. Broker Load
When assigning backends, use user level resource tag to find available backends.
If user level resource tag is not set, broker load task can be assigned to any BE node,
otherwise, task can only be assigned to BE node which match the user level tags.
2. Routine Load
The current routine load job does not have user info, so it can not get user level tag when assigning tasks.
So there are 2 ways:
1. For old routine load job, use tags of replica allocation info to select BE nodes.
2. For new routine load job, the user info will be added and persisted in routine load job.
1. Fix Lru Cache MemTracker consumption value is negative.
2. Fix compaction Cache MemTracker has no track.
3. Add USE_MEM_TRACKER compile option.
4. Make sure the malloc/free hook is not stopped at any time.
Close#9623
Summary:
This pr refactor plan node into plan + operator.
In the previous version in nereids, a plan node consists of children and relational algebra, e.g.
```java
class LogicalJoin extends LogicalBinary {
private Plan left, right;
}
```
This structure above is easy to understand, but it difficult to optimize `Memo.copyIn`: rule generate complete sub-plan,
and Memo must compare the complete sub-plan to distinct GroupExpression and hurt performance.
First, we need change the rule to generate partial sub-plan, and replace some children plan to a placeholder, e.g. LeafOp in Columbia optimizer. And then mark some children in sub-plan to unchanged, and bind the relate group, so don't have to compare and copy some sub-plan if relate group exists.
Second, we need separate the origin `Plan` into `Plan` and `Operator`, which Plan contains children and Operator, and Operator just denote relation relational algebra(no children/ input field). This design make operator and children not affect each other. So plan-group binder can generate placeholder plan(contains relate group) for the sub-query, don't have to generate current plan node case by case because the plan is immutable(means generate a new plan with replace children). And rule implementer can reuse the placeholder to generate partial sub-plan.
Operator and Plan have the similar inheritance structure like below. XxxPlan contains XxxOperator, e.g. LogicalBinary contains a LogicalBinaryOperator.
```
TreeNode
│
│
┌───────┴────────┐ Operator
│ │ │
│ │ │
│ │ │
▼ ▼ ▼
Expression Plan PlanOperator
│ │
│ │
┌───────────┴─────────┐ │
│ │ ┌───────────┴──────────────────┐
│ │ │ │
│ │ │ │
▼ ▼ ▼ ▼
LogicalPlan PhysicalPlan LogicalPlanOperator PhysicalPlanOperator
│ │ │ │
│ │ │ │
│ │ │ │
│ │ │ │
│ │ │ │
│ │ │ │
├───►LogicalLeaf ├──►PhysicalLeaf ├──► LogicalLeafOperator ├───►PhysicalLeafOperator
│ │ │ │
│ │ │ │
│ │ │ │
├───►LogicalUnary ├──►PhysicalUnary ├──► LogicalUnaryOperator ├───►PhysicalUnaryOperator
│ │ │ │
│ │ │ │
│ │ │ │
└───►LogicalBinary └──►PhysicalBinary └──► LogicalBinaryOperator └───►PhysicalBinaryOperator
```
The concrete operator extends the XxxNaryOperator, e.g.
```java
class LogicalJoin extends LogicalBinaryOperator;
class PhysicalProject extends PhysicalUnaryOperator;
class LogicalRelation extends LogicalLeafOperator;
```
So the first example change to this:
```java
class LogicalBinary extends AbstractLogicalPlan implements BinaryPlan {
private Plan left, right;
private LogicalBinaryOperator operator;
}
class LogicalJoin extends LogicalBinaryOperator {}
```
Under such changes, Rule must build the plan and operator as needed, not only the plan like before.
for example: JoinCommutative Rule
```java
public Rule<Plan> build() {
// the plan override function can automatic build plan, according to the Operator's type,
// so return a LogicalBinary(LogicalJoin, Plan, Plan)
return innerLogicalJoin().then(join -> plan(
// operator
new LogicalJoin(join.op.getJoinType().swap(), join.op.getOnClause()),
// children
join.right(),
join.left()
)).toRule(RuleType.LOGICAL_JOIN_COMMUTATIVE);
}
```
* [Enhancement][Vectorized]build hash table with new thread, as non-vectorized past do
edit after comments
* format code with clang format
Co-authored-by: lidongyang <dongyang.li@rateup.com.cn>
Co-authored-by: stephen <hello-stephen@qq.com>
Due to the current architecture, predicate derivation at rewrite cannot satisfy all cases,
because rewrite is performed on first and then where, and when there are subqueries, all cases cannot be derived.
So keep the predicate pushdown method here.
eg.
select * from t1 left join t2 on t1 = t2 where t1 = 1;
InferFiltersRule can't infer t2 = 1, because this is out of specification.
The expression(t2 = 1) can actually be deduced to push it down to the scan node.