after project, some Slot maybe project to another one. So we need to replace ExprId in DistributionSpecHash to the new one. if we do project other than Alias, We need to return DistributionSpecAny other than child's DistributionSpec.
This rule eliminate project that output set is same with its child. If the project is the root of plan, the elimination condition is project's output is exactly the same with its child.
The reason to add this rule is when we do join reorder in optimization, the root of plan after transformed maybe a Project and its output set is same with the root of plan before transformed. If we had a Project on the top of the root and its output set is same with the root of plan too. We will have two exactly same projects in memo. One of them is the parent of the other. After MergeProject, we will get a new Project exactly same like the child and need to add to parent's group. Then we trigger Merge Group. Since merge will produce a cycle, the merge will be denied and we will get a final plan with two consecutive projects.
## for example:
**BEFORE OPTIMIZATION**
```
LogicalProject1( projects=[c_custkey#0, c_name#1]) [GroupId#1]
+--LogicalJoin(type=LEFT_SEMI_JOIN) [GroupId#2]
|--LogicalProject(...)
| +--LogicalJoin(type=INNER_JOIN)
| ...
+--LogicalOlapScan(...)
```
**AFTER APPLY RULE: LOGICAL_SEMI_JOIN_LOGICAL_JOIN_TRANSPOSE_PROJECT**
```
LogicalProject1( projects=[c_custkey#0, c_name#1]) [GroupId#1]
+--LogicalProject2( projects=[c_custkey#0, c_name#1]) [GroupId#2]
+--LogicalJoin(type=INNER_JOIN) [GroupId#10]
|--LogicalProject(...)
| +--LogicalJoin(type=LEFT_SEMI_JOIN)
| ...
+--LogicalOlapScan(...)
```
**AFTER APPLY RULE: MERGE_PROJECTS**
```
LogicalProject3( projects=[c_custkey#0, c_name#1]) [should be in GroupId#1, but in GroupId#2 in fact]
+--LogicalJoin(type=INNER_JOIN) [GroupId#10]
|--LogicalProject(...)
| +--LogicalJoin(type=LEFT_SEMI_JOIN)
| ...
+--LogicalOlapScan(...)
```
Since we have exaclty GroupExpression(LogicalProject3 and LogicalProject2) in GroupId#1 and GroupId#2, we need to do MergeGroup(GroupId#1, GroupId#2). But we have child of GroupId#1 in GroupId#2. So the merge is denied.
If the best GroupExpression in GroupId#2 is LogicalProject3, we will get two consecutive projects in the final plan.
There are several configs related to tcmalloc, users do know how to config them. Actually users just want two modes, performance or compact, in performance mode, users want doris run query and load quickly while in compact mode, users want doris run with less memory usage.
If we want to config tcmalloc individually, we can use env variables which are supported by tcmalloc.
When upgrade 1.2 version from 1.1, FE version will don't match BE version for a period of time. After upgrade BE and doing schema change, BE will use a field desc_tbl that add in 1.2 version FE. BE will coredump because the field desc_tbl is nullptr. So it need to refuse the request.
- this pr is used to fix the be core dump when import array.
- before the change, we import array by rapidjson string will core dump under the non-vectorized scenario.
- after the change, we can import array by rapidjson string successfully.
when enable_nereids_planner=false and enable_fallback_to_origin=false, FE throws exception for all select statement.
Expected: when enable_nereids_planner=false, all valid query execution success
1.
In the previous implementation, the max thread num of olap scanner was set relatively small, such as 3.
which would slow down some of queries.
In this PR, I changed the max thread num to a quarter of the scaner thread pool(default is 12),
which is less than the old scan node's max thread num, but larger than the previous implementation.
The upper limit of the max thread num of the old scan node is too high, which is not reasonable.
2.
Lower down the number of pre allocated free blocks.
1.remove quick_compaction's rowset pick policy, call cu compaction when trigger
quick compaction
2. skip tablet's compaction task when compaction score is too small
Co-authored-by: yixiutt <yixiu@selectdb.com>