1. generalized debug point facilities from docker suites for
fault-injection/stubbing cases
2. add segcompaction fault-injection cases for demonstration
3. add -238 TOO_MANY_SEGMENTS fault-injection case for good
In previous, when using file scan node(eq, querying hive table), the max number of scanner for each scan node
will be the `doris_scanner_thread_pool_thread_num`(default is 48).
And if the query parallelism is N, the total number of scanner would be 48 * N, which is too many.
In this PR, I change the logic, the max number of scanner for each scan node
will be the `doris_scanner_thread_pool_thread_num / query parallelism`. So that the total number of scanners
will be up to `doris_scanner_thread_pool_thread_num`.
Reduce the number of scanner can significantly reduce the memory usage of query.
In branch 2.0, we changed the read/write method of AnalysisManager,
and rename the image module name to AnalysisMgrV2.
So we need to make the same change in master branch, so that user
can upgrade Doris from branch-2.0 to master branch.
After this PR, user can:
- upgrade from 2.0.x(or branch-2.0) to master
Doris is not responsible for managing snapshots, but it needs to clear all
snapshots before doing backup/restore regression testing, so a property is
added to indicate that existing snapshots need to be cleared when creating a
repo.
In addition, a regression test case for backup/restore has been added.
Add tpcds sf100 hive shapes.
Disable query64 temporarily because it is not same with emr cluster after collecting metadata by analyze table xxx.
And the root cause need to analyze, will enable in future PR.
The image file of our cluster reaches 2.3G. After the checkpoint, Followers synchronize the image timeout, resulting in the continuous increase of the bdb directory.
related pr: #25768
could not run multi group_concat distinct with more than one parameters.
This bug is not just for group_concat, but we usually use literal as
parameters in group_concat. So group_concat brought the problem to light.
In the original logic, we think only distinct aggregate function with
zero or one parameter could run in multi distinct mode. But it is wrong.
We could process all distinct aggregate function with not more than one
input slots.
Think about sql:
```sql
SELECT
group_concat(distinct c1, ','), group_concat(distinct c2, ',')
FROM t
GROUP BY c3
```
we translate expression to legacy one when do fold constant on BE.
some times, we generate invalid expression that cannot be tranlsated.
So, we should catch translate exception to avoid query failed.