# Proposed changes
## refactor
- add AggregateExpression to shield the difference of AggregateFunction before disassemble and after
- request `GATHER` physicalProperties for query, because query always collect result to the coordinator, use `GATHER` maybe select a better plan
- refactor `NormalizeAggregate`
- remove some physical fields for the `LogicalAggregate`, like `AggPhase`, `isDisassemble`
- remove `AggregateDisassemble` and `DistinctAggregateDisassemble`, and use `AggregateStrategies` to generate various of PhysicalHashAggregate, like `two phases aggregate`, `three phases aggregate`, and cascades can auto select the lowest cost alternative.
- move `PushAggregateToOlapScan` to `AggregateStrategies`
- separate the traverse and visit method in FoldConstantRuleOnFE
- if some expression not implement the visit method, the traverse method can handle and rewrite the children by default
- if some expression implement the visit, the user defined traverse(invoke accept/visit method) will quickly return because the default visit method will not forward to the children, and the pre-process in traverse method will not be skipped.
## new feature
- support `disable_nereids_rules` to skip some rules.
example:
1. create 1 bucket table `n`
```sql
CREATE TABLE `n` (
`id` bigint(20) NOT NULL
) ENGINE=OLAP
DUPLICATE KEY(`id`)
COMMENT 'OLAP'
DISTRIBUTED BY HASH(`id`) BUCKETS 1
PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"in_memory" = "false",
"storage_format" = "V2",
"disable_auto_compaction" = "false"
);
```
2. insert some rows into `n`
```sql
insert into n select * from numbers('number'='20000000')
```
3. query table `n`
```sql
SET enable_nereids_planner=true;
SET enable_vectorized_engine=true;
SET enable_fallback_to_original_planner=false;
explain plan select id from n group by id;
```
the result show that we use the one stage aggregate
```
| PhysicalHashAggregate ( aggPhase=LOCAL, aggMode=INPUT_TO_RESULT, groupByExpr=[id#0], outputExpr=[id#0], partitionExpr=Optional.empty, requestProperties=[GATHER], stats=(rows=1, width=1, penalty=2.0E7) ) |
| +--PhysicalProject ( projects=[id#0], stats=(rows=20000000, width=1, penalty=0.0) ) |
| +--PhysicalOlapScan ( qualified=default_cluster:test.n, output=[id#0, name#1], stats=(rows=20000000, width=1, penalty=0.0) ) |
```
4. disable one stage aggregate
```sql
explain plan select
/*+SET_VAR(disable_nereids_rules=DISASSEMBLE_ONE_PHASE_AGGREGATE_WITHOUT_DISTINCT)*/
id
from n
group by id
```
the result is two stage aggregate
```
| PhysicalHashAggregate ( aggPhase=GLOBAL, aggMode=BUFFER_TO_RESULT, groupByExpr=[id#0], outputExpr=[id#0], partitionExpr=Optional[[id#0]], requestProperties=[GATHER], stats=(rows=1, width=1, penalty=2.0E7) ) |
| +--PhysicalHashAggregate ( aggPhase=LOCAL, aggMode=INPUT_TO_BUFFER, groupByExpr=[id#0], outputExpr=[id#0], partitionExpr=Optional[[id#0]], requestProperties=[ANY], stats=(rows=1, width=1, penalty=2.0E7) ) |
| +--PhysicalProject ( projects=[id#0], stats=(rows=20000000, width=1, penalty=0.0) ) |
| +--PhysicalOlapScan ( qualified=default_cluster:test.n, output=[id#0, name#1], stats=(rows=20000000, width=1, penalty=0.0) ) |
```