The Repeat Node will change the data partition of fragment when the origin data partition of fragment is HashPartition. The Repeat Node will generate some new rows. The distribution of these new rows is completely inconsistent with the original data distribution, their distribution is RANDOM. If the data distribution is not corrected, an error will occur when the agg node determines whether to perform colocate. Wrong data distribution will cause the agg node to think that agg can be colocated, leading to wrong results. For example, the following query can not be colocated although the distributed column of table is k1: ``` SELECT k1, k2, SUM( k3 ) FROM table GROUP BY GROUPING SETS ( (k1, k2), (k1), (k2), ( ) ) ```
# fe-common This module is used to store some common classes of other modules. # spark-dpp This module is Spark DPP program, used for Spark Load function. Depends: fe-common # fe-core This module is the main process module of FE. Depends: fe-common, spark-dpp