[Opt](multi-catalog) Opt split assignment to resolve uneven distribution. (#30390)
[Opt] (multi-catalog) Opt split assignment to resolve uneven distribution. Currently only for `FileQueryScanNode`. Referring to the implementation of Trino, - Local node soft affinity optimization. Prefer local replication node. - Remote split will use the consistent hash algorithm is used when the file cache is turned on, and because of the possible unevenness of the consistent hash, the split is re-adjusted so that the maximum and minimum split numbers of hosts differ by at most `max_split_num_variance` split. - Remote split will use the round-robin algorithm is used when the file cache is turned off.
This commit is contained in:
@ -2239,7 +2239,28 @@ public class Config extends ConfigBase {
|
||||
"When file cache is enabled, the number of virtual nodes of each node in the consistent hash algorithm. "
|
||||
+ "The larger the value, the more uniform the distribution of the hash algorithm, "
|
||||
+ "but it will increase the memory overhead."})
|
||||
public static int virtual_node_number = 2048;
|
||||
public static int split_assigner_virtual_node_number = 256;
|
||||
|
||||
@ConfField(mutable = true, description = {
|
||||
"本地节点软亲缘性优化。尽可能地优先选取本地副本节点。",
|
||||
"Local node soft affinity optimization. Prefer local replication node."})
|
||||
public static boolean split_assigner_optimized_local_scheduling = true;
|
||||
|
||||
@ConfField(mutable = true, description = {
|
||||
"随机算法最小的候选数目,会选取相对最空闲的节点。",
|
||||
"The random algorithm has the smallest number of candidates and will select the most idle node."})
|
||||
public static int split_assigner_min_random_candidate_num = 2;
|
||||
|
||||
@ConfField(mutable = true, description = {
|
||||
"一致性哈希算法最小的候选数目,会选取相对最空闲的节点。",
|
||||
"The consistent hash algorithm has the smallest number of candidates and will select the most idle node."})
|
||||
public static int split_assigner_min_consistent_hash_candidate_num = 2;
|
||||
|
||||
@ConfField(mutable = true, description = {
|
||||
"各节点之间最大的 split 数目差异,如果超过这个数目就会重新分布 split。",
|
||||
"The maximum difference in the number of splits between nodes. "
|
||||
+ "If this number is exceeded, the splits will be redistributed."})
|
||||
public static int split_assigner_max_split_num_variance = 1;
|
||||
|
||||
@ConfField(description = {
|
||||
"控制统计信息的自动触发作业执行记录的持久化行数",
|
||||
|
||||
Reference in New Issue
Block a user