[improvement](scan) avoid too many scanners for file scan node (#25727)
In previous, when using file scan node(eq, querying hive table), the max number of scanner for each scan node will be the `doris_scanner_thread_pool_thread_num`(default is 48). And if the query parallelism is N, the total number of scanner would be 48 * N, which is too many. In this PR, I change the logic, the max number of scanner for each scan node will be the `doris_scanner_thread_pool_thread_num / query parallelism`. So that the total number of scanners will be up to `doris_scanner_thread_pool_thread_num`. Reduce the number of scanner can significantly reduce the memory usage of query.
This commit is contained in:
@ -464,7 +464,7 @@ Status GroupCommitMgr::group_commit_insert(int64_t table_id, const TPlan& plan,
|
||||
RETURN_IF_ERROR(file_scan_node.prepare(runtime_state.get()));
|
||||
std::vector<TScanRangeParams> params_vector;
|
||||
params_vector.emplace_back(scan_range_params);
|
||||
file_scan_node.set_scan_ranges(params_vector);
|
||||
file_scan_node.set_scan_ranges(runtime_state.get(), params_vector);
|
||||
RETURN_IF_ERROR(file_scan_node.open(runtime_state.get()));
|
||||
|
||||
// 3. Put the block into block queue.
|
||||
@ -557,4 +557,4 @@ Status GroupCommitMgr::get_load_block_queue(int64_t table_id, const TUniqueId& i
|
||||
}
|
||||
return group_commit_table->get_load_block_queue(instance_id, load_block_queue);
|
||||
}
|
||||
} // namespace doris
|
||||
} // namespace doris
|
||||
|
||||
Reference in New Issue
Block a user