[improvement](scan) avoid too many scanners for file scan node (#25727)

In previous, when using file scan node(eq, querying hive table), the max number of scanner for each scan node will be the `doris_scanner_thread_pool_thread_num`(default is 48). And if the query parallelism is N, the total number of scanner would be 48 * N, which is too many. In this PR, I change the logic, the max number of scanner for each scan node will be the `doris_scanner_thread_pool_thread_num / query parallelism`. So that the total number of scanners will be up to `doris_scanner_thread_pool_thread_num`. Reduce the number of scanner can significantly reduce the memory usage of query.
2023-10-29 17:41:31 +08:00
parent 99b45e1938
commit e20cab64f4
35 changed files with 83 additions and 50 deletions
--- a/be/src/exec/exec_node.cpp
+++ b/be/src/exec/exec_node.cpp
@ -213,8 +213,6 @@ Status ExecNode::close(RuntimeState* state) {
                  << " already closed";
        return Status::OK();
    }
-    LOG(INFO) << "query= " << print_id(state->query_id())
-              << " fragment_instance_id=" << print_id(state->fragment_instance_id()) << " closed";
    _is_closed = true;

    Status result;
@ -228,6 +226,9 @@ Status ExecNode::close(RuntimeState* state) {
        _peak_memory_usage_counter->set(_mem_tracker->peak_consumption());
    }
    release_resource(state);
+    LOG(INFO) << "query= " << print_id(state->query_id())
+              << ", fragment_instance_id=" << print_id(state->fragment_instance_id())
+              << ", id=" << _id << " type=" << print_plan_node_type(_type) << " closed";
    return result;
 }