91abfe38cdc1730f61330076cfcc616f042bee21
3 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
| 974263d83c |
[fix](join) Should not use the build block's size to resize mark_join_flags (#50993) (#51089)
Pick #50993 Introduced by #51050 The build block maybe be `clear_column_mem_not_keep` in build phase when the operator is closed. ```cpp Status HashJoinBuildSinkLocalState::close(RuntimeState* state, Status exec_status) { if (_closed) { return Status::OK(); } auto& p = _parent->cast<HashJoinBuildSinkOperatorX>(); Defer defer {[&]() { if (!_should_build_hash_table) { return; } // The build side hash key column maybe no need output, but we need to keep the column in block // because it is used to compare with probe side hash key column if (p._should_keep_hash_key_column && _build_col_ids.size() == 1) { p._should_keep_column_flags[_build_col_ids[0]] = true; } if (_shared_state->build_block) { // release the memory of unused column in probe stage _shared_state->build_block->clear_column_mem_not_keep(p._should_keep_column_flags, p._use_shared_hash_table); } if (p._use_shared_hash_table) { std::unique_lock lock(p._mutex); p._signaled = true; for (auto& dep : _shared_state->sink_deps) { dep->set_ready(); } for (auto& dep : p._finish_dependencies) { dep->set_ready(); } } }}; ``` ``` *** Aborted at 1747343165 (unix time) try "date -d @1747343165" if you are using GNU date *** *** Current BE git commitID: e7a3e78b97 *** *** SIGSEGV address not mapped to object (@0x1) received by PID 7474 (TID 9641 OR 0x7f3f8c0e5640) from PID 1; stack trace: *** 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/doris/be/src/common/signal_handler.h:421 1# PosixSignals::chained_handler(int, siginfo*, void*) [clone .part.0] in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so 2# JVM_handle_linux_signal in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so 3# 0x00007F4368F76520 in /lib/x86_64-linux-gnu/libc.so.6 4# doris::Status doris::pipeline::ProcessHashTableProbe<7>::finish_probing > > >(doris::vectorized::MethodKeysFixed > >&, doris::vectorized::MutableBlock&, doris::vectorized::Block*, bool*, bool) at /root/doris/be/src/pipeline/exec/join/process_hash_table_probe_impl.h:738 5# std::__detail::__variant::__gen_vtable_impl (*)(doris::pipeline::HashJoinProbeOperatorX::pull(doris::RuntimeState*, doris::vectorized::Block*, bool*) const::$_1&&, std::variant > >, doris::vectorized::MethodOneNumber > >, doris::vectorized::MethodOneNumber > >, doris::vectorized::MethodOneNumber > >, doris::vectorized::MethodOneNumber > >, doris::vectorized::MethodOneNumber, doris::JoinHashTable, HashCRC32 > > >, doris::vectorized::MethodOneNumber, doris::JoinHashTable, HashCRC32 > > >, doris::vectorized::MethodKeysFixed > >, doris::vectorized::MethodKeysFixed, HashCRC32 > > >, doris::vectorized::MethodKeysFixed > >, doris::vectorized::MethodKeysFixed, HashCRC32 > > >, doris::vectorized::MethodStringNoCache > > >&, std::variant, doris::pipeline::ProcessHashTableProbe<2>, doris::pipeline::ProcessHashTableProbe<8>, doris::pipeline::ProcessHashTableProbe<1>, doris::pipeline::ProcessHashTableProbe<4>, doris::pipeline::ProcessHashTableProbe<3>, doris::pipeline::ProcessHashTableProbe<7>, doris::pipeline::ProcessHashTableProbe<9>, doris::pipeline::ProcessHashTableProbe<10>, doris::pipeline::ProcessHashTableProbe<11> >&)>, std::integer_sequence >::__visit_invoke(doris::pipeline::HashJoinProbeOperatorX::pull(doris::RuntimeState*, doris::vectorized::Block*, bool*) const::$_1&&, std::variant > >, doris::vectorized::MethodOneNumber > >, doris::vectorized::MethodOneNumber > >, doris::vectorized::MethodOneNumber > >, doris::vectorized::MethodOneNumber > >, doris::vectorized::MethodOneNumber, doris::JoinHashTable, HashCRC32 > > >, doris::vectorized::MethodOneNumber, doris::JoinHashTable, HashCRC32 > > >, doris::vectorized::MethodKeysFixed > >, doris::vectorized::MethodKeysFixed, HashCRC32 > > >, doris::vectorized::MethodKeysFixed > >, doris::vectorized::MethodKeysFixed, HashCRC32 > > >, doris::vectorized::MethodStringNoCache > > >&, std::variant, doris::pipeline::ProcessHashTableProbe<2>, doris::pipeline::ProcessHashTableProbe<8>, doris::pipeline::ProcessHashTableProbe<1>, doris::pipeline::ProcessHashTableProbe<4>, doris::pipeline::ProcessHashTableProbe<3>, doris::pipeline::ProcessHashTableProbe<7>, doris::pipeline::ProcessHashTableProbe<9>, doris::pipeline::ProcessHashTableProbe<10>, doris::pipeline::ProcessHashTableProbe<11> >&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/variant:1013 6# doris::pipeline::HashJoinProbeOperatorX::pull(doris::RuntimeState*, doris::vectorized::Block*, bool*) const at /root/doris/be/src/pipeline/exec/hashjoin_probe_operator.cpp:281 7# doris::pipeline::StatefulOperatorX::get_block(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /root/doris/be/src/pipeline/exec/operator.cpp:670 8# doris::pipeline::OperatorXBase::get_block_after_projects(doris::RuntimeState*, doris::vectorized::Block*, bool*) at /root/doris/be/src/pipeline/exec/operator.cpp:381 9# doris::pipeline::PipelineTask::execute(bool*) in /mnt/hdd01/ci/doris-deploy-master-local/be/lib/doris_be 10# doris::pipeline::TaskScheduler::_do_work(int) at /root/doris/be/src/pipeline/task_scheduler.cpp:144 11# doris::ThreadPool::dispatch_thread() at /root/doris/be/src/util/threadpool.cpp:622 12# doris::Thread::supervise_thread(void*) at /root/doris/be/src/util/thread.cpp:469 13# start_thread at ./nptl/pthread_create.c:442 14# 0x00007F436905A850 at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:83 ``` Related PR: #xxx Problem Summary: None - Test <!-- At least one of them must be included. --> - [ ] Regression test - [ ] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason <!-- Add your reason? --> - Behavior changed: - [ ] No. - [ ] Yes. <!-- Explain the behavior change --> - Does this need documentation? - [ ] No. - [ ] Yes. <!-- Add document PR link here. eg: https://github.com/apache/doris-website/pull/1214 --> - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label <!-- Add branch pick label that this PR should merge into --> ### What problem does this PR solve? Issue Number: close #xxx Related PR: #xxx Problem Summary: ### Release note None ### Check List (For Author) - Test <!-- At least one of them must be included. --> - [ ] Regression test - [ ] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. Explain why: - [ ] This is a refactor/code format and no logic has been changed. - [ ] Previous test can cover this change. - [ ] No code files have been changed. - [ ] Other reason <!-- Add your reason? --> - Behavior changed: - [ ] No. - [ ] Yes. <!-- Explain the behavior change --> - Does this need documentation? - [ ] No. - [ ] Yes. <!-- Add document PR link here. eg: https://github.com/apache/doris-website/pull/1214 --> ### Check List (For Reviewer who merge this PR) - [ ] Confirm the release note - [ ] Confirm test cases - [ ] Confirm document - [ ] Add branch pick label <!-- Add branch pick label that this PR should merge into --> |
|||
| 51b39d0992 | [fix](join)Consider mark join when computing right_col_idx(#50720) (#50727) | |||
| d969047b50 |
[Refactor](join) refactor of hash join (#27557)
Improve the performance under the tpch data set by reconstructing the join related code and the use of hash table Co-authored-by: HappenLee <happenlee@hotmail.com> Co-authored-by: BiteTheDDDDt <pxl290@qq.com> |