Introduction to Main Classes:
- MTMVService:MTMV services for other modules to call
- MTMVHookService:All operations that affect the MTMV
- MTMVJobManager:All operations that affect the MTMV job
- MTMVCacheManager:All operations that affect the MTMV Cache
- MTMVTask&MTMVJob:Inherit from job framework
1. Fix a profile bug of `MergeRangeFileReader`, and add a profile `ApplyBytes` to show the total bytes of ranges.
2. There's no need to merge large columns, because `MergeRangeFileReader` will increase the copy time.
using blocksproduced and rowsproduced to unify the counter name in DataStreamSender and other exec node, or exchange operator and other operators.
blocks produced and rows produced are more easy to understand.
---------
Co-authored-by: yiguolei <yiguolei@gmail.com>
1. Do not use FATAL log when jni encounter error, to avoid crash.
2. Fix NPE when closing PaimonReader, the reader may not be assigned if PaimonReader open failed.
Fix two bugs:
1. Missing column is case sensitive, change the column name to lower case in FE for hive/iceberg/hudi
2. Iceberg use custom method to encode special characters in column name. Decode the column name to match the right column in parquet reader.
This pull request addresses an issue observed with inverted index tables or tables without indices when querying null values using the MATCH function.
Previously, executing a query like `SELECT * FROM table WHERE column MATCH null;` would yield incorrect results.
The update introduces enhanced handling of nullable columns within the MATCH function, ensuring accurate query results when null values are involved.
- Fix complex type crash when using the dict filter facility in the parquet-reader by turning off the dict filter facility in this case.
- Add orc complex types regression test.
1. fix race condition problem when get tablet load index
2. change tablet search algorithm from random to round-robin for random distribution table when load_to_single_tablet set to false
If preparation fails, the counter _peak_memory_usage_counter will be a wild pointer.
*** SIGSEGV address not mapped to object (@0x454d49545f) received by PID 16992 (TID 18856 OR 0x7f4d05444700) from PID 1296651359; stack trace: ***
0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/doris/be/src/common/signal_handler.h:417
1# os::Linux::chained_handler(int, siginfo*, void*) in /app/doris/Nexchip-doris-1.2.4.2-bin-x86_64/java8/jre/lib/amd64/server/libjvm.so
2# JVM_handle_linux_signal in /app/doris/Nexchip-doris-1.2.4.2-bin-x86_64/java8/jre/lib/amd64/server/libjvm.so
3# signalHandler(int, siginfo*, void*) in /app/doris/Nexchip-doris-1.2.4.2-bin-x86_64/java8/jre/lib/amd64/server/libjvm.so
4# 0x00007F55C85B9400 in /lib64/libc.so.6
5# doris::vectorized::VDataStreamSender::close(doris::RuntimeState*, doris::Status) at /root/doris/be/src/vec/sink/vdata_stream_sender.cpp:734
6# doris::PlanFragmentExecutor::close() at /root/doris/be/src/runtime/plan_fragment_executor.cpp:543
7# doris::PlanFragmentExecutor::~PlanFragmentExecutor() at /root/doris/be/src/runtime/plan_fragment_executor.cpp:95
8# doris::FragmentExecState::~FragmentExecState() at /root/doris/be/src/runtime/fragment_mgr.cpp:112
9# std::_Sp_counted_ptr<doris::FragmentExecState*, (__gnu_cxx::_Lock_policy)2>::_M_dispose() at /root/ldb/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:348
10# doris::FragmentMgr::exec_plan_fragment(doris::TExecPlanFragmentParams const&, std::function<void (doris::RuntimeState*, doris::Status*)> const&) at /root/doris/be/src/runtime/fragment_mgr.cpp:855
11# doris::FragmentMgr::exec_plan_fragment(doris::TExecPlanFragmentParams const&) at /root/doris/be/src/runtime/fragment_mgr.cpp:592
12# doris::PInternalServiceImpl::_exec_plan_fragment_impl(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, doris::PFragmentRequestVersion, bool) at /root/doris/be/src/service/internal_service.cpp:463
13# doris::PInternalServiceImpl::_exec_plan_fragment_in_pthread(google::protobuf::RpcController*, doris::PExecPlanFragmentRequest const*, doris::PExecPlanFragmentResult*, google::protobuf::Closure*) at /root/doris/be/src/service/internal_service.cpp:305
14# doris::WorkThreadPool<false>::work_thread(int) at /root/doris/be/src/util/work_thread_pool.hpp:160
15# execute_native_thread_routine at ../../../../../libstdc++-v3/src/c++11/thread.cc:84
16# start_thread in /lib64/libpthread.so.0
17# clone in /lib64/libc.so.6
Manually track query/load/compaction/etc. memory in Allocator instead of mem hook.
Can still use Mem Hook when cannot manually track memory code segments and find memory locations during debugging.
This will cause memory tracking loss for Query, loss less than 10% compared to the past, but this is expected to be more controllable.
Similarly, Mem Hook will no longer track unowned memory to the orphan mem tracker by default, so the total memory of all MemTrackers will be less than before.
Not need to get memory size from jemalloc in Mem Hook each memory alloc and free, which would lose performance in the past.
Not require caching bthread local in pthread local for memory hook, in the past this has caused core dumps inside bthread, seems to be a bug in bthread.
ThreadContext life cycle to manual control
In the past, ThreadContext was automatically created when it was used for the first time (this was usually in the Jemalloc Hook when the first malloc memory), and was automatically destroyed when the thread exited.
Now instead of manually controlling the create and destroy of ThreadContext, it is mainly created manually when the task thread start and destroyed before the task thread end.
Run 43 clickbench query tests.
Use MemHook in the past: