Commit Graph

7864 Commits

Author SHA1 Message Date
22cb7b8fcb [improvement](compaction) be do not compact invisible version to avoid query error -230 #28082 (#36222)
cherry pick from #28082
2024-06-27 13:45:21 +08:00
23cf494b48 [fix](schema-change) Fix schema-change from non-null to null (#36389)
https://github.com/apache/doris/pull/32913
2024-06-26 20:20:50 +08:00
25fb30c723 [fix](intersect) fix coredump caused by intersect of nullable and not nullable children #36401 (#36441)
## Proposed changes

Pick #36765
2024-06-26 17:45:21 +08:00
695d58f354 [cherry-pick](scan)scanner could eos early when reached limit (#36535) (#36736)
## Proposed changes
cherry-pick from master #36535
2024-06-25 17:22:43 +08:00
11201feae5 [fix](spill join) fix coredump of debug_string (#36723)
## Proposed changes

Pick #36715

<!--Describe your changes.-->
2024-06-25 16:33:33 +08:00
785a1f49f5 [fix](txn) Fix coordidator be restart not abort txn #35342 (#36437)
cherry pick from #35342
2024-06-25 13:35:01 +08:00
3652fc31c3 [Pick 2.1] "Fix data loss when node channel been cancelled before close wait (#36662)" (#36744)
## Proposed changes

Pick from https://github.com/apache/doris/pull/36662
2024-06-25 11:36:31 +08:00
6ec9a731e8 [branch-2.1](cherry-pick) partial update should not read old fileds from rows with delete sign (#36210) (#36755)
cherry-pick #36210
2024-06-24 21:13:24 +08:00
e4b6dac0c1 [fix](ubsan) reinterpret_cast fix length types to int8 is not safe (#36725)
## Proposed changes

Fix type check of ubsan. 
```
/root/doris/be/src/vec/exec/format/parquet/fix_length_plain_decoder.h:75:78: runtime error: member call on address 0x5582f35db5c0 which does not point to an object of type 'doris::vectorized::ColumnVector<signed char>'
0x5582f35db5c0: note: object is of type 'doris::vectorized::ColumnVector<int>'
 83 55 00 00  78 c0 b0 5a 82 55 00 00  02 00 00 00 00 00 00 00  10 a0 00 d7 83 55 00 00  10 a0 00 d7
              ^~~~~~~~~~~~~~~~~~~~~~~
              vptr for 'doris::vectorized::ColumnVector<int>'
doris::Status doris::vectorized::FixLengthPlainDecoder::_decode_values<false>(COW<doris::vectorized::IColumn>::mutable_ptr<doris::vectorized::IColumn>&, std::shared_ptr<doris::vectorized::IDataType const>&, doris::vectorized::ColumnSelectVector&, bool) at fix_length_plain_decoder.h:75:78
```
2024-06-24 14:03:41 +08:00
Pxl
c6205783fa [Bug](function) fix wrong output_char_size on hll_to_base64 (#36572)
## Proposed changes
pick from #36529
2024-06-24 13:19:28 +08:00
02fad48870 [Fix](upgrade) Fix fields not handled correctly during upgrade and downgrade (#36691)
master version is #36690
2024-06-22 14:23:04 +08:00
17cf34b244 [Fix](multi-catalog) Fix core in orc and parquet reader sometimes after low mem exception. (#36575)
## Proposed changes

Backport #36574.
2024-06-22 11:28:21 +08:00
90a4dd09f3 [Fix](func) CoreDump and Result Error in percentile function (#36647)
cherry pick #36643
2024-06-21 23:42:45 +08:00
445d42a57d [fix](topn-opt) remove redundant check for fetch phase (#36676)
#36629
Issue Number: close #xxx

<!--Describe your changes.-->
2024-06-21 22:28:38 +08:00
c8e4c404fa [Fix]check if fe set thrift field current_connect_fe (#36681)
bp #36678
2024-06-21 22:15:25 +08:00
c939781411 [Pick 2.1](inverted index) fix wrong no need read data when need_remaining_after_evaluate (#36684)
When using an equal predicate on a column that applies an inverted index
with a parser, it requires remaining_after_evaluate. In this situation,
we cannot optimize the column without reading the data.

## Proposed changes

From (#36637)
2024-06-21 22:01:39 +08:00
0cff539810 [feature](function) support new function replace_empty (#36283) (#36656)
#36283
2024-06-21 16:46:22 +08:00
c8f2a3f952 [fix](eq_for_null) fix incorrect logic in function eq_for_null #36004 (#36124)
cherry pick from #36004
cherry pick from #36164
2024-06-21 14:31:21 +08:00
8105dc7de8 [Pick 2.1](inverted index) fix wrong opt for pk no need read data (#36634)
## Proposed changes
 
Pick from #36618
2024-06-21 00:57:23 +08:00
a79b56ac23 [chore](be) Support config max message size for be thrift server (#36595)
Cherry-pick #36467
2024-06-20 20:15:43 +08:00
b3dcfae864 [chore](be) Improve ingesting binlog error checking (#36596)
Cherry-pick #36487
2024-06-20 20:15:26 +08:00
f7f7b2b738 [Enhancement](multi-catalog) Add more error msgs for wrong data types in orc and parquet reader. (#36580)
Backport #36417
2024-06-20 18:10:25 +08:00
fbcf63e1f5 [cherry-pick] (branch-2.1)fix variant index (#36577)
pick from master #36163
2024-06-20 17:57:26 +08:00
bd47d5a681 [branch-2.1](auto-partition) Fix auto partition load failure in multi replica (#36586)
this pr
1. picked #35630, which was reverted #36098 before.
2. picked #36344 from master

these two pr fixed existing bug about auto partition load.

---------

Co-authored-by: Kaijie Chen <ckj@apache.org>
2024-06-20 17:51:18 +08:00
88e02c836d [Fix]Fix insert select missing audit log when connect follower FE (#36481)
## Proposed changes

pick #36472
2024-06-20 15:16:16 +08:00
dabd27edd2 [opt](inverted index) performance optimization for need_read_data in compound #35346 #36292 (#36404)
pick from master
https://github.com/apache/doris/pull/35346
https://github.com/apache/doris/pull/36292
2024-06-20 08:43:16 +08:00
0be5331b28 [Fix](Variant) fix variant schema change may cause invalid block schema and write missing blocks #36317 (#36536) 2024-06-19 19:09:16 +08:00
5b7d93df5e [Pick](Variant) pick 2 PRs to correct tmp column name to go fast execute #36277 #36313 (#36527) 2024-06-19 19:07:47 +08:00
8d5b621021 [improvement](inverted index) Change inverted index field_name from column_name to id in format v2 #36470 (#36516)
pick from master #36470
2024-06-19 17:29:26 +08:00
f59dc4fb37 [opt](split) generate and get split batch concurrently (#36044)
bp #36045, and turn on batch split, which is turn off in #36109
Generate and get split batch concurrently.
`SplitSource.getNextBatch` remove the synchronization, and make each get their splits concurrently, and `SplitAssignment` generates splits asynchronously.
2024-06-19 16:16:02 +08:00
da0138a412 [Pick 2.1](segment iterator) fix shrink non-char column coredump #36275 (#36468) 2024-06-18 21:59:15 +08:00
Pxl
dda25cceb6 [Bug](information-schema) fix some bug of information_schema.PROCESSLIST (#36447)
## Proposed changes
pick from #36409
2024-06-18 16:45:48 +08:00
33540ec87b [Pick 2.1](inverted index) fix inverted index compound reader memory leak (#36387)
## Proposed changes

Issue Number: close #xxx

Pick from #36146 #36420
2024-06-18 16:13:21 +08:00
4a117800ca [Bug](Function) fix json contains with empty value (#36320) (#36418) 2024-06-18 10:20:45 +08:00
3810861bb1 [branch-2.1](cherry-pick) add _pk_index_meta's size to Segment::_meta_mem_usage (#36329) (#36399)
cherry-pick #36329

add _pk_index_meta's size to Segment::_meta_mem_usage to make memory
estimation more accurate.
2024-06-17 20:41:38 +08:00
e68834158c [fix](inverted index)Support Chinese column name with inverted index #36321 (#36374)
1. `std::string` to `std::wstring` conversion only supports ASCII
characters. For non-ASCII characters, we need to use
`StringUtil::string_to_wstring`
2. Fix index_tool check_terms_stats_v2 and add field info to print

pick from master #36321
2024-06-17 19:41:09 +08:00
612f2ae961 [feature](api) add BE HTTP /api/load_streams (#36312) (#36338)
cherry-pick #36312
2024-06-16 22:09:04 +08:00
6bb670ab38 [metrics](bvar) add bvar for load stream and file writer count (#36300) (#36336)
cherry-pick #36300
2024-06-16 10:14:59 +08:00
7051431671 [branch-2.1](memory) fix query thread attach memory tracker (#36245)
## Proposed changes

fix dcheck
```
*** Check failure stack trace: ***
F20240613 12:33:01.700206 1467887 thread_context.h:204] Check failed: doris::k_doris_exit || !doris::config::enable_memory_orphan_check || thread_mem_tracker()->label() != "Orphan" If you crash here, it means that SCOPED_ATTACH_TASK and SCOPED_SWITCH_THREAD_MEM_TRACKER_LIMITER are not used correctly. starting position of each thread is expected to use SCOPED_ATTACH_TASK to bind a MemTrackerLimiter belonging to Query/Load/Compaction/Other Tasks, otherwise memory alloc using Doris Allocator in the thread will crash. If you want to switch MemTrackerLimiter during thread execution, please use SCOPED_SWITCH_THREAD_MEM_TRACKER_LIMITER, do not repeat Attach. Of course, you can modify enable_memory_orphan_check=false in be.conf to avoid this crash.

44# doris::FragmentMgr::exec_plan_fragment(doris::TPipelineFragmentParams const&, std::function<void (doris::RuntimeState*, doris::Status*)> const&)::$_0::operator()() const at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/runtime/fragment_mgr.cpp:981
45# void std::__invoke_impl<void, doris::FragmentMgr::exec_plan_fragment(doris::TPipelineFragmentParams const&, std::function<void (doris::RuntimeState*, doris::Status*)> const&)::$_0&>(std::__invoke_other, doris::FragmentMgr::exec_plan_fragment(doris::TPipelineFragmentParams const&, std::function<void (doris::RuntimeState*, doris::Status*)> const&)::$_0&) at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61
46# std::enable_if<is_invocable_r_v<void, doris::FragmentMgr::exec_plan_fragment(doris::TPipelineFragmentParams const&, std::function<void (doris::RuntimeState*, doris::Status*)> const&)::$_0&>, void>::type std::__invoke_r<void, doris::FragmentMgr::exec_plan_fragment(doris::TPipelineFragmentParams const&, std::function<void (doris::RuntimeState*, doris::Status*)> const&)::$_0&>(doris::FragmentMgr::exec_plan_fragment(doris::TPipelineFragmentParams const&, std::function<void (doris::RuntimeState*, doris::Status*)> const&)::$_0&) at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:117
47# std::_Function_handler<void (), doris::FragmentMgr::exec_plan_fragment(doris::TPipelineFragmentParams const&, std::function<void (doris::RuntimeState*, doris::Status*)> const&)::$_0>::_M_invoke(std::_Any_data const&) at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291
48# std::function<void ()>::operator()() const at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:560
49# doris::FunctionRunnable::run() at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/util/threadpool.cpp:48
50# doris::ThreadPool::dispatch_thread() at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/util/threadpool.cpp:543
51# void std::__invoke_impl<void, void (doris::ThreadPool::*&)(), doris::ThreadPool*&>(std::__invoke_memfun_deref, void (doris::ThreadPool::*&)(), doris::ThreadPool*&) at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:74
52# std::__invoke_result<void (doris::ThreadPool::*&)(), doris::ThreadPool*&>::type std::__invoke<void (doris::ThreadPool::*&)(), doris::ThreadPool*&>(void (doris::ThreadPool::*&)(), doris::ThreadPool*&) at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96
53# void std::_Bind<void (doris::ThreadPool::*(doris::ThreadPool*))()>::__call<void, , 0ul>(std::tuple<>&&, std::_Index_tuple<0ul>) at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/functional:420
54# void std::_Bind<void (doris::ThreadPool::*(doris::ThreadPool*))()>::operator()<, void>() at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/functional:503
55# void std::__invoke_impl<void, std::_Bind<void (doris::ThreadPool::*(doris::ThreadPool*))()>&>(std::__invoke_other, std::_Bind<void (doris::ThreadPool::*(doris::ThreadPool*))()>&) at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61
56# std::enable_if<is_invocable_r_v<void, std::_Bind<void (doris::ThreadPool::*(doris::ThreadPool*))()>&>, void>::type std::__invoke_r<void, std::_Bind<void (doris::ThreadPool::*(doris::ThreadPool*))()>&>(std::_Bind<void (doris::ThreadPool::*(doris::ThreadPool*))()>&) at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:117
57# std::_Function_handler<void (), std::_Bind<void (doris::ThreadPool::*(doris::ThreadPool*))()> >::_M_invoke(std::_Any_data const&) at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291
58# std::function<void ()>::operator()() const at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:560
59# doris::Thread::supervise_thread(void*) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/util/thread.cpp:498
60# start_thread at /build/glibc-SzIz7B/glibc-2.31/nptl/pthread_create.c:478
61# __clone at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
```
<!--Describe your changes.-->
2024-06-15 13:32:42 +08:00
Pxl
c2fa60cbe5 [Enchancement](scan) enable parallel scan when preagg is on (#36302)
## Proposed changes
pick from #35810
2024-06-14 23:44:41 +08:00
Pxl
db2721915e [Bug](runtime-filter) release dependency when rf rpc failed or meet error status (#36297)
pick from #36126
2024-06-14 23:44:08 +08:00
f01039c224 [Pick 2.1](inverted index) fix memory leak of inverted index writer for array values #36208 (#36276)
fix inverted index writer's field leak
pick from  #36208
2024-06-14 08:39:55 +08:00
845dcce7f0 Revert "[opt](inverted index) performance optimization for need_read_data in …" (#36260)
Reverts apache/doris#36192
2024-06-13 21:31:20 +08:00
56ccb9a657 [fix](parquet) fix parquet reader missing column and filter missing column (#36182)
bp #36189
2024-06-13 21:30:05 +08:00
106a55497b [minor] better column name error description (#36154) (#36212)
bp #36154

Co-authored-by: Jensen <czjourney@163.com>
2024-06-13 11:51:33 +08:00
c84b56140c [Fix](outfile) Add a configuration for exporting data in Parquet format using select into outfile (#36143)
backport: #36142
2024-06-13 11:49:46 +08:00
cc7ab2b9fe [fix](inverted index)Delete tmp dirs when BE starts to avoid tmp files left by last crash #35951 (#36190)
When BE crashes, there may be tmp files left in the tmp dir, so we
remove and rebuild the tmp dir every time we start BE to prevent rubbish
data from occupying the disk.
2024-06-12 23:05:44 +08:00
04e62d9c42 [fix](invert index) ensure that the pred result sign of the inlist is in order #36085 (#36191) 2024-06-12 23:04:31 +08:00
f1e83f5656 [opt](inverted index) performance optimization for need_read_data in compound #35346 (#36192) 2024-06-12 20:02:00 +08:00
e1694e3d91 [Pick 2.1](inverted index) fix memory leak in inverted index writer for array values #36144 (#36165) 2024-06-12 19:59:57 +08:00