Commit Graph

8124 Commits

Author SHA1 Message Date
96260d97bd [fix](ub) undefined behavior in FixedContainer (#39191) (#39201)
## Proposed changes

pick #39191
Undefined behavior occurs if there is a null value in the list.

```
/root/doris/be/src/vec/common/string_ref.h:271:54: runtime error: null pointer passed as argument 2, which is declared to never be null
/var/local/ldb-toolchain/bin/../usr/include/string.h:64:33: note: nonnull attribute specified here
#0 0x5616d072245d in doris::StringRef::eq(doris::StringRef const&) const /root/doris/be/src/vec/common/string_ref.h:271:41
#1 0x5616d072245d in doris::StringRef::operator==(doris::StringRef const&) const /root/doris/be/src/vec/common/string_ref.h:274:60
#2 0x5616d072245d in doris::FixedContainer::find(doris::StringRef const&) const /root/doris/be/src/exprs/hybrid_set.h:76:36
#3 0x5616d072245d in void doris::StringValueSet>::_find_batch(doris::vectorized::IColumn const&, unsigned long, doris::vectorized::PODArray, 16ul, 15ul> const*, doris::vectorized::PODArray, 16ul, 15ul>&) /root/doris/be/src/exprs/hybrid_set.h:688:63
#4 0x5616d0747857 in doris::vectorized::FunctionIn::execute_impl(doris::FunctionContext*, doris::vectorized::Block&, std::vector> const&, unsigned long, unsigned long) const /root/doris/be/src/vec/functions/in.h:170:21
#5 0x5616c741fa3a in doris::vectorized::DefaultExecutable::execute_impl(doris::FunctionContext*, doris::vectorized::Block&, std::vector> const&, unsigned long, unsigned long) const /root/doris/be/src/vec/functions/function.h:462:26
#6 0x5616cbb5b650 in doris::vectorized::PreparedFunctionImpl::_execute_skipped_constant_deal(doris::FunctionContext*, doris::vectorized::Block&, std::vector> const&, unsigned long, unsigned long, bool) const /root/doris/be/src/vec/functions/function.cpp
#7 0x5616cbb4e14e in doris::vectorized::PreparedFunctionImpl::execute_without_low_cardinality_columns(doris::FunctionContext*, doris::vectorized::Block&, std::vector> const&, unsigned long, unsigned long, bool) const /root/doris/be/src/vec/functions/function.cpp:244:12
#8 0x5616cbb4e3c2 in doris::vectorized::PreparedFunctionImpl::execute(doris::FunctionContext*, doris::vectorized::Block&, std::vector> const&, unsigned long, unsigned long, bool) const /root/doris/be/src/vec/functions/function.cpp:250:12
#9 0x5616c741cd68 in doris::vectorized::IFunctionBase::execute(doris::FunctionContext*, doris::vectorized::Block&, std::vector> const&, unsigned long, unsigned long, bool) const /root/doris/be/src/vec/functions/function.h:190:19
#10 0x5616c74cf712 in doris::vectorized::VInPredicate::execute(doris::vectorized::VExprContext*, doris::vectorized::Block*, int*) /root/doris/be/src/vec/exprs/vin_predicate.cpp:130:5
#11 0x5616c740d5c0 in doris::vectorized::VectorizedFnCall::_do_execute(doris::vectorized::VExprContext*, doris::vectorized::Block*, int*, std::vector>&) /root/doris/be/src/vec/exprs/vectorized_fn_call.cpp:183:9
#12 0x5616c740ecf5 in doris::vectorized::VectorizedFnCall::execute(doris::vectorized::VExprContext*, doris::vectorized::Block*, int*) /root/doris/be/src/vec/exprs/vectorized_fn_call.cpp:215:12
#13 0x5616c7462e24 in doris::vectorized::VCompoundPred::execute(doris::vectorized::VExprContext*, doris::vectorized::Block*, int*) /root/doris/be/src/vec/exprs/vcompound_pred.h:127:38
#14 0x5616c74bccec in doris::vectorized::VExprContext::execute(doris::vectorized::Block*, int*) /root/doris/be/src/vec/exprs/vexpr_context.cpp:54:5
#15 0x5616c74c1dcc in doris::vectorized::VExprContext::execute_conjuncts(std::vector, std::allocator>> const&, std::vector, 16ul, 15ul>, std::allocator, 16ul, 15ul>>> const*, bool, doris::vectorized::Block*, doris::vectorized::PODArray, 16ul, 15ul>, bool) /root/doris/be/src/vec/exprs/vexpr_context.cpp:169:9
#16 0x5616c74c5108 in doris::vectorized::VExprContext::execute_conjuncts_and_filter_block(std::vector, std::allocator>> const&, doris::vectorized::Block*, std::vector>&, int, doris::vectorized::PODArray, 16ul, 15ul>&) /root/doris/be/src/vec/exprs/vexpr_context.cpp:322:5
#17 0x5616ad8a7f1a in doris::segment_v2::SegmentIterator::_execute_common_expr(unsigned short*, unsigned short&, doris::vectorized::Block*) /root/doris/be/src/olap/rowset/segment_v2/segment_iterator.cpp:2680:5
#18 0x5616ad89e86e in doris::segment_v2::SegmentIterator::_next_batch_internal(doris::vectorized::Block*) /root/doris/be/src/olap/rowset/segment_v2/segment_iterator.cpp:2582:25
#19 0x5616ad892f5c in doris::segment_v2::SegmentIterator::next_batch(doris::vectorized::Block*)::$_0::operator()() const /root/doris/be/src/olap/rowset/segment_v2/segment_iterator.cpp:2315:9
#20 0x5616ad892f5c in doris::segment_v2::SegmentIterator::next_batch(doris::vectorized::Block*) /root/doris/be/src/olap/rowset/segment_v2/segment_iterator.cpp:2314:19
#21 0x5616ad6dd9cc in doris::segment_v2::LazyInitSegmentIterator::next_batch(doris::vectorized::Block*) /root/doris/be/src/olap/rowset/segment_v2/lazy_init_segment_iterator.h:44:33
#22 0x5616ad269d67 in doris::BetaRowsetReader::next_block(doris::vectorized::Block*) /root/doris/be/src/olap/rowset/beta_rowset_reader.cpp:380:29
#23 0x5616de6de110 in doris::vectorized::VCollectIterator::Level0Iterator::_refresh() /root/doris/be/src/vec/olap/vcollect_iterator.h
#24 0x5616de6c967f in doris::vectorized::VCollectIterator::Level0Iterator::refresh_current_row() /root/doris/be/src/vec/olap/vcollect_iterator.cpp:514:24
#25 0x5616de6ca8a6 in doris::vectorized::VCollectIterator::Level0Iterator::ensure_first_row_ref() /root/doris/be/src/vec/olap/vcollect_iterator.cpp:493:14
#26 0x5616de6d7008 in doris::vectorized::VCollectIterator::Level1Iterator::ensure_first_row_ref() /root/doris/be/src/vec/olap/vcollect_iterator.cpp:692:27
#27 0x5616de6bd200 in doris::vectorized::VCollectIterator::build_heap(std::vector, std::allocator>>&) /root/doris/be/src/vec/olap/vcollect_iterator.cpp:186:9
#28 0x5616de651b6c in doris::vectorized::BlockReader::_init_collect_iter(doris::TabletReader::ReaderParams const&) /root/doris/be/src/vec/olap/block_reader.cpp:157:5
#29 0x5616de65526f in doris::vectorized::BlockReader::init(doris::TabletReader::ReaderParams const&) /root/doris/be/src/vec/olap/block_reader.cpp:229:19
#30 0x5616e175a0f9 in doris::vectorized::NewOlapScanner::open(doris::RuntimeState*) /root/doris/be/src/vec/exec/scan/new_olap_scanner.cpp:237:32
#31 0x5616c736ad34 in doris::vectorized::ScannerScheduler::_scanner_scan(std::shared_ptr, std::shared_ptr) /root/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:236:5
#32 0x5616c736f05e in doris::vectorized::ScannerScheduler::submit(std::shared_ptr, std::shared_ptr)::$_1::operator()() const::'lambda'()::operator()() const::'lambda'()::operator()() const /root/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:176:21
#33 0x5616c736f05e in doris::vectorized::ScannerScheduler::submit(std::shared_ptr, std::shared_ptr)::$_1::operator()() const::'lambda'()::operator()() const /root/doris/be/src/vec/exec/scan/scanner_scheduler.cpp:175:31
#34 0x5616c736f05e in void std::_invoke_impl, std::shared_ptr)::$_1::operator()() const::'lambda'()&>(std::_invoke_other, doris::vectorized::ScannerScheduler::submit(std::shared_ptr, std::shared_ptr)::$_1::operator()() const::'lambda'()&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61:14
#35 0x5616c736f05e in std::enable_if, std::shared_ptr)::$1::operator()() const::'lambda'()&>, void>::type std::_invoke_r, std::shared_ptr)::$_1::operator()() const::'lambda'()&>(doris::vectorized::ScannerScheduler::submit(std::shared_ptr, std::shared_ptr)::$_1::operator()() const::'lambda'()&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:111:2
#36 0x5616c736f05e in std::_Function_handler, std::shared_ptr)::$_1::operator()() const::'lambda'()>::_M_invoke(std::_Any_data const&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291:9
#37 0x5616aeed6a3b in doris::ThreadPool::dispatch_thread() /root/doris/be/src/util/threadpool.cpp:543:24
#38 0x5616aeeae4f7 in doris::Thread::supervise_thread(void*) /root/doris/be/src/util/thread.cpp:498:5
#39 0x7f7e663e3ac2 in start_thread nptl/pthread_create.c:442:8
#40 0x7f7e6647584f misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /root/doris/be/src/vec/common/string_ref.h:271:54 in
```

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-12 10:02:07 +08:00
b38caed808 [Improve](columns)replace fatal with exception #38035 (#38996) 2024-08-12 09:51:30 +08:00
3da2d1c9d6 [bug](parquet)Fix the problem that the parquet reader reads the missing sub-columns of the struct and fails. (#38718) (#39192)
bp #38718
2024-08-11 20:37:40 +08:00
5f77f909d9 [cherry-pick](branch-2.1) Pick "[feature](function) support ip functions named ipv4_to_ipv6 and cut_ipv6" (#39058)
## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
pick https://github.com/apache/doris/pull/36883 and
https://github.com/apache/doris/pull/35239
2024-08-10 18:37:11 +08:00
0db158386a [chore](rowset writer) print rowset rows number when meet too many segments (#39091) (#39182)
pick (#39091)
2024-08-10 18:36:24 +08:00
5e1e725cee [feature](inverted index) Add multi_match function #37722 #38931 #39149 (#38877) 2024-08-10 15:20:08 +08:00
8a682d43ec [fix](ut) repair segcompaction ut (#38165) (#38225)
cherry-pick #38165
2024-08-09 15:52:18 +08:00
5db4184178 [fix] (compaction) fix time series (#38791) (#39052)
## Proposed changes

pick from master #38791
2024-08-09 15:02:58 +08:00
1ef42dd94b [Fix](load) The value of the index id printed in the log is incorrect #38790 (#39131)
cherry pick from #38790
2024-08-09 12:31:33 +08:00
e2f45225d6 [branch-2.1] Picks "[opt](merge-on-write) eliminate reading the old values of non-key columns for delete stmt in publish phase #38703" (#39074)
picks https://github.com/apache/doris/pull/38703
2024-08-09 10:42:52 +08:00
e15b6cfc68 [fix](be) return correct canceled status from scanner (#36392) (#39111)
## Proposed changes

pick #36392
2024-08-09 04:02:42 +08:00
0571342538 [fix](sink) The issue with 2GB limit of protocol buffer (#37990) (#39112)
```
Fail to serialize doris.PFetchDataResult
```

If the size of `PFetchDataResult` is greater than 2G, protocol buffer
cannot serialize the message.

pick #37990
2024-08-09 04:01:56 +08:00
8678fcea32 [config](inverted index)Make inverted_index_ram_dir enable by default(#35094) (#39120)
## Proposed changes

bp #35094

Co-authored-by: Luennng <luennng@gmail.com>
2024-08-09 01:38:14 +08:00
f8f5be7ce7 [fix](schema-change) Fix wrong intput column for cast validity check (#38894) (#39107)
## Proposed changes

1. Use column idx of ref block instead of new block to indicate the ref
column.
2. Rename some variables to clarify their meanings.
3. Clarify some log msg.
4. Add a minimal case to verify the change.
2024-08-08 19:36:06 +08:00
efdd75f286 [fix](function) stddev with DecimalV2 type will result in an error (#… (#39072)
…38731)

https://github.com/apache/doris/pull/38731
The stddev function has a separate implementation for the DecimalV2
type, but there are issues with the implementation. Given that there is
almost no existing data for DecimalV2, it will be removed here. For be,
upgrading to this situation will result in an error directly.
```
SELECT STDDEV(data) FROM DECIMALV2_10_0_DATA;
ERROR 1105 (HY000): errCode = 2, detailMessage = (127.0.0.1)[INTERNAL_ERROR]Agg Function stddev(decimal(10,0)) is not implemented
```
After removing DecimalV2, parameters of type DecimalV2 will be converted
to double for calculations.

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-08 17:53:17 +08:00
1fbfb81b8a [branch-2.1] Picks "[Fix](partial update) Persist partial_update_info in RocksDB in case of BE restart after a partial update has commited #38331" (#39035)
picks https://github.com/apache/doris/pull/38331 and
https://github.com/apache/doris/pull/39066
2024-08-08 14:50:08 +08:00
2ec1a6a7e7 [fix](group commit) Modify group commit commit/abort txn timeout as stream load (#39003) (#39069)
pick https://github.com/apache/doris/pull/39003
2024-08-08 14:36:29 +08:00
44cb7978a9 [opt](index) add more inverted index profile metrics #36696 (#38858) 2024-08-08 14:16:55 +08:00
0a3874f203 [fix](move-memtable) close stream when cancel load stream stub (#38912) (#39039)
backport #38912
2024-08-07 23:24:00 +08:00
749c9f7b56 [fix](group commit) fix repaly wal check label status (#38883) (#38997)
pick https://github.com/apache/doris/pull/38883
2024-08-07 22:06:59 +08:00
773008d6fa [Fix](Json) fix some cast issue (#38683) (#39025)
#38683
2024-08-07 22:05:43 +08:00
91dcaaf7dd [fix](MoW) fix MoW & segcompaction conflict on cache of temp segment … (#38992)
…(#37760)

MoW will update delete bitmap during load, and the page cache could be
modified by segcompaction. Disable page cache touchs when doing
segcompaction could solve this problem.

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
Co-authored-by: zhengyu <freeman.zhang1992@gmail.com>
2024-08-07 21:18:10 +08:00
7e95d7cbec [bugfix](backup)(cooldown) cancel backup properly when be backup failed (#38724) (#38993)
Co-authored-by: zhangyuan <ayuanzhang@tencent.com>
2024-08-07 15:58:11 +08:00
7550fbaff7 [Fix](Exception) throw exception in defer may result std::terminate (… (#39007)
pick #38935
2024-08-07 13:46:23 +08:00
8cb5aa64f4 [test](inverted index) add an Inverted Index Testing Switch (#38077) (#38947)
https://github.com/apache/doris/pull/38077
2024-08-07 11:25:36 +08:00
fc0222a64c [opt](info) processlist schema table support show all fe (#38701) (#38953)
pick #38701
2024-08-07 11:01:46 +08:00
b856530b09 [fix](inverted index) disable range query in StringTypeInvertedIndexReader (#38218) (#38926)
## Proposed changes

pick from master #38218

Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-07 10:44:02 +08:00
e400859531 [fix](update null map) Fix update_null_map #38787 (#38920)
cherry pick from #38787
2024-08-07 10:21:41 +08:00
2543b569bb [Optimize](Row store) pick #37145, #38236 (#38932) 2024-08-07 09:55:42 +08:00
bc644cb253 [opt](catalog) merge scan range to avoid too many splits (#38311) (#38964)
bp #38311
2024-08-06 21:57:02 +08:00
3abb222064 [fix](group commit) Fix test_group_commit_async_wal_msg_fault_injection case (#35313) (#38911)
pick https://github.com/apache/doris/pull/35313
2024-08-06 17:57:22 +08:00
fe6ea3b8b5 [Fix](inverted index) fix missed array inverted index null bitmap #38907 (#38934)
cherry pick from #38907
2024-08-06 17:17:28 +08:00
21a67dba5d [fix](index) fix inverted index compound file entry size int32 overflow #38891 (#38928) 2024-08-06 15:57:09 +08:00
28c0510440 [fix](pipeline) Fix mem control in local exchanger (#38885) (#38910)
If a block (>128M) is dequeue by local exchange source operator and it
is the last block, both of source operators and sink operators will be
hang. This PR fixed it.

pick #38885
2024-08-06 14:45:41 +08:00
ba5c6fba98 [scheduler](core) Use signed int as number of cores (#38514) (#38913)
pick #38514

*** is nereids: 0 ***
*** tablet id: 0 ***
*** Aborted at 1722279016 (unix time) try "date -d @1722279016" if you
are using GNU date ***
*** Current BE git commitID: e9f12fac47e ***
*** SIGSEGV unknown detail explain (@0x0) received by PID 1116227 (TID
1116498 OR 0x7f009ac00640) from PID 0; stack trace: *** 0#
doris::signal::(anonymous namespace)::FailureSignalHandler(int,
siginfo_t*, void*) at

/home/zcp/repo_center/doris_branch-2.1/doris/be/src/common/signal_handler.h:421
1# PosixSignals::chained_handler(int, siginfo*, void*) [clone .part.0]
in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so 2#
JVM_handle_linux_signal in
/usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
 3# 0x00007F01E49B0520 in /lib/x86_64-linux-gnu/libc.so.6
 4# pthread_mutex_lock at ./nptl/pthread_mutex_lock.c:80
5# doris::pipeline::MultiCoreTaskQueue::take(unsigned long) at
/home/zcp/repo_center/doris_branch-2.1/doris/be/src/pipeline/task_queue.cpp:154
6# doris::pipeline::TaskScheduler::_do_work(unsigned long) at
/home/zcp/repo_center/doris_branch-2.1/doris/be/src/pipeline/task_scheduler.cpp:268
7# doris::ThreadPool::dispatch_thread() in
/mnt/disk1/STRESS_ENV/be/lib/doris_be
8# doris::Thread::supervise_thread(void*) at
/home/zcp/repo_center/doris_branch-2.1/doris/be/src/util/thread.cpp:499
 9# start_thread at ./nptl/pthread_create.c:442
10# 0x00007F01E4A94850 at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:83

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-06 14:44:59 +08:00
8ce30963cd [fix] (compaction) fix time series compaction policy (#38220) (#38917)
## Proposed changes

pick from #38220
2024-08-06 14:26:42 +08:00
ff6fa33021 [opt](inverted index) mow supports index optimization #(#38180)
## Proposed changes

https://github.com/apache/doris/pull/37428
https://github.com/apache/doris/pull/37429

<!--Describe your changes.-->
2024-08-06 11:18:13 +08:00
bcea54147c [feature](inverted index) String type inverted index match function c… (#38872)
https://github.com/apache/doris/pull/38170
2024-08-06 09:06:05 +08:00
c7b59b38ef [fix](hist) Fix unstable result of aggregrate function hist #38608 (#38893)
cherry pick from #38608
2024-08-06 08:52:03 +08:00
e9bf0776d7 [fix](parquet) disable parquet page index by default #38691 (#38901)
bp #38691
2024-08-06 08:51:39 +08:00
70a518e099 [Fix](multi-catalog) Fix not throw error when call close() in hive/iceberg writer. (#38902)
## Proposed changes
[Fix] (multi-catalog) Fix not throw error when call close() in
hive/iceberg writer.

When the file writer closes(), it will sync buffer to commit. Therefore,
sometimes data is written only when close() is called, which can expose
some errors. For example, hdfs_file_writer. Therefore, this error needs
to be captured in the entire close process.
2024-08-06 08:51:12 +08:00
0711423ee3 [Chore](pipeline) set PipelineFragmentContext::_timeout (#38890)
## Proposed changes

Now we use `query_timeout` to set a timeout value for queries. But for
pipelineX engine, Doris do not use it so each query will not end before
EOS. This PR fix it.

pick #35328

<!--Describe your changes.-->
2024-08-05 21:47:08 +08:00
9d5af7febd [opt](inverted index) Optimization of the initialization process in topn (#38870)
pick https://github.com/apache/doris/pull/37722
2024-08-05 18:26:00 +08:00
bf1c7a1c15 [fix](clone) fix stale tablet report miss the new cloning replica #38695 (#38839)
cherry pick from #38695
2024-08-05 18:04:24 +08:00
0f69a2a47f [fix](compaction) fix mismatch between segment key and value column rows during compaction (#37960)(#38251)(#38356) (#38835)
pick master #37960 #38251 #38356
2024-08-05 16:48:08 +08:00
4c75fecea9 [fix](compile) be compile failed in mac due to std::max (#37238) (#38860)
cherry-pick #37238 to branch-2.1
2024-08-05 16:31:39 +08:00
bb962a8291 [minor](fix) Fix incorrect fmt arguments (#38840) (#38861)
pick #38840
2024-08-05 16:06:32 +08:00
65154f8abe [branch-2.1] (doris-future) Support auto partition name function (#38853)
cherry-pick https://github.com/apache/doris/pull/34258 to branch-2.1
2024-08-05 16:04:24 +08:00
Pxl
86ef0069ea [Feature](function) support group concat with distinct and order by (#38851)
pick from #38744 and #38776
2024-08-05 15:44:51 +08:00
607c0b82a9 [opt](serde)Optimize the filling of fixed values ​​into block columns without repeated deserialization. (#37377) (#38245) (#38810)
## Proposed changes
pick pr: #38575  and fix this pr bug :  #38245
2024-08-05 09:13:08 +08:00