Commit Graph

8535 Commits

Author SHA1 Message Date
702abbff0f [Opt](orc)Optimize the merge io when orc reader read multiple tiny stripes. (#42004) (#44239)
bp #42004

Co-authored-by: kaka11chen <kaka11.chen@gmail.com>
2024-11-22 11:01:41 +08:00
346b89e683 [improve](routine load) adjust default values to make routine load more convenient to use (#42491) (#44377)
pick (#42491)

For a routine load job, it will be divided into many tasks, each of
which is a transaction. Currently, the default time
consumed(max_batch_interval) is 10 seconds. The benefits of increasing
this value are:
1. Larger batch consumption can lead to better performance.
2. Reducing the number of transactions can alleviate the pressure of
compaction and the conflicts of concurrent transaction submissions.

related doc: https://github.com/apache/doris-website/pull/1236/files
2024-11-21 23:05:11 +08:00
9664b50eb6 [improve](load) do not block delta writer if memtable memory is low (#42649) (#44305)
backport #42649
2024-11-21 11:17:35 +08:00
fb163b55c2 branch-2.1: [Fix](merge-on-write) Fix MergeIndexDeleteBitmapCalculator::calculate_one() coredump #44284 (#44330)
Cherry-picked from #44284

Co-authored-by: bobhan1 <baohan@selectdb.com>
2024-11-20 21:07:43 +08:00
dc67086d97 [fix](scan) Avoid memory allocated by buffered_reader from being traced (#41921) (#44253)
Use OwnedSlice to replace `char*` in BufferedReader

## Proposed changes

pick #41921
2024-11-20 10:37:06 +08:00
610054c77b [cherry-pick](branch-21) fix exchange of tablet shuffle send block error (#44102) (#44230)
cherry-pick from master (#44102)
2024-11-19 17:31:06 +08:00
433c1bc9ff [cherry-pick](branch-21) replace the LOG(FATAL) to throw Exception in query execute layer (#38144) (#44183)
cherry-pick from master https://github.com/apache/doris/pull/38144
2024-11-19 17:28:20 +08:00
c9801f7a38 branch-2.1: [Bug](function) fix cut_ipv6 function error about modify the input column data #43921 (#44180)
Cherry-picked from #43921

Co-authored-by: zhangstar333 <zhangsida@selectdb.com>
2024-11-19 17:27:29 +08:00
83208ee1a8 [pick](branch-2.1) pick #43960 #43929 #44177 (#44240)
pick #43960 #43929 #44177
2024-11-19 17:25:16 +08:00
eeafe45c0c [fix](brpc) coredump caused by brpc checking (#44047) (#44188)
pick #44047
```
/root/doris/be/src/runtime/fragment_mgr.cpp:1064:20: runtime error: member call on null pointer of type 'doris::PBackendService_Stub'

SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /root/doris/be/src/runtime/fragment_mgr.cpp:1064:20 in
*** Query id: 0-0 ***
*** is nereids: 0 ***
*** tablet id: 0 ***
*** Aborted at 1731663847 (unix time) try "date -d @1731663847" if you are using GNU date ***
*** Current BE git commitID: b663df0e50 ***
*** SIGSEGV address not mapped to object (@0x0) received by PID 17169 (TID 17463 OR 0x7f746d21a700) from PID 0; stack trace: ***
0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/doris/be/src/common/signal_handler.h:421
1# PosixSignals::chained_handler(int, siginfo_t*, void*) [clone .part.0] in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
2# JVM_handle_linux_signal in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
3# 0x00007F7601263090 in /lib/x86_64-linux-gnu/libc.so.6
4# doris::FragmentMgr::_check_brpc_available(std::shared_ptr<doris::PBackendService_Stub> const&, doris::FragmentMgr::BrpcItem const&) in /mnt/ssd01/pipline/OpenSourceDoris/clusterEnv/P0/Cluster0/be/lib/doris_be
5# doris::FragmentMgr::cancel_worker() at /root/doris/be/src/runtime/fragment_mgr.cpp:1022
6# doris::Thread::supervise_thread(void*) at /root/doris/be/src/util/thread.cpp:499
7# start_thread at /build/glibc-SzIz7B/glibc-2.31/nptl/pthread_create.c:478
8# __clone at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
```

### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
https://github.com/apache/doris-website/pull/1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
2024-11-19 14:56:46 +08:00
b4e136bfcc [performance](move-memtable) async close tablet streams (#41156 & #43813) (#44128)
backport #41156 and #43813
2024-11-19 14:14:53 +08:00
fa43cc0a90 [improve](http) Save the requested url in http execution error #43855 (#44106)
cherry pick from #43855
2024-11-18 15:06:02 +08:00
445e196041 [improve](load) pass cancel reason to tablet writer when cancelled (#43388) (#44131)
backport #43388
2024-11-18 14:11:38 +08:00
ea61206233 [pick](branch-2.1) pick #43281 (#44020)
pick #43281
2024-11-16 21:53:21 +08:00
261c65f72d [fix](pipeline) only sub_running_sink_operators in close #43500 (#43726)
https://github.com/apache/doris/pull/43500
### What problem does this PR solve?
Previously, sub_running_sink_operators was called only when encountering
EOS during sink
or when all sources were closed. However, this approach has issues, as
it’s possible
for the user to manually cancel, in which case there may be no EOS and
the sources may
not be closed. This would prevent running_sink_operators from reaching
zero, leading to errors.
```
PipelineTask[this = 0x7fc369fe9600, id = 0, open = true, eos = false, finish = false, dry run = false, elapse time = 26361.740784032s], block dependency = NULL, is running = true
operators: 
LOCAL_EXCHANGE_OPERATOR (LOCAL_MERGE_SORT): id=-5, parallel_tasks=4, _channel_id: 0, _num_partitions: 4, _num_senders: 4, _num_sources: 4, _running_sink_operators: 1, _running_source_operators: 1, mem_usage: 0, data queue info: Data Queue 0: [size approx = 0, eos = false], MemTrackers: 0: 0, 1: 34537472, 2: 5701632, 3: 0, 
  DATA_STREAM_SINK_OPERATOR: id=6, Sink Buffer: (_should_stop = false, _busy_channels = 0, _is_finishing = false), _reach_limit: false
0.   this=0x7fc376438f10, LOCAL_MERGE_EXCHANGE_OPERATOR_DEPENDENCY: id=-5, block task = 0, ready=true, _always_ready=true
0.   this=0x7fc3764bc110, LOCAL_MERGE_EXCHANGE_OPERATOR_DEPENDENCY: id=-5, block task = 0, ready=true, _always_ready=true
0.   this=0x7fc3764bc310, LOCAL_MERGE_EXCHANGE_OPERATOR_DEPENDENCY: id=-5, block task = 0, ready=true, _always_ready=true
0.   this=0x7fc3764bc510, LOCAL_MERGE_EXCHANGE_OPERATOR_DEPENDENCY: id=-5, block task = 0, ready=true, _always_ready=true
```
- [x] Confirm test cases
- [x] Confirm document
- [x] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
2024-11-16 20:58:47 +08:00
10009dc062 [fix](variant) fix index in variant (#43375) (#43971)
Problem Summary:

1. Fixing error checks,
`InvertedIndexColumnWriter::check_support_inverted_index`, It is not
appropriate to determine support for other indexes by checking the
inverted index.
2. Fix `TableSchema::update_index()`

pick from master #43375
2024-11-16 16:29:00 +08:00
d1cc68a26a branch-2.1: [Fix](auto-increment) Fix duplicate auto-increment column value problem #43774 (#43984)
Cherry-picked from #43774

Co-authored-by: bobhan1 <baohan@selectdb.com>
2024-11-16 16:17:24 +08:00
48e33bfb2a branch-2.1: [fix](hive)Fixed the issue of reading hive table with empty lzo files #43979 (#44063)
Cherry-picked from #43979

Co-authored-by: wuwenchi <wuwenchi@selectdb.com>
2024-11-16 16:14:50 +08:00
b3022df716 [fix](inverted index) base compaction failed after restore indexes (#43962)
### What problem does this PR solve?
**Problem:**
after restore from other cluster, then rowsets got different index_id,
and make index compaction in base compaction always failed.

**Fix:**
On master branch, this pr: https://github.com/apache/doris/pull/41625
already fix it.
Here pick it to branch-2.1
2024-11-16 16:01:13 +08:00
Pxl
21b3e4bbf9 [Bug](runtime-filter) fix core dump on rf between varchar and char #43758 (#43934)
#43758
2024-11-16 15:59:19 +08:00
edd9015de3 [branch-2.1](function) fix error result in auto partition name (#41130) (#43977)
pick https://github.com/apache/doris/pull/41130
https://github.com/apache/doris/pull/41372

---------

Co-authored-by: zhaochangle <zhaochangle@selectdb.com>
2024-11-15 19:11:42 +08:00
a519702fac branch-2.1: [Bug](bitmap-filter) fix wrong type cast on BitmapFilterColumnPredicate::evaluate #43877 (#43886)
Cherry-picked from #43877

Co-authored-by: Pxl <xl@selectdb.com>
2024-11-15 00:02:51 +08:00
7fc78e3f87 [opt](brpc) check and remove unavailable brpc stubs (#43212) (#43859) 2024-11-14 19:52:06 +08:00
26fef2a02a branch-2.1: [Fix](UT) Fix status UT error introduced by #43731 #43922 (#43927)
Cherry-picked from #43922

Co-authored-by: abmdocrt <lianyukang@selectdb.com>
2024-11-14 19:23:37 +08:00
39b9b81d42 branch-2.1:[fix](build) Fix Mac compilation error caused by namespace conflict in find_symbols.h (#43868) 2024-11-14 10:10:12 +08:00
47f842b4d6 branch-2.1: [minor](rpc) Check client before RPC (#43818)
Cherry-picked from #43626

Co-authored-by: Gabriel <liwenqiang@selectdb.com>
2024-11-13 19:31:55 +08:00
d4712aed1a branch-2.1: [fix](string64) fix coredump caused by ColumnArray<ColumnStr<uint64_t>>::insert_indices_from (#43862)
Cherry-picked from #43624

Co-authored-by: TengJianPing <tengjianping@selectdb.com>
2024-11-13 19:31:11 +08:00
1101fbaf04 [fix](column_complex) wrong type of Field returned by ColumnComplex (#43515) (#43860) 2024-11-13 19:07:00 +08:00
6ecd55fa9e [cherry-pick](branch-2.1) Pick "[Fix](table size) Fix MoW table merge data fault (#40880)" (#43610) 2024-11-13 14:43:18 +08:00
2e64491ee3 [branch-2.1](insert-overwrite) Support create partition for auto partition table when insert overwrite (#38628) (#42644)
pick https://github.com/apache/doris/pull/38628
2024-11-13 11:16:00 +08:00
3d667e95d2 [branch-2.1](function) support array_split and array_reverse_split functions (#35619) (#43761)
pick https://github.com/apache/doris/pull/35619
2024-11-12 21:27:55 +08:00
52c29fef7f [pick](arm) use patch to opt bitshuffle and fix pick pr 38378 ,38634 (#43656)
### What problem does this PR solve?
https://github.com/apache/doris/pull/38634
https://github.com/apache/doris/pull/38378
Issue Number: close #xxx
2024-11-12 18:52:41 +08:00
24b79fd240 [branch-2.1](log) Pick all BE execution log reduction (#43267)
pick https://github.com/apache/doris/pull/42383
pick https://github.com/apache/doris/pull/42900
2024-11-12 18:18:59 +08:00
3e098e14f1 [fix](ngram bloomfilter) fix narrow conversion for ngram bf_size #43480 (#43654)
cherry pick from #43480
2024-11-12 16:54:43 +08:00
821c0d1380 branch-2.1: [improvement](paimon)Using table serialization on the jni side (#43475)
Cherry-picked from #43167

Co-authored-by: wuwenchi <wuwenchihdu@hotmail.com>
Co-authored-by: wuwenchi <wuwenchi@selectdb.com>
2024-11-12 14:43:32 +08:00
66dcb943c3 branch-2.1: [opt](log) change lzo decompress log to debug level (#43583)
Cherry-picked from #43540

Co-authored-by: morningman <yunyou@selectdb.com>
2024-11-12 14:35:31 +08:00
d92c2f1b5e [cherry-pick](branch-2.1)Change spin lock to mutex (#43551)
pick from master:https://github.com/apache/doris/pull/41384
2024-11-12 14:28:26 +08:00
15f85e2cfb [fix] (bloom filter) Fix the bloom filter calculation for date and datetime (#43351) (#43622)
pick from master #43351

---------

Co-authored-by: csun5285 <sunchenyang@selectdb.com>
2024-11-12 10:56:55 +08:00
4c224bb1dd [improvement](build index)Optimize failed task check on same tablet (#42295) (#43589)
bp #42295

Co-authored-by: qidaye <luen@selectdb.com>
2024-11-12 09:56:55 +08:00
dc9ec5b177 [opt]function use sse to opt match_ipv6_subnet (#38755) (#43513)
## Proposed changes
https://github.com/apache/doris/pull/38755
test
```
-------------------------------------------------------------------
Benchmark                         Time             CPU   Iterations
-------------------------------------------------------------------
BM_matchIPv6SubnetSSE          1.89 ns         1.89 ns   1000000000
BM_matchIPv6SubnetNative       4.99 ns         4.99 ns    561455254
```
2024-11-11 18:59:35 +08:00
55133f8e61 [fix](delete) Fix static type dispatch by mistake due to typo (#42260) (#43488)
### What problem does this PR solve?

Problem Summary:

DeletePredicatePB should be DeleteSubPredicatePB.

Test case is too ambiguous to add, since this bug is triggered by a huge
random test and failed to find the minimal case. However, this fix is
verified under the wild test that it does works.

Note that this problem may be triggered by another bug, cuz schema in
delete predicate rowset should contain column referred in delete
condition. Even if we don't have this fix, this error should never
happend.

But this error occurred under wild tests, means that schema in delete
predicate rowset is not adaptable with delete condition. I think it is
under some status that delete operation use BE tablet schema rather than
schema from FE, and the former rename operation result in that status.
But I failed to add a test case to reproduce, and think that by no way
will it happend occurding to the related code.
```
(1105, 'errCode = 2, detailMessage = ([172.20.50.7](http://172.20.50.7/))[INTERNAL_ERROR]failed to initialize storage reader. tablet=78026, res=[INTERNAL_ERROR]column not found, name=loc1, table_id=-1, schema_version=2

\t0#  doris::TabletSchema::column(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) const at /home/zcp/repo_center/doris_master/doris/be/src/common/status.h:375
\t1#  doris::Status doris::DeleteHandler::_parse_column_pred<doris::DeleteSubPredicatePB>(std::shared_ptr<doris::TabletSchema>, std::shared_ptr<doris::TabletSchema>, google::protobuf::RepeatedPtrField<doris::DeleteSubPredicatePB> const&, doris::DeleteConditions*) at /home/zcp/repo_center/doris_master/doris/be/src/util/expected.hpp:1986
\t2#  doris::DeleteHandler::init(std::shared_ptr<doris::TabletSchema>, std::vector<std::shared_ptr<doris::RowsetMeta>, std::allocator<std::shared_ptr<doris::RowsetMeta> > > const&, long) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:701
\t3#  doris::TabletReader::_init_delete_condition(doris::TabletReader::ReaderParams const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:701
\t4#  doris::TabletReader::_init_params(doris::TabletReader::ReaderParams const&) at /home/zcp/repo_center/doris_master/doris/be/src/common/status.h:499
\t5#  doris::TabletReader::init(doris::TabletReader::ReaderParams const&) at /home/zcp/repo_center/doris_master/doris/be/src/common/status.h:499
\t6#  doris::vectorized::BlockReader::init(doris::TabletReader::ReaderParams const&) at /home/zcp/repo_center/doris_master/doris/be/src/common/status.h:499
\t7#  doris::vectorized::NewOlapScanner::open(doris::RuntimeState*) at /home/zcp/repo_center/doris_master/doris/be/src/common/status.h:499
\t8#  doris::vectorized::ScannerScheduler::_scanner_scan(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>) at /home/zcp/repo_center/doris_master/doris/be/src/common/status.h:388
\t9#  std::_Function_handler<void (), doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_1::operator()() const::{lambda()#1}>::_M_invoke(std::_Any_data const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:701
\t10# doris::ThreadPool::dispatch_thread() at /home/zcp/repo_center/doris_master/doris/be/src/util/threadpool.cpp:0
\t11# doris::Thread::supervise_thread(void*) at /var/local/ldb-toolchain/bin/../usr/include/pthread.h:562
\t12# ?
\t13# ?
, backend=[172.20.50.7](http://172.20.50.7/)')
```

```cpp
auto tablet_schema = std::make_shared<TabletSchema>();
    tablet_schema->copy_from(*tablet->tablet_schema());
    if (!request.columns_desc.empty() && request.columns_desc[0].col_unique_id >= 0) {
        tablet_schema->clear_columns();
        // TODO(lhy) handle variant
        for (const auto& column_desc : request.columns_desc) {
            tablet_schema->append_column(TabletColumn(column_desc));
        }
    }
    RowsetSharedPtr rowset_to_add;
    // writes
    res = _convert_v2(tablet, &rowset_to_add, tablet_schema, push_type);
    if (!res.ok()) {
        LOG(WARNING) << "fail to convert tmp file when realtime push. res=" << res
                     << ", failed to process realtime push."
                     << ", tablet=" << tablet->tablet_id()
                     << ", transaction_id=" << request.transaction_id;

        Status rollback_status = _engine.txn_manager()->rollback_txn(request.partition_id, *tablet,
                                                                     request.transaction_id);
        // has to check rollback status to ensure not delete a committed rowset
        if (rollback_status.ok()) {
            _engine.add_unused_rowset(rowset_to_add);
        }
        return res;
    }

    // add pending data to tablet

    if (push_type == PushType::PUSH_FOR_DELETE) {
        rowset_to_add->rowset_meta()->set_delete_predicate(std::move(del_preds.front()));
        del_preds.pop();
    }
```
2024-11-11 17:44:20 +08:00
95d27cf6b2 [Opt](exec) change transmit block to rw lock to opt performance #43223 (#43492)
cherry pick #43223 

Co-authored-by: HappenLee <happenlee@selectdb.com>
2024-11-11 17:32:09 +08:00
e073b575cc [Opt](TabletSchema) reuse TabletColumn info to reduce mem (#42448) (#43349)
(#42448)
2024-11-11 10:38:42 +08:00
ebe6b4d4db [Opt](Serde) optimize serialization to string on variant type (#43237) (#43342)
(#43237)
2024-11-11 10:35:30 +08:00
138103f9eb [opt](arm)Remove negative optimizations of SSE2NEON on memcmp for ARM… (#43510)
… (#38759)
https://github.com/apache/doris/pull/38759
The main issue is that _mm_movemask_epi8 does not have a one-to-one
corresponding instruction on ARM. Testing shows that it performs worse
compared to using memcmp, which allows the compiler to generate the
corresponding ARM instructions.
The following tests were conducted on ARM.
```
--------------------------------------------------------------
Benchmark                    Time             CPU   Iterations
--------------------------------------------------------------
BM_memequal16_sse         3.77 ns         3.77 ns    743238946
BM_memequal16_orgin       2.11 ns         2.11 ns   1000000000
```
2024-11-10 21:20:46 +08:00
91eb8f8365 branch-2.1: [chore](log) Use correct error type of uneven user behaviour (#43494)
Cherry-picked from #43334

Co-authored-by: zclllhhjj <zhaochangle@selectdb.com>
2024-11-10 21:12:35 +08:00
b9e5d878fc [refine](bits) refine bytes_mask_to_bits_mask code (#38360) (#43511)
https://github.com/apache/doris/pull/38360
The previous code only considered the x86 architecture, and
_mm_movemask_epi8 does not have a corresponding instruction in ARM.
According to the article below, we need to abstract the overall logic.
For ARM, optimize using the content mentioned in the following article:
filter function origin 0.711375 seconds 0.7154 seconds 0.71782 seconds
0.715296 seconds
filter function arm opt 0.559854 seconds 0.559854 seconds 0.559854
seconds 0.559854 seconds
2024-11-10 21:06:39 +08:00
8867a826bc [opt](arm) Optimize the BlockBloomFilter::bucket_find on ARM platform… (#43508)
…s using NEON instructions. (#38888)
https://github.com/apache/doris/pull/38888
## Proposed changes

```
--------------------------------------------------------------
Benchmark                    Time             CPU   Iterations
--------------------------------------------------------------
BM_BucketFindNeon         8.14 ns         8.14 ns    344002441
BM_BucketFindNative       17.5 ns         17.5 ns    160152491
```
2024-11-10 18:23:58 +08:00
625a1ea6ad branch-2.1: [Optimize](Expr) Opt getting value of VLitreal (#43249)
Cherry-picked from #43204

Co-authored-by: zclllhhjj <zhaochangle@selectdb.com>
2024-11-10 01:03:56 +08:00
d933956449 [branch-2.1](timezone) Preload time offset in datetime (#42395) (#42607)
pick https://github.com/apache/doris/pull/42395
2024-11-10 00:30:28 +08:00