Commit Graph

20685 Commits

Author SHA1 Message Date
797a31bbb8 [fix](statistics)Change column statistics table varchar length to 1024. (#43244) (#43760)
backport: https://github.com/apache/doris/pull/43244/
2024-11-12 19:13:23 +08:00
52c29fef7f [pick](arm) use patch to opt bitshuffle and fix pick pr 38378 ,38634 (#43656)
### What problem does this PR solve?
https://github.com/apache/doris/pull/38634
https://github.com/apache/doris/pull/38378
Issue Number: close #xxx
2024-11-12 18:52:41 +08:00
24b79fd240 [branch-2.1](log) Pick all BE execution log reduction (#43267)
pick https://github.com/apache/doris/pull/42383
pick https://github.com/apache/doris/pull/42900
2024-11-12 18:18:59 +08:00
3e098e14f1 [fix](ngram bloomfilter) fix narrow conversion for ngram bf_size #43480 (#43654)
cherry pick from #43480
2024-11-12 16:54:43 +08:00
6ae2106cd9 branch-2.1: [improvement](statistics)Remove useless stats validation check. (#43499)
Cherry-picked from #43279

Co-authored-by: Jibing-Li <64681310+Jibing-Li@users.noreply.github.com>
2024-11-12 16:00:35 +08:00
821c0d1380 branch-2.1: [improvement](paimon)Using table serialization on the jni side (#43475)
Cherry-picked from #43167

Co-authored-by: wuwenchi <wuwenchihdu@hotmail.com>
Co-authored-by: wuwenchi <wuwenchi@selectdb.com>
2024-11-12 14:43:32 +08:00
66dcb943c3 branch-2.1: [opt](log) change lzo decompress log to debug level (#43583)
Cherry-picked from #43540

Co-authored-by: morningman <yunyou@selectdb.com>
2024-11-12 14:35:31 +08:00
d92c2f1b5e [cherry-pick](branch-2.1)Change spin lock to mutex (#43551)
pick from master:https://github.com/apache/doris/pull/41384
2024-11-12 14:28:26 +08:00
95a3d237b8 branch-2.1: [fix](case) Fix test_backup_restore_atomic_with_alter with SYNC (#43640)
Cherry-picked from #43601

Co-authored-by: walter <w41ter.l@gmail.com>
2024-11-12 14:03:31 +08:00
f1a8ac7f81 [fix](group commit) fix NPE in group commit select backend (#43629) (#43651)
pick https://github.com/apache/doris/pull/43629

Co-authored-by: meiyi <meiyi@selectdb.com>
2024-11-12 14:01:50 +08:00
15f85e2cfb [fix] (bloom filter) Fix the bloom filter calculation for date and datetime (#43351) (#43622)
pick from master #43351

---------

Co-authored-by: csun5285 <sunchenyang@selectdb.com>
2024-11-12 10:56:55 +08:00
cfae726428 [cherry-pick][test](doc) add some data-admin example in doris's doc to regression test (#43240) (#43566)
(cherry picked from commit 72b4daa9b461b14f1ea179f62543d2336f6ea5b2)

### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #43240

Problem Summary:
2024-11-12 10:55:54 +08:00
4c224bb1dd [improvement](build index)Optimize failed task check on same tablet (#42295) (#43589)
bp #42295

Co-authored-by: qidaye <luen@selectdb.com>
2024-11-12 09:56:55 +08:00
541284ec21 branch-2.1: [fix](test) fix clickhouse jdbc catalog func with cast push down case (#43431)
Cherry-picked from #43348

Co-authored-by: zy-kkk <zhongyk10@gmail.com>
2024-11-12 00:18:20 +08:00
14caaf33cb branch-2.1: [fix](Export) fix issue that Export can not specify the columns which have capital letters (#43572)
Cherry-picked from #42994

Co-authored-by: Tiewei Fang <43782773+BePPPower@users.noreply.github.com>
2024-11-12 00:17:21 +08:00
a1ff02288f branch-2.1: [fix](hive) support query hive view created by spark (#43553)
Cherry-picked from #43530

Co-authored-by: Mingyu Chen (Rayner) <morningman@163.com>
Co-authored-by: morningman <yunyou@selectdb.com>
2024-11-11 23:28:53 +08:00
850afa1319 branch-2.1: [test](doc) add some remote-storage-example in doris's doc to regression test (#43507)
Cherry-picked from #43213

Co-authored-by: yagagagaga <zhangminkefromflydish@gmail.com>
2024-11-11 23:22:18 +08:00
97e9d4d9c0 [enhancement](log) logout tablename with temp partitions in backup (#… (#43139)
…42143)
2024-11-11 23:14:53 +08:00
97ca90075c [chore](agent) log the binary message size of agent tasks #43239 (#43598)
cherry pick from #43239
2024-11-11 19:39:13 +08:00
14d511fe3a [feat](restore) Support compressed snapshot meta and job info #43516 (#43569)
cherry pick from #43516
2024-11-11 19:29:27 +08:00
4ffa84e143 [chore](restore) reduce logged unfinished snapshoting task #43076 (#43607)
cherry pick from #43076
2024-11-11 19:28:48 +08:00
dc9ec5b177 [opt]function use sse to opt match_ipv6_subnet (#38755) (#43513)
## Proposed changes
https://github.com/apache/doris/pull/38755
test
```
-------------------------------------------------------------------
Benchmark                         Time             CPU   Iterations
-------------------------------------------------------------------
BM_matchIPv6SubnetSSE          1.89 ns         1.89 ns   1000000000
BM_matchIPv6SubnetNative       4.99 ns         4.99 ns    561455254
```
2024-11-11 18:59:35 +08:00
55133f8e61 [fix](delete) Fix static type dispatch by mistake due to typo (#42260) (#43488)
### What problem does this PR solve?

Problem Summary:

DeletePredicatePB should be DeleteSubPredicatePB.

Test case is too ambiguous to add, since this bug is triggered by a huge
random test and failed to find the minimal case. However, this fix is
verified under the wild test that it does works.

Note that this problem may be triggered by another bug, cuz schema in
delete predicate rowset should contain column referred in delete
condition. Even if we don't have this fix, this error should never
happend.

But this error occurred under wild tests, means that schema in delete
predicate rowset is not adaptable with delete condition. I think it is
under some status that delete operation use BE tablet schema rather than
schema from FE, and the former rename operation result in that status.
But I failed to add a test case to reproduce, and think that by no way
will it happend occurding to the related code.
```
(1105, 'errCode = 2, detailMessage = ([172.20.50.7](http://172.20.50.7/))[INTERNAL_ERROR]failed to initialize storage reader. tablet=78026, res=[INTERNAL_ERROR]column not found, name=loc1, table_id=-1, schema_version=2

\t0#  doris::TabletSchema::column(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) const at /home/zcp/repo_center/doris_master/doris/be/src/common/status.h:375
\t1#  doris::Status doris::DeleteHandler::_parse_column_pred<doris::DeleteSubPredicatePB>(std::shared_ptr<doris::TabletSchema>, std::shared_ptr<doris::TabletSchema>, google::protobuf::RepeatedPtrField<doris::DeleteSubPredicatePB> const&, doris::DeleteConditions*) at /home/zcp/repo_center/doris_master/doris/be/src/util/expected.hpp:1986
\t2#  doris::DeleteHandler::init(std::shared_ptr<doris::TabletSchema>, std::vector<std::shared_ptr<doris::RowsetMeta>, std::allocator<std::shared_ptr<doris::RowsetMeta> > > const&, long) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:701
\t3#  doris::TabletReader::_init_delete_condition(doris::TabletReader::ReaderParams const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:701
\t4#  doris::TabletReader::_init_params(doris::TabletReader::ReaderParams const&) at /home/zcp/repo_center/doris_master/doris/be/src/common/status.h:499
\t5#  doris::TabletReader::init(doris::TabletReader::ReaderParams const&) at /home/zcp/repo_center/doris_master/doris/be/src/common/status.h:499
\t6#  doris::vectorized::BlockReader::init(doris::TabletReader::ReaderParams const&) at /home/zcp/repo_center/doris_master/doris/be/src/common/status.h:499
\t7#  doris::vectorized::NewOlapScanner::open(doris::RuntimeState*) at /home/zcp/repo_center/doris_master/doris/be/src/common/status.h:499
\t8#  doris::vectorized::ScannerScheduler::_scanner_scan(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>) at /home/zcp/repo_center/doris_master/doris/be/src/common/status.h:388
\t9#  std::_Function_handler<void (), doris::vectorized::ScannerScheduler::submit(std::shared_ptr<doris::vectorized::ScannerContext>, std::shared_ptr<doris::vectorized::ScanTask>)::$_1::operator()() const::{lambda()#1}>::_M_invoke(std::_Any_data const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:701
\t10# doris::ThreadPool::dispatch_thread() at /home/zcp/repo_center/doris_master/doris/be/src/util/threadpool.cpp:0
\t11# doris::Thread::supervise_thread(void*) at /var/local/ldb-toolchain/bin/../usr/include/pthread.h:562
\t12# ?
\t13# ?
, backend=[172.20.50.7](http://172.20.50.7/)')
```

```cpp
auto tablet_schema = std::make_shared<TabletSchema>();
    tablet_schema->copy_from(*tablet->tablet_schema());
    if (!request.columns_desc.empty() && request.columns_desc[0].col_unique_id >= 0) {
        tablet_schema->clear_columns();
        // TODO(lhy) handle variant
        for (const auto& column_desc : request.columns_desc) {
            tablet_schema->append_column(TabletColumn(column_desc));
        }
    }
    RowsetSharedPtr rowset_to_add;
    // writes
    res = _convert_v2(tablet, &rowset_to_add, tablet_schema, push_type);
    if (!res.ok()) {
        LOG(WARNING) << "fail to convert tmp file when realtime push. res=" << res
                     << ", failed to process realtime push."
                     << ", tablet=" << tablet->tablet_id()
                     << ", transaction_id=" << request.transaction_id;

        Status rollback_status = _engine.txn_manager()->rollback_txn(request.partition_id, *tablet,
                                                                     request.transaction_id);
        // has to check rollback status to ensure not delete a committed rowset
        if (rollback_status.ok()) {
            _engine.add_unused_rowset(rowset_to_add);
        }
        return res;
    }

    // add pending data to tablet

    if (push_type == PushType::PUSH_FOR_DELETE) {
        rowset_to_add->rowset_meta()->set_delete_predicate(std::move(del_preds.front()));
        del_preds.pop();
    }
```
2024-11-11 17:44:20 +08:00
95d27cf6b2 [Opt](exec) change transmit block to rw lock to opt performance #43223 (#43492)
cherry pick #43223 

Co-authored-by: HappenLee <happenlee@selectdb.com>
2024-11-11 17:32:09 +08:00
5939200076 [Impl](Nereids) add propagateNullLiteral trait for special functions (#42256) (#43491)
pick: #42256

### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [x] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [x] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [x] No.
- [ ] Yes. <!-- Add document PR link here. eg:
https://github.com/apache/doris-website/pull/1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->

Co-authored-by: LiBinfeng <libinfeng@selectdb.com>
2024-11-11 16:22:41 +08:00
5dda61b410 [fix](backup) Load backup meta and job info bytes from disk #43276 (#43519)
cherry pick from #43276
2024-11-11 14:08:35 +08:00
e073b575cc [Opt](TabletSchema) reuse TabletColumn info to reduce mem (#42448) (#43349)
(#42448)
2024-11-11 10:38:42 +08:00
ebe6b4d4db [Opt](Serde) optimize serialization to string on variant type (#43237) (#43342)
(#43237)
2024-11-11 10:35:30 +08:00
138103f9eb [opt](arm)Remove negative optimizations of SSE2NEON on memcmp for ARM… (#43510)
… (#38759)
https://github.com/apache/doris/pull/38759
The main issue is that _mm_movemask_epi8 does not have a one-to-one
corresponding instruction on ARM. Testing shows that it performs worse
compared to using memcmp, which allows the compiler to generate the
corresponding ARM instructions.
The following tests were conducted on ARM.
```
--------------------------------------------------------------
Benchmark                    Time             CPU   Iterations
--------------------------------------------------------------
BM_memequal16_sse         3.77 ns         3.77 ns    743238946
BM_memequal16_orgin       2.11 ns         2.11 ns   1000000000
```
2024-11-10 21:20:46 +08:00
91eb8f8365 branch-2.1: [chore](log) Use correct error type of uneven user behaviour (#43494)
Cherry-picked from #43334

Co-authored-by: zclllhhjj <zhaochangle@selectdb.com>
2024-11-10 21:12:35 +08:00
b9e5d878fc [refine](bits) refine bytes_mask_to_bits_mask code (#38360) (#43511)
https://github.com/apache/doris/pull/38360
The previous code only considered the x86 architecture, and
_mm_movemask_epi8 does not have a corresponding instruction in ARM.
According to the article below, we need to abstract the overall logic.
For ARM, optimize using the content mentioned in the following article:
filter function origin 0.711375 seconds 0.7154 seconds 0.71782 seconds
0.715296 seconds
filter function arm opt 0.559854 seconds 0.559854 seconds 0.559854
seconds 0.559854 seconds
2024-11-10 21:06:39 +08:00
72b1e2a346 [fix](Outfile) forbid parallel outfile if pipeline engine enabled. (#43437)
Because pipeline engine doesn't implement parallel outfile, we should
forbid the parallel outfile if pipeline engine enabled.
2024-11-10 18:29:08 +08:00
8867a826bc [opt](arm) Optimize the BlockBloomFilter::bucket_find on ARM platform… (#43508)
…s using NEON instructions. (#38888)
https://github.com/apache/doris/pull/38888
## Proposed changes

```
--------------------------------------------------------------
Benchmark                    Time             CPU   Iterations
--------------------------------------------------------------
BM_BucketFindNeon         8.14 ns         8.14 ns    344002441
BM_BucketFindNative       17.5 ns         17.5 ns    160152491
```
2024-11-10 18:23:58 +08:00
4c7e495392 [cherry-pick](branch-2.1) fix wrong property of S3 resource (#43413) 2024-11-10 18:17:51 +08:00
5ac3aee460 branch-2.1: [opt](max-compute) avoid repeated location path creation (#43383)
Cherry-picked from #43355

Co-authored-by: Mingyu Chen (Rayner) <morningman@163.com>
2024-11-10 10:11:37 +08:00
182f37f837 [fix](planner) NullLiteral should always having a correct Type and set to be analyzed (#43371) 2024-11-10 10:10:40 +08:00
1d740ff825 [fix](auditlog) set isQuery to true when query is short circuited (#42647) (#43345)
(#42647)
2024-11-10 10:08:56 +08:00
ea67e3a6b4 branch-2.1: [enhance](mtmv)MTMV interface optimization (#43329)
Cherry-picked from #43086

Co-authored-by: zhangdong <493738387@qq.com>
2024-11-10 10:07:35 +08:00
80fd76677e branch-2.1: [Improvement](LDAP Auth)Enhance LDAP authentication with a configurable group filter (#43293)
Cherry-picked from #42038

Co-authored-by: nsivarajan <117266407+nsivarajan@users.noreply.github.com>
Co-authored-by: Sivarajan Narayanan <narayanan_sivarajan@apple.com>
2024-11-10 10:06:13 +08:00
fe1b8d44fd branch-2.1: [fix](mtmv)Fix the problem where the job does not exist, which prevents the deletion of MTMV (#43325)
Cherry-picked from #43080

Co-authored-by: zhangdong <493738387@qq.com>
2024-11-10 01:11:26 +08:00
625a1ea6ad branch-2.1: [Optimize](Expr) Opt getting value of VLitreal (#43249)
Cherry-picked from #43204

Co-authored-by: zclllhhjj <zhaochangle@selectdb.com>
2024-11-10 01:03:56 +08:00
fba06b33b9 [cherry-pick](branch-2.1)add SessionVariable for enableCooldownReplicaAffinity (#42675)
pick from master:https://github.com/apache/doris/pull/41741
2024-11-10 00:46:26 +08:00
486dfe9f42 branch-2.1: [fix](auth)Fix concurrency issue during role manager upgrade (#43194)
Cherry-picked from #42419

Co-authored-by: zhangdong <493738387@qq.com>
2024-11-10 00:45:07 +08:00
e8d4c4cb7a branch-2.1: [fix](regression) fix flaky partial update cases (#43143)
Cherry-picked from #42908

Co-authored-by: zhannngchen <48427519+zhannngchen@users.noreply.github.com>
2024-11-10 00:37:33 +08:00
d933956449 [branch-2.1](timezone) Preload time offset in datetime (#42395) (#42607)
pick https://github.com/apache/doris/pull/42395
2024-11-10 00:30:28 +08:00
5195d61b6f [fix](profile) update_rpc_time when enable_verbose_profile = false (#43096)
### What problem does this PR solve?

If enable_verbose_profile is false, update_rpc_time will not be called,
and RPC count, max, and min statistics will not be recorded.

```
                          -  RpcCount:  0
                          -  RpcMaxTime:  0ns
                          -  RpcMinTime:  0ns
                          -  RpcSumTime:  0ns
```



Co-authored-by: Mryange <yanxuecheng@selectdb.com>
2024-11-09 22:17:29 +08:00
9d7bc5b765 [pick](branch-2.1) pick #38215 (#43386)
pick #38215

---------

Co-authored-by: Zou Xinyi <zouxinyi@selectdb.com>
2024-11-09 22:13:05 +08:00
fefc8a8efb branch-2.1: [fix](new_json_reader)fix new_json_reader core (#43188)
Cherry-picked from #41290

Co-authored-by: amory <wangqiannan@selectdb.com>
2024-11-09 12:33:34 +08:00
95cfe72f61 [test](p0) fix load stream leak in injection cases (#42681) (#43505)
cherry-pick #42681
2024-11-08 21:36:13 +08:00
2ba88ed2a8 [improve](report) split agent batch tasks automaticlly #43257 (#43365)
cherry pick from #43257
2024-11-08 18:59:53 +08:00