Commit Graph

8651 Commits

Author SHA1 Message Date
5deeb42f6d branch-2.1: [opt] Optimization for short circuit of CompoundPred #45422 (#46241)
cherry pick from #45422
2025-01-02 10:11:47 +08:00
8e730faec5 [Exec](expr) Opt the compound pred performace (#45414) (#46232)
cherry-pick #45414

before:
```
 mysqlslap -hd3 -uroot -P9130  --create-schema=test_db2 -c 10 -i 500 -q "SELECT count(k) FROM sbtest1_dup WHERE k BETWEEN 4850578 AND 8454295 OR k BETWEEN 8776291 AND 29749077;"
Benchmark
	Average number of seconds to run all queries: 0.041 seconds
	Minimum number of seconds to run all queries: 0.037 seconds
	Maximum number of seconds to run all queries: 0.115 seconds
	Number of clients running queries: 10
	Average number of queries per client: 1
```

after:
```
mysqlslap -hd3 -uroot -P9030  --create-schema=test_db -c 10 -i 500 -q "SELECT count(k) FROM sbtest1 WHERE k BETWEEN 4850578 AND 8454295 OR k BETWEEN 8776291 AND 29749077;"
Benchmark
	Average number of seconds to run all queries: 0.029 seconds
	Minimum number of seconds to run all queries: 0.027 seconds
	Maximum number of seconds to run all queries: 0.034 seconds
	Number of clients running queries: 10
	Average number of queries per client: 1
```

### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
https://github.com/apache/doris-website/pull/1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
2025-01-01 16:19:42 +08:00
726e1c8c80 [opt](profile) add summary metric for file scanner #45941 (#46188)
bp #45941
2025-01-01 13:27:50 +08:00
e4adf9b931 [fix](routine load) replace heavy work pool with routine load thread pool for metadata fetching (#44907) (#46186)
pick #44907

In production, we encountered an issue where the librdkafka consumer
stucked during destruction, causing the heavy work pool to become
saturated, which in turn made all heavy work pool-dependent
functionalities, such as querying, unusable. To mitigate this impact, we
replaced the heavy work pool with routine load threads for metadata
fetching.
2024-12-31 23:06:08 +08:00
9390c743c5 branch-2.1: [Fix](multi-catalog) Fix column mutate() crash replace it by assume_mutable(). #46151 (#46198)
Cherry-picked from #46151

Co-authored-by: Qi Chen <chenqi@selectdb.com>
2024-12-31 22:02:49 +08:00
4472648c07 [branch-2.1] pick workload group usage metrics (#46177)
pick #45284  #44870
2024-12-31 10:09:48 +08:00
df26475e1a [Enhancement](compaction) enable the compaction producer to generate multiple compaction tasks in a single run (#45411) (#46160)
pick master #45411
2024-12-31 09:51:43 +08:00
3d79955db3 branch-2.1: [fix](parquet-reader) Fixed the issue of excessive scanning data in late materialization‌ case of parquet reader #46121 (#46183)
Cherry-picked from #46121

Co-authored-by: Qi Chen <chenqi@selectdb.com>
2024-12-31 07:30:49 +08:00
Pxl
43c646363e [Bug](runtime-filter) support ip rf and use exception to replace dche… (#41531)
…ck when PrimitiveType to PColumnType (#39985)

use exception to replace dcheck when PrimitiveType to PColumnType
```cpp
*** SIGABRT unknown detail explain (@0x11d3f) received by PID 73023 (TID 74292 OR 0x7fd758225640) from PID 73023; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/zcp/repo_center/doris_master/doris/be/src/common/signal_handler.h:421
 1# 0x00007FDDBE6B9520 in /lib/x86_64-linux-gnu/libc.so.6
 2# pthread_kill at ./nptl/pthread_kill.c:89
 3# raise at ../sysdeps/posix/raise.c:27
 4# abort at ./stdlib/abort.c:81
 5# 0x000056123F81A94D in /root/output/be/lib/doris_be
 6# 0x000056123F80CF8A in /root/output/be/lib/doris_be
 7# google::LogMessage::SendToLog() in /root/output/be/lib/doris_be
 8# google::LogMessage::Flush() in /root/output/be/lib/doris_be
 9# google::LogMessageFatal::~LogMessageFatal() in /root/output/be/lib/doris_be
10# doris::to_proto(doris::PrimitiveType) at /home/zcp/repo_center/doris_master/doris/be/src/exprs/runtime_filter.cpp:114
11# doris::IRuntimeFilter::push_to_remote(doris::TNetworkAddress const*) at /home/zcp/repo_center/doris_master/doris/be/src/exprs/runtime_filter.cpp:1143
12# doris::IRuntimeFilter::publish(bool)::$_0::operator()(doris::IRuntimeFilter*) const at /home/zcp/repo_center/doris_master/doris/be/src/exprs/runtime_filter.cpp:959
13# doris::IRuntimeFilter::publish(bool)::$_2::operator()() const at /home/zcp/repo_center/doris_master/doris/be/src/exprs/runtime_filter.cpp:983
14# doris::IRuntimeFilter::publish(bool) at /home/zcp/repo_center/doris_master/doris/be/src/exprs/runtime_filter.cpp:997
```

## Proposed changes
pick from #39985
2024-12-30 20:56:11 +08:00
419456f3a9 branch-2.1: [fix](sort)fix merge sort may miss the limit #46072 (#46158)
Cherry-picked from #46072

Co-authored-by: Mryange <yanxuecheng@selectdb.com>
2024-12-30 20:02:24 +08:00
10ad255198 branch-2.1: [fix](memory) Fix purge jemalloc dirty page (#46146)
### What problem does this PR solve?

Fix purge jemalloc dirty page
2024-12-30 20:00:54 +08:00
7040abfb04 [fix](correctness) Fix operator initialization (#45728) (#46150)
Now we plan for local exchange depends on operator initialization. This
PR fixed a wrong order between those two steps.

pick #45728
2024-12-30 17:01:33 +08:00
1d742b5f7d [Cherry-pick](branch-2.1) Pick "[Enhancement](compaction) Do not set failure time when cumulative compaction dealing with delete rowset (#43466)" (#46117)
Before this PR, in cases where there is an alternating distribution of
data rowset -> delete rowset -> data rowset -> delete rowset, cumulative
compaction would only move the cumulative point forward to allow base
compaction to handle the delete rowset. Cumulative compaction itself
would not process the data and would return be marked as failure. This
would cause the compaction submission task process to pause for 5
seconds, impacting efficiency.

This PR modifies the return status to OK for such cases, which improves
the efficiency of the compaction submission task.
2024-12-30 10:18:57 +08:00
a380f5d222 [enchement](utf8)import enable_text_validate_utf8 session var (#45537) (#46070)
bp #45537
2024-12-28 10:05:03 +08:00
4746e9e3a2 [opt](inverted index)Optimize code to get rid of heap use after free (#45745) (#46075)
bp #45745
2024-12-27 16:46:58 +08:00
a5f27c5b28 branch-2.1: [fix](scan) Fix scan with limit #46035 (#46091)
cherry pick from #46035
2024-12-27 16:41:58 +08:00
d2aa890887 branch-2.1: [fix](clone) Fix the skipped link file due to the stale value #46009 (#46068)
Cherry-picked from #46009

Co-authored-by: walter <maochuan@selectdb.com>
2024-12-27 16:30:16 +08:00
fcc4d0d451 [fix](inverted index) Modify Error Handling for File Open Failure (#45773)
https://github.com/apache/doris/pull/44551
2024-12-27 14:09:57 +08:00
d2c108726d [opt](bloomfilter index) optimize memory usage for bloom filter index writer #45833 (#46047)
cherry pick from #45833
2024-12-27 12:10:56 +08:00
df8bc8f23d branch-2.1: [fix](parquet) impl has_dict_page to replace old logic and fix write empty parquet row group bug #45740 (#45954)
Cherry-picked from #45740

Co-authored-by: Socrates <suyiteng@selectdb.com>
2024-12-26 15:17:49 +08:00
289d621faa [improvement](information_schema)Show view definition in information_schema.views. (#45857) (#45930)
backport: https://github.com/apache/doris/pull/45857
2024-12-26 10:11:13 +08:00
1cf6986cea [pick](branch-2.1) pick #44092 (#45836) 2024-12-25 23:11:19 +08:00
c94ac6c9f8 [branch-2.1](ORC) fix predicate filter failed when use hive 1.x version (#45809)
### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #43185 

Pick the pr to branch-2.1 to fix predicate filter failed when use hive
1.x version

Co-authored-by: fantasy12345zsq <1575033031@qq.com>
2024-12-25 18:56:57 +08:00
a6e4497811 [opt](inverted index) Add troubleshooting logs (#44182) (#45777)
https://github.com/apache/doris/pull/44182
2024-12-25 13:58:01 +08:00
547e88b1ee branch-2.1: [fix](csv reader) fix core dump when parsing csv with enclose #45485 (#45889)
Cherry-picked from #45485

Co-authored-by: hui lai <laihui@selectdb.com>
2024-12-25 12:09:20 +08:00
64195d79ee [refactor](metrics) Remove IntAtomicCounter & CoreLocal #45742 (#45870)
cherry pick from #45742
2024-12-24 23:13:48 +08:00
6d6473efae branch-2.1: [fix](tabletScheduler) Fix addTablet dead lock in tabletScheduler #45298 (#45769)
Cherry-picked from #45298

Co-authored-by: deardeng <dengxin@selectdb.com>
2024-12-24 21:44:30 +08:00
f0031d9954 [cherry-pick](branch-21)support posexplode table function (#43221) (#45783)
cherry-pick from master #43221
2024-12-24 21:42:30 +08:00
02f15a8ef0 [fix](inverted index) Fix Null Pointer Exception in function match(#45456)(#45774)
pick: https://github.com/apache/doris/pull/45456
2024-12-24 11:27:13 +08:00
Pxl
d70c17bdc0 [Improvement](scan) use loop to instead recursion on Level1Iterator::_normal_next #38005 (#45767)
pick from  #38005
2024-12-23 13:54:12 +08:00
367ecc3292 [fix](expr)Remove the _can_fast_execute flag from VExpr. (#45542) (#45662) 2024-12-22 21:26:32 +08:00
39c69c766e [Optimize](Variant) optimize schema update performance (#45480) (#45731)
(#45480)
2024-12-21 23:41:03 +08:00
06efd5b4af [Opt](SegmentIterator) clear and release iterators memory footprint in advance when EOF (#44768) (#45734)
(#44768)
2024-12-20 20:38:56 +08:00
37c4de3cbf branch-2.1: [improve](variant) only sanitize in debug mode #45689 (#45698)
Cherry-picked from #45689

Co-authored-by: lihangyu <lihangyu@selectdb.com>
2024-12-20 14:25:49 +08:00
19c0e89da7 [enchement](iceberg)support read iceberg partition evolution table. (#45367) (#45569)
cherry-pick #45367

Co-authored-by: daidai <changyuwei@selectdb.com>
2024-12-20 08:56:51 +08:00
c459ad7382 [fix](binlog) Avoid clear binlog dir #45581 (#45620)
cherry pick from #45581
2024-12-19 23:42:54 +08:00
9272c650b4 [Refactor](query) refactor lock in fragment mgr and change std::unorder_map to phmap (#45069)
### What problem does this PR solve?

Related PR: #44821
2024-12-19 22:27:33 +08:00
4b7c2eaa7d [branch-2.1](fix) fix incorrect result of hash join with const column (#45630) 2024-12-19 19:14:38 +08:00
d6c629d293 branch-2.1: [fix](tvf) Tvf supports to parse the enclose character in csv files #45407 (#45570)
Cherry-picked from #45407

Co-authored-by: Tiewei Fang <fangtiewei@selectdb.com>
2024-12-19 16:23:44 +08:00
7d32e4f71f branch-2.1: [Fix](ORC) Not push down fixed char type in orc reader #45484 (#45525)
cherry-pick #45484
2024-12-19 14:06:00 +08:00
eb67db3d25 branch-2.1: [feat](docker)Add a BE ENV item 'SKIP_CHECK_ULIMIT' for Docker to start quickly #45267 (#45468)
Cherry-picked from #45267

Co-authored-by: FreeOnePlus <54164178+FreeOnePlus@users.noreply.github.com>
2024-12-19 09:31:41 +08:00
Pxl
fe0cc289de [Bug](function) fix wrong result on group_concat with distinct+order_… (#45513) 2024-12-18 22:49:18 +08:00
59b3760fdd branch-2.1: [opt](join) Check the property of nullable from intermediate row #45017 (#45476)
Cherry-picked from #45017

Co-authored-by: Jerry Hu <hushenggang@selectdb.com>
2024-12-18 22:40:17 +08:00
02feb16530 branch-2.1: [bug](s3) fix S3 file system gets absolute path #44965 (#45529)
Cherry-picked from https://github.com/apache/doris/pull/44965
2024-12-18 22:29:24 +08:00
855e9a508c [fix](catalog) opt the count pushdown rule for iceberg/paimon/hive scan node (#44038) (#45564)
bp #44038
2024-12-18 09:54:56 +08:00
01684ce3b1 branch-2.1: [fix](mysql-buffer) fix special buffer size with nested type #45126 (#45458)
Cherry-picked from #45126

Co-authored-by: amory <wangqiannan@selectdb.com>
2024-12-17 17:48:10 +08:00
Pxl
900086667f [Chore](pipeline) catch exception on task::prepare to avoid exception make backend coredump #45479 (#45516)
…dump

### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
https://github.com/apache/doris-website/pull/1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
2024-12-17 17:22:05 +08:00
Pxl
7856662ecf [Bug](pipeline) make sink operator process eos signals after wake_up_early #45207 (#45400)
make sink operator process eos signals after wake_up_early #45207  (#45400)
2024-12-17 17:21:42 +08:00
191ef9b8b0 branch-2.1: [fix](schema-change) fix array/map/struct in schema-change not-null to null will make core #45305 (#45482)
Cherry-picked from #45305

Co-authored-by: amory <wangqiannan@selectdb.com>
2024-12-17 17:05:17 +08:00
8dc845671f [test](load) injection cases should check Exception is thrown (#44713) (#45321)
backport #44713
2024-12-17 14:37:27 +08:00