Commit Graph

8664 Commits

Author SHA1 Message Date
03a3f37cc4 [fix](index meta) make has_inverted_index function more robust #46364 (#46428)
cherry pick from #46364
2025-01-07 09:36:22 +08:00
1405c48a77 [opt](coordinator) optimize parallel degree of shuffle when use nereids (#44754) (#46397)
cherry pick from #44754
2025-01-04 19:04:17 +08:00
db224ba15f [fix](variant) fix schema change for variant from not null to null (#46403)
cherry-pick from #46279
2025-01-04 09:00:43 +08:00
f93452dba7 branch-2.1: [improve](move-memtable) disable stack trace in load stream reply #46318 (#46332)
Cherry-picked from #46318

Co-authored-by: Kaijie Chen <chenkaijie@selectdb.com>
2025-01-03 22:01:21 +08:00
08cb06396b branch-2.1: [Fix](index build) should not append index to meta while column id is -1 #46307 (#46345)
cherry pick from #46307

Co-authored-by: airborne12 <jiangkai@selectdb.com>
2025-01-03 21:59:55 +08:00
384b78fa4e [Fix]delete internal group (#46351) 2025-01-03 21:59:36 +08:00
333b54eaba [fix](ip) fix ip nullable param without check (#44700) (#46252)
if we use ipv6_cidr_to_range function with nullable func which with
invalid ipv6 will make be core
```
mysql> select id, ipv6_cidr_to_range(nullable(''), 32) from fn_test_ip_nullable order by id;
```
2025-01-03 21:24:37 +08:00
18846751d5 branch-2.1: [Bug](scan) do not release tablet_reader on NewOlapScanner::close #46296 (#46355)
Cherry-picked from #46296

Co-authored-by: Pxl <xl@selectdb.com>
2025-01-03 16:26:33 +08:00
34f9072ab6 [Improvement](local shuffle) Reduce locking scope in local exchanger … (#46294)
…(#46251)

Reduce lock scope from global level to data queue level.
2025-01-03 09:14:22 +08:00
a12fa2881f [fix](metrics) Fix negative scanner cnt (#46291)
### What problem does this PR solve?
Fix 

![image](https://github.com/user-attachments/assets/c78e4978-2292-46c2-8cc1-59da4c05cf41)

Related PR: https://github.com/apache/doris/pull/41314

Scanner has two constructor function on branch-2.1, but on master it
just has one, so cherrypick introduced bug.
2025-01-03 09:07:06 +08:00
7206ca39ec [fix](DECIMAL) error DECIMAL cat to BOOLEAN (#44326) (#46275)
In the past, there were issues with converting `double` and `decimal` to
`boolean`.
For example, a `double` value like 0.13 would first be cast to `uint8`,
resulting in 0.
Then, it would be converted to `bool`, yielding 0 (incorrect, as the
expected result is 1).

Similarly, `decimal` values were directly cast to `uint8`, leading to
non-0/1 values for `bool`.
This issue arises because Doris internally uses `uint8` to represent
`boolean`.

before
```
mysql> select cast(40.123 as BOOLEAN);
+-------------------------+
| cast(40.123 as BOOLEAN) |
+-------------------------+
|                      40 |
+-------------------------+
```

now
```
mysql> select cast(40.123 as BOOLEAN);
+-------------------------+
| cast(40.123 as BOOLEAN) |
+-------------------------+
|                       1 |
+-------------------------+
```

### What problem does this PR solve?

pick #44326
Related PR: #[44326](https://github.com/mrhhsg/doris/tree/pick_44326)

Problem Summary:
2025-01-02 18:42:04 +08:00
2c7e5cae9a branch-2.1: [fix](catalog) Fix mark handling for failed tasks to maintain getLeftMarks #46205 (#46256)
cherry pick from #46205
2025-01-02 16:39:43 +08:00
df6935ad04 [Improvement](fragment) Improvement map performance in fragment mgr (#46245)
pick #46235
2025-01-02 15:11:56 +08:00
5deeb42f6d branch-2.1: [opt] Optimization for short circuit of CompoundPred #45422 (#46241)
cherry pick from #45422
2025-01-02 10:11:47 +08:00
8e730faec5 [Exec](expr) Opt the compound pred performace (#45414) (#46232)
cherry-pick #45414

before:
```
 mysqlslap -hd3 -uroot -P9130  --create-schema=test_db2 -c 10 -i 500 -q "SELECT count(k) FROM sbtest1_dup WHERE k BETWEEN 4850578 AND 8454295 OR k BETWEEN 8776291 AND 29749077;"
Benchmark
	Average number of seconds to run all queries: 0.041 seconds
	Minimum number of seconds to run all queries: 0.037 seconds
	Maximum number of seconds to run all queries: 0.115 seconds
	Number of clients running queries: 10
	Average number of queries per client: 1
```

after:
```
mysqlslap -hd3 -uroot -P9030  --create-schema=test_db -c 10 -i 500 -q "SELECT count(k) FROM sbtest1 WHERE k BETWEEN 4850578 AND 8454295 OR k BETWEEN 8776291 AND 29749077;"
Benchmark
	Average number of seconds to run all queries: 0.029 seconds
	Minimum number of seconds to run all queries: 0.027 seconds
	Maximum number of seconds to run all queries: 0.034 seconds
	Number of clients running queries: 10
	Average number of queries per client: 1
```

### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
https://github.com/apache/doris-website/pull/1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
2025-01-01 16:19:42 +08:00
726e1c8c80 [opt](profile) add summary metric for file scanner #45941 (#46188)
bp #45941
2025-01-01 13:27:50 +08:00
e4adf9b931 [fix](routine load) replace heavy work pool with routine load thread pool for metadata fetching (#44907) (#46186)
pick #44907

In production, we encountered an issue where the librdkafka consumer
stucked during destruction, causing the heavy work pool to become
saturated, which in turn made all heavy work pool-dependent
functionalities, such as querying, unusable. To mitigate this impact, we
replaced the heavy work pool with routine load threads for metadata
fetching.
2024-12-31 23:06:08 +08:00
9390c743c5 branch-2.1: [Fix](multi-catalog) Fix column mutate() crash replace it by assume_mutable(). #46151 (#46198)
Cherry-picked from #46151

Co-authored-by: Qi Chen <chenqi@selectdb.com>
2024-12-31 22:02:49 +08:00
4472648c07 [branch-2.1] pick workload group usage metrics (#46177)
pick #45284  #44870
2024-12-31 10:09:48 +08:00
df26475e1a [Enhancement](compaction) enable the compaction producer to generate multiple compaction tasks in a single run (#45411) (#46160)
pick master #45411
2024-12-31 09:51:43 +08:00
3d79955db3 branch-2.1: [fix](parquet-reader) Fixed the issue of excessive scanning data in late materialization‌ case of parquet reader #46121 (#46183)
Cherry-picked from #46121

Co-authored-by: Qi Chen <chenqi@selectdb.com>
2024-12-31 07:30:49 +08:00
Pxl
43c646363e [Bug](runtime-filter) support ip rf and use exception to replace dche… (#41531)
…ck when PrimitiveType to PColumnType (#39985)

use exception to replace dcheck when PrimitiveType to PColumnType
```cpp
*** SIGABRT unknown detail explain (@0x11d3f) received by PID 73023 (TID 74292 OR 0x7fd758225640) from PID 73023; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/zcp/repo_center/doris_master/doris/be/src/common/signal_handler.h:421
 1# 0x00007FDDBE6B9520 in /lib/x86_64-linux-gnu/libc.so.6
 2# pthread_kill at ./nptl/pthread_kill.c:89
 3# raise at ../sysdeps/posix/raise.c:27
 4# abort at ./stdlib/abort.c:81
 5# 0x000056123F81A94D in /root/output/be/lib/doris_be
 6# 0x000056123F80CF8A in /root/output/be/lib/doris_be
 7# google::LogMessage::SendToLog() in /root/output/be/lib/doris_be
 8# google::LogMessage::Flush() in /root/output/be/lib/doris_be
 9# google::LogMessageFatal::~LogMessageFatal() in /root/output/be/lib/doris_be
10# doris::to_proto(doris::PrimitiveType) at /home/zcp/repo_center/doris_master/doris/be/src/exprs/runtime_filter.cpp:114
11# doris::IRuntimeFilter::push_to_remote(doris::TNetworkAddress const*) at /home/zcp/repo_center/doris_master/doris/be/src/exprs/runtime_filter.cpp:1143
12# doris::IRuntimeFilter::publish(bool)::$_0::operator()(doris::IRuntimeFilter*) const at /home/zcp/repo_center/doris_master/doris/be/src/exprs/runtime_filter.cpp:959
13# doris::IRuntimeFilter::publish(bool)::$_2::operator()() const at /home/zcp/repo_center/doris_master/doris/be/src/exprs/runtime_filter.cpp:983
14# doris::IRuntimeFilter::publish(bool) at /home/zcp/repo_center/doris_master/doris/be/src/exprs/runtime_filter.cpp:997
```

## Proposed changes
pick from #39985
2024-12-30 20:56:11 +08:00
419456f3a9 branch-2.1: [fix](sort)fix merge sort may miss the limit #46072 (#46158)
Cherry-picked from #46072

Co-authored-by: Mryange <yanxuecheng@selectdb.com>
2024-12-30 20:02:24 +08:00
10ad255198 branch-2.1: [fix](memory) Fix purge jemalloc dirty page (#46146)
### What problem does this PR solve?

Fix purge jemalloc dirty page
2024-12-30 20:00:54 +08:00
7040abfb04 [fix](correctness) Fix operator initialization (#45728) (#46150)
Now we plan for local exchange depends on operator initialization. This
PR fixed a wrong order between those two steps.

pick #45728
2024-12-30 17:01:33 +08:00
1d742b5f7d [Cherry-pick](branch-2.1) Pick "[Enhancement](compaction) Do not set failure time when cumulative compaction dealing with delete rowset (#43466)" (#46117)
Before this PR, in cases where there is an alternating distribution of
data rowset -> delete rowset -> data rowset -> delete rowset, cumulative
compaction would only move the cumulative point forward to allow base
compaction to handle the delete rowset. Cumulative compaction itself
would not process the data and would return be marked as failure. This
would cause the compaction submission task process to pause for 5
seconds, impacting efficiency.

This PR modifies the return status to OK for such cases, which improves
the efficiency of the compaction submission task.
2024-12-30 10:18:57 +08:00
a380f5d222 [enchement](utf8)import enable_text_validate_utf8 session var (#45537) (#46070)
bp #45537
2024-12-28 10:05:03 +08:00
4746e9e3a2 [opt](inverted index)Optimize code to get rid of heap use after free (#45745) (#46075)
bp #45745
2024-12-27 16:46:58 +08:00
a5f27c5b28 branch-2.1: [fix](scan) Fix scan with limit #46035 (#46091)
cherry pick from #46035
2024-12-27 16:41:58 +08:00
d2aa890887 branch-2.1: [fix](clone) Fix the skipped link file due to the stale value #46009 (#46068)
Cherry-picked from #46009

Co-authored-by: walter <maochuan@selectdb.com>
2024-12-27 16:30:16 +08:00
fcc4d0d451 [fix](inverted index) Modify Error Handling for File Open Failure (#45773)
https://github.com/apache/doris/pull/44551
2024-12-27 14:09:57 +08:00
d2c108726d [opt](bloomfilter index) optimize memory usage for bloom filter index writer #45833 (#46047)
cherry pick from #45833
2024-12-27 12:10:56 +08:00
df8bc8f23d branch-2.1: [fix](parquet) impl has_dict_page to replace old logic and fix write empty parquet row group bug #45740 (#45954)
Cherry-picked from #45740

Co-authored-by: Socrates <suyiteng@selectdb.com>
2024-12-26 15:17:49 +08:00
289d621faa [improvement](information_schema)Show view definition in information_schema.views. (#45857) (#45930)
backport: https://github.com/apache/doris/pull/45857
2024-12-26 10:11:13 +08:00
1cf6986cea [pick](branch-2.1) pick #44092 (#45836) 2024-12-25 23:11:19 +08:00
c94ac6c9f8 [branch-2.1](ORC) fix predicate filter failed when use hive 1.x version (#45809)
### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #43185 

Pick the pr to branch-2.1 to fix predicate filter failed when use hive
1.x version

Co-authored-by: fantasy12345zsq <1575033031@qq.com>
2024-12-25 18:56:57 +08:00
a6e4497811 [opt](inverted index) Add troubleshooting logs (#44182) (#45777)
https://github.com/apache/doris/pull/44182
2024-12-25 13:58:01 +08:00
547e88b1ee branch-2.1: [fix](csv reader) fix core dump when parsing csv with enclose #45485 (#45889)
Cherry-picked from #45485

Co-authored-by: hui lai <laihui@selectdb.com>
2024-12-25 12:09:20 +08:00
64195d79ee [refactor](metrics) Remove IntAtomicCounter & CoreLocal #45742 (#45870)
cherry pick from #45742
2024-12-24 23:13:48 +08:00
6d6473efae branch-2.1: [fix](tabletScheduler) Fix addTablet dead lock in tabletScheduler #45298 (#45769)
Cherry-picked from #45298

Co-authored-by: deardeng <dengxin@selectdb.com>
2024-12-24 21:44:30 +08:00
f0031d9954 [cherry-pick](branch-21)support posexplode table function (#43221) (#45783)
cherry-pick from master #43221
2024-12-24 21:42:30 +08:00
02f15a8ef0 [fix](inverted index) Fix Null Pointer Exception in function match(#45456)(#45774)
pick: https://github.com/apache/doris/pull/45456
2024-12-24 11:27:13 +08:00
Pxl
d70c17bdc0 [Improvement](scan) use loop to instead recursion on Level1Iterator::_normal_next #38005 (#45767)
pick from  #38005
2024-12-23 13:54:12 +08:00
367ecc3292 [fix](expr)Remove the _can_fast_execute flag from VExpr. (#45542) (#45662) 2024-12-22 21:26:32 +08:00
39c69c766e [Optimize](Variant) optimize schema update performance (#45480) (#45731)
(#45480)
2024-12-21 23:41:03 +08:00
06efd5b4af [Opt](SegmentIterator) clear and release iterators memory footprint in advance when EOF (#44768) (#45734)
(#44768)
2024-12-20 20:38:56 +08:00
37c4de3cbf branch-2.1: [improve](variant) only sanitize in debug mode #45689 (#45698)
Cherry-picked from #45689

Co-authored-by: lihangyu <lihangyu@selectdb.com>
2024-12-20 14:25:49 +08:00
19c0e89da7 [enchement](iceberg)support read iceberg partition evolution table. (#45367) (#45569)
cherry-pick #45367

Co-authored-by: daidai <changyuwei@selectdb.com>
2024-12-20 08:56:51 +08:00
c459ad7382 [fix](binlog) Avoid clear binlog dir #45581 (#45620)
cherry pick from #45581
2024-12-19 23:42:54 +08:00
9272c650b4 [Refactor](query) refactor lock in fragment mgr and change std::unorder_map to phmap (#45069)
### What problem does this PR solve?

Related PR: #44821
2024-12-19 22:27:33 +08:00