Commit Graph

8782 Commits

Author SHA1 Message Date
6430ff365d [Bug](partition) should not do reset for the partition_sorts (#49148)
### What problem does this PR solve?
Problem Summary:

the partition_sorts is unique_ptr,could release after destroy.
and when some extreme case like cancel, if source reset the sorter
early, the sink operator will coredump still use it.
2025-03-18 18:03:29 +08:00
16e348b189 [fix](array/map) Fix BE crash in lambda functions (#49139) 2025-03-18 11:51:38 +08:00
afb143c1e8 [improve](binlog) Add config to control whether enable persistent connection during ingesting (#49005)
Cherry-pick #48467, #48761
2025-03-15 10:45:53 +08:00
e5a2b0eea8 Revert "[cherry-pick](jsonb) add a check for jsonb value to avoid invalid jsonb value write into segment file " (#49058)
Reverts apache/doris#48729
temp revert this pr for
PartialUpdateInfo::_generate_default_values_for_missing_cids using empty
string , which will make this check fail.
2025-03-14 17:41:06 +08:00
ad6cf63a28 branch-2.1: [opt](inverted index) uniform profile naming convention #48826 (#48975)
Cherry-picked from #48826

Co-authored-by: zzzxl <yangsiyu@selectdb.com>
2025-03-14 14:04:46 +08:00
79595ad62f [fix](minor) Reorder incorrect logging (#49001) 2025-03-14 12:03:34 +08:00
c989ac9467 branch-2.1: [chore](binlog) add ingesting/downloading binlog latency metrics #48599 (#49002)
cherry pick from #48599
2025-03-14 11:22:23 +08:00
7d521ce288 branch-2.1: [fix](binlog) avoid adding acqurie_md5 param when enable_download_md5… #48573 (#49004)
cherry pick from #48573
2025-03-14 11:18:51 +08:00
ed2e1ac34a branch-2.1: [fix](variant) update least common type in ColumnObject::pop_back #48935 (#48979)
Cherry-picked from #48935

Co-authored-by: Sun Chenyang <sunchenyang@selectdb.com>
2025-03-13 17:41:17 +08:00
ef4101b52a branch-2.1: [fix](cold hot separation) Fix the issue that files on the remote storage are not deleted after triggering cold data compaction. (#48109) (#48446)
Auto pick #48109 branch 2.1
2025-03-12 15:37:00 +08:00
e455bceb91 [fix](function) fix error result when STR_TO_DATE input all space (#4… (#48920)
…8872)
https://github.com/apache/doris/pull/48872
before
```
mysql> select STR_TO_DATE ('  ', '%Y-%m-%d %H:%i:%s');
+-----------------------------------------+
| STR_TO_DATE ('  ', '%Y-%m-%d %H:%i:%s') |
+-----------------------------------------+
|                                         |
+-----------------------------------------+
```
now
```
mysql> select STR_TO_DATE ('  ', '%Y-%m-%d %H:%i:%s');
+-----------------------------------------+
| STR_TO_DATE ('  ', '%Y-%m-%d %H:%i:%s') |
+-----------------------------------------+
| NULL                                    |
+-----------------------------------------+
```

Problem Summary:

None

- Test <!-- At least one of them must be included. -->
    - [x] Regression test
    - [x] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
- [ ] Previous test can cover this change. - [ ] No code files have been
changed. - [ ] Other reason <!-- Add your reason? -->

- Behavior changed:
    - [x] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [x] No.
- [ ] Yes. <!-- Add document PR link here. eg:
https://github.com/apache/doris-website/pull/1214 -->

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->

### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
https://github.com/apache/doris-website/pull/1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
2025-03-11 19:30:38 +08:00
bc3e93a8a9 Revert "branch-2.1: [bug](auto partition) Fix be crash with single replica insert" (#48926)
Reverts apache/doris#48536

BE core after this  PR, revert it。
*** Check failure stack trace: ***
    @     0x564fa82e5606  google::LogMessage::SendToLog()
    @     0x564fa82e2050  google::LogMessage::Flush()
    @     0x564fa82e5e49  google::LogMessageFatal::~LogMessageFatal()
    @     0x564fa9dcb44a  (unknown)
@ 0x564fa8a19e05 google::protobuf::internal::LogMessage::Finish()
    @     0x564f9e68d49e  google::protobuf::Map<>::at<>()
    @     0x564f9e68b805  doris::TabletsChannel::_commit_txn()
    @     0x564f9e68b20b  doris::TabletsChannel::close()
    @     0x564f9e591fee  doris::LoadChannel::_handle_eos()
    @     0x564f9e591ca2  doris::LoadChannel::add_batch()
    @     0x564f9e58c300  doris::LoadChannelMgr::add_batch()
    @     0x564f9e702bb1  std::_Function_handler<>::_M_invoke()
    @     0x564f9e71d3db  doris::WorkThreadPool<>::work_thread()
    @     0x564fab0ad760  execute_native_thread_routine
    @     0x7efd907a8ac3  (unknown)
    @     0x7efd9083a850  (unknown)
    @              (nil)  (unknown)
2025-03-11 17:44:34 +08:00
1462dc50d4 branch-2.1: [bug](auto partition) Fix be crash with single replica insert (#48536)
### What problem does this PR solve?

pick:#48101
2025-03-11 12:09:58 +08:00
4dbf3eb12b branch-2.1-pick: [Fix](merge-on-write) should re-calculate delete bitmaps between segments if BE restart before publish (#48775) (#48873)
pick https://github.com/apache/doris/pull/48775
2025-03-11 12:05:45 +08:00
7ca00c114b [fix](load) add lock for runtime_state->tablet_commit_infos (#48709) (#48850)
backport #48709
2025-03-11 12:03:39 +08:00
22f1cd1f64 [Improvement](local exchange) Reuse memory in PassToOneExchanger (#48745)
### What problem does this PR solve?

pick #39031

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
https://github.com/apache/doris-website/pull/1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
2025-03-10 09:49:26 +08:00
e957a2f94c [improve](ipv6) Enhance ipv6 type to accept uint128 strings in network byte order (#48799) 2025-03-09 00:22:42 +08:00
Pxl
98c782bedb [Chore](case) pick FragmentMgr::send_filter_size.return_eof/RuntimeFilterProducer::send_size.rpc_fail to 2.1 (#48817)
part of https://github.com/apache/doris/pull/48225
2025-03-08 16:21:22 +08:00
48e3a73118 [fix](cancel) Fix cancel failure (#48751)
### What problem does this PR solve?

If a query is canceled before fragment contexts are prepared, no
fragment context will be found. So we should set execution ready to
ensure tasks will not be blocked.
2025-03-07 11:10:58 +08:00
3f684f2899 branch-2.1:[fix] (inverted index) Fix UTF-8 4-byte truncation issue and add configuration to control correct term writing (#48657) (#48741)
Cherry-picked from #48657
2025-03-06 21:28:24 +08:00
7b2899a7ff [cherry-pick](jsonb) add a check for jsonb value to avoid invalid jsonb value write into segment file (#48729)
…ke select core (#48625)

fix invalid jsonb value write into segment file which make select core,
so we add a check for jsonb value when convert_to_olap which value will
be written into segment file
2025-03-06 15:50:35 +08:00
d61737f04b branch-2.1: [fix](function) fix the function elt #48701 (#48719)
Cherry-picked from #48701

Co-authored-by: Sun Chenyang <sunchenyang@selectdb.com>
2025-03-06 11:39:54 +08:00
c9a299e914 [fix](columns) fix bug found by UT and add regression test (#48554) (#48690)
### What problem does this PR solve?

Issue Number: close #xxx

Related PR:  Pick #48554

Problem Summary:

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [x] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [x] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [x] No.
- [ ] Yes. <!-- Add document PR link here. eg:
https://github.com/apache/doris-website/pull/1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
2025-03-06 09:33:33 +08:00
03514b476f Revert "branch-2.1: [fix](inverted index) fix wrong read data for primary key #47841 (#48207)" (#48684)
This reverts commit #48207 for branch2.1 and #47841 for master

### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
https://github.com/apache/doris-website/pull/1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
2025-03-05 18:51:54 +08:00
84c638ee68 [opt](Inverted index)Avoid repeated calculations of suffix paths (#48137) (#48155)
bp #48137
2025-03-05 17:56:16 +08:00
fa9c05a54a [fix](inverted index) Fix for Inaccurate match_phrase_prefix Cache in Query Processing (#48604)
https://github.com/apache/doris/pull/46310
2025-03-05 16:01:28 +08:00
1b33f8cfd6 branch-2.1: [fix](hudi) Set Spark Hudi JNI scanner as default (#48602) (#48606)
### What problem does this PR solve?
Related PR: #45041 

Problem Summary:
If we set hudi_jni_scanner to an incorrect value, jni_connector will be
null, causing a core dump.
So we set Spark Hudi JNI scanner as default and the hadoop hudi jni
reader will be supported in future
2025-03-05 14:02:46 +08:00
1b0bb4010b [Enhancement-2.1](log) Reduce INFO log size by changing some routine query log to VLOG (#48293)
### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

They are only useful when there's a query hanging in BE. When it occurs,
we can dynamically change the vlog level.

### Release note

Reduce query-side INFO log quantity

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [x] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [x] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [x] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [x] No.
- [ ] Yes. <!-- Add document PR link here. eg:
https://github.com/apache/doris-website/pull/1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
2025-03-05 10:18:21 +08:00
be6210d742 branch-2.1: [improve](load) improve error message "unknown load_id" #47509 (#48639)
Cherry-picked from #47509

Co-authored-by: Kaijie Chen <chenkaijie@selectdb.com>
2025-03-05 10:11:24 +08:00
74b85d6bff branch-2.1-pick: [compaction](config) Add a config to control whether to prune rows with delete sign=1 in base compaction (#48241) (#48620)
pick https://github.com/apache/doris/pull/48241
2025-03-05 10:10:58 +08:00
fa4b901ec4 branch-2.1: [chore](http) add HttpClient::execute debug log #48595 (#48619)
Cherry-picked from #48595

Co-authored-by: walter <maochuan@selectdb.com>
2025-03-05 10:03:05 +08:00
621944d487 [InvertedIndex](Variant) supoort inverted index for array type in variant (#48594)
cherry-pick from #47688
2025-03-05 10:02:13 +08:00
08e7d920db branch-2.1: [fix](index build) Correct inverted index behavior after dynamically adding a column #48389 (#48546)
Cherry-picked from #48389

---------

Co-authored-by: airborne12 <jiangkai@selectdb.com>
2025-03-05 09:26:54 +08:00
Pxl
548c79f336 [Improvement](column) add santy check and add some fix for ColumnString #47964 (#48512)
pick part of #47964
2025-03-04 21:47:16 +08:00
1ccd879e3d branch-2.1: [fix](serde)fix arrow serde with no value into column #48053 (#48097)
Cherry-picked from #48053

Co-authored-by: amory <wangqiannan@selectdb.com>
2025-03-04 21:37:52 +08:00
c9381b0285 [fix](load) Fix import failure when the stream load parameter specifies Transfer-Encoding:chunked (#48196) (#48503)
pick from master  #48196
2025-03-04 10:12:54 +08:00
ddf63dc2f0 branch-2.1: [enhancement](threadpool) reduce thread pool for arrow flight and spill io threads #48530 (#48556)
Cherry-picked from #48530

Co-authored-by: yiguolei <guolei@selectdb.com>
2025-03-03 18:51:48 +08:00
cd3e1dce74 [feature](inverted index) Add profile statistics for each condition in inverted index filters (#48459)
https://github.com/apache/doris/pull/47504
2025-03-01 11:00:19 +08:00
c5e67bf82d branch-2.1: [Chore](client) Do not log in thrift exception when ADDRESS_SANITIZER is defined #48430 (#48455)
Cherry-picked from #48430

Co-authored-by: Pxl <xl@selectdb.com>
2025-02-28 15:12:12 +08:00
da90fba590 branch-2.1: [fix](client) Do not log in thrift exception when ADDRESS_SANITIZER is defined #48347 (#48404)
cherry pick from #48347
2025-02-27 15:52:39 +08:00
dae9d9d5e4 [cherry-pick](branch-2.1) Don't prematurely erase DeleteRows in reading iceberg table with position delete (#47977) (#48308)
### What problem does this PR solve?
Issue Number: close #41460
Problem Summary:
When reading the Iceberg table, previously read `DeleteRows` should not
be released immediately, as the Iceberg data file is split into multiple
`IcebergSplit`s for execution. These `IcebergSplit`s belong to the same
data file, meaning they share the same `DeleteRows`. Therefore,
`DeleteRows` in the `DeleteFile` should not be released prematurely.
Instead, they should be released when the shared_kv is reset, at which
point all `DeleteRows` will be freed along with the cached `DeleteFile`.

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
https://github.com/apache/doris-website/pull/1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
2025-02-27 15:44:40 +08:00
393cb00174 [fix](metrics) max_compaction_score metrics do not update while compaction_num_per_round > 1 (#48383)
refer to #46160
2025-02-27 11:07:15 +08:00
45ebb21cad branch-2.1-pick: [Opt](partial update) Add some cases for partial update (#47900) (#48272)
pick https://github.com/apache/doris/pull/47900
2025-02-26 22:04:57 +08:00
d0cb4f8864 branch-2.1: [fix](schema scan) Fix invalid pointer access #48370 (#48387)
Cherry-picked from #48370

Co-authored-by: Gabriel <liwenqiang@selectdb.com>
2025-02-26 22:04:05 +08:00
4f3ccca9a2 branch-2.1: [fix](schema scan) Fix invalid pointer access #48313 (#48341)
Cherry-picked from #48313

Co-authored-by: Gabriel <liwenqiang@selectdb.com>
2025-02-26 20:29:39 +08:00
b1ac39587b [fix](variant) fix variant used in order by back to legacy planner would meet core (#48332)
### What problem does this PR solve?
sql 
```
 SELECT    *,    ROW_NUMBER() OVER (     PARTITION BY orderid      ORDER BY v DESC   ) AS row_idfirst  FROM test_v_legacy order by orderid;
```
core:
```
F20250225 11:23:08.735848 1788347 column_object.h:439] should not call the method in column object
*** Check failure stack trace: ***
F20250225 11:23:08.736686 1788345 column_object.h:439] should not call the method in column object
*** Check failure stack trace: ***
    @     0x5555fa0821b6  google::LogMessage::SendToLog()
    @     0x5555fa0821b6  google::LogMessage::SendToLog()
    @     0x5555fa07ec00  google::LogMessage::Flush()
    @     0x5555fa07ec00  google::LogMessage::Flush()
    @     0x5555fa0829f9  google::LogMessageFatal::~LogMessageFatal()
    @     0x5555fa0829f9  google::LogMessageFatal::~LogMessageFatal()
    @     0x5555dd722b1c  doris::vectorized::ColumnObject::get_permutation()
    @     0x5555dd722b1c  doris::vectorized::ColumnObject::get_permutation()
    @     0x5555dd6b933c  doris::vectorized::ColumnNullable::get_permutation()
    @     0x5555dd6b933c  doris::vectorized::ColumnNullable::get_permutation()
    @     0x5555dde3997f  doris::vectorized::sort_block()
    @     0x5555dde3997f  doris::vectorized::sort_block()
    @     0x5555e2f13646  doris::vectorized::Sorter::partial_sort()
    @     0x5555e2f13646  doris::vectorized::Sorter::partial_sort()
    @     0x5555e2f1503b  doris::vectorized::FullSorter::_do_sort()
    @     0x5555e2f1503b  doris::vectorized::FullSorter::_do_sort()
    @     0x5555e2f15a2f  doris::vectorized::FullSorter::prepare_for_read()
    @     0x5555e2f15a2f  doris::vectorized::FullSorter::prepare_for_read()
    @     0x5555f97fb76a  doris::pipeline::SortSinkOperatorX::sink()
    @     0x5555f97fb76a  doris::pipeline::SortSinkOperatorX::sink()
    @     0x5555fa018798  doris::pipeline::PipelineXTask::execute()
    @     0x5555fa018798  doris::pipeline::PipelineXTask::execute()
    @     0x5555fa04efe5  doris::pipeline::TaskScheduler::_do_work()
    @     0x5555fa04efe5  doris::pipeline::TaskScheduler::_do_work()
    @     0x5555fa052f2b  doris::pipeline::TaskScheduler::start()::$_0::operator()()
    @     0x5555fa052f2b  doris::pipeline::TaskScheduler::start()::$_0::operator()()

```
Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [x] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
https://github.com/apache/doris-website/pull/1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
2025-02-26 20:27:43 +08:00
82fe8bc3d7 branch-2.1:[fix](libhdfs) fix the lifecycle issue of libhdfs config (#48352)
pick part of #47299
when calling `hdfsBuilderSetKerb5Conf`, the `value` string's lifecycle
must be with `hdfs_builder`.
2025-02-26 14:10:49 +08:00
69196d8212 [fix](jvm) the jvm opt should only be set once #48335 (#48336)
bp #48335
2025-02-26 09:52:58 +08:00
a2fe1bda7f branch-2.1: [fix](group commit) fix wal reader handle empty block #48290 (#48334)
Cherry-picked from #48290

Co-authored-by: meiyi <meiyi@selectdb.com>
2025-02-26 09:51:55 +08:00
babab64bbd branch-2.1: [fix](group commit) group commit failed if enable global enable_unique_key_partial_update (#48251)
pick https://github.com/apache/doris/pull/48120
2025-02-25 22:08:50 +08:00