Commit Graph

1533 Commits

Author SHA1 Message Date
505c9af95a [fix](inverted index) fix query error (#50860) (#50909)
pick from master #50860
2025-05-17 16:19:15 +08:00
0f50cea3d8 branch-2.1: [fix](memory) Fix PODArray::add_num_element (#50785)
pick #50756
2025-05-11 19:14:25 +08:00
0002500757 branch-2.1: [Fix](inverted index) fix rename column build index bug #50056 (#50732)
pick #47562 #50056 from master

---------

Co-authored-by: qiye <luen@selectdb.com>
2025-05-09 17:13:46 +08:00
5501e130bf [fix](parquet)Fixed the problem that when Parquert reader use index to read files, there will be multiple threads modify same object (#50161) (#50496)
bp #50161
2025-05-08 15:51:25 +08:00
ebcec779ec branch-2.1: [fix](function) fix error result when input utf8 in url_encode, strright, append_trailing_char_if_absent #49127 (#50660)
…ght, append_trailing_char_if_absent (#49127)

The url_encode function previously performed a modulus operation on a
signed number. Converting it to an unsigned number will fix the issue.
```
before
mysql> select url_encode('编码');
+----------------------+
| url_encode('编码')   |
+----------------------+
| %5.%23%0-%5.%10%/(   |
+----------------------+
now
mysql> select url_encode('编码');
+----------------------+
| url_encode('编码')   |
+----------------------+
| %E7%BC%96%E7%A0%81   |
+----------------------+
```

The strright function did not calculate the length according to the
number of UTF-8 characters.
```
before
mysql> select strright("你好世界",5);
+----------------------------+
| strright("你好世界",5)     |
+----------------------------+
|                            |
+----------------------------+
now

mysql> select strright("你好世界",5);
+----------------------------+
| strright("你好世界",5)     |
+----------------------------+
| 你好世界                   |
+----------------------------+
```

he case of inputting a UTF-8 character was not considered.
```
mysql> select append_trailing_char_if_absent('中文', '文');
+-------------------------------------------------+
| append_trailing_char_if_absent('中文', '文')    |
+-------------------------------------------------+
| NULL                                            |
+-------------------------------------------------+
now
mysql> select append_trailing_char_if_absent('中文', '文');
+-------------------------------------------------+
| append_trailing_char_if_absent('中文', '文')    |
+-------------------------------------------------+
| 中文                                            |
+-------------------------------------------------+
```
2025-05-07 22:37:50 +08:00
02c3157e4c [branch-2.1](function) fix wrong floor of function date_diff when unit less than day (#49429) (#50606)
pick https://github.com/apache/doris/pull/49429
2025-05-07 09:27:37 +08:00
af4195e399 branch-2.1: [fix](geo) Fix ST_Contains behavior #50115 (#50569)
Cherry-picked from #50115

Co-authored-by: linrrarity <142187136+linrrzqqq@users.noreply.github.com>
2025-05-03 22:36:22 +08:00
4b3dd6c10a branch-2.1: [feat](func) any function supports json #50311 (#50484)
Cherry-picked from #50311

Co-authored-by: lw112 <131352377+felixwluo@users.noreply.github.com>
2025-04-29 19:11:25 +08:00
3660139c64 branch-2.1: [fix](path gc) Fix path gc race with publish task #50343 (#50520)
cherry pick from #50343
2025-04-29 16:17:34 +08:00
4e8148105a [fix](serde)Fixed the issue that serde may cause be core when reading schema changed text table. (#50105) (#50504)
bp #50105
2025-04-28 21:54:43 -07:00
bc68a8da07 branch-2.1: remove visible rowset from memory during deletion transaction (#50329) 2025-04-24 09:41:37 +08:00
cf72fa82e2 [Improve](explode) explode function support multi param (#50310)
### What problem does this PR solve?
backport:https://github.com/apache/doris/pull/48537
Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
https://github.com/apache/doris-website/pull/1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
2025-04-23 23:27:07 +08:00
ea29bc523e branch-2.1: [Enhancement](GEO) Support Multipolygon and some spatial functions (#50073)
pick: https://github.com/apache/doris/pull/37003,
https://github.com/apache/doris/pull/48695 and
https://github.com/apache/doris/pull/49665

---------

Co-authored-by: Mryange <59914473+Mryange@users.noreply.github.com>
Co-authored-by: koi <koi20000@163.com>
2025-04-17 09:25:29 +08:00
aa4b54952c branch-2.1: [enhancement]Optimize GeoFunctions for const columns #34396 (#50067)
Cherry-picked from #34396

Co-authored-by: koarz <66543806+koarz@users.noreply.github.com>
2025-04-16 14:05:46 +08:00
fe634555bd [fix](variant)fix core in column_object when sort from empty block (#50035) 2025-04-16 14:03:04 +08:00
6e448d3a56 [feat](test)add some be ut for orc/parquet reader (#49418) (#49948)
bp #49418
2025-04-16 12:38:45 +08:00
d8a274251e branch-2.1: [feature](function) support utf8 input in initcap #49846 (#49977) 2025-04-11 15:06:23 +08:00
8199febcdb [Test][Fix](parquet-reader) Add parquet decoder unit tests and fix bugs by these tests. (#49922) 2025-04-10 21:56:53 +08:00
aad189cf40 [feature](function) upper lower support utf8 input (#49756)
### What problem does this PR solve?
https://github.com/apache/doris/pull/49231
2025-04-07 12:00:31 +08:00
c0bc16d88f [fix](function) wrong result of arrays_overlap (#49403) (#49707)
Pick #49403
If the two arrays have the same non-null elements, they are considered
overlapping, and the result is 1.
If the two arrays have no common non-null elements and either array
contains a null element, the result is null.
Otherwise, the result is 0.

```
select arrays_overlap([1, 2, 3], [1, null]);  -- result should be 1

select arrays_overlap([2, 3], [1, null]);  -- result should be null

select arrays_overlap([2, 3], [1]);   -- result should be 0
```

### What problem does this PR solve?
2025-04-04 20:58:01 +08:00
145e393d3d branch-2.1: [fix](function) check return type is nullptr in FunctionBasePtr::build #49737 (#49761) 2025-04-02 20:23:41 +08:00
1259ee5088 branch-2.1: [Feature](function) support year of week #48870 (#49012) 2025-03-29 11:24:45 +08:00
4a31fc4e09 [Bug](fix) fix the percentile func result do not equal the percentile array rewrite result (#49379)
cherry pick https://github.com/apache/doris/pull/49351
2025-03-29 08:56:24 +08:00
ce49f37a5e branch-2.1: [fix](core) fix subreplace when inputting a large number of empty strings #49241 (#49303)
Cherry-picked from #49241

Co-authored-by: Mryange <yanxuecheng@selectdb.com>
2025-03-20 22:56:44 +08:00
8f79742f7d branch-2.1: [fix](arrow) Fix Arrow serialization and deserialization of Date/Datetime/Array/Map/Struct/Bitmap/HLL/Decimal256 types (#49244)
### What problem does this PR solve?

pick #48944 [fix](arrow) Fix UT DataTypeSerDeArrowTest of
Array/Map/Struct/Bitmap/HLL/Decimal256 types
pick #48398  [fix](arrow) Fix UT DataTypeSerDeArrowTest of Date type
2025-03-20 09:57:04 +08:00
f771a422a9 branch-2.1: [fix](column) fix ColumnWithTypeAndName::get_nested use-after-free when input Const(Nullable) column #48288 (#49258) 2025-03-20 09:53:20 +08:00
3b61f840f4 [fix](function) Undefined behavior in parse_url (#49149) (#49226) 2025-03-19 17:32:47 +08:00
e5a2b0eea8 Revert "[cherry-pick](jsonb) add a check for jsonb value to avoid invalid jsonb value write into segment file " (#49058)
Reverts apache/doris#48729
temp revert this pr for
PartialUpdateInfo::_generate_default_values_for_missing_cids using empty
string , which will make this check fail.
2025-03-14 17:41:06 +08:00
ad6cf63a28 branch-2.1: [opt](inverted index) uniform profile naming convention #48826 (#48975)
Cherry-picked from #48826

Co-authored-by: zzzxl <yangsiyu@selectdb.com>
2025-03-14 14:04:46 +08:00
ed2e1ac34a branch-2.1: [fix](variant) update least common type in ColumnObject::pop_back #48935 (#48979)
Cherry-picked from #48935

Co-authored-by: Sun Chenyang <sunchenyang@selectdb.com>
2025-03-13 17:41:17 +08:00
e455bceb91 [fix](function) fix error result when STR_TO_DATE input all space (#4… (#48920)
…8872)
https://github.com/apache/doris/pull/48872
before
```
mysql> select STR_TO_DATE ('  ', '%Y-%m-%d %H:%i:%s');
+-----------------------------------------+
| STR_TO_DATE ('  ', '%Y-%m-%d %H:%i:%s') |
+-----------------------------------------+
|                                         |
+-----------------------------------------+
```
now
```
mysql> select STR_TO_DATE ('  ', '%Y-%m-%d %H:%i:%s');
+-----------------------------------------+
| STR_TO_DATE ('  ', '%Y-%m-%d %H:%i:%s') |
+-----------------------------------------+
| NULL                                    |
+-----------------------------------------+
```

Problem Summary:

None

- Test <!-- At least one of them must be included. -->
    - [x] Regression test
    - [x] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
- [ ] Previous test can cover this change. - [ ] No code files have been
changed. - [ ] Other reason <!-- Add your reason? -->

- Behavior changed:
    - [x] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [x] No.
- [ ] Yes. <!-- Add document PR link here. eg:
https://github.com/apache/doris-website/pull/1214 -->

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->

### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
https://github.com/apache/doris-website/pull/1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
2025-03-11 19:30:38 +08:00
3f684f2899 branch-2.1:[fix] (inverted index) Fix UTF-8 4-byte truncation issue and add configuration to control correct term writing (#48657) (#48741)
Cherry-picked from #48657
2025-03-06 21:28:24 +08:00
7b2899a7ff [cherry-pick](jsonb) add a check for jsonb value to avoid invalid jsonb value write into segment file (#48729)
…ke select core (#48625)

fix invalid jsonb value write into segment file which make select core,
so we add a check for jsonb value when convert_to_olap which value will
be written into segment file
2025-03-06 15:50:35 +08:00
621944d487 [InvertedIndex](Variant) supoort inverted index for array type in variant (#48594)
cherry-pick from #47688
2025-03-05 10:02:13 +08:00
08e7d920db branch-2.1: [fix](index build) Correct inverted index behavior after dynamically adding a column #48389 (#48546)
Cherry-picked from #48389

---------

Co-authored-by: airborne12 <jiangkai@selectdb.com>
2025-03-05 09:26:54 +08:00
cd3e1dce74 [feature](inverted index) Add profile statistics for each condition in inverted index filters (#48459)
https://github.com/apache/doris/pull/47504
2025-03-01 11:00:19 +08:00
1aa57a3b13 branch-2.1: [fix](array index) Correct null bitmap writing for inverted index #47846 (#48214)
cherry pick from #47846 #48231
2025-02-25 20:31:18 +08:00
470030b878 [feat](clone) Speed clone tablet via batch small file downloading #45061 (#45218)
cherry pick from #45061
2025-02-10 19:38:40 +08:00
3ec723f2cb branch-2.1: [fix](prepared statement) fix protocol with TIME datatype #47389 (#47543)
Cherry-picked from #47389

Co-authored-by: lihangyu <lihangyu@selectdb.com>
2025-02-08 13:00:49 +08:00
701aec6b21 branch-2.1: [opt](jsonb) add ut for the jsonb parser #47181 (#47388)
Cherry-picked from #47181

Co-authored-by: Sun Chenyang <sunchenyang@selectdb.com>
2025-01-24 17:29:33 +08:00
Pxl
58415c3591 [Chore](case) add test case for cityhash #46928 (#46957)
pick from #46928
2025-01-14 14:03:19 +08:00
4472648c07 [branch-2.1] pick workload group usage metrics (#46177)
pick #45284  #44870
2024-12-31 10:09:48 +08:00
df26475e1a [Enhancement](compaction) enable the compaction producer to generate multiple compaction tasks in a single run (#45411) (#46160)
pick master #45411
2024-12-31 09:51:43 +08:00
Pxl
43c646363e [Bug](runtime-filter) support ip rf and use exception to replace dche… (#41531)
…ck when PrimitiveType to PColumnType (#39985)

use exception to replace dcheck when PrimitiveType to PColumnType
```cpp
*** SIGABRT unknown detail explain (@0x11d3f) received by PID 73023 (TID 74292 OR 0x7fd758225640) from PID 73023; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/zcp/repo_center/doris_master/doris/be/src/common/signal_handler.h:421
 1# 0x00007FDDBE6B9520 in /lib/x86_64-linux-gnu/libc.so.6
 2# pthread_kill at ./nptl/pthread_kill.c:89
 3# raise at ../sysdeps/posix/raise.c:27
 4# abort at ./stdlib/abort.c:81
 5# 0x000056123F81A94D in /root/output/be/lib/doris_be
 6# 0x000056123F80CF8A in /root/output/be/lib/doris_be
 7# google::LogMessage::SendToLog() in /root/output/be/lib/doris_be
 8# google::LogMessage::Flush() in /root/output/be/lib/doris_be
 9# google::LogMessageFatal::~LogMessageFatal() in /root/output/be/lib/doris_be
10# doris::to_proto(doris::PrimitiveType) at /home/zcp/repo_center/doris_master/doris/be/src/exprs/runtime_filter.cpp:114
11# doris::IRuntimeFilter::push_to_remote(doris::TNetworkAddress const*) at /home/zcp/repo_center/doris_master/doris/be/src/exprs/runtime_filter.cpp:1143
12# doris::IRuntimeFilter::publish(bool)::$_0::operator()(doris::IRuntimeFilter*) const at /home/zcp/repo_center/doris_master/doris/be/src/exprs/runtime_filter.cpp:959
13# doris::IRuntimeFilter::publish(bool)::$_2::operator()() const at /home/zcp/repo_center/doris_master/doris/be/src/exprs/runtime_filter.cpp:983
14# doris::IRuntimeFilter::publish(bool) at /home/zcp/repo_center/doris_master/doris/be/src/exprs/runtime_filter.cpp:997
```

## Proposed changes
pick from #39985
2024-12-30 20:56:11 +08:00
d2c108726d [opt](bloomfilter index) optimize memory usage for bloom filter index writer #45833 (#46047)
cherry pick from #45833
2024-12-27 12:10:56 +08:00
df8bc8f23d branch-2.1: [fix](parquet) impl has_dict_page to replace old logic and fix write empty parquet row group bug #45740 (#45954)
Cherry-picked from #45740

Co-authored-by: Socrates <suyiteng@selectdb.com>
2024-12-26 15:17:49 +08:00
1cf6986cea [pick](branch-2.1) pick #44092 (#45836) 2024-12-25 23:11:19 +08:00
64195d79ee [refactor](metrics) Remove IntAtomicCounter & CoreLocal #45742 (#45870)
cherry pick from #45742
2024-12-24 23:13:48 +08:00
02f15a8ef0 [fix](inverted index) Fix Null Pointer Exception in function match(#45456)(#45774)
pick: https://github.com/apache/doris/pull/45456
2024-12-24 11:27:13 +08:00
79662fcc94 [branch-2.1](functions) clean some ip functions code and make IS_IP_ADDRESS_IN_RANGE DEPENDS_ON_ARGUMENT (#45358)
pick https://github.com/apache/doris/pull/35239


add special logic to deal smooth upgrade

The origin PR is https://github.com/apache/doris/pull/35239. for
branch-3.0 it was merged in 3.0.0 but forgot to register old version.
now in branch-3.0 we fix it in
https://github.com/apache/doris/pull/45428 which must be merged in
3.0.4. and do same thing in this PR which must be merged in 2.1.8.
then:
```
FROM    TO    result
217-    218+    
217-    303-    💥
218+    303-    
218+    304+    
303-    304+    
```
this is our best result.
2024-12-17 11:51:07 +08:00