Commit Graph

489 Commits

Author SHA1 Message Date
6404277795 [fix](json) Add . after in JSON path to support correct token parsing (#52543) (#52544)
Boost tokenizer requires explicit "." after "$" to correctly extract
JSON path tokens. Without this, expressions like "$[0].key" cannot be
properly split, causing issues in downstream logic. This commit ensures
a "." is automatically added after "$" to maintain consistent token
parsing behavior.
2025-07-03 14:36:53 +08:00
fb70742e87 branch-2.1: [Fix](field) Fix potential memory leak and wrong binary reading about JsonbField (#50174) (#52693)
pick https://github.com/apache/doris/pull/50174
2025-07-03 12:38:37 +08:00
a75760d18f brach-2.1 cherry-pick [Fix](Variant) fix serialize with json key contains . as name (#51864)
cherry-pick from #51857
2025-06-20 14:00:00 +08:00
18d2f93120 branch-2.1: [fix](function) JSON_EXTRACT_STRING should return NULL instead of the string 'null' when encountering a NULL value #51516 (#51566)
Cherry-picked from #51516

---------

Co-authored-by: Jerry Hu <hushenggang@selectdb.com>
2025-06-13 11:07:31 +08:00
505c9af95a [fix](inverted index) fix query error (#50860) (#50909)
pick from master #50860
2025-05-17 16:19:15 +08:00
0f50cea3d8 branch-2.1: [fix](memory) Fix PODArray::add_num_element (#50785)
pick #50756
2025-05-11 19:14:25 +08:00
5501e130bf [fix](parquet)Fixed the problem that when Parquert reader use index to read files, there will be multiple threads modify same object (#50161) (#50496)
bp #50161
2025-05-08 15:51:25 +08:00
ebcec779ec branch-2.1: [fix](function) fix error result when input utf8 in url_encode, strright, append_trailing_char_if_absent #49127 (#50660)
…ght, append_trailing_char_if_absent (#49127)

The url_encode function previously performed a modulus operation on a
signed number. Converting it to an unsigned number will fix the issue.
```
before
mysql> select url_encode('编码');
+----------------------+
| url_encode('编码')   |
+----------------------+
| %5.%23%0-%5.%10%/(   |
+----------------------+
now
mysql> select url_encode('编码');
+----------------------+
| url_encode('编码')   |
+----------------------+
| %E7%BC%96%E7%A0%81   |
+----------------------+
```

The strright function did not calculate the length according to the
number of UTF-8 characters.
```
before
mysql> select strright("你好世界",5);
+----------------------------+
| strright("你好世界",5)     |
+----------------------------+
|                            |
+----------------------------+
now

mysql> select strright("你好世界",5);
+----------------------------+
| strright("你好世界",5)     |
+----------------------------+
| 你好世界                   |
+----------------------------+
```

he case of inputting a UTF-8 character was not considered.
```
mysql> select append_trailing_char_if_absent('中文', '文');
+-------------------------------------------------+
| append_trailing_char_if_absent('中文', '文')    |
+-------------------------------------------------+
| NULL                                            |
+-------------------------------------------------+
now
mysql> select append_trailing_char_if_absent('中文', '文');
+-------------------------------------------------+
| append_trailing_char_if_absent('中文', '文')    |
+-------------------------------------------------+
| 中文                                            |
+-------------------------------------------------+
```
2025-05-07 22:37:50 +08:00
02c3157e4c [branch-2.1](function) fix wrong floor of function date_diff when unit less than day (#49429) (#50606)
pick https://github.com/apache/doris/pull/49429
2025-05-07 09:27:37 +08:00
4b3dd6c10a branch-2.1: [feat](func) any function supports json #50311 (#50484)
Cherry-picked from #50311

Co-authored-by: lw112 <131352377+felixwluo@users.noreply.github.com>
2025-04-29 19:11:25 +08:00
4e8148105a [fix](serde)Fixed the issue that serde may cause be core when reading schema changed text table. (#50105) (#50504)
bp #50105
2025-04-28 21:54:43 -07:00
cf72fa82e2 [Improve](explode) explode function support multi param (#50310)
### What problem does this PR solve?
backport:https://github.com/apache/doris/pull/48537
Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
https://github.com/apache/doris-website/pull/1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
2025-04-23 23:27:07 +08:00
aa4b54952c branch-2.1: [enhancement]Optimize GeoFunctions for const columns #34396 (#50067)
Cherry-picked from #34396

Co-authored-by: koarz <66543806+koarz@users.noreply.github.com>
2025-04-16 14:05:46 +08:00
fe634555bd [fix](variant)fix core in column_object when sort from empty block (#50035) 2025-04-16 14:03:04 +08:00
6e448d3a56 [feat](test)add some be ut for orc/parquet reader (#49418) (#49948)
bp #49418
2025-04-16 12:38:45 +08:00
d8a274251e branch-2.1: [feature](function) support utf8 input in initcap #49846 (#49977) 2025-04-11 15:06:23 +08:00
8199febcdb [Test][Fix](parquet-reader) Add parquet decoder unit tests and fix bugs by these tests. (#49922) 2025-04-10 21:56:53 +08:00
aad189cf40 [feature](function) upper lower support utf8 input (#49756)
### What problem does this PR solve?
https://github.com/apache/doris/pull/49231
2025-04-07 12:00:31 +08:00
c0bc16d88f [fix](function) wrong result of arrays_overlap (#49403) (#49707)
Pick #49403
If the two arrays have the same non-null elements, they are considered
overlapping, and the result is 1.
If the two arrays have no common non-null elements and either array
contains a null element, the result is null.
Otherwise, the result is 0.

```
select arrays_overlap([1, 2, 3], [1, null]);  -- result should be 1

select arrays_overlap([2, 3], [1, null]);  -- result should be null

select arrays_overlap([2, 3], [1]);   -- result should be 0
```

### What problem does this PR solve?
2025-04-04 20:58:01 +08:00
145e393d3d branch-2.1: [fix](function) check return type is nullptr in FunctionBasePtr::build #49737 (#49761) 2025-04-02 20:23:41 +08:00
1259ee5088 branch-2.1: [Feature](function) support year of week #48870 (#49012) 2025-03-29 11:24:45 +08:00
ce49f37a5e branch-2.1: [fix](core) fix subreplace when inputting a large number of empty strings #49241 (#49303)
Cherry-picked from #49241

Co-authored-by: Mryange <yanxuecheng@selectdb.com>
2025-03-20 22:56:44 +08:00
8f79742f7d branch-2.1: [fix](arrow) Fix Arrow serialization and deserialization of Date/Datetime/Array/Map/Struct/Bitmap/HLL/Decimal256 types (#49244)
### What problem does this PR solve?

pick #48944 [fix](arrow) Fix UT DataTypeSerDeArrowTest of
Array/Map/Struct/Bitmap/HLL/Decimal256 types
pick #48398  [fix](arrow) Fix UT DataTypeSerDeArrowTest of Date type
2025-03-20 09:57:04 +08:00
f771a422a9 branch-2.1: [fix](column) fix ColumnWithTypeAndName::get_nested use-after-free when input Const(Nullable) column #48288 (#49258) 2025-03-20 09:53:20 +08:00
3b61f840f4 [fix](function) Undefined behavior in parse_url (#49149) (#49226) 2025-03-19 17:32:47 +08:00
e5a2b0eea8 Revert "[cherry-pick](jsonb) add a check for jsonb value to avoid invalid jsonb value write into segment file " (#49058)
Reverts apache/doris#48729
temp revert this pr for
PartialUpdateInfo::_generate_default_values_for_missing_cids using empty
string , which will make this check fail.
2025-03-14 17:41:06 +08:00
ed2e1ac34a branch-2.1: [fix](variant) update least common type in ColumnObject::pop_back #48935 (#48979)
Cherry-picked from #48935

Co-authored-by: Sun Chenyang <sunchenyang@selectdb.com>
2025-03-13 17:41:17 +08:00
e455bceb91 [fix](function) fix error result when STR_TO_DATE input all space (#4… (#48920)
…8872)
https://github.com/apache/doris/pull/48872
before
```
mysql> select STR_TO_DATE ('  ', '%Y-%m-%d %H:%i:%s');
+-----------------------------------------+
| STR_TO_DATE ('  ', '%Y-%m-%d %H:%i:%s') |
+-----------------------------------------+
|                                         |
+-----------------------------------------+
```
now
```
mysql> select STR_TO_DATE ('  ', '%Y-%m-%d %H:%i:%s');
+-----------------------------------------+
| STR_TO_DATE ('  ', '%Y-%m-%d %H:%i:%s') |
+-----------------------------------------+
| NULL                                    |
+-----------------------------------------+
```

Problem Summary:

None

- Test <!-- At least one of them must be included. -->
    - [x] Regression test
    - [x] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
- [ ] Previous test can cover this change. - [ ] No code files have been
changed. - [ ] Other reason <!-- Add your reason? -->

- Behavior changed:
    - [x] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [x] No.
- [ ] Yes. <!-- Add document PR link here. eg:
https://github.com/apache/doris-website/pull/1214 -->

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->

### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
https://github.com/apache/doris-website/pull/1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
2025-03-11 19:30:38 +08:00
7b2899a7ff [cherry-pick](jsonb) add a check for jsonb value to avoid invalid jsonb value write into segment file (#48729)
…ke select core (#48625)

fix invalid jsonb value write into segment file which make select core,
so we add a check for jsonb value when convert_to_olap which value will
be written into segment file
2025-03-06 15:50:35 +08:00
Pxl
43c646363e [Bug](runtime-filter) support ip rf and use exception to replace dche… (#41531)
…ck when PrimitiveType to PColumnType (#39985)

use exception to replace dcheck when PrimitiveType to PColumnType
```cpp
*** SIGABRT unknown detail explain (@0x11d3f) received by PID 73023 (TID 74292 OR 0x7fd758225640) from PID 73023; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/zcp/repo_center/doris_master/doris/be/src/common/signal_handler.h:421
 1# 0x00007FDDBE6B9520 in /lib/x86_64-linux-gnu/libc.so.6
 2# pthread_kill at ./nptl/pthread_kill.c:89
 3# raise at ../sysdeps/posix/raise.c:27
 4# abort at ./stdlib/abort.c:81
 5# 0x000056123F81A94D in /root/output/be/lib/doris_be
 6# 0x000056123F80CF8A in /root/output/be/lib/doris_be
 7# google::LogMessage::SendToLog() in /root/output/be/lib/doris_be
 8# google::LogMessage::Flush() in /root/output/be/lib/doris_be
 9# google::LogMessageFatal::~LogMessageFatal() in /root/output/be/lib/doris_be
10# doris::to_proto(doris::PrimitiveType) at /home/zcp/repo_center/doris_master/doris/be/src/exprs/runtime_filter.cpp:114
11# doris::IRuntimeFilter::push_to_remote(doris::TNetworkAddress const*) at /home/zcp/repo_center/doris_master/doris/be/src/exprs/runtime_filter.cpp:1143
12# doris::IRuntimeFilter::publish(bool)::$_0::operator()(doris::IRuntimeFilter*) const at /home/zcp/repo_center/doris_master/doris/be/src/exprs/runtime_filter.cpp:959
13# doris::IRuntimeFilter::publish(bool)::$_2::operator()() const at /home/zcp/repo_center/doris_master/doris/be/src/exprs/runtime_filter.cpp:983
14# doris::IRuntimeFilter::publish(bool) at /home/zcp/repo_center/doris_master/doris/be/src/exprs/runtime_filter.cpp:997
```

## Proposed changes
pick from #39985
2024-12-30 20:56:11 +08:00
df8bc8f23d branch-2.1: [fix](parquet) impl has_dict_page to replace old logic and fix write empty parquet row group bug #45740 (#45954)
Cherry-picked from #45740

Co-authored-by: Socrates <suyiteng@selectdb.com>
2024-12-26 15:17:49 +08:00
02f15a8ef0 [fix](inverted index) Fix Null Pointer Exception in function match(#45456)(#45774)
pick: https://github.com/apache/doris/pull/45456
2024-12-24 11:27:13 +08:00
79662fcc94 [branch-2.1](functions) clean some ip functions code and make IS_IP_ADDRESS_IN_RANGE DEPENDS_ON_ARGUMENT (#45358)
pick https://github.com/apache/doris/pull/35239


add special logic to deal smooth upgrade

The origin PR is https://github.com/apache/doris/pull/35239. for
branch-3.0 it was merged in 3.0.0 but forgot to register old version.
now in branch-3.0 we fix it in
https://github.com/apache/doris/pull/45428 which must be merged in
3.0.4. and do same thing in this PR which must be merged in 2.1.8.
then:
```
FROM    TO    result
217-    218+    
217-    303-    💥
218+    303-    
218+    304+    
303-    304+    
```
this is our best result.
2024-12-17 11:51:07 +08:00
d4a6fd1850 Revert #43255 & #44615 (#45096)
Revert "branch-2.1: [enhance](orc) Optimize ORC Predicate Pushdown for
OR-connected Predicate #43255 (#44438)"
Revert "[fix](orc) check all the cases before build_search_argument
(#44615) (#44801)"
2024-12-06 21:14:13 +08:00
5f952cf6ed branch-2.1: [fix](iceberg)Bring field_id with parquet files And fix map type's key optional #44470 (#44828)
Cherry-picked from #44470

Co-authored-by: wuwenchi <wuwenchi@selectdb.com>
2024-12-02 10:24:07 +08:00
4b15b1f263 [fix](orc) check all the cases before build_search_argument (#44615) (#44801)
cherry-pick #44615

Co-authored-by: Socrates <suyiteng@selectdb.com>
2024-11-30 09:17:56 +08:00
dceaf97381 branch-2.1: [enhance](orc) Optimize ORC Predicate Pushdown for OR-connected Predicate #43255 (#44438)
Cherry-picked from #43255

Co-authored-by: Socrates <suyiteng@selectdb.com>
2024-11-22 22:52:53 +08:00
d4712aed1a branch-2.1: [fix](string64) fix coredump caused by ColumnArray<ColumnStr<uint64_t>>::insert_indices_from (#43862)
Cherry-picked from #43624

Co-authored-by: TengJianPing <tengjianping@selectdb.com>
2024-11-13 19:31:11 +08:00
1101fbaf04 [fix](column_complex) wrong type of Field returned by ColumnComplex (#43515) (#43860) 2024-11-13 19:07:00 +08:00
9d7bc5b765 [pick](branch-2.1) pick #38215 (#43386)
pick #38215

---------

Co-authored-by: Zou Xinyi <zouxinyi@selectdb.com>
2024-11-09 22:13:05 +08:00
46afbfca01 branch-2.1: [fix](ip) fix datatype serde for ipv6 with rowstore (#43252)
Cherry-picked from #43065

Co-authored-by: amory <wangqiannan@selectdb.com>
2024-11-05 20:09:14 +08:00
25d7d0b255 [fix](move-memtable) abstract multi-streams to one logical stream (#42039) (#42250)
backport #42039
2024-10-22 20:26:42 +08:00
7eec0f8fbb [branch-2.1](datetime) Fix date floor functions overflow (#35477) (#42238)
pick https://github.com/apache/doris/pull/35477
2024-10-22 15:54:53 +08:00
0b4552f74b [cherry-pick](branch-2.1) pick hive text write from master (#40537)
## Proposed changes
pick prs:
https://github.com/apache/doris/pull/38549
https://github.com/apache/doris/pull/40183
https://github.com/apache/doris/pull/40315

---------

Co-authored-by: Calvin Kirs <kirs@apache.org>
2024-09-27 20:57:07 +08:00
cecd214345 [branch-2.1](Column) refactor ColumnNullable to provide flags safety (#40769) (#40848)
pick https://github.com/apache/doris/pull/40769

Co-authored-by: Jerry Hu <mrhhsg@gmail.com>
2024-09-14 16:27:43 +08:00
ca07a00c93 Revert "[branch-2.1](hive) support hive write text table (#38549) (#4… (#40157)
…0063)"

This reverts commit c6df7c21a3c09ae1664deabacb88dfcea9d94b68.

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->

Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-08-30 10:25:38 +08:00
c6df7c21a3 [branch-2.1](hive) support hive write text table (#38549) (#40063)
1. Support write hive text table
2. Add SessionVariable `hive_text_compression` to write compressed hive
text table
3. Supported compression type: gzip, bzip2, snappy, lz4, zstd

pick from https://github.com/apache/doris/pull/38549
2024-08-29 16:50:40 +08:00
bb687bd69c [cherry-pick](branch-2.1) add function regexp_extract_or_null (#39561)
# Proposed changes

pick https://github.com/apache/doris/pull/38296
2024-08-21 09:14:58 +08:00
021678c7c3 [fix](window_funnel) fix wrong result of window_funnel #38954 (#39270)
## Proposed changes

BP #38954
2024-08-16 09:59:31 +08:00
a44a274563 [Fix](parquet-reader) Fix and optimize parquet min-max filtering. (#39375)
Backport #38277.
2024-08-15 14:12:54 +08:00