Commit Graph

111 Commits

Author SHA1 Message Date
56d87c7f4d [cherry-pick](branch-21) fix array_map cause coredump as NULL (#51618) (#51742) 2025-06-16 14:53:43 +08:00
fbad523a13 [cherry-pick](branch-21) pick (#50913) (#51072)
### What problem does this PR solve?
Problem Summary:
pick from master (#50913)

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
https://github.com/apache/doris-website/pull/1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
2025-05-22 14:34:29 +08:00
02c3157e4c [branch-2.1](function) fix wrong floor of function date_diff when unit less than day (#49429) (#50606)
pick https://github.com/apache/doris/pull/49429
2025-05-07 09:27:37 +08:00
c0bc16d88f [fix](function) wrong result of arrays_overlap (#49403) (#49707)
Pick #49403
If the two arrays have the same non-null elements, they are considered
overlapping, and the result is 1.
If the two arrays have no common non-null elements and either array
contains a null element, the result is null.
Otherwise, the result is 0.

```
select arrays_overlap([1, 2, 3], [1, null]);  -- result should be 1

select arrays_overlap([2, 3], [1, null]);  -- result should be null

select arrays_overlap([2, 3], [1]);   -- result should be 0
```

### What problem does this PR solve?
2025-04-04 20:58:01 +08:00
123000ed9d [fix](array_avg) fix core for array_avg (#46927) (#48631)
if we set session variable for fold_constant_for_be = 1 and
enable_decimal256 = true
here will meet error in sql
```
SELECT ARRAY_AVG(CAST([] AS ARRAY < DECIMALV3(1,0) > ));
```
with core
```
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior /mnt/disk1/wangqiannan/amory/doris/be/src/vec/columns/column_decimal.h:200:15 in
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /mnt/disk1/wangqiannan/amory/doris/be/src/common/signal_handler.h:421
 1# PosixSignals::chained_handler(int, siginfo*, void*) [clone .part.0] in /mnt/disk1/wangqiannan/tool/jdk-17.0.10/lib/server/libjvm.so
 2# JVM_handle_linux_signal in /mnt/disk1/wangqiannan/tool/jdk-17.0.10/lib/server/libjvm.so
 3# 0x00007FD98C1A2B50 in /lib64/libc.so.6
 4# doris::vectorized::ColumnDecimal<doris::vectorized::Decimal<wide::integer<256ul, int> > >::get_data_at(unsigned long) const at /mnt/disk1/wangqiannan/amory/doris/be/src/vec/columns/column_decimal.h:201
 5# doris::vectorized::DataTypeDecimalSerDe<doris::vectorized::Decimal<wide::integer<256ul, int> > >::write_column_to_pb(doris::vectorized::IColumn const&, doris::PValues&, long, long) const at /mnt/disk1/wangqiannan/amory/doris/be/src/vec/data_types/serde/data_type_decimal_serde.h:158
 6# doris::vectorized::DataTypeNullableSerDe::write_column_to_pb(doris::vectorized::IColumn const&, doris::PValues&, long, long) const at /mnt/disk1/wangqiannan/amory/doris/be/src/vec/data_types/serde/data_type_nullable_serde.cpp:237
 7# doris::FoldConstantExecutor::fold_constant_vexpr(doris::TFoldConstantParams const&, doris::PConstantExprResult*) at /mnt/disk1/wangqiannan/amory/doris/be/src/runtime/fold_constant_executor.cpp:118
 8# doris::PInternalService::_fold_constant_expr(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, doris::PConstantExprResult*) at /mnt/disk1/wangqiannan/amory/doris/be/src/service/internal_service.cpp:1537
 9# doris::PInternalService::fold_constant_expr(google::protobuf::RpcController*, doris::PConstantExprRequest const*, doris::PConstantExprResult*, google::protobuf::Closure*)::$_0::operator()() const at /mnt/disk1/wangqiannan/amory/doris/be/src/service/internal_service.cpp:1515
10# void std::__invoke_impl<void, doris::PInternalService::fold_constant_expr(google::protobuf::RpcController*, doris::PConstantExprRequest const*, doris::PConstantExprResult*, google::protobuf::Closure*)::$_0&>(std::__invoke_other, doris::PInternalService::fold_constant_expr(google::protobuf::RpcController*, doris::PConstantExprRequest const*, doris::PConstantExprResult*, google::protobuf::Closure*)::$_0&) at /mnt/disk1/wangqiannan/tool/ldb_toolchain_16/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61
11# std::enable_if<is_invocable_r_v<void, doris::PInternalService::fold_constant_expr(google::protobuf::RpcController*, doris::PConstantExprRequest const*, doris::PConstantExprResult*, google::protobuf::Closure*)::$_0&>, void>::type std::__invoke_r<void, doris::PInternalService::fold_constant_expr(google::protobuf::RpcController*, doris::PConstantExprRequest const*, doris::PConstantExprResult*, google::protobuf::Closure*)::$_0&>(doris::PInternalService::fold_constant_expr(google::protobuf::RpcController*, doris::PConstantExprRequest const*, doris::PConstantExprResult*, google::protobuf::Closure*)::$_0&) at /mnt/disk1/wangqiannan/tool/ldb_toolchain_16/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:117
12# std::_Function_handler<void (), doris::PInternalService::fold_constant_expr(google::protobuf::RpcController*, doris::PConstantExprRequest const*, doris::PConstantExprResult*, google::protobuf::Closure*)::$_0>::_M_invoke(std::_Any_data const&) at /mnt/disk1/wangqiannan/tool/ldb_toolchain_16/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291
13# std::function<void ()>::operator()() const at /mnt/disk1/wangqiannan/tool/ldb_toolchain_16/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:560
14# doris::WorkThreadPool<false>::work_thread(int) at /mnt/disk1/wangqiannan/amory/doris/be/src/util/work_thread_pool.hpp:158
15# void std::__invoke_impl<void, void (doris::WorkThreadPool<false>::* const&)(int), doris::WorkThreadPool<false>*&, int&>(std::__invoke_memfun_deref, void (doris::WorkThreadPool<false>::* const&)(int), doris::WorkThreadPool<false>*&, int&) at /mnt/disk1/wangqiannan/tool/ldb_toolchain_16/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:74
16# std::__invoke_result<void (doris::WorkThreadPool<false>::* const&)(int), doris::WorkThreadPool<false>*&, int&>::type std::__invoke<void (doris::WorkThreadPool<false>::* const&)(int), doris::WorkThreadPool<false>*&, int&>(void (doris::WorkThreadPool<false>::* const&)(int), doris::WorkThreadPool<false>*&, int&) at /mnt/disk1/wangqiannan/tool/ldb_toolchain_16/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96
```

### What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

### Release note

None

### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [ ] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
https://github.com/apache/doris-website/pull/1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
2025-03-04 21:46:26 +08:00
803d3a1545 Revert "[fix](date_function) fix str_to_date function return wrong microsecond issue" (#47441) 2025-01-26 12:53:11 +08:00
8925a390d6 [fix](date_function) fix str_to_date function return wrong microsecond issue (#47252) 2025-01-24 17:32:23 +08:00
fb407f2e94 [opt](lambda) let lambda expression support refer outer slot (#45186) 2024-12-11 18:55:49 +08:00
3d667e95d2 [branch-2.1](function) support array_split and array_reverse_split functions (#35619) (#43761)
pick https://github.com/apache/doris/pull/35619
2024-11-12 21:27:55 +08:00
a45dc8796a [fix](Nereids) simplify decimal comparison wrong when cast to smaller scale (#41151) (#41618)
pick from master #41151
2024-10-09 23:03:01 +08:00
226e01889c [fix](array_apply) pick array apply fix (#39328)
## Proposed changes
backport: https://github.com/apache/doris/pull/39105
Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-14 18:52:29 +08:00
ceef9ee123 [feature](serde) support presto compatible output format (#37039) (#37253)
bp #37039
2024-07-04 13:56:05 +08:00
d237a4d303 [fix](array)fix array_except/union for left const return only one row result #36776 (#36986) 2024-06-30 12:25:17 +08:00
cbaff8a700 [fix](nereids)change the decimal's precision and scale for cast(xx as decimal) (#36540)
pick from master #36316

expression cast( xx as decimal )'s datatype maybe decimalv3 or decimalv2
depending on enable_decimal_conversion value in fe conf file. if
enable_decimal_conversion is true, the datatype is decimalv3(9, 0), but
the datatype was decimalv3(38, 9) in 2.0 releases. So this pr change the
datatype same as 2.0 releases to keep the behavior consistent.
2024-06-20 17:46:11 +08:00
f163d56a98 [feature](function) support sequence function(alias of array_range), enhance both to handle datetimev2 (#30823) 2024-02-27 10:12:19 +08:00
9cbb55d49b [fix](Nereids) create double literal when create decimal literal failed (#28959)
FIX
1. remove float and double literal toString and getStringValue introduced by
  PR #23504 and PR #23271
  These functions lead to wrong cast result of double and float literal
2. fix compute signature for datetimev2 always produce scale 6
3. fix stats calculator failed when generate node stats with two same column
4. constant fold on fe failed when cast double to integral

TODO
after fix the first problem, some mv matching not work well, fix them later
- test_dup_mv_div
- test_dup_mv_json
- test_tcu
2024-01-12 11:46:29 +08:00
c0f63915f7 [chore](test) make configuartion of parallel scan be fuzzy (#29356) 2024-01-05 11:09:43 +08:00
7a4ef90110 [Improve](regresstests)add test cases for array functions (#28492) 2024-01-04 20:39:35 +08:00
51f320a606 [bug](function) fix array_apply function return wrong result (#28133) 2023-12-08 20:14:54 +08:00
79f6f85cf1 [FIX](serde)fix datetimev2 serde parse from string with scale (#27965) 2023-12-05 13:58:32 +08:00
3ddc8211d1 [FIX](array )fix array<null> literal in fe (#27750) 2023-12-03 13:19:22 +08:00
6c4ec3cb82 [FIX](complextype)fix array/map/struct impl hashcode and equals (#27717) 2023-11-30 22:08:15 +08:00
2c6d2255c3 [fix](Nereids) nested type literal type coercion and insert values with map (#26669) 2023-11-14 21:13:26 -06:00
3e10e5af39 [Fix](Serde) Fix content displayed by complex types in MySQL Client (#25946)
This pr makes three changes to the display of complex types:
1. NULL value in complex types refers to being displayed as `null`, not `NULL`
2. struct type is displayed as "column_name": column_value
3. Time types such as `datetime` and `date`, are displayed with double quotes in complex types. like
    `{1, "2023-10-26 12:12:12"}`

This pr also do a code refactor:
1. nesting_level is set to a member variable of the `DataTypeSerDe`, rather than a parameter in methods.

What's more, this pr fix a bug that fileSize is not correct, introduced by this pr: #25854
2023-11-01 23:48:55 +08:00
af8832389f [feature](Nereids) add 4 array functions (#25488)
- array_concat
- array_pushback
- array_pushfront
- array_zip
2023-10-17 04:45:15 -05:00
5f95e97c56 [fix](function) array distance should return null when result is nan (#25214) 2023-10-10 04:41:51 -05:00
90c5461ad2 [fix](Nereids) let dml work well (#24748)
Co-authored-by: sohardforaname <organic_chemistry@foxmail.com>

TODO:
1. support agg_state type
2. support implicit cast literal exception
3. use nereids execute dml for these regression cases:

- test_agg_state_nereids (for TODO 1)
- test_array_insert_overflow (for TODO 2)
- nereids_p0/json_p0/test_json_load_and_function (for TODO 2)
- nereids_p0/json_p0/test_json_unique_load_and_function (for TODO 2)
- nereids_p0/jsonb_p0/test_jsonb_load_and_function (for TODO 2)
- nereids_p0/jsonb_p0/test_jsonb_unique_load_and_function (for TODO 2)
- json_p0/test_json_load_and_function (for TODO 2)
- json_p0/test_json_unique_load_and_function (for TODO 2)
- jsonb_p0/test_jsonb_load_and_function (for TODO 2)
- jsonb_p0/test_jsonb_unique_load_and_function (for TODO 2)
- test_multi_partition_key (for TODO 2)
2023-09-26 21:08:24 +08:00
e4c0c98efa [fix](Nereids): round microsecond when specify scale of microsecond (#24854) 2023-09-26 10:11:53 +08:00
e9435c14f8 [Improve](array-func)improve array union support multi params (#24327) 2023-09-20 14:29:48 +08:00
268c867679 [Improve](serde)replace function_cast from_string to serde (#24087)
Now we can not support streamload with column which is map/array nested map/array
serde can do this now , so we can replace it
Notice. if item data in complex type data is empty we just return error, instead of makeup default value , because now we can not define right default for complex type
2023-09-14 13:53:16 +08:00
6c5072ffc5 [FIX](array-func) fix array index func with decimal (#23399)
fix array index func with decimal
in old analyzer when sql with array_position or array_contains with decimal , may loss precision to which will make result wrong
2023-08-24 17:58:20 +08:00
22e373a799 [feature](vector-search) add 4 distance functions to support vector search (#23129) 2023-08-23 15:51:15 +08:00
b670dd0db7 [feature](Nereids) support array type (#22851)
FEATURE:
1. enable array type in Nereids
2. support generice on function signature
3. support array and map type in type coercion and type check
4. add element_at and element_slice syntax in Nereids parser

REFACTOR:
1. remove AbstractDataType

BUG FIX:
1. remove FROM from nonReserved keyword list

TODO:
1. support lambda expression
2. use Nereids' way do function type coercion
3. use castIfnotSame when do implict cast on BoundFunction
4. let AnyDataType type coercion do same thing as function type coercion
5. add below array function
- array_apply
- array_concat
- array_filter
- array_sortby
- array_exists
- array_first_index
- array_last_index
- array_count
- array_shuffle shuffle
- array_pushfront
- array_pushback
- array_repeat
- array_zip
- reverse
- concat_ws
- split_by_string
- explode
- bitmap_from_array
- bitmap_to_array
- multi_search_all_positions
- multi_match_any
- tokenize
2023-08-22 09:47:55 +08:00
2d96d19030 [FIX](array-func) fix array() with decimal type (#23117)
if we write sql with : select array(1.0,2.0,null, null,2.0)
here will pass arg type with uint8 to be which does not match array() func sign with deicmal, and make be core. so here should cast from be and make null tag to cast decimal type
2023-08-18 12:12:50 +08:00
c1f36639fd [fix](sort) VSortedRunMerger does not return any rows with a large offset value (#22191) 2023-07-31 22:28:13 +08:00
7261845b3d [FIX](complex-type)fix complex type nested col_const (#22375)
for array/map/struct in mysql_writer unpack_if_const only unpack self column not nested , so col_const should not used in nested column.
2023-07-31 14:53:18 +08:00
b5fa29e138 [fix](bitmap) incorrect result of function 'bitmap_from_array' (#22305) 2023-07-27 22:44:06 +08:00
18beb822a3 [FIX](array-type) fix array string output with fe const expr (#21042)
fe foldconstRule make array() function expr with const literal , and would not pass this array literal to be . but we should make fe array string output format is same with be array string output
2023-06-21 11:52:02 +08:00
Pxl
a0d4f11667 [Bug](function) catch error state in function cast to avoid core dump (#20751)
catch error state in function cast to avoid core dump
2023-06-14 17:34:34 +08:00
99c0592157 [Feature](array-function) Support array_pushback function #17417 (#19988)
Implement array_pushback.

mysql> select array_pushback([1, 2], 3);
+--------------------------------+
| array_pushback(ARRAY(1, 2), 3) |
+--------------------------------+
| [1, 2, 3]                      |
+--------------------------------+
1 row in set (0.01 sec)
2023-06-12 16:51:12 +08:00
1f032a551d [Improve](array-functions) support array first function (#20397)
add array_first(lambda, [1,2,3,null]) function for doris
2023-06-06 12:08:46 +08:00
59a0f80233 [Improve](array-function)Improve array function intersect (#20085)
now we just support array function with 2 arrays , but intersect operator can support more than 2 arrays
2023-06-05 10:38:48 +08:00
d68f3f3b3d [Feature](array-functions)improve array functions for array_last_index (#20294)
Now we just support array_first_index for lambda input , but no array_last_index
2023-06-02 13:54:03 +08:00
519f01133a [feature](decimal)support cast rounding half up and div precision increment in decimalv3. (#19811) 2023-06-01 13:09:58 +08:00
ff05217a1e [regression](p0) fix test for array_enumerate_uniq (#20231) 2023-05-30 22:14:19 +08:00
bb12a1cb49 [Enhance](array function) add support for DecimalV3 for array_enumerate_uniq() (#17724) 2023-05-30 13:09:19 +08:00
55ccddb62c [Conf](decimalv3) enable decimalv3 by default 2023-05-29 15:38:31 +08:00
ee34b6de2d [Refact] (serde) refact mysql serde with data type (#19543)
refact mysql output (de)serialize with data type serde , avoid accoriding switch case Primitive type writed in mysqlWriter
2023-05-26 14:11:17 +08:00
67dc68630b [Improve](complex-type)improve array/map/struct creating and function with decimalv3 (#19830) 2023-05-19 17:43:36 +08:00
325a1d4b28 [vectorized](function) support array_count function (#18557)
support array_count function.
array_count:Returns the number of non-zero and non-null elements in the given array.
2023-05-16 17:00:01 +08:00