Commit Graph

307 Commits

Author SHA1 Message Date
75e2bc8a25 [function](bitmap) support bitmap_to_base64 and bitmap_from_base64 (#23759) 2023-09-02 00:58:48 +08:00
3a34ec95af [FE](fucntion) add date_floor/ceil in FE function (#23539) 2023-08-31 19:26:47 +08:00
d326cb0c99 [fix](planner) array constructor do type coercion with decimal in wrong way (#23630)
array creator with decimal type and integer type parameters should return array<decimal>,
but the legacy planner return array<double>
2023-08-30 11:18:31 +08:00
94a8fa6bc9 [bug](function) fix explode_number function return wrong rows (#23603)
before the explode_number function result is random with const value.
because the _cur_size is reset, so it's can't insert values to column.
2023-08-29 19:02:49 +08:00
0128dd42d9 [fix](regexp_extract_all) fix be OOM when quering with regexp_extrac… (#23284) 2023-08-29 10:34:12 +08:00
7cfb3cc0aa [fix](functions) fix function substitute for datetimeV1/V2 (#23344)
* fix

* function fe
2023-08-25 09:59:38 +08:00
6c5072ffc5 [FIX](array-func) fix array index func with decimal (#23399)
fix array index func with decimal
in old analyzer when sql with array_position or array_contains with decimal , may loss precision to which will make result wrong
2023-08-24 17:58:20 +08:00
22e373a799 [feature](vector-search) add 4 distance functions to support vector search (#23129) 2023-08-23 15:51:15 +08:00
Pxl
8ed4045df9 [Chore](primitive-type) remove VecPrimitiveTypeTraits (#22842) 2023-08-23 08:37:40 +08:00
f842a5cf37 [fix](case) comment out orthogonal_bitmap_* (#23259)
comment out orthogonal_bitmap_*
2023-08-22 10:02:53 +08:00
b670dd0db7 [feature](Nereids) support array type (#22851)
FEATURE:
1. enable array type in Nereids
2. support generice on function signature
3. support array and map type in type coercion and type check
4. add element_at and element_slice syntax in Nereids parser

REFACTOR:
1. remove AbstractDataType

BUG FIX:
1. remove FROM from nonReserved keyword list

TODO:
1. support lambda expression
2. use Nereids' way do function type coercion
3. use castIfnotSame when do implict cast on BoundFunction
4. let AnyDataType type coercion do same thing as function type coercion
5. add below array function
- array_apply
- array_concat
- array_filter
- array_sortby
- array_exists
- array_first_index
- array_last_index
- array_count
- array_shuffle shuffle
- array_pushfront
- array_pushback
- array_repeat
- array_zip
- reverse
- concat_ws
- split_by_string
- explode
- bitmap_from_array
- bitmap_to_array
- multi_search_all_positions
- multi_match_any
- tokenize
2023-08-22 09:47:55 +08:00
ae9f04f969 [fix](array) fix typeExtactMatch for array() type (#23264)
if we write sql with : `select cast(array() as array<varchar(10)>)`
castexpr in fe will call analyze() with `Type.matchExactType(childType, type, true);`
here array type only check contains_null , but should check inner type to make array matchExactType right
2023-08-21 19:41:09 +08:00
2d96d19030 [FIX](array-func) fix array() with decimal type (#23117)
if we write sql with : select array(1.0,2.0,null, null,2.0)
here will pass arg type with uint8 to be which does not match array() func sign with deicmal, and make be core. so here should cast from be and make null tag to cast decimal type
2023-08-18 12:12:50 +08:00
d7a6b64a65 [Fix](Planner) fix case function with null cast to array null (#22947) 2023-08-17 16:37:07 +08:00
2019bb3870 [fix](bitmap) fix wrong result of bitmap intersect functions (#22735)
* [fix](bitmap) fix wrong result of bitmap intersect functions

* fix test case
2023-08-09 18:31:24 +08:00
4608dcb2d9 [fix](agg) fix coredump caused by push down count aggregation (#22699)
fix coredump caused by push down count aggregation
2023-08-09 10:21:20 +08:00
1a8a1e5b16 [Feature](count_by_enum) support count_by_enum function (#22071)
count_by_enum(expr1, expr2, ... , exprN);

Treats the data in a column as an enumeration and counts the number of values in each enumeration. Returns the number of enumerated values for each column, and the number of non-null values versus the number of null values.
2023-08-06 16:05:14 +08:00
d3b50e3b2a [BUG](date_trunc) fix date_trunc function only handle lower string (#22602)
fix date_trunc function only handle lower string
2023-08-05 12:53:13 +08:00
265cded7da [Fix](Planner) fix window function in aggregation (#22603)
Problem:
When window function in aggregation function, executor would report an error like: Required field 'node_type' was not present!

Example:
SELECT SUM(MAX(c1) OVER (PARTITION BY c2, c3)) FROM test_window_in_agg;

Reason:
When analyze aggregate, analytic expr (window function carrior when analyze) transfered to slot and loss message. So when
serialize to thrift package, TExpr can not determine node_type of analytic expr.

Solved:
We do not support aggregate(window function) yet. So we report an error when analyze.
2023-08-04 19:15:51 +08:00
d5a21de796 [Enhancement](planner)support fold constant for date_trunc() (#22122) 2023-08-04 13:32:48 +08:00
469886eb4e [FIX](array)fix if function for array() #22553
[FIX](array)fix if function for array() #22553
2023-08-03 19:40:45 +08:00
7a2ff56863 [regression](fix) fix test_round case (#22441) 2023-08-01 11:35:44 +08:00
c1f36639fd [fix](sort) VSortedRunMerger does not return any rows with a large offset value (#22191) 2023-07-31 22:28:13 +08:00
7261845b3d [FIX](complex-type)fix complex type nested col_const (#22375)
for array/map/struct in mysql_writer unpack_if_const only unpack self column not nested , so col_const should not used in nested column.
2023-07-31 14:53:18 +08:00
79289e32dc [fix](cast) fix wrong result of casting empty string to array date (#22281) 2023-07-30 21:15:03 +08:00
c2155678ca [fix](functions) fix now(null) crash (#22321)
before: BE crash
now:

mysql [test]>select now(null);
+-----------+
| now(NULL) |
+-----------+
| NULL      |
+-----------+
1 row in set (0.06 sec)
2023-07-28 14:07:56 +08:00
341c45974c [round](decimalv2) round precise decimalv2 value (#22258) 2023-07-27 10:00:36 +08:00
Pxl
560731f392 [Bug](runtime-filter) fix probe expr prepared twice on minmax runtime filter (#22229)
fix probe expr prepared twice on minmax runtime filter
2023-07-26 19:44:35 +08:00
8ff487cc4b [fix](cast) fix invalid value error when casting null date value to string then casting to date value (#22223) 2023-07-26 17:59:01 +08:00
a0463ea047 [round](decimalv2) round decimalv2 to precision value (#22138)
* [round](decimalv2) round decimalv2 to precision value

* update

* update`
2023-07-25 03:29:48 +08:00
99bf901607 [fix](in) throw exception for unsupported data type of in expr (#22050) 2023-07-24 14:13:31 +08:00
3d0f952934 [FIX](complex-type)delete enable_map/struct_type switch #21957 2023-07-22 15:29:32 +08:00
5b043a980e [fix](planner)only forbid literal value in AnalyticExpr's order by list (#21819)
* [fix](planner)only forbid literal value in AnalyticExpr's order by list
2023-07-19 09:40:55 +08:00
03b575842d [Feature](table function) support explode_json_array_json (#21795) 2023-07-17 11:40:02 +08:00
736d6f3b4c [improvement](timezone) support mixed uppper-lower case of timezone names (#21572) 2023-07-11 09:37:14 +08:00
fba3ae96b9 Revert "[Fix](planner) Set inline view output as non constant after analyze (#21212)" (#21581)
This reverts commit 0c3acfdb7c744decb7b60e372007707a55d14e00.
2023-07-06 20:30:27 +08:00
0c3acfdb7c [Fix](planner) Set inline view output as non constant after analyze (#21212)
Problem:
Select list should be non const when from list have tables or multiple tuples. Or upper query will regard wrong of isConstant
And make wrong constant folding
For example: when using nullif funtion with subquery which result in two alternative constant, planner would treat it as constant expr. So analyzer would report an error of order by clause can not be constant

Solusion:
Change inline view output to non constant, because (select 1 a from table) as view , a in output is no constant when we see
view.a outside
2023-07-06 15:37:43 +08:00
33fa5dd1e9 [fix](cast) fix coredump of cast string of invalid datetime (#21350)
For sql like select cast("627492340" as datetime); the string is an invalid datetime, function DateV2Value<T>::from_date_str cast it as datetime 2062-74-92 23:40:00, with an out-of-range month and day value, which cause memory violation in function DateV2Value<T>::format_datetime when trying to access s_days_in_month.

==256444==ERROR: AddressSanitizer: global-buffer-overflow on address 0x55a7c1a5cff8 at pc 0x55a7e5aa3d2a bp 0x7f3b805f0370 sp 0x7f3b805f0368
READ of size 4 at 0x55a7c1a5cff8 thread T390 (FragmentMgrThre)
    #0 0x55a7e5aa3d29 in doris::vectorized::DateV2Value<doris::vectorized::DateTimeV2ValueType>::format_datetime(unsigned int*, bool*) const /home/zcp/repo_center/doris_master/doris/be/src/vec/runtime/vdatetime_value.cpp:1821:31
    #1 0x55a7e5aa3052 in doris::vectorized::DateV2Value<doris::vectorized::DateTimeV2ValueType>::from_date_str(char const*, int, int) /home/zcp/repo_center/doris_master/doris/be/src/vec/runtime/vdatetime_value.cpp:1968:5
    #2 0x55a7d48f0c49 in bool doris::vectorized::read_datetime_v2_text_impl<unsigned long>(unsigned long&, doris::vectorized::ReadBuffer&, unsigned int) /home/zcp/repo_center/doris_master/doris/be/src/vec/io/io_helper.h:309:19
    #3 0x55a7ddb21642 in bool doris::vectorized::try_read_datetime_v2_text<unsigned long>(unsigned long&, doris::vectorized::ReadBuffer&, unsigned int) /home/zcp/repo_center/doris_master/doris/be/src/vec/io/io_helper.h:409:12
    #4 0x55a7ddb215ec in bool doris::vectorized::try_parse_impl<doris::vectorized::DataTypeDateTimeV2, unsigned int, void*>(doris::vectorized::DataTypeDateTimeV2::FieldType&, doris::vectorized::ReadBuffer&, DateLUTImpl const*, unsigned int) /home/zcp/repo_center/doris_master/doris/be/src/vec/functions/function_cast.h:839:16
    #5 0x55a7ddb21c84 in auto doris::Status doris::vectorized::ConvertThroughParsing<doris::vectorized::DataTypeString, doris::vectorized::DataTypeDateTimeV2, doris::vectorized::NameCast>::execute<void*>(doris::FunctionContext*, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long>> const&, unsigned long, unsigned long, bool, void*)::'lambda'(void*, auto)::operator()<std::integral_constant<bool, false>, std::integral_constant<bool, true>>(void*, auto) const /home/zcp/repo_center/doris_master/doris/be/src/vec/functions/function_cast.h:1340:38
    #6 0x55a7ddb216f7 in void* std::__invoke_impl<doris::Status, doris::Status doris::vectorized::ConvertThroughParsing<doris::vectorized::DataTypeString, doris::vectorized::DataTypeDateTimeV2, doris::vectorized::NameCast>::execute<void*>(doris::FunctionContext*, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long>> const&, unsigned long, unsigned long, bool, void*)::'lambda'(void*, auto), std::integral_constant<bool, false>, std::integral_constant<bool, true>>(std::__invoke_other, auto&&, std::integral_constant<bool, false>&&, std::integral_constant<bool, true>&&) /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61:14
    #7 0x55a7ddb2167f in std::__invoke_result<void*, std::integral_constant<bool, false>, std::integral_constant<bool, true>>::type std::__invoke<doris::Status doris::vectorized::ConvertThroughParsing<doris::vectorized::DataTypeString, doris::vectorized::DataTypeDateTimeV2, doris::vectorized::NameCast>::execute<void*>(doris::FunctionContext*, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long>> const&, unsigned long, unsigned long, bool, void*)::'lambda'(void*, auto), std::integral_constant<bool, false>, std::integral_constant<bool, true>>(void*&&, std::integral_constant<bool, false>&&, std::integral_constant<bool, true>&&) /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96:14
    #8 0x55a7ddb20d14 in std::__detail::__variant::__gen_vtable_impl<std::__detail::__variant::_Multi_array<std::__detail::__variant::__deduce_visit_result<doris::Status> (*)(doris::Status doris::vectorized::ConvertThroughParsing<doris::vectorized::DataTypeString, doris::vectorized::DataTypeDateTimeV2, doris::vectorized::NameCast>::execute<void*>(doris::FunctionContext*, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long>> const&, unsigned long, unsigned long, bool, void*)::'lambda'(void*, auto)&&, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>&&, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>&&)>, std::integer_sequence<unsigned long, 0ul, 1ul>>::__visit_invoke(doris::Status doris::vectorized::ConvertThroughParsing<doris::vectorized::DataTypeString, doris::vectorized::DataTypeDateTimeV2, doris::vectorized::NameCast>::execute<void*>(doris::FunctionContext*, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long>> const&, unsigned long, unsigned long, bool, void*)::'lambda'(void*, auto)&&, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>&&, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>&&) /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/variant:1013:11
    #9 0x55a7ddb20c15 in decltype(auto) std::__do_visit<std::__detail::__variant::__deduce_visit_result<doris::Status>, doris::Status doris::vectorized::ConvertThroughParsing<doris::vectorized::DataTypeString, doris::vectorized::DataTypeDateTimeV2, doris::vectorized::NameCast>::execute<void*>(doris::FunctionContext*, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long>> const&, unsigned long, unsigned long, bool, void*)::'lambda'(void*, auto), std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>>(auto&&, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>&&, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>&&) /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/variant:1714:14
    #10 0x55a7ddb20b6a in decltype(auto) std::visit<doris::Status doris::vectorized::ConvertThroughParsing<doris::vectorized::DataTypeString, doris::vectorized::DataTypeDateTimeV2, doris::vectorized::NameCast>::execute<void*>(doris::FunctionContext*, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long>> const&, unsigned long, unsigned long, bool, void*)::'lambda'(void*, auto), std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>>(void*&&, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>&&, std::variant<std::integral_constant<bool, false>, std::integral_constant<bool, true>>&&) /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/variant:1769:9
    #11 0x55a7ddb205ff in doris::Status doris::vectorized::ConvertThroughParsing<doris::vectorized::DataTypeString, doris::vectorized::DataTypeDateTimeV2, doris::vectorized::NameCast>::execute<void*>(doris::FunctionContext*, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long>> const&, unsigned long, unsigned long, bool, void*) /home/zcp/repo_center/doris_master/doris/be/src/vec/functions/function_cast.h:1321:23
    #12 0x55a7ddb1f2c7 in doris::vectorized::FunctionConvertFromString<doris::vectorized::DataTypeDateTimeV2, doris::vectorized::NameCast>::execute_impl(doris::FunctionContext*, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long>> const&, unsigned long, unsigned long) /home/zcp/repo_center/doris_master/doris/be/src/vec/functions/function_cast.h:1417:20
2023-06-30 10:12:31 +08:00
e0b20f0437 [feature](function) add ip function ipv4numtostring (alias inet_ntoa) (#20936) 2023-06-27 10:17:40 +08:00
baf9a2107b [fix](regression) fix case failure by adding sync after stream load (#21155) 2023-06-26 10:38:46 +08:00
55e7af1e31 [fix](test) fix two case bug #21124 2023-06-25 18:53:20 +08:00
0cf9de8cef [fix](decimalv3) fix result error when cast a round decimalv3 to double (#20678) 2023-06-21 00:02:48 +08:00
824bc02603 [Function] Support date function: microsecond() (#20044) 2023-06-20 10:32:54 +08:00
0bd9aecfcc [Fix](planner&Nereids) fold constant invalid yyyy-mm-31 to the last day of month incorrectly (#20685)
currently, expression: cast('20230631' as date) will be evaluate to 2023-06-30 incorrectly, and '20230632' will be null, we fix the bug and evaluate all the invalid date to null.
2023-06-16 16:23:02 +08:00
97135a1cbb [Feature] (json)add json_contains function (#20824) 2023-06-16 15:10:12 +08:00
f741129ffe [Fix](regression-test) fix incorrect explain requirement when globally enable nereids (#20809) 2023-06-16 14:51:22 +08:00
affe36d32e [test](find_in_set) add find_in_set function test case (#20718) 2023-06-14 09:43:48 +08:00
99c0592157 [Feature](array-function) Support array_pushback function #17417 (#19988)
Implement array_pushback.

mysql> select array_pushback([1, 2], 3);
+--------------------------------+
| array_pushback(ARRAY(1, 2), 3) |
+--------------------------------+
| [1, 2, 3]                      |
+--------------------------------+
1 row in set (0.01 sec)
2023-06-12 16:51:12 +08:00
10134ea8c6 [fix](planner) fix RewriteInPredicateRule may be useless (#20668)
Issue Number: close #20669

RewriteInPredicateRule may cast InPredicate expr's two child to the same type, for example: where cast(age as char) in ('11'), the type of age is int, RewriteInPredicateRule will cast expr's two child type to int. As in the example above, child 0 will be such struct: 
```
child 0: type: int
    |---  child: type : char
            |-- child: type : int
```

Due to the RewriteInPredicateRule cast the type of the expr to int, it will reanalyze stmt, but it will reset stmt first before reanalyze the stmt, and reset opt will change child 0 to such struct:
```
child: type : char
    |-- child: type : int
```
It cause two child's type will be cast to varchar in func castAllToCompatibleType, the logic of RewriteInPredicateRule will be useless.

In 1.1-lts and 1.2-lts, such case  " where cast(age as char) in ('11')"  can't work well,  because func castAllToCompatibleType will cast int to char but int can't cast to char(master can work well because func castAllToCompatibleType will cast int to varchar in such case).
```
MySQL [test]> select user_id from test_cast where cast(age as char) in ('45');
ERROR 1105 (HY000): errCode = 2, detailMessage = type not match, originType=INT, targeType=CHAR(*)
```
2023-06-12 14:39:01 +08:00
a347063390 [fix](case expr) fix coredump of case for null value 2 (#20635)
fix coredump of case for null value 2
2023-06-11 23:08:53 +08:00