Commit Graph

426 Commits

Author SHA1 Message Date
47edc5a06e [fix](functions) Support nullable column for multi_string functions (#19498) 2023-05-11 01:13:13 +08:00
Pxl
dfad7b6b38 [Feature](generic-aggregation) some prowork of generic aggregation (#19343)
some prowork of generic aggregation
2023-05-09 21:42:21 +08:00
673cbe3317 [chore](build) Porting to GCC-13 (#19293)
Support using GCC-13 to build the codebase.
2023-05-08 10:42:06 +08:00
6626f26506 [optimize](string) optimize char_length function by SIMD (#18925)
Optimize char_length function by SIMD
(1) optimize utf8_len compute
(2) 840% up
2023-04-28 17:22:35 +08:00
Pxl
ec517a53a8 [Chore](build) upgrade clang-format version to 16 && move thrift to fe-common (#19155)
upgrade clang-format version to 16
move thrift to fe-common
fix core dump on pipeline engine when operator canceled and not prepared
2023-04-28 14:14:51 +08:00
20395ce501 [feature](array_function): add support for array_cum_sum function (#18231) 2023-04-27 09:57:13 +08:00
925efc1902 [bug](map-type)fix some bugs in map and map element function (#18935)
fix some bugs in map and map element function.
2023-04-26 22:10:15 +08:00
1dfc5ea34c [bugfix](jsonb) fix jsonb parser crash on noavx2 host (#18977)
support avx2 and noavx2 for jsonb parser using __AVX2__ macro.
2023-04-26 15:10:12 +08:00
375789d345 [enhancement](JNI) Provide default environment variables if it is unset (#19041) 2023-04-26 12:06:38 +08:00
5fd6d8ebd4 [fix](function) Support more behaviors of cast time in MySQL 2023-04-26 07:49:54 +08:00
8d21f20753 [enhancement](javaudf) not depend on parent will cause deconstructor core (#18948)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-04-25 15:26:54 +08:00
16a394da0e [chore](build) Use include-what-you-use to optimize includes (PART III) (#18958)
Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.
2023-04-24 14:51:51 +08:00
ab2a6864bc [function](json) Json unquote (#18037) 2023-04-24 10:33:29 +08:00
b75f4c97f3 [function](string) support char function (#18878)
* [function](string) support char function

* fix
2023-04-22 08:36:48 +08:00
de0e89d1b4 [feature](function) Modified cast as time to behave more like MySQL (#18565)
Because the underlying type of time was float64, select cast("19:22:18" as time) would result in a null value in the past.
Results in the following:
2023-04-22 06:11:59 +08:00
ec1ab1a3d2 [Improve](GEO)wkb input and output are represented as hexadecimal strings And delete EWKB (#18721) 2023-04-21 15:11:18 +08:00
ab9500bfa6 [optimize](string) optimize instr and locate function for constant arguments (#18692)
Optimize instr and locate function for constant arguments.

    instr and locate function constant arguments has 58%~200% performance improvement.
    refactor locate(substr, str, pos) as standardized arguments processing.
2023-04-20 10:40:19 +08:00
e412dd12e8 [chore](build) Use include-what-you-use to optimize includes (PART II) (#18761)
Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.
2023-04-19 23:11:48 +08:00
fb377a9da9 [Improvement](functions)Optimized some datetime function's return value (#18369) 2023-04-19 15:51:11 +08:00
79c446c89f [enhancement](exception) Column filter/replicate supports exception safety (#18503) 2023-04-18 19:23:09 +08:00
564446e52f [Refact](type system) refact serde for type system and pb serde impl (#18627) 2023-04-18 14:13:56 +08:00
18898db09d [feature](function) Add new parameters to 'trim'. (#18580) 2023-04-18 14:13:30 +08:00
0b074ade02 [fix](const column) fix coredump caused by const column for some functions (#18737) 2023-04-18 13:57:55 +08:00
1e06763366 [fix](bitmap) fix bitmap_count errors to set nullable to non-nullable bitmap col (#18689) 2023-04-17 13:23:27 +08:00
092d81f88a [BugFix](functions) fix multi_search_all_positions #18682 2023-04-17 08:32:57 +08:00
c704351273 [enhancement](memory) Refactor memory limit exceeded behavior (#18590)
No check mem tracker limit and no cancel task in mem hook, only in Allocator. This helps in clearer analysis of memory issues and reduces performance loss.
PODArray/hash table/arena memory allocation will use Allocator.

Optimize mem limit exceeded log printing

Optimize compilation time
2023-04-14 10:42:35 +08:00
2519931a04 [vectorized](function) support time_to_sec function (#18354)
support time_to_sec function
2023-04-13 19:31:12 +08:00
2f64a8b387 [feature](GEO)Support read/write WKB/EWKB to gis types (#18526)
Support mutual conversion from wkb and gis types.also compatible with EWKB format
https://cwiki.apache.org/confluence/display/DORIS/DSIP-033%3A+More+GEO+functions
2023-04-13 16:25:18 +08:00
4335c9998f [chore](ARM) Add some vectorization compatibility code on aarch64 (#18553)
update sse2noen to support more sse code on arm cpus
2023-04-13 10:15:33 +08:00
d57371da13 [feature](struct-type) support basic struct constructor function (#18190)
This commit will support struct and named_struct function.
2023-04-13 09:18:00 +08:00
43392918cd [Optimization](functions)Optimize function call for const columns. (#18310) 2023-04-12 11:11:01 +08:00
1238f6de97 [bug](array) fix be core in array_with_constant/array_repeat function when the first argument is nullable (#18404)
fix be core in array_with_constant/array_repeat function when the first argument is nullable
2023-04-11 19:46:41 +08:00
0c5e3df4a3 [optimize](string) optimize split_by_string and substring_index function (#18496)
Use SIMD stringsearcher and SIMD memcmp optimze split_by_string and substring_index function.

split_by_string function has 32%~540% up
substring_index function has 22%~46% up
Performance difference depends on the needle size and whether the needle is constant param. And the longer the needle, the more performance improvement
2023-04-11 15:49:03 +08:00
5efafefeda [refactor](string) remove volnitsky search algorithm (#18474) 2023-04-10 10:56:07 +08:00
f38e00b4c0 [refactor](typesystem) using typeindex to create column instead of type name because type name is not stable (#18328)
---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-04-09 18:08:31 +08:00
3d28de6e54 [Enhencement](like) fallback to re2 if hyperscan failed (#18350) 2023-04-09 09:18:13 +08:00
fb50626075 [optimize](string) optimize concat function by SIMD memcpy (#18458)
Optimize concat function 29% up by memcpy_small_allow_read_write_overflow15.
Optimize string functions list: concat, convert_to, mask, initcap, lower, upper.

concat function has 29% up:
2023-04-08 17:05:34 +08:00
58bbd46c65 [Optimization](string) optimize constant empty string compare ( column='', column!='') (#18321)
Optimize constant empty string compare:
(1) When the constant empy string '' (size is 0), we can compare offsets in SIMD directly.

q10: SELECT MobilePhoneModel, COUNT(DISTINCT UserID) AS u FROM hits WHERE MobilePhoneModel <> '' GROUP BY MobilePhoneModel ORDER BY u DESC LIMIT 10;
q11: SELECT MobilePhone, MobilePhoneModel, COUNT(DISTINCT UserID) AS u FROM hits WHERE MobilePhoneModel <> '' GROUP BY MobilePhone, MobilePhoneModel ORDER BY u DESC LIMIT 10;
q12: SELECT SearchPhrase, COUNT(*) AS c FROM hits WHERE SearchPhrase <> '' GROUP BY SearchPhrase ORDER BY c DESC LIMIT 10;
q13: SELECT SearchPhrase, COUNT(DISTINCT UserID) AS u FROM hits WHERE SearchPhrase <> '' GROUP BY SearchPhrase ORDER BY u DESC LIMIT 10;
q14: SELECT SearchEngineID, SearchPhrase, COUNT(*) AS c FROM hits WHERE SearchPhrase <> '' GROUP BY SearchEngineID, SearchPhrase ORDER BY c DESC LIMIT 10;
Issue Number: close #xxx
2023-04-08 16:04:10 +08:00
0517616242 [vectorized](function) support array_repeat function to be compatible with hive syntax (#18028)
---------

Co-authored-by: zhangyu209 <zhangyu209@meituan.com>
2023-04-08 15:50:28 +08:00
f6f4dac1d0 [Improvement](DECIMAL) Improve decimal operation (#18437) 2023-04-07 15:58:28 +08:00
550c8aa648 [Bug](DECIMALV3) fix wrong decimal scale returned by function round (#18375) 2023-04-06 14:44:21 +08:00
Pxl
76d76f672c [Chore](build) enchancement for backend build time usage (#18344) 2023-04-06 11:13:21 +08:00
50e6c4216a [vectorized](function) suppoort date_trunc function truncate week mode (#18334)
support date_trunc could truncate week eg:
select date_trunc('2023-4-3 19:28:30', 'week');
2023-04-04 10:24:26 +08:00
8b85c55117 [vectorized](function) Support array_shuffle and shuffle function. (#18116)
---------

Co-authored-by: zhangyu209 <zhangyu209@meituan.com>
2023-04-04 08:53:13 +08:00
d4688620e9 [opt](array) optimize array_sortby using qsort instead of bubble sort #18311 2023-04-03 17:10:51 +08:00
961f5d1bb7 [feature](function)Add St_Angle/St_Azimuth function (#18293)
Add St_Angle/St_azimuth function:
St_Angle:
Enter three point, which represent two intersecting lines. Returns the angle between these lines. Point 2 and point 1 represent the first line and point 2 and point 3 represent the second line. The angle between these lines is in radians, in the range [0, 2pi). The angle is measured clockwise from the first line to the second line.

`

mysql> SELECT ST_Angle(ST_Point(1, 0),ST_Point(0, 0),ST_Point(0, 1));
+----------------------------------------------------------------------+
| st_angle(st_point(1.0, 0.0), st_point(0.0, 0.0), st_point(0.0, 1.0)) |
+----------------------------------------------------------------------+
| 4.71238898038469 |
+----------------------------------------------------------------------+
1 row in set (0.04 sec)
`

St_azimuth:
Enter two point, and returns the azimuth of the line segment formed by points 1 and 2. The azimuth is the angle in radians measured between the line from point 1 facing true North to the line segment from point 1 to point 2.
`

mysql> SELECT st_azimuth(ST_Point(0, 0),ST_Point(1, 0));
+----------------------------------------------------+
| st_azimuth(st_point(0.0, 0.0), st_point(1.0, 0.0)) |
+----------------------------------------------------+
| 1.5707963267948966 |
+----------------------------------------------------+
1 row in set (0.04 sec)
2023-04-03 13:01:59 +08:00
94e3472050 [bug](function) fix count equal function return incorrect value (#18200)
fix count equal function return incorrect value
2023-04-03 11:20:36 +08:00
7cd8f7c9ba [fix](grouping) fix coredump of grouping function for outer join (#18292)
Result of functions grouping and grouping_id is always not nullable, but outer join will convert the result column to nullable when necessary, which will cause mismatch of column type and column object when executing unctions grouping and grouping_id.
2023-04-03 09:35:31 +08:00
a77921d767 [refactor](typesystem) remove unused rpc common file and using function rpc (#18270)
rpc common is duplicate, all its method is included in function rpc. So that I remove it.
get_field_type is never used, remove it.
---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-03-31 18:13:25 +08:00
20b3bdb000 [vectorized](function) support array_first_index function (#18175)
mysql> select array_first_index(x->x+1>3, [2, 3, 4]);
+-------------------------------------------------------------------+
| array_first_index(array_map([x] -> x(0) + 1 > 3, ARRAY(2, 3, 4))) |
+-------------------------------------------------------------------+
|                                                                 2 |
+-------------------------------------------------------------------+

mysql> select array_first_index(x -> x is null, [null, 1, 2]);
+----------------------------------------------------------------------+
| array_first_index(array_map([x] -> x(0) IS NULL, ARRAY(NULL, 1, 2))) |
+----------------------------------------------------------------------+
|                                                                    1 |
+----------------------------------------------------------------------+

mysql> select array_first_index(x->power(x,2)>10, [1, 2, 3, 4]);
+---------------------------------------------------------------------------------+
| array_first_index(array_map([x] -> power(x(0), 2.0) > 10.0, ARRAY(1, 2, 3, 4))) |
+---------------------------------------------------------------------------------+
|                                                                               4 |
+---------------------------------------------------------------------------------+
2023-03-31 12:51:29 +08:00