Commit Graph

444 Commits

Author SHA1 Message Date
c03a19ea23 [improvement](bitmap) Using set to store a small number of elements to improve performance (#19973)
Test on SSB 100g:

select lo_suppkey, count(distinct lo_linenumber) from lineorder group by lo_suppkey;
exec time: 4.388s

create materialized view:

create materialized view customer_uv as select lo_suppkey, bitmap_union(to_bitmap(lo_linenumber)) from lineorder group by lo_suppkey;
select lo_suppkey, count(distinct lo_linenumber) from lineorder group by lo_suppkey;
exec time: 12.908s

test with the patch, exec time: 5.790s
2023-05-31 16:13:42 +08:00
de08c4a57b [enhance](match) Support match query without inverted index (#19936) 2023-05-30 15:02:57 +08:00
bb12a1cb49 [Enhance](array function) add support for DecimalV3 for array_enumerate_uniq() (#17724) 2023-05-30 13:09:19 +08:00
Pxl
bbb3af6ce6 [Feature](agg_state) support agg_state combinators (#19969)
support agg_state combinators state/merge/union
2023-05-29 13:07:29 +08:00
a86134cb39 [fix](executor) Fixed an error with cast as time. #20144
before

mysql [(none)]>select cast("10:10:10" as time);
+-------------------------------+
| CAST('10:10:10' AS TIMEV2(0)) |
+-------------------------------+
| 00:00:00                      |
+-------------------------------+
after

mysql [(none)]>select cast("10:10:10" as time);
+-------------------------------+
| CAST('10:10:10' AS TIMEV2(0)) |
+-------------------------------+
| 10:10:10                      |
+-------------------------------+
In the past, we supported this syntax.

mysql [(none)]>select cast("2023:05:01 13:14:15" as time);
+------------------------------------------+
| CAST('2023:05:01 13:14:15' AS TIMEV2(0)) |
+------------------------------------------+
| 13:14:15                                 |
+------------------------------------------+
However, "10:10:10" is also a valid datetime.

mysql [(none)]>select cast("10:10:10" as datetime);
+-----------------------------------+
| CAST('10:10:10' AS DATETIMEV2(0)) |
+-----------------------------------+
| 2010-10-10 00:00:00               |
+-----------------------------------+
So here, the order of parsing has been adjusted.
2023-05-29 12:17:21 +08:00
08ec5e2eb5 [fix](function) fix result column is nullable type when fast execute (#19889) 2023-05-24 10:27:50 +08:00
a434a49f71 [Bug](decimal) fix mod function (#19925)
Bug:
select id, kdcml * ktint, kdcml / ktint, kdcml % ktint from expr_test order by id;
+------+-------------------+-------------------+-----------------------+
| id | kdcml * ktint | kdcml / ktint | kdcml % ktint |
+------+-------------------+-------------------+-----------------------+
| NULL | NULL | NULL | NULL |
| 1 | 24.395 | 24.395 | -4702111234474983.74 |
| 2 | 68.968 | 17.242 | -4702111234474983.74 |
| 3 | 146.268 | 16.252 | -4702111234474983.74 |
| 4 | 275.772 | 17.235 | -4702111234474983.74 |
| 5 | 487.470 | 19.498 | -4702111234474983.74 |
| 6 | 827.244 | 22.979 | -4702111234474983.74 |
| 7 | 1364.860 | 27.854 | -4702111234474983.74 |
| 8 | 2205.928 | 34.467 | -4702111234474983.74 |
| 9 | 3509.595 | 43.328 | -4702111234474983.74 |
| 10 | 5514.790 | 55.147 | -4702111234474983.74 |
| 11 | 8578.988 | 70.900 | -4702111234474983.74 |
| 12 | 13235.484 | 91.913 | -4702111234474983.74 |
| 13 | 24.395 | 24.395 | -4702111234474983.74 |
| 14 | 68.968 | 17.242 | -4702111234474983.74 |
| 15 | 146.268 | 16.252 | -4702111234474983.74 |
| 16 | 275.772 | 17.235 | -4702111234474983.74 |
| 17 | 487.470 | 19.498 | -4702111234474983.74 |
| 18 | 827.244 | 22.979 | -4702111234474983.74 |
| 19 | 1364.860 | 27.854 | -4702111234474983.74 |
| 20 | 2205.928 | 34.467 | -4702111234474983.74 |
| 21 | 3509.595 | 43.328 | -4702111234474983.74 |
| 22 | 5514.790 | 55.147 | -4702111234474983.74 |
| 23 | 8578.988 | 70.900 | -4702111234474983.74 |
| 24 | 13235.484 | 91.913 | -4702111234474983.74 |
2023-05-23 18:24:31 +08:00
3dcdadcea6 [Improvement](function) support decimalv3 for function least and greatest (#19931) 2023-05-22 22:48:44 +08:00
Pxl
d64be9565d [Bug](function) fix function in get wrong result when input const column (#19791)
fix function in get wrong result when input const column
2023-05-22 10:58:29 +08:00
5547bbbaef [decimalv3](function) support function width_bucket (#19806) 2023-05-19 20:28:59 +08:00
c4900eb658 [Bug](DecimalV3) fix decimalv3 functions (#19801) 2023-05-19 14:10:01 +08:00
294599ee45 [feature](jsonb) rename JSONB type name and function name to JSON (#19774)
To be more compatible with MySQL, rename JSONB type name and function name to JSON.

The old JSONB type name and jsonb_xx function can still be used for backward compatibility.

There is a function jsonb_extract remained since json_extract is used by json string function and more work need to change it. It will be changed further.
2023-05-18 16:16:52 +08:00
88ca4f3e6b [feature](like) make like regexp used as a sql function (#19755) 2023-05-18 10:03:12 +08:00
48ec530d2c [fix](functions) fix least/greatest function coredump bug (#19462)
fix least/greatest function coredump bug
2023-05-17 14:12:52 +08:00
56809230d1 [Improvement](string function) optimize substring and in string set (#19257)
* [Improvement](string function) optimize substring and in string set

* update
2023-05-17 14:09:52 +08:00
2bdfaac609 [fix](ubsan) fix ubsan errors (#19658)
ixu ubsan errors:

doris/be/src/util/string_parser.hpp:275:58: runtime error: signed integer overflow: 2147483647 + 1 cannot be represented in type 'int'

doris/be/src/vec/functions/functions_comparison.h:214:51: runtime error: addition of unsigned offset to 0x7fea6c6b7010 overflowed to 0x7fea6c6b700c

doris/be/src/vec/functions/multiply.cpp:67:50: runtime error: signed integer overflow: 1295699415680000000 * 0x0000000000015401d0a4cd4890a77700 cannot be represented in type '__int128

doris/be/src/vec/aggregate_functions/aggregate_function_percentile_approx.h:445:73: runtime error: addition of unsigned offset to 0x7feca3343d10 overflowed to 0x7feca3343d08 

doris/be/src/exec/schema_scanner/schema_tables_scanner.cpp:330:24: run
2023-05-17 09:32:03 +08:00
325a1d4b28 [vectorized](function) support array_count function (#18557)
support array_count function.
array_count:Returns the number of non-zero and non-null elements in the given array.
2023-05-16 17:00:01 +08:00
c87e78dc35 [bug](jsonb) fix jsonb query bug When the json key value contains "." (#19185)
Issue Number: close #19173

mysql> SELECT jsonb_extract('{"a.b.c":{"k1":"v31", "k2.a1": 300},"a":"opentelemetry"}', '$."a.b.c".k1');
+-------------------------------------------------------------------------------------------+
| jsonb_extract('{"a.b.c":{"k1":"v31", "k2.a1": 300},"a":"opentelemetry"}', '$."a.b.c".k1') |
+-------------------------------------------------------------------------------------------+
| "v31" |
+-------------------------------------------------------------------------------------------+
1 row in set (0.06 sec)
2023-05-15 15:43:12 +08:00
47edc5a06e [fix](functions) Support nullable column for multi_string functions (#19498) 2023-05-11 01:13:13 +08:00
Pxl
dfad7b6b38 [Feature](generic-aggregation) some prowork of generic aggregation (#19343)
some prowork of generic aggregation
2023-05-09 21:42:21 +08:00
673cbe3317 [chore](build) Porting to GCC-13 (#19293)
Support using GCC-13 to build the codebase.
2023-05-08 10:42:06 +08:00
6626f26506 [optimize](string) optimize char_length function by SIMD (#18925)
Optimize char_length function by SIMD
(1) optimize utf8_len compute
(2) 840% up
2023-04-28 17:22:35 +08:00
Pxl
ec517a53a8 [Chore](build) upgrade clang-format version to 16 && move thrift to fe-common (#19155)
upgrade clang-format version to 16
move thrift to fe-common
fix core dump on pipeline engine when operator canceled and not prepared
2023-04-28 14:14:51 +08:00
20395ce501 [feature](array_function): add support for array_cum_sum function (#18231) 2023-04-27 09:57:13 +08:00
925efc1902 [bug](map-type)fix some bugs in map and map element function (#18935)
fix some bugs in map and map element function.
2023-04-26 22:10:15 +08:00
1dfc5ea34c [bugfix](jsonb) fix jsonb parser crash on noavx2 host (#18977)
support avx2 and noavx2 for jsonb parser using __AVX2__ macro.
2023-04-26 15:10:12 +08:00
375789d345 [enhancement](JNI) Provide default environment variables if it is unset (#19041) 2023-04-26 12:06:38 +08:00
5fd6d8ebd4 [fix](function) Support more behaviors of cast time in MySQL 2023-04-26 07:49:54 +08:00
8d21f20753 [enhancement](javaudf) not depend on parent will cause deconstructor core (#18948)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-04-25 15:26:54 +08:00
16a394da0e [chore](build) Use include-what-you-use to optimize includes (PART III) (#18958)
Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.
2023-04-24 14:51:51 +08:00
ab2a6864bc [function](json) Json unquote (#18037) 2023-04-24 10:33:29 +08:00
b75f4c97f3 [function](string) support char function (#18878)
* [function](string) support char function

* fix
2023-04-22 08:36:48 +08:00
de0e89d1b4 [feature](function) Modified cast as time to behave more like MySQL (#18565)
Because the underlying type of time was float64, select cast("19:22:18" as time) would result in a null value in the past.
Results in the following:
2023-04-22 06:11:59 +08:00
ec1ab1a3d2 [Improve](GEO)wkb input and output are represented as hexadecimal strings And delete EWKB (#18721) 2023-04-21 15:11:18 +08:00
ab9500bfa6 [optimize](string) optimize instr and locate function for constant arguments (#18692)
Optimize instr and locate function for constant arguments.

    instr and locate function constant arguments has 58%~200% performance improvement.
    refactor locate(substr, str, pos) as standardized arguments processing.
2023-04-20 10:40:19 +08:00
e412dd12e8 [chore](build) Use include-what-you-use to optimize includes (PART II) (#18761)
Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.
2023-04-19 23:11:48 +08:00
fb377a9da9 [Improvement](functions)Optimized some datetime function's return value (#18369) 2023-04-19 15:51:11 +08:00
79c446c89f [enhancement](exception) Column filter/replicate supports exception safety (#18503) 2023-04-18 19:23:09 +08:00
564446e52f [Refact](type system) refact serde for type system and pb serde impl (#18627) 2023-04-18 14:13:56 +08:00
18898db09d [feature](function) Add new parameters to 'trim'. (#18580) 2023-04-18 14:13:30 +08:00
0b074ade02 [fix](const column) fix coredump caused by const column for some functions (#18737) 2023-04-18 13:57:55 +08:00
1e06763366 [fix](bitmap) fix bitmap_count errors to set nullable to non-nullable bitmap col (#18689) 2023-04-17 13:23:27 +08:00
092d81f88a [BugFix](functions) fix multi_search_all_positions #18682 2023-04-17 08:32:57 +08:00
c704351273 [enhancement](memory) Refactor memory limit exceeded behavior (#18590)
No check mem tracker limit and no cancel task in mem hook, only in Allocator. This helps in clearer analysis of memory issues and reduces performance loss.
PODArray/hash table/arena memory allocation will use Allocator.

Optimize mem limit exceeded log printing

Optimize compilation time
2023-04-14 10:42:35 +08:00
2519931a04 [vectorized](function) support time_to_sec function (#18354)
support time_to_sec function
2023-04-13 19:31:12 +08:00
2f64a8b387 [feature](GEO)Support read/write WKB/EWKB to gis types (#18526)
Support mutual conversion from wkb and gis types.also compatible with EWKB format
https://cwiki.apache.org/confluence/display/DORIS/DSIP-033%3A+More+GEO+functions
2023-04-13 16:25:18 +08:00
4335c9998f [chore](ARM) Add some vectorization compatibility code on aarch64 (#18553)
update sse2noen to support more sse code on arm cpus
2023-04-13 10:15:33 +08:00
d57371da13 [feature](struct-type) support basic struct constructor function (#18190)
This commit will support struct and named_struct function.
2023-04-13 09:18:00 +08:00
43392918cd [Optimization](functions)Optimize function call for const columns. (#18310) 2023-04-12 11:11:01 +08:00
1238f6de97 [bug](array) fix be core in array_with_constant/array_repeat function when the first argument is nullable (#18404)
fix be core in array_with_constant/array_repeat function when the first argument is nullable
2023-04-11 19:46:41 +08:00