doris

Author	SHA1	Message	Date
ZhangYu0123	089a91ecd5	[vectorized](function) support array_exists lambda function (#17931 ) Co-authored-by: zhangyu209 <zhangyu209@meituan.com>	2023-03-23 11:11:39 +08:00
Xinyi Zou	ebef0c038d	Revert "[fix](function) fix AES/SM3/SM4 encrypt/ decrypt algorithm initialization vector bug (#17420 )" (#17887 ) This reverts commit 397cc011c4f1ba5a25c770258c13f1cd3f28b47d.	2023-03-22 13:28:25 +08:00
Mellorsssss	4193884a32	[feature](array_zip) Support array_zip function (#17696 )	2023-03-21 18:44:30 +08:00
HappenLee	7b93c17364	[Bug][Fix] regexp function core dump DCHECK failed and error result (#17953 ) CREATE TABLE `test` ( `name` varchar(64) NULL, `age` int(11) NULL ) ENGINE=OLAP DUPLICATE KEY(`name`) COMMENT 'OLAP' DISTRIBUTED BY HASH(`name`) BUCKETS 1 PROPERTIES ( "replication_allocation" = "tag.location.default: 1", "in_memory" = "false", "storage_format" = "V2", "disable_auto_compaction" = "false" ); insert into `test` values ("lemon",1),("tom",2); select a.name regexp concat('^', a.name) from test a;	2023-03-21 08:56:19 +08:00
zhangstar333	dc284b62d9	[vectorized](function) support array_filter function (#17832 )	2023-03-20 23:18:10 +08:00
TengJianPing	dfa2528b5e	[fix](bitmap) fix wrong result of bitmap count functions for null values (#17849 ) bitmap count functions result is null when there are null values, which is not right:	2023-03-19 11:49:58 +08:00
ZhangYu0123	a53d46e317	[Fix](array function) fix array_pushfront function with DecimalV3 #17760 Support array_pushfront function with DecimalV3 Issue Number: close #xxx	2023-03-16 09:03:52 +08:00
ZhaoChangle	66f3ef568e	(functions) optimize const_column to full convert	2023-03-15 10:57:03 +08:00
zhangstar333	85080ee3c3	[vectorized](function) support array_map function (#17581 )	2023-03-15 10:51:29 +08:00
spaces-x	5b39fa9843	[Feature](vec)(quantile_state): support quantile state in vectorized engine (#16562 ) * [Feature](vectorized)(quantile_state): support vectorized quantile state functions 1. now quantile column only support not nullable 2. add up some regression test cases 3. set default enable_quantile_state_type = true --------- Co-authored-by: spaces-x <weixiang06@meituan.com>	2023-03-14 10:54:04 +08:00
gitccl	c302fa2564	[Feature](array-function) Support array_pushfront function (#17584 )	2023-03-13 14:26:02 +08:00
starocean999	782001c75b	[fix](planner) project should be done inside subquery (#17630 ) WITH t0 AS( SELECT report.date1 AS date2 FROM( SELECT DATE_FORMAT(date, '%Y%m%d') AS date1 FROM cir_1756_t1 ) report GROUP BY report.date1 ), t3 AS( SELECT date_format(date, '%Y%m%d') AS date3 FROM cir_1756_t2 ) SELECT row_number() OVER(ORDER BY date2) FROM( SELECT t0.date2 FROM t0 LEFT JOIN t3 ON t0.date2 = t3.date3 ) tx; The DATE_FORMAT(date, '%Y%m%d') was calculated in GROUP BY node, which is wrong. This expr should be calculated inside the subquery.	2023-03-13 11:10:27 +08:00
abmdocrt	55c42da511	[Feature](array) Support array<decimalv3> data type (#16640 )	2023-03-13 10:48:13 +08:00
camby	6dcd791b74	[feature](struct-type) support CAST AS Struct type (#17553 ) 1. add support `CAST AS Struct` from Struct type; 2. fix crash while `CAST('{}' AS Struct)`; 3. `CAST('' AS complext_type)` should return NULL instead of empty object; --------- Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2023-03-10 21:21:16 +08:00
bobhan1	e1bf9411de	[feature](array function) add support for array_enumerate_uniq (#17541 ) add support for array_enumerate_uniq()	2023-03-10 10:20:49 +08:00
lihangyu	368e6a4f9c	[Bug](array filter) Fix bug due to `ColumnArray::filter_generic` invalid inplace `size_at` after `set_end_ptr` (#17554 ) We should make a new PodArray to add items instead of do it inplace	2023-03-09 10:59:29 +08:00
Xinyi Zou	397cc011c4	[fix](function) fix AES/SM3/SM4 encrypt/ decrypt algorithm initialization vector bug (#17420 ) ECB algorithm, block_encryption_mode does not take effect, it only takes effect when init vector is provided. Solved: 192/256 supports calculation without init vector For other algorithms, an error should be reported when there is no init vector Initialization Vector. The default value for the block_encryption_mode system variable is aes-128-ecb, or ECB mode, which does not require an initialization vector. The alternative permitted block encryption modes CBC, CFB1, CFB8, CFB128, and OFB all require an initialization vector. Reference: https://dev.mysql.com/doc/refman/8.0/en/encryption-functions.html#function_aes-decrypt Note: This fix does not support smooth upgrades. during upgrade process, query may report error: funciton not found	2023-03-09 09:51:41 +08:00
ElvinWei	bd5ed2b0c2	[enhancement](histogram) optimize the histogram bucketing strategy, etc (#17264 ) * optimize the histogram bucketing strategy, etc * fix p0 regression of histogram	2023-03-08 20:12:05 +08:00
TengJianPing	eea6d770d7	[fix](bitmap) fix wrong result of bitmap_or for null (#17456 ) Result of select bitmap_to_string(bitmap_or(to_bitmap(1), null)) should be 1 instead of null. This PR fix logic of bitmap_or and bitmap_or_count. Other count related funcitons should also be checked and fix, they will be fixed in another PR.	2023-03-08 16:29:01 +08:00
bobhan1	4ea0d6c5fa	[feature](array_function) add support for array_popfront (#17416 )	2023-03-08 13:57:38 +08:00
gitccl	b1d65f855d	[Feature](array-function) Support array_concat function (#17436 )	2023-03-08 13:57:16 +08:00
zhangstar333	06468ba627	[vectorized](bug) fix array constructor function change origin column from block (#17296 )	2023-03-07 16:42:23 +08:00
Gabriel	979cf42d7a	[Bug](decimalv3) Use correct decimal scale for function round (#17232 ) Co-authored-by: maochongxin <maochongxin@gmail.com>	2023-03-01 12:28:41 +08:00
zhangstar333	1dd2a41e38	[vectorized](bug) fix window function can't handle first row of beyond (#17084 ) Issue Number: close #16845	2023-02-28 17:30:23 +08:00
HappenLee	3e40467ce6	[Bug](vec) Fix chinese pinyin order by (#17152 ) bug: some chinese word not sort by pinyin in GBK coding CREATE TABLE `test_convert` ( `a` varchar(100) NULL ) ENGINE=OLAP DUPLICATE KEY(`a`) DISTRIBUTED BY HASH(`a`) BUCKETS 3 PROPERTIES ( "replication_allocation" = "tag.location.default: 1" ); insert into test_convert values("b"), ("a"), ("c"), ("睿"), ("多"), ("丝"); Query OK, 6 rows affected (0.03 sec) {'label':'insert_ca73a6acc2194d5b_888218a3949355a6', 'status':'VISIBLE', 'txnId':'18068'} mysql [test]>select * from test_convert; +------+ \| a \| +------+ \| a \| \| c \| \| 丝 \| \| b \| \| 多 \| \| 睿 \| +------+ 6 rows in set (0.01 sec) mysql [test]>select * from test_convert order by convert(a using gbk); +------+ \| a \| +------+ \| a \| \| b \| \| c \| \| 多 \| \| 丝 \| \| 睿 \| +------+ 6 rows in set (0.01 sec)	2023-02-28 14:29:56 +08:00
奕冷	c0360f80bb	[enhancement](aggregate-function) enhance aggregate funtion collect and add group_array aliases (#15339 ) Enhance aggregate function `collect_set` and `collect_list` to support optional `max_size` param, which enables to limit the number of elements in result array.	2023-02-27 14:22:30 +08:00
ZhaoChangle	b5d67781a2	[Fix](function)fix datatime-diff function's overflow (#16935 )	2023-02-24 20:06:06 +08:00
TengJianPing	883f575cfe	[fix](string function) fix wrong usage of iconv_open (#17048 ) * [fix](string function) fix wrong usage of iconv_open Also add test case for function convert * fix test case	2023-02-24 09:13:10 +08:00
lihangyu	526a66e9fb	[Function](array-type) support array_apply (#17020 ) Filter array to match specific binary condition ``` mysql> select array_apply([1000000, 1000001, 1000002], '=', 1000002); +-------------------------------------------------------------+ \| array_apply(ARRAY(1000000, 1000001, 1000002), '=', 1000002) \| +-------------------------------------------------------------+ \| [1000002] \| +-------------------------------------------------------------+ ```	2023-02-23 17:38:16 +08:00
morrySnow	7956800df7	[refactor](Nereids) let type coercion same with legacy planner (#16844 ) - change for Nereids 1. add a variable length parameter to the ctor of Count for a good error reporting of Count(a, b) 2. refactor StringRegexPredicate, let it inherit from ScalarFunction 3. remove useless class TypeCollection 4. use catalog.Type.Collection to check expression arguments type 5. change type coercion for TimestampArithmetic, divide, integral divide, comparison predicate, case when and in predicate. Let them same as legacy planner. - change for legacy planner 1. change the common type of floating and Decimal from Decimal to Double	2023-02-22 17:29:37 +08:00
ZhaoChangle	d6a841409f	[Enhancement](func)Introduce non_nullable extraction function. #16621 Introduced a new function non_nullable to BE, which can extract concrete data column from a nullable column. If the input argument is already not a nullable column, raise an error.	2023-02-18 20:44:07 +08:00
HappenLee	de1337511c	[Bug](Datetime) Fix date time function mem use after free (#16814 )	2023-02-16 16:15:58 +08:00
abmdocrt	41947c73eb	[Feature](array-function) Support array functions for nested type datev2 and datetimev2 (#16382 )	2023-02-08 12:51:07 +08:00
luozenglin	289a4b2ea4	[fix](func) fix truncate float type result error (#16468 ) When the argument of truncate function is float type, it can match both truncate(DECIMALV3) and truncate(DOUBLE), if the match is truncate(DECIMALV3), the precision is lost when converting float to DECIMALV3(38, 0). Here I modify it to match truncate(DOUBLE) for now, maybe we still need to solve the problem of losing precision when converting float to DECIMALV3.	2023-02-08 08:57:43 +08:00
luozenglin	09870098af	[fix](func) fix core dump when the pattern of the regexp_extract_all function does not contain subpatterns (#16408 )	2023-02-05 01:16:54 +08:00
gnehil	ca7b2e27a8	[regression-test](function) add regression test for money_format with truncate (#16052 )	2023-02-04 23:10:01 +08:00
Gabriel	918004c016	[Bug](date) Fix BE crash caused by function `datediff` (#16397 ) * [Bug](date) Fix BE crash caused by function `datediff` * update	2023-02-04 18:43:23 +08:00
yongkang.zhong	941e192019	[enhancement](test) add function case date_sub(datetime,INTERVAL dayofmonth(datetime)-1 DAY) (#16306 )	2023-02-02 09:56:01 +08:00
HaveAnOrangeCat	e3c8fffd99	[function](round) fix decimal scale for scale not specified (#15541 )	2023-02-01 14:58:48 +08:00
HappenLee	95d7c2de26	[Refactor](function) Rewrite the function elt (#16287 )	2023-02-01 11:17:06 +08:00
abmdocrt	ca7eb94f23	[improvement](agg-function) Increase the limit maximum number of agg function parameters (#15924 )	2023-01-31 21:03:50 +08:00
shee	6bebf92254	[fix][FE] fix be coredump when children of FunctionCallExpr is folded (#16064 ) Co-authored-by: shizhiqiang03 <shizhiqiang03@meituan.com> fix be coredump when children of FunctionCallExpr is folded	2023-01-30 15:25:00 +08:00
abmdocrt	eb7da1c0ee	[fix](datatype) fix some bugs about data type array datetimev2 and decimalv3 (#16132 )	2023-01-29 14:26:08 +08:00
lihangyu	578a855b3e	[Bug](topn-opt) filter condition for analytic info for two phase read opt (#16173 ) two phase read optimization should not be enabled when query has analytic info	2023-01-29 12:06:18 +08:00
AKIRA	b919cbe487	[ehancement](nereids) Enhancement for limit clause (#16114 ) support limit offset without order by. the legacy planner supoort this feature in PR #15218	2023-01-28 11:04:03 +08:00
abmdocrt	9ffd109b35	[fix](datetimev2) Fix BE datetimev2 type returning wrong result (#15885 )	2023-01-20 22:25:20 +08:00
Pxl	81bab55d43	[Bug](function) catch function calculation error on aggregate node to avoid core dump (#15903 )	2023-01-16 11:21:28 +08:00
abmdocrt	7441b4dc96	[Feature](function) Support width_bucket function (#14396 )	2023-01-12 13:59:21 +08:00
luozenglin	05f6e4c48a	[fix](predicate) fix be core dump caused by pushing down the double column predicate (#15693 )	2023-01-09 19:31:04 +08:00
ElvinWei	76ad599fd7	[enhancement](histogram) optimise aggregate function histogram (#15317 ) This pr mainly to optimize the histogram(👉🏻 https://github.com/apache/doris/pull/14910) aggregation function. Including the following: 1. Support input parameters `sample_rate` and `max_bucket_num` 2. Add UT and regression test 3. Add documentation 4. Optimize function implementation logic Parameter description： - `sample_rate`：Optional. The proportion of sample data used to generate the histogram. The default is 0.2. - `max_bucket_num`：Optional. Limit the number of histogram buckets. The default value is 128. --- Example： ``` MySQL [test]> SELECT histogram(c_float) FROM histogram_test; +-------------------------------------------------------------------------------------------------------------------------------------+ \| histogram(`c_float`) \| +-------------------------------------------------------------------------------------------------------------------------------------+ \| {"sample_rate":0.2,"max_bucket_num":128,"bucket_num":3,"buckets":[{"lower":"0.1","upper":"0.1","count":1,"pre_sum":0,"ndv":1},...]} \| +-------------------------------------------------------------------------------------------------------------------------------------+ MySQL [test]> SELECT histogram(c_string, 0.5, 2) FROM histogram_test; +-------------------------------------------------------------------------------------------------------------------------------------+ \| histogram(`c_string`) \| +-------------------------------------------------------------------------------------------------------------------------------------+ \| {"sample_rate":0.5,"max_bucket_num":2,"bucket_num":2,"buckets":[{"lower":"str1","upper":"str7","count":4,"pre_sum":0,"ndv":3},...]} \| +-------------------------------------------------------------------------------------------------------------------------------------+ ``` Query result description： ``` { "sample_rate": 0.2, "max_bucket_num": 128, "bucket_num": 3, "buckets": [ { "lower": "0.1", "upper": "0.2", "count": 2, "pre_sum": 0, "ndv": 2 }, { "lower": "0.8", "upper": "0.9", "count": 2, "pre_sum": 2, "ndv": 2 }, { "lower": "1.0", "upper": "1.0", "count": 2, "pre_sum": 4, "ndv": 1 } ] } ``` Field description： - sample_rate：Rate of sampling - max_bucket_num：Limit the maximum number of buckets - bucket_num：The actual number of buckets - buckets：All buckets - lower：Upper bound of the bucket - upper：Lower bound of the bucket - count：The number of elements contained in the bucket - pre_sum：The total number of elements in the front bucket - ndv：The number of different values in the bucket > Total number of histogram elements = number of elements in the last bucket(count) + total number of elements in the previous bucket(pre_sum).	2023-01-07 00:50:32 +08:00

1 2 3 4

151 Commits