doris

Author	SHA1	Message	Date
jiawei liang	99c0592157	[Feature](array-function) Support array_pushback function #17417 (#19988 ) Implement array_pushback. mysql> select array_pushback([1, 2], 3); +--------------------------------+ \| array_pushback(ARRAY(1, 2), 3) \| +--------------------------------+ \| [1, 2, 3] \| +--------------------------------+ 1 row in set (0.01 sec)	2023-06-12 16:51:12 +08:00
zxealous	10134ea8c6	[fix](planner) fix RewriteInPredicateRule may be useless (#20668 ) Issue Number: close #20669 RewriteInPredicateRule may cast InPredicate expr's two child to the same type, for example: where cast(age as char) in ('11'), the type of age is int, RewriteInPredicateRule will cast expr's two child type to int. As in the example above, child 0 will be such struct: ``` child 0: type: int \|--- child: type : char \|-- child: type : int ``` Due to the RewriteInPredicateRule cast the type of the expr to int, it will reanalyze stmt, but it will reset stmt first before reanalyze the stmt, and reset opt will change child 0 to such struct: ``` child: type : char \|-- child: type : int ``` It cause two child's type will be cast to varchar in func castAllToCompatibleType, the logic of RewriteInPredicateRule will be useless. In 1.1-lts and 1.2-lts, such case " where cast(age as char) in ('11')" can't work well, because func castAllToCompatibleType will cast int to char but int can't cast to char(master can work well because func castAllToCompatibleType will cast int to varchar in such case). ``` MySQL [test]> select user_id from test_cast where cast(age as char) in ('45'); ERROR 1105 (HY000): errCode = 2, detailMessage = type not match, originType=INT, targeType=CHAR(*) ```	2023-06-12 14:39:01 +08:00
Xinyi Zou	a347063390	[fix](case expr) fix coredump of case for null value 2 (#20635 ) fix coredump of case for null value 2	2023-06-11 23:08:53 +08:00
TengJianPing	dd71e101d3	[fix](case expr) fix coredump of case for null value (#20564 ) be coredump when when expr is null:	2023-06-08 20:05:23 +08:00
Jerry Hu	49f8f20fb1	[fix](regex) String with Chinese characters matching failed (#20493 )	2023-06-07 07:27:47 +08:00
Chengpeng Yan	ae428c29e2	[feature](planner)(nereids) support user defined variable (#20334 ) Support user-defined variables. After this PR, we can use `set @a = xx` to define a user variable and use it in the query like `select @a`. the changes of this PR: 1. Support the grammar for `set user variable` in the parser. 2. Add the `userVars` in `VariableMgr` to store the user-defined variables. 3. For the `set @a = xx`, we will store the variable name and its value in the `userVars` in `VariableMgr`. 4. For the `select @a`, we will get the value for the variable name in `userVars`.	2023-06-06 14:35:16 +08:00
amory	1f032a551d	[Improve](array-functions) support array first function (#20397 ) add array_first(lambda, [1,2,3,null]) function for doris	2023-06-06 12:08:46 +08:00
TengJianPing	1b94b6368f	[fix](load) in strict mode, return error for insert if datatype convert fails (#20378 ) * [fix](load) in strict mode, return error for load and insert if datatype convert fails Revert "[fix](MySQL) the way Doris handles boolean type is consistent with MySQL (#19416)" This reverts commit 68eb420cabe5b26b09d6d4a2724ae12699bdee87. Since it changed other behaviours, e.g. in strict mode insert into t_int values ("a"), it will result 0 is inserted into table, but it should return error instead. * fix be ut * fix regression tests	2023-06-06 12:04:03 +08:00
morrySnow	e553615a27	[opt](Nereids) perfer use datev2 / datetimev2 in date related functions (#20224 ) 1. update all date related functions' signatures order. 1.1. if return value need to be compute with time info, args with datetimev2 at the top of the list, followed by datev2, datetime and date 1.2. if return value need to be compute with only date info, args with datev2 at the top of list, followed by datetimev2, date and datetime 2. Priority for use datev2, if we must cast date to datev2 or datetime/datetimev2	2023-06-06 11:42:29 +08:00
Yang, Xu	d02737a293	[feature](struct-type) support struct_element function (#19045 ) This commit support a function allows return a field column in named struct column. Since the function can return any type, this commit also supports ANY_STRUCT_TYPE and ANY_ELEMENT_TYPE.	2023-06-06 10:44:08 +08:00
mch_ucchi	fac0b50f56	[Fix](Planner)fix cast date/datev2/datetime to float/double return null. (#20008 )	2023-06-05 19:06:50 +08:00
ZhangYu0123	d03bb4ba7b	[Optimize](function) Optimize locate function by compare across strings (#20290 ) Optimize locate function by compare across strings. about 90% speed up test by sum()	2023-06-05 12:43:14 +08:00
amory	59a0f80233	[Improve](array-function)Improve array function intersect (#20085 ) now we just support array function with 2 arrays , but intersect operator can support more than 2 arrays	2023-06-05 10:38:48 +08:00
amory	d68f3f3b3d	[Feature](array-functions)improve array functions for array_last_index (#20294 ) Now we just support array_first_index for lambda input , but no array_last_index	2023-06-02 13:54:03 +08:00
HappenLee	608d2a3eca	[Bug](exec) push down no group by agg min cause error result (#20289 ) sql """ CREATE TABLE t1_int ( num int(11) NULL, dgs_jkrq bigint(20) NULL ) ENGINE=OLAP DUPLICATE KEY(num) COMMENT 'OLAP' DISTRIBUTED BY HASH(num) BUCKETS 1 PROPERTIES ( "replication_allocation" = "tag.location.default: 1", "storage_format" = "V2", "light_schema_change" = "true", "disable_auto_compaction" = "false", "enable_single_replica_compaction" = "false" ); """ sql """insert into t1_int values(1,1),(1,2),(1,3),(1,4),(1,null);""" qt_sql """ select min(dgs_jkrq) from t1_int; """ get the error result：4 after change we get the right result：1	2023-06-01 17:29:46 +08:00
Mryange	519f01133a	[feature](decimal)support cast rounding half up and div precision increment in decimalv3. (#19811 )	2023-06-01 13:09:58 +08:00
LiBinfeng	65a75abecb	[Fix](Nereids) bitmap type should not be used in comparison predicate (#19807 ) When using nereids, if we use compare operator of bitmap type, an analyze exception need to be throwed. like: select id from (select BITMAP_EMPTY() as c0 from expr_test) as ref0 where c0 = 1 order by id Which c0 in subq0 is a bitmap type, this scenario is not supported right now.	2023-05-31 23:09:36 +08:00
Gabriel	ff05217a1e	[regression](p0) fix test for `array_enumerate_uniq` (#20231 )	2023-05-30 22:14:19 +08:00
Chenyang Sun	accaff1026	[Feature](compaction) wip: single replica compaction (#19237 ) Currently, compaction is executed separately for each backend, and the reconstruction of the index during compaction leads to high CPU usage. To address this, we are introducing single replica compaction, where a specific primary replica is selected to perform compaction, and the remaining replicas fetch the compaction results from the primary replica. The Backend (BE) requests replica information for all peers corresponding to a tablet from the Frontend (FE). This information includes the host where the replica is located and the replica_id. By calculating hash(replica_id), the replica with the smallest hash value is responsible for executing compaction, while the remaining replicas are responsible for fetching the compaction results from this replica. The compaction task producer thread, before submitting a compaction task, checks whether the local replica should fetch from its peer. If it should, the task is then submitted to the single replica compaction thread pool. When performing single replica compaction, the process begins by requesting rowset versions from the target replica. These rowset_versions are then compared with the local rowset versions. The first version that can be fetched is selected.	2023-05-30 21:12:48 +08:00
bobhan1	bb12a1cb49	[Enhance](array function) add support for DecimalV3 for array_enumerate_uniq() (#17724 )	2023-05-30 13:09:19 +08:00
Pxl	5788214416	[Bug](function) fix equals implements not judge order by elements of function call expr (#20083 ) fix equals implements not judge order by elements of function call expr #19296	2023-05-29 19:03:05 +08:00
Gabriel	55ccddb62c	[Conf](decimalv3) enable decimalv3 by default	2023-05-29 15:38:31 +08:00
Yanko	f54a068d82	[feature](function) add json->operator convert to json_extract (#19899 )	2023-05-27 12:45:45 +08:00
lihangyu	23c95d15da	[regression-test](sort) Fix unstable sorting (#20125 )	2023-05-26 23:42:05 +08:00
lihangyu	317338913c	[Bug](topn) Fix topn fetch set real default value (#20074 ) 1. Before this PR if rowset does not contain column which should be read for related SlotDescriptor will call `insert_default` to column, but it's not this real defautl value.Real default value relevant information should be provided by the frontend side. 2. Support fetch when light schema change is not enabled, but disable for AGG or UNIQUE MOR model	2023-05-26 16:06:55 +08:00
Pxl	43aa062fb1	[Chore](hash-join) remove useless conditions and add some case (#20050 )	2023-05-26 14:45:24 +08:00
TengJianPing	315b30c23d	[testcase](union) add test case for union of decimal (#20080 )	2023-05-26 14:12:14 +08:00
amory	ee34b6de2d	[Refact] (serde) refact mysql serde with data type (#19543 ) refact mysql output (de)serialize with data type serde , avoid accoriding switch case Primitive type writed in mysqlWriter	2023-05-26 14:11:17 +08:00
zclllyybb	384a0c7aa7	[fix](testcases) Fix some unstable testcases. (#19956 ) case of test_string_concat_extremely_long_string will exceed our test limit. Move it to p2 so that it will be tested only in SelectDB test environment. Because we need to keep consistent with MySQL & avoid overflow. the q67 must keep its behavior like now. When we fully apply nereids & decimalV3 then, it will be fixed automatically. In the parallel test, although all query stats were cleaned, the cases run in parallel will affect this. So we need to use a unique table for query_stats_test test_query_sys_tables didn't deal with some unstable situations. fixed it. temporarily disable unstable case analyze_test case for p0.	2023-05-24 09:52:02 +08:00
HappenLee	35f8fc22f2	[testcase](test) Fix query stats test may failed (#19958 )	2023-05-23 18:33:07 +08:00
Pxl	9945067e3c	[Bug](function) make VcompoundPred optimization work well (#19870 ) make VcompoundPred optimization work well #19818 this pr try to enable VcompoundPred optimization but get wrong result on tpcds q28. The reason is some nullable logic on mysql need special handling. mysql [regression_test_tpcds_sf1_p1]>select null and false; +----------------+ \| NULL AND FALSE \| +----------------+ \| 0 \| +----------------+ 1 row in set (0.00 sec) mysql [regression_test_tpcds_sf1_p1]>select null and true; +---------------+ \| NULL AND TRUE \| +---------------+ \| NULL \| +---------------+ 1 row in set (0.00 sec) mysql [regression_test_tpcds_sf1_p1]>select null or false; +---------------+ \| NULL OR FALSE \| +---------------+ \| NULL \| +---------------+ 1 row in set (0.00 sec) mysql [regression_test_tpcds_sf1_p1]>select null or true; +--------------+ \| NULL OR TRUE \| +--------------+ \| 1 \| +--------------+ 1 row in set (0.00 sec)	2023-05-22 18:32:17 +08:00
Pxl	d64be9565d	[Bug](function) fix function in get wrong result when input const column (#19791 ) fix function in get wrong result when input const column	2023-05-22 10:58:29 +08:00
amory	67dc68630b	[Improve](complex-type)improve array/map/struct creating and function with decimalv3 (#19830 )	2023-05-19 17:43:36 +08:00
Kang	88ca4f3e6b	[feature](like) make like regexp used as a sql function (#19755 )	2023-05-18 10:03:12 +08:00
amory	67668905d6	[Improve](complex-type)add complex type support unique table with regress test #19751 add complex type support unique table with regress test struct / map / array now support unique table but no regress test	2023-05-17 21:32:46 +08:00
mch_ucchi	1d05feea1b	[Feature](Nereids) add executable function to support fold constant for functions (#18209 ) 1. Add date-time functions for fold constant for Nereids. This is the list of executable date-time function nereids supports up to now: - now() - now(int) - current_timestamp() - current_timestamp(int) - localtime() - localtimestamp() - curdate() - current_date() - curtime() - current_time() - date_{add/sub}(),{years/months/days/hours/minutes/seconds}_{add/sub}() - datediff() - {date/datev2}() - {year/quarter/month/day/hour/minute/second}() - dayof{year/month/week}() - date_format() - date_trunc() - from_days() - last_day() - to_monday() - from_unixtime() - unix_timestamp() - utc_timestamp() - to_date() - to_days() - str_to_date() - makedate() 2. solved problem: - enable datev2/datetimev2 default. - refactor Nereids foldConstantOnFE and support fold nested expression. - separate the executable into multi-files for easily-reading and adding new functions	2023-05-17 21:26:31 +08:00
xueweizhang	48ec530d2c	[fix](functions) fix least/greatest function coredump bug (#19462 ) fix least/greatest function coredump bug	2023-05-17 14:12:52 +08:00
Pxl	d784c99360	[Bug](planner) fix unassigned conjunct assigned on wrong node (#19672 ) * fix unassigned conjunct assigned on wrong node	2023-05-17 10:28:22 +08:00
Pxl	7f73749b88	[Bug](pipeline) fix distributionColumnIds not updated correct when outputColumnUnique… (#19704 ) fix distributionColumnIds not updated correct when outputColumnUnique	2023-05-17 00:13:10 +08:00
Ziyu Wang	325a1d4b28	[vectorized](function) support array_count function (#18557 ) support array_count function. array_count：Returns the number of non-zero and non-null elements in the given array.	2023-05-16 17:00:01 +08:00
Zhengguo Yang	6748ae4a57	[Feature] Collect the information statistics of the query hit (#18805 ) 1. Show the query hit statistics for `baseall` ```sql MySQL [test_query_db]> show query stats from baseall; +-------+------------+-------------+ \| Field \| QueryCount \| FilterCount \| +-------+------------+-------------+ \| k0 \| 0 \| 0 \| \| k1 \| 0 \| 0 \| \| k2 \| 0 \| 0 \| \| k3 \| 0 \| 0 \| \| k4 \| 0 \| 0 \| \| k5 \| 0 \| 0 \| \| k6 \| 0 \| 0 \| \| k10 \| 0 \| 0 \| \| k11 \| 0 \| 0 \| \| k7 \| 0 \| 0 \| \| k8 \| 0 \| 0 \| \| k9 \| 0 \| 0 \| \| k12 \| 0 \| 0 \| \| k13 \| 0 \| 0 \| +-------+------------+-------------+ 14 rows in set (0.002 sec) MySQL [test_query_db]> select k0, k1,k2, sum(k3) from baseall where k9 > 1 group by k0,k1,k2; +------+------+--------+-------------+ \| k0 \| k1 \| k2 \| sum(`k3`) \| +------+------+--------+-------------+ \| 0 \| 6 \| 32767 \| 3021 \| \| 1 \| 12 \| 32767 \| -2147483647 \| \| 0 \| 3 \| 1989 \| 1002 \| \| 0 \| 7 \| -32767 \| 1002 \| \| 1 \| 8 \| 255 \| 2147483647 \| \| 1 \| 9 \| 1991 \| -2147483647 \| \| 1 \| 11 \| 1989 \| 25699 \| \| 1 \| 13 \| -32767 \| 2147483647 \| \| 1 \| 14 \| 255 \| 103 \| \| 0 \| 1 \| 1989 \| 1001 \| \| 0 \| 2 \| 1986 \| 1001 \| \| 1 \| 15 \| 1992 \| 3021 \| +------+------+--------+-------------+ 12 rows in set (0.050 sec) MySQL [test_query_db]> show query stats from baseall; +-------+------------+-------------+ \| Field \| QueryCount \| FilterCount \| +-------+------------+-------------+ \| k0 \| 1 \| 0 \| \| k1 \| 1 \| 0 \| \| k2 \| 1 \| 0 \| \| k3 \| 1 \| 0 \| \| k4 \| 0 \| 0 \| \| k5 \| 0 \| 0 \| \| k6 \| 0 \| 0 \| \| k10 \| 0 \| 0 \| \| k11 \| 0 \| 0 \| \| k7 \| 0 \| 0 \| \| k8 \| 0 \| 0 \| \| k9 \| 1 \| 1 \| \| k12 \| 0 \| 0 \| \| k13 \| 0 \| 0 \| +-------+------------+-------------+ 14 rows in set (0.001 sec) ``` 2. Show the query hit statistics summary for all the mv in a table ```sql MySQL [test_query_db]> show query stats from baseall all; +-----------+------------+ \| IndexName \| QueryCount \| +-----------+------------+ \| baseall \| 1 \| +-----------+------------+ 1 row in set (0.005 sec) ``` 3. Show the query hit statistics detail info for all the mv in a table ```sql MySQL [test_query_db]> show query stats from baseall all verbose; +-----------+-------+------------+-------------+ \| IndexName \| Field \| QueryCount \| FilterCount \| +-----------+-------+------------+-------------+ \| baseall \| k0 \| 1 \| 0 \| \| \| k1 \| 1 \| 0 \| \| \| k2 \| 1 \| 0 \| \| \| k3 \| 1 \| 0 \| \| \| k4 \| 0 \| 0 \| \| \| k5 \| 0 \| 0 \| \| \| k6 \| 0 \| 0 \| \| \| k10 \| 0 \| 0 \| \| \| k11 \| 0 \| 0 \| \| \| k7 \| 0 \| 0 \| \| \| k8 \| 0 \| 0 \| \| \| k9 \| 1 \| 1 \| \| \| k12 \| 0 \| 0 \| \| \| k13 \| 0 \| 0 \| +-----------+-------+------------+-------------+ 14 rows in set (0.017 sec) ``` 4. Show the query hit for a database ```sql MySQL [test_query_db]> show query stats for test_query_db; +----------------------------+------------+ \| TableName \| QueryCount \| +----------------------------+------------+ \| compaction_tbl \| 0 \| \| bigtable \| 0 \| \| empty \| 0 \| \| tempbaseall \| 0 \| \| test \| 0 \| \| test_data_type \| 0 \| \| test_string_function_field \| 0 \| \| baseall \| 1 \| \| nullable \| 0 \| +----------------------------+------------+ 9 rows in set (0.005 sec) ``` 5. Show query hit statistics for all the databases ```sql MySQL [(none)]> show query stats; +-----------------+------------+ \| Database \| QueryCount \| +-----------------+------------+ \| test_query_db \| 1 \| +-----------------+------------+ 1 rows in set (0.005 sec) ```	2023-05-15 10:56:34 +08:00
starocean999	e9392780a9	[fix](nereids)fix some nereids planner bugs (#19509 ) 1.some encrypt and decrypt functions have wrong blockEncryptionMode 2.topN node should compare tuples from intermediate_row_desc with first_sort_slot.tuple_id 3.must keep the limit if it's an uncorrelated in-subquery with limit on sort, like select a from t1 where a in ( select b from t2 order by xx limit yy )	2023-05-12 09:06:16 +08:00
xy720	39ec8aa64c	[refactor](complex-type) refactor array/map/struct literal to not invoke execute() function in prepare state (#19068 )	2023-05-11 18:44:37 +08:00
herry2038	834bf2eab7	[feature](array) Add array_last lambda function (#18388 ) Add array_last lambda function	2023-05-11 13:15:54 +08:00
Jerry Hu	47edc5a06e	[fix](functions) Support nullable column for multi_string functions (#19498 )	2023-05-11 01:13:13 +08:00
Pxl	5473795a51	[Bug](scan) forbiden push down in predicate when in_state->use_set is false (#19471 ) forbiden push down in predicate when in_state->use_set is false	2023-05-10 11:12:20 +08:00
Gabriel	4c6ca88088	Revert "[refactor](function) ignore DST for function `from_unixtime` (#19151 )" (#19333 ) This reverts commit 9dd6c8f87b73db238bfd38fb1d76f3796910f398.	2023-05-06 16:33:58 +08:00
Gabriel	9dd6c8f87b	[refactor](function) ignore DST for function `from_unixtime` (#19151 )	2023-05-05 11:51:49 +08:00
yiguolei	8eab20d3df	[bugfix](low cardinality) cached code is wrong will result wrong query result when many null pages (#19221 ) Sometimes the dict is not initialized when run comparison predicate here, for example, the full page is null, then the reader will skip read, so that the dictionary is not inited. The cached code is wrong during this case, because the following page maybe not null, and the dict should have items in the future. This will result the dict string column query return wrong result, if there are many null values in the column. I also add some regression test for dict column's equal query, larger than query, less than query. --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-04-29 21:28:41 +08:00
brody715	20395ce501	[feature](array_function): add support for array_cum_sum function (#18231 )	2023-04-27 09:57:13 +08:00

1 2 3 4 5 ...

343 Commits