doris

Author	SHA1	Message	Date
奕冷	c0360f80bb	[enhancement](aggregate-function) enhance aggregate funtion collect and add group_array aliases (#15339 ) Enhance aggregate function `collect_set` and `collect_list` to support optional `max_size` param, which enables to limit the number of elements in result array.	2023-02-27 14:22:30 +08:00
lihangyu	29dc08fc45	[Optimize](simd json reader) Cached search results for previous row (keyed as index in JSON object) - used as a hint. (#17124 ) * [Optimize](simd json reader) Cached search results for previous row (keyed as index in JSON object) - used as a hint. `_simdjson_set_column_value` could become a hot spot while parsing json in simdjson mode, introduce `_prev_positions` to cache results for previous row (keyed as index in JSON object) due to the json name field order, should be quite the same between each lines * fix case	2023-02-27 10:39:22 +08:00
DuRipeng	aefcc98715	[Enhancement](datetimev2-enhance) support 'microseconds_sub' function for datetimev2 (#17130 ) Based on #16970 , introduce microseconds_sub function for datetimev2	2023-02-27 08:47:30 +08:00
morrySnow	469b6b8466	[enhancement](Nereids) datetime v2 type precision derive (#17079 )	2023-02-26 22:33:55 +08:00
Tiewei Fang	3a9aa03aab	[BugFix](oracle-catalog) Modify the doris data type mapping of oracle `NUMBER(p,s)` type (#17051 ) The data type `NUMBER(p,s)` of oracle has some different of doris decimal type in semantics. For Oracle Number(p,s) type： 1. if s<0 , it means this is an Interger. This `NUMBER(p,s)` has (p+\|s\| ) significant digit, and rounding will be performed at s position. eg: if we insert 1234567 into `NUMBER(5,-2)` type, then the oracle will store 1234500. In this case, Doris will use int type (`TINYINT/SMALLINT/INT/.../LARGEINT`). 2. if s>=0 && s<p , it just like doris Decimal(p,s) behavior. 3. if s>=0 && s>p, it means this is a decimal(like 0.xxxxx). p represents how many digits can be left to the left after the decimal point, the figure after the decimal point s will be rounded. eg: we can not insert 0.0123456 into `NUMBER(5,7)` type, because there must be two zeros on the right side of the decimal point, we can insert 0.0012345 into `NUMBER(5,7)` type. In this case, Doris will use `DECIMAL(s,s)` 4. if we don't specify p and s for `NUMBER(p,s)` like `NUMBER`, the p and s of `NUMBER` are uncertain. In this case, doris can not determine p and s, so doris can not determine data type.	2023-02-26 09:05:41 +08:00
Pxl	2db4a981b3	[Feature](Materialized-View) forbiden rename column on materialized view (#17030 ) forbiden rename column on materialized view	2023-02-24 21:28:31 +08:00
YangShaw	c53b6a9532	[fix](Nereids) fix nullable() of lead/lag (#17014 ) fix bug when we use NULL as default value for window function lead() and lag()	2023-02-24 21:27:44 +08:00
YueW	5f2dad29ca	[enhancement](inverted index) Support inverted index without specified parser to use match query (#17110 )	2023-02-24 20:34:55 +08:00
ZhaoChangle	b5d67781a2	[Fix](function)fix datatime-diff function's overflow (#16935 )	2023-02-24 20:06:06 +08:00
Pxl	0691586eb7	[Chore](regression-test) add createMV action && add some mv case from fe ut MaterializedViewFunctionTest (#16825 ) 1. add createMV action 2. add some mv case from fe ut MaterializedViewFunctionTest 3. reduce mv scheduler interval time from 10s to 0.3s	2023-02-24 16:35:37 +08:00
AKIRA	cf5bc9594b	[fix](planner) conjuncts of the outer query block didn't work when it's on the results expr of inline view (#17036 ) Here is a cases: select id, name from (select '123' as id, '1234' as name, age from test_insert ) a where name != '1234';	2023-02-24 15:27:34 +08:00
AlexYue	c39914c0a0	[feature](partition)add default list partition (#15509 ) This pr implements the list default partition referred in related #15507. It's similar as GreenPlum's default's partition which would store all data not satisfying prior partition key's constraints and optimizer wouldn't filter default partition which means default partition would be scanned each time you try to select data from one table with default partition. User could either create one table with default partition or alter add one default partition. ```sql PARTITION LIST(key) { PARTITION p1 values in (xx,xx), PARTITION DEFAULT } ALTER TABLE XXX ADD PARTITION DEFAULT ``` We don't support automatically migrate data inside default partition which meets newly added partition key's constraint to newly add partition when alter add new partition. User should select default partition using new constraints as predicate and insert them to new partition. ```sql insert into tbl select * from tbl partition default where partition_key=xx; ```	2023-02-24 15:24:59 +08:00
starocean999	479d57df88	[fix](planner) the project expr should be calculated in join node in some case (#17035 ) Consider the sql bellow: select sum(cc.qlnm) as qlnm FROM outerjoin_A left join (SELECT outerjoin_B.b, coalesce(outerjoin_C.c, 0) AS qlnm FROM outerjoin_B inner JOIN outerjoin_C ON outerjoin_B.b = outerjoin_C.c ) cc on outerjoin_A.a = cc.b group by outerjoin_A.a; The coalesce(outerjoin_C.c, 0) was calculated in the agg node, which is wrong. This pr correct this, and the expr is calculated in the inner join node now.	2023-02-24 15:20:05 +08:00
TengJianPing	883f575cfe	[fix](string function) fix wrong usage of iconv_open (#17048 ) * [fix](string function) fix wrong usage of iconv_open Also add test case for function convert * fix test case	2023-02-24 09:13:10 +08:00
qiye	92ecd16573	(feature)[DOE]Support array for Doris on ES (#16941 ) * (feature)[DOE]Support array for Doris on ES	2023-02-23 19:31:18 +08:00
lihangyu	526a66e9fb	[Function](array-type) support array_apply (#17020 ) Filter array to match specific binary condition ``` mysql> select array_apply([1000000, 1000001, 1000002], '=', 1000002); +-------------------------------------------------------------+ \| array_apply(ARRAY(1000000, 1000001, 1000002), '=', 1000002) \| +-------------------------------------------------------------+ \| [1000002] \| +-------------------------------------------------------------+ ```	2023-02-23 17:38:16 +08:00
zhannngchen	edead494cb	[Enhancement](storage) add a new hidden column __DORIS_VERSION_COL__ for unique key table (#16509 )	2023-02-23 15:47:17 +08:00
xy720	91fc9fae8e	[Bug](complex-type) Fix is null predicate in delete stmt for array/struct/map type (#17018 )	2023-02-23 15:06:49 +08:00
morrySnow	37960e83d3	[test](Nereids) add ssb sf0.1 p1 regression case (#17046 )	2023-02-23 12:25:10 +08:00
DuRipeng	e65a061256	[Enhancement](datetimev2-enhance) support 'microseconds_add' function for datetimev2 (#16970 ) support 'microseconds_add' function for datetimev2	2023-02-22 17:49:41 +08:00
morrySnow	7956800df7	[refactor](Nereids) let type coercion same with legacy planner (#16844 ) - change for Nereids 1. add a variable length parameter to the ctor of Count for a good error reporting of Count(a, b) 2. refactor StringRegexPredicate, let it inherit from ScalarFunction 3. remove useless class TypeCollection 4. use catalog.Type.Collection to check expression arguments type 5. change type coercion for TimestampArithmetic, divide, integral divide, comparison predicate, case when and in predicate. Let them same as legacy planner. - change for legacy planner 1. change the common type of floating and Decimal from Decimal to Double	2023-02-22 17:29:37 +08:00
AKIRA	a95f47ac0a	[ehancement](planner) Support filter the output of set operation node (#16666 )	2023-02-21 19:22:09 +08:00
lihangyu	113023fb86	(Enhancement)[load-json] support simdjson in new json reader (#16903 ) be config: enable_simdjson_reader=true related PR #11665	2023-02-21 11:31:00 +08:00
Xin Liao	3a5e8f83e8	[fix](merge-on-write) fix that be may coredump when sequence column is null (#16832 ) To facilitate the use of the primary key index, encode the seq column to the minimum value of the corresponding length when the seq column is null.	2023-02-20 16:25:52 +08:00
Pxl	ce3afe7f13	[Enchancement](Materialized-View) forbiden some case in create mv with group by and fix select fail on g… (#16820 ) 1. forbiden some case in create mv with group by select k1+1,sum(abs(k2+2)+k3+3) from d_table group by k1; 2. fix select fail on grouping column have diffrent expr with select list create materialized view k1p2ap3psg as select k1+1,sum(abs(k2+2)+k3+3) from d_table group by k1+1; mysql [test]>explain select k1+1,sum(abs(k2+2)+k3+3) from d_table group by k1; ERROR 1105 (HY000): errCode = 2, detailMessage = select list expression not produced by aggregation output (missing from GROUP BY clause?): `k1` + 1	2023-02-20 13:04:50 +08:00
zhangstar333	5291f14aff	[vectorized](udf) java udf support array type (#16841 )	2023-02-20 10:00:25 +08:00
amory	8b70bfdc31	[Feature](map-type) Support stream load and fix some bugs for map type (#16776 ) 1、support stream load with json, csv format for map 2、fix olap convertor when compaction action in map column which has null 3、support select outToFile for map 4、add some regression-test	2023-02-19 15:11:54 +08:00
zhengshengjun	e2e6a0dd83	[Feature](load) Support mutable property for partition (#16036 ) The background is described in this issue: #15723, where users used Apache Druid to satisfy such lambada requirements before. We will not make Doris dropping data not belonged to current time window automatically like Druid, which is not flexible. We demand a ability to support mutable/immutable partition, the PR works this way: 1. Support mutable property for a partition. 2. The mutable property of a partition is passed from FE to BE in a load procedure 3. If a record's partition is immutable, we mark this row as "un selected" which will not be included in computation of 'max_filter_ratio', so that data write to immutable partition will be neglected and not cause load failure. Use Example: 1. Add immutable partition or modify an partition to be immutable: - alter table test_tbl add [temporary] partition xxx values less than ('xxx') ('mutable' = 'true'); - alter table test_tbl modify partition xx set ('mutable' = 'false'); 2. Write 5 records into table, two of then belongs to immutable partition	2023-02-18 23:09:34 +08:00
ZhaoChangle	d6a841409f	[Enhancement](func)Introduce non_nullable extraction function. #16621 Introduced a new function non_nullable to BE, which can extract concrete data column from a nullable column. If the input argument is already not a nullable column, raise an error.	2023-02-18 20:44:07 +08:00
AKIRA	861e4bc64a	[fix](planner) Nullable of slot descriptor is mistaken and cause BE crash #16862	2023-02-18 20:39:56 +08:00
morrySnow	9b94729c87	Revert "[test](pipeline) Run nereids cases in p1/p2 (#16130 )" (#16792 ) This reverts commit b480db2e119ac0516e8621ea3d53c40f250c1d24.	2023-02-17 18:48:27 +08:00
lihangyu	6acee1ce88	[Fix](topn opt) double check plan From OriginalPlanner to make sure optimized SQL is a general topn query (#16848 ) From the original logic, query like `select * from a where exists (select * from b order by 1) order by 1 limit 1` is a query contains subquery, but the top query will pass `checkEnableTwoPhaseRead` and set `isTwoPhaseOptEnabled=true`.So check the double plan is a general topn query plan is needed, and rollback the needMaterialize flag setted by the previous `analyze`.	2023-02-17 10:59:35 +08:00
lihangyu	5dfd6d2390	[improve](dynamic table) refine SegmentWriter columns writer generate (#16816 ) * [improve](dynamic table) refine SegmentWriter columns writer generate ``` Dynamic Block consists of two parts, dynamic part of columns and static part of columns static dynamic \| ----- \| ------- \| the static ones are original _tablet_schame columns the dynamic ones are auto generated and extended from file scan. ``` We should only consisder to use Block info to generte columns when it's a dynamic table load procudure. And seperate the static ones and dynamic ones * test	2023-02-17 10:24:33 +08:00
Gabriel	b35998a3b7	[Bug](datetimev2) Support cast datetimev2 to datetimev2 with different precision #16826	2023-02-17 08:42:36 +08:00
HappenLee	de1337511c	[Bug](Datetime) Fix date time function mem use after free (#16814 )	2023-02-16 16:15:58 +08:00
Jibing-Li	292926e5aa	[Fix](multi catalog)Fix partition case bug (#16763 ) Set column names from path to lower case in case-insensitive case. This is for Iceberg columns from path. Iceberg columns are case sensitive, which may cause error for table with partitions.	2023-02-16 15:47:23 +08:00
mch_ucchi	b6f2dfa994	[test](Nereids) add not nullable test for scalar functions (#16498 )	2023-02-16 11:57:19 +08:00
xy720	0c56a4622c	[Feature](struct-type) Add implicitly cast for struct-type (#16613 ) Currently not support insert {1, 'a'} into struct<f1:tinyint, f2:varchar(20)> This commit will support implicitly cast the char type in the struct to varchar. Add implicitly cast for struct-type.	2023-02-15 16:55:00 +08:00
Pxl	ea78184551	[Feature](Materialized-View) support multiple slot on one column in materialized view (#16378 )	2023-02-14 16:10:50 +08:00
TengJianPing	fb0d08ff4c	[fix](mark join) fix bug of mark join with other conjuncts (#16655 ) Fix bug that probe_index is not increased for mark hash join with other conjuncts.	2023-02-14 14:47:15 +08:00
zhangstar333	af5dc7565e	[bug](udf) fix udf return type of decimal check scale must is 9 (#16497 )	2023-02-14 10:53:53 +08:00
xueweizhang	90af1b0113	[fix](subquery) fix bug of using constexpr and some agg func(like count,max) as subquery's output (#16579 ) Signed-off-by: nextdreamblue <zxw520blue1@163.com>	2023-02-14 00:11:56 +08:00
lihangyu	36955a6769	[regression-test](dynamic-table) add regression test for dynamic table (#16656 )	2023-02-14 00:03:19 +08:00
YangShaw	77a3288ce7	[feature](Nereids) support window function (#14397 )	2023-02-13 21:20:56 +08:00
minghong	a2b9b9edd7	[fix](planner) fix bug in agg on constant column (#16442 ) For performance reason, we want to remove constant column from groupingExprs. For example: `select sum(T.A) from T group by T.B, 'xyz'` is equivalent to `select sum(T.A) from T group by T.B` We can remove constant column `abc` from groupingExprs. But there is an exception when all groupingExpr are constant For example: sql1: `select 'abc' from t group by 'abc'` is not equivalent to sql2: `select 'abc' from t` sql3: `select 'abc', sum(a) from t group by 'abc'` is not equivalent to sql4: `select 1, sum(a) from t` (when t is empty, sql3 returns 0 tuple, sql4 return 1 tuple) We need to keep some constant columns if all groupingExpr are constant. Consider sql5 `select a from (select "abc" as a, 'def' as b) T group by b, a;` if the constant column `a` is in select list, this column should not be removed. sql5 is transformed to sql6 `select a from (select "abc" as a, 'def' as b) T group by a;`	2023-02-13 11:26:08 +08:00
Tiewei Fang	3c3110b253	[Fix](Jdbc Catalog) jdbc catalog support to connect to doris database (#16527 ) Doris can use mysql-jdbc-jar to connect doris database, but doris has some data type that mysql without. Such as DecimalV3 and Date/DatetimeV2 I add some case judgments in `Mysql Catalog` , so that Jdbc catalog can identify the data type of DORIS	2023-02-10 20:24:40 +08:00
YueW	43eca4f209	[Feature-WIP](inverted index) Implementation for alter inverted index. (#16371 ) implementation for add/drop inverted index.	2023-02-10 17:56:17 +08:00
Xin Liao	6a5277b391	[fix](sequence-column) MergeIterator does not use the correct seq column for comparison (#16494 )	2023-02-10 17:51:15 +08:00
zhangdong	8758cd412f	[feature](auth)Implementing privilege management with rbac model (#16091 ) change implement of auth to rbac each user has one default role which can not be drop; if you grant priv to user,it will grant to default role , In the current pr, the user can still only have one role other than the default role, but in the future, the user and role will be many-to-many rename PaloRole,PaloAuth,PaloPrivilege to Role,Auth,Privilege	2023-02-10 12:30:49 +08:00
xueweizhang	379bef598d	[fix-core](block) clear block row_same_bit when block reuse (#16172 )	2023-02-10 12:21:27 +08:00

1 2 3 4 5 ...

837 Commits