doris

Author	SHA1	Message	Date
zhangstar333	5291f14aff	[vectorized](udf) java udf support array type (#16841 )	2023-02-20 10:00:25 +08:00
amory	8b70bfdc31	[Feature](map-type) Support stream load and fix some bugs for map type (#16776 ) 1、support stream load with json, csv format for map 2、fix olap convertor when compaction action in map column which has null 3、support select outToFile for map 4、add some regression-test	2023-02-19 15:11:54 +08:00
zhengshengjun	e2e6a0dd83	[Feature](load) Support mutable property for partition (#16036 ) The background is described in this issue: #15723, where users used Apache Druid to satisfy such lambada requirements before. We will not make Doris dropping data not belonged to current time window automatically like Druid, which is not flexible. We demand a ability to support mutable/immutable partition, the PR works this way: 1. Support mutable property for a partition. 2. The mutable property of a partition is passed from FE to BE in a load procedure 3. If a record's partition is immutable, we mark this row as "un selected" which will not be included in computation of 'max_filter_ratio', so that data write to immutable partition will be neglected and not cause load failure. Use Example: 1. Add immutable partition or modify an partition to be immutable: - alter table test_tbl add [temporary] partition xxx values less than ('xxx') ('mutable' = 'true'); - alter table test_tbl modify partition xx set ('mutable' = 'false'); 2. Write 5 records into table, two of then belongs to immutable partition	2023-02-18 23:09:34 +08:00
ZhaoChangle	d6a841409f	[Enhancement](func)Introduce non_nullable extraction function. #16621 Introduced a new function non_nullable to BE, which can extract concrete data column from a nullable column. If the input argument is already not a nullable column, raise an error.	2023-02-18 20:44:07 +08:00
AKIRA	861e4bc64a	[fix](planner) Nullable of slot descriptor is mistaken and cause BE crash #16862	2023-02-18 20:39:56 +08:00
morrySnow	9b94729c87	Revert "[test](pipeline) Run nereids cases in p1/p2 (#16130 )" (#16792 ) This reverts commit b480db2e119ac0516e8621ea3d53c40f250c1d24.	2023-02-17 18:48:27 +08:00
lihangyu	6acee1ce88	[Fix](topn opt) double check plan From OriginalPlanner to make sure optimized SQL is a general topn query (#16848 ) From the original logic, query like `select * from a where exists (select * from b order by 1) order by 1 limit 1` is a query contains subquery, but the top query will pass `checkEnableTwoPhaseRead` and set `isTwoPhaseOptEnabled=true`.So check the double plan is a general topn query plan is needed, and rollback the needMaterialize flag setted by the previous `analyze`.	2023-02-17 10:59:35 +08:00
lihangyu	5dfd6d2390	[improve](dynamic table) refine SegmentWriter columns writer generate (#16816 ) * [improve](dynamic table) refine SegmentWriter columns writer generate ``` Dynamic Block consists of two parts, dynamic part of columns and static part of columns static dynamic \| ----- \| ------- \| the static ones are original _tablet_schame columns the dynamic ones are auto generated and extended from file scan. ``` We should only consisder to use Block info to generte columns when it's a dynamic table load procudure. And seperate the static ones and dynamic ones * test	2023-02-17 10:24:33 +08:00
Gabriel	b35998a3b7	[Bug](datetimev2) Support cast datetimev2 to datetimev2 with different precision #16826	2023-02-17 08:42:36 +08:00
HappenLee	de1337511c	[Bug](Datetime) Fix date time function mem use after free (#16814 )	2023-02-16 16:15:58 +08:00
Jibing-Li	292926e5aa	[Fix](multi catalog)Fix partition case bug (#16763 ) Set column names from path to lower case in case-insensitive case. This is for Iceberg columns from path. Iceberg columns are case sensitive, which may cause error for table with partitions.	2023-02-16 15:47:23 +08:00
mch_ucchi	b6f2dfa994	[test](Nereids) add not nullable test for scalar functions (#16498 )	2023-02-16 11:57:19 +08:00
xy720	0c56a4622c	[Feature](struct-type) Add implicitly cast for struct-type (#16613 ) Currently not support insert {1, 'a'} into struct<f1:tinyint, f2:varchar(20)> This commit will support implicitly cast the char type in the struct to varchar. Add implicitly cast for struct-type.	2023-02-15 16:55:00 +08:00
Pxl	ea78184551	[Feature](Materialized-View) support multiple slot on one column in materialized view (#16378 )	2023-02-14 16:10:50 +08:00
TengJianPing	fb0d08ff4c	[fix](mark join) fix bug of mark join with other conjuncts (#16655 ) Fix bug that probe_index is not increased for mark hash join with other conjuncts.	2023-02-14 14:47:15 +08:00
zhangstar333	af5dc7565e	[bug](udf) fix udf return type of decimal check scale must is 9 (#16497 )	2023-02-14 10:53:53 +08:00
xueweizhang	90af1b0113	[fix](subquery) fix bug of using constexpr and some agg func(like count,max) as subquery's output (#16579 ) Signed-off-by: nextdreamblue <zxw520blue1@163.com>	2023-02-14 00:11:56 +08:00
lihangyu	36955a6769	[regression-test](dynamic-table) add regression test for dynamic table (#16656 )	2023-02-14 00:03:19 +08:00
YangShaw	77a3288ce7	[feature](Nereids) support window function (#14397 )	2023-02-13 21:20:56 +08:00
minghong	a2b9b9edd7	[fix](planner) fix bug in agg on constant column (#16442 ) For performance reason, we want to remove constant column from groupingExprs. For example: `select sum(T.A) from T group by T.B, 'xyz'` is equivalent to `select sum(T.A) from T group by T.B` We can remove constant column `abc` from groupingExprs. But there is an exception when all groupingExpr are constant For example: sql1: `select 'abc' from t group by 'abc'` is not equivalent to sql2: `select 'abc' from t` sql3: `select 'abc', sum(a) from t group by 'abc'` is not equivalent to sql4: `select 1, sum(a) from t` (when t is empty, sql3 returns 0 tuple, sql4 return 1 tuple) We need to keep some constant columns if all groupingExpr are constant. Consider sql5 `select a from (select "abc" as a, 'def' as b) T group by b, a;` if the constant column `a` is in select list, this column should not be removed. sql5 is transformed to sql6 `select a from (select "abc" as a, 'def' as b) T group by a;`	2023-02-13 11:26:08 +08:00
Tiewei Fang	3c3110b253	[Fix](Jdbc Catalog) jdbc catalog support to connect to doris database (#16527 ) Doris can use mysql-jdbc-jar to connect doris database, but doris has some data type that mysql without. Such as DecimalV3 and Date/DatetimeV2 I add some case judgments in `Mysql Catalog` , so that Jdbc catalog can identify the data type of DORIS	2023-02-10 20:24:40 +08:00
YueW	43eca4f209	[Feature-WIP](inverted index) Implementation for alter inverted index. (#16371 ) implementation for add/drop inverted index.	2023-02-10 17:56:17 +08:00
Xin Liao	6a5277b391	[fix](sequence-column) MergeIterator does not use the correct seq column for comparison (#16494 )	2023-02-10 17:51:15 +08:00
zhangdong	8758cd412f	[feature](auth)Implementing privilege management with rbac model (#16091 ) change implement of auth to rbac each user has one default role which can not be drop; if you grant priv to user,it will grant to default role , In the current pr, the user can still only have one role other than the default role, but in the future, the user and role will be many-to-many rename PaloRole,PaloAuth,PaloPrivilege to Role,Auth,Privilege	2023-02-10 12:30:49 +08:00
xueweizhang	379bef598d	[fix-core](block) clear block row_same_bit when block reuse (#16172 )	2023-02-10 12:21:27 +08:00
AKIRA	0c20c607b2	fix stats (#16556 )	2023-02-10 11:00:01 +08:00
morrySnow	05ed1f751b	[fix](planner)(Nereids) add date and datev2 signature to greatest and least function (#16565 )	2023-02-09 21:36:53 +08:00
starocean999	f0b0eedbc5	[fix](planner)group_concat lost order by info in second phase merge agg (#16479 )	2023-02-08 20:48:52 +08:00
morrySnow	a512469537	[fix](planner) cannot process more than one subquery in disjunct (#16506 ) before this PR, Doris cannot process sql like that ```sql CREATE TABLE `test_sq_dj1` ( `c1` int(11) NULL, `c2` int(11) NULL, `c3` int(11) NULL ) ENGINE=OLAP DUPLICATE KEY(`c1`) COMMENT 'OLAP' DISTRIBUTED BY HASH(`c1`) BUCKETS 3 PROPERTIES ( "replication_allocation" = "tag.location.default: 1", "in_memory" = "false", "storage_format" = "V2", "disable_auto_compaction" = "false" ); CREATE TABLE `test_sq_dj2` ( `c1` int(11) NULL, `c2` int(11) NULL, `c3` int(11) NULL ) ENGINE=OLAP DUPLICATE KEY(`c1`) COMMENT 'OLAP' DISTRIBUTED BY HASH(`c1`) BUCKETS 3 PROPERTIES ( "replication_allocation" = "tag.location.default: 1", "in_memory" = "false", "storage_format" = "V2", "disable_auto_compaction" = "false" ); insert into test_sq_dj1 values(1, 2, 3), (10, 20, 30), (100, 200, 300); insert into test_sq_dj2 values(10, 20, 30); -- core SELECT * FROM test_sq_dj1 WHERE c1 IN (SELECT c1 FROM test_sq_dj2) OR c1 IN (SELECT c1 FROM test_sq_dj2) OR c1 < 10; -- invalid slot SELECT * FROM test_sq_dj1 WHERE c1 IN (SELECT c1 FROM test_sq_dj2) OR c1 IN (SELECT c2 FROM test_sq_dj2) OR c1 < 10; ``` there are two problems: 1. we should remove redundant sub-query in one conjuncts to avoid generate useless join node 2. when we have more than one sub-query in one disjunct. we should put the conjunct contains the disjunct at the top node of the set of mark join nodes. And pop up the mark slot to the top node.	2023-02-08 18:46:06 +08:00
TengJianPing	f6a20f844b	[fix](hashjoin) join produce blocks with rows larger than batch size: handle join with other conjuncts (#16402 )	2023-02-08 14:26:35 +08:00
abmdocrt	41947c73eb	[Feature](array-function) Support array functions for nested type datev2 and datetimev2 (#16382 )	2023-02-08 12:51:07 +08:00
morrySnow	81dbed70c2	[fix](Nereids) back off on tpch p1 (#16478 ) adjust nullable on empty set should apply after unnested sub-query some function should propagate nullable when args are datev2 or datetimev2 add back tpch sf0.1 nereids regression test	2023-02-08 10:43:13 +08:00
luozenglin	289a4b2ea4	[fix](func) fix truncate float type result error (#16468 ) When the argument of truncate function is float type, it can match both truncate(DECIMALV3) and truncate(DOUBLE), if the match is truncate(DECIMALV3), the precision is lost when converting float to DECIMALV3(38, 0). Here I modify it to match truncate(DOUBLE) for now, maybe we still need to solve the problem of losing precision when converting float to DECIMALV3.	2023-02-08 08:57:43 +08:00
lihangyu	1d0fdff98a	[Bug](sort) disable 2phase read for sort by expressions exclude slotref (#16460 ) ``` create table tbl1 (k1 varchar(100), k2 string) distributed by hash(k1) buckets 1 properties("replication_num" = "1"); insert into tbl1 values(1, "alice"); select cast(k1 as INT) as id from tbl1 order by id limit 2; ``` The above query could pass `checkEnableTwoPhaseRead` since the order by element is SlotRef but actually it's an function call expr	2023-02-07 19:42:54 +08:00
Gabriel	91229bb87d	[Bug](makr join) Fix mark join with other conjuncts (#16435 )	2023-02-07 09:31:41 +08:00
Jibing-Li	a13beca0de	[Fix](load)Use lower case for load column names. #16422 The columns name in stream load and broker load are case sensitive, make it case insensitive. This would be consist with query, because query sql columns name are case insensitve.	2023-02-07 09:18:37 +08:00
Kang	36a5e0a2a9	[bugfix](array) fix element revert on error in DataTypeArray::from_string (#16434 ) * fix array from_string element revert on error * add testcase	2023-02-06 18:27:36 +08:00
starocean999	dccd04a3ba	[fix](fe)predicate is wrongly pushed through CUBE function (#15831 )	2023-02-06 11:29:15 +08:00
lihangyu	f2fd47f238	[Improve](row-store) support row cache (#16263 )	2023-02-06 11:16:39 +08:00
Mingyu Chen	f940cf4cf6	[fix](multi-catalog) fix recursive get schema cache bug (#16415 )	2023-02-06 09:23:07 +08:00
luozenglin	09870098af	[fix](func) fix core dump when the pattern of the regexp_extract_all function does not contain subpatterns (#16408 )	2023-02-05 01:16:54 +08:00
starocean999	df3a6e2412	[fix](fe)only set column info for slots in sortTupleDesc (#16407 )	2023-02-04 23:14:25 +08:00
gnehil	ca7b2e27a8	[regression-test](function) add regression test for money_format with truncate (#16052 )	2023-02-04 23:10:01 +08:00
starocean999	dd63897757	[fix](be)the set operation node should accept both nullable and non-nullable data from child node (#16126 )	2023-02-04 23:08:59 +08:00
Gabriel	918004c016	[Bug](date) Fix BE crash caused by function `datediff` (#16397 ) * [Bug](date) Fix BE crash caused by function `datediff` * update	2023-02-04 18:43:23 +08:00
Kang	125b60b4b9	[improvement](compatibility) add DATA_TYPE in information schema for new types #16391 Add DATA_TYPE in information schema for types: datev2, datatimev2, decimal, jsonb. It was 'unknown' for these types and cause problem for tools such as BI using information schema.	2023-02-03 22:28:42 +08:00
minghong	4f778c38a1	[feature](nereids) support explore 4 phase aggregation (#16298 ) support 4 phase Aggregation. example: `select count(distinct k1), sum(k2) from t` suppose t.k0 is distribute key. we have plan ``` Agg(DISTINCT_GLOBAL) \| Exchange(Gather) \| Agg(DISTINCT_LOCAL) \| Agg(GLOBAL) \| Exchange(hash distribute by k1) \| Agg(LOCAL) \| scan ``` limitations: 1. only support sql with one distinct. not support:`select count(distinct k1), count(distinct k2) from t` 2. only support sql with distinct one column not support: `select count(distinct k1, k2) from t`	2023-02-03 21:51:10 +08:00
AKIRA	5e232a30d8	[fix](planner) Doris returns empty sets when select from a inline view (#16370 ) Doris always delays the execution of expressions as possible as it can, so as the expansion of constant expression. Given below SQL: ```sql select i from (select 'abc' as i, sum(birth) as j from subquerytest2) as tmp ``` The aggregation would be eliminated, since its output is not required by the outer block, but the expasion for constant expression would be done in the final result expr, and since aggreagete output has been eliminate, the expasion would actually do nothing, and finally cause a empty results. To fix this, we materialize the results expr in the inner block for such SQL, it may affect performance, but better than let system produce a mistaken result.	2023-02-03 21:23:52 +08:00
AKIRA	a5d9aca7ba	[test](Nereids) enable G-K and L-Q scalar function regression test cases (#16169 ) 1. delete invalid signature of nvl function 2. fix some test cases that failed because of malformed function name	2023-02-03 21:18:43 +08:00
Gabriel	87fbb8341a	[Bug](datev2) Fix bug when cast datev2 to date (#16394 )	2023-02-03 20:50:16 +08:00

1 2 3 4 5 ...

812 Commits