doris

Author	SHA1	Message	Date
zy-kkk	88e2753e40	[fix](Nereids) fix ShowProcedureStatusCommand sendResultSet (#35355 )	2024-05-24 17:22:07 +08:00
TengJianPing	639c7ee7fb	[fix](decimalv2) fix scale of decimalv2 to string (#35222 ) (#35359 ) * [fix](decimalv2) fix scale of decimalv2 to string	2024-05-24 17:20:43 +08:00
Mingyu Chen	ca86ee7b15	[fix](load) fix wrong assert and cancel load error (#35362 )	2024-05-24 17:11:01 +08:00
feiniaofeiafei	1e07971a98	[Feat](nereids)when dealing insert into stmt with empty table source, fe returns directly (#35333 ) * [Feat](nereids) when dealing insert into stmt with empty table source, fe returns directly (#34418) When a LogicalOlapScan has no partitions, transform it to a LogicalEmptyRelation. When dealing insert into stmt with empty table source, fe returns directly. * [Fix](nereids) fix when insert into select empty table --------- Co-authored-by: feiniaofeiafei <moailing@selectdb.com>	2024-05-24 16:25:00 +08:00
starocean999	bfe293c725	[fix](nereids) AdjustNullable rule should handle union node with no children (#35074 ) The output slot's nullable info is not correctly calculated in union node. Because old code only get correct result if union node has children. But the union node may have no children but only have constantExprList. So in that case, we should calculate output's nullable info byboth children and constantExprList.	2024-05-24 16:23:58 +08:00
Tiewei Fang	f6beeb1ddd	[Enhencement](tvf) select tvf supports using resource (#35139 ) Create an S3/HDFS resource that TVF can use it directly to access the data source.	2024-05-24 16:23:58 +08:00
seawinde	d6e8fb7d77	[feature](mtmv) Support agg state roll up and optimize the roll up code (#35026 ) agg_state is agg intermediate state, detail see state combinator: https://doris.apache.org/zh-CN/docs/dev/sql-manual/sql-functions/combinators/state this support agg function roll up as following +---------------------+---------------------------------------------+---------------------+ \| query \| materialized view \| roll up \| \| ------------------- \| ------------------------------------------- \| ------------------- \| \| agg_funtion() \| agg_funtion_unoin() or agg_funtion_state() \| agg_funtion_merge() \| \| agg_funtion_unoin() \| agg_funtion_unoin() or agg_funtion_state() \| agg_funtion_union() \| \| agg_funtion_merge() \| agg_funtion_unoin() or agg_funtion_state() \| agg_funtion_merge() \| +---------------------+---------------------------------------------+---------------------+ for example which can be rewritten by mv sucessfully as following MV defination is ``` select o_orderstatus, l_partkey, l_suppkey, sum_union(sum_state(o_shippriority)), group_concat_union(group_concat_state(l_shipinstruct)), avg_union(avg_state(l_linenumber)), max_by_union(max_by_state(l_shipmode, l_suppkey)), count_union(count_state(l_orderkey)), multi_distinct_count_union(multi_distinct_count_state(l_shipmode)) from lineitem left join orders on lineitem.l_orderkey = o_orderkey and l_shipdate = o_orderdate group by o_orderstatus, l_partkey, l_suppkey; ``` Query is ``` select o_orderstatus, l_suppkey, sum(o_shippriority), group_concat(l_shipinstruct), avg(l_linenumber), max_by(l_shipmode,l_suppkey), count(l_orderkey), multi_distinct_count(l_shipmode) from lineitem left join orders on l_orderkey = o_orderkey and l_shipdate = o_orderdate group by o_orderstatus, l_suppkey; ```	2024-05-24 16:23:58 +08:00
Calvin Kirs	bbf502dfcf	[fix](create-table)The CREATE TABLE IF NOT EXISTS AS SELECT statement should refrain from performing any INSERT operations if the table already exists (#35210 )	2024-05-24 16:23:58 +08:00
feiniaofeiafei	bd4dd94c24	[Fix](nereids) add checkBlockRules() check for create view and alter view (#34104 )	2024-05-24 16:23:58 +08:00
Yongqiang YANG	78fab91d6b	[fix](overflow) show backends overflow for backend ids (#35245 )	2024-05-24 16:23:58 +08:00
Xujian Duan	dd567fa774	[fix](function) support return JsonType for If function (#35199 ) add a FunctionSignature for If to support return Type is JsonType.	2024-05-24 16:23:58 +08:00
morrySnow	98b2bda660	[opt](Nereids) remove restrict for count() in window (#35220 ) support count() used for window function CREATE TABLE `t1` ( `id` INT NULL, `dt` TEXT NULL ) DISTRIBUTED BY HASH(`id`) BUCKETS 10 PROPERTIES ( "replication_allocation" = "tag.location.default: 1" ); select , count() over() from t1;	2024-05-24 16:23:58 +08:00
walter	473e14ca82	[chore](backup) log backup/restore job during replay (#35234 )	2024-05-24 16:23:57 +08:00
catpineapple	edb276ad92	[fix](typo)fix show backend typo (#35198 )	2024-05-24 16:23:57 +08:00
zy-kkk	cf46ebe31d	[improve](jdbc catalog) Remove all property checks during create (#35194 ) (#35354 )	2024-05-24 16:12:02 +08:00
starocean999	f062506b22	[fix](nereids)the preagg state for count(*) is wrong (#35326 )	2024-05-24 15:23:04 +08:00
wangqt	0b90e37227	[fix](Nereids) string literal coercion of in predicate (#35337 ) pick from master #35200 Description: The sql execute much slow when the literal value with string format in `in predicate`; and the real data is integral type。 ``` mysql> set enable_nereids_planner = false; Query OK, 0 rows affected (0.03 sec) mysql> select id,sum(clicks) from a_table where id in ('787934713', '306960695') group by id limit 10; +------------+---------------+ \| id \| sum(`clicks`) \| +------------+---------------+ \| 787934713 \| 2838 \| \| 306960695 \| 339 \| +------------+---------------+ 2 rows in set (1.81 sec) mysql> set enable_nereids_planner = true; Query OK, 0 rows affected (0.02 sec) mysql> select id,sum(clicks) from a_table where id in ('787934713', '306960695') group by id limit 10; +------------+-------------+ \| id \| sum(clicks) \| +------------+-------------+ \| 787934713 \| 2838 \| \| 306960695 \| 339 \| +------------+-------------+ 2 rows in set (28.14 sec) ``` Reason: In legacy planner, the string literal with convert to integral value, but in the nereids planner do not do this convert and with do string matching in BE。 Solved: do process string literal with numeric in `in predicate` like in `comparison predicate`; test table: ``` create table a_table( k1 BIGINT NOT NULL, k2 VARCHAR(100) NOT NULL, v1 INT SUM NULL DEFAULT "0" ) ENGINE=OLAP AGGREGATE KEY(k1,k2) distributed BY hash(k1) buckets 2 properties("replication_num" = "1"); insert into a_table values (10, 'name1', 10),(20, 'name2', 10); explain plan select * from a_table where k1 in ('10', '20001'); ``` before optimize: ``` +--------------------------------------------------------------------------------------------------------------------------------------+ \| Explain String(Nereids Planner) \| +--------------------------------------------------------------------------------------------------------------------------------------+ \| ========== PARSED PLAN (time: 1ms) ========== \| \| UnboundResultSink[4] ( ) \| \| +--LogicalProject[3] ( distinct=false, projects=[], excepts=[] ) \| \| +--LogicalFilter[2] ( predicates='k1 IN ('10001', '20001') ) \| \| +--LogicalCheckPolicy ( ) \| \| +--UnboundRelation ( id=RelationId#0, nameParts=a_table ) \| \| \| \| ========== ANALYZED PLAN (time: 2ms) ========== \| \| LogicalResultSink[15] ( outputExprs=[k1#0, k2#1, v1#2] ) \| \| +--LogicalProject[13] ( distinct=false, projects=[k1#0, k2#1, v1#2], excepts=[] ) \| \| +--LogicalFilter[11] ( predicates=cast(k1#0 as TEXT) IN ('10001', '20001') ) \| \| +--LogicalOlapScan ( qualified=internal.db.a_table, indexName=<index_not_selected>, selectedIndexId=12003, preAgg=UNSET ) \| \| \| \| ========== REWRITTEN PLAN (time: 6ms) ========== \| \| LogicalResultSink[45] ( outputExprs=[k1#0, k2#1, v1#2] ) \| \| +--LogicalFilter[43] ( predicates=cast(k1#0 as TEXT) IN ('10001', '20001') ) \| \| +--LogicalOlapScan ( qualified=internal.db.a_table, indexName=a_table, selectedIndexId=12003, preAgg=OFF, No aggregate on scan. ) \| \| \| \| ========== OPTIMIZED PLAN (time: 6ms) ========== \| \| PhysicalResultSink[90] ( outputExprs=[k1#0, k2#1, v1#2] ) \| \| +--PhysicalDistribute[87]@1 ( stats=0.33, distributionSpec=DistributionSpecGather ) \| \| +--PhysicalFilter[84]@1 ( stats=0.33, predicates=cast(k1#0 as TEXT) IN ('10001', '20001') ) \| \| +--PhysicalOlapScan[a_table]@0 ( stats=1 ) \| +--------------------------------------------------------------------------------------------------------------------------------------+ ``` after optimize: ``` +--------------------------------------------------------------------------------------------------------------------------------------+ \| Explain String(Nereids Planner) \| +--------------------------------------------------------------------------------------------------------------------------------------+ \| ========== PARSED PLAN (time: 15ms) ========== \| \| UnboundResultSink[4] ( ) \| \| +--LogicalProject[3] ( distinct=false, projects=[], excepts=[] ) \| \| +--LogicalFilter[2] ( predicates='k1 IN ('10001', '20001') ) \| \| +--LogicalCheckPolicy ( ) \| \| +--UnboundRelation ( id=RelationId#0, nameParts=a_table ) \| \| \| \| ========== ANALYZED PLAN (time: 11ms) ========== \| \| LogicalResultSink[15] ( outputExprs=[k1#0, k2#1, v1#2] ) \| \| +--LogicalProject[13] ( distinct=false, projects=[k1#0, k2#1, v1#2], excepts=[] ) \| \| +--LogicalFilter[11] ( predicates=k1#0 IN (10001, 20001) ) \| \| +--LogicalOlapScan ( qualified=internal.db.a_table, indexName=<index_not_selected>, selectedIndexId=12003, preAgg=UNSET ) \| \| \| \| ========== REWRITTEN PLAN (time: 12ms) ========== \| \| LogicalResultSink[45] ( outputExprs=[k1#0, k2#1, v1#2] ) \| \| +--LogicalFilter[43] ( predicates=k1#0 IN (10001, 20001) ) \| \| +--LogicalOlapScan ( qualified=internal.db.a_table, indexName=a_table, selectedIndexId=12003, preAgg=OFF, No aggregate on scan. ) \| \| \| \| ========== OPTIMIZED PLAN (time: 4ms) ========== \| \| PhysicalResultSink[90] ( outputExprs=[k1#0, k2#1, v1#2] ) \| \| +--PhysicalDistribute[87]@1 ( stats=0, distributionSpec=DistributionSpecGather ) \| \| +--PhysicalFilter[84]@1 ( stats=0, predicates=k1#0 IN (10001, 20001) ) \| \| +--PhysicalOlapScan[a_table]@0 ( stats=2 ) \| +--------------------------------------------------------------------------------------------------------------------------------------+ ```	2024-05-24 14:26:52 +08:00
starocean999	bb3a0fd30e	[fix](nereids)should use nereids expr's nullable info when call Expr's toThrift method (#35274 )	2024-05-24 02:24:40 +08:00
starocean999	9277480f00	[fix](nereids)days_diff should match datetimev2 function sigature in higher priority (#35295 )	2024-05-24 02:21:55 +08:00
zhangdong	a52ee6e9b9	[opt](mtmv) generate bi-map between base table and materialized view partitions (#35131 )	2024-05-23 19:11:33 +08:00
HHoflittlefish777	9ba995317a	[fix](routineload) fix data source properties do not persist in edit log (#35137 )	2024-05-23 19:09:41 +08:00
924060929	bf37e5c905	[feature](Nereids) support select distinct with aggregate (#35300 ) (cherry picked from commit adcbc8cce57aaec507174f39536a028db803a2e5)	2024-05-23 19:01:10 +08:00
seawinde	4075408b84	[feature](mtmv)Support single table mv rewrite (#34185 ) (#35242 ) Support Single table query rewrite with out group by this is useful for complex filter or expresission the mv def and query is as following which can be query rewritten mv def: ``` select * from lineitem where l_comment like '%xx%' ``` query: ``` select l_linenumber, l_receiptdate from lineitem where l_comment like '%xx%' ``` Co-authored-by: zfr9527 <qhu15zhang3294197@163.com>	2024-05-23 19:00:36 +08:00
seawinde	82887cc2b3	[improvement](mtmv)Split expression get cherry pick21 (#35240 ) * [improvement](mtmv) Split the expression mapping in LogicalCompatibilityContext for performance (#34646) Need query to view expression mapping when check the logic of hyper graph is equals or not. Getting all expression mapping one-time may affect performance. So split the expresson to three type JOIN_EDGE, NODE, FILTER_EDGE and get them step by step. * fix code style	2024-05-23 18:59:56 +08:00
walter	acf741fa80	[feature](binlog) Support gc binlogs by history nums and size (#35250 ) * [chore](binlog) Add logs about binlog gc (#34359) * [feature](binlog) Support gc binlogs by history nums and size (#34888)	2024-05-23 14:39:57 +08:00
jakevin	0b440685d9	[fix](nereids): fix PlanPostProcessor use visitor (#35244 ) (cherry picked from commit 46e004a358b9e13adb492d376f77e4317e558a6a)	2024-05-23 14:12:25 +08:00
Mingyu Chen	adc364a6fd	[feature](Paimon) support deletion vector for Paimon naive reader (#34743 ) (#35241 ) bp #34743 Co-authored-by: 苏小刚 <suxiaogang223@icloud.com>	2024-05-23 00:01:30 +08:00
zy-kkk	3a5fb6265a	[refactor](jdbc catalog) split trino jdbc executor (#34932 ) (#35176 ) pick #34932	2024-05-22 19:09:57 +08:00
zy-kkk	05a390e050	[refactor](jdbc catalog) split oceanbase jdbc executor (#34869 ) (#35175 ) pick #34869	2024-05-22 19:09:35 +08:00
Qi Chen	291cf57c54	[Configurations](multi-catalog) Add `enable_parquet_filter_by_min_max` and `enable_orc_filter_by_min_max` Session variables. (#35012 ) (#35164 ) backport #35012	2024-05-22 19:06:12 +08:00
Mingyu Chen	05cedfca4e	[fix](hudi) catch exception when getting hudi partition (#35027 ) (#35159 ) bp #35027	2024-05-22 18:44:19 +08:00
morrySnow	9ed4a2023b	[fix](Nereids) DatetimeV2 round floor and round ceiling is wrong (#35153 ) (#35155 ) pick from master #35153 1. round floor was incorrectly implemented as round 2. round ceiling not really round because use double type when divide	2024-05-22 16:23:20 +08:00
feiniaofeiafei	15f70c8183	[Feat](planner)create table stmt offer default distribution attribute :random distribution and auto bucket (#35189 ) Co-authored-by: feiniaofeiafei <moailing@selectdb.com>	2024-05-22 15:18:29 +08:00
yiguolei	dbf7a76592	Revert "[Chore](rollup) check duplicate column name when create table with rollup (#34827 )" This reverts commit 4a8df535537e8eab8fa2ad54934a185e17d4e660.	2024-05-22 10:19:51 +08:00
Xujian Duan	af7b16f213	[optimize](desc) display the correct data type of aggStateType (#34968 ) If a table column is AGG_STATE type, we can't get the clear defined data type if we use `desc tbl` statement. create table a_table( k1 int null, k2 agg_state<max_by(int not null,int)> generic, k3 agg_state<group_concat(string)> generic ) aggregate key (k1) distributed BY hash(k1) buckets 3 properties("replication_num" = "1"); before optimize: mysql> desc a_table; +-------+------------------------------------------------+------+-------+---------+---------+ \| Field \| Type \| Null \| Key \| Default \| Extra \| +-------+------------------------------------------------+------+-------+---------+---------+ \| k1 \| INT \| Yes \| true \| NULL \| \| \| k2 \| org.apache.doris.catalog.AggStateType@239f771c \| No \| false \| NULL \| GENERIC \| \| k3 \| org.apache.doris.catalog.AggStateType@2e535f50 \| No \| false \| NULL \| GENERIC \| +-------+------------------------------------------------+------+-------+---------+---------+ 3 rows in set (0.00 sec) after optimize: mysql> desc a_table; +-------+------------------------------------+------+-------+---------+---------+ \| Field \| Type \| Null \| Key \| Default \| Extra \| +-------+------------------------------------+------+-------+---------+---------+ \| k1 \| INT \| Yes \| true \| NULL \| \| \| k2 \| AGG_STATE<max_by(INT, INT NULL)> \| No \| false \| NULL \| GENERIC \| \| k3 \| AGG_STATE<group_concat(TEXT NULL)> \| No \| false \| NULL \| GENERIC \| +-------+------------------------------------+------+-------+---------+---------+ Co-authored-by: duanxujian <duanxujian@jd.com>	2024-05-22 10:03:31 +08:00
zhiqiang	e8fb47bec1	[fix](broker load) Make Config.enable_pipeline_load works as expected for BrokerLoad (#35105 ) * FIX LOAD PROFILE * FIX	2024-05-22 10:02:02 +08:00
xzj7019	7ae83b60fd	[opt](Nereids) opt locality under multi-replica (#34927 ) Make tablet locality fixed under multi-replica cases. Session variable: set enable_ordered_scan_range_locations = true, default false; 3 replica tpcds 100g: 7% improvement	2024-05-22 10:00:13 +08:00
Jibing-Li	37f1bf317c	[fix](statistics)Disable fetch min/max column stats through HMS, because the value may inaccurate and misleading. (#35124 ) (#35145 ) backport #35124	2024-05-21 22:58:12 +08:00
wuwenchi	009ab77c25	[feature](iceberg)Support write to iceberg for 2.1 (#35103 ) #34257 #33629 bp: #34257 #33629	2024-05-21 22:46:37 +08:00
Mingyu Chen	903ff32021	[opt](fe) exit FE when transfer to (non)master failed (#34809 ) (#35158 ) bp #34809	2024-05-21 22:31:47 +08:00
Ashin Gau	98f8eb5c43	[opt](split) get file splits in batch mode (#34032 ) (#35107 ) bp #34032	2024-05-21 22:27:07 +08:00
GoGoWen	0599cb2efd	fix replica's remote data size set to data size (#35098 ) fix replica's remote data size set to data size	2024-05-21 16:48:08 +08:00
yujun	706c9c473b	[fix](autobucket) calc bucket num exclude today's partition #34304 #35129	2024-05-21 15:49:16 +08:00
HHoflittlefish777	44bb2bb639	[opt](routine-load) do not schedule invalid task (#34918 )	2024-05-21 13:02:42 +08:00
Tiewei Fang	c0fd98abe5	[Fix](tvf) Fix that tvf reading empty files in compressed formats. (#34926 ) 1. Fix the issue with tvf reading empty compressed files. 2. move two test cases (`test_local_tvf_compression` and `test_s3_tvf_compression`) from p2 to p0	2024-05-21 12:59:31 +08:00
starocean999	f3762322c8	[opt](nereids)new way to set pre-agg status (#34738 )	2024-05-21 12:54:49 +08:00
minghong	518b143caa	[feat](Nereids)choose agg mv in cbo #35020	2024-05-21 12:54:10 +08:00
morrySnow	45c145fdf7	[fix](Nereids) LogicalPlanDeepCopier copy scan conjuncts in wrong way (#35077 ) pick from master #35076 intro by PR #34933 This PR attempts to address the issue of losing conjuncts when performing a deep copy of the outer structure. However, the timing of copying the conjuncts is incorrect, resulting in the inability to map slots within the conjuncts to the output of the outer structure.	2024-05-20 21:49:53 +08:00
abmdocrt	42425808a1	[Cherry-Pick](branch-2.1) Pick "Fix multiple replica partial update auto inc data inconsistency problem #34788 " (#35056 ) * [Fix](auto inc) Fix multiple replica partial update auto inc data inconsistency problem (#34788) * Problem: For tables with auto-increment columns, updating partial columns can cause data inconsistency among replicas. Cause: Previously, the implementation for updating partial columns in tables with auto-increment columns was done independently on each BE (Backend), leading to potential inconsistencies in the auto-increment column values generated by each BE. Solution: Before distributing blocks, determine if the update involves partial columns of a table with an auto-increment column. If so, add the auto-increment column to the last column of the block. After distributing to each BE, each BE will check if the data key for the partial column update exists. If it exists, the previous auto-increment column value is used; if not, the auto-increment column value from the last column of the block is used. This ensures that the auto-increment column values are consistent across different BEs. * 2 * [Fix](regression-test) Fix auto inc partial update unstable regression test (#34940)	2024-05-20 15:43:46 +08:00
yiguolei	a43c6eca22	[chore](femetaversion) add a check in fe code to avoid fe meta version changed during pick PR (#35039 ) * [chore](femetaversion) add a check in fe code to avoid fe meta version changed during pick PR * f * f --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2024-05-20 13:29:17 +08:00

1 2 3 4 5 ...

6958 Commits