doris

Author	SHA1	Message	Date
HappenLee	c29582bd57	[pipeline](split by segment)support segment split by scanner (#17738 ) * support segment split by scanner * change code by cr	2023-03-16 15:25:52 +08:00
pengxiangyu	b3d8be7cac	[fix](cooldown)add push conf for alter storage policy (#17818 ) * add push conf for alter storage policy	2023-03-16 14:27:27 +08:00
amory	ee7226348d	[FIX](Map) fix map compaction error (#17795 ) When compaction case, memory map offsets coming to same olap convertor which is from 0 to 0+size but it should be continue in different pages when in one segment writer . eg : last block with map offset : [3, 6, 8, ... 100] this block with map offset : [5, 10, 15 ..., 100] the same convertor should record last offset to make later coming offset followed last offset. so after convertor : the current offset should [105, 110, 115, ... 200], then column writer just call append_data() to make the right offset data append pages	2023-03-16 13:54:01 +08:00
morrySnow	0086fdbbdb	[enhancement](planner) support delete from using syntax (#17787 ) support syntax delete using, this syntax only support UNIQUE KEY model use the result of `t2` join `t3` to romve rows from `t1` ```sql -- create t1, t2, t3 tables CREATE TABLE t1 (id INT, c1 BIGINT, c2 STRING, c3 DOUBLE, c4 DATE) UNIQUE KEY (id) DISTRIBUTED BY HASH (id) PROPERTIES('replication_num'='1', "function_column.sequence_col" = "c4"); CREATE TABLE t2 (id INT, c1 BIGINT, c2 STRING, c3 DOUBLE, c4 DATE) DISTRIBUTED BY HASH (id) PROPERTIES('replication_num'='1'); CREATE TABLE t3 (id INT) DISTRIBUTED BY HASH (id) PROPERTIES('replication_num'='1'); -- insert data INSERT INTO t1 VALUES (1, 1, '1', 1.0, '2000-01-01'), (2, 2, '2', 2.0, '2000-01-02'), (3, 3, '3', 3.0, '2000-01-03'); INSERT INTO t2 VALUES (1, 10, '10', 10.0, '2000-01-10'), (2, 20, '20', 20.0, '2000-01-20'), (3, 30, '30', 30.0, '2000-01-30'), (4, 4, '4', 4.0, '2000-01-04'), (5, 5, '5', 5.0, '2000-01-05'); INSERT INTO t3 VALUES (1), (4), (5); -- remove rows from t1 DELETE FROM t1 USING t2 INNER JOIN t3 ON t2.id = t3.id WHERE t1.id = t2.id; ``` the expect result is only remove the row where id = 1 in table t1 ``` +----+----+----+--------+------------+ \| id \| c1 \| c2 \| c3 \| c4 \| +----+----+----+--------+------------+ \| 2 \| 2 \| 2 \| 2.0 \| 2000-01-02 \| \| 3 \| 3 \| 3 \| 3.0 \| 2000-01-03 \| +----+----+----+--------+------------+ ```	2023-03-16 13:12:00 +08:00
AKIRA	bece027135	[ehancement](profile) Add HTTP interface for q-error (#17786 ) 1. Add Http interface for query q-error 2. Fix the selectivity calculation of inner join, it would always be 0 if there is only one join condition before	2023-03-16 12:19:23 +08:00
谢健	c2edca7bda	[fix](Nereids) construct project with all slots in semi-semi-transpose-project rule (#17811 ) error msg in tpch 20 ``` SlotRef have invalid slot id: , desc: 22, slot_desc: tuple_desc_map: [Tuple(id=10 slots=[Slot(id=51 type=DECIMALV2(27, 9) col=-1, colname= null=(offset=0 mask=80)), Slot(id=52 type=INT col=-1, colname= null=(offset=0 mask=0)), Slot(id=53 type=INT col=-1, colname= null=(offset=0 mask=0)), Slot(id=54 type=INT col=-1, colname= null=(offset=0 mask=0)), Slot(id=55 type=INT col=-1, colname= null=(offset=0 mask=0))] has_varlen_slots=0)] tuple_id_map: [-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 0] tuple_is_nullable: [0] , desc_tbl: Slot(id=22 type=INT col=-1, colname= null=(offset=0 mask=0)) ``` Before we only use slots in `hashJoin` conditions to construct projects, which may lost some slots in `project`, such as ``` LOGICAL_SEMI_JOIN_LOGICAL_JOIN_TRANSPOSE_PROJECT LogicalJoin[1135] ( type=LEFT_SEMI_JOIN, markJoinSlotReference=Optional.empty, hashJoinConjuncts=[(PS_PARTKEY#0 = P_PARTKEY#6)], otherJoinConjuncts=[] ) \|--LogicalProject[1128] ( distinct=false, projects=[PS_PARTKEY#0, PS_SUPPKEY#1], excepts=[], canEliminate=true ) \| +--LogicalJoin[1120] ( type=LEFT_SEMI_JOIN, markJoinSlotReference=Optional.empty, hashJoinConjuncts=[(L_PARTKEY#17 = PS_PARTKEY#0), (L_SUPPKEY#18 = PS_SUPPKEY#1)], otherJoinConjuncts=[(cast(PS_AVAILQTY#2 as DECIMAL(27, 9)) > (0.5 * sum(L_QUANTITY))#33)] ) \| \|--GroupPlan( GroupId#2 ) \| +--GroupPlan( GroupId#7 ) +--GroupPlan( GroupId#12 ) ----------------------after---------------------- LogicalJoin[1141] ( type=LEFT_SEMI_JOIN, markJoinSlotReference=Optional.empty, hashJoinConjuncts=[(L_PARTKEY#17 = PS_PARTKEY#0), (L_SUPPKEY#18 = PS_SUPPKEY#1)], otherJoinConjuncts=[(cast(PS_AVAILQTY#2 as DECIMAL(27, 9)) > (0.5 * sum(L_QUANTITY))#33)] ) \|--LogicalProject[1140] ( distinct=false, projects=[PS_PARTKEY#0, PS_SUPPKEY#1], excepts=[], canEliminate=true ) \| +--LogicalJoin[1139] ( type=LEFT_SEMI_JOIN, markJoinSlotReference=Optional.empty, hashJoinConjuncts=[(PS_PARTKEY#0 = P_PARTKEY#6)], otherJoinConjuncts=[] ) \| \|--GroupPlan( GroupId#2 ) \| +--GroupPlan( GroupId#12 ) +--GroupPlan( GroupId#7 ) ``` `PS_AVAILQTY#2` lost in project Now we use all slots to construct projest	2023-03-16 11:53:32 +08:00
LiBinfeng	ebe651dae9	[Fix](Planner)Add call once logic to analyze of function aes_decrypt #17829 The problem is an exception when doing analyze: java.lang.IllegalStateException: exceptions : errCode = 2, detailMessage = select list expression not produced by aggregation output (missing from GROUP BY clause?): xxx The scenario is: select aes_decrypt(xxx,xxx) as c0 from table group by c0; Analyze of problem: The direct problem is mismatched of slotref, and this mismatched due to the mismatched of parameter number of aes_decrypt function. When debuging, we can see the slotref of group column is added to ExprSubstitutionMap, but can not matching with select result columns. And this is because when substiting expr it will analyze again, so the parameter would be added twice. This will cause the mismatching of function, so it would not be substitute as a slotref, the exception would be throw. Fix: Add call once to adding third parameter of aes_decrypt type function. Compare the child we want to add to the last child of function. If they are the same, do not add it.	2023-03-16 11:04:21 +08:00
meiyi	1da3e7596e	[fix](point query) Fix NegativeArraySizeException when prepared statement contains a long string (#17651 )	2023-03-16 10:24:33 +08:00
Lei Zhang	f8ad01f55d	[fix](fe) fix drop frontend removeUnReadyElectableNode incorrectly (#17680 ) * when add two not exist fe and drop two not exit fe, we may meet exception like this: ''' java.lang.IllegalArgumentException: com.sleepycat.je.config.IntConfigParam: param je.rep.electableGroupSizeOverride doesn't validate, -1 is less than min of 0 at com.sleepycat.je.config.IntConfigParam.validate(IntConfigParam.java:47) at com.sleepycat.je.config.IntConfigParam.validateValue(IntConfigParam.java:75) at com.sleepycat.je.dbi.DbConfigManager.setVal(DbConfigManager.java:648) at com.sleepycat.je.dbi.DbConfigManager.setIntVal(DbConfigManager.java:694) at com.sleepycat.je.rep.ReplicationMutableConfig.setElectableGroupSizeOverrideVoid(ReplicationMutableConfig.java:523) at com.sleepycat.je.rep.ReplicationMutableConfig.setElectableGroupSizeOverride(ReplicationMutableConfig.java:512) at org.apache.doris.ha.BDBHA.removeUnReadyElectableNode(BDBHA.java:236) at org.apache.doris.catalog.Env.dropFrontend(Env.java:2533) '''	2023-03-16 10:22:42 +08:00
Lei Zhang	b043b9798d	[feature](bdbje) Add config param for bdbje logging level (#17064 ) Add new config param bdbje_file_logging_level	2023-03-16 09:50:44 +08:00
ZhangYu0123	a53d46e317	[Fix](array function) fix array_pushfront function with DecimalV3 #17760 Support array_pushfront function with DecimalV3 Issue Number: close #xxx	2023-03-16 09:03:52 +08:00
meiyi	d4dc56c99e	[fix](insert) Fragment is not cancelled when client quit without commit a rollback transation insert (#17678 )	2023-03-15 21:46:40 +08:00
zhengyu	990dce9a47	[fix](load) fix load channel timeout too fast in routine load task (#17796 ) enlarge the timeout in routine load Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>	2023-03-15 21:12:02 +08:00
Mingyu Chen	0ad459fea5	[fix](multi-catalog) fix forward to master throw NPE (#17791 ) The ConnectionContext maybe null, and actually, we don't need ConnectionContext in MasterCatalogExecutor	2023-03-15 20:50:29 +08:00
yongkang.zhong	e4a1e57d6f	[feature](multi-catalog) support sap hana jdbc catalog and jdbc external table (#17780 )	2023-03-15 20:37:36 +08:00
yinzhijian	54a51933b6	[improvement](FQDN) Support existing cluster upgrade (#17659 )	2023-03-15 20:13:13 +08:00
morrySnow	1fdc265083	[fix](Nereids) lock tables when analyze may lead to dead lock (#17776 ) Nereids use AutoCloseable to do tables read lock and unlock. However, if AutoCloseable throw exception when open resource, its close function would not be called. So, we do close manually when exception thrown in opening stage.	2023-03-15 14:00:27 +08:00
jakevin	1e3da95359	[fix](Nereids): fix LAsscom split conjuncts. (#17792 )	2023-03-15 13:39:08 +08:00
pengxiangyu	ceff7e851d	[fix](cooldown)Check cooldown ttl and datetime when alter storage policy (#17779 ) * Check cooldown ttl and datetime when alter storage policy	2023-03-15 12:19:30 +08:00
Weijie Guo	c8de04f9d7	[fix][Nereids] fix not correct condition to checkReorder in InnerJoinRightAssociate. (#17799 )	2023-03-15 11:49:03 +08:00
谢健	97bf07fe26	[enhancement](Nereids) add new distributed cost model (#17556 ) Add a new distributed cost model in Nereids. The new cost model models the cost of the pipeline execute engine by dividing cost into run and start costs. They are: * START COST: the cost from starting to emitting the fist tuple * RUN COST: the cost from emitting the first tuple to emitting all tuples For the parent operator and child operator, we assume the timeline of them is: ``` child start ---> child run --------------------> finish \|---> parent start ---> parent run -> finish ``` Therefore, in the parallel model, we can get: ``` start_cost(parent) = start_cost(child) + start_cost(parent) run_cost(parent) = max(run_cost(child), start_cost(parent) + run_cost(parent)) ```	2023-03-15 11:22:31 +08:00
zhangstar333	85080ee3c3	[vectorized](function) support array_map function (#17581 )	2023-03-15 10:51:29 +08:00
morrySnow	5ab758674e	[fix](planner) nested loop join with left semi generate repeat result (#17767 )	2023-03-15 09:56:44 +08:00
yuxuan-luo	45fcdaabc7	[Bug](catalog) Fix fetching information_schema table timed out(#17692 ) (#17694 ) Co-authored-by: hugoluo <hugoluo@tencent.com>	2023-03-15 09:56:24 +08:00
AKIRA	16a4dc0a85	[ehancement](profile) Disable profiling for the internal query (#17720 )	2023-03-15 09:48:29 +08:00
LiBinfeng	9b047d2c94	Feat: Add byte size to TTypedesc in TExpr. Which will be used to carry scalarType information. (#17757 ) Co-authored-by: libinfeng <libinfeng@selectdb.com>	2023-03-15 08:24:32 +08:00
jakevin	7872f3626a	[feature](Nereids): Rewrite InPredicate to disjunction if there exist items < 3 elements in InPredicate (#17646 ) * [feature](Nereids): Rewrite InPredicate to disjunction if there exists < 3 elements in InPredicate * fix SimplifyRange	2023-03-15 08:23:56 +08:00
Jibing-Li	02220560c5	[Improvement](multi catalog)Hive splitter. Get HDFS/S3 splits by using FileSystem api (#17706 ) Use FileSystem API to get splits for file in HDFS/S3 instead of calling InputFormat.getSplits. The splits is based on blocks in HDFS/S3.	2023-03-15 00:25:00 +08:00
Yulei-Yang	b28f31f98d	[fix](meta) fix show create table result of hive table (#17677 ) make it usable in hive. current issue: type of partition column are wrapped by ``, it's not illegal in hive. One problem case: CREATE TABLE t3p_parquet( id int, name string) PARTITIONED BY ( dt int) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' LOCATION 'hdfs://path/to/t3p_parquet' TBLPROPERTIES ( 'transient_lastDdlTime'='1671700883')	2023-03-14 22:50:35 +08:00
minghong	e46077fbf4	print group id for physical plan node (#17742 )	2023-03-14 22:35:08 +08:00
morrySnow	6348819c27	[fix](Nereids) remove bitmap_union_int(bigint) signature (#17356 )	2023-03-14 20:42:47 +08:00
zhbinbin	ff9e03e2bf	[Feature](add bitmap udaf) add the bitmap intersection and difference set for mixed calculation of udaf (#15588 ) * Add the bitmap intersection and difference set for mixed calculation of udaf Co-authored-by: zhangbinbin05 <zhangbinbin05@baidu.com>	2023-03-14 20:40:37 +08:00
minghong	65f71d9e06	[enhance](nereids) broadcast cost calculate (#17711 ) update broadcast join cost estimate according to BE implementation. there is an enhancement on BE. in broadcast join, BE only build one hash table, not instanceNum hash tables.	2023-03-14 19:45:03 +08:00
morrySnow	699159698e	[enhancement](planner) support update from syntax (#17639 ) support update from syntax note: enable_concurrent_update is not supported now ``` UPDATE <target_table> SET <col_name> = <value> [ , <col_name> = <value> , ... ] [ FROM <additional_tables> ] [ WHERE <condition> ] ``` for example: t1 ``` +----+----+----+-----+------------+ \| id \| c1 \| c2 \| c3 \| c4 \| +----+----+----+-----+------------+ \| 3 \| 3 \| 3 \| 3.0 \| 2000-01-03 \| \| 2 \| 2 \| 2 \| 2.0 \| 2000-01-02 \| \| 1 \| 1 \| 1 \| 1.0 \| 2000-01-01 \| +----+----+----+-----+------------+ ``` t2 ``` +----+----+----+------+------------+ \| id \| c1 \| c2 \| c3 \| c4 \| +----+----+----+------+------------+ \| 4 \| 4 \| 4 \| 4.0 \| 2000-01-04 \| \| 2 \| 20 \| 20 \| 20.0 \| 2000-01-20 \| \| 5 \| 5 \| 5 \| 5.0 \| 2000-01-05 \| \| 1 \| 10 \| 10 \| 10.0 \| 2000-01-10 \| \| 3 \| 30 \| 30 \| 30.0 \| 2000-01-30 \| +----+----+----+------+------------+ ``` t3 ``` +----+ \| id \| +----+ \| 1 \| \| 5 \| \| 4 \| +----+ ``` do update ```sql update t1 set t1.c1 = t2.c1, t1.c3 = t2.c3 * 100 from t2 inner join t3 on t2.id = t3.id where t1.id = t2.id; ``` the result ``` +----+----+----+--------+------------+ \| id \| c1 \| c2 \| c3 \| c4 \| +----+----+----+--------+------------+ \| 3 \| 3 \| 3 \| 3.0 \| 2000-01-03 \| \| 2 \| 2 \| 2 \| 2.0 \| 2000-01-02 \| \| 1 \| 10 \| 1 \| 1000.0 \| 2000-01-01 \| +----+----+----+--------+------------+ ```	2023-03-14 19:26:30 +08:00
AKIRA	f1dde20315	[ehancemnet](nereids) Refactor statistics (#17637 ) 1. Support for more expression type 2. Support derive with histogram 3. Use StatisticRange to abstract to logic 4. Use Statistics rather than StatisDeriveResult	2023-03-14 13:10:55 +08:00
jakevin	be3a7e69cd	[refactor](Nereids): polish code SemiJoinLogicalJoinTranspose. (#17740 )	2023-03-14 12:48:58 +08:00
谢健	3a97190661	[fix](Nereids) Compare plan with their output rather than string in UnrankTest (#17698 ) After adding a unique ID, the unRankTest fail because each plan has a different ID in the string. To avoid the effect of unique ID, Compare the plan with the output rather than the string	2023-03-14 11:10:06 +08:00
spaces-x	5b39fa9843	[Feature](vec)(quantile_state): support quantile state in vectorized engine (#16562 ) * [Feature](vectorized)(quantile_state): support vectorized quantile state functions 1. now quantile column only support not nullable 2. add up some regression test cases 3. set default enable_quantile_state_type = true --------- Co-authored-by: spaces-x <weixiang06@meituan.com>	2023-03-14 10:54:04 +08:00
huangzhaowei	f3c6ee5961	[Enhance](ComputeNode) ES Scan node support to be scheduled to compute node (#16533 ) ES Scan node support to be scheduled to compute node.	2023-03-14 00:13:24 +08:00
lihangyu	9b7596f1c6	[Feature](Dynamic schema table) step1 support schema change expression (#17494 ) 1. introduce a new type `VARIANT` to encapsulate dynamic generated columns for hidding the detail of types and names of newly generated columns 2. introduce a new expression `SchemaChangeExpr` for doing schema change for extensibility	2023-03-13 15:12:42 +08:00
gitccl	c302fa2564	[Feature](array-function) Support array_pushfront function (#17584 )	2023-03-13 14:26:02 +08:00
pengxiangyu	ac944e2ac1	[fix](cooldown)Fix bug for storage policy in dynamic partition (#17665 ) * fix bug for partition storage policy	2023-03-13 14:13:55 +08:00
yiguolei	be5147c32e	[enhancement](feservice) catch throwable and print log for frontend service (#17708 ) --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-03-13 11:27:00 +08:00
starocean999	782001c75b	[fix](planner) project should be done inside subquery (#17630 ) WITH t0 AS( SELECT report.date1 AS date2 FROM( SELECT DATE_FORMAT(date, '%Y%m%d') AS date1 FROM cir_1756_t1 ) report GROUP BY report.date1 ), t3 AS( SELECT date_format(date, '%Y%m%d') AS date3 FROM cir_1756_t2 ) SELECT row_number() OVER(ORDER BY date2) FROM( SELECT t0.date2 FROM t0 LEFT JOIN t3 ON t0.date2 = t3.date3 ) tx; The DATE_FORMAT(date, '%Y%m%d') was calculated in GROUP BY node, which is wrong. This expr should be calculated inside the subquery.	2023-03-13 11:10:27 +08:00
abmdocrt	55c42da511	[Feature](array) Support array<decimalv3> data type (#16640 )	2023-03-13 10:48:13 +08:00
HappenLee	39b5682d59	[Pipeline](shared_scan_opt) Support shared scan opt in pipeline exec engine	2023-03-13 10:33:57 +08:00
Xiangyu Wang	a0a2809324	[Enhancement](multi-catalog) support hms event deserialization for HDP/CDH Hive versions. (#17660 ) Some HDP/CDH Hive versions use gzip to compress the message body of hms NotificationEvent, so com.qihoo.finance.hms.event.MetastoreEventFactory can not transfer it rightly.	2023-03-13 09:47:28 +08:00
Mingyu Chen	b0d1166989	[fix](meta) fix concurrent modification exception and potential NPE (#17602 )	2023-03-12 22:12:07 +08:00
Mingyu Chen	46dcf69644	[fix](jdbc-catalog) avoid calculate driver's md5 when replaying edit log (#17693 )	2023-03-12 22:11:45 +08:00
AKIRA	54e5c71e52	[fix](planner) Fix NPE when update stats by profile	2023-03-12 21:40:47 +08:00

1 2 3 4 5 ...

4047 Commits