doris

Author	SHA1	Message	Date
谢健	97bf07fe26	[enhancement](Nereids) add new distributed cost model (#17556 ) Add a new distributed cost model in Nereids. The new cost model models the cost of the pipeline execute engine by dividing cost into run and start costs. They are: * START COST: the cost from starting to emitting the fist tuple * RUN COST: the cost from emitting the first tuple to emitting all tuples For the parent operator and child operator, we assume the timeline of them is: ``` child start ---> child run --------------------> finish \|---> parent start ---> parent run -> finish ``` Therefore, in the parallel model, we can get: ``` start_cost(parent) = start_cost(child) + start_cost(parent) run_cost(parent) = max(run_cost(child), start_cost(parent) + run_cost(parent)) ```	2023-03-15 11:22:31 +08:00
ZhaoChangle	66f3ef568e	(functions) optimize const_column to full convert	2023-03-15 10:57:03 +08:00
zhangstar333	85080ee3c3	[vectorized](function) support array_map function (#17581 )	2023-03-15 10:51:29 +08:00
Stalary	ca0367d846	FIX: es doc (#17771 )	2023-03-15 10:40:53 +08:00
morrySnow	5ab758674e	[fix](planner) nested loop join with left semi generate repeat result (#17767 )	2023-03-15 09:56:44 +08:00
yuxuan-luo	45fcdaabc7	[Bug](catalog) Fix fetching information_schema table timed out(#17692 ) (#17694 ) Co-authored-by: hugoluo <hugoluo@tencent.com>	2023-03-15 09:56:24 +08:00
AKIRA	16a4dc0a85	[ehancement](profile) Disable profiling for the internal query (#17720 )	2023-03-15 09:48:29 +08:00
TengJianPing	64c2437be5	[fix](coalesce) support coalesce function for bitmap (#17798 )	2023-03-15 09:34:44 +08:00
LiBinfeng	9b047d2c94	Feat: Add byte size to TTypedesc in TExpr. Which will be used to carry scalarType information. (#17757 ) Co-authored-by: libinfeng <libinfeng@selectdb.com>	2023-03-15 08:24:32 +08:00
jakevin	7872f3626a	[feature](Nereids): Rewrite InPredicate to disjunction if there exist items < 3 elements in InPredicate (#17646 ) * [feature](Nereids): Rewrite InPredicate to disjunction if there exists < 3 elements in InPredicate * fix SimplifyRange	2023-03-15 08:23:56 +08:00
Jibing-Li	02220560c5	[Improvement](multi catalog)Hive splitter. Get HDFS/S3 splits by using FileSystem api (#17706 ) Use FileSystem API to get splits for file in HDFS/S3 instead of calling InputFormat.getSplits. The splits is based on blocks in HDFS/S3.	2023-03-15 00:25:00 +08:00
Yulei-Yang	b28f31f98d	[fix](meta) fix show create table result of hive table (#17677 ) make it usable in hive. current issue: type of partition column are wrapped by ``, it's not illegal in hive. One problem case: CREATE TABLE t3p_parquet( id int, name string) PARTITIONED BY ( dt int) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' LOCATION 'hdfs://path/to/t3p_parquet' TBLPROPERTIES ( 'transient_lastDdlTime'='1671700883')	2023-03-14 22:50:35 +08:00
Luzhijing	76f486980a	[docs](user)update the users number (#17749 )	2023-03-14 22:42:51 +08:00
minghong	e46077fbf4	print group id for physical plan node (#17742 )	2023-03-14 22:35:08 +08:00
lihangyu	7180cf3d9b	[Improve](row store) avoid serialize null slot into a jsonb row (#17734 ) This could save some disk space	2023-03-14 22:13:41 +08:00
morrySnow	6348819c27	[fix](Nereids) remove bitmap_union_int(bigint) signature (#17356 )	2023-03-14 20:42:47 +08:00
zhbinbin	ff9e03e2bf	[Feature](add bitmap udaf) add the bitmap intersection and difference set for mixed calculation of udaf (#15588 ) * Add the bitmap intersection and difference set for mixed calculation of udaf Co-authored-by: zhangbinbin05 <zhangbinbin05@baidu.com>	2023-03-14 20:40:37 +08:00
minghong	65f71d9e06	[enhance](nereids) broadcast cost calculate (#17711 ) update broadcast join cost estimate according to BE implementation. there is an enhancement on BE. in broadcast join, BE only build one hash table, not instanceNum hash tables.	2023-03-14 19:45:03 +08:00
morrySnow	699159698e	[enhancement](planner) support update from syntax (#17639 ) support update from syntax note: enable_concurrent_update is not supported now ``` UPDATE <target_table> SET <col_name> = <value> [ , <col_name> = <value> , ... ] [ FROM <additional_tables> ] [ WHERE <condition> ] ``` for example: t1 ``` +----+----+----+-----+------------+ \| id \| c1 \| c2 \| c3 \| c4 \| +----+----+----+-----+------------+ \| 3 \| 3 \| 3 \| 3.0 \| 2000-01-03 \| \| 2 \| 2 \| 2 \| 2.0 \| 2000-01-02 \| \| 1 \| 1 \| 1 \| 1.0 \| 2000-01-01 \| +----+----+----+-----+------------+ ``` t2 ``` +----+----+----+------+------------+ \| id \| c1 \| c2 \| c3 \| c4 \| +----+----+----+------+------------+ \| 4 \| 4 \| 4 \| 4.0 \| 2000-01-04 \| \| 2 \| 20 \| 20 \| 20.0 \| 2000-01-20 \| \| 5 \| 5 \| 5 \| 5.0 \| 2000-01-05 \| \| 1 \| 10 \| 10 \| 10.0 \| 2000-01-10 \| \| 3 \| 30 \| 30 \| 30.0 \| 2000-01-30 \| +----+----+----+------+------------+ ``` t3 ``` +----+ \| id \| +----+ \| 1 \| \| 5 \| \| 4 \| +----+ ``` do update ```sql update t1 set t1.c1 = t2.c1, t1.c3 = t2.c3 * 100 from t2 inner join t3 on t2.id = t3.id where t1.id = t2.id; ``` the result ``` +----+----+----+--------+------------+ \| id \| c1 \| c2 \| c3 \| c4 \| +----+----+----+--------+------------+ \| 3 \| 3 \| 3 \| 3.0 \| 2000-01-03 \| \| 2 \| 2 \| 2 \| 2.0 \| 2000-01-02 \| \| 1 \| 10 \| 1 \| 1000.0 \| 2000-01-01 \| +----+----+----+--------+------------+ ```	2023-03-14 19:26:30 +08:00
Kang	f999b823fc	[feature](array) support array for apache arrow convertor (#17682 ) * support array type for arrow * fix builder.Append() for each array row * fix array child column append start offset	2023-03-14 17:53:16 +08:00
AKIRA	f1dde20315	[ehancemnet](nereids) Refactor statistics (#17637 ) 1. Support for more expression type 2. Support derive with histogram 3. Use StatisticRange to abstract to logic 4. Use Statistics rather than StatisDeriveResult	2023-03-14 13:10:55 +08:00
jakevin	be3a7e69cd	[refactor](Nereids): polish code SemiJoinLogicalJoinTranspose. (#17740 )	2023-03-14 12:48:58 +08:00
yiguolei	77ab2fac20	[refactor](functioncontext) remove function context impl class (#17715 ) * [refactor](functioncontext) remove function context impl class Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> --------- Co-authored-by: yiguolei <yiguolei@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2023-03-14 11:21:45 +08:00
谢健	3a97190661	[fix](Nereids) Compare plan with their output rather than string in UnrankTest (#17698 ) After adding a unique ID, the unRankTest fail because each plan has a different ID in the string. To avoid the effect of unique ID, Compare the plan with the output rather than the string	2023-03-14 11:10:06 +08:00
spaces-x	5b39fa9843	[Feature](vec)(quantile_state): support quantile state in vectorized engine (#16562 ) * [Feature](vectorized)(quantile_state): support vectorized quantile state functions 1. now quantile column only support not nullable 2. add up some regression test cases 3. set default enable_quantile_state_type = true --------- Co-authored-by: spaces-x <weixiang06@meituan.com>	2023-03-14 10:54:04 +08:00
ElvinWei	36a0d40ac3	Fix errors in the data-partition.md (#17756 )	2023-03-14 10:44:57 +08:00
weij	ba0f5a2355	[test](mv) Add mv case from fe ut (#17204 ) add some mv case from fe ut MaterializedViewFunctionTest	2023-03-14 10:29:43 +08:00
airborne12	2e0af4e33c	[Enhancement](inverted-index) use read buffer when read index bytes in compound reader (#17306 ) Read IO would be a problem when reading inverted index from disk. Using read buffer to reduce IO. Set use buffer flag to be true when reading internal bytes in compound reader for inverted index.	2023-03-14 10:10:59 +08:00
TengJianPing	7d91114304	[fix](join) fix wrong result of null aware left anti join (#17752 )	2023-03-14 09:35:46 +08:00
Qi Chen	c6630a06c1	[Fix](multi-catalog) Fix "test_hive_other" regression test. (#17611 )	2023-03-14 09:16:48 +08:00
caoliang-web	76458cf091	[typo](partition)Modify the list partition document #17744	2023-03-14 08:27:26 +08:00
yagagagaga	883ae8a86d	[typo](docs) Add some content for bitmap_hash.md. (#17747 )	2023-03-14 08:27:07 +08:00
huangzhaowei	f3c6ee5961	[Enhance](ComputeNode) ES Scan node support to be scheduled to compute node (#16533 ) ES Scan node support to be scheduled to compute node.	2023-03-14 00:13:24 +08:00
lihangyu	9b7596f1c6	[Feature](Dynamic schema table) step1 support schema change expression (#17494 ) 1. introduce a new type `VARIANT` to encapsulate dynamic generated columns for hidding the detail of types and names of newly generated columns 2. introduce a new expression `SchemaChangeExpr` for doing schema change for extensibility	2023-03-13 15:12:42 +08:00
gitccl	c302fa2564	[Feature](array-function) Support array_pushfront function (#17584 )	2023-03-13 14:26:02 +08:00
pengxiangyu	ac944e2ac1	[fix](cooldown)Fix bug for storage policy in dynamic partition (#17665 ) * fix bug for partition storage policy	2023-03-13 14:13:55 +08:00
yiguolei	be5147c32e	[enhancement](feservice) catch throwable and print log for frontend service (#17708 ) --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-03-13 11:27:00 +08:00
zhengyu	2b31fc1472	[fix](regression) segcompaction timeout too short (#16731 ) (#17565 ) Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>	2023-03-13 11:19:21 +08:00
chunping	b9fac82fb1	[fix](regression) adjust regression pipeline config(tablet_create_timeout_second) for avoiding create partition timeout (#17668 ) This pull request for bellow problem : regression pipleline fail case always meet error "Failed to create partition. Timeout. Unfinished mark: 10003=57059", so adjust tablet_create_timeout_second to 100	2023-03-13 11:18:03 +08:00
DongLiang-0	5fccbac81b	[fix](demo)add Sync full database for versions below doris 1.2 (#17669 )	2023-03-13 11:17:29 +08:00
Pxl	16fc3a0e22	[Chore](compile) remove some unused static on inline function to reduce compile time (#17603 ) remove some unused static on inline function to reduce compile time	2023-03-13 11:11:59 +08:00
starocean999	782001c75b	[fix](planner) project should be done inside subquery (#17630 ) WITH t0 AS( SELECT report.date1 AS date2 FROM( SELECT DATE_FORMAT(date, '%Y%m%d') AS date1 FROM cir_1756_t1 ) report GROUP BY report.date1 ), t3 AS( SELECT date_format(date, '%Y%m%d') AS date3 FROM cir_1756_t2 ) SELECT row_number() OVER(ORDER BY date2) FROM( SELECT t0.date2 FROM t0 LEFT JOIN t3 ON t0.date2 = t3.date3 ) tx; The DATE_FORMAT(date, '%Y%m%d') was calculated in GROUP BY node, which is wrong. This expr should be calculated inside the subquery.	2023-03-13 11:10:27 +08:00
abmdocrt	55c42da511	[Feature](array) Support array<decimalv3> data type (#16640 )	2023-03-13 10:48:13 +08:00
camby	3a6c0e7867	[fix](regression) fix test_array_export and test_map_export dir conflict #17636 regression test test_array_export and test_map_export use same output dir, if they run at the same time, the cases will failed.	2023-03-13 10:35:50 +08:00
HappenLee	39b5682d59	[Pipeline](shared_scan_opt) Support shared scan opt in pipeline exec engine	2023-03-13 10:33:57 +08:00
yuxuan-luo	edb2d90852	[fix](routine load) fix ROUTINE LOAD bug,kafka commit a lack of one(#17282 ) (#17291 ) Co-authored-by: hugoluo <hugoluo@tencent.com>	2023-03-13 10:20:59 +08:00
Xiangyu Wang	a0a2809324	[Enhancement](multi-catalog) support hms event deserialization for HDP/CDH Hive versions. (#17660 ) Some HDP/CDH Hive versions use gzip to compress the message body of hms NotificationEvent, so com.qihoo.finance.hms.event.MetastoreEventFactory can not transfer it rightly.	2023-03-13 09:47:28 +08:00
Jerry Hu	93a865c3e8	[improvement](join) Avoid reading from left child while hash table is empty(right join) (#17655 ) When the right (build) side is empty in a right outer join, there is no need to read data from the left child.	2023-03-13 09:03:17 +08:00
Johnny_Sc	47cfc81925	[fix docs] (#17634 ) Co-authored-by: shenshoucheng <shenshoucheng@jd.com>	2023-03-13 08:06:33 +08:00
ZhangYu0123	33059d92cc	[docs](doc) fix faq docs (#17707 )	2023-03-13 08:05:12 +08:00

1 2 3 4 5 ...

9296 Commits