doris

Author	SHA1	Message	Date
Jerry Hu	c19e391d32	[fix](profile) show query profile for pipeline engine (#15687 )	2023-01-10 10:12:34 +08:00
Xinyi Zou	9c0f96883a	[fix](hashjoin) Fix right join pull output block memory overflow (#15440 ) For outer join / right outer join / right semi join, when HashJoinNode::pull->process_data_in_hashtable outputs a block, it will output all rows of a key in the hash table into a block, and the output of a key is completed After that, it will check whether the block size exceeds the batch size, and if it exceeds, the output will be terminated. If a key has 2000w+ rows, memory overflow will occur when the subsequent block operations on the 2000w+ rows are performed.	2023-01-10 10:10:43 +08:00
gnehil	3990a44aba	[typo](doc) add since dev lable to field function doc (#15648 )	2023-01-10 09:52:37 +08:00
zhannngchen	67a6ad648e	[typo](doc) command of manually trigger compaction incorrect (#15709 )	2023-01-10 09:50:47 +08:00
Mingyu Chen	9e3a61989b	[refactor](es) remove BE generated dsl for es query #15751 remove fe config enable_new_es_dsl and all related code. Now the DSL for es is always generated on FE side.	2023-01-10 08:40:32 +08:00
plat1ko	ab186a60ce	[enhancement](compaction) Optimize judging delete rowset and picking candidate rowsets for compaction #15631 Tablet::version_for_delete_predicate should travel all rowset metas in tablet meta which complex is O(N), however we can directly judge whether this rowset is a delete rowset by RowsetMeta::has_delete_predicate which complex is O(1). As we won't call Tablet::version_for_delete_predicate when pick input rowsets for compaction, we can reduce the critical area of Tablet::_meta_lock.	2023-01-10 08:32:15 +08:00
luozenglin	05f6e4c48a	[fix](predicate) fix be core dump caused by pushing down the double column predicate (#15693 )	2023-01-09 19:31:04 +08:00
AKIRA	2b0e5e42a5	[ehancement](nereids) Support list parttion prune (#15724 )	2023-01-09 19:00:53 +08:00
jakevin	67ceb83294	[enhance](Nereids): polish test format, add more comment. (#15662 )	2023-01-09 15:40:27 +08:00
AKIRA	5ceb5441f4	[feature](nereids) let set operation syntax campatible with lagecy planner (#15664 ) Though this syntax doesn't get suppoted in many other systems since the order by clause here almost redandunt and useless but we have to keep consistent with the legacy doris syntax Here is a example: SELECT * FROM (SELECT k1, k3 FROM tbl1 ORDER BY k3 UNION ALL SELECT k1, k5 FROM tbl2) t;	2023-01-09 15:31:29 +08:00
AKIRA	7543d677fa	[fix](nereids) Fix the bugs of data distribution calculation on OlapScan (#15699 ) when need to scan more than one olap table partition and it is not a colocate table or its colocate group is unstable, we need to make it as any distribution even if its distribution type is Hash	2023-01-09 15:25:54 +08:00
Gabriel	e2492cf7fc	[Bug](DECIMALV3) Fix binary predicate between decimalv3 and float (#15696 )	2023-01-09 15:16:59 +08:00
Gabriel	2c9c7c48ac	[improvement](decimalv3) Java UDF and array type support DECIMALV3 (#15674 )	2023-01-09 15:13:16 +08:00
谢健	4c50c4906b	[fix](Nereids) add implicit casting for arithmetic expression (#15630 ) Add implicit casting for arithmetic expression to support select "1" + "2"	2023-01-09 15:10:35 +08:00
morrySnow	4f2bea86ee	[fix](Nereids) divide operator return type is not same with lagecy planner (#15707 )	2023-01-09 14:50:24 +08:00
xueweizhang	93b941baeb	[fix](tvf) use virtual-hosted style when s3('uri'='s3://xxx') (#15617 ) Signed-off-by: nextdreamblue <zxw520blue1@163.com> Signed-off-by: nextdreamblue <zxw520blue1@163.com>	2023-01-09 14:09:40 +08:00
Gabriel	699bf972e2	[Bug](bitmap) Fix bitmap_from_string for null constant (#15698 )	2023-01-09 10:21:08 +08:00
Mingyu Chen	211cc66d02	[fix](multi-catalog) fix image loading failture when create catalog with resource (#15692 ) Bug fix fix image loading failture when create catalog with resource When creating jdbc catalog with resource, the metadata image will failed to be loaded. Because when loading jdbc catalog image, it will try to get resource from ResourceMgr, but ResourceMgr has not been loaded, so NPE will be thrown. This PR fix this bug, and refactor some logic about catalog and resource. When loading jdbc catalog image, it will not get resource from ResourceMgr. And now user can create catalog with resource and properties, like: create catalog jdbc_catalog with resource jdbc_resource properites("user" = "user1"); The properties in "properties" clause will overwrite the properties in "jdbc_resource". force adding tinyInt1isBit=false to jdbc url The default value of tinyInt1isBit is true, and it will cause tinyint in mysql to be bit type. force adding tinyInt1isBit=false to jdbc url so that the tinyint in mysql will be tinyint in Doris. Avoid calculate checksum of jdbc driver jar multiple times Refactor Refactor the notification logic when updating properties in resource. When updating properties in resource, it will notify the corresponding catalog to update its own properties. This PR change this logic. After updating properties in resource, it will only uninitialize the catalog's internal objects such "jdbc client" or "hms client". And this objects will be re-initialized lazily. And all properties will be got from Resource at runtime, so that it will always get the latest properties Regression test cases Because we add tinyInt1isBit=false to jdbc url, some of cases need to be changed.	2023-01-09 09:56:26 +08:00
Pxl	1514b5ab5c	[Feature](Materialized-View) support advanced Materialized-View (#15212 )	2023-01-09 09:53:11 +08:00
caiconghui	97cea9b5c9	[improvement](bdbje) add more log to make bdbje DatabaseNotFoundException problem easily solved (#15715 ) Co-authored-by: caiconghui1 <caiconghui1@jd.com>	2023-01-09 08:55:21 +08:00
wxy	6829d361cb	[Feature](audit) add errorCode and errorMessage in audit log (#14925 ) * [feat] add errorCode and errorMessage in audit log. * [Feature](audit) add errorCode and errorMessage in audit log Co-authored-by: wangxiangyu@360shuke.com <wangxiangyu@360shuke.com>	2023-01-09 08:47:57 +08:00
Lijia Liu	c57fa7c930	[Pipeline] Fix PipScannerContext::can_finish return wrong status (#15259 ) Now in ScannerContext::push_back_scanner_and_reschedule, _num_running_scanners-- is before _num_scheduling_ctx++. InPipScannerContext::can_finish, we check _num_running_scanners == 0 && _num_scheduling_ctx == 0 without obtaining _transfer_lock. In follow case, PipScannerContext::can_finish will return wrong result. _num_running_scanners-- Check _num_running_scanners == 0 && _num_scheduling_ctx == 0` return true. _num_scheduling_ctx++ So, we can set _num_running_scanners-- in the last of this func. Describe your changes. PipScannerContext::get_block_from_queue not block. Set _num_running_scanners-- in the last of ScannerContext::push_back_scanner_and_reschedule.	2023-01-09 08:46:58 +08:00
htyoung	663676ccfe	fix(ui): 1. fix component/table can not change pageSize,affect system/query profile/session page etc. (#15533 ) 2. add antd Table Component missing rowKey property to fit react specification 2. fix system/query profile/session/configuration page maybe lead memory leak when switch these pages fast 3.other grammar fix to fit typescript and react specification Co-authored-by: tongyang.hty <hantongyang@douyu.tv>	2023-01-09 08:46:18 +08:00
zbtzbtzbt	ba54634d55	[refactor] delete non vec load from memtable (#15667 ) * [refactor] delete non vec load from memtable delete non vec load from memtable totally. remove function keys_type() in memtable. Co-authored-by: zhoubintao <1229701101@qq.com>	2023-01-09 08:41:58 +08:00
wxy	fb1f6bdd82	[doc](export) add docs for cancel-export. (#15682 ) Co-authored-by: wangxiangyu@360shuke.com <wangxiangyu@360shuke.com>	2023-01-09 08:38:45 +08:00
Mingyu Chen	f256bb8d39	[fix](meta) fix priv table load bug when upgrading to 1.2.x (#15706 ) In old version, NODE_PRIV will be incorrectly assigned to normal users. So when upgrading to 1.2.x, it will failed to handle this unexpected case. This PR fix this by removing NODE_PRIV from normal user.	2023-01-09 08:38:26 +08:00
ElvinWei	36590da24b	[fix](regression p0) add the alias function hist to histogram and fix p0 (#15708 ) add the alias function hist to histogram and fix p0	2023-01-08 11:31:23 +08:00
yixiutt	90be1a22a9	[bugfix](vertical compaction) fix dcheck failed in MOW tablet (#15638 ) fix a dcheck error for vertical compaction on Merge-On-Write table。 When merge rowsets with empty segment, VerticalHeapMergeIterator::init return ok directly and _record_rowids not set, dcheck failed when _unique_key_next_block call current_block_row_locations。	2023-01-08 10:39:52 +08:00
Mingyu Chen	500c7fb702	[improvement](multi-catalog) support unsupported column type (#15660 ) When creating an external catalog, Doris will automatically sync the schema of table from external catalog. But some of column type are not supported by Doris now, such as struct, map, etc. In previous, when meeting these unsupported column, Doris will throw an exception, and the corresponding table can not be synced. But user may just want to query other supported columns. In this PR, I add a new column type: UNSUPPORTED. And now it is just used for external table schema sync. When meeting unsupported column, it will be synced as column with UNSUPPORTED type. When query this table, there are serval situation: select * from table: throw error Unsupported type 'UNSUPPORTED_TYPE' xxx select k1 from table: k1 is with supported type. query OK. select * except(k2): k2 is with unsupported type. query OK	2023-01-08 10:07:10 +08:00
Ashin Gau	707eab9a63	[opt](multi-catalog) cache and reuse position delete rows in iceberg v2 (#15670 ) A deleted file may belong to multiple data files. Each data file will read a full amount of deleted files, so a deleted file may be read repeatedly. The deleted files can be cached, and multiple data files can reuse the first read content. The performance is improved by 60% in the case of single thread, and by 30% in the case of multithreading.	2023-01-07 22:29:11 +08:00
morrySnow	a1d8177e33	[fix](test) remove unstable regression test (#15689 ) remove `regression-test/suites/nereids_performance_p0/sub_query_join_where_pushdown.groovy`	2023-01-07 22:05:22 +08:00
yongkang.zhong	ae1a77e034	add Q&A to jdbc external table (#15680 )	2023-01-07 20:04:02 +08:00
lsy3993	054af036fe	[typo](doc) fix Chinese describe (#15683 )	2023-01-07 20:02:44 +08:00
ElvinWei	5dfdacd278	[enhancement](histogram) add histogram syntax and perstist histogram statistics (#15490 ) Histogram statistics are more expensive to collect and we collect and persist them separately. This PR does the following work: 1. Add histogram syntax and add keyword `TABLE` 2. Add the task of collecting histogram statistics 3. Persistent histogram statistics 4. Replace fastjson with gson 5. Add unit tests... Relevant syntax examples： > Refer to some databases such as mysql and add the keyword `TABLE`. ```SQL -- collect column statistics ANALYZE TABLE statistics_test; -- collect histogram statistics ANALYZE TABLE statistics_test UPDATE HISTOGRAM ON col1,col2; ``` base on #15317	2023-01-07 00:55:42 +08:00
ElvinWei	76ad599fd7	[enhancement](histogram) optimise aggregate function histogram (#15317 ) This pr mainly to optimize the histogram(👉🏻 https://github.com/apache/doris/pull/14910) aggregation function. Including the following: 1. Support input parameters `sample_rate` and `max_bucket_num` 2. Add UT and regression test 3. Add documentation 4. Optimize function implementation logic Parameter description： - `sample_rate`：Optional. The proportion of sample data used to generate the histogram. The default is 0.2. - `max_bucket_num`：Optional. Limit the number of histogram buckets. The default value is 128. --- Example： ``` MySQL [test]> SELECT histogram(c_float) FROM histogram_test; +-------------------------------------------------------------------------------------------------------------------------------------+ \| histogram(`c_float`) \| +-------------------------------------------------------------------------------------------------------------------------------------+ \| {"sample_rate":0.2,"max_bucket_num":128,"bucket_num":3,"buckets":[{"lower":"0.1","upper":"0.1","count":1,"pre_sum":0,"ndv":1},...]} \| +-------------------------------------------------------------------------------------------------------------------------------------+ MySQL [test]> SELECT histogram(c_string, 0.5, 2) FROM histogram_test; +-------------------------------------------------------------------------------------------------------------------------------------+ \| histogram(`c_string`) \| +-------------------------------------------------------------------------------------------------------------------------------------+ \| {"sample_rate":0.5,"max_bucket_num":2,"bucket_num":2,"buckets":[{"lower":"str1","upper":"str7","count":4,"pre_sum":0,"ndv":3},...]} \| +-------------------------------------------------------------------------------------------------------------------------------------+ ``` Query result description： ``` { "sample_rate": 0.2, "max_bucket_num": 128, "bucket_num": 3, "buckets": [ { "lower": "0.1", "upper": "0.2", "count": 2, "pre_sum": 0, "ndv": 2 }, { "lower": "0.8", "upper": "0.9", "count": 2, "pre_sum": 2, "ndv": 2 }, { "lower": "1.0", "upper": "1.0", "count": 2, "pre_sum": 4, "ndv": 1 } ] } ``` Field description： - sample_rate：Rate of sampling - max_bucket_num：Limit the maximum number of buckets - bucket_num：The actual number of buckets - buckets：All buckets - lower：Upper bound of the bucket - upper：Lower bound of the bucket - count：The number of elements contained in the bucket - pre_sum：The total number of elements in the front bucket - ndv：The number of different values in the bucket > Total number of histogram elements = number of elements in the last bucket(count) + total number of elements in the previous bucket(pre_sum).	2023-01-07 00:50:32 +08:00
谢健	9c8fcd805c	[feature](Nereids) support variable type expression (#15659 )	2023-01-07 00:32:57 +08:00
mch_ucchi	08d439cde7	[feature](Nereids) add keyword rlike (#15647 )	2023-01-07 00:28:21 +08:00
yongkang.zhong	a6773417ef	[Doc] Add sidebars for split_by_string function and delete split_by_char builtins code (#15679 )	2023-01-06 21:14:26 +08:00
zhangguoqiang	c18bfdc93e	[test][regression cases][external]add external table p2 regression cases according doris1.2 docs 20230105 (#15651 )	2023-01-06 20:19:39 +08:00
yongkang.zhong	cad47dd9d9	[test](Nereids) add two regression test cases for Nereids (#15598 ) 1. test predicates infer could work well with push down predicates through join 2. test count with subquery containing constant literal	2023-01-06 16:29:50 +08:00
luozenglin	53559e2bdc	[fix](decimalv2) fix loss of precision when cast to decimalv2 literal (#15629 )	2023-01-06 16:02:46 +08:00
Jerry Hu	9c36278c4a	[improvement](pipeline) Support sharing hash table for broadcast join (#15628 )	2023-01-06 15:11:28 +08:00
HappenLee	1038093c29	[Pipeline](Exec) disable work steal of hash join build (#15652 )	2023-01-06 15:08:10 +08:00
HappenLee	f24659c003	[Refactor](pipeline) refactor the code of channel buffer limit and change the default value (#15650 )	2023-01-06 14:52:43 +08:00
AKIRA	7f84db310a	[fix](nereids) Convert to datetime when binary expr's left is date and right is int type (#15615 ) In the below case, expression ` date > 20200101` should implicit cast date both side to datetime instead of bigint ```sql CREATE TABLE `part_by_date` ( `date` date NOT NULL COMMENT '', `id` int(11) NOT NULL COMMENT '' ) ENGINE=OLAP UNIQUE KEY(`date`, `id`) PARTITION BY RANGE(`date`) (PARTITION p201912 VALUES [('0000-01-01'), ('2020-01-01')), PARTITION p202001 VALUES [('2020-01-01'), ('2020-02-01'))) DISTRIBUTED BY HASH(`id`) BUCKETS 3 PROPERTIES ( "replication_allocation" = "tag.location.default: 1" ); INSERT INTO part_by_date VALUES('0001-02-01', 1),('2020-01-15', 2); SELECT id FROM part_by_date WHERE date > 20200101; ```	2023-01-06 14:08:05 +08:00
谢健	ae77b582f0	[fix](Nereids) add information function and fix bugs in schemaScan (#15608 ) 1. Add information function - Database() - User() - Current_User() - Connection_id() 2. Fix bugs in schemaScan	2023-01-06 13:37:27 +08:00
谢健	ef72b8d859	[Feature](Nereids): add logical operator \|\| && (#15643 )	2023-01-06 12:18:21 +08:00
Tiewei Fang	df2da89b89	[feature](multi-catalog) support postgresql jdbc catalog (#15570 ) support postgresql jdbc catalog	2023-01-06 11:00:59 +08:00
Gabriel	b57500d0c3	[Bug](decimalv3) fix wrong result for MOD operation (#15644 )	2023-01-06 10:38:53 +08:00
luozenglin	05d72e8919	[fix](join) fix anti join incorrectly outputs null values (#15567 )	2023-01-06 09:55:48 +08:00

1 2 3 4 5 ...

8064 Commits