doris

Author	SHA1	Message	Date
Xinyi Zou	fc12362a6d	[feature-wip](arrow-flight)(step2) FE support Arrow Flight server (#24314 ) This is a POC, the design documentation will be updated soon	2023-09-20 14:42:54 +08:00
LiBinfeng	a3361df7b9	[Feat](Nereids) support json and jsonb datatype (#24156 ) Feature: support jsonb and json type in nereids Document: this feature supports these two datatype in nereids optimizer like original planner, the sql reference is same as before [JSON - Apache Doris](https://doris.apache.org/zh-CN/docs/dev/sql-manual/sql-reference/Data-Types/JSON)	2023-09-20 14:32:22 +08:00
amory	e9435c14f8	[Improve](array-func)improve array union support multi params (#24327 )	2023-09-20 14:29:48 +08:00
zclllyybb	ca56921481	[docs](partition) Auto partition docs (#24574 )	2023-09-20 14:28:23 +08:00
zclllyybb	8aea31e383	[fix](timezone) fix timezone parse when there is no tzfile (#24578 )	2023-09-20 14:28:12 +08:00
Guangdong Liu	aa9f2260ea	[fix](multi-catalog)Es catalog needs to verify whether it is a valid configuration. (#24309 )	2023-09-20 14:20:57 +08:00
Calvin Kirs	df66922bc0	[Chore](sonar)sonar (C++) configuration file name error (#24662 ) FYI https://community.sonarsource.com/t/project-root-configuration-file-none/99389	2023-09-20 13:58:30 +08:00
wangbo	26ca0b2780	Add some block counter (#24465 ) Add some block counter (#24465)	2023-09-20 13:23:01 +08:00
谢健	deafa2dd88	[fix](Nereids) fix row count unconsistent when join ordering (#24589 ) In the context of reorder join, when a new plan is generated, it may include a project operation. In this case, the newly generated join root and the original join root will no longer be in the same group. To avoid inconsistencies in the statistics between these two groups, we keep the child group's row count unchanged when the parent group expression is a project operation.	2023-09-20 13:11:35 +08:00
Gabriel	901ee7a8d3	[regression](pipelineX) disable pipelineX test cases (#24654 )	2023-09-20 13:01:08 +08:00
Gabriel	c0df8fca20	[pipelineX](fix) Fix potential concurrent problem (#24651 )	2023-09-20 13:00:58 +08:00
daidai	c704497d02	[fix](csv_reader)Fixed bug when parsing multi-character delimiters. (#24572 ) Fixed bug when parsing multi-character delimiters.	2023-09-20 12:41:35 +08:00
Liqf	075552ead4	[feature](partitions)support batch delete partition (#23986 ) ALTER TABLE example_db.my_table DROP PARTITION p1, DROP PARTITION p2, DROP PARTITION p3;	2023-09-20 11:45:52 +08:00
Siyang Tang	0fb79e4011	[fix](broker-load) fix file offset for compressed file #24564 Co-authored-by: Kang <kxiao.tiger@gmail.com>	2023-09-20 11:41:52 +08:00
Yongqiang YANG	a2e29d171a	[enhancement](be-meta) sync rocksdb by default to protect data (#24571 ) If performance of user's disks is low, users can change the config to false, this way users know what would happen if a kernel panic.	2023-09-20 11:41:26 +08:00
xu tao	b7ca4fcc8d	[fix](io): use try with resource make io stream close automatically to avoid resource leak (#24605 )	2023-09-20 11:39:03 +08:00
bobhan1	848290d8a8	[Fix](nereids) Support partial update for insert into table (#24594 )	2023-09-20 11:35:09 +08:00
morrySnow	b02398ba85	[fix](planner) statement run successful but log error msg in audit log (#24628 ) legacy planner will set error msg when throw AnalysisException. However, in some place, we catch these exception and muted them. So, we should reset back error msg and error code.	2023-09-20 11:32:47 +08:00
HHoflittlefish777	5a0ccd702c	[typo](docs) fix error in routine load doc (#24623 )	2023-09-20 11:13:14 +08:00
Adonis Ling	8316aad417	[chore](macOS) Fix linkage errors (#24642 ) Issue Number: close #24643	2023-09-20 10:50:10 +08:00
minghong	9a4a4c0760	[opt](Nereids)skip unknown col stats check on __internal_scheam and information_schema (#24625 ) columns in __internal_scheam and information_schema do not have column stats	2023-09-20 10:48:05 +08:00
Mingyu Chen	c41cadb64d	[fix](broker) fix broker read issue (#24635 ) The given "length" of broker's pread() method is the buffer length, not the length required from file. So it may larger than the file length. So we should return all read data, instead of return EOF when `read()` method return -1 I will add regression test case later when the framework support broker process.	2023-09-20 10:43:16 +08:00
yiguolei	c3b3f0f00a	[enhancement](serialize) add dcheck to ensure pb type is set (#24645 ) should check the pb's type is set, or the deserialize will core. should not return unknown type because deserialize will core. --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-09-20 10:42:28 +08:00
Jerry Hu	49f6eda843	[fix](nested_join) incorrect result of semi/anti mark join (#24616 )	2023-09-20 10:41:06 +08:00
Liqf	14bd290aec	[feature](jsonb)support json_length and json_contains function (#24332 )	2023-09-20 10:40:44 +08:00
Liqf	e59aa49f28	[feature](datetime-func)support milliseconds_add/sub/diff and microseconds_diff (#24114 )	2023-09-20 10:38:56 +08:00
zhangstar333	a71d7f2beb	[pipelineX](operator) support partition sort operator and distinct streaming agg operator (#24544 )	2023-09-20 09:50:51 +08:00
morrySnow	7e17e0d3f7	[fix](Nereids) select outfile column order is wrong (#24595 )	2023-09-20 09:27:40 +08:00
zy-kkk	527b284e90	[improvement](jdbc catalog) Extend conjunctExprToString to Support both 'AND' and 'OR' with Optimized DateLiteral Handling (#24537 )	2023-09-19 23:11:44 +08:00
Calvin Kirs	4f215a7dc3	[Improve](Fe)Ensure that only one FE process uses the metedata file (#24442 )	2023-09-19 23:11:20 +08:00
Calvin Kirs	1a553f7e14	[Improve](start-shell)Optimize fe&be startup (#24556 ) - sh start_fe/start_be --console is used to instruct the program to run in console mode. - sh start_fe/start_be --daemon is used to instruct the program to run in daemon mode. - sh start_fe/start_be used starts as a background execution, records output and error logs to the specified file	2023-09-19 23:00:59 +08:00
Calvin Kirs	420914abfc	[Fix](RoutineLoad)multi-table query table error (#24538 ) multi-table will take all tables and then convert them into OlapTable, thus causing View type conversion errors.	2023-09-19 22:57:13 +08:00
Ashin Gau	19ccb9517f	[fix](iceberg) should call UserGroupInformation when enable security authentication (#24614 ) Fix two bugs: 1. Call `UserGroupInformation.doAs` when enable security authentication 2. `catalogId` is 0 when `IcebergExternalCatalog` is loaded from fe image	2023-09-19 22:39:58 +08:00
Mingyu Chen	32c6f5f905	[opt](test) set longer timeout for hive query cache test case (#24569 ) Sometimes the first run of query may be longer then former given threshold, which case test fail. Also add a new session variable test_query_cache_hit So that we can use it to test if cache is hit in regression test	2023-09-19 22:25:18 +08:00
Yongqiang YANG	71dcb58db9	[improvement](scanner_schedule) reduce memory consumption of scanner (#24199 ) * [improvement](scanner_schedule) reduce memory consumption of scanner 1. limit scanner by memory consumptin rather than blocks. 2. scheduler run correcty instread of at lest 1.	2023-09-19 21:36:23 +08:00
Dongyang Li	8afdfd58e2	[fix](case) ensure jar downloaded (#24475 ) ensure jar downloaded	2023-09-19 21:26:12 +08:00
airborne12	8c502f65f2	[Fix](metrics) fix wrong timer metrics for _seek_columns (#24622 )	2023-09-19 20:59:09 +08:00
谢健	c3bd2a22d4	[feature](Nereids) add many array functions (#24301 ) Add function array_filter, array_sortby, array_last_index, array_first_index, array_orderby, array_count	2023-09-19 18:58:49 +08:00
Liqf	c9f5142420	[Imporve](UNIX_TIMESTAMP) UNIX_TIMESTAMP func support 'yyyy-MM-dd HH:mm:ss' format (#24561 ) UNIX_TIMESTAMP function data format parameter supports 'yyyy-MM-dd HH:mm:ss' The implementation is the same as the date_format function before: ```sql mysql> select UNIX_TIMESTAMP('2023-09-18 00:00:00','yyyy-MM-dd HH:mm:ss'); +--------------------------------------------------------------+ \| unix_timestamp('2023-09-18 00:00:00', 'yyyy-MM-dd HH:mm:ss') \| +--------------------------------------------------------------+ \| NULL \| +--------------------------------------------------------------+ 1 row in set (0.04 sec) ``` now: ```sql mysql> select UNIX_TIMESTAMP('2023-09-18 00:00:00','yyyy-MM-dd HH:mm:ss'); +------------+ \| 1694966400 \| +------------+ \| 1694966400 \| +------------+ 1 row in set (0.01 sec) ```	2023-09-19 18:41:59 +08:00
minghong	037ff2d5a6	[fix](nereids) bug: runtimefilter should not be pushed through window and topN (#24439 ) runtime filter should not push down through topN runtime filter should not push down through window if target slot is not partition key of all windowExpressions	2023-09-19 18:18:06 +08:00
Gabriel	e54c4ef258	[pipelineX](dependency) refactor write dependency (#24555 )	2023-09-19 18:01:42 +08:00
Yongqiang YANG	3cac6806b4	[fix](txn) persist txn record of single replica load and ccr ingestion (#24543 ) Otherwise txn would be dropped when a be reboots.	2023-09-19 15:10:38 +08:00
Pxl	5e4ab7cd25	[Bug](materialized-view) add limit for drop column on mv (#24493 ) add limit for drop column on mv	2023-09-19 14:32:14 +08:00
Mryange	ee56783629	[fix](Java UDF) Do not use enum as the data type for JavaUdfDataType. (#24460 )	2023-09-19 14:06:02 +08:00
jakevin	eea84ac36c	[fix](Nereids): use == instead of id to identity PhysicalHashJoin (#24535 )	2023-09-19 12:06:30 +08:00
Siyang Tang	b092bdaabf	[feature](load) collect loaded rows on table level after txn published (#24346 ) As title. Stream load 20 lines ``` 2023-09-14 11:40:04,186 DEBUG (PUBLISH_VERSION\|23) [DatabaseTransactionMgr.updateCatalogAfterVisible():1769] table id to loaded rows:{51016=20} ``` ``` mysql> select count() from dup_tbl_basic; +----------+ \| count() \| +----------+ \| 20 \| +----------+ 1 row in set (0.05 sec) ```	2023-09-19 12:00:08 +08:00
Jibing-Li	80bcb43143	[Feature]Support external table sample stats collection (#24376 ) Support hive table sample stats collection. Gramma is like `analyze table with sample percent 10`	2023-09-19 11:20:27 +08:00
HappenLee	6a33e4639a	[schedule](pipeline) Remove wait schedule time in pipeline query engine and change current queue to std::mutex (#24525 ) This reverts commit 591aeaa98d1178e2e277278c7afeafef9bdb88d6.	2023-09-18 23:57:56 +08:00
Yongqiang YANG	1ac7c8f14d	[improvement](scan_queue_mem_limit) scan queue mem limit is so small for (#24553 ) a wide table Users rarely set scan_queue_mem_limit, so it almost often works as 2G/20. However, somecases we need set it to a larger value, especially for insrt into select from a wide table.	2023-09-18 20:22:03 +08:00
minghong	c54fc82031	[improve](nereids) expand runtime filter target by hashJoin's equal condition (#23274 ) generate more runtime filters example: lineitem join partsupp on l_partkey= ps_partkey join filter(part) on ps_partkey=p_partkey we need two RFs: RF1: p_partkey->ps_partkey RF2: p_partkey->l_partkey This pr will generate RF2, but current version will not. merge runtime filters current version, if one src could affect 2 targets, we will generate 2 runtime filters. after this pr, the two rf will be merged. refer to regression test: ds_rf2/ds_rf5/ds_rf54	2023-09-18 18:27:01 +08:00

... 2 3 4 5 6 ...

13721 Commits