doris

Author	SHA1	Message	Date
zzzzzzzs	ffd50b6aeb	[improvement](broker) TOperationStatus determines that a null pointer is redundant. (#18712 ) TOperationStatus determines that a null pointer is redundant. If tOperationStatus is a null pointer, then tOperationStatus.getMessage() will have a null pointer exception.	2023-05-04 10:03:09 +08:00
DuRipeng	52d25f41a4	[feature](multi-catalog) Rename multi-catalog config 'specified_database_list' to 'include_database_list', and introduce new multi-catalog config 'exclude_database_list' (#18834 ) In my scene, We need to specify databases that are excluded to synchronize to doris, like some databases store temporary table. Since #17803 introduce `specified_database_list` to specify 'include databases', this pr introduce new config `exclude_database_list` to specify 'exclude databases', and rename `specified_database_list` to `include_database_list` for naming symmetry. BTW, when `include_database_list` and `exclude_database_list` specify overlapping databases, `exclude_database_list` would take effect with higher privilege over `include_database_list`.	2023-05-04 09:30:02 +08:00
minghong	7652d8649b	[regression](nereids) check tpc-h 1G/500G/1T plan if backend_num == 1 #18848 cases in nereids_tpch_shape_sf1_p0, nereids_tpch_shape_sf500_p0 and nereids_tpch_shape_sf1000_p0 are only for one be environment	2023-05-04 08:55:06 +08:00
Yongqiang YANG	c98829c94b	[improvement](load) log time consumed by waiting flush (#19226 )	2023-05-03 17:48:13 +08:00
zhangdong	72d937ad52	[fix](auth)fix es catalog show table (#19202 )	2023-05-02 20:22:07 +08:00
Mingyu Chen	9d18be9dd3	[doc](thrift) update doc for thrift 0.16 (#19217 ) * 1 update doc for thrift 0.16	2023-05-02 16:00:10 +08:00
TsukiokaKogane	145b94531f	[Fix](load) fix request_slave_tablet_pull_rowset get wrong url in case of ipv6 address (#19026 )	2023-05-02 09:55:09 +08:00
hechao	224bca3794	[docker](hudi) add hudi docker compose (#19048 )	2023-05-02 09:54:52 +08:00
AlexYue	b0c215e694	[enhance](be)add more profile in prefetched buffered reader (#19119 )	2023-05-02 09:53:39 +08:00
Xiangyu Wang	05beb8538e	[Fix](multi-catalog) fix FE abnormal exit when replay OP_REFRESH_EXTERNAL_TABLE (#19120 ) When salve FE nodes replay OP_REFRESH_EXTERNAL_TABLE log, it will invoke `org.apache.doris.datasource.hive.HiveMetaStoreCache#invalidateTableCache`, but if the table is a non-partitioned table, it will invoke `catalog.getClient().getTable`. If some network problem occurs or this table is not existed, an exception will be thrown and FE will exit right away. The solution is that we can use a dummy key as the file cache key which only contains db name and table name. And when slave FE nodes replay OP_REFRESH_EXTERNAL_TABLE log, it will not rely on the hms client and there will not any exception occurs.	2023-05-02 09:53:20 +08:00
abmdocrt	43803940f5	[community](collaborator) add more collaborators (#19229 ) Add @TangSiyang2001 as collaborator, and he helped a lot in good first issue.	2023-05-01 23:34:06 +08:00
zhangstar333	eac61dc410	[vectorized](function) add some check about result type in array map (#19228 )	2023-05-01 16:28:11 +08:00
Yongqiang YANG	a978be32a6	[fix](schema_change) remove shadow prefix of schema for tablesink (#18822 ) LSC updates tablet's schema in writing. Be optimized adding columns via linked schema change and it distinguishes adding by comparing column name. e.g. if new column's name is not found in old schema, then it is a newly-add column. When a table is under schema-changing, it adds __doris_shadow_ prefix in name of columns in shadow index. Then writes during schema-changing would bring schema with __doris_shadow_ to be. If schema change request arrives at be after writes, then be do it as a add-column schema change due to __doris_shadow_ is not in base tablet.	2023-04-30 22:46:36 +08:00
nanfeng	da4de37dec	[feature-wip](mv lifecycle) separate life cycle of base table and its materialized views (#19210 ) support related syntax and add:regress-test case --------- Co-authored-by: yzy <yzy@nanfeng_yzy@163.com>	2023-04-30 17:42:02 +08:00
yiguolei	8eab20d3df	[bugfix](low cardinality) cached code is wrong will result wrong query result when many null pages (#19221 ) Sometimes the dict is not initialized when run comparison predicate here, for example, the full page is null, then the reader will skip read, so that the dictionary is not inited. The cached code is wrong during this case, because the following page maybe not null, and the dict should have items in the future. This will result the dict string column query return wrong result, if there are many null values in the column. I also add some regression test for dict column's equal query, larger than query, less than query. --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-04-29 21:28:41 +08:00
zclllyybb	d383f1f3d7	[optimization](simd) optimize count_zero_num for ColumnNullable #19124	2023-04-29 14:50:39 +08:00
wangbo	f2b15c03ca	[fix]disable enable_resource_group for regression test (#19206 ) When running regression test with setting enable_resource_group = true, it's shared by other test case, may be cause regression test failed. So we should not set it to true until we have fully test it.	2023-04-29 14:47:50 +08:00
Mingyu Chen	8c6ccc092a	[fix](test) fix 2 unstable test (#19220 )	2023-04-29 14:42:47 +08:00
Mingyu Chen	fc3728c6ab	[fix](dynamic-partition) create HOUR unit partition with DATEV2 throw exception (#19213 ) Need to forbid create HOUR unit partition with partition column type DATEV2 ``` Unexpected exception: String index out of range: 10 ```	2023-04-29 08:23:06 +08:00
Tiewei Fang	c74c2a4f8e	[fix](Metadata tvf) Metadata TVF supports read the specified columns from Fe (#19110 )	2023-04-29 00:06:08 +08:00
slothever	d006143330	[fix](multi-catalog) when endpoint has no region, need a suggestion (#19203 ) solve the problem ``` mysql> CREATE CATALOG iceberg PROPERTIES ( 'type'='iceberg', 'iceberg.catalog.type'='rest', 'uri' = 'http://0.0.0.0:8888, "AWS_ACCESS_KEY" = "admin", "AWS_SECRET_KEY" = "password", "AWS_REGION" = "us-east-1", "AWS_ENDPOINT" = "http://minio:9000" ); show databases; ERROR 1105 (HY000): IllegalArgumentException, msg: java.lang.IllegalArgumentException: The value of property fs.s3a.endpoint.region must not be null ```	2023-04-29 00:05:41 +08:00
HappenLee	4a10d146bf	[pipeline](exec) fix regression prepare failed cause query core dump (#19208 ) fix regression prepare failed cause query core dump	2023-04-28 20:46:39 +08:00
yongjinhou	bee3aa3007	be conf action supports specify item (#19159 )	2023-04-28 19:12:51 +08:00
Xinyi Zou	a324ee794c	[fix](memory) Fix Aggregation null key memory leak due to incorrect aggfunc destroy #19201	2023-04-28 18:41:41 +08:00
liujinhui	b87d21d836	[doc](spark-load)add spark load ha EN docs (#19194 ) * 15000-doc-spark-ha english doc * Update spark-load-manual.md format --------- Co-authored-by: liujh <liujh@t3go.cn> Co-authored-by: Luzhijing <82810928+luzhijing@users.noreply.github.com>	2023-04-28 18:18:42 +08:00
zgxme	fd3c132d91	[enhancement](test) split large data of p2 cases (#19186 )	2023-04-28 18:18:25 +08:00
Xinyi Zou	1379d7f3e0	[fix](memory) mmap threshold can be modified in conf, Increase to 128M	2023-04-28 18:17:22 +08:00
Zhengguo Yang	43e70ab252	[chore](recover) add a config to recover remaining data in emergency (#18986 )	2023-04-28 17:42:00 +08:00
zhangdong	365ac54102	[doc](fqdn)fqdn doc cn (#19179 ) * fqdn doc cn * Update fqdn.md format --------- Co-authored-by: Luzhijing <82810928+luzhijing@users.noreply.github.com>	2023-04-28 17:26:49 +08:00
ZhangYu0123	6626f26506	[optimize](string) optimize char_length function by SIMD (#18925 ) Optimize char_length function by SIMD (1) optimize utf8_len compute (2) 840% up	2023-04-28 17:22:35 +08:00
yixiutt	aef9355cd3	[feature-wip](partial update) PART1: support basic partial write (#17542 )	2023-04-28 17:17:57 +08:00
ElvinWei	718297d3c1	[test](statistics) add p0 test of sampling statistics (#19176 ) 1. Added test p0 for sampling collection statistics 2. Modify the uniqueKeys of table analysis_jobs for deletion based on relevant conditions 3. Solve the problem that incremental statistics p0 is less stable	2023-04-28 15:50:05 +08:00
starocean999	f0852f2ac9	[fix](fe)fix bug if left table is empty and there are multiple right tables need do bucket shuffle to left side (#19169 ) * [fix](fe)fix bug if left table is empty and there are multiple right tables need do bucket shuffle to left side * fix bug * fix test cases	2023-04-28 15:06:38 +08:00
fornaix	48c4679019	[doc] fix broken link in docs (#19175 )	2023-04-28 14:29:14 +08:00
Pxl	ec517a53a8	[Chore](build) upgrade clang-format version to 16 && move thrift to fe-common (#19155 ) upgrade clang-format version to 16 move thrift to fe-common fix core dump on pipeline engine when operator canceled and not prepared	2023-04-28 14:14:51 +08:00
yagagagaga	ffe27baeaf	[FAQ](docs) add a FAQ about hive catalog occurring UnknownHostException (#19182 ) [FAQ](docs) add a FAQ about hive catalog occurring UnknownHostException (#19182)	2023-04-28 13:50:24 +08:00
Zhengguo Yang	52b1bd2c81	[clone](download) fix be clone action download tablet content length overflow (#18851 )	2023-04-28 11:35:17 +08:00
WenYao	5e9c0c3500	[Enhancement](data-type) add FE config to prohibit create date and decimalv2 type (#19077 ) * prohibits date and decimal type * add config in test	2023-04-28 11:31:51 +08:00
Ashin Gau	65a82a0b57	[opt](FileReader) turn off prefetch data in parquet page reader when using MergeRangeFileReader (#19102 ) Using both `MergeRangeFileReader` and `BufferedStreamReader` simultaneously would waste a lot of memory, so turn off prefetch data in `BufferedStreamReader` when using MergeRangeFileReader.	2023-04-28 09:27:56 +08:00
Tiewei Fang	86be6d27e7	[Enhencement](Cancel Export) Cancel export support to cancel IN_QUEUE state export job (#19058 )	2023-04-28 09:27:23 +08:00
Mingyu Chen	3082ed806f	[chore](branch-1.1) remove some checks on branch-1.1-lts (#19145 ) The compilation env for branch 1.1 lts is no longer supported. So remove the required github checks to let PR merged	2023-04-28 09:12:32 +08:00
Gabriel	28016c53f0	[profile](rf) refactor profile of runtime filters (#19134 ) * [profile](rf) refactor profile of runtime filters --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2023-04-28 08:46:42 +08:00
airborne12	31c1ebc165	[Chore](thirdparty) update clucene from 2.4.11 to 2.4.12 (#19150 ) fix memory leak in Standard Analyzer	2023-04-27 23:33:26 +08:00
gnehil	745f29f557	[typo](doc) add version label (#19148 )	2023-04-27 23:22:24 +08:00
jiugem	dfa3fd1bcf	Update CHAR.md (#19154 )	2023-04-27 23:22:09 +08:00
liujinhui	4aa4abebe7	15000-doc-spark-ha (#19153 ) Co-authored-by: liujh <liujh@t3go.cn>	2023-04-27 23:21:41 +08:00
jiugem	e0ca4e061b	Update CHAR.md (#19156 )	2023-04-27 23:21:26 +08:00
yagagagaga	72799622e1	[typo](docs) Supplementary explanation on the hint section of insertSupplementary explanation on the hint section of INSERT.md (#19171 )	2023-04-27 23:21:01 +08:00
jakevin	a35fc02bd4	[enhance](Nereids): handle project of OuterJoin in Reorder. (#19137 )	2023-04-27 22:17:03 +08:00
morrySnow	0f895640d9	[opt](Nereids)(WIP) optimize agg and window normalization step 1 (#19168 ) 1. move SimplifyAggGroupBy behind NormalizeAggregate 2. fix project to agg rule for the project containing window expression	2023-04-27 21:42:23 +08:00

1 2 3 4 5 ...

10260 Commits