doris

Author	SHA1	Message	Date
caiconghui	2e20ff8cab	[feature](metric) Support collect query counter and error query counter metric in user level (#22125 ) 1. support collect query counter and error query counter metric in user level 2. add sum and count for histogram metric for mistaken delete in PR #22045	2023-07-25 11:16:38 +08:00
Euporia	ba2eb4d788	[typo](docs) add jdbc catalog error handling methods (#22160 )	2023-07-25 10:45:29 +08:00
LiBinfeng	3c58e9bac9	[Fix](Nereids) Fix problem of infer predicates not completely (#22145 ) Problem: When inferring predicate in nereids, new inferred predicates can not be the source of next round. For example: create table tt1(c1 int, c2 int) distributed by hash(c1) properties('replication_num'='1'); create table tt2(c1 int, c2 int) distributed by hash(c1) properties('replication_num'='1'); create table tt3(c1 int, c2 int) distributed by hash(c1) properties('replication_num'='1'); explain select * from tt1 left join tt2 on tt1.c1 = tt2.c1 left join tt3 on tt2.c1 = tt3.c1 where tt1.c1 = 123; we expect to get t33.c1 = 123, but we can just get t22.c1 = 123. Because when infer tt1.c1 = 123 and tt2.c1 = tt3.c1, we can not get any relationship of these two predicates. Solution: We need to cache middle results of source predicates like t22.c1 = 123 in example.	2023-07-25 10:05:00 +08:00
Gabriel	a0463ea047	[round](decimalv2) round decimalv2 to precision value (#22138 ) * [round](decimalv2) round decimalv2 to precision value * update * update`	2023-07-25 03:29:48 +08:00
Qi Chen	752cec9e19	[Fix](multi-catalog) Fix not single slot filter conjuncts with dict filter issue. (#22052 ) ### Issue Dictionary filtering is a mechanism that directly reads the dictionary encoding of a single string column filter condition for filter comparison. But dictionary filtered single string columns may be included in other multi-column filter conditions. This can cause problems. For example: `select * from multi_catalog.lineitem_string_date_orc where l_commitdate < l_receiptdate and l_receiptdate = '1995-01-01' order by l_orderkey, l_partkey, l_suppkey, l_linenumber limit 10;` `l_receiptdate` is string filter column，it is included by multi-column filter condition `l_commitdate < l_receiptdate`. ### Solution Resolve it by separating the multi-column filter conditions and executing it after the dictionary filter column is converted to string.	2023-07-24 22:31:18 +08:00
zhangdong	fc67929e34	[improvement](catalog) optimize ldap and support more character in user and table name (#21968 ) - common name support `-` ,reason: MySQL's db name support `-` - table name support `-` - username support `.`,reason:LDAP's username support `.` - ldap doc - ldap support rbac	2023-07-24 22:04:37 +08:00
zhangdong	7fcf702081	[improvement](multi catalog)paimon support filesystem metastore (#21910 ) 1.support filesystem metastore 2.support predicate and project when split 3.fix partition table query error todo: Now you need to manually put paimon-s3-0.4.0-incubating.jar in be/lib/java_extensions when use s3 filesystem doc pr: #21966	2023-07-24 22:02:57 +08:00
Dongyang Li	8180cde83b	[tools](tpcds) Update README.md, use default gcc (#21159 ) compile with gcc-11 is not ok, compile with gcc 9.40 or below is ok, default gcc often meet requirements.	2023-07-24 21:47:51 +08:00
Dongyang Li	9fe470b273	[pipeline](check) update check-pr-if-need-run-build.sh (#22171 ) no need to run pipeline if only modify regression-test/pipeline/p0/conf/regression-conf.groovy or regression-test/pipeline/p1/conf/regression-conf.groovy	2023-07-24 21:04:23 +08:00
morrySnow	82bdcb3da8	[fix](Nereids) translate partition topn order key on wrong tuple (#22168 ) partition key should on child tuple, sort key should on partition top's tuple	2023-07-24 20:46:27 +08:00
AKIRA	2d52d8d926	[opt](stats) Update stats table config and comment (#22070 ) 1. set replica count fot stats tbl as :"Math.max(Config.statistic_internal_table_replica_num,Config.min_replication_num_per_tablet)" 2. update comment for stats tbl remove symbol `'`	2023-07-24 20:43:55 +08:00
morrySnow	0677b261b5	[fix](Nereids) should not process prepare command by Nereids (#22167 )	2023-07-24 20:11:40 +08:00
Siyang Tang	0205f540ac	[enhancement](config) Enlarge broker scanner bytes conf to 500G, 5G is still not enough (#22126 )	2023-07-24 19:49:39 +08:00
morrySnow	cf30ea914a	[fix](Nereids) forbid gather sort with explict shuffle (#22153 ) gather sort with explict shuffle usually bad, forbid it	2023-07-24 19:45:18 +08:00
Ashin Gau	30c21789c8	[opt](filecache) use weak_ptr to cache the file handle of file segment (#21975 ) Use weak_ptr to cache the file handle of file segment. The max cached number of file handles can be configured by `file_cache_max_file_reader_cache_size`, default `1000000`. Users can inspect the number of cached file handles by request BE metrics: `http://be_host:be_webserver_port/metrics`: ``` # TYPE doris_be_file_cache_segment_reader_cache_size gauge doris_be_file_cache_segment_reader_cache_size{path="/mnt/datadisk1/gaoxin/file_cache"} 2500 ```	2023-07-24 19:09:27 +08:00
Calvin Kirs	3ba3690f93	[Fix](Http-API)Check and replace user sensitive characters (#22148 )	2023-07-24 18:21:42 +08:00
谢健	68bd4a1a96	[opt](Nereids) check multiple distinct functions that cannot be transformed into muti_distinct (#21626 ) This commit introduces a transformation for SQL queries that contain multiple distinct aggregate functions. When the number of distinct values processed by these functions is greater than 1, they are converted into multi_distinct functions for more efficient handling. Example: ``` SELECT COUNT(DISTINCT c1), SUM(DISTINCT c2) FROM tbl GROUP BY c3 -- Transformed to SELECT MULTI_DISTINCT_COUNT(c1), MULTI_DISTINCT_SUM(c2) FROM tbl GROUP BY c3 ``` The following functions can be transformed: - COUNT - SUM - AVG - GROUP_CONCAT If any unsupported functions are encountered, an error is now reported during the optimization phase. To ensure the absence of such cases, a final check has been implemented after the rewriting phase.	2023-07-24 16:34:17 +08:00
morrySnow	21deb57a4d	[fix](Nereids) remove double sigature of ceil, floor and round (#22134 ) we convert input parameters to double for function ceil, floor and round, because DecimalV2 could not do these operation. Since we intro DecimalV3, we should convert all parameters to DecimalV3 to get correct result. For example, when we use double as parameters, we get wrong result: ```sql select round(341/20000,4),341/20000,round(0.01705,4); +-------------------------+---------------+-------------------+ \| round((341 / 20000), 4) \| (341 / 20000) \| round(0.01705, 4) \| +-------------------------+---------------+-------------------+ \| 0.017 \| 0.01705 \| 0.0171 \| +-------------------------+---------------+-------------------+ ``` DecimalV3 could get correct result ```sql select round(341/20000,4),341/20000,round(0.01705,4); +-------------------------+---------------+-------------------+ \| round((341 / 20000), 4) \| (341 / 20000) \| round(0.01705, 4) \| +-------------------------+---------------+-------------------+ \| 0.0171 \| 0.01705 \| 0.0171 \| +-------------------------+---------------+-------------------+ ```	2023-07-24 16:08:00 +08:00
YueW	d2531db1cf	[fix](inverted index) fix regression case test_index_change_7 occasional failure (#22066 )	2023-07-24 15:39:08 +08:00
morrySnow	ac9480123c	[refactor](Nereids) push down all non-slot order key in sort and prune them upper sort (#22034 ) According the implementation in execution engine, all order keys in SortNode will be output. We must normalize LogicalSort follow by it. We push down all non-slot order key in sort to materialize them behind sort. So, all order key will be slot and do not need do projection by SortNode itself. This will simplify translation of SortNode by avoid to generate resolvedTupleExprs and sortTupleDesc.	2023-07-24 15:36:33 +08:00
HHoflittlefish777	e146969376	[Fix](config) delete unuse lazy open config #22136	2023-07-24 15:02:34 +08:00
DeadlineFen	667e4ea99b	[Fix](binlog) Fix bugs in tombstone (#22031 )	2023-07-24 14:33:16 +08:00
xzj7019	b5f27b5349	[enhance](nereids) enable wf partition topn by default (#21860 )	2023-07-24 14:21:45 +08:00
TengJianPing	99bf901607	[fix](in) throw exception for unsupported data type of in expr (#22050 )	2023-07-24 14:13:31 +08:00
Calvin Kirs	1a6709d3ac	[Fix](Sonar)Fix Java heap space error (#22135 )	2023-07-24 12:46:19 +08:00
jakevin	66fa1bef6d	[refactor](Nereids): avoid useless groupByColStats Map (#22000 )	2023-07-24 12:13:52 +08:00
mch_ucchi	ea35437c44	[Fix](Nereids)fix insert into default value exception (#21924 ) default value in the first cell of values when rise a cast exception, we filter it when check the types of values in insert, when the literal is string and value is the specific default value string, we skip type check.	2023-07-24 12:08:43 +08:00
mch_ucchi	e141409171	[Fix](planner) fix rewritten alias function's original function is not analyzed again (#21497 ) fn is null because the alias function's original function is analyzed again, we fix it by add an analysis phase.	2023-07-24 11:40:00 +08:00
minghong	138e6c2f01	[stats](nereids)keep min/max expr in colstats (#22064 ) columnStatistics.minExpr and maxExpr is useful when we derive stats for cast function. This pr 1. maintains the min/max expr during stats derive in filter condition: col<literal, col>literal and col=literal 2. adjust column stats range for cast function (now only support cast from string to other types) ds9 is changed, but no performance issue: on tpcds_sf100_rf exe time is 1.5~1.6sec, the same as master	2023-07-24 10:28:36 +08:00
zhannngchen	86e80ae175	[enhancement](merge-on-write) support concurrent delete bitmap calc while close_wait (#21488 )	2023-07-24 10:09:28 +08:00
Pxl	19ba6bec38	[Improvement](pipeline) support send eos on local exchange and remove some unused code (#22086 ) support send eos on local exchange and remove some unused code	2023-07-24 09:25:32 +08:00
Kaijie Chen	8fa41e4973	[fix](docs) update version since of stream load enable_profile (#21786 ) Stream load enable_profile is not supported in 1.2.4.1, update the docs. Currently it's supported in 2.0.0-rc01. #21784 has backported this feature into branch 1.2-lts, and it will be included in the 1.2.7 release.	2023-07-24 09:22:12 +08:00
gnehil	c78341b728	[improvement](spark-load) support datev2 and datetimev2 #21839	2023-07-24 09:07:53 +08:00
shee	ff9811fa1b	[Bug][Colocate] when adding a table to the colocate group, we should check that the number of buckets per partition is the same (#21906 ) for example CREATE TABLE `colocate_a` ( dt date, k1 int, v1 int ) ENGINE=OLAP DUPLICATE KEY(`k1`) PARTITION BY RANGE(`dt`) (PARTITION p1 VALUES [('2022-10-02'), ('2022-10-03')) DISTRIBUTED BY HASH(`k1`) BUCKETS 2 PROPERTIES ( "replication_num" = "3", "in_memory" = "false", "storage_format" = "V2" ); ALTER TABLE colocate_a set ("colocate_with" = "ab"); CREATE TABLE `colocate_b` ( dt date, k1 int, v1 int ) ENGINE=OLAP DUPLICATE KEY(`k1`) PARTITION BY RANGE(`dt`) (PARTITION p1 VALUES [('2022-10-02'), ('2022-10-03')) DISTRIBUTED BY HASH(`k1`) BUCKETS 2 PROPERTIES ( "replication_num" = "3", "in_memory" = "false", "storage_format" = "V2" ); ALTER TABLE colocate_b ADD PARTITION p2 VALUES [("2022-10-03"),("2022-10-04")) DISTRIBUTED BY HASH(k1) BUCKETS 10; ALTER TABLE colocate_b set ("colocate_with" = "ab"); table colocate_b partition p2 set bucket num is 10 then take it into group ab. In ColocateTableCheckerAndBalancer matchGroup occur : java.lang.IllegalStateException: 2 vs. 10 303861 at com.google.common.base.Preconditions.checkState(Preconditions.java:508) ~[guava-30.0-jre.jar:?] 303862 at org.apache.doris.clone.ColocateTableCheckerAndBalancer.matchGroup(ColocateTableCheckerAndBalancer.java:242) ~[doris-fe.jar:1.2-SNAPSHOT] 303863 at org.apache.doris.clone.ColocateTableCheckerAndBalancer.runAfterCatalogReady(ColocateTableCheckerAndBalancer.java:95) ~[doris-fe.jar:1.2-SNAPSHOT] 303864 at org.apache.doris.common.util.MasterDaemon.runOneCycle(MasterDaemon.java:58) ~[doris-fe.jar:1.2-SNAPSHOT] 303865 at org.apache.doris.common.util.Daemon.run(Daemon.java:116) ~[doris-fe.jar:1.2-SNAPSHOT] --------- Co-authored-by: shizhiqiang03 <shizhiqiang03@meituan.com>	2023-07-24 09:01:16 +08:00
赵立伟	d0219062ef	[refactor](be) use std::move to improve performance of push_back #22056	2023-07-24 08:51:28 +08:00
wuwenchi	64348055a1	[improvement](iceberg) Optimize the split to the user-specified size #22078 According to the specified split size, the split tasks are merged to keep a single task near the expected size.	2023-07-24 08:48:10 +08:00
Chenyang Sun	0396ac9d38	fix(compaction) release the block and segment iterator after reading to the end of the segment file (#22082 ) When reading to the end of the segment file, clearing the block did not release the memory, leading to high memory usage during compaction. When reading through segment file for columns that are dictionary encoded, the column iterator in the segment iterator will hold the dictionary. Release the segment iterator to free up the dictionary.	2023-07-24 08:47:19 +08:00
Mingyu Chen	0c811edb78	[deps](hadoop) update hadoop libs to 3.3.4.5 (#22062 )	2023-07-23 20:17:16 +08:00
Mingyu Chen	a5099a2d3b	[minor](log) print error msg to fe.out before log is initialized (#22106 ) The exception may be thrown before LOG is initialized. Such as wrong config value. So we need to print it to fe.out, otherwise we can't know what's wrong. After this PR, the error can be found in fe.out, such as: ``` java.lang.NumberFormatException: For input string: "3g" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Long.parseLong(Long.java:589) at java.lang.Long.parseLong(Long.java:631) at org.apache.doris.common.ConfigBase.setConfigField(ConfigBase.java:253) at org.apache.doris.common.ConfigBase.setFields(ConfigBase.java:232) at org.apache.doris.common.ConfigBase.initConf(ConfigBase.java:146) at org.apache.doris.common.ConfigBase.init(ConfigBase.java:112) at org.apache.doris.DorisFE.start(DorisFE.java:101) at org.apache.doris.DorisFE.main(DorisFE.java:73) ```	2023-07-23 19:20:10 +08:00
Liqf	ddd7e9871d	[improvement](Jsonb) optimization Jsonb path parse (#21495 ) The previous logic was to read jsonbvalue while parsing the json path. For complex json paths, there will be a lot of repeated parsing work. The optimization idea is to separate the analysis and value of jsonpath	2023-07-23 18:59:12 +08:00
Xin Liao	4f0158c458	[fix](partial-update) fix update core for merge-on-write table (#22090 )	2023-07-23 13:35:08 +08:00
yiguolei	2c16fe0da9	[bugfix](runtimefilter) runtime filter is shared between multi instances with same node id, should not cache exprs (#22114 ) runtime filter is shared among multi instances. in the past, we cached pushdown expr(runtime filter generated) every scannode[runtime filter consumer] will try to call prepare expr but the expr may generated with different fn_context_id --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-07-23 13:04:33 +08:00
zhangstar333	256051a965	[bug](node) fix partiton sort node core dump when eos (#22108 ) fix partiton sort node core dump when eos	2023-07-23 12:00:53 +08:00
Siyang Tang	22aa54e335	[enhancement](config) enlarge max_bytes_per_broker_scanner to 5G #22099	2023-07-23 12:00:32 +08:00
zhangdong	eceb30f47e	[doc](catalog)paimon doc (#21966 ) code pr: #21910	2023-07-23 11:24:40 +08:00
zhangdong	dfb5d4bc13	[fix](catalog) do not call makeSureInitialized when create/drop table/db from hms meta event (#21941 ) Supplement to #21104	2023-07-23 11:24:20 +08:00
yiguolei	f8307f1a1a	[bugfix](scanner) when scanner init failed during get tablet, not need call update counters (#22117 ) Co-authored-by: yiguolei <yiguolei@gmail.com> If the scanner is failed during init or open, then not need update counters because the query is fail and the counter is useless. And it may core during update counters. For example, update counters depend on scanner's tablet, but the tablet == null when init failed.	2023-07-23 10:19:20 +08:00
caiconghui	8cb532230a	[fix](metric) fix prometheus metric format error (#22045 ) we should define metric name only once like following: # HELP doris_fe_query_latency_ms # TYPE doris_fe_query_latency_ms summary doris_fe_query_latency_ms{quantile="0.75"} 1.0 doris_fe_query_latency_ms{quantile="0.95"} 2.0 doris_fe_query_latency_ms{quantile="0.98"} 100.0 doris_fe_query_latency_ms{quantile="0.99"} 100.0 doris_fe_query_latency_ms{quantile="0.999"} 100.0 doris_fe_query_latency_ms{quantile="0.75",user="default_cluster:test1"} 1.0 doris_fe_query_latency_ms{quantile="0.95",user="default_cluster:test1"} 1.0 doris_fe_query_latency_ms{quantile="0.98",user="default_cluster:test1"} 1.0 doris_fe_query_latency_ms{quantile="0.99",user="default_cluster:test1"} 1.0 doris_fe_query_latency_ms{quantile="0.999",user="default_cluster:test1"} 1.0	2023-07-22 22:38:29 +08:00
Pxl	0755fd16d8	remove create hot partition failed check (#22093 )	2023-07-22 17:47:46 +08:00
amory	3d0f952934	[FIX](complex-type)delete enable_map/struct_type switch #21957	2023-07-22 15:29:32 +08:00

1 2 3 4 5 ...

12044 Commits