doris

Author	SHA1	Message	Date
Xujian Duan	1069d4f91e	[Enhancement](Stmt)ShowPartitionsStmt support forward to master #16359 Co-authored-by: duanxujian <duanxujian@jd.com>	2023-02-04 22:51:19 +08:00
Xinyi Zou	63d57b83f3	[fix](memory) Fix request jemallloc metrics wait lock je_malloc_mutex_lock_slow #16381 MetricRegistry::trigger_all_hooks holds the metrics lock and is stuck in get_je_metrics, to_prometheus is waiting for MetricRegistry::trigger_all_hooks to release the lock, so get_je_metrics is no longer called in MetricRegistry::trigger_all_hooks.	2023-02-04 22:49:22 +08:00
plat1ko	bd8ef4edeb	[fix](cooldown) Fix core in remove_all_remote_rowsets (#16374 )	2023-02-04 22:31:38 +08:00
plat1ko	1473a9716b	[fix](cooldown) Fix bug in report tablet (#16414 )	2023-02-04 22:30:57 +08:00
huangzhaowei	1146bde695	[feature-wip](MTMV) Support refresh mtmv (#16218 ) Support using this sql to refresh mtmv manually. It can generate a mtmv task right now. ``` REFRESH MATERIALIZED VIEW test_mv_view [complete]; ``` You can use `show mtmv task` to show the latest task. In this pr, I also try to clear the mtmv tasks when drop the mtmv to make sure test suite to be right	2023-02-04 20:17:45 +08:00
camby	60386a46a6	fix ADMIN-CHECK-TABLET typo (#16389 )	2023-02-04 18:44:08 +08:00
Gabriel	918004c016	[Bug](date) Fix BE crash caused by function `datediff` (#16397 ) * [Bug](date) Fix BE crash caused by function `datediff` * update	2023-02-04 18:43:23 +08:00
yagagagaga	712fa8c538	[typo](docs) Fixed some display errors caused by MD syntax errors (#16395 )	2023-02-04 18:12:05 +08:00
ElvinWei	ad78f313be	[Improvement](statistics) show analysis job info (#16305 ) Supports query analysis job info. syntax: ```SQL SHOW ANALYZE [TABLE \| ID] [ WHERE [STATE = ["PENDING"\|"RUNNING"\|"FINISHED"\|"FAILED"]] ] [ORDER BY ...] [LIMIT limit]; ``` example: ```SQL SHOW ANALYZE test_table1 WHERE state = 'FINISHED' ORDER BY col_name LIMIT 1; ``` result: \| job_id \| catalog_name \| db_name \| tbl_name \| col_name \| job_type \| analysis_type \| message \| last_exec_time_in_ms \| state \| schedule_type \| \| ------ \| ------------ \| -------------------- \| ----------- \| -------- \| -------- \| ------------- \| ------- \| -------------------- \| -------- \| ------------- \| \| 10086 \| internal \| default_cluster:test \| test_table1 \| pv \| MANUAL \| FULL \| \| 2023-02-01 09:36:41 \| FINISHED \| ONCE \|	2023-02-03 23:21:47 +08:00
ElvinWei	f443ebfd9a	[Improvement](statistics) optimise histogram keyword (#16369 )	2023-02-03 23:02:41 +08:00
Kang	125b60b4b9	[improvement](compatibility) add DATA_TYPE in information schema for new types #16391 Add DATA_TYPE in information schema for types: datev2, datatimev2, decimal, jsonb. It was 'unknown' for these types and cause problem for tools such as BI using information schema.	2023-02-03 22:28:42 +08:00
Hu Yanjun	b621d1d68a	[docs](docs) update en docs (#16257 )	2023-02-03 22:01:43 +08:00
minghong	4f778c38a1	[feature](nereids) support explore 4 phase aggregation (#16298 ) support 4 phase Aggregation. example: `select count(distinct k1), sum(k2) from t` suppose t.k0 is distribute key. we have plan ``` Agg(DISTINCT_GLOBAL) \| Exchange(Gather) \| Agg(DISTINCT_LOCAL) \| Agg(GLOBAL) \| Exchange(hash distribute by k1) \| Agg(LOCAL) \| scan ``` limitations: 1. only support sql with one distinct. not support:`select count(distinct k1), count(distinct k2) from t` 2. only support sql with distinct one column not support: `select count(distinct k1, k2) from t`	2023-02-03 21:51:10 +08:00
yixiutt	56be2e5a1a	[bugfix](disk balance) fix new rowset time check when add tablet (#16261 ) In disk balancer, if a tablet is in highly concurrent load, new rowset creation time(which use current time) may be same as the newest rowset, and when add tablet, there has a creation time check that new_time must bigger than old time, so disk balancer will failed many times and makes this tablet lose many verisons as migration will block writes.	2023-02-03 21:49:37 +08:00
lihangyu	54c85e36ad	[Fix](point query) OlapScanNode `reuslt` could be memleak since it's cached (#16406 ) Cached OlapScanNode each time call `addScanRangeLocations` will add TScanRangeLocations to result. So `result` could grow too large and lead `getReplicaNumPerHost` a cpu hot spot in it's loop.	2023-02-03 21:42:53 +08:00
AKIRA	5e232a30d8	[fix](planner) Doris returns empty sets when select from a inline view (#16370 ) Doris always delays the execution of expressions as possible as it can, so as the expansion of constant expression. Given below SQL: ```sql select i from (select 'abc' as i, sum(birth) as j from subquerytest2) as tmp ``` The aggregation would be eliminated, since its output is not required by the outer block, but the expasion for constant expression would be done in the final result expr, and since aggreagete output has been eliminate, the expasion would actually do nothing, and finally cause a empty results. To fix this, we materialize the results expr in the inner block for such SQL, it may affect performance, but better than let system produce a mistaken result.	2023-02-03 21:23:52 +08:00
AKIRA	a5d9aca7ba	[test](Nereids) enable G-K and L-Q scalar function regression test cases (#16169 ) 1. delete invalid signature of nvl function 2. fix some test cases that failed because of malformed function name	2023-02-03 21:18:43 +08:00
Gabriel	87fbb8341a	[Bug](datev2) Fix bug when cast datev2 to date (#16394 )	2023-02-03 20:50:16 +08:00
lihangyu	f94a78ab4a	[Fix](topn) fix wrong nullable cast for RowId column and use heapsorter for two phase read (#16399 ) convert_nullable_flags does not contain nullable info for RowID column, but valid_column_ids contain RowID column, nullable falg will be undefined for RowID column	2023-02-03 20:49:45 +08:00
zhengshiJ	929b31bd3c	[Feature](Nereids) Support CaseWhen with subquery (#16385 ) Co-authored-by: jianghaochen <jianghaochen@meituan.com>	2023-02-03 18:20:47 +08:00
谢健	3891083474	[fix](Nereids): fix some bugs in DpHyper (#16282 )	2023-02-03 18:19:48 +08:00
Gabriel	3f4ca3da32	[Bug](CURRENT_TIMESTAMP) Fix wrong default value after schema change (#16364 ) * [Bug](CURRENT_TIMESTAMP) Fix wrong default value after schema change * update * update	2023-02-03 17:06:24 +08:00
luozenglin	4df70becb9	[refactor](reader) refactor broker_file_reader to get _client in the constructor (#16021 )	2023-02-03 16:51:19 +08:00
caiconghui	13cb81a724	[fix](broker) Fix bug that heavy broker load may failed due to BrokerException which indicate the fd is not owned by client (#16350 ) Co-authored-by: caiconghui1 <caiconghui1@jd.com>	2023-02-03 15:06:45 +08:00
xy720	6294b29f0a	[chore](regression-test) Remove array config in regression test (#16376 ) The fe config "enable_array_type" is not used, this commit removes it from regression test.	2023-02-03 14:44:03 +08:00
xy720	b1fd124f02	[feature](struct-type/map-type) Add switch for struct and map type for creating table (#16379 ) Add switches to forbid uses creating table with struct or map column.	2023-02-03 13:46:52 +08:00
starocean999	dfb610d7ec	[fix](nereids) the order exprs in sort node should be slotRef in its tupleDesc (#16363 )	2023-02-03 13:28:08 +08:00
morrySnow	a9177569c6	[refactor](Nereids) remove trick datatype code in Expression (#16365 ) Since we already do typeCoercion bottom-up in binding step. The trick codes of dataType in Expression are useless. This PR try to remove them.	2023-02-03 13:02:34 +08:00
Pxl	5e4bb98900	[Chore](build) enable -Wpedantic and update lowest gcc version to 11.1 (#16290 ) enable -Wpedantic and update lowest gcc version to 11.1	2023-02-03 11:28:48 +08:00
zhangstar333	7d5a10e1af	[bug](function) fix mask_first_n function can't handle const value (#16308 )	2023-02-03 10:32:42 +08:00
zhangdong	4fc0715156	[fix](auth)fix external catalog cannot use db (#16269 )	2023-02-03 10:10:33 +08:00
zhangstar333	545b91f8f7	[bug](jdbc) fix jdbc insert decimalv3 be core dump (#16353 )	2023-02-03 10:00:06 +08:00
Jerry Hu	7a800bd3c6	[fix](scan) coredump caused by null of _scanner_ctx (#16361 )	2023-02-03 09:24:15 +08:00
lihangyu	13f74088fa	[Improve](row-store) check light schema change enabled (#16358 )	2023-02-02 20:57:18 +08:00
lihangyu	1d8265c5a3	[refactor](row-store) make row store column a hidden column in meta (#16251 ) This could simplfy storage engine logic and make code more readable, and we could analyze the hidden `__DORIS_ROW_STORE_COL__` length etc..	2023-02-02 20:56:13 +08:00
plat1ko	6ee0dbfb23	[fix](cooldown) Fix bugs in cooldown single replica files (#16299 )	2023-02-02 19:31:26 +08:00
Pxl	0d5b115993	[Feature](Materialized-View) support duplicate base column for diffrent aggregate function (#15837 ) support duplicate base column for diffrent aggregate function	2023-02-02 18:57:39 +08:00
zhengshiJ	e31913faca	[Feature](Nereids) Support order and limit in subquery (#15971 ) 1.Compatible with the old optimizer, the sort and limit in the subquery will not take effect, just delete it directly. ``` select * from sub_query_correlated_subquery1 where sub_query_correlated_subquery1.k1 > (select sum(sub_query_correlated_subquery3.k3) a from sub_query_correlated_subquery3 where sub_query_correlated_subquery3.v2 = sub_query_correlated_subquery1.k2 order by a limit 1); ``` 2.Adjust the unnesting position of the subquery to ensure that the conjunct in the filter has been optimized, and then unnesting Support: ``` SELECT DISTINCT k1 FROM sub_query_correlated_subquery1 i1 WHERE ((SELECT count() FROM sub_query_correlated_subquery1 WHERE ((k1 = i1.k1) AND (k2 = 2)) or ((k1 = i1.k1) AND (k2 = 1)) ) > 0); ``` The reason why the above can be supported is that conjunction will be performed, which can be converted into the following ``` SELECT DISTINCT k1 FROM sub_query_correlated_subquery1 i1 WHERE ((SELECT count() FROM sub_query_correlated_subquery1 WHERE ((k1 = i1.k1) AND (k2 = 2 or k2 = 1)) ) > 0); ``` Not Support: ``` SELECT DISTINCT k1 FROM sub_query_correlated_subquery1 i1 WHERE ((SELECT count(*) FROM sub_query_correlated_subquery1 WHERE ((k1 = i1.k1) AND (k2 = 2)) or ((k2 = i1.k1) AND (k2 = 1)) ) > 0); ```	2023-02-02 18:17:30 +08:00
Mingyu Chen	cb6875b5a4	[improvement](multi-catalog) use date/datetimev2 as default col type for catalog table (#16304 ) 1. When mapping column from external datasource, use date/datetimev2 as default type 2. check `is_cancelled` when read data, to avoid endless loop after query is cancelled	2023-02-02 17:35:48 +08:00
Tiewei Fang	557159d3ce	[feature](JdbcExternalCatalog) support insert data in JdbcExternalCatalog (#16271 )	2023-02-02 17:31:33 +08:00
谢健	09abd32957	[fix](test) result order in group-by-costant case is not stable (#16323 )	2023-02-02 16:54:01 +08:00
谢健	398da44e46	[fix](Nereids) fix bugs in test join5 (#16312 ) make bucket-shuffle-join in PhysicalPlanTranlator when property of left child is not enforced	2023-02-02 16:51:45 +08:00
Kang	68d2067f51	[improvement](testcase) change order by sql in test_dup_mv_bitmap_hash.groovy to make result stable change order by sql in test_dup_mv_bitmap_hash.groovy to make result stable	2023-02-02 16:42:58 +08:00
YueW	bb179b77f7	[Feature-WIP](inverted index) support array type for inverted index reader (#16355 )	2023-02-02 16:14:14 +08:00
DuRipeng	a69c0f28ca	[typo](doc) revise zh-CN document markdown format in ALTER-SYSTEM-DECOMMISSION-BACKEND (#16221 )	2023-02-02 15:42:27 +08:00
morrySnow	a6c1eaf1d8	[refactor] bind slot and function in one rule (#16288 ) 1. use one rule to bind slot and function and do type coercion to fix type and nullable error a. SUM(a1 + AVG(a2)) when a1 and a2 are TINYINT. Before, the return type was SMALLINT, after this PR will return the right type - DOUBLE. 2. fix runtime filter gnerator bugs - bind runtime filter on wrong join conjuncts.	2023-02-02 15:02:32 +08:00
lsy3993	42960ffd08	[typo](docs)fix docs format (#16279 )	2023-02-02 14:13:17 +08:00
Gabriel	3b8182ee7e	[nereids](nvl) Fix function signature (#16345 )	2023-02-02 14:05:51 +08:00
Ashin Gau	9618427020	[improvement](multi-catalog) increase default batch_size to 4064 (#16326 ) The performance of ClickBench Q30 is affected by batch_size: \| batch_size \| 1024 \| 4096 \| 20480 \| \| -- \| -- \| -- \| -- \| \| Q30 query time \| 2.27 \| 1.08 \| 0.62 \| Because aggregation operator will create a new result block for each batch block, and Q30 has 90 columns, which is time-consuming. Larger batch_size will decrease the number of aggregation blocks, so the larger batch_size will improve performance. Doris internal reader will read at least 4064 rows even if batch_size < 4064, so this PR keep the process of reading external table the same as internal table.	2023-02-02 11:51:09 +08:00
zhannngchen	69f34cd1c3	[fix](load) sequence column do not compare correctly in memtable (#16211 )	2023-02-02 11:00:23 +08:00

1 2 3 4 5 ...

8516 Commits