doris

Author	SHA1	Message	Date
Jerry Hu	501e7b9132	[chore][config] increase the default value of doris_blocking_priority_queue_wait_timeout_ms (#12580 ) The default value of Config::doris_blocking_priority_queue_wait_timeout_ms make PriorityWorkStealingThreadPool::work_thread high CPU usage (about 8%)	2022-09-14 14:26:13 +08:00
HappenLee	a219a41dde	[dependency](xxhash) Add xxhash lib (#12566 ) Add xxhash lib for BE, which is the faster hash method by test.	2022-09-14 12:30:09 +08:00
jakevin	fd0cf78aa7	[fix](Nereids): fix StatsCalculator compute project and correct commute join type. (#12539 )	2022-09-14 10:32:05 +08:00
ChPi	ead016e0d2	[Enhancement](execute) add timeout for executing fragment rpc (#12512 ) Co-authored-by: chenjie <chenjie@cecdat.com>	2022-09-14 09:12:33 +08:00
lsy3993	8448867bed	[regression-test](window-function) add big table in regression of window function #12562	2022-09-14 08:43:24 +08:00
Yongqiang YANG	5dcf933012	[Bug](column) ColumnNullable::replace_column_data should DCHECK size > sel… #12558	2022-09-14 08:42:15 +08:00
camby	56b2fc43d4	[enhancement](array-type) shrink column suffix zero for type ARRAY<CHAR> (#12443 ) In compute level, CHAR type will shrink suffix zeros. To keep the logic the same as CHAR type, we also shrink for ARRAY or ARRAY<ARRAY> types. Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2022-09-13 23:24:48 +08:00
HappenLee	d913ca5731	[Opt](vectorized) Speed up bucket shuffle join hash compute (#12407 ) * [Opt](vectorized) Speed up bucket shuffle join hash compute	2022-09-13 20:19:22 +08:00
jakevin	9a5be4bab5	[feature](Nereids): Eliminate redundant filter and limit. (#12511 )	2022-09-13 20:08:13 +08:00
AlexYue	58508aea13	[enhance](information_schema) show hll type and bitmap type instead of unknown (#12519 ) Before this pr, when querying data type of hll/bitmap column, 'unknown' would be returned instead of the correct data type of queried column.	2022-09-13 19:43:42 +08:00
TengJianPing	6bf5fc6db5	[improvement](storage) For debugging problems: add session variable `skip_storage_engine_merge` to treat agg and unique data model as dup model (#11952 ) For debug purpose: Add session variable skip_storage_engine_merge, when set to true, tables of aggregate key model and unique key model will be read as duplicate key model. Add session variable skip_delete_predicate, when set to true, rows deleted with delete statement will be selected.	2022-09-13 19:18:56 +08:00
Henry2SS	6a3385437b	[fix](comments) modify comments of setting global variables #12514 Co-authored-by: wuhangze <wuhangze@jd.com>	2022-09-13 19:13:57 +08:00
Pxl	9e49f68663	[fix](new-scan) try to fix invalid call to nullptr slot (#12552 )	2022-09-13 18:54:29 +08:00
deardeng	b98a3ed86c	[fix](frontend) fix notify update storage policy agent task null exception #12470	2022-09-13 16:20:11 +08:00
Pxl	2306e46658	[Enhancement](compaction) reduce VMergeIterator copy block (#12316 ) This pr change make VMergeIterator support return row reference to instead copy a full block.	2022-09-13 16:19:34 +08:00
Jibing-Li	dc80a993bc	[feature-wip](new-scan) New load scanner. (#12275 ) Related pr: https://github.com/apache/doris/pull/11582 https://github.com/apache/doris/pull/12048 Using new file scan node and new scheduling framework to do the load job, replace the old broker scan node. The load part (Be part) is work in progress. Query part (Fe) has been tested using tpch benchmark. Please review only the FE code in this pr, BE code has been disabled by enable_new_load_scan_node configuration. Will send another pr soon to fix be side code.	2022-09-13 13:36:34 +08:00
jakevin	5b4d3616a4	[feature](Nereids): semi join transpose. (#12515 ) * [feature](Nereids): semi join transpose. * fix conditionChecker and check lasscom	2022-09-13 13:32:47 +08:00
Kikyou1997	d35a8a24a5	[feature](nereids) push down Project through Limit (#12490 ) This rule is rewrite project -> limit to limit -> project. The reason is we could get tree like project -> limit -> project -> other node. If we do not rewrite it. we could not merge the two project into one. And if we has more than one project on one node, the second one will overwrite the first one when translate. Then, be will core dump or return slot cannot find error.	2022-09-13 13:26:12 +08:00
jakevin	c3d7d4ce7a	[fix](Nereids): fix LAsscom project split. (#12506 )	2022-09-13 12:12:39 +08:00
starocean999	6b52e47805	[fix](agg)the intermediate slots should be materialized as output slots (#12441 ) in some case, the output slots of agg info may be materialized by call SlotDescriptor's materializeSrcExpr method, but not the intermediate slots. This pr set intermediate slots materialized info to keep consistent with output slots.	2022-09-13 11:28:27 +08:00
catpineapple	550b1e531b	[fix](doc) add the key columes description of the table model document (#12500 ) add the key columes description of the table model document	2022-09-13 11:27:05 +08:00
yinzhijian	353f9e3782	[regression](json) add a nullable case for stream load with json format (#12505 )	2022-09-13 10:45:01 +08:00
slothever	9f25544f2f	[feature-wip](parquet-reader) page index bug fix (#12428 ) Co-authored-by: jinzhe <jinzhe@selectdb.com>	2022-09-13 10:28:53 +08:00
Mingyu Chen	8a274d7851	[feature-wip](new-scan) refactor some interface about predicate push down in scan node (#12527 ) This PR introduce a new enum type `PushDownType`: ``` enum class PushDownType { // The predicate can not be pushed down to data source UNACCEPTABLE, // The predicate can be pushed down to data source // and the data source can fully evaludate it ACCEPTABLE, // The predicate can be pushed down to data source // but the data source can not fully evaluate it. PARTIAL_ACCEPTABLE }; ``` And derived class of VScanNode can override following method to determine whether to accept a bianry/in/bloom filter/is null predicate: ``` PushDownType _should_push_down_binary_predicate(); PushDownType _should_push_down_in_predicate(); PushDownType _should_push_down_function_filter(); PushDownType _should_push_down_bloom_filter(); PushDownType _should_push_down_is_null_predicate(); ```	2022-09-13 10:25:13 +08:00
Stalary	87439e227e	[Enhancement](DOE): Doe support object/nested use string (#12401 ) * MOD: doe support object/nested use string	2022-09-13 09:59:48 +08:00
zy-kkk	97cb095010	[test](join)add test join case4 #12508	2022-09-13 09:09:49 +08:00
zy-kkk	8be5527be4	[test](join)add some join cases (#12501 )	2022-09-13 08:59:32 +08:00
lsy3993	4c73755b40	[test](window-function) add regression test of window function (#12529 )	2022-09-13 08:58:19 +08:00
Mingyu Chen	e33f4f90ae	[fix](exec) Avoid query thread block on wait_for_start (#12411 ) When FE send cancel rpc to BE, it does not notify the wait_for_start() thread, so that the fragment will be blocked and occupy the execution thread. Add a max wait time for wait_for_start() thread. So that it will not block forever.	2022-09-13 08:57:37 +08:00
xy720	b1c2a8343f	[Bug](array_type) Forbid adding array key columns #12479 mysql> desc array_test; +-----------+----------------+------+-------+---------+-------+ \| Field \| Type \| Null \| Key \| Default \| Extra \| +-----------+----------------+------+-------+---------+-------+ \| id \| INT \| Yes \| true \| NULL \| \| \| c_array \| ARRAY<INT(11)> \| Yes \| false \| NULL \| NONE \| +-----------+----------------+------+-------+---------+-------+ Before: mysql> ALTER TABLE array_test ADD COLUMN add_arr_key array<int> key NULL DEFAULT NULL; Query OK, 0 rows affected (0.00 sec) After: mysql> ALTER TABLE array_test ADD COLUMN c_array array<int> key NULL DEFAULT NULL; ERROR 1105 (HY000): errCode = 2, detailMessage = Array can only be used in the non-key column of the duplicate table at present. mysql> ALTER TABLE array_test MODIFY COLUMN c_array array<int> key NULL DEFAULT NULL; ERROR 1105 (HY000): errCode = 2, detailMessage = Array can only be used in the non-key column of the duplicate table at present.	2022-09-13 08:48:28 +08:00
Zhengguo Yang	503a79e4d8	[Bugfix](load) fix be may core dump when load column mapping has function (#12509 ) fix be may core dump when load column mapping has function this bug may be introduced by #12375	2022-09-13 08:44:10 +08:00
TaoZex	c8e9a32bb2	[Function](cbrt)Add cbrt function for doris (#12523 ) Add cbrt function for doris	2022-09-12 19:58:45 +08:00
Henry2SS	ecfefae715	[enhancement](load) make default load mem limit configurable (#12348 ) * make LoadMemLimit valid for broker load, stream load and routine load Co-authored-by: wuhangze <wuhangze@jd.com>	2022-09-12 10:25:01 +08:00
carlvinhust2012	fc605779ed	[fix](array-type) support to export the array type to hdfs (#12504 ) Co-authored-by: hucheng01 <hucheng01@baidu.com>	2022-09-12 10:23:33 +08:00
wudi	9b73b45d05	[Doc](Streamload) update streamload default timeout #12499 Co-authored-by: wudi <>	2022-09-12 10:23:18 +08:00
Mingyu Chen	efd2bdb203	[improvement](new-scan) avoid too many scanner context scheduling (#12491 ) When select large number of data from a table, the profile will show that: - ScannerCtxSchedCount: 2.82664M(2826640) But there is only 8 times of ScannerSchedCount, most of them are busy running. After improvement, the ScannerCtxSchedCount will be reduced to only 10.	2022-09-12 10:22:54 +08:00
weizuo93	e879c26232	[Enhancement](ChunkAllocator) Constructor of singleton class should be private #12516 Co-authored-by: weizuo <weizuo@xiaomi.com>	2022-09-12 10:21:49 +08:00
luozenglin	0c260152b7	[fix](profile) fix query instance profile may be lost. (#12418 )	2022-09-09 22:58:04 +08:00
Kikyou1997	a6a378c9ca	[fix](regression-test) remove 2 regression cases for nereids temporarily which blocked the pipeline (#12517 ) removed below cases in regression suite: nereids_syntax_p0/sub_query_correlated 1. qt_not_exists_unCorrelated 2. qt_not_exist_uncorr	2022-09-09 22:20:35 +08:00
Kikyou1997	f80d7bdd5b	[enhancement](Nereids) add type coercion between decimal and integral (#12482 )	2022-09-09 20:08:03 +08:00
Shuo Wang	2b62ac2fef	[Feature](Nereids) Main framework for selecting rollup index. (#12464 ) # Proposed changes First step of #12303 ## Problem summary This is the first step for supporting rollup index selection for aggregate/unique key OLAP table. This PR aims to select rollup index when the aggregate node is present and the aggregate function matches the value type. So pre-aggregation is turned on by default. Cases that pre-aggregation should be turned off will be addressed in the next PR. Main steps for rollup index selection: 1. filter rollup indexes with all the required columns. 2. filter rollup indexes that match the key prefix most. 3. order the rollup indexes by row count, column count, rollup index id. TODO remaining: 1. address cases that pre-aggregation should be turned off. (next PR) 2. add more test cases. Refactor - Add `Project.getSlotToProducer` to extract a map from the project output slot to its producing expression. - Add `Filter.getConjuncts` to split the filter condition to conjunctive predicates. - Move the usage of `ExpressionReplacer` to `ExpressionUtils.replace(expr, replaceMap)` to simplify the code.	2022-09-09 18:14:31 +08:00
zhengshiJ	dc7e5ca039	[fix](nereids) uncorrelated subquery can't get the correct result (#12421 ) When the current non-correlated subquery is executed, an error will be reported that the corresponding column cannot be found. The reason is that the tupleID of the child obtained in visitPhysicalNestedLoopJoin is not consistent with the child. The non-correlated subquery will trigger this bug because it uses crossJoin. At the same time, sub-query regression tests for non-associative and complex scenarios have been added Co-authored-by: morrySnow <morrysnow@126.com>	2022-09-09 18:08:34 +08:00
Xin Liao	554ba40b13	[feature-wip](unique-key-merge-on-write) update delete bitmap when increamental clone (#12364 )	2022-09-09 17:03:27 +08:00
jakevin	77b93ebc09	[enhancement](Nereids) add optionalAnd to simplify code (#12497 ) Add optionalAnd to avoid adding True which may make BE crash. Use optional to simplify code.	2022-09-09 15:54:32 +08:00
Gabriel	66491ec137	[Improvement](sort) improve partial sort algorithm (#12349 ) * [Improvement](sort) improve partial sort algorithm	2022-09-09 15:44:18 +08:00
924060929	6b8a139f2d	[feature](Nereids) Support function registry (#12481 ) Support function registry. The classes: - BuiltinFunctions: contains the built-in functions list - FunctionRegistry: used to register scalar functions and aggregate functions, it can find the function by name - FunctionBuilder: used to resolve a BoundFunction class, extract the constructor, and build to a BoundFunction by arguments(`List<Expression>`) Register example: you can add built-in functions in the list for simplicity ```java public class BuiltinFunctions implements FunctionHelper { public final List<ScalarFunc> scalarFunctions = ImmutableList.of( scalar(Substring.class, "substr", "substring"), scalar(WeekOfYear.class), scalar(Year.class) ); public final ImmutableList<AggregateFunc> aggregateFunctions = ImmutableList.of( agg(Avg.class), agg(Count.class), agg(Max.class), agg(Min.class), agg(Sum.class) ); } ``` Note: - Currently, we only support register scalar functions add aggregate functions, we will support register table functions. - Currently, we only support resolve function by function name and difference arity, but can not resolve the same arity override function, e.g. `some_function(Expression)` and `some_function(Literal)`	2022-09-09 15:19:45 +08:00
morrySnow	c9a6486f8c	[fix](Nereids) subquery predicate's slot appears in having's output by mistake (#12494 ) when uncorrelated subquery in having predicates, having's output will appears one slot from subquery by mistake. This PR fix it by always add a project on the top of having. Co-authored-by: mch_ucchi <organic_chemistry@foxmail.com>	2022-09-09 11:52:56 +08:00
carlvinhust2012	b1db8aef58	[regression](array-type) add some case for array insert (#12474 ) Co-authored-by: hucheng01 <hucheng01@baidu.com>	2022-09-09 11:18:06 +08:00
xy720	73351917ab	[Enhancement](array-type) Add readable information in subquery for array type #12463	2022-09-09 11:17:50 +08:00
morrySnow	a04f9814fe	[fix](Nereids) column prune generate empty project list on join's child (#12486 ) * [fix](Nereids) column prune generate empty project list on join's child	2022-09-09 10:43:57 +08:00

... 39 40 41 42 43 ...

8276 Commits