doris

Author	SHA1	Message	Date
Shuo Wang	d2d5c19d51	[Improvement](Nereids) Avoid unsafe cast. (#12603 ) This PR changed some interfaces to avoid unsafe cast. - Modify `Plan.getExpressions()`'s return type from `List<Expression>` to `List<? extends Expression>` Return projects (type is a list of named expression) in `getExpressions` can avoid unsafe cast. See `LogicalProject.getExpression()` as an example. - Modify `EmptyRelation.getProjects()`'s return type from `List<NamedExpression>` to `List<? extends NamedExpression>` Creating empty relation with a list of slots can avoid unsafe cast. See the `EliminateLimit` rule for example.	2022-09-15 12:02:35 +08:00
mch_ucchi	5e0dc11f87	[feature](Nereids)add RelationId as a unique identifier of relations (#12461 ) In Nereids, we could not distinguish two relation from same table in one PlanTree. This lead to some trick code to process them when do plan. Such as a separate branch to do equals in GroupExpression. This PR add RelationId to LogicalRelation and PhysicalRelation. Then all relations equals function will compare RelationId to help us distinguish two relation from same table. TODO: add relation id to UnboundRelation, UnboundOneRowRelation, LogicalOneRowRelation, PhysicalOneRowRelation.	2022-09-15 11:56:56 +08:00
Gabriel	fc4298e85e	[feature](outfile) support parquet writer (#12492 )	2022-09-15 11:09:12 +08:00
zhangstar333	22a8d35999	[Feature](vectorized) support jdbc sink for insert into data to table (#12534 )	2022-09-15 11:08:41 +08:00
carlvinhust2012	33f5a86e69	[fix](array-type) forbid to create materialized view for array column (#12543 ) Co-authored-by: hucheng01 <hucheng01@baidu.com>	2022-09-15 11:08:23 +08:00
HappenLee	e413a2b8e9	[Opt](vectorized) Use new way to do hash shffle to speed up query (#12586 )	2022-09-15 11:08:04 +08:00
Henry2SS	2ac790bf31	[enhancement](statistic) the calculation of routine load statistics are not accurate (#12594 ) Co-authored-by: wuhangze <wuhangze@jd.com>	2022-09-15 11:00:57 +08:00
jakevin	6543924790	[fix](Nereids): avoid commute cause dead-loop. (#12616 ) * [fix](Nereids): avoid commute cause dead-loop. * update best plan	2022-09-15 10:47:11 +08:00
Mingyu Chen	8aa5899484	[fix](load) add scan tuple for stream load scan node only when vectorization is enable (#12578 )	2022-09-15 08:44:39 +08:00
Gabriel	beeb0ef3eb	[Bug](lead) fix wrong child expression of `lead` function (#12587 )	2022-09-15 08:44:18 +08:00
Kikyou1997	d4cb0bbdd5	[test](nereids) Add TPC-H regression test cases for nereids (#12600 ) forbidden some test cases that could not run success. Will be open if we fix corresponding bugs	2022-09-14 22:37:56 +08:00
Yongqiang YANG	be0a0200cf	[fix](grpc-java) use pooled stub to call rpc on be instead of one stub (#10439 ) A channel is closed when a timeout or exception happens, if only one stub is used, then all query would fail. If we dont close the channel, sometimes grpc-java stuck without sending any rpc.	2022-09-14 22:30:45 +08:00
Kikyou1997	3543f85ae5	[feature](nereids) merge push down and remove redundant operator rules into one batch (#12569 ) 1. For some related rules, we need to execute them together to get the expected plan. 2. Add session variables to avoid fallback to stale planner when running regression tests of nereids for piggyback.	2022-09-14 14:37:36 +08:00
jakevin	fd0cf78aa7	[fix](Nereids): fix StatsCalculator compute project and correct commute join type. (#12539 )	2022-09-14 10:32:05 +08:00
ChPi	ead016e0d2	[Enhancement](execute) add timeout for executing fragment rpc (#12512 ) Co-authored-by: chenjie <chenjie@cecdat.com>	2022-09-14 09:12:33 +08:00
HappenLee	d913ca5731	[Opt](vectorized) Speed up bucket shuffle join hash compute (#12407 ) * [Opt](vectorized) Speed up bucket shuffle join hash compute	2022-09-13 20:19:22 +08:00
jakevin	9a5be4bab5	[feature](Nereids): Eliminate redundant filter and limit. (#12511 )	2022-09-13 20:08:13 +08:00
TengJianPing	6bf5fc6db5	[improvement](storage) For debugging problems: add session variable `skip_storage_engine_merge` to treat agg and unique data model as dup model (#11952 ) For debug purpose: Add session variable skip_storage_engine_merge, when set to true, tables of aggregate key model and unique key model will be read as duplicate key model. Add session variable skip_delete_predicate, when set to true, rows deleted with delete statement will be selected.	2022-09-13 19:18:56 +08:00
Henry2SS	6a3385437b	[fix](comments) modify comments of setting global variables #12514 Co-authored-by: wuhangze <wuhangze@jd.com>	2022-09-13 19:13:57 +08:00
deardeng	b98a3ed86c	[fix](frontend) fix notify update storage policy agent task null exception #12470	2022-09-13 16:20:11 +08:00
Jibing-Li	dc80a993bc	[feature-wip](new-scan) New load scanner. (#12275 ) Related pr: https://github.com/apache/doris/pull/11582 https://github.com/apache/doris/pull/12048 Using new file scan node and new scheduling framework to do the load job, replace the old broker scan node. The load part (Be part) is work in progress. Query part (Fe) has been tested using tpch benchmark. Please review only the FE code in this pr, BE code has been disabled by enable_new_load_scan_node configuration. Will send another pr soon to fix be side code.	2022-09-13 13:36:34 +08:00
jakevin	5b4d3616a4	[feature](Nereids): semi join transpose. (#12515 ) * [feature](Nereids): semi join transpose. * fix conditionChecker and check lasscom	2022-09-13 13:32:47 +08:00
Kikyou1997	d35a8a24a5	[feature](nereids) push down Project through Limit (#12490 ) This rule is rewrite project -> limit to limit -> project. The reason is we could get tree like project -> limit -> project -> other node. If we do not rewrite it. we could not merge the two project into one. And if we has more than one project on one node, the second one will overwrite the first one when translate. Then, be will core dump or return slot cannot find error.	2022-09-13 13:26:12 +08:00
jakevin	c3d7d4ce7a	[fix](Nereids): fix LAsscom project split. (#12506 )	2022-09-13 12:12:39 +08:00
starocean999	6b52e47805	[fix](agg)the intermediate slots should be materialized as output slots (#12441 ) in some case, the output slots of agg info may be materialized by call SlotDescriptor's materializeSrcExpr method, but not the intermediate slots. This pr set intermediate slots materialized info to keep consistent with output slots.	2022-09-13 11:28:27 +08:00
Stalary	87439e227e	[Enhancement](DOE): Doe support object/nested use string (#12401 ) * MOD: doe support object/nested use string	2022-09-13 09:59:48 +08:00
xy720	b1c2a8343f	[Bug](array_type) Forbid adding array key columns #12479 mysql> desc array_test; +-----------+----------------+------+-------+---------+-------+ \| Field \| Type \| Null \| Key \| Default \| Extra \| +-----------+----------------+------+-------+---------+-------+ \| id \| INT \| Yes \| true \| NULL \| \| \| c_array \| ARRAY<INT(11)> \| Yes \| false \| NULL \| NONE \| +-----------+----------------+------+-------+---------+-------+ Before: mysql> ALTER TABLE array_test ADD COLUMN add_arr_key array<int> key NULL DEFAULT NULL; Query OK, 0 rows affected (0.00 sec) After: mysql> ALTER TABLE array_test ADD COLUMN c_array array<int> key NULL DEFAULT NULL; ERROR 1105 (HY000): errCode = 2, detailMessage = Array can only be used in the non-key column of the duplicate table at present. mysql> ALTER TABLE array_test MODIFY COLUMN c_array array<int> key NULL DEFAULT NULL; ERROR 1105 (HY000): errCode = 2, detailMessage = Array can only be used in the non-key column of the duplicate table at present.	2022-09-13 08:48:28 +08:00
Zhengguo Yang	503a79e4d8	[Bugfix](load) fix be may core dump when load column mapping has function (#12509 ) fix be may core dump when load column mapping has function this bug may be introduced by #12375	2022-09-13 08:44:10 +08:00
Henry2SS	ecfefae715	[enhancement](load) make default load mem limit configurable (#12348 ) * make LoadMemLimit valid for broker load, stream load and routine load Co-authored-by: wuhangze <wuhangze@jd.com>	2022-09-12 10:25:01 +08:00
luozenglin	0c260152b7	[fix](profile) fix query instance profile may be lost. (#12418 )	2022-09-09 22:58:04 +08:00
Kikyou1997	f80d7bdd5b	[enhancement](Nereids) add type coercion between decimal and integral (#12482 )	2022-09-09 20:08:03 +08:00
Shuo Wang	2b62ac2fef	[Feature](Nereids) Main framework for selecting rollup index. (#12464 ) # Proposed changes First step of #12303 ## Problem summary This is the first step for supporting rollup index selection for aggregate/unique key OLAP table. This PR aims to select rollup index when the aggregate node is present and the aggregate function matches the value type. So pre-aggregation is turned on by default. Cases that pre-aggregation should be turned off will be addressed in the next PR. Main steps for rollup index selection: 1. filter rollup indexes with all the required columns. 2. filter rollup indexes that match the key prefix most. 3. order the rollup indexes by row count, column count, rollup index id. TODO remaining: 1. address cases that pre-aggregation should be turned off. (next PR) 2. add more test cases. Refactor - Add `Project.getSlotToProducer` to extract a map from the project output slot to its producing expression. - Add `Filter.getConjuncts` to split the filter condition to conjunctive predicates. - Move the usage of `ExpressionReplacer` to `ExpressionUtils.replace(expr, replaceMap)` to simplify the code.	2022-09-09 18:14:31 +08:00
zhengshiJ	dc7e5ca039	[fix](nereids) uncorrelated subquery can't get the correct result (#12421 ) When the current non-correlated subquery is executed, an error will be reported that the corresponding column cannot be found. The reason is that the tupleID of the child obtained in visitPhysicalNestedLoopJoin is not consistent with the child. The non-correlated subquery will trigger this bug because it uses crossJoin. At the same time, sub-query regression tests for non-associative and complex scenarios have been added Co-authored-by: morrySnow <morrysnow@126.com>	2022-09-09 18:08:34 +08:00
jakevin	77b93ebc09	[enhancement](Nereids) add optionalAnd to simplify code (#12497 ) Add optionalAnd to avoid adding True which may make BE crash. Use optional to simplify code.	2022-09-09 15:54:32 +08:00
924060929	6b8a139f2d	[feature](Nereids) Support function registry (#12481 ) Support function registry. The classes: - BuiltinFunctions: contains the built-in functions list - FunctionRegistry: used to register scalar functions and aggregate functions, it can find the function by name - FunctionBuilder: used to resolve a BoundFunction class, extract the constructor, and build to a BoundFunction by arguments(`List<Expression>`) Register example: you can add built-in functions in the list for simplicity ```java public class BuiltinFunctions implements FunctionHelper { public final List<ScalarFunc> scalarFunctions = ImmutableList.of( scalar(Substring.class, "substr", "substring"), scalar(WeekOfYear.class), scalar(Year.class) ); public final ImmutableList<AggregateFunc> aggregateFunctions = ImmutableList.of( agg(Avg.class), agg(Count.class), agg(Max.class), agg(Min.class), agg(Sum.class) ); } ``` Note: - Currently, we only support register scalar functions add aggregate functions, we will support register table functions. - Currently, we only support resolve function by function name and difference arity, but can not resolve the same arity override function, e.g. `some_function(Expression)` and `some_function(Literal)`	2022-09-09 15:19:45 +08:00
morrySnow	c9a6486f8c	[fix](Nereids) subquery predicate's slot appears in having's output by mistake (#12494 ) when uncorrelated subquery in having predicates, having's output will appears one slot from subquery by mistake. This PR fix it by always add a project on the top of having. Co-authored-by: mch_ucchi <organic_chemistry@foxmail.com>	2022-09-09 11:52:56 +08:00
xy720	73351917ab	[Enhancement](array-type) Add readable information in subquery for array type #12463	2022-09-09 11:17:50 +08:00
morrySnow	a04f9814fe	[fix](Nereids) column prune generate empty project list on join's child (#12486 ) * [fix](Nereids) column prune generate empty project list on join's child	2022-09-09 10:43:57 +08:00
zy-kkk	a468085efe	[improvement](error info)improve the s3 path err msg #12438	2022-09-09 09:14:24 +08:00
TengJianPing	b45a8379eb	[bugfix](odbc) escape identifiers for sqlserver and postgresql (#12487 ) Delimited identifier format for sqlserver and postgresql is different from MySQL. Sqlserver use brackets ([ ]) and postgresql use double quotes("").	2022-09-09 09:11:03 +08:00
Mingyu Chen	e84272ed43	[improvment](planner) unset common fields to reduce plan thrift size (#12495 ) 1. For query with 1656 union, the plan thrift size will be reduced from 400MB+ to 2MB. This optimization is introduced from #4904, but lost after #9720 2. Disable ExprSubstitutionMap.verify when debug is disable. So that the plan time of query with 1656 union will be reduced from 20s to 2s	2022-09-09 09:02:45 +08:00
morrySnow	d2a23a4cf9	[enhancement](Nereids) change aggregate and join stats calc algorithm (#12447 ) The original statistic derive calculate algorithm rely on NDV and other column statistics. But we cannot get these stats in product environment. This PR change these operator's stats calc algorithm to use a DEFAULT RATIO variable instead of column statistics. We should change these algorithm when we could get column stats in product environment	2022-09-09 01:00:07 +08:00
Kikyou1997	b4f0f39e77	[feature](Nereids) implement uncheckedCast method in VarcharLiteral (#12468 ) Implement uncheckedCast on VarcharLiteral for a temp way to let TimestampArithmetic work. We should remove these code and do implicit cast in TypeCoercion rule in future.	2022-09-09 00:33:37 +08:00
jakevin	8478efad44	[improve](Nereids): check same logicalProperty when insert a Group. (#12469 )	2022-09-09 00:00:11 +08:00
qiye	85bd297777	[feature](function)Support function "current_date" in FE (#11702 ) Issue Number: close #11699	2022-09-08 16:00:57 +08:00
Kikyou1997	d1ab6b1db2	[enhancement](nereids) add syntax support for fractional literal (#12444 ) Just as legacy planner, Nereids parse all fractional literal to decimal. In the future, we will add more syntax for user to control the fractional literal type.	2022-09-08 15:54:20 +08:00
jakevin	7c7ac86fe8	[feature](Nereids): Left deep tree join order. (#12439 ) * [feature](Nereids): Left deep tree join order.	2022-09-08 15:09:22 +08:00
lihuigang	491dd34ba7	[fix](planner) fix orthogonal_bitmap_union_count plan : wrong PREAGGREGATION (#12095 ) Execution plan display when using orthogonal_bitmap_union_count function: PREAGGREGATION: OFF Reason: Invalid Aggregate Operator: orthogonal_bitmap_union_count The correct plan is: PREAGGREGATION: ON Co-authored-by: lihuigang <lihuigang@meituan.com>	2022-09-08 15:00:43 +08:00
Henry2SS	461a4cc94e	[Enhancement](Error Msg) show details of COLUMN and TABLE name regex #11999 Co-authored-by: wuhangze <wuhangze@jd.com>	2022-09-08 14:59:39 +08:00
Tiewei Fang	824a192f8f	[enhancement](http) executeSQL rest api support streaming response (#12239 )	2022-09-08 14:57:15 +08:00

1 2 3 4 5 ...

2755 Commits