doris

Author	SHA1	Message	Date
morrySnow	31d8fdd9e4	[fix](Nereids) finalize local aggregate should not turn on stream pre agg (#13922 )	2022-11-03 11:08:06 +08:00
starocean999	a4a991207b	[fix](agg)fix group by constant value bug (#13827 ) * [fix](agg)fix group by constant value bug * keep only one const grouping exprs if no agg exprs	2022-11-03 10:26:59 +08:00
Zhengguo Yang	b3c6af0059	[Bugfix](MV) Fixed load negative values into bitmap type materialized views successfully under non-vectorization (#13719 ) * [Bugfix](MV) Fixed load negative values into bitmap type materialized views successfully under non-vectorization	2022-11-03 09:21:38 +08:00
Mingyu Chen	7b4c2cabb4	[feature](new-scan) support transactional insert in new scan framework (#13858 ) Support running transactional insert operation with new scan framework. eg: admin set frontend config("enable_new_load_scan_node" = "true"); begin; insert into tbl1 values(1,2); insert into tbl1 values(3,4); insert into tbl1 values(5,6); commit; Add some limitation to transactional insert Do not support non-literal value in insert stmt Fix some issue about array type: Forbid cast other non-array type to NESTED array type, it may cause BE crash. Add getStringValueForArray() method for Expr, to get valid string-formatted array type value. Add useLocalSessionState=true in regression-test jdbc url without this config, the jdbc driver will send some init cmd each time it connect to server, such as select @@session.tx_read_only. But when we use transactional insert, after begin command, Doris do not support any other type of stmt except for insert, commit or rollback. So adding this config to let the jdbc NOT send cmd when connecting.	2022-11-03 08:36:07 +08:00
Fy	e021705053	[feature](nereids) support common table expression (#12742 ) Support common table expression(CTE) in Nereids： - Just implemented inline CTE, which means we will copy the logicalPlan of CTE everywhere it is referenced; - If the name of CTE is the same as an existing table or view, we will choose CTE first;	2022-11-02 23:41:53 +08:00
Mingyu Chen	0ea7f85986	[fix](keyword) add BIN as keyword (#13907 )	2022-11-02 22:30:43 +08:00
mch_ucchi	53814e466b	[Enhancement](Nereids)optimize merge group in memo #13900	2022-11-02 20:42:55 +08:00
zhangstar333	374303186c	[Vectorized](function) support topn_array function (#13869 )	2022-11-02 19:49:23 +08:00
ZenoYang	b26d8f284c	[fix](rpc) The proxy removed when rpc exception occurs is not an abnormal proxy (#13836 ) `BackendServiceProxy.getInstance()` uses the round robin strategy to obtain the proxy, so when the current RPC request is abnormal, the proxy removed by `BackendServiceProxy.getInstance().removeProxy(...)` is not an abnormal proxy.	2022-11-02 19:39:33 +08:00
924060929	6eea855e78	[feature](Nereids) Support lots of scalar function and fix some bug (#13764 ) Proposed changes 1. function interfaces that can search the matched signature, say ComputeSignature. It's equal to the Function.CompareMode. - IdenticalSignature: equal to Function.CompareMode.IS_IDENTICAL - NullOrIdenticalSignature: equal to Function.CompareMode.IS_INDISTINGUISHABLE - ImplicitlyCastableSignature: equal to Function.CompareMode.IS_SUPERTYPE_OF - ExplicitlyCastableSignature: equal to Function.CompareMode.IS_NONSTRICT_SUPERTYPE_OF 3. generate lots of scalar functions 4. bug-fix: disassemble avg function compute wrong result because the wrong input type, the AggregateParam.inputTypesBeforeDissemble is use to save the origin input type and pass to backend to find the correct global aggregate function. 5. bug-fix: subquery with OneRowRelation will crash because wrong nullable property Note: 1. currently no more unit test/regression test for the scalar functions, I will add the test until migrate aggregate functions for unified processing. 2. A known problem is can not invoke the variable length function, I will fix it later.	2022-11-02 18:01:08 +08:00
shee	a871fef815	[Improve](Nereids): refactor eliminate outer join (#13402 ) Refactor eliminate outer join #12985 Evaluate the expression with ConstantFoldRule. If the evaluation result is NULL or FALSE, then the elimination condition is satisfied.	2022-11-02 17:39:05 +08:00
morrySnow	1bafb26217	[fix](Nereids) throw NPE when call getOutputExprIds in LogicalProperties (#13898 )	2022-11-02 16:52:18 +08:00
morrySnow	699ffbca0e	[enhancement](Nereids) generate correct distribution spec after project (#13725 ) after project, some Slot maybe project to another one. So we need to replace ExprId in DistributionSpecHash to the new one. if we do project other than Alias, We need to return DistributionSpecAny other than child's DistributionSpec.	2022-11-02 16:50:44 +08:00
xueweizhang	f2a0adf34e	[fix](fe) Inconsistent behavior for string comparison in FE and BE (#13604 )	2022-11-02 15:32:13 +08:00
morrySnow	6f3db8b4b4	[enhancement](Nereids) add eliminate unnecessary project rule (#13886 ) This rule eliminate project that output set is same with its child. If the project is the root of plan, the elimination condition is project's output is exactly the same with its child. The reason to add this rule is when we do join reorder in optimization, the root of plan after transformed maybe a Project and its output set is same with the root of plan before transformed. If we had a Project on the top of the root and its output set is same with the root of plan too. We will have two exactly same projects in memo. One of them is the parent of the other. After MergeProject, we will get a new Project exactly same like the child and need to add to parent's group. Then we trigger Merge Group. Since merge will produce a cycle, the merge will be denied and we will get a final plan with two consecutive projects. ## for example: BEFORE OPTIMIZATION ``` LogicalProject1( projects=[c_custkey#0, c_name#1]) [GroupId#1] +--LogicalJoin(type=LEFT_SEMI_JOIN) [GroupId#2] \|--LogicalProject(...) \| +--LogicalJoin(type=INNER_JOIN) \| ... +--LogicalOlapScan(...) ``` AFTER APPLY RULE: LOGICAL_SEMI_JOIN_LOGICAL_JOIN_TRANSPOSE_PROJECT ``` LogicalProject1( projects=[c_custkey#0, c_name#1]) [GroupId#1] +--LogicalProject2( projects=[c_custkey#0, c_name#1]) [GroupId#2] +--LogicalJoin(type=INNER_JOIN) [GroupId#10] \|--LogicalProject(...) \| +--LogicalJoin(type=LEFT_SEMI_JOIN) \| ... +--LogicalOlapScan(...) ``` AFTER APPLY RULE: MERGE_PROJECTS ``` LogicalProject3( projects=[c_custkey#0, c_name#1]) [should be in GroupId#1, but in GroupId#2 in fact] +--LogicalJoin(type=INNER_JOIN) [GroupId#10] \|--LogicalProject(...) \| +--LogicalJoin(type=LEFT_SEMI_JOIN) \| ... +--LogicalOlapScan(...) ``` Since we have exaclty GroupExpression(LogicalProject3 and LogicalProject2) in GroupId#1 and GroupId#2, we need to do MergeGroup(GroupId#1, GroupId#2). But we have child of GroupId#1 in GroupId#2. So the merge is denied. If the best GroupExpression in GroupId#2 is LogicalProject3, we will get two consecutive projects in the final plan.	2022-11-02 14:16:03 +08:00
Mingyu Chen	ee8dffbfb7	[meta](recover) change dropInfo and RecoverInfo to GSON (#13830 )	2022-11-02 13:32:46 +08:00
Mingyu Chen	d5becdb4a1	[fix](dynamic-partition) fix wrong check of replication num (#13755 )	2022-11-02 12:55:33 +08:00
wxy	947e67fa76	[enhancement](test) retry start be or fe when port has been bind. (#13860 ) Co-authored-by: wangxiangyu@360shuke.com <wangxiangyu@360shuke.com>	2022-11-02 08:42:35 +08:00
Mingyu Chen	0eeb4d2881	[minor](log) remove some e.printStackTrace() (#13870 )	2022-11-02 08:42:10 +08:00
minghong	7f34698eef	[enhancement](Nereids) use join estimation v2 only when stats derive v2 is enable (#13845 ) join estimation V2 should be invoked when enableNereidsStatsDeriveV2=true	2022-11-01 20:38:39 +08:00
minghong	f0c9867af3	[fix](nereids) map literal to double in FilterSelectivityCalculator (#13776 ) fix literal to double bug: all literal type implements getDouble() function	2022-11-01 20:20:44 +08:00
morrySnow	01f9f8ad43	[enhancement](Nereids) add merge project rule to column prune rule set (#13835 ) when we do column prune, we add project on child plan. If child plan is Project. we need to merge them.	2022-11-01 20:17:53 +08:00
qiye	61c817f4cc	[feature](syntax) support SELECT * EXCEPT (#13844 ) * [feature](syntax) support SELECT * EXCEPT: add regression test	2022-11-01 19:41:25 +08:00
minghong	1eef986e75	[feature](nereids) add rule for semi/anti join exploration, when there is project between them (#13756 )	2022-11-01 19:07:25 +08:00
TengJianPing	c14277e587	[fix](analytic) fix coredump cause by empty analytic parameter types (#13808 ) * fix fe compile error	2022-11-01 17:25:36 +08:00
jakevin	83e55cade8	[feature](Nereids): add rule for matching plan into HyperGraph. (#13805 )	2022-11-01 14:57:25 +08:00
morrySnow	34e68a41dd	[enhancement](explain) add cardinality to explain string and explain graph (#13720 ) 1. set cardinality when translate Nereids plan to legacy planner's plan 2. print cardinality when use EXPLAIN GRAPH	2022-11-01 11:43:21 +08:00
morrySnow	b27714542d	[fix](planner) infer predicate could generate predicates in another scope (#13691 ) * [fix](planner) infer predicate could generate predicates in another scope	2022-11-01 09:03:41 +08:00
jakevin	36a47dfe16	[enhancement](Nereids): use ImmutableList explicitly in Plan (#13817 )	2022-10-31 20:23:30 +08:00
minghong	18be77af64	[fix](nereids) query cannot execution when both nereids enable and fallback to legacy planner are set to false (#13787 ) when enable_nereids_planner=false and enable_fallback_to_origin=false, FE throws exception for all select statement. Expected: when enable_nereids_planner=false, all valid query execution success	2022-10-31 19:02:01 +08:00
xueweizhang	ba177a15cb	[feature-wip](recover) new recover ddl and support show catalog recycle bin (#13067 )	2022-10-31 17:44:56 +08:00
jakevin	ceb7b60a64	[fix](Nereids) update immutable LogicalAggregate attribute by mistake (#13740 )	2022-10-31 14:11:55 +08:00
starocean999	53e5f3939e	[fix](plan)result exprs should be substituted in the same way as agg exprs (#13744 ) * [fix](cast)ignore implicit cast when comparing two exprs * fix fe ut	2022-10-31 10:19:32 +08:00
luozenglin	61b7c2c96c	[fix](join) fix incorrect result when using anti join with other join predicates (#13743 )	2022-10-31 09:51:34 +08:00
Mingyu Chen	efe813ba60	[fix](test)(explain) add full qualified name for scan node explain string (#13777 ) 1. In the "explain" result of SQL, the table name in `ScanNode` should be full qualified with dbname. And for olap scan node, the selected index name should not be "null". 2. Remove `tpch_sf1_p1/tpch_sf1/nereids/` in regression test, it will be fixed later.	2022-10-30 13:24:48 +08:00
谢健	2a5d3dbb6e	feat(nereids): draw hyper graph by graphviz (#13749 )	2022-10-28 17:23:35 +08:00
Ashin Gau	e0667b297f	[feature-wip](multi-catalog) reuse hdfsFs and decode parquet values in batch (#13688 ) PR(https://github.com/apache/doris/pull/13404) introduced that ParquetReader will break up batch insertion when encountering null values, which leads to the bad performance compared to OrcReader. So this PR has pushed null map into decode function, reduce the time of virtual function call when encountering null values. Further more, reuse hdfsFS among file readers to reduce the time of building connection to hdfs.	2022-10-28 15:52:52 +08:00
Pxl	2fab0c45c7	[Feature](runtime-filter) add runtime filter breaking change adapt (#13246 ) add runtime filter breaking change adapt	2022-10-28 10:59:28 +08:00
Ashin Gau	45b31506c7	[improvement](delete) support delete from partitioned table without partition specified (#13533 ) Support delete from partitioned table without partition specified in [DELETE] stmt. ## Usage If it is a partitioned table, you can specify a partition. If not specified, Doris will infer partition from the given conditions. In two cases, Doris cannot infer the partition from conditions: 1) the conditions do not contain partition columns; 2) The operator of the partition column is `not in`. When a partition table does not specify the partition, or the partition cannot be inferred from the conditions, the session variable `delete_without_partition` needs to be `true` to make delete statement be applied to all partitions. ## Test case Test case is added in `regression-test/suites/delete_p0/test_delete_from_partition.groovy`, user can delete from partitioned table without partition specified now.	2022-10-27 21:32:45 +08:00
huangzhaowei	ec86e9c9b2	[feature-wip][MTMV] The schedule framework for the MTMV (#13147 ) Design document: https://github.com/apache/doris/issues/13146	2022-10-27 11:37:24 +08:00
谢健	0e70d681d9	[feature](Nereids): Construct join graph (#13679 ) * feat: add hypergraph and its api * feat: add visulization api Signed-off-by: xiejiann <jianxie0@gmail.com> * remove unused code Signed-off-by: xiejiann <jianxie0@gmail.com> * fix format Signed-off-by: xiejiann <jianxie0@gmail.com> * remove unused test Signed-off-by: xiejiann <jianxie0@gmail.com> * remove unused tests Signed-off-by: xiejiann <jianxie0@gmail.com> * format Signed-off-by: xiejiann <jianxie0@gmail.com> Signed-off-by: xiejiann <jianxie0@gmail.com>	2022-10-27 11:32:31 +08:00
DongLiang-0	2697f72d77	[Improvement][SET-PROPERTY] Support for set query_timeout property (#13444 )	2022-10-27 10:03:39 +08:00
Mingyu Chen	7557980d64	[improvement](regression-test) avoid query empty result after loading finished (#13682 ) When running regression test, we always found that the query return empty result after loading finished, even if we call "sync" before the query. This is because for `stream load`, the load task result will be returned immediately after the txn's status changed to VISIBLE, but before writing the edit log. So if we do the query right after we got the load task result, it is possible that we can not see the latest loaded data. Same issue with `insert` operation	2022-10-27 09:47:18 +08:00
Mingyu Chen	5bd66243ee	[minor](log) remove some unused logs (#13689 ) 1. When running regression test with specific suites or group, do not print other suite name or file name 2. Remove unused alter table job log.	2022-10-27 09:37:32 +08:00
minghong	ddb27b9c3f	nereids use decimal(27,9) (#13678 )	2022-10-26 21:37:24 +08:00
minghong	f4c8d4ce85	[feature](nereids) estimate plan cost by column ndv and table row count (#13375 ) In this version, we use column ndv information to estimate plan cost. This is the first version, covers TPCH queries.	2022-10-26 20:35:10 +08:00
camby	bed759b3f5	[Fix](array-type) support CTAS for ARRAY column from collect_list and collect_set (#13627 ) Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2022-10-26 19:42:15 +08:00
Zhengguo Yang	0841c5bf28	[Bugfix](manager) fix query profile key incompatible with old versions (#13596 )	2022-10-26 14:27:58 +08:00
luozenglin	3548d0b824	[fix](statistics) fix cross join statistics exception (#13645 )	2022-10-26 14:10:57 +08:00
Tiewei Fang	c418bbd2d1	[feature-wip](new-scan) support Json reader (#13546 ) Issue Number: close #12574 This pr adds `NewJsonReader` which implements GenericReader interface to support read json format file. TODO: 1. modify `_scann_eof` later. 2. Rename `NewJsonReader` to `JsonReader` when `JsonReader` is deleted.	2022-10-26 12:52:21 +08:00

1 2 3 4 5 ...

1904 Commits