doris

Author	SHA1	Message	Date
morrySnow	22b4c6af20	[feature](Nereids) support statement having aggregate function in order by list (#13976 ) 1. add a feature that support statement having aggregate function in order by list. such as: SELECT COUNT() FROM t GROUP BY c1 ORDER BY COUNT() DESC; 2. add clickbench analyze unit tests	2022-11-07 17:01:31 +08:00
starocean999	bb9182d602	[fix](repeat)remove unmaterialized expr from repeat node (#13953 )	2022-11-07 14:13:05 +08:00
zhoumengyks	3c8524b9d8	[security](fe jar) upgrade commons-codec:commons-codec to 1.13 #13951	2022-11-07 13:50:07 +08:00
Tiewei Fang	27549564a7	[feature](table-valued-function) Support S3 tvf (#13959 ) This pr does three things： 1. Modified the framework of table-valued-function(tvf). 2. be support `fetch_table_schema` rpc. 3. Implemented `S3(path, AK, SK, format)` table-valued-function.	2022-11-06 11:04:26 +08:00
Mingyu Chen	fb5a3e118a	[feature-wip](dlf) prepare to support aliyun dlf (#13969 ) [What is DLF](https://www.alibabacloud.com/product/datalake-formation) This PR is a preparation for support DLF, with some changes of multi catalog 1. Add RuntimeException for most of hive meta store or es client visit operation. 2. Add DLF related dependencies. 3. Move the checks of es catalog properties to the analysis phase of creating es catalog TODO(in next PR): 1. Refactor the `getSplit` method to support not only hdfs, but s3-compatible object storage. 2. Finish the implementation of supporting DLF	2022-11-06 10:01:57 +08:00
Mingyu Chen	d01f7c546a	[refactor](iceberg-hudi) disable iceberg and hudi table by default (#13932 )	2022-11-05 19:22:27 +08:00
wxy	620a137bd7	[enhancement](test) support tablet repair and balance process in ut (#13940 )	2022-11-05 19:20:23 +08:00
Gabriel	2ee7ba79a8	[Improvement](javaudf) improve java loader usage (#13962 )	2022-11-05 13:20:04 +08:00
924060929	06a1efdb01	[fix](Nerieds) fix tpch and support trace plan's change event (#13957 ) This pr fix some bugs for run tpc-h 1. fix the avg(decimal) crash the backend. The fix code in `Avg.getFinalType()` and every child class of `ComputeSinature` 2. fix the ReorderJoin dead loop. The fix code in `ReorderJoin.findInnerJoin()` 3. fix the TimestampArithmetic can not bind the functions in the child. The fix code in `BindFunction.FunctionBinder.visitTimestampArithmetic()` New feature: support trace the plan's change event, you can `set enable_nereids_trace=true` to open trace log and see some log like this: ``` 2022-11-03 21:07:38,391 INFO (mysql-nio-pool-0\|208) [Job.printTraceLog():128] ========== RewriteBottomUpJob ANALYZE_FILTER_SUBQUERY ========== before: LogicalProject ( projects=[S_ACCTBAL#17, S_NAME#13, N_NAME#4, P_PARTKEY#19, P_MFGR#21, S_ADDRESS#14, S_PHONE#16, S_COMMENT#18] ) +--LogicalFilter ( predicates=((((((((P_PARTKEY#19 = PS_PARTKEY#7) AND (S_SUPPKEY#12 = PS_SUPPKEY#8)) AND (P_SIZE#24 = 15)) AND (P_TYPE#23 like '%BRASS')) AND (S_NATIONKEY#15 = N_NATIONKEY#3)) AND (N_REGIONKEY#5 = R_REGIONKEY#0)) AND (R_NAME#1 = 'EUROPE')) AND (PS_SUPPLYCOST#10 = (SCALARSUBQUERY) (QueryPlan: LogicalAggregate ( phase=LOCAL, outputExpr=[min(PS_SUPPLYCOST#31) AS `min(PS_SUPPLYCOST)`#33], groupByExpr=[] )), (CorrelatedSlots: [P_PARTKEY#19, S_SUPPKEY#12, S_NATIONKEY#15, N_NATIONKEY#3, N_REGIONKEY#5, R_REGIONKEY#0, R_NAME#1]))) ) +--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] ) \|--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] ) \| \|--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] ) \| \| \|--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] ) \| \| \| \|--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.part, output=[P_PARTKEY#19, P_NAME#20, P_MFGR#21, P_BRAND#22, P_TYPE#23, P_SIZE#24, P_CONTAINER#25, P_RETAILPRICE#26, P_COMMENT#27], candidateIndexIds=[], selectedIndexId=11076, preAgg=ON ) \| \| \| +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.supplier, output=[S_SUPPKEY#12, S_NAME#13, S_ADDRESS#14, S_NATIONKEY#15, S_PHONE#16, S_ACCTBAL#17, S_COMMENT#18], candidateIndexIds=[], selectedIndexId=11124, preAgg=ON ) \| \| +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.partsupp, output=[PS_PARTKEY#7, PS_SUPPKEY#8, PS_AVAILQTY#9, PS_SUPPLYCOST#10, PS_COMMENT#11], candidateIndexIds=[], selectedIndexId=11092, preAgg=ON ) \| +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.nation, output=[N_NATIONKEY#3, N_NAME#4, N_REGIONKEY#5, N_COMMENT#6], candidateIndexIds=[], selectedIndexId=11044, preAgg=ON ) +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.region, output=[R_REGIONKEY#0, R_NAME#1, R_COMMENT#2], candidateIndexIds=[], selectedIndexId=11108, preAgg=ON ) after: LogicalProject ( projects=[S_ACCTBAL#17, S_NAME#13, N_NAME#4, P_PARTKEY#19, P_MFGR#21, S_ADDRESS#14, S_PHONE#16, S_COMMENT#18] ) +--LogicalFilter ( predicates=((((((((P_PARTKEY#19 = PS_PARTKEY#7) AND (S_SUPPKEY#12 = PS_SUPPKEY#8)) AND (P_SIZE#24 = 15)) AND (P_TYPE#23 like '%BRASS')) AND (S_NATIONKEY#15 = N_NATIONKEY#3)) AND (N_REGIONKEY#5 = R_REGIONKEY#0)) AND (R_NAME#1 = 'EUROPE')) AND (PS_SUPPLYCOST#10 = min(PS_SUPPLYCOST)#33)) ) +--LogicalProject ( projects=[P_PARTKEY#19, P_NAME#20, P_MFGR#21, P_BRAND#22, P_TYPE#23, P_SIZE#24, P_CONTAINER#25, P_RETAILPRICE#26, P_COMMENT#27, S_SUPPKEY#12, S_NAME#13, S_ADDRESS#14, S_NATIONKEY#15, S_PHONE#16, S_ACCTBAL#17, S_COMMENT#18, PS_PARTKEY#7, PS_SUPPKEY#8, PS_AVAILQTY#9, PS_SUPPLYCOST#10, PS_COMMENT#11, N_NATIONKEY#3, N_NAME#4, N_REGIONKEY#5, N_COMMENT#6, R_REGIONKEY#0, R_NAME#1, R_COMMENT#2, min(PS_SUPPLYCOST)#33] ) +--LogicalApply ( correlationSlot=[P_PARTKEY#19, S_SUPPKEY#12, S_NATIONKEY#15, N_NATIONKEY#3, N_REGIONKEY#5, R_REGIONKEY#0, R_NAME#1], correlationFilter=Optional.empty ) \|--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] ) \| \|--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] ) \| \| \|--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] ) \| \| \| \|--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] ) \| \| \| \| \|--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.part, output=[P_PARTKEY#19, P_NAME#20, P_MFGR#21, P_BRAND#22, P_TYPE#23, P_SIZE#24, P_CONTAINER#25, P_RETAILPRICE#26, P_COMMENT#27], candidateIndexIds=[], selectedIndexId=11076, preAgg=ON ) \| \| \| \| +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.supplier, output=[S_SUPPKEY#12, S_NAME#13, S_ADDRESS#14, S_NATIONKEY#15, S_PHONE#16, S_ACCTBAL#17, S_COMMENT#18], candidateIndexIds=[], selectedIndexId=11124, preAgg=ON ) \| \| \| +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.partsupp, output=[PS_PARTKEY#7, PS_SUPPKEY#8, PS_AVAILQTY#9, PS_SUPPLYCOST#10, PS_COMMENT#11], candidateIndexIds=[], selectedIndexId=11092, preAgg=ON ) \| \| +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.nation, output=[N_NATIONKEY#3, N_NAME#4, N_REGIONKEY#5, N_COMMENT#6], candidateIndexIds=[], selectedIndexId=11044, preAgg=ON ) \| +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.region, output=[R_REGIONKEY#0, R_NAME#1, R_COMMENT#2], candidateIndexIds=[], selectedIndexId=11108, preAgg=ON ) +--LogicalAggregate ( phase=LOCAL, outputExpr=[min(PS_SUPPLYCOST#31) AS `min(PS_SUPPLYCOST)`#33], groupByExpr=[] ) +--LogicalFilter ( predicates=(((((P_PARTKEY#19 = PS_PARTKEY#28) AND (S_SUPPKEY#12 = PS_SUPPKEY#29)) AND (S_NATIONKEY#15 = N_NATIONKEY#3)) AND (N_REGIONKEY#5 = R_REGIONKEY#0)) AND (CAST(R_NAME AS STRING) = CAST(EUROPE AS STRING))) ) +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.partsupp, output=[PS_PARTKEY#28, PS_SUPPKEY#29, PS_AVAILQTY#30, PS_SUPPLYCOST#31, PS_COMMENT#32], candidateIndexIds=[], selectedIndexId=11092, preAgg=ON ) ```	2022-11-04 15:01:06 +08:00
morrySnow	dc01fb4085	[enhancement](Nereids) remove unnecessary string cast (#13730 ) convert string like literal to the cast type instead of run cast in runtime	2022-11-04 11:18:22 +08:00
morrySnow	9bf20a7b5d	[enhancement](Nereids) remove unnecessary int cast (#13881 )	2022-11-04 11:07:59 +08:00
morrySnow	efb2596c7a	[enhancment](Nereids) enable push down filter through aggregation (#13938 )	2022-11-04 11:04:00 +08:00
Jibing-Li	f2d84d81e6	[feature-wip][refactor](multi-catalog) Persist external catalog related metadata. (#13746 ) Persist external catalog/db/table, including the columns of external tables. After this change, external objects could have their own uniq ID through their lifetime, this is required for the statistic information collection.	2022-11-04 09:04:00 +08:00
zhannngchen	698541e58d	[improvement](exec) add more debug info on fragment exec error (#13899 )	2022-11-04 08:55:31 +08:00
Mingyu Chen	5d56fe6d32	[fix](meta)(recover) fix recover info persist bug (#13948 ) introduced from #13830	2022-11-04 07:40:21 +08:00
Gabriel	0a228a68d6	[Improvement](javaudf) support different date argument for date/datetime type (#13920 )	2022-11-03 20:33:20 +08:00
carlvinhust2012	8043418db4	[optimization](array-type) update the exception message when create table with array column (#13731 ) This pr is used to update the exception message when create table with array column. Co-authored-by: hucheng01 <hucheng01@baidu.com>	2022-11-03 17:12:17 +08:00
jakevin	c1438cbad6	[revert](Nereids): revert GroupExpression Children ImmutableList. (#13918 )	2022-11-03 16:29:54 +08:00
gnehil	b1816d49e7	[fix](typo) check catalog enable exception message spelling mistake (#13925 )	2022-11-03 14:44:37 +08:00
mch_ucchi	29e01db7ce	[Fix](Nereids) add comments to CostAndEnforcerJob and fix view test case (#13046 ) 1. add comments to cost and enforcer job as some code is too hard to understand 2. fix nereids_syntax_p0/view.groovy's multi-answer bug.	2022-11-03 12:12:24 +08:00
Gabriel	bfba058ecf	[Feature](join) Support null aware left anti join (#13871 )	2022-11-03 12:11:25 +08:00
Adonis Ling	57ee5c4a65	[feature](nereids) Support authentication (#13434 ) Add a rule to check the permission of a user who are executing a query. Forbid users who don't have SELECT_PRIV on some tables from executing queries on these tables.	2022-11-03 11:58:14 +08:00
morrySnow	31d8fdd9e4	[fix](Nereids) finalize local aggregate should not turn on stream pre agg (#13922 )	2022-11-03 11:08:06 +08:00
starocean999	a4a991207b	[fix](agg)fix group by constant value bug (#13827 ) * [fix](agg)fix group by constant value bug * keep only one const grouping exprs if no agg exprs	2022-11-03 10:26:59 +08:00
Zhengguo Yang	b3c6af0059	[Bugfix](MV) Fixed load negative values into bitmap type materialized views successfully under non-vectorization (#13719 ) * [Bugfix](MV) Fixed load negative values into bitmap type materialized views successfully under non-vectorization	2022-11-03 09:21:38 +08:00
Mingyu Chen	7b4c2cabb4	[feature](new-scan) support transactional insert in new scan framework (#13858 ) Support running transactional insert operation with new scan framework. eg: admin set frontend config("enable_new_load_scan_node" = "true"); begin; insert into tbl1 values(1,2); insert into tbl1 values(3,4); insert into tbl1 values(5,6); commit; Add some limitation to transactional insert Do not support non-literal value in insert stmt Fix some issue about array type: Forbid cast other non-array type to NESTED array type, it may cause BE crash. Add getStringValueForArray() method for Expr, to get valid string-formatted array type value. Add useLocalSessionState=true in regression-test jdbc url without this config, the jdbc driver will send some init cmd each time it connect to server, such as select @@session.tx_read_only. But when we use transactional insert, after begin command, Doris do not support any other type of stmt except for insert, commit or rollback. So adding this config to let the jdbc NOT send cmd when connecting.	2022-11-03 08:36:07 +08:00
Fy	e021705053	[feature](nereids) support common table expression (#12742 ) Support common table expression(CTE) in Nereids： - Just implemented inline CTE, which means we will copy the logicalPlan of CTE everywhere it is referenced; - If the name of CTE is the same as an existing table or view, we will choose CTE first;	2022-11-02 23:41:53 +08:00
Mingyu Chen	0ea7f85986	[fix](keyword) add BIN as keyword (#13907 )	2022-11-02 22:30:43 +08:00
mch_ucchi	53814e466b	[Enhancement](Nereids)optimize merge group in memo #13900	2022-11-02 20:42:55 +08:00
zhangstar333	374303186c	[Vectorized](function) support topn_array function (#13869 )	2022-11-02 19:49:23 +08:00
ZenoYang	b26d8f284c	[fix](rpc) The proxy removed when rpc exception occurs is not an abnormal proxy (#13836 ) `BackendServiceProxy.getInstance()` uses the round robin strategy to obtain the proxy, so when the current RPC request is abnormal, the proxy removed by `BackendServiceProxy.getInstance().removeProxy(...)` is not an abnormal proxy.	2022-11-02 19:39:33 +08:00
924060929	6eea855e78	[feature](Nereids) Support lots of scalar function and fix some bug (#13764 ) Proposed changes 1. function interfaces that can search the matched signature, say ComputeSignature. It's equal to the Function.CompareMode. - IdenticalSignature: equal to Function.CompareMode.IS_IDENTICAL - NullOrIdenticalSignature: equal to Function.CompareMode.IS_INDISTINGUISHABLE - ImplicitlyCastableSignature: equal to Function.CompareMode.IS_SUPERTYPE_OF - ExplicitlyCastableSignature: equal to Function.CompareMode.IS_NONSTRICT_SUPERTYPE_OF 3. generate lots of scalar functions 4. bug-fix: disassemble avg function compute wrong result because the wrong input type, the AggregateParam.inputTypesBeforeDissemble is use to save the origin input type and pass to backend to find the correct global aggregate function. 5. bug-fix: subquery with OneRowRelation will crash because wrong nullable property Note: 1. currently no more unit test/regression test for the scalar functions, I will add the test until migrate aggregate functions for unified processing. 2. A known problem is can not invoke the variable length function, I will fix it later.	2022-11-02 18:01:08 +08:00
shee	a871fef815	[Improve](Nereids): refactor eliminate outer join (#13402 ) Refactor eliminate outer join #12985 Evaluate the expression with ConstantFoldRule. If the evaluation result is NULL or FALSE, then the elimination condition is satisfied.	2022-11-02 17:39:05 +08:00
morrySnow	1bafb26217	[fix](Nereids) throw NPE when call getOutputExprIds in LogicalProperties (#13898 )	2022-11-02 16:52:18 +08:00
morrySnow	699ffbca0e	[enhancement](Nereids) generate correct distribution spec after project (#13725 ) after project, some Slot maybe project to another one. So we need to replace ExprId in DistributionSpecHash to the new one. if we do project other than Alias, We need to return DistributionSpecAny other than child's DistributionSpec.	2022-11-02 16:50:44 +08:00
xueweizhang	f2a0adf34e	[fix](fe) Inconsistent behavior for string comparison in FE and BE (#13604 )	2022-11-02 15:32:13 +08:00
morrySnow	6f3db8b4b4	[enhancement](Nereids) add eliminate unnecessary project rule (#13886 ) This rule eliminate project that output set is same with its child. If the project is the root of plan, the elimination condition is project's output is exactly the same with its child. The reason to add this rule is when we do join reorder in optimization, the root of plan after transformed maybe a Project and its output set is same with the root of plan before transformed. If we had a Project on the top of the root and its output set is same with the root of plan too. We will have two exactly same projects in memo. One of them is the parent of the other. After MergeProject, we will get a new Project exactly same like the child and need to add to parent's group. Then we trigger Merge Group. Since merge will produce a cycle, the merge will be denied and we will get a final plan with two consecutive projects. ## for example: BEFORE OPTIMIZATION ``` LogicalProject1( projects=[c_custkey#0, c_name#1]) [GroupId#1] +--LogicalJoin(type=LEFT_SEMI_JOIN) [GroupId#2] \|--LogicalProject(...) \| +--LogicalJoin(type=INNER_JOIN) \| ... +--LogicalOlapScan(...) ``` AFTER APPLY RULE: LOGICAL_SEMI_JOIN_LOGICAL_JOIN_TRANSPOSE_PROJECT ``` LogicalProject1( projects=[c_custkey#0, c_name#1]) [GroupId#1] +--LogicalProject2( projects=[c_custkey#0, c_name#1]) [GroupId#2] +--LogicalJoin(type=INNER_JOIN) [GroupId#10] \|--LogicalProject(...) \| +--LogicalJoin(type=LEFT_SEMI_JOIN) \| ... +--LogicalOlapScan(...) ``` AFTER APPLY RULE: MERGE_PROJECTS ``` LogicalProject3( projects=[c_custkey#0, c_name#1]) [should be in GroupId#1, but in GroupId#2 in fact] +--LogicalJoin(type=INNER_JOIN) [GroupId#10] \|--LogicalProject(...) \| +--LogicalJoin(type=LEFT_SEMI_JOIN) \| ... +--LogicalOlapScan(...) ``` Since we have exaclty GroupExpression(LogicalProject3 and LogicalProject2) in GroupId#1 and GroupId#2, we need to do MergeGroup(GroupId#1, GroupId#2). But we have child of GroupId#1 in GroupId#2. So the merge is denied. If the best GroupExpression in GroupId#2 is LogicalProject3, we will get two consecutive projects in the final plan.	2022-11-02 14:16:03 +08:00
Mingyu Chen	ee8dffbfb7	[meta](recover) change dropInfo and RecoverInfo to GSON (#13830 )	2022-11-02 13:32:46 +08:00
Mingyu Chen	d5becdb4a1	[fix](dynamic-partition) fix wrong check of replication num (#13755 )	2022-11-02 12:55:33 +08:00
wxy	947e67fa76	[enhancement](test) retry start be or fe when port has been bind. (#13860 ) Co-authored-by: wangxiangyu@360shuke.com <wangxiangyu@360shuke.com>	2022-11-02 08:42:35 +08:00
Mingyu Chen	0eeb4d2881	[minor](log) remove some e.printStackTrace() (#13870 )	2022-11-02 08:42:10 +08:00
jiafeng.zhang	7fedfdcf6a	[fix](spark load)The where condition does not take effect when spark load loads the file (#13803 )	2022-11-01 23:01:45 +08:00
Gabriel	287a739510	[javaudf](string) Fix string format in java udf (#13854 )	2022-11-01 21:25:12 +08:00
minghong	7f34698eef	[enhancement](Nereids) use join estimation v2 only when stats derive v2 is enable (#13845 ) join estimation V2 should be invoked when enableNereidsStatsDeriveV2=true	2022-11-01 20:38:39 +08:00
minghong	f0c9867af3	[fix](nereids) map literal to double in FilterSelectivityCalculator (#13776 ) fix literal to double bug: all literal type implements getDouble() function	2022-11-01 20:20:44 +08:00
morrySnow	01f9f8ad43	[enhancement](Nereids) add merge project rule to column prune rule set (#13835 ) when we do column prune, we add project on child plan. If child plan is Project. we need to merge them.	2022-11-01 20:17:53 +08:00
qiye	61c817f4cc	[feature](syntax) support SELECT * EXCEPT (#13844 ) * [feature](syntax) support SELECT * EXCEPT: add regression test	2022-11-01 19:41:25 +08:00
minghong	1eef986e75	[feature](nereids) add rule for semi/anti join exploration, when there is project between them (#13756 )	2022-11-01 19:07:25 +08:00
TengJianPing	c14277e587	[fix](analytic) fix coredump cause by empty analytic parameter types (#13808 ) * fix fe compile error	2022-11-01 17:25:36 +08:00
jakevin	83e55cade8	[feature](Nereids): add rule for matching plan into HyperGraph. (#13805 )	2022-11-01 14:57:25 +08:00

1 2 3 4 5 ...

3013 Commits