doris

Author	SHA1	Message	Date
Gabriel	184cee2d2b	[Bug](outfile) Fix wrong decimal format for ORC (#14124 )	2022-11-10 11:01:30 +08:00
Tiewei Fang	43eb946543	[feature](table-valued-function)S3 table valued function supports parquet/orc/json file format #14130 S3 table valued function supports parquet/orc/json file format. For example: parquet format	2022-11-10 10:33:12 +08:00
Jerry Hu	10df61b5bf	[improvement](join) Share hash table in fragments for broadcast join (#13921 )	2022-11-10 09:48:34 +08:00
zhangstar333	df622d8b7d	[Bug](udf) fix java-udaf process string type error and add some tests (#14106 )	2022-11-10 09:30:57 +08:00
mch_ucchi	3117ac9289	[enhancement](Nereids) use post-order to generate runtime filter in RuntimeFilterGenerator (#13949 ) change runtime filter generator from pre-order to post-order, it maybe change the quantity of generated runtime filters. and the ut will be corrected.	2022-11-09 14:28:49 +08:00
Tiewei Fang	b74d0a4747	[feature](table-valued-function) Support `desc from s3()` and modify the syntax of tvf (#14047 ) This pr does two things: Support desc function s3() modify the syntax of tvf	2022-11-09 14:12:43 +08:00
morrySnow	84bb82acc0	[fix](Nereids) aggregate disassemble generate error output list on GLOBAL phase aggregate (#14079 ) we must use localAggregateFunction as key of globalOutputSMap, because we use local output exprs to generate global output in disassembleDistinct	2022-11-09 13:43:12 +08:00
jakevin	b144d2b4f4	[improve](Nereids): remove redundant code, add annotation in Memo. (#14083 )	2022-11-09 13:39:20 +08:00
morrySnow	aff62655c4	[feature](Nereids) binding slot in order by that not show in project (#14042 ) 1. binding slot in order by that not show in project, such as: SELECT c1 FROM t WHERE c2 > 0 ORDER BY c3 2. not check unbound when bind slot reference. Instead, do it in analysis check.	2022-11-09 13:25:41 +08:00
xueweizhang	572f491756	[fix](ctas) text column type len = 1 when create table as select (#13906 ) Signed-off-by: nextdreamblue <zxw520blue1@163.com>	2022-11-09 09:09:34 +08:00
Kang	151842a1fe	[feature](inverted index)WIP inverted index api: SQL syntax and metadata (#13430 ) Introduce a SQL syntax for creating inverted index and related metadata changes. ``` -- create table with INVERTED index CREATE TABLE httplogs ( ts datetime, clientip varchar(20), request string, status smallint, size int, INDEX idx_size (size) USING INVERTED, INDEX idx_status (status) USING INVERTED, INDEX idx_clientip (clientip) USING INVERTED PROPERTIES("parser"="none") ) DUPLICATE KEY(ts) DISTRIBUTED BY RANDOM BUCKETS 10 -- add an INVERTED index to a table CREATE INDEX idx_request ON httplogs(request) USING INVERTED PROPERTIES("parser"="english"); ```	2022-11-08 23:46:53 +08:00
Tiewei Fang	826cfdaf93	[feature](information_schema) add `backends` information_schema table (#13086 )	2022-11-08 22:15:10 +08:00
shee	3f3f2eb098	[Nereids][Improve] infer predicate after push down predicate (#12996 ) This PR implements the function of predicate inference For example: ``` sql select * from student left join score on student.id = score.sid where score.sid > 1 ``` transformed logical plan tree: left join / \ filter(sid >1) filter(id > 1) <---- inferred predicate \| \| scan scan See `InferPredicatesTest` for more cases The logic is as follows: 1. poll up bottom predicate then infer additional predicates for example: select * from (select * from t1 where t1.id = 1) t join t2 on t.id = t2.id 1. poll up bottom predicate select * from (select * from t1 where t1.id = 1) t join t2 on t.id = t2.id and t.id = 1 2. infer select * from (select * from t1 where t1.id = 1) t join t2 on t.id = t2.id and t.id = 1 and t2.id = 1 finally transformed sql: select * from (select * from t1 where t1.id = 1) t join t2 on t.id = t2.id and t2.id = 1 2. put these predicates into `otherJoinConjuncts` , these predicates are processed in the next round of predicate push-down Now only support infer `ComparisonPredicate`. TODO: We should determine whether `expression` satisfies the condition for replacement eg: Satisfy `expression` is non-deterministic	2022-11-08 21:36:17 +08:00
Mingyu Chen	b6f91b6eff	[improvement](profile) support ordinary user to get query profile via http api (#14016 )	2022-11-08 20:39:01 +08:00
Kikyou1997	ecfdf0320d	[fix](statistics) ColumnStatistics was changed unexpectedly when show stats (#14068 ) The logic of show stats would change the internal collected ColumnStat unexpectedly which would cause inaccurate cost and inefficient plan	2022-11-08 20:26:37 +08:00
minghong	cdc635610b	[enhancement](Nereids) tpch q21 anti and semi join reorder (#14037 ) estimation of anti and semi join need re-work. we just let tpch q21 pass.	2022-11-08 17:21:50 +08:00
morrySnow	54c07f8782	[regression](Nereids) add back tpch regression test cases (#13826 ) 1. add back TPC-H regression test cases 2. fix decimal problem on aggregate function sum and agg introduced by #13764 3. fix memo merge group NPE introduced by #13900	2022-11-08 16:40:46 +08:00
Mingyu Chen	1c07a01038	[feature](multi-catalog) Support data on s3-compatible oss and support aliyun DLF (#13994 ) Support Aliyun DLF Support data on s3-compatible object storage, such as aliyun oss. Refactor some interface of catalog, to make it more tidy. Fix bug that the default text format field delimiter of hive should be \x01 Add a new class PooledHiveMetaStoreClient to wrap the IMetaStoreClient.	2022-11-08 14:02:41 +08:00
谢健	61d4974ba1	[fix](Nereids) Use simple cost to calculate benefit and avoid unuseless calculation (#14056 ) In GraphSimplifier, we can use simple cost to calculate the benefit. And only when the best neighbor of the apply step is the processing edge, we need to update recursively.	2022-11-08 13:11:38 +08:00
morrySnow	e6b12ce8e8	[feature](Nereids) support query that group by use alias generated in aggregate output (#14030 ) support query having alias in group by list, such as: SELECT c1 AS a, SUM(c2) FROM t GROUP BY a;	2022-11-08 11:02:42 +08:00
Mingyu Chen	b09e5ced97	[fix](priv) fix meta replay bug when upgrading from 1.1.x to 1.2.x (#14046 )	2022-11-08 10:43:33 +08:00
luozenglin	6ed443c7e8	[enhancement](profile) add instanceNum, tableIds to profile. (#13985 )	2022-11-08 08:49:16 +08:00
jakevin	17a4746a08	[enhancement](Nereids) support otherJoinConjuncts in cascades join reorder (#13681 )	2022-11-08 00:08:44 +08:00
Gabriel	1c2532b9dc	[Bug](udf) Make UDF's type always nullable (#14002 )	2022-11-07 20:51:31 +08:00
morrySnow	4ea1b39cb2	[enhancement](Nereids) remove unnecessary decimal cast (#13745 )	2022-11-07 19:24:10 +08:00
谢健	f2978fb6ff	[feat](Nereids) add graph simplifier (#14007 )	2022-11-07 18:45:45 +08:00
morrySnow	22b4c6af20	[feature](Nereids) support statement having aggregate function in order by list (#13976 ) 1. add a feature that support statement having aggregate function in order by list. such as: SELECT COUNT() FROM t GROUP BY c1 ORDER BY COUNT() DESC; 2. add clickbench analyze unit tests	2022-11-07 17:01:31 +08:00
starocean999	bb9182d602	[fix](repeat)remove unmaterialized expr from repeat node (#13953 )	2022-11-07 14:13:05 +08:00
zhoumengyks	3c8524b9d8	[security](fe jar) upgrade commons-codec:commons-codec to 1.13 #13951	2022-11-07 13:50:07 +08:00
Tiewei Fang	27549564a7	[feature](table-valued-function) Support S3 tvf (#13959 ) This pr does three things： 1. Modified the framework of table-valued-function(tvf). 2. be support `fetch_table_schema` rpc. 3. Implemented `S3(path, AK, SK, format)` table-valued-function.	2022-11-06 11:04:26 +08:00
Mingyu Chen	fb5a3e118a	[feature-wip](dlf) prepare to support aliyun dlf (#13969 ) [What is DLF](https://www.alibabacloud.com/product/datalake-formation) This PR is a preparation for support DLF, with some changes of multi catalog 1. Add RuntimeException for most of hive meta store or es client visit operation. 2. Add DLF related dependencies. 3. Move the checks of es catalog properties to the analysis phase of creating es catalog TODO(in next PR): 1. Refactor the `getSplit` method to support not only hdfs, but s3-compatible object storage. 2. Finish the implementation of supporting DLF	2022-11-06 10:01:57 +08:00
Mingyu Chen	d01f7c546a	[refactor](iceberg-hudi) disable iceberg and hudi table by default (#13932 )	2022-11-05 19:22:27 +08:00
wxy	620a137bd7	[enhancement](test) support tablet repair and balance process in ut (#13940 )	2022-11-05 19:20:23 +08:00
Gabriel	2ee7ba79a8	[Improvement](javaudf) improve java loader usage (#13962 )	2022-11-05 13:20:04 +08:00
924060929	06a1efdb01	[fix](Nerieds) fix tpch and support trace plan's change event (#13957 ) This pr fix some bugs for run tpc-h 1. fix the avg(decimal) crash the backend. The fix code in `Avg.getFinalType()` and every child class of `ComputeSinature` 2. fix the ReorderJoin dead loop. The fix code in `ReorderJoin.findInnerJoin()` 3. fix the TimestampArithmetic can not bind the functions in the child. The fix code in `BindFunction.FunctionBinder.visitTimestampArithmetic()` New feature: support trace the plan's change event, you can `set enable_nereids_trace=true` to open trace log and see some log like this: ``` 2022-11-03 21:07:38,391 INFO (mysql-nio-pool-0\|208) [Job.printTraceLog():128] ========== RewriteBottomUpJob ANALYZE_FILTER_SUBQUERY ========== before: LogicalProject ( projects=[S_ACCTBAL#17, S_NAME#13, N_NAME#4, P_PARTKEY#19, P_MFGR#21, S_ADDRESS#14, S_PHONE#16, S_COMMENT#18] ) +--LogicalFilter ( predicates=((((((((P_PARTKEY#19 = PS_PARTKEY#7) AND (S_SUPPKEY#12 = PS_SUPPKEY#8)) AND (P_SIZE#24 = 15)) AND (P_TYPE#23 like '%BRASS')) AND (S_NATIONKEY#15 = N_NATIONKEY#3)) AND (N_REGIONKEY#5 = R_REGIONKEY#0)) AND (R_NAME#1 = 'EUROPE')) AND (PS_SUPPLYCOST#10 = (SCALARSUBQUERY) (QueryPlan: LogicalAggregate ( phase=LOCAL, outputExpr=[min(PS_SUPPLYCOST#31) AS `min(PS_SUPPLYCOST)`#33], groupByExpr=[] )), (CorrelatedSlots: [P_PARTKEY#19, S_SUPPKEY#12, S_NATIONKEY#15, N_NATIONKEY#3, N_REGIONKEY#5, R_REGIONKEY#0, R_NAME#1]))) ) +--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] ) \|--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] ) \| \|--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] ) \| \| \|--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] ) \| \| \| \|--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.part, output=[P_PARTKEY#19, P_NAME#20, P_MFGR#21, P_BRAND#22, P_TYPE#23, P_SIZE#24, P_CONTAINER#25, P_RETAILPRICE#26, P_COMMENT#27], candidateIndexIds=[], selectedIndexId=11076, preAgg=ON ) \| \| \| +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.supplier, output=[S_SUPPKEY#12, S_NAME#13, S_ADDRESS#14, S_NATIONKEY#15, S_PHONE#16, S_ACCTBAL#17, S_COMMENT#18], candidateIndexIds=[], selectedIndexId=11124, preAgg=ON ) \| \| +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.partsupp, output=[PS_PARTKEY#7, PS_SUPPKEY#8, PS_AVAILQTY#9, PS_SUPPLYCOST#10, PS_COMMENT#11], candidateIndexIds=[], selectedIndexId=11092, preAgg=ON ) \| +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.nation, output=[N_NATIONKEY#3, N_NAME#4, N_REGIONKEY#5, N_COMMENT#6], candidateIndexIds=[], selectedIndexId=11044, preAgg=ON ) +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.region, output=[R_REGIONKEY#0, R_NAME#1, R_COMMENT#2], candidateIndexIds=[], selectedIndexId=11108, preAgg=ON ) after: LogicalProject ( projects=[S_ACCTBAL#17, S_NAME#13, N_NAME#4, P_PARTKEY#19, P_MFGR#21, S_ADDRESS#14, S_PHONE#16, S_COMMENT#18] ) +--LogicalFilter ( predicates=((((((((P_PARTKEY#19 = PS_PARTKEY#7) AND (S_SUPPKEY#12 = PS_SUPPKEY#8)) AND (P_SIZE#24 = 15)) AND (P_TYPE#23 like '%BRASS')) AND (S_NATIONKEY#15 = N_NATIONKEY#3)) AND (N_REGIONKEY#5 = R_REGIONKEY#0)) AND (R_NAME#1 = 'EUROPE')) AND (PS_SUPPLYCOST#10 = min(PS_SUPPLYCOST)#33)) ) +--LogicalProject ( projects=[P_PARTKEY#19, P_NAME#20, P_MFGR#21, P_BRAND#22, P_TYPE#23, P_SIZE#24, P_CONTAINER#25, P_RETAILPRICE#26, P_COMMENT#27, S_SUPPKEY#12, S_NAME#13, S_ADDRESS#14, S_NATIONKEY#15, S_PHONE#16, S_ACCTBAL#17, S_COMMENT#18, PS_PARTKEY#7, PS_SUPPKEY#8, PS_AVAILQTY#9, PS_SUPPLYCOST#10, PS_COMMENT#11, N_NATIONKEY#3, N_NAME#4, N_REGIONKEY#5, N_COMMENT#6, R_REGIONKEY#0, R_NAME#1, R_COMMENT#2, min(PS_SUPPLYCOST)#33] ) +--LogicalApply ( correlationSlot=[P_PARTKEY#19, S_SUPPKEY#12, S_NATIONKEY#15, N_NATIONKEY#3, N_REGIONKEY#5, R_REGIONKEY#0, R_NAME#1], correlationFilter=Optional.empty ) \|--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] ) \| \|--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] ) \| \| \|--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] ) \| \| \| \|--LogicalJoin ( type=CROSS_JOIN, hashJoinCondition=[], otherJoinCondition=[] ) \| \| \| \| \|--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.part, output=[P_PARTKEY#19, P_NAME#20, P_MFGR#21, P_BRAND#22, P_TYPE#23, P_SIZE#24, P_CONTAINER#25, P_RETAILPRICE#26, P_COMMENT#27], candidateIndexIds=[], selectedIndexId=11076, preAgg=ON ) \| \| \| \| +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.supplier, output=[S_SUPPKEY#12, S_NAME#13, S_ADDRESS#14, S_NATIONKEY#15, S_PHONE#16, S_ACCTBAL#17, S_COMMENT#18], candidateIndexIds=[], selectedIndexId=11124, preAgg=ON ) \| \| \| +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.partsupp, output=[PS_PARTKEY#7, PS_SUPPKEY#8, PS_AVAILQTY#9, PS_SUPPLYCOST#10, PS_COMMENT#11], candidateIndexIds=[], selectedIndexId=11092, preAgg=ON ) \| \| +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.nation, output=[N_NATIONKEY#3, N_NAME#4, N_REGIONKEY#5, N_COMMENT#6], candidateIndexIds=[], selectedIndexId=11044, preAgg=ON ) \| +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.region, output=[R_REGIONKEY#0, R_NAME#1, R_COMMENT#2], candidateIndexIds=[], selectedIndexId=11108, preAgg=ON ) +--LogicalAggregate ( phase=LOCAL, outputExpr=[min(PS_SUPPLYCOST#31) AS `min(PS_SUPPLYCOST)`#33], groupByExpr=[] ) +--LogicalFilter ( predicates=(((((P_PARTKEY#19 = PS_PARTKEY#28) AND (S_SUPPKEY#12 = PS_SUPPKEY#29)) AND (S_NATIONKEY#15 = N_NATIONKEY#3)) AND (N_REGIONKEY#5 = R_REGIONKEY#0)) AND (CAST(R_NAME AS STRING) = CAST(EUROPE AS STRING))) ) +--LogicalOlapScan ( qualified=default_cluster:regression_test_tpch_sf1_p1_tpch_sf1.partsupp, output=[PS_PARTKEY#28, PS_SUPPKEY#29, PS_AVAILQTY#30, PS_SUPPLYCOST#31, PS_COMMENT#32], candidateIndexIds=[], selectedIndexId=11092, preAgg=ON ) ```	2022-11-04 15:01:06 +08:00
morrySnow	dc01fb4085	[enhancement](Nereids) remove unnecessary string cast (#13730 ) convert string like literal to the cast type instead of run cast in runtime	2022-11-04 11:18:22 +08:00
morrySnow	9bf20a7b5d	[enhancement](Nereids) remove unnecessary int cast (#13881 )	2022-11-04 11:07:59 +08:00
morrySnow	efb2596c7a	[enhancment](Nereids) enable push down filter through aggregation (#13938 )	2022-11-04 11:04:00 +08:00
Jibing-Li	f2d84d81e6	[feature-wip][refactor](multi-catalog) Persist external catalog related metadata. (#13746 ) Persist external catalog/db/table, including the columns of external tables. After this change, external objects could have their own uniq ID through their lifetime, this is required for the statistic information collection.	2022-11-04 09:04:00 +08:00
zhannngchen	698541e58d	[improvement](exec) add more debug info on fragment exec error (#13899 )	2022-11-04 08:55:31 +08:00
Mingyu Chen	5d56fe6d32	[fix](meta)(recover) fix recover info persist bug (#13948 ) introduced from #13830	2022-11-04 07:40:21 +08:00
Gabriel	0a228a68d6	[Improvement](javaudf) support different date argument for date/datetime type (#13920 )	2022-11-03 20:33:20 +08:00
carlvinhust2012	8043418db4	[optimization](array-type) update the exception message when create table with array column (#13731 ) This pr is used to update the exception message when create table with array column. Co-authored-by: hucheng01 <hucheng01@baidu.com>	2022-11-03 17:12:17 +08:00
jakevin	c1438cbad6	[revert](Nereids): revert GroupExpression Children ImmutableList. (#13918 )	2022-11-03 16:29:54 +08:00
gnehil	b1816d49e7	[fix](typo) check catalog enable exception message spelling mistake (#13925 )	2022-11-03 14:44:37 +08:00
mch_ucchi	29e01db7ce	[Fix](Nereids) add comments to CostAndEnforcerJob and fix view test case (#13046 ) 1. add comments to cost and enforcer job as some code is too hard to understand 2. fix nereids_syntax_p0/view.groovy's multi-answer bug.	2022-11-03 12:12:24 +08:00
Gabriel	bfba058ecf	[Feature](join) Support null aware left anti join (#13871 )	2022-11-03 12:11:25 +08:00
Adonis Ling	57ee5c4a65	[feature](nereids) Support authentication (#13434 ) Add a rule to check the permission of a user who are executing a query. Forbid users who don't have SELECT_PRIV on some tables from executing queries on these tables.	2022-11-03 11:58:14 +08:00
morrySnow	31d8fdd9e4	[fix](Nereids) finalize local aggregate should not turn on stream pre agg (#13922 )	2022-11-03 11:08:06 +08:00
starocean999	a4a991207b	[fix](agg)fix group by constant value bug (#13827 ) * [fix](agg)fix group by constant value bug * keep only one const grouping exprs if no agg exprs	2022-11-03 10:26:59 +08:00

... 104 105 106 107 108 ...

8289 Commits