doris

Author	SHA1	Message	Date
xy720	73351917ab	[Enhancement](array-type) Add readable information in subquery for array type #12463	2022-09-09 11:17:50 +08:00
morrySnow	a04f9814fe	[fix](Nereids) column prune generate empty project list on join's child (#12486 ) * [fix](Nereids) column prune generate empty project list on join's child	2022-09-09 10:43:57 +08:00
zy-kkk	a468085efe	[improvement](error info)improve the s3 path err msg #12438	2022-09-09 09:14:24 +08:00
TengJianPing	b45a8379eb	[bugfix](odbc) escape identifiers for sqlserver and postgresql (#12487 ) Delimited identifier format for sqlserver and postgresql is different from MySQL. Sqlserver use brackets ([ ]) and postgresql use double quotes("").	2022-09-09 09:11:03 +08:00
Mingyu Chen	e84272ed43	[improvment](planner) unset common fields to reduce plan thrift size (#12495 ) 1. For query with 1656 union, the plan thrift size will be reduced from 400MB+ to 2MB. This optimization is introduced from #4904, but lost after #9720 2. Disable ExprSubstitutionMap.verify when debug is disable. So that the plan time of query with 1656 union will be reduced from 20s to 2s	2022-09-09 09:02:45 +08:00
morrySnow	d2a23a4cf9	[enhancement](Nereids) change aggregate and join stats calc algorithm (#12447 ) The original statistic derive calculate algorithm rely on NDV and other column statistics. But we cannot get these stats in product environment. This PR change these operator's stats calc algorithm to use a DEFAULT RATIO variable instead of column statistics. We should change these algorithm when we could get column stats in product environment	2022-09-09 01:00:07 +08:00
Kikyou1997	b4f0f39e77	[feature](Nereids) implement uncheckedCast method in VarcharLiteral (#12468 ) Implement uncheckedCast on VarcharLiteral for a temp way to let TimestampArithmetic work. We should remove these code and do implicit cast in TypeCoercion rule in future.	2022-09-09 00:33:37 +08:00
jakevin	8478efad44	[improve](Nereids): check same logicalProperty when insert a Group. (#12469 )	2022-09-09 00:00:11 +08:00
qiye	85bd297777	[feature](function)Support function "current_date" in FE (#11702 ) Issue Number: close #11699	2022-09-08 16:00:57 +08:00
Kikyou1997	d1ab6b1db2	[enhancement](nereids) add syntax support for fractional literal (#12444 ) Just as legacy planner, Nereids parse all fractional literal to decimal. In the future, we will add more syntax for user to control the fractional literal type.	2022-09-08 15:54:20 +08:00
jakevin	7c7ac86fe8	[feature](Nereids): Left deep tree join order. (#12439 ) * [feature](Nereids): Left deep tree join order.	2022-09-08 15:09:22 +08:00
lihuigang	491dd34ba7	[fix](planner) fix orthogonal_bitmap_union_count plan : wrong PREAGGREGATION (#12095 ) Execution plan display when using orthogonal_bitmap_union_count function: PREAGGREGATION: OFF Reason: Invalid Aggregate Operator: orthogonal_bitmap_union_count The correct plan is: PREAGGREGATION: ON Co-authored-by: lihuigang <lihuigang@meituan.com>	2022-09-08 15:00:43 +08:00
Henry2SS	461a4cc94e	[Enhancement](Error Msg) show details of COLUMN and TABLE name regex #11999 Co-authored-by: wuhangze <wuhangze@jd.com>	2022-09-08 14:59:39 +08:00
Tiewei Fang	824a192f8f	[enhancement](http) executeSQL rest api support streaming response (#12239 )	2022-09-08 14:57:15 +08:00
ZenoYang	9225dd16ca	[fix](grouping sets) grouping sets cause be core or return wrong results (#12313 )	2022-09-08 14:55:50 +08:00
924060929	74ffdbeebc	[feature](Nereids) Support OneRowRelation and EmptyRelation (#12416 ) Support OneRowRelation and EmptyRelation. OneRowRelation: `select 100, 'abc', substring('abc', 1, 2)` EmptyRelation: `select * from tbl limit 0` Note: PhysicalOneRowRelation will translate to UnionNode(constExpr) for BE execution	2022-09-08 12:21:13 +08:00
morrySnow	a6880ca573	[fix](Nereids) throw IndexOutOfBoundsException in DistributionSpecHash#equalsSatisfy (#12446 ) In earlier PR #11976 , we changed DistributionSpecHash#equalsSatisfy, and forgot to check whether the length of both side are same. When required's shuffle slot size longer than current one, exception will be thrown.	2022-09-08 11:41:48 +08:00
Ashin Gau	dd2f834c79	[feature-wip](parquet-reader) bug fix, create compress codec before parsing dictionary (#12422 ) ## Fix five bugs: 1. Parquet dictionary data may be compressed, but `ColumnChunkReader` try to parse dictionary data before creating compression codec, causing unexpected data errors. 2. `FE` doesn't resolve array type 3. `ParquetFileHdfsScanner` doesn't fill partition values when the table is partitioned 4. `ParquetFileHdfsScanner` set `_scanner_eof = true` when a scan range is empty, causing the end of the scanner, and resulting in data loss 5. typographical error in `PageReader`	2022-09-08 09:54:25 +08:00
Gabriel	a536030979	[FOLLOWUP](load) fix nullable and add regression (#12375 ) * [FOLLOWUP](load) fix nullable and add regression	2022-09-08 00:05:04 +08:00
Kikyou1997	bdbce77227	[fix](nereids) cast left child of TimestampArithmetic to wrong type in BindFunction (#12423 )	2022-09-07 20:32:47 +08:00
camby	184be8d13c	[fix](array-type) ARRAY is not supported in bloomfilter index (#12353 ) Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2022-09-07 18:00:01 +08:00
chenlinzhong	941bda5a20	[enhancement](spark-load)support dynamic set env (#12276 ) * [enhancement](spark-load)support dynamic set env and display spark appid * [enhancement](spark-load)support dynamic set env	2022-09-07 16:24:29 +08:00
jakevin	40f481049a	[fix](Nereids)lowest cost plan map do not be merged when do group merge (#12396 ) * [fix](Nereids)lowest cost plan map do not be merged when do group merge	2022-09-07 16:13:11 +08:00
Shuo Wang	f2923f9180	[Refactor](Nereids) Simplify get input and output slots for plan/expression. (#12356 ) Simplify the code of getting input/output slots from `Expression` or `Plan`. new interfaces add `Expression`: `getInputSlots`: Get all the input slots of the expression. `Plan`: - `getOutputSet`: Get the output slot set of the plan. - `getInputSlots`: Get the input slot set of the plan. changed interface `TreeNode`: - `collect`: return `set` as result instead of `list`.	2022-09-07 14:05:37 +08:00
Kikyou1997	0bb06a1fa7	[feature](Nereids) let nullable of Year, WeekOfYear and Divide be the same as implementation in BE (#12374 ) These function/expression should always be nullable, so just return true in the overwrite method. - Year - WeekOfYear - Divide	2022-09-07 13:09:08 +08:00
morrySnow	46776af2a3	[fix](Nereids)plan translator lost other conjuncts on hash join node (#12391 ) In the earlier PR #11812 , we split join condition into two parts: hash join conjuncts and other condition. But we forgot to translate other condition into other conjuncts in HashJoinNode of legacy planner. So we get wrong result if query has other condition on join node. Such as: SELECT * FROM lineorder INNER JOIN part ON lo_partkey = p_partkey WHERE lo_orderkey > p_size;	2022-09-07 11:32:05 +08:00
zhangstar333	42bdde8750	[Feature](Vectorized) support jdbc scan node (#12010 )	2022-09-07 10:29:41 +08:00
spaces-x	232d17efea	[Enhancement](sparkload) cast the src slot types of bitmap columns to `bitmap` when FE push tasks in spark load (#12394 ) In the current spark load implementation, the types of source data, that BE reads from the Broker, are all set to varchar. However, the two types of varchar and bitmap are not compatible anymore after version 1.1.0, which will cause spark load failure. An example of spark load error message: detailMessage = type not match, originType=VARCHAR(*), targeType=BITMAP Describe your changes. Set the src type of the bitmap columns from varchar to bitmapwhen fe pushtasks.	2022-09-07 10:07:38 +08:00
Adonis Ling	a465549f5e	[feature](Nereids)support parse and analyze having clause (#12129 ) Implement the having clause for Nereids Planner. NOTE: This PR aims at making Nereids Planner generate the correct logical plan and physical plan only. The runtime correctness is not the goal in this PR due to GROUP BY is not ready in Nereids Planner.	2022-09-07 09:47:03 +08:00
morrySnow	55fb90d6ae	[feature](Nereids)add colocate, shuffle and bucket shuffle join algorithm to Nereids (#11976 ) This PR 1. add support below join algorithm already supported by legacy to Nereids - colocate join - bucket shuffle join - shuffle join - broadcast join 2. update all cost enforce derive utils - ChildOutputPropertyDeriver - EnforceMissingPropertiesHelper - RequestPropertyDeriver 3. add a local quick sort plan used in enforce 4. set PhysicalProperties to PhysicalPlan when choose best plan from memo 5. rename Job#pushTask to Job#pushJob	2022-09-07 00:31:21 +08:00
minghong	4c36e3dfa6	[fix](Nereids)LogicalAggregate's equals and hashCode missing two attributes (#12393 ) After applying NormalizeAggregate rule, owner groups of all aggregate children are removed. The root cause is the new aggregate node is regarded as the old aggregate node, because LogicalAggregate.equals() does not take some attributes ("normalized", "disassembled") into account.	2022-09-07 00:07:26 +08:00
morrySnow	3a0aae1b82	[enhancement](explain)add projections and output id in explain string (#12358 ) In earlier PR #11842, we add the ability of projection on each ExecNode. But, we cannot get the projection expr list in explain. This is inconvenience to debug. This PR add them into explain string if they exist.	2022-09-06 21:03:02 +08:00
morrySnow	f1507f93ee	[enhancement](chore)add single empty line rule to fe check style for Nereids (#12365 )	2022-09-06 14:19:59 +08:00
zhengshiJ	d7dedfadad	[fix](nereids) fix dead loop in unnesting subquery rule (#12345 ) [fix](nereids) fix dead loop in unnesting subquery rule	2022-09-06 11:50:30 +08:00
xueweizhang	53b79d5a8c	[Enhancement](restore) new add the property of reserve_replica to restore statement (#11942 ) Add a new property called 'reserve_replica', which means you can get a table with same partitions with the same replication num as before the backup. Co-authored-by: Stalary <stalary@163.com> Co-authored-by: camby <104178625@qq.com>	2022-09-06 10:32:21 +08:00
starocean999	86fa0e38e2	[fix](join) hash join should use children's output tuple ids not output tableref ids (#12261 )	2022-09-06 09:53:45 +08:00
Stalary	f2aa87d797	Add ctas support config key type ut and doc. (#12327 )	2022-09-06 09:16:02 +08:00
morrySnow	190717dbcc	[enhancement](chore)add single space separator rule to fe check style (#12354 ) Some times, our code use more than one space as separator by mistake. This PR add a CheckStyle rule SingleSpaceSeparator to check that for Nereids.	2022-09-05 21:59:58 +08:00
morrySnow	698bae09b2	[fix](Nereids)get NPE and group not be optimized when add REWRITE rule to Cascades Optimzer (#12346 ) Fix some bugs when add REWRITE rule to Cascades Optimizer - all rule should set as not rewrite rule when use them in Cascades Optimizer - IMPLEMENT rule promise should large than others since we should do exploration first.	2022-09-05 19:11:48 +08:00
minghong	f466a072d8	fix bug: tpch-q12 invalid type (#12347 ) In old planner, Predicate set its type in analyzeImpl(). However, function analyzeImpl() is in old planner path, but not in nereids path. And hence the type is invalid. Because all predicate has type bool, we set its type in constructor.	2022-09-05 19:09:27 +08:00
Kikyou1997	dadfd85c40	prune for agg with constant expr (#12274 ) Currently, nereids doesn't support aggregate function with no slot reference in query, since all the column would be pruned, e.g. SELECT COUNT(1) FROM t; This PR reserve the column with the smallest amount of data when doing column prune under this situation. To be noticed, this PR ONLY handle aggregate functions. So projection with no slot reference need to be handled in future.	2022-09-05 19:09:00 +08:00
Adonis Ling	8bfb89c100	[feature-wip](array-type) Add some regression tests for nested array (#12322 ) #11392 made _input_block in each BetaRowsetReaders sharable. However, for some types (e.g. nested array with more than 1 depth), the _column_vector_batches in RowBlockV2 can be nested which means that there is a ColumnVectorBatch inside another ColumnVectorBatch. In this case, the data of inner ColumnVectorBatch may be corrupted because the data of _input_block is copied shallowly to the _output_block.	2022-09-05 14:05:24 +08:00
Gabriel	3b104e334a	[Bug](load) fix missing nullable info in stream load (#12302 )	2022-09-05 13:41:28 +08:00
morrySnow	2398cd3bb6	[enhancement](Nereids)print slot name in explain string (#12272 ) Currently, explain string print all expression as slot id, e.g. `<slot 1>`. This PR, print its name with slot id instead, e.g. `column_a[#1]`. For details: - print qualified table name for OlapScanNode - print NamedExpression name with SlotId instead of just SlotId - OlapScanNode's node name use "OlapScanNode" instead of table name	2022-09-05 11:31:35 +08:00
camby	90a0baf5f8	[fix](array-type) Forbid ARRAY<NOT_NULL(T)> temporarily (#12262 ) Currently, there are still lots of bugs related to ARRAY<NOT_NULL(T)>. We decide that we don't support ARRAY<NOT_NULL(T)> types at the first version and all elements in ARRAY are nullable. Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2022-09-03 14:26:08 +08:00
minghong	34dd67f804	[feature](nereids) add weekOfYear to support ssb-flat benchmark (#12207 ) support function WeekOfYear In current implementation, WeekOfYear can be used in where clause, but not in select clause.	2022-09-03 12:04:51 +08:00
xy720	62561834a8	[Feature](array-type) Support is-null-predicate for array type (#12237 )	2022-09-03 11:37:57 +08:00
Zhengguo Yang	c944496fb4	[chore](log) add cluster and tag message to exception (#12287 )	2022-09-02 20:46:39 +08:00
Stalary	0d33c713d1	[Bug](CTAS) Fix CTAS error for use agg column as first. (#12299 ) * FIX: ctas default use duplicate key.	2022-09-02 20:44:01 +08:00
zhengshiJ	7f7a3a7524	[feature](nereids) Convert subqueries into algebraic expressions and … (#11454 ) 1.Convert subqueries to Apply nodes. 2.Convert ApplyNode to ordinary join. ### Detailed design: There are three types of current subexpressions, scalarSubquery, inSubquery, and Exists. The scalarSubquery refers to the returned data as 1 row and 1 column. Subquery replacement ``` before: scalarSubquery: filter(t1.a = scalarSubquery(output b)); inSubquery: filter(inSubquery); inSubquery = (t1.a in select *); exists: filter(exists); exists = (select ); end: scalarSubquery: filter(t1.a = b); inSubquery: filter(True); exists: filter(True); ``` Subquery Transformation Rules* ``` PushApplyUnderFilter * before: * Apply * / \ * Input(output:b) Filter(Correlated predicate/UnCorrelated predicate) * * after: * Filter(Correlated predicate) * \| * Apply * / \ * Input(output:b) Filter(UnCorrelated predicate) ``` ``` PushApplyUnderProject * before: * Apply * / \ * Input(output:b) Project(output:a) * * after: * Project(b,(if the Subquery is Scalar add 'a' as the output column)) * / \ * Input(output:b) Apply ``` ``` ApplyPullFilterOnAgg * before: * Apply * / \ * Input(output:b) agg(output:fn,c; group by:null) * \| * Filter(Correlated predicate(Input.e = this.f)/UnCorrelated predicate) * * end: * Apply(Correlated predicate(Input.e = this.f)) * / \ * Input(output:b) agg(output:fn,this.f; group by:this.f) * \| * Filter(UnCorrelated predicate) ``` ``` ApplyPullFilterOnProjectUnderAgg * before: * apply * / \ * Input(output:b) agg * \| * Project(output:a) * \| * Filter(correlated predicate(Input.e = this.f)/Unapply predicate) * \| * child * apply * / \ * Input(output:b) agg * \| * Filter(correlated predicate(Input.e = this.f)/Unapply predicate) * \| * Project(output:a,this.f, Unapply predicate(slots)) * \| * child ``` ``` ScalarToJoin * UnCorrelated -> CROSS_JOIN * Correlated -> LEFT_OUTER_JOIN ``` ``` InToJoin * Not In -> LEFT_ANTI_JOIN * In -> LEFT_SEMI_JOIN ``` ``` existsToJoin * Exists * Correlated -> LEFT_SEMI_JOIN * correlated LEFT_SEMI_JOIN(Correlated Predicate) * / \ --> / \ * input queryPlan input queryPlan * * UnCorrelated -> CROSS_JOIN(limit(1)) * uncorrelated CROSS_JOIN * / \ --> / \ * input queryPlan input limit(1) * \| * queryPlan * * Not Exists * Correlated -> LEFT_ANTI_JOIN * correlated LEFT_ANTI_JOIN(Correlated Predicate) * / \ --> / \ * input queryPlan input queryPlan * * UnCorrelated -> CROSS_JOIN(Count()) Filter(count() = 0) \| * apply Cross_Join * / \ --> / \ * input queryPlan input agg(output:count()) \| * limit(1) * \| * queryPlan ```	2022-09-02 17:34:19 +08:00

1 2 3 4 5 ...

2719 Commits