Commit Graph

11919 Commits

Author SHA1 Message Date
0fa3efae1d [fix](Nereids): removePhysicalExpression() should clear empty Group. (#21951) 2023-07-19 14:41:06 +08:00
bd40767754 [stats](nereids) dump col stats for all physical plan node and cost details in memo #21902
1. print cost detail
2. dump col stats in memo
2023-07-19 14:10:26 +08:00
56c67a442a [regression-test] add p0/p1 case about partition table (#21777) 2023-07-19 14:05:56 +08:00
f668b3965e [Enhancement](Nereids)enable nereids DML by default. (#21539)
TODO: fix cast agg_state type when do insert
2023-07-19 13:52:15 +08:00
d8272b16e9 [fix](fe) fd leak of ssl #19645 2023-07-19 12:45:54 +08:00
ce397a8d32 [FIX](map)fix arrow serde with map null key #21955 2023-07-19 12:09:34 +08:00
d987f782d2 [refactor](Nereids) refactor cte analyze, rewrite and reuse code (#21727)
REFACTOR:

1. Generate CTEAnchor, CTEProducer, CTEConsumer when analyze.

For example, statement `WITH cte1 AS (SELECT * FROM t) SELECT * FROM cte1`.
Before this PR, we got analyzed plan like this:
```
logicalCTE(LogicalSubQueryAlias(cte1))
+-- logicalProject()
    +-- logicalCteConsumer()
```
we only have LogicalCteConsumer on the plan, but not LogicalCteProducer.
This is not a valid plan, and should not as the final result of analyze.
After this PR, we got analyzed plan like this:
```
logicalCteAnchor()
|-- logicalCteProducer()
+-- logicalProject()
    +-- logicalCteConsumer()
```
This is a valid plan with LogicalCteProducer and LogicalCteConsumer

2. Replace re-analyze unbound plan with deepCopy plan when do CTEInline

Because we generate LogicalCteAnchor and LogicalCteProducer when analyze.
So, we could not do re-analyze to gnerate CTE inline plan anymore.
The another reason is, we reuse relation id between unbound and bound relation.
So, if we do re-analyze on unresloved CTE plan, we will get two relation
with same RelationId. This is wrong, because we use RelationId to distinguish
two different relations.
This PR implement two helper class to deep copy a new plan from CTEProducer.
`LogicalPlanDeepCopier` and `ExpressionDeepCopier`

3. New rewrite framework to ensure do CTEInline in right way.

Before this PR, we do CTEInline before apply any rewrite rule.
But sometimes, some CteConsumer could be eliminated after rewrite.
After this PR, we do CTEInline after the plans relaying on CTEProducer have
been rewritten. So we could do CTEInline if some the count of CTEConsumer
decrease under the threshold of CTEInline.

4. add relation id to all relation plan node
5. let all relation generated from table implement trait CatalogRelation
6. reuse relation id between unbound relation and relation after bind


ENHANCEMENT:

1. Pull up CTEAnchor before RBO to avoid break other rules' pattern

Before this PR, we will generate CTEAnchor and LogicalCTE in the middle of plan.
So all rules should process LogicalCTEAnchor, otherwise will generate unexpected plan.
For example, push down filter and push down project should add pattern like:
```
logicalProject(logicalCTE)
...
logicalFilter(logicalCteAnchor)
...
```
project and filter must be push through these virtual plan node to ensure all project
and filter could be merged togather and get right order of them. for Example:
```
logicalProject
+-- logicalFilter
    +-- logicalCteAnchor
        +-- logicalProject
            +-- logicalFilter
                +-- logicalOlapScan
```
upper plan will lead to translation error. because we could not do twice filter and
project on bottom logicalOlapScan.


BUGFIX:

1. Recursive analyze LogicalCTE to avoid bind outer relation on inner CTE

For example
```sql
SELECT * FROM (WITH cte1 AS (SELECT * FROM t1) SELECT * FROM cte1)v1, cte1 v2; 
```
Before this PR, we will use nested cte name to bind outer plan.
So the outer cte1 with alias v2 will bound on the inner cte1.
After this PR, the sql will throw Table not exists exception when binding.

2. Use right way do withChildren in CTEProducer and remove projects in it

Before this PR, we add an attr named projects in CTEProducer to represent the output
of it. This is because we cannot get right output of it by call `getOutput` method on it.
The root reason of that is the wrong implementation of computeOutput of LogicalCteProducer.
This PR fix this problem and remove projects attr of CTEProducer.

3. Adjust nullable rule update CTEConsumer's output by CTEProducer's output

This PR process nullable on LogicalCteConsumer to ensure CteConsumer's output with right
nullable info, if the CteProducer's output nullable has been adjusted.

4. Bind set operation expression should not change children's output's nullable

This PR use fix a problem introduced by prvious PR #21168. The nullable info of
SetOperation's children should not changed after binding SetOperation.
2023-07-19 11:41:41 +08:00
c28b90a301 [Bug](topn opt) disable topn 2 phase read when storage policy is not emtpy (#21909) 2023-07-19 10:28:41 +08:00
1110ff49f3 [feature-wip](dbt) exchange table temp to target table atomically (#21931)
exchange table temp to target table atomically
2023-07-19 10:20:50 +08:00
21633908bd [feature-wip](dbt) overwrite the materialization for table and view (#21935)
overwrite the materialization for table and view
2023-07-19 10:20:29 +08:00
1818526fba [fix](profile) Fix wrong instance number in query profile (#21808) 2023-07-19 10:00:48 +08:00
c993663827 [fix](nereids) fix cte as bc right side hang bug (#21897)
During original computeMultiCastFragmentParams process, we don't handle the scenario the cte as the broadcast right side, which will lead the missing setting of the buildHashTableForBroadcastJoin flag true and finally the sql hang.
2023-07-19 09:43:31 +08:00
5b043a980e [fix](planner)only forbid literal value in AnalyticExpr's order by list (#21819)
* [fix](planner)only forbid literal value in AnalyticExpr's order by list
2023-07-19 09:40:55 +08:00
d349c955f0 [fix](nereids) Disable auto analyze temporarily #21919 2023-07-19 09:27:24 +08:00
e0705f1149 [chore](third-party) Introduce libunwind (#21938) 2023-07-19 01:55:26 +08:00
24c00698f2 [fix](stmt-forward) fix should-be-required fields in forward params (#21945)
* fix-optional-fields-in-forward-param

* fix reviewed
2023-07-19 01:52:50 +08:00
b35cfc5d5e [opt](join) Opt the performance of join probe (#21845) 2023-07-19 01:21:22 +08:00
Pxl
0de94e857f [Bug](materialized view) fix wrong match mv when mv have where clause (#21797) 2023-07-19 01:11:39 +08:00
845cf94a7a [feature](function) support time_to_sec (#21722)
mysql >select sec_to_time(time_to_sec(cast('16:32:18' as time)));
+----------------------------------------------------+
| sec_to_time(time_to_sec(CAST('16:32:18' AS TIME))) |
+----------------------------------------------------+
| 16:32:18                                           |
+----------------------------------------------------+
1 row in set (0.53 sec)

mysql [test]>select sec_to_time(59538);
+--------------------+
| sec_to_time(59538) |
+--------------------+
| 16:32:18           |
+--------------------+
1 row in set (0.03 sec)
2023-07-19 01:09:48 +08:00
1c149439d7 [docs](map)Add map and struct type support parameters (#21802) 2023-07-19 01:06:23 +08:00
802d73f16d [optimization](heartbeart) Rm startuptime from front heart beart class (#21904)
---------

Co-authored-by: yuxianbing <iloveqaz123>
2023-07-19 00:56:36 +08:00
Pxl
4171309b9b [Bug](scanner) fix core dump due to release ScannerContext too early #21946 2023-07-19 00:53:23 +08:00
f6bfe058be [Fix](information_schema) Schema table varchar len error #21308 2023-07-19 00:50:01 +08:00
Pxl
f87fad97e1 [Bug](storage) add lock on base tablet when create_tablet #21915 2023-07-19 00:47:19 +08:00
fff1983f40 [fix](planner)use tupleId of agg node to get its unsigned conjuncts (#21949) 2023-07-19 00:46:49 +08:00
beec0e9169 [Improvement](tablet clone) impr tablet sched speed and fix tablet sched failed too many times (#21856) 2023-07-18 23:25:22 +08:00
dcb165cc9f [opt](hudi) get hudi split concurrently by using parallelStream (#21871)
This PR contains two optimizations:
1. Using parallel stream to get hoodie splits concurrently. It reduce the split time from 1min20s to 12s when splitting 10,000 partitions.
2. Reading hoodie meta table to get table partitions. It reduce the getting partition time from 12min to 3s when reading 10,000 partitions.
2023-07-18 23:19:34 +08:00
28dfcd8785 [fix](pipeline) Fix pipeline that cause plenty timeout of p0 cases #21917 2023-07-18 23:15:49 +08:00
d2b199955a [bugfix](deserialize ) pack struct to avoid parse wrong content for file header (#21907)
Recently we encountered one strange bug where the log is file length is not match. file=/mnt/hdd01/master/NO_AVX2/doris.HDD/snapshot/20230713122303.26.72000/45832/536215111/45832.hdr, file_length=, real_file_length=0 when running restore P2 case, after checking the file on the remote storage we doubt it's the local file deserialize who caused this situation.
Then we analyzed the layout for the struct and the content of the hdr file then we found out that it must be the wrong layout which cause reading wrong content.
2023-07-18 22:32:41 +08:00
a9ea138caf [fix](two level hash table) fix dead loop when converting to two level hash table for zero value (#21899)
When enable two level hash table , if there is zero value in the existing one level hash table, it will cause dead loop when converting to two level hash table, because the PartitionedHashTable::_is_partitioned flag is not set correctly when doing the converting.
2023-07-18 19:50:30 +08:00
c6063ed92f [Revert](lazy open) revert lazy open and add case (#21821) 2023-07-18 19:41:33 +08:00
87556b5741 [bug](test) fix regression test case failed with curdate (#21922)
fix regression test case failed with curdate
2023-07-18 19:10:55 +08:00
d6d27ef428 [fix](Nereids) join other conjuncts should get slot from join output (#21840) 2023-07-18 18:22:40 +08:00
2013dcd0e9 [refactor](load) cleanup segment flush logic in beta rowset writer (#21635) 2023-07-18 18:17:57 +08:00
c36d225a27 [feature](profile) add process hashtable time in join node (#21878)
add process hashtable time in join node
2023-07-18 18:09:42 +08:00
Pxl
3089e4b3b6 [Bug](excution) fix ScannerContext is done make query failed (#21923)
fix ScannerContext is done make query failed
2023-07-18 17:58:00 +08:00
e654b5ddfc [enhancement](broker-load) support special partition path pattern (#21778)
Some users may have non-ACID path like `/path/to/k=v/1/filename`, introducing by HQL statement `insert into union all`, for which path partition `k=v` should be parsed normally in broker load.
2023-07-18 14:50:37 +08:00
ec12a4159a [fix](planner) push conjuncts into SetOperationStmt inline view (#21718)
* [fix](planner)push conjuncts into SetOperationStmt inline view
2023-07-18 14:17:07 +08:00
50b81a9c13 [Fix](multi-catalog) Filter invisible files for hive table. (#21867)
In fact, hive can not read files which startswith "." or "_", so we need filter these files.
2023-07-18 13:08:12 +08:00
Pxl
417e3e5616 [Feature](delete) support fold constant on delete stmt (#21833)
support fold constant on delete stmt
2023-07-18 12:56:28 +08:00
Pxl
19492b06c1 [Bug](decimalv3) fix failed on test_dup_tab_decimalv3 due to wrong precision (#21890)
fix failed on test_dup_tab_decimalv3 due to wrong precision
2023-07-18 12:53:09 +08:00
e1a116af94 [fix](planner)normalize the behavior of from_unixtime() according to Nereids planner (#21723)
if from_unixtime() receive an integer out of int range, the function returns null.
2023-07-18 12:15:38 +08:00
07e720e65d [fix](planner)need recalculate nullable info of output slots for join node (#21650)
* [fix](planner)need recalculate nullable info of output slots for join node
2023-07-18 12:10:27 +08:00
Pxl
b3d3ffa2de [Bug](pipeline) adjust scanner scheduler.submit and _num_scheduling_ctx maintain (#21843)
adjust scanner scheduler.submit and _num_scheduling_ctx maintain
2023-07-18 11:55:21 +08:00
489171e4c1 [Fix](multi catalog)Fix hive partition value contains special character such as / bug (#21876)
Hive escapes some special characters in partition value to %XX, for example, / is escaped to %2F.
Doris didn't handle this case which will cause doris failed to list the files under partition with special characters.
This pr is to fix this bug.
2023-07-18 11:20:38 +08:00
ebd2a4b707 [fix](dynamic partition) fix create hot partition failed without error response (#20996) 2023-07-18 10:56:37 +08:00
e24867e138 [typo][docs] Modify the description of CREATE-TABLE (#21858) 2023-07-18 10:29:47 +08:00
726e0d5ebf [fix](load) fix dead loop in _handle_mem_exceed_limit function when reduce memory for load (#21886) 2023-07-18 09:49:36 +08:00
b656f31cf2 [Enchancement](compatible) show decimalv3 to decimal (#21782) 2023-07-18 09:17:14 +08:00
b6517ed83b [Enhance](function) add boolean type for sum agg function (#21862)
before the sum agg not register for boolean type, so it need cast to other type can execute.
2023-07-18 08:06:52 +08:00