Commit Graph

2298 Commits

Author SHA1 Message Date
5dfb59844f [enhancement](Nereids)refactor PlannerContext and JobContext (#10485)
Refactor Context in Cascades:
use two context in cascades framework.

JobContext is used in each job, contains such attributes:
- reference to PlannerContext
- current cost upper bound 
- current required physical properties

PlannerContext is used to hold global info for query planner, contains such attributes:
- reference to Memo
- reference to connectContext
- reference to ruleset could be used for plan
- job pool to maintain unexecuted jobs
- job scheduler to schedule unexecuted jobs
- current job context for next job to be executed
2022-07-06 18:36:31 +08:00
f758e1166a [fix] Fix RewriteBinaryPredicatesRule which causes wrong query results in some cases. (#10551)
During the query planning phase, the binary predicate rewrite optimization process converting DecimalLiteral to integers may overflow, resulting in false values like "id = 12345678901.0" (see the issue for detailed examples).

This pr fixes a possible overflow and optimizes the case where DecimalLiteral is not in the column type value range.

Issue Number: close #10544
2022-07-06 15:39:27 +08:00
0b80457c1f [feature](nereids) support like and regexp predicate (#10411)
support like and regexp predicate for nereids.
for example:
select * from t1 where k1 like 'xxx' and k2 regexp '^sa'
2022-07-06 14:32:06 +08:00
0b9f508379 [fix](nereids) fix ut,check bound should be called recursively on the plan node (#10530)
fix ut,check bound should be called recursively on the plan node
2022-07-06 10:37:05 +08:00
c936abd2a3 [fix](fe) when bdbje adding follower, master write op may failed. (#10376) 2022-07-06 10:29:16 +08:00
5f5e01b285 [feature-wip](multi-catalog) Fix hive partition prune in hive and hudi external table. (#10547)
`ExprBuilder` use stack to build the expr. 
The input order is : col, value and the output is value, col, but the `>=` is not reverse.
Example:
`col >=  1` => `1 >= col`

In this case, it's better use the queue to keeper the input order.

And also the `CompoundPredicate(OR)` have some problems,  it should be `alwaysTrue` whenever it's not a partition key or it's not a supported op.
2022-07-06 10:22:16 +08:00
589ab06b5c [enhancement](nereids) make filter node and join node work in Nereids (#10605)
enhancement
- add functions `finalizeForNereids` and `finalizeImplForNereids` in stale expression to generate some attributes using in BE.
- remove unnecessary parameter `Analyzer` in function `getBuiltinFunction`
- swap join condition if its left hand expression related to right table
- change join physical implementation to broadcast hash join 
- add push predicate rule into planner

fix
- swap join children visit order to ensure the last fragment is root
- avoid visit join left child twice

known issues
- expression compute will generate a wrong answer when expression include arithmetic with two literal children.
2022-07-05 18:23:00 +08:00
3b0ddd7ae0 [Enhancement](Nereids)(Step1) prune column for filter/agg/join/sort (#10478)
Column pruning for filter/agg/join/sort.

#### For agg
Pattern : agg()
Transformed:
```
agg
  |
project
  |
child
```
#### For filter()/sort():
Pattern: project(filter()/join()/sort())
Transformed:
```
project
    |
filter/sort
   |
project
   |
child
```
#### For join
Pattern: project(join())
Transformed:
```
        project
             |
           join
       /          \
project    project
   |              |
child        child
```

for example:
```sql
table a: k1,v1
table b: k1,k2,k3,v1
select a.k1,b.k2 from a,b on a.k1 = b.k1 where a.k1 > 1
```

origin plan tree:
```
         project(a.k1,b.k2 )
                        |
          join(a:k1,v1 b:k1,k2,k3,v1)
                /                   \
 scan(a:k1,v1)         scan(b:k1,k2,k3,v1)
```

transformed plan tree:

```
              project(a.k1,b.k2 )
                        |
               join(a:k1 b:k1,k2)
               /                  \
          project(k1)   project(k1,k2)
               |                      |
 scan(a:k1,v1)       scan(b:k1,k2,k3,v1)
```
2022-07-05 17:54:21 +08:00
f40ae7c654 [feature-wip](multi-catalog) support "show proc 'catalogs/'" (#10596) 2022-07-05 13:40:24 +08:00
680118c6b9 [Feature] [nereids] Agg rewrite rule of nereids optmizer (#10412)
Add Rule for disassemble the logical aggregate node, this is necessary since our execution framework is distributed and the execution of aggregate always in two steps, first, aggregate locally then merge them.

Add some fields to logical aggregate to determine whether a logical aggreate operator has been disasembled and mark the aggregate phase it belongs and add the logic to mapping  the new aggregate function to its stale definition to get the function intermediate type.
2022-07-05 11:57:42 +08:00
e444ac7a87 [format](*): using guava package header (#10325) 2022-07-05 11:05:39 +08:00
73ba806046 [feature-wip](multi-catalog) Add catalog to information_schema table "columns". (#10592) 2022-07-05 09:57:19 +08:00
1cee0a7028 [feature-wip](multi-catalog) Modify the persist method about data source (#10523) 2022-07-04 18:24:14 +08:00
46bff6bba0 [fix](multi-catalog) fix the core dump on hms table (#10573)
In the funciton `TextConverter::write_vec_column`, it should execute the statement `nullable_column->get_null_map_data().push_back(0);` for every row.
Otherwise the null map will get error and cause the core dump.
2022-07-04 15:52:05 +08:00
e6f090e5bf [enhancement](Nereids)make nereids work (#10550)
Nereids could execute query: `select a from t;`

**enhancement**
- add a queriable interface for QueryStmt and LogicalPlanAdapter Temporarily
- refactor GroupId, GroupId extends doris.common.id now
- GroupId is generated by it's memo now, not global yet
- add varchar type
- Nereids enabled only when vectorized engine enabled

**fix**
- set output and column label to logicalPlanAdapter
- set output expression on root fragment
- set select partition and select index id to OlapScanNode
- BatchRulesJob add rule type mismatch
- add all implementation rules to rule set
- SlotReference get catalog column no longer returns null values
- bind star correctly
- implement `isNullable` in expressions

**known issue**
- could not do expression mapping(e.g. a + 1) on project node(wait intermediate tuple interface and project ability in ExecNode in be)
- aggregate do not work
- sort do not work
- filter do not work
- join do not work
2022-07-04 14:15:33 +08:00
4e00584e40 [fix] fix api of table schema in http v2 (#10476)
schema in this api should be a list, just like in v1
2022-07-03 23:20:03 +08:00
8b6c46cfd1 [fix] fix create table like when having sequence column (#10464) 2022-07-03 23:19:46 +08:00
614b782d4d [feature](doris-on-es) Support es external table not assign schema (#9583) 2022-07-03 23:19:05 +08:00
bfaa60b695 [fix](fe-ut) fix ut compile bug (#10562)
Introduced from #10306
2022-07-02 22:54:14 +08:00
848e0c5987 [fix](planner)infer predicate generate infered predicate using wrong information from another scope (#10519)
This PR fix a bug in predicate inference.

The original predicate inference compare two slot without SlotId. This will arise an error when a query has SetOperand and more than one SetOperand's child use same table alias. e.g.

```
select * from tb1 inner join tb2 on tb1.k1 = tb2.k1 
union
select * from tb1 inner join tb2 on tb1.k2 = tb2.k2 where tb1.k1 = 3;
```

in this case, we infer a predicate `tb2.k1 = 3` on table 'tbl2' of SetOperand's second child by mistake.
2022-07-02 22:41:04 +08:00
078cb3b4db [feature-wip](multi-catalog) end to end to support multi-catalog (#10521)
Get through the previous pull requests that support multi-catalog, and end to end to achieve multi-catalog.
2022-07-02 20:43:10 +08:00
632ff01bbb [enhancement](Nereids) add post porcessor and error listener to prser (#10306)
add parser error listener and post processor to parser

error listener:
- throw exception when parser find unexpected syntax

post processor:
- throw exception when find error indent
- replace '``' with '`' in quoted identifier
- replace non reserved key word with normal identifier
2022-07-01 21:25:13 +08:00
3b3debf5a4 [build] Fix nested resource path error when as maven project from eclipse (#10427)
1. Fix nested resource path error when as maven project from eclipse
2. Add instructions of "Eclipse import FE as maven project" in developer guide
2022-07-01 18:03:54 +08:00
0401c04497 [chore] remove unused code for enable_lateral_view (#10438)
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-07-01 16:45:41 +08:00
558a21d7de [style](*): fix declarationOrder error by rearrange code (#10347) 2022-07-01 15:40:34 +08:00
f998c0b044 [Enhancement](Nereids) push down predicate through join (#10462)
Add filter operator to join children according to the predicate of filter and join, in order to achieving  predicate push-down

Pattern: 
```
      filter
         |
       join
      /     \
child    child
```

Transform:
```
      filter
         |
       join
      /     \
filter     filter
 |            |  
child     child
```
2022-07-01 15:39:01 +08:00
a44a222f76 [Feature](neireids) Add support of ProjectNode in PlanTranslator (#10499)
Since we will do the column prune with project node, so we need compact the project outputs to the PlanNode in PhysicalPlanTranslator::visitPhysicalProject

1. Add support for ProjectNode to make column prune available.
2. Add SortNode to PlanFragment when it is unpartitioned piggyback
2022-07-01 13:17:49 +08:00
d0b757c03a [bugfix](fe) fix add follower failed due to conflict socket . (#10429) 2022-07-01 11:12:36 +08:00
6c2e76e39f [enhancement](proc) Support showing more details in show proc "/dbs" (#10471) 2022-07-01 10:27:38 +08:00
d43d3fc35f [improvement] modify comment " to ', to be compatible with mysql. (#10327) 2022-07-01 08:59:29 +08:00
aab7dc956f [refactor](load) Remove mini load (#10520) 2022-06-30 23:21:41 +08:00
aae619ef2c [feature-wip](nereids) Adjust plan execution flow and fix physical bugs (#10481)
Organize the plan process, improve the batch execution of rules and the way to add jobs.
Fix the problem that the condition in PhysicalHashJoin is empty.
2022-06-30 20:07:48 +08:00
77b1565b96 [feature](nereids) Support analyze for test SSB (#10415)
Follow-up #10241, this PR go through parse and analyze the SSB and add this functions:

1. support parse parenthesizedExpression
2. support analyze LogicalAggregate and LogicalSort
3. replace the functionCall to UnboundFunction and BoundFunction
4. support sum aggregate funciton
5. fix the dead loop in the ExpressionRewriter
6. refine some code
2022-06-30 16:26:39 +08:00
620faf4959 [feature-wip](multi-catalog) add auth&catalog check (#10480)
This pr follows up [#10435](https://github.com/apache/doris/pull/10435). [#10435](https://github.com/apache/doris/pull/10435) had supported catalog in sql syntax, but some doris statements are only valid in internal catalog. In order to remind users of the scope of catalog usage, it is necessary to throw errors to exceptions of using catalog in the analyze phase.

## How does it effect origin behavior
It is fully compatible with the previous sql statements. Meanwhile, if using the internal catalog in the statements that all the usage of the internal catalog, the syntax is still valid, but using the external catalog will directly throw errors. For example:
```
MySQL [(none)]> show data from tpch10.lineitem;
+-----------+-----------+------------+--------------+----------+
| TableName | IndexName | Size       | ReplicaCount | RowCount |
+-----------+-----------+------------+--------------+----------+
| lineitem  | lineitem  | 210.809 MB | 32           | 6001215  |
|           | Total     | 210.809 MB | 32           |          |
+-----------+-----------+------------+--------------+----------+

MySQL [(none)]> show data from internal_catalog.tpch10.lineitem;
+-----------+-----------+------------+--------------+----------+
| TableName | IndexName | Size       | ReplicaCount | RowCount |
+-----------+-----------+------------+--------------+----------+
| lineitem  | lineitem  | 210.809 MB | 32           | 6001215  |
|           | Total     | 210.809 MB | 32           |          |
+-----------+-----------+------------+--------------+----------+

MySQL [(none)]> show data from hive.tpch10.lineitem;
ERROR 1105 (HY000): errCode = 2, detailMessage = External catalog 'hive' is not allowed in 'ShowDataStmt'
```
2022-06-30 12:04:23 +08:00
c62c2e308f [chore]replace checkstyle action with mvn checkstyle:check (#10474) 2022-06-30 11:20:50 +08:00
9b6ed1525d [fix](nereids) extractPlan should use physical expressions but is logical expressions actually (#10483)
extractPlan should use physical expressions but is logical expressions actually
2022-06-29 19:38:10 +08:00
c695ccb827 [fix](proc) Fix show proc '/current_query_stmts' error due to wrong index for execTime (#10488)
Co-authored-by: caiconghui1 <caiconghui1@jd.com>
2022-06-29 17:41:47 +08:00
d1055eacb4 [feature-wip](array-type) Use uppercase to describe columns with array type (#10193)
Use uppercase to describe columns with array type.
2022-06-29 14:07:27 +08:00
abd10f0f3e [feature-wip](multi-catalog) Impl FileScanNode in be (#10402)
Define a new file scanner node for hms table in be.
This file scanner node is different from broker scan node as blow:
1. Broker scan node will define src slot and dest slot, there is two memory copy in it: first is from file to src slot
    and second from src to dest slot. Otherwise FileScanNode only have one stemp memory copy just from file to dest slot.
2. Broker scan node will read all the filed in the file to src slot and FileScanNode only read the need filed.
3. Broker scan node will convert type into string type for src slot and then use cast to convert to dest slot type,
    but FileScanNode will have the final type.

Now FileScanNode is a standalone code, but we will uniform the file scan and broker scan in the feature.
2022-06-29 11:04:01 +08:00
9aa800141d [fix](ut)(nereids) the check bound function lacks recursive processing (#10357) 2022-06-29 10:40:13 +08:00
0a36c34326 [feature](nereids) costAndEnforcerJob interim solution (#10468)
In order to complete the ssb test, temporarily increase the implementation of costAndEnforcerJob, and create an OptimizeGroupjob for all children of the group.
2022-06-28 18:45:23 +08:00
f5936aa7ce [enhancement](Nereids): add more implmentation rules. (#10335)
Add more implmentation rules.

Current some `logical` and `physical` operator is different. I change some code to make them match.

Implementation
- Sort:only heap sort
- Agg
- OlapScan
2022-06-28 17:08:33 +08:00
17eb8c00d3 [feature] add table valued function framework and numbers table valued function (#10214) 2022-06-28 14:01:57 +08:00
1f2bf39140 [feature-wip](multi-catalog) get catalog name from TableName (#10435) 2022-06-28 10:42:37 +08:00
498a80547c [fix](fe-ut) fix fe ut and build.sh bug (#10432) 2022-06-27 19:01:05 +08:00
ca94867b4e [Feature-wip] add date v2 type (#9916) 2022-06-26 16:07:56 +08:00
HB
f6ef1aad5c [fix](backup) fix mkdir failed (#10422) (#10423) 2022-06-26 09:55:48 +08:00
4408231765 [fix](random-distribution) Make aggregate keys table with replace type columns and unique keys table can only have hash distribution to make data computing correctly (#10414) 2022-06-26 09:52:16 +08:00
79ad05eec6 [fix](doe) fix doe on es v8 (#10391)
doris on es8 can not work, because type change. The use of type is no longer recommended in es7,
and support for type has been removed from es8.

1. /_mapping not support include_type_name
2. /_search not support use type
2022-06-26 09:51:29 +08:00
eebfbd0c91 Revert "[fix](vectorized) Support outer join for vectorized exec engine (#10323)" (#10424)
This reverts commit 2cc670dba697a330358ae7d485d856e4b457c679.
2022-06-25 22:18:08 +08:00