Commit Graph

2153 Commits

Author SHA1 Message Date
2fc2113b05 [feature] Support show proc BackendLoadStatistic (#9618)
The proc info method already exists in `ClusterLoadStatistic.getBackendStatistic`, I'll add a proc node to show it.
2022-06-01 23:30:10 +08:00
8effdd95a7 [fix](routine-load) fix bug that routine load task can not find backend (#9902)
Introduced from #9492.
2022-06-01 17:55:30 +08:00
92babc7a47 [improvement][fix](planner) Add a rewrite rule to optimize InPredicate. (#9739)
1. Convert child expressions in InPredicate to column type and discard child expressions in them that cannot be converted exactly.
2. Fix the bug of ColumnRange exception caused by InPredicate child expressions type conversion.
3. Fix the problem that the tablet could not be hit due caused by InPredicate child expressions type conversion.
2022-06-01 15:26:32 +08:00
4ab7694a7f [Enhancement](Nereids)rewrite framework used in Memo (#9807)
Issue Number: close #9627 , #9628

This PR introduce two essentials for Nereids

1. pattern match iterator used in memo

pattern match iterator is implemented by two iterators nested within each other: GroupExpressionIterator and GroupIterator.
GroupExpressionIterator use GroupIterator to get all children Plan which matching pattern and use them as children to generate pattern matched plan.
GroupIterator use GroupExpressionIterator to get all pattern matched Plan related to GroupExpressions in itself.

2. plan rewrite framework for memo

Rewrite framework is implemented by two jobs: RewriteTopDownJob and RewriteBottomUpJob
Both of them takes a group, a set of rules that need to be applied, and a context as construction parameters.
RewriteTopDownJob apply these jobs from top to down one by one.
RewriteBottomUpJob apply these jobs from bottom to up one by one.
When one rule rewrites plan tree at a plan node. This plan node will be applied all rules again until no rules can rewrite it.
2022-06-01 15:12:58 +08:00
54e9d49718 [Bug][Vectorized] Fix call nvl function core dump (#9883)
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
2022-05-31 22:18:38 +08:00
e2b93a4165 [feature-wip](multi-catalog) Add basic class and interface for multi catalog support 2022-05-31 11:32:44 +08:00
8092439634 [feature](hudi) Step2: Support query hudi external table(include cow and mor table) (#9752)
support query cow and mor hudi table.
2022-05-30 09:43:36 +08:00
7b98dd438d [feature](function) Add nvl function (#9726) 2022-05-30 09:43:00 +08:00
0683181fef [API changed](parser) Remove merge join syntax (#9795)
Remove merge join sql and merge join node
2022-05-30 09:04:21 +08:00
c5369d3220 [bugfix] Fix create table like when having hidden columns (#9694) 2022-05-29 18:02:16 +08:00
e231273ddf [fix](sql-block-rule) sql block rule NPE (#9778) 2022-05-29 16:21:00 +08:00
ee1bed46be [config] Add backend_rpc_timeout_second in FE config (#9779) 2022-05-27 21:58:09 +08:00
b2c2cdb122 [feature] Support compression prop (#8923) 2022-05-27 21:52:05 +08:00
6698f63dec [fix](function) If function adds type inference (#9728) 2022-05-26 22:43:18 +08:00
e701c057dc [style](fe) wrap and whitespace rules (#9764)
change below rules' severity to error and fix original code error:

- EmptyBlock
- EmptyCatchBlock
- LeftCurly
- RightCurly
- IllegalTokenText
- MultipleVariableDeclarations
- OneStatementPerLine
- StringLiteralEquality
- UnusedLocalVariable
- Indentation
- OuterTypeFilename
- MethodParamPad
- GenericWhitespace
- NoWhitespaceBefore
- OperatorWrap
- ParenPad
- WhitespaceAfter
- WhitespaceAround
2022-05-26 16:56:20 +08:00
32a210f426 [fix](help) fix bug of help command (#9761)
This bug is introduced from #9306, that user need to execute
"help stream-load" to show the help doc.
But actually, it should be "help stream load".
2022-05-26 08:44:00 +08:00
0c70359404 [fix](resource-tag) Consider resource tags when assigning tasks for broker & routine load (#9492)
This CL mainly changes:
1. Broker Load
    When assigning backends, use user level resource tag to find available backends.
    If user level resource tag is not set, broker load task can be assigned to any BE node,
    otherwise, task can only be assigned to BE node which match the user level tags.

2. Routine Load
    The current routine load job does not have user info, so it can not get user level tag when assigning tasks.
    So there are 2 ways:
    1. For old routine load job, use tags of replica allocation info to select BE nodes.
    2. For new routine load job, the user info will be added and persisted in routine load job.
2022-05-26 08:42:09 +08:00
2a11a4ab99 [feature-wip][array-type] Support more sub types. (#9466)
Please refer to #9465
2022-05-26 08:41:34 +08:00
be026addde [security] update canal version to fix fastjson security issue (#9763) 2022-05-25 18:22:37 +08:00
cc9321a09b [Enhancement](Nereids)refactor plan node into plan + operator (#9755)
Close #9623 

Summary:
This pr refactor plan node into plan + operator.

In the previous version in nereids, a plan node consists of children and relational algebra, e.g.
```java
class LogicalJoin extends LogicalBinary {
  private Plan left, right;
}
```
This structure above is easy to understand, but it difficult to optimize `Memo.copyIn`: rule generate complete sub-plan,
and Memo must compare the complete sub-plan to distinct GroupExpression and hurt performance.

First, we need change the rule to generate partial sub-plan, and replace some children plan to a placeholder, e.g. LeafOp in Columbia optimizer. And then mark some children in sub-plan to unchanged, and bind the relate group, so don't have to compare and copy some sub-plan if relate group exists.

Second, we need separate the origin `Plan` into `Plan` and `Operator`, which Plan contains children and Operator, and Operator just denote relation relational algebra(no children/ input field). This design make operator and children not affect each other. So plan-group binder can generate placeholder plan(contains relate group) for the sub-query, don't have to generate current plan node case by case because the plan is immutable(means generate a new plan with replace children). And rule implementer can reuse the placeholder to generate partial sub-plan.

Operator and Plan have the similar inheritance structure like below. XxxPlan contains XxxOperator, e.g. LogicalBinary contains a LogicalBinaryOperator.
```
          TreeNode
             │
             │
     ┌───────┴────────┐                                                   Operator
     │                │                                                       │
     │                │                                                       │
     │                │                                                       │
     ▼                ▼                                                       ▼
Expression          Plan                                                PlanOperator
                      │                                                       │
                      │                                                       │
          ┌───────────┴─────────┐                                             │
          │                     │                                 ┌───────────┴──────────────────┐
          │                     │                                 │                              │
          │                     │                                 │                              │
          ▼                     ▼                                 ▼                              ▼
     LogicalPlan          PhysicalPlan                   LogicalPlanOperator           PhysicalPlanOperator
          │                     │                                 │                              │
          │                     │                                 │                              │
          │                     │                                 │                              │
          │                     │                                 │                              │
          │                     │                                 │                              │
          │                     │                                 │                              │
          ├───►LogicalLeaf      ├──►PhysicalLeaf                  ├──► LogicalLeafOperator       ├───►PhysicalLeafOperator
          │                     │                                 │                              │
          │                     │                                 │                              │
          │                     │                                 │                              │
          ├───►LogicalUnary     ├──►PhysicalUnary                 ├──► LogicalUnaryOperator      ├───►PhysicalUnaryOperator
          │                     │                                 │                              │
          │                     │                                 │                              │
          │                     │                                 │                              │
          └───►LogicalBinary    └──►PhysicalBinary                └──► LogicalBinaryOperator     └───►PhysicalBinaryOperator
```

The concrete operator extends the XxxNaryOperator, e.g.
```java
class LogicalJoin extends LogicalBinaryOperator;
class PhysicalProject extends PhysicalUnaryOperator;
class LogicalRelation extends LogicalLeafOperator;
```

So the first example change to this:
```java
class LogicalBinary extends AbstractLogicalPlan implements BinaryPlan {
  private Plan left, right;
  private LogicalBinaryOperator operator;
}

class LogicalJoin extends LogicalBinaryOperator {}
```

Under such changes, Rule must build the plan and operator as needed, not only the plan like before.
for example: JoinCommutative Rule
```java
public Rule<Plan> build() {
  // the plan override function can automatic build plan, according to the Operator's type,
  // so return a LogicalBinary(LogicalJoin, Plan, Plan)
  return innerLogicalJoin().then(join -> plan(
    // operator
    new LogicalJoin(join.op.getJoinType().swap(), join.op.getOnClause()),
    // children
    join.right(),
    join.left()
  )).toRule(RuleType.LOGICAL_JOIN_COMMUTATIVE);
}
```
2022-05-24 20:53:24 +08:00
77297bb7ee Fix some typos in fe/. (#9682) 2022-05-23 12:11:01 +08:00
d8f1b77cc1 [improvement](planner) Backfill the original predicate pushdown code (#9703)
Due to the current architecture, predicate derivation at rewrite cannot satisfy all cases,
because rewrite is performed on first and then where, and when there are subqueries, all cases cannot be derived.
So keep the predicate pushdown method here.

eg.
select * from t1 left join t2 on t1 = t2 where t1 = 1;

InferFiltersRule can't infer t2 = 1, because this is out of specification.

The expression(t2 = 1) can actually be deduced to push it down to the scan node.
2022-05-22 21:35:32 +08:00
d270f4f2d4 [config](checksum) Disable consistency checker by default (#9699)
Disable by default because current checksum logic has some bugs.
And it will also bring some overhead.
2022-05-22 21:31:43 +08:00
ad4da4aa8f [doc] Fix typos in documentation (#9692) 2022-05-22 21:30:22 +08:00
3391de482b [Refactor] simplify some code in routine load (#9532) 2022-05-22 21:25:39 +08:00
31e40191a8 [Refactor] add vpre_filter_expr for vectorized to improve performance (#9508) 2022-05-22 11:45:57 +08:00
8fa677b59c [Refactor][Bug-Fix][Load Vec] Refactor code of basescanner and vjson/vparquet/vbroker scanner (#9666)
* [Refactor][Bug-Fix][Load Vec] Refactor code of basescanner and vjson/vparquet/vbroker scanner
1. fix bug of vjson scanner not support `range_from_file_path`
2. fix bug of vjson/vbrocker scanner core dump by src/dest slot nullable is different
3. fix bug of vparquest filter_block reference of column in not 1
4. refactor code to simple all the code

It only changed vectorized load, not original row based load.

Co-authored-by: lihaopeng <lihaopeng@baidu.com>
2022-05-20 11:43:03 +08:00
6f61af7682 [Vectorized][java-udf] add datetime&&largeint&&decimal type to java-udf (#9440) 2022-05-20 10:26:09 +08:00
5fa6e892be [fix](broker-scan-node) Remove trailing spaces in broker_scanner. Make it consistent with hive and trino behavior. (#9190)
Hive and trino/presto would automatically trim the trailing spaces but Doris doesn't.
This would cause different query result with hive.

Add a new session variable "trim_tailing_spaces_for_external_table_query".
If set to true, when reading csv from broker scan node, it will trim the tailing space of the column
2022-05-20 09:55:13 +08:00
c2d41c84bf [feature](nereids): add join rules base code (#9598) 2022-05-20 08:18:08 +08:00
c048b1f0f9 [fix](sparkload): fix min_value will be negative number when maxGlobalDictValue exceeds integer range (#9436) 2022-05-19 23:56:24 +08:00
1355bc162b [Enhance] Add host info to heartbeat error msg (#9499) 2022-05-19 23:45:53 +08:00
cbc7b167b1 [Feature] cancel load support state (#9537) 2022-05-19 16:37:56 +08:00
235d586f11 [style](fe) code correct rules and name rules (#9670)
* [style](fe) code correct rules and name rules

* revert some change according to comments
2022-05-19 16:36:03 +08:00
7a9bf5b23e [FeConfig](Project) Project optimization is enabled by default (#9667) 2022-05-19 14:03:14 +08:00
a3183ec45c [fix](planner) unnecessary cast will be added on children in CaseExpr sometimes (#9600)
unnecessary cast will be added on children in CaseExpr because use symbolized equal to compare to `Expr`'s type.
it will lead to expression compare mistake and then lead to expression substitute failed when use `ExprSubstitutionMap`
2022-05-18 22:44:51 +08:00
94c89e8a37 [improment](planner) push down predicate past two phase aggregate (#9498)
Push down predicate past aggregate cannot push down predicate past 2 phase aggregate.

origin plan is like this:
```
second phase agg (conjuncts on olap scan node tuples)
|
first phase agg
|
olap scan node
```
should be optimized to
```
second phase agg
|
first phase agg
|
olap scan node (conjuncts on olap scan node tuples)
```
2022-05-18 10:09:39 +08:00
682cc14182 [bug] (init) Java version check fail (#9607) 2022-05-18 07:47:03 +08:00
7d9c25e718 [config] Remove some old config and session variable (#9495)
1. Remove session variable "enable_lateral_view"
2. Remove Fe config: enable_materialized_view
3. Remove Fe config: enable_create_sync_job
4. Fe config dynamic_partition_enable is only used for disable dynamic partition scheduler.
2022-05-17 22:37:11 +08:00
2ba81899d0 [fix] fix bug that replica can not be repaired duo to DECOMMISSION state (#9424)
Reset state of replica which state are in DECOMMISSION after finished scheduling.
2022-05-17 22:36:30 +08:00
4ba75d3195 [feature] Add StoragePolicyResource for Remote Storage (#9554)
Add StoragePolicyResource for Remote Storage
2022-05-17 20:17:33 +08:00
d95fe08458 [feature] group_concat support distinct (#9576) 2022-05-17 19:29:47 +08:00
72e0042efb [feature-wip](hudi) Step1: Support create hudi external table (#9559)
support create hudi table
support show create table for hudi table

### Design
1. create hudi table without schema(recommanded)
```sql
    CREATE [EXTERNAL] TABLE table_name
    ENGINE = HUDI
    [COMMENT "comment"]
    PROPERTIES (
    "hudi.database" = "hudi_db_in_hive_metastore",
    "hudi.table" = "hudi_table_in_hive_metastore",
    "hudi.hive.metastore.uris" = "thrift://127.0.0.1:9083"
    );
```

2. create hudi table with schema
```sql
    CREATE [EXTERNAL] TABLE table_name
    [(column_definition1[, column_definition2, ...])]
    ENGINE = HUDI
    [COMMENT "comment"]
    PROPERTIES (
    "hudi.database" = "hudi_db_in_hive_metastore",
    "hudi.table" = "hudi_table_in_hive_metastore",
    "hudi.hive.metastore.uris" = "thrift://127.0.0.1:9083"
    );
```
When create hudi table with schema, the columns must exist in corresponding table in hive metastore.
2022-05-17 11:30:23 +08:00
bee5c2f8aa [feature-wip](parquet-vec) Support parquet scanner in vectorized engine (#9433) 2022-05-17 09:37:17 +08:00
c731e84341 [fix](planner)VecNotImplException thrown when query need rewrite and some slot cannot changed to nullable (#9589) 2022-05-16 22:34:02 +08:00
9f9b666bc1 [Feature](Nereids) Data structure of comparison predicate (#9506)
1. The data structure of the comparison expression
2. Refactored the inheritance and implementation relationship of tree node

```
        +-- ---- ---- ---+- ---- ---- ---- ---+- ---- ----- ---- ----TreeNode-----------------+
        |                |                    |                                               |
                                                                                              |
        |                |                    |                                               |
                                                                                              v
        v                v                    v                                           Abstract Tree Node
    Leaf Node        Unary Node          Binary Node                              +--------          ---------+
        |                |                    |                                   |        (children)         |
                                                                                  |                           |
        v                v                    v                                   v                           v
Leaf Expression   Unary Expression      Binary Expression              +------Expression----+           Plan Node
        |                |                    |                        |                    |
                                                                       |                    |
        |                |                    |                        v                    v
        |                |                    +- ---- ---- -----> Comparison Predicate     Named Expr
                                                                                       +----   -------+
        |                |                                                             v              v
        |                +- -- --- --- --- --- --- --- --- --- --- --- --- --- ---> Alias Expr      Slot
                                                                                                      ^
        |                                                                                             |
        |                                                                                             |
        +---- --- ---- ------ ---- ------- ------ ------- --- ------ ------ ----- ---- ----- ----- ---+
```
2022-05-16 15:01:13 +08:00
953429e370 [fix](function) fix last_value get wrong result when have order by clause (#9247) 2022-05-15 23:56:01 +08:00
a9653f00bb [fix](lateral-view) Error view includes lateral view (#9530)
Fixed #9529

When the lateral view based on a inline view which belongs to a view,
Doris could not resolve the column of lateral view in query.
When a query uses a view, it mainly refers to the string representation of the view.
That is, if the view's string representation is wrong, the view is wrong.
The string representation of the inline view lacks the handling of the lateral view.
This leads to query errors when using such views.
This PR mainly fixes the string representation of inline views.
2022-05-14 09:57:08 +08:00
8a0097cfb9 [style](java) format fe code with some check rules (#9460)
Issue Number: close #9403 

set below rules' severity to error and format code according check info.
a. Merge conflicts unresolved
b. Avoid using corresponding octal or Unicode escape
c. Avoid Escaped Unicode Characters
d. No Line Wrap
e. Package Name
f. Type Name
g. Annotation Location
h. Interface Type Parameter
i. CatchParameterName
j. Pattern Variable Name
k. Record Component Name
l. Record Type Parameter Name
m. Method Type Parameter Name
n. Redundant Import
o. Custom Import Order
p. Unused Imports
q. Avoid Star Import
r. tab character in file
s. Newline At End Of File
t. Trailing whitespace found
2022-05-12 20:14:38 +08:00
d7705ace65 [fix](binlog-load) binlog load fails because txn exceeds the default value (#9471)
binlog load Because txn exceeds the default value, resume is a failure,
and a friendly prompt message is given to the user, instead of prompting success now,
it still fails after a while, and the user will feel inexplicable
Issue Number: close #9468
2022-05-12 13:31:22 +08:00