Close #9623 Summary: This pr refactor plan node into plan + operator. In the previous version in nereids, a plan node consists of children and relational algebra, e.g. ```java class LogicalJoin extends LogicalBinary { private Plan left, right; } ``` This structure above is easy to understand, but it difficult to optimize `Memo.copyIn`: rule generate complete sub-plan, and Memo must compare the complete sub-plan to distinct GroupExpression and hurt performance. First, we need change the rule to generate partial sub-plan, and replace some children plan to a placeholder, e.g. LeafOp in Columbia optimizer. And then mark some children in sub-plan to unchanged, and bind the relate group, so don't have to compare and copy some sub-plan if relate group exists. Second, we need separate the origin `Plan` into `Plan` and `Operator`, which Plan contains children and Operator, and Operator just denote relation relational algebra(no children/ input field). This design make operator and children not affect each other. So plan-group binder can generate placeholder plan(contains relate group) for the sub-query, don't have to generate current plan node case by case because the plan is immutable(means generate a new plan with replace children). And rule implementer can reuse the placeholder to generate partial sub-plan. Operator and Plan have the similar inheritance structure like below. XxxPlan contains XxxOperator, e.g. LogicalBinary contains a LogicalBinaryOperator. ``` TreeNode │ │ ┌───────┴────────┐ Operator │ │ │ │ │ │ │ │ │ ▼ ▼ ▼ Expression Plan PlanOperator │ │ │ │ ┌───────────┴─────────┐ │ │ │ ┌───────────┴──────────────────┐ │ │ │ │ │ │ │ │ ▼ ▼ ▼ ▼ LogicalPlan PhysicalPlan LogicalPlanOperator PhysicalPlanOperator │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ ├───►LogicalLeaf ├──►PhysicalLeaf ├──► LogicalLeafOperator ├───►PhysicalLeafOperator │ │ │ │ │ │ │ │ │ │ │ │ ├───►LogicalUnary ├──►PhysicalUnary ├──► LogicalUnaryOperator ├───►PhysicalUnaryOperator │ │ │ │ │ │ │ │ │ │ │ │ └───►LogicalBinary └──►PhysicalBinary └──► LogicalBinaryOperator └───►PhysicalBinaryOperator ``` The concrete operator extends the XxxNaryOperator, e.g. ```java class LogicalJoin extends LogicalBinaryOperator; class PhysicalProject extends PhysicalUnaryOperator; class LogicalRelation extends LogicalLeafOperator; ``` So the first example change to this: ```java class LogicalBinary extends AbstractLogicalPlan implements BinaryPlan { private Plan left, right; private LogicalBinaryOperator operator; } class LogicalJoin extends LogicalBinaryOperator {} ``` Under such changes, Rule must build the plan and operator as needed, not only the plan like before. for example: JoinCommutative Rule ```java public Rule<Plan> build() { // the plan override function can automatic build plan, according to the Operator's type, // so return a LogicalBinary(LogicalJoin, Plan, Plan) return innerLogicalJoin().then(join -> plan( // operator new LogicalJoin(join.op.getJoinType().swap(), join.op.getOnClause()), // children join.right(), join.left() )).toRule(RuleType.LOGICAL_JOIN_COMMUTATIVE); } ```
Apache Doris (incubating)
Doris is an MPP-based interactive SQL data warehousing for reporting and analysis. Its original name was Palo, developed in Baidu. After donated to Apache Software Foundation, it was renamed Doris.
-
Doris provides high concurrent low latency point query performance, as well as high throughput queries of ad-hoc analysis.
-
Doris provides batch data loading and real-time mini-batch data loading.
-
Doris provides high availability, reliability, fault tolerance, and scalability.
The main advantages of Doris are the simplicity (of developing, deploying and using) and meeting many data serving requirements in a single system. For details, refer to Overview.
Official website: https://doris.apache.org/
License
Note
Some licenses of the third-party dependencies are not compatible with Apache 2.0 License. So you need to disable some Doris features to be complied with Apache 2.0 License. For details, refer to the
thirdparty/LICENSE.txt
Technology
Doris mainly integrates the technology of Google Mesa and Apache Impala, and it is based on a column-oriented storage engine and can communicate by MySQL client.
Compile and install
See Compilation
Getting start
See Basic Usage
Doris Connector
Doris provides support for Spark/Flink to read data stored in Doris through Connector, and also supports to write data to Doris through Connector.
apache/incubator-doris-flink-connector
apache/incubator-doris-spark-connector
Doris Manager
Doris provides one-click visual automatic installation and deployment, cluster management and monitoring tools for clusters.
apache/incubator-doris-manager
Report issues or submit pull request
If you find any bugs, feel free to file a GitHub issue or fix it by submitting a pull request.
Contact Us
Contact us through the following mailing list.
| Name | Scope | |||
|---|---|---|---|---|
| dev@doris.apache.org | Development-related discussions | Subscribe | Unsubscribe | Archives |
Links
- Doris official site - https://doris.incubator.apache.org
- Developer Mailing list - dev@doris.apache.org. Mail to dev-subscribe@doris.apache.org, follow the reply to subscribe the mail list.
- Slack channel - Join the Slack