when show databases/tables/table status where xxx, it will change a selectStmt to select result from
information_schema, it need catalog info to scan schema table, otherwise may get many
database or table info from multi catalog.
for example
mysql> show databases where schema_name='test';
+----------+
| Database |
+----------+
| test |
| test |
+----------+
MySQL [internal.test]> show tables from test where table_name='test_dc';
+----------------+
| Tables_in_test |
+----------------+
| test_dc |
| test_dc |
+----------------+
# Proposed changes
## refactor
- add AggregateExpression to shield the difference of AggregateFunction before disassemble and after
- request `GATHER` physicalProperties for query, because query always collect result to the coordinator, use `GATHER` maybe select a better plan
- refactor `NormalizeAggregate`
- remove some physical fields for the `LogicalAggregate`, like `AggPhase`, `isDisassemble`
- remove `AggregateDisassemble` and `DistinctAggregateDisassemble`, and use `AggregateStrategies` to generate various of PhysicalHashAggregate, like `two phases aggregate`, `three phases aggregate`, and cascades can auto select the lowest cost alternative.
- move `PushAggregateToOlapScan` to `AggregateStrategies`
- separate the traverse and visit method in FoldConstantRuleOnFE
- if some expression not implement the visit method, the traverse method can handle and rewrite the children by default
- if some expression implement the visit, the user defined traverse(invoke accept/visit method) will quickly return because the default visit method will not forward to the children, and the pre-process in traverse method will not be skipped.
## new feature
- support `disable_nereids_rules` to skip some rules.
example:
1. create 1 bucket table `n`
```sql
CREATE TABLE `n` (
`id` bigint(20) NOT NULL
) ENGINE=OLAP
DUPLICATE KEY(`id`)
COMMENT 'OLAP'
DISTRIBUTED BY HASH(`id`) BUCKETS 1
PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"in_memory" = "false",
"storage_format" = "V2",
"disable_auto_compaction" = "false"
);
```
2. insert some rows into `n`
```sql
insert into n select * from numbers('number'='20000000')
```
3. query table `n`
```sql
SET enable_nereids_planner=true;
SET enable_vectorized_engine=true;
SET enable_fallback_to_original_planner=false;
explain plan select id from n group by id;
```
the result show that we use the one stage aggregate
```
| PhysicalHashAggregate ( aggPhase=LOCAL, aggMode=INPUT_TO_RESULT, groupByExpr=[id#0], outputExpr=[id#0], partitionExpr=Optional.empty, requestProperties=[GATHER], stats=(rows=1, width=1, penalty=2.0E7) ) |
| +--PhysicalProject ( projects=[id#0], stats=(rows=20000000, width=1, penalty=0.0) ) |
| +--PhysicalOlapScan ( qualified=default_cluster:test.n, output=[id#0, name#1], stats=(rows=20000000, width=1, penalty=0.0) ) |
```
4. disable one stage aggregate
```sql
explain plan select
/*+SET_VAR(disable_nereids_rules=DISASSEMBLE_ONE_PHASE_AGGREGATE_WITHOUT_DISTINCT)*/
id
from n
group by id
```
the result is two stage aggregate
```
| PhysicalHashAggregate ( aggPhase=GLOBAL, aggMode=BUFFER_TO_RESULT, groupByExpr=[id#0], outputExpr=[id#0], partitionExpr=Optional[[id#0]], requestProperties=[GATHER], stats=(rows=1, width=1, penalty=2.0E7) ) |
| +--PhysicalHashAggregate ( aggPhase=LOCAL, aggMode=INPUT_TO_BUFFER, groupByExpr=[id#0], outputExpr=[id#0], partitionExpr=Optional[[id#0]], requestProperties=[ANY], stats=(rows=1, width=1, penalty=2.0E7) ) |
| +--PhysicalProject ( projects=[id#0], stats=(rows=20000000, width=1, penalty=0.0) ) |
| +--PhysicalOlapScan ( qualified=default_cluster:test.n, output=[id#0, name#1], stats=(rows=20000000, width=1, penalty=0.0) ) |
```
add DateV2 and DateTimeV2 for Literal.uncheckCastTo()
move nereids tpch cases into suite nereids_tpch_p1
move nereids datav2 cases into suite nereids_datav2_p1
The PhysicalFilter can't be assigned to ExchangeNode, SortNode and UnionNode. The nereids would create a standalone SelectNode to do the filter work properly.
## date_add series
- DATE_ADD
- DAYS_ADD
- ADDDATE
- TIMESTAMPADD
## date_sub series
- DATE_SUB
- DAYS_SUB
- SUBDATE
## NOTE
1. For DAYS_XXX, time unit is omissible, by default the time unit is DAY
2. no TIMESTAMPSUB
Fix two things:
1. Fix that the MySQL table displays the garbled code even if the UTF8 is specified for table.
2. Fix that `test_mysql_jdbc_catalog.out` lack of returned data for table `ex_tb13`.
Issue Number: close #xxx
I add jdbc catalog for doris multi-catalog feature.
Currently, the jdbc catalog only supports MYSQL DBMS.
TODO:
support for postgre DB
Support for other databases.
Problem summary
For jdbc catalog, we can create catalog like:
CREATE CATALOG jdbc4 PROPERTIES (
"type"="jdbc",
"jdbc.user"="root",
"jdbc.password"="123456",
"jdbc.jdbc_url" = "jdbc:mysql://127.0.0.1:13396/demo?yearIsDateType=false",
"jdbc.driver_url" = "file:/mnt/disk2/ftw/tools/jar/mysql-connector-java-5.1.47/mysql-connector-java-5.1.47.jar",
"jdbc.driver_class" = "com.mysql.jdbc.Driver"
);
Note:
yearIsDateType is a param of jdbc:
If yearIsDateType configuration property is set to false, then the returned object type is java.sql.Short. If set to true (the default), then the returned object is of type java.sql.Date with the date set to January 1st, at midnight.
To compat with mysql, we force the use of yearIsDateType=false in FE. if user sets yearIsDateType=true, doris FE will force to change yearIsDateType=false.
Add tpch 1g orc test case in hive docker
Refactor some suites dir of catalog test cases.
And "-internal" for dlf endpoint, to support access oss with aliyun vpc.