Add `information_schema` database for all catalog.
This is useful when using BI tools to connect to Doris,
the tools can get meta info from `information_schema`.
This PR mainly changes:
1. There will be a `information_schema` db in each catalog.
2. Each `information_schema` db only store the meta info of the catalog it belongs to.
3. For `information_schema`, the `TABLE_SCHEMA` column's value is the database name.
4. There is a new global variable `show_full_dbname_in_info_schema_db`, default is false, if set to true,
The `TABLE_SCHEMA` column's value is the like `ctl.db`, because:
When connect to Doris, the `database` info in connection url will be: `xxx?db=ctl.db`.
And then some BI will try to query `information_schema` with sql like:
`select * from information_schema.columns where TABLE_SCHEMA = "ctl.db"`
So it has to be format as `ctl.db`
eg, the `information_schema.columns` table in external catalog `doris` is like:
```
mysql> select * from information_schema.columns limit 1\G
*************************** 1. row ***************************
TABLE_CATALOG: doris
TABLE_SCHEMA: doris.__internal_schema
TABLE_NAME: column_statistics
COLUMN_NAME: id
ORDINAL_POSITION: 1
COLUMN_DEFAULT: NULL
IS_NULLABLE: NO
DATA_TYPE: varchar
CHARACTER_MAXIMUM_LENGTH: 4096
CHARACTER_OCTET_LENGTH: 16384
NUMERIC_PRECISION: NULL
NUMERIC_SCALE: NULL
DATETIME_PRECISION: NULL
CHARACTER_SET_NAME: NULL
COLLATION_NAME: NULL
COLUMN_TYPE: varchar(4096)
COLUMN_KEY:
EXTRA:
PRIVILEGES:
COLUMN_COMMENT:
COLUMN_SIZE: 4096
DECIMAL_DIGITS: NULL
GENERATION_EXPRESSION: NULL
SRS_ID: NULL
```
6. Modify the behavior of
- show tables
- shwo databases
- show columns
- show table status
The above statements may query the `information_schema` db if there is `where` predicate after them
1. expand_runtime_filter_by_inner_join will create some redundant rfs,e.g., tpch q5 and q9, we need to remove one
2. hive: prune rf if target only used as probe
case:
```
MySQL root@127.0.0.1:test> select cast(12 as decimalv3(2,1))
+-----------------------------+
| cast(12 as DECIMALV3(2, 1)) |
+-----------------------------+
| 12.0 |
+-----------------------------+
```
decimalv2 literal will generate wrong result too. But it is not only
bugs in planner, but also have bugs in executor. So we need fix executor
bug in another PR.
Alter column stats operation need to write bdbje, so it should be forwarded to master to execute. Otherwise, the operation on follower/observer will cause the FE crash.
The `ADMIN SHOW` statement can not be executed with high version of mysql 8.x jdbc driver.
So I rename these statement, remove the `ADMIN` keywords.
1. ADMIN SHOW CONFIG -> SHOW CONFIG
2. ADMIN SHOW REPLICA -> SHOW REPLICA
3. ADMIN DIAGNOSE TABLET -> SHOW TABLET DIAGNOSIS
4. ADMIN SHOW TABLET -> SHOW TABLET
for compatibility, the old statements are still supported, but not recommend to use.
They will be removed in later version
FIX
1. remove float and double literal toString and getStringValue introduced by
PR #23504 and PR #23271
These functions lead to wrong cast result of double and float literal
2. fix compute signature for datetimev2 always produce scale 6
3. fix stats calculator failed when generate node stats with two same column
4. constant fold on fe failed when cast double to integral
TODO
after fix the first problem, some mv matching not work well, fix them later
- test_dup_mv_div
- test_dup_mv_json
- test_tcu
materialized view def is as following:
> select l_linenumber, o_custkey
> from orders
> left join lineitem on lineitem.L_ORDERKEY = orders.O_ORDERKEY
> where o_custkey = 1;
when query is as following, it can be rewritten by mv above
it requires that query has reject null filters on the join right input,
current supported filter are "=", "<", "<=", ">", ">=", "<=>"
> select IFNULL(orders.O_CUSTKEY, 0) as custkey_not_null,
> case when l_linenumber in (1,2,3) then l_linenumber else o_custkey end as case_when
> from orders
> inner join lineitem on orders.O_ORDERKEY = lineitem.L_ORDERKEY
> where o_custkey = 1 and l_linenumber > 0;
Only write editlog for manual analyze task, don't need to do so for auto tasks to reduce writing editlog.
Add error message to job info while task failed.
Support to create partition materialized view using nodata table
Such as the table def as following:
> CREATE TABLE `test_no_data` (
> `user_id` LARGEINT NOT NULL COMMENT '"用户id"',
> `date` DATE NOT NULL COMMENT '"数据灌入日期时间"',
> `num` SMALLINT NOT NULL COMMENT '"数量"'
> ) ENGINE=OLAP
> DUPLICATE KEY(`user_id`, `date`, `num`)
> COMMENT 'OLAP'
> PARTITION BY RANGE(`date`)
> (PARTITION p201701_1000 VALUES [('0000-01-01'), ('2017-02-01')),
> PARTITION p201702_2000 VALUES [('2017-02-01'), ('2017-03-01')),
> PARTITION p201703_all VALUES [('2017-03-01'), ('2017-04-01')))
> DISTRIBUTED BY HASH(`user_id`) BUCKETS 2
> PROPERTIES ('replication_num' = '1') ;
when table test_no_data has no data, it also support to create partition materialized view as following:
> CREATE MATERIALIZED VIEW no_data_partition_mv
> BUILD IMMEDIATE REFRESH AUTO ON MANUAL
> partition by(`date`)
> DISTRIBUTED BY RANDOM BUCKETS 2
> PROPERTIES ('replication_num' = '1')
> AS
> SELECT * FROM test_no_data where date > '2017-05-01';
>
Query rewrite by mv support bitmap_union and bitmap_union_count roll up, aggregate functions which supports roll up is listed as following:
| 查询中函数 | 物化视图中函数 | 函数上卷后 |
|------------------|--------------|--------------------|
| max | max | max |
| min | min | min |
| sum | sum | sum |
| count | count | sum |
| count(distinct ) | bitmap_union | bitmap_union_count |
| bitmap_union | bitmap_union | bitmap_union|
| bitmap_union_count | bitmap_union | bitmap_union_count |
this depends on https://github.com/apache/doris/pull/29256
The current logic for SQL dialect conversion is all in the `fe-core` module, which may lead to the following issues:
- Changes to the dialect conversion logic may occur frequently, requiring users to upgrade the Doris version frequently within the fe-core module, leading to a longer change cycle.
- The cost of customized development is high, requiring users to replace the fe-core JAR package.
Turning it into a plugin can address the above issues properly.
Problem:
fe ut failed cause of null pointer error
Cause:
fe ut getting statement context from connection context failed
Resolved:
add null pointer judgement