Commit Graph

5983 Commits

Author SHA1 Message Date
62064a86bf [test](ut) added UT cases for show create load stmt (#29564) 2024-01-12 13:58:20 +08:00
1718341051 [pipelineX](fix) Fix correctness problem due to local hash shuffle (#29881) 2024-01-12 13:58:19 +08:00
acda8d2129 [feature](profile )merge of profiles can be disabled by profile level. #29861
The merging of profiles requires ensuring the correctness of the profiles themselves. However, if merging is intended for troubleshooting correctness issues through profiles, errors may occur.

Moreover, the 'try-catch' does not catch exceptions related to profile merging. If merging fails, even the normal profile cannot be obtained.
2024-01-12 13:58:19 +08:00
3ef1229635 [docs](query-accel) refine several statements in docs (#29716) 2024-01-12 13:58:19 +08:00
2a51750abd [fix](dynamic partition) fix dynamic partition storage medium not working (#29490) 2024-01-12 13:58:19 +08:00
0d6ab3c68c [chore](regression test) check disk is good (#29740) 2024-01-12 13:58:19 +08:00
53639a01fe [Fix] (schema change) fix the bug that non light schema change tables can rename column (#29850) 2024-01-12 13:58:19 +08:00
fc5dc1c285 [config](move-memtable) set default load_stream_per_node to 20 (#29822) 2024-01-12 13:58:19 +08:00
cbffdbb8bf [bug](group_commit) fix relay wal problem on materialized-view (#29848) 2024-01-12 13:58:19 +08:00
a4f29193f6 [pipelineX](fix) Fix incorrect runtime filter (#29860) 2024-01-12 13:58:19 +08:00
ebfbe0c8dd [opt](information_schema) support information_schema in external catalog (#28919)
Add `information_schema` database for all catalog.
This is useful when using BI tools to connect to Doris,
the tools can get meta info from `information_schema`.

This PR mainly changes:

1. There will be a `information_schema` db in each catalog.
2. Each `information_schema` db only store the meta info of the catalog it belongs to.
3. For `information_schema`, the `TABLE_SCHEMA` column's value is the database name.
4. There is a new global variable `show_full_dbname_in_info_schema_db`, default is false, if set to true,
    The `TABLE_SCHEMA` column's value is the like `ctl.db`, because:

	When connect to Doris, the `database` info in connection url will be: `xxx?db=ctl.db`.
	
	And then some BI will try to query `information_schema` with sql like:
	
	`select * from information_schema.columns where TABLE_SCHEMA = "ctl.db"`
	
	So it has to be format as `ctl.db`
	
	eg, the `information_schema.columns` table in external catalog `doris` is like:
	
	```
	mysql> select * from information_schema.columns limit 1\G
	*************************** 1. row ***************************
	           TABLE_CATALOG: doris
	            TABLE_SCHEMA: doris.__internal_schema
	              TABLE_NAME: column_statistics
	             COLUMN_NAME: id
	        ORDINAL_POSITION: 1
	          COLUMN_DEFAULT: NULL
	             IS_NULLABLE: NO
	               DATA_TYPE: varchar
	CHARACTER_MAXIMUM_LENGTH: 4096
	  CHARACTER_OCTET_LENGTH: 16384
	       NUMERIC_PRECISION: NULL
	           NUMERIC_SCALE: NULL
	      DATETIME_PRECISION: NULL
	      CHARACTER_SET_NAME: NULL
	          COLLATION_NAME: NULL
	             COLUMN_TYPE: varchar(4096)
	              COLUMN_KEY:
	                   EXTRA:
	              PRIVILEGES:
	          COLUMN_COMMENT:
	             COLUMN_SIZE: 4096
	          DECIMAL_DIGITS: NULL
	   GENERATION_EXPRESSION: NULL
	                  SRS_ID: NULL
	```
	
6. Modify the behavior of

	- show tables
	- shwo databases
	- show columns
	- show table status

	The above statements may query the `information_schema` db if there is `where` predicate after them
2024-01-12 13:58:19 +08:00
d3721455b0 [Session](rf) Change the default min size of bf runtime filter (#29837) 2024-01-12 13:58:19 +08:00
f67a00ffbb [opt](nereids) prune runtime redundant filters (#29828)
1. expand_runtime_filter_by_inner_join will create some redundant rfs,e.g., tpch q5 and q9, we need to remove one
2. hive: prune rf if target only used as probe
2024-01-12 13:58:19 +08:00
ed3c8bba87 [fix](auth)remove the key when priv is empty (#29522)
- remove the key when priv is empty
- check priv when create mv
2024-01-12 13:58:19 +08:00
8ba1eb0b02 [feature](mtmv) task tvf add queryId (#29671)
To better locate abnormal situations, add queryId
2024-01-12 12:00:32 +08:00
4d97f8ea75 [enhance](function) support two special format for str_to_date (#29823) 2024-01-12 12:00:32 +08:00
eed72a101e [fix](Nereids) decimalv3 cast in fe produce wrong data (#29808)
case:
```
MySQL root@127.0.0.1:test> select cast(12 as decimalv3(2,1))
+-----------------------------+
| cast(12 as DECIMALV3(2, 1)) |
+-----------------------------+
| 12.0                        |
+-----------------------------+
```

decimalv2 literal will generate wrong result too. But it is not only
bugs in planner, but also have bugs in executor. So we need fix executor
bug in another PR.
2024-01-12 12:00:13 +08:00
885d8b28ba [fix](Nerids): fix function deps when check unique and not null #29797 2024-01-12 11:59:52 +08:00
18f850c94f [enhance](auto-partition) forbid null column for auto partition (#29749) 2024-01-12 11:59:52 +08:00
e93a16ac6e [fix](Nereids) support complex literal cast in fe (#29599) 2024-01-12 11:59:52 +08:00
0d691c638b [Feature](profile)Support report runtime workload statistics #29591 2024-01-12 11:59:27 +08:00
17a2b89945 [runtimeFilter](nereids) expand runtime filter by join condition by default (#29633)
1. expand rf by join condition 
2. fix ignore_shape_nodes bug
2024-01-12 11:59:27 +08:00
a94343c5f9 [fix](planner) Fix table sample not take effect if exist conjunct #29814 2024-01-12 11:59:27 +08:00
81d6775b7b [Cleanup](Nereids): delete useless ddlSql to avoid wrong usage (#29788)
ddlSql is useless and some code use getDdlSql() wrong, so delete those code
2024-01-12 11:57:16 +08:00
99c8e47518 [fix](nereids) fix regression case "nereids_p0/runtimefilter" (#29776) 2024-01-12 11:53:58 +08:00
a244f11da2 [fix](statistics)Fix alter column stats not forward to master bug (#29786)
Alter column stats operation need to write bdbje, so it should be forwarded to master to execute. Otherwise, the operation on follower/observer will cause the FE crash.
2024-01-12 11:53:57 +08:00
697a6a4ba2 [Refactor](admin-stmt) rename some admin-show statestmt (#29492)
The `ADMIN SHOW` statement can not be executed with high version of mysql 8.x jdbc driver.
So I rename these statement, remove the `ADMIN` keywords.

1. ADMIN SHOW CONFIG -> SHOW CONFIG
2. ADMIN SHOW REPLICA -> SHOW REPLICA
3. ADMIN DIAGNOSE TABLET -> SHOW TABLET DIAGNOSIS
4. ADMIN SHOW TABLET -> SHOW TABLET

for compatibility, the old statements are still supported, but not recommend to use.
They will be removed in later version
2024-01-12 11:53:57 +08:00
ddf2e8d3dd [feature](Nereids): merge topNs (#28246)
merge topNs like 
```
TopN
|
TopN

merge ->

TopN
```
2024-01-12 11:53:42 +08:00
3e9cd3a8b9 [minor](jdbc) fix wrong log and add more info (#29557) 2024-01-12 11:53:21 +08:00
e17809a684 [fix](nereids)logicalhaving is in wrong place after logicalagg and logicalwindow (#29463) 2024-01-12 11:48:39 +08:00
883d6dfc73 [fix](planner)strip trailing zeros for decimal literal if the precision larger than max decimal precision in doris (#29737) 2024-01-12 11:48:39 +08:00
HB
ff7f09fe1f [fix](executor) Fe publish topic info tcp leak (#29739)
* [fix](executor) Fe publish topic info tcp leak

* enhancement
2024-01-12 11:48:39 +08:00
fe5b0e9880 [FIX](struct)fix struct literal in fe const fold with field name #29735 2024-01-12 11:48:39 +08:00
2c44951543 [fix](planner)only allow null safe equal when both children are nullable (#29470) 2024-01-12 11:46:29 +08:00
463a7ab212 [Performance](exec) opt the exchange performance (#29579) 2024-01-12 11:46:29 +08:00
7a75cde77d [fix](schemachange) Fixed the issue of incorrect log information when distribution columns are compared inconsistently (#27013) 2024-01-12 11:46:29 +08:00
0a853be3d1 [fix](storage medium) show create table don't print empty storage medium #29650 2024-01-12 11:46:29 +08:00
b7a819bd24 [Fix](typo) Fix analyzeGroupCommitDataBytes typo #29640 2024-01-12 11:46:29 +08:00
Pxl
7738eca6da [Bug](stream-load) fix stream load failed on table with rollup (#29665)
fix stream load failed on table with rollup
2024-01-12 11:46:29 +08:00
9cbb55d49b [fix](Nereids) create double literal when create decimal literal failed (#28959)
FIX
1. remove float and double literal toString and getStringValue introduced by
  PR #23504 and PR #23271
  These functions lead to wrong cast result of double and float literal
2. fix compute signature for datetimev2 always produce scale 6
3. fix stats calculator failed when generate node stats with two same column
4. constant fold on fe failed when cast double to integral

TODO
after fix the first problem, some mv matching not work well, fix them later
- test_dup_mv_div
- test_dup_mv_json
- test_tcu
2024-01-12 11:46:29 +08:00
da182a8b6f [feature](nereids)print nereids node id in explain (#29238)
* print nereids id in explain
2024-01-12 11:44:21 +08:00
e4707154fa [opt](statistics) create or update table stats after alter column stats.
Create or update table stats after alter column stats.
Set flag to disable auto analyze for the table after user inject column stats.
2024-01-12 11:44:21 +08:00
fda001b6d3 [Improvement](nereids) Support join derivation when mv rewrite (#29609)
materialized view def is as following:
>            select l_linenumber, o_custkey
>           from orders
>            left join lineitem on lineitem.L_ORDERKEY = orders.O_ORDERKEY
>            where o_custkey = 1;

when query is as following, it can be rewritten by mv above
it requires that query has reject null filters on the join right input, 
current supported filter are  "=", "<", "<=", ">", ">=", "<=>" 
>            select IFNULL(orders.O_CUSTKEY, 0) as custkey_not_null,
>           case when l_linenumber in (1,2,3) then l_linenumber else o_custkey end as case_when
>            from orders
>            inner join lineitem on orders.O_ORDERKEY = lineitem.L_ORDERKEY
>            where o_custkey = 1 and l_linenumber > 0;
2024-01-12 11:44:21 +08:00
34fe5ee38b [feat](Nereids) support show constraint command (#29667)
show constraints from t1;
+------+-------------+-----------------------------------------+
| Name | Type        | Definition                              |
+------+-------------+-----------------------------------------+
| fk   | FOREIGN KEY | FOREIGN KEY (id) REFERENCES cir.t1 (id) |
| uk   | UNIQUE      | UNIQUE (id)                             |
| pk   | PRIMARY KEY | PRIMARY KEY (id)                        |
+------+-------------+-----------------------------------------+
2024-01-12 11:44:21 +08:00
be56bf06cf [feature](function) support ip function named is_ip_address_in_range(addr, cidr) (#29681) 2024-01-12 11:44:21 +08:00
7357ca62af [improvement](statistics)Only write editlog for manual analyze task, don't need to do so for auto tasks. #29685
Only write editlog for manual analyze task, don't need to do so for auto tasks to reduce writing editlog.
Add error message to job info while task failed.
2024-01-12 11:44:21 +08:00
78aabc3492 [session](shared) disable shared scan in default (#29703) 2024-01-12 11:44:21 +08:00
028e59efab [refactor](Nereids): unify all replaceNamedExpressions (#28228)
Use a unified function `replaceNamedExpressions ` instead of implementing it yourself repeatedly.
2024-01-12 11:44:21 +08:00
0c7c9485b6 [Fix](nereids) Fix get ralated partition table when nodata (#29453)
Support to create partition materialized view using nodata table
Such as the table def as following:
>        CREATE TABLE `test_no_data` (
>         `user_id` LARGEINT NOT NULL COMMENT '"用户id"',
>         `date` DATE NOT NULL COMMENT '"数据灌入日期时间"',
>         `num` SMALLINT NOT NULL COMMENT '"数量"'
>        ) ENGINE=OLAP
>        DUPLICATE KEY(`user_id`, `date`, `num`)
>        COMMENT 'OLAP'
>        PARTITION BY RANGE(`date`)
>        (PARTITION p201701_1000 VALUES [('0000-01-01'), ('2017-02-01')),
>        PARTITION p201702_2000 VALUES [('2017-02-01'), ('2017-03-01')),
>        PARTITION p201703_all VALUES [('2017-03-01'), ('2017-04-01')))
>        DISTRIBUTED BY HASH(`user_id`) BUCKETS 2
>        PROPERTIES ('replication_num' = '1') ;

when table test_no_data has no data, it also support to create partition materialized view as following:
>        CREATE MATERIALIZED VIEW no_data_partition_mv
>            BUILD IMMEDIATE REFRESH AUTO ON MANUAL
>            partition by(`date`)
>            DISTRIBUTED BY RANDOM BUCKETS 2
>            PROPERTIES ('replication_num' = '1')
>            AS
>           SELECT * FROM test_no_data where date > '2017-05-01';
>
2024-01-12 11:44:21 +08:00
28695249ea [Fix](nereids) Fix partition check failure (#29642)
Optimize mv rewrite partition check logic and fix check failure and
add more relevant explain info.
2024-01-12 11:44:21 +08:00