Commit Graph

9414 Commits

Author SHA1 Message Date
3a59ee1c5d [fix](auditlog)Record return row count in audit log for internal query. (#39616) (#39702)
backport: https://github.com/apache/doris/pull/39616
2024-08-21 17:37:01 +08:00
1460878bdf [fix](cluster key) forbid cluster key and remove case (#39679)
branch-2.1 does not support mow cluster key
2024-08-21 14:31:54 +08:00
ba3b56d269 [fix](nereids)prevent null pointer exception if datetime value overflows (#39675)
pick from master https://github.com/apache/doris/pull/39482
2024-08-21 14:17:34 +08:00
0bfcee1251 [opt](file-cache) support system table file_cache_statistics (#39552)
1. Add new system table: `file_cache_statistics`

	This table is used for viewing metrics related to file cache on BE side

	```
	mysql> select * from information_schema.file_cache_statistics limit 10;

+-------+---------------+----------------------------+--------------------------------+--------------------+
| BE_ID | BE_IP | CACHE_PATH | METRIC_NAME | METRIC_VALUE |

+-------+---------------+----------------------------+--------------------------------+--------------------+
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
disposable_queue_curr_elements | 0 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
disposable_queue_curr_size | 0 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
disposable_queue_max_elements | 102400 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
disposable_queue_max_size | 21474836480 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ | hits_ratio |
0.8539634687001242 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ | hits_ratio_1h | 0
|
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ | hits_ratio_5m | 0
|
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
index_queue_curr_elements | 0 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
index_queue_curr_size | 0 |
| 10003 | 172.20.32.136 | /mnt/output/be/file_cache/ |
index_queue_max_elements | 102400 |

+-------+---------------+----------------------------+--------------------------------+--------------------+
	```

	It will show metrics of file caches on each BE.

2. Add new metrics `hits_ratio_1h` and `hits_ratio_5m` for file cache

This 2 metrics will show the hit ratio of file cache in recent 1 hour or
5 minutes.
So that we can know recent hit ratio instead of global historical hit
ratio.
2024-08-21 10:03:39 +08:00
bf26f49505 [bugfix](external)add check of engine and catalog types for 2.1 #39343 (#39643)
bp #39343
2024-08-21 09:50:17 +08:00
8a562aeb77 [opt](nereids) recover adoptive bucket shuffle (#39598)
## Proposed changes

pick from https://github.com/apache/doris/pull/36784

Co-authored-by: xiongzhongjian <xiongzhongjian@selectdb.com>
2024-08-21 09:26:53 +08:00
6df6f1dc97 [improvement](iceberg)]support doris's char/varchar to iceberg's string for 2.1 #38807 (#39645)
bp: #38807
2024-08-21 09:19:10 +08:00
28ce116b17 [improvement](iceberg)add some description for show create for 2.1 #39179 (#39644)
## Proposed changes

bp: #39179

1. add `location` and `properties` for `show create table`.
2. add `location` for `show create database`.
2024-08-21 09:18:38 +08:00
2fe6d580be [improvement](diagnose) add tablet in recycle bin hint #39547 (#39622)
cherry pick from #39547
2024-08-21 09:16:01 +08:00
57262a3d5c [fix](partition rebalancer) fix migrate tablets between backends back and forth #39333 (#39606)
cherry pick from #39333
2024-08-21 09:15:31 +08:00
bb687bd69c [cherry-pick](branch-2.1) add function regexp_extract_or_null (#39561)
# Proposed changes

pick https://github.com/apache/doris/pull/38296
2024-08-21 09:14:58 +08:00
7c3c5c67fc [log](statistics)Add result row count log for statistics internal query. (#39556) (#39609)
backport: https://github.com/apache/doris/pull/39556
2024-08-20 23:18:48 +08:00
a5daa3edc8 [opt](variables) enlarge the default value of max_allowed_packet (#38697) (#39626)
bp #38697
2024-08-20 22:02:01 +08:00
dfd21bd2a0 [fix](fe-log) add position info in async mode #39419 (#39571)
pick part of #39419
2024-08-20 22:01:34 +08:00
a4deefea5d [fix](catalog) gen partition id by name (#39325) (#39625)
bp #39325
2024-08-20 22:00:19 +08:00
a3fd13fee6 [fix](catalog) set timeout for split fetch (#39346) (#39624)
bp #39346
2024-08-20 21:59:55 +08:00
0e21dba817 [opt](catalog) modify some meta cache logic (#38506) (#39628)
#38506
2024-08-20 21:57:55 +08:00
607887673e [improvement](report) report handler discard old report tasks #39469 (#39605)
cherry pick from #39469
2024-08-20 17:40:49 +08:00
e302882e52 [branch-2.1](pick) Pick 2 PRs to branch-2.1 (#39604)
## Proposed changes

pick #39480 #39589

<!--Describe your changes.-->
2024-08-20 17:10:30 +08:00
621d394a5e [enhance](Backup) Do connectivity check when creating repository (#38350) (#39538)
Previously when creating repository, FE would not do connectivity check.
It might result in confusing error when using backup restore.

pick #38350

Co-authored-by: AlexYue <yj976240184@gmail.com>
2024-08-19 22:16:02 +08:00
9647885b95 [fix](routine load) should update progress before handle transaction state transform (#39311) (#39526)
pick (#39311)

Update progress maybe throw exception, causing offset has been persisted
on edit log or meta service, but the memory data has not been updated.
It will cause repeated consumption.
2024-08-19 21:23:59 +08:00
3d8b04a782 [fix](stream load) do not throw exception but skip record when can not find database (#39360) (#39527)
pick (#39360)

When fetch stream load record from BE node, if can not find database,
StreamLoadRecordMgr will throw exception and the remaining records will
not be recorded in memory.

For example: Ten stream load records were pulled, and the database
associated with the stream load of the first record was deleted by the
user. Therefore, the pull will end, resulting in the remaining nine
records not being consumed recorded in memory.

This pr do not throw exception but skip record when can not find
database to solve this problem.
2024-08-19 21:23:26 +08:00
a1aa9b8ab9 [fix](routine load) add read lock to fix some concurrent bugs (#39242) (#39525)
pick #39242
2024-08-19 21:18:27 +08:00
830f250a80 [opt](query cancel) cancel query if it has pipeline task leakage #39223 (#39537)
pick #39223 with some modifications. Optimization will only be applied
to pipeline x.
2024-08-19 14:33:59 +08:00
c0cbb2362c [enhancement](schema-change) Record detailed fail reason for schema change tasks (#39351) (#39501)
## Proposed changes

Expose the error msg from BE as the real fail reason recorded for schema
change tasks. To avoid too much memory usage, we just pick one among all
to record.
2024-08-18 13:51:06 +08:00
e01d051acf [improvement](external catalog)Optimize the process of refreshing catalog for 2.1 (#39205) (#39186)
## Proposed changes

bp: #39205

When the catalog attributes have not changed, refreshing the catalog
only requires processing the cache, without rebuilding the entire
catalog.
2024-08-17 17:02:06 +08:00
fd4d1f4e4f [chore](table) Add batch method to get visible version of the olap table (#38949) (#39495)
Cherry-pick #38949
2024-08-17 16:55:06 +08:00
b0da8430bc [opt](inverted index) Optimize the usage of the multi_match function (#39472)
## Proposed changes

https://github.com/apache/doris/pull/39193

<!--Describe your changes.-->
2024-08-17 16:53:52 +08:00
20936fe054 [branch-2.1][improvement](jdbc catalog) Compatible with ojdbc6 by adding version check (#39408)
pick (#39341)

In previous versions, we used a method based on JDBC 4.2 to read data,
so it was equivalent to abandoning support for ojdbc6. However, we
recently found that a large number of users still use Oracle version
11g, which will have some unexpected compatibility issues when using
ojdbc8 to connect. Therefore, I use version verification to make it
compatible with both ojdbc6 and ojdbc8, so that good compatibility can
be obtained through ojdbc6, and better reading efficiency can be
obtained through ojdbc8.
2024-08-17 16:43:01 +08:00
79aa079cc6 [fix](array-funcs) array min/max #39307 (#39484) 2024-08-17 10:56:44 +08:00
472565cd48 [enhance](mtmv) refresh hms table before run mtmv task (#38212) (#39490)
pick from master #38212​
2024-08-16 20:05:52 +08:00
ae8073f155 [opt](mtmv) partition rollup support week and quarter (#39286) (#39477)
pick from master #39286​
2024-08-16 20:01:06 +08:00
c84cb5cf3d [enhance](mtmv) hive cache add partitionId to partitionName Map (#38525) (#39476)
pick from master #38525
2024-08-16 20:00:22 +08:00
c9a246b65b [fix](mtmv) Fix cancelled tasks with running status (#39424) (#39489)
pick from master #39424
2024-08-16 19:54:16 +08:00
2948b5ea2b [branch-2.1][fix](jdbc scan) Remove the conjuncts.remove call in JdbcScan (#39407)
pick (#39180)

In #37565, due to the change in the calling order of finalize, the final
generated Plan will be missing the PREDICATES that have been pushed down
in Jdbc. Although this behavior is correct, before perfectly handling
the push down of various PREDICATES, we need to keep all conjuncts to
ensure that we can still filter data normally when the data returned by
Jdbc is a superset.
2024-08-16 19:01:40 +08:00
f203ee8224 [enhance](auth)modify priv of refresh catalog/db/table (#39008) (#39475)
pick from master #39008
2024-08-16 17:31:58 +08:00
4458302a77 [Fix](Planner) fix delete from using does not attach partition information (#39020) (#39418)
cherry-pick from master #39020
Problem:
when use delete from using clause and assign partition information, it
would delete more data from other partition
Solved:
add partition information when transfer delete clause into insert into
select clause
2024-08-16 17:16:08 +08:00
e249e00586 [fix](Nereids) fix explain plan with sql cache (#39431) (#39463)
introduced by #38950, explain plan with sql cache will throw an exception
```
errCode = 2, detailMessage = Cannot invoke "org.apache.doris.nereids.trees.plans.Plan.treeString()" because "this.optimizedPlan" is null
```
2024-08-16 15:51:47 +08:00
6aafc5adb4 [improvement](statistics)Support drop cached stats. (#39367) (#39462)
backport: https://github.com/apache/doris/pull/39367
2024-08-16 14:12:58 +08:00
d56000e924 [opt](Nereids) polish aggregate function signature matching (#39352) (#39460)
pick from master #39352

use double to match string
- corr
- covar
- covar_samp
- stddev
- stddev_samp

use largeint to match string
- group_bit_and
- group_bit_or
- group_git_xor

use double to match decimalv3
- topn_weighted

optimize error message
- multi_distinct_sum
- multi_distinct_sum0
2024-08-16 13:57:11 +08:00
ec0e413317 [fix](Nereids) npe when delete with cte and without using (#39379) (#39441)
pick from master #39379
2024-08-16 13:56:24 +08:00
5d6527c536 [opt](set operation) INTERSECT should evaluated before others (#39095) (#39437)
pick from master #39095

this is a behaviour change PR.

set operation INTERSECT should evaluated before others. In Doris
history, all set operators have same priority.

This PR change Nereids, let it be same with MySQL.
2024-08-16 11:08:36 +08:00
021678c7c3 [fix](window_funnel) fix wrong result of window_funnel #38954 (#39270)
## Proposed changes

BP #38954
2024-08-16 09:59:31 +08:00
289096692f [opt](Nereids) support parse sql with semicolon at beginning (#39399) (#39442)
pick from master #39399

For statement with semicolon at beginning like: ;SELECT 1; MySQL could
not parse it, but legacy planner could
2024-08-16 09:57:44 +08:00
4e889bbc6d [fix](Nereids) support implicit cast ip types to string (#39318) (#39440)
pick from master #39318
2024-08-16 09:57:02 +08:00
4380f3cb51 [fix](variable) support all type functions (#39144) (#39438)
pick from master #39144
2024-08-16 09:51:02 +08:00
46dc0d2192 [fix](variant) fix variant cast (#39426)
## Proposed changes
backport: https://github.com/apache/doris/pull/39377
Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-16 09:47:44 +08:00
3aaee8f7d5 [fix](Nereids) polish function signature search algorithm (#38497) (#39436)
pick from master #38497 and #39342

use array<double> for array<string>
- array_avg
- array_cum_sum
- array_difference
- array_product

use array<bigint> for array<string>
- bitmap_from_array

use double first
- fmod
- pmod

let high order function throw friendly exception
- array_filter
- array_first
- array_last
- array_reverse_split
- array_sort_by
- array_split

let return type same as parameter's type
- array_push_back
- array_push_front
- array_with_constant
- if

let greatest / least work same as mysql's greatest
2024-08-16 08:24:25 +08:00
b3597ea898 [improvement](statistics)Drop column stats after schema change. (#39101) (#39401)
backport: https://github.com/apache/doris/pull/39101
2024-08-15 17:02:42 +08:00
01090cf61f [improvement](statistics)Improve statistics cache loading logic. (#38829) (#39410)
backport: https://github.com/apache/doris/pull/38829
2024-08-15 17:01:24 +08:00