Commit Graph

9001 Commits

Author SHA1 Message Date
dfd21bd2a0 [fix](fe-log) add position info in async mode #39419 (#39571)
pick part of #39419
2024-08-20 22:01:34 +08:00
a4deefea5d [fix](catalog) gen partition id by name (#39325) (#39625)
bp #39325
2024-08-20 22:00:19 +08:00
a3fd13fee6 [fix](catalog) set timeout for split fetch (#39346) (#39624)
bp #39346
2024-08-20 21:59:55 +08:00
0e21dba817 [opt](catalog) modify some meta cache logic (#38506) (#39628)
#38506
2024-08-20 21:57:55 +08:00
607887673e [improvement](report) report handler discard old report tasks #39469 (#39605)
cherry pick from #39469
2024-08-20 17:40:49 +08:00
e302882e52 [branch-2.1](pick) Pick 2 PRs to branch-2.1 (#39604)
## Proposed changes

pick #39480 #39589

<!--Describe your changes.-->
2024-08-20 17:10:30 +08:00
621d394a5e [enhance](Backup) Do connectivity check when creating repository (#38350) (#39538)
Previously when creating repository, FE would not do connectivity check.
It might result in confusing error when using backup restore.

pick #38350

Co-authored-by: AlexYue <yj976240184@gmail.com>
2024-08-19 22:16:02 +08:00
9647885b95 [fix](routine load) should update progress before handle transaction state transform (#39311) (#39526)
pick (#39311)

Update progress maybe throw exception, causing offset has been persisted
on edit log or meta service, but the memory data has not been updated.
It will cause repeated consumption.
2024-08-19 21:23:59 +08:00
3d8b04a782 [fix](stream load) do not throw exception but skip record when can not find database (#39360) (#39527)
pick (#39360)

When fetch stream load record from BE node, if can not find database,
StreamLoadRecordMgr will throw exception and the remaining records will
not be recorded in memory.

For example: Ten stream load records were pulled, and the database
associated with the stream load of the first record was deleted by the
user. Therefore, the pull will end, resulting in the remaining nine
records not being consumed recorded in memory.

This pr do not throw exception but skip record when can not find
database to solve this problem.
2024-08-19 21:23:26 +08:00
a1aa9b8ab9 [fix](routine load) add read lock to fix some concurrent bugs (#39242) (#39525)
pick #39242
2024-08-19 21:18:27 +08:00
830f250a80 [opt](query cancel) cancel query if it has pipeline task leakage #39223 (#39537)
pick #39223 with some modifications. Optimization will only be applied
to pipeline x.
2024-08-19 14:33:59 +08:00
c0cbb2362c [enhancement](schema-change) Record detailed fail reason for schema change tasks (#39351) (#39501)
## Proposed changes

Expose the error msg from BE as the real fail reason recorded for schema
change tasks. To avoid too much memory usage, we just pick one among all
to record.
2024-08-18 13:51:06 +08:00
e01d051acf [improvement](external catalog)Optimize the process of refreshing catalog for 2.1 (#39205) (#39186)
## Proposed changes

bp: #39205

When the catalog attributes have not changed, refreshing the catalog
only requires processing the cache, without rebuilding the entire
catalog.
2024-08-17 17:02:06 +08:00
fd4d1f4e4f [chore](table) Add batch method to get visible version of the olap table (#38949) (#39495)
Cherry-pick #38949
2024-08-17 16:55:06 +08:00
b0da8430bc [opt](inverted index) Optimize the usage of the multi_match function (#39472)
## Proposed changes

https://github.com/apache/doris/pull/39193

<!--Describe your changes.-->
2024-08-17 16:53:52 +08:00
20936fe054 [branch-2.1][improvement](jdbc catalog) Compatible with ojdbc6 by adding version check (#39408)
pick (#39341)

In previous versions, we used a method based on JDBC 4.2 to read data,
so it was equivalent to abandoning support for ojdbc6. However, we
recently found that a large number of users still use Oracle version
11g, which will have some unexpected compatibility issues when using
ojdbc8 to connect. Therefore, I use version verification to make it
compatible with both ojdbc6 and ojdbc8, so that good compatibility can
be obtained through ojdbc6, and better reading efficiency can be
obtained through ojdbc8.
2024-08-17 16:43:01 +08:00
79aa079cc6 [fix](array-funcs) array min/max #39307 (#39484) 2024-08-17 10:56:44 +08:00
472565cd48 [enhance](mtmv) refresh hms table before run mtmv task (#38212) (#39490)
pick from master #38212​
2024-08-16 20:05:52 +08:00
ae8073f155 [opt](mtmv) partition rollup support week and quarter (#39286) (#39477)
pick from master #39286​
2024-08-16 20:01:06 +08:00
c84cb5cf3d [enhance](mtmv) hive cache add partitionId to partitionName Map (#38525) (#39476)
pick from master #38525
2024-08-16 20:00:22 +08:00
c9a246b65b [fix](mtmv) Fix cancelled tasks with running status (#39424) (#39489)
pick from master #39424
2024-08-16 19:54:16 +08:00
2948b5ea2b [branch-2.1][fix](jdbc scan) Remove the conjuncts.remove call in JdbcScan (#39407)
pick (#39180)

In #37565, due to the change in the calling order of finalize, the final
generated Plan will be missing the PREDICATES that have been pushed down
in Jdbc. Although this behavior is correct, before perfectly handling
the push down of various PREDICATES, we need to keep all conjuncts to
ensure that we can still filter data normally when the data returned by
Jdbc is a superset.
2024-08-16 19:01:40 +08:00
f203ee8224 [enhance](auth)modify priv of refresh catalog/db/table (#39008) (#39475)
pick from master #39008
2024-08-16 17:31:58 +08:00
4458302a77 [Fix](Planner) fix delete from using does not attach partition information (#39020) (#39418)
cherry-pick from master #39020
Problem:
when use delete from using clause and assign partition information, it
would delete more data from other partition
Solved:
add partition information when transfer delete clause into insert into
select clause
2024-08-16 17:16:08 +08:00
e249e00586 [fix](Nereids) fix explain plan with sql cache (#39431) (#39463)
introduced by #38950, explain plan with sql cache will throw an exception
```
errCode = 2, detailMessage = Cannot invoke "org.apache.doris.nereids.trees.plans.Plan.treeString()" because "this.optimizedPlan" is null
```
2024-08-16 15:51:47 +08:00
6aafc5adb4 [improvement](statistics)Support drop cached stats. (#39367) (#39462)
backport: https://github.com/apache/doris/pull/39367
2024-08-16 14:12:58 +08:00
d56000e924 [opt](Nereids) polish aggregate function signature matching (#39352) (#39460)
pick from master #39352

use double to match string
- corr
- covar
- covar_samp
- stddev
- stddev_samp

use largeint to match string
- group_bit_and
- group_bit_or
- group_git_xor

use double to match decimalv3
- topn_weighted

optimize error message
- multi_distinct_sum
- multi_distinct_sum0
2024-08-16 13:57:11 +08:00
ec0e413317 [fix](Nereids) npe when delete with cte and without using (#39379) (#39441)
pick from master #39379
2024-08-16 13:56:24 +08:00
5d6527c536 [opt](set operation) INTERSECT should evaluated before others (#39095) (#39437)
pick from master #39095

this is a behaviour change PR.

set operation INTERSECT should evaluated before others. In Doris
history, all set operators have same priority.

This PR change Nereids, let it be same with MySQL.
2024-08-16 11:08:36 +08:00
021678c7c3 [fix](window_funnel) fix wrong result of window_funnel #38954 (#39270)
## Proposed changes

BP #38954
2024-08-16 09:59:31 +08:00
289096692f [opt](Nereids) support parse sql with semicolon at beginning (#39399) (#39442)
pick from master #39399

For statement with semicolon at beginning like: ;SELECT 1; MySQL could
not parse it, but legacy planner could
2024-08-16 09:57:44 +08:00
4e889bbc6d [fix](Nereids) support implicit cast ip types to string (#39318) (#39440)
pick from master #39318
2024-08-16 09:57:02 +08:00
4380f3cb51 [fix](variable) support all type functions (#39144) (#39438)
pick from master #39144
2024-08-16 09:51:02 +08:00
46dc0d2192 [fix](variant) fix variant cast (#39426)
## Proposed changes
backport: https://github.com/apache/doris/pull/39377
Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-16 09:47:44 +08:00
3aaee8f7d5 [fix](Nereids) polish function signature search algorithm (#38497) (#39436)
pick from master #38497 and #39342

use array<double> for array<string>
- array_avg
- array_cum_sum
- array_difference
- array_product

use array<bigint> for array<string>
- bitmap_from_array

use double first
- fmod
- pmod

let high order function throw friendly exception
- array_filter
- array_first
- array_last
- array_reverse_split
- array_sort_by
- array_split

let return type same as parameter's type
- array_push_back
- array_push_front
- array_with_constant
- if

let greatest / least work same as mysql's greatest
2024-08-16 08:24:25 +08:00
b3597ea898 [improvement](statistics)Drop column stats after schema change. (#39101) (#39401)
backport: https://github.com/apache/doris/pull/39101
2024-08-15 17:02:42 +08:00
01090cf61f [improvement](statistics)Improve statistics cache loading logic. (#38829) (#39410)
backport: https://github.com/apache/doris/pull/38829
2024-08-15 17:01:24 +08:00
642beb069b [fix](schema-change) Fix potential data race when a schema change jobs is set to cancelled but the table state is still SCHEMA_CHANGE (#39164) (#39327)
## Proposed changes

Set job cancel state after table state changed to normal.
2024-08-15 14:18:11 +08:00
aebc70d75a revert [improvement](mv) Support to use cast when create sync materialized view #38008 (#39378)
## Proposed changes

this is brought by https://github.com/apache/doris/pull/38008
if use cast(FLOOR(MINUTE(time) / 15) as decimal(9, 0)) in group by
clause when sync materialized view. if downgrade from 2.1.6 to 2.1.5 or
upgrade 2.1.6 to 3.0.0
this may cause fe can not run. So revert the function.
2024-08-15 14:16:57 +08:00
4acd69590d [Fix](function) fix wrong nullable signature of function corr (#39380)
## Proposed changes

Issue Number: close #xxx

before `corr(nullable_x, nullable_y)` will core dump. not fixed.
no need to patch in master because the refactor
https://github.com/apache/doris/pull/37330 already changed the
implementation context
2024-08-15 14:10:09 +08:00
265bf9d54f [fix](protocol) CLIENT_MULTI_STATEMENTS not used actually (#39308) (#39370)
pick #39308 to branch-2.1
2024-08-15 14:06:12 +08:00
1accde9fb3 [fix](nestedtype) support nested type for schema change reorder (#39392)
## Proposed changes
backport: https://github.com/apache/doris/pull/39210
Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-15 14:03:03 +08:00
7e9aa2b9ac [feature](restore) Support clean_tables/clean_partitions properties for restore job #39028 (#39363)
cherry pick from #39028
2024-08-15 09:58:26 +08:00
cf089d2cbe [Cherry-pick](branch-2.1) Pick "[Enhancement](wal) modify wal api which hard to use (#38895)" (#39188)
## Proposed changes

Pick #38895

Before this pr, this api needs backends' ip and port as param, which is
hard to use. This pr modify it. If there is no param, doris will print
all backends WAL info.

The acceptable usage are as follows 

```
curl -u root: "127.0.0.1:8038/api/get_wal_size?host_ports=127.0.0.1:9058"
{"msg":"success","code":0,"data":["127.0.0.1:9058:0"],"count":0}%                                                                                             

curl -u root: "127.0.0.1:8038/api/get_wal_size?host_ports="         
{"msg":"success","code":0,"data":["127.0.0.1:9058:0"],"count":0}%                                                                                                                                             

curl -u root: "127.0.0.1:8038/api/get_wal_size"            
{"msg":"success","code":0,"data":["127.0.0.1:9058:0"],"count":0}% 
```

<!--Describe your changes.-->

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-15 09:37:10 +08:00
c12137a8d6 [branch-2.1][fix](expr) Enhance SQL Expression Handling by Introducing printSqlInParens to CompoundPredicate (#39082)
pick #39064
2024-08-14 21:14:58 +08:00
226e01889c [fix](array_apply) pick array apply fix (#39328)
## Proposed changes
backport: https://github.com/apache/doris/pull/39105
Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-14 18:52:29 +08:00
a9692a305e [fix](function)timediff with now function causes a error signature (… (#39349)
…#39322)
https://github.com/apache/doris/pull/39322
## Proposed changes

```
mysql [(none)]>select round(timediff(now(),'2024-08-15')/60/60,2);
ERROR 1105 (HY000): errCode = 2, detailMessage = argument 1 requires datetimev2 type, however 'now()' is of datetime type
```
The reason is that the function parameter types were modified in
expectedInputTypes, which led to no match being found. The code here is
from a long time ago. Because the precision of datetimev2 could not be
deduced in the past, a separate implementation was made here. This code
can be safely deleted.


<!--Describe your changes.-->

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-14 18:36:14 +08:00
701e23b65b [Fix](nereids) fix condition function partition prune (#39298) (#39332)
cherry-pick #39298 to branch-2.1
2024-08-14 18:32:50 +08:00
187461e2fd [Fix](Export) Export delete multiple times when specify the delete_existing_files property () (#39304)
bp: #38400

When the `Export` statement specifies the `delete_existing_files`
property, each `Outfile` statement generated by the `Export` will carry
this property. This causes each `Outfile` statement to delete existing
files, so only the result of the last Outfile statement will be
retained.

So, we add a rpc method which can delete existing files for `Export`
statement and the `Outfile` statements generated by the `Export` will
not carry `delete_existing_files` property any more.

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-13 22:26:02 +08:00
Pxl
33220109f7 [Bug](materialized-view) fix analyze where clause failed on mv (#39061) (#39209)
## Proposed changes
pick from #39061
fix analyze where clause failed on mv
do not analyze slot after replaceSlot to avoid duplicate columns in desc
2024-08-13 16:08:20 +08:00