Commit Graph

7446 Commits

Author SHA1 Message Date
c0cbb2362c [enhancement](schema-change) Record detailed fail reason for schema change tasks (#39351) (#39501)
## Proposed changes

Expose the error msg from BE as the real fail reason recorded for schema
change tasks. To avoid too much memory usage, we just pick one among all
to record.
2024-08-18 13:51:06 +08:00
e01d051acf [improvement](external catalog)Optimize the process of refreshing catalog for 2.1 (#39205) (#39186)
## Proposed changes

bp: #39205

When the catalog attributes have not changed, refreshing the catalog
only requires processing the cache, without rebuilding the entire
catalog.
2024-08-17 17:02:06 +08:00
fd4d1f4e4f [chore](table) Add batch method to get visible version of the olap table (#38949) (#39495)
Cherry-pick #38949
2024-08-17 16:55:06 +08:00
b0da8430bc [opt](inverted index) Optimize the usage of the multi_match function (#39472)
## Proposed changes

https://github.com/apache/doris/pull/39193

<!--Describe your changes.-->
2024-08-17 16:53:52 +08:00
79aa079cc6 [fix](array-funcs) array min/max #39307 (#39484) 2024-08-17 10:56:44 +08:00
472565cd48 [enhance](mtmv) refresh hms table before run mtmv task (#38212) (#39490)
pick from master #38212​
2024-08-16 20:05:52 +08:00
ae8073f155 [opt](mtmv) partition rollup support week and quarter (#39286) (#39477)
pick from master #39286​
2024-08-16 20:01:06 +08:00
c84cb5cf3d [enhance](mtmv) hive cache add partitionId to partitionName Map (#38525) (#39476)
pick from master #38525
2024-08-16 20:00:22 +08:00
c9a246b65b [fix](mtmv) Fix cancelled tasks with running status (#39424) (#39489)
pick from master #39424
2024-08-16 19:54:16 +08:00
2948b5ea2b [branch-2.1][fix](jdbc scan) Remove the conjuncts.remove call in JdbcScan (#39407)
pick (#39180)

In #37565, due to the change in the calling order of finalize, the final
generated Plan will be missing the PREDICATES that have been pushed down
in Jdbc. Although this behavior is correct, before perfectly handling
the push down of various PREDICATES, we need to keep all conjuncts to
ensure that we can still filter data normally when the data returned by
Jdbc is a superset.
2024-08-16 19:01:40 +08:00
f203ee8224 [enhance](auth)modify priv of refresh catalog/db/table (#39008) (#39475)
pick from master #39008
2024-08-16 17:31:58 +08:00
4458302a77 [Fix](Planner) fix delete from using does not attach partition information (#39020) (#39418)
cherry-pick from master #39020
Problem:
when use delete from using clause and assign partition information, it
would delete more data from other partition
Solved:
add partition information when transfer delete clause into insert into
select clause
2024-08-16 17:16:08 +08:00
e249e00586 [fix](Nereids) fix explain plan with sql cache (#39431) (#39463)
introduced by #38950, explain plan with sql cache will throw an exception
```
errCode = 2, detailMessage = Cannot invoke "org.apache.doris.nereids.trees.plans.Plan.treeString()" because "this.optimizedPlan" is null
```
2024-08-16 15:51:47 +08:00
6aafc5adb4 [improvement](statistics)Support drop cached stats. (#39367) (#39462)
backport: https://github.com/apache/doris/pull/39367
2024-08-16 14:12:58 +08:00
d56000e924 [opt](Nereids) polish aggregate function signature matching (#39352) (#39460)
pick from master #39352

use double to match string
- corr
- covar
- covar_samp
- stddev
- stddev_samp

use largeint to match string
- group_bit_and
- group_bit_or
- group_git_xor

use double to match decimalv3
- topn_weighted

optimize error message
- multi_distinct_sum
- multi_distinct_sum0
2024-08-16 13:57:11 +08:00
ec0e413317 [fix](Nereids) npe when delete with cte and without using (#39379) (#39441)
pick from master #39379
2024-08-16 13:56:24 +08:00
4380f3cb51 [fix](variable) support all type functions (#39144) (#39438)
pick from master #39144
2024-08-16 09:51:02 +08:00
46dc0d2192 [fix](variant) fix variant cast (#39426)
## Proposed changes
backport: https://github.com/apache/doris/pull/39377
Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-16 09:47:44 +08:00
3aaee8f7d5 [fix](Nereids) polish function signature search algorithm (#38497) (#39436)
pick from master #38497 and #39342

use array<double> for array<string>
- array_avg
- array_cum_sum
- array_difference
- array_product

use array<bigint> for array<string>
- bitmap_from_array

use double first
- fmod
- pmod

let high order function throw friendly exception
- array_filter
- array_first
- array_last
- array_reverse_split
- array_sort_by
- array_split

let return type same as parameter's type
- array_push_back
- array_push_front
- array_with_constant
- if

let greatest / least work same as mysql's greatest
2024-08-16 08:24:25 +08:00
b3597ea898 [improvement](statistics)Drop column stats after schema change. (#39101) (#39401)
backport: https://github.com/apache/doris/pull/39101
2024-08-15 17:02:42 +08:00
01090cf61f [improvement](statistics)Improve statistics cache loading logic. (#38829) (#39410)
backport: https://github.com/apache/doris/pull/38829
2024-08-15 17:01:24 +08:00
642beb069b [fix](schema-change) Fix potential data race when a schema change jobs is set to cancelled but the table state is still SCHEMA_CHANGE (#39164) (#39327)
## Proposed changes

Set job cancel state after table state changed to normal.
2024-08-15 14:18:11 +08:00
aebc70d75a revert [improvement](mv) Support to use cast when create sync materialized view #38008 (#39378)
## Proposed changes

this is brought by https://github.com/apache/doris/pull/38008
if use cast(FLOOR(MINUTE(time) / 15) as decimal(9, 0)) in group by
clause when sync materialized view. if downgrade from 2.1.6 to 2.1.5 or
upgrade 2.1.6 to 3.0.0
this may cause fe can not run. So revert the function.
2024-08-15 14:16:57 +08:00
4acd69590d [Fix](function) fix wrong nullable signature of function corr (#39380)
## Proposed changes

Issue Number: close #xxx

before `corr(nullable_x, nullable_y)` will core dump. not fixed.
no need to patch in master because the refactor
https://github.com/apache/doris/pull/37330 already changed the
implementation context
2024-08-15 14:10:09 +08:00
265bf9d54f [fix](protocol) CLIENT_MULTI_STATEMENTS not used actually (#39308) (#39370)
pick #39308 to branch-2.1
2024-08-15 14:06:12 +08:00
1accde9fb3 [fix](nestedtype) support nested type for schema change reorder (#39392)
## Proposed changes
backport: https://github.com/apache/doris/pull/39210
Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-15 14:03:03 +08:00
7e9aa2b9ac [feature](restore) Support clean_tables/clean_partitions properties for restore job #39028 (#39363)
cherry pick from #39028
2024-08-15 09:58:26 +08:00
cf089d2cbe [Cherry-pick](branch-2.1) Pick "[Enhancement](wal) modify wal api which hard to use (#38895)" (#39188)
## Proposed changes

Pick #38895

Before this pr, this api needs backends' ip and port as param, which is
hard to use. This pr modify it. If there is no param, doris will print
all backends WAL info.

The acceptable usage are as follows 

```
curl -u root: "127.0.0.1:8038/api/get_wal_size?host_ports=127.0.0.1:9058"
{"msg":"success","code":0,"data":["127.0.0.1:9058:0"],"count":0}%                                                                                             

curl -u root: "127.0.0.1:8038/api/get_wal_size?host_ports="         
{"msg":"success","code":0,"data":["127.0.0.1:9058:0"],"count":0}%                                                                                                                                             

curl -u root: "127.0.0.1:8038/api/get_wal_size"            
{"msg":"success","code":0,"data":["127.0.0.1:9058:0"],"count":0}% 
```

<!--Describe your changes.-->

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-15 09:37:10 +08:00
c12137a8d6 [branch-2.1][fix](expr) Enhance SQL Expression Handling by Introducing printSqlInParens to CompoundPredicate (#39082)
pick #39064
2024-08-14 21:14:58 +08:00
226e01889c [fix](array_apply) pick array apply fix (#39328)
## Proposed changes
backport: https://github.com/apache/doris/pull/39105
Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-14 18:52:29 +08:00
a9692a305e [fix](function)timediff with now function causes a error signature (… (#39349)
…#39322)
https://github.com/apache/doris/pull/39322
## Proposed changes

```
mysql [(none)]>select round(timediff(now(),'2024-08-15')/60/60,2);
ERROR 1105 (HY000): errCode = 2, detailMessage = argument 1 requires datetimev2 type, however 'now()' is of datetime type
```
The reason is that the function parameter types were modified in
expectedInputTypes, which led to no match being found. The code here is
from a long time ago. Because the precision of datetimev2 could not be
deduced in the past, a separate implementation was made here. This code
can be safely deleted.


<!--Describe your changes.-->

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-14 18:36:14 +08:00
701e23b65b [Fix](nereids) fix condition function partition prune (#39298) (#39332)
cherry-pick #39298 to branch-2.1
2024-08-14 18:32:50 +08:00
187461e2fd [Fix](Export) Export delete multiple times when specify the delete_existing_files property () (#39304)
bp: #38400

When the `Export` statement specifies the `delete_existing_files`
property, each `Outfile` statement generated by the `Export` will carry
this property. This causes each `Outfile` statement to delete existing
files, so only the result of the last Outfile statement will be
retained.

So, we add a rpc method which can delete existing files for `Export`
statement and the `Outfile` statements generated by the `Export` will
not carry `delete_existing_files` property any more.

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-13 22:26:02 +08:00
Pxl
33220109f7 [Bug](materialized-view) fix analyze where clause failed on mv (#39061) (#39209)
## Proposed changes
pick from #39061
fix analyze where clause failed on mv
do not analyze slot after replaceSlot to avoid duplicate columns in desc
2024-08-13 16:08:20 +08:00
9fb103e979 [opt](fe) Optimize calculate load job num metric in FE (#39267)
cherry pick from https://github.com/apache/doris/pull/31952
https://github.com/apache/doris/pull/34020
Co-authored-by: Lei Zhang <27994433+SWJTU-ZhangLei@users.noreply.github.com>
2024-08-13 15:42:15 +08:00
228f78b80d [fix] (nereids) fix Match Expreesion in filter estimation (#39050) (#39215)
## Proposed changes

pick from master #39050
2024-08-13 10:57:53 +08:00
60eeec3754 [fix] (inverted index) Fix match function without inverted index (#38989) (#39220)
## Proposed changes

pick from #38989
2024-08-13 10:55:54 +08:00
f5e896f6eb [improvement](balance) don't balance tablet which has unfinish alter job #39121 (#39202)
cherry pick from #39121
2024-08-13 09:33:26 +08:00
c2044409da [branch-2.1] Picks "[fix](delete) Fix delete stmt on MOW table doesn't use partial update in Nereids planner #38751" (#39214)
## Proposed changes

picks https://github.com/apache/doris/pull/38751
2024-08-12 17:02:48 +08:00
0c39b88804 [Improvement](expr) fold child when const expr not folded (#38493) (#38961)
cherry-pick from master 38493

1. fold child when const expr not folded
2. do not fold function `sleep`
3. move all exceptional expression into shouldSkipFold

before

mysql [test]>explain select sleep(sign(1)*100);
+-----------------------------------------------+
| Explain String(Nereids Planner)               |
+-----------------------------------------------+
| PLAN FRAGMENT 0                               |
|   OUTPUT EXPRS:                               |
|     sleep(cast((sign(1.0) * 100) as INT))[#0] |
|   PARTITION: UNPARTITIONED                    |
|                                               |
|   HAS_COLO_PLAN_NODE: false                   |
|                                               |
|   VRESULT SINK                                |
|      MYSQL_PROTOCAL                           |
|                                               |
|   0:VUNION(32)                                |
|      constant exprs:                          |
|          sleep(CAST((sign(1) * 100) AS int))  |
+-----------------------------------------------+
13 rows in set (15.02 sec)

mysql [test]>select sleep(sign(1)*100);
+-----------------------------------------------------+
| sleep(cast((sign(cast(1 as DOUBLE)) * 100) as INT)) |
+-----------------------------------------------------+
|                                                   1 |
+-----------------------------------------------------+
1 row in set (1 min 55.34 sec)


after

mysql [test]>explain select sleep(sign(1)*100);
+---------------------------------+
| Explain String(Nereids Planner) |
+---------------------------------+
| PLAN FRAGMENT 0                 |
|   OUTPUT EXPRS:                 |
|     sleep(100)[#0]              |
|   PARTITION: UNPARTITIONED      |
|                                 |
|   HAS_COLO_PLAN_NODE: false     |
|                                 |
|   VRESULT SINK                  |
|      MYSQL_PROTOCAL             |
|                                 |
|   0:VUNION(32)                  |
|      constant exprs:            |
|          sleep(100)             |
+---------------------------------+
13 rows in set (0.23 sec)

mysql [test]> select sleep(sign(1)*100);
+-----------------------------------------------------+
| sleep(cast((sign(cast(1 as DOUBLE)) * 100) as INT)) |
+-----------------------------------------------------+
|                                                   1 |
+-----------------------------------------------------+
1 row in set (1 min 40.37 sec)


Co-authored-by: Pxl <pxl290@qq.com>
2024-08-12 15:13:48 +08:00
ebf5d70c9d [fix](function) MicroSecondsSub without scale (#38945) (#39194)
## Proposed changes
https://github.com/apache/doris/pull/38945
Added the computeSignature function for millisecond/microsecond
calculation functions to generate parameters and return values with the
appropriate precision.
Modified the microSecondsAdd function, which was used for constant
folding, because constant folding uses the precision of the parameters
for calculation. However, for millisecond/microsecond calculations, it
is necessary to set the precision to the maximum to ensure correct
display.


before
```
mysql> SELECT MICROSECONDS_SUB('2010-11-30 23:50:50', 2);
+-------------------------------------------------------------------+
| microseconds_sub(cast('2010-11-30 23:50:50' as DATETIMEV2(0)), 2) |
+-------------------------------------------------------------------+
| 2010-11-30 23:50:49                                               |
+-------------------------------------------------------------------+
```
now
```
mysql> SELECT MICROSECONDS_SUB('2010-11-30 23:50:50', 2);
+-------------------------------------------------------------------+
| microseconds_sub(cast('2010-11-30 23:50:50' as DATETIMEV2(0)), 2) |
+-------------------------------------------------------------------+
| 2010-11-30 23:50:49.999998                                        |
+-------------------------------------------------------------------+
```


<!--Describe your changes.-->

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-12 10:01:28 +08:00
0ee0dd6ae3 [fix](routine load) reset Kafka progress cache when routine load job topic change (#38474) (#39181)
pick (#38474)

When change routine load job topic from test_topic_before to
test_topic_after by
```
ALTER ROUTINE LOAD FOR test_topic_change FROM KAFKA("kafka_topic" = "test_topic_after");
```
(test_topic_before has 5 rows and test_topic_after has 1 rows)

Exception happened, which cannot consume any data:
```
2024-07-29 15:57:28,122 WARN (Routine load task scheduler|55) [KafkaRoutineLoadJob.hasMoreDataToConsume():792] Kafka offset fallback. partition: 0, cache offset: 5 get latest of
fset: 1, task 16656914-ba0a-465d-8e79-8252b423b0fc, job 16615
2024-07-29 15:57:28,123 WARN (Routine load task scheduler|55) [KafkaRoutineLoadJob.hasMoreDataToConsume():792] Kafka offset fallback. partition: 0, cache offset: 5 get latest of
fset: 1, task 16656914-ba0a-465d-8e79-8252b423b0fc, job 16615
2024-07-29 15:57:28,125 WARN (Routine load task scheduler|55) [KafkaRoutineLoadJob.hasMoreDataToConsume():792] Kafka offset fallback. partition: 0, cache offset: 5 get latest of
fset: 1, task 16656914-ba0a-465d-8e79-8252b423b0fc, job 16615
2024-07-29 15:57:28,126 WARN (Routine load task scheduler|55) [KafkaRoutineLoadJob.hasMoreDataToConsume():792] Kafka offset fallback. partition: 0, cache offset: 5 get latest of
fset: 1, task 16656914-ba0a-465d-8e79-8252b423b0fc, job 16615
2024-07-29 15:57:28,128 WARN (Routine load task scheduler|55) [KafkaRoutineLoadJob.hasMoreDataToConsume():792] Kafka offset fallback. partition: 0, cache offset: 5 get latest of
fset: 1, task 16656914-ba0a-465d-8e79-8252b423b0fc, job 16615
2024-07-29 15:57:28,129 WARN (Routine load task scheduler|55) [KafkaRoutineLoadJob.hasMoreDataToConsume():792] Kafka offset fallback. partition: 0, cache offset: 5 get latest of
fset: 1, task 16656914-ba0a-465d-8e79-8252b423b0fc, job 16615
2024-07-29 15:57:28,131 WARN (Routine load task scheduler|55) [KafkaRoutineLoadJob.hasMoreDataToConsume():792] Kafka offset fallback. partition: 0, cache offset: 5 get latest of
fset: 1, task 16656914-ba0a-465d-8e79-8252b423b0fc, job 16615
2024-07-29 15:57:28,133 WARN (Routine load task scheduler|55) [KafkaRoutineLoadJob.hasMoreDataToConsume():792] Kafka offset fallback. partition: 0, cache offset: 5 get latest of
fset: 1, task 16656914-ba0a-465d-8e79-8252b423b0fc, job 16615
2024-07-29 15:57:28,134 WARN (Routine load task scheduler|55) [KafkaRoutineLoadJob.hasMoreDataToConsume():792] Kafka offset fallback. partition: 0, cache offset: 5 get latest of
fset: 1, task 16656914-ba0a-465d-8e79-8252b423b0fc, job 16615
2024-07-29 15:57:28,136 WARN (Routine load task scheduler|55) [KafkaRoutineLoadJob.hasMoreDataToConsume():792] Kafka offset fallback. partition: 0, cache offset: 5 get latest of
fset: 1, task 16656914-ba0a-465d-8e79-8252b423b0fc, job 16615
2024-07-29 15:57:28,137 WARN (Routine load task scheduler|55) [KafkaRoutineLoadJob.hasMoreDataToConsume():792] Kafka offset fallback. partition: 0, cache offset: 5 get latest of
fset: 1, task 16656914-ba0a-465d-8e79-8252b423b0fc, job 16615
```

It is necessary to reset Kafka progress cache when routine load job
topic change.
2024-08-10 23:00:39 +08:00
5f77f909d9 [cherry-pick](branch-2.1) Pick "[feature](function) support ip functions named ipv4_to_ipv6 and cut_ipv6" (#39058)
## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
pick https://github.com/apache/doris/pull/36883 and
https://github.com/apache/doris/pull/35239
2024-08-10 18:37:11 +08:00
9b4354fcb7 [fix](mtmv) second level MTMV always refresh all partition by mistake… (#39167)
pick: 
https://github.com/apache/doris/pull/38698
2024-08-10 18:26:56 +08:00
4dd9d4e1dc [enhance](mtmv) change mysql table type of mtmv to table (#38797) (#39166)
pick: https://github.com/apache/doris/pull/38797
2024-08-10 18:20:48 +08:00
878ce29fa7 [fix](mtmv) Fix rewrite by materialized view fail when query hive table (#38909) (#39163)
## Proposed changes

commitId: 08c9e051
pr: https://github.com/apache/doris/pull/38909
2024-08-10 18:15:11 +08:00
df45398912 [fix](mtmv)fix can not show create mtmv use follower fe (#38794) (#39162)
pick: https://github.com/apache/doris/pull/38794
2024-08-10 18:12:01 +08:00
29ad364007 [enhance](mtmv)Disable mtmv list rollup (#38124) (#39158)
pick master: https://github.com/apache/doris/pull/38124
2024-08-10 18:06:38 +08:00
3dc150a0da [Fix](nereids) fix bind expression compare dbname ignore cluster (#39114) (#39142)
cherry-pick #39114 to branch-2.1
2024-08-10 18:00:48 +08:00
5e1e725cee [feature](inverted index) Add multi_match function #37722 #38931 #39149 (#38877) 2024-08-10 15:20:08 +08:00