Commit Graph

8924 Commits

Author SHA1 Message Date
e083dc26a0 [cherry-pick](branch-2.1) Pick "[Fix](group commit) Fix multiple cluster group commit BE select strategy (#38644)" (#39010)
## Proposed changes

Pick #38644 

<!--Describe your changes.-->
2024-08-07 22:07:30 +08:00
749c9f7b56 [fix](group commit) fix repaly wal check label status (#38883) (#38997)
pick https://github.com/apache/doris/pull/38883
2024-08-07 22:06:59 +08:00
fd3f95066e [fix](Nereids) lock table when generate distribute plan (#38950) (#39029)
We should lock table when generate distribute plan, because insert overwrite by async materialized view will drop partitions parallel, and query thread will throw exception:
```
java.lang.RuntimeException: Cannot invoke "org.apache.doris.catalog.Partition.getBaseIndex()" because "partition" is null
    at org.apache.doris.nereids.util.Utils.execWithUncheckedException(Utils.java:76) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.nereids.glue.translator.PhysicalPlanTranslator.translatePlan(PhysicalPlanTranslator.java:278) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.nereids.NereidsPlanner.splitFragments(NereidsPlanner.java:341) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.nereids.NereidsPlanner.distribute(NereidsPlanner.java:400) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.nereids.NereidsPlanner.plan(NereidsPlanner.java:147) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.qe.StmtExecutor.executeByNereids(StmtExecutor.java:796) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.qe.StmtExecutor.execute(StmtExecutor.java:605) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.qe.StmtExecutor.queryRetry(StmtExecutor.java:558) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.qe.StmtExecutor.execute(StmtExecutor.java:548) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.qe.ConnectProcessor.executeQuery(ConnectProcessor.java:385) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.qe.ConnectProcessor.handleQuery(ConnectProcessor.java:237) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.qe.MysqlConnectProcessor.handleQuery(MysqlConnectProcessor.java:260) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.qe.MysqlConnectProcessor.dispatch(MysqlConnectProcessor.java:288) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.qe.MysqlConnectProcessor.processOnce(MysqlConnectProcessor.java:342) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.mysql.ReadListener.lambda$handleEvent$0(ReadListener.java:52) ~[doris-fe.jar:1.2-SNAPSHOT]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
    at java.lang.Thread.run(Thread.java:833) ~[?:?]
Caused by: java.lang.NullPointerException: Cannot invoke "org.apache.doris.catalog.Partition.getBaseIndex()" because "partition" is null
    at org.apache.doris.planner.OlapScanNode.mockRowCountInStatistic(OlapScanNode.java:589) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.planner.OlapScanNode.finalizeForNereids(OlapScanNode.java:1733) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.nereids.util.Utils.execWithUncheckedException(Utils.java:74) ~[doris-fe.jar:1.2-SNAPSHOT]
    ... 17 more
2024-07-29 00:46:17,608 WARN (mysql-nio-pool-114|201) Analyze failed. stmt[210035, 49d3041004ba4b6a-b07fe4491d03c5de]
org.apache.doris.common.NereidsException: errCode = 2, detailMessage = Cannot invoke "org.apache.doris.catalog.Partition.getBaseIndex()" because "partition" is null
    at org.apache.doris.qe.StmtExecutor.executeByNereids(StmtExecutor.java:803) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.qe.StmtExecutor.execute(StmtExecutor.java:605) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.qe.StmtExecutor.queryRetry(StmtExecutor.java:558) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.qe.StmtExecutor.execute(StmtExecutor.java:548) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.qe.ConnectProcessor.executeQuery(ConnectProcessor.java:385) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.qe.ConnectProcessor.handleQuery(ConnectProcessor.java:237) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.qe.MysqlConnectProcessor.handleQuery(MysqlConnectProcessor.java:260) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.qe.MysqlConnectProcessor.dispatch(MysqlConnectProcessor.java:288) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.qe.MysqlConnectProcessor.processOnce(MysqlConnectProcessor.java:342) ~[doris-fe.jar:1.2-SNAPSHOT]
    at org.apache.doris.mysql.ReadListener.lambda$handleEvent$0(ReadListener.java:52) ~[doris-fe.jar:1.2-SNAPSHOT]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
    at java.lang.Thread.run(Thread.java:833) ~[?:?]
```

this exception is too hard to reproduce, so I can not write a test case
2024-08-07 19:00:44 +08:00
6f37e483f8 [improve](config)del useless creation config for inverted index (#39005)
## Proposed changes
delete useless config : enable_create_inverted_index_for_array
backport: https://github.com/apache/doris/pull/39006
Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-07 17:13:05 +08:00
36edfa0c65 [cherry-pick](branch-2.1) Pick "[Enhancement](audit log) Set print audit log sesssion variable default value to false #38865" (#39009)
pick #38865
2024-08-07 16:59:26 +08:00
7e95d7cbec [bugfix](backup)(cooldown) cancel backup properly when be backup failed (#38724) (#38993)
Co-authored-by: zhangyuan <ayuanzhang@tencent.com>
2024-08-07 15:58:11 +08:00
843afccdf9 [fix](catalog) remove backend in black list from candidate backends for external table (#38984)
When select backends for external table's query,
skip the backends in black list
2024-08-07 14:09:06 +08:00
f9788b4ca5 [Fix](nereids) fix partition_prune or expression evaluate wrongly (#38897) (#38998)
cherry-pick #38897 to branch-2.1
2024-08-07 13:49:42 +08:00
8cb5aa64f4 [test](inverted index) add an Inverted Index Testing Switch (#38077) (#38947)
https://github.com/apache/doris/pull/38077
2024-08-07 11:25:36 +08:00
fc0222a64c [opt](info) processlist schema table support show all fe (#38701) (#38953)
pick #38701
2024-08-07 11:01:46 +08:00
2b1aa05370 pick some pr from to branch21 #38115 #38008 #37929 (#38940)
## Proposed changes

pr: https://github.com/apache/doris/pull/38115
commitId: 2b29288c

pr: https://github.com/apache/doris/pull/38008
commitId: c6b924da

pr: https://github.com/apache/doris/pull/37929
commitId: d44fcdc5
2024-08-07 10:19:41 +08:00
2543b569bb [Optimize](Row store) pick #37145, #38236 (#38932) 2024-08-07 09:55:42 +08:00
bc644cb253 [opt](catalog) merge scan range to avoid too many splits (#38311) (#38964)
bp #38311
2024-08-06 21:57:02 +08:00
2540835b58 [opt](log) Remove unnecessary log for analysis (#38943)
In master branch, we already fixed in this pr:
https://github.com/apache/doris/pull/36884

Here cherry-pick to branch-2.1
2024-08-06 21:44:18 +08:00
07ea511141 [opt](optimizer) Remove unused code to unify code (#38918)
## Proposed changes
Now, Agg's child predicates will  not spread to agg.
    For example:
    select a, sum(b)
    from (
     select a,b from t where a = 1 and b = 2
    ) t
    group by a
    `a = 1` in scan can be propagated to `a` of agg.
    But `b = 2` in scan can not be propagated to `sum(b)` of agg.

Issue Number: #38905
<!--Describe your changes.-->

Co-authored-by: liutang123 <liulijia@gmail.com>
2024-08-06 19:09:25 +08:00
5066be6df3 [fix](multicatalog) fix hadoop authenticator not inited for existing hms catalog. (#38930)
Backport #38475.

Co-authored-by: Yulei-Yang <yulei.yang0699@gmail.com>
2024-08-06 15:34:32 +08:00
fcb4483ed1 [chore](sql) Forbid show hidden columns and create table with hidden column (#38796) (#38924)
Forbid show hidden columns and create table with hidden column
2024-08-06 14:24:41 +08:00
75fe929dc4 [enhancement](nereids) Support eliminate outer join by match expression (#38537) (#38925)
enable run match expression outer of filter plan, e.g join conjunct
support eliminate outer join by match expression, if any arguments of match expression is null literal
2024-08-06 13:16:57 +08:00
ab3057b2d4 [Feat](nereids) support date function in partition prune (#38743) (#38898)
cherry-pick #38743 to branch-2.1
2024-08-06 09:13:13 +08:00
3b9394a8c7 [improvement](tablet scheduler) Adjust tablet sched priority to help load data succ #38528 (#38884)
cherry pick from #38528
2024-08-06 02:13:47 +08:00
9c020f9db1 [fix](fe) Fix the default value of ReplacePartitionClause.isStrictRange (#38688) (#38879) 2024-08-05 20:59:50 +08:00
ce75e6adfe [fix](group commit) Fix group commit debug log and improve performance (#38754) (#38841)
Pick https://github.com/apache/doris/pull/38754
2024-08-05 18:34:49 +08:00
0f0b0e9b37 [Feat](nereids) Support date_trunc function in partition prune (#38025) (#38849)
cherry-pick #38025 to branch-2.1
2024-08-05 18:29:10 +08:00
40567b5d69 [fix](nereids)support group_concat with distinct and order by (#38871)
## Proposed changes

pick from master https://github.com/apache/doris/pull/38080

<!--Describe your changes.-->
2024-08-05 18:23:55 +08:00
bf1c7a1c15 [fix](clone) fix stale tablet report miss the new cloning replica #38695 (#38839)
cherry pick from #38695
2024-08-05 18:04:24 +08:00
994c56f914 [fix](txn) fix abortTxn by label does not acquire table write lock (#38777) (#38842)
pick https://github.com/apache/doris/pull/38777
2024-08-05 16:33:20 +08:00
65154f8abe [branch-2.1] (doris-future) Support auto partition name function (#38853)
cherry-pick https://github.com/apache/doris/pull/34258 to branch-2.1
2024-08-05 16:04:24 +08:00
5dfc5d2c77 [enhancement](querycancel) print detail message when query is cancelled (#38859)
## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->

---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-08-05 14:47:03 +08:00
de9b9d6a39 [Fix](nereids) change char(0) to char(1), varchar(0) to varchar(65533) when create table (#38427) (#38530)
cherry-pick #38427 to branch-2.1

---------

Co-authored-by: morrySnow <101034200+morrySnow@users.noreply.github.com>
2024-08-05 09:18:18 +08:00
9430b27e68 [branch-2.1][improvement](jdbc catalog) improvement some jdbc catalog properties check order (#38770)
pick (#38439)

1. Move the execution of testJdbcConnection() to checkWhenCreating
instead of the constructor
2. Move the logic of renaming lower_case_table_names to
lower_case_meta_names to setDefaultPropsIfMissing
2024-08-05 09:14:04 +08:00
5d02c48715 [feature](hive)Support reading renamed Parquet Hive and Orc Hive tables. (#38432) (#38809)
bp #38432 

## Proposed changes
Add `hive_parquet_use_column_names` and `hive_orc_use_column_names`
session variables to read the table after rename column in `Hive`.

These two session variables are referenced from
`parquet_use_column_names` and `orc_use_column_names` of `Trino` hive
connector.

By default, these two session variables are true. When they are set to
false, reading orc/parquet will access the columns according to the
ordinal position in the Hive table definition.

For example:
```mysql
in Hive :
hive> create table tmp (a int , b string) stored as parquet;
hive> insert into table tmp values(1,"2");
hive> alter table tmp  change column  a new_a int;
hive> insert into table tmp values(2,"4");

in Doris :
mysql> set hive_parquet_use_column_names=true;
Query OK, 0 rows affected (0.00 sec)

mysql> select  * from tmp;
+-------+------+
| new_a | b    |
+-------+------+
|  NULL | 2    |
|     2 | 4    |
+-------+------+
2 rows in set (0.02 sec)

mysql> set hive_parquet_use_column_names=false;
Query OK, 0 rows affected (0.00 sec)

mysql> select  * from tmp;
+-------+------+
| new_a | b    |
+-------+------+
|     1 | 2    |
|     2 | 4    |
+-------+------+
2 rows in set (0.02 sec)
```

You can use `set
parquet.column.index.access/orc.force.positional.evolution = true/false`
in hive 3 to control the results of reading the table like these two
session variables. However, for the rename struct inside column parquet
table, the effects of hive and doris are different.
2024-08-05 09:06:49 +08:00
40767003c6 [Fix](ScanNode) Move the finalize phase of ScanNode to after the end of the Physical Translate phase (#38604)
bp: #37565

Currently, Doris first obtains splits and then performs projection.
After column pruning, it calls `updateRequiredSlots` to update the
scanRange information. However, the Trino connector's column pruning
pushdown needs to be completed before obtaining splits.

Therefore, we move the finalize phase of `ScanNode` to after the end of
the `Physical Translate` phase, so that `createScanRangeLocations` can
use the final columns which have been pruning.

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-05 08:58:59 +08:00
f76397277e [fix](routine load) fix show routine load task result incorrect (#38523) (#38826)
pick (#38523)

Create a job:
```
CREATE ROUTINE LOAD testShow ON test_show_routine_load
COLUMNS TERMINATED BY ","
PROPERTIES
(
"max_batch_interval" = "5",
"max_batch_rows" = "300000",
"max_batch_size" = "209715200"
)
FROM KAFKA
(
"kafka_broker_list" = "127.0.0.1:19092",
"kafka_topic" = "test_show_routine_load",
"property.kafka_default_offsets" = "OFFSET_BEGINNING"
);
```
show routine load task:
```
SHOW ROUTINE LOAD TASK WHERE JobName = "testShow";
```
result:
```
ERROR 1105 (HY000): errCode = 2, detailMessage = The job named testshowdoes not exists or job state is stopped or cancelled
```

Do not  use `toLowerCase` method;
2024-08-04 22:18:25 +08:00
79b07d0b8a [fix](routine load) fix enclose and escape can not set in routine load job (#38402) (#38825)
pick (#38402)
2024-08-04 22:17:12 +08:00
7c70f75198 [Fix](Load)Audit logs avoid recording certain sensitive information #38769 (#38784)
…

## Proposed changes

#38769

<!--Describe your changes.-->
2024-08-04 10:53:03 +08:00
556f0fc784 [pick](json-keys) support json_keys function (#38631)
## Proposed changes
backport: https://github.com/apache/doris/pull/36411
Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-02 19:10:00 +08:00
2425730609 [enhance](auth)support cache ranger datamask and row filter (#37723) (#38575)
pick: https://github.com/apache/doris/pull/37723
2024-08-02 14:59:32 +08:00
f24d55fc94 [fix](syntax) multi statements must delim with semicolon (#38670) (#38753)
pick from master #38670
2024-08-02 14:49:51 +08:00
da7b2cf578 [refactor](catalog) set "use_meta_cache" default to true (#38244)(#38352)(#38619) (#38355)
bp #38244 #38352 #38619

---------

Co-authored-by: Yulei-Yang <yulei.yang0699@gmail.com>
2024-08-02 14:13:38 +08:00
d800434859 [Chore](Fe)Upgrade dependencies (#38509) (#38747)
## Proposed changes

upgrade spring-boot to 2.7.18
upgrade zookeeper to 3.9.2
upgrade jetty to 9.4.55.v20240627
upgrade ivy to 2.5.2
upgrade icu4j to 75.1
upgrade ini4j to 0.5.4

(cherry picked from commit 3f633c2018e86c6c842647262853d88ad63672bf)

pick #38509
## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-02 12:34:17 +08:00
f21d7e3833 [test](inverted index)Add cases for inverted index format v2 (#38132)(#38443) (#38222)
## Proposed changes

backport #38132 #38443
2024-08-02 12:04:26 +08:00
4f2ca43917 [minor](fe) simplfy some code in HMSExternalTable (#32344) (#38675)
bp #32344

Co-authored-by: DuRipeng <453243496@qq.com>
2024-08-02 11:27:10 +08:00
84d9b2fcf4 [pick](nestedtypes) support nested type with agg replace_if_not_null (#38719)
## Proposed changes
backport: https://github.com/apache/doris/pull/38304
Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-02 11:18:33 +08:00
e140613ae1 [fix](Nereids) remove db readlock before get table from db (#38660) (#38729)
pick from master #38660

insert will hold readlock of target table before planning. if nereids
need db readlock after it, will lead to dead lock. because other
statement need to hold db lock before get table lock

for example:

insert: target table read lock -> database read lock
drop table: database write lock -> target table write lock
2024-08-02 08:34:59 +08:00
555dccb1a4 [fix](bloom filter)Fix rename column with bloom filter (#38431) (#38662)
backport #38431
2024-08-01 19:01:14 +08:00
2562cf33a7 [fix](mtmv) Choose a valid partition column when there are both valid and invalid expressions (#38367) (#38684)
## Proposed changes
pick #38367 

<!--Describe your changes.-->
2024-08-01 19:00:28 +08:00
60091f072a [fix](auth)fix create table like need create_priv of existed table (#… (#38570)
…37879)

pick: https://github.com/apache/doris/pull/37879
2024-08-01 18:57:44 +08:00
a4e793752f [bugfix](iceberg)revert count(*) directly returned by fe for 2.1 (#38566) (#38655)
bp: #38566
2024-08-01 18:56:19 +08:00
e8690b62ee [fix](group commit) Pick add debug log show why group commit not work; delete wal when replay success (#38611) (#38659)
Pick https://github.com/apache/doris/pull/38611
2024-08-01 16:59:54 +08:00
cafcf7acc1 [cherry-pick](SSL) Fix ssl connection close 2.1 (#38587) (#38677)
## Proposed changes

Issue Number: close #38590 

If SSL connection closed, a specified packet will sent to indicate the
closing of connection. The SSL engine will be shut down and output an
empty unwrapped result.

Therefore, handle this case correctly to avoid buffer overflow by
breaking the reading flow and do the cleanup stuff initiatively.
2024-08-01 16:06:30 +08:00