Commit Graph

9390 Commits

Author SHA1 Message Date
ae88d032db [chore](ddl) support force_enable_feature_binlog #41796 (#42926)
cherry pick from #41796
2024-10-31 09:53:45 +08:00
fce4695f37 [Configuration](transactional-hive) Add skip_checking_acid_version_file session var to skip checking acid version file in some hive envs. (#42111)(#42225) (#42939)
cherry-pick (#42111)(#42225)

---------

Co-authored-by: Qi Chen <kaka11.chen@gmail.com>
2024-10-31 09:52:20 +08:00
0d008b5a43 [feat](nereids)disable join reorder if column stats is invalid #41790 (branch-2.1) (#42902)
## Proposed changes
pick #41790
Issue Number: close #xxx

<!--Describe your changes.-->
2024-10-30 23:47:03 +08:00
2a6d6c15c5 [enhance](hive)hive event code optimization #42637 (#42876)
cherry pick from #42637

Co-authored-by: zhangdong <493738387@qq.com>
2024-10-30 12:52:21 +08:00
17d84dc88f [enhance](paimon)paimon scanner code optimization #42606 (#42875)
cherry pick from #42606

Co-authored-by: zhangdong <493738387@qq.com>
2024-10-30 12:51:59 +08:00
ba4d01d93c [branch-2.1](hint) add detail for replica missing msg (#42546) 2024-10-29 22:47:41 +08:00
233fad4815 [opt](variable) force update some variable by variable version (#41607) (#42643)
pick from master #41607

variable version:
000-100: doris-2.0.x
100-200: doris-2.1.x
200-300: doris-3.0.x

update variables

000:
nereids_timeout_second = 30 if original value is 5

100:
enable_nereids_dml = true
enable_nereids_dml_with_pipeline = true
enable_nereids_planner = true
enable_fallback_to_original_planner = true
enable_pipeline_x_engine = true
2024-10-29 15:57:14 +08:00
1538b82221 [Impl](Nereids) add nereids gc cost time and be fold const cost time to profile (#42007) (#42516)
pick:#42007
add Nereids GarbageCollect Time and Nereids BeFoldConst Time to profile

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-10-29 15:16:40 +08:00
d5c43c51b5 [branch-2.1](fe) Avoid interrupt daemon thread and use proper polling interval (#42210) (#42646)
pick https://github.com/apache/doris/pull/42210
2024-10-29 10:10:26 +08:00
73621c480a [fix](log) fix group commit warn log (#42403) (#42575)
pick https://github.com/apache/doris/pull/42403
2024-10-28 21:04:29 +08:00
0baca54200 [fix](Nereids) offset do more than once when have shuffle after limit (#42576) (#42577)
pick from master #42576

intro by #39316. it want to fix a problem intro by #36699. but forgot to
remove all wrong code in #36699.

after #39316, we should not set offset on exchange, when the exchange is
on the top of a limit with offset.
2024-10-28 19:48:13 +08:00
85674814eb [fix](query-forward) Fix forward query exception or stuck or potential query result loss (#41303) (#42369)
## Proposed changes

1. Fix forward query exception if no status code is set in master
execution. EOF may result in this status.

2. Fix forward query stuck due to no result packet sent to mysql
channel. Should use result packets from master.

3. Fix potential forward query result loss if follower can read status
change during query process. Should judge by the status once before
execution.

4. Add assertion for regression test.
2024-10-28 17:39:57 +08:00
0e63133c80 [Chore](job) Provides configuration of job execution queue size (#42253) (#42530)
When dealing with a large number of tasks, the default execution queue
size is 1024. This can lead to tasks being dropped if the queue becomes
full.
eg

`dispatch instant task failed, job id is xxx`

To address this, you can add the parameters `insert_task_queue_size` and
`mtmv_task_queue_size` in the `fe.conf` configuration file. These
parameters must be set to a power of 2.

**Keep in mind, increasing this value is recommended only when thread
resources are limited; otherwise, you should consider increasing the
number of task execution threads.**

(cherry picked from commit f9ea8f8229e9f5514c1773bd25c3cc11985c63fb)

## Proposed changes

Issue Number: #42253

<!--Describe your changes.-->
2024-10-28 13:42:08 +08:00
9eef393e2a [pick]support cgroup v2 (#42465)
## Proposed changes

pick #39991   #39374  #36663
2024-10-25 20:13:27 +08:00
4a62d9e44b Revert "[2.1][improvement](jdbc catalog) Add catalog property to enable jdbc connection pool" (#42481)
Reverts apache/doris#42255

We have found that after closing the connection pool, there will be
class loading problems and connection release problems for some data
sources. We will remove this function first and re-add it after solving
and testing it completely.
2024-10-25 19:37:36 +08:00
859332c918 [improve](common) Add synchronized to avoid concurrent modification #42384 (#42454)
cherry pick from #42384
2024-10-25 16:33:53 +08:00
b88d4db6e7 [fix](auth)Fix use encryptkey should check auth (#41791) (#42105)
pick from master #41791
2024-10-25 14:32:42 +08:00
120bf28d1e [fix](statistics)Skip shadow index while analyzing a table. (#42201) (#42414)
backport: https://github.com/apache/doris/pull/42201
2024-10-25 13:44:42 +08:00
db9c74c38f [improvement](statistics)Drop expired external stats only when the catalog is dropped. (#42244) (#42410)
backport: https://github.com/apache/doris/pull/42244
2024-10-24 23:47:53 +08:00
dba6296d3e [improvement](statistics)Skip auto analyze when table row count is not fully reported. (return -1) (#42209) (#42407)
backport: https://github.com/apache/doris/pull/42209
2024-10-24 19:40:03 +08:00
7e710701ce [enhancement](bakcup) throw detailed msg for backup or restore (#42288) (#42312)
## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-10-24 16:56:04 +08:00
8ff77c9509 [improve](restore) Split batch creating replica task by table id #42235 (#42342)
cherry pick from #42235
2024-10-24 16:20:19 +08:00
aa0fe8ccf4 [fix] (inverted index) fix the error result in the query when using count on index (#42346) (#42347)
## Proposed changes

pick from master #42346
2024-10-24 12:11:59 +08:00
5453337e7b [fix](mtmv)Fix the old version of the materialized view error 'Curren… (#42307)
…t database is not set'` (#42152)

pick: https://github.com/apache/doris/pull/42152
2024-10-24 11:16:01 +08:00
cbdaaa62b2 [feature](function) hour/minute/second functions support time as an a… 41008 (#42232)
…rgument. (#41008)

## Proposed changes

```
mysql> select cast(4562632 as time),hour(cast(4562632 as time)), minute(cast(4562632 as time)),second(cast(4562632 as time));
+-----------------------+-----------------------------+-------------------------------+-------------------------------+
| cast(4562632 as TIME) | hour(cast(4562632 as TIME)) | minute(cast(4562632 as TIME)) | second(cast(4562632 as TIME)) |
+-----------------------+-----------------------------+-------------------------------+-------------------------------+
| 456:26:32             |                         456 |                            26 |                            32 |
+-----------------------+-----------------------------+-------------------------------+-------------------------------+
```

<!--Describe your changes.-->

---------

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->

Co-authored-by: Dongyang Li <hello_stephen@qq.com>
2024-10-24 11:09:36 +08:00
73fc5c734e [fix](mtmv)fix in the scenario of recreating a table, the materialize… (#42339)
…d view may assume that the data has not changed (#41762)

pick: https://github.com/apache/doris/pull/41762
2024-10-24 11:07:29 +08:00
e3b059e339 [fix](nereids) Fix not check column name when create or alter view (#42206) (#42323)
pr: https://github.com/apache/doris/pull/42206
commitId: 2bcaa5b4
2024-10-24 10:11:32 +08:00
44e9368c78 Pick some fix from master to 21(#41472) (#40106)(#40173) (#42212)
## Proposed changes

pr: https://github.com/apache/doris/pull/41472
commitId: 2745e044


pr: https://github.com/apache/doris/pull/40106
commitId: 0fdb1ee0


pr: https://github.com/apache/doris/pull/40173
commitId: 0d07e3d1
2024-10-24 10:09:55 +08:00
036906f025 [fix](routine load) make the timeout of load channel consistent with routine load task (#42042) (#42267)
pick (#42042)

Routine load task timeout is max_batch_interval * 10, but load channel
timeout is max_batch_interval * 2.
2024-10-23 16:17:39 +08:00
0aa07e6118 [enhancement](blacklist) should check the real error reason when sendfragment timeout (#42314)
## Proposed changes

1. send fragment in BE may run for long time.
2. timeout is not related with BE down.
3. cancel logic should not handle blacklist. because if a BE is down,
send fragment will find it.

Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-10-23 16:16:04 +08:00
a5b3520cbb [fix](mtmv) regression test unstable and error (#41145) (#42185)
## Proposed changes

pr: https://github.com/apache/doris/pull/41145
commitId: 5e6e4bf6
2024-10-23 14:42:53 +08:00
e7395198d4 [fix](estimate) func call with not filter will estimate some statisti… (#42302)
…cs (#41989)

before this pr: use ! or not in or condition when table has been
analyzed we will meet
```
SELECT
  count(1)
FROM
  table_30_un_pa_ke_pr_di4
where
  col_int_undef_signed_not_null < -128
  or not array_contains(col_array_bigint__undef_signed, col_int_undef_signed_not_null);

ERROR 1105 (HY000): errCode = 2, detailMessage = Not-predicate meet unexpected child:
  array_contains(col_array_bigint__undef_signed, cast(col_int_undef_signed_not_null as BIGINT))
```

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-10-23 14:33:41 +08:00
c9acd71ad6 [fix](inverted index) Fix errors caused by enable_need_read_data_opt #42064 (#42247)
cherry pick from #42064

Co-authored-by: Sun Chenyang <csun5285@gmail.com>
2024-10-23 09:18:19 +08:00
bde8e2d474 [2.1][improvement](jdbc catalog) Add catalog property to enable jdbc connection pool (#42255)
pick (#41992)

We initially introduced jdbc connection pool to improve the connection
performance of jdbc catalog, but we always found that connection pool
would bring some unexpected errors, so we chose to add a catalog
property: `enable_connection_pool` to choose whether to enable the jdbc
connection pool of jdbc catalog, and the default false.However, the
created catalog will still open the connection pool when it is upgraded,
and only the newly created catalog will be false

And we conducted performance tests on this, the performance loss is
within the expected range.

- Enable connection pool: mysqlslap -uroot -h127.0.0.1 -P9030
--concurrency=1 --iterations=100 --query='SELECT * FROM mysql.test.test
limit 1;' --create-schema=mysql --delimiter=";" --verbose
Benchmark
        Average number of seconds to run all queries: 0.008 seconds
        Minimum number of seconds to run all queries: 0.004 seconds
        Maximum number of seconds to run all queries: 0.133 seconds
        Number of clients running queries: 1
        Average number of queries per client: 1

- Disable connection pool: mysqlslap -uroot -h127.0.0.1 -P9030
--concurrency=1 --iterations=100 --query='SELECT * FROM
mysql_no_pool.test.test limit 1;' --create-schema=mysql --delimiter=";"
--verbose
Benchmark
        Average number of seconds to run all queries: 0.054 seconds
        Minimum number of seconds to run all queries: 0.047 seconds
        Maximum number of seconds to run all queries: 0.184 seconds
        Number of clients running queries: 1
        Average number of queries per client: 1
2024-10-22 23:28:28 +08:00
1fc1662e88 [2.1][fix](external) Fix case-insensitive table name mapping retrieval rules (#42259)
pick (#38227)

```
Doris > CREATE CATALOG `oracle` PROPERTIES (
    -> "user" = "doris_test",
    -> "type" = "jdbc",
    -> "password" = "xxx",
    -> "jdbc_url" = "jdbc:oracle:thin:@xxx:1521:XE",
    -> "driver_url" = "ojdbc8-19.3.0.0.jar",
    -> "driver_class" = "oracle.jdbc.driver.OracleDriver"
    -> );
Query OK, 0 rows affected (2.16 sec)
```

```
Doris > show variables like '%lower_case%';
+------------------------+-------+---------------+---------+
| Variable_name          | Value | Default_Value | Changed |
+------------------------+-------+---------------+---------+
| lower_case_table_names | 1     | 0             | 1       |
+------------------------+-------+---------------+---------+
1 row in set (0.00 sec)

Doris > show tables from oracle.DORIS_TEST;
+----------------------+
| Tables_in_DORIS_TEST |
+----------------------+
| aa/d                 |
| aaad                 |
| lower_test           |
| student              |
| student2             |
| student3             |
| test_all_types       |
| test_char            |
| test_clob            |
| test_date            |
| test_insert          |
| test_int             |
| test_num             |
| test_number          |
| test_number2         |
| test_number3         |
| test_number4         |
| test_raw             |
| test_timestamp       |
+----------------------+
19 rows in set (0.01 sec)

```

```
Doris > select * from oracle.DORIS_TEST.test_int limit 1;
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
| ID   | TINYINT_VALUE1 | SMALLINT_VALUE1 | INT_VALUE1 | BIGINT_VALUE1      | TINYINT_VALUE2 | SMALLINT_VALUE2 | INT_VALUE2 | BIGINT_VALUE2       |
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
| 1    |             99 |            9999 |  999999999 | 999999999999999999 |            999 |           99999 | 9999999999 | 9999999999999999999 |
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
1 row in set (1.03 sec)

Doris > select * from oracle.DORIS_TEST.TEST_INT limit 1;
ERROR 1105 (HY000): errCode = 2, detailMessage = Table [TEST_INT] does not exist in database [DORIS_TEST].

```

```
Doris > select * from oracle.DORIS_TEST.test_int limit 1;
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
| ID   | TINYINT_VALUE1 | SMALLINT_VALUE1 | INT_VALUE1 | BIGINT_VALUE1      | TINYINT_VALUE2 | SMALLINT_VALUE2 | INT_VALUE2 | BIGINT_VALUE2       |
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
| 1    |             99 |            9999 |  999999999 | 999999999999999999 |            999 |           99999 | 9999999999 | 9999999999999999999 |
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
1 row in set (0.20 sec)

Doris > select * from oracle.DORIS_TEST.TEST_INT limit 1;
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
| ID   | TINYINT_VALUE1 | SMALLINT_VALUE1 | INT_VALUE1 | BIGINT_VALUE1      | TINYINT_VALUE2 | SMALLINT_VALUE2 | INT_VALUE2 | BIGINT_VALUE2       |
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
| 1    |             99 |            9999 |  999999999 | 999999999999999999 |            999 |           99999 | 9999999999 | 9999999999999999999 |
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
1 row in set (0.20 sec)
```

```
Doris > show variables like '%lower_case%';
+------------------------+-------+---------------+---------+
| Variable_name          | Value | Default_Value | Changed |
+------------------------+-------+---------------+---------+
| lower_case_table_names | 2     | 0             | 1       |
+------------------------+-------+---------------+---------+
1 row in set (0.01 sec)

Doris > show tables from oracle.DORIS_TEST;
+----------------------+
| Tables_in_DORIS_TEST |
+----------------------+
| AA/D                 |
| AAAD                 |
| LOWER_TEST           |
| STUDENT              |
| TEST_ALL_TYPES       |
| TEST_CHAR            |
| TEST_CLOB            |
| TEST_DATE            |
| TEST_INSERT          |
| TEST_INT             |
| TEST_NUM             |
| TEST_NUMBER          |
| TEST_NUMBER2         |
| TEST_NUMBER3         |
| TEST_NUMBER4         |
| TEST_RAW             |
| TEST_TIMESTAMP       |
| student2             |
| student3             |
+----------------------+
19 rows in set (1.05 sec)
```

```
Doris > select * from oracle.DORIS_TEST.test_int limit 1;
ERROR 1105 (HY000): errCode = 2, detailMessage = Table [test_int] does not exist in database [DORIS_TEST].
Doris > select * from oracle.DORIS_TEST.TEST_INT limit 1;
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
| ID   | TINYINT_VALUE1 | SMALLINT_VALUE1 | INT_VALUE1 | BIGINT_VALUE1      | TINYINT_VALUE2 | SMALLINT_VALUE2 | INT_VALUE2 | BIGINT_VALUE2       |
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
| 1    |             99 |            9999 |  999999999 | 999999999999999999 |            999 |           99999 | 9999999999 | 9999999999999999999 |
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
1 row in set (1.07 sec)
```

```
Doris > select * from oracle.DORIS_TEST.test_int limit 1;
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
| ID   | TINYINT_VALUE1 | SMALLINT_VALUE1 | INT_VALUE1 | BIGINT_VALUE1      | TINYINT_VALUE2 | SMALLINT_VALUE2 | INT_VALUE2 | BIGINT_VALUE2       |
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
| 1    |             99 |            9999 |  999999999 | 999999999999999999 |            999 |           99999 | 9999999999 | 9999999999999999999 |
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
1 row in set (0.21 sec)

Doris > select * from oracle.DORIS_TEST.TEST_INT limit 1;
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
| ID   | TINYINT_VALUE1 | SMALLINT_VALUE1 | INT_VALUE1 | BIGINT_VALUE1      | TINYINT_VALUE2 | SMALLINT_VALUE2 | INT_VALUE2 | BIGINT_VALUE2       |
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
| 1    |             99 |            9999 |  999999999 | 999999999999999999 |            999 |           99999 | 9999999999 | 9999999999999999999 |
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
1 row in set (0.20 sec)
```

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-10-22 23:28:01 +08:00
56c2a0f523 [2.1][opt](Catalog) Remove unnecessary conjuncts handling on External Scan (#42261)
pick (#41218)

In the previous FileScanNode, some parts that used conjuncts for
predicate conversion were placed in the init phase. However, for the
Nereids planner, pushing the filter down to the scan happens in the
Translator, which means that the ScanNode can only get the complete
conjuncts in the finalized phase. Therefore, in this PR, I have removed
all conjuncts variables in External for the Nereids planner. They no
longer need to store conjuncts themselves or add them to the ScanNode.
Instead, all places in the ScanNode that use conjuncts should be moved
to the finalized phase.

This refactor also fix a performance issue introduced from #40176 After
introducing the change of generating SelectNode for consecutive projects
or filters, FileScan still adds conjuncts too early in the init phase,
resulting in the discovery of consecutive filters when the upper layer
continues to translate, a selectnode was unexpectedly generated on the
scannode, causing the project to be unable to prune the scannode
columns. However, the Project node trims columns of SelectNode and
ScanNode differently, which causes ScanNode to scan unnecessary columns.

My modification removes the addition of conjuncts in the scannode step,
so that we can keep the structure from ScanNode to Project and achieve
correct column trimming.
2024-10-22 21:42:55 +08:00
6f2bac012a [pick](branch-2.1) pick #39398 #41754 #41770 (#42231)
pick #39398 #41754 #41770
2024-10-22 18:05:40 +08:00
c1d2b8d548 [2.1][improvement](jdbc catalog) Disallow non-constant type conversion pushdown and implicit conversion pushdown (#42242)
pick (#42102)

Add a variable `enable_jdbc_cast_predicate_push_down`, the default value
is false, which prohibits the pushdown of non-constant predicates with
type conversion and all predicates with implicit conversion. This change
can prevent the wrong predicates from being pushed down to the Jdbc data
source, resulting in query data errors, because the predicates with cast
were not correctly pushed down to the data source before. If you find
that the data is read correctly and the performance is better before
this change, you can manually set this variable to true

```
| Expression                                          | Can Push Down |
|-----------------------------------------------------|---------------|
| column type equals const type                       | Yes           |
| column type equals cast const type                  | Yes           |
| cast column type equals const type                  | No            |
| cast column type equals cast const type             | No            |
| column type not equals column type                  | No            |
| column type not equals cast const type              | No            |
| cast column type not equals const type              | No            |
| cast column type not equals cast const type         | No            |

```
2024-10-22 17:27:29 +08:00
9c4ce73dfa [Pick](nereids) pick 40529 41464 40349 (#42073)
## Proposed changes
pick  #40529 #41464 #40349

Issue Number: close #xxx

<!--Describe your changes.-->
2024-10-22 15:04:50 +08:00
85a98df9c2 [Fix](count on index) fix count on index opt when count project expr #41772 (#42229)
cherry pick from #41772
2024-10-22 13:16:32 +08:00
803c052100 [fix](nereids)modify agg function nullability in PhysicalHashAggregate (#42018)
## Proposed changes

pick from master https://github.com/apache/doris/pull/41943

<!--Describe your changes.-->
2024-10-22 11:23:39 +08:00
104d427afa [cherry-pick][chore](audit) Optimize the SQL (insert into values) length in audit logs (#37894) and let line comment work well (#40599) (#42186)
## Proposed changes

cherry-pick from master #37894 and #40599

<!--Describe your changes.-->
2024-10-22 10:16:05 +08:00
8ce65cab86 [chore](log) Adjust log level for replaying a batch editlog cost time(#41392) (#42182)
* `replay journal cost too much time` is a counter for replaying a batch
editlog it is normal that cost too much time, the warning level can make
confused

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-10-22 10:06:11 +08:00
c6b35ba9b8 [cherry-pick](paimon) pick part of refactor code from #34496 (#42221)
#34496

Co-authored-by: Tiewei Fang <43782773+BePPPower@users.noreply.github.com>
2024-10-22 09:44:41 +08:00
ac3d64c010 [feature](hive)support create hive table for text format #41860 (#42195)
cherry pick from #41860

Co-authored-by: wuwenchi <wuwenchihdu@hotmail.com>
2024-10-21 21:30:11 +08:00
720a4c9f30 [Fix](Branch-2.1) fix fallback to original planer when enable_server_side_prepared_statement = false (#42156) 2024-10-21 17:46:24 +08:00
bbd4970ed8 [feature](jdbc catalog) support gbase jdbc catalog #41027 #41587 (#42123)
cherry pick from #41027 #41587

---------

Co-authored-by: zy-kkk <zhongyk10@gmail.com>
2024-10-21 16:52:23 +08:00
a150d160ea [fix](jdbc catalog) fix and add mysql and doris extremum test #41679 (#42122)
cherry pick from #41679

---------

Co-authored-by: zy-kkk <zhongyk10@gmail.com>
2024-10-21 16:39:40 +08:00
9ac8b44d65 [bugfix](hive)Use the connected user to initialize the owner of the hive table #41876 (#42121)
cherry pick from #41876

---------

Co-authored-by: wuwenchi <wuwenchihdu@hotmail.com>
2024-10-21 14:41:13 +08:00
9383378f2e [Fix](Export) show export statement supports specify the catalog name #41662 (#42117)
cherry pick from #41662

Co-authored-by: Tiewei Fang <43782773+BePPPower@users.noreply.github.com>
2024-10-19 10:47:28 +08:00