Commit Graph

8777 Commits

Author SHA1 Message Date
63c2d22513 [cherry-pick](branch-2.1) Pick "[Fix](delete command) Mark delete sign when do delete command in MoW table (#35917)" (#37594)
Pick #35917 and #37151
2024-07-15 18:54:01 +08:00
03e21dddff [cherry-pick](branch-21) fix cast string to int return wrong result (#36788) (#37803)
## Proposed changes
cherry-pick from master:
https://github.com/apache/doris/pull/36788
https://github.com/apache/doris/pull/36505

<!--Describe your changes.-->
2024-07-15 18:48:49 +08:00
Pxl
d1fc4e2e60 [Bug](query) fix meet invalid column when direct scan on mow mv (#37806)
pick from #36483
2024-07-15 18:29:30 +08:00
57301920e3 [fix](colocate join) fix wrong use of colocate join (#37361) (#37714)
cherry-pick from master #37361
2024-07-15 16:47:17 +08:00
e5339a4014 [feature](ES Catalog)Support control scroll level by config #37180 (#37290)
## Proposed changes

backport #37180
2024-07-15 16:41:38 +08:00
ff7a04093e [fix](fe) fix several blocking bugs #37756 (#37757)
bp #37756
2024-07-15 15:56:01 +08:00
7bd6818350 [branch-2.1][improvement](jdbc catalog) Added support for Oracle Raw type (#37776)
pick (#37078)
In previous versions, we adopted the strategy of reading the object
address for Oracle's raw type, which would lead to unstable and
meaningless results. Here I changed it to read hexadecimal or UTF8
2024-07-15 14:43:05 +08:00
2759383365 [branch-2.1](timezone) refactor tzdata load to accelerate and unify timezone parsing (#37062) (#37269)
pick https://github.com/apache/doris/pull/37062

1. revert https://github.com/apache/doris/pull/25097. we decide to rely
on OS. not maintain independent tzdata anymore to keep result
consistency
2. refactor timezone load. removed rwlock.

before:
```sql
mysql [optest]>select count(convert_tz(d, 'Asia/Shanghai', 'America/Los_Angeles')), count(convert_tz(dt, 'America/Los_Angeles', '+00:00')) from dates;
+-------------------------------------------------------------------------------------+--------------------------------------------------------+
| count(convert_tz(cast(d as DATETIMEV2(6)), 'Asia/Shanghai', 'America/Los_Angeles')) | count(convert_tz(dt, 'America/Los_Angeles', '+00:00')) |
+-------------------------------------------------------------------------------------+--------------------------------------------------------+
|                                                                            16000000 |                                               16000000 |
+-------------------------------------------------------------------------------------+--------------------------------------------------------+
1 row in set (6.88 sec)
```
now:
```sql
mysql [optest]>select count(convert_tz(d, 'Asia/Shanghai', 'America/Los_Angeles')), count(convert_tz(dt, 'America/Los_Angeles', '+00:00')) from dates;
+-------------------------------------------------------------------------------------+--------------------------------------------------------+
| count(convert_tz(cast(d as DATETIMEV2(6)), 'Asia/Shanghai', 'America/Los_Angeles')) | count(convert_tz(dt, 'America/Los_Angeles', '+00:00')) |
+-------------------------------------------------------------------------------------+--------------------------------------------------------+
|                                                                            16000000 |                                               16000000 |
+-------------------------------------------------------------------------------------+--------------------------------------------------------+
1 row in set (2.61 sec)
```
3. now don't support timezone offset format string like 'UTC+8', like we
already said in
https://doris.apache.org/docs/dev/query/query-variables/time-zone/#usage
4. support case-insensitive timezone parsing in nereids.
5. a bug when parse timezone using nereids. should check DST by input,
but wrongly by now before. now fixed.

doc pr: https://github.com/apache/doris-website/pull/810
2024-07-15 10:56:48 +08:00
3da5b17abf [branch-2.1](timezone) make TimeUtils formatter use correct time_zone (#37465) (#37652)
All timestamp/datetime parsing in Doris is controlled by the session
variable `time_zone`.
Apply it also to interface of `TimeUtils` in FE.

pick https://github.com/apache/doris/pull/37465
2024-07-15 10:23:38 +08:00
16de141743 [regression](kerberos)add hive kerberos docker regression env (#37657)
## Proposed changes
pick:
[regression](kerberos)fix regression pipeline env when write hosts 
(#37057)
[regression](kerberos)add hive kerberos docker regression env (#36430)
2024-07-15 09:35:39 +08:00
ec8467f57b [fix](auto bucket) Fix hit not support alter estimate_partition_size #33670 (#37633)
cherry pick from #33670
2024-07-13 22:12:38 +08:00
8930df3b31 [Feature](iceberg-writer) Implements iceberg partition transform. (#37692)
## Proposed changes

Cherry-pick iceberg partition transform functionality. #36289 #36889

---------

Co-authored-by: kang <35803862+ghkang98@users.noreply.github.com>
Co-authored-by: lik40 <lik40@chinatelecom.cn>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Mingyu Chen <morningman@163.com>
2024-07-13 16:07:50 +08:00
56a207c3f0 [case](paimon/iceberg)move cases from p2 to p0 (#37276) (#37738)
bp #37276

Co-authored-by: wuwenchi <wuwenchihdu@hotmail.com>
2024-07-13 10:01:05 +08:00
d91376cd52 [bugfix](paimon)adding dependencies for clang #37512 (#37737)
cherry pick from #37512

Co-authored-by: wuwenchi <wuwenchihdu@hotmail.com>
2024-07-13 09:59:35 +08:00
20758576b2 [fix](split) remove retry when fetch split batch failed (#37637)
bp: #37636
2024-07-12 22:46:03 +08:00
019cd9b4ec [fix](hudi) return empty if there is no commit implemented (#37703)
bp: #37702
2024-07-12 22:44:58 +08:00
f2556ba182 [feature](insert)support external hive truncate table DDL (#37659)
pick: #36801
2024-07-12 22:37:47 +08:00
259d28407e [improvement](statistics)Enable estimate hive table row count using file size. (#37218) (#37694)
backport: https://github.com/apache/doris/pull/37218
2024-07-12 13:47:27 +08:00
ffa9e49bc7 [feature](mtmv) pick some mtmv pr from master (#37651)
cherry-pick from master
pr: #36318
commitId: c1999479

pr: #36111
commitId: 35ebef62

pr: #36175
commitId: 4c8e66b4

pr: #36414
commitId: 5e009b5a

pr: #36770
commitId: 19e2126c

pr: #36567
commitId: 3da83514
2024-07-12 10:35:54 +08:00
6214d6421f [Fix](planner) fix bug of char(255) toSql (#37340) (#37671)
cherry-pick #37340 from master
2024-07-12 10:33:24 +08:00
217eac790b [pick](Variant) pick some refactor and fix #34925 #36317 #36201 #36793 (#37526) 2024-07-11 21:25:34 +08:00
fdf21ec251 [fix](readconsistency) avoid table not exist error (#37593) (#37641)
Query following createting table would throw table not exist error.

For example.
t1: client issue create table to master fe
t2: client issue query sql to observer fe, the query would fail due to
not exist table in plan phase.
t3: observer fe receive editlog creating the table from the master fe

After the pr:
query at t2 would wait until latest edit log is received from master fe
in the observer fe.

pick #37593

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-07-11 18:57:53 +08:00
cee3cf8499 [fix](statistics)Fix column cached stats size bug. (#37545) (#37667)
backport: https://github.com/apache/doris/pull/37545
2024-07-11 18:53:12 +08:00
8a0d940914 [fix](publish) Pick Fix publish failed because because task is null (#37546)
## Proposed changes

Pick https://github.com/apache/doris/pull/37531

This pr catch the exception to make the failed txn does not block the
other txns.
2024-07-11 15:22:04 +08:00
39ded1f649 [branch-2.1][improvement](jdbc catalog) Change JdbcExecutor's error reporting from UDF to JDBC (#37635)
pick (#35692)

In the initial version, JdbcExecutor directly used UdfRuntimeException,
which could lead to misunderstanding of the exception. Therefore, I
created a separate Exception for JdbcExecutor to help us view the
exception more clearly.
2024-07-11 15:11:41 +08:00
ef754487d9 [branch-2.1][improvement](jdbc catalog) Catch AbstractMethodError in getColumnValue Method and Suggest Updating to ojdbc8+ (#37634)
pick (#37608)

Catch AbstractMethodError in getColumnValue method. Provide a clear
error message suggesting the use of ojdbc8 or higher versions to avoid
compatibility issues.
2024-07-11 15:10:47 +08:00
1eb04cf538 [feature](mtmv) Support query rewrite by materialized view when query is aggregate and materialized view has no aggregate (#36278) (#37497)
cherry-pick from master
pr: #36278
commitId: 649f9bc6
2024-07-11 10:54:50 +08:00
e6b8ebc847 [Fix](Short Circuit) fix no project list in OlapScanNode (#37121) (#37504)
pick from #37121
2024-07-11 10:04:28 +08:00
e1cb568d11 [Optimize] Add session variable `max_fetch_remote_schema_tablet_count… (#37505)
pick from #37217
2024-07-11 10:04:20 +08:00
f58032f1da [fix](dynamic partition) drop partition exclude history_partition_num #37539 (#37570)
cherry pick from #37539

---------

Co-authored-by: Yongqiang YANG <98214048+dataroaring@users.noreply.github.com>
2024-07-10 22:09:00 +08:00
a7416f6ff9 [fix](dump) do not report success if dumping fails (#37510)
## Proposed changes
pick https://github.com/apache/doris/pull/37508

Issue Number: close #xxx

<!--Describe your changes.-->
2024-07-10 16:26:14 +08:00
87cd366636 [chore](dependencies)Remove Unnecessary Dependencies (#37469) (#37555)
bp #37469
2024-07-10 09:33:43 +08:00
770e7d21a4 [dependencies](fe)upgrade paimon to 0.8.1 (#37205) (#37554)
bp #37205
2024-07-10 09:32:33 +08:00
db4d061a68 [fix](Nereids) null type result with alias name should keep alias name (#37457) (#37524)
pick from master #37457
2024-07-09 20:46:51 +08:00
8ef83259ff [fix](planner) fix bug of select stmt toSql(#37274) (#37344)
cherry-pick from master #37274
2024-07-09 20:34:57 +08:00
9b075bc873 [fix](nereids) derive column stats for 'expr and A is not null' (#37235) (#37498)
pick from #37235 
the algorithm for computing stats for "expr1 and expr2" predicate is as
following:
1. compute output stats of expr1 based on input stats. the result stats
is denoted by leftStats
2. compute stats of expr2 based on leftStats after step1, leftStats
should be normalized to avoid abnormal cases, such as ndv > rowCount or
numNulls > rowCount

Issue Number: close #xxx

<!--Describe your changes.-->

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-07-09 17:46:57 +08:00
7f6b846f58 [improve](fe) Add a counter metric for recording large editlog write (#37328) (#37474) 2024-07-09 17:16:31 +08:00
9b500faa0c [fix](create table) create table fail not write drop table editlog #37488 (#37506)
cherry pick from #37488
2024-07-09 13:44:43 +08:00
4426d6d80f [fix](fe) Add check editlog size mechanism for backupJob (#35653) (#37466)
* When creating a backupJob with huge of tables in a database, it can
cause backupJob editlog size over 2GB and bdbje will throw exception
because of ByteBuffer overflow

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-07-09 10:33:28 +08:00
ca0e44f83f [fix](case) fix struct format out files (#37350) (#37499)
bp #37350
2024-07-09 10:11:50 +08:00
a6ec78ec5f [fix](truncate) fix tablet invert index leaky #37334 (#37410)
cherry pick from #37334
2024-07-08 20:57:30 +08:00
4c5a7b26e2 [chore](restore) Log partition visible version (#37414)
Cherry-pick #36920.
2024-07-08 18:58:16 +08:00
68352c7c77 [fix](Nereids) constant folding for str_to_date on datev1 is wrong (#37360) (#37367)
pick from master #37360

when enable datev1, we should return datev1/datetimev1
2024-07-08 15:21:12 +08:00
fbc954e8be [feat](mtmv) Support grouping_sets rewrite when query rewrite by materialized view (#36056) (#37436)
cherry pick from master
pr: #36056
commitId: 569c9772
2024-07-08 15:06:16 +08:00
779a51570e [opt](mtmv) Set query rewrite by materialized view default enable (#35897) (#36949)
cherry pick from master
pr: #35897
commitId: 603fa82f
2024-07-08 14:29:38 +08:00
95dad14062 [chore](query) print query id when killed by timeout checker (#37402)
pick #36868
2024-07-08 11:26:29 +08:00
dd18652861 [branch-2.1](routine-load) make get Kafka meta timeout configurable (#37399)
pick #36619
2024-07-08 10:39:17 +08:00
af7b69da48 [fix](nereids) Pick the stop watch is not reset (#37168) (#37397)
Pick https://github.com/apache/doris/pull/37168 and
https://github.com/apache/doris/pull/37095
2024-07-08 10:28:03 +08:00
97e4025ee0 [branch-2.1](routine-load) increase routine load job default max batch size and rows (#37388)
pick #36632

Most users only care about the size of **max_batch_interval**, but in
order to achieve an interval effect, they have to configure
**max_batch_rows** and **max_batch_size** according to the
characteristics of the data. By adjusting these two default values,
users do not need to worry about configuration in most scenarios.

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-07-07 18:35:08 +08:00
a05406ecc9 [branch-2.1] Picks "[Fix](delete) Fix delete job timeout when executing delete from ... #37363" (#37374)
## Proposed changes

picks https://github.com/apache/doris/pull/37363
2024-07-07 18:33:17 +08:00