Commit Graph

19376 Commits

Author SHA1 Message Date
78eb9d8e33 [case](restart_fe) add demo case for restart_fe test (#37091) (#37313)
pick from master #37091

Co-authored-by: stephen <hello-stephen@qq.com>
2024-07-15 19:42:20 +08:00
63c2d22513 [cherry-pick](branch-2.1) Pick "[Fix](delete command) Mark delete sign when do delete command in MoW table (#35917)" (#37594)
Pick #35917 and #37151
2024-07-15 18:54:01 +08:00
03e21dddff [cherry-pick](branch-21) fix cast string to int return wrong result (#36788) (#37803)
## Proposed changes
cherry-pick from master:
https://github.com/apache/doris/pull/36788
https://github.com/apache/doris/pull/36505

<!--Describe your changes.-->
2024-07-15 18:48:49 +08:00
Pxl
e5219467dd [Bug](join) avoid overflow on bucket_size+1 (#37807)
## Proposed changes
pick from #37493
2024-07-15 18:47:36 +08:00
b7dbd5c186 [feature](inverted index) add ordered functionality to match_phrase query (#37751)
## Proposed changes

1. select count() from tbl where b match_phrase 'the brown ~2+';
2024-07-15 18:42:48 +08:00
Pxl
d1fc4e2e60 [Bug](query) fix meet invalid column when direct scan on mow mv (#37806)
pick from #36483
2024-07-15 18:29:30 +08:00
ce4983eaf2 [datalake](hudi) add hudi docker compose to run hudi examples (#37774)
bp: #37451
2024-07-15 17:16:59 +08:00
967173d7d0 [cherry-pick-2.1](table-function) pick some table functions exec performance (#34090) (#37778)
## Proposed changes

pick from master:
https://github.com/apache/doris/pull/33904
https://github.com/apache/doris/pull/34090

Co-authored-by: HappenLee <happenlee@hotmail.com>
2024-07-15 17:15:56 +08:00
a4d37d96ca [opt](file-scanner) add not found file number in profile (#37042) (#37764)
bp #37042
2024-07-15 17:11:06 +08:00
57301920e3 [fix](colocate join) fix wrong use of colocate join (#37361) (#37714)
cherry-pick from master #37361
2024-07-15 16:47:17 +08:00
e5339a4014 [feature](ES Catalog)Support control scroll level by config #37180 (#37290)
## Proposed changes

backport #37180
2024-07-15 16:41:38 +08:00
8df2432e94 [fix](inverted index) implementation of match function without index #36471 (#36918) 2024-07-15 16:19:41 +08:00
ea12114549 [fix](dockerfile) Switch repos to point to to vault.centos.org because CentOS 7 is EOL (#37568) (#37763)
bp #37568
2024-07-15 15:57:56 +08:00
ff7a04093e [fix](fe) fix several blocking bugs #37756 (#37757)
bp #37756
2024-07-15 15:56:01 +08:00
8360e3f6cf [fix](sleep) sleep with character const make be crash (#37681) (#37775)
cherry-pick #37681 to branch-2.1
2024-07-15 14:57:46 +08:00
de61887cdc [chore](log) reduce print warning msg during be starting up #36710 (#37780)
cherry pick from #36710
2024-07-15 14:46:54 +08:00
7bd6818350 [branch-2.1][improvement](jdbc catalog) Added support for Oracle Raw type (#37776)
pick (#37078)
In previous versions, we adopted the strategy of reading the object
address for Oracle's raw type, which would lead to unstable and
meaningless results. Here I changed it to read hexadecimal or UTF8
2024-07-15 14:43:05 +08:00
79f6b647d5 [FIX] should check fe host standing when coordinator is not found. (#37772)
fix https://github.com/apache/doris/pull/37707
2024-07-15 12:27:31 +08:00
232202b71f [improve](load) reduce memory reserved in memtable limiter (#37511) (#37699)
cherry-pick #37511
2024-07-15 11:09:09 +08:00
2759383365 [branch-2.1](timezone) refactor tzdata load to accelerate and unify timezone parsing (#37062) (#37269)
pick https://github.com/apache/doris/pull/37062

1. revert https://github.com/apache/doris/pull/25097. we decide to rely
on OS. not maintain independent tzdata anymore to keep result
consistency
2. refactor timezone load. removed rwlock.

before:
```sql
mysql [optest]>select count(convert_tz(d, 'Asia/Shanghai', 'America/Los_Angeles')), count(convert_tz(dt, 'America/Los_Angeles', '+00:00')) from dates;
+-------------------------------------------------------------------------------------+--------------------------------------------------------+
| count(convert_tz(cast(d as DATETIMEV2(6)), 'Asia/Shanghai', 'America/Los_Angeles')) | count(convert_tz(dt, 'America/Los_Angeles', '+00:00')) |
+-------------------------------------------------------------------------------------+--------------------------------------------------------+
|                                                                            16000000 |                                               16000000 |
+-------------------------------------------------------------------------------------+--------------------------------------------------------+
1 row in set (6.88 sec)
```
now:
```sql
mysql [optest]>select count(convert_tz(d, 'Asia/Shanghai', 'America/Los_Angeles')), count(convert_tz(dt, 'America/Los_Angeles', '+00:00')) from dates;
+-------------------------------------------------------------------------------------+--------------------------------------------------------+
| count(convert_tz(cast(d as DATETIMEV2(6)), 'Asia/Shanghai', 'America/Los_Angeles')) | count(convert_tz(dt, 'America/Los_Angeles', '+00:00')) |
+-------------------------------------------------------------------------------------+--------------------------------------------------------+
|                                                                            16000000 |                                               16000000 |
+-------------------------------------------------------------------------------------+--------------------------------------------------------+
1 row in set (2.61 sec)
```
3. now don't support timezone offset format string like 'UTC+8', like we
already said in
https://doris.apache.org/docs/dev/query/query-variables/time-zone/#usage
4. support case-insensitive timezone parsing in nereids.
5. a bug when parse timezone using nereids. should check DST by input,
but wrongly by now before. now fixed.

doc pr: https://github.com/apache/doris-website/pull/810
2024-07-15 10:56:48 +08:00
351ba4aeb2 [opt](spill) handle oom exception in spill tasks (#35025) (#35171) 2024-07-15 10:33:33 +08:00
31b3afa2c8 [fix](pipeline) fix exception safety issue in MultiCastDataStreamer (#36814)
## Proposed changes

pick #36748

```cpp
RETURN_IF_ERROR(vectorized::MutableBlock(block).merge(*pos_to_pull->_block))
```
this line may throw an exception(cannot allocate)

```
*** Query id: b7b80bfd76cc42a5-a9916f8364d5a4d3 ***
*** tablet id: 0 ***
*** Aborted at 1719187603 (unix time) try "date -d @1719187603" if you are using GNU date ***
*** Current BE git commitID: a8c48f5328 ***
*** SIGSEGV address not mapped to object (@0x47) received by PID 1197117 (TID 1197376 OR 0x7f49a25e4640) from PID 71; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/doris_branch-2.0/doris/be/src/common/signal_handler.h:417
 1# os::Linux::chained_handler(int, siginfo_t*, void*) in /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
 2# JVM_handle_linux_signal in /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
 3# signalHandler(int, siginfo_t*, void*) in /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
 4# 0x00007F4ABB927520 in /lib/x86_64-linux-gnu/libc.so.6
 5# std::default_delete<doris::vectorized::Block>::operator()(doris::vectorized::Block*) const at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:85
 6# doris::pipeline::MultiCastDataStreamer::close_sender(int) at /root/doris_branch-2.0/doris/be/src/pipeline/exec/multi_cast_data_streamer.cpp:60
 7# doris::pipeline::MultiCastDataStreamerSourceOperator::close(doris::RuntimeState*) at /root/doris_branch-2.0/doris/be/src/pipeline/exec/multi_cast_data_stream_source.cpp:120
 8# doris::pipeline::PipelineTask::close() at /root/doris_branch-2.0/doris/be/src/pipeline/pipeline_task.cpp:334
 9# doris::pipeline::TaskScheduler::_try_close_task(doris::pipeline::PipelineTask*, doris::pipeline::PipelineTaskState) at /root/doris_branch-2.0/doris/be/src/pipeline/task_scheduler.cpp:353
10# doris::pipeline::TaskScheduler::_do_work(unsigned long) in /mnt/disk1/STRESS_ENV/be/lib/doris_be
11# doris::ThreadPool::dispatch_thread() in /mnt/disk1/STRESS_ENV/be/lib/doris_be
12# doris::Thread::supervise_thread(void*) at /root/doris_branch-2.0/doris/be/src/util/thread.cpp:499
13# start_thread at ./nptl/pthread_create.c:442
14# 0x00007F4ABBA0B850 at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:83
```

<!--Describe your changes.-->
2024-07-15 10:32:20 +08:00
3da5b17abf [branch-2.1](timezone) make TimeUtils formatter use correct time_zone (#37465) (#37652)
All timestamp/datetime parsing in Doris is controlled by the session
variable `time_zone`.
Apply it also to interface of `TimeUtils` in FE.

pick https://github.com/apache/doris/pull/37465
2024-07-15 10:23:38 +08:00
9556c07a16 [mac](compile) fix compile error on mac (#37726) 2024-07-15 10:19:42 +08:00
8de13c5cc8 [fix](function) error scale set in unix_timestamp (#36110) (#37619)
## Proposed changes

```
mysql [test]>set DEBUG_SKIP_FOLD_CONSTANT = true;
Query OK, 0 rows affected (0.00 sec)

mysql [test]>select cast(unix_timestamp("2024-01-01",'yyyy-MM-dd') as bigint);
+------------------------------------------------------------+
| cast(unix_timestamp('2024-01-01', 'yyyy-MM-dd') as BIGINT) |
+------------------------------------------------------------+
|                                           1704038400000000 |
+------------------------------------------------------------+
```
now
```
mysql [test]>select cast(unix_timestamp("2024-01-01",'yyyy-MM-dd') as bigint);
+------------------------------------------------------------+
| cast(unix_timestamp('2024-01-01', 'yyyy-MM-dd') as BIGINT) |
+------------------------------------------------------------+
|                                                 1704038400 |
+------------------------------------------------------------+
1 row in set (0.01 sec)
```

The column does not have a scale set, but the cast uses the scale to
perform the cast.


<!--Describe your changes.-->

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-07-15 10:00:04 +08:00
b55dd6f644 [fix](delete) fix the error message for valid decimal data for 2.1 (#37710)
## Proposed changes

cherry-pick : #36802

<!--Describe your changes.-->
2024-07-15 09:54:42 +08:00
16de141743 [regression](kerberos)add hive kerberos docker regression env (#37657)
## Proposed changes
pick:
[regression](kerberos)fix regression pipeline env when write hosts 
(#37057)
[regression](kerberos)add hive kerberos docker regression env (#36430)
2024-07-15 09:35:39 +08:00
8f39143c14 [test](fix) replace hardcode s3BucketName (#37750)
## Proposed changes

pick from master #37739 

<!--Describe your changes.-->

---------

Co-authored-by: stephen <hello-stephen@qq.com>
2024-07-14 18:38:52 +08:00
747172237a [branch-2.1](memory) Pick some memory GC patch (#37725)
pick
#36768
#37164
#37174
#37525
2024-07-14 15:19:40 +08:00
ec8467f57b [fix](auto bucket) Fix hit not support alter estimate_partition_size #33670 (#37633)
cherry pick from #33670
2024-07-13 22:12:38 +08:00
00a5718541 [fix](test) fix test typo (#37741) 2024-07-13 20:00:56 +08:00
5162789234 [Refactor](Variant) make many insterfaces exception safe (#37640) (#37719) 2024-07-13 16:52:10 +08:00
8930df3b31 [Feature](iceberg-writer) Implements iceberg partition transform. (#37692)
## Proposed changes

Cherry-pick iceberg partition transform functionality. #36289 #36889

---------

Co-authored-by: kang <35803862+ghkang98@users.noreply.github.com>
Co-authored-by: lik40 <lik40@chinatelecom.cn>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Mingyu Chen <morningman@163.com>
2024-07-13 16:07:50 +08:00
56a207c3f0 [case](paimon/iceberg)move cases from p2 to p0 (#37276) (#37738)
bp #37276

Co-authored-by: wuwenchi <wuwenchihdu@hotmail.com>
2024-07-13 10:01:05 +08:00
d91376cd52 [bugfix](paimon)adding dependencies for clang #37512 (#37737)
cherry pick from #37512

Co-authored-by: wuwenchi <wuwenchihdu@hotmail.com>
2024-07-13 09:59:35 +08:00
7887f51e9b [fix](partial update) fix a mem leak issue (#37706) (#37730)
cherry-pick #37706
2024-07-13 09:20:01 +08:00
20758576b2 [fix](split) remove retry when fetch split batch failed (#37637)
bp: #37636
2024-07-12 22:46:03 +08:00
019cd9b4ec [fix](hudi) return empty if there is no commit implemented (#37703)
bp: #37702
2024-07-12 22:44:58 +08:00
f2556ba182 [feature](insert)support external hive truncate table DDL (#37659)
pick: #36801
2024-07-12 22:37:47 +08:00
326b40cde2 [branch-2.1](memory) Add HTTP API to clear data cache (#37704)
pick #36599

Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>
2024-07-12 17:21:52 +08:00
a61030215e [branch-2.1](memory) Support make all memory snapshots (#37705)
pick #36679
2024-07-12 16:21:37 +08:00
ab8204b33b [fix](regression) fix multi replica case not executed (#37425) (#37698)
cherry-pick #37425
2024-07-12 15:53:20 +08:00
035027f831 [fix](query cancel) Fix query is cancelled when it comes from follower FE #37662 (#37707)
cherry pick from #37662
2024-07-12 15:50:45 +08:00
259d28407e [improvement](statistics)Enable estimate hive table row count using file size. (#37218) (#37694)
backport: https://github.com/apache/doris/pull/37218
2024-07-12 13:47:27 +08:00
37583d2d0a [test](statistics)Add test case for set global variables. (#37582) (#37691)
backport: https://github.com/apache/doris/pull/37582
2024-07-12 13:46:53 +08:00
ef031c5fb2 [branch-2.1](memory) Fix reserve memory compatible with memory GC and logging (#37682)
pick
#36307
#36412
2024-07-12 11:43:26 +08:00
ffa9e49bc7 [feature](mtmv) pick some mtmv pr from master (#37651)
cherry-pick from master
pr: #36318
commitId: c1999479

pr: #36111
commitId: 35ebef62

pr: #36175
commitId: 4c8e66b4

pr: #36414
commitId: 5e009b5a

pr: #36770
commitId: 19e2126c

pr: #36567
commitId: 3da83514
2024-07-12 10:35:54 +08:00
6214d6421f [Fix](planner) fix bug of char(255) toSql (#37340) (#37671)
cherry-pick #37340 from master
2024-07-12 10:33:24 +08:00
4dc933bb28 [cherry-pick] (branch-2.1) fix query errors caused by ignore_above (#37685)
## Proposed changes
pick from master #37679
2024-07-12 09:31:45 +08:00
87912de93f [fix](scan) catch exceptions thrown in scanner (#36101) (#37408)
## Proposed changes

pick #36101

The uncaught exceptions thrown in the scanner will cause the BE to
crash.
2024-07-12 08:49:39 +08:00