Commit Graph

19400 Commits

Author SHA1 Message Date
Pxl
c1b1437fc3 [Bug](bitmap) clear set when bitmap fastunion (#37816) (#37875)
pick from #37816
2024-07-16 16:01:32 +08:00
80ea98b371 [fix](nereids)subquery unnesting get wrong result if correlated conjuncts is not slot_a = slot_b (#37683)
pick from master https://github.com/apache/doris/pull/37644

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-07-16 15:06:40 +08:00
02716598d4 [Fix](sql function) memory overflow to the left of string address when do_money_format has small negative value #36226 (#37870)
cherry pick from #36226

Co-authored-by: sparrow <38098988+biohazard4321@users.noreply.github.com>
2024-07-16 15:04:42 +08:00
2c80259f66 [fix](mtmv) use isManagedTable instead of check table type (#34287) (#37822)
pick: https://github.com/apache/doris/pull/34287
2024-07-16 15:01:28 +08:00
2c4b5519e9 [cherry-pick](branch-2.1) let insert statement support CTE (#36150) (#36265)
## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->

cherry-pick: #36150
2024-07-16 14:50:53 +08:00
Pxl
9bb37e3616 [Bug](runtime-filter) try to fix DCHECK fail on _acquire_runtime_filter (#37805)
## Proposed changes
pick from https://github.com/apache/doris/pull/35422
2024-07-16 14:37:54 +08:00
8e42871228 [fix](in expr) fix error result when in expr has null value and lef… (#37800)
https://github.com/apache/doris/pull/36024

## Proposed changes

```
create table t2 (id int, c1 int);
insert into t2 values(1, null);
 select 0 in (c1, null) from t2; -- should return null,but 1
```

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-07-16 14:04:35 +08:00
253f929234 [cherry-pick](branch-2.1) fix inverted index file size (#37836)
## Proposed changes

pick from master #37232
pick from master #37564
2024-07-16 11:28:47 +08:00
e84b9a0eaa [fix](auth)fix fe can not restart when replay create row policy log (… (#37820)
pick: https://github.com/apache/doris/pull/37342
2024-07-16 11:28:19 +08:00
Pxl
d7e84b7ee3 [Enchancement](bitmap) optimize bitmap deserialize and remove some unused code (#37623)
## Proposed changes
pick from #35789
2024-07-16 11:21:54 +08:00
2957fdc039 [branch2.1] pick [fix](show) show create table show index comment err… (#37034)
## Proposed changes
pick https://github.com/apache/doris/pull/36306
2024-07-16 11:19:27 +08:00
47096f2083 [test](regression) add cases for data quality error url (#34987) (#37777)
cherry-pick #34987
2024-07-16 11:12:52 +08:00
7951090283 [minor](dependencies)upgrade aircompressor to 0.27 (#36106) (#37572)
(cherry picked from commit e4aaa3294213191e3ed04703861a8307c6fb850b)

## Proposed changes

Issue Number: #36106
2024-07-16 11:11:58 +08:00
f1d22a9610 [fix](mtmv) fix mtmv task nereids cost too much time (#37589) (#37819)
pick: https://github.com/apache/doris/pull/37589
2024-07-16 11:08:18 +08:00
3cb1d4e842 [feature](json)support explode_json_object func #36887 (#37378) 2024-07-16 10:59:11 +08:00
e64f2997e9 [fix](function) fix core when input not null array in foreach functio… (#37798)
## Proposed changes
https://github.com/apache/doris/pull/37349
error code
```C++
return creator_without_type::create<AggregateFunctionForEach>(transform_arguments, true,
                                                                      nested_function);
```
"transform_arguments is an internal type of array. All internal types of
the array are null, so an array that is not null was mistakenly treated
as a null array."

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-07-16 10:57:11 +08:00
6932eef65e [opt](serde)Optimize the filling of fixed values ​​into block columns without repeated deserialization. (#37377) (#37530)
bp #37377
2024-07-16 10:56:13 +08:00
0080815d11 [cherry-pick](branch-2.1)fix be metric doris_be_process_thread_num is zero (#36719)
## Proposed changes

Issue Number: close #xxx

cherry-pick #35511

<!--Describe your changes.-->
2024-07-16 10:43:06 +08:00
e7a001c420 [enhance](mtmv)support partition tvf (#37795)
pick from: https://github.com/apache/doris/pull/36479 and
https://github.com/apache/doris/pull/37201
2024-07-16 09:27:44 +08:00
aa2b902633 [cherry-pick](branch-21) fix broadcast join running when hash table build not finished (#37844)
cherry-pick from master https://github.com/apache/doris/pull/37792

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-07-16 09:20:06 +08:00
bdf3e3a17e [test](docker) change the default region for docker compose (#37768) (#37813)
bp #37768
2024-07-15 22:18:33 +08:00
1d49d386aa [cherry-pick](branch-21) remove the useless code in column vector (#34432) (#37827)
cherry-pick from master https://github.com/apache/doris/pull/34432

Co-authored-by: HappenLee <happenlee@hotmail.com>
2024-07-15 22:10:58 +08:00
Pxl
010d9d88f8 [Feature](rpc) support set brpc_idle_timeout_sec and enable thrift so… (#37808)
pick from #37333
2024-07-15 21:12:25 +08:00
091725e915 [case](paimon/iceberg)move cases from p2 to p0 for 2.1 (#37276) (#37818)
move cases from p2 to p0

bp: #37276
2024-07-15 21:08:52 +08:00
78eb9d8e33 [case](restart_fe) add demo case for restart_fe test (#37091) (#37313)
pick from master #37091

Co-authored-by: stephen <hello-stephen@qq.com>
2024-07-15 19:42:20 +08:00
63c2d22513 [cherry-pick](branch-2.1) Pick "[Fix](delete command) Mark delete sign when do delete command in MoW table (#35917)" (#37594)
Pick #35917 and #37151
2024-07-15 18:54:01 +08:00
03e21dddff [cherry-pick](branch-21) fix cast string to int return wrong result (#36788) (#37803)
## Proposed changes
cherry-pick from master:
https://github.com/apache/doris/pull/36788
https://github.com/apache/doris/pull/36505

<!--Describe your changes.-->
2024-07-15 18:48:49 +08:00
Pxl
e5219467dd [Bug](join) avoid overflow on bucket_size+1 (#37807)
## Proposed changes
pick from #37493
2024-07-15 18:47:36 +08:00
b7dbd5c186 [feature](inverted index) add ordered functionality to match_phrase query (#37751)
## Proposed changes

1. select count() from tbl where b match_phrase 'the brown ~2+';
2024-07-15 18:42:48 +08:00
Pxl
d1fc4e2e60 [Bug](query) fix meet invalid column when direct scan on mow mv (#37806)
pick from #36483
2024-07-15 18:29:30 +08:00
ce4983eaf2 [datalake](hudi) add hudi docker compose to run hudi examples (#37774)
bp: #37451
2024-07-15 17:16:59 +08:00
967173d7d0 [cherry-pick-2.1](table-function) pick some table functions exec performance (#34090) (#37778)
## Proposed changes

pick from master:
https://github.com/apache/doris/pull/33904
https://github.com/apache/doris/pull/34090

Co-authored-by: HappenLee <happenlee@hotmail.com>
2024-07-15 17:15:56 +08:00
a4d37d96ca [opt](file-scanner) add not found file number in profile (#37042) (#37764)
bp #37042
2024-07-15 17:11:06 +08:00
57301920e3 [fix](colocate join) fix wrong use of colocate join (#37361) (#37714)
cherry-pick from master #37361
2024-07-15 16:47:17 +08:00
e5339a4014 [feature](ES Catalog)Support control scroll level by config #37180 (#37290)
## Proposed changes

backport #37180
2024-07-15 16:41:38 +08:00
8df2432e94 [fix](inverted index) implementation of match function without index #36471 (#36918) 2024-07-15 16:19:41 +08:00
ea12114549 [fix](dockerfile) Switch repos to point to to vault.centos.org because CentOS 7 is EOL (#37568) (#37763)
bp #37568
2024-07-15 15:57:56 +08:00
ff7a04093e [fix](fe) fix several blocking bugs #37756 (#37757)
bp #37756
2024-07-15 15:56:01 +08:00
8360e3f6cf [fix](sleep) sleep with character const make be crash (#37681) (#37775)
cherry-pick #37681 to branch-2.1
2024-07-15 14:57:46 +08:00
de61887cdc [chore](log) reduce print warning msg during be starting up #36710 (#37780)
cherry pick from #36710
2024-07-15 14:46:54 +08:00
7bd6818350 [branch-2.1][improvement](jdbc catalog) Added support for Oracle Raw type (#37776)
pick (#37078)
In previous versions, we adopted the strategy of reading the object
address for Oracle's raw type, which would lead to unstable and
meaningless results. Here I changed it to read hexadecimal or UTF8
2024-07-15 14:43:05 +08:00
79f6b647d5 [FIX] should check fe host standing when coordinator is not found. (#37772)
fix https://github.com/apache/doris/pull/37707
2024-07-15 12:27:31 +08:00
232202b71f [improve](load) reduce memory reserved in memtable limiter (#37511) (#37699)
cherry-pick #37511
2024-07-15 11:09:09 +08:00
2759383365 [branch-2.1](timezone) refactor tzdata load to accelerate and unify timezone parsing (#37062) (#37269)
pick https://github.com/apache/doris/pull/37062

1. revert https://github.com/apache/doris/pull/25097. we decide to rely
on OS. not maintain independent tzdata anymore to keep result
consistency
2. refactor timezone load. removed rwlock.

before:
```sql
mysql [optest]>select count(convert_tz(d, 'Asia/Shanghai', 'America/Los_Angeles')), count(convert_tz(dt, 'America/Los_Angeles', '+00:00')) from dates;
+-------------------------------------------------------------------------------------+--------------------------------------------------------+
| count(convert_tz(cast(d as DATETIMEV2(6)), 'Asia/Shanghai', 'America/Los_Angeles')) | count(convert_tz(dt, 'America/Los_Angeles', '+00:00')) |
+-------------------------------------------------------------------------------------+--------------------------------------------------------+
|                                                                            16000000 |                                               16000000 |
+-------------------------------------------------------------------------------------+--------------------------------------------------------+
1 row in set (6.88 sec)
```
now:
```sql
mysql [optest]>select count(convert_tz(d, 'Asia/Shanghai', 'America/Los_Angeles')), count(convert_tz(dt, 'America/Los_Angeles', '+00:00')) from dates;
+-------------------------------------------------------------------------------------+--------------------------------------------------------+
| count(convert_tz(cast(d as DATETIMEV2(6)), 'Asia/Shanghai', 'America/Los_Angeles')) | count(convert_tz(dt, 'America/Los_Angeles', '+00:00')) |
+-------------------------------------------------------------------------------------+--------------------------------------------------------+
|                                                                            16000000 |                                               16000000 |
+-------------------------------------------------------------------------------------+--------------------------------------------------------+
1 row in set (2.61 sec)
```
3. now don't support timezone offset format string like 'UTC+8', like we
already said in
https://doris.apache.org/docs/dev/query/query-variables/time-zone/#usage
4. support case-insensitive timezone parsing in nereids.
5. a bug when parse timezone using nereids. should check DST by input,
but wrongly by now before. now fixed.

doc pr: https://github.com/apache/doris-website/pull/810
2024-07-15 10:56:48 +08:00
351ba4aeb2 [opt](spill) handle oom exception in spill tasks (#35025) (#35171) 2024-07-15 10:33:33 +08:00
31b3afa2c8 [fix](pipeline) fix exception safety issue in MultiCastDataStreamer (#36814)
## Proposed changes

pick #36748

```cpp
RETURN_IF_ERROR(vectorized::MutableBlock(block).merge(*pos_to_pull->_block))
```
this line may throw an exception(cannot allocate)

```
*** Query id: b7b80bfd76cc42a5-a9916f8364d5a4d3 ***
*** tablet id: 0 ***
*** Aborted at 1719187603 (unix time) try "date -d @1719187603" if you are using GNU date ***
*** Current BE git commitID: a8c48f5328 ***
*** SIGSEGV address not mapped to object (@0x47) received by PID 1197117 (TID 1197376 OR 0x7f49a25e4640) from PID 71; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/doris_branch-2.0/doris/be/src/common/signal_handler.h:417
 1# os::Linux::chained_handler(int, siginfo_t*, void*) in /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
 2# JVM_handle_linux_signal in /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
 3# signalHandler(int, siginfo_t*, void*) in /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
 4# 0x00007F4ABB927520 in /lib/x86_64-linux-gnu/libc.so.6
 5# std::default_delete<doris::vectorized::Block>::operator()(doris::vectorized::Block*) const at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:85
 6# doris::pipeline::MultiCastDataStreamer::close_sender(int) at /root/doris_branch-2.0/doris/be/src/pipeline/exec/multi_cast_data_streamer.cpp:60
 7# doris::pipeline::MultiCastDataStreamerSourceOperator::close(doris::RuntimeState*) at /root/doris_branch-2.0/doris/be/src/pipeline/exec/multi_cast_data_stream_source.cpp:120
 8# doris::pipeline::PipelineTask::close() at /root/doris_branch-2.0/doris/be/src/pipeline/pipeline_task.cpp:334
 9# doris::pipeline::TaskScheduler::_try_close_task(doris::pipeline::PipelineTask*, doris::pipeline::PipelineTaskState) at /root/doris_branch-2.0/doris/be/src/pipeline/task_scheduler.cpp:353
10# doris::pipeline::TaskScheduler::_do_work(unsigned long) in /mnt/disk1/STRESS_ENV/be/lib/doris_be
11# doris::ThreadPool::dispatch_thread() in /mnt/disk1/STRESS_ENV/be/lib/doris_be
12# doris::Thread::supervise_thread(void*) at /root/doris_branch-2.0/doris/be/src/util/thread.cpp:499
13# start_thread at ./nptl/pthread_create.c:442
14# 0x00007F4ABBA0B850 at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:83
```

<!--Describe your changes.-->
2024-07-15 10:32:20 +08:00
3da5b17abf [branch-2.1](timezone) make TimeUtils formatter use correct time_zone (#37465) (#37652)
All timestamp/datetime parsing in Doris is controlled by the session
variable `time_zone`.
Apply it also to interface of `TimeUtils` in FE.

pick https://github.com/apache/doris/pull/37465
2024-07-15 10:23:38 +08:00
9556c07a16 [mac](compile) fix compile error on mac (#37726) 2024-07-15 10:19:42 +08:00
8de13c5cc8 [fix](function) error scale set in unix_timestamp (#36110) (#37619)
## Proposed changes

```
mysql [test]>set DEBUG_SKIP_FOLD_CONSTANT = true;
Query OK, 0 rows affected (0.00 sec)

mysql [test]>select cast(unix_timestamp("2024-01-01",'yyyy-MM-dd') as bigint);
+------------------------------------------------------------+
| cast(unix_timestamp('2024-01-01', 'yyyy-MM-dd') as BIGINT) |
+------------------------------------------------------------+
|                                           1704038400000000 |
+------------------------------------------------------------+
```
now
```
mysql [test]>select cast(unix_timestamp("2024-01-01",'yyyy-MM-dd') as bigint);
+------------------------------------------------------------+
| cast(unix_timestamp('2024-01-01', 'yyyy-MM-dd') as BIGINT) |
+------------------------------------------------------------+
|                                                 1704038400 |
+------------------------------------------------------------+
1 row in set (0.01 sec)
```

The column does not have a scale set, but the cast uses the scale to
perform the cast.


<!--Describe your changes.-->

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-07-15 10:00:04 +08:00
b55dd6f644 [fix](delete) fix the error message for valid decimal data for 2.1 (#37710)
## Proposed changes

cherry-pick : #36802

<!--Describe your changes.-->
2024-07-15 09:54:42 +08:00