Commit Graph

8254 Commits

Author SHA1 Message Date
dc438649d9 [bugfix](handshake) brpc handshake should not use light pool (#42115) (#42127)
The light pool may be full. Handshake is used to check the connection
state of brpc. Should not be interfered by the thread pool logic.

---------
pick #42115

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->

Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-10-19 16:19:17 +08:00
d5fef266ec [fix](inverted index) Fix incorrect exception handling (#42094)
https://github.com/apache/doris/pull/41874
2024-10-19 10:45:32 +08:00
5db44a1b91 [fix](arrays_overlap) support arrays overlap with inverted index (#42090)
## Proposed changes
backport : https://github.com/apache/doris/pull/41286
https://github.com/apache/doris/pull/41495
Issue Number: close #xxx

<!--Describe your changes.-->
2024-10-18 22:08:39 +08:00
dde0bf92ce [fix](inverted index) Fix incorrect usage of regexp compile_err (#41944) (#42085)
https://github.com/apache/doris/pull/41944
2024-10-18 22:06:59 +08:00
460ff02997 [cherry-pick](branch-21)fix date_floor function return wrong result (#41948) (#42065)
## Proposed changes

cherry-pick from master https://github.com/apache/doris/pull/41948

<!--Describe your changes.-->
2024-10-18 21:54:22 +08:00
03136baacf [fix](scanner) Fix incorrect _max_thread_num in scanner context when many queries are running. #41273 (#42016)
cherry pick from #41273
2024-10-18 18:08:07 +08:00
fb12e10272 [fix](array-funcs)fix array agg func with decimal type (#40839) (#42023)
## Proposed changes
backport: (https://github.com/apache/doris/pull/40839)
Issue Number: close #xxx

<!--Describe your changes.-->
2024-10-17 20:47:39 +08:00
Pxl
4d04db467e [Bug](predicate) Fixed the problem that the number of rows in inlist #41824 (#41910)
pick from #41824
2024-10-17 17:13:00 +08:00
Pxl
f4d9ddcb00 [Improvement](runtime-filter) set some rf brpc request to ignore_eovercrowded #41698 (#41897)
pick from #41698
2024-10-17 16:57:26 +08:00
5806dae467 [fix](move-memtable) do not retry open streams (#41550) (#41999)
backport #41550
2024-10-17 15:56:56 +08:00
b4875c2789 [fix](jni)fix jni use timezone_obj get timezone be core. (#41956) (#42003)
bp #41956 

This PR #40225 try to pass time zone info from BE to JNI, and it use
`_state->timezone_obj().name()`
to get the timezone name.
But when we do some rolling upgrade of BE, it may coredump like:

```
*** SIGSEGV address not mapped to object (@0x610) received by PID 72661 (TID 73538 OR 0x7f2e898d1640) from PID 1552; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/common/signal_handler.h:421
 1# os::Linux::chained_handler(int, siginfo_t*, void*) in /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
 2# JVM_handle_linux_signal in /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
 3# signalHandler(int, siginfo_t*, void*) in /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
 4# 0x00007F3070D3E520 in /lib/x86_64-linux-gnu/libc.so.6
 5# cctz::time_zone::name[abi:cxx11]() const in /mnt/hdd01/ci/compatibility-deploy/be/lib/doris_be
 6# doris::vectorized::JniConnector::open(doris::RuntimeState*, doris::RuntimeProfile*) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/vec/exec/jni_connector.cpp:87
 7# doris::vectorized::AvroJNIReader::init_fetch_table_schema_reader() at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/vec/exec/format/avro/avro_jni_reader.cpp:119
 8# std::_Function_handler::_M_invoke(std::_Any_data const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291
 9# doris::WorkThreadPool::work_thread(int) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/util/work_thread_pool.hpp:159
10# execute_native_thread_routine at ../../../../../libstdc++-v3/src/c++11/thread.cc:84
11# start_thread at ./nptl/pthread_create.c:442
12# 0x00007F3070E22850 at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:83
172.20.50.206 last coredump sql: 2024-10-13 04:12:23,985 [query] 
```

This PR use another method: `_state->timezone()`, which just return a
string, instead of reading and initializing
time zone info file, to avoid potential coredump.
2024-10-17 14:47:33 +08:00
67d057a711 [cherry-pick](branch-21) fix conv function parser string failure return wrong result (#40530) (#41964)
## Proposed changes

Issue Number: close #39618
cherry-pick from master (#40530)
2024-10-17 14:45:46 +08:00
0b41cd2472 [fix](serde)fix the bug in DataTypeNullableSerDe.deserialize_column_from_fixed_json (#41217) (#41960)
bp #41217 

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-10-17 14:36:01 +08:00
968e33f07e [cherry-pick](branch-21) pick (#39057) (#41352) (#41958)
## Proposed changes

pick from master (#39057) (#41352)

<!--Describe your changes.-->

---------

Co-authored-by: Mryange <59914473+Mryange@users.noreply.github.com>
2024-10-17 14:30:40 +08:00
1b901f6fcc [cherry-pick](branch-2.1) add parquet tvf cases and fix some parquet bug (#41931)
## Proposed changes
pick pr:
  https://github.com/apache/doris/pull/41683
  https://github.com/apache/doris/pull/41506
  https://github.com/apache/doris/pull/41338
  https://github.com/apache/doris/pull/39326

---------

Co-authored-by: morningman <morningman@163.com>
2024-10-17 14:20:58 +08:00
b8214952a1 [branch-2.1] Fix is_partial_update parameter is not set in append_block_with_partial_content() (#41865)
https://github.com/apache/doris/pull/41439 forgets to set
`is_partial_update` parameter for `Tablet::lookup_row_key()` in
`append_block_with_partial_content()`
2024-10-17 12:44:41 +08:00
19784d420c [opt](inverted index) Improved top-N optimization by refining the sorting column check. (#39496) (#41954)
https://github.com/apache/doris/pull/39496
2024-10-17 11:31:11 +08:00
0b6447faeb [Fix](SchemaChange) refactor variant root column iterator to make row… (#41941)
pick #41700
2024-10-17 10:39:07 +08:00
7d99d5fcc4 [fix](analytic) Fix data distribution after analytic operator (#41902) (#41949)
Fix data distribution after analytic operator

pick #41902
2024-10-16 18:41:56 +08:00
5bd33fc88c [pick](branch-2.1) pick #41292 #41350 #41589 #41628 #41743 #41601 #41667 #41751 (#41927)
## Proposed changes

pick #41292 #41350 #41589 #41628 #41743 #41601 #41667 #41751

<!--Describe your changes.-->

---------

Co-authored-by: Pxl <pxl290@qq.com>
2024-10-16 15:41:28 +08:00
e56216211e [pick](branch-2.1) pick #40667 #40714 (#41905)
pick
#40667
#40714

---------

Co-authored-by: wangbo <wangbo@apache.org>
2024-10-16 14:09:03 +08:00
e6545a36a3 [improvement](iceberg)Parallelize splits for count(*) for 2.1 (#41169) (#41880)
bp: #41169
2024-10-16 10:52:06 +08:00
b185dfcbf6 [pick](branch-2.1) pick #41676 #41740 #41857 (#41904)
pick #41676 #41740 #41857
2024-10-15 22:41:17 +08:00
b91d8e2327 [Improvement](minor) Reduce locking scope (#41845) (#41844)
pick #41845
2024-10-15 18:39:53 +08:00
78b6157aa9 [fix](ip/variant) fix information meta (#41871)
fix datatype information meta  for ip/variant (#41666)

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-10-15 18:01:14 +08:00
abcba778ff [fix](cancel) Fix cancel msg on branch-2.1 (#41798)
Make sure we can tell cancel reason from:
1. user cancel
2. timeout
3. others

```text
mysql [demo]>set query_timeout=1;
--------------
set query_timeout=1
--------------

Query OK, 0 rows affected (0.00 sec)

mysql [demo]>select sleep(5);
--------------
select sleep(5)
--------------

ERROR 1105 (HY000): errCode = 2, detailMessage = Timeout

mysql [demo]>select sleep(5);
--------------
select sleep(5)
--------------

^C^C -- sending "KILL QUERY 0" to server ...
^C -- query aborted
ERROR 1105 (HY000): errCode = 2, detailMessage = cancel query by user from 127.0.0.1:64208
```
2024-10-15 17:15:05 +08:00
77fbe6397a [fix](http) Remove file if downloading faile is failed #41778 (#41827)
cherry pick from #41778
2024-10-15 15:30:29 +08:00
94687a2f3c [fix](array/map) fix resize impl in array/map (#41595) (#41699)
backport: https://github.com/apache/doris/pull/41595
2024-10-15 09:50:11 +08:00
d97642e9b5 [cherry-pick](branch-21) fix tablet sink shuffle without project not match the output tuple (#40299)(#41293) (#41327)
## Proposed changes

cherry-pick from master  (#40299)(#41293)

<!--Describe your changes.-->
2024-10-15 00:12:23 +08:00
4888c632f4 [cherry-pick](branch2.1) support escape.delim and serialization.null.format for hive text (#41684)
## Proposed changes
pick from master:
https://github.com/apache/doris/pull/40291
2024-10-15 00:08:23 +08:00
ff52e73a07 [Fix](inverted index) fix match null for inverted index #41746 (#41787)
cherry pick from #41746
2024-10-14 14:45:36 +08:00
f112af0fd2 [pick](branch-2.1) pick #41555 #41592 #38204 (#41781)
pick #41555 #41592 #38204
2024-10-14 14:05:08 +08:00
e10458baad [enhancement](err-msg) Output column info when size invalid in block data convertor (#41535) (#41764)
## Proposed changes

pick: #41535

As title.
2024-10-12 21:08:04 +08:00
2ae37626bb [opt](index compaction)Use RAM dir to create tmp index_writer (#41371) (#41705)
## Proposed changes

bp #41371
2024-10-12 17:13:55 +08:00
90d6985f91 [Fix](bug) Is null predicate get error query result (#41704)
cherry-pick #41668
2024-10-12 13:18:14 +08:00
4ac07fe918 [Feature](json) Support json_search function in 2.1 (#41590)
cherry-pick #40948 

Like mysql, json_search returns the path which point to a json string
witch match the pattern.
`SELECT JSON_SEARCH('["A",[{"B":"1"}],{"C":"AB"},{"D":"BC"}]', 'one',
'A_') as res;`
```
+----------+
| res      |
+----------+
| "$[2].C" |
+----------+
```

Co-authored-by: liutang123 <liulijia@gmail.com>
2024-10-11 16:33:07 +08:00
e9cfbb56b3 [bugfix](becore) use after free problem when the segment is pop (#41685) (#41697)
## Proposed changes

pick #41685
Issue Number: close #xxx
introduced by #41608

<!--Describe your changes.-->

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->

Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-10-11 14:07:46 +08:00
8c0f73cb90 [Enhancement](MaxCompute)Refactoring maxCompute catalog using Storage API.(#40225 , #40888 ,#41386 ) (#41610)
bp #40225 , #40888 ,#41386

## Proposed changes
Among them, #40225 is the new api of mc,
#40888 is used to fix the bug when reading null between the new and old
apis,
#41386 is used for compatibility between the new and old versions
2024-10-11 11:55:41 +08:00
b489cdf840 [opt](merge-on-write) avoid to check delete bitmap while lookup rowkey in some situation to reduce CPU cost (#41480) (#41439)
## Proposed changes

Issue Number: close #xxx

cherry-pick #41480
2024-10-11 10:15:39 +08:00
6dddd4c499 [function](cast)Make string casting to integers more like MySQL's beh… (#41541)
…avior (#38847)
https://github.com/apache/doris/pull/38847
## Proposed changes

There are two issues here. First, the results of casting are
inconsistent between FE and BE .
```
FE
mysql [(none)]>select cast('3.000' as int); 
+----------------------+
| cast('3.000' as INT) |
+----------------------+
|                    3 |
+----------------------+

mysql [(none)]>set debug_skip_fold_constant = true;

BE
mysql [(none)]>select cast('3.000' as int);
+----------------------+
| cast('3.000' as INT) |
+----------------------+
|                 NULL |
+----------------------+
```
The second issue is that casting on BE converts '3.0' to null. Here, the
casting logic for FE and BE has been unified

<!--Describe your changes.-->

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->

---------

Co-authored-by: Xinyi Zou <zouxinyi02@gmail.com>
2024-10-11 09:32:00 +08:00
4c9ebbb3b9 [fix](cloud) cloud group commit should skip repaly wal if label is already used and the txn state is committed or visible (#41262) (#41461)
pick https://github.com/apache/doris/pull/41262
2024-10-10 22:27:04 +08:00
f2ba1f2fb3 [bugfix](segmentload) should remove segment from segment cache if load segment failed (#41608) (#41660) 2024-10-10 19:40:22 +08:00
0fb42d3a48 [Enhancement](tvf)catalog tvf implements user permission checks and hides sensitive information (#41497) (#41604)
bp #41497 

before #21790
## Proposed changes
This PR unifies the duplicate parts of `catalog tvf` and `show
catalogs`, adds permission check when querying `catalog tvf`, and hides
sensitive information.
2024-10-10 17:55:40 +08:00
1db0aef9b7 [feature](array_agg) support array_agg with param is array/map/struct… (#41651)
… (#40697)

this pr we support array_agg function support param with array map
struct type

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-10-10 17:54:54 +08:00
3120bfb6e3 [fix](pipelinex) fix fragment instance progress reports (part 2) (#40694) (#41641)
backport #40694
2024-10-10 17:49:41 +08:00
30492a2438 [opt](load) print more detailed log when stream load finished #41398 (#41639)
cherry pick from #41398
2024-10-10 17:47:48 +08:00
d32688e091 [Enhancement](multi-catalog) Set hdfs native client logger to glog and redirect jvm stdout/stderr logger to jni.log. (#41633)
Backport #39540.

Co-authored-by: Mingyu Chen <morningman@163.com>
2024-10-10 17:47:21 +08:00
a26079c09d [Opt](load) Optimize the error messages of -235 and -238 for loading #41048 (#41638)
cherry pick from #41048
2024-10-10 14:20:52 +08:00
33fad04341 [opt](Nereids) use 1 instead narrowest column when do column pruning (#41548) (#41627)
pick from master #41548
2024-10-10 14:02:23 +08:00
aa541fddf9 [fix](load) disable num segments check in compatibility mode (#41053) (#41552)
backport #41053
2024-10-10 11:20:16 +08:00