b4875c2789
[fix](jni)fix jni use timezone_obj get timezone be core. ( #41956 ) ( #42003 )
...
bp #41956
This PR #40225 try to pass time zone info from BE to JNI, and it use
`_state->timezone_obj().name()`
to get the timezone name.
But when we do some rolling upgrade of BE, it may coredump like:
```
*** SIGSEGV address not mapped to object (@0x610) received by PID 72661 (TID 73538 OR 0x7f2e898d1640) from PID 1552; stack trace: ***
0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/common/signal_handler.h:421
1# os::Linux::chained_handler(int, siginfo_t*, void*) in /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
2# JVM_handle_linux_signal in /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
3# signalHandler(int, siginfo_t*, void*) in /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
4# 0x00007F3070D3E520 in /lib/x86_64-linux-gnu/libc.so.6
5# cctz::time_zone::name[abi:cxx11]() const in /mnt/hdd01/ci/compatibility-deploy/be/lib/doris_be
6# doris::vectorized::JniConnector::open(doris::RuntimeState*, doris::RuntimeProfile*) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/vec/exec/jni_connector.cpp:87
7# doris::vectorized::AvroJNIReader::init_fetch_table_schema_reader() at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/vec/exec/format/avro/avro_jni_reader.cpp:119
8# std::_Function_handler::_M_invoke(std::_Any_data const&) at /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291
9# doris::WorkThreadPool::work_thread(int) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/util/work_thread_pool.hpp:159
10# execute_native_thread_routine at ../../../../../libstdc++-v3/src/c++11/thread.cc:84
11# start_thread at ./nptl/pthread_create.c:442
12# 0x00007F3070E22850 at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:83
172.20.50.206 last coredump sql: 2024-10-13 04:12:23,985 [query]
```
This PR use another method: `_state->timezone()`, which just return a
string, instead of reading and initializing
time zone info file, to avoid potential coredump.
2024-10-17 14:47:33 +08:00
67d057a711
[cherry-pick](branch-21) fix conv function parser string failure return wrong result ( #40530 ) ( #41964 )
...
## Proposed changes
Issue Number: close #39618
cherry-pick from master (#40530 )
2024-10-17 14:45:46 +08:00
0b41cd2472
[fix](serde)fix the bug in DataTypeNullableSerDe.deserialize_column_from_fixed_json ( #41217 ) ( #41960 )
...
bp #41217
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
2024-10-17 14:36:01 +08:00
968e33f07e
[cherry-pick](branch-21) pick ( #39057 ) ( #41352 ) ( #41958 )
...
## Proposed changes
pick from master (#39057 ) (#41352 )
<!--Describe your changes.-->
---------
Co-authored-by: Mryange <59914473+Mryange@users.noreply.github.com >
2024-10-17 14:30:40 +08:00
1b901f6fcc
[cherry-pick](branch-2.1) add parquet tvf cases and fix some parquet bug ( #41931 )
...
## Proposed changes
pick pr:
https://github.com/apache/doris/pull/41683
https://github.com/apache/doris/pull/41506
https://github.com/apache/doris/pull/41338
https://github.com/apache/doris/pull/39326
---------
Co-authored-by: morningman <morningman@163.com >
2024-10-17 14:20:58 +08:00
b8214952a1
[branch-2.1] Fix is_partial_update parameter is not set in append_block_with_partial_content() ( #41865 )
...
https://github.com/apache/doris/pull/41439 forgets to set
`is_partial_update` parameter for `Tablet::lookup_row_key()` in
`append_block_with_partial_content()`
2024-10-17 12:44:41 +08:00
19784d420c
[opt](inverted index) Improved top-N optimization by refining the sorting column check. ( #39496 ) ( #41954 )
...
https://github.com/apache/doris/pull/39496
2024-10-17 11:31:11 +08:00
0b6447faeb
[Fix](SchemaChange) refactor variant root column iterator to make row… ( #41941 )
...
pick #41700
2024-10-17 10:39:07 +08:00
7d99d5fcc4
[fix](analytic) Fix data distribution after analytic operator ( #41902 ) ( #41949 )
...
Fix data distribution after analytic operator
pick #41902
2024-10-16 18:41:56 +08:00
5bd33fc88c
[pick](branch-2.1) pick #41292 #41350 #41589 #41628 #41743 #41601 #41667 #41751 ( #41927 )
...
## Proposed changes
pick #41292 #41350 #41589 #41628 #41743 #41601 #41667 #41751
<!--Describe your changes.-->
---------
Co-authored-by: Pxl <pxl290@qq.com >
2024-10-16 15:41:28 +08:00
e56216211e
[pick](branch-2.1) pick #40667 #40714 ( #41905 )
...
pick
#40667
#40714
---------
Co-authored-by: wangbo <wangbo@apache.org >
2024-10-16 14:09:03 +08:00
e6545a36a3
[improvement](iceberg)Parallelize splits for count(*) for 2.1 ( #41169 ) ( #41880 )
...
bp: #41169
2024-10-16 10:52:06 +08:00
b185dfcbf6
[pick](branch-2.1) pick #41676 #41740 #41857 ( #41904 )
...
pick #41676 #41740 #41857
2024-10-15 22:41:17 +08:00
b91d8e2327
[Improvement](minor) Reduce locking scope ( #41845 ) ( #41844 )
...
pick #41845
2024-10-15 18:39:53 +08:00
78b6157aa9
[fix](ip/variant) fix information meta ( #41871 )
...
fix datatype information meta for ip/variant (#41666 )
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
2024-10-15 18:01:14 +08:00
abcba778ff
[fix](cancel) Fix cancel msg on branch-2.1 ( #41798 )
...
Make sure we can tell cancel reason from:
1. user cancel
2. timeout
3. others
```text
mysql [demo]>set query_timeout=1;
--------------
set query_timeout=1
--------------
Query OK, 0 rows affected (0.00 sec)
mysql [demo]>select sleep(5);
--------------
select sleep(5)
--------------
ERROR 1105 (HY000): errCode = 2, detailMessage = Timeout
mysql [demo]>select sleep(5);
--------------
select sleep(5)
--------------
^C^C -- sending "KILL QUERY 0" to server ...
^C -- query aborted
ERROR 1105 (HY000): errCode = 2, detailMessage = cancel query by user from 127.0.0.1:64208
```
2024-10-15 17:15:05 +08:00
77fbe6397a
[fix](http) Remove file if downloading faile is failed #41778 ( #41827 )
...
cherry pick from #41778
2024-10-15 15:30:29 +08:00
94687a2f3c
[fix](array/map) fix resize impl in array/map ( #41595 ) ( #41699 )
...
backport: https://github.com/apache/doris/pull/41595
2024-10-15 09:50:11 +08:00
d97642e9b5
[cherry-pick](branch-21) fix tablet sink shuffle without project not match the output tuple ( #40299 )( #41293 ) ( #41327 )
...
## Proposed changes
cherry-pick from master (#40299 )(#41293 )
<!--Describe your changes.-->
2024-10-15 00:12:23 +08:00
4888c632f4
[cherry-pick](branch2.1) support escape.delim and serialization.null.format for hive text ( #41684 )
...
## Proposed changes
pick from master:
https://github.com/apache/doris/pull/40291
2024-10-15 00:08:23 +08:00
ff52e73a07
[Fix](inverted index) fix match null for inverted index #41746 ( #41787 )
...
cherry pick from #41746
2024-10-14 14:45:36 +08:00
f112af0fd2
[pick](branch-2.1) pick #41555 #41592 #38204 ( #41781 )
...
pick #41555 #41592 #38204
2024-10-14 14:05:08 +08:00
e10458baad
[enhancement](err-msg) Output column info when size invalid in block data convertor ( #41535 ) ( #41764 )
...
## Proposed changes
pick: #41535
As title.
2024-10-12 21:08:04 +08:00
2ae37626bb
[opt](index compaction)Use RAM dir to create tmp index_writer ( #41371 ) ( #41705 )
...
## Proposed changes
bp #41371
2024-10-12 17:13:55 +08:00
90d6985f91
[Fix](bug) Is null predicate get error query result ( #41704 )
...
cherry-pick #41668
2024-10-12 13:18:14 +08:00
4ac07fe918
[Feature](json) Support json_search function in 2.1 ( #41590 )
...
cherry-pick #40948
Like mysql, json_search returns the path which point to a json string
witch match the pattern.
`SELECT JSON_SEARCH('["A",[{"B":"1"}],{"C":"AB"},{"D":"BC"}]', 'one',
'A_') as res;`
```
+----------+
| res |
+----------+
| "$[2].C" |
+----------+
```
Co-authored-by: liutang123 <liulijia@gmail.com >
2024-10-11 16:33:07 +08:00
e9cfbb56b3
[bugfix](becore) use after free problem when the segment is pop ( #41685 ) ( #41697 )
...
## Proposed changes
pick #41685
Issue Number: close #xxx
introduced by #41608
<!--Describe your changes.-->
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
Co-authored-by: yiguolei <yiguolei@gmail.com >
2024-10-11 14:07:46 +08:00
8c0f73cb90
[Enhancement](MaxCompute)Refactoring maxCompute catalog using Storage API.( #40225 , #40888 ,#41386 ) ( #41610 )
...
bp #40225 , #40888 ,#41386
## Proposed changes
Among them, #40225 is the new api of mc,
#40888 is used to fix the bug when reading null between the new and old
apis,
#41386 is used for compatibility between the new and old versions
2024-10-11 11:55:41 +08:00
b489cdf840
[opt](merge-on-write) avoid to check delete bitmap while lookup rowkey in some situation to reduce CPU cost ( #41480 ) ( #41439 )
...
## Proposed changes
Issue Number: close #xxx
cherry-pick #41480
2024-10-11 10:15:39 +08:00
6dddd4c499
[function](cast)Make string casting to integers more like MySQL's beh… ( #41541 )
...
…avior (#38847 )
https://github.com/apache/doris/pull/38847
## Proposed changes
There are two issues here. First, the results of casting are
inconsistent between FE and BE .
```
FE
mysql [(none)]>select cast('3.000' as int);
+----------------------+
| cast('3.000' as INT) |
+----------------------+
| 3 |
+----------------------+
mysql [(none)]>set debug_skip_fold_constant = true;
BE
mysql [(none)]>select cast('3.000' as int);
+----------------------+
| cast('3.000' as INT) |
+----------------------+
| NULL |
+----------------------+
```
The second issue is that casting on BE converts '3.0' to null. Here, the
casting logic for FE and BE has been unified
<!--Describe your changes.-->
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
---------
Co-authored-by: Xinyi Zou <zouxinyi02@gmail.com >
2024-10-11 09:32:00 +08:00
4c9ebbb3b9
[fix](cloud) cloud group commit should skip repaly wal if label is already used and the txn state is committed or visible ( #41262 ) ( #41461 )
...
pick https://github.com/apache/doris/pull/41262
2024-10-10 22:27:04 +08:00
f2ba1f2fb3
[bugfix](segmentload) should remove segment from segment cache if load segment failed ( #41608 ) ( #41660 )
2024-10-10 19:40:22 +08:00
0fb42d3a48
[Enhancement](tvf)catalog tvf implements user permission checks and hides sensitive information ( #41497 ) ( #41604 )
...
bp #41497
before #21790
## Proposed changes
This PR unifies the duplicate parts of `catalog tvf` and `show
catalogs`, adds permission check when querying `catalog tvf`, and hides
sensitive information.
2024-10-10 17:55:40 +08:00
1db0aef9b7
[feature](array_agg) support array_agg with param is array/map/struct… ( #41651 )
...
… (#40697 )
this pr we support array_agg function support param with array map
struct type
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
2024-10-10 17:54:54 +08:00
3120bfb6e3
[fix](pipelinex) fix fragment instance progress reports (part 2) ( #40694 ) ( #41641 )
...
backport #40694
2024-10-10 17:49:41 +08:00
30492a2438
[opt](load) print more detailed log when stream load finished #41398 ( #41639 )
...
cherry pick from #41398
2024-10-10 17:47:48 +08:00
d32688e091
[Enhancement](multi-catalog) Set hdfs native client logger to glog and redirect jvm stdout/stderr logger to jni.log. ( #41633 )
...
Backport #39540 .
Co-authored-by: Mingyu Chen <morningman@163.com >
2024-10-10 17:47:21 +08:00
a26079c09d
[Opt](load) Optimize the error messages of -235 and -238 for loading #41048 ( #41638 )
...
cherry pick from #41048
2024-10-10 14:20:52 +08:00
33fad04341
[opt](Nereids) use 1 instead narrowest column when do column pruning ( #41548 ) ( #41627 )
...
pick from master #41548
2024-10-10 14:02:23 +08:00
aa541fddf9
[fix](load) disable num segments check in compatibility mode ( #41053 ) ( #41552 )
...
backport #41053
2024-10-10 11:20:16 +08:00
e218fd2314
[Fix](inverted index) add DATEV2 and DATETIMEV2 for inverted index reader #41565 ( #41579 )
...
cherry pick from #41565
2024-10-09 15:32:41 +08:00
31b506c8cc
[Enhancement](inverted index) return OK instead of not supported in expr evaluate_inverted_index #41567 ( #41578 )
...
cherry pick from #41567
2024-10-09 15:14:38 +08:00
0185f8069f
[fix](crash) fix be crash because of int overflow ( #41554 ) ( #41568 )
2024-10-09 14:20:55 +08:00
9fe77b335c
[Enhancement](inverted index) apply inverted index when has any #41547 ( #41584 )
...
cherry pick from #41547
2024-10-09 14:13:38 +08:00
afb477c66d
[Fix](inverted index) Fix wrong need read data opt when enable_common_expr_pushdown is disabled #40689 ( #41562 )
...
cherry pick from #40689
2024-10-08 22:12:10 +08:00
c24ff2ff81
[fix](upgrade) fix version check failure of window_funnel when upgrading ( #41542 )
...
## Proposed changes
Issue Number: close #xxx
Fix fix version check failure of window_funnel when upgrading from 2.1.6
and higher version to latest branch 2.1.
```
02:49:13 F20240930 02:47:48.546983 7581 block.cpp:89] Check failed: BeExecVersionManager::check_be_exec_version(be_exec_version)
02:49:13 *** Check failure stack trace: ***
02:49:13 @ 0x564640041856 google::LogMessage::SendToLog()
02:49:13 @ 0x56464003e2a0 google::LogMessage::Flush()
02:49:13 @ 0x564640042099 google::LogMessageFatal::~LogMessageFatal()
02:49:13 @ 0x56463922d106 doris::vectorized::Block::deserialize()
02:49:13 @ 0x5646390a82bf doris::vectorized::WindowFunnelState<>::read()
02:49:13 @ 0x5646390a6889 doris::vectorized::IAggregateFunctionDataHelper<>::deserialize_and_merge()
02:49:13 @ 0x5646390acdc3 doris::vectorized::IAggregateFunctionHelper<>::deserialize_and_merge_from_column_range()
02:49:13 @ 0x56463fa77152 doris::pipeline::AggSinkLocalState::_merge_without_key()
02:49:13 @ 0x56463fa9d114 doris::pipeline::AggSinkLocalState::Executor<>::execute()
02:49:13 @ 0x56463fa78569 doris::pipeline::AggSinkOperatorX::sink()
02:49:13 @ 0x564640013296 doris::pipeline::PipelineXTask::execute()
02:49:13 @ 0x56464001d41c doris::pipeline::TaskScheduler::_do_work()
02:49:13 @ 0x56463663e078 doris::ThreadPool::dispatch_thread()
02:49:13 @ 0x564636634901 doris::Thread::supervise_thread()
02:49:13 @ 0x7fb64cf58ac3 (unknown)
02:49:13 @ 0x7fb64cfea850 (unknown)
02:49:13 @ (nil) (unknown)
02:49:13 *** Query id: b0cd194940184766-961c310e833e92b1 ***
02:49:13 *** is nereids: 1 ***
02:49:13 *** tablet id: 0 ***
02:49:13 *** Aborted at 1727635668 (unix time) try "date -d @1727635668" if you are using GNU date ***
02:49:13 *** Current BE git commitID: 653e315ba5 ***
02:49:13 *** SIGABRT unknown detail explain (@0x1648) received by PID 5704 (TID 7581 OR 0x7fb354a9a640) from PID 5704; stack trace: ***
02:49:13 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/common/signal_handler.h:421
02:49:13 1# 0x00007FB64CF06520 in /lib/x86_64-linux-gnu/libc.so.6
02:49:13 2# pthread_kill at ./nptl/pthread_kill.c:89
02:49:13 3# raise at ../sysdeps/posix/raise.c:27
02:49:13 4# abort at ./stdlib/abort.c:81
02:49:13 5# 0x000056464004C06D in /mnt/hdd01/ci/compatibility-deploy/be/lib/doris_be
02:49:13 6# 0x000056464003E76A in /mnt/hdd01/ci/compatibility-deploy/be/lib/doris_be
02:49:13 7# google::LogMessage::SendToLog() in /mnt/hdd01/ci/compatibility-deploy/be/lib/doris_be
02:49:13 8# google::LogMessage::Flush() in /mnt/hdd01/ci/compatibility-deploy/be/lib/doris_be
02:49:13 9# google::LogMessageFatal::~LogMessageFatal() in /mnt/hdd01/ci/compatibility-deploy/be/lib/doris_be
02:49:13 10# doris::vectorized::Block::deserialize(doris::PBlock const&) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/vec/core/block.cpp:113
02:49:13 11# doris::vectorized::WindowFunnelState<(doris::vectorized::TypeIndex)14, long>::read(doris::vectorized::BufferReadable&) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/vec/aggregate_functions/aggregate_function_window_funnel.h:363
02:49:13 12# doris::vectorized::IAggregateFunctionDataHelper<doris::vectorized::WindowFunnelState<(doris::vectorized::TypeIndex)14, long>, doris::vectorized::AggregateFunctionWindowFunnel<(doris::vectorized::TypeIndex)14, long> >::deserialize_and_merge(char*, char*, doris::vectorized::BufferReadable&, doris::vectorized::Arena*) const at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/vec/aggregate_functions/aggregate_function.h:517
02:49:13 13# doris::vectorized::IAggregateFunctionHelper<doris::vectorized::AggregateFunctionNullVariadicInline<doris::vectorized::AggregateFunctionWindowFunnel<(doris::vectorized::TypeIndex)14, long>, false> >::deserialize_and_merge_from_column_range(char*, doris::vectorized::IColumn const&, unsigned long, unsigned long, doris::vectorized::Arena*) const at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/vec/aggregate_functions/aggregate_function.h:465
02:49:13 14# doris::pipeline::AggSinkLocalState::_merge_without_key(doris::vectorized::Block*) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/pipeline/exec/aggregation_sink_operator.cpp:389
02:49:13 15# doris::pipeline::AggSinkLocalState::Executor<true, true>::execute(doris::pipeline::AggSinkLocalState*, doris::vectorized::Block*) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/pipeline/exec/aggregation_sink_operator.h:73
02:49:13 16# doris::pipeline::AggSinkOperatorX::sink(doris::RuntimeState*, doris::vectorized::Block*, bool) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/pipeline/exec/aggregation_sink_operator.cpp:744
02:49:13 17# doris::pipeline::PipelineXTask::execute(bool*) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/pipeline/pipeline_x/pipeline_x_task.cpp:332
02:49:13 18# doris::pipeline::TaskScheduler::_do_work(unsigned long) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/pipeline/task_scheduler.cpp:347
02:49:13 19# doris::ThreadPool::dispatch_thread() in /mnt/hdd01/ci/compatibility-deploy/be/lib/doris_be
02:49:13 20# doris::Thread::supervise_thread(void*) at /home/zcp/repo_center/doris_branch-2.1/doris/be/src/util/thread.cpp:499
02:49:13 21# start_thread at ./nptl/pthread_create.c:442
02:49:13 22# 0x00007FB64CFEA850 at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:83
02:49:13 [2024-09-30 02:49:13,147 __main__:796] [INFO]: 172.20.50.73 last coredump sql: 2024-09-30 02:48:18,328 [query] Query b0cd194940184766-961c310e833e92b1 1 times with new query id: 2e0e00de0e7548dd-95f9abc9d8d11c3a
```
2024-10-08 17:13:33 +08:00
cb24ccc112
[bugfix](brpc) Should use status to generate protobuf message, because it will encoding Backend Info ( #41515 ) ( #41522 )
...
Should use status to generate protobuf message, because it will encoding
Backend Info
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
2024-10-04 23:03:55 +08:00
d33a3eb1c5
[cherry-pick](branch-2.1) Pick "[Fix](LZ4 compression) Fix wrong LZ4 compression max input size limit ( #41239 )" ( #41505 )
...
## Proposed changes
LZ4 compression max supported value is LZ4_MAX_INPUT_SIZE, which is
0x7E000000(2,113,929,216 bytes). Doris use wrong max size INT_MAX, which
is 2,147,483,647, to check. If input data size is between this two size,
then it can pass the check but LZ4 compression will fail.
This PR fix it.
<!--Describe your changes.-->
## Proposed changes
Issue Number: close #xxx
<!--Describe your changes.-->
2024-10-01 22:43:11 +08:00
98a1311aa2
[Opt](scanner-scheduler) Opt scanner scheduler starvation issue. ( #41484 )
...
## Proposed changes
Backport #40641
2024-09-30 15:40:20 +08:00
2b9c963edb
[fix](scanner) Check query status when iterating through rowsets and segments #41363 ( #41452 )
...
cherry pick from #41363
2024-09-30 09:49:46 +08:00