Commit Graph

3494 Commits

Author SHA1 Message Date
70daa1f85d [opt](inverted index) Controls whether the in_list can execute fast_execute. (#40141)
https://github.com/apache/doris/pull/40022
2024-08-30 10:32:43 +08:00
ca07a00c93 Revert "[branch-2.1](hive) support hive write text table (#38549) (#4… (#40157)
…0063)"

This reverts commit c6df7c21a3c09ae1664deabacb88dfcea9d94b68.

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->

Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-08-30 10:25:38 +08:00
a7156ee775 [fix](parquet)Fix the be core issue when reading parquet unsigned types. (#39926) (#40123)
bp #39926
2024-08-29 21:52:52 +08:00
c6df7c21a3 [branch-2.1](hive) support hive write text table (#38549) (#40063)
1. Support write hive text table
2. Add SessionVariable `hive_text_compression` to write compressed hive
text table
3. Supported compression type: gzip, bzip2, snappy, lz4, zstd

pick from https://github.com/apache/doris/pull/38549
2024-08-29 16:50:40 +08:00
8df93f8dfe [Opt](parquet/orc-reader) Opt get dict ids in _rewrite_dict_predicates() (#40095)
## Proposed changes

backport #39893.
2024-08-29 14:50:42 +08:00
15b14ef49b [fix](inverted index) fix error handling in fast_execute (#40086)
https://github.com/apache/doris/pull/40024
2024-08-29 14:45:25 +08:00
1d439d2ea9 [opt](parquet) add predicate filter time for parquet reader (#40005) (#40053)
bp #40005
2024-08-28 21:57:00 +08:00
4909c34555 [fix](compile) Fix Type Mismatch in min Function for VMysqlResultWriter on macOS (#38202) (#40042)
pick #38202 to branch-2.1

Co-authored-by: zy-kkk <zhongyk10@gmail.com>
2024-08-28 21:36:04 +08:00
b052f46944 [fix](json-quote) fix json quote func for not find the func (#39931) (#40061)
1.  fix function not found before this pr:
```
drop table if exists t003;
create table t003 (a bigint, b json not null) properties ("replication_num"="1");
insert into t003 values (1, '{"a":1,"b":2}');
select a, map_agg("k1", json_quote(b)) from t003 group by a;

[17:47]>  select a, map_agg("k1", json_quote(b)) from t003 group by a;
(1105, 'errCode = 2, detailMessage = (172.20.48.119)[INTERNAL_ERROR]Function json_quote get failed, expr is VectorizedFnCall[json_quote](arguments=(CAST b(JSONB) TO String),return=Nullable(String)) and return type is Nullable(String).')
``` 
after pr , we can carry on it 
2. fix a core if we use json_quote with table column
```
[WARNING!] /sys/kernel/mm/transparent_hugepage/enabled: [always] madvise never, Doris not recommend turning on THP, which may cause the BE process to use more memory and cannot be freed in time. Turn off THP: `echo madvise | sudo tee /sys/kernel/mm/transparent_hugepage/enabled`
start BE in local mode
doris_be: /mnt/disk1/wangqiannan/amory/doris/thirdparty/installed/include/rapidjson/writer.h:488:void rapidjson::Writer<rapidjson::GenericStringBuffer<rapidjson::UTF8<>>>::Prefix(Type) [OutputStream = rapidjson::GenericStringBuffer<rapidjson::UTF8<>>, SourceEncoding = rapidjson::UTF8<>, TargetEncoding = rapidjson::UTF8<>, StackAllocator = rapidjson::CrtAllocator, writeFlags = 0]: 假设 ‘!hasRoot_’ 失败。
*** Query id: b7b94dead55e4090-8c8e94f65ec9efd3 ***
*** is nereids: 1 ***
*** tablet id: 0 ***
*** Aborted at 1724744881 (unix time) try "date -d @1724744881" if you are using GNU date ***
*** Current BE git commitID: bd5844ea0d ***
*** SIGABRT unknown detail explain (@0x461003611e1) received by PID 3543521 (TID 3548858 OR 0x7f8a53c1a700) from PID 3543521; stack trace: ***
doris_be: /mnt/disk1/wangqiannan/amory/doris/thirdparty/installed/include/rapidjson/writer.h:488:void rapidjson::Writer<rapidjson::GenericStringBuffer<rapidjson::UTF8<>>>::Prefix(Type) [OutputStream = rapidjson::GenericStringBuffer<rapidjson::UTF8<>>, SourceEncoding = rapidjson::UTF8<>, TargetEncoding = rapidjson::UTF8<>, StackAllocator = rapidjson::CrtAllocator, writeFlags = 0]: 假设 ‘!hasRoot_’ 失败。
doris_be: /mnt/disk1/wangqiannan/amory/doris/thirdparty/installed/include/rapidjson/writer.h:488:void rapidjson::Writer<rapidjson::GenericStringBuffer<rapidjson::UTF8<>>>::Prefix(Type) [OutputStream = rapidjson::GenericStringBuffer<rapidjson::UTF8<>>, SourceEncoding = rapidjson::UTF8<>, TargetEncoding = rapidjson::UTF8<>, StackAllocator = rapidjson::CrtAllocator, writeFlags = 0]: 假设 ‘!hasRoot_’ 失败。
doris_be: /mnt/disk1/wangqiannan/amory/doris/thirdparty/installed/include/rapidjson/writer.h:488:void rapidjson::Writer<rapidjson::GenericStringBuffer<rapidjson::UTF8<>>>::Prefix(Type) [OutputStream = rapidjson::GenericStringBuffer<rapidjson::UTF8<>>, SourceEncoding = rapidjson::UTF8<>, TargetEncoding = rapidjson::UTF8<>, StackAllocator = rapidjson::CrtAllocator, writeFlags = 0]: 假设 ‘!hasRoot_’ 失败。
doris_be: /mnt/disk1/wangqiannan/amory/doris/thirdparty/installed/include/rapidjson/writer.h:488:void rapidjson::Writer<rapidjson::GenericStringBuffer<rapidjson::UTF8<>>>::Prefix(Type) [OutputStream = rapidjson::GenericStringBuffer<rapidjson::UTF8<>>, SourceEncoding = rapidjson::UTF8<>, TargetEncoding = rapidjson::UTF8<>, StackAllocator = rapidjson::CrtAllocator, writeFlags = 0]: 假设 ‘!hasRoot_’ 失败。
doris_be: /mnt/disk1/wangqiannan/amory/doris/thirdparty/installed/include/rapidjson/writer.h:488:void rapidjson::Writer<rapidjson::GenericStringBuffer<rapidjson::UTF8<>>>::Prefix(Type) [OutputStream = rapidjson::GenericStringBuffer<rapidjson::UTF8<>>, SourceEncoding = rapidjson::UTF8<>, TargetEncoding = rapidjson::UTF8<>, StackAllocator = rapidjson::CrtAllocator, writeFlags = 0]: 假设 ‘!hasRoot_’ 失败。
doris_be: /mnt/disk1/wangqiannan/amory/doris/thirdparty/installed/include/rapidjson/writer.h:488:void rapidjson::Writer<rapidjson::GenericStringBuffer<rapidjson::UTF8<>>>::Prefix(Type) [OutputStream = rapidjson::GenericStringBuffer<rapidjson::UTF8<>>, SourceEncoding = rapidjson::UTF8<>, TargetEncoding = rapidjson::UTF8<>, StackAllocator = rapidjson::CrtAllocator, writeFlags = 0]: 假设 ‘!hasRoot_’ 失败。
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /mnt/disk1/wangqiannan/amory/doris/be/src/common/signal_handler.h:421
 1# 0x00007F9429595B50 in /lib64/libc.so.6
 2# gsignal in /lib64/libc.so.6
 3# __GI_abort in /lib64/libc.so.6
 4# _nl_load_domain.cold.0 in /lib64/libc.so.6
 5# 0x00007F942958E426 in /lib64/libc.so.6
 6# rapidjson::Writer<rapidjson::GenericStringBuffer<rapidjson::UTF8<char>, rapidjson::CrtAllocator>, rapidjson::UTF8<char>, rapidjson::UTF8<char>, rapidjson::CrtAllocator, 0u>::Prefix(rapidjson::Type) at /mnt/disk1/wangqiannan/amory/doris/thirdparty/installed/include/rapidjson/writer.h:488
 7# rapidjson::Writer<rapidjson::GenericStringBuffer<rapidjson::UTF8<char>, rapidjson::CrtAllocator>, rapidjson::UTF8<char>, rapidjson::UTF8<char>, rapidjson::CrtAllocator, 0u>::String(char const*, unsigned int, bool) at /mnt/disk1/wangqiannan/amory/doris/thirdparty/installed/include/rapidjson/writer.h:206
 8# bool rapidjson::GenericValue<rapidjson::UTF8<char>, rapidjson::MemoryPoolAllocator<rapidjson::CrtAllocator> >::Accept<rapidjson::Writer<rapidjson::GenericStringBuffer<rapidjson::UTF8<char>, rapidjson::CrtAllocator>, rapidjson::UTF8<char>, rapidjson::UTF8<char>, rapidjson::CrtAllocator, 0u> >(rapidjson::Writer<rapidjson::GenericStringBuffer<rapidjson::UTF8<char>, rapidjson::CrtAllocator>, rapidjson::UTF8<char>, rapidjson::UTF8<char>, rapidjson::CrtAllocator, 0u>&) const at /mnt/disk1/wangqiannan/amory/doris/thirdparty/installed/include/rapidjson/document.h:1974
 9# doris::vectorized::FunctionJsonQuoteImpl::execute(std::vector<doris::vectorized::ColumnStr<unsigned int> const*, std::allocator<doris::vectorized::ColumnStr<unsigned int> const*> > const&, doris::vectorized::ColumnStr<unsigned int>&, unsigned long) at /mnt/disk1/wangqiannan/amory/doris/be/src/vec/functions/function_json.cpp:826
10# doris::vectorized::FunctionJson<doris::vectorized::FunctionJsonQuoteImpl>::execute_impl(doris::FunctionContext*, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long, unsigned long) const at /mnt/disk1/wangqiannan/amory/doris/be/src/vec/functions/function_json.cpp:985
11# doris::vectorized::DefaultExecutable::execute_impl(doris::FunctionContext*, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long, unsigned long) const at /mnt/disk1/wangqiannan/amory/doris/be/src/vec/functions/function.h:463
12# doris::vectorized::PreparedFunctionImpl::_execute_skipped_constant_deal(doris::FunctionContext*, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long, unsigned long, bool) const at /mnt/disk1/wangqiannan/amory/doris/be/src/vec/functions/function.cpp:120
13# doris::vectorized::PreparedFunctionImpl::execute_without_low_cardinality_columns(doris::FunctionContext*, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long, unsigned long, bool) const at /mnt/disk1/wangqiannan/amory/doris/be/src/vec/functions/function.cpp:245
```

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-28 21:31:08 +08:00
ddf72ce09a [Fix](branch-2.1) fix manual pick with mistake when handling decimal … (#40008)
…type

introduced by #39843
2024-08-28 10:15:52 +08:00
4127eec9a7 [fix](function) fix error return type in mod(float32,BigInt) (#39358) (#39971)
## Proposed changes
https://github.com/apache/doris/pull/39358
```
CREATE TABLE testdb (
    K1 BIGINT,
    K2 FLOAT
) properties("replication_num" = "1");
insert into testdb values(1,1.1);
select mod(k1,k2) from testdb;
mysql [test10]>select mod(k1,k2) from testdb;
ERROR 1105 (HY000): errCode = 2, detailMessage = (127.0.0.1)[INTERNAL_ERROR]Function mod get failed, expr is VectorizedFnCall[mod](arguments=K1, K2,return=Nullable(Float32)) and return type is Nullable(Float32).
```

<!--Describe your changes.-->

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-27 18:32:52 +08:00
fc9936d923 [fix](funcs) fix map struct construct funcs (#39973)
## Proposed changes
backport: https://github.com/apache/doris/pull/39699
Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-27 18:31:51 +08:00
b7e0bfa1c2 [improve](function) opt aes_encrypt/decrypt function to handle const column (#37194) (#39954)
pick #37194 to branch-2.1

---------

Co-authored-by: zhangstar333 <87313068+zhangstar333@users.noreply.github.com>
2024-08-27 14:56:38 +08:00
4cf769b39f [Improve](table-function) explode json array with json args (#39491) 2024-08-27 14:53:17 +08:00
8256c6f0ba [Fix](parquet-reader) Fix definition level rle decode dead loop in parquet-reader. (#39523) (#39945)
bp #39523

Co-authored-by: Qi Chen <kaka11.chen@gmail.com>
2024-08-27 08:54:43 +08:00
db0724dfe0 [Fix-2.1](function) fix function covar core for not null input (#39943)
## Proposed changes

Issue Number: close #xxx

add testcases like:
```groovy
    qt_notnull1 "select covar_samp(non_nullable(x), non_nullable(y)) from test_covar_samp"
    qt_notnull2 "select covar_samp(x, non_nullable(y)) from test_covar_samp"
    qt_notnull3 "select covar_samp(non_nullable(x), y) from test_covar_samp"
```

before they will all coredump in 2.1
2024-08-27 08:39:47 +08:00
21bd4a4ac8 [bug](function)fix json_replace check return type error (#37014) (#39938)
1. fix the return type dcheck error:
```
mysql [test]>select (json_replace(a, '$.fparam.nested_2', "qwe")) from json_table_2 limit 1;
ERROR 1105 (HY000): errCode = 2, detailMessage = (10.16.10.8)[INTERNAL_ERROR]Function json_replace get failed, expr is VectorizedFnCall[json_replace](arguments=a, String, String, String,return=Nullable(String)) and return type is Nullable(String).
```

2. improve the json_replace/json_insert/json_set function execute of not
convert const column, test about could faster 1s on 1000w table rows

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-27 08:03:48 +08:00
4c1c67e03a [improvemen](overflow) Provide the user with a suggestion to avoid th… (#39631) (#39897)
cherry-pick #39631 to branch-2.1
2024-08-26 08:10:32 +08:00
8dbd73988a [fix](recvr) catch exception of transmit_block (#39882)
BP #39881
2024-08-25 00:25:20 +08:00
5a810122a2 [debug](load) check the column type when string column is invalid (#39337)
## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-24 18:14:21 +08:00
263746b04b [fix](paimon) fix crash when enable cache with paimon deletion vector(#39877) (#39875)
bp #39877
2024-08-24 17:58:20 +08:00
14a2a66106 [fix](paimon) fix not able to read paimon data from hdfs with HA (#39806) (#39876)
bp #39806
2024-08-24 17:51:15 +08:00
460605ae3c [branch-2.1] pick some prs (#39860)
## Proposed changes

Issue Number: close #xxx

https://github.com/apache/doris/pull/38385 optimize parsing datetime
https://github.com/apache/doris/pull/38978 make stream load failure
message more clear and disable some error's stacktrace by default
https://github.com/apache/doris/pull/39255 fix random function coredump
https://github.com/apache/doris/pull/39324 fix function corr
inconsistency with doc
https://github.com/apache/doris/pull/39449 check auto partitoin nullity
when creating partition
https://github.com/apache/doris/pull/39695 make
DynamicPartitionScheduler immediately know interval's change
https://github.com/apache/doris/pull/39754 Add some partition expr check
on creating table
2024-08-24 17:26:42 +08:00
564d3cd647 [Performance](opt) opt the order by performance in permutation (#39089)
## Proposed changes

Issue Number: cherry pick #38985

<!--Describe your changes.-->
2024-08-24 16:05:46 +08:00
a6f267c479 [pick](Variant) fix element_at should return nullable if result type is nullable (#39846)
#39732
2024-08-24 09:22:03 +08:00
8cf6c6a2b5 [fix](agg function) incorrect result of map agg(#39743) (#39854)
## Proposed changes

pick #39743
2024-08-24 09:19:52 +08:00
3103bb08dc [pick](Variant) casting to decimal type may lost precision (#39843)
#39650
2024-08-23 22:47:32 +08:00
37443aa7e1 [improve](move-memtable) reuse connection in load_stream_stub (#39231) (#39762)
backport #39231
2024-08-23 22:46:28 +08:00
6ceb574aa0 [branch-2.1]Pick IO limit/workload group usage table (#39839) 2024-08-23 18:51:47 +08:00
baf5b71b39 [branch-2.1](memory) Modify thedefault JEMALLOC_CONF and support flush Jemalloc tcache (#39829)
pick #38185
2024-08-23 17:21:42 +08:00
c40246efa9 [bugfix](iceberg)Fixed random core with writing iceberg partitioned table for 2.1 (#39808)(#39569) (#39832)
## Proposed changes

bp: #39808 #39569
2024-08-23 17:19:48 +08:00
1f16daa5f6 Revert "[bugfix](iceberg)clear block for partition values for 2.1 (#39569)" (#39815)
Reverts apache/doris#39729
2024-08-23 11:58:42 +08:00
1367f74e7a [branch-2.1](memory) Optimize ClearCacheActionimplementation (#39796)
pick #38438
2024-08-23 01:51:14 +08:00
dc732fe33f [bugfix](iceberg)clear block for partition values for 2.1 (#39569) (#39729)
## Proposed changes

bp: #39569

clear block, or we will get wrong partition values.
2024-08-22 22:43:02 +08:00
04e993c1de [refine](pipeline) refine some VDataStreamRecvr code (#35063) (#37802)
## Proposed changes
https://github.com/apache/doris/pull/35063
https://github.com/apache/doris/pull/35428
2024-08-22 19:55:17 +08:00
8ce8887b75 [branch-2.1](memory) Refactor refresh workload groups weighted memory ratio and record refresh interval memory growth (#39760)
pick #38168
overwrites changes in #37221 on workload_group_manager.cpp. If need to
pick 37221, ignore it.
2024-08-22 17:33:11 +08:00
ba7baa7e6b [fix](window_funnel) fix upgrading problem caused by behaviour change of window_funnel (#39766)
## Proposed changes

Issue Number: close #xxx

For the latest published 2.1 version `2.1.5`:
```
max_be_exec_version=4;
AGG_FUNCTION_NEW=2;
```
and `branch-2.1`:
```
max_be_exec_version=5;
AGG_FUNCTION_NEW=2;
```
It will cause problem when upgrading.
This PR fix the problem, set `AGG_FUNCTION_NEW` to `5`.
2024-08-22 17:26:51 +08:00
0e694f19db [fix](merge-on-write) segcompaction should process delete bitmap if necessary (#38369) (#39707)
## Proposed changes

Issue Number: close #xxx

cherry-pick #38369 and #38800
2024-08-22 00:42:56 +08:00
1e30d4ebaf Revert "[Bug](compatibility) fix window funnel function coredump when upgrade" (#39681)
Reverts apache/doris#39646
2024-08-21 14:47:27 +08:00
bb687bd69c [cherry-pick](branch-2.1) add function regexp_extract_or_null (#39561)
# Proposed changes

pick https://github.com/apache/doris/pull/38296
2024-08-21 09:14:58 +08:00
7bb83ae379 [cherry-pick](branch-21) fix append_data_by_selector_impl reserve too mush useless memory (#39581) (#39635)
## Proposed changes

cherry-pick from master #39581
2024-08-21 08:47:30 +08:00
75eded04d7 [Bug](compatibility) fix window funnel function coredump when upgrade (#39646)
## Proposed changes
this PR https://github.com/apache/doris/pull/39270 have change the agg
of window funnel
and max_be_exec_version is update to 5, in order to compatibility of the
agg function when upgrade.

<!--Describe your changes.-->
2024-08-21 08:46:50 +08:00
a3fd13fee6 [fix](catalog) set timeout for split fetch (#39346) (#39624)
bp #39346
2024-08-20 21:59:55 +08:00
12ed2951c4 [fix] (inverted index) remove tmp columns in block (#39369) (#39533) 2024-08-20 20:53:23 +08:00
5fcd6e6270 [Fix](load) Fix the incorrect src value printed in the error log when strict mode is true #39447 (#39587)
cherry pick from #39447
2024-08-20 12:02:13 +08:00
273a62584c [opt](inverted index) unified optimization judgment to prevent omissions (#39473)
https://github.com/apache/doris/pull/38027
2024-08-17 16:57:19 +08:00
b0da8430bc [opt](inverted index) Optimize the usage of the multi_match function (#39472)
## Proposed changes

https://github.com/apache/doris/pull/39193

<!--Describe your changes.-->
2024-08-17 16:53:52 +08:00
79aa079cc6 [fix](array-funcs) array min/max #39307 (#39484) 2024-08-17 10:56:44 +08:00
7687f2c53a [fix](ip-funcs) fix ip inet6_aton funcs #39415 (#39513) 2024-08-17 10:56:06 +08:00
824f035b98 [pick](Row store) fix row store with invalid json string in variant ty… (#39456)
#39394
2024-08-16 14:43:11 +08:00