Commit Graph

321 Commits

Author SHA1 Message Date
Pxl
43c646363e [Bug](runtime-filter) support ip rf and use exception to replace dche… (#41531)
…ck when PrimitiveType to PColumnType (#39985)

use exception to replace dcheck when PrimitiveType to PColumnType
```cpp
*** SIGABRT unknown detail explain (@0x11d3f) received by PID 73023 (TID 74292 OR 0x7fd758225640) from PID 73023; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/zcp/repo_center/doris_master/doris/be/src/common/signal_handler.h:421
 1# 0x00007FDDBE6B9520 in /lib/x86_64-linux-gnu/libc.so.6
 2# pthread_kill at ./nptl/pthread_kill.c:89
 3# raise at ../sysdeps/posix/raise.c:27
 4# abort at ./stdlib/abort.c:81
 5# 0x000056123F81A94D in /root/output/be/lib/doris_be
 6# 0x000056123F80CF8A in /root/output/be/lib/doris_be
 7# google::LogMessage::SendToLog() in /root/output/be/lib/doris_be
 8# google::LogMessage::Flush() in /root/output/be/lib/doris_be
 9# google::LogMessageFatal::~LogMessageFatal() in /root/output/be/lib/doris_be
10# doris::to_proto(doris::PrimitiveType) at /home/zcp/repo_center/doris_master/doris/be/src/exprs/runtime_filter.cpp:114
11# doris::IRuntimeFilter::push_to_remote(doris::TNetworkAddress const*) at /home/zcp/repo_center/doris_master/doris/be/src/exprs/runtime_filter.cpp:1143
12# doris::IRuntimeFilter::publish(bool)::$_0::operator()(doris::IRuntimeFilter*) const at /home/zcp/repo_center/doris_master/doris/be/src/exprs/runtime_filter.cpp:959
13# doris::IRuntimeFilter::publish(bool)::$_2::operator()() const at /home/zcp/repo_center/doris_master/doris/be/src/exprs/runtime_filter.cpp:983
14# doris::IRuntimeFilter::publish(bool) at /home/zcp/repo_center/doris_master/doris/be/src/exprs/runtime_filter.cpp:997
```

## Proposed changes
pick from #39985
2024-12-30 20:56:11 +08:00
419456f3a9 branch-2.1: [fix](sort)fix merge sort may miss the limit #46072 (#46158)
Cherry-picked from #46072

Co-authored-by: Mryange <yanxuecheng@selectdb.com>
2024-12-30 20:02:24 +08:00
df8bc8f23d branch-2.1: [fix](parquet) impl has_dict_page to replace old logic and fix write empty parquet row group bug #45740 (#45954)
Cherry-picked from #45740

Co-authored-by: Socrates <suyiteng@selectdb.com>
2024-12-26 15:17:49 +08:00
4b7c2eaa7d [branch-2.1](fix) fix incorrect result of hash join with const column (#45630) 2024-12-19 19:14:38 +08:00
5f952cf6ed branch-2.1: [fix](iceberg)Bring field_id with parquet files And fix map type's key optional #44470 (#44828)
Cherry-picked from #44470

Co-authored-by: wuwenchi <wuwenchi@selectdb.com>
2024-12-02 10:24:07 +08:00
Pxl
846fe83152 [Chore](runtime-filter) add rpc error msg to RuntimeFilterContext (#43517) (#44622) (#44719)
pick from #43517
2024-11-28 16:46:27 +08:00
dc9ec5b177 [opt]function use sse to opt match_ipv6_subnet (#38755) (#43513)
## Proposed changes
https://github.com/apache/doris/pull/38755
test
```
-------------------------------------------------------------------
Benchmark                         Time             CPU   Iterations
-------------------------------------------------------------------
BM_matchIPv6SubnetSSE          1.89 ns         1.89 ns   1000000000
BM_matchIPv6SubnetNative       4.99 ns         4.99 ns    561455254
```
2024-11-11 18:59:35 +08:00
95d27cf6b2 [Opt](exec) change transmit block to rw lock to opt performance #43223 (#43492)
cherry pick #43223 

Co-authored-by: HappenLee <happenlee@selectdb.com>
2024-11-11 17:32:09 +08:00
9d7bc5b765 [pick](branch-2.1) pick #38215 (#43386)
pick #38215

---------

Co-authored-by: Zou Xinyi <zouxinyi@selectdb.com>
2024-11-09 22:13:05 +08:00
cbdaaa62b2 [feature](function) hour/minute/second functions support time as an a… 41008 (#42232)
…rgument. (#41008)

## Proposed changes

```
mysql> select cast(4562632 as time),hour(cast(4562632 as time)), minute(cast(4562632 as time)),second(cast(4562632 as time));
+-----------------------+-----------------------------+-------------------------------+-------------------------------+
| cast(4562632 as TIME) | hour(cast(4562632 as TIME)) | minute(cast(4562632 as TIME)) | second(cast(4562632 as TIME)) |
+-----------------------+-----------------------------+-------------------------------+-------------------------------+
| 456:26:32             |                         456 |                            26 |                            32 |
+-----------------------+-----------------------------+-------------------------------+-------------------------------+
```

<!--Describe your changes.-->

---------

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->

Co-authored-by: Dongyang Li <hello_stephen@qq.com>
2024-10-24 11:09:36 +08:00
7eec0f8fbb [branch-2.1](datetime) Fix date floor functions overflow (#35477) (#42238)
pick https://github.com/apache/doris/pull/35477
2024-10-22 15:54:53 +08:00
5bd33fc88c [pick](branch-2.1) pick #41292 #41350 #41589 #41628 #41743 #41601 #41667 #41751 (#41927)
## Proposed changes

pick #41292 #41350 #41589 #41628 #41743 #41601 #41667 #41751

<!--Describe your changes.-->

---------

Co-authored-by: Pxl <pxl290@qq.com>
2024-10-16 15:41:28 +08:00
4888c632f4 [cherry-pick](branch2.1) support escape.delim and serialization.null.format for hive text (#41684)
## Proposed changes
pick from master:
https://github.com/apache/doris/pull/40291
2024-10-15 00:08:23 +08:00
0b4552f74b [cherry-pick](branch-2.1) pick hive text write from master (#40537)
## Proposed changes
pick prs:
https://github.com/apache/doris/pull/38549
https://github.com/apache/doris/pull/40183
https://github.com/apache/doris/pull/40315

---------

Co-authored-by: Calvin Kirs <kirs@apache.org>
2024-09-27 20:57:07 +08:00
e4ee0e8379 [fix](topn) Fix wrong rows returned by TOPN sorter (#40243)
## Proposed changes

pick #40241

<!--Describe your changes.-->
2024-09-02 14:34:53 +08:00
1768169b9a Revert "[Improvement](sort) Free sort blocks if this block is exhausted (#39306)" (#40211)
Reverts apache/doris#39956
2024-08-31 15:58:55 +08:00
3ac8347e3d [Improvement](sort) Free sort blocks if this block is exhausted (#39306) (#39956)
## Proposed changes

pick #39306

<!--Describe your changes.-->
2024-08-30 13:35:43 +08:00
ca07a00c93 Revert "[branch-2.1](hive) support hive write text table (#38549) (#4… (#40157)
…0063)"

This reverts commit c6df7c21a3c09ae1664deabacb88dfcea9d94b68.

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->

Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-08-30 10:25:38 +08:00
c6df7c21a3 [branch-2.1](hive) support hive write text table (#38549) (#40063)
1. Support write hive text table
2. Add SessionVariable `hive_text_compression` to write compressed hive
text table
3. Supported compression type: gzip, bzip2, snappy, lz4, zstd

pick from https://github.com/apache/doris/pull/38549
2024-08-29 16:50:40 +08:00
8dbd73988a [fix](recvr) catch exception of transmit_block (#39882)
BP #39881
2024-08-25 00:25:20 +08:00
460605ae3c [branch-2.1] pick some prs (#39860)
## Proposed changes

Issue Number: close #xxx

https://github.com/apache/doris/pull/38385 optimize parsing datetime
https://github.com/apache/doris/pull/38978 make stream load failure
message more clear and disable some error's stacktrace by default
https://github.com/apache/doris/pull/39255 fix random function coredump
https://github.com/apache/doris/pull/39324 fix function corr
inconsistency with doc
https://github.com/apache/doris/pull/39449 check auto partitoin nullity
when creating partition
https://github.com/apache/doris/pull/39695 make
DynamicPartitionScheduler immediately know interval's change
https://github.com/apache/doris/pull/39754 Add some partition expr check
on creating table
2024-08-24 17:26:42 +08:00
04e993c1de [refine](pipeline) refine some VDataStreamRecvr code (#35063) (#37802)
## Proposed changes
https://github.com/apache/doris/pull/35063
https://github.com/apache/doris/pull/35428
2024-08-22 19:55:17 +08:00
8ce8887b75 [branch-2.1](memory) Refactor refresh workload groups weighted memory ratio and record refresh interval memory growth (#39760)
pick #38168
overwrites changes in #37221 on workload_group_manager.cpp. If need to
pick 37221, ignore it.
2024-08-22 17:33:11 +08:00
017dad8c54 [fix](type)support runtime predicate for time type (#38258) (#38465)
## Proposed changes
https://github.com/apache/doris/pull/38258
Issue Number: close #xxx

<!--Describe your changes.-->
2024-07-31 10:27:36 +08:00
a751372e76 [Feature](multi-catalog) Add memory tracker for orc reader/writer and arrow parquet writer。 (#37257)
## Proposed changes

backport #37234
2024-07-25 13:51:59 +08:00
7819c75e55 [fix](shuffle) Fix local exchange dependency blocking (#38160)
## Proposed changes

pick #38151

<!--Describe your changes.-->
2024-07-20 00:19:47 +08:00
4b31e52b24 [enhancement](runtimefilter) fix potential core in runtime filter sync filter size (#38058) (#38093)
pick #38058

## Proposed changes
IRuntimeFilter maybe deconstructed before the rpc finished, so that
could not use a raw pointer in closure. Has to use the context's shared
ptr.

---------

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->

Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-07-18 23:11:26 +08:00
88d771d360 [pipeline](fix) Avoid to use a freed dependency when cancelled (#34584) (#38046)
## Proposed changes

pick #34584
<!--Describe your changes.-->
2024-07-18 15:27:10 +08:00
0aeb768bf9 [Fix](export/outfile) Support compression when exporting data to Parquet / ORC. (#37167)
bp: #36490
2024-07-03 10:53:57 +08:00
72c20d3ccc [branch-2.1](function) fix date_format and from_unixtime core when meet long format string (#35883) (#36158)
pick #35883
2024-07-01 20:35:31 +08:00
Pxl
cb80ae906f [Bug](runtime-filter) disable sync filter when pipeline engine is off (#36994)
## Proposed changes
1. disable sync filter when pipeline engine is off
2. reduce some warning log
2024-06-28 16:59:26 +08:00
c84b56140c [Fix](outfile) Add a configuration for exporting data in Parquet format using select into outfile (#36143)
backport: #36142
2024-06-13 11:49:46 +08:00
1715bae26f [opt](parquet-writer) Specify the row group size when writing data to Parquet files. (#35081) (#36042)
bp #35081

Co-authored-by: Tiewei Fang <43782773+BePPPower@users.noreply.github.com>
2024-06-07 17:57:11 +08:00
b91d2caab8 [Feature](iceberg-writer) Implements iceberg sink basic functionality for inserting into table. (#35587)
backport #34929
2024-05-29 16:40:54 +08:00
Pxl
b143f0dfe2 [Improvement](date) shortcut for str to date parse (#35288)
shortcut for str to date parse
2024-05-25 17:47:20 +08:00
a6f7747d29 [feature](datatype) add BE config to allow zero date (#34961)
Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>
2024-05-23 19:12:39 +08:00
a8c24d7698 [Fix](function) fix overflow of date_add function (#35080)
fix overflow of date_add function
2024-05-22 10:02:59 +08:00
b96148c9cd [Fix](function) fix days/weeks_diff result wrong on BE #35104
select days_diff('2024-01-01 00:00:00', '2023-12-31 23:59:59');
should be 0 but got 1 on BE.
2024-05-22 10:00:26 +08:00
c7134faea9 [Fix](outfile) Fix the timing of setting the _is_closed flag in Parquet/ORC writer (#34668) 2024-05-15 10:28:22 +08:00
4dd5379951 [bugfix](hive)fix error for writing to hive for 2.1 (#34518)
mirror #34520
2024-05-14 23:27:29 +08:00
520774a24b [fix](serde) fix ipv4/v6 serde functions for arrow, orc, parquet format (#34042)
this PR is from @sjyango work in #32326,
wants merge #32326 into master branch, but it's draft and not maintain long time. so have this new PR.
Co-authored-by: sjyango <sjyang2022@zju.edu.cn>
2024-05-10 14:37:04 +08:00
Pxl
804586b342 [Improvement](sort) insert data by batch on VSortedRunMerger::get_next (#34363)
insert data by batch on VSortedRunMerger::get_next
2024-05-10 14:36:53 +08:00
a173513e27 [fix](pipelinex) exchange sink not set ready when source limit #34241 2024-04-29 20:58:50 +08:00
946d28646a [fix](outfile)Fixed orcOutputStream.close() throwing an exception during destruction causing the program to hang. (#34254)
bp #34243
2024-04-28 19:54:34 +08:00
30a68c1240 [fix](spill) use different algorithm to avoid partition data skew (#34162) 2024-04-27 11:20:36 +08:00
60e20a3afe [fix](pipeline_x) Crc32HashPartitioner should use ShuffleChannelIds (#34147) 2024-04-26 15:03:11 +08:00
25358564ca [Fix](compile) Fix gcc compile on master (#33864)
This is imported by #33511. wrongly used

ColumnStr<T> ();

which violate C++20 standard(see https://wg21.cmeerw.net/cwg/issue2237) but still supported by clang up until now(see llvm/llvm-project#58112)
2024-04-19 23:41:37 +08:00
657a29fd9e [refactor](partitioner) refine get channel id logics (#33765) 2024-04-18 19:05:24 +08:00
4863167f90 [refactor](pipelineX) Reduce prepare overhead (PART I) (#33550) 2024-04-17 23:42:12 +08:00
Pxl
341cb40693 [Chore](log) adjust output order on PrintInstanceStandardInfo and reduce warning log when rpc finished (#33652)
adjust output order on PrintInstanceStandardInfo and reduce warning log when rpc finished
2024-04-17 23:42:12 +08:00