Commit Graph

5826 Commits

Author SHA1 Message Date
22684dedff [pipelineX](pick) pick PRs from pipeline (#25340) 2023-10-12 14:35:32 +08:00
2664d1cffb [chore](vec) Make this copy constructor of StringRef explicit (#25337) 2023-10-12 14:12:46 +08:00
Pxl
f14e4311c4 [Chore](check) add length check for BufferWritable (#25322)
add length check for BufferWritable
2023-10-12 10:51:50 +08:00
7447ac71b5 [minor](format) fix BE code format (#25328) 2023-10-12 10:34:36 +08:00
022762d5f0 [fix](memory) Fix work load group GC and add logs to locate slow GC #24975
Fix work load group GC, add cancel load and add logs.
Unify the format and change all to lowercase of GC logs, avoid unnecessary trouble when grep or less
Add logs to help locate the cause of slow GC.
2023-10-12 10:33:56 +08:00
2014e16cfb [fix](es catalog)fix es http timeout (#25273) 2023-10-12 10:21:55 +08:00
7ca63665b4 [fix](agg) garbled characters in result of map_agg (#25318) 2023-10-12 10:10:55 +08:00
58d96ecdbf [Improve](status) avoid print too may stack log for DATA_QUALITY_ERROR code (#25292) 2023-10-12 09:58:51 +08:00
46ab4346ca [Opt](parquet reader) Optimize the performance of reading decimal in parquet reader. (#25012)
Optimize the performance of reading decimal in parquet reader.

- Static dispatch `DecimalScaleParams`.
- Optimize `memcpy`, static dispatch copy size in fixed length cases.
- Use right shift bit operator to convert decimals.
2023-10-12 09:53:08 +08:00
e41b03e530 [Fix](multi-catalog) delete hdfs hedged configs at BE side. (#25094)
Issue Number: close #25093 

We can set hdfs hedged configs when creating catalog, just like this:
```
CREATE CATALOG `test_ctl` PROPERTIES (
...
"dfs.client.hedged.read.threadpool.size" = "128",
"dfs.client.hedged.read.threshold.millis" = "500",
...
);
```
It is redundant to set these configs at BE side, and it will brings an occasional bug at #25093 .
2023-10-11 23:25:30 +08:00
73c3e3ab55 [Feature](x-load) support config min replica num for loading data (#21118) 2023-10-11 21:07:35 +08:00
ba87f7d3a3 [fix](pipelineX) add table sink and some fix in pipelineX (#25314) 2023-10-11 20:18:08 +08:00
e94fca4949 [enhancement](bvar) add metrics to monitor load throughput (#25189)
to monitor realtime load throughput of the BE

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
2023-10-11 19:56:33 +08:00
f960b8c989 [bugfix](stream receiver) be will core during stop because receiver is not closed (#25298)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-10-11 19:49:40 +08:00
c6b1c903e4 [fix](Regression-test) fix that the String type in a nested type should contain double quotes and add regression-test (#25115) 2023-10-11 18:30:26 +08:00
e514d52232 [fix](point-query) Support mow table with sequence column (#25308) 2023-10-11 18:22:16 +08:00
73f632a4e3 [fix](move-memtable) handle error in LoadStreamWriter::close (#24805) 2023-10-11 16:54:42 +08:00
3db33207d4 [pipelineX](fix) Fix nullable types for set operator (#25294) 2023-10-11 16:50:54 +08:00
bb670118f5 [coverage](test) Delete unused function to improve test coverage (#25233) 2023-10-11 11:50:51 +08:00
cdf5f0fe68 [fix](pipelineX) mark join column should be nullable (#25275) 2023-10-11 11:35:43 +08:00
2f706cc84b [compile](simdjson reader) use __AVX2__ macro to decide whether use simdjson to parse (#25165) 2023-10-11 10:50:13 +08:00
8e66dbc4a8 [enhancement](log) add some decheck log to debug (#25210) 2023-10-11 10:33:13 +08:00
5be29f859a [enhancement](node) add filter in partition sort node in BE #25188
add filter in partition sort node in BE
2023-10-11 10:30:15 +08:00
2ed5245014 [FIX](array_function) fix array_map function with array index function without checkout arg… #25226 2023-10-11 10:23:33 +08:00
7b22ae0c80 [pipelineX](feature) Support set operation operator (#25251)
---------

Co-authored-by: zhaochangle <zhaochangle@selectdb.com>
2023-10-11 10:22:45 +08:00
be11b48407 [fix](load) fix MemTableWriter::active_memtable_mem_consumption (#25207) 2023-10-10 22:33:50 +08:00
fb3b888ff1 [prune](partition)support prune partition when is auto partition with function call (#24747)
now create table use auto create partition:
AUTO PARTITION BY RANGE date_trunc(event_day, 'day')
so the value of event_day will be insert into partition of date_trunc(event_day, 'day'),
eg: select * from partition_range where date_trunc(event_day,"day")= "2023-08-07 11:00:00";
we can prune some partitions by invoke function of date_trunc("2023-08-07 11:00:00","day" );
2023-10-10 20:39:43 +08:00
913282b29b [refactor](column) remove get_data_type in IColumn (#25242) 2023-10-10 20:27:15 +08:00
62a6b132be [Fix](func numbers) Remove backend_nums argument of numbers function (#25200) 2023-10-10 20:25:58 +08:00
ba1edcf2dc [fix](stack trace) Optimize stack trace output (#24933)
Status prints the stack trace, the first four frame pointers are removed, it doesn't make sense.
Optimize stack trace field order.
example:

  0#  doris::PlanFragmentExecutor::cancel(doris::PPlanFragmentCancelReason const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at doris/core/be/src/common/status.h:0
  1#  doris::FragmentMgr::cancel_query_unlocked(doris::TUniqueId const&, doris::PPlanFragmentCancelReason const&, std::unique_lock<std::mutex> const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at doris/cor
e/be/src/runtime/fragment_mgr.cpp:984
  2#  doris::FragmentMgr::cancel_query(doris::TUniqueId const&, doris::PPlanFragmentCancelReason const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at doris/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/
../../../../include/x86_64-linux-gnu/c++/11/bits/gthr-default.h:778
  3#  long doris::MemTrackerLimiter::free_top_memory_query<doris::MemTrackerLimiter::TrackerLimiterGroup>(long, doris::MemTrackerLimiter::Type, std::vector<doris::MemTrackerLimiter::TrackerLimiterGroup, std::allocator<doris::MemTrackerLimiter::TrackerLimiterGroup> >&, s
td::function<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > (long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)> const&, doris::RuntimeProfile*) at doris/ldb_toolchain/bin/../
lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/basic_string.h:187
  4#  doris::MemTrackerLimiter::free_top_memory_query(long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, doris::RuntimeProfile*, doris::Mem
TrackerLimiter::Type) at doris/core/be/src/runtime/memory/mem_tracker_limiter.cpp:362
  5#  doris::MemInfo::process_full_gc() at doris/core/be/src/util/mem_info.cpp:198
  6#  doris::Daemon::memory_gc_thread() at doris/core/be/src/common/daemon.cpp:0
  7#  doris::Thread::supervise_thread(void*) at doris/ldb_toolchain/bin/../usr/include/pthread.h:562
  8#  start_thread
  9#  __clone
2023-10-10 18:23:07 +08:00
5f95e97c56 [fix](function) array distance should return null when result is nan (#25214) 2023-10-10 04:41:51 -05:00
6ca0f3fa5f [Bug](writer) Fix ub in async writer (#25218) 2023-10-10 16:00:45 +08:00
7434f80300 [pipelineX](refactor) Refactor pending finish dependency (#25181) 2023-10-10 11:56:02 +08:00
880d0d7e70 [Bug](pipeline) Support the auto partition in pipeline load (#25176) 2023-10-10 11:51:12 +08:00
6ad22721cb fix: ubsan compile bug (#25199) 2023-10-10 11:46:33 +08:00
39669c6df2 [feature](pipelineX) add runtimefliter in pipelineX multicast sink (#25120) 2023-10-10 10:41:08 +08:00
f5b826b66d [fix](mark join) mark join column should be nullable (#24910) 2023-10-10 10:10:36 +08:00
b8621364d2 [FIX](serde)fix scale with decimalv2 in mysql writer which get real scale #25190 2023-10-10 09:09:57 +08:00
90ad48cdb7 [feature](pipelineX) add node id and profilev2 in pipelineX (#25084) 2023-10-10 09:09:26 +08:00
b58010c48e [fix](export) BufferWritable must be committed before deconstruct (#25185)
F20231009 16:03:47.659968 3342535 string_buffer.hpp:48] Check failed: _now_offset == 0
*** Check failure stack trace: ***
@ 0x561a6f8e21e6 google::LogMessage::SendToLog()
@ 0x561a6f8de7b0 google::LogMessage::Flush()
@ 0x561a6f8e2a29 google::LogMessageFatal::~LogMessageFatal()
@ 0x561a4a409233 doris::vectorized::BufferWritable::~BufferWritable()
@ 0x561a6e202853 doris::vectorized::VCSVTransformer::write()
@ 0x561a6e1f19ba doris::vectorized::VFileResultWriter::_write_file()
@ 0x561a6e1f1522 doris::vectorized::VFileResultWriter::append_block()
@ 0x561a6e121bed

The error will occur in DEBUG mode, and doing export will invalid data.
It has been covered by baidu case.
2023-10-09 22:39:45 +08:00
53b46b7e6c [FIX](filter) update for filter_by_select logic (#25007)
this pr is aim to update for filter_by_select logic and change delete limit

only support scala type in delete statement where condition
only support column nullable and predict column support filter_by_select logic, because we can not push down non-scala type to storage layer to pack in predict column but do filter logic
2023-10-09 21:27:40 +08:00
4de3df6a46 [refactor](column) remove unused method and column definitions (#25152)
remove unused method and column definitions
using primitive type in predicate column to check datev1 and datev2
2023-10-09 17:14:35 +08:00
af707e5244 [pipelineX](fix) fix external table scan operator (#25166) 2023-10-09 16:33:27 +08:00
e1b9854f90 [bugfix](thirdparty) Upgrade aws s3 sdk to prevent mem leak (#25106)
During the use of the AWS S3 SDK, we found that there is a memory leak. According to the official issue, upgrading the SDK should resolve the issue.
2023-10-09 16:08:50 +08:00
d7b6fe57df [Bug](java-udf) fix java-udf memory leak (#25151) 2023-10-09 15:10:56 +08:00
5a55e47acd [Enhancement](Load) stream tvf support two phase commit (#23800) 2023-10-09 14:15:56 +08:00
9e31cb26bb [fix](parse_url) fix parse_url is not working in some case to extract the HOST (#25040)
Issue Number: close #24452
2023-10-09 00:14:58 +08:00
451e299151 [Opt](performance) Optimize timeround with minute / second (#25073) 2023-10-08 23:14:23 +08:00
5c020be4d2 [Bug](join) corner case cause the mark join + null aware left join core dump in regression test in pipeline query engine (#25087) 2023-10-08 22:50:12 +08:00
9d8b993c51 [fix](fs) fix remove error log failed (#25108) 2023-10-08 22:15:37 +08:00