Commit Graph

6030 Commits

Author SHA1 Message Date
b1eef30b49 [pipelineX](dependency) Wake up task by dependencies (#26879)
---------

Co-authored-by: Mryange <2319153948@qq.com>
2023-11-18 03:20:24 +08:00
0aec436ef8 [chore](be) format reader parameter settings (#22964) 2023-11-18 00:11:46 +08:00
5fb27eb652 [fix](compile) fix BE compile failure on Mac (#27206) 2023-11-17 23:52:51 +08:00
5d548935e0 [improvement](insert) support schema change and decommission for group commit (#26359) 2023-11-17 21:41:38 +08:00
0a1a6cf02f [fix](topn) add defensive code in topn opt to avoid crash due to column not in tablet schema 2023-11-17 21:14:10 +08:00
c459408580 [fix](jni) avoid BE crash and NPE when close paimon reader (#27129)
1. Do not use FATAL log when jni encounter error, to avoid crash.
2. Fix NPE when closing PaimonReader, the reader may not be assigned if PaimonReader open failed.
2023-11-17 20:01:08 +08:00
52995c528e [fix](iceberg) iceberg use customer method to encode special characters of field name (#27108)
Fix two bugs:
1. Missing column is case sensitive, change the column name to lower case in FE for hive/iceberg/hudi
2. Iceberg use custom method to encode special characters in column name. Decode the column name to match the right column in parquet reader.
2023-11-17 18:38:55 +08:00
xy
fdec286e82 [optimize](cooldown)Shorten the _meta_lock lock interval (#27118)
Change the two passes of _rs_version_map to one, reducing cpu overhead and shortening the lock interval of _meta_lock

Co-authored-by: xingying01@corp.netease.com <xingying01@corp.netease.com>
2023-11-17 16:59:36 +08:00
593e3662b0 [Fix](match) fix match null for no index (#26983)
This pull request addresses an issue observed with inverted index tables or tables without indices when querying null values using the MATCH function. 
Previously, executing a query like `SELECT * FROM table WHERE column MATCH null;` would yield incorrect results. 

The update introduces enhanced handling of nullable columns within the MATCH function, ensuring accurate query results when null values are involved.
2023-11-17 15:57:50 +08:00
4d2fb1fffb [fix](load) add lock in active_memtable_mem_consumption (#27101) 2023-11-17 15:03:15 +08:00
e1b180d53d [improve](streamload) Explicitly judge the return value of close #27134 2023-11-17 14:17:09 +08:00
a0661ed9d2 [Fix](multi-catalog) Fix complex type crash when using dict filter facility in the parquet-reader. (#27151)
- Fix complex type crash when using the dict filter facility in the parquet-reader by turning off the dict filter facility in this case.
- Add orc complex types regression test.
2023-11-17 13:43:58 +08:00
4fff9a5937 [Improvement](inverted index) delay inverted index col read to reduce IO (#26080) (#26337) 2023-11-17 13:12:12 +08:00
91af86bc78 [fix](function) fix error when use negative number in explode_numbers #27020 2023-11-17 12:02:14 +08:00
Pxl
1188d88a10 [Chore](status) catch some error status on storage (#27132)
catch some error status on storage
2023-11-17 12:00:39 +08:00
334260dff7 [feature](function) support ip function ipv4stringtonum(ordefault, ornull), inet_aton (#25510) 2023-11-17 10:27:07 +08:00
a4d78682ff [Optimize](point query) clear names to reduce mem consumption and cpu cost related to block column name (#26931) 2023-11-17 10:18:21 +08:00
0c264c8a14 [fix](pipelineX) fix scheduling bug in union operator (#27131) 2023-11-17 10:02:54 +08:00
afffcfd14c [fix](load) skip cancel already cancelled channels (#27111) 2023-11-16 18:38:40 +08:00
e29d8cb110 [feature](move-memtable) support pipelineX in sink v2 (#27067) 2023-11-16 15:00:55 +08:00
54989175fb [case] Load json data with enable_simdjson_reader=false (#26601) 2023-11-16 14:40:59 +08:00
f10ab4e113 [enhancement](JNI) Provide default environment variables if it is unset (#27037) 2023-11-16 14:37:11 +08:00
7e82e7651a [Improve](txn) Add some fuzzy test stub in txn (#26712) 2023-11-16 11:50:06 +08:00
7fbc6d26a7 [debug](log) add some log to debug issue about insert (#27045) 2023-11-16 11:46:47 +08:00
042f6e8458 [cleanup](move-memtable) cleanup unused fields in rowset writer v2 (#27073) 2023-11-16 10:13:00 +08:00
xy
b8b86a7262 [enhance](cooldown) Reduce the locking interval for cooldown task (#26984)
Co-authored-by: xingying01 <xingying01@corp.netease.com>
2023-11-16 10:02:32 +08:00
0eabe9a651 [Test](orc-reader) Add orc submodule's unit tests. (#26878) 2023-11-16 09:53:42 +08:00
7ef1f7e511 [Bug](pipeline) try fix the exchange sink buffer result error (#27052) 2023-11-16 09:20:56 +08:00
02f3762ab3 [refactor](status) define error code and enable stacktrace in same place (#27065) 2023-11-16 08:41:40 +08:00
3ad865fef9 [refactor](storage) Expressing the types of computation layer and storage layer in PrimitiveTypeTraits (#26191) 2023-11-15 21:34:49 +08:00
10ee48bb6f [fix](publish version) publish version task no need return VERSION_NOT_EXIST #27005
if BE's tablet not contains a txn, publish txn on them will no error, when check version exists it will indicate the tablet as error_tablet_id in task's response, so FE can know this tablet has fail.

Also for task, it's no need to set its status as "VERSION_NOT_EXIST". Because if set it as not ok, the BE will try this task two times. Since not contains this tablet's txn, the retry is in vain.
2023-11-15 21:09:54 +08:00
035e593b26 remove useless hash function (#26955) 2023-11-15 20:37:21 +08:00
83edcdead9 [enhancement](random_sink) change tablet search algorithm from random to round-robin for random distribution table (#26611)
1. fix race condition problem when get tablet load index
2. change tablet search algorithm from random to round-robin for random distribution table when load_to_single_tablet set to false
2023-11-15 19:55:31 +08:00
0491437a86 [Opt](scanner-scheduler) Optimize BlockingQueue, BlockingPriorityQueue and change remote scan thread pool. (#26784)
## Proposed changes
- Optimize `BlockingQueue`, `BlockingPriorityQueue` by swapping `notify` and `unlock` to reduce lock competition. Ref: https://www.boost.org/doc/libs/1_54_0/boost/thread/sync_bounded_queue.hpp
- Change remote scan thread pool to `PriorityQueue`.

### Test result
Before:
```
mysql> select  sum(lo_partkey)  from  lineorder;
+-----------------+
| sum(lo_partkey) |
+-----------------+
| 300021444265405 |
+-----------------+
1 row in set (1.11 sec)
```

After:
```
mysql> select  sum(lo_partkey)  from  lineorder;
+-----------------+
| sum(lo_partkey) |
+-----------------+
| 300021444265405 |
+-----------------+
1 row in set (0.80 sec)
```
2023-11-15 18:24:36 +08:00
d3fd923447 [opt](pipeline) Return InternalError to FE instead of doing a useless DCHECK in ExecNode #27035
Effect: Client will see error message like below when BE meeting plan logical error.

RROR 1105 (HY000): errCode = 2, detailMessage = ([xxx]())[CANCELLED]Logical error during processing VNewOlapScanNode(dr_case_tag), output of projections 2 mismatches with exec node output 3
2023-11-15 18:15:21 +08:00
00896d8954 [fix](agg) fix coredump of multi distinct of decimal128I (#27014)
* [fix](agg) fix coredump of multi distinct of decimal128

* fix
2023-11-15 17:37:20 +08:00
e1ba471727 [fix](send_batch_parallelism) add test case for send_batch_parallelism (#26908) 2023-11-15 14:21:58 +08:00
dbac12bae8 [fix](memory)Modify the default conf values of mem_limit and cache_last_version_interval_second (#26945)
mem_limit from 80% to 90%
cache_last_version_interval_second from 900 to 30
2023-11-15 14:02:58 +08:00
15c43d8b8a [BugFix](JDBC Catalog) fix jdbc catalog query bitmap may cause be core sometimes (#26933)
BitmapValue::write_to will get a string with size 1 for empty BitmapValue, however the size 1 string will reinterpret to BitmapValue* back in ColumnComplexType::insert:
void insert(const Field& x) override {
const String& s = doris::vectorized::get<const String&>(x);
data.push_back(reinterpret_cast<const T>(s.c_str()));
}

in data.push_back will goto BitmapValue copy constructor, as the _type is not first member in BitmapValue, cause access to an unknown memory location.
2023-11-15 10:20:42 +08:00
6183b298e1 [refactor](data_type) remove some unused functions (#26966) 2023-11-15 09:23:53 +08:00
89215306d3 [improve](load) add switch for vertical segment writer (#26996) 2023-11-15 08:19:12 +08:00
30d1e6036c [feature](runtime filter) New session variable runtime_filter_wait_infinitely (#26888)
New session variable: runtime_filter_wait_infinitely. If set runtime_filter_wait_infinitely = true, consumer of rf will wait on receiving until query is timeout.
2023-11-14 21:05:59 +08:00
cdef768629 [fix](sink) crash caused by wild pointer of counter in VDataStreamSender (#26947)
If preparation fails, the counter _peak_memory_usage_counter will be a wild pointer.

*** SIGSEGV address not mapped to object (@0x454d49545f) received by PID 16992 (TID 18856 OR 0x7f4d05444700) from PID 1296651359; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /root/doris/be/src/common/signal_handler.h:417
 1# os::Linux::chained_handler(int, siginfo*, void*) in /app/doris/Nexchip-doris-1.2.4.2-bin-x86_64/java8/jre/lib/amd64/server/libjvm.so
 2# JVM_handle_linux_signal in /app/doris/Nexchip-doris-1.2.4.2-bin-x86_64/java8/jre/lib/amd64/server/libjvm.so
 3# signalHandler(int, siginfo*, void*) in /app/doris/Nexchip-doris-1.2.4.2-bin-x86_64/java8/jre/lib/amd64/server/libjvm.so
 4# 0x00007F55C85B9400 in /lib64/libc.so.6
 5# doris::vectorized::VDataStreamSender::close(doris::RuntimeState*, doris::Status) at /root/doris/be/src/vec/sink/vdata_stream_sender.cpp:734
 6# doris::PlanFragmentExecutor::close() at /root/doris/be/src/runtime/plan_fragment_executor.cpp:543
 7# doris::PlanFragmentExecutor::~PlanFragmentExecutor() at /root/doris/be/src/runtime/plan_fragment_executor.cpp:95
 8# doris::FragmentExecState::~FragmentExecState() at /root/doris/be/src/runtime/fragment_mgr.cpp:112
 9# std::_Sp_counted_ptr<doris::FragmentExecState*, (__gnu_cxx::_Lock_policy)2>::_M_dispose() at /root/ldb/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:348
10# doris::FragmentMgr::exec_plan_fragment(doris::TExecPlanFragmentParams const&, std::function<void (doris::RuntimeState*, doris::Status*)> const&) at /root/doris/be/src/runtime/fragment_mgr.cpp:855
11# doris::FragmentMgr::exec_plan_fragment(doris::TExecPlanFragmentParams const&) at /root/doris/be/src/runtime/fragment_mgr.cpp:592
12# doris::PInternalServiceImpl::_exec_plan_fragment_impl(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, doris::PFragmentRequestVersion, bool) at /root/doris/be/src/service/internal_service.cpp:463
13# doris::PInternalServiceImpl::_exec_plan_fragment_in_pthread(google::protobuf::RpcController*, doris::PExecPlanFragmentRequest const*, doris::PExecPlanFragmentResult*, google::protobuf::Closure*) at /root/doris/be/src/service/internal_service.cpp:305
14# doris::WorkThreadPool<false>::work_thread(int) at /root/doris/be/src/util/work_thread_pool.hpp:160
15# execute_native_thread_routine at ../../../../../libstdc++-v3/src/c++11/thread.cc:84
16# start_thread in /lib64/libpthread.so.0
17# clone in /lib64/libc.so.6
2023-11-14 19:05:49 +08:00
50fa96c185 [fix](memory)Fix MacOS perf_counters.cpp compile Error (#26942) 2023-11-14 18:53:04 +08:00
a2ae225c77 [Fix](row cache) invalid row cache using key encoded without sequence column (#26948) 2023-11-14 16:51:23 +08:00
573fa2063b [Fix](wal) Fix wal space back pressure core (#26907) 2023-11-14 16:10:25 +08:00
3585c7e216 [test](parquet)append parquet reader byte_array_decimal and rle_bool case (#26751) 2023-11-14 15:05:10 +08:00
9b3ddd13eb [pipelineX](jdbc) prevent JDBC scan if disable Java support (#26919) 2023-11-14 14:20:36 +08:00
d2eea9b3ae [chore](macOS) Reduce the size of executables on macOS arm64 (#26894)
Like #15641, we should reduce the size of executables on macOS arm64. Otherwise, we can not run doris_be and doris_be_test with ASAN build type on macOS arm64 now.
2023-11-14 12:21:08 +08:00
39473cdf48 [performance](load) add vertical segment writer (#24403) 2023-11-14 11:53:09 +08:00