doris

Author	SHA1	Message	Date
lihangyu	a4d78682ff	[Optimize](point query) clear names to reduce mem consumption and cpu cost related to block column name (#26931 )	2023-11-17 10:18:21 +08:00
Mryange	0c264c8a14	[fix](pipelineX) fix scheduling bug in union operator (#27131 )	2023-11-17 10:02:54 +08:00
Kaijie Chen	afffcfd14c	[fix](load) skip cancel already cancelled channels (#27111 )	2023-11-16 18:38:40 +08:00
Kaijie Chen	e29d8cb110	[feature](move-memtable) support pipelineX in sink v2 (#27067 )	2023-11-16 15:00:55 +08:00
HowardQin	54989175fb	[case] Load json data with enable_simdjson_reader=false (#26601 )	2023-11-16 14:40:59 +08:00
zy-kkk	f10ab4e113	[enhancement](JNI) Provide default environment variables if it is unset (#27037 )	2023-11-16 14:37:11 +08:00
deardeng	7e82e7651a	[Improve](txn) Add some fuzzy test stub in txn (#26712 )	2023-11-16 11:50:06 +08:00
Jerry Hu	7fbc6d26a7	[debug](log) add some log to debug issue about insert (#27045 )	2023-11-16 11:46:47 +08:00
Kaijie Chen	042f6e8458	[cleanup](move-memtable) cleanup unused fields in rowset writer v2 (#27073 )	2023-11-16 10:13:00 +08:00
xy	b8b86a7262	[enhance](cooldown) Reduce the locking interval for cooldown task (#26984 ) Co-authored-by: xingying01 <xingying01@corp.netease.com>	2023-11-16 10:02:32 +08:00
Qi Chen	0eabe9a651	[Test](orc-reader) Add orc submodule's unit tests. (#26878 )	2023-11-16 09:53:42 +08:00
HappenLee	7ef1f7e511	[Bug](pipeline) try fix the exchange sink buffer result error (#27052 )	2023-11-16 09:20:56 +08:00
yiguolei	02f3762ab3	[refactor](status) define error code and enable stacktrace in same place (#27065 )	2023-11-16 08:41:40 +08:00
Jerry Hu	3ad865fef9	[refactor](storage) Expressing the types of computation layer and storage layer in PrimitiveTypeTraits (#26191 )	2023-11-15 21:34:49 +08:00
yujun	10ee48bb6f	[fix](publish version) publish version task no need return VERSION_NOT_EXIST #27005 if BE's tablet not contains a txn, publish txn on them will no error, when check version exists it will indicate the tablet as error_tablet_id in task's response, so FE can know this tablet has fail. Also for task, it's no need to set its status as "VERSION_NOT_EXIST". Because if set it as not ok, the BE will try this task two times. Since not contains this tablet's txn, the retry is in vain.	2023-11-15 21:09:54 +08:00
wangbo	035e593b26	remove useless hash function (#26955 )	2023-11-15 20:37:21 +08:00
caiconghui	83edcdead9	[enhancement](random_sink) change tablet search algorithm from random to round-robin for random distribution table (#26611 ) 1. fix race condition problem when get tablet load index 2. change tablet search algorithm from random to round-robin for random distribution table when load_to_single_tablet set to false	2023-11-15 19:55:31 +08:00
Qi Chen	0491437a86	[Opt](scanner-scheduler) Optimize `BlockingQueue`, `BlockingPriorityQueue` and change remote scan thread pool. (#26784 ) ## Proposed changes - Optimize `BlockingQueue`, `BlockingPriorityQueue` by swapping `notify` and `unlock` to reduce lock competition. Ref: https://www.boost.org/doc/libs/1_54_0/boost/thread/sync_bounded_queue.hpp - Change remote scan thread pool to `PriorityQueue`. ### Test result Before: ``` mysql> select sum(lo_partkey) from lineorder; +-----------------+ \| sum(lo_partkey) \| +-----------------+ \| 300021444265405 \| +-----------------+ 1 row in set (1.11 sec) ``` After: ``` mysql> select sum(lo_partkey) from lineorder; +-----------------+ \| sum(lo_partkey) \| +-----------------+ \| 300021444265405 \| +-----------------+ 1 row in set (0.80 sec) ```	2023-11-15 18:24:36 +08:00
zhiqiang	d3fd923447	[opt](pipeline) Return InternalError to FE instead of doing a useless DCHECK in ExecNode #27035 Effect: Client will see error message like below when BE meeting plan logical error. RROR 1105 (HY000): errCode = 2, detailMessage = ([xxx]())[CANCELLED]Logical error during processing VNewOlapScanNode(dr_case_tag), output of projections 2 mismatches with exec node output 3	2023-11-15 18:15:21 +08:00
TengJianPing	00896d8954	[fix](agg) fix coredump of multi distinct of decimal128I (#27014 ) * [fix](agg) fix coredump of multi distinct of decimal128 * fix	2023-11-15 17:37:20 +08:00
Guangdong Liu	e1ba471727	[fix](send_batch_parallelism) add test case for `send_batch_parallelism` (#26908 )	2023-11-15 14:21:58 +08:00
Xinyi Zou	dbac12bae8	[fix](memory)Modify the default conf values of mem_limit and cache_last_version_interval_second (#26945 ) mem_limit from 80% to 90% cache_last_version_interval_second from 900 to 30	2023-11-15 14:02:58 +08:00
GoGoWen	15c43d8b8a	[BugFix](JDBC Catalog) fix jdbc catalog query bitmap may cause be core sometimes (#26933 ) BitmapValue::write_to will get a string with size 1 for empty BitmapValue, however the size 1 string will reinterpret to BitmapValue* back in ColumnComplexType::insert: void insert(const Field& x) override { const String& s = doris::vectorized::get<const String&>(x); data.push_back(reinterpret_cast<const T>(s.c_str())); } in data.push_back will goto BitmapValue copy constructor, as the _type is not first member in BitmapValue, cause access to an unknown memory location.	2023-11-15 10:20:42 +08:00
Jerry Hu	6183b298e1	[refactor](data_type) remove some unused functions (#26966 )	2023-11-15 09:23:53 +08:00
Kaijie Chen	89215306d3	[improve](load) add switch for vertical segment writer (#26996 )	2023-11-15 08:19:12 +08:00
zhiqiang	30d1e6036c	[feature](runtime filter) New session variable runtime_filter_wait_infinitely (#26888 ) New session variable: runtime_filter_wait_infinitely. If set runtime_filter_wait_infinitely = true, consumer of rf will wait on receiving until query is timeout.	2023-11-14 21:05:59 +08:00
Jerry Hu	cdef768629	[fix](sink) crash caused by wild pointer of counter in VDataStreamSender (#26947 ) If preparation fails, the counter _peak_memory_usage_counter will be a wild pointer. * SIGSEGV address not mapped to object (@0x454d49545f) received by PID 16992 (TID 18856 OR 0x7f4d05444700) from PID 1296651359; stack trace: * 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t, void) at /root/doris/be/src/common/signal_handler.h:417 1# os::Linux::chained_handler(int, siginfo, void) in /app/doris/Nexchip-doris-1.2.4.2-bin-x86_64/java8/jre/lib/amd64/server/libjvm.so 2# JVM_handle_linux_signal in /app/doris/Nexchip-doris-1.2.4.2-bin-x86_64/java8/jre/lib/amd64/server/libjvm.so 3# signalHandler(int, siginfo, void) in /app/doris/Nexchip-doris-1.2.4.2-bin-x86_64/java8/jre/lib/amd64/server/libjvm.so 4# 0x00007F55C85B9400 in /lib64/libc.so.6 5# doris::vectorized::VDataStreamSender::close(doris::RuntimeState, doris::Status) at /root/doris/be/src/vec/sink/vdata_stream_sender.cpp:734 6# doris::PlanFragmentExecutor::close() at /root/doris/be/src/runtime/plan_fragment_executor.cpp:543 7# doris::PlanFragmentExecutor::~PlanFragmentExecutor() at /root/doris/be/src/runtime/plan_fragment_executor.cpp:95 8# doris::FragmentExecState::~FragmentExecState() at /root/doris/be/src/runtime/fragment_mgr.cpp:112 9# std::_Sp_counted_ptr<doris::FragmentExecState, (__gnu_cxx::_Lock_policy)2>::_M_dispose() at /root/ldb/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/shared_ptr_base.h:348 10# doris::FragmentMgr::exec_plan_fragment(doris::TExecPlanFragmentParams const&, std::function<void (doris::RuntimeState, doris::Status)> const&) at /root/doris/be/src/runtime/fragment_mgr.cpp:855 11# doris::FragmentMgr::exec_plan_fragment(doris::TExecPlanFragmentParams const&) at /root/doris/be/src/runtime/fragment_mgr.cpp:592 12# doris::PInternalServiceImpl::_exec_plan_fragment_impl(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, doris::PFragmentRequestVersion, bool) at /root/doris/be/src/service/internal_service.cpp:463 13# doris::PInternalServiceImpl::_exec_plan_fragment_in_pthread(google::protobuf::RpcController, doris::PExecPlanFragmentRequest const, doris::PExecPlanFragmentResult, google::protobuf::Closure) at /root/doris/be/src/service/internal_service.cpp:305 14# doris::WorkThreadPool<false>::work_thread(int) at /root/doris/be/src/util/work_thread_pool.hpp:160 15# execute_native_thread_routine at ../../../../../libstdc++-v3/src/c++11/thread.cc:84 16# start_thread in /lib64/libpthread.so.0 17# clone in /lib64/libc.so.6	2023-11-14 19:05:49 +08:00
Xinyi Zou	50fa96c185	[fix](memory)Fix MacOS perf_counters.cpp compile Error (#26942 )	2023-11-14 18:53:04 +08:00
lihangyu	a2ae225c77	[Fix](row cache) invalid row cache using key encoded without sequence column (#26948 )	2023-11-14 16:51:23 +08:00
abmdocrt	573fa2063b	[Fix](wal) Fix wal space back pressure core (#26907 )	2023-11-14 16:10:25 +08:00
daidai	3585c7e216	[test](parquet)append parquet reader byte_array_decimal and rle_bool case (#26751 )	2023-11-14 15:05:10 +08:00
Gabriel	9b3ddd13eb	[pipelineX](jdbc) prevent JDBC scan if disable Java support (#26919 )	2023-11-14 14:20:36 +08:00
Adonis Ling	d2eea9b3ae	[chore](macOS) Reduce the size of executables on macOS arm64 (#26894 ) Like #15641, we should reduce the size of executables on macOS arm64. Otherwise, we can not run doris_be and doris_be_test with ASAN build type on macOS arm64 now.	2023-11-14 12:21:08 +08:00
Kaijie Chen	39473cdf48	[performance](load) add vertical segment writer (#24403 )	2023-11-14 11:53:09 +08:00
Kaijie Chen	f6a9914bc7	[feature](move-memtable) support auto partition in sink v2 (#26914 )	2023-11-14 11:39:44 +08:00
Xinyi Zou	de6ecd2035	[fix](tls) Manually track memory in Allocator instead of mem hook and ThreadContext life cycle to manual control (#26904 ) Manually track query/load/compaction/etc. memory in Allocator instead of mem hook. Can still use Mem Hook when cannot manually track memory code segments and find memory locations during debugging. This will cause memory tracking loss for Query, loss less than 10% compared to the past, but this is expected to be more controllable. Similarly, Mem Hook will no longer track unowned memory to the orphan mem tracker by default, so the total memory of all MemTrackers will be less than before. Not need to get memory size from jemalloc in Mem Hook each memory alloc and free, which would lose performance in the past. Not require caching bthread local in pthread local for memory hook, in the past this has caused core dumps inside bthread, seems to be a bug in bthread. ThreadContext life cycle to manual control In the past, ThreadContext was automatically created when it was used for the first time (this was usually in the Jemalloc Hook when the first malloc memory), and was automatically destroyed when the thread exited. Now instead of manually controlling the create and destroy of ThreadContext, it is mainly created manually when the task thread start and destroyed before the task thread end. Run 43 clickbench query tests. Use MemHook in the past:	2023-11-14 10:30:42 +08:00
Ashin Gau	34edc578f1	[opt](MergeIO) use equivalent merge size to measure merge effectiveness (#26741 ) `MergeRangeFileReader` is used to merge small IOs, and `max_amplified_read_ratio` controls the proportion of read amplification. However, in some extreme cases(eg. `orc strip size`/`parquet row group size` is less than 3MB), the control effect of `max_amplified_read_ratio` is not good, resulting in a large amount of small IOs. After testing, the return time of a single IO for IO size smaller than 4kb in hdfs(512kb in oss) remains basically unchanged. Therefore, equivalent IO size is used to measure merge effectiveness: ``` EquivalentIOSize = MergeSize / Request IOs ``` When `EquivalentIOSize` is greater than 4kb in hdfs, or 512kb in oss, we believe that this kind of merge is effective.	2023-11-14 10:07:14 +08:00
Siyang Tang	6f82c798eb	[fix](delta-writer) fix total received rows in delta writer incorrect (#26905 )	2023-11-14 08:31:16 +08:00
Ashin Gau	ec40603b93	[fix](parquet) compressed_page_size has the same meaning in page v1 and v2 (#26783 ) 1. Parquet with page v2 is parsed error when using other codec except snappy. Because `compressed_page_size` has the same meaning in page v1 and v2, it always contains the bytes of definition level, repetition level and compressed data. 2. Add regression test for `fix_length_byte_array` stored decimal type, and dictionary encoded date/datetime type.	2023-11-14 08:30:42 +08:00
Kaijie Chen	b19abac5e2	[fix](move-memtable) pass num local sink to backends (#26897 )	2023-11-14 08:28:49 +08:00
Kaijie Chen	de62c00f4e	[fix](move-memtable) init auto partition context in VRowDistribution::open (#26911 )	2023-11-14 08:16:14 +08:00
Yongqiang YANG	5ad49dceaa	[fix](scanner_schedule) scanner hangs due to negative num_running_scanners (#26816 ) * [fix] scanner hangs due to negative num_running_scanners Before the patch, num_running_scanners is increased after submitting, then it may be decreased before increasing then negative values can be seen by get_block_from_queue and a expected submit does not happend. Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com>	2023-11-13 23:03:49 +08:00
zzzxl	f698bb7be2	[Feature](inverted index) index tool add match_all and match_phrase (#26896 )	2023-11-13 22:53:11 +08:00
yujun	ebc15fc6cc	[fix](transaction) Fix concurrent schema change and txn cause dead lock (#26428 ) Concurrent schema change and txn may cause dead lock. An example: Txn T commit but not publish; Run schema change or rollup on T's related partition, add alter replica R; sc/rollup add a sched txn watermark M; Restart fe; After fe restart, T's loadedTblIndexes will clear because it's not save to disk; T will publish version to all tablet, including sc/rollup's new alter replica R; Since R not contains txn data, so the T will fail. It will then always waitting for R's data; sc/rollup wait for txn before M to finish, only after that it will let R copy history data; Since T's not finished, so sc/rollup will always wait, so R will nerver copy history data; Txn T and sc/rollup will wait each other forever, cause dead lock; Fix: because sc/rollup will ensure double write after the sched watermark M, so for finish transaction, when checking a alter replica: if txn id is bigger than M, check it just like a normal replica; otherwise skip check this replica, the BE will modify history data later.	2023-11-13 21:39:28 +08:00
TengJianPing	504ec324bb	Revert "[refactor](scan) delete bloom_filter_predicate (#26499 )" (#26851 ) This reverts commit 2bb3ef198144954583aea106591959ee09932cba.	2023-11-13 16:27:23 +08:00
airborne12	c6b97c4daa	[Improvement](segment iterator) remove range in first read to save time (#26689 ) Currently, rowids may be fragmented significantly after `_get_row_ranges_by_column_conditions`, potentially leading to high CPU costs when processing these scattered ranges of rowid. This PR enhances the `SegmentIterator` by eliminating the initial range read in the `BitmapRangeIterator` constructor and introducing a `read_batch_rowids` method to both `BitmapRangeIterator` and `BackwardBitmapRangeIterator` classes. The aim is to boost performance by omitting redundant read operations, thereby reducing execution time. Moreover, to avoid unnecessary reads when the range is relatively complete, we employ a simple `is_continuous` check to determine if the block of rows is continuous. If so, we call `next_batch` instead of `read_by_rowids`, streamlining the processing of consecutive rowids. We selected three SQL statement scenarios to test the effects of the optimization, which are: 1. ```select COUNT() from wc_httplogs_inverted_index where request match "images" and (size >= 10 and status = 200);``` 2. ```select COUNT() from wc_httplogs_inverted_index where request match "HTTP" and (size >= 10 and status = 200);``` 3. ```select COUNT() from wc_httplogs_inverted_index where request match "GET" and (size >= 10 and status = 200);``` - The first SQL statement represents the scenario primarily optimized in this PR, where the first read matches a large number of rows but is highly fragmented. - The second SQL statement represents a scenario where the first read fully hits, mainly to verify if there is any performance degradation in the PR when hitting a complete rowid range. - The third SQL statement represents a near-total hit with only occasional misses, used to check if the PR degrades when the rowid range contains many continuous ranges. The results are as follows: 1. For the first SQL statement: 1. Before optimization: Execution time: 0.32 sec, FirstReadTime: 6s628ms 2. After optimization: Execution time: 0.16 sec, FirstReadTime: 1s604ms 2. For the second SQL statement: 1. Before optimization: Execution time: 0.16 sec, FirstReadTime: 682.816ms 2. After optimization: Execution time: 0.15 sec, FirstReadTime: 635.156ms 3. For the third SQL statement: 1. Before optimization: Execution time: 0.16 sec, FirstReadTime: 787.904ms 2. After optimization: Execution time: 0.16 sec, FirstReadTime: 798.861ms	2023-11-13 15:51:48 +08:00
zy-kkk	2f32a721ee	[refactor](jni) unified jni framework for jdbc catalog (#26317 ) This commit overhauls the JDBC connector logic within our project, transitioning from the previous mechanism of fetching data through JNI calls for individual ResultSet items to a more efficient and unified approach using the VectorTable data structure.	2023-11-13 14:28:15 +08:00
Jerry Hu	fa3c7d98c8	[fix](map) the implementation of ColumnMap::replicate was incorrect" (#26647 )	2023-11-13 12:17:14 +08:00
meiyi	c0fda8c5c2	[improve](group commit) Add a swicth to wait internal group commit lo… (#26734 ) * [improve](group commit) Add a swicth to make internal group commit load finish * modify group commit tvf plan	2023-11-13 10:35:35 +08:00
TengJianPing	7332b1b371	[fix](decimal) fix undefined behaviour of divide by zero when cast string to decimal (#26822 ) * [fix](decimal) fix undefined behaviour of divide by zero when cast string to decimal * fix format	2023-11-13 10:09:06 +08:00

1 2 3 4 5 ...

6172 Commits