Commit Graph

7734 Commits

Author SHA1 Message Date
f8fcd17f33 [fix](memory) Fix nested scoped tracker and nested reserve memory (#35257)
SCOPED_ATTACH_TASK cannot be nested, but SCOPED_SWITCH_THREAD_MEM_TRACKER_LIMITER can continue to be called, so attach_limiter_tracker may be nested.
2024-05-28 13:12:03 +08:00
9d6b2d66ca [feature](metrics)support be jvm metrics. (#35023)
support be jvm metrics.
if you `curl http://be_host:webserver_port/metrics` , you will get :
```
doris_be_jvm_heap_size_bytes{type="max"} 8589934592
doris_be_jvm_heap_size_bytes{type="committed"} 8589934592
doris_be_jvm_heap_size_bytes{type="used"} 364159504

doris_be_jvm_non_heap_size_bytes{type="committed"} 117899264
doris_be_jvm_non_heap_size_bytes{type="used"} 115330424

doris_be_jvm_young_size_bytes{type="used"} 255852544
doris_be_jvm_young_size_bytes{type="peak_used"} 255852544
doris_be_jvm_young_size_bytes{type="max"} 8589934592

doris_be_jvm_old_size_bytes{type="used"} 94393344
doris_be_jvm_old_size_bytes{type="peak_used"} 94393344
doris_be_jvm_old_size_bytes{type="max"} 8589934592

doris_be_jvm_gc{name="G1 Young Generation Count", type="count"} 3
doris_be_jvm_gc{name="G1 Young Generation Time", type="time"} 33
doris_be_jvm_gc{name="G1 Old Generation Count", type="count"} 0
doris_be_jvm_gc{name="G1 Old Generation Time", type="time"} 0

doris_be_jvm_thread{type="count"} 147
doris_be_jvm_thread{type="peak_count"} 147
doris_be_jvm_thread{type="new_count"} 0
doris_be_jvm_thread{type="runnable_count"} 25
doris_be_jvm_thread{type="blocked_count"} 0
doris_be_jvm_thread{type="waiting_count"} 48
doris_be_jvm_thread{type="timed_waiting_count"} 74
doris_be_jvm_thread{type="terminated_count"} 0
```
2024-05-28 13:12:03 +08:00
79cd726132 [Fix](inverted index) fix race condition in index build (#35427)
Fix race condition problem introduced by #35366 , which will cause heap-use-after-free
2024-05-28 13:12:03 +08:00
d8eefd0be8 [fix] fix wrong result of spill agg with limit (#35403) 2024-05-28 13:12:03 +08:00
7058b31edd [fix](move-memtable) clear load streams before shutdown SegmentFileWriterThreadPool (#35217) 2024-05-28 13:12:03 +08:00
238e218312 [fix](httpapi) restore compaction/run_status api can show be's overall compaction status and refactor code (#35409) 2024-05-28 09:43:43 +08:00
596fb6f327 [improve](ub) fix some runtime error of ubsan when downcast (#35343)
those code could work well, but it will be report some runtime error under UBSAN,
so refactor it to let's ubsan could running happy.
2024-05-27 15:27:43 +08:00
c44affb43f Add downgrade scan thread num by column num (#35351) 2024-05-27 15:27:12 +08:00
68eda58a8c [Fix](multi-catalog) Fix string dict filtering when use null related function in parquet and orc reader. (#35335)
The following sql and when the dictionary column contains functions related to null, the results will be incorrect.
```
select * from ( select IF(o_orderpriority IS NULL, 'null', o_orderpriority) AS o_orderpriority from test_string_dict_filter_orc ) as A where o_orderpriority = 'null';
```
```
select * from ( select IFNULL(o_orderpriority, 'null') AS o_orderpriority from test_string_dict_filter_parquet ) as A where o_orderpriority = 'null'
```
```
select * from ( select COALESCE(o_orderpriority, 'null') AS o_orderpriority from test_string_dict_filter_parquet ) as A where o_orderpriority = 'null';
```
2024-05-27 15:25:29 +08:00
7284b6959f [Configurations](multi-catalog)Fix enable_orc_filter_by_min_max functionality, the mistake for #35012. (#35320)
fix bug introduced from  #35012
2024-05-27 15:25:07 +08:00
09f9012817 [Fix](hive-writer) Fix hive partition update core. (#35311)
Issue: #31442
```
/home/zcp/repo_center/doris_master/doris/be/src/common/signal_handler.h:421
1# PosixSignals::chained_handler(int, siginfo*, void*) [clone .part.0] in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
2# JVM_handle_linux_signal in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
3# 0x00007F963FA9D090 in /lib/x86_64-linux-gnu/libc.so.6
4# doris::vectorized::VHivePartitionWriter::_build_partition_update() at /home/zcp/repo_center/doris_master/doris/be/src/vec/sink/writer/vhive_partition_writer.cpp:215
5# doris::vectorized::VHivePartitionWriter::close(doris::Status const&) at /home/zcp/repo_center/doris_master/doris/be/src/vec/sink/writer/vhive_partition_writer.cpp:164
6# doris::vectorized::VHiveTableWriter::close(doris::Status) at /home/zcp/repo_center/doris_master/doris/be/src/vec/sink/writer/vhive_table_writer.cpp:209
7# doris::vectorized::AsyncResultWriter::process_block(doris::RuntimeState*, doris::RuntimeProfile*) at /home/zcp/repo_center/doris_master/doris/be/src/vec/sink/writer/async_result_writer.cpp:184
8# doris::vectorized::AsyncResultWriter::start_writer(doris::RuntimeState*, doris::RuntimeProfile*)::$_0::operator()() const at
```
2024-05-27 15:24:53 +08:00
5ab5ec3d0d [Fix](inverted index) fix build index wrong size for inverted index (#35366) 2024-05-27 15:24:17 +08:00
2e20e38523 [improvement](jdbc catalog) remove useless jdbc catalog code (#34986) (#35418) 2024-05-27 14:25:26 +08:00
b6eaf95720 [fix](memory) Fix BE memory info compatible with Cgroup (#35412) (#35425)
1. `memory.usage_in_bytes ~= free.used + free.(buff/cache) - (buff)`, free cache can be reused,
   so, modify cgroup_memory_usage = memory.usage_in_bytes - memory.meminfo["Cached"].
2. If system not configured with cgroup, find cgroup file path will failed, refactor refresh cgroup memory info, compatible with find failed.
2024-05-27 12:31:44 +08:00
8f5deb10be [be](oom) add stacktrace in debugmode to find oom reason 2024-05-26 23:39:46 +08:00
ade1841a01 [fix](shuffle) Do not return error if local recvr is null (#35399) 2024-05-26 20:20:50 +08:00
6e17dc1e87 (cherry-pick)[branch-2.1] add calc tablet file crc and fix single compaction test #33076 #34915 (#35215)
* [fix](compaction test) show single replica compaction status and fix test (#33076)
* [improve](http action) add http interface to calculate the crc of all files in tablet (#34915)
2024-05-26 17:15:09 +08:00
65b9e5ab69 [fix](chore) fix DCHECK failure of BufferWritable if failed to alloc memory (#35345) 2024-05-25 17:48:04 +08:00
Pxl
b143f0dfe2 [Improvement](date) shortcut for str to date parse (#35288)
shortcut for str to date parse
2024-05-25 17:47:20 +08:00
34e5030702 [bugifx](core) fix logical error of status check in nestedloop join (#35365) 2024-05-25 17:46:44 +08:00
c6c90ff63e [chore](routine-load) make routine_load_consumer_pool_size can update using HTTP API (#35315) 2024-05-25 17:46:29 +08:00
5bcdc75283 fix compile 2024-05-25 09:00:48 +08:00
0f550aeda7 [fix](compression) handle exception to reuse compression context (#35338) (#35380)
* [fix](compression) handle exception to reuse compression context

Otherwise, there is memleak and new context is allocated, then flush tlb
consumes a lot sys cpu.
2024-05-24 19:56:27 +08:00
c4b2ddd688 [Fix](Variant) clear block after a flush complete (#35226) (#35372)
Otherwise result in crash

```
*** SIGSEGV address not mapped to object (@0x0) received by PID 4149909 (TID 4152328 OR 0x7efefc60d700) from PID 0; stack trace: ***
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /home/zcp/repo_center/doris_master/doris/be/src/common/signal_handler.h:421
 1# PosixSignals::chained_handler(int, siginfo*, void*) [clone .part.0] in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
 2# JVM_handle_linux_signal in /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
 3# 0x00007F031AD0E090 in /lib/x86_64-linux-gnu/libc.so.6
 4# doris::Status doris::vectorized::MutableBlock::merge_impl<doris::vectorized::Block const&>(doris::vectorized::Block const&) at /home/zcp/repo_center/doris_master/doris/be/src/vec/core/block.h:586
 5# doris::Status doris::vectorized::MutableBlock::merge<doris::vectorized::Block const&>(doris::vectorized::Block const&) at /home/zcp/repo_center/doris_master/doris/be/src/vec/core/block.h:521
```
2024-05-24 19:10:07 +08:00
41f29cf4cd [fix](decompress)(review) context leaked in failure path (#33622) (#35364)
* [fix](decompress)(review) context leaked in failure path

* [fix](decompress)(review) context leaked in failure path review fix

Co-authored-by: Vallish Pai <vallishpai@gmail.com>
2024-05-24 17:40:13 +08:00
639c7ee7fb [fix](decimalv2) fix scale of decimalv2 to string (#35222) (#35359)
* [fix](decimalv2) fix scale of decimalv2 to string
2024-05-24 17:20:43 +08:00
4b91ad003f [opt](memory) avoid allocate memory in agg operator constructor (#35301)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-05-24 16:23:58 +08:00
309503855e [Fix](bloom filter) Fix bloom filter memory leak (#34871)
* Issue: Doris occasionally encounters an issue where memory usage becomes exceptionally high and does not decrease. The leaked memory is occupied by Bloom filters stored in memory.

Reason: The segment cache stores segment objects read from files into memory. It functions as an LRU cache with an eviction strategy: when the number of segments exceeds the maximum number, or the total memory size of segment objects in the cache exceeds the maximum usage, it evicts the older segments. However, there is a piece of logic in the code that first reads the segment object into memory, assuming it occupies memory size A, then places the read segment object into the cache (at this point, the cache considers the segment object size to be A). It then reads the segment's Bloom filter from the file and assigns it to the segment's Bloom filter member variable, assuming the Bloom filter occupies memory size B. Thus, the total size of the segment object at this point is A+B. However, the cache does not update this size, leading to the actual size of the segment object stored in the cache (A+B) being larger than the size considered by the cache (A). When the number of segment objects in the cache increases to a certain extent, the used memory will surge dramatically. However, the cache does not perceive the size as reaching the eviction limit, so it does not evict the segment objects. In such cases, a memory leak issue arises.

Solution: Since each segment object only reads the Bloom filter once, the issue can be resolved by changing the logic from reading the segment, placing it into the cache, and then reading the Bloom filter to reading the segment, reading the Bloom filter, and then placing it into the cache.
2024-05-24 16:23:58 +08:00
682d72bf4d [fix](noexcept) Remove incorrect noexcept #35230 2024-05-24 16:23:58 +08:00
4b7608c2bf [fix](inverted index)Change index_id from int32 to int64 to avoid overflow (#35206)
Co-authored-by: Luennng <luennng@gmail.com>
2024-05-23 19:12:55 +08:00
a6f7747d29 [feature](datatype) add BE config to allow zero date (#34961)
Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>
2024-05-23 19:12:39 +08:00
eb49cd839b [refactor](datalake) return the error status instead of static_cast<void> (#34873)
Followup #34797
`static_cast<void>` has ignored the wrong status, some of them should make the query finished with error status, so replace `static_cast<void>`  with `RETURN_IF_ERROR`.

The following three scenarios need to be handled separately and cannot be simply replaced:
1. The outer function returns void;
2. Call status function inner constructors or destructors;
3. Call status function with best effort, and should ignore the wrong status.
2024-05-23 19:06:21 +08:00
0d2ab9d5c3 [fix](clean trash) Fix clean trash lost submit task (#35271) 2024-05-23 16:27:20 +08:00
Pxl
e962a7309b [Chore](runtime-filter) adjust some check and error msg on runtime filter (#35018) (#35251)
adjust some check and error msg on runtime filter
2024-05-23 11:20:02 +08:00
adc364a6fd [feature](Paimon) support deletion vector for Paimon naive reader (#34743) (#35241)
bp #34743
Co-authored-by: 苏小刚 <suxiaogang223@icloud.com>
2024-05-23 00:01:30 +08:00
3a5fb6265a [refactor](jdbc catalog) split trino jdbc executor (#34932) (#35176)
pick #34932
2024-05-22 19:09:57 +08:00
05a390e050 [refactor](jdbc catalog) split oceanbase jdbc executor (#34869) (#35175)
pick #34869
2024-05-22 19:09:35 +08:00
291cf57c54 [Configurations](multi-catalog) Add enable_parquet_filter_by_min_max and enable_orc_filter_by_min_max Session variables. (#35012) (#35164)
backport #35012
2024-05-22 19:06:12 +08:00
d63c3ae2d4 [bugfix](hive)fix testcase for viewfs for 2.1 #35178 2024-05-22 18:13:09 +08:00
72f2d0d449 [fix](memory) Allow flush memtable failed when process exceed memlimit #35150 2024-05-22 18:11:59 +08:00
c23384ff07 [fix](decimal) Fix long string casting to decimalv2 (#35121) 2024-05-22 14:32:29 +08:00
Pxl
84f7bfffe2 [Bug](bitmap-filter) fix empty bitmap when rf do merge (#34182)
fix empty bitmap when rf do merge
2024-05-22 14:29:50 +08:00
9d7c65b4d8 [fix](memory) Avoid frequently refresh cgroup memory info (#35083) (#35182)
pick #35083
2024-05-22 11:42:08 +08:00
f0b2f5ba36 [Fix](bug) agg limit contains null values may cause error result (#35180) 2024-05-22 10:57:57 +08:00
a8c24d7698 [Fix](function) fix overflow of date_add function (#35080)
fix overflow of date_add function
2024-05-22 10:02:59 +08:00
ced0093d74 [fix](mem_tracker] attach mem tracker in FragmentMgr::apply_filter (#35128) 2024-05-22 10:02:46 +08:00
b96148c9cd [Fix](function) fix days/weeks_diff result wrong on BE #35104
select days_diff('2024-01-01 00:00:00', '2023-12-31 23:59:59');
should be 0 but got 1 on BE.
2024-05-22 10:00:26 +08:00
2ed6a00fd1 [opt](memory) Add GlobalMemoryArbitrator and support ReserveMemory (#34985) (#35070) 2024-05-22 09:53:45 +08:00
11971eddb4 [atomicstatus](be) add atomic status to share state between multi thread (#35002) 2024-05-22 01:11:07 +08:00
9bc2c88384 [enhancement](memory) add exception check in page builder to avoid be oom during flush memtable (#35138) 2024-05-22 01:07:50 +08:00