Commit Graph

67 Commits

Author SHA1 Message Date
06451c4ff1 fix: infinit loop when handle exceed limit memory (#21556)
In some situation, _handle_mem_exceed_limit will alloc a large memory block, more than 5G. After add some log, we found that:

alloc memory was made in vector::insert_realloc
writers_to_reduce_mem's size is more than 8 million.
which indicated that an infinite loop was met in while (!tablets_mem_heap.empty()).
By reviewing codes, """ if (std::get<0>(tablet_mem_item)++ != std::get<1>(tablet_mem_item)) """ is wrong,
which must be """ if (++std::get<0>(tablet_mem_item) != std::get<1>(tablet_mem_item)) """.
In the original code, we will made ++ on end iterator, and then compare to end iterator, the behavior is undefined.
2023-07-06 14:34:29 +08:00
b2c4e51be1 [fix](load) delete lazy open DCheck when unkown load id (#21083) 2023-06-21 20:42:31 +08:00
a3342cb088 [improvement](load) avoid producing small segment (#20852)
avoid producing small segment
2023-06-19 18:34:44 +08:00
f2025b9eed [fix](memory) before compaction run, check memory exceed limit #20782 2023-06-14 14:20:48 +08:00
e801e3b737 [fix](memory) Fix crash at bthread_setspecific in brpc::Socket::CheckHealth() (#20450)
Only switch to bthread local when modifying the mem tracker in the thread context. No longer switches to bthread local by default when bthread starts
mem tracker increases brpc IOBufBlockMemory memory
remove thread mem tracker metrics
2023-06-08 19:48:19 +08:00
2db900b775 [fix](lazy_open) fix lazy open null point (#20540) 2023-06-07 17:56:31 +08:00
Pxl
8376e5eefb [Chore](build) add non-virtual-dtor, remove no-embedded-directive/no-zero-length-array (#20118)
add non-virtual-dtor, remove no-embedded-directive/no-zero-length-array
2023-05-29 14:42:47 +08:00
e0d9f7f955 [enhancement](load) add some profile items for load (#20141) 2023-05-29 09:54:03 +08:00
d5d47703fe [fix](memory) remove auto option in memory config and optimize memtracker logs #19706
fix mem_limit default value
memory_gc_sleep_time_s to memory_gc_sleep_time_ms
LoadChannelMgr::_handle_mem_exceed_limit process_mem_limit to process soft mem limit
fix query mem tracker print
2023-05-18 08:54:03 +08:00
f8ef25bb10 [enhancement](load) lazy-open necessary partitions when load (#18874) 2023-05-14 16:09:55 +08:00
c98829c94b [improvement](load) log time consumed by waiting flush (#19226) 2023-05-03 17:48:13 +08:00
3899c08036 [optimize](compile) remove unused template param from load channel (#18980)
* [optimize](compile) remove unused template param from load channel



---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-04-24 23:36:47 +08:00
8e4710079d [improvement](profile) Insert into add LoadChannel runtime profile (#18908)
TabletSink and LoadChannel in BE are M: N relationship,
Every once in a while LoadChannel will randomly return its own runtime profile to a TabletSink, so usually all LoadChannel runtime profiles are saved on each TabletSink, and the timeliness of the same LoadChannel profile saved on different TabletSinks is different, and each TabletSink will periodically send fe reports all the LoadChannel profiles saved by itself, and ensures to update the latest LoadChannel profile according to the timestamp.
2023-04-24 09:41:57 +08:00
e412dd12e8 [chore](build) Use include-what-you-use to optimize includes (PART II) (#18761)
Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.
2023-04-19 23:11:48 +08:00
c704351273 [enhancement](memory) Refactor memory limit exceeded behavior (#18590)
No check mem tracker limit and no cancel task in mem hook, only in Allocator. This helps in clearer analysis of memory issues and reduces performance loss.
PODArray/hash table/arena memory allocation will use Allocator.

Optimize mem limit exceeded log printing

Optimize compilation time
2023-04-14 10:42:35 +08:00
6470ae58ea [enhancement](config) remove config load_process_max_memory_limit_bytes (#15686) 2023-01-31 21:36:34 +08:00
97fcad76f8 [enhancement](memtracker) Improve readability (#15716) 2023-01-16 16:30:35 +08:00
ef0e0cf68d [enhancement](load) refine the reduce memory policy when process memory is nearly full (#15685)
If process memory is almost full but data load don't consume more than 5% (50% * 10%) of total memory, we don't need to reduce memory of load jobs
2023-01-12 14:43:33 +08:00
4380f1ec54 [Enhancement](load) reduce memory by memory size of global delta writer (#14491) 2023-01-03 20:05:21 +08:00
b23d068281 [refactor](remove-non-vec) Remove non vec load from memtable and delta writer (#15517)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2022-12-30 21:22:58 +08:00
aa0f38f864 [chore](gutil) remove some gutil files and use c++ stl instead (#15357)
* [chore](gutil) remove some gutil files and use c++ stl instead

* fix

* fix
2022-12-26 21:25:09 +08:00
e1f0fa069c [enhancement](memory) Refactored process memory statistics periodically refresh, and fix catch bad_alloc (#14580) 2022-11-29 10:15:25 +08:00
72748c229a update (#14215) 2022-11-13 12:31:42 +08:00
33b50860c7 [improvement](load) release load channel actively when error occurs (#14218) 2022-11-13 12:31:15 +08:00
7682c08af0 [improvement](load) reduce memory in batch for small load channels (#14214) 2022-11-12 22:14:01 +08:00
0b945fe361 [enhancement](memtracker) Refactor mem tracker hierarchy (#13585)
mem tracker can be logically divided into 4 layers: 1)process 2)type 3)query/load/compation task etc. 4)exec node etc.

type includes

enum Type {
        GLOBAL = 0,        // Life cycle is the same as the process, e.g. Cache and default Orphan
        QUERY = 1,         // Count the memory consumption of all Query tasks.
        LOAD = 2,          // Count the memory consumption of all Load tasks.
        COMPACTION = 3,    // Count the memory consumption of all Base and Cumulative tasks.
        SCHEMA_CHANGE = 4, // Count the memory consumption of all SchemaChange tasks.
        CLONE = 5, // Count the memory consumption of all EngineCloneTask. Note: Memory that does not contain make/release snapshots.
        BATCHLOAD = 6,  // Count the memory consumption of all EngineBatchLoadTask.
        CONSISTENCY = 7 // Count the memory consumption of all EngineChecksumTask.
    }
Object pointers are no longer saved between each layer, and the values of process and each type are periodically aggregated.

other fix:

In [fix](memtracker) Fix transmit_tracker null pointer because phamp is not thread safe #13528, I tried to separate the memory that was manually abandoned in the query from the orphan mem tracker. But in the actual test, the accuracy of this part of the memory cannot be guaranteed, so put it back to the orphan mem tracker again.
2022-11-08 09:52:33 +08:00
32a029d9dc [enhancement](memtracker) Refactor load channel + memtable mem tracker (#13795) 2022-11-03 09:47:12 +08:00
d8ec53c83f [enhancement](load) avoid duplicate reduce on same TabletsChannel #12975
In the policy changed by PR #12716, when reaching the hard limit, there might be multiple threads can pick same LoadChannel and call reduce_mem_usage on same TabletsChannel. Although there's a lock and condition variable can prevent multiple threads to reduce mem usage concurrently, but they still can do same reduce-work on that channel multiple times one by one, even it's just reduced.
2022-09-27 22:03:08 +08:00
b14b178928 [enhancement](memory) Trigger load channel flush based on process physical memory to avoid OOM #12960
When the physical memory of the process reaches 90% of the mem limit, trigger the load channel mgr to brush down
The default value of be.conf mem_limit is changed from 90% to 80%, and stability is the priority.
Fix deadlock in arena_locks in BufferPool::BufferAllocator::ScavengeBuffers and _lock in DebugString
2022-09-27 09:07:38 +08:00
3bb920ba54 [Enhancement](load) Refine the load channel flush policy on mem limit (#12716)
1. Remove single load channel mem limit, only use load channel mgr mem limit
2. Default load channel mgr mem limit from 50% to 80%
3. load channel mgr add soft mem limit. When the soft limit is exceeded, other threads will not hang, only current thread triggers flush
4. When exceed load channel mgr mem limit, find a load channel with the largest mem usage, continue to find a tablet channel with the largest mem usage, and try to flush 1/3 of the mem usage of this tablet channel.
2022-09-24 10:01:13 +08:00
942b31038f [fix](memory) Fix BE OOM when load -238 fail (#12666)
When the flush is triggered when the load channel exceeds the mem limit, if the flush fails, an error message is returned and the load is terminated.

Usually flush failure is -238 error code. Because the memtable is frequently flushed after the load channel exceeds the mem limit, the number of segments exceeds the max value.
2022-09-17 00:17:53 +08:00
73d8f5901d fix mem tracker limiter (#11376) 2022-08-01 09:44:04 +08:00
4960043f5e [enhancement] Refactor to improve the usability of MemTracker (step2) (#10823) 2022-07-21 17:11:28 +08:00
ec5996f1f8 [improvement]do not acquire mutex in metric hook (#10941) 2022-07-18 08:52:24 +08:00
eb25df5a2c [fix] (mem tracker) Fix inaccurate mem tracker leads to load OOM (#10409)
* fix load tracker

* fix comment
2022-06-25 14:13:02 +08:00
85362a907e [fix](mem tracker) Fix some memory leaks, inaccurate statistics, core dump, deadlock bugs (#10072)
1. Fix the memory leak. When the load task is canceled, the `IndexChannel` and `NodeChannel` mem trackers cannot be destructed in time.
2. Fix Load task being frequently canceled by oom and inaccurate `LoadChannel` mem tracker limit, and rewrite the variable name of `mem limit` in `LoadChannel`.
3. Fix core dump, when logout task mem tracker, phmap erase fails, resulting in repeated logout of the same tracker.
4. Fix the deadlock, when add_child_tracker mem limit exceeds, calling log_usage causes `_child_trackers_lock` deadlock.
5. Fix frequent log printing when thread mem tracker limit exceeds, which will affect readability and performance.
6. Optimize some details of mem tracker display.
2022-06-14 21:38:37 +08:00
b34ed43ec9 [feature-wip] (memory tracker) (step6, End) Fix some details (#9301)
1. Fix LoadTask, ChunkAllocator, TabletMeta, Brpc, the accuracy of memory track.
2. Modified some MemTracker names, deleted some unnecessary trackers, and improved readability.
3. More powerful MemTracker debugging capabilities.
4. Avoid creating TabletColumn temporary objects and improve BE startup time by 8%.
5. Fix some other details.
2022-05-10 18:17:09 +08:00
c9961c9bb9 [style] clang-format all c++ code (#9305)
- sh build-support/clang-format.sh  to  clang-format all c++ code
2022-04-29 16:14:22 +08:00
d330bc3806 [Vectorized](stream-load-vec) Support stream load in vectorized engine (#8709) (#9280)
Implement vectorized stream load.
Added fe configuration option `enable_vectorized_load` to enable vectorized stream load.

    Co-authored-by: tengjp@outlook.com
    Co-authored-by: mrhhsg@gmail.com
    Co-authored-by: minghong.zhou@163.com
    Co-authored-by: HappenLee <happenlee@hotmail.com>
    Co-authored-by: zhoubintao <35688959+zbtzbtzbt@users.noreply.github.com>
2022-04-29 09:50:51 +08:00
290366787c [refactor] refactor code, replace some file with stl libs (#8759)
1. replace ConditionVariables with std::condition_variable
2. repalace Mutex with std::mutex
3. repalce MonoTime with std::chrono
2022-04-13 09:55:29 +08:00
519305cb22 [feature-wip] (memory tracker) (step4) Switch TLS mem tracker to separate more detailed memory usage (#8669)
Based on #8605, Separate out the memory usage of each operator from the Query/Load/StorageEngine mem tracker.
2022-04-08 09:02:26 +08:00
eeae516e37 [Feature](Memory) Hook TCMalloc new/delete automatically counts to MemTracker (#8476)
Early Design Documentation: https://shimo.im/docs/DT6JXDRkdTvdyV3G

Implement a new way of memory statistics based on TCMalloc New/Delete Hook,
MemTracker and TLS, and it is expected that all memory new/delete/malloc/free
of the BE process can be counted.
2022-03-20 23:06:54 +08:00
e17aef9467 [refactor] refactor the implement of MemTracker, and related usage (#8322)
Modify the implementation of MemTracker:
1. Simplify a lot of useless logic;
2. Added MemTrackerTaskPool, as the ancestor of all query and import trackers, This is used to track the local memory usage of all tasks executing;
3. Add cosume/release cache, trigger a cosume/release when the memory accumulation exceeds the parameter mem_tracker_consume_min_size_bytes;
4. Add a new memory leak detection mode (Experimental feature), throw an exception when the remaining statistical value is greater than the specified range when the MemTracker is destructed, and print the accurate statistical value in HTTP, the parameter memory_leak_detection
5. Added Virtual MemTracker, cosume/release will not sync to parent. It will be used when introducing TCMalloc Hook to record memory later, to record the specified memory independently;
6. Modify the GC logic, register the buffer cached in DiskIoMgr as a GC function, and add other GC functions later;
7. Change the global root node from Root MemTracker to Process MemTracker, and remove Process MemTracker in exec_env;
8. Modify the macro that detects whether the memory has reached the upper limit, modify the parameters and default behavior of creating MemTracker, modify the error message format in mem_limit_exceeded, extend and apply transfer_to, remove Metric in MemTracker, etc.;

Modify where MemTracker is used:
1. MemPool adds a constructor to create a temporary tracker to avoid a lot of redundant code;
2. Added trackers for global objects such as ChunkAllocator and StorageEngine;
3. Added more fine-grained trackers such as ExprContext;
4. RuntimeState removes FragmentMemTracker, that is, PlanFragmentExecutor mem_tracker, which was previously used for independent statistical scan process memory, and replaces it with _scanner_mem_tracker in OlapScanNode;
5. MemTracker is no longer recorded in ReservationTracker, and ReservationTracker will be removed later;
2022-03-11 22:04:23 +08:00
26289c28b0 [fix](load)(compaction) Fix NodeChannel coredump bug and modify some compaction logic (#8072)
1. Fix the problem of BE crash caused by destruct sequence. (close #8058)
2. Add a new BE config `compaction_task_num_per_fast_disk`

    This config specify the max concurrent compaction task num on fast disk(typically .SSD).
    So that for high speed disk, we can execute more compaction task at same time,
    to compact the data as soon as possible

3. Avoid frequent selection of unqualified tablet to perform compaction.
4. Modify some log level to reduce the log size of BE.
5. Modify some clone logic to handle error correctly.
2022-02-17 10:52:08 +08:00
ef984a6a72 [improvement](load) Improve load fault tolerance (#7674)
Currently, if we encounter a problem with a replica of a tablet during the load process,
such as a write error, rpc error, -235, etc., it will cause the entire load job to fail,
which results in a significant reduction in Doris' fault tolerance.

This PR mainly changes:

1. refined the judgment of failed replicas in the load process, so that the failure of a few replicas will not affect the normal completion of the load job.
2. fix a bug introduced from #7754 that may cause BE coredump
2022-01-20 09:23:21 +08:00
5f8d91257b [improvement](routine-load) Reduce the probability that the routine load task rpc timeout (#7754)
If an load task has a relatively short timeout, then we need to ensure that
each RPC of this task does not get blocked for a long time.
And an RPC is usually blocked for two reasons.

1. handling "memory exceeds limit" in the RPC
    
    If the system finds that the memory occupied by the load exceeds the threshold,
    it will select the load channel that occupies the most memory and flush the memtable in it.
    this operation is done in the RPC, which may be more time consuming.

2. close the load channel

    When the load channel receives the last batch, it will end the task.
    It will wait for all memtables flushes to finish synchronously. This process is also time consuming.

Therefore, this PR solves this problem by.

1. Use timeout to determine whether it is a high-priority load task

    If the timeout of an load task is relatively short, then we mark it as a high-priority task.

2. not processing "memory exceeds limit" for high priority tasks
3. use a separate flush thread to flush memtable for high priority tasks.
2022-01-16 10:41:31 +08:00
d57c2344e1 [MemTracker] Refactored the hierarchical structure of memtracker (#5956)
To avoid showing too many memtracker on BE web pages.
The MemTracker level now has 3 levels: OVERVIEW, TASK and VERBOSE.

OVERVIEW Mainly used for main memory consumption module such as Query/Load/Metadata.
TASK is mainly used to record the memory overhead of a single task such as a single query, load, and compaction task.
VERBOSE is used for other more detailed memtrackers.
2021-06-16 09:44:24 +08:00
be733cfa9c [Metrics] Add some large memtrackers' metric (#5614)
MemTracker can provide memory consumption for us to find out which
module consume more memory, but it's just a current value, this patch
add metrics for some large memory consumers, then we can find out
which module consume more memory in timeline, it would be useful to
troubleshoot OOM problems and optimize configs.
2021-04-21 09:15:04 +08:00
b423274f17 [Enhance] Make MemTracker more accurate (#5515) (#5516)
* [Enhance] Make MemTracker more accurate (#5515)
 This PR main about:
 1. Improve the readability of MemTrackers' name
 2. Add the MemTracker of:
    * Load
    * Compaction
    * SchemaChange
    * StoragePageCache
    * TabletManager
 3. Change SchemaChange to a Singleon

* revise some code for Code Review

* change the name of mem_tracker

* keep reader_context have the same lifetime of rowset_reader in schema change.

* change vlog notice to log(warning) in schema change
2021-04-08 09:14:55 +08:00
689602e686 [Enhancement] Support Pallralel Merge In Exchange Node (#5468)
Support Parallel Merge In Exchange Node
2021-03-11 22:34:18 +08:00