Commit Graph

208 Commits

Author SHA1 Message Date
f3aea7f0f0 [Enhancement](status) Unify error code and enable customed err msg for BE internal errors (#14744) 2022-12-11 23:33:18 +08:00
12304bc0ee [Pipeline](exec) Support pipeline exec engine (#14736)
Co-authored-by: Lijia Liu <liutang123@yeah.net>
Co-authored-by: HappenLee <happenlee@hotmail.com>
Co-authored-by: Jerry Hu <mrhhsg@gmail.com>
Co-authored-by: Pxl <952130278@qq.com>
Co-authored-by: shee <13843187+qzsee@users.noreply.github.com>
Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>

## Problem Summary:

### 1. Design

DSIP: https://cwiki.apache.org/confluence/display/DORIS/DSIP-027%3A+Support+Pipeline+Exec+Engine

### 2. How to use:

Set the environment variable `set enable_pipeline_engine = true; `
2022-12-02 17:11:34 +08:00
22883e7e08 [fuzzy](test) be fuzzy conf (#14654) 2022-11-29 19:38:40 +08:00
e1f0fa069c [enhancement](memory) Refactored process memory statistics periodically refresh, and fix catch bad_alloc (#14580) 2022-11-29 10:15:25 +08:00
0702277196 [improvement](tcmalloc) add moderate mode and avoid oom with a lot of cache (#14374)
ReleaseToSystem aggressively when there are little free memory.
2022-11-28 20:17:51 +08:00
21416f9947 [enhancement](memory) Support Jemalloc metrics and default allocator changed to Jemalloc (#14384) 2022-11-18 21:02:54 +08:00
bd5a593403 [enhancement](memtracker) Use proc/meminfo MemAvailable to control memory and optimize MemTracker log printing (#14335) 2022-11-17 22:46:07 +08:00
6da2948283 [feature-wip](multi-catalog) support iceberg v2(step 1) (#13867)
Support position delete(part of).
2022-11-17 17:56:48 +08:00
cffdeff4ec [fix](memory) Fix memory leak by calling boost::stacktrace (#14269)
boost::stacktrace::stacktrace() has memory leak, so use glog internal func to print stacktrace.
The reason for the memory leak of boost::stacktrace is that a state is saved in the thread local of each thread but not actively released. The test found that each thread leaked about 100M after calling boost::stacktrace.
refer to:
boostorg/stacktrace#118
boostorg/stacktrace#111
2022-11-15 08:58:57 +08:00
12652ebb0e [UDF](java udf) using config to enable java udf instead of macro at compile time (#14062)
* [UDF](java udf) useing config to enable java udf instead of macro at compile time
2022-11-11 09:03:52 +08:00
43eb946543 [feature](table-valued-function)S3 table valued function supports parquet/orc/json file format #14130
S3 table valued function supports parquet/orc/json file format.
For example: parquet format
2022-11-10 10:33:12 +08:00
0b945fe361 [enhancement](memtracker) Refactor mem tracker hierarchy (#13585)
mem tracker can be logically divided into 4 layers: 1)process 2)type 3)query/load/compation task etc. 4)exec node etc.

type includes

enum Type {
        GLOBAL = 0,        // Life cycle is the same as the process, e.g. Cache and default Orphan
        QUERY = 1,         // Count the memory consumption of all Query tasks.
        LOAD = 2,          // Count the memory consumption of all Load tasks.
        COMPACTION = 3,    // Count the memory consumption of all Base and Cumulative tasks.
        SCHEMA_CHANGE = 4, // Count the memory consumption of all SchemaChange tasks.
        CLONE = 5, // Count the memory consumption of all EngineCloneTask. Note: Memory that does not contain make/release snapshots.
        BATCHLOAD = 6,  // Count the memory consumption of all EngineBatchLoadTask.
        CONSISTENCY = 7 // Count the memory consumption of all EngineChecksumTask.
    }
Object pointers are no longer saved between each layer, and the values of process and each type are periodically aggregated.

other fix:

In [fix](memtracker) Fix transmit_tracker null pointer because phamp is not thread safe #13528, I tried to separate the memory that was manually abandoned in the query from the orphan mem tracker. But in the actual test, the accuracy of this part of the memory cannot be guaranteed, so put it back to the orphan mem tracker again.
2022-11-08 09:52:33 +08:00
32fea672b0 [chore](gutil) remove some gutil macros and solve some macro conflict with brpc (#13954)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2022-11-07 13:39:52 +08:00
27549564a7 [feature](table-valued-function) Support S3 tvf (#13959)
This pr does three things:

1. Modified the framework of table-valued-function(tvf).
2. be support `fetch_table_schema` rpc.
3. Implemented `S3(path, AK, SK, format)` table-valued-function.
2022-11-06 11:04:26 +08:00
948e080b31 [minor](error msg) Fix wrong error message (#13950) 2022-11-04 13:49:46 +08:00
32a029d9dc [enhancement](memtracker) Refactor load channel + memtable mem tracker (#13795) 2022-11-03 09:47:12 +08:00
8b3afd431e [improvement](memory) simplify memory config related to tcmalloc (#13781)
There are several configs related to tcmalloc, users do know how to config them. Actually users just want two modes, performance or compact, in performance mode, users want doris run query and load quickly while in compact mode, users want doris run with less memory usage.

If we want to config tcmalloc individually, we can use env variables which are supported by tcmalloc.
2022-11-01 21:45:19 +08:00
0134e9d2f4 [Improvement](runtime filter) Reduce merging time for bloom filter (#13668) 2022-10-27 00:02:05 +08:00
295d887cf5 [improvement](thread) set name for priority thread pool (#13552) 2022-10-26 09:32:15 +08:00
9dc5dd382a [enhancement](memtracker) Fix Brpc mem count and refactored thread context macro (#13469) 2022-10-21 12:01:38 +08:00
736d113700 [fix](memtracker) Fix transmit_tracker null pointer because phamp is not thread safe #13528 2022-10-21 08:30:30 +08:00
2745a88814 [enhancement](memtracker) Fix brpc causing query mem tracker to be inaccurate #13401 2022-10-19 12:28:20 +08:00
125def5102 [enhancement](macOS M1) Support building from source on macOS (M1) (#13195)
# Proposed changes

This PR fixed lots of issues when building from source on macOS with Apple M1 chip.

## ATTENTION

The job for supporting macOS with Apple M1 chip is too big and there are lots of unresolved issues during runtime:
1. Some errors with memory tracker occur when BE (RELEASE) starts.
2. Some UT cases fail.
...

Temporarily, the following changes are made on macOS to start BE successfully.
1. Disable memory tracker.
2. Use tcmalloc instead of jemalloc.

This PR kicks off the job. Guys who are interested in this job can continue to fix these runtime issues.

## Use case

```shell
./build.sh -j 8 --be --clean

cd output/be/bin
ulimit -n 60000
./start_be.sh --daemon
```

## Something else

It takes around _**10+**_ minutes to build BE (with prebuilt third-parties) on macOS with M1 chip. We will improve the  development experience on macOS greatly when we finish the adaptation job.
2022-10-18 13:10:13 +08:00
87a6b1a13b [enhancement](memtracker) Fix bthread local consume mem tracker (#13368)
Previously, bthread_getspecific was called every time bthread local was used. In the test at #10823, it was found that frequent calls to bthread_getspecific had performance problems.

So a cache is implemented on pthread local based on the btls key, but the btls key cannot correctly sense bthread switching.

So, based on bthread_self to get the bthread id to implement the cache.
2022-10-17 18:31:07 +08:00
c494ca0ed4 [enhancement](memtracker) Print query memory usage log every second when memory_verbose_track is enabled (#13302) 2022-10-13 09:11:23 +08:00
c5f802b93c [Bug](libjvm) reorder initialization of JNI (#13165) 2022-10-08 18:53:47 +08:00
b14b178928 [enhancement](memory) Trigger load channel flush based on process physical memory to avoid OOM #12960
When the physical memory of the process reaches 90% of the mem limit, trigger the load channel mgr to brush down
The default value of be.conf mem_limit is changed from 90% to 80%, and stability is the priority.
Fix deadlock in arena_locks in BufferPool::BufferAllocator::ScavengeBuffers and _lock in DebugString
2022-09-27 09:07:38 +08:00
70ab9cb43e [feature](http) refactor version info and add new http api for get version info (#12513)
Refactor version info and add new http api for get version info
2022-09-22 10:53:04 +08:00
b41eaa5ac0 [fix](memtracker) Introduce orphan mem tracker to verify memory tracking accuracy (#12794)
The mem hook consumes the orphan tracker by default. If the thread does not attach other trackers, by default all consumption will be passed to the process tracker through the orphan tracker.

In real time, consumption of all other trackers + orphan tracker consumption = process tracker consumption.

Ideally, all threads are expected to attach to the specified tracker, so that "all memory has its own ownership", and the consumption of the orphan mem tracker is close to 0, but greater than 0.
2022-09-21 15:47:10 +08:00
3bb042e45c [fix](memtracker) Process physical mem check does not include tc/jemalloc allocator cache (#12688)
tcmalloc/jemalloc allocator cache does not participate in the mem check as part of the process physical memory.

because new/malloc will trigger mem hook when using tcmalloc/jemalloc allocator cache, but it may not actually alloc physical memory, which is not expected in mem hook fail.

in addition:

The value of tcmalloc/jemalloc allocator cache is used as a mem tracker, the parent is the process mem tracker, which is updated every 1s.
Modify the process default mem_limit to 90%. expect mem tracker to effectively limit the memory usage of the process.
2022-09-17 11:31:01 +08:00
42b6532131 remove gc and fix print (#12682) 2022-09-17 00:16:15 +08:00
e7aa131506 [enhancement](tcmalloc) add aggressive_memory_decommit conf and make it disable (#12436)
Co-authored-by: yixiutt <yixiu@selectdb.com>
2022-09-08 08:37:16 +08:00
8370115cf6 [enhancement](memtracker) Improve performance of tracking real physical memory of PODArray #12168 2022-08-30 10:22:12 +08:00
db07e51cd3 [refactor](status) Refactor status handling in agent task (#11940)
Refactor TaggableLogger
Refactor status handling in agent task:
Unify log format in TaskWorkerPool
Pass Status to the top caller, and replace some OLAPInternalError with more detailed error message Status
Premature return with the opposite condition to reduce indention
2022-08-29 12:06:01 +08:00
b1fd701493 [fix](memtracker) Improve memory tracking accuracy for exec nodes (#11947) 2022-08-22 08:56:05 +08:00
b300b4faa0 [enhancement](memtracker) Optimize readability of mem exceed limit error message #11877 2022-08-18 14:39:41 +08:00
d2bb3ad08e [fix](memtracker) Fix core in logout task mem tracker (#11797) 2022-08-16 11:28:06 +08:00
b36680796f [optimization] (be-log) modify the backendservice log (#11689)
Co-authored-by: hucheng01 <hucheng01@baidu.com>
2022-08-12 11:52:24 +08:00
b892dfdbbd [Improvement](regresstion test) Fix regression test case failure for ASAN build (#11400)
* [improvement](regresstion test) Improve performance of ASAN build by using -O3 and fix mem limit exceed error for nereids test cases

* exclude tpcds_sf1 q72 for ASAN build because this query takes too long time
2022-08-03 11:19:50 +08:00
f730a048b1 [feature-wip](load) Support single replica load (#10298)
During load process, the same operation are performed on all replicas such as sort and aggregation,
which are resource-intensive.
Concurrent data load would consume much CPU and memory resources.
It's better to perform write process (writing data into MemTable and then data flush) on single replica
and synchronize data files to other replicas before transaction finished.
2022-08-02 11:44:18 +08:00
5c1cd058f2 [Feature] Add interface to check tablet segment lost (#10711)
Co-authored-by: weizuo <weizuo@xiaomi.com>
2022-08-02 09:40:04 +08:00
b35daf0a04 [improvement](light-schema-change) Support tablet schema cache (#11131) 2022-08-01 12:18:00 +08:00
73d8f5901d fix mem tracker limiter (#11376) 2022-08-01 09:44:04 +08:00
d3c88471ad [tracing] Support opentelemtry collector. (#10864)
* [tracing] Support opentelemtry collector.

1. support for exporting traces to multiple distributed tracing system via collector;
2. support using collector to process traces.
2022-07-29 16:49:40 +08:00
b6bdb3bdbc [fix] (mem tracker) Fix MemTracker accuracy (#11190) 2022-07-27 18:59:24 +08:00
e210db426c [opt] stop script opt (#11183) 2022-07-27 11:32:26 +08:00
4960043f5e [enhancement] Refactor to improve the usability of MemTracker (step2) (#10823) 2022-07-21 17:11:28 +08:00
d5fa66d9a3 [Enhancement] [Memory] Limit memory usage use process actual physical memory (#10924) 2022-07-19 11:08:39 +08:00
cc7c31b080 [Bug](be) fix be coredump when receive singal(#10903). (#10953) 2022-07-18 15:23:51 +08:00
b04a791895 [Enhancement] support compile with jemalloc (#10542)
A test feature to use jemalloc as default malloc.
2022-07-11 12:15:35 +08:00