This CL mainly changes:
1. the `storage_page_cache_limit` is based on config `mem_limit`
the default is 20% of `mem_limit`.
2. the `buffer_pool_limit` is based on config `mem_limit`
the default is 20% of `mem_limit`.
3. the `buffer_pool_clean_pages_limit` is based on config `buffer_pool_limit`
the default is 50% of `buffer_pool_limit`
4. Fix some show bugs of lru cache hit ratio and usage ratio
5. Fix a create view bug that `notEvalNondeterministicFunction` should be reset after analyze.
1. `StorageEngine::_delete_tablets_on_unused_root_path` will try to obtain tablet shard write lock in `TabletManager`
```
StorageEngine::_delete_tablets_on_unused_root_path
TabletManager::drop_tablets_on_error_root_path
obtain each tablet shard's write lock
```
2. `TabletManager::build_all_report_tablets_info` and other methods will obtain tablet shard read lock frequently.
So, `StorageEngine::_delete_tablets_on_unused_root_path` will hold `_store_lock` for a long time.
This will make it difficult for other threads to get write `_store_lock`, such as `StorageEngine::get_stores_for_create_tablet`
`drop_tablets_on_error_root_path` is a small probability event, `TabletManager::drop_tablets_on_error_root_path` should return when its param `tablet_info_vec` is empty
This is part of the array type support and has not been fully completed.
The following functions are implemented
1. fe array type support and implementation of array function, support array syntax analysis and planning
2. Support import array type data through insert into
3. Support select array type data
4. Only the array type is supported on the value lie of the duplicate table
this pr merge some code from #4655#4650#4644#4643#4623#2979
When parsing memory parameters in `ParseUtil::parse_mem_spec`, convert the percentage to `double` instead of `int`.
The currently affected parameters include `mem_limit` and `storage_page_cache_limit`
1. relocation R_X86_64_32 against `__gxx_personality_v0' can not be used when making a shared object; recompile with -fPIC
2. warning: the use of `tmpnam' is dangerous, better use `mkstemp'
3. Death tests use fork(), which is unsafe particularly in a threaded context. For this test, Google Test couldn't detect the number of threads.
* [doris-1008] support backup and restore directly to cloud storage via aws s3 protocol
* Internal][S3DirectAccess] Support backup,restore,load,export directlyconnect to s3
1. Support load and export data from/to s3 directly.
2. Add a config to auto convert broker access to s3 acces when available
Change-Id: Iac96d4b3670776708bc96a119ff491db8cb4cde7
(cherry picked from commit 2f03832ca52221cc7436069b96c45c48c4bc7201)
* [Internal][S3DirectAccess] File path glob compatible with broker
Change-Id: Ie55e07a547aa22c6fa8d432ca926216c10384e68
(cherry picked from commit d4fb25544c0dc06d23e1ada571ec3f8edd4ba56f)
* [internal] [doris-1008] fix log4j class not found
Change-Id: I468176aca0d821383c74ee658d461aba9e7d5be3
(cherry picked from commit 029adaa9d6ded8503acbd6644c1519456f3db232)
* add poms
Co-authored-by: yangzhengguo01 <yangzhengguo01@baidu.com>
#5146
Add histogram metrics into util/metrics.h. The data structure of histogram is implemented in util/histogram.h,
which could also be used in other situations that in need of histogram. Unit tests added as well.
There are some long loops and sleeps in unit tests, it will cost a
very long time to run all unit tests, especially run in TSAN mode.
This patch speed up unit tests by shortening long loops and sleeps,
on my environment all unit tests finished in 1 minite. It's useful
to do basic functional unit tests.
You can switch to run in this mode by adding a new environment variable
'DORIS_ALLOW_SLOW_TESTS'. For example, you can set:
export DORIS_ALLOW_SLOW_TESTS=1
and also you can disable it by setting:
export DORIS_ALLOW_SLOW_TESTS=0
After tablet level metrics is supported, the http metrics API may response
a very large body when a BE holds a large number of tablets, and cause heavy
network traffic.
This patch introduce http content compression to reduce network traffic.
Sometimes we want to detect the hotspot of a cluster, for example, hot scanned tablet, hot wrote tablet,
but we have no insight about tablets in the cluster.
This patch introduce tablet level metrics to help to achieve this object, now support 4 metrics on tablets: `query_scan_bytes `, `query_scan_rows `, `flush_bytes `, `flush_count `.
However, one BE may holds hundreds of thousands of tablets, so I add a parameter for the metrics HTTP request,
and not return tablet level metrics by default.
Sometimes we want to detect the hotspot of a cluster, for example, hot scanned tablet, hot wrote tablet,
but we have no insight about tablets in the cluster.
This patch introduce tablet level metrics to help to achieve this object, now support 4 metrics on tablets: `query_scan_bytes `, `query_scan_rows `, `flush_bytes `, `flush_count `.
However, one BE may holds hundreds of thousands of tablets, so I add a parameter for the metrics HTTP request,
and not return tablet level metrics by default.
Redesign metrics to 3 layers:
MetricRegistry - MetricEntity - Metrics
MetricRegistry : the register center
MetricEntity : the entity registered on MetricRegistry. Generally a MetricRegistry can be registered on several
MetricEntities, each of MetricEntity is an independent entity, such as server, disk_devices, data_directories, thrift
clients and servers, and so on.
Metric : metrics of an entity. Such as fragment_requests_total on server entity, disk_bytes_read on a disk_device entity,
thrift_opened_clients on a thrift_client entity.
MetricPrototype: the type of a metric. MetricPrototype is a global variable, can be shared by the same metrics across
different MetricEntities.
We make all MemTrackers shared, in order to show MemTracker real-time consumptions on the web.
As follows:
1. nearly all MemTracker raw ptr -> shared_ptr
2. Use CreateTracker() to create new MemTracker(in order to add itself to its parent)
3. RowBatch & MemPool still use raw ptrs of MemTracker, it's easy to ensure RowBatch & MemPool destructor exec
before MemTracker's destructor. So we don't change these code.
4. MemTracker can use RuntimeProfile's counter to calc consumption. So RuntimeProfile's counter need to be shared
too. We add a shared counter pool to store the shared counter, don't change other counters of RuntimeProfile.
Note that, this PR doesn't change the MemTracker tree structure. So there still have some orphan trackers, e.g. RowBlockV2's MemTracker. If you find some shared MemTrackers are little memory consumption & too time-consuming, you could make them be the orphan, then it's fine to use the raw ptr.
from/to_base64 may return incorrect value when the value is null #4130
remove the duplicated base64 code
fix the base64 encoded string length is wrong, and this will cause the memory error
cpp-mustache is a C++ implementation of a Mustache template engine
with support for RapidJSON, and in order to simplify RapidJSON object
building, we introduce class EasyJson from Apache Kudu.
Add a JSON format for existing metrics like this.
```
{
"tags":
{
"metric":"thread_pool",
"name":"thrift-server-pool",
"type":"active_thread_num"
},
"unit":"number",
"value":3
}
```
I add a new JsonMetricVisitor to handle the transformation.
It's not to modify existing PrometheusMetricVisitor and SimpleCoreMetricVisitor.
Also I add
1. A unit item to indicate the metric better
2. Cloning tablet statistics divided by database.
3. Use white space to replace newline in audit.log
Fix some unittest failed due to glog, this may be we change the ut build dir,and the log path is not exist in new build dir, so we change the log from file to stdout
Ref https://github.com/apache/incubator-doris/issues/3566
Introduce trace utility from Kudu to BE. This utility has been widely used in Kudu,
Impala also import this trace utility.
This trace util is used for tracing each phases in a thread, and can be dumped to
string to see each phases' time cost and diagnose which phase cost more time.
This util store a Trace object as a threadlocal variable, we can add trace entries
which record the current file name, line number, user specified symbols and
timestamp to this object, and it's able to add some counters to this Trace
object. And then, it can be dumped to human readable string.
There are some helpful macros defined in trace.h, here is a simple example for
usage:
```
scoped_refptr<Trace> t1(new Trace); // New 2 traces
scoped_refptr<Trace> t2(new Trace);
t1->AddChildTrace("child_trace", t2.get()); // t1 add t2 as a child named "child_trace"
TRACE_TO(t1, "step $0", 1); // Explicitly trace to t1
usleep(10);
// ... do some work
ADOPT_TRACE(t1.get()); // Explicitly adopt to trace to t1
TRACE("step $0", 2); // Implicitly trace to t1
{
// The time spent in this scope is added to counter t1.scope_time_cost
TRACE_COUNTER_SCOPE_LATENCY_US("scope_time_cost");
ADOPT_TRACE(t2.get()); // Adopt to trace to t2 for the duration of the current scope
TRACE("sub start"); // Implicitly trace to t2
usleep(10);
// ... do some work
TRACE("sub before loop");
for (int i = 0; i < 10; ++i) {
TRACE_COUNTER_INCREMENT("iterate_count", 1); // Increase counter t2.iterate_count
MicrosecondsInt64 start_time = GetMonoTimeMicros();
usleep(10);
// ... do some work
MicrosecondsInt64 end_time = GetMonoTimeMicros();
int64_t dur = end_time - start_time;
// t2's simple histogram metric with name prefixed with "lbm_writes"
const char* counter = BUCKETED_COUNTER_NAME("lbm_writes", dur);
TRACE_COUNTER_INCREMENT(counter, 1);
}
TRACE("sub after loop");
}
TRACE("goodbye $0", "cruel world"); // Automatically restore to trace to t1
std::cout << t1->DumpToString(Trace::INCLUDE_ALL) << std::endl;
```
output looks like:
```
0514 02:16:07.988054 (+ 0us) trace_test.cpp:76] step 1
0514 02:16:07.988112 (+ 58us) trace_test.cpp:80] step 2
0514 02:16:07.988863 (+ 751us) trace_test.cpp:103] goodbye cruel world
Related trace 'child_trace':
0514 02:16:07.988120 (+ 0us) trace_test.cpp:85] sub start
0514 02:16:07.988188 (+ 68us) trace_test.cpp:88] sub before loop
0514 02:16:07.988850 (+ 662us) trace_test.cpp:101] sub after loop
Metrics: {"scope_time_cost":744,"child_traces":[["child_trace",{"iterate_count":10,"lbm_writes_lt_1ms":10}]]}
```
Exclude the original source code, this patch
do the following work to adapt to Doris:
- Rename "kudu" namespace to "doris"
- Update some names to the existing function names in Doris, i.g. strings::internal::SubstituteArg::kNoArg -> strings::internal::SubstituteArg::NoArg
- Use doris::SpinLock instead of kudu::simple_spinlock which hasn't been imported
- Use manual malloc() and free() instead of kudu::Arena which hasn't been imported
- Use manual rapidjson::Writer instead of kudu::JsonWriter which hasn't been imported
- Remove all TRACE_EVENT related unit tests since TRACE_EVENT is not imported this time
- Update CMakeLists.txt
NOTICE(#3622):
This is a "revert of revert pull request".
This pr is mainly used to synthesize the PRs whose commits were
scattered and submitted due to the wrong merge method into a complete single commit.
LSAN detected errors have been fixed by a prior pathch (#3326), but
there are still some ASAN detected errors.
This patch try to fix these errors to make Doris BE more robustness.
And then we can add CI run in LSAN/ASAN mode to detect memory errors
as early as possible.
We can observe the workload of BE, and also it's a way to check
whether there is any problem in BE, like some container increase
too large and lead to OOM.
This patch add the following metrics:
```
Name Description
rowset_count_generated_and_in_use The total count of rowset id generated and in use since BE last start
unused_rowsets_count The total count of unused rowset waiting to be GC
broker_count The total count of brokers in management
data_stream_receiver_count The total count of data stream receivers in management
fragment_endpoint_count The total count of fragment endpoints of data stream in management, should always equal to data_stream_receiver_count
active_scan_context_count The total count of active scan contexts
plan_fragment_count The total count of plan fragments in executing
load_channel_count The total count of load channels in management
result_buffer_block_count The total count of result buffer blocks for queries, each block has a limited queue size (default 1024)
result_block_queue_count The total count of queues for fragments, each queue has a limited size (default 20, by config::max_memory_sink_batch_count)
routine_load_task_count The total count of routine load tasks in executing
small_file_cache_count The total count of cached small files' digest info
stream_load_pipe_count The total count of stream load pipes, each pipe has a limited buffer size (default 1M)
tablet_writer_count The total count of tablet writers
brpc_endpoint_stub_count The total count of brpc endpoints
```