Commit Graph

70 Commits

Author SHA1 Message Date
d5ea677282 [feature](tracing) Support query tracing to improve doris observability by introducing OpenTelemetry. (#10533)
The collection of query traces is implemented in fe and be, and the spans are exported to zipkin.
DSIP: https://cwiki.apache.org/confluence/display/DORIS/DSIP-012%3A+Introduce+opentelemetry
2022-07-09 15:50:40 +08:00
4ec6e3ee81 [refactor] Remove debug action since it is never used. (#10484)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2022-06-29 20:37:51 +08:00
26bc462e1c [feature-wip] (memory tracker) (step5) Fix track bthread, fix track vectorized query (#9145)
1. fix track bthread
- Bthread, a high performance M:N thread library used by brpc. In Doris, a brpc server response runs on one bthread, possibly on multiple pthreads. Currently, MemTracker consumption relies on pthread local variables (TLS).
- This caused pthread TLS MemTracker confusion when switching pthread TLS MemTracker in brpc server response. So replacing pthread TLS with bthread TLS in the brpc server response saves the MemTracker.
Ref: 731730da85/docs/en/server.md (bthread-local)

2. fix track vectorized query
- Added track mmap. Currently, mmap allocates memory in many places of the vectorized execution engine.
- Refactored ThreadContext to avoid dependency conflicts and make it easier to debug.
- Fix some bugs.
2022-04-27 20:34:02 +08:00
869fdff2f0 [refactor] add reference path for source file from impala (#9115)
According to the requirements of the APLv2, the referenced code needs to be marked with the path of the source code.
2022-04-20 12:29:57 +08:00
c71ffc01de [Refactor] Cleanup some unused include (#9063) 2022-04-18 09:52:31 +08:00
eeae516e37 [Feature](Memory) Hook TCMalloc new/delete automatically counts to MemTracker (#8476)
Early Design Documentation: https://shimo.im/docs/DT6JXDRkdTvdyV3G

Implement a new way of memory statistics based on TCMalloc New/Delete Hook,
MemTracker and TLS, and it is expected that all memory new/delete/malloc/free
of the BE process can be counted.
2022-03-20 23:06:54 +08:00
e17aef9467 [refactor] refactor the implement of MemTracker, and related usage (#8322)
Modify the implementation of MemTracker:
1. Simplify a lot of useless logic;
2. Added MemTrackerTaskPool, as the ancestor of all query and import trackers, This is used to track the local memory usage of all tasks executing;
3. Add cosume/release cache, trigger a cosume/release when the memory accumulation exceeds the parameter mem_tracker_consume_min_size_bytes;
4. Add a new memory leak detection mode (Experimental feature), throw an exception when the remaining statistical value is greater than the specified range when the MemTracker is destructed, and print the accurate statistical value in HTTP, the parameter memory_leak_detection
5. Added Virtual MemTracker, cosume/release will not sync to parent. It will be used when introducing TCMalloc Hook to record memory later, to record the specified memory independently;
6. Modify the GC logic, register the buffer cached in DiskIoMgr as a GC function, and add other GC functions later;
7. Change the global root node from Root MemTracker to Process MemTracker, and remove Process MemTracker in exec_env;
8. Modify the macro that detects whether the memory has reached the upper limit, modify the parameters and default behavior of creating MemTracker, modify the error message format in mem_limit_exceeded, extend and apply transfer_to, remove Metric in MemTracker, etc.;

Modify where MemTracker is used:
1. MemPool adds a constructor to create a temporary tracker to avoid a lot of redundant code;
2. Added trackers for global objects such as ChunkAllocator and StorageEngine;
3. Added more fine-grained trackers such as ExprContext;
4. RuntimeState removes FragmentMemTracker, that is, PlanFragmentExecutor mem_tracker, which was previously used for independent statistical scan process memory, and replaces it with _scanner_mem_tracker in OlapScanNode;
5. MemTracker is no longer recorded in ReservationTracker, and ReservationTracker will be removed later;
2022-03-11 22:04:23 +08:00
09bfb8b9d3 [fix] (rpc-udf) Fixed the problem that the query could not be interrupted (#8248)
if an error occurred in the rpc server during the execution of rpc-udf.
Add java,cpp,python demo of rpc-udf server
2022-03-03 09:30:03 +08:00
6e8d52f3fc [fix](stream-load) fix bug that stream load may be blocked with unqualified data (#8176)
Co-authored-by: morningman <chenmingyu@baidu.com>
2022-02-22 09:26:23 +08:00
d0ee101c2f [refactor] (runtime)tidy up the plan_fragment_executor codes (#8110)
Co-authored-by: zuochunwei <zuochunwei@meituan.com>
2022-02-22 09:20:27 +08:00
26289c28b0 [fix](load)(compaction) Fix NodeChannel coredump bug and modify some compaction logic (#8072)
1. Fix the problem of BE crash caused by destruct sequence. (close #8058)
2. Add a new BE config `compaction_task_num_per_fast_disk`

    This config specify the max concurrent compaction task num on fast disk(typically .SSD).
    So that for high speed disk, we can execute more compaction task at same time,
    to compact the data as soon as possible

3. Avoid frequent selection of unqualified tablet to perform compaction.
4. Modify some log level to reduce the log size of BE.
5. Modify some clone logic to handle error correctly.
2022-02-17 10:52:08 +08:00
ef233701b3 [feature](vec)(load) Support vtablet sink to enable insert into by using vec query engine (#7957)
Support vtablet sink to enable insert into query in vec query engine
2022-02-08 11:04:09 +08:00
5fc0a9f40d [improvement](Load) Cancel the load job ASAP when encounter unqualified data (#6319)
This PR mainly changes:

1. Help to Cancel the load job ASAP when encounter unqualified data.
    Solution is described in #6318 .
    Also replace some std::stringstream with fmt::memory_buffer to avoid performance issues.

2. fix a NPE bug when create user with empty host
3. fix compile warning after rebasing the master(vectorization)
2022-01-18 13:13:55 +08:00
efb4e189df [fix](lateral-view) Fix some lateral view bugs (#7772)
1. Fix bug that BE may crash when input node of TableFunctionNode has non-null column
2. Fix bug that TableFunctionNode may not return all results
2022-01-18 12:09:32 +08:00
e1d7233e9c [feature](vectorization) Support Vectorized Exec Engine In Doris (#7785)
# Proposed changes

Issue Number: close #6238

    Co-authored-by: HappenLee <happenlee@hotmail.com>
    Co-authored-by: stdpain <34912776+stdpain@users.noreply.github.com>
    Co-authored-by: Zhengguo Yang <yangzhgg@gmail.com>
    Co-authored-by: wangbo <506340561@qq.com>
    Co-authored-by: emmymiao87 <522274284@qq.com>
    Co-authored-by: Pxl <952130278@qq.com>
    Co-authored-by: zhangstar333 <87313068+zhangstar333@users.noreply.github.com>
    Co-authored-by: thinker <zchw100@qq.com>
    Co-authored-by: Zeno Yang <1521564989@qq.com>
    Co-authored-by: Wang Shuo <wangshuo128@gmail.com>
    Co-authored-by: zhoubintao <35688959+zbtzbtzbt@users.noreply.github.com>
    Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>
    Co-authored-by: xinghuayu007 <1450306854@qq.com>
    Co-authored-by: weizuo93 <weizuo@apache.org>
    Co-authored-by: yiguolei <guoleiyi@tencent.com>
    Co-authored-by: anneji-dev <85534151+anneji-dev@users.noreply.github.com>
    Co-authored-by: awakeljw <993007281@qq.com>
    Co-authored-by: taberylyang <95272637+taberylyang@users.noreply.github.com>
    Co-authored-by: Cui Kaifeng <48012748+azurenake@users.noreply.github.com>


## Problem Summary:

### 1. Some code from clickhouse

**ClickHouse is an excellent implementation of the vectorized execution engine database,
so here we have referenced and learned a lot from its excellent implementation in terms of
data structure and function implementation.
We are based on ClickHouse v19.16.2.2 and would like to thank the ClickHouse community and developers.**

The following comment has been added to the code from Clickhouse, eg:
// This file is copied from
// https://github.com/ClickHouse/ClickHouse/blob/master/src/Interpreters/AggregationCommon.h
// and modified by Doris

### 2. Support exec node and query:
* vaggregation_node
* vanalytic_eval_node
* vassert_num_rows_node
* vblocking_join_node
* vcross_join_node
* vempty_set_node
* ves_http_scan_node
* vexcept_node
* vexchange_node
* vintersect_node
* vmysql_scan_node
* vodbc_scan_node
* volap_scan_node
* vrepeat_node
* vschema_scan_node
* vselect_node
* vset_operation_node
* vsort_node
* vunion_node
* vhash_join_node

You can run exec engine of SSB/TPCH and 70% TPCDS stand query test set.

### 3. Data Model

Vec Exec Engine Support **Dup/Agg/Unq** table, Support Block Reader Vectorized.
Segment Vec is working in process.

### 4. How to use

1. Set the environment variable `set enable_vectorized_engine = true; `(required)
2. Set the environment variable `set batch_size = 4096; ` (recommended)

### 5. Some diff from origin exec engine

https://github.com/doris-vectorized/doris-vectorized/issues/294

## Checklist(Required)

1. Does it affect the original behavior: (No)
2. Has unit tests been added: (Yes)
3. Has document been added or modified: (No)
4. Does it need to update dependencies: (No)
5. Are there any changes that cannot be rolled back: (Yes)
2022-01-18 10:07:15 +08:00
85521944dd [refactor](olap-scan-node) Refactor olap scannode (#7131)
1. Delete useless variables
2. Add const modifier for read-only function
3. Delete the empty destructor, the compiler will automatically generate it, refer to the 3/5/0 rule: 
    [https://en.cppreference.com/w/cpp/language/rule_of_three]
4. It is recommended to add the override keyword (instead of the virtual keyword) to the subclass virtual function. 
    Override will let the compiler help check and improve security. This is also the reason why C++11 introduces override
2021-12-16 10:33:41 +08:00
836c95c2ca [feat](memory-track) Print peak memory use of all backend after query in audit log (#7030)
Add a new field `peakMemoryBytes` in fe.audit.log
2021-11-22 14:46:08 +08:00
fcd4f0b5c2 [fix](profile) fix some bugs about ReportProfile on BE (#7144)
1. setting _report_thread_active to false is not necessary protected by _report_thread_lock, because 
_report_thread_active's type is bool, writing data is multi-threadly safety if size <= marchine word length

2. report_profile thread terminates early is possiable, in the function report_profile(), while (_report_thread_active) may 
break if  _report_thread_active is false,  the thread of calling open() may be scheduled out between 
_report_thread_started_cv.wait(l) and _report_thread_active = true, we should not assume that how long time elapsed 
between a thread be scheduled twice
2021-11-20 21:43:57 +08:00
6c6380969b [refactor] replace boost smart ptr with stl (#6856)
1. replace all boost::shared_ptr to std::shared_ptr
2. replace all boost::scopted_ptr to std::unique_ptr
3. replace all boost::scoped_array to std::unique<T[]>
4. replace all boost:thread to std::thread
2021-11-17 10:18:35 +08:00
896a08cbcf [Enhancement] add thread id in be log (#6891)
Add thread id in be log in order to quickly find the query id that caused the BE crushed by segmentation fault
See #6890
2021-11-14 18:52:01 +08:00
ca8268f1c9 [Feature] Extend logger interface, support structured log output (#6600)
Support structured logging.
2021-11-07 17:39:53 +08:00
74ddea8d83 [Optimize] Remove some unused code to reduce lock contention (#6566)
1. Remove global runtime profile counter
2. Remove unused thread token register
2021-09-07 11:56:12 +08:00
9469b2ce1a [Outfile] Support concurrent export of query results (#6539)
This pr mainly supports
1. Export query result sets concurrently
2. Query result set export supports s3 protocol

Among them, there are several preconditions for concurrently exporting query result sets
1. Enable concurrent export variables
2. The query itself can be exported concurrently
    (some queries containing sort nodes at the top level cannot be exported concurrently)
3. Export the s3 protocol used instead of the broker

After exporting the result set concurrently,
the file prefix is changed to outfile_{query_instance_id}_filenumber.{file_format}
2021-09-07 11:53:32 +08:00
3f2fdd236f Add scan thread token (#6443) 2021-08-27 10:56:17 +08:00
2030c44dba [Log] Modify some log level on BE side (#6381) 2021-08-14 10:25:45 +08:00
9216735cfa [New Featrue] Support Vectorization Execution Engine Interface For Doris (#6329)
1. FE vectorized plan code
2. Function register vec function
3. Diff function nullable type
4. New thirdparty code and new thrift struct
2021-08-11 14:54:06 +08:00
d57c2344e1 [MemTracker] Refactored the hierarchical structure of memtracker (#5956)
To avoid showing too many memtracker on BE web pages.
The MemTracker level now has 3 levels: OVERVIEW, TASK and VERBOSE.

OVERVIEW Mainly used for main memory consumption module such as Query/Load/Metadata.
TASK is mainly used to record the memory overhead of a single task such as a single query, load, and compaction task.
VERBOSE is used for other more detailed memtrackers.
2021-06-16 09:44:24 +08:00
1a81b9e160 [MemTracker] Some enchance of MemTracker (#5783)
1 Make some MemTracker have reasonable parent MemTracker not the root tracker
2 Make each MemTracker can be easily to trace.
3 Add show level of MemTracker to reduce the MemTracker show in the web page to have a way to control show how many tracker in web page.
2021-05-19 09:27:50 +08:00
9d25bfe980 [Bug] Fix bug that database not found when replaying batch transaction remove log (#5815)
* [Bug] Fix bug that database not found when replaying batch transaction remove log

[GlobalTransactionMgr.replayBatchRemoveTransactions():353] replay batch remove transactions failed. db 0
org.apache.doris.common.AnalysisException: errCode = 2, detailMessage = databaseTransactionMgr[0] does not exist
        at org.apache.doris.transaction.GlobalTransactionMgr.getDatabaseTransactionMgr(GlobalTransactionMgr.java:84) ~[palo-fe.jar:3.4.0]
        at org.apache.doris.transaction.GlobalTransactionMgr.replayBatchRemoveTransactions(GlobalTransactionMgr.java:350) [palo-fe.jar:3.4.0]
        at org.apache.doris.persist.EditLog.loadJournal(EditLog.java:601) [palo-fe.jar:3.4.0]
        at org.apache.doris.catalog.Catalog.replayJournal(Catalog.java:2452) [palo-fe.jar:3.4.0]
        at org.apache.doris.master.Checkpoint.runAfterCatalogReady(Checkpoint.java:101) [palo-fe.jar:3.4.0]
        at org.apache.doris.common.util.MasterDaemon.runOneCycle(MasterDaemon.java:58) [palo-fe.jar:3.4.0]
        at org.apache.doris.common.util.Daemon.run(Daemon.java:116) [palo-fe.jar:3.4.0]

The id of information_scheam database is 0, and it has no txn at all.
2021-05-17 11:50:46 +08:00
0c83e43a67 [Optimize] Optimize profile lock conflict and view profile while query is executing (#5762)
1. Reduce lock conflicts in RuntimeProfile of be;
2. can view query profile when the query is executing;
3. reduce wait time for 'show proc /current_queries'.
2021-05-13 22:33:26 +08:00
98e80aa65e [refactor] Replace boost::function with std::function (#5700)
Replace boost::function with std::function
2021-05-09 22:00:48 +08:00
a803ceea86 [refactor] Remove boost mutex, use std::mutex instead (#5684)
* Remove boost mutex, use std::mutex instead

* replace shared_mutex
2021-04-22 11:29:36 +08:00
c4cc681d14 remove boost_foreach, using c++ foreach instead (#5611) 2021-04-15 10:52:29 +08:00
0131c33966 [Enhance] Improve the readability of memtrackers' name (#5455)
Improve the readability of memtrackers' name, then you will be happy to read website be_ip:port/mem_tracker
2021-03-11 22:33:31 +08:00
93a4c7efc1 [LOG] Standardize the use of VLOG in code (#5264)
At present, the application of vlog in the code is quite confusing.
It is inherited from impala VLOG_XX format, and there is also VLOG(number) format.
VLOG(number) format does not have a unified specification, so this pr standardizes the use of VLOG
2021-01-21 12:09:09 +08:00
ff4bd1223f [Profile] Add cpu time cost in query audit (#5051) 2020-12-13 22:22:15 +08:00
6fedf5881b [CodeFormat] Clang-format cpp sources (#4965)
Clang-format all c++ source files.
2020-11-28 18:36:49 +08:00
f1b57c4418 [Optimize] Avoid repeated sending of common components in Fragments (#4904)
This CL mainly changes:

1. Avoid repeated sending of common components in Fragments

    In the previous implementation, a query may generate multiple Fragments,
these Fragments contain some common information, such as DescriptorTable.
Fragment will be sent to BE in a certain order, so these public information will be sent repeatedly
and generated repeatedly on the BE side.

    In some complex SQL, these public information may be very large,
thereby increasing the execution time of Fragment.

    So I improved this. For multiple Fragments sent to the same BE, only the first Fragment will carry
these public information, and it will be cached on the BE side, and subsequent Fragments
no longer need to carry this information.

    In the local test, the execution time of some complex SQL can be reduced from 3 seconds to 1 second.

2. Add the time-consuming part of FE logic in Profile

    Including SQL analysis, planning, Fragment scheduling and sending on the FE side, and the time to fetch data.
2020-11-22 20:38:05 +08:00
09f97f8a05 [Refactor] Fixes some be typo part 2 (#4747) 2020-10-20 09:28:57 +08:00
004b955ca4 [Bug] Fix a null pointer bug in PlanFragmentExecutor. (#4473)
Fix a null pointer bug in PlanFragmentExecutor. Add null check operation before it is used.
Detail: #4472
2020-08-28 09:28:23 +08:00
e4e9af4577 This PR contain three things (#4448)
1. Fix core bug wild pointer in PlanFragmentExecutor, fix issue #4447
2. Fix core bug wild pointer json load, fix issue #4452
3. Change the declare order of ODBC type in thrift for compatibility
2020-08-26 10:53:53 +08:00
10f822eb43 [MemTracker] make all MemTrackers shared (#4135)
We make all MemTrackers shared, in order to show MemTracker real-time consumptions on the web.
As follows:
1. nearly all MemTracker raw ptr -> shared_ptr
2. Use CreateTracker() to create new MemTracker(in order to add itself to its parent)
3. RowBatch & MemPool still use raw ptrs of MemTracker, it's easy to ensure RowBatch & MemPool destructor exec 
     before MemTracker's destructor. So we don't change these code.
4. MemTracker can use RuntimeProfile's counter to calc consumption. So RuntimeProfile's counter need to be shared 
    too. We add a shared counter pool to store the shared counter, don't change other counters of RuntimeProfile.
Note that, this PR doesn't change the MemTracker tree structure. So there still have some orphan trackers, e.g. RowBlockV2's MemTracker. If you find some shared MemTrackers are little memory consumption & too time-consuming, you could make them be the orphan, then it's fine to use the raw ptr.
2020-07-31 21:57:21 +08:00
9b0ad66b78 [runtime] Replace the thread pool in FragmentMgr (#4057) 2020-07-15 10:03:48 +08:00
9bb7e5d208 Fix some code & comments (#3999)
TPlanExecParams::volume_id is never used, so delete the print_volume_ids() function.
Fix log, and log if PlanFragmentExecutor::open() returns error.
Fix some comments
2020-07-03 21:18:47 +08:00
4a44c457a3 [Bug] Fix bug that a query plan is not correctly cancelled (#3933)
This bug is introduced by #3872.
It will cause some expected to be cancelled queries not being cancelled.
2020-06-25 16:23:13 +08:00
51367abce7 [Bug] Fix bug that BE crash when doing Insert Operation (#3872)
Mainly change:
1. Fix the bug in `update_status(status)` of `PlanFragmentExecutor`.
2. When the FE Coordinator executes `execRemoteFragmentAsync()`, if it finds an RPC error, return a Future with an error code instead of exception.
3. Protect the `_status` in RuntimeState with lock
4. Move the `_runtime_profile` of RuntimeState before the `_obj_pool`, so that the profile will be
deconstructed after the object pool.
5. Remove the unused `ObjectPool` param in RuntimeProfile constructor. If I don't remove it,
RuntimeProfile will depends on the `_obj_pool` in RuntimeProfile.
2020-06-19 17:09:04 +08:00
101c7c161d [Bug] Fix bug that double unregister the resource pool in runtime state (#3458)
The resource pool in runtime state will be automatically unregistered
when deconstructing the RuntimeState. So no need to unregister it when
closing the plan fragment executor.
2020-05-04 14:48:57 +08:00
4eb27bc7e3 [Profile] Make running profile clearer and more intuitive to improve usability (#3365) (#3383)
This CL mainly made the following modifications:
1. Delete Invalid method in Running Profile Class.
2. Move Memlimit Counter from blockmgr to fragment and add PeakMemUsage Counter
3. Fix the bug of buffer pool memlimit counter
4. Call compute_time_in_profile() before pretty_print() to show the _local_time_percent without child running profile
5. Add TransferThread ThreadToken count in AveThreadToken Counter
2020-04-24 21:38:55 +08:00
807499427c unregister fragment mem tracker in close() (#3286)
ref https://github.com/apache/incubator-doris/issues/3273

P.S.
614a76beea/be/src/runtime/plan_fragment_executor.cpp (L559-L562)
I think this piece of code is useless.
This `_mem_tracker` in `PlanFragmentExecutor` is set as fragment_mem_tracker of `RuntimeState`.

**direct use**
We use it in these code, when rowbatch reset, mem tracker's consumption will be released.
7eab12a40e/be/src/exec/olap_rewrite_node.cpp (L57-L58)
839ec45197/be/src/exec/olap_scan_node.cpp (L1217-L1218)

**other usage** 
e.g.
6c33f80544/be/src/exec/olap_scanner.cpp (L245)
won't consume the fragment mem tracker. We don't need to worry about the fragment mem tracker consumption is not zero when we want to destroy it.

Or we can add a consumption check before we close the mem tracker?
2020-04-13 23:15:56 +08:00
50af594c66 [MemLimit] Normalize the setting of mem limit (#3033)
Normalize the setting of mem limit to avoid some unexpected exception.
For example, use may not setting query mem limit in query plan, which
may cause BE crash.
2020-03-05 08:47:45 +08:00