Commit Graph

76 Commits

Author SHA1 Message Date
4e783afa7a [feature] add Generic debug timer for debugging or profiling (#7923)
add a group of debug-timer for the purpose of profiling or testing
you can use these timers for custom meaning purpose unlike the specific named timer
2022-01-31 22:15:43 +08:00
e1d7233e9c [feature](vectorization) Support Vectorized Exec Engine In Doris (#7785)
# Proposed changes

Issue Number: close #6238

    Co-authored-by: HappenLee <happenlee@hotmail.com>
    Co-authored-by: stdpain <34912776+stdpain@users.noreply.github.com>
    Co-authored-by: Zhengguo Yang <yangzhgg@gmail.com>
    Co-authored-by: wangbo <506340561@qq.com>
    Co-authored-by: emmymiao87 <522274284@qq.com>
    Co-authored-by: Pxl <952130278@qq.com>
    Co-authored-by: zhangstar333 <87313068+zhangstar333@users.noreply.github.com>
    Co-authored-by: thinker <zchw100@qq.com>
    Co-authored-by: Zeno Yang <1521564989@qq.com>
    Co-authored-by: Wang Shuo <wangshuo128@gmail.com>
    Co-authored-by: zhoubintao <35688959+zbtzbtzbt@users.noreply.github.com>
    Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>
    Co-authored-by: xinghuayu007 <1450306854@qq.com>
    Co-authored-by: weizuo93 <weizuo@apache.org>
    Co-authored-by: yiguolei <guoleiyi@tencent.com>
    Co-authored-by: anneji-dev <85534151+anneji-dev@users.noreply.github.com>
    Co-authored-by: awakeljw <993007281@qq.com>
    Co-authored-by: taberylyang <95272637+taberylyang@users.noreply.github.com>
    Co-authored-by: Cui Kaifeng <48012748+azurenake@users.noreply.github.com>


## Problem Summary:

### 1. Some code from clickhouse

**ClickHouse is an excellent implementation of the vectorized execution engine database,
so here we have referenced and learned a lot from its excellent implementation in terms of
data structure and function implementation.
We are based on ClickHouse v19.16.2.2 and would like to thank the ClickHouse community and developers.**

The following comment has been added to the code from Clickhouse, eg:
// This file is copied from
// https://github.com/ClickHouse/ClickHouse/blob/master/src/Interpreters/AggregationCommon.h
// and modified by Doris

### 2. Support exec node and query:
* vaggregation_node
* vanalytic_eval_node
* vassert_num_rows_node
* vblocking_join_node
* vcross_join_node
* vempty_set_node
* ves_http_scan_node
* vexcept_node
* vexchange_node
* vintersect_node
* vmysql_scan_node
* vodbc_scan_node
* volap_scan_node
* vrepeat_node
* vschema_scan_node
* vselect_node
* vset_operation_node
* vsort_node
* vunion_node
* vhash_join_node

You can run exec engine of SSB/TPCH and 70% TPCDS stand query test set.

### 3. Data Model

Vec Exec Engine Support **Dup/Agg/Unq** table, Support Block Reader Vectorized.
Segment Vec is working in process.

### 4. How to use

1. Set the environment variable `set enable_vectorized_engine = true; `(required)
2. Set the environment variable `set batch_size = 4096; ` (recommended)

### 5. Some diff from origin exec engine

https://github.com/doris-vectorized/doris-vectorized/issues/294

## Checklist(Required)

1. Does it affect the original behavior: (No)
2. Has unit tests been added: (Yes)
3. Has document been added or modified: (No)
4. Does it need to update dependencies: (No)
5. Are there any changes that cannot be rolled back: (Yes)
2022-01-18 10:07:15 +08:00
2a2f12ca51 [refactor & fix](exce & olap) refactor reader: rename Reader to TabletReader (#7544)
1. Consider the responsibility of Reader,  Rename Reader to TabletReader, I think the new name TabletReader can represent its function exactly,  it is more suitable and meaningful
2. add virtual keyword for the destructor of OlapScanner, because VOlapScanner is derived from it
3. refactor struct ReaderParams and KeysParam as TabletReader's inner struct,guard by TabletReader name scope, it's also more reasonable
4. reduce OlapScanner's member data amount, just use _parent->member_data is simpler
5. bugfix: TupleReader has the same memeber data _collect_iter to its parent class Reader, this usage is dangerous, the writer may make some mistake, so i delete TupleReader::_collect_iter to fix it.
6. call set_tablet_reader() in OlapScanner::prepare() to setup _tablet_reader, VOlapScanner should override set_tablet_reader to new BlockReader instead,  use this way to avoid new Reader twice by reset unique_ptr _tablet_reader
7. if the member data is a inseparable part of a class, i suggest using normal variable while not pointer variable, because pointer bring a indirect lay and must handle coping and destructing carefully, it's not necessary
8. some other small changes for readability or design
2022-01-06 00:00:32 +08:00
85521944dd [refactor](olap-scan-node) Refactor olap scannode (#7131)
1. Delete useless variables
2. Add const modifier for read-only function
3. Delete the empty destructor, the compiler will automatically generate it, refer to the 3/5/0 rule: 
    [https://en.cppreference.com/w/cpp/language/rule_of_three]
4. It is recommended to add the override keyword (instead of the virtual keyword) to the subclass virtual function. 
    Override will let the compiler help check and improve security. This is also the reason why C++11 introduces override
2021-12-16 10:33:41 +08:00
8b557c0e70 [Refactor] Refact code of sequence column (#7007) 2021-11-15 11:10:45 +08:00
c3b133bdb3 [Refactor] Refactor the reader code (#6866)
1. Removed useless redundant code logic
2. Change reader to interface, add tuple reader to simplify the structure of reader
2021-10-30 18:15:28 +08:00
b2f1e21a3b [Bugs] Fix some bugs (#6586)
* fix regex lazy

* fix result file core

* fix dynamic partition replica and table name length bug

* fix replicanum 0

* fix delete bug

* renew proxy

Co-authored-by: morningman <chenmingyu@baidu.com>
2021-09-10 09:53:30 +08:00
ca3eb6490e push down conditions on unique table value columns to base rowset (#6457) 2021-08-26 09:14:49 +08:00
8738ce380b Add long text type STRING, with a maximum length of 2GB. Usage is similar to varchar, and there is no guarantee for the performance of storing extremely long data (#6391) 2021-08-18 09:05:40 +08:00
9216735cfa [New Featrue] Support Vectorization Execution Engine Interface For Doris (#6329)
1. FE vectorized plan code
2. Function register vec function
3. Diff function nullable type
4. New thirdparty code and new thrift struct
2021-08-11 14:54:06 +08:00
16bc5fa585 [Bug] fix violating C/C++ aliasing rules cause a error hash value in decimal value (#6348)
In RuntimeFilter BloomFilter, decimal column will got a wrong hash value because violating  aliasing rules
decimal12_t decimal = { 12, 12 };
murmurhash3(decimal) in bloom filter: 2167721464
expect: 4203026776
2021-08-03 12:00:03 +08:00
ed3ff470ce [ARRAY] Support array type load and select not include access by index (#5980)
This is part of the array type support and has not been fully completed. 
The following functions are implemented
1. fe array type support and implementation of array function, support array syntax analysis and planning
2. Support import array type data through insert into
3. Support select array type data
4. Only the array type is supported on the value lie of the duplicate table

this pr merge some code from #4655 #4650 #4644 #4643 #4623 #2979
2021-07-13 14:02:39 +08:00
290a844e04 [optimize] Optimize bloomfilter performance (#6180)
refactor runtime filter bloomfilter and eliminate some virtual function calls which obtained a performance improvement of about 5%
import block bloom filter, for avx version obtained 40% performance improvement
before: bloomfilter size:default, about 2000W item cost about 1s400ms
after: bloomfilter size:524288, about 2000W item cost about 400ms
2021-07-10 10:12:12 +08:00
739c0268ff [refactor] Remove decimal v1 related code from code base (#6079)
remove ALL DECIMAL V1 type code , this is a part of #6073
2021-07-07 10:26:32 +08:00
149def9e42 [Feature] Support RuntimeFilter in Doris (BE Implement) (#6077)
1. support in/bloomfilter/minmax
2. support broadcast/shuffle/bucket shuffle/colocate join
3. opt memory use and cpu cache miss while build runtime filter
4. opt memory use in left semi join (works well on tpcds-95)
2021-07-04 20:59:05 +08:00
d57c2344e1 [MemTracker] Refactored the hierarchical structure of memtracker (#5956)
To avoid showing too many memtracker on BE web pages.
The MemTracker level now has 3 levels: OVERVIEW, TASK and VERBOSE.

OVERVIEW Mainly used for main memory consumption module such as Query/Load/Metadata.
TASK is mainly used to record the memory overhead of a single task such as a single query, load, and compaction task.
VERBOSE is used for other more detailed memtrackers.
2021-06-16 09:44:24 +08:00
1ec615c562 [BUG] Fixed some uninitialized variables (#5850)
Fixed some potential bugs caused by uninitialized variables
2021-05-25 10:34:35 +08:00
1a81b9e160 [MemTracker] Some enchance of MemTracker (#5783)
1 Make some MemTracker have reasonable parent MemTracker not the root tracker
2 Make each MemTracker can be easily to trace.
3 Add show level of MemTracker to reduce the MemTracker show in the web page to have a way to control show how many tracker in web page.
2021-05-19 09:27:50 +08:00
50ffae44b1 [BUG] Fix bug that Unique/AGG key will read all key columns when there are two rowsets (#5632) 2021-04-14 00:12:05 +08:00
9c7d8d2e98 [Bug] Fix bug that isPreAggregation is incorrectly set (#5608)
1. The MaterializedViewSelector should be reset for each scan node
2. On the BE side, columns with delete conditions must be added to the return column.
2021-04-09 14:13:06 +08:00
0131c33966 [Enhance] Improve the readability of memtrackers' name (#5455)
Improve the readability of memtrackers' name, then you will be happy to read website be_ip:port/mem_tracker
2021-03-11 22:33:31 +08:00
462efeaf39 [Performance Optimization and Refactor] (#5358) (#5364)
1. Add BlockColumnPredicate support OR and AND column predicate in RowBlockV2
2. Support evaluate vectorization delete predicate in storage engine not in Reader in SegmentV2
2021-02-07 22:41:33 +08:00
93a4c7efc1 [LOG] Standardize the use of VLOG in code (#5264)
At present, the application of vlog in the code is quite confusing.
It is inherited from impala VLOG_XX format, and there is also VLOG(number) format.
VLOG(number) format does not have a unified specification, so this pr standardizes the use of VLOG
2021-01-21 12:09:09 +08:00
17d939b789 [Bug] Fix scanner threads heap-use-after-free (#5111)
Scanner threads may be running and using the member vars of OlapScanNode,
when the OlapScanNode has already destroyed.

We can use `_running_thread` to be the last accessed member variable.
And `transfer_thread` need to wait for `_running_thread==0`.
After `transfer_thread` joined, `OlapScanNode::close()` can continue.
2021-01-04 09:28:51 +08:00
9e19b6b133 [Performance Improve] Push Down _conjunct of 'A is NULL' and 'B is not NULL' to Storage Engine. (#5092)
This patch mainly do the following:
- Support #5086
- Refactor ColumnRangeValue to support contain null
2021-01-03 15:45:07 +08:00
c57145b4c2 [Bug] Fix bug that routine load may lost some data (#5093)
In the previous implementation, whether a subtask is in commit or abort state,
we will try to update the job progress, such as the consumed offset of kafka.
Under normal circumstances, the aborted transaction does not consume any data,
and all progress is 0, so even we update the progress, the progress will remain
unchanged.
However, in the case of high cluster load, the subtask may fail half of the execution on the BE side.
At this time, although the task is aborted, part of the progress is updated.
Cause the next subtask to skip these data for consumption, resulting in data loss.
2020-12-23 09:33:52 +08:00
81c7c0360e [Bug] Fix a core dump of counter in BE (#5078)
Introduced by PR #5051.
As @liutang123 said, when PlanFragmentExecutor is destructed, it will call
`close -> ExecNode::close -> OlapScanNode::close`. OlapScanNode will wait for `_transfer_thread`.
`_transfer_thread` will wait for all OlapScanner processing to complete.
OlapScanner is processed by the scanner thread. When the last scanner processing is completed,
`_transfer_thread` will break out of the loop, and PlanFragmentExecutor will continue to destruct.
And if it is completed, its RuntimeProfile::Counter will also be destructed.
At this time, the ScopedTimer in the Scan thread may still use this Counter when it is destructed.

So we must make sure that the timer is deconstructed before deconstructing the runtime profile.
2020-12-15 09:33:38 +08:00
b9dabc3b5b [Enhance] Push down predicate on value column of unique table to base rowset (#5022) 2020-12-06 08:50:37 +08:00
6fedf5881b [CodeFormat] Clang-format cpp sources (#4965)
Clang-format all c++ source files.
2020-11-28 18:36:49 +08:00
6247408689 [Compact]Take tablet scan frequency into consider when selecting tablet for compaction (#4837)
A large number of small segment files will lead to low efficiency for scan operations.
Multiple small files can be merged into a large file by compaction operation.
So we could take the tablet scan frequency into consideration when selecting an tablet for compaction
and preferentially do compaction for those tablets which are scanned frequently during a
latest period of time at the present.

Using the compaction strategy of Kudu for reference, scan frequency can be calculated
for tablet during a latest period of time and be taken into consideration when calculating compaction score.
2020-11-18 21:51:12 +08:00
66132d2836 [Feature] Running Profile OLAP_SCAN_NODE layering and enhance readability (#4825)
mainly includes:
- `OLAP_SCAN_NODE` profile layering: `OLAP_SCAN_NODE`,`OlapScanner`, and `SegmentIterator`.
- Delete meaningless statistical values. mainly in scan_node.cpp.
- Increase `RowsConditionsFiltered` statistical, split from `RowsDelFiltered`, the meaning is the number of rows filtered by various column indexes, only in segment V2.
- Modify the document based on the above, and enhance readability.
2020-11-11 21:21:25 +08:00
b1c1ffda4a [Refactor] Refactor olap scan node code (#4823)
1. Remove meaningless code in Doris
2. Replace string copy by string reference
3. Simplified the implementation of some functions
2020-11-01 09:12:23 +08:00
09f97f8a05 [Refactor] Fixes some be typo part 2 (#4747) 2020-10-20 09:28:57 +08:00
83f6f46c34 [Config] Limit the version number of tablet (#4687)
Add a BE config `max_tablet_version_num` to limit the version number of a single tablet.
To avoid too many versions
2020-10-13 10:08:16 +08:00
068707484d Support sequence column for UNIQUE_KEYS Table (#4256)
* add sequence  col

Co-authored-by: yangwenbo6 <yangwenbo3@jd.com>
2020-09-04 10:10:17 +08:00
498b06fbe2 [Metrics] Support tablet level metrics (#4428)
Sometimes we want to detect the hotspot of a cluster, for example, hot scanned tablet, hot wrote tablet,
but we have no insight about tablets in the cluster.
This patch introduce tablet level metrics to help to achieve this object, now support 4 metrics on tablets: `query_scan_bytes `, `query_scan_rows `, `flush_bytes `, `flush_count `. 
However, one BE may holds hundreds of thousands of tablets, so I add a parameter for the metrics HTTP request,
and not return tablet level metrics by default.
2020-09-02 10:39:41 +08:00
8b0b120aca [Profile] Add 2 Segment related metrics in query profile (#4348)
Total number of segments and filterd number of segment
2020-08-27 12:07:21 +08:00
4c571cb6f5 Revert "[Metrics] Support tablet level metrics (#4327)" (#4397)
This reverts commit 56260a65c87830ffe34109195ee4d6f1d543e630.

Co-authored-by: morningman <chenmingyu@baidu.com>
2020-08-19 22:37:52 +08:00
56260a65c8 [Metrics] Support tablet level metrics (#4327)
Sometimes we want to detect the hotspot of a cluster, for example, hot scanned tablet, hot wrote tablet,
but we have no insight about tablets in the cluster.
This patch introduce tablet level metrics to help to achieve this object, now support 4 metrics on tablets: `query_scan_bytes `, `query_scan_rows `, `flush_bytes `, `flush_count `.
However, one BE may holds hundreds of thousands of tablets, so I add a parameter for the metrics HTTP request,
and not return tablet level metrics by default.
2020-08-18 16:56:12 +08:00
10f822eb43 [MemTracker] make all MemTrackers shared (#4135)
We make all MemTrackers shared, in order to show MemTracker real-time consumptions on the web.
As follows:
1. nearly all MemTracker raw ptr -> shared_ptr
2. Use CreateTracker() to create new MemTracker(in order to add itself to its parent)
3. RowBatch & MemPool still use raw ptrs of MemTracker, it's easy to ensure RowBatch & MemPool destructor exec 
     before MemTracker's destructor. So we don't change these code.
4. MemTracker can use RuntimeProfile's counter to calc consumption. So RuntimeProfile's counter need to be shared 
    too. We add a shared counter pool to store the shared counter, don't change other counters of RuntimeProfile.
Note that, this PR doesn't change the MemTracker tree structure. So there still have some orphan trackers, e.g. RowBlockV2's MemTracker. If you find some shared MemTrackers are little memory consumption & too time-consuming, you could make them be the orphan, then it's fine to use the raw ptr.
2020-07-31 21:57:21 +08:00
725ebafd99 [Bug] Cancel the query if OlapScanner prepare failed (#4002) 2020-07-03 21:33:07 +08:00
b8ee84a120 [Doc] Add docs to OLAP_SCAN_NODE query profile (#3808) 2020-06-13 16:25:40 +08:00
4f79036a7e Add error code into error message (#3645) 2020-05-21 19:14:35 +08:00
b58b1b3953 [metrics] Make DorisMetrics to be a real singleton (#3417) 2020-05-04 09:20:53 +08:00
6c33f80544 Add disable_storage_page_cache config (#2890)
1. when read column data page:
    for compaction, schema_change, check_sum: we don't use page cache
    for query and config::disable_storage_page_cache is false, we use page cache
2. when read column index page
    if config::disable_storage_page_cache is false, we use page cache
2020-02-16 19:13:30 +08:00
4c5b0b6dc9 Remove VersionHash used to comparison in BE (#2622) 2019-12-31 19:38:45 +08:00
d31f774852 Add block split bloom filter (#2471)
[STORAGE][SEGMENTV2]

    use block split bloom filter
    build bloom filter against data page
    add distinct value to bloom filter
    add ordinal index to bloom filter index
2019-12-18 12:57:44 +08:00
f828670245 Add Bitmap index reader (#2319)
[STORAGE] [INDEX]

For #2061 and #2062

Add bitmap index reader
SegmentIterator support bitmap index
Add some metrics
2019-12-03 23:01:40 +08:00
95a3b4ccfe Add object type (#1948)
Add a new type: Object. Currently, it's mainly for complex aggregate metrics(HLL , Bitmap).

The Object type has the following constraints:
1 Object type could not as key column type
2 Object type doesn't support all indices (BloomFilter, short key, zone map, invert index)
3 Object type doesn't support filter and group by

In the implementation:

The Object type reuse the StringValue and StringVal, because in storage engine, the Object type is binary, it has a pointer and length.
2019-10-31 21:42:58 +08:00
9bc2325c6a Fix incorrect scan bytes in metrics (#2034) 2019-10-23 18:13:40 +08:00