Commit Graph

66 Commits

Author SHA1 Message Date
b44c47fc10 [fix] (remote storage) fix bug for storage policy (#11597) 2022-08-09 09:05:48 +08:00
321107cb40 [refactor](schema change) Using tablet schema shared ptr instead of raw ptr (#11475)
* Using tabletschema shared ptr instead of raw ptrs


Co-authored-by: yiguolei <yiguolei@gmail.com>
2022-08-05 11:04:38 +08:00
b35daf0a04 [improvement](light-schema-change) Support tablet schema cache (#11131) 2022-08-01 12:18:00 +08:00
73d8f5901d fix mem tracker limiter (#11376) 2022-08-01 09:44:04 +08:00
2783267599 [feature-wip](unique-key-merge-on-write) update bitmap after compaction, DSIP-018 (#11289) 2022-07-30 19:50:47 +08:00
b6bdb3bdbc [fix] (mem tracker) Fix MemTracker accuracy (#11190) 2022-07-27 18:59:24 +08:00
d4fb27125a [feature-wip](unique-key-merge-on-write) row id conversion for compaction (#11149) 2022-07-27 16:32:13 +08:00
4960043f5e [enhancement] Refactor to improve the usability of MemTracker (step2) (#10823) 2022-07-21 17:11:28 +08:00
523d395527 [refactor] Remove alpha rowset meta (#10933)
* remove alpha_rowset_meta
* remove alpha rowset related codes in compaction
* remove alpha rowset related codes in RowsetMeta
* fix be ut because some ut use alpha rowsetmeta
2022-07-18 08:45:46 +08:00
486cf0ebd4 [Feature] Lightweight schema change of add/drop column (#10136)
* [Schema Change] support fast add/drop column  (#49)

* [feature](schema-change) support fast schema change. coauthor: yixiutt

* [schema change] Using columns desc from fe to read data. coauthor: Lchangliang

* [feature](schema change) schema change optimize for add/drop columns.

1.add uniqueId field for class column.
2.schema change for add/drop columns directly update schema meta

Co-authored-by: yixiutt <yixiu@selectdb.com>
Co-authored-by: SWJTU-ZhangLei <1091517373@qq.com>

[Feature](schema change) fix write and add regression test (#69)

Co-authored-by: yixiutt <yixiu@selectdb.com>

[schema change] be ssupport that delete use newest schema

add delete regression test

fix regression case (#107)

tmp

[feature](schema change) light schema change exclude rollup and agg/uniq/dup key type.

[feature](schema change) fe olapTable maxUniqueId write in disk.

[feature](schema change) add rpc iface for sc add column.

[feature](schema change) add columnsDesc to TPushReq for ligtht sc.

resolve the deadlock when schema change (#124)

fix columns from fe don't has bitmap_index flag (#134)

add update/delete case

construct MATERIALIZED schema from origin schema when insert

fix not vectorized compaction coredump

use segment cache

choose newest schema by schema version when compaction (#182)

[bugfix](schema change) fix ligth schema change problem.

[feature](schema change) light schema change add alter job. (#1)

fix be ut

[bug] (schema change) unique drop key column should not light schema
change

[feature](schema change) add schema change regression-test.

fix regression test

[bugfix](schema change) fix multi alter clauses for light schema change. (#2)

[bugfix](schema change) fix multi clauses calculate column unique id (#3)

modify PushTask process (#217)

[Bugfix](schema change) fix jobId replay cause bdbje exception.

[bug](schema change) fix max col unique id repeatitive. (#232)

[optimize](schema change) modify pendingMaxColUniqueId generate rule.

fix compaction error
* fix be ut

* fix snapshot load core

fix unique_id error (#278)

[refact](fe) remove redundant code for light schema change. (#4)

[refact](fe) remove redundant code for light schema change. (#4)

format fe core

format be core

fix be ut

modify fe meta version

fix rebase error

flush schema into rowset_meta in old table

[refactor](schema change) refact fe light schema change. (#5)

delete the change of schemahash and support get max version schema

* modify for review

* fix be ut

* fix schema change test
2022-07-12 19:41:06 +08:00
331fa50501 [feature](cold-data) move cold data to object storage without losing any feature(BE) (#10280)
This PR supports rowset level data upload on the BE side, so that there can be both cold data and hot data in a tablet,
and there is no necessary to prohibit loading new data to cooled tablets.

Each rowset is bound to a `FileSystem`, so that the storage layer can read and write rowsets without
perceiving the underlying filesystem.

The abstracted `RemoteFileSystem` can try local caching strategies with different granularity,
instead of caching segment files as before.

To avoid conflicts with the code in be/src/io, we temporarily put the file system related code in the be/src/io/fs directory.
In the future, `FileReader`s and `FileWriter`s should be unified.
2022-07-08 12:18:39 +08:00
4dfebb9852 [Feature] compaction quickly for small data import (#9804)
* compaction quickly for small data import #9791
1.merge small versions of rowset as soon as possible to increase the import frequency of small version data
2.small version means that the number of rows is less than config::small_compaction_rowset_rows  default 1000
2022-06-15 21:48:34 +08:00
79620f6fa2 [Improvement] change the condition of vectorized compaction (#9950) 2022-06-04 12:29:23 +08:00
47dfdd8e09 [fix](storage) Disable compaction before schema change is actually executed(#9032) (#9065)
As in issue, the combination and schema change at the same time may lead to version intersection.
Describe the overview of changes.
1. Do not do compaction before schema change is actually executed.
2. Set tablet as bad when it has version intersection.
3. Do not do schema change when it can not find appropriate versions to delete in new tablet.
4. Do not change rowsets after compaction if the rowsets of the tablet has changed.
2022-06-01 23:29:18 +08:00
ca05d1ee01 [fix](memory tracker) Fix lru cache, compaction tracker, add USE_MEM_TRACKER compile (#9661)
1. Fix Lru Cache MemTracker consumption value is negative.
2. Fix compaction Cache MemTracker has no track.
3. Add USE_MEM_TRACKER compile option.
4. Make sure the malloc/free hook is not stopped at any time.
2022-05-25 08:56:17 +08:00
2c79d223e4 [refactor][rowset]move rowset writer to a single place (#9368) 2022-05-19 23:57:02 +08:00
73c4ec7167 Fix some typos in be/. (#9681) 2022-05-19 20:55:39 +08:00
7c2db79b73 [BUG] fix bug for vectorized compaction and some storage vectorization bug (#9610) 2022-05-19 16:35:15 +08:00
22439cb6a6 [Improvement] [compaction]Enable vectorized compaction by default (#9383) 2022-05-07 08:46:35 +08:00
e130d2f233 [fix][compaction] Rowset::end_version null pointer(#9379) 2022-05-06 14:40:08 +08:00
49a0cd1925 [fix](compaction) fix bug for vectorized compaction (#9344)
1. add a BE config to switch vectorized compaction
2. Fix vectorized compaction bug that row statistic is not right.
2022-05-03 17:31:40 +08:00
c9961c9bb9 [style] clang-format all c++ code (#9305)
- sh build-support/clang-format.sh  to  clang-format all c++ code
2022-04-29 16:14:22 +08:00
d330bc3806 [Vectorized](stream-load-vec) Support stream load in vectorized engine (#8709) (#9280)
Implement vectorized stream load.
Added fe configuration option `enable_vectorized_load` to enable vectorized stream load.

    Co-authored-by: tengjp@outlook.com
    Co-authored-by: mrhhsg@gmail.com
    Co-authored-by: minghong.zhou@163.com
    Co-authored-by: HappenLee <happenlee@hotmail.com>
    Co-authored-by: zhoubintao <35688959+zbtzbtzbt@users.noreply.github.com>
2022-04-29 09:50:51 +08:00
e5e0dc421d [refactor] Change ALL OLAPStatus to Status (#8855)
Currently, there are 2 status code in BE, one is common/Status.h,
and the other is olap/olap_define.h called OLAPStatus.
OLAPStatus is just an enum type, it is very simple and could not save many informations,
I will unify these code to common/Status.
2022-04-14 11:43:49 +08:00
290366787c [refactor] refactor code, replace some file with stl libs (#8759)
1. replace ConditionVariables with std::condition_variable
2. repalace Mutex with std::mutex
3. repalce MonoTime with std::chrono
2022-04-13 09:55:29 +08:00
bd0a3369b7 [fix] check disk capacity before writing data (#8887)
1. We forgot to check disk capacity when writing data.
2. TODO: the user specified disk capacity is not used now. We need to find a way to use it.
3. Avoid print too much compaction log when there is not suitable version for compaction.
2022-04-08 11:29:49 +08:00
eeae516e37 [Feature](Memory) Hook TCMalloc new/delete automatically counts to MemTracker (#8476)
Early Design Documentation: https://shimo.im/docs/DT6JXDRkdTvdyV3G

Implement a new way of memory statistics based on TCMalloc New/Delete Hook,
MemTracker and TLS, and it is expected that all memory new/delete/malloc/free
of the BE process can be counted.
2022-03-20 23:06:54 +08:00
e17aef9467 [refactor] refactor the implement of MemTracker, and related usage (#8322)
Modify the implementation of MemTracker:
1. Simplify a lot of useless logic;
2. Added MemTrackerTaskPool, as the ancestor of all query and import trackers, This is used to track the local memory usage of all tasks executing;
3. Add cosume/release cache, trigger a cosume/release when the memory accumulation exceeds the parameter mem_tracker_consume_min_size_bytes;
4. Add a new memory leak detection mode (Experimental feature), throw an exception when the remaining statistical value is greater than the specified range when the MemTracker is destructed, and print the accurate statistical value in HTTP, the parameter memory_leak_detection
5. Added Virtual MemTracker, cosume/release will not sync to parent. It will be used when introducing TCMalloc Hook to record memory later, to record the specified memory independently;
6. Modify the GC logic, register the buffer cached in DiskIoMgr as a GC function, and add other GC functions later;
7. Change the global root node from Root MemTracker to Process MemTracker, and remove Process MemTracker in exec_env;
8. Modify the macro that detects whether the memory has reached the upper limit, modify the parameters and default behavior of creating MemTracker, modify the error message format in mem_limit_exceeded, extend and apply transfer_to, remove Metric in MemTracker, etc.;

Modify where MemTracker is used:
1. MemPool adds a constructor to create a temporary tracker to avoid a lot of redundant code;
2. Added trackers for global objects such as ChunkAllocator and StorageEngine;
3. Added more fine-grained trackers such as ExprContext;
4. RuntimeState removes FragmentMemTracker, that is, PlanFragmentExecutor mem_tracker, which was previously used for independent statistical scan process memory, and replaces it with _scanner_mem_tracker in OlapScanNode;
5. MemTracker is no longer recorded in ReservationTracker, and ReservationTracker will be removed later;
2022-03-11 22:04:23 +08:00
c86d469baf [Refactor](storage_engine) Use std::shared_mutex to replace RWMutex (#8387) 2022-03-11 18:14:24 +08:00
aea3e4e59b [refactor] Remove version hash from BE and related test in BE (#8027) 2022-02-14 09:29:27 +08:00
20ef8a6e21 [feature-wip](remote storage)(step1) use a struct instead of string for parameter path, add basic remote method (#7098)
For the first, we need to make a parameter to discribe the data is local or remote.
At then, we need to support some basic function to support the operation for remote storage.
2021-12-22 22:58:23 +08:00
d57c2344e1 [MemTracker] Refactored the hierarchical structure of memtracker (#5956)
To avoid showing too many memtracker on BE web pages.
The MemTracker level now has 3 levels: OVERVIEW, TASK and VERBOSE.

OVERVIEW Mainly used for main memory consumption module such as Query/Load/Metadata.
TASK is mainly used to record the memory overhead of a single task such as a single query, load, and compaction task.
VERBOSE is used for other more detailed memtrackers.
2021-06-16 09:44:24 +08:00
6d6c3d9703 [Enhancement] Reduce memory consumption by releasing readers earier (#5811)
We created multiple rowset readers to read data of one tablet,
after one rowset reader has reached EOF, it can be released to
reduce resource (typically memory) consumption.
As the same, we can release segment reader when it reach EOF.
2021-06-16 09:37:50 +08:00
4ef1dbf394 [Bug] Fix lack of rdlock before rowset_with_max_version() in compaction log (#5953) 2021-06-03 10:01:35 +08:00
1a81b9e160 [MemTracker] Some enchance of MemTracker (#5783)
1 Make some MemTracker have reasonable parent MemTracker not the root tracker
2 Make each MemTracker can be easily to trace.
3 Add show level of MemTracker to reduce the MemTracker show in the web page to have a way to control show how many tracker in web page.
2021-05-19 09:27:50 +08:00
e519a24c9a dynamic adjust compaction policy (#5651)
Co-authored-by: weizuo <weizuo@xiaomi.com>
2021-04-26 12:39:13 +08:00
b423274f17 [Enhance] Make MemTracker more accurate (#5515) (#5516)
* [Enhance] Make MemTracker more accurate (#5515)
 This PR main about:
 1. Improve the readability of MemTrackers' name
 2. Add the MemTracker of:
    * Load
    * Compaction
    * SchemaChange
    * StoragePageCache
    * TabletManager
 3. Change SchemaChange to a Singleon

* revise some code for Code Review

* change the name of mem_tracker

* keep reader_context have the same lifetime of rowset_reader in schema change.

* change vlog notice to log(warning) in schema change
2021-04-08 09:14:55 +08:00
79544d39cb [Metrics][LOG] Update metrics of 'max_compaction_score' and log for compaction (#5592)
* optimize compaction metrics and log

Co-authored-by: weizuo <weizuo@xiaomi.com>
2021-04-08 09:10:40 +08:00
0131c33966 [Enhance] Improve the readability of memtrackers' name (#5455)
Improve the readability of memtrackers' name, then you will be happy to read website be_ip:port/mem_tracker
2021-03-11 22:33:31 +08:00
99404df8b2 [Bug][Compaction] Fix bug that output rowset is not deleted after compaction failure (#4964)
This CL fix 2 bugs:

1. 
When the compaction fails, we must explicitly delete the output rowset,
otherwise the GC logic cannot process these rows.

2. 
Base compaction failed if compaction process include some delete version in SegmentV2,
Because the number of filtered rows is wrong.
2020-11-30 22:02:03 +08:00
ec7e1c6b1b [Refactor] Execute 'pick rowsets' before applying for permits for a compaction task (#4891)
The current compaction mechanism is that there is a producer thread that has been producing compaction tasks,
and the selected tablet must apply for `permits`.
When a tablet could hold `permits`, compaction task for this tablet will be submitted to  thread pool.
We take compaction score as `permits` which is used for limiting memory consumption.
However,  `pick_rowset_to_compaction()` will be executed before the file merge in compaction thread,
and the number of segment files that actually perform the merge operation is smaller than compaction score.
In addition, it is also possible that compaction task exits directly because the tablet doesn't meet
the requirements of compaction. 

This patch optimizes and refactors the code of compaction, so that we can execute 'pick rowsets'
before applying for permits for a compaction task, calculate the number of segment files that actually
participate in the merge operation, and take this number as `permits`.
2020-11-30 11:41:14 +08:00
6fedf5881b [CodeFormat] Clang-format cpp sources (#4965)
Clang-format all c++ source files.
2020-11-28 18:36:49 +08:00
10e1e29711 Remove header file common/names.h (#4945) 2020-11-26 17:00:48 +08:00
588e5bee47 [Bug] Fix bug of cumulative compaction and deletion of stale version (#4593)
When selecting candidate rowsets to do the cumulative compaction,
some rowsets may not be selected because the protection time has not expired.
Therefore, we need to find the current longest continuous version path in the candidate rowsets.
2020-10-21 10:03:55 +08:00
75e0ba32a1 Fixes some be typo (#4714) 2020-10-13 09:37:15 +08:00
eba595583e [Optimize] Optimize the execution model of compaction to limit memory consumption (#4670)
Currently, there are M threads to do base compaction and N threads to do cumulative compaction for each disk. 
Too many compaction tasks may run out of memory, so the max concurrency of running compaction tasks
is limited by semaphore.
If the running threads cost too much memory, we can't defense it. In addition, reducing concurrency to avoid OOM
will lead to some compaction tasks can't be executed in time and we may encounter more heavy compaction. 
Therefore, concurrency limitation is not enough.

The strategy proposed in #3624  may be effective to solve the OOM. 

A CompactionPermitLimiter is used for compaction limitation, and use single-producer/multi-consumer model.
Producer will try to generate compaction tasks and acquire `permits` for each task. 
The compaction task which can hold `permits` will be executed in thread pool and each finished task will
release its `permits`.

`permits` should be applied for before a compaction task can execute. When the sum of `permits`
held by executing compaction tasks reaches a threshold, subsequent compaction task will be no longer allowed,
until some `permits` are released. Tablet compaction score is used as `permits` of compaction task here.

To some extent, memory consumption can be limited by setting appropriate `permits` threshold.
2020-10-11 11:39:25 +08:00
10f822eb43 [MemTracker] make all MemTrackers shared (#4135)
We make all MemTrackers shared, in order to show MemTracker real-time consumptions on the web.
As follows:
1. nearly all MemTracker raw ptr -> shared_ptr
2. Use CreateTracker() to create new MemTracker(in order to add itself to its parent)
3. RowBatch & MemPool still use raw ptrs of MemTracker, it's easy to ensure RowBatch & MemPool destructor exec 
     before MemTracker's destructor. So we don't change these code.
4. MemTracker can use RuntimeProfile's counter to calc consumption. So RuntimeProfile's counter need to be shared 
    too. We add a shared counter pool to store the shared counter, don't change other counters of RuntimeProfile.
Note that, this PR doesn't change the MemTracker tree structure. So there still have some orphan trackers, e.g. RowBlockV2's MemTracker. If you find some shared MemTrackers are little memory consumption & too time-consuming, you could make them be the orphan, then it's fine to use the raw ptr.
2020-07-31 21:57:21 +08:00
a01d1aec56 [Compaction] track RowsetReader's mem & add metric (#4068)
Ref https://github.com/apache/incubator-doris/issues/3624#issuecomment-655933244
Only RowsetReaders in compaction are under the track.
Other RowsetReaders won't be effected, because the parent_tracker is nullptr.
2020-07-24 07:58:09 +08:00
3c09e1e1d8 [trace] Adapt trace util to compaction module (#3814)
Trace util is helpful for diagnosing compaction performance problems,
we can get trace log for base compaction like:
```
W0610 11:26:33.804431 56452 storage_engine.cpp:552] Trace:
0610 11:23:03.727535 (+     0us) storage_engine.cpp:554] start to perform base compaction
0610 11:23:03.728961 (+  1426us) storage_engine.cpp:560] found best tablet 546859
0610 11:23:03.728963 (+     2us) base_compaction.cpp:40] got base compaction lock
0610 11:23:03.729029 (+    66us) base_compaction.cpp:44] rowsets picked
0610 11:24:51.784439 (+108055410us) compaction.cpp:46] got concurrency lock and start to do compaction
0610 11:24:51.784818 (+   379us) compaction.cpp:74] prepare finished
0610 11:26:33.359265 (+101574447us) compaction.cpp:87] merge rowsets finished
0610 11:26:33.484481 (+125216us) compaction.cpp:102] output rowset built
0610 11:26:33.484482 (+     1us) compaction.cpp:106] check correctness finished
0610 11:26:33.513197 (+ 28715us) compaction.cpp:110] modify rowsets finished
0610 11:26:33.513300 (+   103us) base_compaction.cpp:49] compaction finished
0610 11:26:33.513441 (+   141us) base_compaction.cpp:56] unused rowsets have been moved to GC queue
Metrics: {"filtered_rows":0,"input_row_num":3346807,"input_rowsets_count":42,"input_rowsets_data_size":1256413170,"input_segments_num":44,"merge_rowsets_latency_us":101574444,"merged_rows":0,"output_row_num":3346807,"output_rowset_data_size":1228439659,"output_segments_num":6}
```
for cumulative compaction like:
```
W0610 11:14:18.714366 56468 storage_engine.cpp:518] Trace:
0610 11:14:08.068484 (+     0us) storage_engine.cpp:520] start to perform cumulative compaction
0610 11:14:08.069844 (+  1360us) storage_engine.cpp:526] found best tablet 547083
0610 11:14:08.069846 (+     2us) cumulative_compaction.cpp:42] got cumulative compaction lock
0610 11:14:08.069947 (+   101us) cumulative_compaction.cpp:46] calculated cumulative point
0610 11:14:08.070141 (+   194us) cumulative_compaction.cpp:50] rowsets picked
0610 11:14:08.070143 (+     2us) compaction.cpp:46] got concurrency lock and start to do compaction
0610 11:14:08.070518 (+   375us) compaction.cpp:74] prepare finished
0610 11:14:15.389893 (+7319375us) compaction.cpp:87] merge rowsets finished
0610 11:14:15.390916 (+  1023us) compaction.cpp:102] output rowset built
0610 11:14:15.390917 (+     1us) compaction.cpp:106] check correctness finished
0610 11:14:15.409460 (+ 18543us) compaction.cpp:110] modify rowsets finished
0610 11:14:15.409496 (+    36us) cumulative_compaction.cpp:55] compaction finished
0610 11:14:15.410138 (+   642us) cumulative_compaction.cpp:65] unused rowsets have been moved to GC queue
Metrics: {"filtered_rows":0,"input_row_num":136707,"input_rowsets_count":302,"input_rowsets_data_size":76617836,"input_segments_num":302,"merge_rowsets_latency_us":7319372,"merged_rows":0,"output_row_num":136707,"output_rowset_data_size":53893280,"output_segments_num":1}
```
2020-06-13 19:31:51 +08:00
37fccd53c4 [Tablet] A small refactor on class Tablet (#3339)
There is no functional changes in this patch.
Key refactor points are:
- Remove meaningless return value of functions in class Tablet, and
  also some related functions in other classes
- Allow RowsetGraph::capture_consistent_versions to pass a nullptr
  to the output parameter
- Use CHECK instead of LOG(FATAL) to simplify code
2020-04-24 22:22:26 +08:00