Commit Graph

42 Commits

Author SHA1 Message Date
580ce38a3f [fix](schema_hash) Fix bug that introduced by removing schema_hash (#9449) 2022-05-08 21:03:10 +08:00
c9961c9bb9 [style] clang-format all c++ code (#9305)
- sh build-support/clang-format.sh  to  clang-format all c++ code
2022-04-29 16:14:22 +08:00
e5e0dc421d [refactor] Change ALL OLAPStatus to Status (#8855)
Currently, there are 2 status code in BE, one is common/Status.h,
and the other is olap/olap_define.h called OLAPStatus.
OLAPStatus is just an enum type, it is very simple and could not save many informations,
I will unify these code to common/Status.
2022-04-14 11:43:49 +08:00
c872793a23 remove rowset converter since it is useless (#8974)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2022-04-13 10:40:12 +08:00
290366787c [refactor] refactor code, replace some file with stl libs (#8759)
1. replace ConditionVariables with std::condition_variable
2. repalace Mutex with std::mutex
3. repalce MonoTime with std::chrono
2022-04-13 09:55:29 +08:00
519305cb22 [feature-wip] (memory tracker) (step4) Switch TLS mem tracker to separate more detailed memory usage (#8669)
Based on #8605, Separate out the memory usage of each operator from the Query/Load/StorageEngine mem tracker.
2022-04-08 09:02:26 +08:00
c69dd54116 [refactor](mutex) Use std::mutex to replace Mutex and refactor some lock logic (#8452) 2022-03-24 14:50:02 +08:00
eeae516e37 [Feature](Memory) Hook TCMalloc new/delete automatically counts to MemTracker (#8476)
Early Design Documentation: https://shimo.im/docs/DT6JXDRkdTvdyV3G

Implement a new way of memory statistics based on TCMalloc New/Delete Hook,
MemTracker and TLS, and it is expected that all memory new/delete/malloc/free
of the BE process can be counted.
2022-03-20 23:06:54 +08:00
c86d469baf [Refactor](storage_engine) Use std::shared_mutex to replace RWMutex (#8387) 2022-03-11 18:14:24 +08:00
0ff7de4157 [refactor] remove agent status (#8273)
There are 3 error code types in BE: OLAPStatus AgentStatus Status.
It is very confused and sometimes conflict during write code.
I will try to unify them to Status.
2022-03-09 13:04:50 +08:00
26289c28b0 [fix](load)(compaction) Fix NodeChannel coredump bug and modify some compaction logic (#8072)
1. Fix the problem of BE crash caused by destruct sequence. (close #8058)
2. Add a new BE config `compaction_task_num_per_fast_disk`

    This config specify the max concurrent compaction task num on fast disk(typically .SSD).
    So that for high speed disk, we can execute more compaction task at same time,
    to compact the data as soon as possible

3. Avoid frequent selection of unqualified tablet to perform compaction.
4. Modify some log level to reduce the log size of BE.
5. Modify some clone logic to handle error correctly.
2022-02-17 10:52:08 +08:00
7d7e3a39f5 [refactor] Remove snapshot converter and unused Protobuf Definitions (#8026)
1. remove snapshot converter
2. remove unused protobuf definitions
3. move some macro as const variables
2022-02-12 16:06:04 +08:00
20ef8a6e21 [feature-wip](remote storage)(step1) use a struct instead of string for parameter path, add basic remote method (#7098)
For the first, we need to make a parameter to discribe the data is local or remote.
At then, we need to support some basic function to support the operation for remote storage.
2021-12-22 22:58:23 +08:00
6c6380969b [refactor] replace boost smart ptr with stl (#6856)
1. replace all boost::shared_ptr to std::shared_ptr
2. replace all boost::scopted_ptr to std::unique_ptr
3. replace all boost::scoped_array to std::unique<T[]>
4. replace all boost:thread to std::thread
2021-11-17 10:18:35 +08:00
8850cfe2ad [Compaction] Modify compaction logic (#5737)
1. Add /api/compaction/run_status to show the running compaction tasks.
2. Support do base and cumulative compaction for one tablet at same time.
3. Modify some log level.
4. Add a feedback document.
2021-05-07 11:18:47 +08:00
d641a26490 [Refactor] Remove boost filesystem (#5579)
* use std::filesystem instead of boost
Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com>
2021-04-08 09:11:59 +08:00
a56e7e2192 [Refactor] make uint24_t,OLAPIndexFixedHeader as a POD type (#5559) 2021-03-27 18:59:23 +08:00
a6e2c3e3f1 [Bug][Clone] Fix the bug that incremental clone is not triggered (#5230)
In version 0.13, we support a more efficient compaction logic. 
This logic will maintain multiple version paths of the tablet.
This can avoid -230 errors and can also support incremental clone.

But the previous incremental clone uses the incremental rowset meta recorded in `incr_rs_meta`.
At present, the incremental rowset meta recorded in `incr_rs_meta` and the records
in `stale_rs_meta` are duplicated, and the current clone logic does not adapt to the
new multi-version path, resulting in many cases not triggering incremental clone.

This CL mainly modified:

1. Removed `incr_rs_meta` metadata
2. Modified the clone logic. When the clone is incremented, it will try to read the rowset in `stale_rs_meta`.
3. Delete a lot of code that was previously used for version compatibility.
2021-02-06 22:04:48 +08:00
93a4c7efc1 [LOG] Standardize the use of VLOG in code (#5264)
At present, the application of vlog in the code is quite confusing.
It is inherited from impala VLOG_XX format, and there is also VLOG(number) format.
VLOG(number) format does not have a unified specification, so this pr standardizes the use of VLOG
2021-01-21 12:09:09 +08:00
ec7e1c6b1b [Refactor] Execute 'pick rowsets' before applying for permits for a compaction task (#4891)
The current compaction mechanism is that there is a producer thread that has been producing compaction tasks,
and the selected tablet must apply for `permits`.
When a tablet could hold `permits`, compaction task for this tablet will be submitted to  thread pool.
We take compaction score as `permits` which is used for limiting memory consumption.
However,  `pick_rowset_to_compaction()` will be executed before the file merge in compaction thread,
and the number of segment files that actually perform the merge operation is smaller than compaction score.
In addition, it is also possible that compaction task exits directly because the tablet doesn't meet
the requirements of compaction. 

This patch optimizes and refactors the code of compaction, so that we can execute 'pick rowsets'
before applying for permits for a compaction task, calculate the number of segment files that actually
participate in the merge operation, and take this number as `permits`.
2020-11-30 11:41:14 +08:00
6fedf5881b [CodeFormat] Clang-format cpp sources (#4965)
Clang-format all c++ source files.
2020-11-28 18:36:49 +08:00
10e1e29711 Remove header file common/names.h (#4945) 2020-11-26 17:00:48 +08:00
09f97f8a05 [Refactor] Fixes some be typo part 2 (#4747) 2020-10-20 09:28:57 +08:00
b85bb0e2e9 [Bug-Fix] Some deleted tablets are not recycled on BE (#4401) 2020-08-27 12:09:19 +08:00
8a169981cf [Bug][TabletRepair] Fix bug that too many replicas generated when decommission BE (#4148)
Try to select the BE with an existing replicas as the destination BE for
REPLICA_RELOCATING clone task.
Fix #4147 

Also add 2 new FE configs `max_clone_task_timeout_sec` and `min_clone_task_timeout_sec`
2020-07-30 09:46:33 +08:00
c08d6e4708 [tablet meta] Do some refactor on TabletMeta (#3136)
remove some functions' return value which always return OLAP_SUCCESS
optimize some loops
2020-03-20 15:03:22 +08:00
c83729435f Write delete predicate into RowsetMeta upon upgrade from Doris-0.10 to Doris-0.11 (#3044)
If delete predicate exists in meta in Doris-0.10, all of this predicates should
be remained. There is an confused place in Doris-0.10. The delete predicate
only exists in OLAPHeaderMessage and PPendingDelta, not in PDelta.
This trick results this bug.
2020-03-07 11:16:48 +08:00
b3c5f0fac7 Remove unneeded headers included in agent-util (#2929) 2020-02-18 13:18:56 +08:00
7c4149cf27 Improve comparison and printing of Version (#2796)
* Improve comparison and printing of Version

There are two members in `Version`:` first` and `second`.
There are many places where we need to print one `Version` object  and
compare two `Version` objects, but in the current code, these two members
are accessed directly, which makes the code very tedious.

This patch mainly do:
1. Adds overloaded methods for `operator<<()` for `Version`, so
   we can directly print a Version object;
2. Adds the `cantains()` method to determine whether it is an containment
   relationship;
3. Uses `operator==()` to determine if two `Version` objects are equal.

Because there are too many places need to be modified, there are still some
naked codes left, which will be modified later.

This patch also removes some necessary header file references.

No functional changes in this patch.
2020-01-19 18:04:28 +08:00
9e54751098 [Snapshot] Modify the prefer snapshot version (#2748)
In this CL, prefer snapshot version in snapshot request is defined
in thrift. So that both FE and BE can use this version value.
2020-01-15 15:10:14 +08:00
9c90b09a3f [Alter Table] No need to check whether table is stable when doing some kinds of alter operation (#2617)
* [Alter Table] No need to check whether table is stable when doing some kinds of alter operation.

Not all alter table operation require table to be stable. Such as rename, modify meta data.
2020-01-02 20:51:23 +08:00
fbee3c7722 Remove VersionHash used to comparison in BE (#2358) 2019-12-04 20:09:03 +08:00
e98bbb5bc5 Refactor clone task (#2285)
In the previous implementation, clone task will continue download files
even if some error happened. This may cause unexpected problem. This
Change List refactor it to that when error happends, clone task will
fail total and try to clone from another remote source.

Besides above change, I call FileUtils::remove_all and create_dir
instead of boost one, which may cause exception. What's more
AgentMasterClient is replaced with ThriftRpcHelper, by this change
conncection can be reused.
2019-11-24 22:36:10 +08:00
d0316d158d Refactor and reorganize the file utils (#2089) 2019-11-11 20:25:41 +08:00
6b4ef34162 fix AlphaRowsetTest by remove StorageEngine #2078 (#2091) 2019-10-30 19:39:41 +08:00
01e71def63 Update engine_clone_task.cpp (#1979) 2019-10-14 16:23:04 +08:00
85940a292b RowsetFactory as a single entry for Rowset creation (#1748) 2019-09-05 18:10:18 +08:00
6f4feca3dc Add rowset id generator to FE and BE (#1678) 2019-09-02 18:51:31 +08:00
7e981b2b14 Limit the disk usage to avoid running out of disk capacity (#1702)
Set high watermark and flood stage of disk used capacity.
And forbid some operations if disk usage is too high.
2019-08-27 22:18:17 +08:00
2b2bc82ae2 Add timeout on snapshot of data (#1672)
Release snapshot when finishing or cancelling backup/restore job.
Snapshot may takes a lot disk space if not releasing them in time.
2019-08-21 21:18:53 +08:00
dcb75729db Change cumulative compaction for decoupling storage from compution (#1576)
1. Calculate cumulative point when loading tablet first time.
2. Simplify pick rowsets logic upon delete predicate.
3. Saving meta and modify rowsets only once after cumulative compaction.
2019-08-13 18:25:56 +08:00
0d48a3961c Refactor Storage Engine (#1478)
NOTE: This patch would modify all Backend's data.
And this will cause a very long time to restart be.
So if you want to interferer your product environment,
you should upgrade backend one by one.

1. Refactoring be is to clarify the structure the codes.
2. Use unique id to indicate a rowset.
   Nameing rowset with tablet_id and version will lead to
   many conflicts among compaction, clone, restore.
3. Extract an rowset interface to encapsulate rowsets
   with different format.
2019-07-15 21:18:22 +08:00