Commit Graph

127 Commits

Author SHA1 Message Date
942611c185 Revert "[enhancement](compaction) opt compaction task producer and quick compaction (#13495)" (#13833)
This reverts commit 4f2ea0776ca3fe5315ab5ef7e00eefabfb5771a0.
2022-11-01 14:22:12 +08:00
4f2ea0776c [enhancement](compaction) opt compaction task producer and quick compaction (#13495)
1.remove quick_compaction's rowset pick policy, call cu compaction when trigger
quick compaction
2. skip tablet's compaction task when compaction score is too small

Co-authored-by: yixiutt <yixiu@selectdb.com>
2022-10-31 12:24:05 +08:00
671dc93035 [feature-wip](unique-key-merge-on-write) fix that versions of multiple replicas are inconsistent when rebalance (#12363) 2022-10-09 11:31:27 +08:00
Pxl
8731eea26e [Chore](clang) fix some build fail on clang15 (#12882)
remove unused variables
2022-09-26 23:13:28 +08:00
e01986b8b9 [feature](light-schema-change) fix light-schema-change and add more cases (#12160)
Fix _delete_sign_idx and _seq_col_idx when append_column or build_schema when load.
Tablet schema cache support recycle when schema sptr use count equals 1.
Add a http interface for flink-connector to sync ddl.
Improve tablet->tablet_schema() by max_version_schema.
2022-09-17 11:29:36 +08:00
db07e51cd3 [refactor](status) Refactor status handling in agent task (#11940)
Refactor TaggableLogger
Refactor status handling in agent task:
Unify log format in TaskWorkerPool
Pass Status to the top caller, and replace some OLAPInternalError with more detailed error message Status
Premature return with the opposite condition to reduce indention
2022-08-29 12:06:01 +08:00
805c13aaa1 [fix](backup) fix backup restore raise Storage backend not initialized. error (#11736)
fix backup restore raise Storage backend not initialized. error
2022-08-15 13:24:38 +08:00
a6537a90cd [Enhancement] Garbage collection of unused data on remote storage backend (#10731)
* [Feature](cold_on_s3) support unused remote rowset gc

* return aborted when skip drop tablet

* perform unused remote rowset gc
2022-07-29 14:38:39 +08:00
01e108cb7b [feature-wip](unique-key-merge-on-write) update delete bitmap while publish version (#11195)
1.make version publish work in version order
2.update delete bitmap while publish version, load current version rowset
primary key and search in pre rowsets
3.speed up publish version task by parallel tablet publish task

Co-authored-by: yixiutt <yixiu@selectdb.com>
2022-07-27 16:26:42 +08:00
0b6d2ae290 [fix] Move s3 fs connect outside the lock critical area (#11026)
* fix potential bug of S3FileSystem

* move s3 fs connect outside the lock critical area
2022-07-23 16:06:29 +08:00
98abb8bc1f fix empty storage policy, be refresh exception log. (#11123)
* fix empty storage policy, be refresh exception log.

* fix log level
2022-07-22 22:10:16 +08:00
4960043f5e [enhancement] Refactor to improve the usability of MemTracker (step2) (#10823) 2022-07-21 17:11:28 +08:00
2df1822269 [bugfix]fix DCHECK failure in remove_all_remote_rowsets (#10994) 2022-07-20 19:06:21 +08:00
8c544b6e13 fix show storage policy null pointer and redundant log (#10906)
* fix show storage policy null pointer and redundant log
2022-07-18 14:08:54 +08:00
401203da6a [feature](code-data) move cold data to object storage without losing any feature(FE) (#10693)
Co-authored-by:platonekosama@gmail.com
2022-07-15 18:00:48 +08:00
331fa50501 [feature](cold-data) move cold data to object storage without losing any feature(BE) (#10280)
This PR supports rowset level data upload on the BE side, so that there can be both cold data and hot data in a tablet,
and there is no necessary to prohibit loading new data to cooled tablets.

Each rowset is bound to a `FileSystem`, so that the storage layer can read and write rowsets without
perceiving the underlying filesystem.

The abstracted `RemoteFileSystem` can try local caching strategies with different granularity,
instead of caching segment files as before.

To avoid conflicts with the code in be/src/io, we temporarily put the file system related code in the be/src/io/fs directory.
In the future, `FileReader`s and `FileWriter`s should be unified.
2022-07-08 12:18:39 +08:00
1d3496c6ab [feature] support backup/restore connect to HDFS (#10081) 2022-06-19 10:26:20 +08:00
Pxl
fd0bd395ac [Enhancement] Remove some unused include (#10035) 2022-06-17 10:47:25 +08:00
Pxl
5805f8077f [Feature] [Vectorized] Some pre-refactorings or interface additions for schema change part2 (#10003) 2022-06-16 10:50:08 +08:00
4dfebb9852 [Feature] compaction quickly for small data import (#9804)
* compaction quickly for small data import #9791
1.merge small versions of rowset as soon as possible to increase the import frequency of small version data
2.small version means that the number of rows is less than config::small_compaction_rowset_rows  default 1000
2022-06-15 21:48:34 +08:00
f4e2f78a1a [fix] Fix the bug that data balance causes tablet loss (#9971)
1. Provide a FE conf to test the reliability in single replica case when tablet scheduling are frequent.
2. According to #6063, almost apply this fix on current code.
2022-06-15 09:52:56 +08:00
ca05d1ee01 [fix](memory tracker) Fix lru cache, compaction tracker, add USE_MEM_TRACKER compile (#9661)
1. Fix Lru Cache MemTracker consumption value is negative.
2. Fix compaction Cache MemTracker has no track.
3. Add USE_MEM_TRACKER compile option.
4. Make sure the malloc/free hook is not stopped at any time.
2022-05-25 08:56:17 +08:00
4cd579b155 [refactor] Check status precise_code instead of construct OLAPInternalError (#9514)
* check status precise_code instead of construct OLAPInternalError
* move is_io_error to Status
2022-05-12 15:39:29 +08:00
580ce38a3f [fix](schema_hash) Fix bug that introduced by removing schema_hash (#9449) 2022-05-08 21:03:10 +08:00
c9961c9bb9 [style] clang-format all c++ code (#9305)
- sh build-support/clang-format.sh  to  clang-format all c++ code
2022-04-29 16:14:22 +08:00
e157c2c254 [feature-wip](remote-storage) step3: Support remote storage, only for be, add migration_task_v2 (#8806)
1. Add TStorageMigrationReqV2 and EngineStorageMigrationTask to support migration action
2. Change TabletManager::create_tablet() for remote storage
3. Change TabletManager::try_delete_unused_tablet_path() for remote storage
2022-04-22 22:38:10 +08:00
d1d834694f [fix] Fix bug of wrong argument of drop_tablet function (#9031)
introduced from #8574
2022-04-15 15:19:28 +08:00
e5e0dc421d [refactor] Change ALL OLAPStatus to Status (#8855)
Currently, there are 2 status code in BE, one is common/Status.h,
and the other is olap/olap_define.h called OLAPStatus.
OLAPStatus is just an enum type, it is very simple and could not save many informations,
I will unify these code to common/Status.
2022-04-14 11:43:49 +08:00
290366787c [refactor] refactor code, replace some file with stl libs (#8759)
1. replace ConditionVariables with std::condition_variable
2. repalace Mutex with std::mutex
3. repalce MonoTime with std::chrono
2022-04-13 09:55:29 +08:00
d51545a952 [fix](ut)(memory-leak) Fix be asan ut failed and hdfs file reader memory leak (#8905) 2022-04-08 00:07:00 +08:00
98cab78320 [refactor](schema_hash) remove schema_hash since every tablet id in be is unique (#8574) 2022-04-07 08:37:45 +08:00
f96bc62573 [feature](balance) Support balance between disks on a single BE (#8553)
Current situation of Doris is that the cluster is balanced, but the disks of a backend may be unbalanced.
for example, backend A have two disks: disk1 and disk2, disk1's usage is 98%, but disk2's usage is only 40%.
disk1 is unable to take more data, therefore only one disk of backend A can take new data,
the available write throughput of backend A is only half of its ability, and we can not resolve this through load or 
partition rebalance now.

So we introduce disk rebalancer, disk rebalancer is different from other rebalancer(load or partition)
which take care of cluster-wide data balancing. it takes care about backend-wide data balancing.

[For more details see #8550](https://github.com/apache/incubator-doris/issues/8550)
2022-03-28 10:03:21 +08:00
e17aef9467 [refactor] refactor the implement of MemTracker, and related usage (#8322)
Modify the implementation of MemTracker:
1. Simplify a lot of useless logic;
2. Added MemTrackerTaskPool, as the ancestor of all query and import trackers, This is used to track the local memory usage of all tasks executing;
3. Add cosume/release cache, trigger a cosume/release when the memory accumulation exceeds the parameter mem_tracker_consume_min_size_bytes;
4. Add a new memory leak detection mode (Experimental feature), throw an exception when the remaining statistical value is greater than the specified range when the MemTracker is destructed, and print the accurate statistical value in HTTP, the parameter memory_leak_detection
5. Added Virtual MemTracker, cosume/release will not sync to parent. It will be used when introducing TCMalloc Hook to record memory later, to record the specified memory independently;
6. Modify the GC logic, register the buffer cached in DiskIoMgr as a GC function, and add other GC functions later;
7. Change the global root node from Root MemTracker to Process MemTracker, and remove Process MemTracker in exec_env;
8. Modify the macro that detects whether the memory has reached the upper limit, modify the parameters and default behavior of creating MemTracker, modify the error message format in mem_limit_exceeded, extend and apply transfer_to, remove Metric in MemTracker, etc.;

Modify where MemTracker is used:
1. MemPool adds a constructor to create a temporary tracker to avoid a lot of redundant code;
2. Added trackers for global objects such as ChunkAllocator and StorageEngine;
3. Added more fine-grained trackers such as ExprContext;
4. RuntimeState removes FragmentMemTracker, that is, PlanFragmentExecutor mem_tracker, which was previously used for independent statistical scan process memory, and replaces it with _scanner_mem_tracker in OlapScanNode;
5. MemTracker is no longer recorded in ReservationTracker, and ReservationTracker will be removed later;
2022-03-11 22:04:23 +08:00
c86d469baf [Refactor](storage_engine) Use std::shared_mutex to replace RWMutex (#8387) 2022-03-11 18:14:24 +08:00
0ff7de4157 [refactor] remove agent status (#8273)
There are 3 error code types in BE: OLAPStatus AgentStatus Status.
It is very confused and sometimes conflict during write code.
I will try to unify them to Status.
2022-03-09 13:04:50 +08:00
50864aca7d [refactor] fix warings when compile with clang (#8069) 2022-02-19 11:29:02 +08:00
26289c28b0 [fix](load)(compaction) Fix NodeChannel coredump bug and modify some compaction logic (#8072)
1. Fix the problem of BE crash caused by destruct sequence. (close #8058)
2. Add a new BE config `compaction_task_num_per_fast_disk`

    This config specify the max concurrent compaction task num on fast disk(typically .SSD).
    So that for high speed disk, we can execute more compaction task at same time,
    to compact the data as soon as possible

3. Avoid frequent selection of unqualified tablet to perform compaction.
4. Modify some log level to reduce the log size of BE.
5. Modify some clone logic to handle error correctly.
2022-02-17 10:52:08 +08:00
aea3e4e59b [refactor] Remove version hash from BE and related test in BE (#8027) 2022-02-14 09:29:27 +08:00
ed39ff1500 [feature](compaction) Support triggering compaction for a specific partition manually (#7521)
Add statement to trigger cumulative or base compaction for a specified partition.
2022-01-21 09:27:06 +08:00
9ddcf0625c [improvement](load) Transaction for load job with no data for all partitions should be considered as normal and should not be aborted (#7240)
If the load result set is empty, or the load data is all filtered by the `where` condition,
it will not return failed with msg `all partitions have no load data`, but will return success directly.
2022-01-05 10:38:33 +08:00
20ef8a6e21 [feature-wip](remote storage)(step1) use a struct instead of string for parameter path, add basic remote method (#7098)
For the first, we need to make a parameter to discribe the data is local or remote.
At then, we need to support some basic function to support the operation for remote storage.
2021-12-22 22:58:23 +08:00
6c6380969b [refactor] replace boost smart ptr with stl (#6856)
1. replace all boost::shared_ptr to std::shared_ptr
2. replace all boost::scopted_ptr to std::unique_ptr
3. replace all boost::scoped_array to std::unique<T[]>
4. replace all boost:thread to std::thread
2021-11-17 10:18:35 +08:00
ed7a873a44 [Memory Usage] Implement segment lru cache to save memory of BE (#6829) 2021-10-25 10:07:15 +08:00
24d38614a0 [Dependency] Upgrade thirdparty libs (#6766)
Upgrade the following dependecies:

libevent -> 2.1.12
OpenSSL 1.0.2k -> 1.1.1l
thrift 0.9.3 -> 0.13.0
protobuf 3.5.1 -> 3.14.0
gflags 2.2.0 -> 2.2.2
glog 0.3.3 -> 0.4.0
googletest 1.8.0 -> 1.10.0
snappy 1.1.7 -> 1.1.8
gperftools 2.7 -> 2.9.1
lz4 1.7.5 -> 1.9.3
curl 7.54.1 -> 7.79.0
re2 2017-05-01 -> 2021-02-02
zstd 1.3.7 -> 1.5.0
brotli 1.0.7 -> 1.0.9
flatbuffers 1.10.0 -> 2.0.0
apache-arrow 0.15.1 -> 5.0.0
CRoaring 0.2.60 -> 0.3.4
orc 1.5.8 -> 1.6.6
libdivide 4.0.0 -> 5.0
brpc 0.97 -> 1.0.0-rc02
librdkafka 1.7.0 -> 1.8.0

after this pr compile doris should use build-env:1.4.0
2021-10-15 13:03:04 +08:00
f772649535 [Optimize] Optimize lock when check error storage (#6321)
1. `StorageEngine::_delete_tablets_on_unused_root_path` will try to obtain tablet shard write lock in `TabletManager`
```
StorageEngine::_delete_tablets_on_unused_root_path
  TabletManager::drop_tablets_on_error_root_path
    obtain each tablet shard's write lock
```
2. `TabletManager::build_all_report_tablets_info` and other methods will obtain tablet shard read lock frequently.

So, `StorageEngine::_delete_tablets_on_unused_root_path` will hold `_store_lock` for a long time.
This will make it difficult for other threads to get write `_store_lock`, such as `StorageEngine::get_stores_for_create_tablet`

`drop_tablets_on_error_root_path` is a small probability event, `TabletManager::drop_tablets_on_error_root_path` should return when its param `tablet_info_vec` is empty
2021-08-07 21:30:49 +08:00
Pxl
0c6726f7cd [Bug] Fix bug of TDisk have wrong static_cast (#6175)
* remove some useless static_cast
2021-07-09 09:42:08 +08:00
58d0c8971e [Bugfix] Fix BE metrics http API dead lock bug (#5730) 2021-04-30 10:15:33 +08:00
9b0d6ecaf0 [Log] Add error msg when tablet not found (#5659)
Before drop a tablet, it will try to find the tablet in tablet map.
But the tablet maybe has been not existed.
Therefore, it is better to print the error message and error status.
2021-04-21 16:37:47 +08:00
d15fe05f3c [Metrics] Add metrics to monitor BE's agent task queue size (#5648)
* [Metrics] Add metrics to monitor BE's agent task queue size

Sometimes, user's DDL or background task may last a long time,
it's not easy to find out which procedure has problem.
This patch add metric to monitor BE's agent task queue size,
which would be helpful for troubleshooting.

The raw metrics on BE looks like:
doris_be_agent_task_queue_size{type="REPORT_OLAP_TABLE"} 0
doris_be_agent_task_queue_size{type="REPORT_DISK_STATE"} 0
doris_be_agent_task_queue_size{type="REPORT_TASK"} 0
doris_be_agent_task_queue_size{type="CHECK_CONSISTENCY"} 0
doris_be_agent_task_queue_size{type="DELETE"} 0
doris_be_agent_task_queue_size{type="CLEAR_TRANSACTION_TASK"} 0
doris_be_agent_task_queue_size{type="PUBLISH_VERSION"} 0
doris_be_agent_task_queue_size{type="UPLOAD"} 0
doris_be_agent_task_queue_size{type="DROP_TABLE"} 0
doris_be_agent_task_queue_size{type="CREATE_TABLE"} 39
doris_be_agent_task_queue_size{type="RELEASE_SNAPSHOT"} 0
doris_be_agent_task_queue_size{type="STORAGE_MEDIUM_MIGRATE"} 245
doris_be_agent_task_queue_size{type="CLONE"} 0
doris_be_agent_task_queue_size{type="MOVE"} 0
doris_be_agent_task_queue_size{type="ALTER_TABLE"} 0
doris_be_agent_task_queue_size{type="DOWNLOAD"} 0
doris_be_agent_task_queue_size{type="PUSH"} 0
doris_be_agent_task_queue_size{type="UPDATE_TABLET_META_INFO"} 0
doris_be_agent_task_queue_size{type="MAKE_SNAPSHOT"} 0

* fix typo
2021-04-21 09:23:33 +08:00
422456c31a Add warn log when client report be state failed and refactor some report code (#5342)
There are some redundant code for report task, disk and tablet in be, and when fe return error report message, there is no any warn log showing report failed.

Co-authored-by: caiconghui [蔡聪辉] <caiconghui@xiaomi.com>
2021-03-03 17:00:21 +08:00