doris

Author	SHA1	Message	Date
GoGoWen	91332fa6bd	[fix](reader) fix logic error for Tablet::capture_rs_readers (#7469 )	2021-12-24 21:32:49 +08:00
pengxiangyu	20ef8a6e21	[feature-wip](remote storage)(step1) use a struct instead of string for parameter path, add basic remote method (#7098 ) For the first, we need to make a parameter to discribe the data is local or remote. At then, we need to support some basic function to support the operation for remote storage.	2021-12-22 22:58:23 +08:00
Mingyu Chen	db57c42c83	[improvement](compaction)(tablet repair) Add missing rowsets in compaction status url and support force dropping redundant replica (#7283 ) 1. Add missing rowsets in compaction status url 2. Add a new config `force_drop_redundant_replica` to force drop redundant replicas. 3. Fix FE ut	2021-12-09 22:34:57 +08:00
Mingyu Chen	adb6bfdf74	[Bug] Fix bug that truncate table may change the storage medium property (#6905 )	2021-10-25 10:07:27 +08:00
HappenLee	a0b3840daa	[MemerySave] Change TabletSchema in tablet to reference to save mem (#6814 ) Change TabletSchema in tablet to reference to save memory	2021-10-16 21:54:32 +08:00
Mingyu Chen	982b76c3c0	[Bug] Fix resource tag bug, add documents and some other bug fix (#6708 ) 1. Fix bug of UNKNOWN Operation Type 91 2. Support using resource_tag property of user to limit the usage of BE 3. Add new FE config `disable_tablet_scheduler` to disable tablet scheduler. 4. Add documents for resource tag. 5. Modify the default value of FE config `default_db_data_quota_bytes` to 1PB. 6. Add a new BE config `disable_compaction_trace_log` to disable the trace log of compaction time cost. 7. Modify the default value of BE config `remote_storage_read_buffer_mb` to 16MB 8. Fix `show backends` results error 9. Add new BE config `external_table_connect_timeout_sec` to set the timeout when connecting to odbc and mysql table. 10. Modify issue template to enable blank issue, for release note or other specific usage. 11. Fix a bug in alpha_row_set split_range() function.	2021-09-28 10:37:42 +08:00
Mingyu Chen	fee8e6afc5	[Bug] Fix some bugs (#6665 ) 1.Fix a potential BE coredump of sending batch when loading data. (Fix [Bug] BE crash when loading data #6656) 2.Fix a potential BE coredump when doing schema change. (Fix [Bug] BE crash when doing alter task #6657) 3.Optimize the metric of base_compaction_request_failed. 4.Add Order column in show tablet result. (Fix [Feature] Add order column in SHOW TABLET stmt result #6658) 5.Fix bug that tablet repair slot not being released. (Fix [Bug] Tablet scheduler stop working #6659) 6.Fix bug that REPLICA_MISSING error can not be handled. (Fix [Bug] REPLICA_MISSING error can not be handled. #6660) 7.Modify column name of SHOW PROC "/cluster_balance/cluster_load_stat" 8.Optimize the result of SHOW PROC "/statistic" to show COLOCATE_MISMATCH tablets (Fix [Feature] the health status of colocate table's tablet is not shown in show proc statistic #6663) 9.Fix bug that show load where state='pending' can not be executed. (Fix [Bug] show load where state='pending' can not be executed. #6664)	2021-09-17 10:11:37 +08:00
Mingyu Chen	c6aa37f5ef	[Alter] Support doing compaction for tablets under alter operation (#6365 ) The problem I want to solve is described in #6355. This CL mainly changes: 1. Support compacting tablets under alter operations On BE side, the compaction logic will select tablets which state is "TABLET_NOTREADY" to do cumulative compaction. 2. Remove "alter_task" field in tablet's meta on BE side. "alter_task" field is never used long time ago 3. Support doing delete operation when table is doing alter operation. Previously, when a table is doing alter operation, execution of delete will return error: Table's state is not NORMAL. But now, delete can be executed successfully only if the condition column is not under schema change. And delete condition will be applied to all materialized indexes.	2021-08-07 21:32:26 +08:00
weizuo93	1454aacd69	[Metric] Add metrics to monitor size of queued tasks in load thread pool (#6306 ) (1) Add metrics to monitor the size of queued tasks in load thread pool. (2) Change some log level to VLOG_NOTICE	2021-07-27 13:41:44 +08:00
Mingyu Chen	4a5f0f859d	[Bug] Add readlock when calling get_rowset_by_version() (#6120 )	2021-07-01 09:19:10 +08:00
Mingyu Chen	81ecf3d097	[Bug] Rebuilt version graph of a tablet when there are too many orphan vertex (#5945 ) The version information of the tablet will be stored in the memory in an adjacency graph data structure. And as the new version is written and the old version is deleted, the data structure will begin to have empty vertex with no edge associations(orphan vertex). These orphan vertexs should be removed somehow.	2021-06-03 09:59:20 +08:00
HappenLee	1a81b9e160	[MemTracker] Some enchance of MemTracker (#5783 ) 1 Make some MemTracker have reasonable parent MemTracker not the root tracker 2 Make each MemTracker can be easily to trace. 3 Add show level of MemTracker to reduce the MemTracker show in the web page to have a way to control show how many tracker in web page.	2021-05-19 09:27:50 +08:00
stdpain	17cd32ffee	[BUG] Fixed uninitialized variables in compaction (#5828 )	2021-05-18 12:13:58 +08:00
Mingyu Chen	8850cfe2ad	[Compaction] Modify compaction logic (#5737 ) 1. Add /api/compaction/run_status to show the running compaction tasks. 2. Support do base and cumulative compaction for one tablet at same time. 3. Modify some log level. 4. Add a feedback document.	2021-05-07 11:18:47 +08:00
Zhengguo Yang	49b2bc39ae	[Optimize] Reduce meaningless memory copies (#5748 ) Reduce meaningless memory copies of rowset_meta pb	2021-05-05 10:20:09 +08:00
weizuo93	e519a24c9a	dynamic adjust compaction policy (#5651 ) Co-authored-by: weizuo <weizuo@xiaomi.com>	2021-04-26 12:39:13 +08:00
wangbo	b4a4c29651	(#5638 ) stale rowset can't be access after clone finish (#5639 ) * (#5638) stale rowset can't be access after clone finish * clear stale rowset after clone	2021-04-19 09:27:41 +08:00
Zhengguo Yang	d641a26490	[Refactor] Remove boost filesystem (#5579 ) * use std::filesystem instead of boost Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com>	2021-04-08 09:11:59 +08:00
Mingyu Chen	087fd8159b	[Bug] Fix bug that the stale rowset file will not be deleted (#5527 ) 1. If cumulative compaction compact only one rowset, the old rowset will not be put into `stale_rowset_meta_map` 2. Show rowset id in `/api/compaction/show` Co-authored-by: xxiao2018 <benghua3_1@sina.com>	2021-03-17 22:31:05 +08:00
Mingyu Chen	a6e2c3e3f1	[Bug][Clone] Fix the bug that incremental clone is not triggered (#5230 ) In version 0.13, we support a more efficient compaction logic. This logic will maintain multiple version paths of the tablet. This can avoid -230 errors and can also support incremental clone. But the previous incremental clone uses the incremental rowset meta recorded in `incr_rs_meta`. At present, the incremental rowset meta recorded in `incr_rs_meta` and the records in `stale_rs_meta` are duplicated, and the current clone logic does not adapt to the new multi-version path, resulting in many cases not triggering incremental clone. This CL mainly modified: 1. Removed `incr_rs_meta` metadata 2. Modified the clone logic. When the clone is incremented, it will try to read the rowset in `stale_rs_meta`. 3. Delete a lot of code that was previously used for version compatibility.	2021-02-06 22:04:48 +08:00
Zhengguo Yang	93a4c7efc1	[LOG] Standardize the use of VLOG in code (#5264 ) At present, the application of vlog in the code is quite confusing. It is inherited from impala VLOG_XX format, and there is also VLOG(number) format. VLOG(number) format does not have a unified specification, so this pr standardizes the use of VLOG	2021-01-21 12:09:09 +08:00
Mingyu Chen	3d4b2cb1ae	[Bug] Fix tablet shared ptr circular reference causing the tablet not to be cleared (#5100 ) Regardless of whether the tablet is submitted for compaction or not, we need to call 'reset_compaction' to clean up the base_compaction or cumulative_compaction objects in the tablet, because these two objects store the tablet's own shared_ptr. If it is not cleaned up, the reference count of the tablet will always be greater than 1, thus cannot be collected by the garbage collector. (TabletManager::start_trash_sweep) This bug is introduced from #4891	2020-12-18 21:17:18 +08:00
Yingchun Lai	49f7eb69bf	[Refactor] Refactor DeleteHandler and Cond module (2nd) (#5030 ) * [Refactor] Refactor DeleteHandler and Cond module (#4925) This patch mainly do the following refactors: - Use int64_t instead of int32_t for 'version' in DeleteHandler - Move some comments from .cpp to .h file, add some new comments in .h files, and also remove some meaningless comments - Use switch...case... instead of multiple if..else.. for DeleteConditionHandler::is_condition_value_valid - Use range loop to simplify code - Reduce some compare operations in Cond::del_eval - Improve some branch predictions in Reader - Fix and improve some unit tests	2020-12-08 10:01:18 +08:00
Mingyu Chen	c440aa07d1	Revert "[Refactor] Refactor DeleteHandler and Cond module (#4925 )" (#5028 ) This reverts commit 9c9992e0aa28ee85364eebf86a6675f1073e08fb. Co-authored-by: morningman <chenmingyu@baidu.com>	2020-12-05 21:39:49 +08:00
Yingchun Lai	9c9992e0aa	[Refactor] Refactor DeleteHandler and Cond module (#4925 ) This patch mainly do the following refactors: - Use int64_t instead of int32_t for 'version' in DeleteHandler - Move some comments from .cpp to .h file, add some new comments in .h files, and also remove some meaningless comments - Use switch...case... instead of multiple if..else.. for DeleteConditionHandler::is_condition_value_valid - Use range loop to simplify code - Reduce some compare operations in Cond::del_eval - Improve some branch predictions in Reader - Fix and improve some unit tests	2020-12-04 12:13:30 +08:00
weizuo93	ec7e1c6b1b	[Refactor] Execute 'pick rowsets' before applying for permits for a compaction task (#4891 ) The current compaction mechanism is that there is a producer thread that has been producing compaction tasks, and the selected tablet must apply for `permits`. When a tablet could hold `permits`, compaction task for this tablet will be submitted to thread pool. We take compaction score as `permits` which is used for limiting memory consumption. However, `pick_rowset_to_compaction()` will be executed before the file merge in compaction thread, and the number of segment files that actually perform the merge operation is smaller than compaction score. In addition, it is also possible that compaction task exits directly because the tablet doesn't meet the requirements of compaction. This patch optimizes and refactors the code of compaction, so that we can execute 'pick rowsets' before applying for permits for a compaction task, calculate the number of segment files that actually participate in the merge operation, and take this number as `permits`.	2020-11-30 11:41:14 +08:00
sduzh	6fedf5881b	[CodeFormat] Clang-format cpp sources (#4965 ) Clang-format all c++ source files.	2020-11-28 18:36:49 +08:00
sduzh	10e1e29711	Remove header file common/names.h (#4945 )	2020-11-26 17:00:48 +08:00
Yingchun Lai	b7b1d5eb38	[Refactor] Short circuit return to avoid meaningless loop (#4933 )	2020-11-24 13:46:50 +08:00
weizuo93	6247408689	[Compact]Take tablet scan frequency into consider when selecting tablet for compaction (#4837 ) A large number of small segment files will lead to low efficiency for scan operations. Multiple small files can be merged into a large file by compaction operation. So we could take the tablet scan frequency into consideration when selecting an tablet for compaction and preferentially do compaction for those tablets which are scanned frequently during a latest period of time at the present. Using the compaction strategy of Kudu for reference, scan frequency can be calculated for tablet during a latest period of time and be taken into consideration when calculating compaction score.	2020-11-18 21:51:12 +08:00
Mingyu Chen	f239f44b37	[Compaction][Bug-Fix] Fix bug that meta lock need to be held when calculating compaction score (#4829 ) * [Compaction][Buf] Fix bug that meta lock need to be held when calucating compaction score * fix Co-authored-by: morningman <chenmingyu@baidu.com>	2020-11-05 20:29:01 +08:00
Zhengguo Yang	09f97f8a05	[Refactor] Fixes some be typo part 2 (#4747 )	2020-10-20 09:28:57 +08:00
HuangWei	e31b4a4561	[Bug] fix illegal defer in Tablet::rowset_with_max_version() (#4737 )	2020-10-17 13:44:15 +08:00
weizuo93	eba595583e	[Optimize] Optimize the execution model of compaction to limit memory consumption (#4670 ) Currently, there are M threads to do base compaction and N threads to do cumulative compaction for each disk. Too many compaction tasks may run out of memory, so the max concurrency of running compaction tasks is limited by semaphore. If the running threads cost too much memory, we can't defense it. In addition, reducing concurrency to avoid OOM will lead to some compaction tasks can't be executed in time and we may encounter more heavy compaction. Therefore, concurrency limitation is not enough. The strategy proposed in #3624 may be effective to solve the OOM. A CompactionPermitLimiter is used for compaction limitation, and use single-producer/multi-consumer model. Producer will try to generate compaction tasks and acquire `permits` for each task. The compaction task which can hold `permits` will be executed in thread pool and each finished task will release its `permits`. `permits` should be applied for before a compaction task can execute. When the sum of `permits` held by executing compaction tasks reaches a threshold, subsequent compaction task will be no longer allowed, until some `permits` are released. Tablet compaction score is used as `permits` of compaction task here. To some extent, memory consumption can be limited by setting appropriate `permits` threshold.	2020-10-11 11:39:25 +08:00
ZhangYu0123	4f7cfee908	[compaction][config] Change default config policy to size_based (#4599 ) (1) change default compaction config policy to size_based (2) change missed version check policy when delete stale rowsets	2020-09-16 15:04:06 +08:00
ZhangYu0123	d29bf30f74	[BUG] Fix stale path delete checking logic when current main path is missing. (#4549 ) Fix stale path delete checking logic. When current main path is version missing, then delete checking logic is always core dumped. So we fix the checking logic to tolerate current main version missing.	2020-09-08 18:52:53 +08:00
ZhangYu0123	c29d41f675	[BUG] Fix recover persistent stale rowsets bug from multi-single version rowsets in stale rowsets (#4513 ) (1) fix recover persistent stale rowsets bug from multi-single version rowset in stale rowsets (2) delete_expired_inc_rowsets check consistent version convert to [0, max_version]	2020-09-03 16:59:18 +08:00
Yingchun Lai	498b06fbe2	[Metrics] Support tablet level metrics (#4428 ) Sometimes we want to detect the hotspot of a cluster, for example, hot scanned tablet, hot wrote tablet, but we have no insight about tablets in the cluster. This patch introduce tablet level metrics to help to achieve this object, now support 4 metrics on tablets: `query_scan_bytes `, `query_scan_rows `, `flush_bytes `, `flush_count `. However, one BE may holds hundreds of thousands of tablets, so I add a parameter for the metrics HTTP request, and not return tablet level metrics by default.	2020-09-02 10:39:41 +08:00
ZhangYu0123	1d93ba027a	[Compaction] Compaction show policy type and disk format (#4466 ) Add more information in compaction show api 1、add cumulative policy type 2、format rowset total disk size	2020-08-30 21:09:47 +08:00
ZhangYu0123	123237afb7	[Compaction] Persistence stale rowsets meta (#4454 ) Persistence stale rowsets meta. When BE reboots, stale rowsets meta can resume and the stale version can also be readable before stale gc time. ISSUE: #4453	2020-08-30 21:05:48 +08:00
ZhangYu0123	a7422ee142	[UT][Bug-Fix] Resolve UT memory leak problem (#4406 ) Fix ut memory leak on Fix #4164	2020-08-21 10:41:54 +08:00
Mingyu Chen	4c571cb6f5	Revert "[Metrics] Support tablet level metrics (#4327 )" (#4397 ) This reverts commit 56260a65c87830ffe34109195ee4d6f1d543e630. Co-authored-by: morningman <chenmingyu@baidu.com>	2020-08-19 22:37:52 +08:00
ZhangYu0123	dc3ed1c525	[Compaction]Compaction rules optimization (#4212 ) Compaction rules optimization, the detail problem description and design to see #4164. This pr commits 2 functions: (1) add the cumulative policy configable, and implement original policy. (2) implement universal policy, the optimization version in #4164.	2020-08-19 09:34:13 +08:00
Yingchun Lai	56260a65c8	[Metrics] Support tablet level metrics (#4327 ) Sometimes we want to detect the hotspot of a cluster, for example, hot scanned tablet, hot wrote tablet, but we have no insight about tablets in the cluster. This patch introduce tablet level metrics to help to achieve this object, now support 4 metrics on tablets: `query_scan_bytes `, `query_scan_rows `, `flush_bytes `, `flush_count `. However, one BE may holds hundreds of thousands of tablets, so I add a parameter for the metrics HTTP request, and not return tablet level metrics by default.	2020-08-18 16:56:12 +08:00
Mingyu Chen	3359467b9a	[Tablet][Recovery] Support using empty tablet to repair the damaged or missing tablet (#4255 ) In some very special circumstances, such as code bugs, or human misoperation, etc., all replicas of some tablets may be lost. In this case, the data has been substantially lost. However, in some scenarios, the business still hopes to ensure that the query will not report errors even if there is data loss, and reduce the perception of the user layer. At this point, we can use the blank Tablet to fill the missing replica to ensure that the query can be executed normally. Add a new FE config `recover_with_empty_tablet`. default is false. true means to use empty tablet to fill the missing one. Also fix a bug in Fix #4274	2020-08-18 06:13:53 +00:00
ZhangYu0123	3372958a4c	[BUG] Fix calculation of cumulative point (#4259 ) Fix calculation of cumulative point. The problem is calculation of cumulative point is wrong when be restarts and there is delete rowset. also see #4258	2020-08-06 23:13:43 +08:00
ZhangYu0123	16c89c7d56	[BUG]Fix remove expired stale rowset path order error (#4214 ) Delete stale rowset path order error. This bug leads to stale rowsets version inconsistents. #4213	2020-08-01 17:44:39 +08:00
ZhangYu0123	03cf9b2a24	[Compaction] Add delayed deletion of rowsets function, fix -230 error. (#4039 ) Related issue #4017, main changes as follows: 1. Add expired_snapshot_rs_version_map，_expired_snapshot_rs_metas， 2. Add VersionedRowsetTracker record compacted path version 3. Record path version when rowsets compact 4. In gc process, add expired snapshot rowsets to unused set to remove.	2020-07-19 22:03:59 +08:00
Mingyu Chen	15d9e10a8b	[Bug] Fix bug that tablet meta lock twice (#4112 ) * [Bug] Fix bug that tablet meta lock twice The tablet meta load may already be hold before calling generate_tablet_meta_copy(), so we need provide a unlocked version of generate_tablet_meta_copy() * fix typo Co-authored-by: chenmingyu <chenmingyu@baidu.com>	2020-07-19 21:27:24 +08:00
caiconghui	2e460f581c	[Bug] Support get all rowset meta info in memory from tablet meta url (#4061 ) This PR is to fix bug that we cannot get the newest tablet meta info from tablet meta url.	2020-07-13 20:53:51 +08:00

1 2

93 Commits