doris

Author	SHA1	Message	Date
Lijia Liu	f772649535	[Optimize] Optimize lock when check error storage (#6321 ) 1. `StorageEngine::_delete_tablets_on_unused_root_path` will try to obtain tablet shard write lock in `TabletManager` ``` StorageEngine::_delete_tablets_on_unused_root_path TabletManager::drop_tablets_on_error_root_path obtain each tablet shard's write lock ``` 2. `TabletManager::build_all_report_tablets_info` and other methods will obtain tablet shard read lock frequently. So, `StorageEngine::_delete_tablets_on_unused_root_path` will hold `_store_lock` for a long time. This will make it difficult for other threads to get write `_store_lock`, such as `StorageEngine::get_stores_for_create_tablet` `drop_tablets_on_error_root_path` is a small probability event, `TabletManager::drop_tablets_on_error_root_path` should return when its param `tablet_info_vec` is empty	2021-08-07 21:30:49 +08:00
Pxl	0c6726f7cd	[Bug] Fix bug of TDisk have wrong static_cast (#6175 ) * remove some useless static_cast	2021-07-09 09:42:08 +08:00
Yingchun Lai	58d0c8971e	[Bugfix] Fix BE metrics http API dead lock bug (#5730 )	2021-04-30 10:15:33 +08:00
xinghuayu007	9b0d6ecaf0	[Log] Add error msg when tablet not found (#5659 ) Before drop a tablet, it will try to find the tablet in tablet map. But the tablet maybe has been not existed. Therefore, it is better to print the error message and error status.	2021-04-21 16:37:47 +08:00
Yingchun Lai	d15fe05f3c	[Metrics] Add metrics to monitor BE's agent task queue size (#5648 ) * [Metrics] Add metrics to monitor BE's agent task queue size Sometimes, user's DDL or background task may last a long time, it's not easy to find out which procedure has problem. This patch add metric to monitor BE's agent task queue size, which would be helpful for troubleshooting. The raw metrics on BE looks like: doris_be_agent_task_queue_size{type="REPORT_OLAP_TABLE"} 0 doris_be_agent_task_queue_size{type="REPORT_DISK_STATE"} 0 doris_be_agent_task_queue_size{type="REPORT_TASK"} 0 doris_be_agent_task_queue_size{type="CHECK_CONSISTENCY"} 0 doris_be_agent_task_queue_size{type="DELETE"} 0 doris_be_agent_task_queue_size{type="CLEAR_TRANSACTION_TASK"} 0 doris_be_agent_task_queue_size{type="PUBLISH_VERSION"} 0 doris_be_agent_task_queue_size{type="UPLOAD"} 0 doris_be_agent_task_queue_size{type="DROP_TABLE"} 0 doris_be_agent_task_queue_size{type="CREATE_TABLE"} 39 doris_be_agent_task_queue_size{type="RELEASE_SNAPSHOT"} 0 doris_be_agent_task_queue_size{type="STORAGE_MEDIUM_MIGRATE"} 245 doris_be_agent_task_queue_size{type="CLONE"} 0 doris_be_agent_task_queue_size{type="MOVE"} 0 doris_be_agent_task_queue_size{type="ALTER_TABLE"} 0 doris_be_agent_task_queue_size{type="DOWNLOAD"} 0 doris_be_agent_task_queue_size{type="PUSH"} 0 doris_be_agent_task_queue_size{type="UPDATE_TABLET_META_INFO"} 0 doris_be_agent_task_queue_size{type="MAKE_SNAPSHOT"} 0 * fix typo	2021-04-21 09:23:33 +08:00
caiconghui	422456c31a	Add warn log when client report be state failed and refactor some report code (#5342 ) There are some redundant code for report task, disk and tablet in be, and when fe return error report message, there is no any warn log showing report failed. Co-authored-by: caiconghui [蔡聪辉] <caiconghui@xiaomi.com>	2021-03-03 17:00:21 +08:00
Zhengguo Yang	6ede4c6ec1	[Feature] Support backup,restore,load,export directly connect to s3 (#5399 ) * [doris-1008] support backup and restore directly to cloud storage via aws s3 protocol * Internal][S3DirectAccess] Support backup,restore,load,export directlyconnect to s3 1. Support load and export data from/to s3 directly. 2. Add a config to auto convert broker access to s3 acces when available Change-Id: Iac96d4b3670776708bc96a119ff491db8cb4cde7 (cherry picked from commit 2f03832ca52221cc7436069b96c45c48c4bc7201) * [Internal][S3DirectAccess] File path glob compatible with broker Change-Id: Ie55e07a547aa22c6fa8d432ca926216c10384e68 (cherry picked from commit d4fb25544c0dc06d23e1ada571ec3f8edd4ba56f) * [internal] [doris-1008] fix log4j class not found Change-Id: I468176aca0d821383c74ee658d461aba9e7d5be3 (cherry picked from commit 029adaa9d6ded8503acbd6644c1519456f3db232) * add poms Co-authored-by: yangzhengguo01 <yangzhengguo01@baidu.com>	2021-02-22 16:07:56 +08:00
Mingyu Chen	a6e2c3e3f1	[Bug][Clone] Fix the bug that incremental clone is not triggered (#5230 ) In version 0.13, we support a more efficient compaction logic. This logic will maintain multiple version paths of the tablet. This can avoid -230 errors and can also support incremental clone. But the previous incremental clone uses the incremental rowset meta recorded in `incr_rs_meta`. At present, the incremental rowset meta recorded in `incr_rs_meta` and the records in `stale_rs_meta` are duplicated, and the current clone logic does not adapt to the new multi-version path, resulting in many cases not triggering incremental clone. This CL mainly modified: 1. Removed `incr_rs_meta` metadata 2. Modified the clone logic. When the clone is incremented, it will try to read the rowset in `stale_rs_meta`. 3. Delete a lot of code that was previously used for version compatibility.	2021-02-06 22:04:48 +08:00
Zhengguo Yang	93a4c7efc1	[LOG] Standardize the use of VLOG in code (#5264 ) At present, the application of vlog in the code is quite confusing. It is inherited from impala VLOG_XX format, and there is also VLOG(number) format. VLOG(number) format does not have a unified specification, so this pr standardizes the use of VLOG	2021-01-21 12:09:09 +08:00
Yingchun Lai	176dcf8bd9	[Trace] Add trace for create tablet tasks (#5091 ) Add trace for create tablet tasks, it's a useful tool for admin to find out the bottleneck when create tablets timeouted. For example, admin could enlarge 'tablet_map_shard_size' when found 'got tablets shard lock' procedure cost too much time.	2020-12-19 11:18:12 +08:00
Yingchun Lai	8823f2d928	[Buf] Fix incorrect name of TaskWorkerPool (#5015 ) '_task_worker_type' is not well initialized when use it to init '_name', then '_name' is always 'TaskWorkerPool.CREATE_TABLE', this patch fix this bug.	2020-12-04 09:30:23 +08:00
sduzh	6fedf5881b	[CodeFormat] Clang-format cpp sources (#4965 ) Clang-format all c++ source files.	2020-11-28 18:36:49 +08:00
sduzh	10e1e29711	Remove header file common/names.h (#4945 )	2020-11-26 17:00:48 +08:00
Mingyu Chen	a1ae399737	[Refactor] Refactor storage medium migration task process (#4475 ) This CL refactor the storage medium migration task process in BE. I did not modify the execution logic. Just extract part of the logic in the migration task and put it in task_work_pool. In this way, the migration task is only used to process the migration from the specified tablet to the specified data dir. Later, we can use this task to migrate of tablets between different disks. #4476	2020-11-12 10:00:43 +08:00
Zhengguo Yang	9aa3d61dc0	[refactor] change http server log level (#4853 ) * change some log level * change some log level	2020-11-08 20:53:36 +08:00
Zhengguo Yang	75e0ba32a1	Fixes some be typo (#4714 )	2020-10-13 09:37:15 +08:00
Mingyu Chen	ed09ea9cf7	[Bug] Fix bug that tablet report always out of date (#4695 ) This is because the report version is not set correctly in the tablet report request. Also modify the download url of libevent Fix #4689 Fix #4691	2020-10-09 15:54:31 +08:00
Mingyu Chen	00f25c2b77	[Bug] Tablet and Disk report thread not work (#4597 ) The tablet and disk information reporting threads need to report to the FE periodically. At the same time these two reporting threads will also be triggered by certain events. The modification in PR #4440 caused these two threads to be triggered only by events, and could not report regularly.	2020-09-20 20:51:52 +08:00
Mingyu Chen	e55327bbc7	[Bug] Fix bug that task_worker_pool not work (#4543 ) The number of thread initialized in task worker pool is not right. This bug is introduced from #4440	2020-09-08 09:25:36 +08:00
Yingchun Lai	b780df697a	[refactor] Optimize threads usage mode in BE (#4440 ) BE can not graceful exit because some threads are running in endless loop. This patch do the following optimization: - Use the well encapsulated Thread and ThreadPool instead of std::thread and std::vector<std::thread> - Use CountDownLatch in thread's loop condition to avoid endless loop - Introduce a new class Daemon for daemon works, like tcmalloc_gc, memory_maintenance and calculate_metrics - Decouple statistics type TaskWorkerPool and StorageEngine notification by submit tasks to TaskWorkerPool's queue - Reorder objects' stop and deconstruct in main(), i.e. stop network services at first, then internal services - Use libevent in pthreads mode, by calling evthread_use_pthreads(), then EvHttpServer can exit gracefully in multi-threads - Call brpc::Server's Stop() and ClearServices() explicitly	2020-09-06 20:19:14 +08:00
Yingchun Lai	498b06fbe2	[Metrics] Support tablet level metrics (#4428 ) Sometimes we want to detect the hotspot of a cluster, for example, hot scanned tablet, hot wrote tablet, but we have no insight about tablets in the cluster. This patch introduce tablet level metrics to help to achieve this object, now support 4 metrics on tablets: `query_scan_bytes `, `query_scan_rows `, `flush_bytes `, `flush_count `. However, one BE may holds hundreds of thousands of tablets, so I add a parameter for the metrics HTTP request, and not return tablet level metrics by default.	2020-09-02 10:39:41 +08:00
HangyuanLiu	ad738fa198	Add OLAP_ERR_DATE_QUALITY_ERR error status to display schema change failure (#4388 ) In the process of historical data transformation of materialized views, it may occur that the transformation fails due to data quality. Add an error status code ：OLAP_ERR_DATE_QUALITY_ERR to determine if a data problem is causing the failure #3344	2020-08-27 17:52:53 +08:00
Mingyu Chen	3359467b9a	[Tablet][Recovery] Support using empty tablet to repair the damaged or missing tablet (#4255 ) In some very special circumstances, such as code bugs, or human misoperation, etc., all replicas of some tablets may be lost. In this case, the data has been substantially lost. However, in some scenarios, the business still hopes to ensure that the query will not report errors even if there is data loss, and reduce the perception of the user layer. At this point, we can use the blank Tablet to fill the missing replica to ensure that the query can be executed normally. Add a new FE config `recover_with_empty_tablet`. default is false. true means to use empty tablet to fill the missing one. Also fix a bug in Fix #4274	2020-08-18 06:13:53 +00:00
Yingchun Lai	e71152132c	[metrics] Redesign metrics to 3 layers (#4115 ) Redesign metrics to 3 layers: MetricRegistry - MetricEntity - Metrics MetricRegistry : the register center MetricEntity : the entity registered on MetricRegistry. Generally a MetricRegistry can be registered on several MetricEntities, each of MetricEntity is an independent entity, such as server, disk_devices, data_directories, thrift clients and servers, and so on. Metric : metrics of an entity. Such as fragment_requests_total on server entity, disk_bytes_read on a disk_device entity, thrift_opened_clients on a thrift_client entity. MetricPrototype: the type of a metric. MetricPrototype is a global variable, can be shared by the same metrics across different MetricEntities.	2020-08-08 11:23:01 +08:00
Dayue Gao	fdcbea480d	[Enhancement] DO NOT increase report version for publish task (#3894 ) Fixes #3893 In a cluster with frequent load activities, FE will ignore most tablet report from BE because currently it only handle reports whose version >= BE's latest report version (which is increased each time a transaction is published). This can be observed from FE's log, with many logs like `out of date report version 15919277405765 from backend[177969252]. current report version[15919277405766]` in it. However many system functionalities rely on TabletReport processing to work properly. For example 1. bad or version miss replica is detected and repaired during TabletReport 2. storage medium migration decision and action is made based on TabletReport 3. BE's old transaction is cleared/republished during TabletReport In fact, it is not necessary to update the report version after the publish task. Because this is actually a problem left over by history. In the reporting logic of the current version, we will no longer decrease the version information of the replica in the FE metadata according to the report. So even if we receive a stale version of the report, it does not matter. This CL contains mainly two changes 1. do not increase report version for publish task 2. populate `tabletWithoutPartitionId` out of read lock of TabletInvertedIndex	2020-07-01 09:23:40 +08:00
lichaoyong	93a0b47d22	Revert "[Memory Engine] MemTablet creation and compatibility handling in BE (#3762 )" (#3931 ) This reverts commit ca96ea30560c9e9837c28cfd2cdd8ed24196f787.	2020-06-24 10:13:45 +08:00
xy720	f189a2e7b8	[Spark load][Be 1/1] Be handle push task (#3742 ) 1、Add a PushBrokerReader in push_handle.cpp. 2、PushBrokerReader wraps the ParquetScanner to support reading data from parquet format file through broker.	2020-06-22 19:57:58 +08:00
Binglin Chang	ca96ea3056	[Memory Engine] MemTablet creation and compatibility handling in BE (#3762 )	2020-06-18 09:56:07 +08:00
Yingchun Lai	b58b1b3953	[metrics] Make DorisMetrics to be a real singleton (#3417 )	2020-05-04 09:20:53 +08:00
HuangWei	5f9359d618	Use SleepFor() instead of usleep() (#3211 )	2020-03-29 14:18:19 +08:00
Youngwb	a77515fe03	[Backup] Fix backup job block at SNAPSHOTING phase (#3058 ) This bug occurred when BE make snapshot, the version required by fe had been merged into the cumulative version, so the snapshot task could not complete the task even if it retried. In order to solve this problem, the BackupJob could be set to CANCELLED, and the user could continue to retry the job. Fix #3057	2020-03-11 14:05:02 +08:00
Yingchun Lai	aa58cd99d9	Fix disks_total_capacity metric bug (#2988 ) Now disks_total_capacity metric is a user specified capacity, but disks_avail_capacity is the disk's actual available capacity, so disks_total_capacity may be less than disks_avail_capacity, and UsedPct on FE may be a negative number as a result. We'd better to use disk actual capacity for disks_total_capacity metric.	2020-03-02 19:09:50 +08:00
kangkaisen	625411bd28	Doris support in memory olap table (#2847 )	2020-02-18 10:45:54 +08:00
lichaoyong	9ee1704859	[util] Import util tools from KUDU (#2905 ) 1. MonoTime/MonoDelta MonoTime: The MonoTime represents a particular point in time, relative to some fixed but unspecified reference point. MonoDelta: The MonoDelta class represents an elapsed duration of time, the delta between two MonoTime instances. 2. CountDownLatch This is a C++ implementation of the Java CountDownLatch	2020-02-14 18:01:16 +08:00
LingBin	d0e2fc3305	Remove resource_info related members from TaskWorkerPool (#2704 ) The `TResourceInfo` was used to help `cgruops` to isolate resources, but it is no longer used. In fact, the `TResourceInfo` information is no longer carried in the requests from FE to BE.	2020-01-16 14:39:08 +08:00
Mingyu Chen	c39d35df4c	Add tablet compaction score metrics (#2427 ) [Metric] Add tablet compaction score metrics Backend: Add metric "tablet_max_compaction_score" to monitor the current max compaction score of tablets on this Backend. This metric will be updated each time the compaction thread picking tablets to compact. Frontend: Add metric "tablet_max_compaction_score" for each Backend. These metrics will be updated when backends report tablet. And also add a calculated metric "max_tablet_compaction_core" to monitor the max compaction core of tablets on all Backends.	2019-12-12 17:46:59 +08:00
Mingyu Chen	c5ce72215d	Optimize tablet report with expired transaction. (#2215 ) When there are lots of expired transactions on BE, and with large number of tablet, the report thread may become to slow. Because it has to iterate the whole transaction map for each tablet. But this is unnecessary. We should first build a expired transaction map with 'tablet id' as key. And for each tablet, we only need to seek the expired transaction map once with tablet id, instead of traversing the whole transaction map.	2019-11-15 23:03:21 +08:00
Mingyu Chen	11872d5cf6	Sending clear txn task explicitly after transaction being aborted (#2182 )	2019-11-13 11:22:45 +08:00
lichaoyong	0bcfddab92	Remove clear_alter_task (#2056 ) Alter task has been refactored and clear_alter_task is not necessary.	2019-10-24 18:57:14 +08:00
lichaoyong	58c882fa2a	Remove SchemaChangeV1 (#2014 )	2019-10-21 15:07:28 +08:00
ZHAO Chun	f130bd3e7b	Use Env function to operate directory (#1980 ) Now Env has unify all environment operation, such as file operation. However some of our old functions don't leverage it. This change unify FileUtils::scan_dir to use Env's function.	2019-10-15 09:25:12 +08:00
lichaoyong	7df1418ff4	Check transaction_id in TClearTransactionTaskRequest (#1872 )	2019-09-26 10:15:43 +08:00
Mingyu Chen	9aa2045987	Refactor alter job (#1695 )	2019-09-12 16:31:29 +08:00
Mingyu Chen	7e981b2b14	Limit the disk usage to avoid running out of disk capacity (#1702 ) Set high watermark and flood stage of disk used capacity. And forbid some operations if disk usage is too high.	2019-08-27 22:18:17 +08:00
Mingyu Chen	2b2bc82ae2	Add timeout on snapshot of data (#1672 ) Release snapshot when finishing or cancelling backup/restore job. Snapshot may takes a lot disk space if not releasing them in time.	2019-08-21 21:18:53 +08:00
ZHAO Chun	a6f0b5c789	Change RowsetWriter num_rows() return int64_t (#1535 )	2019-07-24 10:44:38 +08:00
yiguolei	c34b35e6c4	Add ALTER_TABLET task in be (#1497 ) This a for the new implementation of alter table process.	2019-07-23 15:16:21 +08:00
yiguolei	755b12cd75	Add partition id to tablet meta in be (#1490 ) FE uses partition_id to publish version. BE should check whether all tablets related with this partition have the version. But Tablet in BE does not have partition id in its metadata. So that BE could not check it. This patch will add partition id to tablet meta during report task. Sync at most 10k tablets during set tablet meta.	2019-07-17 14:07:55 +08:00
lichaoyong	0d48a3961c	Refactor Storage Engine (#1478 ) NOTE: This patch would modify all Backend's data. And this will cause a very long time to restart be. So if you want to interferer your product environment, you should upgrade backend one by one. 1. Refactoring be is to clarify the structure the codes. 2. Use unique id to indicate a rowset. Nameing rowset with tablet_id and version will lead to many conflicts among compaction, clone, restore. 3. Extract an rowset interface to encapsulate rowsets with different format.	2019-07-15 21:18:22 +08:00
Mingyu Chen	5c1b4f641e	Add report version for publish task (#1401 )	2019-06-28 20:15:08 +08:00

1 2

83 Commits