doris

Author	SHA1	Message	Date
Mingyu Chen	cf2b93532b	[fix](file-scanner) fix some logic about broker load with parquet with new file scanner (#13135 ) Fix some logic about broker load using new file scanner, with parquet format: 1. If columns are specified in load stmt, but none of them are in parquet file, error will be thrown like `err: No columns found in file`. See `parquet_s3_case4` 2. If the first column of table are not in table, the result number of rows is wrong. See `parquet_s3_case8` 3. If column specified in `columns` in load stmt does not exist in file and table, error will be thrown like: `failed to find default value expr for slot: x1`. See `parquet_s3_case2`	2022-10-08 13:08:08 +08:00
Pxl	9607f60845	[Feature](serialize) move block_data_version to fe heart beat (#12667 ) Move block_data_version from be config to fe heart beat	2022-09-27 18:25:54 +08:00
Pxl	fd0bd395ac	[Enhancement] Remove some unused include (#10035 )	2022-06-17 10:47:25 +08:00
BePPPower	51db78d375	[refactor] modify all OLAP_LOG_WARNING to LOG(WARNING) (#9473 ) Co-authored-by: BePPPower <fangtiewei@selectdb.com>	2022-05-10 09:25:25 +08:00
chenlinzhong	c9961c9bb9	[style] clang-format all c++ code (#9305 ) - sh build-support/clang-format.sh to clang-format all c++ code	2022-04-29 16:14:22 +08:00
yiguolei	0ff7de4157	[refactor] remove agent status (#8273 ) There are 3 error code types in BE: OLAPStatus AgentStatus Status. It is very confused and sometimes conflict during write code. I will try to unify them to Status.	2022-03-09 13:04:50 +08:00
Zhengguo Yang	6c6380969b	[refactor] replace boost smart ptr with stl (#6856 ) 1. replace all boost::shared_ptr to std::shared_ptr 2. replace all boost::scopted_ptr to std::unique_ptr 3. replace all boost::scoped_array to std::unique<T[]> 4. replace all boost:thread to std::thread	2021-11-17 10:18:35 +08:00
Zhengguo Yang	24d38614a0	[Dependency] Upgrade thirdparty libs (#6766 ) Upgrade the following dependecies: libevent -> 2.1.12 OpenSSL 1.0.2k -> 1.1.1l thrift 0.9.3 -> 0.13.0 protobuf 3.5.1 -> 3.14.0 gflags 2.2.0 -> 2.2.2 glog 0.3.3 -> 0.4.0 googletest 1.8.0 -> 1.10.0 snappy 1.1.7 -> 1.1.8 gperftools 2.7 -> 2.9.1 lz4 1.7.5 -> 1.9.3 curl 7.54.1 -> 7.79.0 re2 2017-05-01 -> 2021-02-02 zstd 1.3.7 -> 1.5.0 brotli 1.0.7 -> 1.0.9 flatbuffers 1.10.0 -> 2.0.0 apache-arrow 0.15.1 -> 5.0.0 CRoaring 0.2.60 -> 0.3.4 orc 1.5.8 -> 1.6.6 libdivide 4.0.0 -> 5.0 brpc 0.97 -> 1.0.0-rc02 librdkafka 1.7.0 -> 1.8.0 after this pr compile doris should use build-env:1.4.0	2021-10-15 13:03:04 +08:00
Zhengguo Yang	d641a26490	[Refactor] Remove boost filesystem (#5579 ) * use std::filesystem instead of boost Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com>	2021-04-08 09:11:59 +08:00
sduzh	6fedf5881b	[CodeFormat] Clang-format cpp sources (#4965 ) Clang-format all c++ source files.	2020-11-28 18:36:49 +08:00
Yingchun Lai	1151a0063c	[Bug] Make 'LastStartTime' in backends list as the actual BE start time (#4872 ) We use 'LastStartTime' in backends list to check whether there is an unexpected restart of BE, but it will be changed as BE's first heartbeat time after FE restarted, it would be better to set it to BE's actual start time.	2020-11-11 21:24:06 +08:00
Yingchun Lai	b780df697a	[refactor] Optimize threads usage mode in BE (#4440 ) BE can not graceful exit because some threads are running in endless loop. This patch do the following optimization: - Use the well encapsulated Thread and ThreadPool instead of std::thread and std::vector<std::thread> - Use CountDownLatch in thread's loop condition to avoid endless loop - Introduce a new class Daemon for daemon works, like tcmalloc_gc, memory_maintenance and calculate_metrics - Decouple statistics type TaskWorkerPool and StorageEngine notification by submit tasks to TaskWorkerPool's queue - Reorder objects' stop and deconstruct in main(), i.e. stop network services at first, then internal services - Use libevent in pthreads mode, by calling evthread_use_pthreads(), then EvHttpServer can exit gracefully in multi-threads - Call brpc::Server's Stop() and ClearServices() explicitly	2020-09-06 20:19:14 +08:00
Yingchun Lai	e71152132c	[metrics] Redesign metrics to 3 layers (#4115 ) Redesign metrics to 3 layers: MetricRegistry - MetricEntity - Metrics MetricRegistry : the register center MetricEntity : the entity registered on MetricRegistry. Generally a MetricRegistry can be registered on several MetricEntities, each of MetricEntity is an independent entity, such as server, disk_devices, data_directories, thrift clients and servers, and so on. Metric : metrics of an entity. Such as fragment_requests_total on server entity, disk_bytes_read on a disk_device entity, thrift_opened_clients on a thrift_client entity. MetricPrototype: the type of a metric. MetricPrototype is a global variable, can be shared by the same metrics across different MetricEntities.	2020-08-08 11:23:01 +08:00
Mingyu Chen	9a934ec9f6	[Load] Add more info in SHOW LOAD result (#3391 ) Fix #3390 This CL add more info in `JobDetails` column of `SHOW LOAD` result for Broker Load Job. For example: ``` { "Unfinished backends": { "9c3441027ff948a0-8287923329a2b6a7": [10002] }, "All backends": { "9c3441027ff948a0-8287923329a2b6a7": [10002, 10004, 10006] }, "ScannedRows": 2390016, "TaskNumber": 1, "FileNumber": 1, "FileSize": 1073741824 } ``` 2 newly added keys: `Unfinished backends` indicates the BE which task on them are not finished. `All backends` indicates the BE which this job has tasks on it. One more thing, I pass the Backend Id along with the heartbeat msg from FE to BE, so that BE can know the Id of themselves.	2020-04-26 21:30:23 +08:00
Yingchun Lai	8276c6d7f8	Show BE version in 'show backends;' (#3074 ) In a large scale cluster, we may rolling upgrade BEs, this patch add a column named 'Version' for command 'show backends;', as well as website '/system?path=//backends', to provide a method to check whether there is any BE missing upgraded.	2020-03-12 22:15:13 +08:00
LingBin	5440e19d01	Improve the triggering strategy of BE report (#2881 ) Currently, the report from BE to FE is completed in the background threads of `AgentServer` (`report_tablet_thread` and `report_disk_stat_thread`). These two threads will sleep and be in a standby state after each report, if there is any need to report immediately, they will be notified and wake up immediately to report. For example, when background thread (`disk_monitor_thread`) in `StorageEngine` finds some tablets were deleted, it will notify `AgentServer` to trigger a report immediately. In the current implementation, in order to report ASAP, a local variable (`_is_drop_tables`) and two other flags are used to record whether reporting is needed, and then `StorageEngine::disk_monitor_thread` checks the value of this variable every time it runs, to determine whether it needs to be triggered Reporting. This is actually superfluous, and it may result in untimely notifications, as shown below: ``` (thread_1) (thread_2) disk-monitor disk-stat-reporter \| \| \| reporting \| \| notify_1 \| \| \| \| wait_for_notify(will wait until timeout or next notification) \| \| V V ``` When `report_tablet_thread` has not started waiting, `StorageEngine::disk_monitor_thread` triggers a notification, so this notification will not be received by `report_tablet_thread`, resulting in the BE not reporting to the FE until the lock times out or the next round of `disk_monitor_thread` detection. This change restructures the triggering implementation, and solves the above problem. This change also changes some methods(that do not need to be public) to private.	2020-02-11 20:38:44 +08:00
kangpinghuang	c07f37d78c	[Segment V2] Add a control framework between FE and BE through heartbeat #2247 (#2364 ) The control framework is implemented through heartbeat message. Use uint64_t as flags to control different functions. Now add a flag to set the default rowset type to beta.	2019-12-12 12:18:32 +08:00
lichaoyong	0d48a3961c	Refactor Storage Engine (#1478 ) NOTE: This patch would modify all Backend's data. And this will cause a very long time to restart be. So if you want to interferer your product environment, you should upgrade backend one by one. 1. Refactoring be is to clarify the structure the codes. 2. Use unique id to indicate a rowset. Nameing rowset with tablet_id and version will lead to many conflicts among compaction, clone, restore. 3. Extract an rowset interface to encapsulate rowsets with different format.	2019-07-15 21:18:22 +08:00
ZHAO Chun	9d03ba236b	Uniform Status (#1317 )	2019-06-14 23:38:31 +08:00
Mingyu Chen	ff0dd0d2da	Support SSL authentication with Kafka in routine load job (#1235 )	2019-06-07 16:29:01 +08:00
李超勇	3d324e38ea	Remove OLAP_LOG_INFO log format. Use LOG(INFO) instead (#372 )	2018-11-30 20:59:40 +08:00
ZHAO Chun	49302955c8	Revert "Remove OLAP_LOG_INFO log format. Use LOG(INFO) instead (#370 )" (#371 ) This reverts commit a816925776de06dc7503ea7429802cad9042d0e4.	2018-11-30 20:56:51 +08:00
李超勇	a816925776	Remove OLAP_LOG_INFO log format. Use LOG(INFO) instead (#370 ) * Remove unused row-oriented format flags * Remove unused row-oriented format flags * Remove OLAP_LOG_INFO log format. Use LOG(INFO) instead	2018-11-30 20:36:58 +08:00
Mingyu Chen	9a2ad18428	Add path info of replica in catalog (#327 ) Add path info of replica in catalog Also fix a bug that when calling check_none_row_oriented_table, store is null, it cannot be used to create table. Instead, OLAPHeader can be used to get storage type information.	2018-11-19 17:42:46 +08:00
Zhao Chun	a2b299e3b9	Reduce UT binary size (#314 ) * Reduce UT binary size Almost every module depend on ExecEnv, and ExecEnv contains all singleton, which make UT binary contains all object files. This patch seperate ExecEnv's initial and destory to anthor file to avoid other file's dependence. And status.cc include debug_util.h which depend tuple.h tuple_row.h, and I move get_stack_trace() to stack_util.cpp to reduce status.cc's dependence. I add USE_RTTI=1 to build rocksdb to avoid linking librocksdb.a Issue: #292 * Update	2018-11-15 16:17:23 +08:00
chenhao7253886	37b4cafe87	Change variable and namespace name in BE (#268 ) Change 'palo' to 'doris'	2018-11-02 10:22:32 +08:00
morningman	2868793b6b	Change license to Apache License 2.0 (#262 )	2018-11-01 09:06:01 +08:00
morningman	051aced48d	Missing many files in last commit In last commit, a lot of files has been missed	2018-10-31 16:19:21 +08:00
morningman	4f6f8572de	Added: Add 3 new metrics of Backends: host_fd_metrics, process_fd_metrics and process_thread_metrics, to monitor open file number and thread number. Added: Support getting column size and precision info of table or view using JDBC. Updated: Change the promethues type name GAUGE to lowercase, to fit the latest promethues version. Updated: Backend ip saved in FE will be compared with BE's local ip when doing heartbeat, to avoid false positive heartbeat response. Updated: Using version_num of tablet instead of calculating nice value to select cumulative compaction candicates. Fixed: Predicates should not be pushed down to subquery which contains limit clause. Fixed: Fix the formula of calculating BE load score. Fixed: Fix a bug that in some edge cases, non-master Fontend may wait for a unnecessary long timeout after forwarding cmd to Master FE. Fixed: A bug that granting privs on more than one table does not work. Fixed: Support 'Insert into' table which contains HLL columns. Fixed: ExportStmt' toSql() method may throw NullPointer Exception if table does not exist. Fixed: Remove unnecessary 'get capacity' operation to avoid IO impact. Internal commit id: merge to c16bd603a53dfe2089ff95704c698a738c317792	2018-10-26 14:48:21 +08:00
morningman	cc74efb3c5	merge to ddb65b69f9c788e359e191889cb31f15279c41ec (#224 ) 1. Apache HDFS broker support HDFS HA and Hadoop kerberos authentication. 2. New Backup and Restore function. Use Fs Broker to backup your data to HDFS or restore them from HDFS. 3. Table-Level Privileges. Grant fine-grained privileges on table-level to specified user. 4. A lot of bugs fixed. 5. Performance improvement.	2018-08-24 17:12:26 +08:00
morningman	19997510a6	merge to 9625ef157dd44c58802d63cb7547f037b75fd710 (#208 ) 1. Implement Backend http server using libevent instead of mongoose. 2. Remove Old Hypertable rpc framework, use brpc instead. 3. Change rpc from FE to BE to brpc. 4. Fs broker support HDFS HA. 5. add more metrics to monitor. 6. Lots of bug fixed.	2018-07-17 09:20:30 +08:00
morningman	2419384e8a	push 3.3.19 to github (#193 ) * push 3.3.19 to github * merge to 20ed420122a8283200aa37b0a6179b6a571d2837	2018-05-15 20:38:22 +08:00
LingBin	51d5c727a7	make UUID to be authentication token (#107 )	2017-09-20 21:25:10 +08:00
李超勇	6486be64c3	fix license statement (#29 ) * change picture to word * change picture to word * SHOW FULL TABLES WHERE Table_type != VIEW sql can not execute * change license description	2017-08-18 19:16:23 +08:00
cyongli	e2311f656e	baidu palo	2017-08-11 17:51:21 +08:00

35 Commits