doris

Author	SHA1	Message	Date
HappenLee	52c45e38af	[Refactor](RF) refactor the profile of rf and pipeline-x support local ignore (#31287 ) * [Refactor](RF) refactor the profile of rf and pipeline-x support local ignore * fix local merge filter	2024-02-23 19:05:06 +08:00
ZhangJian He	72f4e7e2d1	[security] Don't print token (#30227 )	2024-01-24 09:59:45 +08:00
zhiqiang	8c59e16f81	[opt](query cancel) optimization for query cancel #28778	2023-12-22 12:48:37 +08:00
HHoflittlefish777	d69cdf8635	[improve](heartbeat) show more info when receive invalid cluster id (#27975 )	2023-12-05 11:10:22 +08:00
Xinyi Zou	fc12362a6d	[feature-wip](arrow-flight)(step2) FE support Arrow Flight server (#24314 ) This is a POC, the design documentation will be updated soon	2023-09-20 14:42:54 +08:00
yiguolei	1d1a9e2bfc	[improvement](graceful shutdown) waiting for all query finished when graceful shutdown (#23865 ) In some cloud native deployment scenario, BE(especially the Compute Node BE) will be add to cluster and remove from cluster very frequently. User's query will fail if there is a fragment is running on the shutting down BE. Users could use stop_be.sh --grace, then BE will wait all running queries to stop to avoiding running query failure, but if the waiting time exceed the limit, then be will exit directly. During this period, FE will not send any queries to BE and waiting for all running queries to stop	2023-09-05 09:52:28 +08:00
plat1ko	25b6e4deb2	[fix](daemon) Fix incorrect initialization order of daemon services (#23578 ) Current initialization dependency: Daemon ───┬──► StorageEngine ──► ExecEnv ──► Disk/Mem/CpuInfo │ │ BackendService ─┘ However, original code incorrectly initialize Daemon before StorageEngine. This PR also stop and join threads of daemon services in their dtor, to ensure Daemon services release resources in reverse order of initialization via RAII.	2023-08-31 19:46:38 +08:00
hzq	c083336bbe	[Improvement](pipeline) Cancel outdated query if original fe restarts (#23582 ) If any FE restarts, queries that is emitted from this FE will be cancelled. Implementation of #23704	2023-08-31 17:58:52 +08:00
Pxl	f35ab37e1e	[Bug](materialized-view) fix load db use analyzer to analyze diffrent metaindex (#23673 ) fix load db use analyzer to analyze diffrent metaindex	2023-08-31 12:35:38 +08:00
zhangdong	17e7c1ca53	[fix](fqdn)Fqdn with ipv6 (#22454 ) now,`hostname_to_ip` only can resolve `ipv4`,Therefore, a method is provided to parse ipv4 or ipv6 based on parameters。 when `_heartbeat` call `hostname_to_ip`,Resolve to ipv4 or ipv6, determined by `BackendOptions.is_bind_ipv6` Decision Additionally, a method is provided to first attempt to parse the host into ipv4, and then try ipv6 if it fails	2023-08-25 21:24:55 +08:00
Yongqiang YANG	14fe95578e	[enhancement](heartbeat) print a warning log for long running heartbeat (#20559 )	2023-06-09 08:39:35 +08:00
zhangdong	b481045372	[improvement](k8s)when the IP of be is different from the IP specified by the master, obey it (#19951 ) When FE is deployed on a virtual machine and CN is deployed on k8s, FE needs to use a proxy IP to communicate with CN nodes, and BE cannot resolve the proxy IP from the local network card. We change the previous verification rules and obey the IP assigned by the master	2023-05-26 15:40:29 +08:00
zhangdong	8ec18660fe	[improvement](FQDN)Remove unused code (#19638 )	2023-05-16 08:48:20 +08:00
zhangdong	b129c9901b	[improvement](FQDN)Change the implementation of fqdn (#19123 ) Main changes: 1. If fqdn is enabled in the configuration file, when fe starts, localAddr will obtain fqdn instead of IP, priority_ Networks will fail 2. The IP and host names of Backend and Front are combined into one field, host. When fqdn is enabled, it represents the host name, and when not enabled, it represents the IP address 3. The communication between clusters directly uses fqdn, and various Connection pool add authentication mechanisms to prevent the IP address of the domain name from changing and the connection between nodes from making errors 4. No longer requires polling to verify if the IP has changed, delete fqdnManager 5. Change the method of verifying the legitimacy of nodes between FEs from obtaining client IP to displaying the identity of the transmitting node itself in the HTTP request header or the message body of the throttle 6. When processing the heartbeat, if BE finds that the host stored by itself is inconsistent with the host stored by the master, after verifying the legitimacy of the host, it will change its own host instead of directly reporting an error 7. Simplify the generation logic of fe name Scope of influence: 1. Establishing communication connections between clusters 2. Determine whether it is the same node through attributes such as IP 3. Print Log 4. Information display 5. Address Splicing 6. k8s deployment 7. Upgrade compatibility Test plan: 1. Change the IP address of the node, while keeping the fqdn unchanged, change the IP addresses of fe and be, and verify whether the cluster can read and write data normally 2. Use the master code to generate metadata, and use the previous metadata on the current pr to verify whether it is compatible with the old version (upgrading is no longer supported if fqdn has been enabled before) 3. Deploy fe and be clusters using k8s to verify whether the cluster can read and write data normally 4. According to https://doris.apache.org/zh-CN/docs/dev/admin-manual/cluster-management/fqdn?_highlight=fqdn#%E6%97%A7%E9%9B%86%E7%BE%A4%E5%90%AF%E7%94%A8fqdn Upgrading old clusters 5. Use streamload to specify the fqdn of fe and be to import data separately 6. Use different users to start transactions and write data using insert statements	2023-05-11 00:44:48 +08:00
Mingyu Chen	84d040bdbf	[fix](heartbeat) fix update BE last start time (#18962 ) Sometimes the LastStartTime info in show backends result is unchanged even if BE restart. This PR fix it	2023-04-27 09:59:04 +08:00
Adonis Ling	9e960f4c4f	[chore](build) Use include-what-you-use to optimize includes (#18681 ) Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.	2023-04-17 11:44:58 +08:00
plat1ko	f3aea7f0f0	[Enhancement](status) Unify error code and enable customed err msg for BE internal errors (#14744 )	2022-12-11 23:33:18 +08:00
Pxl	2fab0c45c7	[Feature](runtime-filter) add runtime filter breaking change adapt (#13246 ) add runtime filter breaking change adapt	2022-10-28 10:59:28 +08:00
huangzhaowei	17ba40f947	[feature-wip](CN Node)Support compute node (#13231 ) Introduce the node role to doris, and the table creation and tablet scheduler will control the storage only assign to the BE nodes.	2022-10-25 21:44:33 +08:00
Mingyu Chen	cf2b93532b	[fix](file-scanner) fix some logic about broker load with parquet with new file scanner (#13135 ) Fix some logic about broker load using new file scanner, with parquet format: 1. If columns are specified in load stmt, but none of them are in parquet file, error will be thrown like `err: No columns found in file`. See `parquet_s3_case4` 2. If the first column of table are not in table, the result number of rows is wrong. See `parquet_s3_case8` 3. If column specified in `columns` in load stmt does not exist in file and table, error will be thrown like: `failed to find default value expr for slot: x1`. See `parquet_s3_case2`	2022-10-08 13:08:08 +08:00
Pxl	9607f60845	[Feature](serialize) move block_data_version to fe heart beat (#12667 ) Move block_data_version from be config to fe heart beat	2022-09-27 18:25:54 +08:00
Pxl	fd0bd395ac	[Enhancement] Remove some unused include (#10035 )	2022-06-17 10:47:25 +08:00
BePPPower	51db78d375	[refactor] modify all OLAP_LOG_WARNING to LOG(WARNING) (#9473 ) Co-authored-by: BePPPower <fangtiewei@selectdb.com>	2022-05-10 09:25:25 +08:00
chenlinzhong	c9961c9bb9	[style] clang-format all c++ code (#9305 ) - sh build-support/clang-format.sh to clang-format all c++ code	2022-04-29 16:14:22 +08:00
yiguolei	0ff7de4157	[refactor] remove agent status (#8273 ) There are 3 error code types in BE: OLAPStatus AgentStatus Status. It is very confused and sometimes conflict during write code. I will try to unify them to Status.	2022-03-09 13:04:50 +08:00
Zhengguo Yang	6c6380969b	[refactor] replace boost smart ptr with stl (#6856 ) 1. replace all boost::shared_ptr to std::shared_ptr 2. replace all boost::scopted_ptr to std::unique_ptr 3. replace all boost::scoped_array to std::unique<T[]> 4. replace all boost:thread to std::thread	2021-11-17 10:18:35 +08:00
Zhengguo Yang	24d38614a0	[Dependency] Upgrade thirdparty libs (#6766 ) Upgrade the following dependecies: libevent -> 2.1.12 OpenSSL 1.0.2k -> 1.1.1l thrift 0.9.3 -> 0.13.0 protobuf 3.5.1 -> 3.14.0 gflags 2.2.0 -> 2.2.2 glog 0.3.3 -> 0.4.0 googletest 1.8.0 -> 1.10.0 snappy 1.1.7 -> 1.1.8 gperftools 2.7 -> 2.9.1 lz4 1.7.5 -> 1.9.3 curl 7.54.1 -> 7.79.0 re2 2017-05-01 -> 2021-02-02 zstd 1.3.7 -> 1.5.0 brotli 1.0.7 -> 1.0.9 flatbuffers 1.10.0 -> 2.0.0 apache-arrow 0.15.1 -> 5.0.0 CRoaring 0.2.60 -> 0.3.4 orc 1.5.8 -> 1.6.6 libdivide 4.0.0 -> 5.0 brpc 0.97 -> 1.0.0-rc02 librdkafka 1.7.0 -> 1.8.0 after this pr compile doris should use build-env:1.4.0	2021-10-15 13:03:04 +08:00
Zhengguo Yang	d641a26490	[Refactor] Remove boost filesystem (#5579 ) * use std::filesystem instead of boost Co-authored-by: Mingyu Chen <morningman.cmy@gmail.com>	2021-04-08 09:11:59 +08:00
sduzh	6fedf5881b	[CodeFormat] Clang-format cpp sources (#4965 ) Clang-format all c++ source files.	2020-11-28 18:36:49 +08:00
Yingchun Lai	1151a0063c	[Bug] Make 'LastStartTime' in backends list as the actual BE start time (#4872 ) We use 'LastStartTime' in backends list to check whether there is an unexpected restart of BE, but it will be changed as BE's first heartbeat time after FE restarted, it would be better to set it to BE's actual start time.	2020-11-11 21:24:06 +08:00
Yingchun Lai	b780df697a	[refactor] Optimize threads usage mode in BE (#4440 ) BE can not graceful exit because some threads are running in endless loop. This patch do the following optimization: - Use the well encapsulated Thread and ThreadPool instead of std::thread and std::vector<std::thread> - Use CountDownLatch in thread's loop condition to avoid endless loop - Introduce a new class Daemon for daemon works, like tcmalloc_gc, memory_maintenance and calculate_metrics - Decouple statistics type TaskWorkerPool and StorageEngine notification by submit tasks to TaskWorkerPool's queue - Reorder objects' stop and deconstruct in main(), i.e. stop network services at first, then internal services - Use libevent in pthreads mode, by calling evthread_use_pthreads(), then EvHttpServer can exit gracefully in multi-threads - Call brpc::Server's Stop() and ClearServices() explicitly	2020-09-06 20:19:14 +08:00
Yingchun Lai	e71152132c	[metrics] Redesign metrics to 3 layers (#4115 ) Redesign metrics to 3 layers: MetricRegistry - MetricEntity - Metrics MetricRegistry : the register center MetricEntity : the entity registered on MetricRegistry. Generally a MetricRegistry can be registered on several MetricEntities, each of MetricEntity is an independent entity, such as server, disk_devices, data_directories, thrift clients and servers, and so on. Metric : metrics of an entity. Such as fragment_requests_total on server entity, disk_bytes_read on a disk_device entity, thrift_opened_clients on a thrift_client entity. MetricPrototype: the type of a metric. MetricPrototype is a global variable, can be shared by the same metrics across different MetricEntities.	2020-08-08 11:23:01 +08:00
Mingyu Chen	9a934ec9f6	[Load] Add more info in SHOW LOAD result (#3391 ) Fix #3390 This CL add more info in `JobDetails` column of `SHOW LOAD` result for Broker Load Job. For example: ``` { "Unfinished backends": { "9c3441027ff948a0-8287923329a2b6a7": [10002] }, "All backends": { "9c3441027ff948a0-8287923329a2b6a7": [10002, 10004, 10006] }, "ScannedRows": 2390016, "TaskNumber": 1, "FileNumber": 1, "FileSize": 1073741824 } ``` 2 newly added keys: `Unfinished backends` indicates the BE which task on them are not finished. `All backends` indicates the BE which this job has tasks on it. One more thing, I pass the Backend Id along with the heartbeat msg from FE to BE, so that BE can know the Id of themselves.	2020-04-26 21:30:23 +08:00
Yingchun Lai	8276c6d7f8	Show BE version in 'show backends;' (#3074 ) In a large scale cluster, we may rolling upgrade BEs, this patch add a column named 'Version' for command 'show backends;', as well as website '/system?path=//backends', to provide a method to check whether there is any BE missing upgraded.	2020-03-12 22:15:13 +08:00
LingBin	5440e19d01	Improve the triggering strategy of BE report (#2881 ) Currently, the report from BE to FE is completed in the background threads of `AgentServer` (`report_tablet_thread` and `report_disk_stat_thread`). These two threads will sleep and be in a standby state after each report, if there is any need to report immediately, they will be notified and wake up immediately to report. For example, when background thread (`disk_monitor_thread`) in `StorageEngine` finds some tablets were deleted, it will notify `AgentServer` to trigger a report immediately. In the current implementation, in order to report ASAP, a local variable (`_is_drop_tables`) and two other flags are used to record whether reporting is needed, and then `StorageEngine::disk_monitor_thread` checks the value of this variable every time it runs, to determine whether it needs to be triggered Reporting. This is actually superfluous, and it may result in untimely notifications, as shown below: ``` (thread_1) (thread_2) disk-monitor disk-stat-reporter \| \| \| reporting \| \| notify_1 \| \| \| \| wait_for_notify(will wait until timeout or next notification) \| \| V V ``` When `report_tablet_thread` has not started waiting, `StorageEngine::disk_monitor_thread` triggers a notification, so this notification will not be received by `report_tablet_thread`, resulting in the BE not reporting to the FE until the lock times out or the next round of `disk_monitor_thread` detection. This change restructures the triggering implementation, and solves the above problem. This change also changes some methods(that do not need to be public) to private.	2020-02-11 20:38:44 +08:00
kangpinghuang	c07f37d78c	[Segment V2] Add a control framework between FE and BE through heartbeat #2247 (#2364 ) The control framework is implemented through heartbeat message. Use uint64_t as flags to control different functions. Now add a flag to set the default rowset type to beta.	2019-12-12 12:18:32 +08:00
lichaoyong	0d48a3961c	Refactor Storage Engine (#1478 ) NOTE: This patch would modify all Backend's data. And this will cause a very long time to restart be. So if you want to interferer your product environment, you should upgrade backend one by one. 1. Refactoring be is to clarify the structure the codes. 2. Use unique id to indicate a rowset. Nameing rowset with tablet_id and version will lead to many conflicts among compaction, clone, restore. 3. Extract an rowset interface to encapsulate rowsets with different format.	2019-07-15 21:18:22 +08:00
ZHAO Chun	9d03ba236b	Uniform Status (#1317 )	2019-06-14 23:38:31 +08:00
Mingyu Chen	ff0dd0d2da	Support SSL authentication with Kafka in routine load job (#1235 )	2019-06-07 16:29:01 +08:00
李超勇	3d324e38ea	Remove OLAP_LOG_INFO log format. Use LOG(INFO) instead (#372 )	2018-11-30 20:59:40 +08:00
ZHAO Chun	49302955c8	Revert "Remove OLAP_LOG_INFO log format. Use LOG(INFO) instead (#370 )" (#371 ) This reverts commit a816925776de06dc7503ea7429802cad9042d0e4.	2018-11-30 20:56:51 +08:00
李超勇	a816925776	Remove OLAP_LOG_INFO log format. Use LOG(INFO) instead (#370 ) * Remove unused row-oriented format flags * Remove unused row-oriented format flags * Remove OLAP_LOG_INFO log format. Use LOG(INFO) instead	2018-11-30 20:36:58 +08:00
Mingyu Chen	9a2ad18428	Add path info of replica in catalog (#327 ) Add path info of replica in catalog Also fix a bug that when calling check_none_row_oriented_table, store is null, it cannot be used to create table. Instead, OLAPHeader can be used to get storage type information.	2018-11-19 17:42:46 +08:00
Zhao Chun	a2b299e3b9	Reduce UT binary size (#314 ) * Reduce UT binary size Almost every module depend on ExecEnv, and ExecEnv contains all singleton, which make UT binary contains all object files. This patch seperate ExecEnv's initial and destory to anthor file to avoid other file's dependence. And status.cc include debug_util.h which depend tuple.h tuple_row.h, and I move get_stack_trace() to stack_util.cpp to reduce status.cc's dependence. I add USE_RTTI=1 to build rocksdb to avoid linking librocksdb.a Issue: #292 * Update	2018-11-15 16:17:23 +08:00
chenhao7253886	37b4cafe87	Change variable and namespace name in BE (#268 ) Change 'palo' to 'doris'	2018-11-02 10:22:32 +08:00
morningman	2868793b6b	Change license to Apache License 2.0 (#262 )	2018-11-01 09:06:01 +08:00
morningman	051aced48d	Missing many files in last commit In last commit, a lot of files has been missed	2018-10-31 16:19:21 +08:00
morningman	4f6f8572de	Added: Add 3 new metrics of Backends: host_fd_metrics, process_fd_metrics and process_thread_metrics, to monitor open file number and thread number. Added: Support getting column size and precision info of table or view using JDBC. Updated: Change the promethues type name GAUGE to lowercase, to fit the latest promethues version. Updated: Backend ip saved in FE will be compared with BE's local ip when doing heartbeat, to avoid false positive heartbeat response. Updated: Using version_num of tablet instead of calculating nice value to select cumulative compaction candicates. Fixed: Predicates should not be pushed down to subquery which contains limit clause. Fixed: Fix the formula of calculating BE load score. Fixed: Fix a bug that in some edge cases, non-master Fontend may wait for a unnecessary long timeout after forwarding cmd to Master FE. Fixed: A bug that granting privs on more than one table does not work. Fixed: Support 'Insert into' table which contains HLL columns. Fixed: ExportStmt' toSql() method may throw NullPointer Exception if table does not exist. Fixed: Remove unnecessary 'get capacity' operation to avoid IO impact. Internal commit id: merge to c16bd603a53dfe2089ff95704c698a738c317792	2018-10-26 14:48:21 +08:00
morningman	cc74efb3c5	merge to ddb65b69f9c788e359e191889cb31f15279c41ec (#224 ) 1. Apache HDFS broker support HDFS HA and Hadoop kerberos authentication. 2. New Backup and Restore function. Use Fs Broker to backup your data to HDFS or restore them from HDFS. 3. Table-Level Privileges. Grant fine-grained privileges on table-level to specified user. 4. A lot of bugs fixed. 5. Performance improvement.	2018-08-24 17:12:26 +08:00
morningman	19997510a6	merge to 9625ef157dd44c58802d63cb7547f037b75fd710 (#208 ) 1. Implement Backend http server using libevent instead of mongoose. 2. Remove Old Hypertable rpc framework, use brpc instead. 3. Change rpc from FE to BE to brpc. 4. Fs broker support HDFS HA. 5. add more metrics to monitor. 6. Lots of bug fixed.	2018-07-17 09:20:30 +08:00

1 2

54 Commits