doris

Author	SHA1	Message	Date
Xinyi Zou	b41eaa5ac0	[fix](memtracker) Introduce orphan mem tracker to verify memory tracking accuracy (#12794 ) The mem hook consumes the orphan tracker by default. If the thread does not attach other trackers, by default all consumption will be passed to the process tracker through the orphan tracker. In real time, consumption of all other trackers + orphan tracker consumption = process tracker consumption. Ideally, all threads are expected to attach to the specified tracker, so that "all memory has its own ownership", and the consumption of the orphan mem tracker is close to 0, but greater than 0.	2022-09-21 15:47:10 +08:00
Xinyi Zou	3bb042e45c	[fix](memtracker) Process physical mem check does not include tc/jemalloc allocator cache (#12688 ) tcmalloc/jemalloc allocator cache does not participate in the mem check as part of the process physical memory. because new/malloc will trigger mem hook when using tcmalloc/jemalloc allocator cache, but it may not actually alloc physical memory, which is not expected in mem hook fail. in addition: The value of tcmalloc/jemalloc allocator cache is used as a mem tracker, the parent is the process mem tracker, which is updated every 1s. Modify the process default mem_limit to 90%. expect mem tracker to effectively limit the memory usage of the process.	2022-09-17 11:31:01 +08:00
zxealous	254cb321b9	[optimize](remote) Optimize cache reader use a pre-created buffer when downloading the cache (#12165 ) * optimize cache reader * add description for config * optimize cache reader * optimize cache reader	2022-08-31 10:15:40 +08:00
Xinyi Zou	8370115cf6	[enhancement](memtracker) Improve performance of tracking real physical memory of PODArray #12168	2022-08-30 10:22:12 +08:00
Xinyi Zou	22157077e9	[fix](memtracker) Optimize the return msg of process memory limit exceed #12086 Return the real process memory information when the process exceeds mem limit Optimize the memory exceed limit log printing logic process tracker does not participate in process memory limit.	2022-08-26 14:28:46 +08:00
zxealous	54fc038dc5	[Fix](remote) Fix thread safety issue in cache (#11984 )	2022-08-24 18:14:14 +08:00
Xinyi Zou	1304a17600	[fix](memtracker) Improve performance of tracking real physical memory of PodArray #12021	2022-08-24 14:24:14 +08:00
Xinyi Zou	1fc5515a78	[enhancement](memory) Remove unused reservation tracker (#11969 )	2022-08-24 08:49:34 +08:00
Mingyu Chen	05da3d947f	[feature-wip](new-scan) add scanner scheduling framework (#11582 ) There are currently many types of ScanNodes in Doris. And most of the logic of these ScanNodes is the same, including: Runtime filter Predicate pushdown Scanner generation and scheduling So I intend to unify the common logic of all ScanNodes. Different data sources only need to implement different Scanners for data access. So that the future optimization for scan can be applied to the scan of all data sources, while also reducing the code duplication. This PR mainly adds 4 new class: VScanner All Scanners' parent class. The subclasses can inherit this class to implement specific data access methods. VScanNode The unified ScanNode, and is responsible for common logic including RuntimeFilter, predicate pushdown, Scanner generation and scheduling. ScannerContext ScannerContext is responsible for recording the execution status of a group of Scanners corresponding to a ScanNode. Including how many scanners are being scheduled, and maintaining a producer-consumer blocks queue between scanners and scan nodes. ScannerContext is also the scheduling unit of ScannerScheduler. ScannerScheduler schedules a ScannerContext at a time, and submits the Scanners to the scanner thread pool for data scanning. ScannerScheduler Unified responsible for all Scanner scheduling tasks Test: This work is still in progress and default is disabled. I tested it with jmeter with 50 concurrency, but currently the scanner is just return without data. The QPS can reach about 9000. I can't compare it to origin implement because no data is read for now. I will test it when new olap scanner is ready. Co-authored-by: morningman <morningman@apache.org>	2022-08-23 08:45:18 +08:00
Mingyu Chen	abbf75d302	[doc][refactor](metrics) Reorganize FE and BE metrics and add document (#11307 )	2022-08-02 11:34:06 +08:00
Xinyi Zou	73d8f5901d	fix mem tracker limiter (#11376 )	2022-08-01 09:44:04 +08:00
Luwei	d6f937cb01	(performance)[scanner] Isolate local and remote queries using different scanner… (#11006 )	2022-07-29 19:14:46 +08:00
Pxl	4e6a59df4c	[Improvement][chore] add const to all operator== (#11251 )	2022-07-27 21:46:47 +08:00
Xinyi Zou	b6bdb3bdbc	[fix] (mem tracker) Fix MemTracker accuracy (#11190 )	2022-07-27 18:59:24 +08:00
Xinyi Zou	4960043f5e	[enhancement] Refactor to improve the usability of MemTracker (step2) (#10823 )	2022-07-21 17:11:28 +08:00
Xinyi Zou	d5fa66d9a3	[Enhancement] [Memory] Limit memory usage use process actual physical memory (#10924 )	2022-07-19 11:08:39 +08:00
plat1ko	331fa50501	[feature](cold-data) move cold data to object storage without losing any feature(BE) (#10280 ) This PR supports rowset level data upload on the BE side, so that there can be both cold data and hot data in a tablet, and there is no necessary to prohibit loading new data to cooled tablets. Each rowset is bound to a `FileSystem`, so that the storage layer can read and write rowsets without perceiving the underlying filesystem. The abstracted `RemoteFileSystem` can try local caching strategies with different granularity, instead of caching segment files as before. To avoid conflicts with the code in be/src/io, we temporarily put the file system related code in the be/src/io/fs directory. In the future, `FileReader`s and `FileWriter`s should be unified.	2022-07-08 12:18:39 +08:00
Kidd	659e863bd7	[bugfix] fix tcmalooc hook cancel deadlock (#10514 )	2022-07-01 10:41:59 +08:00
yiguolei	aab7dc956f	[refactor](load) Remove mini load (#10520 )	2022-06-30 23:21:41 +08:00
Xinyi Zou	deeb3028ad	[Enhancement] [Memory] [Vectorized] Stress test and optimize memory allocation (#9581 ) * vec stress test, Allocator introduce chunkallocator * fix comment	2022-06-29 02:57:51 +08:00
Dayue Gao	4d1e926b6c	[feature][config] introduce a new BE config storage_page_cache_shard_size (#9821 ) Co-authored-by: gaodayue <gaodayue@bytedance.com>	2022-05-28 10:17:09 +08:00
yiguolei	cd105bee0a	[refactor](es) Clean es tcp scannode and related thrift definitions (#9553 ) PaloExternalSourcesService is designed for es_scan_node using tcp protocol. But es tcp protocol need deploy a tcp jar into es code. Both es version and lucene version are upgraded, and the tcp jar is not maintained any more. So that I remove all the related code and thrift definitions.	2022-05-14 10:03:55 +08:00
chenlinzhong	c9961c9bb9	[style] clang-format all c++ code (#9305 ) - sh build-support/clang-format.sh to clang-format all c++ code	2022-04-29 16:14:22 +08:00
hongbin	c71ffc01de	[Refactor] Cleanup some unused include (#9063 )	2022-04-18 09:52:31 +08:00
yiguolei	aeee738af0	Revert "[Refactor][agent_task] Remove etl mgr and etl job pool from be (#8635 )" (#8666 ) This reverts commit 6bc982c37436acf288f566cf10e084731b80fa44.	2022-03-25 18:32:50 +08:00
yiguolei	6bc982c374	[Refactor][agent_task] Remove etl mgr and etl job pool from be (#8635 )	2022-03-25 15:17:39 +08:00
Xinyi Zou	e17aef9467	[refactor] refactor the implement of MemTracker, and related usage (#8322 ) Modify the implementation of MemTracker: 1. Simplify a lot of useless logic; 2. Added MemTrackerTaskPool, as the ancestor of all query and import trackers, This is used to track the local memory usage of all tasks executing; 3. Add cosume/release cache, trigger a cosume/release when the memory accumulation exceeds the parameter mem_tracker_consume_min_size_bytes; 4. Add a new memory leak detection mode (Experimental feature), throw an exception when the remaining statistical value is greater than the specified range when the MemTracker is destructed, and print the accurate statistical value in HTTP, the parameter memory_leak_detection 5. Added Virtual MemTracker, cosume/release will not sync to parent. It will be used when introducing TCMalloc Hook to record memory later, to record the specified memory independently; 6. Modify the GC logic, register the buffer cached in DiskIoMgr as a GC function, and add other GC functions later; 7. Change the global root node from Root MemTracker to Process MemTracker, and remove Process MemTracker in exec_env; 8. Modify the macro that detects whether the memory has reached the upper limit, modify the parameters and default behavior of creating MemTracker, modify the error message format in mem_limit_exceeded, extend and apply transfer_to, remove Metric in MemTracker, etc.; Modify where MemTracker is used: 1. MemPool adds a constructor to create a temporary tracker to avoid a lot of redundant code; 2. Added trackers for global objects such as ChunkAllocator and StorageEngine; 3. Added more fine-grained trackers such as ExprContext; 4. RuntimeState removes FragmentMemTracker, that is, PlanFragmentExecutor mem_tracker, which was previously used for independent statistical scan process memory, and replaces it with _scanner_mem_tracker in OlapScanNode; 5. MemTracker is no longer recorded in ReservationTracker, and ReservationTracker will be removed later;	2022-03-11 22:04:23 +08:00
Zhengguo Yang	409aefdfbf	[refactor] add some log when close parquet file (#8144 )	2022-02-21 09:36:53 +08:00
yinzhijian	936da4f10a	[feature](thread-pool) Support thread pool per disk for scanners (#7994 ) Support thread pool per disk for scanners to prevent pool performance from some high ioutil disks happening key point: 1. each disk has a thread pool for scanners 2. whenever a thread pool of one disk runs out of local work, tasks can be retrieved from other threads(disks). This is done round-robin. performance testing: vec version: 25% faster than single thread pool in a high io util disk test case normal version: 8% faster than single thread pool in a high io util disk test case	2022-02-18 09:40:58 +08:00
yiguolei	6b9cb49779	[Refactor] remove plugin folder in be since it is useless and it need fPIC tag to build and we will remove all fPIC tag in the future (#8008 )	2022-02-12 12:28:14 +08:00
Zhengguo Yang	f8d086d87f	[feature](rpc) (experimental)Support implement UDF through GRPC protocol. (#7519 ) Support implement UDF through GRPC protocol. This brings several benefits: 1. The udf implementation language is not limited to c++, users can use any familiar language to implement udf 2. UDF is decoupled from Doris, udf will not cause doris coredump, udf computing resources are separated from doris, and doris services are not affected But RPC's UDF has a fixed overhead, so its performance is much slower than C++ UDF, especially when the amount of data is large. Create function like ``` CREATE FUNCTION rpc_add(INT, INT) RETURNS INT PROPERTIES ( "SYMBOL"="add_int", "OBJECT_FILE"="127.0.0.1:9999", "TYPE"="RPC" ); ``` Function service need to implement `check_fn` and `fn_call` methods Note: THIS IS AN EXPERIMENTAL FEATURE, THE INTERFACE AND DATA STRUCTURE MAY BE CHANGED IN FUTURE !!!	2022-02-08 09:25:09 +08:00
HappenLee	e1d7233e9c	[feature](vectorization) Support Vectorized Exec Engine In Doris (#7785 ) # Proposed changes Issue Number: close #6238 Co-authored-by: HappenLee <happenlee@hotmail.com> Co-authored-by: stdpain <34912776+stdpain@users.noreply.github.com> Co-authored-by: Zhengguo Yang <yangzhgg@gmail.com> Co-authored-by: wangbo <506340561@qq.com> Co-authored-by: emmymiao87 <522274284@qq.com> Co-authored-by: Pxl <952130278@qq.com> Co-authored-by: zhangstar333 <87313068+zhangstar333@users.noreply.github.com> Co-authored-by: thinker <zchw100@qq.com> Co-authored-by: Zeno Yang <1521564989@qq.com> Co-authored-by: Wang Shuo <wangshuo128@gmail.com> Co-authored-by: zhoubintao <35688959+zbtzbtzbt@users.noreply.github.com> Co-authored-by: Gabriel <gabrielleebuaa@gmail.com> Co-authored-by: xinghuayu007 <1450306854@qq.com> Co-authored-by: weizuo93 <weizuo@apache.org> Co-authored-by: yiguolei <guoleiyi@tencent.com> Co-authored-by: anneji-dev <85534151+anneji-dev@users.noreply.github.com> Co-authored-by: awakeljw <993007281@qq.com> Co-authored-by: taberylyang <95272637+taberylyang@users.noreply.github.com> Co-authored-by: Cui Kaifeng <48012748+azurenake@users.noreply.github.com> ## Problem Summary: ### 1. Some code from clickhouse ClickHouse is an excellent implementation of the vectorized execution engine database, so here we have referenced and learned a lot from its excellent implementation in terms of data structure and function implementation. We are based on ClickHouse v19.16.2.2 and would like to thank the ClickHouse community and developers. The following comment has been added to the code from Clickhouse, eg: // This file is copied from // https://github.com/ClickHouse/ClickHouse/blob/master/src/Interpreters/AggregationCommon.h // and modified by Doris ### 2. Support exec node and query: * vaggregation_node * vanalytic_eval_node * vassert_num_rows_node * vblocking_join_node * vcross_join_node * vempty_set_node * ves_http_scan_node * vexcept_node * vexchange_node * vintersect_node * vmysql_scan_node * vodbc_scan_node * volap_scan_node * vrepeat_node * vschema_scan_node * vselect_node * vset_operation_node * vsort_node * vunion_node * vhash_join_node You can run exec engine of SSB/TPCH and 70% TPCDS stand query test set. ### 3. Data Model Vec Exec Engine Support Dup/Agg/Unq table, Support Block Reader Vectorized. Segment Vec is working in process. ### 4. How to use 1. Set the environment variable `set enable_vectorized_engine = true; `(required) 2. Set the environment variable `set batch_size = 4096; ` (recommended) ### 5. Some diff from origin exec engine https://github.com/doris-vectorized/doris-vectorized/issues/294 ## Checklist(Required) 1. Does it affect the original behavior: (No) 2. Has unit tests been added: (Yes) 3. Has document been added or modified: (No) 4. Does it need to update dependencies: (No) 5. Are there any changes that cannot be rolled back: (Yes)	2022-01-18 10:07:15 +08:00
HappenLee	4e02109926	[refactor][fix](constants-fold) Refactor the code of fold constant mgr and fix some undefined behavior and mem leak (#7373 ) 1. Fix some memory leaks 2. Remove redundant and invalid code 3. Fix some buggy writes to reduce extra memory copies and return null pointers to string 4. Reframing the naming to make the structure clearer	2021-12-14 15:53:56 +08:00
Zhengguo Yang	e2d3d0134e	dd a method to get doris current memory usage (#6979 ) Add all memory usage check when TryConsume memory	2021-11-24 10:07:54 +08:00
Mingyu Chen	ed7a873a44	[Memory Usage] Implement segment lru cache to save memory of BE (#6829 )	2021-10-25 10:07:15 +08:00
caiconghui	0393c9b3b9	[Optimize] Support send batch parallelism for olap table sink (#6397 ) * Support send batch parallelism for olap table sink Co-authored-by: caiconghui <caiconghui@xiaomi.com>	2021-08-30 11:03:09 +08:00
Mingyu Chen	3f2fdd236f	Add scan thread token (#6443 )	2021-08-27 10:56:17 +08:00
Mingyu Chen	fa382f8602	[Bug][MemLimit] Modify the memory limit of storage page cache (#6451 ) This CL mainly changes: 1. the `storage_page_cache_limit` is based on config `mem_limit` the default is 20% of `mem_limit`. 2. the `buffer_pool_limit` is based on config `mem_limit` the default is 20% of `mem_limit`. 3. the `buffer_pool_clean_pages_limit` is based on config `buffer_pool_limit` the default is 50% of `buffer_pool_limit` 4. Fix some show bugs of lru cache hit ratio and usage ratio 5. Fix a create view bug that `notEvalNondeterministicFunction` should be reset after analyze.	2021-08-19 14:16:53 +08:00
qiye	a1a37c8cba	[Feature] Support calc constant expr by BE (#6233 ) At present, some constant expression calculations are implemented on the FE side, but they are incomplete, and some expressions cannot be completely consistent with the value calculated by BE (such as part of the time function) Therefore, we provide a way to pass all the constants in SQL to BE for calculation, and then begin to analyze and plan SQL. This method can also solve the problem that some complex constant calculations issued by BI cannot be processed on the FE side. Here through a session variable enable_fold_constant_by_be to control this function, which is disabled by default.	2021-07-19 10:25:53 +08:00
Mingyu Chen	d57c2344e1	[MemTracker] Refactored the hierarchical structure of memtracker (#5956 ) To avoid showing too many memtracker on BE web pages. The MemTracker level now has 3 levels: OVERVIEW, TASK and VERBOSE. OVERVIEW Mainly used for main memory consumption module such as Query/Load/Metadata. TASK is mainly used to record the memory overhead of a single task such as a single query, load, and compaction task. VERBOSE is used for other more detailed memtrackers.	2021-06-16 09:44:24 +08:00
HappenLee	1a81b9e160	[MemTracker] Some enchance of MemTracker (#5783 ) 1 Make some MemTracker have reasonable parent MemTracker not the root tracker 2 Make each MemTracker can be easily to trace. 3 Add show level of MemTracker to reduce the MemTracker show in the web page to have a way to control show how many tracker in web page.	2021-05-19 09:27:50 +08:00
Yingchun Lai	be733cfa9c	[Metrics] Add some large memtrackers' metric (#5614 ) MemTracker can provide memory consumption for us to find out which module consume more memory, but it's just a current value, this patch add metrics for some large memory consumers, then we can find out which module consume more memory in timeline, it would be useful to troubleshoot OOM problems and optimize configs.	2021-04-21 09:15:04 +08:00
曹建华	a2e83e65d2	[BE] Add scanner/etl thread pool queue size metric. (#5619 ) * [BE] Add scanner/etl thread pool queue size metric. * Fix compilation problem.	2021-04-20 09:14:57 +08:00
HappenLee	b423274f17	[Enhance] Make MemTracker more accurate (#5515 ) (#5516 ) * [Enhance] Make MemTracker more accurate (#5515) This PR main about: 1. Improve the readability of MemTrackers' name 2. Add the MemTracker of: * Load * Compaction * SchemaChange * StoragePageCache * TabletManager 3. Change SchemaChange to a Singleon * revise some code for Code Review * change the name of mem_tracker * keep reader_context have the same lifetime of rowset_reader in schema change. * change vlog notice to log(warning) in schema change	2021-04-08 09:14:55 +08:00
Yingchun Lai	0131c33966	[Enhance] Improve the readability of memtrackers' name (#5455 ) Improve the readability of memtrackers' name, then you will be happy to read website be_ip:port/mem_tracker	2021-03-11 22:33:31 +08:00
Skysheepwang	6c098e45fc	[Optimize][Cache]Implementation of Separated Page Cache (#5008 ) #4995 Implementation of Separated Page Cache - Add config "index_page_cache_ratio" to set the ratio of capacity of index page cache - Change the member of StoragePageCache to maintain two type of cache - Change the interface of StoragePageCache for selecting type of cache - Change the usage of page cache in read_and_decompress_page in page_io.cpp - add page type as argument - check if current page type is available in StoragePageCache (cover the situation of ratio == 0 or 1) - Add type as argument in superior call of read_and_decompress_page - Change Unit Test	2021-01-04 12:19:24 +08:00
sduzh	6fedf5881b	[CodeFormat] Clang-format cpp sources (#4965 ) Clang-format all c++ source files.	2020-11-28 18:36:49 +08:00
Zhengguo Yang	75e0ba32a1	Fixes some be typo (#4714 )	2020-10-13 09:37:15 +08:00
HaiBo Li	5f43fb3bde	[Cache][BE] LRU cache for sql/partition cache #2581 (#4005 ) 1. Find the cache node by SQL Key, then find the corresponding partition data by Partition Key, and then decide whether to hit Cache by LastVersion and LastVersionTime 2. Refers to the classic cache algorithm LRU, which is the least recently used algorithm, using a three-layer data structure to achieve 3. The Cache elimination algorithm is implemented by ensuring the range of the partition as much as possible, to avoid the situation of partition discontinuity, which will reduce the hit rate of the Cache partition, 4. Use the two thresholds of maximum memory and elastic memory to control to avoid frequent elimination of data	2020-09-20 20:50:51 +08:00
Yingchun Lai	b780df697a	[refactor] Optimize threads usage mode in BE (#4440 ) BE can not graceful exit because some threads are running in endless loop. This patch do the following optimization: - Use the well encapsulated Thread and ThreadPool instead of std::thread and std::vector<std::thread> - Use CountDownLatch in thread's loop condition to avoid endless loop - Introduce a new class Daemon for daemon works, like tcmalloc_gc, memory_maintenance and calculate_metrics - Decouple statistics type TaskWorkerPool and StorageEngine notification by submit tasks to TaskWorkerPool's queue - Reorder objects' stop and deconstruct in main(), i.e. stop network services at first, then internal services - Use libevent in pthreads mode, by calling evthread_use_pthreads(), then EvHttpServer can exit gracefully in multi-threads - Call brpc::Server's Stop() and ClearServices() explicitly	2020-09-06 20:19:14 +08:00

1 2

70 Commits