doris

Author	SHA1	Message	Date
Adonis Ling	2b6f85ab96	[chore](macOS) Fix BE UT (#14307 ) #13195 left some unresolved issues. One of them is that some BE unit tests fail. This PR fixes this issue. Now, we can run the command ./run-be-ut.sh --run successfully on macOS.	2022-11-18 10:13:38 +08:00
Xinyi Zou	0b945fe361	[enhancement](memtracker) Refactor mem tracker hierarchy (#13585 ) mem tracker can be logically divided into 4 layers: 1)process 2)type 3)query/load/compation task etc. 4)exec node etc. type includes enum Type { GLOBAL = 0, // Life cycle is the same as the process, e.g. Cache and default Orphan QUERY = 1, // Count the memory consumption of all Query tasks. LOAD = 2, // Count the memory consumption of all Load tasks. COMPACTION = 3, // Count the memory consumption of all Base and Cumulative tasks. SCHEMA_CHANGE = 4, // Count the memory consumption of all SchemaChange tasks. CLONE = 5, // Count the memory consumption of all EngineCloneTask. Note: Memory that does not contain make/release snapshots. BATCHLOAD = 6, // Count the memory consumption of all EngineBatchLoadTask. CONSISTENCY = 7 // Count the memory consumption of all EngineChecksumTask. } Object pointers are no longer saved between each layer, and the values of process and each type are periodically aggregated. other fix: In [fix](memtracker) Fix transmit_tracker null pointer because phamp is not thread safe #13528, I tried to separate the memory that was manually abandoned in the query from the orphan mem tracker. But in the actual test, the accuracy of this part of the memory cannot be guaranteed, so put it back to the orphan mem tracker again.	2022-11-08 09:52:33 +08:00
plat1ko	db07e51cd3	[refactor](status) Refactor status handling in agent task (#11940 ) Refactor TaggableLogger Refactor status handling in agent task: Unify log format in TaskWorkerPool Pass Status to the top caller, and replace some OLAPInternalError with more detailed error message Status Premature return with the opposite condition to reduce indention	2022-08-29 12:06:01 +08:00
Xinyi Zou	1fc5515a78	[enhancement](memory) Remove unused reservation tracker (#11969 )	2022-08-24 08:49:34 +08:00
Xinyi Zou	4960043f5e	[enhancement] Refactor to improve the usability of MemTracker (step2) (#10823 )	2022-07-21 17:11:28 +08:00
Mingyu Chen	dce18cb325	[doc] Add window functions sql help doc (#9393 )	2022-05-07 08:43:51 +08:00
Mingyu Chen	e5d4cf01ed	[fix](ut) fix a potential memory leak in BE ut (#9362 )	2022-05-05 20:47:31 +08:00
Zhengguo Yang	5a44eeaf62	[refactor] Unify all unit tests into one binary file (#8958 ) 1. solved the previous delayed unit test file size is too large (1.7G+) and the unit test link time is too long problem problems 2. Unify all unit tests into one file to significantly reduce unit test execution time to less than 3 mins 3. temporarily disable stream_load_test.cpp, metrics_action_test.cpp, load_channel_mgr_test.cpp because it will re-implement part of the code and affect other tests	2022-04-12 15:30:40 +08:00
Xinyi Zou	eeae516e37	[Feature](Memory) Hook TCMalloc new/delete automatically counts to MemTracker (#8476 ) Early Design Documentation: https://shimo.im/docs/DT6JXDRkdTvdyV3G Implement a new way of memory statistics based on TCMalloc New/Delete Hook, MemTracker and TLS, and it is expected that all memory new/delete/malloc/free of the BE process can be counted.	2022-03-20 23:06:54 +08:00
Pxl	a8af8d2981	[fix](vectorized) fix core dump on get_json_string and add some ut (#8496 )	2022-03-17 10:08:31 +08:00
camby	3ba4de0d27	[fix](ut) fix some UT compile or run failed cases (#8489 )	2022-03-16 11:38:35 +08:00
Xinyi Zou	e17aef9467	[refactor] refactor the implement of MemTracker, and related usage (#8322 ) Modify the implementation of MemTracker: 1. Simplify a lot of useless logic; 2. Added MemTrackerTaskPool, as the ancestor of all query and import trackers, This is used to track the local memory usage of all tasks executing; 3. Add cosume/release cache, trigger a cosume/release when the memory accumulation exceeds the parameter mem_tracker_consume_min_size_bytes; 4. Add a new memory leak detection mode (Experimental feature), throw an exception when the remaining statistical value is greater than the specified range when the MemTracker is destructed, and print the accurate statistical value in HTTP, the parameter memory_leak_detection 5. Added Virtual MemTracker, cosume/release will not sync to parent. It will be used when introducing TCMalloc Hook to record memory later, to record the specified memory independently; 6. Modify the GC logic, register the buffer cached in DiskIoMgr as a GC function, and add other GC functions later; 7. Change the global root node from Root MemTracker to Process MemTracker, and remove Process MemTracker in exec_env; 8. Modify the macro that detects whether the memory has reached the upper limit, modify the parameters and default behavior of creating MemTracker, modify the error message format in mem_limit_exceeded, extend and apply transfer_to, remove Metric in MemTracker, etc.; Modify where MemTracker is used: 1. MemPool adds a constructor to create a temporary tracker to avoid a lot of redundant code; 2. Added trackers for global objects such as ChunkAllocator and StorageEngine; 3. Added more fine-grained trackers such as ExprContext; 4. RuntimeState removes FragmentMemTracker, that is, PlanFragmentExecutor mem_tracker, which was previously used for independent statistical scan process memory, and replaces it with _scanner_mem_tracker in OlapScanNode; 5. MemTracker is no longer recorded in ReservationTracker, and ReservationTracker will be removed later;	2022-03-11 22:04:23 +08:00
Zhengguo Yang	6c6380969b	[refactor] replace boost smart ptr with stl (#6856 ) 1. replace all boost::shared_ptr to std::shared_ptr 2. replace all boost::scopted_ptr to std::unique_ptr 3. replace all boost::scoped_array to std::unique<T[]> 4. replace all boost:thread to std::thread	2021-11-17 10:18:35 +08:00
Zhengguo Yang	24d38614a0	[Dependency] Upgrade thirdparty libs (#6766 ) Upgrade the following dependecies: libevent -> 2.1.12 OpenSSL 1.0.2k -> 1.1.1l thrift 0.9.3 -> 0.13.0 protobuf 3.5.1 -> 3.14.0 gflags 2.2.0 -> 2.2.2 glog 0.3.3 -> 0.4.0 googletest 1.8.0 -> 1.10.0 snappy 1.1.7 -> 1.1.8 gperftools 2.7 -> 2.9.1 lz4 1.7.5 -> 1.9.3 curl 7.54.1 -> 7.79.0 re2 2017-05-01 -> 2021-02-02 zstd 1.3.7 -> 1.5.0 brotli 1.0.7 -> 1.0.9 flatbuffers 1.10.0 -> 2.0.0 apache-arrow 0.15.1 -> 5.0.0 CRoaring 0.2.60 -> 0.3.4 orc 1.5.8 -> 1.6.6 libdivide 4.0.0 -> 5.0 brpc 0.97 -> 1.0.0-rc02 librdkafka 1.7.0 -> 1.8.0 after this pr compile doris should use build-env:1.4.0	2021-10-15 13:03:04 +08:00
HappenLee	9216735cfa	[New Featrue] Support Vectorization Execution Engine Interface For Doris (#6329 ) 1. FE vectorized plan code 2. Function register vec function 3. Diff function nullable type 4. New thirdparty code and new thrift struct	2021-08-11 14:54:06 +08:00
Zhengguo Yang	739c0268ff	[refactor] Remove decimal v1 related code from code base (#6079 ) remove ALL DECIMAL V1 type code ， this is a part of #6073	2021-07-07 10:26:32 +08:00
sduzh	6fedf5881b	[CodeFormat] Clang-format cpp sources (#4965 ) Clang-format all c++ source files.	2020-11-28 18:36:49 +08:00
sduzh	10e1e29711	Remove header file common/names.h (#4945 )	2020-11-26 17:00:48 +08:00
HuangWei	10f822eb43	[MemTracker] make all MemTrackers shared (#4135 ) We make all MemTrackers shared, in order to show MemTracker real-time consumptions on the web. As follows: 1. nearly all MemTracker raw ptr -> shared_ptr 2. Use CreateTracker() to create new MemTracker(in order to add itself to its parent) 3. RowBatch & MemPool still use raw ptrs of MemTracker, it's easy to ensure RowBatch & MemPool destructor exec before MemTracker's destructor. So we don't change these code. 4. MemTracker can use RuntimeProfile's counter to calc consumption. So RuntimeProfile's counter need to be shared too. We add a shared counter pool to store the shared counter, don't change other counters of RuntimeProfile. Note that, this PR doesn't change the MemTracker tree structure. So there still have some orphan trackers, e.g. RowBlockV2's MemTracker. If you find some shared MemTrackers are little memory consumption & too time-consuming, you could make them be the orphan, then it's fine to use the raw ptr.	2020-07-31 21:57:21 +08:00
Yingchun Lai	b58b1b3953	[metrics] Make DorisMetrics to be a real singleton (#3417 )	2020-05-04 09:20:53 +08:00
Yingchun Lai	72f3082358	[Metrics] Add some metrics for container size in BE (#3246 ) We can observe the workload of BE, and also it's a way to check whether there is any problem in BE, like some container increase too large and lead to OOM. This patch add the following metrics: ``` Name Description rowset_count_generated_and_in_use The total count of rowset id generated and in use since BE last start unused_rowsets_count The total count of unused rowset waiting to be GC broker_count The total count of brokers in management data_stream_receiver_count The total count of data stream receivers in management fragment_endpoint_count The total count of fragment endpoints of data stream in management, should always equal to data_stream_receiver_count active_scan_context_count The total count of active scan contexts plan_fragment_count The total count of plan fragments in executing load_channel_count The total count of load channels in management result_buffer_block_count The total count of result buffer blocks for queries, each block has a limited queue size (default 1024) result_block_queue_count The total count of queues for fragments, each queue has a limited size (default 20, by config::max_memory_sink_batch_count) routine_load_task_count The total count of routine load tasks in executing small_file_cache_count The total count of cached small files' digest info stream_load_pipe_count The total count of stream load pipes, each pipe has a limited buffer size (default 1M) tablet_writer_count The total count of tablet writers brpc_endpoint_stub_count The total count of brpc endpoints ```	2020-04-25 16:13:39 +08:00
Yingchun Lai	4a7a88ede1	[LSAN] Fix some memory leak detected by LSAN (#3326 )	2020-04-22 22:59:44 +08:00
trueeyu	099e0f74bd	Remove unused LLVM related codes of directory:be/src/exprs (#2910 ) (#2972 ) Remove unused LLVM related codes of directory (step 3):be/src/exprs (#2910) there are many LLVM related codes in code base, but these codes are not really used. The higher version of GCC is not compatible with the LLVM 3.4.2 version currently used by Doris. The PR delete all LLVM related code of directory: be/src/exprs	2020-02-24 18:23:08 +08:00
yangzhg	3e6dfa31c4	[UnitTest] Fix BE unit test randomly failed (#2970 ) * fix http server related unit test failed due to http port has been used * fix unit test failed in DEBUG build type	2020-02-21 22:21:02 +08:00
ZHAO Chun	1648226927	Adapt arrow 0.15 API (#2657 ) This CL supports arrow's zero copy read interface, which can make code comply with arrow 0.15. And the schema change unit test has some problem, I disable it in run-ut.sh	2020-01-04 15:54:29 +08:00
Yunfeng,Wu	f53f188c5d	Add arrow IPC serialization for Doris-Spark-Connector (#2013 )	2019-10-31 10:32:06 +08:00
Dayue Gao	f76dad289e	Basic implementation for BetaRowsetReader (#1718 )	2019-09-03 13:52:16 +08:00
ZHAO Chun	58801c6ab0	Support converting RowBatch and RowBlockV2 to/from Arrow (#1699 )	2019-08-27 11:30:00 +08:00

28 Commits