doris

Author	SHA1	Message	Date
Xinyi Zou	e17aef9467	[refactor] refactor the implement of MemTracker, and related usage (#8322 ) Modify the implementation of MemTracker: 1. Simplify a lot of useless logic; 2. Added MemTrackerTaskPool, as the ancestor of all query and import trackers, This is used to track the local memory usage of all tasks executing; 3. Add cosume/release cache, trigger a cosume/release when the memory accumulation exceeds the parameter mem_tracker_consume_min_size_bytes; 4. Add a new memory leak detection mode (Experimental feature), throw an exception when the remaining statistical value is greater than the specified range when the MemTracker is destructed, and print the accurate statistical value in HTTP, the parameter memory_leak_detection 5. Added Virtual MemTracker, cosume/release will not sync to parent. It will be used when introducing TCMalloc Hook to record memory later, to record the specified memory independently; 6. Modify the GC logic, register the buffer cached in DiskIoMgr as a GC function, and add other GC functions later; 7. Change the global root node from Root MemTracker to Process MemTracker, and remove Process MemTracker in exec_env; 8. Modify the macro that detects whether the memory has reached the upper limit, modify the parameters and default behavior of creating MemTracker, modify the error message format in mem_limit_exceeded, extend and apply transfer_to, remove Metric in MemTracker, etc.; Modify where MemTracker is used: 1. MemPool adds a constructor to create a temporary tracker to avoid a lot of redundant code; 2. Added trackers for global objects such as ChunkAllocator and StorageEngine; 3. Added more fine-grained trackers such as ExprContext; 4. RuntimeState removes FragmentMemTracker, that is, PlanFragmentExecutor mem_tracker, which was previously used for independent statistical scan process memory, and replaces it with _scanner_mem_tracker in OlapScanNode; 5. MemTracker is no longer recorded in ReservationTracker, and ReservationTracker will be removed later;	2022-03-11 22:04:23 +08:00
zhangstar333	e0ef9b8f6c	[refactor](vectorized) to_bitmap(-1) return NULL instead of return parse failed error_message (#8373 )	2022-03-11 17:21:47 +08:00
yiguolei	7cfcddd8df	[fix] brpc will check required field in proto and need_gen_rollup is moved will throw exception (#8420 )	2022-03-11 00:28:33 +08:00
yiguolei	d880559214	[refactor] remove old schema change code on BE (#8342 )	2022-03-09 13:05:44 +08:00
yiguolei	0ff7de4157	[refactor] remove agent status (#8273 ) There are 3 error code types in BE: OLAPStatus AgentStatus Status. It is very confused and sometimes conflict during write code. I will try to unify them to Status.	2022-03-09 13:04:50 +08:00
Pxl	cd8694e532	[feature][vectorized] support replace() (#8384 )	2022-03-08 18:57:12 +08:00
zhangstar333	454b45bea3	[feature](vectorize)(function) support regexp&&sm4&&aes functions (#8307 )	2022-03-08 13:14:02 +08:00
Zhengguo Yang	f52d479cbc	[fix](ut) fix be ut fragment_mgr_test compile failed (#8344 )	2022-03-05 14:43:20 +08:00
Zhengguo Yang	e7c417505c	[fix] fix hash table insert() may be failed but not handle this error (#8207 )	2022-03-03 22:33:05 +08:00
Zhengguo Yang	f622ce0497	[refactor] remove types_test (#8289 ) * [refactor] remove types_test 1. remove types_test, it will cause core dump in higher version GCC or clang, because of memory align, some code will be vectorized in higher GCC or clang 2. Change string type length to 2 GB instead of -1 3. modify inaccessible code	2022-03-03 09:31:35 +08:00
yiguolei	8be71b69d5	[refactor] remove pusher.cpp and related mock test code (#8288 )	2022-03-03 09:30:54 +08:00
Zhengguo Yang	246ac4e37a	[fix] fix a bug of encryption function with iv may return wrong result (#8277 )	2022-03-02 17:26:44 +08:00
Adonis Ling	b40e9144cb	[feature-wip][array-type] Refactor type info for nested array. (#8279 )	2022-03-02 14:20:39 +08:00
zhangstar333	2b9b0fc1ec	[Fix] Function percentile input null return null (#8238 )	2022-03-01 14:42:48 +08:00
Pxl	7d0e36a054	[fix](be-ut) fix bitmap_ut result wrong && fix schema_change compile error (#8261 )	2022-03-01 11:11:02 +08:00
caiconghui	c66a9bf64b	[fix](be-ut) fix unit test bug for tablet_info_test (#8253 ) introduced from #8041	2022-02-27 10:44:20 +08:00
HappenLee	a6bc9cbe53	[Function] Refactor the function code of log (#8199 ) 1. Support return null when input is invalid 2. Del the unless code in vec function Co-authored-by: lihaopeng <lihaopeng@baidu.com>	2022-02-24 11:06:58 +08:00
Mingyu Chen	9a7931cfed	[fix](mem-pool) fix bug that mem pool failed to allocate in ASAN mode (#8216 ) Also fix BE ut: 1. fix scheme_change_test memory leak 2. fix mem_pool_test Do not using DEFAULT_PADDING_SIZE = 0x10 in mem_pool when running ut. 3. remove plugin_test	2022-02-24 10:52:58 +08:00
Adonis Ling	0726a43a2a	[fix](be-ut) Fix unused-but-set-variable errors. (#8211 )	2022-02-23 21:43:15 +08:00
HappenLee	01fb25a498	[UT] Fix the UT of column_nullable_test (#8180 ) Co-authored-by: lihaopeng <lihaopeng@baidu.com>	2022-02-23 15:37:40 +08:00
wangbo	e3f1efcbbf	[Vec][Storage] Support delete condition;ut (#8091 ) Co-authored-by: Wang Bo <wangbo36@meituan.com>	2022-02-23 12:48:18 +08:00
wangbo	d17ed5e27a	[vectorization](storage)support seq column in storage layer (#8186 ) [vectorization](storage)support seq column in storage layer (#8186)	2022-02-23 12:23:31 +08:00
zhangstar333	31ab569c1d	[Vectorized][Feature] support some bitmap functions (#8138 )	2022-02-23 11:42:16 +08:00
zuochunwei	802fcbbb05	(#8162 )refactor binary dict Co-authored-by: zuochunwei <zuochunwei@meituan.com>	2022-02-22 11:23:54 +08:00
jacktengg	f13fd13e1b	[fix] (schema change) Fix BE crash after schema change int column to varchar column(#8073 ) (#8142 ) Co-authored-by: jianping.teng <tengjp@outlook.com>	2022-02-22 09:22:00 +08:00
zuochunwei	5f50d9ae3b	predicate test bugfix (#8134 ) Co-authored-by: zuochunwei <zuochunwei@meituan.com>	2022-02-19 12:05:26 +08:00
Zhengguo Yang	50864aca7d	[refactor] fix warings when compile with clang (#8069 )	2022-02-19 11:29:02 +08:00
HappenLee	bcde1f265a	[Function][Vectorized] Support least/greast function (#8107 ) Co-authored-by: lihaopeng <lihaopeng@baidu.com>	2022-02-18 11:57:07 +08:00
HappenLee	68b24d608f	[fix] (vectorization)Fix nullable column compute the hash value error (#8105 ) Co-authored-by: lihaopeng <lihaopeng@baidu.com>	2022-02-18 11:20:47 +08:00
zuochunwei	a162f56284	(test) resolve unit test failed problem for VGenericIteratorsTest Co-authored-by: zuochunwei <zuochunwei@meituan.com>	2022-02-17 20:03:07 +08:00
Pxl	f06c13a828	[feature](vec)(function) support function `convert_tz()` (#8060 )	2022-02-17 10:51:32 +08:00
HappenLee	bef1b55c1f	[feature][fix](vec)(function) Fix multi args function call the DATETIME type not effective in DATE type and add the alias function (#8050 ) 1. Support some function alias of mod/fmod, adddate/add_data 2. Support some function of multi args: week, yearweek 3. Fix bug of multi args function call the DATETIME type not effective in DATE type	2022-02-17 10:49:25 +08:00
yiguolei	aea3e4e59b	[refactor] Remove version hash from BE and related test in BE (#8027 )	2022-02-14 09:29:27 +08:00
Pxl	64f71ddae3	[fix](be-ut) fix segmentation fault at unaligned address int128 (#8021 )	2022-02-14 09:29:05 +08:00
Adonis Ling	18e2071278	[fix](be-unit-test) Fix memory problems in agg_test.cpp. (#8019 )	2022-02-14 09:23:40 +08:00
yiguolei	7d7e3a39f5	[refactor] Remove snapshot converter and unused Protobuf Definitions (#8026 ) 1. remove snapshot converter 2. remove unused protobuf definitions 3. move some macro as const variables	2022-02-12 16:06:04 +08:00
Pxl	b26e7e3c28	[feature](function)(vec) support locate function (#7988 ) * support function locate in vectorized engine * add ut and fix some bug	2022-02-12 16:00:37 +08:00
Zhengguo Yang	7a73645eee	[refactor] remove some unused code (#8022 )	2022-02-12 15:17:28 +08:00
Zhengguo Yang	5029ef46c9	[fix] fix ltrim result may incorrect in some case (#7963 ) fix ltrim result may incorrect in some case according to https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html Built-in Function: int __builtin_cl/tz (unsigned int x) If x is 0, the result is undefined. So we handle the case of 0 separately this function return different between gcc and clang when x is 0	2022-02-09 13:06:37 +08:00
Pxl	0553ce2944	[feature](vectorization) support function topn && remove some unused code (#7793 )	2022-02-09 13:05:31 +08:00
Mingyu Chen	3048ce8a4f	[improvement][refactor](vec) Refactor serde of vec block and using brpc attachment (#7939 ) This PR mainly changes: 1. Change the define of PBlock The new PBlock consists of a set of PColumnMeta and a binary buffer. The PColumnMeta records the metadata information of all columns in the Block, while the buffer stores the serialized binary data of all columns. 2. Refactor the serialize/deserialize method of data type Rewrite the `serialize()/deserialize()` of IDataType. And also add a new method `get_uncompressed_serialized_bytes()` to get the total length of uncompressed serialized data of a column. 3. Rewrite the serialize/deserialize method of Block Now, when serializing a Block to PBlock, it will first get the total length of uncompressed serialized data of all columns in this Block, and then allocate the memory to write the serialized data to the buffer. 4. Use brpc attachment to transmit the serialized column data	2022-02-08 11:11:42 +08:00
Zhengguo Yang	f8d086d87f	[feature](rpc) (experimental)Support implement UDF through GRPC protocol. (#7519 ) Support implement UDF through GRPC protocol. This brings several benefits: 1. The udf implementation language is not limited to c++, users can use any familiar language to implement udf 2. UDF is decoupled from Doris, udf will not cause doris coredump, udf computing resources are separated from doris, and doris services are not affected But RPC's UDF has a fixed overhead, so its performance is much slower than C++ UDF, especially when the amount of data is large. Create function like ``` CREATE FUNCTION rpc_add(INT, INT) RETURNS INT PROPERTIES ( "SYMBOL"="add_int", "OBJECT_FILE"="127.0.0.1:9999", "TYPE"="RPC" ); ``` Function service need to implement `check_fn` and `fn_call` methods Note: THIS IS AN EXPERIMENTAL FEATURE, THE INTERFACE AND DATA STRUCTURE MAY BE CHANGED IN FUTURE !!!	2022-02-08 09:25:09 +08:00
Mingyu Chen	c0e59e59aa	[fix][refactor] fix bugs and refactor some code by lint (#7871 ) 1. Fix some `passedByValue` issues. 2. Fix some `dereferenceBeforeCheck` issues. 3. Fix some `uninitMemberVar` issues. 4. Fix some iterator `eraseDereference` issues. 5. Fix compile issue introduced from #7923 #7905 #7848	2022-02-01 14:31:14 +08:00
Mingyu Chen	82f421a019	[fix](brpc-attachment) Fix bug that may cause BE crash when enable `transfer_data_by_brpc_attachment` (#7921 ) This PR mainly changes: 1. Fix bug when enable `transfer_data_by_brpc_attachment` In `data_stream_sender`, we will send a serialized PRowBatch data to multiple Channels. And if `transfer_data_by_brpc_attachment` is enabled, we will mistakenly clear the data in PRowBatch after sending PRowBatch to the first Channel. As a result, the following Channel cannot receive the correct data, causing an error. So I use a separate buffer instead of `tuple_data` in PRowBatch to store the serialized data and reuse it in multiple channels. 2. Fix bug that the the offset in serialized row batch may overflow Use int64 to replace int32 offset. And for compatibility, add a new field `new_tuple_offsets` in PRowBatch.	2022-02-01 08:51:16 +08:00
924060929	c1fef37399	[improvement](runtime-filter) Support adaptive runtime filter(#7546 ) (#7645 ) Change 1: Support an adaptive runtime filter: IN_OR_BLOOM_FILTER The processing logic is If the number of rows in the right table < runtime_filter_max_in_num, then IN predicate will work If the number of rows in the right table >= runtime_filter_max_in_num, then Bloom filter can take effect Change 2: The default runtime filter is changed to filter: IN_OR_BLOOM_FILTER	2022-01-30 16:46:52 +08:00
zhangstar333	fb6e22f4ca	[Fix] fix memory leak in be unit test (#7857 ) 1. fix be unit test memory leak 2. ignore mindump test with ASAN test	2022-01-29 01:00:38 +08:00
Pxl	cd73a6b84b	[chore] fix clang compile error (#7883 )	2022-01-26 12:53:35 +08:00
wangbo	cf02e43ec1	[improvement](vectorized) optimize dict read (#7805 )	2022-01-22 10:18:30 +08:00
Amos Bird	800a36343a	[chore] Prolog of hermetic build with GCC 11 and Clang 13. (#7712 ) Prepare to generate hermetic build using GCC 11 and Clang 13. The ideal toolchain would be ldb toolchain generated by [ldb_toolchain_gen.sh](https://github.com/amosbird/ldb_toolchain_gen/releases/download/v0.3/ldb_toolchain_gen.sh) To kick off a clang build, set `DORIS_TOOLCHAIN=clang` before running any build scripts.	2022-01-21 12:12:04 +08:00
Mingyu Chen	0efef1b332	[fix](schema-change) Fix bug that schema change may return -102 error (#7808 ) When using linked schema change, we need to check if all rowsets are of the same type, ALPHA or BETA. otherwise, we need to use direct schema change to convert the data.	2022-01-21 10:59:54 +08:00

1 2 3 4 5 ...

584 Commits