doris

Author	SHA1	Message	Date
yiguolei	aea3e4e59b	[refactor] Remove version hash from BE and related test in BE (#8027 )	2022-02-14 09:29:27 +08:00
yiguolei	7d7e3a39f5	[refactor] Remove snapshot converter and unused Protobuf Definitions (#8026 ) 1. remove snapshot converter 2. remove unused protobuf definitions 3. move some macro as const variables	2022-02-12 16:06:04 +08:00
Pxl	b26e7e3c28	[feature](function)(vec) support locate function (#7988 ) * support function locate in vectorized engine * add ut and fix some bug	2022-02-12 16:00:37 +08:00
Pxl	64fb8dab39	[feature] (function)(vec) support pmod function (#7977 )	2022-02-12 16:00:11 +08:00
Zhengguo Yang	7a73645eee	[refactor] remove some unused code (#8022 )	2022-02-12 15:17:28 +08:00
yiguolei	6b9cb49779	[Refactor] remove plugin folder in be since it is useless and it need fPIC tag to build and we will remove all fPIC tag in the future (#8008 )	2022-02-12 12:28:14 +08:00
Pxl	a4e7c76336	[Enhancement] use std::search to replace custom search (#7999 )	2022-02-11 10:47:58 +08:00
wangyongfeng	690b3b7283	[doc] Translate the Chinese comments (#7982 ) Translate the Chinese comments of file /be/src/common/config.h	2022-02-10 15:08:45 +08:00
smallhibiscus	2e27827c73	[doc] Added http interface return example to obtain the specified table structure information (#7955 ) 1. Added http interface return example in table-schema-action.md. 2. Correct typos in the document in error.md. 3. Modify the content of the code comments in the text_converter.hpp file.	2022-02-10 15:07:28 +08:00
Zhengguo Yang	5029ef46c9	[fix] fix ltrim result may incorrect in some case (#7963 ) fix ltrim result may incorrect in some case according to https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html Built-in Function: int __builtin_cl/tz (unsigned int x) If x is 0, the result is undefined. So we handle the case of 0 separately this function return different between gcc and clang when x is 0	2022-02-09 13:06:37 +08:00
zuochunwei	db20e1f323	[refactor](storage) VGenericIterator to reuse Schema (#7858 ) 1. reuse Schema to avoid copying, because clone Schema will generate a lot of sub Field object 2. call interface provided by Block to reduce code lines	2022-02-09 13:06:03 +08:00
Pxl	0553ce2944	[feature](vectorization) support function topn && remove some unused code (#7793 )	2022-02-09 13:05:31 +08:00
Mingyu Chen	3048ce8a4f	[improvement][refactor](vec) Refactor serde of vec block and using brpc attachment (#7939 ) This PR mainly changes: 1. Change the define of PBlock The new PBlock consists of a set of PColumnMeta and a binary buffer. The PColumnMeta records the metadata information of all columns in the Block, while the buffer stores the serialized binary data of all columns. 2. Refactor the serialize/deserialize method of data type Rewrite the `serialize()/deserialize()` of IDataType. And also add a new method `get_uncompressed_serialized_bytes()` to get the total length of uncompressed serialized data of a column. 3. Rewrite the serialize/deserialize method of Block Now, when serializing a Block to PBlock, it will first get the total length of uncompressed serialized data of all columns in this Block, and then allocate the memory to write the serialized data to the buffer. 4. Use brpc attachment to transmit the serialized column data	2022-02-08 11:11:42 +08:00
HappenLee	ef233701b3	[feature](vec)(load) Support vtablet sink to enable insert into by using vec query engine (#7957 ) Support vtablet sink to enable insert into query in vec query engine	2022-02-08 11:04:09 +08:00
HappenLee	505acae931	[fix](vectorization) make sure the mem address use in agg is align in proper way before use (#7960 )	2022-02-08 10:05:03 +08:00
caoliang-web	8fcae0f0f4	[refactor] Modify the content of code comments (#7950 ) Co-authored-by: caol <caol@shuhaisc.com>	2022-02-08 09:55:46 +08:00
Zhengguo Yang	f8d086d87f	[feature](rpc) (experimental)Support implement UDF through GRPC protocol. (#7519 ) Support implement UDF through GRPC protocol. This brings several benefits: 1. The udf implementation language is not limited to c++, users can use any familiar language to implement udf 2. UDF is decoupled from Doris, udf will not cause doris coredump, udf computing resources are separated from doris, and doris services are not affected But RPC's UDF has a fixed overhead, so its performance is much slower than C++ UDF, especially when the amount of data is large. Create function like ``` CREATE FUNCTION rpc_add(INT, INT) RETURNS INT PROPERTIES ( "SYMBOL"="add_int", "OBJECT_FILE"="127.0.0.1:9999", "TYPE"="RPC" ); ``` Function service need to implement `check_fn` and `fn_call` methods Note: THIS IS AN EXPERIMENTAL FEATURE, THE INTERFACE AND DATA STRUCTURE MAY BE CHANGED IN FUTURE !!!	2022-02-08 09:25:09 +08:00
HappenLee	9eb1d1df27	[fix](vec) fix block mem use-after-free bug in agg table read (#7944 )	2022-02-06 00:34:38 +08:00
HappenLee	51abaa89f3	[fix](vec) Fix some bugs about vec engine (#7884 ) 1. mem leak in vcollector iter 2. query slow in agg table limit 10 3. query slow in SSB q4,q5,q6	2022-02-03 19:21:17 +08:00
Mingyu Chen	c0e59e59aa	[fix][refactor] fix bugs and refactor some code by lint (#7871 ) 1. Fix some `passedByValue` issues. 2. Fix some `dereferenceBeforeCheck` issues. 3. Fix some `uninitMemberVar` issues. 4. Fix some iterator `eraseDereference` issues. 5. Fix compile issue introduced from #7923 #7905 #7848	2022-02-01 14:31:14 +08:00
Mingyu Chen	82f421a019	[fix](brpc-attachment) Fix bug that may cause BE crash when enable `transfer_data_by_brpc_attachment` (#7921 ) This PR mainly changes: 1. Fix bug when enable `transfer_data_by_brpc_attachment` In `data_stream_sender`, we will send a serialized PRowBatch data to multiple Channels. And if `transfer_data_by_brpc_attachment` is enabled, we will mistakenly clear the data in PRowBatch after sending PRowBatch to the first Channel. As a result, the following Channel cannot receive the correct data, causing an error. So I use a separate buffer instead of `tuple_data` in PRowBatch to store the serialized data and reuse it in multiple channels. 2. Fix bug that the the offset in serialized row batch may overflow Use int64 to replace int32 offset. And for compatibility, add a new field `new_tuple_offsets` in PRowBatch.	2022-02-01 08:51:16 +08:00
zuochunwei	4e783afa7a	[feature] add Generic debug timer for debugging or profiling (#7923 ) add a group of debug-timer for the purpose of profiling or testing you can use these timers for custom meaning purpose unlike the specific named timer	2022-01-31 22:15:43 +08:00
HappenLee	358bd79fb1	[improvement](vec)(Join) Mem reuse to speed up join operator (#7905 ) 1. Reuse the mem of output block in vec join node 2. Add the function `replicate` in column	2022-01-31 22:14:12 +08:00
dataroaring	14c209c7cf	[refactor] remove useless if statement in segment_writer.cpp (#7864 )	2022-01-31 22:12:54 +08:00
Pxl	2003da7cf9	[fix](ut) fix abs function ut (#7938 )	2022-01-31 14:58:29 +08:00
wangbo	3f221e1d0b	[fix](memory-leak) using unique_ptr to refactor some fields (#7933 ) Using unique_ptr to refactor some class members. Fix mem leak for `SegmentIterator`'s `_pre_eval_block_predicate`.	2022-01-30 16:49:04 +08:00
Pxl	3ee000c13c	[chore] support build with libc++ && add some build config (#7903 ) support LIBCPP/LDD/BUILD_META_TOOL for build.sh	2022-01-30 16:47:22 +08:00
924060929	c1fef37399	[improvement](runtime-filter) Support adaptive runtime filter(#7546 ) (#7645 ) Change 1: Support an adaptive runtime filter: IN_OR_BLOOM_FILTER The processing logic is If the number of rows in the right table < runtime_filter_max_in_num, then IN predicate will work If the number of rows in the right table >= runtime_filter_max_in_num, then Bloom filter can take effect Change 2: The default runtime filter is changed to filter: IN_OR_BLOOM_FILTER	2022-01-30 16:46:52 +08:00
Zeno Yang	a72eaa2b2e	[fix](Vectorized) optinmize dict page decoder init (#7917 ) this may cause mem leak	2022-01-29 11:47:57 +08:00
qiye	6a1a2a2ed5	[fix](query) Add init function for result_file_sink (#7927 ) Add init function in `result_file_sink` to fix the error "Empty partition info", which is occasional reported when using SELECT INFO OUTFILE.	2022-01-29 10:08:57 +08:00
EmmyMiao87	1d900d8605	(fix)[planner] Fix the right tuple ids in empty set node (#7931 ) The tuple ids of the empty set node must be exactly the same as the tuple ids of the origin root node. In the issue, we found that once the tree where the root node is located has a window function, the tuple ids of the empty set node cannot be calculated correctly. This pr mostly fixes the problem. In order to calculate the correct tuple ids, the tuple ids obtained from the SelectStmt.getMaterializedTupleIds() function in the past are changed to directly use the tuple ids of the origin root node. Although we tried to fix #7929 by modifying the SelectStmt.getMaterializedTupleIds() function, this method can't get the tuple of the last correct window function. So we use other ways to construct tupleids of empty nodes.	2022-01-29 09:46:05 +08:00
zhangstar333	fb6e22f4ca	[Fix] fix memory leak in be unit test (#7857 ) 1. fix be unit test memory leak 2. ignore mindump test with ASAN test	2022-01-29 01:00:38 +08:00
zhangstar333	071be928f9	[fix](vectorized) fix bug multi distinct function get wrong type (#7900 )	2022-01-28 22:31:41 +08:00
zuochunwei	1ba20b1dbb	[improvement](storage) improving Column inserter (#7855 ) * optimize Column inserter * DCHECK * DCHECK Co-authored-by: zuochunwei <zuochunwei@meituan.com>	2022-01-27 14:18:15 +08:00
caiconghui	d2386dd85d	[improvement](rewrite) Make RewriteDateLiteralRule to be compatible with mysql (#7876 )	2022-01-27 10:32:18 +08:00
zuochunwei	df76a5b34c	refactor SegmentIterator (#7852 ) Co-authored-by: zuochunwei <zuochunwei@meituan.com>	2022-01-26 16:44:02 +08:00
zuochunwei	ec5ecd1604	handle conflict (#7836 ) Co-authored-by: zuochunwei <zuochunwei@meituan.com>	2022-01-26 16:33:37 +08:00
HappenLee	015371ac72	[fix](grouping-set) Fix the bug of grouping set core in both vec and non vec query engine (#7800 )	2022-01-26 16:15:30 +08:00
Henry2SS	f227472db2	[chore] fix error while compiling with -O3 (#7890 )	2022-01-26 12:53:56 +08:00
Pxl	cd73a6b84b	[chore] fix clang compile error (#7883 )	2022-01-26 12:53:35 +08:00
zhangstar333	a6831535e9	[Vectorized][Bug] fix bug of coalesce function (#7827 )	2022-01-25 20:44:16 +08:00
Zeno Yang	c2520c878c	[Improvement](Vectorized) optimize SegmentIterator predication evaluate (#7795 ) * [Improvement](Vectorized) optimize SegmentIterator predication evaluate * fix bug * move bytes32_mask_to_bits32_mask to util/simd/bits.h	2022-01-22 15:31:07 +08:00
wangbo	cf02e43ec1	[improvement](vectorized) optimize dict read (#7805 )	2022-01-22 10:18:30 +08:00
Pxl	b56c568a8d	[fix](vectorized) fix fold const value fail at datetime type (#7803 )	2022-01-22 10:16:38 +08:00
shee	b14d1c54fd	[fix](function) fix vec round reference #7421 (#7801 ) reference #7421	2022-01-22 10:09:10 +08:00
Amos Bird	800a36343a	[chore] Prolog of hermetic build with GCC 11 and Clang 13. (#7712 ) Prepare to generate hermetic build using GCC 11 and Clang 13. The ideal toolchain would be ldb toolchain generated by [ldb_toolchain_gen.sh](https://github.com/amosbird/ldb_toolchain_gen/releases/download/v0.3/ldb_toolchain_gen.sh) To kick off a clang build, set `DORIS_TOOLCHAIN=clang` before running any build scripts.	2022-01-21 12:12:04 +08:00
Mingyu Chen	0efef1b332	[fix](schema-change) Fix bug that schema change may return -102 error (#7808 ) When using linked schema change, we need to check if all rowsets are of the same type, ALPHA or BETA. otherwise, we need to use direct schema change to convert the data.	2022-01-21 10:59:54 +08:00
weizuo93	ed39ff1500	[feature](compaction) Support triggering compaction for a specific partition manually (#7521 ) Add statement to trigger cumulative or base compaction for a specified partition.	2022-01-21 09:27:06 +08:00
Mingyu Chen	ef984a6a72	[improvement](load) Improve load fault tolerance (#7674 ) Currently, if we encounter a problem with a replica of a tablet during the load process, such as a write error, rpc error, -235, etc., it will cause the entire load job to fail, which results in a significant reduction in Doris' fault tolerance. This PR mainly changes: 1. refined the judgment of failed replicas in the load process, so that the failure of a few replicas will not affect the normal completion of the load job. 2. fix a bug introduced from #7754 that may cause BE coredump	2022-01-20 09:23:21 +08:00
Mingyu Chen	7574d39d14	[fix](bitmap-index) Fix bug that bitmap index may return wrong result. (#7788 ) Fix the following bugs. 1. `column1` created a bitmap index. 2. `column1` has a lot index items in the bitmap index, and the index page is divided into two levels. 3. `column1`'s value range is `[1000, 10000000]`. 4. the query condition is `column1 > 0` 5. the empty result will be returned, while the expected value should be 9999000 rows.	2022-01-19 12:27:08 +08:00

1 2 3 4 5 ...

1627 Commits