doris

Author	SHA1	Message	Date
yiguolei	4e9b32d622	[bugfix](exception) remove fmt code to test if there still exist core (#19009 )	2023-04-25 07:24:14 +08:00
yiguolei	3899c08036	[optimize](compile) remove unused template param from load channel (#18980 ) * [optimize](compile) remove unused template param from load channel --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-04-24 23:36:47 +08:00
HappenLee	b2c26e17e1	[Compile](vec) Fix compile by BHREAD_SCANNER (#18979 )	2023-04-24 17:07:06 +08:00
Adonis Ling	16a394da0e	[chore](build) Use include-what-you-use to optimize includes (PART III) (#18958 ) Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.	2023-04-24 14:51:51 +08:00
gitccl	296b0c92f7	[Enhancement](compaction) stop tablet compaction when table dropped (#18702 ) * [Enhancement](compaction) stop tablet compaction when table dropped * fix be ut	2023-04-24 11:04:27 +08:00
Mellorsssss	ab2a6864bc	[function](json) Json unquote (#18037 )	2023-04-24 10:33:29 +08:00
yiguolei	8d7a9fd21b	[refactor](exceptionsafe) add factory creator to some class (#18978 ) make vexprecontext,vexpr,function,query context,runtimestate thread safe. --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-04-24 10:32:11 +08:00
Xinyi Zou	8e4710079d	[improvement](profile) Insert into add LoadChannel runtime profile (#18908 ) TabletSink and LoadChannel in BE are M: N relationship, Every once in a while LoadChannel will randomly return its own runtime profile to a TabletSink, so usually all LoadChannel runtime profiles are saved on each TabletSink, and the timeliness of the same LoadChannel profile saved on different TabletSinks is different, and each TabletSink will periodically send fe reports all the LoadChannel profiles saved by itself, and ensures to update the latest LoadChannel profile according to the timestamp.	2023-04-24 09:41:57 +08:00
Jerry Hu	0c95d760fe	[fix](fixed_hashtable) The incorrect implementation of copy constructor (#18921 )	2023-04-24 08:36:52 +08:00
airborne12	07ea350201	[Fix](inverted index) fix memory leak when create bkd reader (#18914 ) The function compoundReader->openInput is called three times, and if any of these calls fail, an error is logged, and the function returns early. If one or two of the calls succeed, but the others fail, there might be a situation where the allocated memory for the IndexInput objects is not freed. To fix this, you could use std::unique_ptr to manage the memory for IndexInput objects. This would automatically clean up the memory when the function goes out of scope.	2023-04-23 23:21:44 +08:00
AlexYue	c3baa65de3	[feature](io) enable s3 file writer with multi part uploading concurrently (#17585 ) Formerly S3FileWriter has to write each buffer with 5MB or more then upload one part, after all these works are done it could then process the incoming data, it's blocking and inefficient. This pr brings one bufferpool where the data could write into memory buffer immediately if has free buffer and then it would be uploaded into the S3. This pr doesn't provide the ability to elegantly support cases where there is no free buffer, i'll leave it as one future work.	2023-04-23 23:19:44 +08:00
yiguolei	3736530585	[refactor](query context) rename query fragments context to query context and make query context safe (#18950 ) * [refactor](query context) rename query fragments context to query context and make query context safe --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-04-23 22:53:56 +08:00
ZenoYang	0da2cf270a	[improvement](fetch data) Merge result into batch to reduce rpc times (#17828 )	2023-04-23 15:07:28 +08:00
yiguolei	61b44108e2	[bugfix](asan) fix possible asan check bug in exception to string (#18936 ) Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-04-23 12:26:36 +08:00
Ashin Gau	29f502380c	[opt](FileReader) merge small IO to optimize read performace (#18796 ) Add `MergeRangeFileReader` to merge small IO to optimize parquet&orc read performance. `MergeRangeFileReader` is a FileReader that efficiently supports random access in format like parquet and orc. In order to merge small IO in parquet and orc, the random access ranges should be generated when creating the reader. The random access ranges is a list of ranges that order by offset. The range in random access ranges should be reading sequentially, can be skipped, but can't be read repeatedly. When calling read_at, if the start offset located in random access ranges, the slice size should not span two ranges. For example, in parquet, the random access ranges is the column offsets in a row group. When reading at offset, if [offset, offset + 8MB) contains many random access ranges, the reader will read data in [offset, offset + 8MB) as a whole, and copy the data in random access ranges into small buffers(name as box, default 1MB, 64MB in total). A box can be occupied by many ranges, and use a reference counter to record how many ranges are cached in the box. If reference counter equals zero, the box can be release or reused by other ranges. When there is no empty box for a new read operation, the read operation will do directly. ## Effects The runtime of ClickBench reduces from 102s to 77s, and the runtime of Query 24 reduces from 24.74s to 9.45s. The profile of Query 24: ``` VFILE_SCAN_NODE (id=0):(Active: 8s344ms, % non-child: 83.06%) - FileReadBytes: 534.46 MB - FileReadCalls: 1.031K (1031) - FileReadTime: 28s801ms - GetNextTime: 8s304ms - MaxScannerThreadNum: 12 - MergedSmallIO: 0ns - CopyTime: 157.774ms - MergedBytes: 549.91 MB - MergedIO: 94 - ReadTime: 28s642ms - RequestBytes: 507.96 MB - RequestIO: 1.001K (1001) - NumScanners: 18 ``` 1001 request IOs has been merged into 94 IOs. ## Remaining problems 1. Add p2 regression test in nest PR 2. Profiles are scattered in various codes and will be refactored in the next PR 3. Support ORC reader	2023-04-23 10:51:38 +08:00
Mryange	b81b470d4f	[fix](planner) fix pr "using crchash replace murmurhash in the runtime filter" (#18759 )	2023-04-23 10:33:35 +08:00
huanghaibin	9756be6bf0	[improvement](stream-load) use vector instead of skiplist when insert dup keys (#18686 )	2023-04-23 09:40:09 +08:00
amory	1ffd34f6f1	[Refact](type system)refact interconversion for jsonb with column (#18819 ) * refact jsonb to column * update * fix format * fixed * fix file head for compile	2023-04-22 14:01:05 +08:00
yiguolei	c80dc91a78	[bugfix](memleak) UserFunctionCache may have memory leak during close (#18913 ) * [bugfix](memleak) UserFunctionCache may have memory leak during close * [bugfix](memleak) UserFunctionCache may have memory leak during close --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-04-22 10:15:51 +08:00
gitccl	a1c05b5c13	[fix](compaction) fix potential null pointer dereference (#18915 )	2023-04-22 08:38:32 +08:00
TengJianPing	b75f4c97f3	[function](string) support char function (#18878 ) * [function](string) support char function * fix	2023-04-22 08:36:48 +08:00
Mryange	de0e89d1b4	[feature](function) Modified cast as time to behave more like MySQL (#18565 ) Because the underlying type of time was float64, select cast("19:22:18" as time) would result in a null value in the past. Results in the following:	2023-04-22 06:11:59 +08:00
yiguolei	24ee391a7e	[bugfix](memoryleak) inlist is memory leak if the type is int (#18883 ) * [bugfix](memoryleak) inlist is memory leak if the type is int --------- Co-authored-by: yiguolei <yiguolei@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2023-04-22 00:34:10 +08:00
lihangyu	af20b2c95e	[Bug](topn opt) Fix be crash when enable topn opt with larger thresho… (#18858 ) topn opt should be inited when update it	2023-04-21 17:45:00 +08:00
Jack Drogon	5706bef2b3	[feature](common) Add unexpected/result support (#18312 ) * Add unexpected/result support * Rename result.hpp -> result.h && Add NOLINT in expected.hpp * Add NOLINT in result.h to avoid clang-tidy checker * Rename result.h to expected.h * Add Apache License for be/src/util/expected.hpp * Disable clang-format in be util/expected.hpp	2023-04-21 17:07:20 +08:00
Liqf	ec1ab1a3d2	[Improve](GEO)wkb input and output are represented as hexadecimal strings And delete EWKB (#18721 )	2023-04-21 15:11:18 +08:00
lihangyu	8cc0af150a	[Fix](dynamic table) fix dynamic table with insert into and column al… (#18808 ) 1. The num_rows should be correctly set 2. insert into has no dynamic column	2023-04-21 11:19:00 +08:00
Pxl	c033c6239f	[Bug](table-function) fix wrong result when seprator of explode_split size more than one (#18824 ) fix wrong result when seprator of explode_split size more than one	2023-04-21 11:00:47 +08:00
yiguolei	63a76ed115	[refactor](exceptionsafe) disallow call new method explicitly (#18830 ) disallow call new method explicitly force to use create_shared or create_unique to use shared ptr placement new is allowed reference https://abseil.io/tips/42 to add factory method to all class. I think we should follow this guide because if throw exception in new method, the program will terminate. --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-04-21 09:13:24 +08:00
HappenLee	eb93afc614	[MemLeak](pipeline) fix mem leak by exchange node in pipeline (#18864 )	2023-04-21 09:06:55 +08:00
yiguolei	b26e2d5d50	[bugfix](memoryleak) close expr after it is pushdown to storage layer (#18849 ) (#18852 ) Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-04-21 05:21:16 +08:00
Pxl	9e64951721	[Chore](asan) set decrementOutputRecursionDepth to suppressions and remove some unu… (#18845 ) 18845	2023-04-20 23:33:25 +08:00
Jerry Hu	c4e469c82c	[feature](agg) Support spill to disk in aggregation (#18051 )	2023-04-20 18:59:08 +08:00
Tiewei Fang	8e2146f48c	[Enhencement](Export) support export with outfile syntax (#18325 ) `Export` syntax provides asynchronous export function, but `Export` does not achieve vectorization. `Outfile` syntax provides synchronous export function`. So we can reimplement the export syntax with oufile syntax.	2023-04-20 17:27:04 +08:00
Qi Chen	3328a65b75	[Fix](mutli-catalog) Use decimal v3 type to fix decimal loss issue in multi-catalog module. (#18835 ) Fix decimal v3 precision loss issues in the multi-catalog module. Now it will use decimal v3 to represent decimal type in the multi-catalog module. Regression Test: `test_load_with_decimal.groovy`	2023-04-20 11:02:53 +08:00
ZhangYu0123	ab9500bfa6	[optimize](string) optimize instr and locate function for constant arguments (#18692 ) Optimize instr and locate function for constant arguments. instr and locate function constant arguments has 58%~200% performance improvement. refactor locate(substr, str, pos) as standardized arguments processing.	2023-04-20 10:40:19 +08:00
hulk	7c099c5747	[bugfix](be) Fix segment fault if the PID_DIR wasn't set (#18789 ) Doris BE would crash if the PID_DIR wasn't set	2023-04-20 10:39:54 +08:00
Gabriel	293e115536	[Improvement](bloom filter) initialize bloom filter with adaptive size (#18785 )	2023-04-20 10:06:40 +08:00
Pxl	908fbf92cf	[Chore](build) ignore compile warning on orc && fix invalid command curdate on conf (#18810 ) ignore compile warning on orc && fix invalid command curdate on conf	2023-04-20 10:03:40 +08:00
Adonis Ling	e412dd12e8	[chore](build) Use include-what-you-use to optimize includes (PART II) (#18761 ) Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.	2023-04-19 23:11:48 +08:00
zclllyybb	fb377a9da9	[Improvement](functions)Optimized some datetime function's return value (#18369 )	2023-04-19 15:51:11 +08:00
zhangstar333	0b379de602	[refactor](scan) optimize the agg function of count(1) (#18739 )	2023-04-19 09:10:51 +08:00
Xinyi Zou	a68af93d30	[fix](compile) Fix block.cpp compilation failure (#18797 )	2023-04-19 08:49:23 +08:00
airborne12	0165ffbcae	[Fix](vertical compaction) Preserve _segment_num_rows during final segment flush (#18779 )	2023-04-18 20:58:23 +08:00
Xinyi Zou	79c446c89f	[enhancement](exception) Column filter/replicate supports exception safety (#18503 )	2023-04-18 19:23:09 +08:00
amory	564446e52f	[Refact](type system) refact serde for type system and pb serde impl (#18627 )	2023-04-18 14:13:56 +08:00
Mryange	18898db09d	[feature](function) Add new parameters to 'trim'. (#18580 )	2023-04-18 14:13:30 +08:00
TengJianPing	0b074ade02	[fix](const column) fix coredump caused by const column for some functions (#18737 )	2023-04-18 13:57:55 +08:00
AlexYue	7b0e5ad54d	[enhance](buffered reader)add bvar to detect download bytes and download speed (#18736 )	2023-04-18 10:14:07 +08:00
Jerry Hu	3de4d64657	[chore](hashtable) Use doris' Allocator to replace std::allocator in phmap (#18735 )	2023-04-18 09:58:28 +08:00

1 2 3 4 5 ...

4324 Commits