doris

Author	SHA1	Message	Date
Gabriel	4483e3a6e1	[Improvement](scan) add a config for scan queue memory limit (#19439 )	2023-05-10 13:14:23 +08:00
DeadlineFen	a05dbd3f81	[chore](compile) Improves PCH cache hit ratio (#19469 ) Supplement the documentation of be-clion-dev, avoid the problem of undefined DORIS_JAVA_HOME and inability to find jni.h when using clion development without directly compiling through build.sh Complete the classification of header files in pch.h and introduce some header files that are not frequently modified in doris. Separate the declaration and definition in common/config.h. If you need to modify the default configuration now, please modify it in common/config.cpp. gen_cpp/version.h is regenerated every time it is recompiled, which may cause PCH to fail, so now you need to get the version information indirectly rather than directly.	2023-05-10 12:49:01 +08:00
Xinyi Zou	cf8ceb8586	[fix](scan) fix scanner mem tracker (#19354 )	2023-05-10 09:56:41 +08:00
luozenglin	03538381a3	[enhancement](memory) MemCounter supports lock-free thread safety (#19256 ) make try_add() and update_peak() thread-safe.	2023-05-10 02:24:07 +08:00
Pxl	dfad7b6b38	[Feature](generic-aggregation) some prowork of generic aggregation (#19343 ) some prowork of generic aggregation	2023-05-09 21:42:21 +08:00
DeadlineFen	e08de52ee7	[chore](compile) using PCH for compilation acceleration under clang (#19303 )	2023-05-08 19:51:06 +08:00
Adonis Ling	673cbe3317	[chore](build) Porting to GCC-13 (#19293 ) Support using GCC-13 to build the codebase.	2023-05-08 10:42:06 +08:00
Yusheng Xu	9edbfa37cd	[Enhancement](Broker Load) New progress manager for showing loading progress status (#19170 ) This work is in the early stage, current progress is not accurate because the scan range will be too large for gathering information, what's more, only file scan node and import job support new progress manager ## How it works for example, when we use the following load query: ``` LOAD LABEL test_broker_load ( DATA INFILE("XXX") INTO TABLE `XXX` ...... ) ``` Initial Progress: the query will call `BrokerLoadJob` to create job, then `coordinator` is called to calculate scan range and its location. Update Progress: BE will report runtime_state to FE and FE update progress status according to jobID and fragmentID we can use `show load` to see the progress PENDING: ``` State: PENDING Progress: 0.00% ``` LOADING: ``` State: LOADING Progress: 14.29% (1/7) ``` FINISH: ``` State: FINISHED Progress: 100.00% (7/7) ``` At current time, full output of `show load\G` looks like: ``` ************************* 1. row ************************* JobId: 25052 Label: test_broker State: LOADING Progress: 0.00% (0/7) Type: BROKER EtlInfo: NULL TaskInfo: cluster:N/A; timeout(s):250000; max_filter_ratio:0.0 ErrorMsg: NULL CreateTime: 2023-05-03 20:53:13 EtlStartTime: 2023-05-03 20:53:15 EtlFinishTime: 2023-05-03 20:53:15 LoadStartTime: 2023-05-03 20:53:15 LoadFinishTime: NULL URL: NULL JobDetails: {"Unfinished backends":{"5a9a3ecd203049bc-85e39a765c043228":[10080]},"ScannedRows":39611808,"TaskNumber":1,"LoadBytes":7398908902,"All backends":{"5a9a3ecd203049bc-85e39a765c043228":[10080]},"FileNumber":1,"FileSize":7895697364} TransactionId: 14015 ErrorTablets: {} User: root Comment: ``` ## TODO: 1. The current partition granularity of scan range is too large, resulting in an uneven loading process for progress." 2. Only broker load supports the new Progress Manager, support progress for other query	2023-05-06 22:44:40 +08:00
Pxl	dff669899a	[Feature](generic-aggregation) add some type define for generic aggregate functions support (#19252 ) add some type define for generic aggregate functions support	2023-05-06 11:30:13 +08:00
Xinyi Zou	58cb404661	[fix](memory) Allocator throws Exception instead of std::bad_alloc (#19285 ) W0505 01:31:25.840227 1727715 scanner_scheduler.cpp:340] Scan thread read VScanner failed: [MEM_LIMIT_EXCEEDED]PreCatch error code:11, [E11] Allocator sys memory check failed: Cannot alloc:16384, consuming tracker:<Orphan>, exec node:<>, process memory used 5.87 GB exceed limit 5.64 GB or sys mem available 252.17 GB less than low water mark 1.60 GB, failed alloc size 16.00 KB. @ 0x555c19e0cca8 doris::Exception::Exception() @ 0x555c1c3e0c3f Allocator<>::sys_memory_check() @ 0x555c1c3e1052 Allocator<>::memory_check() @ 0x555c19e0a645 Allocator<>::alloc() @ 0x555c1c34508b COWHelper<>::create<>() @ 0x555c1e23f574 doris::vectorized::ConvertThroughParsing<>::execute<>() @ 0x555c1e23f209 doris::vectorized::FunctionConvertFromString<>::execute_impl() @ 0x555c1e23f4aa doris::vectorized::FunctionConvertFromString<>::execute_impl() @ 0x555c1e15ac29 doris::vectorized::PreparedFunctionImpl::execute_without_low_cardinality_columns() @ 0x555c1e15ac56 doris::vectorized::PreparedFunctionImpl::execute() @ 0x555c1e245276 _ZNSt17_Function_handlerIFN5doris6StatusEPNS0_15FunctionContextERNS0_10vectorized5BlockERKSt6vectorImSaImEEmmEZNKS4_12FunctionCast14create_wrapperINS4_14DataTypeNumberIiEEEESt8functionISC_ERKSt10shared_ptrIKNS4_9IDataTypeEEPKT_bEUlS3_S6_SB_mmE_E9_M_invokeERKSt9_Any_dataOS3_S6_SB_OmSY_ @ 0x555c1e2a9341 _ZZNK5doris10vectorized12FunctionCast23prepare_remove_nullableEPNS_15FunctionContextERKSt10shared_ptrIKNS0_9IDataTypeEES9_bENKUlS3_RNS0_5BlockERKSt6vectorImSaImEEmmE_clES3_SB_SG_mm @ 0x555c1e2a8d42 _ZNSt17_Function_handlerIFN5doris6StatusEPNS0_15FunctionContextERNS0_10vectorized5BlockERKSt6vectorImSaImEEmmEZNKS4_12FunctionCast23prepare_remove_nullableES3_RKSt10shared_ptrIKNS4_9IDataTypeEESJ_bEUlS3_S6_SB_mmE_E9_M_invokeERKSt9_Any_dataOS3_S6_SB_OmSQ_ @ 0x555c1e20e42b doris::vectorized::PreparedFunctionCast::execute_impl() @ 0x555c1e15ac29 doris::vectorized::PreparedFunctionImpl::execute_without_low_cardinality_columns() @ 0x555c1e15ac56 doris::vectorized::PreparedFunctionImpl::execute() @ 0x555c1d63e960 doris::vectorized::IFunctionBase::execute() @ 0x555c1d628700 doris::vectorized::VCastExpr::execute() @ 0x555c1d6163e5 doris::vectorized::VExprContext::execute() @ 0x555c20a83fe1 doris::vectorized::VFileScanner::_convert_to_output_block() @ 0x555c20a809af doris::vectorized::VFileScanner::_get_block_impl() @ 0x555c209b9bc4 doris::vectorized::VScanner::get_block() @ 0x555c209b1a50 doris::vectorized::ScannerScheduler::_scanner_scan() @ 0x555c209b2ac1 _ZNSt17_Function_handlerIFvvEZZN5doris10vectorized16ScannerScheduler18_schedule_scannersEPNS2_14ScannerContextEENK3$_0clEvEUlvE1_E9_M_invokeERKSt9_Any_data @ 0x555c1a8378cf doris::ThreadPool::dispatch_thread() @ 0x555c1a830fac doris::Thread::supervise_thread() @ 0x7f461faa117a start_thread @ 0x7f462033bdf3 __GI___clone @ (nil) (unknown)	2023-05-05 18:01:48 +08:00
Yongqiang YANG	c98829c94b	[improvement](load) log time consumed by waiting flush (#19226 )	2023-05-03 17:48:13 +08:00
HappenLee	4a10d146bf	[pipeline](exec) fix regression prepare failed cause query core dump (#19208 ) fix regression prepare failed cause query core dump	2023-04-28 20:46:39 +08:00
Gabriel	28016c53f0	[profile](rf) refactor profile of runtime filters (#19134 ) * [profile](rf) refactor profile of runtime filters --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2023-04-28 08:46:42 +08:00
Kang	68d3111629	[bugfix](topn) fix memory leak in topn AcceptNullPredicate (#19060 ) fix the memory leak reported by ASAN as follows.	2023-04-27 14:07:57 +08:00
Gabriel	aabcab9dbe	[Improvement](runtime filter) Improve merge phase (#18828 )	2023-04-26 21:01:20 +08:00
Pxl	60cda12e57	[Bug](pipeline-engine) fix hang on insert into select when enable pipeline engine (#19075 )	2023-04-26 16:50:19 +08:00
Kang	1dfc5ea34c	[bugfix](jsonb) fix jsonb parser crash on noavx2 host (#18977 ) support avx2 and noavx2 for jsonb parser using __AVX2__ macro.	2023-04-26 15:10:12 +08:00
WenYao	339d804ec4	[Refactor](exceptionsafe) add factory creator to some class (#19000 )	2023-04-25 14:33:47 +08:00
yiguolei	3899c08036	[optimize](compile) remove unused template param from load channel (#18980 ) * [optimize](compile) remove unused template param from load channel --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-04-24 23:36:47 +08:00
yiguolei	8d7a9fd21b	[refactor](exceptionsafe) add factory creator to some class (#18978 ) make vexprecontext,vexpr,function,query context,runtimestate thread safe. --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-04-24 10:32:11 +08:00
Xinyi Zou	8e4710079d	[improvement](profile) Insert into add LoadChannel runtime profile (#18908 ) TabletSink and LoadChannel in BE are M: N relationship, Every once in a while LoadChannel will randomly return its own runtime profile to a TabletSink, so usually all LoadChannel runtime profiles are saved on each TabletSink, and the timeliness of the same LoadChannel profile saved on different TabletSinks is different, and each TabletSink will periodically send fe reports all the LoadChannel profiles saved by itself, and ensures to update the latest LoadChannel profile according to the timestamp.	2023-04-24 09:41:57 +08:00
yiguolei	3736530585	[refactor](query context) rename query fragments context to query context and make query context safe (#18950 ) * [refactor](query context) rename query fragments context to query context and make query context safe --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-04-23 22:53:56 +08:00
ZenoYang	0da2cf270a	[improvement](fetch data) Merge result into batch to reduce rpc times (#17828 )	2023-04-23 15:07:28 +08:00
yiguolei	c80dc91a78	[bugfix](memleak) UserFunctionCache may have memory leak during close (#18913 ) * [bugfix](memleak) UserFunctionCache may have memory leak during close * [bugfix](memleak) UserFunctionCache may have memory leak during close --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-04-22 10:15:51 +08:00
lihangyu	af20b2c95e	[Bug](topn opt) Fix be crash when enable topn opt with larger thresho… (#18858 ) topn opt should be inited when update it	2023-04-21 17:45:00 +08:00
Jerry Hu	c4e469c82c	[feature](agg) Support spill to disk in aggregation (#18051 )	2023-04-20 18:59:08 +08:00
Qi Chen	3328a65b75	[Fix](mutli-catalog) Use decimal v3 type to fix decimal loss issue in multi-catalog module. (#18835 ) Fix decimal v3 precision loss issues in the multi-catalog module. Now it will use decimal v3 to represent decimal type in the multi-catalog module. Regression Test: `test_load_with_decimal.groovy`	2023-04-20 11:02:53 +08:00
Gabriel	293e115536	[Improvement](bloom filter) initialize bloom filter with adaptive size (#18785 )	2023-04-20 10:06:40 +08:00
Adonis Ling	e412dd12e8	[chore](build) Use include-what-you-use to optimize includes (PART II) (#18761 ) Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.	2023-04-19 23:11:48 +08:00
zclllyybb	fb377a9da9	[Improvement](functions)Optimized some datetime function's return value (#18369 )	2023-04-19 15:51:11 +08:00
Xinyi Zou	79c446c89f	[enhancement](exception) Column filter/replicate supports exception safety (#18503 )	2023-04-18 19:23:09 +08:00
Gabriel	5300b21db7	[Bug](DECIMALV3) report failure if a decimal value is overflow (#18336 )	2023-04-17 13:18:14 +08:00
HappenLee	eb128753ac	[Opt](pipeline) opt pipeline shared scan (#18715 )	2023-04-17 13:06:39 +08:00
wangbo	ac0b382fed	[improvement](executor) Priority Queue support vruntime (#18635 ) * 1 rename some class 2 mfqs support vruntime * fix const * as sugguestion * fix const	2023-04-17 10:17:28 +08:00
Yongqiang YANG	bcff3710ca	[fix] set execution timeout for brokerload and use query timeout when… (#18694 ) We should use query timeout if execution timeout is not set to upgrade.	2023-04-15 20:41:04 +08:00
Pxl	975b373896	[Chore](thrift) add some check on client cache && remove some unused code && catch st… #18683	2023-04-15 17:47:51 +08:00
luozenglin	81799d614e	[feature-wip](resource-group) support resource group interface in be. (#18588 )	2023-04-14 14:00:49 +08:00
Xinyi Zou	c704351273	[enhancement](memory) Refactor memory limit exceeded behavior (#18590 ) No check mem tracker limit and no cancel task in mem hook, only in Allocator. This helps in clearer analysis of memory issues and reduces performance loss. PODArray/hash table/arena memory allocation will use Allocator. Optimize mem limit exceeded log printing Optimize compilation time	2023-04-14 10:42:35 +08:00
Gabriel	2294fb46a5	[refactor](minor) update scan concurrency for pipeline (#18650 )	2023-04-14 09:45:12 +08:00
Zhengguo Yang	4335c9998f	[chore](ARM) Add some vectorization compatibility code on aarch64 (#18553 ) update sse2noen to support more sse code on arm cpus	2023-04-13 10:15:33 +08:00
Tiewei Fang	49a9956986	[Enhencement](Profile) add profile info for jdbc scanner #18569	2023-04-12 10:47:21 +08:00
ZhangYu0123	5efafefeda	[refactor](string) remove volnitsky search algorithm (#18474 )	2023-04-10 10:56:07 +08:00
Xinyi Zou	308ff9a16f	[enchancement](memory) tracking lru cache memory and page memory not in cache (#18361 ) Statistics lru cache memory in metrics Statistics page memory not in cache in mem tracker	2023-04-07 14:22:44 +08:00
HappenLee	c32adba1cf	[Refactor](Pipeline) Refactor pipeline code to improve coverage (#18376 ) Refactor pipeline code to improve coverage	2023-04-07 13:09:44 +08:00
gitccl	7f8d92656e	[fix](streamload) fix stream load failed when enable profile (#18364 ) #18015 enables stream load profile log, however be will encounter rpc fail when loading tpch data(see #18291). This is because when `is_report_success` is true, be will reportExecStatus to fe, but fe cannot find QueryInfo in `coordinatorMap`, thus it will return error to be.	2023-04-05 01:01:46 +08:00
Ashin Gau	66bfd18601	[opt](file_reader) add prefetch buffer to read csv&json file (#18301 ) Co-authored-by: ByteYue <[yj976240184@gmail.com](mailto:yj976240184@gmail.com)> This PR is an optimization for https://github.com/apache/doris/pull/17478: 1. Change the buffer size of `LineReader` to 4MB to align with the size of prefetch buffer. 2. Lazily prefetch data in the first read to prevent wasted reading. 3. S3 block size is 32MB only, which is too small for a file split. Set 128MB as default file split size. 4. Add `_end_offset` for prefetch buffer to prevent wasted reading. The query performance of reading data on object storage is improved by more than 3x+.	2023-04-04 19:05:22 +08:00
Xinyi Zou	5e7ea5e305	[fix](memory) Fix `bthread_setspecific` log fatal on UBSAN build (#18274 )	2023-03-31 19:46:53 +08:00
yiguolei	1027abe0d3	[enhancement](query exec) should print error status when query meet error (#18247 ) If BE is in heavy load, the query may failed, but BE will try to connect to FE using thrift, if FE is also in heavy load the thrift connection will failed. And the status is rewritten at line 342, and the actual failure reason for the query is lost. Should print the error status every time during update. Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-03-31 14:08:24 +08:00
zhangstar333	1b2aaab2f2	[vectorized](bug) fix some case in enable fold constant (#17997 ) fix some case in enable fold constant	2023-03-31 11:41:31 +08:00
Kang	4e1e0ce06d	[bugfix](topn) fix topn optimzation wrong result for NULL values (#18121 ) 1. add PassNullPredicate to fix topn wrong result for NULL values 2. refactor RuntimePredicate to avoid using TCondition 3. refactor using ordering_exprs in fe and vsort_node	2023-03-31 10:01:34 +08:00

1 2 3 4 5 ...

997 Commits