doris

Author	SHA1	Message	Date
Kaijie Chen	7058b31edd	[fix](move-memtable) clear load streams before shutdown SegmentFileWriterThreadPool (#35217 )	2024-05-28 13:12:03 +08:00
Pxl	b143f0dfe2	[Improvement](date) shortcut for str to date parse (#35288 ) shortcut for str to date parse	2024-05-25 17:47:20 +08:00
TengJianPing	639c7ee7fb	[fix](decimalv2) fix scale of decimalv2 to string (#35222 ) (#35359 ) * [fix](decimalv2) fix scale of decimalv2 to string	2024-05-24 17:20:43 +08:00
abmdocrt	309503855e	[Fix](bloom filter) Fix bloom filter memory leak (#34871 ) * Issue: Doris occasionally encounters an issue where memory usage becomes exceptionally high and does not decrease. The leaked memory is occupied by Bloom filters stored in memory. Reason: The segment cache stores segment objects read from files into memory. It functions as an LRU cache with an eviction strategy: when the number of segments exceeds the maximum number, or the total memory size of segment objects in the cache exceeds the maximum usage, it evicts the older segments. However, there is a piece of logic in the code that first reads the segment object into memory, assuming it occupies memory size A, then places the read segment object into the cache (at this point, the cache considers the segment object size to be A). It then reads the segment's Bloom filter from the file and assigns it to the segment's Bloom filter member variable, assuming the Bloom filter occupies memory size B. Thus, the total size of the segment object at this point is A+B. However, the cache does not update this size, leading to the actual size of the segment object stored in the cache (A+B) being larger than the size considered by the cache (A). When the number of segment objects in the cache increases to a certain extent, the used memory will surge dramatically. However, the cache does not perceive the size as reaching the eviction limit, so it does not evict the segment objects. In such cases, a memory leak issue arises. Solution: Since each segment object only reads the Bloom filter once, the issue can be resolved by changing the logic from reading the segment, placing it into the cache, and then reading the Bloom filter to reading the segment, reading the Bloom filter, and then placing it into the cache.	2024-05-24 16:23:58 +08:00
Kaijie Chen	a6f7747d29	[feature](datatype) add BE config to allow zero date (#34961 ) Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>	2024-05-23 19:12:39 +08:00
Gabriel	c23384ff07	[fix](decimal) Fix long string casting to decimalv2 (#35121 )	2024-05-22 14:32:29 +08:00
Ashin Gau	98f8eb5c43	[opt](split) get file splits in batch mode (#34032 ) (#35107 ) bp #34032	2024-05-21 22:27:07 +08:00
Yongqiang YANG	b4a798240a	[fix](inverted_index) donot use int32_t for index id to avoid overflow (#35062 )	2024-05-21 12:58:38 +08:00
lihangyu	e3e5f18f26	[Fix](Json type) correct cast result for json type (#34764 )	2024-05-18 18:40:17 +08:00
zhiqiang	eb7eaee386	[fix](function) money format (#34680 )	2024-05-18 18:35:29 +08:00
HHoflittlefish777	1a24895257	[opt](routine-load) optimize routine load task thread pool and related param(#32282 ) (#34896 )	2024-05-15 12:42:02 +08:00
Sun Chenyang	95b05928fd	[fix](compaction) fix time series compaction merge empty rowsets priority #34562 (#34765 )	2024-05-14 09:10:09 +08:00
zhiqiang	0ae1b9c70a	[chore](remove code) Remove dragonbox related (#34528 ) * Revert "[refactor](mysql result format) use new serde framework to tuple convert (#25006)" This reverts commit e5ef0aa6d439c3f9b1f1fe5bc89c9ea6a71d4019. * run buildall * MORE * FIX	2024-05-13 22:16:57 +08:00
yiguolei	32cbd4a583	[chore](status) unify error code between thrift,pb, status.h (#34397 ) Co-authored-by: yiguolei <yiguolei@gmail.com>	2024-05-10 14:41:01 +08:00
yangshijie	9b712b03b4	[FIX]fix is_ip_address_in_range func with const param (#34266 )	2024-05-10 14:37:20 +08:00
yiguolei	8fdfbcb3c4	Revert "[Opt](func) opt the percentile func performance (#34373 ) (#34416 )" This reverts commit 509ae425e416b4779ae94eab9c2b21f9850e03c3.	2024-05-07 07:23:48 +08:00
Chester	f7900b53ce	[enhancement](function) floor/ceil/round/round_bankers can use column as scale argument (#34391 )	2024-05-06 22:18:36 +08:00
HappenLee	509ae425e4	[Opt](func) opt the percentile func performance (#34373 ) (#34416 )	2024-05-06 20:10:35 +08:00
苏小刚	0f0c0a266b	[opt](parquet)Skip page with offset index (#33082 ) Make skip_page() in ColumnChunkReader more efficient. No more reading page headers if there are pagelocations in chunk.	2024-04-26 15:06:16 +08:00
Ashin Gau	c631f4f8a8	[fix](schema change) resolve the use count check of source logical column (#33932 ) Fix error like: ``` 8# google::LogMessageFatal::~LogMessageFatal() in /mnt/hdd01/ci/master-deploy/be/lib/doris_be 9# doris::vectorized::Block::clear_column_data(int) in /mnt/hdd01/ci/master-deploy/be/lib/doris_be 10# doris::vectorized::ParquetReader::get_next_block(doris::vectorized::Block, unsigned long, bool) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/parquet/vparquet_reader.cpp:514 11# doris::vectorized::VFileScanner::_get_block_impl(doris::RuntimeState, doris::vectorized::Block, bool) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/scan/vfile_scanner.cpp:333 12# doris::vectorized::VScanner::get_block(doris::RuntimeState, doris::vectorized::Block, bool) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/scan/vscanner.cpp:132 13# doris::vectorized::VScanner::get_block_after_projects(doris::RuntimeState, doris::vectorized::Block, bool) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/scan/vscanner.cpp:99 ``` Because source logical column is the destination logical column if logical converter is consistent. Previously, the reference of column was reset after the conversion was completed, but if an EOF occurred, it was returned in advance, but EOF is not a true error. ``` if (_logical_converter->is_consistent()) { // If logical converter is consistent, _src_logical_column is the final destination column, // other components will check the use count _src_logical_column.reset(); } ```	2024-04-22 12:31:46 +08:00
Sun Chenyang	7e91e69eb9	[fix](compaction) fix single compaction (#33907 ) * [fix](compaction)Fix single compaction to get all local versions #33849 add test and comment * remove single replica compaction prepare input rowsets reviesd	2024-04-19 23:30:25 +08:00
Kaijie Chen	ffd9da44a2	[fix](move-memtable) fix commit may fail due to duplicated reports (#32403 )	2024-04-19 15:02:49 +08:00
Ashin Gau	9b7af4c0cf	[feature](schema change) unified schema change for parquet and orc reader (#32873 ) Following #25138, unified schema change interface for parquet and orc reader, and can be applied to other format readers as well. Unified schema change interface for all format readers: - First, read the data according to the column type of the file into source column; - Second, convert source column to the destination column with type planned by FE.	2024-04-12 15:09:25 +08:00
yiguolei	a4924dabb7	[enhancement](exception) enble exception logic in pipeline execute thread (#33437 ) * [enhancement](exception) enble exception logic in pipeline execute thread * f --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2024-04-12 15:09:25 +08:00
Pxl	5f30463bb3	[Chore](descriptors) remove unused codes for descriptors (#33408 ) remove unused codes for descriptors	2024-04-12 15:09:25 +08:00
Jensen	26d9082b9a	[Feature](function) Add function strcmp (#33272 )	2024-04-12 15:09:25 +08:00
Uniqueyou	31984bb4f0	[feature](function) support quote string function #33055	2024-04-12 15:09:25 +08:00
zclllyybb	3d66723214	[branch-2.1](auto-partition) pick auto partition and some more prs (#33523 )	2024-04-11 17:12:17 +08:00
Pxl	5688c28364	[Bug](runtime-filter) try to fix heap use after free on runtime filter send filter size (#33465 ) (#33522 )	2024-04-11 13:10:24 +08:00
yujun	bc929686e3	[feature](debug point) add macro DBUG_RUN_CALLBACK (#33407 )	2024-04-11 09:31:50 +08:00
zhangstar333	ef26479282	[improve](serde) support complex type in write/read pb serde (#33124 ) support complex type and ip/jsonb in DataTypeSerDe::write_column_to_pb/read_column_from_pb function	2024-04-11 09:31:50 +08:00
yujun	e2ad7149c3	[feature](debug point) Add handler to debug point (#33350 )	2024-04-10 16:24:13 +08:00
Pxl	8fd6d4c41b	[Chore](build) add -Wconversion and remove some unused code (#33127 ) add -Wconversion and remove some unused code	2024-04-10 15:26:08 +08:00
zclllyybb	c61d6ad1e2	[Feature] support function uuid_to_int and int_to_uuid #33005	2024-04-10 14:53:56 +08:00
zhiqiang	bf022f9d8d	[enhancement](function truncate) truncate can use column as scale argument (#32746 ) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2024-04-10 14:53:56 +08:00
Xinyi Zou	cf7595d423	[opt](memory) Optimize mem tracker accuracy (#32039 ) (#33140 )	2024-04-10 11:42:19 +08:00
DuRipeng	39fba884fb	[fix](typo) typo fix for 'delete bimap' changing to 'delete bitmap' (#32341 )	2024-04-10 11:34:30 +08:00
amory	28e2d89ce3	[Improve](inverted_index) update clucene and improve array inverted index writer (#32436 )	2024-04-10 11:34:29 +08:00
Xin Liao	950ca68fac	[fix](move-memtable) fix timeout to get tablet schema (#33256 ) (#33260 )	2024-04-04 21:45:55 +08:00
Xin Liao	df197c6a14	[fix](move-memtable) fix initial use count of streams for auto partition (#33165 ) (#33236 ) Co-authored-by: Kaijie Chen <ckj@apache.org>	2024-04-03 20:31:29 +08:00
TengJianPing	ff0da8108b	[fix](RF) fix 'Invalid value' error of RF of decimal type (#32749 )	2024-03-25 22:34:19 +08:00
Tiewei Fang	d7a3ff1ddf	[Fix](Outfile) Fix the column type mapping in the orc/parquet file format (#32281 ) \| Doris Type \| Orc Type \| Parquet Type \| \|---------------------\|--------------------\|------------------------\| \| Date \| Long (logical: DATE) \| int32 (Logical: Date) \| \| DateTime \| TIMESTAMP (logical: TIMESTAMP) \| int96 \|	2024-03-22 08:52:16 +08:00
zhangstar333	7486e96b12	[improve](function) add error msg if exceeded maximum default value in repeat function (#32219 ) add some error msg from repeat function, so the user could know the count is greater than default value.	2024-03-21 14:07:49 +08:00
huanghaibin	2196c534e8	[fix](group commit) Fix compatibility issues on serializing and deserializing wal file (#32299 )	2024-03-21 14:07:24 +08:00
Mryange	8bd101129a	[behavior change](output) change float output format (#32049 )	2024-03-21 14:07:22 +08:00
zhiqiang	0990014e94	[fix](datetime) fix datetime rounding on BE (#32075 )	2024-03-21 14:07:19 +08:00
Mingyu Chen	ef2151ae66	[Feature-WIP](multi-catalog) Add Hive sink on BE side. (#32306 ) (#32364 ) bp #32306 Co-authored-by: Qi Chen <kaka11.chen@gmail.com>	2024-03-18 11:23:01 +08:00
Xinyi Zou	7b74b199a5	[fix](memory) Fix LRU cache deleter and memory tracking (#32080 ) In order to add common code to the value deleter of LRU cache, let all lru cache values inherit from LRUCacheValueBase class and tracking memory in destructor.	2024-03-15 17:57:58 +08:00
zclllyybb	847ec368be	[Fix](smooth-upgrade) Fix incompatibility when upgrade from 2.0 to 2.1 (#32220 )	2024-03-14 11:23:05 +08:00
lihangyu	0da010603e	[Improve](TabletSchemaCache) reduce duplicated memory consumption for column name and column path (#31141 ) Both could be reference to related field in TabletColumn.And use shared_ptr for TabletColumn in TabletSchema for later memory reuse	2024-03-09 19:44:42 +08:00

1 2 3 4 5 ...

1399 Commits