doris

Author	SHA1	Message	Date
苏小刚	0f0c0a266b	[opt](parquet)Skip page with offset index (#33082 ) Make skip_page() in ColumnChunkReader more efficient. No more reading page headers if there are pagelocations in chunk.	2024-04-26 15:06:16 +08:00
Ashin Gau	c631f4f8a8	[fix](schema change) resolve the use count check of source logical column (#33932 ) Fix error like: ``` 8# google::LogMessageFatal::~LogMessageFatal() in /mnt/hdd01/ci/master-deploy/be/lib/doris_be 9# doris::vectorized::Block::clear_column_data(int) in /mnt/hdd01/ci/master-deploy/be/lib/doris_be 10# doris::vectorized::ParquetReader::get_next_block(doris::vectorized::Block, unsigned long, bool) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/format/parquet/vparquet_reader.cpp:514 11# doris::vectorized::VFileScanner::_get_block_impl(doris::RuntimeState, doris::vectorized::Block, bool) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/scan/vfile_scanner.cpp:333 12# doris::vectorized::VScanner::get_block(doris::RuntimeState, doris::vectorized::Block, bool) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/scan/vscanner.cpp:132 13# doris::vectorized::VScanner::get_block_after_projects(doris::RuntimeState, doris::vectorized::Block, bool) at /home/zcp/repo_center/doris_master/doris/be/src/vec/exec/scan/vscanner.cpp:99 ``` Because source logical column is the destination logical column if logical converter is consistent. Previously, the reference of column was reset after the conversion was completed, but if an EOF occurred, it was returned in advance, but EOF is not a true error. ``` if (_logical_converter->is_consistent()) { // If logical converter is consistent, _src_logical_column is the final destination column, // other components will check the use count _src_logical_column.reset(); } ```	2024-04-22 12:31:46 +08:00
Sun Chenyang	7e91e69eb9	[fix](compaction) fix single compaction (#33907 ) * [fix](compaction)Fix single compaction to get all local versions #33849 add test and comment * remove single replica compaction prepare input rowsets reviesd	2024-04-19 23:30:25 +08:00
Kaijie Chen	ffd9da44a2	[fix](move-memtable) fix commit may fail due to duplicated reports (#32403 )	2024-04-19 15:02:49 +08:00
Ashin Gau	9b7af4c0cf	[feature](schema change) unified schema change for parquet and orc reader (#32873 ) Following #25138, unified schema change interface for parquet and orc reader, and can be applied to other format readers as well. Unified schema change interface for all format readers: - First, read the data according to the column type of the file into source column; - Second, convert source column to the destination column with type planned by FE.	2024-04-12 15:09:25 +08:00
yiguolei	a4924dabb7	[enhancement](exception) enble exception logic in pipeline execute thread (#33437 ) * [enhancement](exception) enble exception logic in pipeline execute thread * f --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2024-04-12 15:09:25 +08:00
Pxl	5f30463bb3	[Chore](descriptors) remove unused codes for descriptors (#33408 ) remove unused codes for descriptors	2024-04-12 15:09:25 +08:00
Jensen	26d9082b9a	[Feature](function) Add function strcmp (#33272 )	2024-04-12 15:09:25 +08:00
Uniqueyou	31984bb4f0	[feature](function) support quote string function #33055	2024-04-12 15:09:25 +08:00
zclllyybb	3d66723214	[branch-2.1](auto-partition) pick auto partition and some more prs (#33523 )	2024-04-11 17:12:17 +08:00
Pxl	5688c28364	[Bug](runtime-filter) try to fix heap use after free on runtime filter send filter size (#33465 ) (#33522 )	2024-04-11 13:10:24 +08:00
yujun	bc929686e3	[feature](debug point) add macro DBUG_RUN_CALLBACK (#33407 )	2024-04-11 09:31:50 +08:00
zhangstar333	ef26479282	[improve](serde) support complex type in write/read pb serde (#33124 ) support complex type and ip/jsonb in DataTypeSerDe::write_column_to_pb/read_column_from_pb function	2024-04-11 09:31:50 +08:00
yujun	e2ad7149c3	[feature](debug point) Add handler to debug point (#33350 )	2024-04-10 16:24:13 +08:00
Pxl	8fd6d4c41b	[Chore](build) add -Wconversion and remove some unused code (#33127 ) add -Wconversion and remove some unused code	2024-04-10 15:26:08 +08:00
zclllyybb	c61d6ad1e2	[Feature] support function uuid_to_int and int_to_uuid #33005	2024-04-10 14:53:56 +08:00
zhiqiang	bf022f9d8d	[enhancement](function truncate) truncate can use column as scale argument (#32746 ) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2024-04-10 14:53:56 +08:00
Xinyi Zou	cf7595d423	[opt](memory) Optimize mem tracker accuracy (#32039 ) (#33140 )	2024-04-10 11:42:19 +08:00
DuRipeng	39fba884fb	[fix](typo) typo fix for 'delete bimap' changing to 'delete bitmap' (#32341 )	2024-04-10 11:34:30 +08:00
amory	28e2d89ce3	[Improve](inverted_index) update clucene and improve array inverted index writer (#32436 )	2024-04-10 11:34:29 +08:00
Xin Liao	950ca68fac	[fix](move-memtable) fix timeout to get tablet schema (#33256 ) (#33260 )	2024-04-04 21:45:55 +08:00
Xin Liao	df197c6a14	[fix](move-memtable) fix initial use count of streams for auto partition (#33165 ) (#33236 ) Co-authored-by: Kaijie Chen <ckj@apache.org>	2024-04-03 20:31:29 +08:00
TengJianPing	ff0da8108b	[fix](RF) fix 'Invalid value' error of RF of decimal type (#32749 )	2024-03-25 22:34:19 +08:00
Tiewei Fang	d7a3ff1ddf	[Fix](Outfile) Fix the column type mapping in the orc/parquet file format (#32281 ) \| Doris Type \| Orc Type \| Parquet Type \| \|---------------------\|--------------------\|------------------------\| \| Date \| Long (logical: DATE) \| int32 (Logical: Date) \| \| DateTime \| TIMESTAMP (logical: TIMESTAMP) \| int96 \|	2024-03-22 08:52:16 +08:00
zhangstar333	7486e96b12	[improve](function) add error msg if exceeded maximum default value in repeat function (#32219 ) add some error msg from repeat function, so the user could know the count is greater than default value.	2024-03-21 14:07:49 +08:00
huanghaibin	2196c534e8	[fix](group commit) Fix compatibility issues on serializing and deserializing wal file (#32299 )	2024-03-21 14:07:24 +08:00
Mryange	8bd101129a	[behavior change](output) change float output format (#32049 )	2024-03-21 14:07:22 +08:00
zhiqiang	0990014e94	[fix](datetime) fix datetime rounding on BE (#32075 )	2024-03-21 14:07:19 +08:00
Mingyu Chen	ef2151ae66	[Feature-WIP](multi-catalog) Add Hive sink on BE side. (#32306 ) (#32364 ) bp #32306 Co-authored-by: Qi Chen <kaka11.chen@gmail.com>	2024-03-18 11:23:01 +08:00
Xinyi Zou	7b74b199a5	[fix](memory) Fix LRU cache deleter and memory tracking (#32080 ) In order to add common code to the value deleter of LRU cache, let all lru cache values inherit from LRUCacheValueBase class and tracking memory in destructor.	2024-03-15 17:57:58 +08:00
zclllyybb	847ec368be	[Fix](smooth-upgrade) Fix incompatibility when upgrade from 2.0 to 2.1 (#32220 )	2024-03-14 11:23:05 +08:00
lihangyu	0da010603e	[Improve](TabletSchemaCache) reduce duplicated memory consumption for column name and column path (#31141 ) Both could be reference to related field in TabletColumn.And use shared_ptr for TabletColumn in TabletSchema for later memory reuse	2024-03-09 19:44:42 +08:00
Uniqueyou	779ca464a5	[Fix](Status) Handle returned overall Status correctly (#31692 ) Handle returned overall Status correctly	2024-03-09 19:44:39 +08:00
meiyi	eea9b56f69	[fix](group commit) handle group commit create plan error (#31757 )	2024-03-06 13:07:59 +08:00
yiguolei	7d1db6cd1f	[refactor](exception safe) Refactor delete handler and block column predicates to make sure exception safe (#31618 )	2024-03-01 14:21:17 +08:00
abmdocrt	747faeed17	[Enhancement](group commit) optimize some group commit code (#31392 ) This PR optimizes some of the logic related to group commit: 1. Improved the error handling when there is insufficient WAL space during import. 2. Accounted for cases where the content length is negative during import. 3. Added missing error log printing in `group_commit_mgr.cpp`.	2024-02-28 13:05:57 +08:00
abmdocrt	48804a978a	[Fix](group commit) Fix group commit flink error message (#31350 ) * When using stream processing frameworks like Flink with group commit mode enabled, the uncertain size of imported data makes such behavior prohibitive. Previously, to simplify the process, the error message for excessive data volume during streamload was combined with the one for group commit mode, leading to confusion for users when encountering errors indicating the data volume is too large during Flink imports. To address this issue, we are adjusting the logic: if a user employs stream processing imports like Flink with group commit mode enabled, we will automatically disable group commit mode, switching to the standard import mode instead. This is the essence of this PR.	2024-02-26 19:07:10 +08:00
yangshijie	8f77e6363a	[Feature](function) Support xxhash function like murmur hash function (#31193 )	2024-02-23 19:03:28 +08:00
zzzxl	90ab5ec2d9	[fix](invert index) fix the error issue in the unit test remove_element_only_in_table (#31238 )	2024-02-22 13:01:49 +08:00
amory	ad07dec0ed	[Improve](InPredict) enhance in predict with struct type (#30840 )	2024-02-22 13:01:49 +08:00
huanghaibin	b66583551c	[fix](group_commit)Fix bound checking problem when reading wal block (#31112 )	2024-02-22 13:01:48 +08:00
Uniqueyou	f2a38e6345	[chore](columns) remove update_hashes_with_value for SipHash (#31224 )	2024-02-22 13:01:48 +08:00
Xinyi Zou	1abe9d4384	[fix](memory) Fix LRU cache stale sweep (#31122 ) Remove LRUCacheValueBase, put last_visit_time into LRUHandle, and automatically update timestamp to last_visit_time during cache insert and lookup. Do not rely on external modification of last_visit_time, which is often forgotten.	2024-02-21 17:01:29 +08:00
Mingyu Chen	a8d8c6a271	[fix](file-writer) opt s3 file writer and fix empty file related issue #28983 #30703 #31169 (#31213 ) * (feature)(cloud) Use dynamic allocator instead of static buffer pool for better elasticity. (#28983) * [fix](outfile) Fix unable to export empty data (#30703) Issue Number: close #30600 Fix unable to export empty data to hdfs / S3, this behavior is inconsistent with version 1.2.7, version 1.2.7 can export empty data to hdfs/ S3, and there will be exported files on S3/HDFS. * [fix](file-writer) avoid empty file for segment writer (#31169) --------- Co-authored-by: AlexYue <yj976240184@gmail.com> Co-authored-by: zxealous <zhouchangyue@baidu.com>	2024-02-21 16:48:54 +08:00
huanghaibin	7a1bd6abb0	[improvment](group_commit) Refector scan wal function (#30939 ) Co-authored-by: Yongqiang YANG <98214048+dataroaring@users.noreply.github.com>	2024-02-20 09:12:38 +08:00
Pxl	bb4575a392	[Improvement](join) optimization for build_side_output_column (#30826 ) optimization for build_side_output_column	2024-02-19 17:22:03 +08:00
koarz	6cf7468073	[enhancement](function) change some function nullable mode (#30991 ) change some function nullable mode	2024-02-18 14:45:25 +08:00
zclllyybb	68102fd531	[Fix](auto-partition) fix a concurrent bug of extremely long values (#31005 )	2024-02-18 14:45:25 +08:00
abmdocrt	b5012dc55a	[Enhancement](group commit) optimize pre allocated calculation (#30893 )	2024-02-18 11:50:17 +08:00
HappenLee	45b4189bb6	[Refactor](opt) Opt rf and remove unless code (#30900 ) Opt rf and remove unless code	2024-02-18 11:50:16 +08:00

1 2 3 4 5 ...

1381 Commits