9b7af4c0cf
[feature](schema change) unified schema change for parquet and orc reader ( #32873 )
...
Following #25138 , unified schema change interface for parquet and orc reader, and can be applied to other format readers as well.
Unified schema change interface for all format readers:
- First, read the data according to the column type of the file into source column;
- Second, convert source column to the destination column with type planned by FE.
2024-04-12 15:09:25 +08:00
a4924dabb7
[enhancement](exception) enble exception logic in pipeline execute thread ( #33437 )
...
* [enhancement](exception) enble exception logic in pipeline execute thread
* f
---------
Co-authored-by: yiguolei <yiguolei@gmail.com >
2024-04-12 15:09:25 +08:00
5f30463bb3
[Chore](descriptors) remove unused codes for descriptors ( #33408 )
...
remove unused codes for descriptors
2024-04-12 15:09:25 +08:00
26d9082b9a
[Feature](function) Add function strcmp ( #33272 )
2024-04-12 15:09:25 +08:00
31984bb4f0
[feature](function) support quote string function #33055
2024-04-12 15:09:25 +08:00
3d66723214
[branch-2.1](auto-partition) pick auto partition and some more prs ( #33523 )
2024-04-11 17:12:17 +08:00
5688c28364
[Bug](runtime-filter) try to fix heap use after free on runtime filter send filter size ( #33465 ) ( #33522 )
2024-04-11 13:10:24 +08:00
bc929686e3
[feature](debug point) add macro DBUG_RUN_CALLBACK ( #33407 )
2024-04-11 09:31:50 +08:00
ef26479282
[improve](serde) support complex type in write/read pb serde ( #33124 )
...
support complex type and ip/jsonb in DataTypeSerDe::write_column_to_pb/read_column_from_pb function
2024-04-11 09:31:50 +08:00
e2ad7149c3
[feature](debug point) Add handler to debug point ( #33350 )
2024-04-10 16:24:13 +08:00
8fd6d4c41b
[Chore](build) add -Wconversion and remove some unused code ( #33127 )
...
add -Wconversion and remove some unused code
2024-04-10 15:26:08 +08:00
c61d6ad1e2
[Feature] support function uuid_to_int and int_to_uuid #33005
2024-04-10 14:53:56 +08:00
bf022f9d8d
[enhancement](function truncate) truncate can use column as scale argument ( #32746 )
...
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-04-10 14:53:56 +08:00
cf7595d423
[opt](memory) Optimize mem tracker accuracy ( #32039 ) ( #33140 )
2024-04-10 11:42:19 +08:00
39fba884fb
[fix](typo) typo fix for 'delete bimap' changing to 'delete bitmap' ( #32341 )
2024-04-10 11:34:30 +08:00
28e2d89ce3
[Improve](inverted_index) update clucene and improve array inverted index writer ( #32436 )
2024-04-10 11:34:29 +08:00
950ca68fac
[fix](move-memtable) fix timeout to get tablet schema ( #33256 ) ( #33260 )
2024-04-04 21:45:55 +08:00
df197c6a14
[fix](move-memtable) fix initial use count of streams for auto partition ( #33165 ) ( #33236 )
...
Co-authored-by: Kaijie Chen <ckj@apache.org >
2024-04-03 20:31:29 +08:00
ff0da8108b
[fix](RF) fix 'Invalid value' error of RF of decimal type ( #32749 )
2024-03-25 22:34:19 +08:00
d7a3ff1ddf
[Fix](Outfile) Fix the column type mapping in the orc/parquet file format ( #32281 )
...
| Doris Type | Orc Type | Parquet Type |
|---------------------|--------------------|------------------------|
| Date | Long (logical: DATE) | int32 (Logical: Date) |
| DateTime | TIMESTAMP (logical: TIMESTAMP) | int96 |
2024-03-22 08:52:16 +08:00
7486e96b12
[improve](function) add error msg if exceeded maximum default value in repeat function ( #32219 )
...
add some error msg from repeat function, so the user could know the count is greater than default value.
2024-03-21 14:07:49 +08:00
2196c534e8
[fix](group commit) Fix compatibility issues on serializing and deserializing wal file ( #32299 )
2024-03-21 14:07:24 +08:00
8bd101129a
[behavior change](output) change float output format ( #32049 )
2024-03-21 14:07:22 +08:00
0990014e94
[fix](datetime) fix datetime rounding on BE ( #32075 )
2024-03-21 14:07:19 +08:00
ef2151ae66
[Feature-WIP](multi-catalog) Add Hive sink on BE side. ( #32306 ) ( #32364 )
...
bp #32306
Co-authored-by: Qi Chen <kaka11.chen@gmail.com >
2024-03-18 11:23:01 +08:00
7b74b199a5
[fix](memory) Fix LRU cache deleter and memory tracking ( #32080 )
...
In order to add common code to the value deleter of LRU cache, let all lru cache values inherit from LRUCacheValueBase class and tracking memory in destructor.
2024-03-15 17:57:58 +08:00
847ec368be
[Fix](smooth-upgrade) Fix incompatibility when upgrade from 2.0 to 2.1 ( #32220 )
2024-03-14 11:23:05 +08:00
0da010603e
[Improve](TabletSchemaCache) reduce duplicated memory consumption for column name and column path ( #31141 )
...
Both could be reference to related field in TabletColumn.And use shared_ptr for TabletColumn in TabletSchema for later memory reuse
2024-03-09 19:44:42 +08:00
779ca464a5
[Fix](Status) Handle returned overall Status correctly ( #31692 )
...
Handle returned overall Status correctly
2024-03-09 19:44:39 +08:00
eea9b56f69
[fix](group commit) handle group commit create plan error ( #31757 )
2024-03-06 13:07:59 +08:00
7d1db6cd1f
[refactor](exception safe) Refactor delete handler and block column predicates to make sure exception safe ( #31618 )
2024-03-01 14:21:17 +08:00
747faeed17
[Enhancement](group commit) optimize some group commit code ( #31392 )
...
This PR optimizes some of the logic related to group commit:
1. Improved the error handling when there is insufficient WAL space during import.
2. Accounted for cases where the content length is negative during import.
3. Added missing error log printing in `group_commit_mgr.cpp`.
2024-02-28 13:05:57 +08:00
48804a978a
[Fix](group commit) Fix group commit flink error message ( #31350 )
...
* When using stream processing frameworks like Flink with group commit mode enabled, the uncertain size of imported data makes such behavior prohibitive. Previously, to simplify the process, the error message for excessive data volume during streamload was combined with the one for group commit mode, leading to confusion for users when encountering errors indicating the data volume is too large during Flink imports. To address this issue, we are adjusting the logic: if a user employs stream processing imports like Flink with group commit mode enabled, we will automatically disable group commit mode, switching to the standard import mode instead. This is the essence of this PR.
2024-02-26 19:07:10 +08:00
8f77e6363a
[Feature](function) Support xxhash function like murmur hash function ( #31193 )
2024-02-23 19:03:28 +08:00
90ab5ec2d9
[fix](invert index) fix the error issue in the unit test remove_element_only_in_table ( #31238 )
2024-02-22 13:01:49 +08:00
ad07dec0ed
[Improve](InPredict) enhance in predict with struct type ( #30840 )
2024-02-22 13:01:49 +08:00
b66583551c
[fix](group_commit)Fix bound checking problem when reading wal block ( #31112 )
2024-02-22 13:01:48 +08:00
f2a38e6345
[chore](columns) remove update_hashes_with_value for SipHash ( #31224 )
2024-02-22 13:01:48 +08:00
1abe9d4384
[fix](memory) Fix LRU cache stale sweep ( #31122 )
...
Remove LRUCacheValueBase, put last_visit_time into LRUHandle, and automatically update timestamp to last_visit_time during cache insert and lookup.
Do not rely on external modification of last_visit_time, which is often forgotten.
2024-02-21 17:01:29 +08:00
a8d8c6a271
[fix](file-writer) opt s3 file writer and fix empty file related issue #28983 #30703 #31169 ( #31213 )
...
* (feature)(cloud) Use dynamic allocator instead of static buffer pool for better elasticity. (#28983 )
* [fix](outfile) Fix unable to export empty data (#30703 )
Issue Number: close #30600
Fix unable to export empty data to hdfs / S3, this behavior is inconsistent with version 1.2.7,
version 1.2.7 can export empty data to hdfs/ S3, and there will be exported files on S3/HDFS.
* [fix](file-writer) avoid empty file for segment writer (#31169 )
---------
Co-authored-by: AlexYue <yj976240184@gmail.com >
Co-authored-by: zxealous <zhouchangyue@baidu.com >
2024-02-21 16:48:54 +08:00
7a1bd6abb0
[improvment](group_commit) Refector scan wal function ( #30939 )
...
Co-authored-by: Yongqiang YANG <98214048+dataroaring@users.noreply.github.com >
2024-02-20 09:12:38 +08:00
bb4575a392
[Improvement](join) optimization for build_side_output_column ( #30826 )
...
optimization for build_side_output_column
2024-02-19 17:22:03 +08:00
6cf7468073
[enhancement](function) change some function nullable mode ( #30991 )
...
change some function nullable mode
2024-02-18 14:45:25 +08:00
68102fd531
[Fix](auto-partition) fix a concurrent bug of extremely long values ( #31005 )
2024-02-18 14:45:25 +08:00
b5012dc55a
[Enhancement](group commit) optimize pre allocated calculation ( #30893 )
2024-02-18 11:50:17 +08:00
45b4189bb6
[Refactor](opt) Opt rf and remove unless code ( #30900 )
...
Opt rf and remove unless code
2024-02-18 11:50:16 +08:00
8ff8d94697
[fix](ip) change IPv6 to little-endian byte order storage (like IPv4) ( #30730 )
2024-02-05 21:56:57 +08:00
0f47f7f389
[Feature](runtime filter) normalize ignore runtime filter ( #30152 )
...
normalize ignore runtime filter
2024-02-03 20:24:39 +08:00
3315c16383
[enhance](function) refactor from_format_str and support more format ( #30452 )
2024-02-01 19:08:37 +08:00
e9c112b843
[Refact](inverted index) refact inverted index cache to decouple with reader ( #30574 )
2024-02-01 19:00:50 +08:00