Commit Graph

6333 Commits

Author SHA1 Message Date
a7d1e92fc2 [Fix](variant) handle StorageReadOptions to avoid crash in new_column_iterator_with_path (#27936)
In partial update, read variant without `opt` will lead to crash
2023-12-04 17:02:35 +08:00
2022a8ab32 [fix](invert index) fix reader does not close fd (#27918) 2023-12-04 16:44:50 +08:00
4d1aa131ee [Feature](datatype) add be ut codes for IPv4/v6 (#26534)
Add unit test codes for IP types
2023-12-04 15:25:02 +08:00
a6a6892f90 [chore](status code) avoid print stack for DATA_QUALITY_ERROR (#27935)
issue introduced by #27065
2023-12-04 15:04:27 +08:00
48935c14e2 [Improvement](variant) limit the column size on tablet schema (#27399) (#27785)
1. limit the column count to default 2048
2. fix get_inverted_index return nullptr when variant's unique id is -1, using it's parent unique id instead
3. avoid add same path subcolumn duplicately in tablet schema
4. make extracted column unique id -1
2023-12-04 14:47:36 +08:00
Pxl
2b715924c5 [Chore](function) set normal function use_default_implementation_for_constants to default (#27891)
set normal function use_default_implementation_for_constants to default
2023-12-04 14:19:25 +08:00
Pxl
45a49ac059 Bug](column) support insert default for ColumnFixedLengthObject #27927 2023-12-04 12:52:50 +08:00
e62d19d90d [improve](partition) support auto list partition with more columns (#27817)
before the partition by column only have one column.
now remove those limit, could have more columns.
2023-12-04 11:33:18 +08:00
Pxl
e3d2425d47 [Improvement](join) remove insert_indices_from_join and special judge for -1 (#27779)
remove insert_indices_from_join and special judge for -1
2023-12-04 11:03:22 +08:00
d2a99aa03b [refactor](scan) change scan reschedule into scan context (#27766)
* [refactor](scan) change scan reschedule into scan context
2023-12-04 10:25:52 +08:00
a64656748b [Enhancenment](wal) disable group commit when streamload size is too large (#27781) 2023-12-03 23:05:11 +08:00
97d36b4f38 [fix](csv_reader) fix trim_double_quotes behavior change (#27882) 2023-12-03 22:57:55 +08:00
80d2c7ab41 [feature](parquet)support read parquet lzo compress. (#27706) 2023-12-03 09:55:52 +08:00
fc8b32be7a [Opt](multi-catalog) Opt parquet orc reader numeric copy by memcpy() and memset(). (#27545)
Opt parquet orc reader null map decoding by memset().
2023-12-03 09:55:05 +08:00
be30bd1e40 [improvement](spinlock) remove some potential bad spinlock usage (#27904)
* [improvement](spinlock) remove some potential spinlock usage

---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-12-02 20:33:54 +08:00
421ab56c3e [pipelineX](improvement) Support local shuffle for join and agg (#27852) 2023-12-02 20:17:18 +08:00
10483ea12c [fix](profile) fix error set with peak_memory_usage in pipeline #27749 2023-12-02 14:12:38 +08:00
2e1ce758f1 [feature](function) support ip function ipv6numtostring(alias inet6_ntoa) (#27342) 2023-12-02 11:48:19 +08:00
54b5d04ff9 [improve](csv_reader) handle csv reader error (#27892) 2023-12-02 10:05:02 +08:00
Pxl
f65103e2a6 [Chore](runtime-filter) unify interfaces of bloom filter and remove some unused code (#27822)
* unify interfaces of bloom filter and remove some unused code
2023-12-02 07:42:55 +08:00
a1a75fcfbd [fix](runtime filter) Fix extremely high CPU usage caused by rf merge #27894 2023-12-02 07:40:52 +08:00
1706699e7e [fix](multi-catalog)support the max compute partition prune (#27154)
1. max compute partition prune,
we just support filter mc partitions by '=',it can filter just one partition
to support multiple partition filter and range operator('>','<', '>='..), the partition prune should be supported.

2. add max compute row count cache and partitionValues cache

3. add max compute regression case
2023-12-01 22:28:26 +08:00
68525fc112 [feature](profile) add RuntimeFilterInfo in merge profile #27869 2023-12-01 21:42:25 +08:00
7e3d6bc9f1 [Fix](Variant) Implement ColumnObject::update_hash_with_value (#27873) 2023-12-01 20:14:47 +08:00
007506ce42 [fix](like_func) incorrect result of like with 'NO_BACKSLASH_ESCAPES' mode (#27842) 2023-12-01 17:32:46 +08:00
18338a33b6 [bugfix](mergeprofile) ignore null profile to avoid bug (#27860)
---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-12-01 16:56:29 +08:00
137f94eac9 [Bug](func) coredump in equal for null in function (#27844) 2023-12-01 15:48:01 +08:00
60bc3be8a2 [Opt](Compression) Opt zstd block decompression by ZSTD_decompressDCtx(). (#27534)
Opt zstd block decompression by `ZSTD_decompressDCtx()` to replace streaming decompression.
It will improve performance but consume more memory. 

Test result: 
- env: 1 node(16 cores, 64G).
- parquet column: 100 million rows of char(255) column.
- result: 5.2 -> 4.6.
2023-12-01 09:10:32 +08:00
0b7becd4b7 [fix](executor)Fix memtracker not set to task group #27699 2023-11-30 22:35:51 +08:00
97105e9a16 [regression](compaction) Add case to test single replica compaction (#27199) 2023-11-30 21:27:13 +08:00
a2fa0b3745 [compability](segment) fix compability issue introduced by #27676 (#27799)
Prior to PR #27676, data was written with empty path information. Consequently, after implementing #27676, data that already exists in a segment is not included in `column_id_to_footer_ordinal`. This issue will lead to `invalid nonexistent column without default value` error.
2023-11-30 21:24:59 +08:00
c0aac043b6 [pipelineX](local shuffle) Use local shuffle to optimize BHJ (#27823) 2023-11-30 21:08:45 +08:00
Pxl
96f2ef3d99 [Improvement](schema-change) Reserve some memory for use by other parts except hold block of schem… (#27800)
Reserve some memory for use by other parts except hold block of schema change job
2023-11-30 17:01:51 +08:00
Pxl
f573918aa4 [Chore](materialized-view) output reference column info when create mv can't find ref column (#27182)
output reference column info when create mv can't find ref column
2023-11-30 16:48:06 +08:00
f1846c10a1 [fix](stop)fix missing notify_all() after the stop (#27796) 2023-11-30 16:04:13 +08:00
8ca8a0655e [fix](memtracking) require size in Allocator::free (#27795) 2023-11-30 15:57:15 +08:00
db8e56b9f2 [improve](move-memtable) increase open load stream timeout (#26909) 2023-11-30 15:27:29 +08:00
5a4948f0f9 [fix](load) fix DataSink prepared check in PlanFragmentExecutor (#27735) 2023-11-30 15:24:04 +08:00
838225b6be [fix](move-memtable) wait stream close before releasing streams (#27791) 2023-11-30 15:03:07 +08:00
3e910e2978 [refactor](simd_json_reader) refactor simd json reader to adapt to parse multi json (#27272) 2023-11-30 15:01:06 +08:00
e4149c6e4c [Fix](parquet-reader) Fix null map issue in parquet reader. (#27777)
Fix null map issue in parquet reader which cause result incorrect such as `min()`, `max()`.

In order to share null map between parquet converted src column and dst column to avoid copying. It is very tricky that will call mutable function `doris_nullable_column->get_null_map_column_ptr()` which will set `_need_update_has_null = true`. Because some operations such as agg will call `has_null()` to set `_need_update_has_null = false`.
2023-11-30 13:55:37 +08:00
da03a50824 [refine](pipelineX) refine dataqueue set source ready block (#27733) 2023-11-30 13:00:18 +08:00
112ae59aa4 [fix](move-memtable) add timeout for load stream close wait (#27439) 2023-11-30 12:00:06 +08:00
34e53acaea [pipelineX](fix) Fix local exchange on pipelineX engine (#27763) 2023-11-30 11:16:20 +08:00
5739167142 [feature](window_function) support to secondary argument to ignore null values in first_value/last_value (#27623) 2023-11-30 09:56:43 +08:00
e9debca97c [Improve](sort) avoid too may tmp vectors for get_columns (#27734) 2023-11-30 09:47:31 +08:00
1f9aa8ab16 [fix](group commit) Fix some group commit problems (#27769) 2023-11-29 23:43:21 +08:00
d96e2dfefb [feature-wip](arrow-flight)(step5) Support JDBC and PreparedStatement and Fix Bug (#27661) 2023-11-29 21:17:20 +08:00
ce271ff382 [fix](parquet)fix can not read parquet lz4 compress. (#27383)
Fixed the problem of not being able to read parquet lz4 compressed format. By default, it is decompressed according to the Hadoop lz4 format. If it fails, it will fall back to the standard lz4 compression format.
2023-11-29 19:04:53 +08:00
f1e9e6dba8 [fix](pipelineX) make RuntimeFilterTimerQueue graceful exit (#27653)
make RuntimeFilterTimerQueue graceful exit
2023-11-29 18:53:13 +08:00