Commit Graph

6324 Commits

Author SHA1 Message Date
d2a99aa03b [refactor](scan) change scan reschedule into scan context (#27766)
* [refactor](scan) change scan reschedule into scan context
2023-12-04 10:25:52 +08:00
a64656748b [Enhancenment](wal) disable group commit when streamload size is too large (#27781) 2023-12-03 23:05:11 +08:00
97d36b4f38 [fix](csv_reader) fix trim_double_quotes behavior change (#27882) 2023-12-03 22:57:55 +08:00
80d2c7ab41 [feature](parquet)support read parquet lzo compress. (#27706) 2023-12-03 09:55:52 +08:00
fc8b32be7a [Opt](multi-catalog) Opt parquet orc reader numeric copy by memcpy() and memset(). (#27545)
Opt parquet orc reader null map decoding by memset().
2023-12-03 09:55:05 +08:00
be30bd1e40 [improvement](spinlock) remove some potential bad spinlock usage (#27904)
* [improvement](spinlock) remove some potential spinlock usage

---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-12-02 20:33:54 +08:00
421ab56c3e [pipelineX](improvement) Support local shuffle for join and agg (#27852) 2023-12-02 20:17:18 +08:00
10483ea12c [fix](profile) fix error set with peak_memory_usage in pipeline #27749 2023-12-02 14:12:38 +08:00
2e1ce758f1 [feature](function) support ip function ipv6numtostring(alias inet6_ntoa) (#27342) 2023-12-02 11:48:19 +08:00
54b5d04ff9 [improve](csv_reader) handle csv reader error (#27892) 2023-12-02 10:05:02 +08:00
Pxl
f65103e2a6 [Chore](runtime-filter) unify interfaces of bloom filter and remove some unused code (#27822)
* unify interfaces of bloom filter and remove some unused code
2023-12-02 07:42:55 +08:00
a1a75fcfbd [fix](runtime filter) Fix extremely high CPU usage caused by rf merge #27894 2023-12-02 07:40:52 +08:00
1706699e7e [fix](multi-catalog)support the max compute partition prune (#27154)
1. max compute partition prune,
we just support filter mc partitions by '=',it can filter just one partition
to support multiple partition filter and range operator('>','<', '>='..), the partition prune should be supported.

2. add max compute row count cache and partitionValues cache

3. add max compute regression case
2023-12-01 22:28:26 +08:00
68525fc112 [feature](profile) add RuntimeFilterInfo in merge profile #27869 2023-12-01 21:42:25 +08:00
7e3d6bc9f1 [Fix](Variant) Implement ColumnObject::update_hash_with_value (#27873) 2023-12-01 20:14:47 +08:00
007506ce42 [fix](like_func) incorrect result of like with 'NO_BACKSLASH_ESCAPES' mode (#27842) 2023-12-01 17:32:46 +08:00
18338a33b6 [bugfix](mergeprofile) ignore null profile to avoid bug (#27860)
---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-12-01 16:56:29 +08:00
137f94eac9 [Bug](func) coredump in equal for null in function (#27844) 2023-12-01 15:48:01 +08:00
60bc3be8a2 [Opt](Compression) Opt zstd block decompression by ZSTD_decompressDCtx(). (#27534)
Opt zstd block decompression by `ZSTD_decompressDCtx()` to replace streaming decompression.
It will improve performance but consume more memory. 

Test result: 
- env: 1 node(16 cores, 64G).
- parquet column: 100 million rows of char(255) column.
- result: 5.2 -> 4.6.
2023-12-01 09:10:32 +08:00
0b7becd4b7 [fix](executor)Fix memtracker not set to task group #27699 2023-11-30 22:35:51 +08:00
97105e9a16 [regression](compaction) Add case to test single replica compaction (#27199) 2023-11-30 21:27:13 +08:00
a2fa0b3745 [compability](segment) fix compability issue introduced by #27676 (#27799)
Prior to PR #27676, data was written with empty path information. Consequently, after implementing #27676, data that already exists in a segment is not included in `column_id_to_footer_ordinal`. This issue will lead to `invalid nonexistent column without default value` error.
2023-11-30 21:24:59 +08:00
c0aac043b6 [pipelineX](local shuffle) Use local shuffle to optimize BHJ (#27823) 2023-11-30 21:08:45 +08:00
Pxl
96f2ef3d99 [Improvement](schema-change) Reserve some memory for use by other parts except hold block of schem… (#27800)
Reserve some memory for use by other parts except hold block of schema change job
2023-11-30 17:01:51 +08:00
Pxl
f573918aa4 [Chore](materialized-view) output reference column info when create mv can't find ref column (#27182)
output reference column info when create mv can't find ref column
2023-11-30 16:48:06 +08:00
f1846c10a1 [fix](stop)fix missing notify_all() after the stop (#27796) 2023-11-30 16:04:13 +08:00
8ca8a0655e [fix](memtracking) require size in Allocator::free (#27795) 2023-11-30 15:57:15 +08:00
db8e56b9f2 [improve](move-memtable) increase open load stream timeout (#26909) 2023-11-30 15:27:29 +08:00
5a4948f0f9 [fix](load) fix DataSink prepared check in PlanFragmentExecutor (#27735) 2023-11-30 15:24:04 +08:00
838225b6be [fix](move-memtable) wait stream close before releasing streams (#27791) 2023-11-30 15:03:07 +08:00
3e910e2978 [refactor](simd_json_reader) refactor simd json reader to adapt to parse multi json (#27272) 2023-11-30 15:01:06 +08:00
e4149c6e4c [Fix](parquet-reader) Fix null map issue in parquet reader. (#27777)
Fix null map issue in parquet reader which cause result incorrect such as `min()`, `max()`.

In order to share null map between parquet converted src column and dst column to avoid copying. It is very tricky that will call mutable function `doris_nullable_column->get_null_map_column_ptr()` which will set `_need_update_has_null = true`. Because some operations such as agg will call `has_null()` to set `_need_update_has_null = false`.
2023-11-30 13:55:37 +08:00
da03a50824 [refine](pipelineX) refine dataqueue set source ready block (#27733) 2023-11-30 13:00:18 +08:00
112ae59aa4 [fix](move-memtable) add timeout for load stream close wait (#27439) 2023-11-30 12:00:06 +08:00
34e53acaea [pipelineX](fix) Fix local exchange on pipelineX engine (#27763) 2023-11-30 11:16:20 +08:00
5739167142 [feature](window_function) support to secondary argument to ignore null values in first_value/last_value (#27623) 2023-11-30 09:56:43 +08:00
e9debca97c [Improve](sort) avoid too may tmp vectors for get_columns (#27734) 2023-11-30 09:47:31 +08:00
1f9aa8ab16 [fix](group commit) Fix some group commit problems (#27769) 2023-11-29 23:43:21 +08:00
d96e2dfefb [feature-wip](arrow-flight)(step5) Support JDBC and PreparedStatement and Fix Bug (#27661) 2023-11-29 21:17:20 +08:00
ce271ff382 [fix](parquet)fix can not read parquet lz4 compress. (#27383)
Fixed the problem of not being able to read parquet lz4 compressed format. By default, it is decompressed according to the Hadoop lz4 format. If it fails, it will fall back to the standard lz4 compression format.
2023-11-29 19:04:53 +08:00
f1e9e6dba8 [fix](pipelineX) make RuntimeFilterTimerQueue graceful exit (#27653)
make RuntimeFilterTimerQueue graceful exit
2023-11-29 18:53:13 +08:00
498d27c905 [improve](json_reader) add prompt when all fields is null (#27630) 2023-11-29 18:26:42 +08:00
e208072a06 [fix](gc tablet) fix get shutdown tablet cost a lot time (#27693) 2023-11-29 17:35:14 +08:00
b152abc292 [fix](faststring) fix memtracking in faststring free (#27731) 2023-11-29 17:05:14 +08:00
4633a5c49b [chore](log) Warning log to trace send fragment #27738 2023-11-29 16:43:25 +08:00
9daa7dc6b5 [refactor](http) disable snapshot and get_log_file api (#27724)
Disable 2 http api by default:

1. BE's `/api/snapshot`
2. FE's `/get_log_file`
2023-11-29 16:11:51 +08:00
d9d5468621 [feature](audit-log) add audit-log in insert into (#27641) 2023-11-29 15:01:57 +08:00
f3a1abf20b [chore](compile) fix compile error in ColumnObject (#27739)
This is issue is caused by the two PR merged without conflict
2023-11-29 13:39:32 +08:00
7398c3daf1 [Feature-Variant](Variant Type) support variant type query and index (#27676) 2023-11-29 10:37:28 +08:00
6371dbab33 [Pipeline](load) fix may oom in pipeline load (#27714) 2023-11-29 10:20:59 +08:00