Commit Graph

6157 Commits

Author SHA1 Message Date
54b5d04ff9 [improve](csv_reader) handle csv reader error (#27892) 2023-12-02 10:05:02 +08:00
Pxl
f65103e2a6 [Chore](runtime-filter) unify interfaces of bloom filter and remove some unused code (#27822)
* unify interfaces of bloom filter and remove some unused code
2023-12-02 07:42:55 +08:00
a1a75fcfbd [fix](runtime filter) Fix extremely high CPU usage caused by rf merge #27894 2023-12-02 07:40:52 +08:00
1706699e7e [fix](multi-catalog)support the max compute partition prune (#27154)
1. max compute partition prune,
we just support filter mc partitions by '=',it can filter just one partition
to support multiple partition filter and range operator('>','<', '>='..), the partition prune should be supported.

2. add max compute row count cache and partitionValues cache

3. add max compute regression case
2023-12-01 22:28:26 +08:00
68525fc112 [feature](profile) add RuntimeFilterInfo in merge profile #27869 2023-12-01 21:42:25 +08:00
7e3d6bc9f1 [Fix](Variant) Implement ColumnObject::update_hash_with_value (#27873) 2023-12-01 20:14:47 +08:00
007506ce42 [fix](like_func) incorrect result of like with 'NO_BACKSLASH_ESCAPES' mode (#27842) 2023-12-01 17:32:46 +08:00
18338a33b6 [bugfix](mergeprofile) ignore null profile to avoid bug (#27860)
---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-12-01 16:56:29 +08:00
137f94eac9 [Bug](func) coredump in equal for null in function (#27844) 2023-12-01 15:48:01 +08:00
60bc3be8a2 [Opt](Compression) Opt zstd block decompression by ZSTD_decompressDCtx(). (#27534)
Opt zstd block decompression by `ZSTD_decompressDCtx()` to replace streaming decompression.
It will improve performance but consume more memory. 

Test result: 
- env: 1 node(16 cores, 64G).
- parquet column: 100 million rows of char(255) column.
- result: 5.2 -> 4.6.
2023-12-01 09:10:32 +08:00
0b7becd4b7 [fix](executor)Fix memtracker not set to task group #27699 2023-11-30 22:35:51 +08:00
97105e9a16 [regression](compaction) Add case to test single replica compaction (#27199) 2023-11-30 21:27:13 +08:00
a2fa0b3745 [compability](segment) fix compability issue introduced by #27676 (#27799)
Prior to PR #27676, data was written with empty path information. Consequently, after implementing #27676, data that already exists in a segment is not included in `column_id_to_footer_ordinal`. This issue will lead to `invalid nonexistent column without default value` error.
2023-11-30 21:24:59 +08:00
c0aac043b6 [pipelineX](local shuffle) Use local shuffle to optimize BHJ (#27823) 2023-11-30 21:08:45 +08:00
Pxl
96f2ef3d99 [Improvement](schema-change) Reserve some memory for use by other parts except hold block of schem… (#27800)
Reserve some memory for use by other parts except hold block of schema change job
2023-11-30 17:01:51 +08:00
Pxl
f573918aa4 [Chore](materialized-view) output reference column info when create mv can't find ref column (#27182)
output reference column info when create mv can't find ref column
2023-11-30 16:48:06 +08:00
f1846c10a1 [fix](stop)fix missing notify_all() after the stop (#27796) 2023-11-30 16:04:13 +08:00
8ca8a0655e [fix](memtracking) require size in Allocator::free (#27795) 2023-11-30 15:57:15 +08:00
db8e56b9f2 [improve](move-memtable) increase open load stream timeout (#26909) 2023-11-30 15:27:29 +08:00
5a4948f0f9 [fix](load) fix DataSink prepared check in PlanFragmentExecutor (#27735) 2023-11-30 15:24:04 +08:00
838225b6be [fix](move-memtable) wait stream close before releasing streams (#27791) 2023-11-30 15:03:07 +08:00
3e910e2978 [refactor](simd_json_reader) refactor simd json reader to adapt to parse multi json (#27272) 2023-11-30 15:01:06 +08:00
e4149c6e4c [Fix](parquet-reader) Fix null map issue in parquet reader. (#27777)
Fix null map issue in parquet reader which cause result incorrect such as `min()`, `max()`.

In order to share null map between parquet converted src column and dst column to avoid copying. It is very tricky that will call mutable function `doris_nullable_column->get_null_map_column_ptr()` which will set `_need_update_has_null = true`. Because some operations such as agg will call `has_null()` to set `_need_update_has_null = false`.
2023-11-30 13:55:37 +08:00
da03a50824 [refine](pipelineX) refine dataqueue set source ready block (#27733) 2023-11-30 13:00:18 +08:00
112ae59aa4 [fix](move-memtable) add timeout for load stream close wait (#27439) 2023-11-30 12:00:06 +08:00
34e53acaea [pipelineX](fix) Fix local exchange on pipelineX engine (#27763) 2023-11-30 11:16:20 +08:00
5739167142 [feature](window_function) support to secondary argument to ignore null values in first_value/last_value (#27623) 2023-11-30 09:56:43 +08:00
e9debca97c [Improve](sort) avoid too may tmp vectors for get_columns (#27734) 2023-11-30 09:47:31 +08:00
1f9aa8ab16 [fix](group commit) Fix some group commit problems (#27769) 2023-11-29 23:43:21 +08:00
d96e2dfefb [feature-wip](arrow-flight)(step5) Support JDBC and PreparedStatement and Fix Bug (#27661) 2023-11-29 21:17:20 +08:00
ce271ff382 [fix](parquet)fix can not read parquet lz4 compress. (#27383)
Fixed the problem of not being able to read parquet lz4 compressed format. By default, it is decompressed according to the Hadoop lz4 format. If it fails, it will fall back to the standard lz4 compression format.
2023-11-29 19:04:53 +08:00
f1e9e6dba8 [fix](pipelineX) make RuntimeFilterTimerQueue graceful exit (#27653)
make RuntimeFilterTimerQueue graceful exit
2023-11-29 18:53:13 +08:00
498d27c905 [improve](json_reader) add prompt when all fields is null (#27630) 2023-11-29 18:26:42 +08:00
e208072a06 [fix](gc tablet) fix get shutdown tablet cost a lot time (#27693) 2023-11-29 17:35:14 +08:00
b152abc292 [fix](faststring) fix memtracking in faststring free (#27731) 2023-11-29 17:05:14 +08:00
4633a5c49b [chore](log) Warning log to trace send fragment #27738 2023-11-29 16:43:25 +08:00
9daa7dc6b5 [refactor](http) disable snapshot and get_log_file api (#27724)
Disable 2 http api by default:

1. BE's `/api/snapshot`
2. FE's `/get_log_file`
2023-11-29 16:11:51 +08:00
d9d5468621 [feature](audit-log) add audit-log in insert into (#27641) 2023-11-29 15:01:57 +08:00
f3a1abf20b [chore](compile) fix compile error in ColumnObject (#27739)
This is issue is caused by the two PR merged without conflict
2023-11-29 13:39:32 +08:00
7398c3daf1 [Feature-Variant](Variant Type) support variant type query and index (#27676) 2023-11-29 10:37:28 +08:00
6371dbab33 [Pipeline](load) fix may oom in pipeline load (#27714) 2023-11-29 10:20:59 +08:00
0078a445d2 [fix](pipelineX) make union source do not block (#27721) 2023-11-28 23:24:56 +08:00
8910772cb8 [pipelineX](log) refine debug string (#27712) 2023-11-28 21:15:52 +08:00
f54db85ea3 [Opt](compression) Opt gzip decompress by libdeflate on X86 and X86_64 platforms: 2. Opt gzip decompression by libdeflate lib. (#27669)
Opt gzip decompress by libdeflate on X86 and X86_64 platforms: 2. Opt gzip decompression by libdeflate after adding libdeflate lib in #27542.
2023-11-28 20:05:24 +08:00
Pxl
d969047b50 [Refactor](join) refactor of hash join (#27557)
Improve the performance under the tpch data set by reconstructing the join related code and the use of hash table

Co-authored-by: HappenLee <happenlee@hotmail.com>
Co-authored-by: BiteTheDDDDt <pxl290@qq.com>
2023-11-28 19:46:00 +08:00
38d30f21f1 [pipelineX](bug) Fix scan dependency timeout (#27696) 2023-11-28 18:21:11 +08:00
Pxl
91b0edfaa2 [Bug](join) try fix wrong _has_null_in_build_side setted (#27684)
try fix wrong _has_null_in_build_side setted
2023-11-28 17:42:14 +08:00
b93dd1d5f7 [enhancement](load) improve error msg for load when cancelled by mem gc (#26809)
Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
2023-11-28 17:36:11 +08:00
c203d36300 [pipelineX](bug) Add logs (#27665) 2023-11-28 15:53:40 +08:00
f565f60bc3 [refactor](standard)BE:Initialize pointer variables in the class to nullptr by default (#27587) 2023-11-28 13:02:30 +08:00