Commit Graph

322 Commits

Author SHA1 Message Date
4c1c67e03a [improvemen](overflow) Provide the user with a suggestion to avoid th… (#39631) (#39897)
cherry-pick #39631 to branch-2.1
2024-08-26 08:10:32 +08:00
564d3cd647 [Performance](opt) opt the order by performance in permutation (#39089)
## Proposed changes

Issue Number: cherry pick #38985

<!--Describe your changes.-->
2024-08-24 16:05:46 +08:00
7bb83ae379 [cherry-pick](branch-21) fix append_data_by_selector_impl reserve too mush useless memory (#39581) (#39635)
## Proposed changes

cherry-pick from master #39581
2024-08-21 08:47:30 +08:00
b38caed808 [Improve](columns)replace fatal with exception #38035 (#38996) 2024-08-12 09:51:30 +08:00
aa9bdd76d0 [Pick](Variant) pick some fix #38413 #38364 (#38512) 2024-07-31 11:03:31 +08:00
b15ccdbe98 [Pick](Variant) pick some fix (#37922)
#37674
#37839
#37883 
#37857 
#37794
2024-07-16 21:38:47 +08:00
1d49d386aa [cherry-pick](branch-21) remove the useless code in column vector (#34432) (#37827)
cherry-pick from master https://github.com/apache/doris/pull/34432

Co-authored-by: HappenLee <happenlee@hotmail.com>
2024-07-15 22:10:58 +08:00
5162789234 [Refactor](Variant) make many insterfaces exception safe (#37640) (#37719) 2024-07-13 16:52:10 +08:00
217eac790b [pick](Variant) pick some refactor and fix #34925 #36317 #36201 #36793 (#37526) 2024-07-11 21:25:34 +08:00
680be6d19f [fix](ub) fix uninitialized accesses in BE (#35370)
ubsan hints:
```c++
/root/doris/be/src/olap/hll.h:93:29: runtime error: load of value 3078029312, which is not a valid value for type 'HllDataType'
/root/doris/be/src/olap/hll.h:94:23: runtime error: load of value 3078029312, which is not a valid value for type 'HllDataType'
/root/doris/be/src/runtime/descriptors.h:439:38: runtime error: load of value 118, which is not a valid value for type 'bool'
/root/doris/be/src/vec/exec/vjdbc_connector.cpp:61:50: runtime error: load of value 35, which is not a valid value for type 'bool' 
```
2024-05-29 20:31:07 +08:00
682d72bf4d [fix](noexcept) Remove incorrect noexcept #35230 2024-05-24 16:23:58 +08:00
944d9bd4bd [exec](performance) opt the topn nullable column order performance in Heap Sort (#35042) 2024-05-21 12:58:58 +08:00
691f3c5ee7 [Performance](Variant) Improve load performance for variant type (#33890)
1. remove phmap for padding rows
2. add SimpleFieldVisitorToScarlarType for short circuit type deducing
3. correct type coercion for conflict types bettween integers
4. improve nullable column performance
5. remove shared_ptr dependancy for DataType use TypeIndex instead
6. Optimization by caching the order of fields (which is almost always the same)
and a quick check to match the next expected field, instead of searching the hash table.

benchmark:
In clickbench data, load performance:
12m36.799s ->7m10.934s about 43% latency reduce

In variant_p2/performance.groovy:
3min44s20 -> 1min15s80 about 66% latency reducy
2024-05-18 17:58:33 +08:00
Pxl
e2ea54c0a7 [Improvement](sink) remove unused check on string's write_column_to_mysql (#34491)
remove unused check on string's write_column_to_mysql
2024-05-10 22:13:05 +08:00
aa684d85d7 [Bug](Variant) fix rapidjson::Allocator may cause mem allocate issue when build with DENABLE_CLANG_COVERAGE (#34150) 2024-05-10 22:12:00 +08:00
Pxl
804586b342 [Improvement](sort) insert data by batch on VSortedRunMerger::get_next (#34363)
insert data by batch on VSortedRunMerger::get_next
2024-05-10 14:36:53 +08:00
f6ec64c6ad [fix](exception) Fix Block noexcept method not throw exception (#34002) 2024-04-24 17:13:50 +08:00
25358564ca [Fix](compile) Fix gcc compile on master (#33864)
This is imported by #33511. wrongly used

ColumnStr<T> ();

which violate C++20 standard(see https://wg21.cmeerw.net/cwg/issue2237) but still supported by clang up until now(see llvm/llvm-project#58112)
2024-04-19 23:41:37 +08:00
1300317723 [Exec](join) Support column string64 to avoid join failed in string size overflow the uint32 (#33511) (#33850) 2024-04-18 19:43:08 +08:00
5b616da543 [refine](Operator) When _stop_emplace_flag is not set to true, perform batch processing on the block. (#33173) 2024-04-17 23:42:12 +08:00
249a9c9875 [Feature](Variant) support aggregation model for Variant type (#33493)
refactor use `insert_from` to replace `replace_column_data` for variable lengths columns
2024-04-17 23:42:00 +08:00
3d66723214 [branch-2.1](auto-partition) pick auto partition and some more prs (#33523) 2024-04-11 17:12:17 +08:00
ef26479282 [improve](serde) support complex type in write/read pb serde (#33124)
support complex type and ip/jsonb in DataTypeSerDe::write_column_to_pb/read_column_from_pb function
2024-04-11 09:31:50 +08:00
Pxl
8fd6d4c41b [Chore](build) add -Wconversion and remove some unused code (#33127)
add -Wconversion and remove some unused code
2024-04-10 15:26:08 +08:00
8b1d174b13 [Optimize] Move strings_pool from individual tree nodes to the tree itself (#33089)
Previously, strings_pool was allocated within each tree node. However, due to the Arena's alignment of allocated chunks to at least 4K, this allocation size was excessively large for a single tree node. Consequently, when there are numerous nodes within the SubcolumnTree, a significant portion of memory was wasted. Moving strings_pool to the tree itself optimizes memory usage and reduces wastage, improving overall efficiency.
2024-04-10 14:53:56 +08:00
Pxl
e4993a19e5 [Chore](column) remove ColumnVectorHelper (#33036)
remove ColumnVectorHelper
2024-04-10 11:56:41 +08:00
cf7595d423 [opt](memory) Optimize mem tracker accuracy (#32039) (#33140) 2024-04-10 11:42:19 +08:00
ebbfb06162 [Bug](array) fix array column core dump in get_shrinked_column as not check type (#33295)
* [Bug](array) fix array column core dump in get_shrinked_column as not check type

* add function could_shrinked_column
2024-04-08 07:27:40 +08:00
797b8fa456 [FIX](agg) fix vertical_compaction_reader for agg table with array/map type (#33130) 2024-04-03 18:09:45 +08:00
617cc667fe [Fix](Variant) fix variant serialize root node (#31769) 2024-03-21 14:07:50 +08:00
724bc82362 [refactor](chore) replace HashMapWithStackMemory with std::unordered_map (#32309) 2024-03-21 14:07:19 +08:00
b248d3a27e [Refactor](rf) Refactor the rf code interface to remove update filter v1 (#31643) 2024-03-02 17:12:49 +08:00
c72e55d867 [enhancement](core) throw exception instead of core during insert_range_from method (#31592)
---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-02-29 19:51:18 +08:00
f039ec8cfb [debug](Variant) sanitize variant type and column in find_and_set_leave_value (#31436) 2024-02-27 13:58:13 +08:00
f2a38e6345 [chore](columns) remove update_hashes_with_value for SipHash (#31224) 2024-02-22 13:01:48 +08:00
Pxl
bb4575a392 [Improvement](join) optimization for build_side_output_column (#30826)
optimization for build_side_output_column
2024-02-19 17:22:03 +08:00
0442d5dc0e [fix](Variant Type) Add sparse columns meta to fix compaction (#28673)
Co-authored-by: eldenmoon <15605149486@163.com>
2024-02-16 10:12:23 +08:00
b23a785775 [Fix](Variant) support materialize view for variant and accessing variant subcolumns (#30603)
* [Fix](Variant) support materialize view for variant and accessing variant subcolumns
1. fix schema change with path lost and lead to invalid data read
2. support element_at function in BE side and use simdjson to parse data
3. fix multi slot expression
2024-02-16 10:12:23 +08:00
8ff8d94697 [fix](ip) change IPv6 to little-endian byte order storage (like IPv4) (#30730) 2024-02-05 21:56:57 +08:00
82aa304706 [Opt](exec) opt the repeat node code (#30683) 2024-02-01 23:14:14 +08:00
e6fbccd3ed [Feature](Variant) support row store for variant type (#30052) 2024-01-31 23:53:39 +08:00
221308f78a [fix](datatype) fix bugs for IPv4/v6 datatype and add some basic regression test cases (#30261) 2024-01-31 23:53:39 +08:00
5e66e2519d [improve](column) support append_data_by_selector function in const column (#29996)
support append_data_by_selector function in const column
2024-01-24 09:59:45 +08:00
24ed3e4103 [Fix](Expr&code-style) check prepare&open before every VExpr execute (#26673) 2024-01-23 10:09:54 +08:00
Pxl
3cf95d0fdf [Improvement](execute) optimize for ColumnNullable's serialize_vec/deserialize_vec (#28788)
optimize for ColumnNullable's serialize_vec/deserialize_vec
2024-01-12 11:59:52 +08:00
cbcb81f381 [FIX](complextype)fix compare_at base function support nested types (#29297) 2024-01-06 12:05:43 +08:00
5db496d844 [Improve](Variant) make output stable (#29389) 2024-01-02 20:29:17 +08:00
9490d5e9a2 [Debug](Variant) sanitize variant in write_column_to_mysql (#29380) 2024-01-02 20:28:59 +08:00
fcc4cfb900 [Fix](Variant) add more info before crash in serialization (#29344) 2023-12-31 11:17:36 +08:00
e7d67e9411 [fix](be) resolves some unused-raii and used-after-moved issues (#29285) 2023-12-30 12:14:49 +08:00