Commit Graph

51 Commits

Author SHA1 Message Date
Pxl
477961dc21 [Chore](agg) refactor of hash map (#22958)
refactor of hash map
2023-08-18 17:59:30 +08:00
Pxl
d371101bfd [Improvement](aggregation) make fixed hashmap's bitmap_size flexable (#22573)
make fixed hashmap's bitmap_size flexable
2023-08-14 10:47:06 +08:00
9d3f1dcf44 [improvement](vectorized) Deserialized elements of count distinct aggregation directly inserted into target hashset (#21888)
The original logic is to first deserialize the ColumnString into a HashSet (insert the deserialized elements into the hashset), and then traverse all the HashSet elements into the target HashSet during the merge phase.
After optimization, when deserializing, elements are directly inserted into the target HashSet, thereby reducing unnecessary hashset insert overhead.

In one of our internal query tests, 30 hashsets were merged in second phase aggregation(the average cardinality is 1,400,000), and the cardinality after merging is 42,000,000. After optimization, the MergeTime dropped from 5s965ms to 3s375ms.
2023-08-02 21:19:56 +08:00
Pxl
f5e3cd2737 [Improvement](aggregation) optimization for aggregation hash_table_lazy_emplace (#22327)
optimization for aggregation hash_table_lazy_emplace
2023-08-02 11:50:21 +08:00
1c6246f7ee [improve](agg) support distinct agg node (#22169)
select c_name from customer union select c_name from customer
this sql used agg node to get distinct row of c_name,
so it's no need to wait for inserted all data to hash map,
could output the data which it's inserted into hash map successed.
2023-07-28 13:54:10 +08:00
Pxl
9451382428 [Improvement](aggregate) optimization for AggregationMethodKeysFixed::insert_keys_into_columns (#22216)
optimization for AggregationMethodKeysFixed::insert_keys_into_columns
2023-07-26 16:19:15 +08:00
6512893257 [refactor](vectorized) Remove useless control variables to simplify aggregation node code (#22026)
* [refactor](vectorized) Remove useless control variables to simplify aggregation node code

* fix
2023-07-21 12:45:23 +08:00
254f76f61d [Agg](exec) support aggregation_node limit short circuit (#21767) 2023-07-14 00:29:19 +08:00
2c9bdd64fa [fix](memory) arena support memory reuse after clear() (#21033) 2023-06-21 23:27:21 +08:00
a6f625676b [profile](remove child) child is for node, should not be used to organize counters (#20676)
Currently, there are many profiles using add child profile to orgnanize profile into blocks. But it is wrong. Child profile will have a total time counter. Actually, what we should use is just a label.

                          -  MemoryUsage:  
                              -  HashTable:  23.98  KB
                              -  SerializeKeyArena:  446.75  KB
Add a new macro ADD_LABEL_COUNTER to add just a label in the profile.

---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-06-12 10:00:35 +08:00
9f8de89659 [refactor](exec) replace the single pointer with an array of 'conjuncts' in ExecNode (#19758)
Refactoring the filtering conditions in the current ExecNode from an expression tree to an array can simplify the process of adding runtime filters. It eliminates the need for complex merge operations and removes the requirement for the frontend to combine expressions into a single entity.

By representing the filtering conditions as an array, each condition can be treated individually, making it easier to add runtime filters without the need for complex merging logic. The array can store the individual conditions, and the runtime filter logic can iterate through the array to apply the filters as needed.

This refactoring simplifies the codebase, improves readability, and reduces the complexity associated with handling filtering conditions and adding runtime filters. It separates the conditions into discrete entities, enabling more straightforward manipulation and management within the execution node.
2023-05-29 11:47:31 +08:00
Pxl
b927f8cd37 [Chore](asan) change asan_suppr from interceptor_via_lib to interceptor_via_fun (#19636)
change asan_suppr from interceptor_via_lib to interceptor_via_fun
2023-05-16 10:51:43 +08:00
f23c93b3c6 [fix](memory) Fix AggFunc memory leak due to incorrect destroy (#19126) 2023-04-27 14:58:32 +08:00
c4e469c82c [feature](agg) Support spill to disk in aggregation (#18051) 2023-04-20 18:59:08 +08:00
e412dd12e8 [chore](build) Use include-what-you-use to optimize includes (PART II) (#18761)
Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.
2023-04-19 23:11:48 +08:00
d9fe5f7b67 [enhancement](memory) Remove MemPool and replace it with Arena (#17820)
Arena can replace MemPool in most scenarios. Except for memory reuse, MemPool supports reuse of previous memory chunks after clear, but Arena does not.

Some comparisons between MemPool and Arena:

 1. Expansion
     Arena is less than 128M index 2 alloc chunk; more than 128M memory, allocate 128M * n > `size`, n is equal to the minimum value that satisfies the expression;
     MemPool less than 512K index 2 alloc chunk, greater than 512K memory, separately apply for a `size` length chunk
     
     After Arena applied for a chunk larger than 128M last time, the minimum chunk applied for after that is 128M. Does this seem to be a waste of memory? MemPool is also similar. After the chunk of 512K was applied for last time, the minimum chunk of subsequent applications is 512K.

 2. Alignment
     MemPool defaults to 16 alignment, because memtable and other places that use int128 require 16 alignment;
     Arena has no default alignment;

 3. Memory reuse
     Arena only supports `rollback`, which reuses the memory of the current chunk, usually the memory requested last time.
     MemPool supports clear(), all chunks can be reused; or call ReturnPartialAllocation() to roll back the last requested memory; if the last chunk has no memory, search for the most free chunk for allocation

 4. Realloc
     Arena supports realloc contiguous memory; it also supports realloc contiguous memory from any position at the time of the last allocation. The difference between `alloc_continue` and `realloc` is:
         1. Alloc_continue does not need to specify the old size, but the default old size = head->pos - range_start
         2. alloc_continue supports expansion from range_start when additional_bytes is between head and pos, which is equivalent to reusing a part of memory, while realloc completely allocates a new memory
     MemPool does not support realloc, but supports transferring or absorbing chunks between two MemPools

 5. check mem limit
     MemPool checks the mem limit, and Arena checks at the Allocator layer.

 6. Support for ASAN
     Arena does something extra

 7. Error handling
     MemPool supports returning the error message of application failure directly through `Status`, and Arena throws Exception.
Tests that Arena can consider

 1. After the last applied chunk is larger than 128M, the minimum applied chunk is 128M, which seems to waste memory;

 2. Support clear, memory multiplexing;

 3. Increase the large list, alloc the memory larger than 128M, and the size is equal to `size`, so as to avoid the current chunk not being fully used, which is wasteful.

 4. In some cases, it may be possible to allocate backwards to find chunks t
2023-03-29 20:56:49 +08:00
Pxl
ca73c60442 [Chore](build) enable ignored-qualifiers check (#16196)
enable ignored-qualifiers check
2023-02-01 15:15:59 +08:00
a9671b6dfd [feature](agg)support two level-hash map in aggregation node (#15967) 2023-01-30 16:43:33 +08:00
1ec88cbff6 [fix](nereids) AggregationNode process null as key column in wrong way (#16125)
in AggregationNode, _merge_with_serialized_key_helper method should convert the key column to full column if the key column is null literal.
2023-01-29 20:12:07 +08:00
Pxl
81bab55d43 [Bug](function) catch function calculation error on aggregate node to avoid core dump (#15903) 2023-01-16 11:21:28 +08:00
Pxl
93f5e440eb [Bug](execute) fix get next non stop for eos on streaming preagg (#15611)
* fix get nnext non stop for eos on streaming preagg

* update
2023-01-05 09:36:11 +08:00
b085ff49f0 [refactor](non-vec) delete non-vec data sink (#15283)
* [refactor](non-vec) delete non-vec data sink

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2022-12-23 14:10:47 +08:00
b30cd86e9e [Refactor](pipeline) Refactor operator and builder code of pipeline (#14787) 2022-12-05 18:35:00 +08:00
8c0e13ab51 [improvement](profile) add detail memory counter for exec nodes (#14806)
* [improvement](profile) improve accuraccy of memory usage and add detail memory counter

* fix
2022-12-05 11:51:52 +08:00
12304bc0ee [Pipeline](exec) Support pipeline exec engine (#14736)
Co-authored-by: Lijia Liu <liutang123@yeah.net>
Co-authored-by: HappenLee <happenlee@hotmail.com>
Co-authored-by: Jerry Hu <mrhhsg@gmail.com>
Co-authored-by: Pxl <952130278@qq.com>
Co-authored-by: shee <13843187+qzsee@users.noreply.github.com>
Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>

## Problem Summary:

### 1. Design

DSIP: https://cwiki.apache.org/confluence/display/DORIS/DSIP-027%3A+Support+Pipeline+Exec+Engine

### 2. How to use:

Set the environment variable `set enable_pipeline_engine = true; `
2022-12-02 17:11:34 +08:00
176f519fa1 [enhancement](memtracker) Optimize exec node memory tracking (#14711) 2022-12-01 14:52:21 +08:00
1520e5c88a [enhancement](agg)use new method to serialize keys in batch if the key is too large (#14484)
* [enhancement](agg)use new method to serialize keys in batch if the key is too large

* fix compile error
2022-11-23 17:35:39 +08:00
1f326fc0d6 [enhancement](be)limit mem cost to 16m when pre serialize keys in agg node (#14321)
* [enhancement](be)limit mem cost to 16m when pre serialize keys in agg node

* use only one chunk memory when serializing keys in agg node
2022-11-18 12:31:52 +08:00
6d2e6d85d3 [enhancement](be)release memory in Node's close() method (#14258)
* [enhancement](be)release memory in Node's close() method

* format code
2022-11-15 15:59:23 +08:00
4bc33a54a1 [Fix](agg) fix bitmap agg core dump when phmap pointer assert alignment (#13381) 2022-10-15 10:39:23 +08:00
8f4bb0f804 [improvement](agg) iterate aggregation data in memory written order (#12704)
Following the iteration order of the hash table will result in out-of-order access to aggregate states, which is very inefficient.
Traversing aggregate states in memory write order can significantly improve memory read efficiency.

Test
hash table items count: 3.35M

Before this optimization: insert keys into column takes 500ms
With this optimization only takes 80ms
2022-09-21 14:58:50 +08:00
8e4374b7ec [enhancement](agg)remove unnessasery mem alloc and dealloc in agg node (#12535) 2022-09-15 11:07:06 +08:00
Pxl
0ead048b93 [Enhancement](column) remove ColumnString terminating zero and add a data_version for pblock (#12456)
1. remove ColumnString terminating zero
    2. add a data_version for pblock
    3. change EncryptionMode to enum class
2022-09-14 21:25:22 +08:00
3485dfa927 [chore](profile) add some counters in aggregatation & sender (#12385) 2022-09-07 10:09:05 +08:00
dc8f64b3e3 [improvement](agg) Serialize the fixed-length aggregation results with corresponding columns instead of ColumnString (#11801) 2022-08-22 10:12:06 +08:00
Pxl
cac317430f [Bug](aggregation) fix core dump on 2nd phase aggregate (#11843) 2022-08-18 14:42:34 +08:00
092a394782 [improvement](agg)limit the output of agg node (#11461)
* [improvement](agg)limit the output of agg node
2022-08-05 07:53:55 +08:00
842a5b8e24 [refactor](agg) Abstract the hash operation into a method" (#11399) 2022-08-02 17:27:19 +08:00
0325fa436e [fix](agg)Add field of 'is_first_phase' in TAggregationNode (#11321) 2022-08-01 11:49:50 +08:00
b74f36e009 [improvement]Use phmap for aggregation with integer keys (#11175) 2022-07-27 13:58:20 +08:00
b7c9007776 [improvement][agg]Process aggregated results in the vectorized way (#11084) 2022-07-22 22:04:43 +08:00
4960043f5e [enhancement] Refactor to improve the usability of MemTracker (step2) (#10823) 2022-07-21 17:11:28 +08:00
899acb6564 [improvement][agg]import sub hashmap (#10937) 2022-07-18 18:36:45 +08:00
d1573e1a4a [improvement]Use phmap for aggregation with serialized key (#10821) 2022-07-14 11:26:09 +08:00
e293fbd277 [improvement]pre-serialize aggregation keys (#10700) 2022-07-09 06:21:56 +08:00
476be35961 [TYPO] fix typo 'destory' -> 'destroy' (#10373) 2022-06-24 19:11:28 +08:00
eeae516e37 [Feature](Memory) Hook TCMalloc new/delete automatically counts to MemTracker (#8476)
Early Design Documentation: https://shimo.im/docs/DT6JXDRkdTvdyV3G

Implement a new way of memory statistics based on TCMalloc New/Delete Hook,
MemTracker and TLS, and it is expected that all memory new/delete/malloc/free
of the BE process can be counted.
2022-03-20 23:06:54 +08:00
50864aca7d [refactor] fix warings when compile with clang (#8069) 2022-02-19 11:29:02 +08:00
505acae931 [fix](vectorization) make sure the mem address use in agg is align in proper way before use (#7960) 2022-02-08 10:05:03 +08:00
015371ac72 [fix](grouping-set) Fix the bug of grouping set core in both vec and non vec query engine (#7800) 2022-01-26 16:15:30 +08:00