Files
doris/be
Xinyi Zou f5a35c28e9 [Optimize] [Memory] BitShufflePageDecoder use memory allocated by ChunkAllocator instead of Faststring (#6515)
BitShufflePageDecoder reuses the memory for storing decoder results, allocate memory directly from the 
`ChunkAllocator`, the performance is improved to a certain extent.

In the case of #6285, the total time consumption is reduced by 13.5%, and the time consumption ratio of `~Reader()` 
has also been reduced from 17.65% to 1.53%, and the memory allocation is unified to `ChunkAllocator` for centralized 
management , Which is conducive to subsequent memory optimization.

which can avoid the memory waste caused by `Mempool`, because the chunk can be free at any time, but the 
performance is lower than the allocation from `Mempool`. The guess is that there is no `Mempool` after secondary 
allocation of large chunks , Will directly apply for a large number of small chunks from `ChunkAllocator`, and it takes 
longer to lock in `pop_free_chunk` and `push_free_chunk` (but this is not proven from the flame graphs of BE's cpu and 
contention).
2021-11-17 11:20:21 +08:00
..