Commit Graph

6 Commits

Author SHA1 Message Date
efdb3b79a5 [feature] add zstd compression codec (#9747)
ZSTD compression is fast with high compression ratio. It can be used to archive higher compression ratio
than default Lz4f codec for storing cost sensitive data such as logs.

Compared to Lz4f codec, we see zstd codec get 35% compressed size off, 30% faster at first time read without OS page 
cache, 40% slower at second time read with OS page cache in the following comparison test.

test data: 25GB text log, 110 million rows
test table: test_table(ts varchar(30), log string)
test SQL: set enable_vectorized_engine=1; select sum(length(log)) from test_table
be.conf: disable_storage_page_cache = true
set this config to disable doris page cache to avoid all data cached in memory for test real decompression speed.
test result

master branch with lz4f codec result: 
- compressed size 4.3G
- SQL first exec time(read data from disk + decompress + little computation) : 18.3s
- SQL second exec time(read data from OS pagecache + decompress + little computation) : 2.4s

this branch with zstd codec (hardcode enable it) result:
- compressed size: 2.8G
- SQL first exec time: 12.8s
- SQL second exec time: 3.4s
2022-05-27 21:56:18 +08:00
e0c790094c [enhancement][betarowset]optimize lz4 compress and decompress speed by reusing context (#9566) 2022-05-15 21:18:32 +08:00
5a44eeaf62 [refactor] Unify all unit tests into one binary file (#8958)
1. solved the previous delayed unit test file size is too large (1.7G+) and the unit test link time is too long problem problems
2. Unify all unit tests into one file to significantly reduce unit test execution time to less than 3 mins
3. temporarily disable stream_load_test.cpp, metrics_action_test.cpp, load_channel_mgr_test.cpp because it will re-implement part of the code and affect other tests
2022-04-12 15:30:40 +08:00
6fedf5881b [CodeFormat] Clang-format cpp sources (#4965)
Clang-format all c++ source files.
2020-11-28 18:36:49 +08:00
acf868c9d0 Support page compression and checksum in BetaRowset (#1646) 2019-08-19 09:40:47 +08:00
c0253a17fc Add block compression codec and remove not used codec (#1622) 2019-08-12 20:47:16 +08:00