As in issue, the combination and schema change at the same time may lead to version intersection.
Describe the overview of changes.
1. Do not do compaction before schema change is actually executed.
2. Set tablet as bad when it has version intersection.
3. Do not do schema change when it can not find appropriate versions to delete in new tablet.
4. Do not change rowsets after compaction if the rowsets of the tablet has changed.
* handle ColumnDictory in evaluate_or
We need to handle ComlumnDictory in evaluate_or, otherwise delete handler
would triger a core dump.
* handle ComlumnDictionary in evaluate_and
Because there is only one difference between evaluate_and and
evaluate_or, that is or and delete, I merge two macros into one.
Delete handlers also trigger evaluate_and, i am not sure if column
dictionary would be used in evaluate_and.
* clang format
* fix short circut for evaluate_and and evaluate_or
* clang format
ZSTD compression is fast with high compression ratio. It can be used to archive higher compression ratio
than default Lz4f codec for storing cost sensitive data such as logs.
Compared to Lz4f codec, we see zstd codec get 35% compressed size off, 30% faster at first time read without OS page
cache, 40% slower at second time read with OS page cache in the following comparison test.
test data: 25GB text log, 110 million rows
test table: test_table(ts varchar(30), log string)
test SQL: set enable_vectorized_engine=1; select sum(length(log)) from test_table
be.conf: disable_storage_page_cache = true
set this config to disable doris page cache to avoid all data cached in memory for test real decompression speed.
test result
master branch with lz4f codec result:
- compressed size 4.3G
- SQL first exec time(read data from disk + decompress + little computation) : 18.3s
- SQL second exec time(read data from OS pagecache + decompress + little computation) : 2.4s
this branch with zstd codec (hardcode enable it) result:
- compressed size: 2.8G
- SQL first exec time: 12.8s
- SQL second exec time: 3.4s
1. Fix Lru Cache MemTracker consumption value is negative.
2. Fix compaction Cache MemTracker has no track.
3. Add USE_MEM_TRACKER compile option.
4. Make sure the malloc/free hook is not stopped at any time.
* [Enhancement][Vectorized]build hash table with new thread, as non-vectorized past do
edit after comments
* format code with clang format
Co-authored-by: lidongyang <dongyang.li@rateup.com.cn>
Co-authored-by: stephen <hello-stephen@qq.com>
Currently, the libhdfs3 library integrated by doris BE does not support accessing the cluster with kerberos authentication
enabled, and found that kerberos-related dependencies(gsasl and krb5) were not added when build libhdfs3.
so, this pr will enable kerberos support and rebuild libhdfs3 with dependencies gsasl and krb5:
- gsasl version: 1.8.0
- krb5 version: 1.19