[bugfix](column_reader) index_page should not be pre-decoded (#16605)

In our current logic, index page will be pre-decoded but it will return OK
as index page use BinaryPlainPageBuilder and first 4 bytes of the page is a offset
so it's high probablility not equal to EncodingTypePB::DICT_ENCODING which
is 5.
Code in bitshuffle_page_pre_decode.h
```
   	if constexpr (USED_IN_DICT_ENCODING) {
            auto type = decode_fixed32_le((const uint8_t*)&data.data[0]);
            if (static_cast<EncodingTypePB>(type) != EncodingTypePB::DICT_ENCODING) {
                return Status::OK();
            }
            size_of_dict_header = BINARY_DICT_PAGE_HEADER_SIZE;
            data.remove_prefix(4);
        }
```
But if type just equal to EncodingTypePB::DICT_ENCODING and then it will use
BitShuffle to decode BinaryPlainPage, which will leads to an fatal error.
This commit is contained in:
yixiutt
2023-02-14 00:06:14 +08:00
committed by GitHub
parent 89754eb200
commit de725d5d44

View File

@ -211,6 +211,10 @@ Status ColumnReader::read_page(const ColumnIteratorOptions& iter_opts, const Pag
opts.type = iter_opts.type;
opts.encoding_info = _encoding_info;
opts.io_ctx = iter_opts.io_ctx;
// index page should not pre decode
if (iter_opts.type == INDEX_PAGE) {
opts.pre_decode = false;
}
return PageIO::read_and_decompress_page(opts, handle, page_body, footer);
}