Doris have use RLE to encoding/decoding integer.
Four types are comprised of the RLE encoding/decoding algorithm.
Short Repeat : used for short repeating integer sequences.
Direct : used for integer sequences whose values have a relatively constant bit width.
Patched Base : used for integer sequences whose bit widths varies a lot.
Delta : used for monotonically increasing or decreasing sequences.
This bug occurs in Patched Base Type for large negative number.
In patched base, base value is stored 1 to 8 bytes and encoding to 0 ~ 7.
If the base value is 8 byte, the encoding value for base width should be 7.
But now will encoding to 8, this is problem.
It will result in inconsistent data with loaded data because wrong encoding procedure.
In extreme case, the BE process will be cored dump because illegal address.
NOTE: This patch would modify all Backend's data.
And this will cause a very long time to restart be.
So if you want to interferer your product environment,
you should upgrade backend one by one.
1. Refactoring be is to clarify the structure the codes.
2. Use unique id to indicate a rowset.
Nameing rowset with tablet_id and version will lead to
many conflicts among compaction, clone, restore.
3. Extract an rowset interface to encapsulate rowsets
with different format.
Class definition of ByteBuffer duplicates between olap/byte_buffer.h and util/byte_buffer.h.
All of the two classes has a function names as remaining().
Some place which want to call remaining() of util/byte_buffer.h is linked to the other remaining() function of olap/byte_buffer.h