In row mode schema change, it will fail sometime because memory exceeded.
When the left memory is enough for sorting but not enough for next block,
it will not flush row_block_arr which data in memory and continue to alloc next block
so it can't alloc the memory and return directly.
And if it can't alloc the memory for block, it need to flush row_block_arr and
try it again unless row_block_arr is empty.
Two improvements have been added:
1. Translate parquet physical type into doris logical type.
2. Decode parquet column chunk into doris ColumnPtr, and add unit tests to show how to use related API.
* [bugfix](schema change) when there is a string column with delete predicate, the schema change may core
Co-authored-by: yiguolei <yiguolei@gmail.com>
We should close eof scanners before transfer done, otherwise,
they are closed until scannode is closed. Because plan is closed
after the plan is finished, so query profile would leak stats from
scanners closed by scannode::close. e.g. SegmentTotalNum in profile
is less.
# Proposed changes
Read and decode parquet physical type.
1. The encoding type of boolean is bit-packing, this PR introduces the implementation of bit-packing from Impala
2. Create a parquet including all the primitive types supported by hive
## Remaining Problems
1. At present, only physical types are decoded, and there is no corresponding and conversion methods with doris logical.
2. No parsing and processing Decimal type / Timestamp / Date.
3. Int_8 / Int_16 is stored as Int_32. How to resolve these types.
* [feature](planner): push limit to olapscan when meet sort.
* if olap_scan_node's sort_info is set, push sort_limit, read_orderby_key
and read_orderby_key_reverse for olap scanner
* There is a common query pattern to find latest time serials data.
eg. SELECT * from t_log WHERE t>t1 AND t<t2 ORDER BY t DESC LIMIT 100
If the ORDER BY columns is the prefix of the sort key of table, it can
be greatly optimized to read much fewer data instead of read all data
between t1 and t2.
By leveraging the same order of ORDER BY columns and sort key of table,
just read the LIMIT N rows for each related segment and merge N rows.
1. set read_orderby_key to true for read_params and _reader_context
if olap_scan_node's sort info is set.
2. set read_orderby_key_reverse to true for read_params and _reader_context
if is_asc_order is false.
3. rowset reader force merge read segments if read_orderby_key is true.
4. block reader and tablet reader force merge read rowsets if read_orderby_key is true.
5. for ORDER BY DESC, read and compare in reverse order
5.1 segment iterator read backward using a new BackwardBitmapRangeIterator and
reverse the result block before return to caller.
5.2 VCollectIterator::LevelIteratorComparator, VMergeIteratorContext return
opposite result for _is_reverse order in its compare function.
Co-authored-by: jackwener <jakevingoo@gmail.com>
In our origin design, we calc delete bitmap in publish txn, and this operation
will cost too much time as it will load segment data and lookup row key in pre
rowset and segments.And publish version task should run in order, so it'll lead
to timeout in publish_txn.
In this pr, we seperate delete_bitmap calculation to tow part, one of it will be
done in flush mem table, so this work can run parallel. And we calc final
delete_bitmap in publish_txn, get a rowset_id set that should be included and
remove rowsets that has been compacted, the rowset difference between memtable_flush
and publish_txn is really small so publish_txn become very fast.In our test,
publish_txn cost about 10ms.
Co-authored-by: yixiutt <yixiu@selectdb.com>
column_ptr will be a none nullable column pointer after `column_ptr = &nullable_column->get_nested_column()`
so we should not cast column_ptr to ColumnNullable any more