when config::enable_simdjson_parser=true in vec streamload, may lead to core dump when json input invalid format string like '{ "a', or all the fields is null like '{}', this may lead to simdjson lib throw some unhandled expection like `Objects and arrays can only be iterated when they are first encountered`.We should take care of these cases
Signed-off-by: eldenmoon <15605149486@163.com>
The column id check in VSlotRef::execute function before is too strict for fuzzy test to continuously produce random query. Temporarily loosen the check logic.
Moreover, there exists some careless call to VExpr::get_const_col, it might return a nullptr but not every function call checks if it's valid. It's an underlying problem.
Currently we use rapidjson to parse json document, It's fast but not fast enough compare to simdjson.And I found that the simdjson has a parsing front-end called simdjson::ondemand which will parse json when accessing fields and could strip the field token from the original document, using this feature we could reduce the cost of string copy(eg. we convert everthing to a string literal in _write_data_to_column by sprintf, I saw a hotspot from the flamegrame in this function, using simdjson::to_json_string will strip the token(a string piece) which is std::string_view and this is exactly we need).And second in _set_column_value we could iterate through the json document by for (auto field: object_val) {xxx}, this is much faster than looking up a field by it's field name like objectValue.FindMember("k1").The third optimization is the at_pointer interface simdjson provided, this could directly get the json field from original document.
Disable Chunk Allocator in Vectorized Allocator, this will reduce memory cache.
For high concurrent queries, using Chunk Allocator with vectorized Allocator can reduce the impact of gperftools tcmalloc central lock.
Jemalloc or google tcmalloc have core cache, Chunk Allocator may no longer be needed after replacing gperftools tcmalloc.
The bug is caused by use _num_rows_read for limit check. _num_rows_read is count of rows read from storage, but may be filtered by filter_block for WHERE predicate.
Add a _num_rows_return, which is rows after filter_block for WHERE predicate, for count for really returned rows.
Currently, only the virtual memory used by the query can be tracked through the tcmalloc hook. When the memory is not fully used after the application, the recorded virtual memory will be larger than the physical memory.
At present, it is mainly because PODArray does not memset 0 when applying for memory, and blocks applied for through PODArray in places such as VOlapScanNode::_free_blocks are usually used for memory reuse and cannot be fully used.
Two improvements have been added:
1. Translate parquet physical type into doris logical type.
2. Decode parquet column chunk into doris ColumnPtr, and add unit tests to show how to use related API.
We should close eof scanners before transfer done, otherwise,
they are closed until scannode is closed. Because plan is closed
after the plan is finished, so query profile would leak stats from
scanners closed by scannode::close. e.g. SegmentTotalNum in profile
is less.
# Proposed changes
Read and decode parquet physical type.
1. The encoding type of boolean is bit-packing, this PR introduces the implementation of bit-packing from Impala
2. Create a parquet including all the primitive types supported by hive
## Remaining Problems
1. At present, only physical types are decoded, and there is no corresponding and conversion methods with doris logical.
2. No parsing and processing Decimal type / Timestamp / Date.
3. Int_8 / Int_16 is stored as Int_32. How to resolve these types.
* [feature](planner): push limit to olapscan when meet sort.
* if olap_scan_node's sort_info is set, push sort_limit, read_orderby_key
and read_orderby_key_reverse for olap scanner
* There is a common query pattern to find latest time serials data.
eg. SELECT * from t_log WHERE t>t1 AND t<t2 ORDER BY t DESC LIMIT 100
If the ORDER BY columns is the prefix of the sort key of table, it can
be greatly optimized to read much fewer data instead of read all data
between t1 and t2.
By leveraging the same order of ORDER BY columns and sort key of table,
just read the LIMIT N rows for each related segment and merge N rows.
1. set read_orderby_key to true for read_params and _reader_context
if olap_scan_node's sort info is set.
2. set read_orderby_key_reverse to true for read_params and _reader_context
if is_asc_order is false.
3. rowset reader force merge read segments if read_orderby_key is true.
4. block reader and tablet reader force merge read rowsets if read_orderby_key is true.
5. for ORDER BY DESC, read and compare in reverse order
5.1 segment iterator read backward using a new BackwardBitmapRangeIterator and
reverse the result block before return to caller.
5.2 VCollectIterator::LevelIteratorComparator, VMergeIteratorContext return
opposite result for _is_reverse order in its compare function.
Co-authored-by: jackwener <jakevingoo@gmail.com>