Reuse compression ctx and buffer.
Use a global instance for every compression algorithm, and use a
thread saft buffer pool to reuse compression buffer, pool size is equal
to max parallel thread num in compression, and this will not be too large.
Test shows this feature increase 5% of data import and compaction.
Co-authored-by: yixiutt <yixiu@selectdb.com>
Store the offset rather than the length in file for the data with array type. The new file format can improve the seek performance. Please refer to #12246 to get the performance report.
Co-authored-by: xy720 <22125576+xy720@users.noreply.github.com>
Currently, Doris has a variety of readers for different file formats,
such as parquet reader, orc reader, csv reader, json reader and so on.
The interfaces of these readers are not unified, which makes it impossible to call them through a unified method.
In this PR, I added a `GenericReader` interface class, and other Readers will implement this interface class
to use the `get_next_block()` method.
This PR currently only modifies `arrow_reader` and `parquet reader`.
Other readers will be modified one by one in subsequent PRs.
In compute level, CHAR type will shrink suffix zeros.
To keep the logic the same as CHAR type, we also shrink for ARRAY or ARRAY<ARRAY> types.
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
For debug purpose:
Add session variable skip_storage_engine_merge, when set to true, tables of aggregate key model and unique key model will be read as duplicate key model.
Add session variable skip_delete_predicate, when set to true, rows deleted with delete statement will be selected.
Related pr:
https://github.com/apache/doris/pull/11582https://github.com/apache/doris/pull/12048
Using new file scan node and new scheduling framework to do the load job, replace the old broker scan node.
The load part (Be part) is work in progress. Query part (Fe) has been tested using tpch benchmark.
Please review only the FE code in this pr, BE code has been disabled by enable_new_load_scan_node configuration. Will send another pr soon to fix be side code.
This PR introduce a new enum type `PushDownType`:
```
enum class PushDownType {
// The predicate can not be pushed down to data source
UNACCEPTABLE,
// The predicate can be pushed down to data source
// and the data source can fully evaludate it
ACCEPTABLE,
// The predicate can be pushed down to data source
// but the data source can not fully evaluate it.
PARTIAL_ACCEPTABLE
};
```
And derived class of VScanNode can override following method to determine whether to accept
a bianry/in/bloom filter/is null predicate:
```
PushDownType _should_push_down_binary_predicate();
PushDownType _should_push_down_in_predicate();
PushDownType _should_push_down_function_filter();
PushDownType _should_push_down_bloom_filter();
PushDownType _should_push_down_is_null_predicate();
```
When FE send cancel rpc to BE, it does not notify the wait_for_start() thread, so that the fragment will be blocked and occupy the execution thread.
Add a max wait time for wait_for_start() thread. So that it will not block forever.
When select large number of data from a table, the profile will show that:
- ScannerCtxSchedCount: 2.82664M(2826640)
But there is only 8 times of ScannerSchedCount, most of them are busy running.
After improvement, the ScannerCtxSchedCount will be reduced to only 10.
Failed when reading parquet file with many columns(>1600).
mysql> select int_col from types_sf100_r100w limit 5;
ERROR 1105 (HY000): errCode = 2, detailMessage = Couldn't deserialize thrift msg:
TProtocolException: Invalid data
parse_thrift_footer uses fixed length buffer(=64k) to read parquet footer, but the meta data of a parquet file with 1600 columns can exceed 5MB.
Therefore, the buffer size needs to be applied according to the actual length.
* [fix](threadpool) threadpool schedules does not work right on concurrent token
Assuming there is a concurrent thread token whose concurrency is 2, and the 1st
submit on the token is submitted to threadpool while the 2nd is not submitted due
to busy. The token's active_threads is 1, then thread pool does not schedule the
token.
The patch fixes the problem.
We already separate Array Offset64 and String Offset(32bit) in PR: #12341
Now we limit: Offset inside IColumn, Offset64 only inside ColumnArray, to avoid abuse of them.
If we use the wrong one, it will compile failed.
* using pooled connection and enlarge brpc connection timeout and retry times
When a connection failure happen, doris fails queries using the connection.
We should lower the impact of a connection failure by using pooled connection
and enlaring connection timeout and retry times.
* clang format
## Fix five bugs:
1. Parquet dictionary data may be compressed, but `ColumnChunkReader` try to parse dictionary data before creating compression codec, causing unexpected data errors.
2. `FE` doesn't resolve array type
3. `ParquetFileHdfsScanner` doesn't fill partition values when the table is partitioned
4. `ParquetFileHdfsScanner` set `_scanner_eof = true` when a scan range is empty, causing the end of the scanner, and resulting in data loss
5. typographical error in `PageReader`
When the load channel is canceled, the memtracker does not subtract the memory released by the load channel. This will cause the memory usage counted by the memtracker of the load channel mgr to be larger than the actual memory usage.
Each NodeChannel has its own queue, with size up to 1/20 exec_mem_limit.
User will crash into OOM if set exec_mem_limit high. This commit uses
fixed number to control the total max memory used by NodeChannels.
Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>