Scanner threads may be running and using the member vars of OlapScanNode,
when the OlapScanNode has already destroyed.
We can use `_running_thread` to be the last accessed member variable.
And `transfer_thread` need to wait for `_running_thread==0`.
After `transfer_thread` joined, `OlapScanNode::close()` can continue.
And Refactor ColumnRangeValue and OlapScanNode
This patch mainly do the following:
- Fix issue #5071
- Change type_min in ColumnRangeValue as static
- Add Class of type_limit make code clear
- Refactor the function of normalize_in_and_eq_predicate
mainly includes:
- `OLAP_SCAN_NODE` profile layering: `OLAP_SCAN_NODE`,`OlapScanner`, and `SegmentIterator`.
- Delete meaningless statistical values. mainly in scan_node.cpp.
- Increase `RowsConditionsFiltered` statistical, split from `RowsDelFiltered`, the meaning is the number of rows filtered by various column indexes, only in segment V2.
- Modify the document based on the above, and enhance readability.
Replace some boost to std in OlapScanNode.
This refactor seems solve the problem describe in #3929.
Because I found that BE will crash to calling `boost::condition_variable.notify_all()`.
But after upgrade to this, BE does not crash any more.
This CL mainly changes:
1. Add a new BE config `max_pushdown_conditions_per_column` to limit the number of conditions of a single column that can be pushed down to storage engine.
2. Add 2 new session variables `max_scan_key_num` and `doris_max_scan_key_num` which can set in session level and overwrite the config value in BE.
This CL mainly made the following modifications:
1. Delete Invalid MemoryUsed Counter and Add PeakMemUsage in each exec node and datastreamsender
2. Add intent in child execnode profile,make it is easily to know the relationship between execnode
3. Del _is_result_order we not support any more in olap_scan_node.h and olap_scan_node.cpp
4. Add scan_disk method to olap_scanner to fix the counter _num_disks_accessed_counter
5. Now we do not use buffer pool to read and write disk, so annotation eadio counter and
6. Delete the MemUsed counter in exec node.
[STORAGE][SEGMENTV2]
use block split bloom filter
build bloom filter against data page
add distinct value to bloom filter
add ordinal index to bloom filter index
There is unnegligible cost to covnert VectorRowBatch to RowBatch,
When we seek block, we only read one row from engine to minimize
this convert cost.
This patch can optimize some query's time from 5s to 2s
* Reduce UT binary size
Almost every module depend on ExecEnv, and ExecEnv contains all
singleton, which make UT binary contains all object files.
This patch seperate ExecEnv's initial and destory to anthor file to
avoid other file's dependence. And status.cc include debug_util.h which
depend tuple.h tuple_row.h, and I move get_stack_trace() to
stack_util.cpp to reduce status.cc's dependence.
I add USE_RTTI=1 to build rocksdb to avoid linking librocksdb.a
Issue: #292
* Update