Files
doris/be
airborne12 c6b97c4daa [Improvement](segment iterator) remove range in first read to save time (#26689)
Currently, rowids may be fragmented significantly after `_get_row_ranges_by_column_conditions`, potentially leading to high CPU costs when processing these scattered ranges of rowid.

This PR enhances the `SegmentIterator` by eliminating the initial range read in the `BitmapRangeIterator` constructor and introducing a `read_batch_rowids` method to both `BitmapRangeIterator` and `BackwardBitmapRangeIterator` classes. The aim is to boost performance by omitting redundant read operations, thereby reducing execution time.

Moreover, to avoid unnecessary reads when the range is relatively complete, we employ a simple `is_continuous` check to determine if the block of rows is continuous. If so, we call `next_batch` instead of `read_by_rowids`, streamlining the processing of consecutive rowids.


We selected three SQL statement scenarios to test the effects of the optimization, which are:

1. ```select COUNT() from wc_httplogs_inverted_index where request match "images" and (size >= 10 and status = 200);```
2. ```select COUNT() from wc_httplogs_inverted_index where request match "HTTP" and (size >= 10 and status = 200);```
3. ```select COUNT() from wc_httplogs_inverted_index where request match "GET" and (size >= 10 and status = 200);```

- The first SQL statement represents the scenario primarily optimized in this PR, where the first read matches a large number of rows but is highly fragmented. 
- The second SQL statement represents a scenario where the first read fully hits, mainly to verify if there is any performance degradation in the PR when hitting a complete rowid range. 
- The third SQL statement represents a near-total hit with only occasional misses, used to check if the PR degrades when the rowid range contains many continuous ranges.

The results are as follows:

1. For the first SQL statement:
    1. Before optimization: Execution time: 0.32 sec, FirstReadTime: 6s628ms
    2. After optimization: Execution time: 0.16 sec, FirstReadTime: 1s604ms
2. For the second SQL statement:
    1. Before optimization: Execution time: 0.16 sec, FirstReadTime: 682.816ms
    2. After optimization: Execution time: 0.15 sec, FirstReadTime: 635.156ms
3. For the third SQL statement:
    1. Before optimization: Execution time: 0.16 sec, FirstReadTime: 787.904ms
    2. After optimization: Execution time: 0.16 sec, FirstReadTime: 798.861ms
2023-11-13 15:51:48 +08:00
..