Improve the accuracy of sample stats collection. For non distribution columns, use
`n*d / (n - f1 + f1*n/N)`
where `f1` is the number of distinct values that occurred exactly once in our sample of n rows (from a total of N),
and `d` is the total number of distinct values in the sample.
For distribution columns, use `ndv(n) * fraction of tablets sampled` for NDV.
For very large tablet to sample, use limit to control the total lines to scan (for non key column only, because key column is sorted and will be inaccurate using limit).
Currently, rowids may be fragmented significantly after `_get_row_ranges_by_column_conditions`, potentially leading to high CPU costs when processing these scattered ranges of rowid.
This PR enhances the `SegmentIterator` by eliminating the initial range read in the `BitmapRangeIterator` constructor and introducing a `read_batch_rowids` method to both `BitmapRangeIterator` and `BackwardBitmapRangeIterator` classes. The aim is to boost performance by omitting redundant read operations, thereby reducing execution time.
Moreover, to avoid unnecessary reads when the range is relatively complete, we employ a simple `is_continuous` check to determine if the block of rows is continuous. If so, we call `next_batch` instead of `read_by_rowids`, streamlining the processing of consecutive rowids.
We selected three SQL statement scenarios to test the effects of the optimization, which are:
1. ```select COUNT() from wc_httplogs_inverted_index where request match "images" and (size >= 10 and status = 200);```
2. ```select COUNT() from wc_httplogs_inverted_index where request match "HTTP" and (size >= 10 and status = 200);```
3. ```select COUNT() from wc_httplogs_inverted_index where request match "GET" and (size >= 10 and status = 200);```
- The first SQL statement represents the scenario primarily optimized in this PR, where the first read matches a large number of rows but is highly fragmented.
- The second SQL statement represents a scenario where the first read fully hits, mainly to verify if there is any performance degradation in the PR when hitting a complete rowid range.
- The third SQL statement represents a near-total hit with only occasional misses, used to check if the PR degrades when the rowid range contains many continuous ranges.
The results are as follows:
1. For the first SQL statement:
1. Before optimization: Execution time: 0.32 sec, FirstReadTime: 6s628ms
2. After optimization: Execution time: 0.16 sec, FirstReadTime: 1s604ms
2. For the second SQL statement:
1. Before optimization: Execution time: 0.16 sec, FirstReadTime: 682.816ms
2. After optimization: Execution time: 0.15 sec, FirstReadTime: 635.156ms
3. For the third SQL statement:
1. Before optimization: Execution time: 0.16 sec, FirstReadTime: 787.904ms
2. After optimization: Execution time: 0.16 sec, FirstReadTime: 798.861ms
This commit overhauls the JDBC connector logic within our project, transitioning from the previous mechanism of fetching data through JNI calls for individual ResultSet items to a more efficient and unified approach using the VectorTable data structure.
sql select * from t1 a join t1 b on b.id in (select 1) and a.id = b.id; will report an error.
This pr support uncorrelated subquery in join condition to fix it
When doing LoadPendingTask or LoadLoadingTask, there may be some Error thrown,
such as `NoClassDefFoundError`, but previously, we only catch java's `Exception`, so
other kind of error can not be shown clearly.
1. closure should be managed by a unique ptr and released by brpc , should not hold by our code. If hold by our code, we need to wait brpc finished during cancel or close.
2. closure should be exception safe, if any exception happens, should not memory leak.
3. using a specific callback interface to be implemented by Doris's code, we could write any code and doris should manage callback's lifecycle.
4. using a weak ptr between callback and closure. If callback is deconstruted before closure'Run, should not core.
Now the update time of hms table is generated by every FE node (Use `System.currentTimestamp()` separately), so the update time of a hms table may be different between FE nodes, always the same query can not hit the sql-cache if we submit it more than one times through different FE nodes. This pr mainly do following changes to avoid this problem.
- Use the `transient_lastDdlTime` instead of `System.currentTimestamp` as the `schemaUpdateTime` of hms tables
- Use the `eventTime` in hms event instead of `System.currentTimestamp` as the update time when processing hms events