* [improvement](regresstion test) Improve performance of ASAN build by using -O3 and fix mem limit exceed error for nereids test cases
* exclude tpcds_sf1 q72 for ASAN build because this query takes too long time
* not need call delete handler to filter rows since they are filtered in rowset reader
* need not call delete eval in schema change and remove related code
Co-authored-by: yiguolei <yiguolei@gmail.com>
During load process, the same operation are performed on all replicas such as sort and aggregation,
which are resource-intensive.
Concurrent data load would consume much CPU and memory resources.
It's better to perform write process (writing data into MemTable and then data flush) on single replica
and synchronize data files to other replicas before transaction finished.
Analyze schema elements in parquet FileMetaData, and generate the hierarchy of nested fields.
For exmpale:
1. primitive type
```
// thrift:
optional int32 <column-name>;
// sql definition:
<column-name> int32;
```
2. nested type
```
// thrift:
optional group <column-name> (LIST) {
repeated group bag {
optional group array_element (LIST) {
repeated group bag {
optional int32 array_element
}
}
}
}
// sql definition:
<column-name> array<array<int32>>
```
```
if (!_dynamic_mode) {
int8store(_len_pos, _pos - _len_pos - 8);
_len_pos = nullptr;
}
```
_len_pos may be pointed to the pos which already deleted in reserve, int8store will asign value to the freed address,
and lead to use after free when build in ASAN.So I changed _len_pos to the offset of _buf
1. Fix a bug that query large column table may cause infinite loop
2. Optimize the query logic with limit, for the case where the limit value is relatively small, reduce the parallelism of the scanner, reduce unnecessary resource consumption, and increase the number of similar queries that the system can carry at the same time, and increase the query speed by more than 60%
* fix infinite loop when reading wide table
When a wide table is read, the 1st batch would be exceed raw_bytes_threshold,
so Scanner should read at least 1 row.
Actually, we should adjust batch size automatically to reduce memoery usage.
* refactor first and last
[refactor][vectorized] refactor first/last value agg functions
* add some change
* remove first/last about always nullable
* remove always nullable and register it
* refactor value remove bool null flag
* refactor win first last to ptr and pos