During load process, the same operation are performed on all replicas such as sort and aggregation,
which are resource-intensive.
Concurrent data load would consume much CPU and memory resources.
It's better to perform write process (writing data into MemTable and then data flush) on single replica
and synchronize data files to other replicas before transaction finished.
Analyze schema elements in parquet FileMetaData, and generate the hierarchy of nested fields.
For exmpale:
1. primitive type
```
// thrift:
optional int32 <column-name>;
// sql definition:
<column-name> int32;
```
2. nested type
```
// thrift:
optional group <column-name> (LIST) {
repeated group bag {
optional group array_element (LIST) {
repeated group bag {
optional int32 array_element
}
}
}
}
// sql definition:
<column-name> array<array<int32>>
```
* [fix](TabletInvertedIndex) fix potential deadlock between ForkJoinPool and TabletInvertedIndex
The default ForkJoinPool is shared by all parallelStream by default, and we obtain read lock outside the ForkJoinPool in TabletInvertIndex while we obtain read lock inside the same ForkJoinPool in TabletStatMgr which may cause deadlock
```
if (!_dynamic_mode) {
int8store(_len_pos, _pos - _len_pos - 8);
_len_pos = nullptr;
}
```
_len_pos may be pointed to the pos which already deleted in reserve, int8store will asign value to the freed address,
and lead to use after free when build in ASAN.So I changed _len_pos to the offset of _buf
Rules for normalizing expressions should be applied once before do some extra expression transforms.
Normalization rules include:
1. NormalizeBinaryPredicatesRule
2. BetweenToCompoundRule
3. SimplifyNotExprRule
1. Fix a bug that query large column table may cause infinite loop
2. Optimize the query logic with limit, for the case where the limit value is relatively small, reduce the parallelism of the scanner, reduce unnecessary resource consumption, and increase the number of similar queries that the system can carry at the same time, and increase the query speed by more than 60%
* fix infinite loop when reading wide table
When a wide table is read, the 1st batch would be exceed raw_bytes_threshold,
so Scanner should read at least 1 row.
Actually, we should adjust batch size automatically to reduce memoery usage.