This pr mainly optimizes the following items:
- the collection of statistics: clear up invalid historical statistics before collecting them, so as not to affect the final table statistics.
- the incremental collection of statistics: in the case of incremental collection, only the corresponding partition statistics need to be collected.
TODO: Supports incremental collection of materialized view statistics.
Fix 2 bugs of show load profile:
For broker load, the second level should be the sub task' id.
show load profile stmt should be forwarded to master FE to execute.
1. fix constant folding failed on decimalv3 type
2. support reduce decimalv3 literal precision in comparison predicate
3. support fe config enable_decimal_conversion
since we cannot do stats derive and cost estimate on agg very good.
this PR remove some aggregate pattern that usually not good.
1. one stage agg after exchange. this pattern is good only when process very few rows.
2. three stage distinct agg with gather middle merge.
some syntax error will cause unclear msg: NPE,because symbol.value is null and cause NPE when call toLowerCase(), we fix it by check if the value is null and return early.
This reverts commit 296b0c92f702675b92eee3c8af219f3862802fb2.
we can use drop table force stmt to fast drop tablets, no need to check tablet dropped state in every report
Co-authored-by: caiconghui1 <caiconghui1@jd.com>
Hive support create partition with a specific location. In this case, the file path for the create partition may not contain the partition name and value. Which will cause Doris fail to query the the hive partition.
This pr is to fix this bug.
```sql
select if(
date_format(CONCAT_WS('', '9999-07', '-26'), '%Y-%m') = DATE_FORMAT(curdate(), '%Y-%m'),
curdate(),
DATE_FORMAT(DATE_SUB(month_ceil(CONCAT_WS('', '9999-07', '-26')), 1), '%Y-%m-%d')
)
```
return null when construct new children of if(), we find that the the more than "0" index in result map doesn't replace the const map caused by incorrect value-assignment in code.
Add this option in conf:
/**
* If set false, user couldn't submit analyze SQL and FE won't allocate any related resources.
*/
@ConfField
public static boolean enable_stats = true;
It will be checked during analyze of analyze related stmt and init analyze manager
Fix bug when reading array type in parquet file:
```
ERROR 1105 (HY000): errCode = 2, detailMessage = [INTERNAL_ERROR]Read parquet file xxx failed,
reason = [IO_ERROR]Decode too many values in current page
```
When reading normal columns, `ScalarColumnReader::_read_values` still calls `ColumnSelectVector::set_run_length_null_map` to initialize select vector, but `ScalarColumnReader::_read_nested_column` hasn't do this, making the number of values wrong.
The situation where this error occurs is particularly extreme: The column pages have remaining values to be read,
but all of them are null values at ancestor level, so there's no actual read operation, just skipping null values at ancestor level.
Refactor FileQueryScanNode init and finalize methods.
Handle schema related initialization in init method, handle scan range generation in finalize method.