Fix some logic about broker load using new file scanner, with parquet format:
1. If columns are specified in load stmt, but none of them are in parquet file,
error will be thrown like `err: No columns found in file`. See `parquet_s3_case4`
2. If the first column of table are not in table, the result number of rows is wrong.
See `parquet_s3_case8`
3. If column specified in `columns` in load stmt does not exist in file and table,
error will be thrown like: `failed to find default value expr for slot: x1`. See `parquet_s3_case2`
1. Fix issue #13115
2. Modify the method of `get_next_block` or `GenericReader`, to return "read_rows" explicitly.
Some columns in block may not be filled in reader, if the first column is not filled, use `block->rows()` can not return real row numbers.
3. Add more checks for broker load test cases.
Now, regression data is stored in sf1DataPath, which is local or remote.
For performance reason, we use local dir for community pipeline, however, we need prepare data for every machine,
this process is easy mistake. So we cache data from s3 in local transparently, thus, we just need to config one data source.
This is the second step for #12303.
The previous PR #12464 added the framework to select the rollup index for OLAP table, but pre-aggregation is turned on by default.
This PR set pre-aggregation for scan OLAP table.
The main steps are as below:
1. Select rollup index when aggregate is present, this is handled by `SelectRollupWithAggregate` rule. Expressions in aggregate functions, grouping expressions, and pushdown predicates would be used to check whether the pre-aggregation should be turned off.
2. When selecting from olap scan table without aggregate plan, it would be handled by `SelectRollupWithoutAggregate`.
when enable light schema change, run test_materialized_view_hll case throw NullPointerException.
java.lang.NullPointerException: null
at org.apache.doris.analysis.SlotDescriptor.setColumn(SlotDescriptor.java:153)
at org.apache.doris.planner.OlapScanNode.updateSlotUniqueId(OlapScanNode.java:399)
In an earlier PR #11976 , we add shuffle join and bucket shuffle support. But if join's right child's distribution spec satisfied join's require, we do not add distribute on right child. Instead of, do it in plan translator.
It is hard to calculate accurate cost in this way, since we some distribute cost do not calculated.
In this PR, we introduce a new shuffle type BUCKET, and change the way of add enforce to ensure all necessary distribute will be added in cost and enforcer job.