Fix some logic about broker load using new file scanner, with parquet format:
1. If columns are specified in load stmt, but none of them are in parquet file,
error will be thrown like `err: No columns found in file`. See `parquet_s3_case4`
2. If the first column of table are not in table, the result number of rows is wrong.
See `parquet_s3_case8`
3. If column specified in `columns` in load stmt does not exist in file and table,
error will be thrown like: `failed to find default value expr for slot: x1`. See `parquet_s3_case2`
1. Fix issue #13115
2. Modify the method of `get_next_block` or `GenericReader`, to return "read_rows" explicitly.
Some columns in block may not be filled in reader, if the first column is not filled, use `block->rows()` can not return real row numbers.
3. Add more checks for broker load test cases.
This is the second step for #12303.
The previous PR #12464 added the framework to select the rollup index for OLAP table, but pre-aggregation is turned on by default.
This PR set pre-aggregation for scan OLAP table.
The main steps are as below:
1. Select rollup index when aggregate is present, this is handled by `SelectRollupWithAggregate` rule. Expressions in aggregate functions, grouping expressions, and pushdown predicates would be used to check whether the pre-aggregation should be turned off.
2. When selecting from olap scan table without aggregate plan, it would be handled by `SelectRollupWithoutAggregate`.
when enable light schema change, run test_materialized_view_hll case throw NullPointerException.
java.lang.NullPointerException: null
at org.apache.doris.analysis.SlotDescriptor.setColumn(SlotDescriptor.java:153)
at org.apache.doris.planner.OlapScanNode.updateSlotUniqueId(OlapScanNode.java:399)
In an earlier PR #11976 , we add shuffle join and bucket shuffle support. But if join's right child's distribution spec satisfied join's require, we do not add distribute on right child. Instead of, do it in plan translator.
It is hard to calculate accurate cost in this way, since we some distribute cost do not calculated.
In this PR, we introduce a new shuffle type BUCKET, and change the way of add enforce to ensure all necessary distribute will be added in cost and enforcer job.
We used output list to compare two LogicalProperties before. Since join reorder will change the children order of a join plan and caused output list changed. the two join plan will not equals anymore in memo although they should be. So we must add a project on the new join to keep the LogicalProperties the same.
This PR changes the equals and hashCode funtions of LogicalProperties. use a set of output to compare two LogicalProperties. Then we do not need add the top peoject anymore. This help us keep memo simple and efficient.