Refactor Context in Cascades:
use two context in cascades framework.
JobContext is used in each job, contains such attributes:
- reference to PlannerContext
- current cost upper bound
- current required physical properties
PlannerContext is used to hold global info for query planner, contains such attributes:
- reference to Memo
- reference to connectContext
- reference to ruleset could be used for plan
- job pool to maintain unexecuted jobs
- job scheduler to schedule unexecuted jobs
- current job context for next job to be executed
During the query planning phase, the binary predicate rewrite optimization process converting DecimalLiteral to integers may overflow, resulting in false values like "id = 12345678901.0" (see the issue for detailed examples).
This pr fixes a possible overflow and optimizes the case where DecimalLiteral is not in the column type value range.
Issue Number: close#10544
`ExprBuilder` use stack to build the expr.
The input order is : col, value and the output is value, col, but the `>=` is not reverse.
Example:
`col >= 1` => `1 >= col`
In this case, it's better use the queue to keeper the input order.
And also the `CompoundPredicate(OR)` have some problems, it should be `alwaysTrue` whenever it's not a partition key or it's not a supported op.
Something the upstream system(eg, hive) may create empty orc file
which only has a header and footer, without schema.
And if we call `_reader->createRowReader()` with selected columns,
it will throw ParserError: Invalid column selected xx.
So here we first check its number of rows and skip these kind of files.
This is only a fix for non-vec load, for vec load, it use arrow scanner
to read orc file, which does not have this problem.
enhancement
- add functions `finalizeForNereids` and `finalizeImplForNereids` in stale expression to generate some attributes using in BE.
- remove unnecessary parameter `Analyzer` in function `getBuiltinFunction`
- swap join condition if its left hand expression related to right table
- change join physical implementation to broadcast hash join
- add push predicate rule into planner
fix
- swap join children visit order to ensure the last fragment is root
- avoid visit join left child twice
known issues
- expression compute will generate a wrong answer when expression include arithmetic with two literal children.
Add Rule for disassemble the logical aggregate node, this is necessary since our execution framework is distributed and the execution of aggregate always in two steps, first, aggregate locally then merge them.
Add some fields to logical aggregate to determine whether a logical aggreate operator has been disasembled and mark the aggregate phase it belongs and add the logic to mapping the new aggregate function to its stale definition to get the function intermediate type.
In the funciton `TextConverter::write_vec_column`, it should execute the statement `nullable_column->get_null_map_data().push_back(0);` for every row.
Otherwise the null map will get error and cause the core dump.
Nereids could execute query: `select a from t;`
**enhancement**
- add a queriable interface for QueryStmt and LogicalPlanAdapter Temporarily
- refactor GroupId, GroupId extends doris.common.id now
- GroupId is generated by it's memo now, not global yet
- add varchar type
- Nereids enabled only when vectorized engine enabled
**fix**
- set output and column label to logicalPlanAdapter
- set output expression on root fragment
- set select partition and select index id to OlapScanNode
- BatchRulesJob add rule type mismatch
- add all implementation rules to rule set
- SlotReference get catalog column no longer returns null values
- bind star correctly
- implement `isNullable` in expressions
**known issue**
- could not do expression mapping(e.g. a + 1) on project node(wait intermediate tuple interface and project ability in ExecNode in be)
- aggregate do not work
- sort do not work
- filter do not work
- join do not work