runtime filter is shared among multi instances.
in the past, we cached pushdown expr(runtime filter generated)
every scannode[runtime filter consumer] will try to call prepare expr
but the expr may generated with different fn_context_id
---------
Co-authored-by: yiguolei <yiguolei@gmail.com>
With bellow json path
`["$.data","$.data.datatimestamp"]`
After `array_obj->PushBack` the `data` field owner will be taken from array_obj, and lead to null values for json path `$.data.datatimestamp`
Rapidjson doc:
```
//! Append a GenericValue at the end of the array.
\note The ownership of \c value will be transferred to this array on success.
*/
GenericValue& PushBack(GenericValue& value, Allocator& allocator);
```
Refactoring the filtering conditions in the current ExecNode from an expression tree to an array can simplify the process of adding runtime filters. It eliminates the need for complex merge operations and removes the requirement for the frontend to combine expressions into a single entity.
By representing the filtering conditions as an array, each condition can be treated individually, making it easier to add runtime filters without the need for complex merging logic. The array can store the individual conditions, and the runtime filter logic can iterate through the array to apply the filters as needed.
This refactoring simplifies the codebase, improves readability, and reduces the complexity associated with handling filtering conditions and adding runtime filters. It separates the conditions into discrete entities, enabling more straightforward manipulation and management within the execution node.
Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.
Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.
In #17976, we introduced small fix container to optimize the in expr. This PR will change small fix container size of In set to 8, which has better performance when size > 8 by the perf test.
When the be_exec_version is less than 2, murmurhash will still be used, otherwise crc32 will be used. When the be_exec_version is upgraded to 2, please remove.
For scan node with no vectorized predicate, the input column for the first short-circuit predicate is dense and we don't need to access the selector column.
This PR improve performance by ~30% on TPCH Q3.
rpc common is duplicate, all its method is included in function rpc. So that I remove it.
get_field_type is never used, remove it.
---------
Co-authored-by: yiguolei <yiguolei@gmail.com>
* [Feature](vectorized)(quantile_state): support vectorized quantile state functions
1. now quantile column only support not nullable
2. add up some regression test cases
3. set default enable_quantile_state_type = true
---------
Co-authored-by: spaces-x <weixiang06@meituan.com>