In this PR, I have refactored the initialization of the FunctionSet. Previously, all the functions were in one large method which led to the generation of Java code that was too long. This posed a problem for the compiler, as the length of the method exceeded the limit imposed by the Java compiler.
To resolve this issue and improve the readability and manageability of our code, I have categorized these functions by type, and created dedicated initialization methods for each type. As such, our code is now not only more readable and understandable, but also each method is of a length that is acceptable to the compiler and can be compiled successfully.
Moreover, this change makes it easier for us to add new functions as we can directly locate the right category and add new functions there.
This is a significant change aimed at enhancing the maintainability and scalability of our code, while ensuring that our code can be successfully compiled.
Inspired by c++ function `std::vector::emplace_back()`, we can use variadic template for this issue.
e.g.
```
[['struct'], 'STRUCT<TYPES>', ['TYPES'], 'ALWAYS_NOT_NULLABLE', ['TYPES...']]
```
`...TYPES` in template_types defines a variadic template `TYPE`. Then the variadic template will be expanded to multiple normal templates based on actual input arguments at runtime in FE.
But make sure `TYPES...` is placed on the last position in all template type arguments.
BTW, the origin template function logic is not affected.
A new way just like c++ template is proposed in this PR. The previous functions can be defined much simpler using template function.
# map element extract template function
[['element_at', '%element_extract%'], 'E', ['ARRAY<E>', 'BIGINT'], 'ALWAYS_NULLABLE', ['E']],
# map element extract template function
[['element_at', '%element_extract%'], 'V', ['MAP<K, V>', 'K'], 'ALWAYS_NULLABLE', ['K', 'V']],
BTW, the plain type function is not affected and the legacy ARRAY_X MAP_K_V is still supported for compatability.
This commit support:
1、Insert + select for struct/map type
2、Json stream load for struct type
3、m[key] function for map type
How to use:
Set the fe config to create table for struct and map type
1、admin set frontend config("enable_struct_type" = "true");
2、admin set frontend config("enable_map_type" = "true");
#16547
Co-authored-by: xy720 <xuyang25@baidu.com>
Co-authored-by: amory <wangqiannan@selectdb.com>
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
Co-authored-by: hucheng01 <hucheng01@baidu.com>
# Proposed changes
Issue Number: close#6238
Co-authored-by: HappenLee <happenlee@hotmail.com>
Co-authored-by: stdpain <34912776+stdpain@users.noreply.github.com>
Co-authored-by: Zhengguo Yang <yangzhgg@gmail.com>
Co-authored-by: wangbo <506340561@qq.com>
Co-authored-by: emmymiao87 <522274284@qq.com>
Co-authored-by: Pxl <952130278@qq.com>
Co-authored-by: zhangstar333 <87313068+zhangstar333@users.noreply.github.com>
Co-authored-by: thinker <zchw100@qq.com>
Co-authored-by: Zeno Yang <1521564989@qq.com>
Co-authored-by: Wang Shuo <wangshuo128@gmail.com>
Co-authored-by: zhoubintao <35688959+zbtzbtzbt@users.noreply.github.com>
Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>
Co-authored-by: xinghuayu007 <1450306854@qq.com>
Co-authored-by: weizuo93 <weizuo@apache.org>
Co-authored-by: yiguolei <guoleiyi@tencent.com>
Co-authored-by: anneji-dev <85534151+anneji-dev@users.noreply.github.com>
Co-authored-by: awakeljw <993007281@qq.com>
Co-authored-by: taberylyang <95272637+taberylyang@users.noreply.github.com>
Co-authored-by: Cui Kaifeng <48012748+azurenake@users.noreply.github.com>
## Problem Summary:
### 1. Some code from clickhouse
**ClickHouse is an excellent implementation of the vectorized execution engine database,
so here we have referenced and learned a lot from its excellent implementation in terms of
data structure and function implementation.
We are based on ClickHouse v19.16.2.2 and would like to thank the ClickHouse community and developers.**
The following comment has been added to the code from Clickhouse, eg:
// This file is copied from
// https://github.com/ClickHouse/ClickHouse/blob/master/src/Interpreters/AggregationCommon.h
// and modified by Doris
### 2. Support exec node and query:
* vaggregation_node
* vanalytic_eval_node
* vassert_num_rows_node
* vblocking_join_node
* vcross_join_node
* vempty_set_node
* ves_http_scan_node
* vexcept_node
* vexchange_node
* vintersect_node
* vmysql_scan_node
* vodbc_scan_node
* volap_scan_node
* vrepeat_node
* vschema_scan_node
* vselect_node
* vset_operation_node
* vsort_node
* vunion_node
* vhash_join_node
You can run exec engine of SSB/TPCH and 70% TPCDS stand query test set.
### 3. Data Model
Vec Exec Engine Support **Dup/Agg/Unq** table, Support Block Reader Vectorized.
Segment Vec is working in process.
### 4. How to use
1. Set the environment variable `set enable_vectorized_engine = true; `(required)
2. Set the environment variable `set batch_size = 4096; ` (recommended)
### 5. Some diff from origin exec engine
https://github.com/doris-vectorized/doris-vectorized/issues/294
## Checklist(Required)
1. Does it affect the original behavior: (No)
2. Has unit tests been added: (Yes)
3. Has document been added or modified: (No)
4. Does it need to update dependencies: (No)
5. Are there any changes that cannot be rolled back: (Yes)
This CL modify the `evalExpr()` of ExpressionFunctions, so that it won't change the
`FunctionCallExpr` to `NullLiteral` when there is null parameter in UDF. Which will fix the
problem described in ISSUE: #2913
1. Remove all design docs. They will be pushed again after modification.
2. Add streaming load and privilege help docs.
3. Rename palo*.py to doris*.py in gensrc/script/.