Current situation of Doris is that the cluster is balanced, but the disks of a backend may be unbalanced.
for example, backend A have two disks: disk1 and disk2, disk1's usage is 98%, but disk2's usage is only 40%.
disk1 is unable to take more data, therefore only one disk of backend A can take new data,
the available write throughput of backend A is only half of its ability, and we can not resolve this through load or
partition rebalance now.
So we introduce disk rebalancer, disk rebalancer is different from other rebalancer(load or partition)
which take care of cluster-wide data balancing. it takes care about backend-wide data balancing.
[For more details see #8550](https://github.com/apache/incubator-doris/issues/8550)
1. add a config string_type_soft_limit to soft limit max length of string type
2. disable using String type in Key column, partition column and
distribution column
3. remove String type alias BLOB for futrue use
Support a lot of actions for regression testing framework.
e.g. thread, lazyCheck, onSuccess, connect, selectUnionAll, timer
Demo exists in ${DORIS_HOME}/regression-test/suites/demo
In pr #8476, all memory usage of a process is recorded in the process mem tracker,
and all memory usage of a query is recorded in the query mem tracker,
and it is still necessary to manually call `transfer to` to track the cached memory size.
We hope to separate out more detailed memory usage based on Hook TCMalloc new/delete + TLS mem tracker.
In this pr, the more detailed mem tracker is switched to TLS, which automatically and accurately
counts more detailed memory usage than before.
This is only a temporary fix its performance is not ideal. Finally,
we need to reconstruct the functions of `stddev` and delete the interface of `insert_to_null_default ()`.
Add a new column-type to speed up the approximation of quantiles.
1. The new column-type is named `quantile_state` with fixed aggregation function `quantile_union`, which stores the intermediate results of pre-aggregated approximation calculations for quantiles.
2. support pre-aggregation of new column-type and quantile_state related functions.
This feature is propsoed in [DSIP-1](https://cwiki.apache.org/confluence/display/DORIS/DSIP-001%3A+Java+UDF).
This PR support fixed-length input and output Java UDF. Phase I in DIP-1 is done after this PR.
To support Java UDF effeciently, I use no data copy in JNI call and all compute operations are off-heap in Java.
To achieve that, I use a UdfExecutor instead.
For users, a UDF class must have a public evaluate method.
Currently, the compiled output of BE mainly consists of two binaries:
palo_be and meta_tool, which are both around 1.6G in size.
However, the debug information is only needed for debugging purposes.
So I separate the debug info from binaries.
After BE is built, the debug info file will be saved in `be/lib/debug_info/` dir.
`palo_be` and `meta_tool`'s size decrease to about 100MB
This is optional, and default is disabled.
To enable it, use:
`STRIP_DEBUG_INFO=ON sh build.sh`
WindowFunctionLastData::add should keep the last value,
but current implementation keeps the first one.
Obviously, this code is copied from WindowFunctionFirstData::add.