broadcast join cost is used compressed data size currently.
The amount of memory used may be significantly more than estimated.
This patch:
1. add a compressed ratio to broadcast join cost and set to 5 according to the experience.
2. add a new session variable `auto_broadcast_join_threshold` to limit memory used by broadcast in bytes, the default value is 1073741824(1GB)
Add [IF EXISTS] support to following statements:
- CREATE [IF NOT EXISTS] USER
- CREATE [IF NOT EXISTS] ROLE
- DROP [IF EXISTS] USER
- DROP [IF EXISTS] ROLE
support suite block to specify multiple groups.
TestAction support compare result to iterator, local file and http stream.
support print teamcity service message.
abandon the logical: generate groovy file for sql file
support 3 levels parrallel: script file, suite block, thread action
support specify JAVA_OPTS for boot shell
avoid jvm metaspace oom
use -d to run the suite in some directories, instead of -g. and -g is used to specify groups
Current situation of Doris is that the cluster is balanced, but the disks of a backend may be unbalanced.
for example, backend A have two disks: disk1 and disk2, disk1's usage is 98%, but disk2's usage is only 40%.
disk1 is unable to take more data, therefore only one disk of backend A can take new data,
the available write throughput of backend A is only half of its ability, and we can not resolve this through load or
partition rebalance now.
So we introduce disk rebalancer, disk rebalancer is different from other rebalancer(load or partition)
which take care of cluster-wide data balancing. it takes care about backend-wide data balancing.
[For more details see #8550](https://github.com/apache/incubator-doris/issues/8550)
1. add a config string_type_soft_limit to soft limit max length of string type
2. disable using String type in Key column, partition column and
distribution column
3. remove String type alias BLOB for futrue use
Support a lot of actions for regression testing framework.
e.g. thread, lazyCheck, onSuccess, connect, selectUnionAll, timer
Demo exists in ${DORIS_HOME}/regression-test/suites/demo
Add a new column-type to speed up the approximation of quantiles.
1. The new column-type is named `quantile_state` with fixed aggregation function `quantile_union`, which stores the intermediate results of pre-aggregated approximation calculations for quantiles.
2. support pre-aggregation of new column-type and quantile_state related functions.
Currently, the compiled output of BE mainly consists of two binaries:
palo_be and meta_tool, which are both around 1.6G in size.
However, the debug information is only needed for debugging purposes.
So I separate the debug info from binaries.
After BE is built, the debug info file will be saved in `be/lib/debug_info/` dir.
`palo_be` and `meta_tool`'s size decrease to about 100MB
This is optional, and default is disabled.
To enable it, use:
`STRIP_DEBUG_INFO=ON sh build.sh`
Early Design Documentation: https://shimo.im/docs/DT6JXDRkdTvdyV3G
Implement a new way of memory statistics based on TCMalloc New/Delete Hook,
MemTracker and TLS, and it is expected that all memory new/delete/malloc/free
of the BE process can be counted.