In #8319, I remove mysql-connector-java dependency because of license incompatibility.
But we need a mysql compatible driver for http query api. So I choose mariadb-java-client,
which is under LGPL.
This PR fixes the #8731 and refactor the `build.sh` script.
The build.sh script is currently responsible for the compilation of the following Doris components.
1. FE
- fe-common
- fe-core
- spark-dpp
- hive-udf
- java-udf
- ui
2. BE
- palo_be
- meta_tool
3. broker
In the FE module.
- The 4 submodules `fe-common, fe-core, spark-dpp and ui` together form Frontend.
- `spark-dpp, hive-udf and java-udf` can be compiled separately to produce jar packages for individual use.
In the BE module.
- `palo_be` can start the BE process separately.
- `meta_tool` can be compiled separately to produce binaries.
The modified build.sh script has the following changes:
1. there is no longer an option to compile `ui` separately, build together with `--fe`.
2. `fe/be/spark-dpp/hive-udf/java-udf/palo_be/meta_tool` can be compiled separately.
3. all components except `java-udf` will be compiled by default (`java-udf` is in development)
Remaining issues:
Several submodules of FE have messy dependencies.
For example, `java-udf` depends on `fe-core`, and `fe-core` depends on `spark-dpp`,
resulting in a large binary jar of `java-udf`.
It needs to be reorganized afterwards.
1. Remove some unused code.
2. handle mini load with wrong state
1. For some historical reasons, some mini load jobs in LOADING state have not been cleared.
As a result, new load jobs cannot be committed.
2. If a mini load job is created right before FE restart, the mini load job will be in PENDING state forever.
But it should be removed finally.
After `VDataStreamRecvr::SenderQueue::close` clears `_block_queue`, calling
`VDataStreamRecvr::SenderQueue::add_block` again will cause a memory leak.
So, change the lock position, like the other add_block and add_batch.
Current situation of Doris is that the cluster is balanced, but the disks of a backend may be unbalanced.
for example, backend A have two disks: disk1 and disk2, disk1's usage is 98%, but disk2's usage is only 40%.
disk1 is unable to take more data, therefore only one disk of backend A can take new data,
the available write throughput of backend A is only half of its ability, and we can not resolve this through load or
partition rebalance now.
So we introduce disk rebalancer, disk rebalancer is different from other rebalancer(load or partition)
which take care of cluster-wide data balancing. it takes care about backend-wide data balancing.
[For more details see #8550](https://github.com/apache/incubator-doris/issues/8550)