1. Fix Lru Cache MemTracker consumption value is negative.
2. Fix compaction Cache MemTracker has no track.
3. Add USE_MEM_TRACKER compile option.
4. Make sure the malloc/free hook is not stopped at any time.
Currently, the libhdfs3 library integrated by doris BE does not support accessing the cluster with kerberos authentication
enabled, and found that kerberos-related dependencies(gsasl and krb5) were not added when build libhdfs3.
so, this pr will enable kerberos support and rebuild libhdfs3 with dependencies gsasl and krb5:
- gsasl version: 1.8.0
- krb5 version: 1.19
This PR fixes the #8731 and refactor the `build.sh` script.
The build.sh script is currently responsible for the compilation of the following Doris components.
1. FE
- fe-common
- fe-core
- spark-dpp
- hive-udf
- java-udf
- ui
2. BE
- palo_be
- meta_tool
3. broker
In the FE module.
- The 4 submodules `fe-common, fe-core, spark-dpp and ui` together form Frontend.
- `spark-dpp, hive-udf and java-udf` can be compiled separately to produce jar packages for individual use.
In the BE module.
- `palo_be` can start the BE process separately.
- `meta_tool` can be compiled separately to produce binaries.
The modified build.sh script has the following changes:
1. there is no longer an option to compile `ui` separately, build together with `--fe`.
2. `fe/be/spark-dpp/hive-udf/java-udf/palo_be/meta_tool` can be compiled separately.
3. all components except `java-udf` will be compiled by default (`java-udf` is in development)
Remaining issues:
Several submodules of FE have messy dependencies.
For example, `java-udf` depends on `fe-core`, and `fe-core` depends on `spark-dpp`,
resulting in a large binary jar of `java-udf`.
It needs to be reorganized afterwards.
This feature is propsoed in [DSIP-1](https://cwiki.apache.org/confluence/display/DORIS/DSIP-001%3A+Java+UDF).
This PR support fixed-length input and output Java UDF. Phase I in DIP-1 is done after this PR.
To support Java UDF effeciently, I use no data copy in JNI call and all compute operations are off-heap in Java.
To achieve that, I use a UdfExecutor instead.
For users, a UDF class must have a public evaluate method.
Currently, the compiled output of BE mainly consists of two binaries:
palo_be and meta_tool, which are both around 1.6G in size.
However, the debug information is only needed for debugging purposes.
So I separate the debug info from binaries.
After BE is built, the debug info file will be saved in `be/lib/debug_info/` dir.
`palo_be` and `meta_tool`'s size decrease to about 100MB
This is optional, and default is disabled.
To enable it, use:
`STRIP_DEBUG_INFO=ON sh build.sh`
CMAKE_BUILD_DIR is set while building be. "build.sh --clean" just
cleans and exits, however clean_be does not works without
CMAKE_BUILD_DIR set. This patch set CMAKE_BUILD_DIR in clean_be
to teach build.sh --clean work correctly.
Added bprc stub cache check and reset api, used to test whether the bprc stub cache is available, and reset the bprc stub cache
add a config used for auto check and reset bprc stub
refactor runtime filter bloomfilter and eliminate some virtual function calls which obtained a performance improvement of about 5%
import block bloom filter, for avx version obtained 40% performance improvement
before: bloomfilter size:default, about 2000W item cost about 1s400ms
after: bloomfilter size:524288, about 2000W item cost about 400ms
1. fix build thirdparty may be failed in some os, because of default lib path is `lib` or`lib64` or `arrow` bulld failed by `brotil` and `zstd`
2. fix canot extract `.tar.bz2` file
Add ninja build system support, if you installed ninja you can building be by ninja using bash build.sh --be --ninja.
ninja build is more faster than make
This is the last PR of proposal #4308
1. Add a new FE config `enable_http_server_v2` to enable new HTTP Server implementation. The default value is false.
2. Add a new FE config `http_api_extra_base_path` so that we can set base path for Frontend UI.
3. Refactor the new HTTP API response body. The return http status code is always 200, and using internal code in response body to indicate the certain error.
1. Disable the MySQL client and LZO library by default when building the Doris.
MySQL client library is used for MySQL external table feature.
This feature will be replaced by the new ODBC external table soon.
LZO library is used to compress/decompress data of some old data format of Doris,
which is no longer used anymore.
2. Add missing license to some files.
3. For all non-Apache-License code, all are explained in NOTICE file and the corresponding license is declared.
4. Remove the js source code from webroot, it will be downloaded as thirdparty
This CL mainly changes:
1. Add 2 new FE modules
1. fe-common
save all common classes for other modules, currently only `jmockit`
2. spark-dpp
The Spark DPP application for Spark Load. And I removed all dpp related classes to this module, including unit tests.
2. Change the `build.sh`
Add a new param `--spark-dpp` to compile the `spark-dpp` alone. And `--fe` will compile all FE modules.
the output of `spark-dpp` module is `spark-dpp-1.0.0-jar-with-dependencies.jar`, and it will be installed to `output/fe/spark-dpp/`.
3. Modify some bugs of spark load
This CL mainly changes:
1. Reorganized the code logic to limit the supported json format to two, and the import behavior is more consistent.
2. Modified the statistical behavior of the number of error rows when loading in json format, so that the error rows can be counted correctly.
3. See `load-json-format.md` to get details of loading json format.
Format of some docs are incorrect for building the doc website.
* fix a bug that `gensrc` dir can not be built with -j.
* fix ut bug of CreateFunctionTest