* [enhancement](ldap) optimize LDAP authentication.
1. Support caching LDAP user information.
2. HTTP authentication supports LDAP.
3. LDAP temporary users support default user property.
4. LDAP configuration supports the `admin show config` and `admin set config` commands.
In the strict memory usage mode of STRICT_MEMORY_USE=ON, when the capacity of the vectorized Hash Table is greater than 2G, it starts to grow when 75% of the capacity is satisfied, the memory usage of the vectorized Join becomes 50% of the previous value.
STRICT_MEMORY_USE=ON` expects BE to use less memory, and gives priority to ensuring stability when the cluster memory is limited.
1. Fix Lru Cache MemTracker consumption value is negative.
2. Fix compaction Cache MemTracker has no track.
3. Add USE_MEM_TRACKER compile option.
4. Make sure the malloc/free hook is not stopped at any time.
Currently, the libhdfs3 library integrated by doris BE does not support accessing the cluster with kerberos authentication
enabled, and found that kerberos-related dependencies(gsasl and krb5) were not added when build libhdfs3.
so, this pr will enable kerberos support and rebuild libhdfs3 with dependencies gsasl and krb5:
- gsasl version: 1.8.0
- krb5 version: 1.19
This PR fixes the #8731 and refactor the `build.sh` script.
The build.sh script is currently responsible for the compilation of the following Doris components.
1. FE
- fe-common
- fe-core
- spark-dpp
- hive-udf
- java-udf
- ui
2. BE
- palo_be
- meta_tool
3. broker
In the FE module.
- The 4 submodules `fe-common, fe-core, spark-dpp and ui` together form Frontend.
- `spark-dpp, hive-udf and java-udf` can be compiled separately to produce jar packages for individual use.
In the BE module.
- `palo_be` can start the BE process separately.
- `meta_tool` can be compiled separately to produce binaries.
The modified build.sh script has the following changes:
1. there is no longer an option to compile `ui` separately, build together with `--fe`.
2. `fe/be/spark-dpp/hive-udf/java-udf/palo_be/meta_tool` can be compiled separately.
3. all components except `java-udf` will be compiled by default (`java-udf` is in development)
Remaining issues:
Several submodules of FE have messy dependencies.
For example, `java-udf` depends on `fe-core`, and `fe-core` depends on `spark-dpp`,
resulting in a large binary jar of `java-udf`.
It needs to be reorganized afterwards.
This feature is propsoed in [DSIP-1](https://cwiki.apache.org/confluence/display/DORIS/DSIP-001%3A+Java+UDF).
This PR support fixed-length input and output Java UDF. Phase I in DIP-1 is done after this PR.
To support Java UDF effeciently, I use no data copy in JNI call and all compute operations are off-heap in Java.
To achieve that, I use a UdfExecutor instead.
For users, a UDF class must have a public evaluate method.
Currently, the compiled output of BE mainly consists of two binaries:
palo_be and meta_tool, which are both around 1.6G in size.
However, the debug information is only needed for debugging purposes.
So I separate the debug info from binaries.
After BE is built, the debug info file will be saved in `be/lib/debug_info/` dir.
`palo_be` and `meta_tool`'s size decrease to about 100MB
This is optional, and default is disabled.
To enable it, use:
`STRIP_DEBUG_INFO=ON sh build.sh`
CMAKE_BUILD_DIR is set while building be. "build.sh --clean" just
cleans and exits, however clean_be does not works without
CMAKE_BUILD_DIR set. This patch set CMAKE_BUILD_DIR in clean_be
to teach build.sh --clean work correctly.
Added bprc stub cache check and reset api, used to test whether the bprc stub cache is available, and reset the bprc stub cache
add a config used for auto check and reset bprc stub
refactor runtime filter bloomfilter and eliminate some virtual function calls which obtained a performance improvement of about 5%
import block bloom filter, for avx version obtained 40% performance improvement
before: bloomfilter size:default, about 2000W item cost about 1s400ms
after: bloomfilter size:524288, about 2000W item cost about 400ms