Refactor the config for log dir of FE and BE
TLDR:
- Use env variable `LOG_DIR` to set root log dir
- Remove `sys_log_dir` for FE and BE
Details:
1. FE
1. The root log dir is set by env variable `LOG_DIR` in `fe.conf`
2. The default value of `audit_log_dir` is same as `${LOG_DIR}/`
3. The default value of `spark_launcher_log_dir` is `${LOG_DIR}/spark_launcher_log`
4. The default value of `nereids_trace_log_dir` is `${LOG_DIR}/nereids_trace_log`
5. The origin `sys_log_dir` is deprecated, and default value is `""`.
But for compatibility, if user already set `sys_log_dir` before, Doris will still use it as root log dir.
2. BE
1. The root log dir is set by env variable `LOG_DIR` in `be.conf`
2. Remove `pipeline_tracing_log_dir`, use `${LOG_DIR}` directly.
3. The origin `sys_log_dir` is deprecated, and default value is `""`.
But for compatibility, if user already set `sys_log_dir` before, Doris will still use it as root log dir.
Issue Number: close#30484
problem:
gson will use Java's reflection mechanism to generate a default Adapter, but JDK17 is prohibited from visiting such an access.
solution:
gson has provided solutions since 2.9.1, which can bypass this problem: Add support for reflection access filter by Marcono1234 · Pull Request #1905 · google/gson
We need to upgrade the gson version and use this solution
tc/jemalloc_free_memory in web beip:8040/mem_tracker is the cache size of Jemalloc.
Previously lg_tcache_max:20, this will cache up to 1M Bin in the thread cache, which will cause the Jemalloc cache to be too large in some scenarios.
Restore the default lg_tcache_max:16, which can cache a maximum of 64K Bins.
If you are doing a performance POC, you can consider increasing it.
Fix hadoop short circuit reading can not enabled in some environments.
- Revert #21430 because it will cause performance degradation issue.
- Add `$HADOOP_CONF_DIR` to `$CLASSPATH`.
- Remove empty `hdfs-site.xml`. Because in some environments it will cause hadoop short circuit reading can not enabled.
- Copy the hadoop common native libs(which is copied from https://github.com/apache/doris-thirdparty/pull/98
) and add it to `LD_LIBRARY_PATH`. Because in some environments `LD_LIBRARY_PATH` doesn't contain hadoop common native libs, which will cause hadoop short circuit reading can not enabled.
As we know, log4j2 some times may be bottleneck in doris fe when there are many logs to be output in sync mode while asynchronous logging has a better performance, and we find that capturing caller location has a similar impact across all logging libraries, and slows down asynchronous logging by about 30-100x. so, here we provide three log mode for log4j2 to meet the needs of different users.
refer to https://logging.apache.org/log4j/2.x/performance.html
1. cast string literal to date like type should not be an implict cast
2. the string representation of float like type should not be scientific notation
3. the data type of like function's regex expr should be string type even if it's a null literal
4. add -Xss4m in fe.conf to prevent stack overflow in some case
1. Introduce hadoop libhdfs
2. For Linux-X86 platform, use the hadoop libhdfs
3. For other platform, use libhdfs3, because currently we don't have hadoop libhdfs binary for other platform
Co-authored-by: adonis0147 <adonis0147@gmail.com>
* [enhancement](ldap) optimize LDAP authentication.
1. Support caching LDAP user information.
2. HTTP authentication supports LDAP.
3. LDAP temporary users support default user property.
4. LDAP configuration supports the `admin show config` and `admin set config` commands.