* Refactor thirdparty.cmake by add_thirdparty function
Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>
* Refactor thirdparty.cmake
- Rename add_library NOADD to NOTADD
- Fix tcmalloc not add to COMMON_THIRDPARTY
- Fix libintl in OS_MACOSX
Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>
---------
Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>
Introduce libunwind get stack trace, cost is negligible and has line numbers.
use StackTraceCache, PHDRCache speed up, is customizable and has some optimizations.
Other stack trace tools remain: glog, boost, glibc, in case for need.
TODO:
currently support linux __x86_64__, __arm__, __powerpc__, not supported __FreeBSD__, APPLE
Note: __arm__, __powerpc__ not been verified
Support signal handle
libunwid support unw_backtrace for jemalloc
Use of undefined compile option USE_MUSL for later
Fix hadoop short circuit reading can not enabled in some environments.
- Revert #21430 because it will cause performance degradation issue.
- Add `$HADOOP_CONF_DIR` to `$CLASSPATH`.
- Remove empty `hdfs-site.xml`. Because in some environments it will cause hadoop short circuit reading can not enabled.
- Copy the hadoop common native libs(which is copied from https://github.com/apache/doris-thirdparty/pull/98
) and add it to `LD_LIBRARY_PATH`. Because in some environments `LD_LIBRARY_PATH` doesn't contain hadoop common native libs, which will cause hadoop short circuit reading can not enabled.
Add an optional executable binary fs_benchmark_tool, for test the performance of file system such as hdfs, s3.
Usage:
./fs_benchmark_tool --conf my.conf --fs_type=s3 --operation=read --iterations=5
in my.conf, you can add any config key value with following format:
key1=value1
key2=value2
By default, this binary will not be built. Only build it when setting BUILD_FS_BENCHMARK=ON.
The binary will be installed in output/be/lib.
For developer, you can add new subclass of BaseBenchmark to add your own benchmark.
See be/src/io/fs/benchmark/s3_benchmark.hpp for an example
Supplement the documentation of be-clion-dev, avoid the problem of undefined DORIS_JAVA_HOME and inability to find jni.h when using clion development without directly compiling through build.sh
Complete the classification of header files in pch.h and introduce some header files that are not frequently modified in doris.
Separate the declaration and definition in common/config.h. If you need to modify the default configuration now, please modify it in common/config.cpp.
gen_cpp/version.h is regenerated every time it is recompiled, which may cause PCH to fail, so now you need to get the version information indirectly rather than directly.
1. Introduce hadoop libhdfs
2. For Linux-X86 platform, use the hadoop libhdfs
3. For other platform, use libhdfs3, because currently we don't have hadoop libhdfs binary for other platform
Co-authored-by: adonis0147 <adonis0147@gmail.com>
This PR ports the codebase to Clang-16.
Upgrade some third-party libraries:
1. Apache BRPC: 1.2.0 -> 1.4.0 (Some bugs are fixed and all patches for 1.2.0 can be removed.)
2. Boost: 1.73.0 -> 1.81.0 (Porting to Clang-16)
3. libclucene: 2.4.6 -> 2.4.8 (Porting to Clang-16)
Follow #17586.
This PR mainly changes:
Remove env/
Remove FileUtils/FilesystemUtils
Some methods are moved to LocalFileSystem
Remove olap/file_cache
Add s3 client cache for s3 file system
In my test, the time of open s3 file can be reduced significantly
Fix cold/hot separation bug for s3 fs.
This is the last PR of #17764.
After this, all IO operation should be in io/fs.
Except for tests in #17586, I also tested some case related to fs io:
clone
concurrency query on local/s3/hdfs
load error log create and clean
disk metrics
1. change PipelineTaskState to enum class
2. remove some row-based code on FoldConstantExecutor::_get_result
3. reduce memcpy on minmax runtime filter function(Now we can guarantee that the input data is aligned)
4. add Wunused-template check, and remove some unused function, change some static function to inline function.
According to the post https://developer.apple.com/forums/thread/676684, the executable whose size is bigger than 2G may fail to start. The size of the executable `doris_be_test` generated by run-be-ut.sh is 2.1G (> 2G) now and we can't run it on macOS (arm64).
We can separate the debug info from the executable `doris_be_test` to reduce the size. After that, we can run `doris_be_test` successfully.
test data: https://data.gharchive.org/2020-11-13-18.json.gz, 2GB, 197696 lines
before: String 13s vs. JSONB 28s
after: String 13s vs. JSONB 16s
**NOTICE: simdjson need to be patched since BOOL is conflicted with a macro BOOL defined in odbc sqltypes.h**