Like #15641, we should reduce the size of executables on macOS arm64. Otherwise, we can not run doris_be and doris_be_test with ASAN build type on macOS arm64 now.
In the previous implementation, `bin/start_fe.sh --version` will
complain that "Frontend running as process xxx. Stop it first."
To show version
1. `bin/start_fe.sh --version` will print version info to fe.out
2. `bin/start_fe.sh --console --version` will print version info to stdout
Modify the BE fd number check to 60000,
because the default fd number value of some system is 65535, which is smaller than previous threshold 65536,
so reduce to 60000 to let Doris start normally in most of system.
- sh start_fe/start_be --console is used to instruct the program to run in console mode.
- sh start_fe/start_be --daemon is used to instruct the program to run in daemon mode.
- sh start_fe/start_be used starts as a background execution, records output and error logs to the specified file
Sometimes, user need to add some custom libs to the cluster, such lzo.jar, orai18n.jar, etc.
In previous, these lib files are places in fe/lib or be/lib.
But when upgrading cluster, the lib dir will be replaced by new lib dir, so that all custom libs are lost.
In this PR, I add new dir custom_lib for FE and BE, and user can place custom lib files in it.
- allow put custom jar in `${DORIS_HOME}/lib/java_extensions/custom_extension` such as `paimon-s3-0.4.0-incubating.jar`
- add some note for paimon and fqdn
Add scanner isolation class loader to make each plugin non-conflicting.
The BE will get scanner classes by JNI call and use JniClassLoader load them.
In the last version,we always get canner classes from the system class path by default,
so it cannot isolate the classes for each scanner
Jemalloc heap profile follows libgcc's way of backtracing by default.
rewrites dl_iterate_phdr will cause Jemalloc to fail to run after enable profile.
TODO, two solutions:
- Jemalloc specifies GNU libunwind as the prof backtracing way, but my test failed,
--enable-prof-libunwind not work: --enable-prof-libunwind not work jemalloc/jemalloc#2504
- ClickHouse/libunwind solves Jemalloc profile backtracing, but the branch of ClickHouse/libunwind
has been out of touch with GNU libunwind and LLVM libunwind, which will leave the fate to others.
Fix hadoop short circuit reading can not enabled in some environments.
- Revert #21430 because it will cause performance degradation issue.
- Add `$HADOOP_CONF_DIR` to `$CLASSPATH`.
- Remove empty `hdfs-site.xml`. Because in some environments it will cause hadoop short circuit reading can not enabled.
- Copy the hadoop common native libs(which is copied from https://github.com/apache/doris-thirdparty/pull/98
) and add it to `LD_LIBRARY_PATH`. Because in some environments `LD_LIBRARY_PATH` doesn't contain hadoop common native libs, which will cause hadoop short circuit reading can not enabled.
1. fix concurrency bug of s3 fs benchmark tool, to avoid crash on multi thread.
2. Add `prefetch_read` operation to test prefetch reader.
3. add `AWS_EC2_METADATA_DISABLED` env in `start_be.sh` to avoid call ec2 metadata when creating s3 client.
4. add `AWS_MAX_ATTEMPTS` env in `start_be.sh` to avoid warning log of s3 sdk.
Use spark-bundle to read hudi data instead of using hive-bundle to read hudi data.
**Advantage** for using spark-bundle to read hudi data:
1. The performance of spark-bundle is more than twice that of hive-bundle
2. spark-bundle using `UnsafeRow` can reduce data copying and GC time of the jvm
3. spark-bundle support `Time Travel`, `Incremental Read`, and `Schema Change`, these functions can be quickly ported to Doris
**Disadvantage** for using spark-bundle to read hudi data:
1. More dependencies make hudi-dependency.jar very cumbersome(from 138M -> 300M)
2. spark-bundle only provides `RDD` interface and cannot be used directly
Optimize the usage of fs benchmark tool:
1. Remove `Open` benchmark, it is useless.
2. Remove `Delete` benchmark, it is dangerous.
3. Add `SingleRead` benchmark, user can specify an exist file to test read operation:
`sh bin/run-fs-benchmark.sh --conf=conf/hdfs_read.conf --fs_type=hdfs --operation=single_read`
4. Modify the `run-fs-benchmark.sh`, remove `OPTS` section, use options in `fs_benchmark_tool` directly
5. Add some custom counters in the benchmark result, eg:
```
--------------------------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations UserCounters...
--------------------------------------------------------------------------------------------------------------------------------
HdfsReadBenchmark/iterations:1/repeats:3/manual_time/threads:1 6864 ms 2385 ms 1 ReadRate=200.936M/s
HdfsReadBenchmark/iterations:1/repeats:3/manual_time/threads:1 3919 ms 1828 ms 1 ReadRate=351.96M/s
HdfsReadBenchmark/iterations:1/repeats:3/manual_time/threads:1 3839 ms 1819 ms 1 ReadRate=359.265M/s
HdfsReadBenchmark/iterations:1/repeats:3/manual_time/threads:1_mean 4874 ms 2011 ms 3 ReadRate=304.054M/s
HdfsReadBenchmark/iterations:1/repeats:3/manual_time/threads:1_median 3919 ms 1828 ms 3 ReadRate=351.96M/s
HdfsReadBenchmark/iterations:1/repeats:3/manual_time/threads:1_stddev 1724 ms 324 ms 3 ReadRate=89.3768M/s
HdfsReadBenchmark/iterations:1/repeats:3/manual_time/threads:1_cv 35.37 % 16.11 % 3 ReadRate=29.40%
HdfsReadBenchmark/iterations:1/repeats:3/manual_time/threads:1_max 6864 ms 2385 ms 3 ReadRate=359.265M/s
HdfsReadBenchmark/iterations:1/repeats:3/manual_time/threads:1_min 3839 ms 1819 ms 3 ReadRate=200.936M/s
```
- For `open_read` and `single_read`, add `ReadRate` as `bytes per second`.
- For `create_write`, add `WriteRate` as `bytes per second`.
- For `exists` and `rename`, add `ExistsCost` and `RenameCost` as `time cost per one operation`.
1. Introduce hadoop libhdfs
2. For Linux-X86 platform, use the hadoop libhdfs
3. For other platform, use libhdfs3, because currently we don't have hadoop libhdfs binary for other platform
Co-authored-by: adonis0147 <adonis0147@gmail.com>
Remove page cache regular clear
Now the page cache is turned off by default. If the user manually opens the page cache, it can be considered that the user can accept the memory usage of the page cache, and then can consider adding a manual clear command to the cache.
fix memory gc cancel top memory query
jemalloc prof is not enabled by default
We set LIBHDFS3_CONF env in start_be.sh, so libhdfs3 will try to read this hdfs-site.xml,
if file does not exist, it will throw error. But Doris does not handle this error, cause BE crash.
This CL mainly changes:
Modify start_be.sh to only set LIBHDFS3_CONF if hdfs-site.xml exist.
Refactor the HDFSCommonBuilder so that it can return error correctly.
Add BE IP info in status, so that we can get ip from error msg like:
ERROR 1105 (HY000): errCode = 2, detailMessage = [INTERNAL_ERROR]failed to init reader for file 000.snappy.orc, err:
[INTERNAL_ERROR][172.21.0.101]failed to init HDFSCommonBuilder, please check check be/conf/hdfs-site.xml
The logic of prefer compute node is wrong, which causing the external table query can only assign up to 3 backends.
This CL refactor this logic and also change some FE config:
prefer_compute_node_for_external_table
If set to true, query on external table will prefer to assign to compute node.
And the max number of compute node is controlled by min_backend_num_for_external_table.
If set to false, query on external table will assign to any node.
min_backend_num_for_external_table
Only take effect when prefer_compute_node_for_external_table is true.
If the compute node number is less than this value, query on external table will try to get some mix node
to assign, to let the total number of node reach this value.
If the compute node number is larger than this value, query on external table will assign to compute node only.
As discussed in 16107
Sometimes jvm would try to reduce the whole stack to just one line, it's kind of confusing for debugging.
Issue Number: close #xxx
1. Spark dpp
Move `DppResult` and `EtlJobConfig` to sparkdpp package in `fe-common` module.
So taht `fe-core` is longer depends on `spark-dpp` module, so that the `spark-dpp.jar`
will not be moved into `fe/lib`, which reduce the size of FE output.
2. Modify start_fe.sh
Modify the CLASSPATH to make sure that doris-fe.jar is at front, so that
when loading classes with same qualified name, it will be got from doris-fe.jar firstly.
3. Upgrade hadoop and hive version
hadoop: 2.10.2 -> 3.3.3
hive: 2.3.7 -> 3.1.3
4. Override the IHiveMetastoreClient implementations from dependency
`ProxyMetaStoreClient.java` for Aliyun DLF.
`HiveMetaStoreClient.java` for origin Apache Hive metastore.
Because I need to modified some of their method to make them compatible with
different version of Hive.
5. Exclude some unused dependencies to reduce the size of FE output
Now it is only 370MB (Before is 600MB)
6. Upgrade aws-java-sdk version to 1.12.31
7. Support AWS Glue Data Catalog
8. Remove HudiScanNode(no longer support)