- Implements ORC lazy materialization, integrate with the implementation of https://github.com/apache/doris-thirdparty/pull/56 and https://github.com/apache/doris-thirdparty/pull/62.
- Refactor code: Move `execute_conjuncts()` and `execute_conjuncts_and_filter_block()` in `parquet_group_reader `to `VExprContext`, used by parquet reader and orc reader.
- Add session variables `enable_parquet_lazy_materialization` and `enable_orc_lazy_materialization` to control whether enable lazy materialization.
- Modify `build.sh` to update apache-orc submodule or download package every time.
After #19246, when compilng FE, it will automatically generate Config and Session Variables doc and overwrite the origin one.
Need to avoid it because it is not ready to use yet
1. Add apache-orc git submodule update path, not update all modules
When sh build.sh, update all modules will fails serveral times because of unstable github network.
It wastes many time.
2. Add gitignore for be/src/apache-orc/ to avoid mistake commits.
1. Organize http documents
2. Add http interface authentication for FE
3. **Support https interface for FE**
4. Provide authentication interface
5. Add http interface authentication for BE
6. Support https interface for BE
1. Introduce hadoop libhdfs
2. For Linux-X86 platform, use the hadoop libhdfs
3. For other platform, use libhdfs3, because currently we don't have hadoop libhdfs binary for other platform
Co-authored-by: adonis0147 <adonis0147@gmail.com>
This PR ports the codebase to Clang-16.
Upgrade some third-party libraries:
1. Apache BRPC: 1.2.0 -> 1.4.0 (Some bugs are fixed and all patches for 1.2.0 can be removed.)
2. Boost: 1.73.0 -> 1.81.0 (Porting to Clang-16)
3. libclucene: 2.4.6 -> 2.4.8 (Porting to Clang-16)
As part of Inverted Index DSIP steps, we'd like to contribute our inverted index implementations step by step.
First of all we need to introduce clucene to doris thirdparty libs, because inverted index implementations are based on
lucence API and index file format, also we add our features and performance improvements base on clucene, so we
need to maintain the repo ourselves
According to the post https://developer.apple.com/forums/thread/676684, the executable whose size is bigger than 2G may fail to start. The size of the executable `doris_be_test` generated by run-be-ut.sh is 2.1G (> 2G) now and we can't run it on macOS (arm64).
We can separate the debug info from the executable `doris_be_test` to reduce the size. After that, we can run `doris_be_test` successfully.
Support Aliyun DLF
Support data on s3-compatible object storage, such as aliyun oss.
Refactor some interface of catalog, to make it more tidy.
Fix bug that the default text format field delimiter of hive should be \x01
Add a new class PooledHiveMetaStoreClient to wrap the IMetaStoreClient.
There are some errors occur when building FE by JDK (arm64) on M1 because the dependencies protoc and grpc-java doesn't support M1.
#13563 modified the build.sh to fix this issues by adding -Dos.arch=x86_64 to build command.
However, if some one executes `mvn clean package -DskipTests=true` under the folder fe, the errors will occur again.
This PR introduces a better way to fix them.
1. Modify default behavior of `build.sh`
The `BUILD_JAVA_UDF` is default ON, so that jvm is needed for compilation and runtime.
2. Add docker-compose for MySQL 5.7, PostgreSQL 14 and Hive 2
See `docker/thirdparties/docker-compose`.
3. Add some regression test cases for jdbc query on MySQL, PG and Hive Catalog
The default is `false`, if set to true, you need first start docker for MySQL/PG/Hive.
4. Support `if not exists` and `if exists` for create/drop resource and create/drop encryptkey
# Proposed changes
This PR fixed lots of issues when building from source on macOS with Apple M1 chip.
## ATTENTION
The job for supporting macOS with Apple M1 chip is too big and there are lots of unresolved issues during runtime:
1. Some errors with memory tracker occur when BE (RELEASE) starts.
2. Some UT cases fail.
...
Temporarily, the following changes are made on macOS to start BE successfully.
1. Disable memory tracker.
2. Use tcmalloc instead of jemalloc.
This PR kicks off the job. Guys who are interested in this job can continue to fix these runtime issues.
## Use case
```shell
./build.sh -j 8 --be --clean
cd output/be/bin
ulimit -n 60000
./start_be.sh --daemon
```
## Something else
It takes around _**10+**_ minutes to build BE (with prebuilt third-parties) on macOS with M1 chip. We will improve the development experience on macOS greatly when we finish the adaptation job.