1. Add apache-orc git submodule update path, not update all modules
When sh build.sh, update all modules will fails serveral times because of unstable github network.
It wastes many time.
2. Add gitignore for be/src/apache-orc/ to avoid mistake commits.
For now, there are 3 packages for the release binaries of Doris: https://doris.apache.org/download
And user may be confused about how to download and deploy these packages.
So I provide a download script for each release, and user can simply download the script and run it, like:
```
> sh download_x64_apache.sh
Begin to download FE from "https://mirrors.tuna.tsinghua.edu.cn/apache/doris/1.2/1.2.3-rc02/apache-doris-fe-1.2.3-bin-x86_64.tar.xz" to "apache-doris-1.2.3-bin/" ...
Total size: 408078012 Bytes
#################################################### 100.0%
Begin to download BE from "https://mirrors.tuna.tsinghua.edu.cn/apache/doris/1.2/1.2.3-rc02/apache-doris-be-1.2.3-bin-x86_64.tar.xz" to "apache-doris-1.2.3-bin/" ...
Total size: 606211324 Bytes
#################################################### 100.0%
Begin to download DEPS from "https://mirrors.tuna.tsinghua.edu.cn/apache/doris/1.2/1.2.3-rc02/apache-doris-dependencies-1.2.3-bin-x86_64.tar.xz" to "apache-doris-1.2.3-bin/" ...
Total size: 253869148 Bytes
#################################################### 100.0%
Begin to assemble the binaries ...
Move java-udf-jar-with-dependencies.jar to be/lib/ ...
Download complete!
You can now deploy Apache Doris from apache-doris-1.2.3-bin/
```
The script will do the rest.
This script will later be published on the Download page of Apache Doris website, so that user can easily get
it and use it.
Currently only for Linux platform. Other platform is untested.
Introduce a tool to start multiple FEs on single node.
Use case:
```
$ ./multi-fe
./multi-fe start|stop|clean [OPTIONS ...]
start -n <NUM> -l <LIBRARY_PATH> -p <BASE_PORT>
Start the FE cluster.
-n The number of FEs.
-l The FE library path (default: doris/output/fe/lib)
-p The base port to generate all needed ports (default: 9030).
stop Stop the FE cluster.
clean Stop the data (rm -rf "$(pwd)"/fe*).
```
Overwrite the environment variable PATH to avoid using binutils from Homebrew to build third parties which may cause compilation errors.
Error: building for macOS-x86_64 but attempting to link with file built for unknown-unsupported file format
1. Refactor the file reader creation in FileFactory, for simplicity.
Previously, FileFactory had too many `create_file_reader` interfaces.
Now unified into two categories: the interface used by the previous BrokerScanNode,
and the interface used by the new FileScanNode.
And separate the creation methods of readers that read `StreamLoadPipe` and other readers that read files.
2. Modify the StreamLoadPlanner on FE side to support using ExternalFileScanNode
3. Now for generic reader, the file reader will be created inside the reader, not passed from the outside.
4. Add some test cases for csv stream load, the behavior is same as the old broker scanner.
This PR proposes to ignore tracking the following file and dir auto-generated by ANTLR4:
fe/fe-core/src/main/antlr4/org/apache/doris/nereids/DorisLexer.tokens
fe/fe-core/src/main/antlr4/org/apache/doris/nereids/gen/
This is an example of s3 hms_catalog:
```sql
CREATE CATALOG hms_catalog properties(
"type" = "hms",
"hive.metastore.uris"="thrift://localhost:9083",
"AWS_ACCESS_KEY" = "your access key",
"AWS_SECRET_KEY"="your secret key",
"AWS_ENDPOINT"="s3 endpoint",
"AWS_REGION"="s3-region",
"fs.s3a.paging.maximum"="1000");
```
All these params are necessary;
Define a new file scanner node for hms table in be.
This file scanner node is different from broker scan node as blow:
1. Broker scan node will define src slot and dest slot, there is two memory copy in it: first is from file to src slot
and second from src to dest slot. Otherwise FileScanNode only have one stemp memory copy just from file to dest slot.
2. Broker scan node will read all the filed in the file to src slot and FileScanNode only read the need filed.
3. Broker scan node will convert type into string type for src slot and then use cast to convert to dest slot type,
but FileScanNode will have the final type.
Now FileScanNode is a standalone code, but we will uniform the file scan and broker scan in the feature.
Add scalable regression testing framework(#7584)
contains
- Test framework making by groovy, and support built-in **readable DSL** named as `Action`
- Demo exists in `${DORIS_HOME}/regression-test/data/demo`
- Chinese doc exist in `${DORIS_HOME}/docs/zh-CN/developer-guide/regression-testing.md`
English document coming soon
* [improvement](show) Support that user can use show data skew statement instead of admin
This PR mainly do two things:
1. Support that user can use show data skew statement instead of admin
2. Fix fe ut failed caused by pr [improvement](rewrite) Make RewriteDateLiteralRule to be compatible with mysql #7876 and pr [feature-wip](iceberg) Step1: Support create Iceberg external table #7391
Co-authored-by: caiconghui1 <caiconghui1@jd.com>
When I use dev container feature in vscode, there is some config file that shouldn't be put in git.
So it's better to add config file into gitignore for convenience.
This PR is to reduce lock competition by supporting read and write lock in table level. When we modify or read table's meta, we don't need to get database lock, just get table write or read lock. And when we get database lock, that means meta directly in db cannot be modified by other thread. Database lock only protect meta in Database class, while table lock protect meta in Table class.
Co-authored-by: caiconghui [蔡聪辉] <caiconghui@xiaomi.com>
* Optimized the read performance of the table when have multi versions,
changed the merge method of the unique table,
merged the cumulative version data first, and then merged with the base version.
For the data with only one base version, read directly without merging
This is the last PR of proposal #4308
1. Add a new FE config `enable_http_server_v2` to enable new HTTP Server implementation. The default value is false.
2. Add a new FE config `http_api_extra_base_path` so that we can set base path for Frontend UI.
3. Refactor the new HTTP API response body. The return http status code is always 200, and using internal code in response body to indicate the certain error.
* Implements the grammar of the batch delete #4051
* Process create, alter table when table has delete sign column
* Support the syntax for enabling the delete column
* Automatically filtered deleted data in the select statement.
* Automatically add delete sign when create rollup table
TODO:
* Optimize the reading and compaction logic on the be side, so that the data marked as deleted will be completely deleted during base compaction