[typo](doc)flink and spark connector remove thrift compiled documents (#19794)

* [typo](doc)flink and spark connector remove thrift compiled documents

* delete enable_http_server_v2
This commit is contained in:
DongLiang-0
2023-05-19 14:12:07 +08:00
committed by GitHub
parent 7d1844d380
commit a7376bf109
4 changed files with 40 additions and 257 deletions

View File

@ -54,78 +54,16 @@ Github: https://github.com/apache/doris-flink-connector
Ready to work
1.Modify the `custom_env.sh.tpl` file and rename it to `custom_env.sh`
1. Modify the `custom_env.sh.tpl` file and rename it to `custom_env.sh`
2.Specify the thrift installation directory
2. Execute following command in source dir:
`sh build.sh`
Enter the flink version you need to compile according to the prompt.
```bash
##source file content
#export THRIFT_BIN=
#export MVN_BIN=
#export JAVA_HOME=
##amend as below,MacOS as an example
export THRIFT_BIN=/opt/homebrew/Cellar/thrift@0.13.0/0.13.0/bin/thrift
#export MVN_BIN=
#export JAVA_HOME=
```
Install `thrift` 0.13.0 (Note: `Doris` 0.15 and the latest builds are based on `thrift` 0.13.0, previous versions are still built with `thrift` 0.9.3)
Windows:
1. Download: `http://archive.apache.org/dist/thrift/0.13.0/thrift-0.13.0.exe`
2. Modify thrift-0.13.0.exe to thrift
MacOS:
1. Download: `brew install thrift@0.13.0`
2. default address: /opt/homebrew/Cellar/thrift@0.13.0/0.13.0/bin/thrift
Note: Executing `brew install thrift@0.13.0` on MacOS may report an error that the version cannot be found. The solution is as follows, execute it in the terminal:
1. `brew tap-new $USER/local-tap`
2. `brew extract --version='0.13.0' thrift $USER/local-tap`
3. `brew install thrift@0.13.0`
Reference link: `https://gist.github.com/tonydeng/02e571f273d6cce4230dc8d5f394493c`
Linux:
```bash
1. wget https://archive.apache.org/dist/thrift/0.13.0/thrift-0.13.0.tar.gz # Download source package
2. yum install -y autoconf automake libtool cmake ncurses-devel openssl-devel lzo-devel zlib-devel gcc gcc-c++ # Install dependencies
3. tar zxvf thrift-0.13.0.tar.gz
4. cd thrift-0.13.0
5. ./configure --without-tests
6. make
7. make install
8. thrift --version # Check the version after installation is complete
```
Note: If you have compiled Doris, you do not need to install thrift, you can directly use `$DORIS_HOME/thirdparty/installed/bin/thrift`
After the compilation is successful, the target jar package will be generated in the `dist` directory, such as: `flink-doris-connector-1.3.0-SNAPSHOT.jar`.
Copy this file to `classpath` in `Flink` to use `Flink-Doris-Connector`. For example, `Flink` running in `Local` mode, put this file in the `lib/` folder. `Flink` running in `Yarn` cluster mode, put this file in the pre-deployment package.
Execute following command in source dir:
```bash
sh build.sh
Usage:
build.sh --flink version # specify flink version (after flink-doris-connector v1.2 and flink-1.15, there is no need to provide scala version)
build.sh --tag # this is a build from tag
e.g.:
build.sh --flink 1.16.0
build.sh --tag
```
Then, for example, execute the command to compile according to the version you need:
`sh build.sh --flink 1.16.0`
After successful compilation, the file `flink-doris-connector-1.16-1.3.0-SNAPSHOT.jar` will be generated in the `target/` directory. Copy this file to `classpath` in `Flink` to use `Flink-Doris-Connector`. For example, `Flink` running in `Local` mode, put this file in the `lib/` folder. `Flink` running in `Yarn` cluster mode, put this file in the pre-deployment package.
**Remarks:**
1. Doris FE should be configured to enable http v2 in the configuration
conf/fe.conf
```
enable_http_server_v2 = true
```
## Using Maven
Add flink-doris-connector Maven dependencies

View File

@ -47,84 +47,38 @@ Github: https://github.com/apache/incubator-doris-spark-connector
Ready to work
1.Modify the `custom_env.sh.tpl` file and rename it to `custom_env.sh`
1. Modify the `custom_env.sh.tpl` file and rename it to `custom_env.sh`
2.Specify the thrift installation directory
2. Execute following command in source dir:
`sh build.sh`
Follow the prompts to enter the Scala and Spark versions you need to start compiling.
```bash
##source file content
#export THRIFT_BIN=
#export MVN_BIN=
#export JAVA_HOME=
After the compilation is successful, the target jar package will be generated in the `dist` directory, such as: `spark-doris-connector-3.1_2.12-1.1.0-SNAPSHOT.jar`.
Copy this file to `ClassPath` in `Spark` to use `Spark-Doris-Connector`. For example, `Spark` running in `Local` mode, put this file in the `jars/` folder. `Spark` running in `Yarn` cluster mode, put this file in the pre-deployment package.
##amend as below,MacOS as an example
export THRIFT_BIN=/opt/homebrew/Cellar/thrift@0.13.0/0.13.0/bin/thrift
#export MVN_BIN=
#export JAVA_HOME=
For example upload `spark-doris-connector-3.1_2.12-1.1.0-SNAPSHOT.jar` to hdfs and add hdfs file path in spark.yarn.jars.
Install `thrift` 0.13.0 (Note: `Doris` 0.15 and the latest builds are based on `thrift` 0.13.0, previous versions are still built with `thrift` 0.9.3)
Windows:
1. Download: `http://archive.apache.org/dist/thrift/0.13.0/thrift-0.13.0.exe`
2. Modify thrift-0.13.0.exe to thrift
MacOS:
1. Download: `brew install thrift@0.13.0`
2. default address: /opt/homebrew/Cellar/thrift@0.13.0/0.13.0/bin/thrift
Note: Executing `brew install thrift@0.13.0` on MacOS may report an error that the version cannot be found. The solution is as follows, execute it in the terminal:
1. `brew tap-new $USER/local-tap`
2. `brew extract --version='0.13.0' thrift $USER/local-tap`
3. `brew install thrift@0.13.0`
Reference link: `https://gist.github.com/tonydeng/02e571f273d6cce4230dc8d5f394493c`
Linux:
1.Download source package:`wget https://archive.apache.org/dist/thrift/0.13.0/thrift-0.13.0.tar.gz`
2.Install dependencies:`yum install -y autoconf automake libtool cmake ncurses-devel openssl-devel lzo-devel zlib-devel gcc gcc-c++`
3.`tar zxvf thrift-0.13.0.tar.gz`
4.`cd thrift-0.13.0`
5.`./configure --without-tests`
6.`make`
7.`make install`
Check the version after installation is complete:thrift --version
Note: If you have compiled Doris, you do not need to install thrift, you can directly use $DORIS_HOME/thirdparty/installed/bin/thrift
```
Execute following command in source dir
```bash
sh build.sh --spark 2.3.4 --scala 2.11 ## spark 2.3.4, scala 2.11
sh build.sh --spark 3.1.2 --scala 2.12 ## spark 3.1.2, scala 2.12
sh build.sh --spark 3.2.0 --scala 2.12 \
--mvn-args "-Dnetty.version=4.1.68.Final -Dfasterxml.jackson.version=2.12.3" ## spark 3.2.0, scala 2.12
```
> Note: If you check out the source code from tag, you can just run sh build.sh --tag without specifying the spark and scala versions. This is because the version in the tag source code is fixed.
After successful compilation, the file `doris-spark-2.3.4-2.11-1.0.0-SNAPSHOT.jar` will be generated in the `output/` directory. Copy this file to `ClassPath` in `Spark` to use `Spark-Doris-Connector`. For example, `Spark` running in `Local` mode, put this file in the `jars/` folder. `Spark` running in `Yarn` cluster mode, put this file in the pre-deployment package ,for example upload `doris-spark-2.3.4-2.11-1.0.0-SNAPSHOT.jar` to hdfs and add hdfs file path in spark.yarn.jars.
1. Upload doris-spark-connector-3.1.2-2.12-1.0.0.jar Jar to hdfs.
1. Upload spark-doris-connector-3.1_2.12-1.1.0-SNAPSHOT.jar Jar to hdfs.
```
hdfs dfs -mkdir /spark-jars/
hdfs dfs -put /your_local_path/doris-spark-connector-3.1.2-2.12-1.0.0.jar /spark-jars/
hdfs dfs -put /your_local_path/spark-doris-connector-3.1_2.12-1.1.0-SNAPSHOT.jar /spark-jars/
```
2. Add doris-spark-connector-3.1.2-2.12-1.0.0.jar depence in Cluster.
2. Add spark-doris-connector-3.1_2.12-1.1.0-SNAPSHOT.jar depence in Cluster.
```
spark.yarn.jars=hdfs:///spark-jars/doris-spark-connector-3.1.2-2.12-1.0.0.jar
spark.yarn.jars=hdfs:///spark-jars/spark-doris-connector-3.1_2.12-1.1.0-SNAPSHOT.jar
```
## Using Maven
```
<dependency>
<groupId>org.apache.doris</groupId>
<artifactId>spark-doris-connector-3.1_2.12</artifactId>
<!--artifactId>spark-doris-connector-2.3_2.11</artifactId-->
<version>1.0.1</version>
<groupId>org.apache.doris</groupId>
<artifactId>spark-doris-connector-3.1_2.12</artifactId>
<version>1.1.0</version>
</dependency>
```

View File

@ -56,77 +56,15 @@ Flink Doris Connector 可以支持通过 Flink 操作(读取、插入、修改
准备工作
1.修改`custom_env.sh.tpl`文件,重命名为`custom_env.sh`
1. 修改`custom_env.sh.tpl`文件,重命名为`custom_env.sh`
2.指定thrift安装目录
2. 在源码目录下执行:
`sh build.sh`
根据提示输入你需要的 flink 版本进行编译。
```bash
##源文件内容
#export THRIFT_BIN=
#export MVN_BIN=
#export JAVA_HOME=
编译成功后,会在 `dist` 目录生成目标jar包,如:`flink-doris-connector-1.3.0-SNAPSHOT.jar`
将此文件复制到 `Flink``classpath` 中即可使用 `Flink-Doris-Connector` 。例如, `Local` 模式运行的 `Flink` ,将此文件放入 `lib/` 文件夹下。 `Yarn` 集群模式运行的 `Flink` ,则将此文件放入预部署包中。
##修改如下,MacOS为例
export THRIFT_BIN=/opt/homebrew/Cellar/thrift@0.13.0/0.13.0/bin/thrift
#export MVN_BIN=
#export JAVA_HOME=
```
安装 `thrift` 0.13.0 版本(注意:`Doris` 0.15 和最新的版本基于 `thrift` 0.13.0 构建, 之前的版本依然使用`thrift` 0.9.3 构建)
Windows:
1.下载:`http://archive.apache.org/dist/thrift/0.13.0/thrift-0.13.0.exe`(下载目录自己指定)
2.修改thrift-0.13.0.exe 为 thrift
MacOS:
1. 下载:`brew install thrift@0.13.0`
2. 默认下载地址:/opt/homebrew/Cellar/thrift@0.13.0/0.13.0/bin/thrift
注:MacOS执行 `brew install thrift@0.13.0` 可能会报找不到版本的错误,解决方法如下,在终端执行:
1. `brew tap-new $USER/local-tap`
2. `brew extract --version='0.13.0' thrift $USER/local-tap`
3. `brew install thrift@0.13.0`
参考链接: `https://gist.github.com/tonydeng/02e571f273d6cce4230dc8d5f394493c`
Linux:
```bash
1. wget https://archive.apache.org/dist/thrift/0.13.0/thrift-0.13.0.tar.gz # 下载源码包
2. yum install -y autoconf automake libtool cmake ncurses-devel openssl-devel lzo-devel zlib-devel gcc gcc-c++ # 安装依赖
3. tar zxvf thrift-0.13.0.tar.gz
4. cd thrift-0.13.0
5. ./configure --without-tests
6. make
7. make install
8. thrift --version # 安装完成后查看版本
```
注:如果编译过Doris,则不需要安装thrift,可以直接使用 `$DORIS_HOME/thirdparty/installed/bin/thrift`
在源码目录下执行:
```bash
sh build.sh
Usage:
build.sh --flink version # specify flink version (after flink-doris-connector v1.2 and flink-1.15, there is no need to provide scala version)
build.sh --tag # this is a build from tag
e.g.:
build.sh --flink 1.16.0
build.sh --tag
```
然后按照你需要版本执行命令编译即可,例如:
`sh build.sh --flink 1.16.0`
编译成功后,会在 `target/` 目录下生成文件,如:`flink-doris-connector-1.16-1.3.0-SNAPSHOT.jar` 。将此文件复制到 `Flink` 的 `classpath` 中即可使用 `Flink-Doris-Connector` 。例如, `Local` 模式运行的 `Flink` ,将此文件放入 `lib/` 文件夹下。 `Yarn` 集群模式运行的 `Flink` ,则将此文件放入预部署包中。
**备注**
1. Doris FE 要在配置中配置启用 http v2
​ conf/fe.conf
```
enable_http_server_v2 = true
```
## 使用 Maven 管理

View File

@ -47,84 +47,37 @@ Spark Doris Connector 可以支持通过 Spark 读取 Doris 中存储的数据
准备工作
1.修改`custom_env.sh.tpl`文件,重命名为`custom_env.sh`
1. 修改`custom_env.sh.tpl`文件,重命名为`custom_env.sh`
2.指定thrift安装目录
2. 在源码目录下执行:
`sh build.sh`
根据提示输入你需要的 Scala 与 Spark 版本进行编译。
```bash
##源文件内容
#export THRIFT_BIN=
#export MVN_BIN=
#export JAVA_HOME=
编译成功后,会在 `dist` 目录生成目标jar包,如:`spark-doris-connector-3.1_2.12-1.1.0-SNAPSHOT.jar`
将此文件复制到 `Spark``ClassPath` 中即可使用 `Spark-Doris-Connector`。例如,`Local` 模式运行的 `Spark`,将此文件放入 `jars/` 文件夹下。`Yarn`集群模式运行的`Spark`,则将此文件放入预部署包中。
##修改如下,MacOS为例
export THRIFT_BIN=/opt/homebrew/Cellar/thrift@0.13.0/0.13.0/bin/thrift
#export MVN_BIN=
#export JAVA_HOME=
例如将 `spark-doris-connector-3.1_2.12-1.1.0-SNAPSHOT.jar` 上传到 hdfs并在spark.yarn.jars参数上添加 hdfs上的Jar包路径
安装 `thrift` 0.13.0 版本(注意:`Doris` 0.15 和最新的版本基于 `thrift` 0.13.0 构建, 之前的版本依然使用`thrift` 0.9.3 构建)
Windows:
1.下载:`http://archive.apache.org/dist/thrift/0.13.0/thrift-0.13.0.exe`(下载目录自己指定)
2.修改thrift-0.13.0.exe 为 thrift
MacOS:
1. 下载:`brew install thrift@0.13.0`
2. 默认下载地址:/opt/homebrew/Cellar/thrift@0.13.0/0.13.0/bin/thrift
注:MacOS执行 `brew install thrift@0.13.0` 可能会报找不到版本的错误,解决方法如下,在终端执行:
1. `brew tap-new $USER/local-tap`
2. `brew extract --version='0.13.0' thrift $USER/local-tap`
3. `brew install thrift@0.13.0`
参考链接: `https://gist.github.com/tonydeng/02e571f273d6cce4230dc8d5f394493c`
Linux:
1.下载源码包:`wget https://archive.apache.org/dist/thrift/0.13.0/thrift-0.13.0.tar.gz`
2.安装依赖:`yum install -y autoconf automake libtool cmake ncurses-devel openssl-devel lzo-devel zlib-devel gcc gcc-c++`
3.`tar zxvf thrift-0.13.0.tar.gz`
4.`cd thrift-0.13.0`
5.`./configure --without-tests`
6.`make`
7.`make install`
安装完成后查看版本:thrift --version
注:如果编译过Doris,则不需要安装thrift,可以直接使用 $DORIS_HOME/thirdparty/installed/bin/thrift
```
在源码目录下执行:
```bash
sh build.sh --spark 2.3.4 --scala 2.11 ## spark 2.3.4, scala 2.11
sh build.sh --spark 3.1.2 --scala 2.12 ## spark 3.1.2, scala 2.12
sh build.sh --spark 3.2.0 --scala 2.12 \
--mvn-args "-Dnetty.version=4.1.68.Final -Dfasterxml.jackson.version=2.12.3" ## spark 3.2.0, scala 2.12
```
> 注:如果你是从 tag 检出的源码,则可以直接执行 `sh build.sh --tag`,而无需指定 spark 和 scala 的版本。因为 tag 源码中的版本是固定的。
编译成功后,会在 `output/` 目录下生成文件 `doris-spark-2.3.4-2.11-1.0.0-SNAPSHOT.jar`。将此文件复制到 `Spark``ClassPath` 中即可使用 `Spark-Doris-Connector`。例如,`Local` 模式运行的 `Spark`,将此文件放入 `jars/` 文件夹下。`Yarn`集群模式运行的`Spark`,则将此文件放入预部署包中。
例如将 `doris-spark-2.3.4-2.11-1.0.0-SNAPSHOT.jar` 上传到 hdfs并在spark.yarn.jars参数上添加 hdfs上的Jar包路径
1. 上传doris-spark-connector-3.1.2-2.12-1.0.0.jar 到hdfs。
1. 上传spark-doris-connector-3.1_2.12-1.1.0-SNAPSHOT.jar 到hdfs。
```
hdfs dfs -mkdir /spark-jars/
hdfs dfs -put /your_local_path/doris-spark-connector-3.1.2-2.12-1.0.0.jar /spark-jars/
hdfs dfs -put /your_local_path/spark-doris-connector-3.1_2.12-1.1.0-SNAPSHOT.jar /spark-jars/
```
2. 在集群中添加doris-spark-connector-3.1.2-2.12-1.0.0.jar 依赖。
2. 在集群中添加spark-doris-connector-3.1_2.12-1.1.0-SNAPSHOT.jar 依赖。
```
spark.yarn.jars=hdfs:///spark-jars/doris-spark-connector-3.1.2-2.12-1.0.0.jar
spark.yarn.jars=hdfs:///spark-jars/spark-doris-connector-3.1_2.12-1.1.0-SNAPSHOT.jar
```
## 使用Maven管理
```
<dependency>
<groupId>org.apache.doris</groupId>
<artifactId>spark-doris-connector-3.1_2.12</artifactId>
<!--artifactId>spark-doris-connector-2.3_2.11</artifactId-->
<version>1.0.1</version>
<groupId>org.apache.doris</groupId>
<artifactId>spark-doris-connector-3.1_2.12</artifactId>
<version>1.1.0</version>
</dependency>
```