diff --git a/docs/en/extending-doris/flink-doris-connector.md b/docs/en/extending-doris/flink-doris-connector.md index d17fb8ae1d..8cb0a9af17 100644 --- a/docs/en/extending-doris/flink-doris-connector.md +++ b/docs/en/extending-doris/flink-doris-connector.md @@ -26,7 +26,7 @@ under the License. # Flink Doris Connector -Flink Doris Connector can support reading data stored in Doris through Flink. +Flink Doris Connector can support read and write data stored in Doris through Flink. - You can map the `Doris` table to` DataStream` or `Table`. @@ -35,12 +35,33 @@ Flink Doris Connector can support reading data stored in Doris through Flink. | Connector | Flink | Doris | Java | Scala | | --------- | ----- | ------ | ---- | ----- | | 1.0.0 | 1.11.2 | 0.13+ | 8 | 2.12 | +| 1.0.0 | 1.13.x | 0.13.+ | 8 | 2.12 | +**For Flink 1.13.x version adaptation issues** + +```xml + + 2.12 + 1.11.2 + 0.9.3 + 0.15.1 + UTF-8 + ${basedir}/../../ + ${basedir}/../../thirdparty + +``` + +Just change the `flink.version` here to be the same as your Flink cluster version, and edit again ## Build and Install Execute following command in dir `extension/flink-doris-connector/`: +**Notice:** + +1. If you have not compiled the doris source code as a whole, you need to compile the Doris source code first, otherwise the thrift command will not be found, and you need to execute `sh build.sh` in the `incubator-doris` directory. +2. It is recommended to compile under the docker compile environment `apache/incubator-doris:build-env-1.2` of doris, because the JDK version below 1.3 is 11, there will be compilation problems. + ```bash sh build.sh ``` diff --git a/docs/en/extending-doris/spark-doris-connector.md b/docs/en/extending-doris/spark-doris-connector.md index f9b7a8ee38..4735574d06 100644 --- a/docs/en/extending-doris/spark-doris-connector.md +++ b/docs/en/extending-doris/spark-doris-connector.md @@ -37,14 +37,21 @@ Spark Doris Connector can support reading data stored in Doris through Spark. | Connector | Spark | Doris | Java | Scala | | --------- | ----- | ------ | ---- | ----- | | 1.0.0 | 2.x | 0.12+ | 8 | 2.11 | +| 1.0.0 | 3.x | 0.12.+ | 8 | 2.12 | ## Build and Install Execute following command in dir `extension/spark-doris-connector/`: +**Notice:** + +1. If you have not compiled the doris source code as a whole, you need to compile the Doris source code first, otherwise the thrift command will not be found, and you need to execute `sh build.sh` in the `incubator-doris` directory. +2. It is recommended to compile under the docker compile environment `apache/incubator-doris:build-env-1.2` of doris, because the JDK version below 1.3 is 11, there will be compilation problems. + ```bash -sh build.sh +sh build.sh 3 ## spark 3.x version, the default is 3.1.2 +sh build.sh 2 ## soark 2.x version, the default is 2.3.4 ``` After successful compilation, the file `doris-spark-1.0.0-SNAPSHOT.jar` will be generated in the `output/` directory. Copy this file to `ClassPath` in `Spark` to use `Spark-Doris-Connector`. For example, `Spark` running in `Local` mode, put this file in the `jars/` folder. `Spark` running in `Yarn` cluster mode, put this file in the pre-deployment package. diff --git a/docs/zh-CN/extending-doris/flink-doris-connector.md b/docs/zh-CN/extending-doris/flink-doris-connector.md index ce89964563..083226f2a1 100644 --- a/docs/zh-CN/extending-doris/flink-doris-connector.md +++ b/docs/zh-CN/extending-doris/flink-doris-connector.md @@ -26,7 +26,7 @@ under the License. # Flink Doris Connector -Flink Doris Connector 可以支持通过 Flink 读取 Doris 中存储的数据。 +Flink Doris Connector 可以支持通过 Flink 读写 Doris 中存储的数据。 - 可以将`Doris`表映射为`DataStream`或者`Table`。 @@ -34,13 +34,34 @@ Flink Doris Connector 可以支持通过 Flink 读取 Doris 中存储的数据 | Connector | Flink | Doris | Java | Scala | | --------- | ----- | ------ | ---- | ----- | -| 1.0.0 | 1.11.2 | 0.13+ | 8 | 2.12 | +| 1.0.0 | 1.11.x , 1.12.x | 0.13+ | 8 | 2.12 | +| 1.0.0 | 1.13.x | 0.13.+ | 8 | 2.12 | +**针对Flink 1.13.x版本适配问题** + +```xml + + 2.12 + 1.11.2 + 0.9.3 + 0.15.1 + UTF-8 + ${basedir}/../../ + ${basedir}/../../thirdparty + +``` + +只需要将这里的 `flink.version` 改成和你 Flink 集群版本一致,重新编辑即可 ## 编译与安装 在 `extension/flink-doris-connector/` 源码目录下执行: +**注意:** + +1. 这里如果你没有整体编译过 doris 源码,需要首先编译一次 Doris 源码,不然会出现 thrift 命令找不到的情况,需要到 `incubator-doris` 目录下执行 `sh build.sh` +2. 建议在 doris 的 docker 编译环境 `apache/incubator-doris:build-env-1.2` 下进行编译,因为 1.3 下面的JDK 版本是 11,会存在编译问题。 + ```bash sh build.sh ``` diff --git a/docs/zh-CN/extending-doris/spark-doris-connector.md b/docs/zh-CN/extending-doris/spark-doris-connector.md index 2fbc54e1e7..b8eb53e8d5 100644 --- a/docs/zh-CN/extending-doris/spark-doris-connector.md +++ b/docs/zh-CN/extending-doris/spark-doris-connector.md @@ -37,14 +37,21 @@ Spark Doris Connector 可以支持通过 Spark 读取 Doris 中存储的数据 | Connector | Spark | Doris | Java | Scala | | --------- | ----- | ------ | ---- | ----- | | 1.0.0 | 2.x | 0.12+ | 8 | 2.11 | +| 1.0.0 | 3.x | 0.12.+ | 8 | 2.12 | ## 编译与安装 在 `extension/spark-doris-connector/` 源码目录下执行: +**注意:** + +1. 这里如果你没有整体编译过 doris 源码,需要首先编译一次 Doris 源码,不然会出现 thrift 命令找不到的情况,需要到 `incubator-doris` 目录下执行 `sh build.sh` +2. 建议在 doris 的 docker 编译环境 `apache/incubator-doris:build-env-1.2` 下进行编译,因为 1.3 下面的JDK 版本是 11,会存在编译问题。 + ```bash -sh build.sh +sh build.sh 3 ## spark 3.x版本,默认是3.1.2 +sh build.sh 2 ## soark 2.x版本,默认是2.3.4 ``` 编译成功后,会在 `output/` 目录下生成文件 `doris-spark-1.0.0-SNAPSHOT.jar`。将此文件复制到 `Spark` 的 `ClassPath` 中即可使用 `Spark-Doris-Connector`。例如,`Local` 模式运行的 `Spark`,将此文件放入 `jars/` 文件夹下。`Yarn`集群模式运行的`Spark`,则将此文件放入预部署包中。 diff --git a/extension/spark-doris-connector/build.sh b/extension/spark-doris-connector/build.sh index c37b14ee9a..b4ea0429a7 100755 --- a/extension/spark-doris-connector/build.sh +++ b/extension/spark-doris-connector/build.sh @@ -28,6 +28,7 @@ set -eo pipefail ROOT=`dirname "$0"` ROOT=`cd "$ROOT"; pwd` + export DORIS_HOME=${ROOT}/../../ # include custom environment variables @@ -37,6 +38,8 @@ fi # check maven MVN_CMD=mvn + + if [[ ! -z ${CUSTOM_MVN} ]]; then MVN_CMD=${CUSTOM_MVN} fi @@ -45,9 +48,14 @@ if ! ${MVN_CMD} --version; then exit 1 fi export MVN_CMD - -${MVN_CMD} clean package - +if [ $1 == 3 ] +then + ${MVN_CMD} clean package -f pom_3.0.xml +fi +if [ $1 == 2 ] +then + ${MVN_CMD} clean package +fi mkdir -p output/ cp target/doris-spark-1.0.0-SNAPSHOT.jar ./output/ diff --git a/extension/spark-doris-connector/pom_3.0.xml b/extension/spark-doris-connector/pom_3.0.xml new file mode 100644 index 0000000000..4973ff85f4 --- /dev/null +++ b/extension/spark-doris-connector/pom_3.0.xml @@ -0,0 +1,290 @@ + + + + + + 4.0.0 + + org.apache + doris-spark + 1.0.0-SNAPSHOT + + + 2.12 + 3.1.2 + 0.9.3 + 1.0.1 + UTF-8 + + + + + + custom-env + + + env.CUSTOM_MAVEN_REPO + + + + + + custom-nexus + ${env.CUSTOM_MAVEN_REPO} + + + + + + custom-nexus + ${env.CUSTOM_MAVEN_REPO} + + + + + + + general-env + + + !env.CUSTOM_MAVEN_REPO + + + + + + central + central maven repo https + https://repo.maven.apache.org/maven2 + + + + + + + + org.apache.spark + spark-core_${scala.version} + ${spark.version} + provided + + + org.apache.spark + spark-sql_${scala.version} + ${spark.version} + provided + + + org.apache.thrift + libthrift + ${libthrift.version} + + + org.apache.arrow + arrow-vector + ${arrow.version} + + + + org.hamcrest + hamcrest-core + 1.3 + test + + + org.mockito + mockito-scala_${scala.version} + 1.4.7 + + + hamcrest-core + org.hamcrest + + + test + + + junit + junit + 4.11 + + + hamcrest-core + org.hamcrest + + + test + + + com.fasterxml.jackson.core + jackson-databind + 2.10.0 + + + + com.fasterxml.jackson.core + jackson-core + 2.10.0 + + + io.netty + netty-all + 4.1.27.Final + provided + + + + + + + + org.apache.thrift.tools + maven-thrift-plugin + 0.1.11 + + + thrift-sources + generate-sources + + compile + + + + + + net.alchim31.maven + scala-maven-plugin + 3.2.1 + + + scala-compile-first + process-resources + + compile + + + + scala-test-compile + process-test-resources + + testCompile + + + + + + -feature + + + + + org.apache.maven.plugins + maven-shade-plugin + 3.2.1 + + + + com.google.code.findbugs:* + org.slf4j:* + + + + + org.apache.arrow + org.apache.doris.shaded.org.apache.arrow + + + io.netty + org.apache.doris.shaded.io.netty + + + com.fasterxml.jackson + org.apache.doris.shaded.com.fasterxml.jackson + + + org.apache.commons.codec + org.apache.doris.shaded.org.apache.commons.codec + + + com.google.flatbuffers + org.apache.doris.shaded.com.google.flatbuffers + + + org.apache.thrift + org.apache.doris.shaded.org.apache.thrift + + + + + + package + + shade + + + + + + org.jacoco + jacoco-maven-plugin + 0.7.8 + + + **/thrift/** + + + + + prepare-agent + + prepare-agent + + + + check + + check + + + + report + test + + report + + + + + + org.apache.maven.plugins + maven-compiler-plugin + 3.8.1 + + 8 + 8 + + + + + + +