diff --git a/docs/en/community/how-to-contribute/contribute-doc.md b/docs/en/community/how-to-contribute/contribute-doc.md index 541d13d77e..1fd7f30119 100644 --- a/docs/en/community/how-to-contribute/contribute-doc.md +++ b/docs/en/community/how-to-contribute/contribute-doc.md @@ -312,6 +312,70 @@ Directory structure description: All images are in the `static/images` directory +## How to write SQL manual + +SQL manual doc refers to the documentation under `docs/sql-manual`. These documents are used in two places: + +1. Official website document. +2. The output of the HELP command. + +In order to support HELP command output, these documents need to be written in strict accordance with the following format, otherwise they will fail the admission check. + +An example of the `SHOW ALTER` command is as follows: + +``` +--- +{ + "title": "SHOW-ALTER", + "language": "en" +} +--- + + + +## SHOW-ALTER + +### Nameo + +SHOW ALTER + +### Description + +(Describe the syntax) + +### Example + +(Give some example) + +### Keywords + +SHOW, ALTER + +### Best Practice + +(Optional) + +``` + +Note that, regardless of Chinese or English documents, the above headings are in English, and pay attention to the level of the headings. + ## Multiple Versions Website documentation supports version tagging via html tags. You can use the `` tag to mark which version a section of content in the document started from, or which version it was removed from. diff --git a/docs/en/docs/admin-manual/config/config-dir.md b/docs/en/docs/admin-manual/config/config-dir.md new file mode 100644 index 0000000000..a1341bccad --- /dev/null +++ b/docs/en/docs/admin-manual/config/config-dir.md @@ -0,0 +1,49 @@ +--- +{ + "title": "Config Dir", + "language": "en" +} +--- + + + +# Config Dir + +The configuration file directory for FE and BE is `conf/`. In addition to storing the default fe.conf, be.conf and other files, this directory is also used for the common configuration file storage directory. + +Users can store some configuration files in it, and the system will automatically read them. + + + +## hdfs-site.xml and hive-site.xml + +In some functions of Doris, you need to access data on HDFS, or access Hive metastore. + +We can manually fill in various HDFS/Hive parameters in the corresponding statement of the function. + +But these parameters are very many, if all are filled in manually, it is very troublesome. + +Therefore, users can place the HDFS or Hive configuration file hdfs-site.xml/hive-site.xml directly in the `conf/` directory. Doris will automatically read these configuration files. + +The configuration that the user fills in the command will overwrite the configuration items in the configuration file. + +In this way, users only need to fill in a small amount of configuration to complete the access to HDFS/Hive. + + diff --git a/docs/en/docs/admin-manual/maint-monitor/metadata-operation.md b/docs/en/docs/admin-manual/maint-monitor/metadata-operation.md index 1b7d536380..f71204e3b7 100644 --- a/docs/en/docs/admin-manual/maint-monitor/metadata-operation.md +++ b/docs/en/docs/admin-manual/maint-monitor/metadata-operation.md @@ -269,6 +269,18 @@ curl -u $root_user:$password http://$master_hostname:8030/dump ``` 3. Replace the image file in the `meta_dir/image` directory on the OBSERVER FE node with the image_mem file, restart the OBSERVER FE node, and verify the integrity and correctness of the image_mem file. You can check whether the DB and Table metadata are normal on the FE Web page, whether there is an exception in `fe.log`, whether it is in a normal replayed jour. + Since 1.2.0, it is recommanded to use following method to verify the `image_mem` file: + + ``` + sh start_fe.sh --image path_to_image_mem + ``` + + > Notice: `path_to_image_mem` is the path of `image_mem`. + > + > If verify succeed, it will print: `Load image success. Image file /absolute/path/to/image.xxxxxx is valid`. + > + > If verify failed, it will print: `Load image failed. Image file /absolute/path/to/image.xxxxxx is invalid`. + 4. Replace the image file in the `meta_dir/image` directory on the FOLLOWER FE node with the image_mem file in turn, restart the FOLLOWER FE node, and confirm that the metadata and query services are normal. 5. Replace the image file in the `meta_dir/image` directory on the Master FE node with the image_mem file, restart the Master FE node, and then confirm that the FE Master switch is normal and The Master FE node can generate a new image file through checkpoint. @@ -393,3 +405,4 @@ The deployment recommendation of FE is described in the Installation and [Deploy ``` This means that some transactions that have been persisted need to be rolled back, but the number of entries exceeds the upper limit. Here our default upper limit is 100, which can be changed by setting `txn_rollback_limit`. This operation is only used to attempt to start FE normally, but lost metadata cannot be recovered. + diff --git a/docs/en/docs/advanced/broker.md b/docs/en/docs/advanced/broker.md index 7b463aaf02..dd7ac2cf86 100644 --- a/docs/en/docs/advanced/broker.md +++ b/docs/en/docs/advanced/broker.md @@ -26,7 +26,13 @@ under the License. # Broker -Broker is an optional process in the Doris cluster. It is mainly used to support Doris to read and write files or directories on remote storage, such as HDFS, BOS, and AFS. +Broker is an optional process in the Doris cluster. It is mainly used to support Doris to read and write files or directories on remote storage. Now support: + +- Apache HDFS +- Aliyun OSS +- Tencent Cloud CHDFS +- Huawei Cloud OBS (since 1.2.0) +- Amazon S3 Broker provides services through an RPC service port. It is a stateless JVM process that is responsible for encapsulating some POSIX-like file operations for read and write operations on remote storage, such as open, pred, pwrite, and so on. In addition, the Broker does not record any other information, so the connection information, file information, permission information, and so on stored remotely need to be passed to the Broker process in the RPC call through parameters in order for the Broker to read and write files correctly . @@ -194,3 +200,37 @@ Authentication information is usually provided as a Key-Value in the Property Ma ) ``` The configuration for accessing the HDFS cluster can be written to the hdfs-site.xml file. When users use the Broker process to read data from the HDFS cluster, they only need to fill in the cluster file path and authentication information. + +#### Tencent Cloud CHDFS + +Same as Apache HDFS + +#### Aliyun OSS + +``` +( + "fs.oss.accessKeyId" = "", + "fs.oss.accessKeySecret" = "", + "fs.oss.endpoint" = "" +) +``` + +#### Huawei Cloud OBS + +``` +( + "fs.obs.access.key" = "xx", + "fs.obs.secret.key" = "xx", + "fs.obs.endpoint" = "xx" +) +``` + +#### Amazon S3 + +``` +( + "fs.s3a.access.key" = "xx", + "fs.s3a.secret.key" = "xx", + "fs.s3a.endpoint" = "xx" +) +``` diff --git a/docs/en/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/CANCEL-LOAD.md b/docs/en/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/CANCEL-LOAD.md index 79484cf0b4..b96797babe 100644 --- a/docs/en/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/CANCEL-LOAD.md +++ b/docs/en/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/CANCEL-LOAD.md @@ -37,8 +37,10 @@ This statement is used to undo an import job for the specified label. Or batch u ```sql CANCEL LOAD [FROM db_name] -WHERE [LABEL = "load_label" | LABEL like "label_pattern"]; -```` +WHERE [LABEL = "load_label" | LABEL like "label_pattern" | STATE = "PENDING/ETL/LOADING"] +``` + +Notice: Cancel by State is supported since 1.2.0. ### Example @@ -58,6 +60,18 @@ WHERE [LABEL = "load_label" | LABEL like "label_pattern"]; WHERE LABEL like "example_"; ```` + + +3. Cancel all import jobs which state are "LOADING" + + ```sql + CANCEL LOAD + FROM example_db + WHERE STATE = "loading"; + ``` + + + ### Keywords CANCEL, LOAD diff --git a/docs/en/docs/sql-manual/sql-reference/Data-Manipulation-Statements/OUTFILE.md b/docs/en/docs/sql-manual/sql-reference/Data-Manipulation-Statements/OUTFILE.md index beb2ca2db1..4556f683e6 100644 --- a/docs/en/docs/sql-manual/sql-reference/Data-Manipulation-Statements/OUTFILE.md +++ b/docs/en/docs/sql-manual/sql-reference/Data-Manipulation-Statements/OUTFILE.md @@ -48,15 +48,19 @@ illustrate: file_path points to the path where the file is stored and the file prefix. Such as `hdfs://path/to/my_file_`. - The final filename will consist of `my_file_`, the file number and the file format suffix. The file serial number starts from 0, and the number is the number of files to be divided. Such as: + ``` + The final filename will consist of `my_file_`, the file number and the file format suffix. The file serial number starts from 0, and the number is the number of files to be divided. Such as: my_file_abcdefg_0.csv my_file_abcdefg_1.csv my_file_abcdegf_2.csv + ``` 2. format_as + ``` FORMAT AS CSV + ``` Specifies the export format. Supported formats include CSV, PARQUET, CSV_WITH_NAMES, CSV_WITH_NAMES_AND_TYPES and ORC. Default is CSV. @@ -64,38 +68,40 @@ illustrate: Specify related properties. Currently exporting via the Broker process, or via the S3 protocol is supported. - grammar: - [PROPERTIES ("key"="value", ...)] - The following properties are supported: - column_separator: column separator - line_delimiter: line delimiter - max_file_size: the size limit of a single file, if the result exceeds this value, it will be cut into multiple files. + ``` + grammar: + [PROPERTIES ("key"="value", ...)] + The following properties are supported: + column_separator: column separator. Support mulit-bytes, such as: "\\x01", "abc" + line_delimiter: line delimiter. Support mulit-bytes, such as: "\\x01", "abc" + max_file_size: the size limit of a single file, if the result exceeds this value, it will be cut into multiple files. - Broker related properties need to be prefixed with `broker.`: - broker.name: broker name - broker.hadoop.security.authentication: specify the authentication method as kerberos - broker.kerberos_principal: specifies the principal of kerberos - broker.kerberos_keytab: specifies the path to the keytab file of kerberos. The file must be the absolute path to the file on the server where the broker process is located. and can be accessed by the Broker process + Broker related properties need to be prefixed with `broker.`: + broker.name: broker name + broker.hadoop.security.authentication: specify the authentication method as kerberos + broker.kerberos_principal: specifies the principal of kerberos + broker.kerberos_keytab: specifies the path to the keytab file of kerberos. The file must be the absolute path to the file on the server where the broker process is located. and can be accessed by the Broker process - HDFS related properties: - fs.defaultFS: namenode address and port - hadoop.username: hdfs username - dfs.nameservices: if hadoop enable HA, please set fs nameservice. See hdfs-site.xml - dfs.ha.namenodes.[nameservice ID]:unique identifiers for each NameNode in the nameservice. See hdfs-site.xml - dfs.namenode.rpc-address.[nameservice ID].[name node ID]`:the fully-qualified RPC address for each NameNode to listen on. See hdfs-site.xml - dfs.client.failover.proxy.provider.[nameservice ID]:the Java class that HDFS clients use to contact the Active NameNode, usually it is org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider + HDFS related properties: + fs.defaultFS: namenode address and port + hadoop.username: hdfs username + dfs.nameservices: if hadoop enable HA, please set fs nameservice. See hdfs-site.xml + dfs.ha.namenodes.[nameservice ID]:unique identifiers for each NameNode in the nameservice. See hdfs-site.xml + dfs.namenode.rpc-address.[nameservice ID].[name node ID]`:the fully-qualified RPC address for each NameNode to listen on. See hdfs-site.xml + dfs.client.failover.proxy.provider.[nameservice ID]:the Java class that HDFS clients use to contact the Active NameNode, usually it is org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider - For a kerberos-authentication enabled Hadoop cluster, additional properties need to be set: - dfs.namenode.kerberos.principal: HDFS namenode service principal - hadoop.security.authentication: kerberos - hadoop.kerberos.principal: the Kerberos pincipal that Doris will use when connectiong to HDFS. - hadoop.kerberos.keytab: HDFS client keytab location. + For a kerberos-authentication enabled Hadoop cluster, additional properties need to be set: + dfs.namenode.kerberos.principal: HDFS namenode service principal + hadoop.security.authentication: kerberos + hadoop.kerberos.principal: the Kerberos pincipal that Doris will use when connectiong to HDFS. + hadoop.kerberos.keytab: HDFS client keytab location. - For the S3 protocol, you can directly execute the S3 protocol configuration: - AWS_ENDPOINT - AWS_ACCESS_KEY - AWS_SECRET_KEY - AWS_REGION + For the S3 protocol, you can directly execute the S3 protocol configuration: + AWS_ENDPOINT + AWS_ACCESS_KEY + AWS_SECRET_KEY + AWS_REGION + ``` ### example diff --git a/docs/en/docs/sql-manual/sql-reference/Database-Administration-Statements/ADMIN-CANCEL-REBALANCE-DISK.md b/docs/en/docs/sql-manual/sql-reference/Database-Administration-Statements/ADMIN-CANCEL-REBALANCE-DISK.md new file mode 100644 index 0000000000..9ac00222a4 --- /dev/null +++ b/docs/en/docs/sql-manual/sql-reference/Database-Administration-Statements/ADMIN-CANCEL-REBALANCE-DISK.md @@ -0,0 +1,62 @@ +--- +{ + "title": "ADMIN-CANCEL-REBALANCE-DISK", + "language": "en" +} +--- + + + +## ADMIN-CANCEL-REBALANCE-DISK + + + +### Name + +ADMIN CANCEL REBALANCE DISK + +### Description + +This statement is used to cancel rebalancing disks of specified backends with high priority + +Grammar: + +ADMIN CANCEL REBALANCE DISK [ON ("BackendHost1:BackendHeartBeatPort1", "BackendHost2:BackendHeartBeatPort2", ...)]; + +Explain: + +1. This statement only indicates that the system no longer rebalance disks of specified backends with high priority. The system will still rebalance disks by default scheduling. + +### Example + +1. Cancel High Priority Disk Rebalance of all of backends of the cluster + +ADMIN CANCEL REBALANCE DISK; + +2. Cancel High Priority Disk Rebalance of specified backends + +ADMIN CANCEL REBALANCE DISK ON ("192.168.1.1:1234", "192.168.1.2:1234"); + +### Keywords + +ADMIN,CANCEL,REBALANCE DISK + +### Best Practice + + + diff --git a/docs/en/docs/sql-manual/sql-reference/Database-Administration-Statements/ADMIN-REBALANCE-DISK.md b/docs/en/docs/sql-manual/sql-reference/Database-Administration-Statements/ADMIN-REBALANCE-DISK.md new file mode 100644 index 0000000000..9b0734fdd0 --- /dev/null +++ b/docs/en/docs/sql-manual/sql-reference/Database-Administration-Statements/ADMIN-REBALANCE-DISK.md @@ -0,0 +1,68 @@ +--- +{ + "title": "ADMIN-REBALANCE-DISK", + "language": "en" +} +--- + + + +## ADMIN-REBALANCE-DISK + + + +### Name + +ADMIN REBALANCE DISK + +### Description + +This statement is used to try to rebalance disks of the specified backends first, no matter if the cluster is balanced + +Grammar: + +``` +ADMIN REBALANCE DISK [ON ("BackendHost1:BackendHeartBeatPort1", "BackendHost2:BackendHeartBeatPort2", ...)]; +``` + +Explain: + +1. This statement only means that the system attempts to rebalance disks of specified backends with high priority, no matter if the cluster is balanced. +2. The default timeout is 24 hours. Timeout means that the system will no longer rebalance disks of specified backends with high priority. The command settings need to be reused. + +### Example + +1. Attempt to rebalance disks of all backends + +``` +ADMIN REBALANCE DISK; +``` + +2. Attempt to rebalance disks oof the specified backends + +``` +ADMIN REBALANCE DISK ON ("192.168.1.1:1234", "192.168.1.2:1234"); +``` + +### Keywords + +ADMIN,REBALANCE,DISK + +### Best Practice + + diff --git a/docs/sidebars.json b/docs/sidebars.json index 9e2c343305..d5d4b17ea3 100644 --- a/docs/sidebars.json +++ b/docs/sidebars.json @@ -809,7 +809,9 @@ "sql-manual/sql-reference/Database-Administration-Statements/ADMIN-SET-CONFIG", "sql-manual/sql-reference/Database-Administration-Statements/ADMIN-SHOW-TABLET-STORAGE-FORMAT", "sql-manual/sql-reference/Database-Administration-Statements/ADMIN-SHOW-REPLICA-STATUS", - "sql-manual/sql-reference/Database-Administration-Statements/ADMIN-COPY-TABLET" + "sql-manual/sql-reference/Database-Administration-Statements/ADMIN-COPY-TABLET", + "sql-manual/sql-reference/Database-Administration-Statements/ADMIN-REBALANCE-DISK", + "sql-manual/sql-reference/Database-Administration-Statements/ADMIN-CANCEL-REBALANCE-DISK" ] }, { @@ -982,6 +984,7 @@ "type": "category", "label": "Config", "items": [ + "admin-manual/config/config-dir", "admin-manual/config/fe-config", "admin-manual/config/be-config", "admin-manual/config/user-property" diff --git a/docs/zh-CN/community/how-to-contribute/contribute-doc.md b/docs/zh-CN/community/how-to-contribute/contribute-doc.md index b0b38ae6f6..88718d45d0 100644 --- a/docs/zh-CN/community/how-to-contribute/contribute-doc.md +++ b/docs/zh-CN/community/how-to-contribute/contribute-doc.md @@ -24,8 +24,6 @@ specific language governing permissions and limitations under the License. --> - - # Doris 文档贡献 这里我们主要介绍 Doris 的文档怎么修改和贡献, @@ -311,6 +309,70 @@ under the License. 所有图片都在 `static/images `目录下面 +## 如何编写命令帮助手册 + +命令帮助手册文档,是指在 `docs/sql-manual` 下的文档。这些文档用于两个地方: + +1. 官网文档展示。 +2. HELP 命令的输出。 + +为了支持 HELP 命令输出,这些文档需要严格按照以下格式排版编写,否则无法通过准入检查。 + +以 `SHOW ALTER` 命令示例如下: + +``` +--- +{ + "title": "SHOW-ALTER", + "language": "zh-CN" +} +--- + + + +## SHOW-ALTER + +### Nameo + +SHOW ALTER + +### Description + +(描述命令语法。) + +### Example + +(提供命令示例。) + +### Keywords + +SHOW, ALTER + +### Best Practice + +(最佳实践(如有)) + +``` + +注意,不论中文还是英文文档,以上标题都是用英文,并且注意标题的层级。 + ## 文档多版本 网站文档支持通过 html 标签标记版本。可以通过 `` 标签标记文档中的某段内容是从哪个版本开始的,或者从哪个版本移除。 diff --git a/docs/zh-CN/docs/admin-manual/config/config-dir.md b/docs/zh-CN/docs/admin-manual/config/config-dir.md new file mode 100644 index 0000000000..46baf4e67e --- /dev/null +++ b/docs/zh-CN/docs/admin-manual/config/config-dir.md @@ -0,0 +1,49 @@ +--- +{ + "title": "配置文件目录", + "language": "zh-CN" +} +--- + + + +# 配置文件目录 + +FE 和 BE 的配置文件目录为 `conf/`。这个目录除了存放默认的 fe.conf, be.conf 等文件外,也被用于公用的配置文件存放目录。 + +用户可以在其中存放一些配置文件,系统会自动读取。 + + + +## hdfs-site.xml 和 hive-site.xml + +在 Doris 的一些功能中,需要访问 HDFS 上的数据,或者访问 Hive metastore。 + +我们可以通过在功能相应的语句中,手动的填写各种 HDFS/Hive 的参数。 + +但这些参数非常多,如果全部手动填写,非常麻烦。 + +因此,用户可以将 HDFS 或 Hive 的配置文件 hdfs-site.xml/hive-site.xml 直接放置在 `conf/` 目录下。Doris 会自动读取这些配置文件。 + +而用户在命令中填写的配置,会覆盖配置文件中的配置项。 + +这样,用户仅需填写少量的配置,即可完成对 HDFS/Hive 的访问。 + + diff --git a/docs/zh-CN/docs/admin-manual/maint-monitor/metadata-operation.md b/docs/zh-CN/docs/admin-manual/maint-monitor/metadata-operation.md index aa1752c93d..4494f767fc 100644 --- a/docs/zh-CN/docs/admin-manual/maint-monitor/metadata-operation.md +++ b/docs/zh-CN/docs/admin-manual/maint-monitor/metadata-operation.md @@ -267,10 +267,25 @@ FE 目前有以下几个端口 ``` curl -u $root_user:$password http://$master_hostname:8030/dump ``` + 3. 用 image_mem 文件替换掉 OBSERVER FE 节点上`meta_dir/image`目录下的 image 文件,重启 OBSERVER FE 节点, 验证 image_mem 文件的完整性和正确性(可以在 FE Web 页面查看 DB 和 Table 的元数据是否正常,查看fe.log 是否有异常,是否在正常 replayed journal) + + 自 1.2.0 版本起,推荐使用以下功能验证 `image_mem` 文件: + + ``` + sh start_fe.sh --image path_to_image_mem + ``` + + > 注意:`path_to_image_mem` 是 image_mem 文件的路径。 + > + > 如果文件有效会输出 `Load image success. Image file /absolute/path/to/image.xxxxxx is valid`。 + > + > 如果文件无效会输出 `Load image failed. Image file /absolute/path/to/image.xxxxxx is invalid`。 + 4. 依次用 image_mem 文件替换掉 FOLLOWER FE 节点上`meta_dir/image`目录下的 image 文件,重启 FOLLOWER FE 节点, 确认元数据和查询服务都正常 + 5. 用 image_mem 文件替换掉 Master FE 节点上`meta_dir/image`目录下的 image 文件,重启 Master FE 节点, 确认 FE Master 切换正常, Master FE 节点可以通过 checkpoint 正常生成新的 image 文件 6. 集群恢复所有 Load,Create,Alter 操作 diff --git a/docs/zh-CN/docs/advanced/broker.md b/docs/zh-CN/docs/advanced/broker.md index b7bd374efb..8b95a5f497 100644 --- a/docs/zh-CN/docs/advanced/broker.md +++ b/docs/zh-CN/docs/advanced/broker.md @@ -26,7 +26,13 @@ under the License. # Broker -Broker 是 Doris 集群中一种可选进程,主要用于支持 Doris 读写远端存储上的文件和目录,如 HDFS、BOS 和 AFS 等。 +Broker 是 Doris 集群中一种可选进程,主要用于支持 Doris 读写远端存储上的文件和目录。目前支持以下远端存储: + +- Apache HDFS +- 阿里云 OSS +- 腾讯云 CHDFS +- 华为云 OBS (1.2.0 版本后支持) +- 亚马逊 S3 Broker 通过提供一个 RPC 服务端口来提供服务,是一个无状态的 Java 进程,负责为远端存储的读写操作封装一些类 POSIX 的文件操作,如 open,pread,pwrite 等等。除此之外,Broker 不记录任何其他信息,所以包括远端存储的连接信息、文件信息、权限信息等等,都需要通过参数在 RPC 调用中传递给 Broker 进程,才能使得 Broker 能够正确读写文件。 @@ -91,7 +97,7 @@ WITH BROKER "broker_name" 不同的 Broker 类型,以及不同的访问方式需要提供不同的认证信息。认证信息通常在 `WITH BROKER "broker_name"` 之后的 Property Map 中以 Key-Value 的方式提供。 -#### 社区版 HDFS +#### Apache HDFS 1. 简单认证 @@ -187,3 +193,37 @@ WITH BROKER "broker_name" ``` 关于HDFS集群的配置可以写入hdfs-site.xml文件中,用户使用Broker进程读取HDFS集群的信息时,只需要填写集群的文件路径名和认证信息即可。 + +#### 腾讯云 CHDFS + +同 Apache HDFS + +#### 阿里云 OSS + +``` +( + "fs.oss.accessKeyId" = "", + "fs.oss.accessKeySecret" = "", + "fs.oss.endpoint" = "" +) +``` + +#### 华为云 OBS + +``` +( + "fs.obs.access.key" = "xx", + "fs.obs.secret.key" = "xx", + "fs.obs.endpoint" = "xx" +) +``` + +#### 亚马逊 S3 + +``` +( + "fs.s3a.access.key" = "xx", + "fs.s3a.secret.key" = "xx", + "fs.s3a.endpoint" = "xx" +) +``` diff --git a/docs/zh-CN/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/CANCEL-LOAD.md b/docs/zh-CN/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/CANCEL-LOAD.md index 2f4ca856a4..5c7a648cc2 100644 --- a/docs/zh-CN/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/CANCEL-LOAD.md +++ b/docs/zh-CN/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/CANCEL-LOAD.md @@ -37,9 +37,11 @@ CANCEL LOAD ```sql CANCEL LOAD [FROM db_name] -WHERE [LABEL = "load_label" | LABEL like "label_pattern"]; +WHERE [LABEL = "load_label" | LABEL like "label_pattern" | STATE = "PENDING/ETL/LOADING"] ``` +注:1.2.0 版本之后支持根据 State 取消作业。 + ### Example 1. 撤销数据库 example_db 上, label 为 `example_db_test_load_label` 的导入作业 @@ -58,6 +60,18 @@ WHERE [LABEL = "load_label" | LABEL like "label_pattern"]; WHERE LABEL like "example_"; ``` + + +3. 取消状态为 LOADING 的导入作业。 + + ```sql + CANCEL LOAD + FROM example_db + WHERE STATE = "loading"; + ``` + + + ### Keywords CANCEL, LOAD diff --git a/docs/zh-CN/docs/sql-manual/sql-reference/Data-Manipulation-Statements/OUTFILE.md b/docs/zh-CN/docs/sql-manual/sql-reference/Data-Manipulation-Statements/OUTFILE.md index 327e22f7e3..b3db0a349c 100644 --- a/docs/zh-CN/docs/sql-manual/sql-reference/Data-Manipulation-Statements/OUTFILE.md +++ b/docs/zh-CN/docs/sql-manual/sql-reference/Data-Manipulation-Statements/OUTFILE.md @@ -47,53 +47,62 @@ INTO OUTFILE "file_path" 1. file_path - ​ file_path 指向文件存储的路径以及文件前缀。如 `hdfs://path/to/my_file_`。 + ``` + file_path 指向文件存储的路径以及文件前缀。如 `hdfs://path/to/my_file_`。 - ​ 最终的文件名将由 `my_file_`,文件序号以及文件格式后缀组成。其中文件序号由0开始,数量为文件被分割的数量。如: - ​ my_file_abcdefg_0.csv - ​ my_file_abcdefg_1.csv - ​ my_file_abcdegf_2.csv + 最终的文件名将由 `my_file_`,文件序号以及文件格式后缀组成。其中文件序号由0开始,数量为文件被分割的数量。如: + my_file_abcdefg_0.csv + my_file_abcdefg_1.csv + my_file_abcdegf_2.csv + ``` 2. format_as - ​ FORMAT AS CSV - ​ 指定导出格式. 支持 CSV、PARQUET、CSV_WITH_NAMES、CSV_WITH_NAMES_AND_TYPES、ORC. 默认为 CSV。 + + ``` + FORMAT AS CSV + ``` + + 指定导出格式. 支持 CSV、PARQUET、CSV_WITH_NAMES、CSV_WITH_NAMES_AND_TYPES、ORC. 默认为 CSV。 3. properties - ​ 指定相关属性。目前支持通过 Broker 进程, 或通过 S3 协议进行导出。 + ``` + 指定相关属性。目前支持通过 Broker 进程, 或通过 S3 协议进行导出。 - 语法: - [PROPERTIES ("key"="value", ...)] - 支持如下属性: - column_separator: 列分隔符 - line_delimiter: 行分隔符 - max_file_size: 单个文件大小限制,如果结果超过这个值,将切割成多个文件。 - - Broker 相关属性需加前缀 `broker.`: - broker.name: broker名称 - broker.hadoop.security.authentication: 指定认证方式为 kerberos - broker.kerberos_principal: 指定 kerberos 的 principal - broker.kerberos_keytab: 指定 kerberos 的 keytab 文件路径。该文件必须为 Broker 进程所在服务器上的文件的绝对路径。并且可以被 Broker 进程访问 - - HDFS 相关属性: - fs.defaultFS: namenode 地址和端口 - hadoop.username: hdfs 用户名 - dfs.nameservices: name service名称,与hdfs-site.xml保持一致 - dfs.ha.namenodes.[nameservice ID]: namenode的id列表,与hdfs-site.xml保持一致 - dfs.namenode.rpc-address.[nameservice ID].[name node ID]: Name node的rpc地址,数量与namenode数量相同,与hdfs-site.xml保 -持一致 - dfs.client.failover.proxy.provider.[nameservice ID]: HDFS客户端连接活跃namenode的java类,通常是"org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider" + 语法: + [PROPERTIES ("key"="value", ...)] + 支持如下属性: + column_separator: 列分隔符。支持多字节分隔符,如:"\\x01", "abc" + line_delimiter: 行分隔符。支持多字节分隔符,如:"\\x01", "abc" + max_file_size: 单个文件大小限制,如果结果超过这个值,将切割成多个文件。 + + Broker 相关属性需加前缀 `broker.`: + broker.name: broker名称 + broker.hadoop.security.authentication: 指定认证方式为 kerberos + broker.kerberos_principal: 指定 kerberos 的 principal + broker.kerberos_keytab: 指定 kerberos 的 keytab 文件路径。该文件必须为 Broker 进程所在服务器上的文件的绝对路径。并且可以被 Broker 进程访问 + + HDFS 相关属性: + fs.defaultFS: namenode 地址和端口 + hadoop.username: hdfs 用户名 + dfs.nameservices: name service名称,与hdfs-site.xml保持一致 + dfs.ha.namenodes.[nameservice ID]: namenode的id列表,与hdfs-site.xml保持一致 + dfs.namenode.rpc-address.[nameservice ID].[name node ID]: Name node的rpc地址,数量与namenode数量相同,与hdfs-site.xml保 +持一 + dfs.client.failover.proxy.provider.[nameservice ID]: HDFS客户端连接活跃namenode的java类,通常是"org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider" - 对于开启kerberos认证的Hadoop 集群,还需要额外设置如下 PROPERTIES 属性: - dfs.namenode.kerberos.principal: HDFS namenode 服务的 principal 名称 - hadoop.security.authentication: 认证方式设置为 kerberos - hadoop.kerberos.principal: 设置 Doris 连接 HDFS 时使用的 Kerberos 主体 - hadoop.kerberos.keytab: 设置 keytab 本地文件路径 - S3 协议则直接执行 S3 协议配置即可: - AWS_ENDPOINT - AWS_ACCESS_KEY - AWS_SECRET_KEY - AWS_REGION + 对于开启kerberos认证的Hadoop 集群,还需要额外设置如下 PROPERTIES 属性: + dfs.namenode.kerberos.principal: HDFS namenode 服务的 principal 名称 + hadoop.security.authentication: 认证方式设置为 kerberos + hadoop.kerberos.principal: 设置 Doris 连接 HDFS 时使用的 Kerberos 主体 + hadoop.kerberos.keytab: 设置 keytab 本地文件路径 + + S3 协议则直接执行 S3 协议配置即可: + AWS_ENDPOINT + AWS_ACCESS_KEY + AWS_SECRET_KEY + AWS_REGION + ``` ### example diff --git a/docs/zh-CN/docs/sql-manual/sql-reference/Database-Administration-Statements/ADMIN-CANCEL-REBALANCE-DISK.md b/docs/zh-CN/docs/sql-manual/sql-reference/Database-Administration-Statements/ADMIN-CANCEL-REBALANCE-DISK.md new file mode 100644 index 0000000000..4ddf546c25 --- /dev/null +++ b/docs/zh-CN/docs/sql-manual/sql-reference/Database-Administration-Statements/ADMIN-CANCEL-REBALANCE-DISK.md @@ -0,0 +1,61 @@ +--- +{ + "title": "ADMIN-CANCEL-REBALANCE-DISK", + "language": "zh-CN" +} +--- + + + +## ADMIN-CANCEL-REBALANCE-DISK + + + +### Name + +ADMIN CANCEL REBALANCE DISK + +### Description + + 该语句用于取消优先均衡BE的磁盘 + + 语法: + + ADMIN CANCEL REBALANCE DISK [ON ("BackendHost1:BackendHeartBeatPort1", "BackendHost2:BackendHeartBeatPort2", ...)]; + + 说明: + + 1. 该语句仅表示系统不再优先均衡指定BE的磁盘数据。系统仍会以默认调度方式均衡BE的磁盘数据。 + +### Example + + 1. 取消集群所有BE的优先磁盘均衡 + + ADMIN CANCEL REBALANCE DISK; + + 2. 取消指定BE的优先磁盘均衡 + + ADMIN CANCEL REBALANCE DISK ON ("192.168.1.1:1234", "192.168.1.2:1234"); + +### Keywords + + ADMIN,CANCEL,REBALANCE,DISK + +### Best Practice + + diff --git a/docs/zh-CN/docs/sql-manual/sql-reference/Database-Administration-Statements/ADMIN-REBALANCE-DISK.md b/docs/zh-CN/docs/sql-manual/sql-reference/Database-Administration-Statements/ADMIN-REBALANCE-DISK.md new file mode 100644 index 0000000000..1966bc1fba --- /dev/null +++ b/docs/zh-CN/docs/sql-manual/sql-reference/Database-Administration-Statements/ADMIN-REBALANCE-DISK.md @@ -0,0 +1,70 @@ +--- +{ + "title": "ADMIN-REBALANCE-DISK", + "language": "zh-CN" +} +--- + + + +## ADMIN-REBALANCE-DISK + + + +### Name + +ADMIN REBALANCE DISK + +### Description + +该语句用于尝试优先均衡指定的BE磁盘数据 + +语法: + + ``` + ADMIN REBALANCE DISK [ON ("BackendHost1:BackendHeartBeatPort1", "BackendHost2:BackendHeartBeatPort2", ...)]; + ``` + +说明: + + 1. 该语句表示让系统尝试优先均衡指定BE的磁盘数据,不受限于集群是否均衡。 + 2. 默认的 timeout 是 24小时。超时意味着系统将不再优先均衡指定的BE磁盘数据。需要重新使用该命令设置。 + 3. 指定BE的磁盘数据均衡后,该BE的优先级将会失效。 + +### Example + +1. 尝试优先均衡集群内的所有BE + + ``` + ADMIN REBALANCE DISK; + ``` + +2. 尝试优先均衡指定BE + + ``` + ADMIN REBALANCE DISK ON ("192.168.1.1:1234", "192.168.1.2:1234"); + ``` + +### Keywords + + ADMIN,REBALANCE,DISK + +### Best Practice + + + diff --git a/fe/fe-core/src/main/java/org/apache/doris/analysis/CancelLoadStmt.java b/fe/fe-core/src/main/java/org/apache/doris/analysis/CancelLoadStmt.java index 263b66cf31..c08f6370a4 100644 --- a/fe/fe-core/src/main/java/org/apache/doris/analysis/CancelLoadStmt.java +++ b/fe/fe-core/src/main/java/org/apache/doris/analysis/CancelLoadStmt.java @@ -21,6 +21,7 @@ import org.apache.doris.analysis.BinaryPredicate.Operator; import org.apache.doris.cluster.ClusterNamespace; import org.apache.doris.common.AnalysisException; import org.apache.doris.common.UserException; +import org.apache.doris.load.loadv2.JobState; import com.google.common.base.Strings; import com.google.common.collect.Sets; @@ -83,6 +84,14 @@ public class CancelLoadStmt extends DdlStmt { throw new AnalysisException("Only label can use like"); } state = inputValue; + try { + JobState jobState = JobState.valueOf(state); + if (jobState != JobState.PENDING && jobState != JobState.ETL && jobState != JobState.LOADING) { + throw new AnalysisException("invalid state: " + state); + } + } catch (IllegalArgumentException e) { + throw new AnalysisException("invalid state: " + state); + } } } diff --git a/fe/fe-core/src/test/java/org/apache/doris/analysis/CancelLoadStmtTest.java b/fe/fe-core/src/test/java/org/apache/doris/analysis/CancelLoadStmtTest.java index f2e9a39a63..5351e638e8 100644 --- a/fe/fe-core/src/test/java/org/apache/doris/analysis/CancelLoadStmtTest.java +++ b/fe/fe-core/src/test/java/org/apache/doris/analysis/CancelLoadStmtTest.java @@ -58,7 +58,6 @@ public class CancelLoadStmtTest extends TestWithFeService { StringLiteral labelStringLiteral = new StringLiteral("doris_test_label"); SlotRef stateSlotRef = new SlotRef(null, "state"); - StringLiteral stateStringLiteral = new StringLiteral("FINISHED"); BinaryPredicate labelBinaryPredicate = new BinaryPredicate(BinaryPredicate.Operator.EQ, labelSlotRef, labelStringLiteral); @@ -75,11 +74,12 @@ public class CancelLoadStmtTest extends TestWithFeService { Assertions.assertEquals("CANCEL LOAD FROM default_cluster:testDb WHERE `LABEL` = 'doris_test_label'", stmtUpper.toString()); + StringLiteral stateStringLiteral = new StringLiteral("LOADING"); BinaryPredicate stateBinaryPredicate = new BinaryPredicate(BinaryPredicate.Operator.EQ, stateSlotRef, stateStringLiteral); stmt = new CancelLoadStmt(null, stateBinaryPredicate); stmt.analyze(analyzer); - Assertions.assertEquals("CANCEL LOAD FROM default_cluster:testDb WHERE `state` = 'FINISHED'", stmt.toString()); + Assertions.assertEquals("CANCEL LOAD FROM default_cluster:testDb WHERE `state` = 'LOADING'", stmt.toString()); LikePredicate labelLikePredicate = new LikePredicate(LikePredicate.Operator.LIKE, labelSlotRef, labelStringLiteral); @@ -93,7 +93,7 @@ public class CancelLoadStmtTest extends TestWithFeService { stmt = new CancelLoadStmt(null, compoundAndPredicate); stmt.analyze(analyzer); Assertions.assertEquals( - "CANCEL LOAD FROM default_cluster:testDb WHERE `label` = 'doris_test_label' AND `state` = 'FINISHED'", + "CANCEL LOAD FROM default_cluster:testDb WHERE `label` = 'doris_test_label' AND `state` = 'LOADING'", stmt.toString()); CompoundPredicate compoundOrPredicate = new CompoundPredicate(Operator.OR, labelBinaryPredicate, @@ -101,7 +101,7 @@ public class CancelLoadStmtTest extends TestWithFeService { stmt = new CancelLoadStmt(null, compoundOrPredicate); stmt.analyze(analyzer); Assertions.assertEquals( - "CANCEL LOAD FROM default_cluster:testDb WHERE `label` = 'doris_test_label' OR `state` = 'FINISHED'", + "CANCEL LOAD FROM default_cluster:testDb WHERE `label` = 'doris_test_label' OR `state` = 'LOADING'", stmt.toString()); // test match @@ -127,19 +127,19 @@ public class CancelLoadStmtTest extends TestWithFeService { stmt = new CancelLoadStmt(null, stateBinaryPredicate); stmt.analyze(analyzer); LoadManager.addNeedCancelLoadJob(stmt, loadJobs, matchLoadJobs); - Assertions.assertEquals(3, matchLoadJobs.size()); + Assertions.assertEquals(0, matchLoadJobs.size()); // or matchLoadJobs.clear(); stmt = new CancelLoadStmt(null, compoundOrPredicate); stmt.analyze(analyzer); LoadManager.addNeedCancelLoadJob(stmt, loadJobs, matchLoadJobs); - Assertions.assertEquals(3, matchLoadJobs.size()); + Assertions.assertEquals(1, matchLoadJobs.size()); // and matchLoadJobs.clear(); stmt = new CancelLoadStmt(null, compoundAndPredicate); stmt.analyze(analyzer); LoadManager.addNeedCancelLoadJob(stmt, loadJobs, matchLoadJobs); - Assertions.assertEquals(1, matchLoadJobs.size()); + Assertions.assertEquals(0, matchLoadJobs.size()); } @Test