Files
doris/docs/en/sql-reference/sql-statements/Data Definition/CREATE RESOURCE.md
qiye bca121333e [feature](cold-hot) support s3 resource (#8808)
Add cold hot support in FE meta, support alter resource DDL in FE
2022-04-13 09:52:03 +08:00

5.2 KiB

title, language
title language
CREATE RESOURCE en

CREATE RESOURCE

Description

This statement is used to create a resource. Only the root or admin user can create resources. Currently supports Spark, ODBC, S3 external resources.
In the future, other external resources may be added to Doris for use, such as Spark/GPU for query, HDFS/S3 for external storage, MapReduce for ETL, etc.

Syntax:
     CREATE [EXTERNAL] RESOURCE "resource_name"
     PROPERTIES ("key"="value", ...);
        
Explanation:
    1. The type of resource needs to be specified in PROPERTIES "type" = "[spark|odbc_catalog|s3]", currently supports spark, odbc_catalog, s3.
    2. The PROPERTIES varies according to the resource type, see the example for details.

Example

1. Create a Spark resource named spark0 in yarn cluster mode.

````
    CREATE EXTERNAL RESOURCE "spark0"
    PROPERTIES
    (
      "type" = "spark",
      "spark.master" = "yarn",
      "spark.submit.deployMode" = "cluster",
      "spark.jars" = "xxx.jar,yyy.jar",
      "spark.files" = "/tmp/aaa,/tmp/bbb",
      "spark.executor.memory" = "1g",
      "spark.yarn.queue" = "queue0",
      "spark.hadoop.yarn.resourcemanager.address" = "127.0.0.1:9999",
      "spark.hadoop.fs.defaultFS" = "hdfs://127.0.0.1:10000",
      "working_dir" = "hdfs://127.0.0.1:10000/tmp/doris",
      "broker" = "broker0",
      "broker.username" = "user0",
      "broker.password" = "password0"
    );
````
                                                                                                                                                                                                          
Spark related parameters are as follows:
- spark.master: Required, currently supports yarn, spark://host:port.
- spark.submit.deployMode: The deployment mode of the Spark program, required, supports both cluster and client.
- spark.hadoop.yarn.resourcemanager.address: Required when master is yarn.
- spark.hadoop.fs.defaultFS: Required when master is yarn.
- Other parameters are optional, refer to http://spark.apache.org/docs/latest/configuration.html

Working_dir and broker need to be specified when Spark is used for ETL. described as follows:
    working_dir: The directory used by the ETL. Required when spark is used as an ETL resource. For example: hdfs://host:port/tmp/doris.
    broker: broker name. Required when spark is used as an ETL resource. Configuration needs to be done in advance using the `ALTER SYSTEM ADD BROKER` command.
    broker.property_key: The authentication information that the broker needs to specify when reading the intermediate file generated by ETL.

2. Create an ODBC resource

````
    CREATE EXTERNAL RESOURCE `oracle_odbc`
    PROPERTIES (
    "type" = "odbc_catalog",
    "host" = "192.168.0.1",
    "port" = "8086",
    "user" = "test",
    "password" = "test",
    "database" = "test",
    "odbc_type" = "oracle",
    "driver" = "Oracle 19 ODBC driver"
    );
````

The relevant parameters of ODBC are as follows:
- hosts: IP address of the external database
- driver: The driver name of the ODBC appearance, which must be the same as the Driver name in be/conf/odbcinst.ini.
- odbc_type: the type of the external database, currently supports oracle, mysql, postgresql
- user: username of the foreign database
- password: the password information of the corresponding user

3. Create S3 resource

````
CREATE RESOURCE "remote_s3"
PROPERTIES
(
"type" = "s3",
"s3_endpoint" = "http://bj.s3.com",
"s3_region" = "bj",
"s3_root_path" = "/path/to/root",
"s3_access_key" = "bbb",
"s3_secret_key" = "aaaa",
"s3_max_connections" = "50",
"s3_request_timeout_ms" = "3000",
"s3_connection_timeout_ms" = "1000"
);
````

S3 related parameters are as follows:
- required
    - s3_endpoint: s3 endpoint
    - s3_region: s3 region
    - s3_root_path: s3 root directory
    - s3_access_key: s3 access key
    - s3_secret_key: s3 secret key
- optional
    - s3_max_connections: the maximum number of s3 connections, the default is 50
    - s3_request_timeout_ms: s3 request timeout, in milliseconds, the default is 3000
    - s3_connection_timeout_ms: s3 connection timeout, in milliseconds, the default is 1000

keyword

CREATE, RESOURCE