Close related #7389
Support create Iceberg external table in Doris.
This is the first step to support Iceberg external table.
### Create Iceberg external table
This pr describes two ways to create Iceberg external tables. Both ways do not require explicitly specifying column definitions, Doris automatically converts them based on Iceberg's column definitions.
1. Create an Iceberg external table directly
```sql
CREATE [EXTERNAL] TABLE table_name
ENGINE = ICEBERG
[COMMENT "comment"]
PROPERTIES (
"iceberg.database" = "iceberg_db_name",
"iceberg.table" = "icberg_table_name",
"iceberg.hive.metastore.uris" = "thrift://192.168.0.1:9083",
"iceberg.catalog.type" = "HIVE_CATALOG"
);
```
2. Create an Iceberg database and automatically create all the tables under that db.
```sql
CREATE DATABASE db_name
[COMMENT "comment"]
PROPERTIES (
"iceberg.database" = "iceberg_db_name",
"iceberg.hive.metastore.uris" = "thrift://192.168.0.1:9083",
"iceberg.catalog.type" = "HIVE_CATALOG"
);
```
### Show table creation
1. For individual tables you can view them with `help show create table`.
```sql
mysql> show create table iceberg_db.logs_1;
+--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| logs_1 | CREATE TABLE `logs_1` (
`level` varchar(-1) NOT NULL COMMENT "null",
`event_time` datetime NOT NULL COMMENT "null",
`message` varchar(-1) NOT NULL COMMENT "null"
) ENGINE=ICEBERG
COMMENT "ICEBERG"
PROPERTIES (
"iceberg.database" = "doris",
"iceberg.table" = "logs_1",
"iceberg.hive.metastore.uris" = "thrift://10.10.10.10:9087",
"iceberg.catalog.type" = "HIVE_CATALOG"
) |
+--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
```
2. For Iceberg database, you can view it with `help show table creation`.
```sql
mysql> show table creation from iceberg_db;
+--------+---------+---------------------+---------------------------------------------------------+
| Table | Status | Create Time | Error Msg |
+--------+---------+---------------------+---------------------------------------------------------+
| logs | fail | 2021-12-14 13:50:10 | Cannot convert unknown type to Doris type: list<string> |
| logs_1 | success | 2021-12-14 13:50:10 | |
+--------+---------+---------------------+---------------------------------------------------------+
2 rows in set (0.00 sec)
```
This is a new syntax.
Show table creation records in Iceberg database:
Syntax:
```sql
SHOW TABLE CREATION [FROM db] [LIKE mask]
```
Users can directly query the data in the hive table in Doris, and can use join to perform complex queries without laboriously importing data from hive.
Main changes list below:
FE:
Extend HiveScanNode from BrokerScanNode
HiveMetaStoreClientHelper communicate with HIVE and HDFS.
BE:
Treate HiveScanNode as BrokerScanNode, treate HiveTable as BrokerTable.
broker_scanner.cpp: suppot read column from HDFS path.
orc_scanner.cpp: support read hdfs file.
POM:
Add hive.version=2.3.7, hive-metastore and hive-exec
Add hadoop.version=2.8.0, hadoop-hdfs
Upgrade commons-lang to fix incompatiblity of Java 9 and later.
Thrift:
Add THiveTable
Add read_by_column_def in TBrokerRangeDesc
some user do not know Doris support type boolean, they use TINYINT,
so i add type BOOLEAN when enter 'help create table' in mysql client.
currently, type BOOLEAN size is 1 byte, but the value of boolean column only in {0,1} ,
which waste some memory, and i want change it's implement to 1 bit in the future.
1. Fix bug of UNKNOWN Operation Type 91
2. Support using resource_tag property of user to limit the usage of BE
3. Add new FE config `disable_tablet_scheduler` to disable tablet scheduler.
4. Add documents for resource tag.
5. Modify the default value of FE config `default_db_data_quota_bytes` to 1PB.
6. Add a new BE config `disable_compaction_trace_log` to disable the trace log of compaction time cost.
7. Modify the default value of BE config `remote_storage_read_buffer_mb` to 16MB
8. Fix `show backends` results error
9. Add new BE config `external_table_connect_timeout_sec` to set the timeout when connecting to odbc and mysql table.
10. Modify issue template to enable blank issue, for release note or other specific usage.
11. Fix a bug in alpha_row_set split_range() function.
fix the issue #5995
Add the property "dynamic_partition.history_partition_num" to specify the history partition number when enable create_history_partition to fix the invalid date format value
and add these two properties to docs
- There is a fe configuration called dynamic_partition_enable
which controls the opening and closing of the dynamic partition function.
When this configuration is false, it means that all tables do not support dynamic partitioning.
- But when the user tried to create the dynamic partition table, Doris did not detect this parameter.
This will cause the user can normally create a dynamic partition table,
but in fact Doris cannot create a partition for this table.
- This pr detect this config when building the table.
The dynamic partition table can be created only when the dynamic_partition_enable configuration is true.
If the configuration is false, the command to create a dynamic partition table will directly report an error.
Currently we choose BE random without check disk is available,
the create table will failed until create tablet task is sent to BE
and BE will check is there has available capacity to create tablet.
So check backend disk available by storage medium will reduce unnecessary RPC call.