[enhancement](schema) dynamic_partition.time_unit support year (#19551)

dynamic_partition.time_unit support year
This commit is contained in:
xueweizhang
2023-05-17 23:49:15 +08:00
committed by GitHub
parent 8aa7f0e188
commit 97d4778ecf
8 changed files with 79 additions and 12 deletions

View File

@ -83,7 +83,7 @@ The rules of dynamic partition are prefixed with `dynamic_partition.`:
* `dynamic_partition.time_unit`
The unit for dynamic partition scheduling. Can be specified as `HOUR`,`DAY`,` WEEK`, and `MONTH`, means to create or delete partitions by hour, day, week, and month, respectively.
The unit for dynamic partition scheduling. Can be specified as `HOUR`,`DAY`,` WEEK`, `MONTH` and `YEAR`, means to create or delete partitions by hour, day, week, month and year, respectively.
When specified as `HOUR`, the suffix format of the dynamically created partition name is `yyyyMMddHH`, for example, `2020032501`. *When the time unit is HOUR, the data type of partition column cannot be DATE.*
@ -93,6 +93,8 @@ The rules of dynamic partition are prefixed with `dynamic_partition.`:
When specified as `MONTH`, the suffix format of the dynamically created partition name is `yyyyMM`, for example, `202003`.
When specified as `YEAR`, the suffix format of the dynamically created partition name is `yyyy`, for example, `2020`.
* `dynamic_partition.time_zone`
The time zone of the dynamic partition, if not filled in, defaults to the time zone of the current machine's system, such as `Asia/Shanghai`, if you want to know the supported TimeZone, you can found in `https://en.wikipedia.org/wiki/List_of_tz_database_time_zones`.
@ -159,11 +161,11 @@ The rules of dynamic partition are prefixed with `dynamic_partition.`:
* `dynamic_partition.reserved_history_periods`
The range of reserved history periods. It should be in the form of `[yyyy-MM-dd,yyyy-MM-dd],[...,...]` while the `dynamic_partition.time_unit` is "DAY, WEEK, and MONTH". And it should be in the form of `[yyyy-MM-dd HH:mm:ss,yyyy-MM-dd HH:mm:ss],[...,...]` while the dynamic_partition.time_unit` is "HOUR". And no more spaces expected. The default value is `"NULL"`, which means it is not set.
The range of reserved history periods. It should be in the form of `[yyyy-MM-dd,yyyy-MM-dd],[...,...]` while the `dynamic_partition.time_unit` is "DAY, WEEK, MONTH and YEAR". And it should be in the form of `[yyyy-MM-dd HH:mm:ss,yyyy-MM-dd HH:mm:ss],[...,...]` while the dynamic_partition.time_unit` is "HOUR". And no more spaces expected. The default value is `"NULL"`, which means it is not set.
Let us give an example. Suppose today is 2021-09-06,partitioned by day, and the properties of dynamic partition are set to:
```time_unit="DAY/WEEK/MONTH", end=3, start=-3, reserved_history_periods="[2020-06-01,2020-06-20],[2020-10-31,2020-11-15]"```.
```time_unit="DAY/WEEK/MONTH/YEAR", end=3, start=-3, reserved_history_periods="[2020-06-01,2020-06-20],[2020-10-31,2020-11-15]"```.
The system will automatically reserve following partitions in following period :

View File

@ -364,7 +364,7 @@ distribution_desc
The relevant parameters of dynamic partition are as follows:
* `dynamic_partition.enable`: Used to specify whether the dynamic partition function at the table level is enabled. The default is true.
* `dynamic_partition.time_unit:` is used to specify the time unit for dynamically adding partitions, which can be selected as DAY (day), WEEK (week), MONTH (month), HOUR (hour).
* `dynamic_partition.time_unit:` is used to specify the time unit for dynamically adding partitions, which can be selected as DAY (day), WEEK (week), MONTH (month), YEAR (year), HOUR (hour).
* `dynamic_partition.start`: Used to specify how many partitions to delete forward. The value must be less than 0. The default is Integer.MIN_VALUE.
* `dynamic_partition.end`: Used to specify the number of partitions created in advance. The value must be greater than 0.
* `dynamic_partition.prefix`: Used to specify the partition name prefix to be created. For example, if the partition name prefix is ​​p, the partition name will be automatically created as p20200108.

View File

@ -76,7 +76,7 @@ under the License.
- `dynamic_partition.time_unit`
动态分区调度的单位。可指定为 `HOUR`、`DAY`、`WEEK`、`MONTH`。分别表示按小时、按天、按星期、按月进行分区创建或删除。
动态分区调度的单位。可指定为 `HOUR`、`DAY`、`WEEK`、`MONTH`、`YEAR`。分别表示按小时、按天、按星期、按月、按年进行分区创建或删除。
当指定为 `HOUR` 时,动态创建的分区名后缀格式为 `yyyyMMddHH`,例如`2020032501`。小时为单位的分区列数据类型不能为 DATE。
@ -86,6 +86,8 @@ under the License.
当指定为 `MONTH` 时,动态创建的分区名后缀格式为 `yyyyMM`,例如 `202003`。
当指定为 `YEAR` 时,动态创建的分区名后缀格式为 `yyyy`,例如 `2020`。
- `dynamic_partition.time_zone`
动态分区的时区,如果不填写,则默认为当前机器的系统的时区,例如 `Asia/Shanghai`,如果想获取当前支持的时区设置,可以参考 `https://en.wikipedia.org/wiki/List_of_tz_database_time_zones`。
@ -150,11 +152,11 @@ under the License.
- `dynamic_partition.reserved_history_periods`
需要保留的历史分区的时间范围。当`dynamic_partition.time_unit` 设置为 "DAY/WEEK/MONTH" 时,需要以 `[yyyy-MM-dd,yyyy-MM-dd],[...,...]` 格式进行设置。当`dynamic_partition.time_unit` 设置为 "HOUR" 时,需要以 `[yyyy-MM-dd HH:mm:ss,yyyy-MM-dd HH:mm:ss],[...,...]` 的格式来进行设置。如果不设置,默认为 `"NULL"`。
需要保留的历史分区的时间范围。当`dynamic_partition.time_unit` 设置为 "DAY/WEEK/MONTH/YEAR" 时,需要以 `[yyyy-MM-dd,yyyy-MM-dd],[...,...]` 格式进行设置。当`dynamic_partition.time_unit` 设置为 "HOUR" 时,需要以 `[yyyy-MM-dd HH:mm:ss,yyyy-MM-dd HH:mm:ss],[...,...]` 的格式来进行设置。如果不设置,默认为 `"NULL"`。
我们举例说明。假设今天是 2021-09-06,按天分类,动态分区的属性设置为:
`time_unit="DAY/WEEK/MONTH", end=3, start=-3, reserved_history_periods="[2020-06-01,2020-06-20],[2020-10-31,2020-11-15]"`。
`time_unit="DAY/WEEK/MONTH/YEAR", end=3, start=-3, reserved_history_periods="[2020-06-01,2020-06-20],[2020-10-31,2020-11-15]"`。
则系统会自动保留:

View File

@ -361,7 +361,7 @@ distribution_desc
动态分区相关参数如下:
* `dynamic_partition.enable`: 用于指定表级别的动态分区功能是否开启。默认为 true。
* `dynamic_partition.time_unit:` 用于指定动态添加分区的时间单位,可选择为DAY(天),WEEK(周),MONTH(月),HOUR(时)。
* `dynamic_partition.time_unit:` 用于指定动态添加分区的时间单位,可选择为DAY(天),WEEK(周),MONTH(月),YEAR(年),HOUR(时)。
* `dynamic_partition.start`: 用于指定向前删除多少个分区。值必须小于0。默认为 Integer.MIN_VALUE。
* `dynamic_partition.end`: 用于指定提前创建的分区数量。值必须大于0。
* `dynamic_partition.prefix`: 用于指定创建的分区名前缀,例如分区名前缀为p,则自动创建分区名为p20200108。

View File

@ -1125,7 +1125,7 @@ public enum ErrorCode {
ERR_DYNAMIC_PARTITION_MUST_HAS_SAME_BUCKET_NUM_WITH_COLOCATE_TABLE(5063, new byte[]{'4', '2', '0', '0', '0'},
"Dynamic partition buckets must equal the distribution buckets if creating a colocate table: %s"),
ERROR_DYNAMIC_PARTITION_TIME_UNIT(5065, new byte[]{'4', '2', '0', '0', '0'},
"Unsupported time unit %s. Expect HOUR/DAY/WEEK/MONTH."),
"Unsupported time unit %s. Expect HOUR/DAY/WEEK/MONTH/YEAR."),
ERROR_DYNAMIC_PARTITION_START_ZERO(5066, new byte[]{'4', '2', '0', '0', '0'},
"Dynamic partition start must less than 0"),
ERROR_DYNAMIC_PARTITION_START_FORMAT(5066, new byte[]{'4', '2', '0', '0', '0'},

View File

@ -81,7 +81,8 @@ public class DynamicPartitionUtil {
|| !(timeUnit.equalsIgnoreCase(TimeUnit.DAY.toString())
|| timeUnit.equalsIgnoreCase(TimeUnit.HOUR.toString())
|| timeUnit.equalsIgnoreCase(TimeUnit.WEEK.toString())
|| timeUnit.equalsIgnoreCase(TimeUnit.MONTH.toString()))) {
|| timeUnit.equalsIgnoreCase(TimeUnit.MONTH.toString())
|| timeUnit.equalsIgnoreCase(TimeUnit.YEAR.toString()))) {
ErrorReport.reportDdlException(ErrorCode.ERROR_DYNAMIC_PARTITION_TIME_UNIT, timeUnit);
}
Preconditions.checkState(partitionInfo instanceof RangePartitionInfo);
@ -716,6 +717,8 @@ public class DynamicPartitionUtil {
return formattedDateStr.substring(0, 8);
} else if (timeUnit.equalsIgnoreCase(TimeUnit.MONTH.toString())) {
return formattedDateStr.substring(0, 6);
} else if (timeUnit.equalsIgnoreCase(TimeUnit.YEAR.toString())) {
return formattedDateStr.substring(0, 4);
} else if (timeUnit.equalsIgnoreCase(TimeUnit.HOUR.toString())) {
return formattedDateStr.substring(0, 10);
} else {
@ -741,7 +744,6 @@ public class DynamicPartitionUtil {
// return the partition range date string formatted as yyyy-MM-dd[ HH:mm::ss]
// add support: HOUR by caoyang10
// TODO: support YEAR
public static String getPartitionRangeString(DynamicPartitionProperty property, ZonedDateTime current,
int offset, String format) {
String timeUnit = property.getTimeUnit();
@ -751,8 +753,10 @@ public class DynamicPartitionUtil {
return getPartitionRangeOfWeek(current, offset, property.getStartOfWeek(), format);
} else if (timeUnit.equalsIgnoreCase(TimeUnit.HOUR.toString())) {
return getPartitionRangeOfHour(current, offset, format);
} else { // MONTH
} else if (timeUnit.equalsIgnoreCase(TimeUnit.MONTH.toString())) {
return getPartitionRangeOfMonth(current, offset, property.getStartOfMonth(), format);
} else { // YEAR
return getPartitionRangeOfYear(current, offset, format);
}
}
@ -856,6 +860,11 @@ public class DynamicPartitionUtil {
return getFormattedTimeWithoutHourMinuteSecond(resultTime, format);
}
private static String getPartitionRangeOfYear(ZonedDateTime current, int offset, String format) {
ZonedDateTime resultTime = current.plusYears(offset).withMonth(1).withDayOfMonth(1);
return getFormattedTimeWithoutHourMinuteSecond(resultTime, format);
}
private static String getFormattedTimeWithoutHourMinuteSecond(ZonedDateTime zonedDateTime, String format) {
ZonedDateTime timeWithoutHourMinuteSecond = zonedDateTime.withHour(0).withMinute(0).withSecond(0);
return DateTimeFormatter.ofPattern(format).format(timeWithoutHourMinuteSecond);

View File

@ -827,6 +827,35 @@ public class DynamicPartitionTableTest {
Assert.assertEquals(7, partitionName.length());
}
createOlapTblStmt = "CREATE TABLE test.`year_dynamic_partition` (\n"
+ " `k1` datetime NULL COMMENT \"\",\n"
+ " `k2` int NULL COMMENT \"\"\n"
+ ") ENGINE=OLAP\n"
+ "PARTITION BY RANGE(`k1`)\n"
+ "()\n"
+ "DISTRIBUTED BY HASH(`k2`) BUCKETS 3\n"
+ "PROPERTIES (\n"
+ "\"replication_num\" = \"1\",\n"
+ "\"dynamic_partition.enable\" = \"true\",\n"
+ "\"dynamic_partition.start\" = \"-3\",\n"
+ "\"dynamic_partition.end\" = \"3\",\n"
+ "\"dynamic_partition.create_history_partition\" = \"true\",\n"
+ "\"dynamic_partition.time_unit\" = \"year\",\n"
+ "\"dynamic_partition.prefix\" = \"p\",\n"
+ "\"dynamic_partition.buckets\" = \"1\"\n"
+ ");";
createTable(createOlapTblStmt);
emptyDynamicTable = (OlapTable) Env.getCurrentInternalCatalog()
.getDbOrAnalysisException("default_cluster:test")
.getTableOrAnalysisException("year_dynamic_partition");
Assert.assertEquals(7, emptyDynamicTable.getAllPartitions().size());
partitionIterator = emptyDynamicTable.getAllPartitions().iterator();
while (partitionIterator.hasNext()) {
String partitionName = partitionIterator.next().getName();
Assert.assertEquals(5, partitionName.length());
}
createOlapTblStmt = "CREATE TABLE test.`int_dynamic_partition_day` (\n"
+ " `k1` int NULL COMMENT \"\",\n"
+ " `k2` int NULL COMMENT \"\"\n"

View File

@ -39,6 +39,31 @@ suite("test_dynamic_partition") {
// XXX: buckets at pos(8), next maybe impl by sql meta
assertEquals(Integer.valueOf(result.get(0).get(8)), 10)
sql "drop table dy_par"
sql "drop table if exists dy_par"
sql """
CREATE TABLE IF NOT EXISTS dy_par ( k1 date NOT NULL, k2 varchar(20) NOT NULL, k3 int sum NOT NULL )
AGGREGATE KEY(k1,k2)
PARTITION BY RANGE(k1) ( )
DISTRIBUTED BY HASH(k1) BUCKETS 3
PROPERTIES (
"dynamic_partition.enable"="true",
"dynamic_partition.end"="3",
"dynamic_partition.buckets"="10",
"dynamic_partition.start"="-3",
"dynamic_partition.prefix"="p",
"dynamic_partition.time_unit"="YEAR",
"dynamic_partition.create_history_partition"="true",
"dynamic_partition.replication_allocation" = "tag.location.default: 1")
"""
result = sql "show tables like 'dy_par'"
logger.info("${result}")
assertEquals(result.size(), 1)
result = sql "show partitions from dy_par"
// XXX: buckets at pos(8), next maybe impl by sql meta
assertEquals(Integer.valueOf(result.get(0).get(8)), 10)
sql "drop table dy_par"
sql "drop table if exists dy_par_bucket_set_by_distribution"
sql """
CREATE TABLE IF NOT EXISTS dy_par_bucket_set_by_distribution