From 5ee5d70f51c44620972d8ee130bfe38cf2e4e417 Mon Sep 17 00:00:00 2001 From: Gabriel Date: Thu, 15 Dec 2022 21:27:51 +0800 Subject: [PATCH] [DOCS](Decimalv3) Add document for Decimalv3 (#15108) * [DOCS](Decimalv3) Add document for Decimalv3 * update --- .../sql-reference/Data-Types/DECIMALV3.md | 82 +++++++++++++++++++ docs/sidebars.json | 1 + .../sql-reference/Data-Types/DECIMALV3.md | 79 ++++++++++++++++++ 3 files changed, 162 insertions(+) create mode 100644 docs/en/docs/sql-manual/sql-reference/Data-Types/DECIMALV3.md create mode 100644 docs/zh-CN/docs/sql-manual/sql-reference/Data-Types/DECIMALV3.md diff --git a/docs/en/docs/sql-manual/sql-reference/Data-Types/DECIMALV3.md b/docs/en/docs/sql-manual/sql-reference/Data-Types/DECIMALV3.md new file mode 100644 index 0000000000..2197507e57 --- /dev/null +++ b/docs/en/docs/sql-manual/sql-reference/Data-Types/DECIMALV3.md @@ -0,0 +1,82 @@ +--- +{ + "title": "DECIMALV3", + "language": "en" +} +--- + + + +## DECIMALV3 +### Description +DECIMALV3 (M [,D]) + +High-precision fixed-point number, M represents the total number of significant digits, and D represents the scale. + +The range of M is [1, 38], and the range of D is [0, precision]. + +The default value is DECIMALV3(9, 0). + +### Precision Deduction + +DECIMALV3 has a very complex set of type inference rules. For different expressions, different rules will be applied for precision inference. + +#### Arithmetic Expressions + +* Plus / Minus: DECIMALV3(a, b) + DECIMALV3(x, y) -> DECIMALV3(max(a - b, x - y) + max(b, y), max(b, y)). That is, the integer part and the decimal part use the larger value of the two operands respectively. +* Multiply: DECIMALV3(a, b) + DECIMALV3(x, y) -> DECIMALV3(max(a, x), max(b, y)). +* Divide: DECIMALV3(a, b) + DECIMALV3(x, y) -> DECIMALV3(a + y, b). + +#### Aggregation functions + +* SUM / MULTI_DISTINCT_SUM: SUM(DECIMALV3(a, b)) -> DECIMALV3(38, b). +* AVG: AVG(DECIMALV3(a, b)) -> DECIMALV3(38, b). + +#### Default rules + +Except for the expressions mentioned above, other expressions use default rules for precision deduction. That is, for the expression `expr(DECIMALV3(a, b))`, the result type is also DECIMALV3(a, b). + +#### Adjust the result precision + +Different users have different accuracy requirements for DECIMALV3. The above rules are the default behavior of Doris. If users **have different accuracy requirements, they can adjust the accuracy in the following ways**: + +* If the expected result precision is greater than the default precision, you can adjust the result precision by adjusting the parameter's precision. For example, if the user expects to calculate `AVG(col)` and get DECIMALV3(x, y) as the result, where the type of `col` is DECIMALV3 (a, b), the expression can be rewritten to `AVG(CAST(col as DECIMALV3 (x, y))`. +* If the expected result precision is less than the default precision, the desired precision can be obtained by approximating the output result. For example, if the user expects to calculate `AVG(col)` and get DECIMALV3(x, y) as the result, where the type of `col` is DECIMALV3(a, b), the expression can be rewritten as `ROUND(AVG(col), y)`. + +### Why DECIMALV3 is required + +DECIMALV3 in Doris is a real high-precision fixed-point number. Compared with the old version of Decimal, DecimalV3 has the following core advantages: +1. It can represent a wider range. The value ranges of both precision and scale in DECIMALV3 have been significantly expanded. +2. Higher performance. The old version of DECIMAL requires 16 bytes in memory and 12 bytes in storage, while DECIMALV3 has made adaptive adjustments as shown below. +``` ++----------------------+------------------------------+ +| precision | Space occupied (memory/disk) | ++----------------------+------------------------------+ +| 0 < precision <= 8 | 4 bytes | ++----------------------+------------------------------+ +| 8 < precision <= 18 | 8 bytes | ++----------------------+------------------------------+ +| 18 < precision <= 38 | 16 bytes | ++----------------------+------------------------------+ +``` +3. More complete precision deduction. For different expressions, different precision inference rules are applied to deduce the precision of the results. + +### keywords +DECIMALV3 diff --git a/docs/sidebars.json b/docs/sidebars.json index ee3d0d8b0c..16e734c3f6 100644 --- a/docs/sidebars.json +++ b/docs/sidebars.json @@ -945,6 +945,7 @@ "sql-manual/sql-reference/Data-Types/SMALLINT", "sql-manual/sql-reference/Data-Types/TINYINT", "sql-manual/sql-reference/Data-Types/DECIMAL", + "sql-manual/sql-reference/Data-Types/DECIMALV3", "sql-manual/sql-reference/Data-Types/BIGINT", "sql-manual/sql-reference/Data-Types/BOOLEAN", "sql-manual/sql-reference/Data-Types/FLOAT", diff --git a/docs/zh-CN/docs/sql-manual/sql-reference/Data-Types/DECIMALV3.md b/docs/zh-CN/docs/sql-manual/sql-reference/Data-Types/DECIMALV3.md new file mode 100644 index 0000000000..e21b3ff357 --- /dev/null +++ b/docs/zh-CN/docs/sql-manual/sql-reference/Data-Types/DECIMALV3.md @@ -0,0 +1,79 @@ +--- +{ + "title": "DECIMALV3", + "language": "zh-CN" +} +--- + + + +## DECIMALV3 +### description + DECIMALV3(M[,D]) + 高精度定点数,M 代表一共有多少个有效数字(precision),D 代表小数位有多少数字(scale), + 有效数字 M 的范围是 [1, 38],小数位数字数量 D 的范围是 [0, precision]。 + + 默认值为 DECIMALV3(9, 0)。 + +### 精度推演 + +DECIMALV3有一套很复杂的类型推演规则,针对不同的表达式,会应用不同规则进行精度推断。 + +#### 四则运算 + +* 加法 / 减法:DECIMALV3(a, b) + DECIMALV3(x, y) -> DECIMALV3(max(a - b, x - y) + max(b, y), max(b, y)),即整数部分和小数部分都分别使用两个操作数中较大的值。 +* 乘法:DECIMALV3(a, b) + DECIMALV3(x, y) -> DECIMALV3(max(a, x), max(b, y))。 +* 除法:DECIMALV3(a, b) + DECIMALV3(x, y) -> DECIMALV3(a + y, b)。 + +#### 聚合运算 + +* SUM / MULTI_DISTINCT_SUM:SUM(DECIMALV3(a, b)) -> DECIMALV3(38, b)。 +* AVG:AVG(DECIMALV3(a, b)) -> DECIMALV3(38, b)。 + +#### 默认规则 + +除上述提到的函数外,其余表达式都使用默认规则进行精度推演。即对于表达式 `expr(DECIMALV3(a, b))`,结果类型同样也是DECIMALV3(a, b)。 + +#### 调整结果精度 + +不同用户对DECIMALV3的精度要求各不相同,上述规则为当前Doris的默认行为,如果用户**有不同的精度需求,可以通过以下方式进行精度调整**: +1. 如果期望的结果精度大于默认精度,可以通过调整入参精度来调整结果精度。例如用户期望计算`AVG(col)`得到DECIMALV3(x, y)作为结果,其中`col`的类型为DECIMALV3(a, b),则可以改写表达式为`AVG(CAST(col as DECIMALV3(x, y)))`。 +2. 如果期望的结果精度小于默认精度,可以通过对输出结果求近似得到想要的精度。例如用户期望计算`AVG(col)`得到DECIMALV3(x, y)作为结果,其中`col`的类型为DECIMALV3(a, b),则可以改写表达式为`ROUND(AVG(col), y)`。 + +### 为什么需要DECIMALV3 + +Doris中的DECIMALV3是真正意义上的高精度定点数,相比于老版本的Decimal,DecimalV3有以下核心优势: +1. 可表示范围更大。DECIMALV3中precision和scale的取值范围都进行了明显扩充。 +2. 性能更高。老版本的DECIMAL在内存中需要占用16 bytes,在存储中占用12 bytes,而DECIMALV3进行了自适应调整(如下表格)。 +``` ++----------------------+-------------------+ +| precision | 占用空间(内存/磁盘)| ++----------------------+-------------------+ +| 0 < precision <= 8 | 4 bytes | ++----------------------+-------------------+ +| 8 < precision <= 18 | 8 bytes | ++----------------------+-------------------+ +| 18 < precision <= 38 | 16 bytes | ++----------------------+-------------------+ +``` +3. 更完备的精度推演。对于不同的表达式,应用不同的精度推演规则对结果的精度进行推演。 + +### keywords + DECIMALV3