Fix some typos for docs. (#9680)
This commit is contained in:
@ -35,14 +35,14 @@ brpc_port = 8060
|
||||
|
||||
# data root path, separate by ';'
|
||||
# you can specify the storage medium of each root path, HDD or SSD
|
||||
# you can add capacity limit at the end of each root path, seperate by ','
|
||||
# you can add capacity limit at the end of each root path, separate by ','
|
||||
# eg:
|
||||
# storage_root_path = /home/disk1/doris.HDD,50;/home/disk2/doris.SSD,1;/home/disk2/doris
|
||||
# /home/disk1/doris.HDD, capacity limit is 50GB, HDD;
|
||||
# /home/disk2/doris.SSD, capacity limit is 1GB, SSD;
|
||||
# /home/disk2/doris, capacity limit is disk capacity, HDD(default)
|
||||
#
|
||||
# you also can specify the properties by setting '<property>:<value>', seperate by ','
|
||||
# you also can specify the properties by setting '<property>:<value>', separate by ','
|
||||
# property 'medium' has a higher priority than the extension of path
|
||||
#
|
||||
# Default value is ${DORIS_HOME}/storage, you should create it by hand.
|
||||
|
||||
2
dist/NOTICE-dist.txt
vendored
2
dist/NOTICE-dist.txt
vendored
@ -6206,7 +6206,7 @@ by an original code donated by Sébastien Brisard.
|
||||
===============================================================================
|
||||
|
||||
|
||||
The complete text of licenses and disclaimers associated with the the original
|
||||
The complete text of licenses and disclaimers associated with the original
|
||||
sources enumerated above at the time of code translation are in the LICENSE.txt
|
||||
file.
|
||||
|
||||
|
||||
2
dist/licenses/LICENSE.aalto-xml.txt
vendored
2
dist/licenses/LICENSE.aalto-xml.txt
vendored
@ -7,7 +7,7 @@ You may obtain a copy of the License at:
|
||||
|
||||
https://www.apache.org/licenses/
|
||||
|
||||
A copy is also included with both the the downloadable source code package
|
||||
A copy is also included with both the downloadable source code package
|
||||
and jar that contains class bytecodes, as file "ASL 2.0". In both cases,
|
||||
that file should be located next to this file: in source distribution
|
||||
the location should be "release-notes/asl"; and in jar "META-INF/"
|
||||
|
||||
@ -1145,7 +1145,7 @@ Cache for storage page size
|
||||
|
||||
* Type: string
|
||||
|
||||
* Description: data root path, separate by ';'.you can specify the storage medium of each root path, HDD or SSD. you can add capacity limit at the end of each root path, seperate by ','
|
||||
* Description: data root path, separate by ';'.you can specify the storage medium of each root path, HDD or SSD. you can add capacity limit at the end of each root path, separate by ','
|
||||
|
||||
eg.1: `storage_root_path=/home/disk1/doris.HDD,50;/home/disk2/doris.SSD,1;/home/disk2/doris`
|
||||
|
||||
|
||||
@ -163,7 +163,7 @@ The rules of dynamic partition are prefixed with `dynamic_partition.`:
|
||||
|
||||
```time_unit="DAY/WEEK/MONTH", end=3, start=-3, reserved_history_periods="[2020-06-01,2020-06-20],[2020-10-31,2020-11-15]"```.
|
||||
|
||||
The the system will automatically reserve following partitions in following period :
|
||||
The system will automatically reserve following partitions in following period :
|
||||
|
||||
```
|
||||
["2020-06-01","2020-06-20"],
|
||||
@ -173,7 +173,7 @@ The rules of dynamic partition are prefixed with `dynamic_partition.`:
|
||||
|
||||
```time_unit="HOUR", end=3, start=-3, reserved_history_periods="[2020-06-01 00:00:00,2020-06-01 03:00:00]"```.
|
||||
|
||||
The the system will automatically reserve following partitions in following period :
|
||||
The system will automatically reserve following partitions in following period :
|
||||
|
||||
```
|
||||
["2020-06-01 00:00:00","2020-06-01 03:00:00"]
|
||||
|
||||
@ -123,7 +123,7 @@ Let's assume that this is a table that records the user's behavior in accessing
|
||||
| 10 | User's visit, time to stay on the page|
|
||||
| 10 | User's current visit, time spent on the page (redundancy)|
|
||||
|
||||
Then when this batch of data is imported into Doris correctly, the final storage in Doris is is as follows:
|
||||
Then when this batch of data is imported into Doris correctly, the final storage in Doris is as follows:
|
||||
|
||||
|user\_id|date|city|age|sex|last\_visit\_date|cost|max\_dwell\_time|min\_dwell\_time|
|
||||
|---|---|---|---|---|---|---|---|---|
|
||||
@ -178,7 +178,7 @@ The imported data are as follows:
|
||||
| 10004 | 2017-10-01 | 2017-10-01 12:12:48 | Shenzhen | 35 | 0 | 2017-10-01 10:00:15 | 100 | 3 | 3|
|
||||
| 10004 | 2017-10-03 | 2017-10-03 12:38:20 | Shenzhen | 35 | 0 | 2017-10-03 10:20:22 | 11 | 6 | 6|
|
||||
|
||||
Then when this batch of data is imported into Doris correctly, the final storage in Doris is is as follows:
|
||||
Then when this batch of data is imported into Doris correctly, the final storage in Doris is as follows:
|
||||
|
||||
|user_id|date|timestamp|city|age|sex|last\_visit\_date|cost|max\_dwell\_time|min\_dwell\_time|
|
||||
|---|---|---|---|---|---|---|---|---|---|
|
||||
@ -212,7 +212,7 @@ We imported a new batch of data:
|
||||
| 10004 | 2017-10-03 | Shenzhen | 35 | 0 | 2017-10-03 11:22:00 | 44 | 19 | 19|
|
||||
| 10005 | 2017-10-03 | Changsha | 29 | 1 | 2017-10-03 18:11:02 | 3 | 1 | 1|
|
||||
|
||||
Then when this batch of data is imported into Doris correctly, the final storage in Doris is is as follows:
|
||||
Then when this batch of data is imported into Doris correctly, the final storage in Doris is as follows:
|
||||
|
||||
|user_id|date|city|age|sex|last\_visit\_date|cost|max\_dwell\_time|min\_dwell\_time|
|
||||
|---|---|---|---|---|---|---|---|---|
|
||||
|
||||
@ -116,7 +116,7 @@ brpc_port = 8060
|
||||
# If no ip match this rule, will choose one randomly.
|
||||
# use CIDR format, e.g. 10.10.10.0/
|
||||
# Default value is empty.
|
||||
priority_networks = 192.168.59.0/24 # data root path, seperate by ';'
|
||||
priority_networks = 192.168.59.0/24 # data root path, separate by ';'
|
||||
storage_root_path = /soft/be/storage
|
||||
# sys_log_dir = ${PALO_HOME}/log
|
||||
# sys_log_roll_mode = SIZE-MB-
|
||||
|
||||
@ -85,7 +85,7 @@ ProxySQL 有配置文件 `/etc/proxysql.cnf` 和配置数据库文件`/var/lib/p
|
||||
|
||||
#### 查看及修改配置文件
|
||||
|
||||
这里主要是是几个参数,在下面已经注释出来了,可以根据自己的需要进行修改
|
||||
这里主要是几个参数,在下面已经注释出来了,可以根据自己的需要进行修改
|
||||
|
||||
```
|
||||
# egrep -v "^#|^$" /etc/proxysql.cnf
|
||||
|
||||
@ -382,7 +382,7 @@ FROM BINLOG
|
||||
);
|
||||
```
|
||||
|
||||
创建数据同步作业的的详细语法可以连接到 Doris 后,[CREATE SYNC JOB](../../../sql-manual/sql-reference/Data-Manipulation-Statements/Load/CREATE-SYNC-JOB.md) 查看语法帮助。这里主要详细介绍,创建作业时的注意事项。
|
||||
创建数据同步作业的详细语法可以连接到 Doris 后,[CREATE SYNC JOB](../../../sql-manual/sql-reference/Data-Manipulation-Statements/Load/CREATE-SYNC-JOB.md) 查看语法帮助。这里主要详细介绍,创建作业时的注意事项。
|
||||
|
||||
语法:
|
||||
```
|
||||
|
||||
@ -80,7 +80,7 @@ under the License.
|
||||
|
||||
### 创建任务
|
||||
|
||||
创建例行导入任务的的详细语法可以连接到 Doris 后,查看[CREATE ROUTINE LOAD](../../../sql-manual/sql-reference/Data-Manipulation-Statements/Load/CREATE-ROUTINE-LOAD.md)命令手册,或者执行 `HELP ROUTINE LOAD;` 查看语法帮助。
|
||||
创建例行导入任务的详细语法可以连接到 Doris 后,查看[CREATE ROUTINE LOAD](../../../sql-manual/sql-reference/Data-Manipulation-Statements/Load/CREATE-ROUTINE-LOAD.md)命令手册,或者执行 `HELP ROUTINE LOAD;` 查看语法帮助。
|
||||
|
||||
下面我们以几个例子说明如何创建Routine Load任务:
|
||||
|
||||
|
||||
@ -119,7 +119,7 @@ brpc_port = 8060
|
||||
# If no ip match this rule, will choose one randomly.
|
||||
# use CIDR format, e.g. 10.10.10.0/
|
||||
# Default value is empty.
|
||||
priority_networks = 192.168.59.0/24 # data root path, seperate by ';'
|
||||
priority_networks = 192.168.59.0/24 # data root path, separate by ';'
|
||||
storage_root_path = /soft/be/storage
|
||||
# sys_log_dir = ${PALO_HOME}/log
|
||||
# sys_log_roll_mode = SIZE-MB-
|
||||
|
||||
@ -159,7 +159,7 @@ Total: 668.6 MB
|
||||
|
||||
* 第一列:函数直接申请的内存大小,单位MB
|
||||
* 第四列:函数以及函数所有调用的函数总共内存大小。
|
||||
* 第二列、第五列分别是第一列与第四列的的比例值。
|
||||
* 第二列、第五列分别是第一列与第四列的比例值。
|
||||
* 第三列是个第二列的累积值。
|
||||
|
||||
当然也可以生成调用关系图片,更加方便分析。比如下面的命令就能够生成SVG格式的调用关系图。
|
||||
@ -305,7 +305,7 @@ pprof --svg --seconds=60 http://be_host:be_webport/pprof/profile > be.svg
|
||||
|
||||
#### perf + flamegragh
|
||||
|
||||
这个是相当通用的一种CPU分析方式,相比于`pprof`,这种方式必须要求能够登陆到分析对象的物理机上。但是相比于pprof只能定时采点,perf是能够通过不同的事件来完成堆栈信息采集的。具体的的使用方式如下:
|
||||
这个是相当通用的一种CPU分析方式,相比于`pprof`,这种方式必须要求能够登陆到分析对象的物理机上。但是相比于pprof只能定时采点,perf是能够通过不同的事件来完成堆栈信息采集的。具体的使用方式如下:
|
||||
|
||||
```
|
||||
perf record -g -p be_pid -- sleep 60
|
||||
|
||||
@ -427,7 +427,7 @@ outputFormat.close();
|
||||
| doris.request.read.timeout.ms | 30000 | 向 Doris 发送请求的读取超时时间 |
|
||||
| doris.request.query.timeout.s | 3600 | 查询 Doris 的超时时间,默认值为1小时,-1表示无超时限制 |
|
||||
| doris.request.tablet.size | Integer. MAX_VALUE | 一个 Partition 对应的 Doris Tablet 个数。<br />此数值设置越小,则会生成越多的 Partition。从而提升 Flink 侧的并行度,但同时会对 Doris 造成更大的压力。 |
|
||||
| doris.batch.size | 1024 | 一次从 BE 读取数据的最大行数。增大此数值可减少 Flink 与 Doris 之间建立连接的次数。<br />从而减轻网络延迟所带来的的额外时间开销。 |
|
||||
| doris.batch.size | 1024 | 一次从 BE 读取数据的最大行数。增大此数值可减少 Flink 与 Doris 之间建立连接的次数。<br />从而减轻网络延迟所带来的额外时间开销。 |
|
||||
| doris.exec.mem.limit | 2147483648 | 单个查询的内存限制。默认为 2GB,单位为字节 |
|
||||
| doris.deserialize.arrow.async | false | 是否支持异步转换 Arrow 格式到 flink-doris-connector 迭代所需的 RowBatch |
|
||||
| doris.deserialize.queue.size | 64 | 异步转换 Arrow 格式的内部处理队列,当 doris.deserialize.arrow.async 为 true 时生效 |
|
||||
|
||||
@ -273,7 +273,7 @@ kafkaSource.selectExpr("CAST(key AS STRING)", "CAST(value as STRING)")
|
||||
| doris.request.read.timeout.ms | 30000 | 向Doris发送请求的读取超时时间 |
|
||||
| doris.request.query.timeout.s | 3600 | 查询doris的超时时间,默认值为1小时,-1表示无超时限制 |
|
||||
| doris.request.tablet.size | Integer.MAX_VALUE | 一个RDD Partition对应的Doris Tablet个数。<br />此数值设置越小,则会生成越多的Partition。从而提升Spark侧的并行度,但同时会对Doris造成更大的压力。 |
|
||||
| doris.batch.size | 1024 | 一次从BE读取数据的最大行数。增大此数值可减少Spark与Doris之间建立连接的次数。<br />从而减轻网络延迟所带来的的额外时间开销。 |
|
||||
| doris.batch.size | 1024 | 一次从BE读取数据的最大行数。增大此数值可减少Spark与Doris之间建立连接的次数。<br />从而减轻网络延迟所带来的额外时间开销。 |
|
||||
| doris.exec.mem.limit | 2147483648 | 单个查询的内存限制。默认为 2GB,单位为字节 |
|
||||
| doris.deserialize.arrow.async | false | 是否支持异步转换Arrow格式到spark-doris-connector迭代所需的RowBatch |
|
||||
| doris.deserialize.queue.size | 64 | 异步转换Arrow格式的内部处理队列,当doris.deserialize.arrow.async为true时生效 |
|
||||
|
||||
@ -34,7 +34,7 @@ Remote UDF Service 支持通过 RPC 的方式访问用户提供的 UDF Service
|
||||
|
||||
2. 使用限制
|
||||
* 性能:相比于 Native UDF,UDF Service 会带来额外的网络开销,因此性能会远低于 Native UDF。同时,UDF Service 自身的实现也会影响函数的执行效率,用户需要自行处理高并发、线程安全等问题。
|
||||
* 单行模式和批处理模式:Doris 原先的的基于行存的查询执行框架会对每一行数据执行一次 UDF RPC 调用,因此执行效率非常差,而在新的向量化执行框架下,会对每一批数据(默认2048行)执行一次 UDF RPC 调用,因此性能有明显提升。实际测试中,基于向量化和批处理方式的 Remote UDF 性能和基于行存的 Native UDF 性能相当,可供参考。
|
||||
* 单行模式和批处理模式:Doris 原先的基于行存的查询执行框架会对每一行数据执行一次 UDF RPC 调用,因此执行效率非常差,而在新的向量化执行框架下,会对每一批数据(默认2048行)执行一次 UDF RPC 调用,因此性能有明显提升。实际测试中,基于向量化和批处理方式的 Remote UDF 性能和基于行存的 Native UDF 性能相当,可供参考。
|
||||
|
||||
## 编写 UDF 函数
|
||||
|
||||
|
||||
@ -151,7 +151,7 @@ Observer 角色和这个单词的含义一样,仅仅作为观察者来同步
|
||||
|
||||
3. 查看be.INFO中是否有F开头的日志。
|
||||
|
||||
F开头的的日志是 Fatal 日志。如 F0916 ,表示9月16号的Fatal日志。Fatal日志通常表示程序断言错误,断言错误会直接导致进程退出(说明程序出现了Bug)。欢迎前往微信群、github discussion 或dev邮件组寻求帮助。
|
||||
F开头的日志是 Fatal 日志。如 F0916 ,表示9月16号的Fatal日志。Fatal日志通常表示程序断言错误,断言错误会直接导致进程退出(说明程序出现了Bug)。欢迎前往微信群、github discussion 或dev邮件组寻求帮助。
|
||||
|
||||
4. Minidump
|
||||
|
||||
@ -282,7 +282,7 @@ cp fe-core/target/generated-sources/cup/org/apache/doris/analysis/action_table.d
|
||||
```
|
||||
|
||||
### Q14. Doris 升级到1.0 以后版本通过ODBC访问MySQL外表报错 `Failed to set ciphers to use (2026)`
|
||||
这个问题出现在doris 升级到1.0 版本以后,且使用 Connector/ODBC 8.0.x 以上版本,Connector/ODBC 8.0.x 有多种获取方式,比如通过yum安装的的方式获取的 `/usr/lib64/libmyodbc8w.so` 依赖的是 `libssl.so.10` 和 `libcrypto.so.10`
|
||||
这个问题出现在doris 升级到1.0 版本以后,且使用 Connector/ODBC 8.0.x 以上版本,Connector/ODBC 8.0.x 有多种获取方式,比如通过yum安装的方式获取的 `/usr/lib64/libmyodbc8w.so` 依赖的是 `libssl.so.10` 和 `libcrypto.so.10`
|
||||
而doris 1.0 以后版本中openssl 已经升级到1.1 且内置在doris 二进制包中,因此会导致 openssl 的冲突进而出现 类似 如下的错误
|
||||
```
|
||||
ERROR 1105 (HY000): errCode = 2, detailMessage = driver connect Error: HY000 [MySQL][ODBC 8.0(w) Driver]SSL connection error: Failed to set ciphers to use (2026)
|
||||
|
||||
@ -390,7 +390,7 @@ WITH BROKER broker_name
|
||||
)
|
||||
```
|
||||
|
||||
`my_table` 必须是是 Unqiue Key 模型表,并且指定了 Sequcence Col。数据会按照源数据中 `source_sequence` 列的值来保证顺序性。
|
||||
`my_table` 必须是 Unqiue Key 模型表,并且指定了 Sequcence Col。数据会按照源数据中 `source_sequence` 列的值来保证顺序性。
|
||||
|
||||
### Keywords
|
||||
|
||||
|
||||
@ -166,7 +166,7 @@ DorisWriter 通过Doris原生支持Stream load方式导入数据, DorisWriter
|
||||
|
||||
* **lineDelimiter**
|
||||
|
||||
- 描述:每批次数据包含多行,每行为 Json 格式,每行的的分隔符即为 lineDelimiter。支持多个字节, 例如'\x02\x03'。
|
||||
- 描述:每批次数据包含多行,每行为 Json 格式,每行的分隔符即为 lineDelimiter。支持多个字节, 例如'\x02\x03'。
|
||||
- 必选:否
|
||||
- 默认值:`\n`
|
||||
|
||||
|
||||
Reference in New Issue
Block a user