I started a discussion on this before, you can check it in the mail group https://lists.apache.org/thread/o770bc3k623kyfks2mzkt21qsc4g6328 In order to facilitate everyone to organize the documents, I created a new-docs directory under the incubator-doris directory. The new directory structure is below this. I just created a directory structure here, which needs to be rearranged. In the data import scenario, in order to take into account the viewing habits of previous users, the import is organized in two ways: 1. According to the usage scenario: This will give users clearer guidance. For example, the user is the data source of kafka, then the user can directly select the routine to load 2. According to the import method: it is the introduction of the various import methods we provided before In order to facilitate everyone to run and debug locally, I migrated the entire .vuepress under the original document. After completion, you only need to delete the original docs directory and rename the new new-docs directory to docs. At the same time, you can also run it locally, so that you can organize documents and know the content of each document directory. In the local debugging execution, switch to the new-docs directory and execute the following command: ```` npm install npm run dev ```` then through the browser http://ip:port/zh-CN http://ip:port/en
2.3 KiB
2.3 KiB
title, language
| title | language |
|---|---|
| Doris base concept | en |
Doris base concept
- FE: Frontend, the front-end node of Doris. It is mainly responsible for receiving and returning client requests, metadata, cluster management, and query plan generation.
- BE: Backend, the backend node of Doris. Mainly responsible for data storage and management, query plan execution and other work.
- Broker: Broker is a stateless process. It is mainly to help Doris access external data sources such as data on HDFS in a Unix-like file system interface. For example, it is used in data import or data export operations.
- Tablet: Tablet is the actual physical storage unit of a table. A table is stored in units of Tablet in the distributed storage layer composed of BE after partitioning and bucketing. Each Tablet includes meta information and several consecutive RowSets. .
- Rowset: Rowset is a data collection of a data change in the tablet, and the data change includes data import, deletion, and update. Rowset records by version information. A version is generated for each change.
- Version: It consists of two attributes, Start and End, and maintains the record information of data changes. Usually used to indicate the version range of Rowset, after a new import, a Rowset with equal Start and End is generated, and a Rowset version with a range is generated after Compaction.
- Segment: Indicates the data segment in the Rowset. Multiple Segments form a Rowset.
- Compaction: The process of merging consecutive versions of Rowset is called Compaction, and the data will be compressed during the merging process.