[typo](docs) Capitalize and Rename Table Design Files (#22453)

This commit is contained in:
KassieZ
2023-08-02 21:51:58 +08:00
committed by GitHub
parent 76108bac2f
commit d5bf00583f
11 changed files with 29 additions and 29 deletions

View File

@ -196,7 +196,7 @@ mysql> SHOW VARIABLES LIKE "%mem_limit%";
>* The above modification is session level and is only valid within the current connection session. Disconnecting and reconnecting will change back to the default value.
>* If you need to modify the global variable, you can set it as follows: `SET GLOBAL exec_mem_limit = 8589934592;` When the setup is complete, disconnect the session and log in again, and the parameters will take effect permanently.
### Query timeout
### Query Timeout
The current default query time is set to 300 seconds. If a query is not completed within 300 seconds, the query will be cancelled by the Doris system. Users can use this parameter to customize the timeout time of their applications and achieve a blocking mode similar to wait (timeout).

View File

@ -92,7 +92,7 @@ Follow-up logins can be performed with the following connection commands.
## Create Data Table and Import Data
### Create a database
### Create a Database
Initially, root or admin users can create a database by the following command:
@ -290,7 +290,7 @@ MySQL> DESC table2;
> 5. You can add Rollups to Tables to improve query performance. See the Rollup-related section in "Advanced Usage".
> 6. The value of the column is nullable by default, which may affect query performance.
### Load data
### Load Data
Doris supports a variety of data loading methods. You can refer to [Data Loading](../data-operate/import/load-manual.md) for more details. The following uses Stream Load and Broker Load as examples.

View File

@ -28,7 +28,7 @@ under the License.
This topic introduces the data models in Doris from a logical perspective so you can make better use of Doris in different business scenarios.
## Basic concepts
## Basic Concepts
In Doris, data is logically described in the form of tables. A table consists of rows and columns. Row is a row of user data. Column is used to describe different fields in a row of data.
@ -46,7 +46,7 @@ The following is the detailed introduction to each of them.
We illustrate what aggregation model is and how to use it correctly with practical examples.
### Example 1: Importing data aggregation
### Example 1: Importing Data Aggregation
Assume that the business has the following data table schema:
@ -160,7 +160,7 @@ As you can see, the data of User 10000 have been aggregated to one row, while th
After aggregation, Doris only stores the aggregated data. In other words, the detailed raw data will no longer be available.
### Example 2: keep detailed data
### Example 2: Keep Detailed Data
Here is a modified version of the table schema in Example 1:
@ -218,7 +218,7 @@ After importing, this batch of data will be stored in Doris as follows:
As you can see, the stored data are exactly the same as the import data. No aggregation has ever happened. This is because, the newly added `timestamp` column results in **difference of Keys** among the rows. That is to say, as long as the Keys of the rows are not identical in the import data, Doris can save the complete detailed data even in the Aggregate Model.
### Example 3: aggregate import data and existing data
### Example 3: Aggregate Import Data and Existing Data
Based on Example 1, suppose that you have the following data stored in Doris:
@ -437,7 +437,7 @@ Different from the Aggregate and Unique Models, the Duplicate Model stores the d
The Duplicate Model is suitable for storing raw data without aggregation requirements or primary key uniqueness constraints. For more usage scenarios, see the [Limitations of Aggregate Model](#Limitations of Aggregate Model) section.
### Duplicate Model without SORTING COLUMN (Since Doris 2.0)
### Duplicate Model Without SORTING COLUMN (Since Doris 2.0)
When creating a table without specifying Unique, Aggregate, or Duplicate, a table with a Duplicate model will be created by default, and the SORTING COLUMN will be automatically specified.
@ -580,7 +580,7 @@ The above adds a count column, the value of which will always be **1**, so the r
Another method is to add a `cound` column of value 1 but aggregation type of REPLACE. Then `select sum (count) from table;` and `select count (*) from table;` could produce the same results. Moreover, this method does not require the absence of same AGGREGATE KEY columns in the import data.
### Merge on Write of Unique model
### Merge on Write of Unique Model
The Merge on Write implementation in the Unique Model does not impose the same limitation as the Aggregate Model. In Merge on Write, the model adds a `delete bitmap` for each imported rowset to mark the data being overwritten or deleted. With the previous example, after Batch 1 is imported, the data status will be as follows:

View File

@ -24,7 +24,7 @@ specific language governing permissions and limitations
under the License.
-->
# Data Partitioning
# Data Partition
This topic is about table creation and data partitioning in Doris, including the common problems in table creation and their solutions.

View File

@ -1,6 +1,6 @@
---
{
"title": "Rollup and query",
"title": "Rollup and Query",
"language": "en"
}
---

View File

@ -1,6 +1,6 @@
---
{
"title": "BloomFilter index",
"title": "BloomFilter Index",
"language": "en"
}
---
@ -24,7 +24,7 @@ specific language governing permissions and limitations
under the License.
-->
# BloomFilter index
# BloomFilter Index
BloomFilter is a fast search algorithm for multi-hash function mapping proposed by Bloom in 1970. Usually used in some occasions where it is necessary to quickly determine whether an element belongs to a set, but is not strictly required to be 100% correct, BloomFilter has the following characteristics:
@ -40,7 +40,7 @@ Figure below shows an example of Bloom Filter with m=18, k=3 (m is the size of t
So how to judge whether the plot and the elements are in the set? Similarly, all the offset positions of this element are obtained after hash function calculation. If these positions are all 1, then it is judged that this element is in this set, if one is not 1, then it is judged that this element is not in this set. It's that simple!
## Doris BloomFilter index and usage scenarios
## Doris BloomFilter Index and Usage Scenarios
When we use HBase, we know that the Hbase data block index provides an effective method to find the data block of the HFile that should be read when accessing a specific row. But its utility is limited. The default size of the HFile data block is 64KB, and this size cannot be adjusted too much.
@ -99,7 +99,7 @@ Check that the BloomFilter index we built on the table is to use:
SHOW CREATE TABLE <table_name>;
```
### Delete BloomFilter index
### Delete BloomFilter Index
Deleting the index is to remove the index column from the bloom_filter_columns attribute:
@ -107,7 +107,7 @@ Deleting the index is to remove the index column from the bloom_filter_columns a
ALTER TABLE <db.table_name> SET ("bloom_filter_columns" = "");
```
### Modify BloomFilter index
### Modify BloomFilter Index
Modifying the index is to modify the bloom_filter_columns attribute of the table:
@ -115,7 +115,7 @@ Modifying the index is to modify the bloom_filter_columns attribute of the table
ALTER TABLE <db.table_name> SET ("bloom_filter_columns" = "k1,k3");
```
### **Doris BloomFilter usage scenarios**
### **Doris BloomFilter Usage Scenarios**
You can consider establishing a Bloom Filter index for a column when the following conditions are met:
@ -123,7 +123,7 @@ You can consider establishing a Bloom Filter index for a column when the followi
2. The query will be filtered according to the high frequency of the column, and most of the query conditions are in and = filtering.
3. Unlike Bitmap, BloomFilter is suitable for high cardinality columns. Such as UserID. Because if it is created on a low-cardinality column, such as a "gender" column, each Block will almost contain all values, causing the BloomFilter index to lose its meaning.
### **Doris BloomFilter use precautions**
### **Doris BloomFilter Use Precautions**
1. It does not support the creation of Bloom Filter indexes for Tinyint, Float, and Double columns.
2. The Bloom Filter index only has an acceleration effect on in and = filtering queries.

View File

@ -44,7 +44,7 @@ In the Aggregate, Unique and Duplicate data models. The underlying data storage
The prefix index, which is based on sorting, is an indexing method to query data quickly based on a given prefix column.
## Example
## Examples
We use the first 36 bytes of a row of data as the prefix index of this row of data. Prefix indexes are simply truncated when a VARCHAR type is encountered. We give an example:
@ -82,6 +82,6 @@ SELECT * FROM table WHERE age=20;
Therefore, when building a table, choosing the correct column order can greatly improve query efficiency.
## Adjust prefix index by ROLLUP
## Adjust Prefix Index by ROLLUP
Because the column order has been specified when the table is created, there is only one prefix index for a table. This may not be efficient for queries that use other columns that cannot hit the prefix index as conditions. Therefore, we can artificially adjust the column order by creating a ROLLUP. For details, please refer to [ROLLUP](../hit-the-rollup.md).

View File

@ -1,6 +1,6 @@
---
{
"title": "Inverted index",
"title": "Inverted Index",
"language": "en"
}
---
@ -168,11 +168,11 @@ SELECT * FROM table_name WHERE ts > '2023-01-01 00:00:00';
SELECT * FROM table_name WHERE op_type IN ('add', 'delete');
```
## Example
## Examples
This example will demostrate inverted index creation, fulltext query, normal query using a hackernews dataset with 1 million rows. The performanc comparation between using and without inverted index will also be showed.
### Create table
### Create Table
```sql
@ -209,7 +209,7 @@ PROPERTIES ("replication_num" = "1");
```
### Load data
### Load Data
- load data by stream load
@ -252,7 +252,7 @@ mysql> SELECT count() FROM hackernews_1m;
### Query
#### Fulltext search query
#### Fulltext Search Query
- count the rows that comment contains 'OLAP' using LIKE, cost 0.18s
```sql
@ -338,7 +338,7 @@ mysql> SELECT count() FROM hackernews_1m WHERE comment MATCH_ANY 'OLAP OLTP';
```
#### normal equal, range query
#### Normal Equal, Range Query
- range query on DateTime column
```sql

View File

@ -31,7 +31,7 @@ under the License.
In order to improve the like query performance, the NGram BloomFilter index was implemented.
## Create Column With NGram BloomFilter Index
## Create Column with NGram BloomFilter Index
During create table:
@ -74,7 +74,7 @@ Add NGram BloomFilter Index for old column:
alter table example_db.table3 add index idx_ngrambf(username) using NGRAM_BF PROPERTIES("gram_size"="3", "bf_size"="512")comment 'username ngram_bf index'
```
## **Some notes about Doris NGram BloomFilter**
## **Some Notes about Doris NGram BloomFilter**
1. NGram BloomFilter only support CHAR/VARCHAR/String column.
2. NGram BloomFilter index and BloomFilter index should be exclusive on same column

View File

@ -48,7 +48,7 @@
"type": "category",
"label": "Index",
"items": [
"data-table/index/prefix-index",
"data-table/index/index-overview",
"data-table/index/inverted-index",
"data-table/index/bloomfilter",
"data-table/index/ngram-bloomfilter-index",