Fix the mistake for HLL in mini load (#1981)
[Docs] Fix mistakes for HLL column in mini load
This commit is contained in:
@ -37,11 +37,11 @@
|
||||
2. 导入数据,导入的方式见相关help curl
|
||||
|
||||
a. 使用表中的列生成hll列
|
||||
curl --location-trusted -uname:password -T data http://host/api/test_db/test/_load?label=load_1\&hll=set1,id:set2,name
|
||||
|
||||
curl --location-trusted -uname:password -T data -H "label:load_1" -H "columns:dt, id, name, province, os, set1=hll_hash(id), set2=hll_hash(name)"
|
||||
http://host/api/test_db/test/_stream_load
|
||||
b. 使用数据中的某一列生成hll列
|
||||
curl --location-trusted -uname:password -T data http://host/api/test_db/test/_load?label=load_1\&hll=set1,cuid:set2,os
|
||||
\&columns=dt,id,name,province,sex,cuid,os
|
||||
curl --location-trusted -uname:password -T data -H "label:load_1" -H "columns:dt, id, name, province, sex, cuid, os, set1=hll_hash(cuid), set2=hll_hash(os)"
|
||||
http://host/api/test_db/test/_stream_load
|
||||
|
||||
3. 聚合数据,常用方式3种:(如果不聚合直接对base表查询,速度可能跟直接使用ndv速度差不多)
|
||||
|
||||
|
||||
@ -95,6 +95,9 @@
|
||||
|
||||
6. 导入含有HLL列的表,可以是表中的列或者数据中的列用于生成HLL列(用户是defalut_cluster中的
|
||||
|
||||
curl --location-trusted -u root -T testData http://host:port/api/testDb/testTbl/_load?label=123\&max_filter_ratio=0.2
|
||||
\&columns=k1,k2,k3\&hll=hll_column1,k1:hll_column2,k2
|
||||
|
||||
curl --location-trusted -u root -T testData http://host:port/api/testDb/testTbl/_load?label=123\&max_filter_ratio=0.2
|
||||
\&hll=hll_column1,tmp_k4:hll_column2,tmp_k5\&columns=k1,k2,k3,tmp_k4,tmp_k5
|
||||
|
||||
|
||||
@ -36,12 +36,15 @@ distributed by hash(id) buckets 32;
|
||||
|
||||
2. Import data. See help curl for the way you import it.
|
||||
|
||||
A. Generate HLL columns using columns in tables
|
||||
curl --location-trusted -uname:password -T data http://host/api/test_db/test/_load?label=load_1\&hll=set1,id:set2,name
|
||||
A. Generate HLL columns using columns in tables
|
||||
|
||||
B. Generate HLL columns using a column in the data
|
||||
curl --location-trusted -uname:password -T data http://host/api/test_db/test/_load?label=load_1\&hll=set1,cuid:set2,os
|
||||
\&columns=dt,id,name,province,sex,cuid,os
|
||||
curl --location-trusted -uname:password -T data -H "label:load_1" -H "columns:dt, id, name, province, os, set1=hll_hash(id), set2=hll_hash(name)"
|
||||
http://host/api/test_db/test/_stream_load
|
||||
|
||||
B. Generate HLL columns using a column in the data
|
||||
|
||||
curl --location-trusted -uname:password -T data -H "label:load_1" -H "columns:dt, id, name, province, sex, cuid, os, set1=hll_hash(cuid), set2=hll_hash(os)"
|
||||
http://host/api/test_db/test/_stream_load
|
||||
|
||||
3. There are three common ways of aggregating data: (without aggregating the base table directly, the speed may be similar to that of using NDV directly)
|
||||
|
||||
|
||||
@ -92,8 +92,11 @@ seq 1 10 | awk '{OFS="\t"}{print $1, $1 * 10}' | curl --location-trusted -u root
|
||||
|
||||
6. Import tables containing HLL columns, which can be columns in tables or columns in data to generate HLL columns (users are in defalut_cluster)
|
||||
|
||||
curl --location-trusted -u root -T testData http://host:port/api/testDb/testTbl/_load?label=123\&max_filter_ratio=0.2
|
||||
\&hll=hll_column1,tmp_k4:hll_column2,tmp_k5\&columns=k1,k2,k3,tmp_k4,tmp_k5
|
||||
curl --location-trusted -u root -T testData http://host:port/api/testDb/testTbl/_load?label=123\&max_filter_ratio=0.2\&hll=hll_column1,k1:hll_column2,k2
|
||||
\&columns=k1,k2,k3
|
||||
|
||||
curl --location-trusted -u root -T testData http://host:port/api/testDb/testTbl/_load?label=123\&max_filter_ratio=0.2
|
||||
\&hll=hll_column1,tmp_k4:hll_column2,tmp_k5\&columns=k1,k2,k3,tmp_k4,tmp_k5
|
||||
|
||||
7. View imports after submission
|
||||
|
||||
|
||||
Reference in New Issue
Block a user