Fix the mistake for HLL in mini load (#1981)

[Docs] Fix mistakes for HLL column in mini load
2019-10-14 19:46:23 +08:00
parent ccc236484b
commit b84ef013eb
4 changed files with 20 additions and 11 deletions
--- a/docs/documentation/cn/sql-reference/sql-statements/Data
+++ b/docs/documentation/cn/sql-reference/sql-statements/Data
@ -37,11 +37,11 @@
    2. 导入数据，导入的方式见相关help curl

      a. 使用表中的列生成hll列
-        curl --location-trusted -uname:password -T data http://host/api/test_db/test/_load?label=load_1\&hll=set1,id:set2,name
-
+        curl --location-trusted -uname:password -T data -H "label:load_1" -H "columns:dt, id, name, province, os, set1=hll_hash(id), set2=hll_hash(name)"
+            http://host/api/test_db/test/_stream_load
      b. 使用数据中的某一列生成hll列
-        curl --location-trusted -uname:password -T data http://host/api/test_db/test/_load?label=load_1\&hll=set1,cuid:set2,os
-            \&columns=dt,id,name,province,sex,cuid,os
+        curl --location-trusted -uname:password -T data -H "label:load_1" -H "columns:dt, id, name, province, sex, cuid, os, set1=hll_hash(cuid), set2=hll_hash(os)"
+            http://host/api/test_db/test/_stream_load

    3. 聚合数据，常用方式3种：（如果不聚合直接对base表查询，速度可能跟直接使用ndv速度差不多）

--- a/docs/documentation/cn/sql-reference/sql-statements/Data
+++ b/docs/documentation/cn/sql-reference/sql-statements/Data
@ -95,6 +95,9 @@

    6. 导入含有HLL列的表，可以是表中的列或者数据中的列用于生成HLL列（用户是defalut_cluster中的

+        curl --location-trusted -u root -T testData http://host:port/api/testDb/testTbl/_load?label=123\&max_filter_ratio=0.2
+              \&columns=k1,k2,k3\&hll=hll_column1,k1:hll_column2,k2
+
        curl --location-trusted -u root -T testData http://host:port/api/testDb/testTbl/_load?label=123\&max_filter_ratio=0.2
              \&hll=hll_column1,tmp_k4:hll_column2,tmp_k5\&columns=k1,k2,k3,tmp_k4,tmp_k5

--- a/docs/documentation/en/sql-reference/sql-statements/Data
+++ b/docs/documentation/en/sql-reference/sql-statements/Data
@ -36,12 +36,15 @@ distributed by hash(id) buckets 32;

 2. Import data. See help curl for the way you import it.

-A. Generate HLL columns using columns in tables
-curl --location-trusted -uname:password -T data http://host/api/test_db/test/_load?label=load_1\&hll=set1,id:set2,name
+    A. Generate HLL columns using columns in tables

-B. Generate HLL columns using a column in the data
-curl --location-trusted -uname:password -T data http://host/api/test_db/test/_load?label=load_1\&hll=set1,cuid:set2,os
-\&columns=dt,id,name,province,sex,cuid,os
+        curl --location-trusted -uname:password -T data -H "label:load_1" -H "columns:dt, id, name, province, os, set1=hll_hash(id), set2=hll_hash(name)"
+            http://host/api/test_db/test/_stream_load
+
+    B. Generate HLL columns using a column in the data
+
+        curl --location-trusted -uname:password -T data -H "label:load_1" -H "columns:dt, id, name, province, sex, cuid, os, set1=hll_hash(cuid), set2=hll_hash(os)"
+            http://host/api/test_db/test/_stream_load

 3. There are three common ways of aggregating data: (without aggregating the base table directly, the speed may be similar to that of using NDV directly)

--- a/docs/documentation/en/sql-reference/sql-statements/Data
+++ b/docs/documentation/en/sql-reference/sql-statements/Data
@ -92,8 +92,11 @@ seq 1 10 | awk '{OFS="\t"}{print $1, $1 * 10}' | curl --location-trusted -u root

 6. Import tables containing HLL columns, which can be columns in tables or columns in data to generate HLL columns (users are in defalut_cluster)

-curl --location-trusted -u root -T testData http://host:port/api/testDb/testTbl/_load?label=123\&max_filter_ratio=0.2
-\&hll=hll_column1,tmp_k4:hll_column2,tmp_k5\&columns=k1,k2,k3,tmp_k4,tmp_k5
+    curl --location-trusted -u root -T testData http://host:port/api/testDb/testTbl/_load?label=123\&max_filter_ratio=0.2\&hll=hll_column1,k1:hll_column2,k2
+        \&columns=k1,k2,k3
+
+    curl --location-trusted -u root -T testData http://host:port/api/testDb/testTbl/_load?label=123\&max_filter_ratio=0.2
+        \&hll=hll_column1,tmp_k4:hll_column2,tmp_k5\&columns=k1,k2,k3,tmp_k4,tmp_k5

 7. View imports after submission