Commit Graph

574 Commits

Author SHA1 Message Date
620de71052 Update basic-usage.md (#5493)
[Docs] Fix typo in docs zh-CN getting-started basic-usage.md
2021-03-10 18:36:56 +08:00
1d1a2569aa Update install-deploy.md (#5482)
[Docs] fix a mistake in docs zh-CN installing
2021-03-10 10:23:06 +08:00
bd53f407aa [Bucket Shuffle Join] Support the some featrue of Bucket Shuffle Join (#5459)
1.Support Bucket Shuffle Join when left table is colocate table or Colocate/Bucket Bucket Shuffle Join
2.Enable Local Rumtime Filter when there is Bucket Shuffle Join and Colocate Join
3.Add Doc for Bucket Shuffle Join
2021-03-09 14:47:59 +08:00
8855782aab [Doc] Fix page links (#5454) 2021-03-06 16:13:56 +08:00
4e1b6b3eef [ODBC] Let the type conversion of the fail in query in ODBC of MySQL table to prompt the information of the column (#5422)
Let the type conversion of the fail in query in ODBC of MySQL table to prompt the information of the column
2021-03-04 22:23:37 +08:00
bf086408d8 [Doc] Add query cache docs (#4479) 2021-03-04 22:21:08 +08:00
80d237510d [Doc] Modify dead link of doc (#5411)
Fix #5405

Co-authored-by: xxiao2018 <benghua3_1@sina.com>
2021-03-03 15:20:17 +08:00
e93a6da0e5 [Doc] correct format errors in English doc (#5321)
Fix some English doc format errors
2021-02-26 11:32:14 +08:00
5781d67afe Fix file licences (#5414)
Add license to files
For Doris 0.14
2021-02-24 16:37:17 +08:00
8046172c31 Update CREATE TABLE.md (#5398)
add "```" to ## keyword section, to fit markdown syntax.
2021-02-22 16:08:18 +08:00
6ede4c6ec1 [Feature] Support backup,restore,load,export directly connect to s3 (#5399)
* [doris-1008] support backup and restore directly to cloud storage via aws s3 protocol

* Internal][S3DirectAccess] Support backup,restore,load,export directlyconnect to s3
1. Support load and export data from/to s3 directly.
2. Add a config to auto convert broker access to s3 acces when available

Change-Id: Iac96d4b3670776708bc96a119ff491db8cb4cde7

(cherry picked from commit 2f03832ca52221cc7436069b96c45c48c4bc7201)

* [Internal][S3DirectAccess] File path glob compatible with broker

Change-Id: Ie55e07a547aa22c6fa8d432ca926216c10384e68
(cherry picked from commit d4fb25544c0dc06d23e1ada571ec3f8edd4ba56f)

* [internal] [doris-1008] fix log4j class not found

Change-Id: I468176aca0d821383c74ee658d461aba9e7d5be3
(cherry picked from commit 029adaa9d6ded8503acbd6644c1519456f3db232)

* add poms

Co-authored-by: yangzhengguo01 <yangzhengguo01@baidu.com>
2021-02-22 16:07:56 +08:00
b098261253 docs(Doc): correct wrong num in create table help doc (#5365)
Co-authored-by: liuyuan <liuyuan.a@miaozhen.com>
2021-02-20 10:07:48 +08:00
37c976b9af [Docs] Reorder docs index in sidebar (#5388)
reorder the docs sidebar
2021-02-16 22:35:35 +08:00
b8612a4be5 [DOCS] add some missing documents (#5370) 2021-02-09 09:31:39 +08:00
a1808c1a71 [Function] Add BE udf bitmap_not (#5346) (#5357)
this function will return the not result of inputs two bitmap.
2021-02-07 22:39:17 +08:00
780900ac9c [Feature] Support preceding filter original data when loading (#5338)
Support conditional filtering of original data in broker load and routine load
eg:

```
LOAD LABEL `label1`
(
DATA INFILE ('bos://cmy-repo/1.csv')
INTO TABLE tbl2
COLUMNS TERMINATED BY '\t'
(event_day, product_id, ocpc_stage, user_id)
SET (
	ocpc_stage = ocpc_stage + 100
)
PRECEDING FILTER user_id = 1381035
WHERE ocpc_stage > 30
)
...
```
2021-02-07 22:37:48 +08:00
059791c6ac [Config] Change some defualt value of Doris config (#5348)
1. Default enable bucket shuffle join in session variables.
2. Remove config of FE enable_odbc_table.
2021-02-03 13:22:38 +08:00
b315244ba7 [Doc] Fix the error description for the number of bytes of double type. (#5273)
Modify the error description of double type: 12 bytes is modified to 8 bytes
2021-02-01 00:11:14 +08:00
be0b0f930c [Load] Load job should not begin transaction when task queue in loadingLoadTaskScheduler is full to avoid txn timeout (#5205) 2021-02-01 00:10:24 +08:00
de57667d6d [Delete] Support delete with multi partitions (#5252)
Support delete statement like:
1. delete from table partitions(p1, p2) where xxx;  // apply to p1, p2
2. delete from table where xxx;     // apply to all partitions

Also remove code about the deprecated sync/async delete job.

This CL changes FE meta version to 94
2021-01-30 20:33:34 +08:00
e774314ffb Fix some problems related to thrift rpc when use nonblokcing IO model (#5117)
* Fix some problems related to thrift rpc when use nonblokcing IO model

Co-authored-by: caiconghui [蔡聪辉] <caiconghui@xiaomi.com>
2021-01-28 10:57:30 +08:00
67b0631257 [Enhancement] Fix bug for auditloader plugin that audit event cannot be processed in time (#5194)
* [Enhancement] Fix bug that audit event cannot be processed in time

Co-authored-by: caiconghui [蔡聪辉] <caiconghui@xiaomi.com>
2021-01-28 10:53:32 +08:00
ca10205137 [Function] Support show create function statement (#5197)
* [Function]Support show create function stmt

Co-authored-by: caiconghui [蔡聪辉] <caiconghui@xiaomi.com>
2021-01-28 10:52:37 +08:00
2010d331d3 [Doc] Translate a Chinese statement which appears in English version doc (#5290) 2021-01-26 09:13:30 +08:00
139709d060 [Storage] Optimize Zone map create policy (#5260)
If there are too large fields in the table, there may be only one row in each page,
and this row also has a zone map index
This causes the stored data to expand three times the original data,
It also takes up more memory when reading those segments
Therefore, we need to Disable the creation of zonemap indexes for segments with too few rows
2021-01-24 10:11:21 +08:00
83b7a23d5c fix alter routine load not work (#5257) 2021-01-20 10:52:02 +08:00
73a67901ed [Metric] Add system memory metrics for fe (#5149)
Currently, fe's SystemMetrics only support tcp. I add system memory metrics for fe.
Then we can get system memory metrics , which is used to troubleshoot memory problems.
2021-01-17 09:37:01 +08:00
99b22c92f8 [Feature] Add a http interface for single tablet migration between different disks (#5101)
Based on PR #4475, this patch add a new feature for single tablet migration between different disks by http.
Co-authored-by: weizuo <weizuo@xiaomi.com>
2021-01-16 21:35:20 +08:00
07eaf50084 [Feature] Add a http interface to acquire the tablets distribution between different disks (#5096)
For the task of rebalancing tablet among different disks on the same BE,
It might be an effective strategy to ensure all tablets under the same partition
evenly distribute on the different disks. Thus, it is necessary to obtain the
distribution of tablets under the same partition between different disks on a BE.

This patch add a new http interface for BE to acquire the distribution of tablets
under a partition between different disks on the same BE.
2021-01-15 09:32:27 +08:00
05631cfa4f [Config] Add publish timeout param when exec insert (#5170)
Add new session variable to control the timeout of publish task of insert operation.
2021-01-10 20:48:10 +08:00
f2a11fe1f7 [Enhancement] Add more comprehensive prometheus jvm thread metrics on fe (#5112)
Currently, fe thread metrics is very simple, only have thread count and peak_count.
I think we may need more comprehensive prometheus jvm thread metrics on fe.
This will be useful when we want to analysis fe's running status.
2021-01-10 20:43:17 +08:00
5d6a1a7290 [Load] support ignoring eovercrowded when tablet sink (#5156)
If adding the ignore_eovercrowded flag, the `PTabletWriterAddBatchRequest`
won't failed on `EOVERCROWDED` to avoid load jobs failed in this error.
It only effects the NodeChannel(the load job), other rpc requests will still check if overcrowded.
2021-01-09 23:40:51 +08:00
fe1ca824cc [Config] change some static config to dynamic config and delete some unused config (#5158)
* change some BE static config to dynamic config

Co-authored-by: weizuo <weizuo@xiaomi.com>
2021-01-06 09:55:09 +08:00
1be429600b Update install-deploy.md (#5190)
update description about expanding FE
2021-01-05 09:49:15 +08:00
e536823f92 [Thirdparty] Fix build thirdparty may be failed (#5187)
1. fix build thirdparty may be failed  in some os, because of default lib path is `lib` or`lib64` or `arrow` bulld failed by `brotil` and `zstd`
2. fix canot extract `.tar.bz2` file
2021-01-04 15:21:18 +08:00
6c098e45fc [Optimize][Cache]Implementation of Separated Page Cache (#5008)
#4995
**Implementation of Separated Page Cache**
- Add config "index_page_cache_ratio" to set the ratio of capacity of index page cache
- Change the member of StoragePageCache to maintain two type of cache
- Change the interface of StoragePageCache for selecting type of cache
- Change the usage of page cache in read_and_decompress_page in page_io.cpp
  - add page type as argument
  - check if current page type is available in StoragePageCache (cover the situation of ratio == 0 or 1)
- Add type as argument in superior call of read_and_decompress_page
- Change Unit Test
2021-01-04 12:19:24 +08:00
05ac7fcd4a [Function] Add BE udf bitmap_xor (#5098)
this function will return the xor result of inputs two bitmap .
2021-01-04 09:27:46 +08:00
a8b8c4760c [Doc] Fix some spelling mistakes and default value mistakes in document (#5180) 2021-01-03 15:45:56 +08:00
d7a584ac59 [Rebalancer] support partition rebalancer (#5010)
RebalancerType could be configured via Config.rebalancer_type(BeLoad, Partition).
PartitionRebalancer is based on TwoDimensionalGreedyAlgo.
Two dims of Doris should be cluster & partition. And we only consider about the replica count, 
do not consider replica size.
#4845 for further details.
2020-12-31 09:41:38 +08:00
62604dfeac Improve the processing logic of Load statement derived columns (#5140)
* support transitive in load expr
2020-12-30 10:27:46 +08:00
16d52651f3 [Docs] some brpc configs can't be modified at runtime (#5137)
brpc_max_body_size & brpc_socket_max_unwritten_bytes can't be modified at runtime. Only flags which have (R)(has_validator_fn) can.
2020-12-25 15:31:14 +08:00
279ae1cb75 Add fuzzy_parse option to speed up json import (#5114)
add a flag of fuzzy_parse, if the json file all object keys are the same and has same order, we only need to parse the first row, and then use index instead key to parse value
2020-12-25 09:19:42 +08:00
80209ef1b6 Update outfile to support cos.md (#5129)
update doc to add how to export query result on cos
2020-12-23 20:21:10 +08:00
7199bcc88b Update outfile(en) to support cos.md (#5130)
Export query result to `COS`  (Tencent Cloud  Object Storage)
2020-12-23 15:39:45 +08:00
6673306fda [DOC] fix toSql of ShowPartitionsStmt (#5070) 2020-12-19 11:18:00 +08:00
5bf84814cc [Doc] Improve broadcast instructions (#5048) 2020-12-19 11:16:59 +08:00
b640991e43 [Enhance] Add profile for load job (#5052)
Add viewable profile for broker load. Similar to the query profile,
the user can submit the import job by setting the session variable is_report_success to true,
and then view the running profile of the job on the FE web page for easy analysis and debugging.
2020-12-16 23:52:10 +08:00
74bfd69595 [Bug] Forbidden creating table with dynamic partition when FE.config dynamic_partition_enable=false (#5043)
- There is a fe configuration called dynamic_partition_enable
    which controls the opening and closing of the dynamic partition function.
  When this configuration is false, it means that all tables do not support dynamic partitioning.

- But when the user tried to create the dynamic partition table, Doris did not detect this parameter.
  This will cause the user can normally create a dynamic partition table,
    but in fact Doris cannot create a partition for this table.

- This pr detect this config when building the table.
  The dynamic partition table can be created only when the dynamic_partition_enable configuration is true.
  If the configuration is false, the command to create a dynamic partition table will directly report an error.
2020-12-16 23:44:20 +08:00
650536d53e [Feature] Add Topn udaf (#4803)
For #4674 
This is a udaf for approximate topn using Space-Saving algorithm.  At present, we can only calculate
the frequent items and their frequencies in a certain column, based on which we can implement similar
topN functions supported by Kylin in the future. 

I have also added a test to calculate the accuracy of this algorithm. The following is a rough running result.
The total amount of data is 1 million lines and follows the Zipfian distribution, where Element Cardinality
represents the data cardinality, 20X, 50X.. The value representing space_expand_rate is 20,50, which is
used to set the counter number in the space-saving algorithm

```
zf exponent = 0.5
Element cardinality	        20X        50X          100X
               1000		100%	   100%         100%
               10000		100%	   100%		100%
	       100000		100%	   100%		100%
	       500000		 94%	    98%		 99%

zf exponent = 0.6,1
Element cardinality	        20X        50X          100X
		1000		100%	   100%         100%
		10000		100%	   100%		100%
		100000		100%	   100%		100%
		500000		100%	   100%		100%

```
2020-12-16 21:58:34 +08:00
115d4332aa [ODBC] Support ODBC Sink for insert into data to ODBC external table (#5033)
issue:#5031

1. Support ODBC Sink for insert into data to ODBC external table.
2. Support Transaction for ODBC sink to make sure insert into data is atomicital.
3. The document about ODBC sink has been modified
2020-12-13 21:53:27 +08:00