Commit Graph

313 Commits

Author SHA1 Message Date
a1500eb544 Update doris-on-es.md (#3446) 2020-05-03 12:48:48 +08:00
2cb4027164 Update doris-on-es.md (#3441) 2020-05-03 12:48:19 +08:00
54da5a491c Fix delete statement doc display not correctly (#3445) 2020-05-01 19:20:00 +08:00
73a3c59efb [Bug] Fix bug that help-resource.zip file is missing. (#3423) 2020-04-29 19:25:28 +08:00
432965e360 [Enhancement] documents rebuild with Vuepress (#3408) (#3414) 2020-04-29 09:14:31 +08:00
9a934ec9f6 [Load] Add more info in SHOW LOAD result (#3391)
Fix #3390
This CL add more info in `JobDetails` column of `SHOW LOAD` result for Broker Load Job.

For example:
```
{
	"Unfinished backends": {
		"9c3441027ff948a0-8287923329a2b6a7": [10002]
	},
        "All backends": {
		"9c3441027ff948a0-8287923329a2b6a7": [10002, 10004, 10006]
	},
	"ScannedRows": 2390016,
	"TaskNumber": 1,
	"FileNumber": 1,
	"FileSize": 1073741824
}
```

2 newly added keys:

`Unfinished backends` indicates the BE which task on them are not finished.
`All backends` indicates the BE which this job has tasks on it.

One more thing, I pass the Backend Id along with the heartbeat msg from FE to BE, so that BE can
know the Id of themselves.
2020-04-26 21:30:23 +08:00
7715deed4e [Doc] Add download link for 0.12.0 release (#3388) 2020-04-24 21:04:19 +08:00
09eb40e356 [New Stmt] Alter replication number for table (#3360)
This CL add new command to set replication number of table in one time.
```
alter table test_tbl set ("replication_num" = "3");
```
It changes replication num of a unpartitioned table.

and

```
alter table test_tbl set ("default.replication_num" = "3");
```

It changes default replication num of the specified table.
2020-04-23 21:58:09 +08:00
d854a79878 [Bug] isQuery field should be reset at the beginning of query execution (#3374)
If not reset, all queries comes from same session will have save isQuery field value.
This bug will cause all entries in fe.audit.log has same IsQuery=true.

This CL also fix another bug:
The resolved IPs of domain of a user should not appear in other user's white list. Fix #3380
2020-04-23 09:00:47 +08:00
wyb
2de78e50e2 [Bug] Fix authorization missing when auditloader plugin redirect stream load (#3367)
HttpURLConnection can automatically redirect stream load to BE, but there is no authorization 
information in http request headers after redirect.

Maybe HttpURLConnection remove authorization info when do followRedirect.

The solution is set the followRedirect property to false on the connection object and do the 
redirect request manually.

#3364
2020-04-21 22:03:18 +08:00
c6ac60bab9 [SegmentV2] Optimize the upgrade logic of SegmentV2 (#3340)
This CL mainly made the following modifications:

1. Reorganized SegmentV2 upgrade document.
2. When the variable `use_v2_rollup` is set to true, the base rollup in v2 format is forcibly queried for verifying the data.
3. Fix a problem that there is no persistent storage format information in the schema change operation that performs v2 conversion.
4. Allow users to directly create v2 format tables.
2020-04-21 10:45:29 +08:00
c69bf9ac44 [New Stmt] Add SHOW KEYS gramma (#3342)
support `SHOW KEYS FROM table` for the data connector of mainstream BI tools
like PowerBI/FineBI 

#3334
2020-04-20 15:58:20 +08:00
1d3370532b [Doc] Fix some typo, mod routine load doc (#3350)
Fix BOOLEAN typo, improve the routine load sample
2020-04-19 11:39:10 +08:00
31ebb2496d [ISSUE #3190]Add documents for delete simplifly (#3335) 2020-04-18 22:48:18 +08:00
f3e5320fea Fix document bug of storage_cooldown_time (#3333) 2020-04-17 09:34:28 +08:00
9257535f91 [New Feature] Support setting replica quota in db level (#3283)
This PR is to limit the replica usage, admin need to know the replica usage for every db and 
table, be able to set replica quota for every db.

```
ALTER DATABASE db_name SET REPLICA QUOTA quota; 
```
2020-04-14 22:25:32 +08:00
d0f87728e0 [Doc] Add example of timeout property in alter table stmt (#3274) 2020-04-07 19:51:16 +08:00
c9c58342b2 [License] Add License to codes (#3272) 2020-04-07 16:35:13 +08:00
2ed184e06a Add config: tablet writer open rpc timeout (#3258) 2020-04-03 16:43:56 +08:00
fcb651329c [Plugin] Making FE audit module pluggable (#3219)
Currently we have implemented the plugin framework in FE. 
This CL make the original audit log logic pluggable.
The following classes are mainly implemented:

1. AuditPlugin
    The interface of audit plugin

2. AuditEvent
    An AuditEvent contains all information about an audit event, such as a query, or a connection.

3. AuditEventProcessor
    Audit event processor receive all audit events and deliver them to all installed audit plugins.

This CL implements two audit module plugins:

1. The builtin plugin `AuditLogBuilder`, which act same as the previous logic, to save the 
    audit log to the `fe.audit.log`

2. An optional plugin `AuditLoader`, which will periodically inserts the audit log into a Doris table
    specified by the user. In this way, users can conveniently use SQL to query and analyze this
    audit log table.

Some documents are added:

1. HELP docs of install/uninstall/show plugin.
2. Rename the `README.md` in `fe_plugins/` dir to `plugin-development-manual.md` and move
    it to the `docs/` dir
3. `audit-plugin.md` to introduce the usage of `AuditLoader` plugin.

ISSUE: #3226
2020-04-03 09:53:50 +08:00
29b37dad49 Sql reference of materialized view (#3208)
* Sql reference of materialized view

Sql reference of Create and drop materialized view in English and Chinese.

* Change discription
2020-04-01 21:22:19 +08:00
d3555e3624 [Conf][API Change] Change the default FE meta dir and BE storage_root_path
1. Change word of palo to doris in conf file.
2. Set default meta_dir to ${DORIS_HOME}/doris-meta
3. Comment out FE meta_dir, leave it to ${DORIS_HOME}/doris-meta, as exsting in FE Config.java.
4. Comment out BE storage_root_path, leave it to ${DORIS_HOME}/storage, as exsting in BE config.h.

NOTICE: default config is changed.
2020-03-27 20:42:12 +08:00
aa8b2f86c4 [Bug][Refactor] Fix the conflict of temp partition and dynamic partition operations (#3201)
The bug is described in issue: #3200.

This CL solve the problem by:
1. Refactor the alter operation conflict checking logic by introducing new classes `AlterOperations` and `AlterOpType`.
2. Allow add/drop temporary partition when dynamic partition feature is enabled.
3. Allow modifying table's property when there is temporary partition in table.
4. Make the properties `dynamic_partition.enable` optional, and default is true.
2020-03-27 20:25:15 +08:00
c1969a3fb3 [Conf] Make default_storage_medium configurable (#2980)
Doris support choose medium when create table, and the cluster balance strategy is dependent
between different storage medium, and most use will not specify the storage medium when create table,
even they kown that they should choose a storage medium, they have no idea about the
cluster's storage medium, so, I think we should make storage_medium and storage_cooldown_time
configurable, and this should be the admin's responsibility.

For Example, if the cluster's storage medium is HDD, but we need to change part of machines to SSD,
if we change the machine, the tablets before change is stored in HDD and they can't find a dest path
to migrate, and user will create table as usual, it will make all tablets stored in old machines and
the new machines will only store a little tablets. Without this config the only way is admin need
to traverse all partitions in cluster and change the property of storage_medium, it will increase
operational and maintenance costs.

So I add a FE config default_storage_medium, so that user can set the default storage medium.
2020-03-27 20:22:18 +08:00
8fa328c344 [Doc]Update doc for dynamic partition (#3093)
Add explain of dynamic dropping partition.
2020-03-25 20:45:13 +08:00
3b32938140 [Doc] Create CONTRIBUTING.md (#3180) 2020-03-24 13:42:21 +08:00
d4c1938b5c Open datetime min value limit (#3158)
the min_value in olap/type.h of datetime is 0000-01-01 00:00:00, so we don't need restrict datetime min in tablet_sink
2020-03-24 10:52:57 +08:00
0f14408f13 [Temp Partition] Support loading data into temp partitions (#3120)
Related issue: #2663, #2828.

This CL support loading data into specified temporary partitions.

```
INSERT INTO tbl TEMPORARY PARTITIONS(tp1, tp2, ..) ....;

curl .... -H "temporary_partition: tp1, tp, .. "  ....

LOAD LABEL db1.label1 (
DATA INFILE("xxxx") 
INTO TABLE `tbl2`
TEMPORARY PARTITION(tp1, tp2, ...)
...
```

NOTICE: this CL change the FE meta version to 77.

There 3 major changes in this CL

## Syntax reorganization

Reorganized the syntax related to the `specify-partitions`. Removed some redundant syntax
 definitions, and unified the syntax related to the `specify-partitions` under one syntax entry.

## Meta refactor

In order to be able to support specifying temporary partitions, 
I made some changes to the way the partition information in the table is stored.

Partition information is now organized as follows:

The following two maps are reserved in OlapTable for storing formal partitions:

    ```
    idToPartition
    nameToPartition
    ```

Use the `TempPartitions` class for storing temporary partitions.

All the partition attributes of the formal partition and the temporary partition,
such as the range, the number of replicas, and the storage medium, are all stored
in the `partitionInfo` of the OlapTable.

In `partitionInfo`, we use two maps to store the range of formal partition
and temporary partition:

    ```
    idToRange
    idToTempRange
    ```

Use separate map is because the partition ranges of the formal partition and
the temporary partition may overlap. Separate map can more easily check the partition range.

All partition attributes except the partition range are stored using the same map,
and the partition id is used as the map key.

## Method to get partition

A table may contain both formal and temporary partitions.
There are several methods to get the partition of a table.
Typically divided into two categories:

1. Get partition by id
2. Get partition by name

According to different requirements, the caller may want to obtain
a formal partition or a temporary partition. These methods are
described below in order to obtain the partition by using the correct method.

1. Get by name

This type of request usually comes from a user with partition names. Such as
`select * from tbl partition(p1);`.
This type of request has clear information to indicate whether to obtain a
formal or temporary partition.
Therefore, we need to get the partition through this method:

`getPartition(String partitionName, boolean isTemp)`

To avoid modifying too much code, we leave the `getPartition(String
partitionName)`, which is same as:

`getPartition(partitionName, false)`

2. Get by id

This type of request usually means that the previous step has obtained
certain partition ids in some way,
so we only need to get the corresponding partition through this method:

`getPartition(long partitionId)`.

This method will try to get both formal partitions and temporary partitions.

3. Get all partition instances

Depending on the requirements, the caller may want to obtain all formal
partitions,
all temporary partitions, or all partitions. Therefore we provide 3 methods,
the caller chooses according to needs.

`getPartitions()`
`getTempPartitions()`
`getAllPartitions()`
2020-03-19 15:07:01 +08:00
f6374fa9a5 Use default_rowset_type to replace compaction_rowset_type (#3101)
* use default_rowset_type to replace compaction_rowset_type

* segment v2 usage document
2020-03-16 22:23:48 +08:00
14c088161c [New Stmt] Support setting replica status manually (#1522)
Sometimes a replica is broken on BE, but FE does not notice that.
In this case, we have to manually delete that replica on BE.
If there are hundreds of replicas need to be handled, this is a disaster.

So I add a new stmt:

    ADMIN SET REPLICA STATUS

which support setting tablet on specified BE as BAD or OK.
2020-03-16 13:42:30 +08:00
5f18e99cdb [Doc] Update add fe node description (#3100) 2020-03-13 18:05:09 +08:00
d8c756260b Rewrite count distinct to bitmap and hll (#3096) 2020-03-13 11:44:40 +08:00
cf219ddf18 [ConsistencyCheck] Support checking replica consistency of tablet manually (#3067) 2020-03-10 15:25:25 +08:00
7400535b37 [Doc] Update compaction-action_EN.md (#3060)
fix typo
2020-03-09 22:09:43 +08:00
6e46dccd39 [Doc] Update compaction-action.md (#3059)
fix typo
2020-03-09 21:14:09 +08:00
765f284dcd [Doc] Add Downloads page to Doris website (#3039) 2020-03-09 09:42:46 +08:00
cd7207c869 Add ORC help doc (#3041) 2020-03-05 12:44:47 +08:00
cc1a5fb8ea [Function] Support '%' in date format string (#3037)
eg:
select str_to_date('2014-12-21 12%3A34%3A56', '%Y-%m-%d %H%%3A%i%%3A%s');
select unix_timestamp('2007-11-30 10:30%3A19', '%Y-%m-%d %H:%i%%3A%s');

This also enable us to extract column fields from HDFS file path with contains '%'.
2020-03-05 08:56:02 +08:00
70cc6df415 [Doc] Fix some typo (#3024) 2020-03-02 22:13:47 +08:00
511c5eed50 [Doc] Modify format of some docs (#3021)
Format of some docs are incorrect for building the doc website.
* fix a bug that `gensrc` dir can not be built with -j.
* fix ut bug of CreateFunctionTest
2020-03-02 19:07:52 +08:00
ef4bb0c011 [RoutineLoad] Auto Resume RoutineLoadJob (#2958)
When all backends restart, the routine load job can be resumed.
2020-03-02 13:27:35 +08:00
df56588bb5 [Temp Partition] Support add/drop/replace temp partitions (#2828)
This CL implements 3 new operations:

```
ALTER TABLE tbl ADD TEMPORARY PARTITION ...;
ALTER TABLE tbl DROP TEMPORARY PARTITION ...;
ALTER TABLE tbl REPLACE TEMPORARY PARTITION (p1, p2, ...);
```

User manual can be found in document:
`docs/documentation/cn/administrator-guide/alter-table/alter-table-temp-partition.md`

I did not update the grammar manual of `alter-table.md`.
This manual is too confusing and too big, I will reorganize this manual after.

This is the first part to implement the "overwrite load" feature mentioned in issue #2663.
I will implement the "load to temp partition" feature in next PR.

This CL also add GSON serialization method for the following classes (But not used):

```
Partition.java
MaterializedIndex.java
Tablet.java
Replica.java
```
2020-03-01 21:30:34 +08:00
0d1e28746e [Function] Support null_or_empty function (#2977)
It returns true if the string is empty or NULL. Otherwise it returns false.
2020-03-01 17:35:45 +08:00
078e35a62e Support Amazon S3 data source in Broker Load (#3004) 2020-03-01 12:53:50 +08:00
2ac07a8c07 [Doc] Fix docs mixed Chinese and English (#3017) 2020-02-28 16:36:37 +08:00
54b7828c3f [Doc] The doc of max_running_txn_num_per_db config (#3007) 2020-02-27 14:57:46 +08:00
57483ade00 [Doc] Fix typo in Chinese document (#2963)
Fix some errors in Chinese document
2020-02-25 22:30:21 +08:00
wyb
fc2d92d68a Update spark load doc (#2973) 2020-02-22 12:00:50 +08:00
96248058a1 [Doc] Modify the default port of restore meta instance in document (#2971) 2020-02-21 22:05:30 +08:00
70d2ccf384 Add spark load design (#2856) 2020-02-21 14:32:18 +08:00