Fix some format.
NOTICE(#3622 ):
This is a "revert of revert pull request".
This pr is mainly used to synthesize the PRs whose commits were
scattered and submitted due to the wrong merge method into a complete single commit.
Add a new config `drop_backend_after_decommission` in FE. if this config
is false, the BE will not be dropped after finishing decommission operation.
This new config is try to solve the problem described in ISSUE: #3460 .
TODO:
This method will generate a lot of data migration, so it is only a temporary solution.
After that, we should try to solve the problem of data balancing within the BE.
This CL also add the documents of FE and BE configuration.
These documents are incomplete and can be added later.
Fix#3390
This CL add more info in `JobDetails` column of `SHOW LOAD` result for Broker Load Job.
For example:
```
{
"Unfinished backends": {
"9c3441027ff948a0-8287923329a2b6a7": [10002]
},
"All backends": {
"9c3441027ff948a0-8287923329a2b6a7": [10002, 10004, 10006]
},
"ScannedRows": 2390016,
"TaskNumber": 1,
"FileNumber": 1,
"FileSize": 1073741824
}
```
2 newly added keys:
`Unfinished backends` indicates the BE which task on them are not finished.
`All backends` indicates the BE which this job has tasks on it.
One more thing, I pass the Backend Id along with the heartbeat msg from FE to BE, so that BE can
know the Id of themselves.
This CL add new command to set replication number of table in one time.
```
alter table test_tbl set ("replication_num" = "3");
```
It changes replication num of a unpartitioned table.
and
```
alter table test_tbl set ("default.replication_num" = "3");
```
It changes default replication num of the specified table.
If not reset, all queries comes from same session will have save isQuery field value.
This bug will cause all entries in fe.audit.log has same IsQuery=true.
This CL also fix another bug:
The resolved IPs of domain of a user should not appear in other user's white list. Fix#3380
HttpURLConnection can automatically redirect stream load to BE, but there is no authorization
information in http request headers after redirect.
Maybe HttpURLConnection remove authorization info when do followRedirect.
The solution is set the followRedirect property to false on the connection object and do the
redirect request manually.
#3364
This CL mainly made the following modifications:
1. Reorganized SegmentV2 upgrade document.
2. When the variable `use_v2_rollup` is set to true, the base rollup in v2 format is forcibly queried for verifying the data.
3. Fix a problem that there is no persistent storage format information in the schema change operation that performs v2 conversion.
4. Allow users to directly create v2 format tables.
This PR is to limit the replica usage, admin need to know the replica usage for every db and
table, be able to set replica quota for every db.
```
ALTER DATABASE db_name SET REPLICA QUOTA quota;
```
Currently we have implemented the plugin framework in FE.
This CL make the original audit log logic pluggable.
The following classes are mainly implemented:
1. AuditPlugin
The interface of audit plugin
2. AuditEvent
An AuditEvent contains all information about an audit event, such as a query, or a connection.
3. AuditEventProcessor
Audit event processor receive all audit events and deliver them to all installed audit plugins.
This CL implements two audit module plugins:
1. The builtin plugin `AuditLogBuilder`, which act same as the previous logic, to save the
audit log to the `fe.audit.log`
2. An optional plugin `AuditLoader`, which will periodically inserts the audit log into a Doris table
specified by the user. In this way, users can conveniently use SQL to query and analyze this
audit log table.
Some documents are added:
1. HELP docs of install/uninstall/show plugin.
2. Rename the `README.md` in `fe_plugins/` dir to `plugin-development-manual.md` and move
it to the `docs/` dir
3. `audit-plugin.md` to introduce the usage of `AuditLoader` plugin.
ISSUE: #3226
1. Change word of palo to doris in conf file.
2. Set default meta_dir to ${DORIS_HOME}/doris-meta
3. Comment out FE meta_dir, leave it to ${DORIS_HOME}/doris-meta, as exsting in FE Config.java.
4. Comment out BE storage_root_path, leave it to ${DORIS_HOME}/storage, as exsting in BE config.h.
NOTICE: default config is changed.
The bug is described in issue: #3200.
This CL solve the problem by:
1. Refactor the alter operation conflict checking logic by introducing new classes `AlterOperations` and `AlterOpType`.
2. Allow add/drop temporary partition when dynamic partition feature is enabled.
3. Allow modifying table's property when there is temporary partition in table.
4. Make the properties `dynamic_partition.enable` optional, and default is true.
Doris support choose medium when create table, and the cluster balance strategy is dependent
between different storage medium, and most use will not specify the storage medium when create table,
even they kown that they should choose a storage medium, they have no idea about the
cluster's storage medium, so, I think we should make storage_medium and storage_cooldown_time
configurable, and this should be the admin's responsibility.
For Example, if the cluster's storage medium is HDD, but we need to change part of machines to SSD,
if we change the machine, the tablets before change is stored in HDD and they can't find a dest path
to migrate, and user will create table as usual, it will make all tablets stored in old machines and
the new machines will only store a little tablets. Without this config the only way is admin need
to traverse all partitions in cluster and change the property of storage_medium, it will increase
operational and maintenance costs.
So I add a FE config default_storage_medium, so that user can set the default storage medium.
Related issue: #2663, #2828.
This CL support loading data into specified temporary partitions.
```
INSERT INTO tbl TEMPORARY PARTITIONS(tp1, tp2, ..) ....;
curl .... -H "temporary_partition: tp1, tp, .. " ....
LOAD LABEL db1.label1 (
DATA INFILE("xxxx")
INTO TABLE `tbl2`
TEMPORARY PARTITION(tp1, tp2, ...)
...
```
NOTICE: this CL change the FE meta version to 77.
There 3 major changes in this CL
## Syntax reorganization
Reorganized the syntax related to the `specify-partitions`. Removed some redundant syntax
definitions, and unified the syntax related to the `specify-partitions` under one syntax entry.
## Meta refactor
In order to be able to support specifying temporary partitions,
I made some changes to the way the partition information in the table is stored.
Partition information is now organized as follows:
The following two maps are reserved in OlapTable for storing formal partitions:
```
idToPartition
nameToPartition
```
Use the `TempPartitions` class for storing temporary partitions.
All the partition attributes of the formal partition and the temporary partition,
such as the range, the number of replicas, and the storage medium, are all stored
in the `partitionInfo` of the OlapTable.
In `partitionInfo`, we use two maps to store the range of formal partition
and temporary partition:
```
idToRange
idToTempRange
```
Use separate map is because the partition ranges of the formal partition and
the temporary partition may overlap. Separate map can more easily check the partition range.
All partition attributes except the partition range are stored using the same map,
and the partition id is used as the map key.
## Method to get partition
A table may contain both formal and temporary partitions.
There are several methods to get the partition of a table.
Typically divided into two categories:
1. Get partition by id
2. Get partition by name
According to different requirements, the caller may want to obtain
a formal partition or a temporary partition. These methods are
described below in order to obtain the partition by using the correct method.
1. Get by name
This type of request usually comes from a user with partition names. Such as
`select * from tbl partition(p1);`.
This type of request has clear information to indicate whether to obtain a
formal or temporary partition.
Therefore, we need to get the partition through this method:
`getPartition(String partitionName, boolean isTemp)`
To avoid modifying too much code, we leave the `getPartition(String
partitionName)`, which is same as:
`getPartition(partitionName, false)`
2. Get by id
This type of request usually means that the previous step has obtained
certain partition ids in some way,
so we only need to get the corresponding partition through this method:
`getPartition(long partitionId)`.
This method will try to get both formal partitions and temporary partitions.
3. Get all partition instances
Depending on the requirements, the caller may want to obtain all formal
partitions,
all temporary partitions, or all partitions. Therefore we provide 3 methods,
the caller chooses according to needs.
`getPartitions()`
`getTempPartitions()`
`getAllPartitions()`
Sometimes a replica is broken on BE, but FE does not notice that.
In this case, we have to manually delete that replica on BE.
If there are hundreds of replicas need to be handled, this is a disaster.
So I add a new stmt:
ADMIN SET REPLICA STATUS
which support setting tablet on specified BE as BAD or OK.