* Broker load supports function
The commit support the column function in broker load.
The grammar of LoadStmt has not been changed.
Example:
columns terminated by ',' (tmp_c1, tmp_c2) set (c1=tmp_c1+tmp_c2)
Also, the old function is compatible such as default_value, strftime etc.
After this commit, there are no difference in column function between stream load and broker load except old function.
When creating table with OLAP engine, use can specify multi parition columns.
eg:
PARTITION BY RANGE(`date`, `id`)
(
PARTITION `p201701_1000` VALUES LESS THAN ("2017-02-01", "1000"),
PARTITION `p201702_2000` VALUES LESS THAN ("2017-03-01", "2000"),
PARTITION `p201703_all` VALUES LESS THAN ("2017-04-01")
)
Notice that load by hadoop cluster does not support multi parition column table.
Doris support deploy multi BE on one host. So when allocating BE for replicas of
a tablet, we should select different host. But there is a bug in tablet scheduler
that same host may be selected for one tablet. This patch will fix this problem.
There are some places related to this problem:
1. Create Table
There is no bug in Create Table process.
2. Tablet Scheduler
Fixed when selecting BE for REPLICA_MISSING and REPLICA_RELOCATING.
Fixed when balance the tablet.
3. Colocate Table Balancer
Fixed when selecting BE for repairing colocate backend sequence.
Not fix in colocate group balance. Leave it to colocate repairing.
4. Tablet report
Tablet report may add replica to catalog. But I did not check the host here,
Tablet Scheduler will fix it.
The catalogue of load docs:
---- load-manual.md
---- broker-load-manual.md
---- insert-into-manual.md
---- stream-load-manual.md
This commit also changes max/min_stream_load_timeout to max/min_load_timeout.
The old config named stream_load_timeout means the max timeout suited for all types of load.
So the config name has been changed.
Currently, GRANT_PRIV can only be granted on global level, which means
it can only be granted on *.*. Grant it on db.* or db.tbl are not allowed.
This will not be able to meet the requirement to create a user who has privilege
to grant privileges to other users on specified database or table, such as:
GRANT SELECT_PRIV ON db1.* TO cmy@'%';
So I extend the range of GRANT_PRIV. User can now grant GRANT_PRIV on
database or even table level, such as:
GRANT GRANT_PRIV ON db1.* TO cmy@'%';
And after being granted, the user cmy@'%' can now grant GRANT_PRIV on db1.* to
other users.
* This commit has brought contribution to streaming mini load
The operation of streaming mini load is sames as previous. Also, user can check the load by frontend.
The difference is that streaming mini load finish the task before reply of REST API while the non-streaming only register a load.
* When updating doris
Updating fe or be firstly are also supported. After fe and be are updated, the streaming mini load will take effect.
* For multi mini load
The non-streaming mini load still has been used by multi mini load. The behavior of multi mini load has not been changed.
* Add a interface named isSupportedFunction
This function is used to protect the correctness of new feature which consists of be and fe during updaing.
If there are only 3 backends and replication num is 3. If one replica of a
tablet is bad, there is no 4th backend for tablet repair. So we need to delete
a bad replica first to make room for new replica.
This change adds a load property named strict_mode which is used to prohibit the incorrect data.
When it is set to false, the incorrect data will be loaded by NULL just like before.
When it is set to true, the incorrect data which belongs to a column without expr will be filtered.
The strict_mode is supported in broker load v2 now. It will be supported in stream load later.
The non-streaming hint of insert into will use the streamin plan which is same as the plan of stream insert.
It will also record the load info and return the label of insert stmt.
The partition is supportted in insert into stmt. The result which meet the target partitions will be loaded.
The introduction of example has been changed especially non-streaming insert.
Also, the param of partition_names is added in sql syntax which is used to declare the target partition_names in target table.
Change META_VERSION to 50
1. get_json_xxx() now support using quoto to escape dot
2. Implement json_path_prepare() function to preprocess json_path
Performance of get_json_string() on 1000000 rows reduces from 2.27s to 0.27s