When creating table with OLAP engine, use can specify multi parition columns.
eg:
PARTITION BY RANGE(`date`, `id`)
(
PARTITION `p201701_1000` VALUES LESS THAN ("2017-02-01", "1000"),
PARTITION `p201702_2000` VALUES LESS THAN ("2017-03-01", "2000"),
PARTITION `p201703_all` VALUES LESS THAN ("2017-04-01")
)
Notice that load by hadoop cluster does not support multi parition column table.
The TabletQuorumFailedException will be thrown in commitTxn while the success replica num of tablet is less then quorom replica num.
The Hadoop load does not handle this exception because the push task will retry it later.
The streaming broker, insert, stream and mini load will catch this exception and abort the txn after that.
Use same UUID as query ID and load ID of a load execution plan.
Each load execution plan has a load ID, and as a plan, there is also a query ID.
We can use same UUID as query ID and load ID, for tracing the load process more easily.
Change the load ID when retrying a load execution plan.
When a load execution plan retry, the load ID should be changed, otherwise BE can not
distinguish the old and new load requests.
Cancel the running loading task when cancelling the broker load.
When user cancel a broker load, the running loading task should also be cancelled, or
it may occupies the worker thread for a long time.
Remove the unnecessary query report when doing load execution plan.
Only the last query report is needed.
Add a new BE config tablet_writer_rpc_timeout_sec.
It is used for RPC of tablet sink. The default is 600 seconds. which is long enough for flushing
about 6GB data. The long timeout config will reduce the possibility of encountering fail to send batch error when loading.
Use streaming_load_max_mb instead of mini_load_max_mb in BE config.
Add more logs for tracing a broker load process easily.
The function of miniLoadBegin will return the txn_id.
If the backend sends the duplicated request to frontend, frontend will return the txn_id which was created by the same mini load.
The issue is that frontend returns the txn_id when the last same request hasn't been begun the txn.
The frontend returns the zero which is initialized txn_id and the be could not execute the load plan with a error txn_id.
The commit conbines the `createLoadJob` and `execute` together in the write lock. It protects the atomicity of `create` and `beginTxn`.
So the duplicated request cannot get the txn id before the last same request is finished.
Currently, GRANT_PRIV can only be granted on global level, which means
it can only be granted on *.*. Grant it on db.* or db.tbl are not allowed.
This will not be able to meet the requirement to create a user who has privilege
to grant privileges to other users on specified database or table, such as:
GRANT SELECT_PRIV ON db1.* TO cmy@'%';
So I extend the range of GRANT_PRIV. User can now grant GRANT_PRIV on
database or even table level, such as:
GRANT GRANT_PRIV ON db1.* TO cmy@'%';
And after being granted, the user cmy@'%' can now grant GRANT_PRIV on db1.* to
other users.
NOTE: This patch would modify all Backend's data.
And this will cause a very long time to restart be.
So if you want to interferer your product environment,
you should upgrade backend one by one.
1. Refactoring be is to clarify the structure the codes.
2. Use unique id to indicate a rowset.
Nameing rowset with tablet_id and version will lead to
many conflicts among compaction, clone, restore.
3. Extract an rowset interface to encapsulate rowsets
with different format.
The new class named 'AuthorizationInfo' is used to save the auth info in jobs.
The job doesn't need to retrieve the auth info by meta id which maybe throw the exception when db or table has been dropped or renamed.
The persistence of 'AuthorizationInfo' take effect in META_VERSION 56
Before changing default insert operation to streaming load, if the select result
of a insert stmt is empty, a label will still be returned to the user, and user
can use this label to check the insert load job's status.
After changing the insert operation, if the select result is empty, a exception
will be thrown to user client directly without any label.
This new usage pattern is not friendly to already existed users, which is forcing
them to change their way of using insert operation.
So I add a new FE config 'using_old_load_usage_pattern', default is false.
If set to true, a label will be returned to user even if the select result is empty.
* This commit has brought contribution to streaming mini load
The operation of streaming mini load is sames as previous. Also, user can check the load by frontend.
The difference is that streaming mini load finish the task before reply of REST API while the non-streaming only register a load.
* When updating doris
Updating fe or be firstly are also supported. After fe and be are updated, the streaming mini load will take effect.
* For multi mini load
The non-streaming mini load still has been used by multi mini load. The behavior of multi mini load has not been changed.
* Add a interface named isSupportedFunction
This function is used to protect the correctness of new feature which consists of be and fe during updaing.
* Enhence the usabilities
1. Add metrics to monitor transactions and steaming load process in BE.
2. Modify BE config 'result_buffer_cancelled_interval_time' to 300s.
3. Modify FE config 'enable_metric_calculator' to true.
4. Add more log for tracing broker load process.
5. Modify the query report process, to cancel query immediately if some instance failed.
* Fix bugs
1. Avoid NullPointer when enabling colocation join with broker load
2. Return immediately when pull load task coordinator execution failed
1. TaskScheduler will process one task per round
2. TaskScheduler will be blocked till queue tasks a new task
3. TaskScheduler will submit tasks when queue is empty
4. Add a example of creating a broker table by BOS
5. Change syntax of show routine load job
* Fix bug of listener
* Change txnStateChangeListener to txnStateChangeCallback
* Fix the logic of beforeAborted
1. It task is not belong to job, the txn attachment will be set to null.
* Txn will be abort normally without attachment.
* Job will not be updated by this task which attachment is null.
1. Add Config.max_routine_load_concurrent_task_num instead of the old one
2. Fix a bug that SHOW ALTER TABLE COLUMN may throw Nullpointer exception
3. Fix some misspelling of docs
1. Moving lock of routine load job from inside of lock of txn to outside.
2. The process of routine load task commit or abort is following:
* lock job
check task
lock txn
commit txn
unlock txn
commit task
* unlock job
3. The process of checking timeout txn will be ignored when there are related task of txn.
4. The relationship between task and txn will be removed when task timeout.
1. Use a data consumer group to share a single stream load pipe with multi data consumers. This will increase the consuming speed of Kafka messages, as well as reducing the task number of routine
load job.
Test results:
* 1 consumer, 1 partitions:
consume time: 4.469s, rows: 990140, bytes: 128737139. 221557 rows/s, 28M/s
* 1 consumer, 3 partitions:
consume time: 12.765s, rows: 2000143, bytes: 258631271. 156689 rows/s, 20M/s
blocking get time(us): 12268241, blocking put time(us): 1886431
* 3 consumers, 3 partitions:
consume time(all 3): 6.095s, rows: 2000503, bytes: 258631576. 328220 rows/s, 42M/s
blocking get time(us): 1041639, blocking put time(us): 10356581
The next 2 cases show that we can achieve higher speed by adding more consumers. But the bottle neck transfers from Kafka consumer to Doris ingestion, so 3 consumers in a group is enough.
I also add a Backend config `max_consumer_num_per_group` to change the number of consumers in a data consumer group, and default value is 3.
In my test(1 Backend, 2 tablets, 1 replicas), 1 routine load task can achieve 10M/s, which is same as raw stream load.
2. Add OFFSET_BEGINNING and OFFSET_END support for Kafka routine load
1. add show proc "/routine_loads" to show statistic of all of jobs and tasks
2. add show proc "/routine_loads/jobname" to show info of all of jobs named jobname
3. add show proc "/routine_loads/jobname/jobid" to show tasks belong to jobid
4. fix bug of allocateBeToTask
1. ShowRoutineLoadStmt is sames like class description. It does not support show all of routine load job in all of db
2. ShowRoutineLoadTaskStmt is sames like class description. It does not support show all of routine laod task in all of job
3. Init partitionIdsToOffset in constructor of KafkaProgress
4. Change Create/Pause/Resume/Stop routine load job to LabelName such as [db.]name
5. Exclude final job when updating job
6. Catch all of exception when scheduling one job. The exception will not block the another jobs.