This change adds a load property named strict_mode which is used to prohibit the incorrect data.
When it is set to false, the incorrect data will be loaded by NULL just like before.
When it is set to true, the incorrect data which belongs to a column without expr will be filtered.
The strict_mode is supported in broker load v2 now. It will be supported in stream load later.
For example, we start the process for the first time. The pid is 12345. Due to the accident, the process is killed and the fe.pid exists. Then we start the process for the second time. The pid is 6789. The fe.pid shows 67895 , Because file.write only cover the first four digits. This case can happen easily when we use supervise. Then I add the file.setLength(0) and delete the old data.
All FE logs are now with suffix yyyyMMdd or yyyyMMddHH, specified by 2 new
configurations: 'sys_log_roll_interval' and 'audit_log_roll_interval'
All FE logs roll at max size of 1024MB(default), specified by new
configuration: 'log_roll_size_mb'
By default, the new FE logs will look like this:
log/
fe.audit.log
fe.audit.log.20190524-1
fe.audit.log.20190523-2
fe.log
fe.log.20190524-1
fe.log.20190524-2
fe.log.20190523-3
fe.warn.log
fe.warn.log.20190524-1
fe.warn.log.20190523-2
fe.gc.log.20190524
Configurations 'sys_log_roll_mode' and 'audit_log_roll_mode' are deprecated.
The non-streaming hint of insert into will use the streamin plan which is same as the plan of stream insert.
It will also record the load info and return the label of insert stmt.
The partition is supportted in insert into stmt. The result which meet the target partitions will be loaded.
The introduction of example has been changed especially non-streaming insert.
Also, the param of partition_names is added in sql syntax which is used to declare the target partition_names in target table.
Change META_VERSION to 50
A bad tablet is always be chosen to do compaction, and failed again
and again, which may block compaction of other tablets.
Add a BE config 'min_compaction_failure_interval_ms' to avoid choosing
bad tablet again at a certain interval, so that other tablets have
chance to do the compaction.
Also fix a bug that using avg() function on varchar column return
unexpected exception.
1. Print the last few tablets of decommission backend in fe.log for debug.
2. OlapTableSink should get replica on alive Backends, not only available Backends.
3. When decommission multi Backends, we should drop the redundant replicas before creating a new one.
4. Replicas on decommissioning Backends should be not added to catalog again.
5. Decommissioning Backends should not be chosen as destination of tablet repairing.
* Change label of broker load txn
1. put broker load label into txn label
2. fix the bug of `label is already used`
3. fix partition error of new broker load
* Fix count error in mini load and broker load
There are three params (num_rows_load_total, num_rows_load_filtered, num_rows_load_unselected) which are used to count dpp.norm.ALL and dpp.abnorm.ALL.
num_rows_load_total is the number rows of source file.
num_rows_load_unselected is the not satisfied (where conjuncts) rows of num_rows_load_total
num_rows_load_filtered is the rows (quality not good enough) of (num_rows_load_total-num_rows_load_unselected)
This change include the show load of loadv2 and some bug fix of loadv2.
Firstly, the show load will perform both load and loadv2 info. According to loadv2, the ETL progress of loadv2 is N/A during the period of loading.
Secondly, the loadv2 will be created when version of property is v2.
This is a temporary property which will not influence the old broker load.
After the loadv2 is finished, the default load will be changed to loadv2.
Finally, there are some bug in LoadingTaskPlanner fixed by this change.
* Enhence the usabilities
1. Add metrics to monitor transactions and steaming load process in BE.
2. Modify BE config 'result_buffer_cancelled_interval_time' to 300s.
3. Modify FE config 'enable_metric_calculator' to true.
4. Add more log for tracing broker load process.
5. Modify the query report process, to cancel query immediately if some instance failed.
* Fix bugs
1. Avoid NullPointer when enabling colocation join with broker load
2. Return immediately when pull load task coordinator execution failed
* Add new scheduler of load in fe
1. New scheduler only support the broker load now.
2. The stage of load consist of PENDING -> LOADING -> FINISHED
3. The LoadScheduler will divide job into a pending task. There are preparations that need to be done on pending task.
4. OnPendingTaskFinished will be invoked after pending task. It is used to submit the loading task which is created based on attachment of pending task.
5. OnLoadingTaskFinished will be invoked after loding task. It is used to record the commit info and commit txn when all of task has been finished.
.
* Combine pendingTask and loadingTask into loadTask
1. The load task callback include two methods: onTaskFinished, onTaskFailed
* Add txn callback of load job
1. isCommittting is true when beforeCommitted in txn
2. job could not be cancelled when isCommitting is true
3. job will be finished after txn is visible
4. old job will be cleaned when (CurrentTS - FinishedTs) / 1000 > Config.label_keep_seconds
5. LoadTimeoutChecker is performed to cancel timeout job