1. Unify the thrift rpc timeout from BE to FE.
Add a BE config 'thrift_rpc_timeout_ms', default is 5000
2. Add hostname in "show proc '/frontends';" stmt result.
3. Fix a lock order bug in Load.java
1. Optimize the error msg of Tablet scheduler.
2. Missing helper nodes info when modify Frontends.
3. Fix bug that olap tablet's header lock is not released.
* Remove build rows counter in PartitionHashJoinNode
* Fix unit test fail in RuntimeProfileTest
* Add check for result type length in cast_to_string_val
There is unnegligible cost to covnert VectorRowBatch to RowBatch,
When we seek block, we only read one row from engine to minimize
this convert cost.
This patch can optimize some query's time from 5s to 2s
There are A, B, C replicas of one tablet.
A has 0 - 10 version.
B has 0 - 5, 6, 7, 9, 10 version.
1. B has missed versions, so it clones 0 - 10 from A, and remove overlapped versions in its header.
2. Coincidentally, 6 is a version for delete predicate (delete where day = 20181221).
When removing overlapped versions, version 6 is removed but delete predicate is not be removed.
3. Unfortunately, 0-10 cloned from A has data indicated at 20181221.
4. B performs compaction, and data generated by 20181221 is be removed falsely.
Record query consumption into fe audit log. Its basic mode of work is as follows, one of instance of parent plan is responsible for accumulating sub plan's consumption and send to it's parent, BE coordinator will get total consumption because it's a single instance.
Help to locate big query when system overload, by checking consumptions of running parts of current all queries or specified one query. Its basic mode of work is as follows: firstly trigger BE to report RuntimeProfiles, and wait a moment. secondly caculate consumptions with RuntimeProfiles reported by BE. The consumptions supported by it are the cost of running ExecNode in query when call it.
1. Add broker load error hub
A broker load error hub will collect error messages in load process and saves them as a file to the specified remote storage via broker. In case that in broker/min/streaming load process, user may not be able to access the error log file in Backend directly.
We also add a new header option: 'enable_hub' in streaming load request, and default is false. Because if we enable the broker load error hub, it will significantly slow down the processing speed of streaming load, due to the visit of remote storage via broker. So use can disable the error load hub using this header option, to avoid slowing down the load speed.
2. Show load error logs by using SHOW LOAD WARNINGS stmt
We also provide a more easy way to get load error logs. We implement 'SHOW LOAD WARNINGS ON 'url'' stmt to show load error logs directly. The 'url' in stmt is provided in 'SHOW LOAD' stmt.
eg:
show load warnings on "http://192.168.1.1:8040/api/_load_error_log?file=__shard_2/error_log_xxx";
3. Support now() function in broker load
User can mapping a column to now() in broker load stmt, which means this column will be filled with time when the ETL started.
4. Support more types of wildcard in broker load
Currently, we only support wildcard '*' to match the file names. wildcard like '/path/to/20190[1-4]*' is not support.
1. It is wrong to use _tablet_map_lock to protect critical region in get_tablet_stat function.
Add a _tablet_stat_mutex to protect critical region.
2. When base_compaction finished, it checks where there is version missed in tablet.
If answer is yes, BE will be cored dump. Now check tablet's integrity in advance.
1. Print broker address for debug.
2. Do not letting backup job cancelled if it already in state UPLOAD_INFO.
3. Cancel task on Backends when job is cancelled.
4. Show detail progress of backup and restore job.
5. Make 'show snapshot' result more readable.
6. Change upload and download thread num of backup and restore in Backend to 1.
* Add UserFunctionCache to cache UDF's library
This patch replace LibCache with UserFunctionCache. LibCache use HDFS
URL to identify a UDF's Library, and when BE process restart all of
downloaded library should be loaded another time. We use function id
corresponding to a library, and when process restart, all downloaded
libraries can be loaded without another downloading.
* update