* [Refactor] Refactor DeleteHandler and Cond module (#4925)
This patch mainly do the following refactors:
- Use int64_t instead of int32_t for 'version' in DeleteHandler
- Move some comments from .cpp to .h file, add some new comments in .h files, and also remove some meaningless comments
- Use switch...case... instead of multiple if..else.. for DeleteConditionHandler::is_condition_value_valid
- Use range loop to simplify code
- Reduce some compare operations in Cond::del_eval
- Improve some branch predictions in Reader
- Fix and improve some unit tests
This patch mainly do the following refactors:
- Use int64_t instead of int32_t for 'version' in DeleteHandler
- Move some comments from .cpp to .h file, add some new comments in .h files, and also remove some meaningless comments
- Use switch...case... instead of multiple if..else.. for DeleteConditionHandler::is_condition_value_valid
- Use range loop to simplify code
- Reduce some compare operations in Cond::del_eval
- Improve some branch predictions in Reader
- Fix and improve some unit tests
'_task_worker_type' is not well initialized when use it to init '_name',
then '_name' is always 'TaskWorkerPool.CREATE_TABLE', this patch fix
this bug.
1. Add metadata table 'statistics' to store index information;
2. In the header information returned by mysql, the data type length is returned according to the actual type.
The return type of str_to_date depends on whether the time part is included in the format.
If included, it is DATETIME, otherwise it is DATE.
If the format parameter is not constant, the return type will be DATETIME.
The above judgment has been completed in the FE query planning stage,
so here we directly set the value type to the return type set in the query plan.
For example:
A table with one column k1 varchar, and has 2 lines:
"%Y-%m-%d"
"%Y-%m-%d %H:%i:%s"
Query:
SELECT str_to_date("2020-09-01", k1) from tbl;
Result will be:
2020-09-01 00:00:00
2020-09-01 00:00:00
Query:
SELECT str_to_date("2020-09-01", "%Y-%m-%d");
Return type is DATE
Query:
SELECT str_to_date("2020-09-01", "%Y-%m-%d %H:%i:%s");
Return type is DATETIME
_tablets_under_clone in TabletManager is not sharded but the lock
used to prevent concurrent access is sharded, so when shards size
is not 1, it will cause coredump.
This patch fix this bug, and also do some refactor to make shard
locks more convenient to use.
There is no clear instruction to manually modify partitions, when dynamic partition feature is enabled.
The user will be informed only after trying to modify the partition in the command line.
This PR adds instructions for converting dynamic and manual partition tables to each other
Since the plan is retained in the task, if the task is not cleaned up, the memory usage will be too large caused Memory leak or OOM.
When load job finished, there is no need to hold the tasks which are the biggest memory consumers.
Fixed#4992
* Optimized the read performance of the table when have multi versions,
changed the merge method of the unique table,
merged the cumulative version data first, and then merged with the base version.
For the data with only one base version, read directly without merging
This CL fix 2 bugs:
1.
When the compaction fails, we must explicitly delete the output rowset,
otherwise the GC logic cannot process these rows.
2.
Base compaction failed if compaction process include some delete version in SegmentV2,
Because the number of filtered rows is wrong.
The current compaction mechanism is that there is a producer thread that has been producing compaction tasks,
and the selected tablet must apply for `permits`.
When a tablet could hold `permits`, compaction task for this tablet will be submitted to thread pool.
We take compaction score as `permits` which is used for limiting memory consumption.
However, `pick_rowset_to_compaction()` will be executed before the file merge in compaction thread,
and the number of segment files that actually perform the merge operation is smaller than compaction score.
In addition, it is also possible that compaction task exits directly because the tablet doesn't meet
the requirements of compaction.
This patch optimizes and refactors the code of compaction, so that we can execute 'pick rowsets'
before applying for permits for a compaction task, calculate the number of segment files that actually
participate in the merge operation, and take this number as `permits`.
1. Add metrics to `used permits` and `waitting permits` for compaction.
It would be useful to monitor `permits` hold by all executing compaction tasks and waitting compaction task.
2. Add log which can be chosen by config for merge rowsets.
It would be helpful to track the process of rowsets merging for compaction task which lasts for a long time.
1. Random().nextInt() maybe return negative numeric value which would result in `java.lang.ArrayIndexOutOfBoundsException`,
pass a positive numeric value would avoid this problem.
```
int seed = new Random().nextInt(Short.MAX_VALUE) % nodesInfo.size()
```
2. EsNodeInfo[] nodeInfos = (EsNodeInfo[]) nodesInfo.values().toArray() maybe lead `java.lang.ClassCastException in some JDK version : [Ljava.lang.Object; cannot be cast to [Lorg.apache.doris.external.elasticsearch.EsNodeInfo` , pass the original `Class Type` can resolve this.
```
EsNodeInfo[] nodeInfos = nodesInfo.values().toArray(new EsNodeInfo[0]);
```
When a parquet file contains a `Map/List/Struct` structure, Doris can not recognize the column correctly,
and throws exception 'Invalid column: xxxx', that means Doris can not find the column.
The `Map` structure will be recognized into two columns: `key and value`.
The follow is the schema of a parquet file recognized by Doris. This patch tries to solve this problem.
When a user tries to load parquet file into Doris, like this path: `hdfs://hadoop/user/data/date=20201024/*`,
but acturally the path contains some none parquet files,the error is throwed
`Couldn't deserialize thrift: No more data to read.\\nDeserializing page header failed.`.
If the error message includes the file name information, we can quickly locate the errors.
Therefore, this patch try to add the file name to the error message.
1. Support modify column type CHAR to TINYINT/SMALLINT/INT/BIGINT/LARGEINT/FLOAT/DOUBLE/DATE
and TINYINT/SMALLINT/INT/BIGINT/LARGEINT/FLOAT/DOUBLE convert to a wider range of numeric types (#4937)
2. Use template to refactor code of types.h and schema_change.cpp to delete redundant code.
When a tablet selects which replica's host to execute scan operation,
it takes `round-robin` strategy to load balance. `minAssignedBytes` is the current load of one host.
If a backend is not alive momently, it will randomly take one of other replicas as the choice,
but the unalive backend's `minAssignedBytes` not be descreased and the new choice's `minAssignedBytes`
also not be increased. That will make the real load of the backends not correct.
All Column create in inlineView will set `allowNull = false`, which will cause `NULL` data in CTE be process will be ignore.
So we should set column in inlineView allowNull to make sure correct of query.
This is a minor issue when we had FE start after a fresh installation,
but it will occur an error about the log directory is missing due to log directory is not existed
before some environment check message outputing to the log file.
the log directory creation code in bin/start_fe.sh is in the wrong place,
only need to put the log directory creation code in the beginning.
This CL mainly changes:
1. Avoid repeated sending of common components in Fragments
In the previous implementation, a query may generate multiple Fragments,
these Fragments contain some common information, such as DescriptorTable.
Fragment will be sent to BE in a certain order, so these public information will be sent repeatedly
and generated repeatedly on the BE side.
In some complex SQL, these public information may be very large,
thereby increasing the execution time of Fragment.
So I improved this. For multiple Fragments sent to the same BE, only the first Fragment will carry
these public information, and it will be cached on the BE side, and subsequent Fragments
no longer need to carry this information.
In the local test, the execution time of some complex SQL can be reduced from 3 seconds to 1 second.
2. Add the time-consuming part of FE logic in Profile
Including SQL analysis, planning, Fragment scheduling and sending on the FE side, and the time to fetch data.