When we use spark load from hive table, the function loadDataFromHiveTable
will read whole hive table and then filter the data in process()
if hive table have lots of partitions and history data,the load will be cost too much time and resource.
So we can do filter work in loadDataFromHiveTable function when read from hive table.
Co-authored-by: 杜安明 <anming.du@mihoyo.com>
For fix#4977, we return queryId in master FE when finish query for non master to audit it in #4978.
But when the query fail(timeout), the client may not receive the right queryId for audit.
In this PR:
None master FE send queryId to master for querying;
Add more log.
issue:#5031
1. Support ODBC Sink for insert into data to ODBC external table.
2. Support Transaction for ODBC sink to make sure insert into data is atomicital.
3. The document about ODBC sink has been modified
The close method of OlapTabletSink may be called twice.
In the open_internal() method of plan_fragment_executor, close is called once.
If an error occurs in this call, it will be called again in fragment_mgr.
So here we use a flag to prevent repeated close operations.
Co-authored-by: morningman <chenmingyu@baidu.com>
* [Refactor] Refactor DeleteHandler and Cond module (#4925)
This patch mainly do the following refactors:
- Use int64_t instead of int32_t for 'version' in DeleteHandler
- Move some comments from .cpp to .h file, add some new comments in .h files, and also remove some meaningless comments
- Use switch...case... instead of multiple if..else.. for DeleteConditionHandler::is_condition_value_valid
- Use range loop to simplify code
- Reduce some compare operations in Cond::del_eval
- Improve some branch predictions in Reader
- Fix and improve some unit tests
This patch mainly do the following refactors:
- Use int64_t instead of int32_t for 'version' in DeleteHandler
- Move some comments from .cpp to .h file, add some new comments in .h files, and also remove some meaningless comments
- Use switch...case... instead of multiple if..else.. for DeleteConditionHandler::is_condition_value_valid
- Use range loop to simplify code
- Reduce some compare operations in Cond::del_eval
- Improve some branch predictions in Reader
- Fix and improve some unit tests
'_task_worker_type' is not well initialized when use it to init '_name',
then '_name' is always 'TaskWorkerPool.CREATE_TABLE', this patch fix
this bug.
1. Add metadata table 'statistics' to store index information;
2. In the header information returned by mysql, the data type length is returned according to the actual type.
The return type of str_to_date depends on whether the time part is included in the format.
If included, it is DATETIME, otherwise it is DATE.
If the format parameter is not constant, the return type will be DATETIME.
The above judgment has been completed in the FE query planning stage,
so here we directly set the value type to the return type set in the query plan.
For example:
A table with one column k1 varchar, and has 2 lines:
"%Y-%m-%d"
"%Y-%m-%d %H:%i:%s"
Query:
SELECT str_to_date("2020-09-01", k1) from tbl;
Result will be:
2020-09-01 00:00:00
2020-09-01 00:00:00
Query:
SELECT str_to_date("2020-09-01", "%Y-%m-%d");
Return type is DATE
Query:
SELECT str_to_date("2020-09-01", "%Y-%m-%d %H:%i:%s");
Return type is DATETIME
_tablets_under_clone in TabletManager is not sharded but the lock
used to prevent concurrent access is sharded, so when shards size
is not 1, it will cause coredump.
This patch fix this bug, and also do some refactor to make shard
locks more convenient to use.
There is no clear instruction to manually modify partitions, when dynamic partition feature is enabled.
The user will be informed only after trying to modify the partition in the command line.
This PR adds instructions for converting dynamic and manual partition tables to each other
Since the plan is retained in the task, if the task is not cleaned up, the memory usage will be too large caused Memory leak or OOM.
When load job finished, there is no need to hold the tasks which are the biggest memory consumers.
Fixed#4992
* Optimized the read performance of the table when have multi versions,
changed the merge method of the unique table,
merged the cumulative version data first, and then merged with the base version.
For the data with only one base version, read directly without merging
This CL fix 2 bugs:
1.
When the compaction fails, we must explicitly delete the output rowset,
otherwise the GC logic cannot process these rows.
2.
Base compaction failed if compaction process include some delete version in SegmentV2,
Because the number of filtered rows is wrong.
The current compaction mechanism is that there is a producer thread that has been producing compaction tasks,
and the selected tablet must apply for `permits`.
When a tablet could hold `permits`, compaction task for this tablet will be submitted to thread pool.
We take compaction score as `permits` which is used for limiting memory consumption.
However, `pick_rowset_to_compaction()` will be executed before the file merge in compaction thread,
and the number of segment files that actually perform the merge operation is smaller than compaction score.
In addition, it is also possible that compaction task exits directly because the tablet doesn't meet
the requirements of compaction.
This patch optimizes and refactors the code of compaction, so that we can execute 'pick rowsets'
before applying for permits for a compaction task, calculate the number of segment files that actually
participate in the merge operation, and take this number as `permits`.
1. Add metrics to `used permits` and `waitting permits` for compaction.
It would be useful to monitor `permits` hold by all executing compaction tasks and waitting compaction task.
2. Add log which can be chosen by config for merge rowsets.
It would be helpful to track the process of rowsets merging for compaction task which lasts for a long time.
1. Random().nextInt() maybe return negative numeric value which would result in `java.lang.ArrayIndexOutOfBoundsException`,
pass a positive numeric value would avoid this problem.
```
int seed = new Random().nextInt(Short.MAX_VALUE) % nodesInfo.size()
```
2. EsNodeInfo[] nodeInfos = (EsNodeInfo[]) nodesInfo.values().toArray() maybe lead `java.lang.ClassCastException in some JDK version : [Ljava.lang.Object; cannot be cast to [Lorg.apache.doris.external.elasticsearch.EsNodeInfo` , pass the original `Class Type` can resolve this.
```
EsNodeInfo[] nodeInfos = nodesInfo.values().toArray(new EsNodeInfo[0]);
```
When a parquet file contains a `Map/List/Struct` structure, Doris can not recognize the column correctly,
and throws exception 'Invalid column: xxxx', that means Doris can not find the column.
The `Map` structure will be recognized into two columns: `key and value`.
The follow is the schema of a parquet file recognized by Doris. This patch tries to solve this problem.
When a user tries to load parquet file into Doris, like this path: `hdfs://hadoop/user/data/date=20201024/*`,
but acturally the path contains some none parquet files,the error is throwed
`Couldn't deserialize thrift: No more data to read.\\nDeserializing page header failed.`.
If the error message includes the file name information, we can quickly locate the errors.
Therefore, this patch try to add the file name to the error message.
1. Support modify column type CHAR to TINYINT/SMALLINT/INT/BIGINT/LARGEINT/FLOAT/DOUBLE/DATE
and TINYINT/SMALLINT/INT/BIGINT/LARGEINT/FLOAT/DOUBLE convert to a wider range of numeric types (#4937)
2. Use template to refactor code of types.h and schema_change.cpp to delete redundant code.
When a tablet selects which replica's host to execute scan operation,
it takes `round-robin` strategy to load balance. `minAssignedBytes` is the current load of one host.
If a backend is not alive momently, it will randomly take one of other replicas as the choice,
but the unalive backend's `minAssignedBytes` not be descreased and the new choice's `minAssignedBytes`
also not be increased. That will make the real load of the backends not correct.