* [Bug] Fix bug that the image cannot be pulled after the new fe node is added
This is because httpv2 modified the response body of the "/info" api,
causing FE to fail to obtain info from this api.
And the system did not exit correctly.
This will also cause issues in issue #5292
from_unxitime is a cpu-exhausted function.
SQL: select filed from table where from_unixtime(field) > '2021-03-02',
if there are one million rows of data. Function from_unixtime will be called one million times,
which will make query very slow.
In issue #5443, we try to rewrite from_unixtime into timestamp to reduce calling this function.
This rewriting can bring 2 times query performance improvement.
Currently, FeMetaVersion.java is in fe-common, users may forget to copy fe-common.jar when upgrading the service.
It's really dangerous because the data may be corrupted and can not be recovered.
Some invalid config value may cause BE work in an unexpected behavior,
this patch aim to support config validating when BE bootstrap and update BE's config by API
to reject invalid value.
This is a work to accomplish PR #4423
There are some redundant code for report task, disk and tablet in be, and when fe return error report message, there is no any warn log showing report failed.
Co-authored-by: caiconghui [蔡聪辉] <caiconghui@xiaomi.com>
When there is count(*) function in query, we only need to scan the smallest column.
For example:
Query: select count(*) from (select k1, k2, k3 from base) tmp;
Only k1 which is the smallest column should be scanned.
The remaining columns (k2, k3) should be pruning.
This pr achieves this optimization of column pruning.
Fixed#5409
When a table has multiple partitions, each partition has it's own
version, the version doesn't represent whether it's newer or not. When a
partition has a large version, it may be considered as the largest one
currently, this will cause incorrect query result.
Suppose there are 2 partitions:
PartitionName | VisibleVersion | VisibleVersionTime
p1 | 123 | 2021-02-17 23:31:32
p2 | 23 | 2021-02-22 11:39:19
Partition p1 will be considered as the lastest partition, and there is a
cache before p2's last update time, the cache will hit and return an
error result.
* Update fe-idea-dev.md
use `brew install thrift@0.9` to install thrift 0.9.3.1
`brew edit thrift090 | head` shows thrift@0.9 uses thrift 0.9.3.1
* [Refactor] Remove the unnecessary if statement
Future<?> submit(Runnable task)
Submits a Runnable task for execution and returns a Future representing that task. The Future's get method will return null upon successful completion.
* Fix null type
* add comment
Co-authored-by: tanhao <tanhao.0902@bytedance.com>
* [doris-1008] support backup and restore directly to cloud storage via aws s3 protocol
* Internal][S3DirectAccess] Support backup,restore,load,export directlyconnect to s3
1. Support load and export data from/to s3 directly.
2. Add a config to auto convert broker access to s3 acces when available
Change-Id: Iac96d4b3670776708bc96a119ff491db8cb4cde7
(cherry picked from commit 2f03832ca52221cc7436069b96c45c48c4bc7201)
* [Internal][S3DirectAccess] File path glob compatible with broker
Change-Id: Ie55e07a547aa22c6fa8d432ca926216c10384e68
(cherry picked from commit d4fb25544c0dc06d23e1ada571ec3f8edd4ba56f)
* [internal] [doris-1008] fix log4j class not found
Change-Id: I468176aca0d821383c74ee658d461aba9e7d5be3
(cherry picked from commit 029adaa9d6ded8503acbd6644c1519456f3db232)
* add poms
Co-authored-by: yangzhengguo01 <yangzhengguo01@baidu.com>
If the difference between the two times exceeds Integer.MAX_VALUE,
the compare's return value will overflow and the flow exception may be triggered when sorting profile.
In the previous broker load, multiple OlapTableSinks would send data to the same LoadChannel,
and because of the lock granularity problem, LoadChannel could only process these requests serially,
which made it impossible to make full use of cluster resources.
This CL modifies the related locks so that LoadChannel can process these requests in parallel.
In the test, with a size of 20G, the load speed of 334 million rows of data in 3 nodes has been
increased from 9min to 5min, and after enabling 2 concurrency, it can be increased to 3min.
Also modify the profile of load job.
1. Add BlockColumnPredicate support OR and AND column predicate in RowBlockV2
2. Support evaluate vectorization delete predicate in storage engine not in Reader in SegmentV2
After an Alter job finished, the job's state is FINISHED, but table's state
may not be NORMAL for a while.
We need to make sure that table's state become NORMAL to continue next UT.