Support time zone variable like "-8:00","+8:00","8:00"
Time zone variable like "-8:00" is illegal in time-zone ID ,so we mush transfer it to standard format
Some export job from old version of Doris may not has timeout property,
which will cause NPE.
2 more changes:
1. Change the default BE config "max_runnings_transactions" to 2000.
2. Add a new metric to FE to show the master ip:port.
Those query of issue could not be supported. #2483#2493
Those query is forbidden:
query1: select * from t1 where k1=(select k1 from t2 where t1.k2=t2.k2);
query2: select * from t1 where k1=(select distinct k1 from t2 where t1.k2=t2.k2);
Only sum, max, min, avg and count function could appear on select clause for correlated subquery. #2420
Those query is legal:
query1: select * from t1 where k1=(select avg(k1) from t2 where t1.k2=t2.k2);
to solve the issue #2246.
scheme is as following:
add a optional preferred_rowset_type in TabletMeta for V2 format rollup index tablet
add a boolean session variable use_v2_rollup, if set true, the query will v2 storage format rollup index to process the query.
test queries will be sent to online service to verify the correctness of segment-v2 by send the the same queries to fe with use_v2_rollup set or not to check whether the returned results are the same.
Support to create materialized view
This commit support to create materiliazed view.
The syntax of stmt is following:
CREATE Materialized View [MV name] AS
SELECT select_expr[, select_expr ...]
FROM [Base table name]
GROUP BY column_name[, column_name ...]
ORDER BY column_name[, column_name ...]
The CreateMaterializedViewClause is used to check the semantic of stmt in the first step.
Now, the where, having, limit clause is forbidden in CREATE MATERIALIZED VIEW.
Also the aggregation function is restricted in SUM/MIN/MAX.
The second step is to validate stmt according to metadata of base table.
For example, the aggregate type of mv column must be same as the aggregate type of base column in aggregate table.
The last step is to prepare index of mv and add this new mvJob in Handler.
The handler will asynchronous process this new mvJob.
This commit will promote the priority of the || operator to the front of the + - * / mod operators.
It solves the problems 2.1 that mentioned at issue #2396 .
For problem at 2.2 in issue #2396 , it is actually the same problem mentioned in issue #2142 . As it said in pr #2398 before, the influence of modifying that logic will cause semantic errors in insert and load, so this commit will left the bug unsolved temporary.
appendix:
In Mysql 5.7.27
|| and |
select 23|1||7;
23
select (23|1)||7
237
select 23|(1||7)
23
Priority : || > |
|| and &
select 10&1||7;
0
select (10&1)||7
7
select 10&(1||7)
0
Priority : || > &
|| and ^
select 10^1||7
27
select (10^1)||7
117
select 10^(1||7)
27
Priority : || > ^
|| and ~
select ~1||7
184467440737095516147
select ~(1||7)
18446744073709551598
priority : || < ~
[Tag System]
This CL includes 2 parts:
Add classes related to "tag"
Resource: is the collective name of the nodes that provide various service capabilities in Doris cluster.
Tag: A Tag consists of type and name.
TagSet: TagSet represents a set of tags.
TagManager: maintains 2 indexes:
one is from tag to resource.
one is from resource to tags
ISSUE #1723
Using JSON as serialization methods of metadata
Introduce GSON library to serialize the new classes mentioned above.
ISSUE #2415#2389
GSON's version is updated to 2.8.6
1 Because we don't support array type currently, so I use variable arguments instead.
2 intersect_count directly return final count, not bitmap like bitmap_union, because intersect_count return bitmap is more complex and need more serialize. If we really need bitmap format from intersect_count, we could do that in another PR and which won't have compatibility problems.
The multi cluster feature will be deprecated soon.
Add a FE config "disable_cluster_feature", and default is true, to
forbid any cluster related operations, include:
* create/drop cluster
* add free backend/add backend to cluster/decommission cluster balance
* change the backends num of cluster
* link/migration db
* fix ut
This commit fixs the bug below,
FE throws a unexpected exception when encounter a query like :
Set sql_mode = '0,PIPES_AS_CONCAT'.
and make some change to sql mode analyze process, now the analyze process is no longer put in SetVar.class, but in VariableMgr.class.
[Metric] Add tablet compaction score metrics
Backend:
Add metric "tablet_max_compaction_score" to monitor the current max compaction
score of tablets on this Backend. This metric will be updated each time
the compaction thread picking tablets to compact.
Frontend:
Add metric "tablet_max_compaction_score" for each Backend. These metrics will
be updated when backends report tablet.
And also add a calculated metric "max_tablet_compaction_core" to monitor the
max compaction core of tablets on all Backends.
The control framework is implemented through heartbeat message. Use uint64_t as flags to control different functions.
Now add a flag to set the default rowset type to beta.
1. Reduce the publish version interval
2. Change the visible version check from `getReadyToPublishTransactions` to `finishTransaction`,and make the publish version task from serial to concurrent.
3. When `getReadyToPublishTransactions` sort the transactionState by CommitTime to make low version transaction publish firstly and reduce the wait time in `finishTransaction`,
'isRestore' flag is for the old version of backup and restore process,
which is deprecated long time ago. Remove it.
This commit is also for making a further step to ISSUE #1723.
[Load]
When performing a long-time load job, the following errors may occur. Causes the load to fail.
load channel manager add batch with unknown load id: xxx
There is a case of this error because Doris opened an unrelated channel during the load
process. This channel will not receive any data during the entire load process. Therefore,
after a fixed timeout, the channel will be released.
And after the entire load job is completed, it will try to close all open channels. When it try to
close this channel, it will find that the channel no longer exists and an error is reported.
This CL will pass the timeout of load job to the load channel, so that the timeout of load channels
will be same as load job's.
All classes that implement the Wriable interface need only implement the write() method.
The read() method should be implemented by itself according to the situation of different
classes.
**Authorization checking logic**
There are some problems with the current password and permission checking logic. For example:
First, we create a user by:
`create user cmy@"%" identified by "12345";`
And then 'cmy' can login with password '12345' from any hosts.
Second, we create another user by:
`create user cmy@"192.168.%" identified by "abcde";`
Because "192.168.%" has a higher priority in the permission table than "%". So when "cmy" try
to login in by password "12345" from host "192.168.1.1", it should match the second permission
entry, and will be rejected because of invalid password.
But in current implementation, Doris will continue to check password on first entry, than let it pass. So we should change it.
**Permission checking logic**
After a user login, it should has a unique identity which is got from permission table. For example,
when "cmy" from host "192.168.1.1" login, it's identity should be `cmy@"192.168.%"`. And Doris
should use this identity to check other permission, not by using the user's real identity, which is
`cmy@"192.168.1.1"`.
**Black list**
Functionally speaking, Doris only support adding WHITE LIST, which is to allow user to login from
those hosts in the white list. But is some cases, we do need a BLACK LIST function.
Fortunately, by changing the logic described above, we can simulate the effect of the BLACK LIST.
For example, First we add a user by:
`create user cmy@'%' identified by '12345';`
And now user 'cmy' can login from any hosts. and if we don't want 'cmy' to login from host A, we
can add a new user by:
`create user cmy@'A' identified by 'other_passwd';`
Because "A" has a higher priority in the permission table than "%". If 'cmy' try to login from A using password '12345', it will be rejected.
fix arithmetic operation between numeric and non-numeric will cause unexpected value.
After this patch you will get
mysql> select 1 + "kks";
+-----------+
| 1 + 'kks' |
+-----------+
| 1 |
+-----------+
1 row in set (0.02 sec)
mysql> select 1 - "kks";
+-----------+
| 1 - 'kks' |
+-----------+
| 1 |
+-----------+
1 row in set (0.01 sec)
This commit add a new plan node named AssertNumRowsNode
which is used to determine whether the number of rows exceeds the limit.
The subquery in Binary predicate and Between-and predicate should be added a AssertNumRowsNode
which is used to determine whether the number of rows in subquery is more than 1.
If the number of rows in subquery is more than 1, the query will be cancelled.
For example:
There are 4 rows in table t1.
Query: select c1 from t1 where c1=(select c2 from t1);
Result: ERROR 1064 (HY000): Expected no more than 1 to be returned by expression select c2 from t1
ISSUE-2270
TPC-DS 6,54,58
For #2383
1. Limit the concurrent transactions of routine load job
2. Create new routine load task when txn is VISIBLE, not after COMMITTED.
For #2267
1. All non-master daemon thread should also be started after catalog is ready.
For #2354
1. `fixLoadJobMetaError()` should be called after all meta data is read, including image and edit logs.
2. Mini load job should set to CANCELLED when corresponding transaction is not found, instead
of UNKNOWN.
Pure DocValue optimization for doris-on-es
Future todo:
Today, for every tuple scan we check if pure_docvalue is enabled, this is not reasonable, should check pure_docvalue enabled for one whole scan outside, I will add this todo in future
The timeline for this question is as follows:
1. For some reason, the master have lost contact with the other two followers.
Judging from the logs of the master, for almost 40 seconds, the master did not print any logs.
It is suspected that it is stuck due to full gc or other reasons, causing the
other two followers to think that the master has been disconnected.
2. After the other two followers re-elected, they continued to provide services.
3. The master node is manually restarted afterwards. When restarting it for the first time,
it needs to rollback some committed logs, so it needs to be closed and restarted again.
After restarting again, it returns to normal.
The main reason is that the master got stuck for 40 seconds for some reason.
This issue requires further observation.
At the same time, in order to alleviate this problem, we decided to set bdbje's heartbeat timeout
as a configurable value. The default is 30 seconds. Can be configured to 1 minute,
try to avoid this problem first.
There is bug in Doris version 0.10.x. When a load job in PENDING or LOADING
state was replayed from image (not through the edit log), we forgot to add
the corresponding callback id in the CallbackFactory. As a result, the
subsequent finish txn edit logs cannot properly finish the job during the
replay process. This results in that when the FE restarts, these load jobs
that should have been completed are re-entered into the pending state,
resulting in repeated submission load tasks.
Those wrong images are unrecoverable, so that we have to reset all load jobs
in PENDING or LOADING state when restarting FE, depends on its corresponding
txn's status, to avoid submit jobs repeatedly.
If corresponding txn exist, set load job' state depends on txn's status.
If txn does not exist, may be the txn has been removed due to label expiration.
So that we don't know the txn is aborted or visible. So we have to set the job's state
as UNKNOWN, which need handle it manually.
Previously, only Master FE has node info metrics to indicate which node is alive.
But this info should be available on every FE, so that the monitor system
can get all metrics from any FE.