Commit Graph

5755 Commits

Author SHA1 Message Date
e27d2fcfb1 Fix bug of wrong judgement for Hll type column's default value (#1458)
Fix the bug that user may receive '"Hll can not set default value' error
when creating table with Hll type columns.
2019-07-11 10:13:07 +08:00
9c96a688c3 Fix bug that user can set null default value to non-nullable column in create table stmt (#1453)
In create table stmt, column definition `k1 INT NOT NULL DEFAULT NULL`
should not be allowed
2019-07-10 23:48:29 +08:00
645f0a5279 Persist auth info in LoadJob (#1443)
The new class named 'AuthorizationInfo' is used to save the auth info in jobs.
The job doesn't need to retrieve the auth info by meta id which maybe throw the exception when db or table has been dropped or renamed.
The persistence of 'AuthorizationInfo' take effect in META_VERSION 56
2019-07-09 20:50:55 +08:00
1336093395 Set stmt support Expr (#1418) 2019-07-09 13:12:03 +08:00
3202dc28e8 Fix bug that unable to delete redundant replicas (#1442)
This can happen if the Doris cluster is deployed with all, for example, SSD medium,
but create all tables with HDD storage medium property. Then getLoadScore(SSD) will
always return 0.0, so that no replica will be chosen when try to delete redundant
replicas.
2019-07-09 10:38:35 +08:00
bde362c3cd Modify insert operation's behavior (#1444)
Before changing default insert operation to streaming load, if the select result
of a insert stmt is empty, a label will still be returned to the user, and user
can use this label to check the insert load job's status.

After changing the insert operation, if the select result is empty, a exception
will be thrown to user client directly without any label.

This new usage pattern is not friendly to already existed users, which is forcing
them to change their way of using insert operation.

So I add a new FE config 'using_old_load_usage_pattern', default is false.
If set to true, a label will be returned to user even if the select result is empty.
2019-07-09 10:17:09 +08:00
dc64521607 Modify bugs in sample variance and variance confusion (#1439)
1. variance same as var_pop/variance_pop
2. stddev same as stddev_pop
2019-07-08 14:12:36 +08:00
44d89ddcba Resolve the problem of accidental pause in RoutineLoadJob (#1437)
The issue is following:
Request1:
  BE aborts the txn.
  Attachment of txn is set.
  Attachment of txn is set to null without lock of txn because the task has been aborted by FE.
Request2:
  BE aborts the txn again.
  Attachment of txn is set again.
Request1:
  The attachment is not null so the job wants to find the task and commit it.
  The job could not find the task so it is paused. (NullPointer Exception)

In this commit, The commit request will check if task exists instead of checking txn attachment
is not null.
2019-07-08 11:01:42 +08:00
e2246e7181 Fix bug that publish task missing dbId parameter (#1426)
Without dbId parameter, the backend report version can not be
updated when publish task report to FE, which may cause incorrect
order of report.

Related commit: 5c1b4f6
2019-07-04 15:42:39 +08:00
eeb6fa082b Revert "Cast the type of constant value in binary predicate to column type. (#1422)" (#1425)
This reverts commit 687f5a10868ce62d2d25d028152f84f3385b73f4.
2019-07-04 13:48:19 +08:00
687f5a1086 Cast the type of constant value in binary predicate to column type. (#1422)
If one child of a binary predicate is column, and the other is a constant expr,
set the compatible type to column's type.
eg:
k1(int):
... WHERE k1 = "123"  -->  k1 = cast("123" as int);

k2(varchar):
... WHERE 123 = k2 --> cast(123 as varchar) = k2

This optimization is for the case that some users may use a int column to save date, eg: 20190703,
but query with predicate: col = "20190703".

If not casting "20190703" to int, query optimizer can not do partition prune correctly.
2019-07-03 21:49:30 +08:00
4ff17c1fc3 Forbid adding rollup with REPLACE value but without all key columns. (#1421)
When a rollup table contains value columns of REPLACE aggregation type,
all key columns of base table should be included in this rollup.
Without all key columns, the order of rows is undefined.

For example, table schema k1,k2,k3,v1(REPLACE)

1,2,3,A
1,2,4,B
1,2,5,C

A rollup with column(k1,k2,v1):

1,2,A
1,2,B
1,2,C

No matter A or B or C be the last winner, it is meanless to user.

Also fix a bug that set password for current user on non-master FE
will throw NullPointerException.
2019-07-03 09:17:39 +08:00
d9f6829f4f Fix the deadlock of database and loadjob (#1419)
This commit change the idToTable to concurrent hashmap in database. It don't need to lock the database before getTable.
The database will be locked in GlobalTxnManager. The load job will be locked after that.
So the lock sequence is: database, load manager, load job.
2019-07-02 16:31:01 +08:00
7eab12a40e Support reading Parquet file when loading data (#1173) 2019-07-01 18:39:27 +08:00
9f743e3343 Use load job v2 in default (#1404) 2019-06-29 11:20:57 +08:00
6b83440b59 Get table name from DataSourceInfo instead of DataDesc (#1405) 2019-06-29 11:20:12 +08:00
5c1b4f641e Add report version for publish task (#1401) 2019-06-28 20:15:08 +08:00
7cb3c95330 Modify some method for external invoking and add extra sink (#1383) 2019-06-26 19:11:43 +08:00
24a46a4552 Fix column nullable property in show create table stmt result (#1381)
Since the default nullable of a column changed to NOT NULL,
result of show table stmt should show the correct nullable property
of a column.
2019-06-26 18:55:27 +08:00
edeac1a339 Fix publish version task timeout (#1380)
Publish version task using wrong timeout config.
2019-06-26 12:27:25 +08:00
566e122c0d Optimize Export feature (#1378)
1. Add 'timeout' properties in Export stmt.
2. Add more infos in 'show export' stmt.
3. Add more logs for debug.
2019-06-26 00:20:53 +08:00
e807064a88 Modify colocation creation logic (#1289) 2019-06-25 21:20:18 +08:00
322de9cd8e Add sql-function doc of cast_to_bigint (#1370) 2019-06-24 19:40:57 +08:00
9cd0e09457 Fix bug that cast DateLiteral to other types (#1364) 2019-06-24 10:57:25 +08:00
9f7a335d02 Increase the timeout of publish version task when doing alter job (#1359)
The previous setting of timeout of a publish version task is mess.
I change it to a configurable time, default it 30 seconds.

And when the table is under rollup or schema change, I double this timeout.
This a kind of best-effort-optimization. Because with a short timeout,
a replica's publish version task is more likely to fail. And if quorum replicas
of a tablet fail to publish, the alter job will fail.

If the table is not under rollup or schema change, the failure of a replica's
publish version task has a minor effect because the replica can be repaired
by tablet repair process very soon. But the tablet repair process will not
repair rollup replicas.
2019-06-22 14:29:16 +08:00
bc14855f25 Fix FE unit test's failure (#1360) 2019-06-21 22:23:36 +08:00
120e7e9119 Add more UT for FEFunctions (#1344) 2019-06-21 21:54:14 +08:00
6a17f07f97 Fix wrong query result with constant InPredicate (#1357) 2019-06-21 21:48:00 +08:00
3b95867603 Add Checkstyle for doris-fe (#1353) 2019-06-21 21:45:54 +08:00
7550b2f09b Convert mini load to streaming mini load (#1323)
* This commit has brought contribution to streaming mini load
The operation of streaming mini load is sames as previous. Also, user can check the load by frontend.
The difference is that streaming mini load finish the task before reply of REST API while the non-streaming only register a load.

* When updating doris
Updating fe or be firstly are also supported. After fe and be are updated, the streaming mini load will take effect.

* For multi mini load
The non-streaming mini load still has been used by multi mini load. The behavior of multi mini load has not been changed.

* Add a interface named isSupportedFunction
This function is used to protect the correctness of new feature which consists of be and fe during updaing.
2019-06-21 19:34:50 +08:00
ea71277094 Support mysql client 8.0 connection fe (#1349)
for example:
mysql --default-auth=mysql_native_password -P9030 -utest -ptest123456 -hA.B.C.D
2019-06-21 19:15:34 +08:00
3024a6675a Fix the broker hang when fe restart (#1338) 2019-06-20 19:44:17 +08:00
0d3c80dd8a Fix bug that function greast and least return wrong type (#1342) 2019-06-20 19:02:10 +08:00
7f1c3640ed Fix show proc '/jobs/load' stmt for loadv2 (#1335)
* Add streaming job in LoadProc
* Add a config named desired_max_waiting_jobs
1. If the number of pending load jobs is more then desired_max_waiting_jobs, the create load stmt will be rejected.
2. If the number of need_scheduler load jobs is more then desired_max_waiting_jobs, the new routine load job will be rejected.
3. Desired max size is only a expect number, so the size of queue may be more then this number sometimes.

* Merge load manager and load jobs in jobs proc dir
2019-06-20 13:12:12 +08:00
e143dd2c1a Modify the max keep time of historical alter jobs (#1334)
Currently, historical alter jobs will keep for a while before being removed.
And this time is configured by label_keep_max_second. Which is also used for
Load jobs.

But to avoid too many historical load jobs being kept in memory,
'label_keep_max_second' always set to a short time, causing alter jobs to be
removed vary soon.

Add a new FE config 'history_job_keep_max_second' to configure the keep time of
alter jobs. Default is 7 days.
2019-06-19 19:41:48 +08:00
5cd4777bb4 Fix bug that if match wrong symbol (#1324) 2019-06-19 09:17:13 +08:00
bad6478d4f Allow chars i,h,s in time_format (#1328) 2019-06-18 19:48:19 +08:00
30028bc35b Deny specify partition for unpartitioned table (#1319) 2019-06-15 18:19:56 +08:00
5c2cf9f2ce Handle the situation when there is no enough backends for tablet repair (#1299)
If there are only 3 backends and replication num is 3. If one replica of a
tablet is bad, there is no 4th backend for tablet repair. So we need to delete
a bad replica first to make room for new replica.
2019-06-14 20:28:29 +08:00
c8d7b8e1c4 Fix bug that FE web frontend can not get static resource files (#1312) 2019-06-14 20:12:57 +08:00
b002ba04d9 Fix the error of duplicated label (#1303) 2019-06-14 14:13:38 +08:00
a8900d102d Change defalut value to NOT NULL when creating table (#1293) 2019-06-12 18:28:11 +08:00
8b79abcaba Support setting exec_mem_limit in ExportJob (#1280) 2019-06-11 21:05:45 +08:00
53062122ea Change strategy of incorrect data (#1255)
This change adds a load property named strict_mode which is used to prohibit the incorrect data.
When it is set to false, the incorrect data will be loaded by NULL just like before.
When it is set to true, the incorrect data which belongs to a column without expr will be filtered.
The strict_mode is supported in broker load v2 now. It will be supported in stream load later.
2019-06-10 20:39:45 +08:00
6a54464ee8 Fix bug that NullPredicate is not correctly handled by partition pruning (#1276) 2019-06-10 20:11:42 +08:00
2efd5a4d86 Fix bug: FE pid file has wrong content(#1273)
For example, we start the process for the first time. The pid is 12345. Due to the accident, the process is killed and the fe.pid exists. Then we start the process for the second time. The pid is 6789. The fe.pid shows 67895 , Because file.write only cover the first four digits. This case can happen easily when we use supervise. Then I add the file.setLength(0) and delete the old data.
2019-06-10 17:29:04 +08:00
038ddcfa0b Set timeout configuration for stream load (#1271) 2019-06-10 15:51:58 +08:00
e4e04e8203 Make LZO support optional (#1263) 2019-06-07 22:26:54 +08:00
ff0dd0d2da Support SSL authentication with Kafka in routine load job (#1235) 2019-06-07 16:29:01 +08:00
f424321625 Fix IllegalArgumentException in LoadManager (#1240) 2019-06-04 22:23:13 +08:00