Commit Graph

18263 Commits

Author SHA1 Message Date
ae6f2d99c5 Fix bug when use SELECT * FROM TABLE LIMIT 1 (#1469) 2019-07-13 23:57:14 +08:00
5be3e73325 Build snappy with optimize-options enabled (#1467) 2019-07-13 21:27:17 +08:00
aff1559c4d FixBug: if columns of doris table less than parquet file columns , BE will be crash (#1464) 2019-07-12 15:23:13 +08:00
863eb83cb1 Delete deprecated code in Frontend (#1463)
1. Delete Clone/CloneJob/CloneChecker
    The old clone framework is deprecated, using TabletChecker/TabletScheduler instead
2. Delete old BackupJob/RestoreJob
3. Delete OP_DROP_USER edit log
4. Delete CLONE_DONE edit log
2019-07-12 13:34:05 +08:00
734032d917 Fix the error unit of create timestamp in mini load (#1460)
The unit of old create timestamp is micros while the unit of create timestamp in fe is millisecond.
2019-07-11 19:29:18 +08:00
2fd2b714c1 Add aggregate function doc (#1434) 2019-07-11 16:45:45 +08:00
a7390c03f4 Add percentile_approx aggregate function (#1432) 2019-07-11 16:44:43 +08:00
01a3bea456 Add new CreateTabletReq and AlterTabletReqV2 thrift struct (#1459)
This is for developing new alter table process.
2019-07-11 16:38:39 +08:00
b9c79d4b1b Fix importing non-parquet format file causing be crash (#1454) 2019-07-11 16:04:36 +08:00
81f062dd4c Bug-fix: query es table would fail when thrift_port configuration not set (#1455) 2019-07-11 12:29:18 +08:00
941dec215b Add utc_timestamp function (#1456) 2019-07-11 11:09:08 +08:00
e27d2fcfb1 Fix bug of wrong judgement for Hll type column's default value (#1458)
Fix the bug that user may receive '"Hll can not set default value' error
when creating table with Hll type columns.
2019-07-11 10:13:07 +08:00
51c92a0bec Validate the UTF-8 encode of loading data (#1457)
Currently, Doris only support UTF-8 encoded data. All data will be
shown to user in UTF-8 format. So if data loaded in Doris does not
UTF-8 encoded, user will see garbled data when querying.

I introduce a fast UTF-8 validator from

    https://github.com/lemire/fastvalidate-utf-8

This validator is highly optimized that it only takes 0.7 CPU cycles
to validata a 64k string. And by testing 1GB data load to Doris, the
validator has no impact on performance.
2019-07-11 09:46:38 +08:00
9c96a688c3 Fix bug that user can set null default value to non-nullable column in create table stmt (#1453)
In create table stmt, column definition `k1 INT NOT NULL DEFAULT NULL`
should not be allowed
2019-07-10 23:48:29 +08:00
98bd4b4565 Add string function split_part (#1451) 2019-07-10 09:47:33 +08:00
615c979727 Fix bug that BE crashes when inserting null value to non-nullable columns (#1447) 2019-07-10 09:20:09 +08:00
67b370a1ed Add ColumnBlock (#1450)
Use ColumnBlock to read data from Page.
2019-07-09 21:52:27 +08:00
645f0a5279 Persist auth info in LoadJob (#1443)
The new class named 'AuthorizationInfo' is used to save the auth info in jobs.
The job doesn't need to retrieve the auth info by meta id which maybe throw the exception when db or table has been dropped or renamed.
The persistence of 'AuthorizationInfo' take effect in META_VERSION 56
2019-07-09 20:50:55 +08:00
1336093395 Set stmt support Expr (#1418) 2019-07-09 13:12:03 +08:00
3202dc28e8 Fix bug that unable to delete redundant replicas (#1442)
This can happen if the Doris cluster is deployed with all, for example, SSD medium,
but create all tables with HDD storage medium property. Then getLoadScore(SSD) will
always return 0.0, so that no replica will be chosen when try to delete redundant
replicas.
2019-07-09 10:38:35 +08:00
ded60e59f9 Add a configuration to modify the reverse time of load error log (#1433)
Currently, the load error log on BE will be cleaned along with the
intermediate data of load, configured by 'load_data_reserve_hours'.
Sometimes user want to reserve the error log for longer time.
2019-07-09 10:36:13 +08:00
bde362c3cd Modify insert operation's behavior (#1444)
Before changing default insert operation to streaming load, if the select result
of a insert stmt is empty, a label will still be returned to the user, and user
can use this label to check the insert load job's status.

After changing the insert operation, if the select result is empty, a exception
will be thrown to user client directly without any label.

This new usage pattern is not friendly to already existed users, which is forcing
them to change their way of using insert operation.

So I add a new FE config 'using_old_load_usage_pattern', default is false.
If set to true, a label will be returned to user even if the select result is empty.
2019-07-09 10:17:09 +08:00
dc64521607 Modify bugs in sample variance and variance confusion (#1439)
1. variance same as var_pop/variance_pop
2. stddev same as stddev_pop
2019-07-08 14:12:36 +08:00
44d89ddcba Resolve the problem of accidental pause in RoutineLoadJob (#1437)
The issue is following:
Request1:
  BE aborts the txn.
  Attachment of txn is set.
  Attachment of txn is set to null without lock of txn because the task has been aborted by FE.
Request2:
  BE aborts the txn again.
  Attachment of txn is set again.
Request1:
  The attachment is not null so the job wants to find the task and commit it.
  The job could not find the task so it is paused. (NullPointer Exception)

In this commit, The commit request will check if task exists instead of checking txn attachment
is not null.
2019-07-08 11:01:42 +08:00
4989f7bfe3 Fix spelling mistake in docs (#1435) 2019-07-07 11:55:51 +08:00
Ye
25e092f92a Fix broken image link in docs (#1436) 2019-07-07 10:51:31 +08:00
e2246e7181 Fix bug that publish task missing dbId parameter (#1426)
Without dbId parameter, the backend report version can not be
updated when publish task report to FE, which may cause incorrect
order of report.

Related commit: 5c1b4f6
2019-07-04 15:42:39 +08:00
eeb6fa082b Revert "Cast the type of constant value in binary predicate to column type. (#1422)" (#1425)
This reverts commit 687f5a10868ce62d2d25d028152f84f3385b73f4.
2019-07-04 13:48:19 +08:00
687f5a1086 Cast the type of constant value in binary predicate to column type. (#1422)
If one child of a binary predicate is column, and the other is a constant expr,
set the compatible type to column's type.
eg:
k1(int):
... WHERE k1 = "123"  -->  k1 = cast("123" as int);

k2(varchar):
... WHERE 123 = k2 --> cast(123 as varchar) = k2

This optimization is for the case that some users may use a int column to save date, eg: 20190703,
but query with predicate: col = "20190703".

If not casting "20190703" to int, query optimizer can not do partition prune correctly.
2019-07-03 21:49:30 +08:00
4ff17c1fc3 Forbid adding rollup with REPLACE value but without all key columns. (#1421)
When a rollup table contains value columns of REPLACE aggregation type,
all key columns of base table should be included in this rollup.
Without all key columns, the order of rows is undefined.

For example, table schema k1,k2,k3,v1(REPLACE)

1,2,3,A
1,2,4,B
1,2,5,C

A rollup with column(k1,k2,v1):

1,2,A
1,2,B
1,2,C

No matter A or B or C be the last winner, it is meanless to user.

Also fix a bug that set password for current user on non-master FE
will throw NullPointerException.
2019-07-03 09:17:39 +08:00
d9f6829f4f Fix the deadlock of database and loadjob (#1419)
This commit change the idToTable to concurrent hashmap in database. It don't need to lock the database before getTable.
The database will be locked in GlobalTxnManager. The load job will be locked after that.
So the lock sequence is: database, load manager, load job.
2019-07-02 16:31:01 +08:00
7eab12a40e Support reading Parquet file when loading data (#1173) 2019-07-01 18:39:27 +08:00
b0af97d8aa Change error msg of mini load when PUBLISH_TIMEOUT (#1415) 2019-07-01 16:05:49 +08:00
8db97998ba Collect all documents to Doris code base (#1414) 2019-07-01 09:23:13 +08:00
9f743e3343 Use load job v2 in default (#1404) 2019-06-29 11:20:57 +08:00
6b83440b59 Get table name from DataSourceInfo instead of DataDesc (#1405) 2019-06-29 11:20:12 +08:00
8a10bf0f89 Fix binary plain page relocate bug (#1410) 2019-06-29 11:19:43 +08:00
1ff1722d93 Fix the core in dpp sink by sum of int128 (#1412) 2019-06-28 23:30:33 +08:00
5c1b4f641e Add report version for publish task (#1401) 2019-06-28 20:15:08 +08:00
4747bed306 Add rle page (#1379) 2019-06-27 22:25:19 +08:00
f8ec3d90d0 Add some field for extrernal service (#1403) 2019-06-27 22:10:33 +08:00
b17d1c5348 Fix a bug of v2 ColumnReader when reading not-null column (#1398) 2019-06-26 22:58:30 +08:00
7cb3c95330 Modify some method for external invoking and add extra sink (#1383) 2019-06-26 19:11:43 +08:00
756a680143 Add a website builder of Doris documentations (#1396)
The build script locates in docs/website.
Built with Sphinx using a theme provided by Read the Docs.
2019-06-26 19:10:39 +08:00
24a46a4552 Fix column nullable property in show create table stmt result (#1381)
Since the default nullable of a column changed to NOT NULL,
result of show table stmt should show the correct nullable property
of a column.
2019-06-26 18:55:27 +08:00
edeac1a339 Fix publish version task timeout (#1380)
Publish version task using wrong timeout config.
2019-06-26 12:27:25 +08:00
e046f7b05a Add plain page (#1341) 2019-06-26 00:50:50 +08:00
566e122c0d Optimize Export feature (#1378)
1. Add 'timeout' properties in Export stmt.
2. Add more infos in 'show export' stmt.
3. Add more logs for debug.
2019-06-26 00:20:53 +08:00
adba5249c4 Add dayofweek function (#1376) 2019-06-25 21:37:42 +08:00
e807064a88 Modify colocation creation logic (#1289) 2019-06-25 21:20:18 +08:00