doris

Author	SHA1	Message	Date
lichaoyong	36df6ebe4e	Fix rollup bug when init RowCursor (#1502 ) When doing rollup, seek_columns equals to the complete set of tablet's columns. There is no necessity to set it.	2019-07-18 18:06:17 +08:00
ZHAO Chun	41499061ac	Refactor types.h to reduce code and add UT (#1498 )	2019-07-18 12:24:41 +08:00
kangpinghuang	24592e1124	Add log to trace writer validate failure (#1496 ) AlphaRowsetWriter validate rowset failed when build rowset because rowset's num_rows is not equal to segment groups' num_rows when add_rowset api is called. So add some log to trace the process to debug the problems. The logs will be deleted in the future.	2019-07-18 11:24:41 +08:00
yiguolei	755b12cd75	Add partition id to tablet meta in be (#1490 ) FE uses partition_id to publish version. BE should check whether all tablets related with this partition have the version. But Tablet in BE does not have partition id in its metadata. So that BE could not check it. This patch will add partition id to tablet meta during report task. Sync at most 10k tablets during set tablet meta.	2019-07-17 14:07:55 +08:00
Mingyu Chen	2551248a52	Support grant GRANT_PRIV on database or table level (#1472 ) Currently, GRANT_PRIV can only be granted on global level, which means it can only be granted on .. Grant it on db.* or db.tbl are not allowed. This will not be able to meet the requirement to create a user who has privilege to grant privileges to other users on specified database or table, such as: GRANT SELECT_PRIV ON db1.* TO cmy@'%'; So I extend the range of GRANT_PRIV. User can now grant GRANT_PRIV on database or even table level, such as: GRANT GRANT_PRIV ON db1.* TO cmy@'%'; And after being granted, the user cmy@'%' can now grant GRANT_PRIV on db1.* to other users.	2019-07-16 19:25:18 +08:00
Mingyu Chen	4e043e66e2	Modify the result json format of mini load (#1487 ) Mini load is now using stream load framework. But we should keep the mini load return behavior and result json format be same as old. So PUBLISH_TIMEOUT error should be treated as OK in mini load. Also add 2 counters for OlapTableSink profile: SerializeBatchTime: time of serializing all row batch. WaitInFlightPacketTime: time of waiting last send packet	2019-07-16 19:15:41 +08:00
lichaoyong	a9e8113b82	Fix heap-buffer-overflow in split_part() function in StringFunctions (#1482 )	2019-07-15 23:00:37 +08:00
EmmyMiao87	6c246418fb	Add timeout in stream load planner (#1480 ) Mini load timeout needs to be added in plan options. The timeout property has been added in request of process put. Otherwise, the timeout of mini load is useless. Add log of label, txn and query id in mini load	2019-07-15 22:14:59 +08:00
ZHAO Chun	d61a2daeea	Remove unused code (#1483 )	2019-07-15 21:59:06 +08:00
lichaoyong	0d48a3961c	Refactor Storage Engine (#1478 ) NOTE: This patch would modify all Backend's data. And this will cause a very long time to restart be. So if you want to interferer your product environment, you should upgrade backend one by one. 1. Refactoring be is to clarify the structure the codes. 2. Use unique id to indicate a rowset. Nameing rowset with tablet_id and version will lead to many conflicts among compaction, clone, restore. 3. Extract an rowset interface to encapsulate rowsets with different format.	2019-07-15 21:18:22 +08:00
WingC	ae6f2d99c5	Fix bug when use SELECT * FROM TABLE LIMIT 1 (#1469 )	2019-07-13 23:57:14 +08:00
zhongyun2019	5be3e73325	Build snappy with optimize-options enabled (#1467 )	2019-07-13 21:27:17 +08:00
worker24h	aff1559c4d	FixBug: if columns of doris table less than parquet file columns , BE will be crash (#1464 )	2019-07-12 15:23:13 +08:00
Mingyu Chen	863eb83cb1	Delete deprecated code in Frontend (#1463 ) 1. Delete Clone/CloneJob/CloneChecker The old clone framework is deprecated, using TabletChecker/TabletScheduler instead 2. Delete old BackupJob/RestoreJob 3. Delete OP_DROP_USER edit log 4. Delete CLONE_DONE edit log	2019-07-12 13:34:05 +08:00
EmmyMiao87	734032d917	Fix the error unit of create timestamp in mini load (#1460 ) The unit of old create timestamp is micros while the unit of create timestamp in fe is millisecond.	2019-07-11 19:29:18 +08:00
HangyuanLiu	2fd2b714c1	Add aggregate function doc (#1434 )	2019-07-11 16:45:45 +08:00
HangyuanLiu	a7390c03f4	Add percentile_approx aggregate function (#1432 )	2019-07-11 16:44:43 +08:00
Mingyu Chen	01a3bea456	Add new CreateTabletReq and AlterTabletReqV2 thrift struct (#1459 ) This is for developing new alter table process.	2019-07-11 16:38:39 +08:00
HangyuanLiu	b9c79d4b1b	Fix importing non-parquet format file causing be crash (#1454 )	2019-07-11 16:04:36 +08:00
Yunfeng,Wu	81f062dd4c	Bug-fix: query es table would fail when thrift_port configuration not set (#1455 )	2019-07-11 12:29:18 +08:00
HangyuanLiu	941dec215b	Add utc_timestamp function (#1456 )	2019-07-11 11:09:08 +08:00
Mingyu Chen	e27d2fcfb1	Fix bug of wrong judgement for Hll type column's default value (#1458 ) Fix the bug that user may receive '"Hll can not set default value' error when creating table with Hll type columns.	2019-07-11 10:13:07 +08:00
Mingyu Chen	51c92a0bec	Validate the UTF-8 encode of loading data (#1457 ) Currently, Doris only support UTF-8 encoded data. All data will be shown to user in UTF-8 format. So if data loaded in Doris does not UTF-8 encoded, user will see garbled data when querying. I introduce a fast UTF-8 validator from https://github.com/lemire/fastvalidate-utf-8 This validator is highly optimized that it only takes 0.7 CPU cycles to validata a 64k string. And by testing 1GB data load to Doris, the validator has no impact on performance.	2019-07-11 09:46:38 +08:00
Mingyu Chen	9c96a688c3	Fix bug that user can set null default value to non-nullable column in create table stmt (#1453 ) In create table stmt, column definition `k1 INT NOT NULL DEFAULT NULL` should not be allowed	2019-07-10 23:48:29 +08:00
Candy	98bd4b4565	Add string function split_part (#1451 )	2019-07-10 09:47:33 +08:00
chenhao	615c979727	Fix bug that BE crashes when inserting null value to non-nullable columns (#1447 )	2019-07-10 09:20:09 +08:00
ZHAO Chun	67b370a1ed	Add ColumnBlock (#1450 ) Use ColumnBlock to read data from Page.	2019-07-09 21:52:27 +08:00
EmmyMiao87	645f0a5279	Persist auth info in LoadJob (#1443 ) The new class named 'AuthorizationInfo' is used to save the auth info in jobs. The job doesn't need to retrieve the auth info by meta id which maybe throw the exception when db or table has been dropped or renamed. The persistence of 'AuthorizationInfo' take effect in META_VERSION 56	2019-07-09 20:50:55 +08:00
chenhao	1336093395	Set stmt support Expr (#1418 )	2019-07-09 13:12:03 +08:00
Mingyu Chen	3202dc28e8	Fix bug that unable to delete redundant replicas (#1442 ) This can happen if the Doris cluster is deployed with all, for example, SSD medium, but create all tables with HDD storage medium property. Then getLoadScore(SSD) will always return 0.0, so that no replica will be chosen when try to delete redundant replicas.	2019-07-09 10:38:35 +08:00
Mingyu Chen	ded60e59f9	Add a configuration to modify the reverse time of load error log (#1433 ) Currently, the load error log on BE will be cleaned along with the intermediate data of load, configured by 'load_data_reserve_hours'. Sometimes user want to reserve the error log for longer time.	2019-07-09 10:36:13 +08:00
Mingyu Chen	bde362c3cd	Modify insert operation's behavior (#1444 ) Before changing default insert operation to streaming load, if the select result of a insert stmt is empty, a label will still be returned to the user, and user can use this label to check the insert load job's status. After changing the insert operation, if the select result is empty, a exception will be thrown to user client directly without any label. This new usage pattern is not friendly to already existed users, which is forcing them to change their way of using insert operation. So I add a new FE config 'using_old_load_usage_pattern', default is false. If set to true, a label will be returned to user even if the select result is empty.	2019-07-09 10:17:09 +08:00
HangyuanLiu	dc64521607	Modify bugs in sample variance and variance confusion (#1439 ) 1. variance same as var_pop/variance_pop 2. stddev same as stddev_pop	2019-07-08 14:12:36 +08:00
EmmyMiao87	44d89ddcba	Resolve the problem of accidental pause in RoutineLoadJob (#1437 ) The issue is following: Request1: BE aborts the txn. Attachment of txn is set. Attachment of txn is set to null without lock of txn because the task has been aborted by FE. Request2: BE aborts the txn again. Attachment of txn is set again. Request1: The attachment is not null so the job wants to find the task and commit it. The job could not find the task so it is paused. (NullPointer Exception) In this commit, The commit request will check if task exists instead of checking txn attachment is not null.	2019-07-08 11:01:42 +08:00
Youngwb	4989f7bfe3	Fix spelling mistake in docs (#1435 )	2019-07-07 11:55:51 +08:00
Ye	25e092f92a	Fix broken image link in docs (#1436 )	2019-07-07 10:51:31 +08:00
Mingyu Chen	e2246e7181	Fix bug that publish task missing dbId parameter (#1426 ) Without dbId parameter, the backend report version can not be updated when publish task report to FE, which may cause incorrect order of report. Related commit: 5c1b4f6	2019-07-04 15:42:39 +08:00
Mingyu Chen	eeb6fa082b	Revert "Cast the type of constant value in binary predicate to column type. (#1422 )" (#1425 ) This reverts commit 687f5a10868ce62d2d25d028152f84f3385b73f4.	2019-07-04 13:48:19 +08:00
Mingyu Chen	687f5a1086	Cast the type of constant value in binary predicate to column type. (#1422 ) If one child of a binary predicate is column, and the other is a constant expr, set the compatible type to column's type. eg: k1(int): ... WHERE k1 = "123" --> k1 = cast("123" as int); k2(varchar): ... WHERE 123 = k2 --> cast(123 as varchar) = k2 This optimization is for the case that some users may use a int column to save date, eg: 20190703, but query with predicate: col = "20190703". If not casting "20190703" to int, query optimizer can not do partition prune correctly.	2019-07-03 21:49:30 +08:00
Mingyu Chen	4ff17c1fc3	Forbid adding rollup with REPLACE value but without all key columns. (#1421 ) When a rollup table contains value columns of REPLACE aggregation type, all key columns of base table should be included in this rollup. Without all key columns, the order of rows is undefined. For example, table schema k1,k2,k3,v1(REPLACE) 1,2,3,A 1,2,4,B 1,2,5,C A rollup with column(k1,k2,v1): 1,2,A 1,2,B 1,2,C No matter A or B or C be the last winner, it is meanless to user. Also fix a bug that set password for current user on non-master FE will throw NullPointerException.	2019-07-03 09:17:39 +08:00
EmmyMiao87	d9f6829f4f	Fix the deadlock of database and loadjob (#1419 ) This commit change the idToTable to concurrent hashmap in database. It don't need to lock the database before getTable. The database will be locked in GlobalTxnManager. The load job will be locked after that. So the lock sequence is: database, load manager, load job.	2019-07-02 16:31:01 +08:00
worker24h	7eab12a40e	Support reading Parquet file when loading data (#1173 )	2019-07-01 18:39:27 +08:00
EmmyMiao87	b0af97d8aa	Change error msg of mini load when PUBLISH_TIMEOUT (#1415 )	2019-07-01 16:05:49 +08:00
Mingyu Chen	8db97998ba	Collect all documents to Doris code base (#1414 )	2019-07-01 09:23:13 +08:00
EmmyMiao87	9f743e3343	Use load job v2 in default (#1404 )	2019-06-29 11:20:57 +08:00
EmmyMiao87	6b83440b59	Get table name from DataSourceInfo instead of DataDesc (#1405 )	2019-06-29 11:20:12 +08:00
HangyuanLiu	8a10bf0f89	Fix binary plain page relocate bug (#1410 )	2019-06-29 11:19:43 +08:00
EmmyMiao87	1ff1722d93	Fix the core in dpp sink by sum of int128 (#1412 )	2019-06-28 23:30:33 +08:00
Mingyu Chen	5c1b4f641e	Add report version for publish task (#1401 )	2019-06-28 20:15:08 +08:00
kangpinghuang	4747bed306	Add rle page (#1379 )	2019-06-27 22:25:19 +08:00

... 245 246 247 248 249 ...

13073 Commits