doris

Author	SHA1	Message	Date
Mingyu Chen	c39d35df4c	Add tablet compaction score metrics (#2427 ) [Metric] Add tablet compaction score metrics Backend: Add metric "tablet_max_compaction_score" to monitor the current max compaction score of tablets on this Backend. This metric will be updated each time the compaction thread picking tablets to compact. Frontend: Add metric "tablet_max_compaction_score" for each Backend. These metrics will be updated when backends report tablet. And also add a calculated metric "max_tablet_compaction_core" to monitor the max compaction core of tablets on all Backends.	2019-12-12 17:46:59 +08:00
xy720	7f2144e7e5	Upgrade JMockit from version 1.13 to 1.48 (#2423 )	2019-12-12 12:03:17 +08:00
Mingyu Chen	8e6535053c	[Tag System] Remove the 'isRestore' flag when creating table or partition (#2363 ) 'isRestore' flag is for the old version of backup and restore process, which is deprecated long time ago. Remove it. This commit is also for making a further step to ISSUE #1723.	2019-12-10 16:37:44 +08:00
Mingyu Chen	a3b7cf484b	Set the load channel's timeout to be the same as the load job's timeout (#2405 ) [Load] When performing a long-time load job, the following errors may occur. Causes the load to fail. load channel manager add batch with unknown load id: xxx There is a case of this error because Doris opened an unrelated channel during the load process. This channel will not receive any data during the entire load process. Therefore, after a fixed timeout, the channel will be released. And after the entire load job is completed, it will try to close all open channels. When it try to close this channel, it will find that the channel no longer exists and an error is reported. This CL will pass the timeout of load job to the load channel, so that the timeout of load channels will be same as load job's.	2019-12-06 21:51:00 +08:00
Mingyu Chen	55d64e3be8	Remove the readFields() method in Writable interface (#2394 ) All classes that implement the Wriable interface need only implement the write() method. The read() method should be implemented by itself according to the situation of different classes.	2019-12-06 21:46:21 +08:00
Mingyu Chen	a46bf1ada3	[Authorization] Modify the authorization checking logic (#2372 ) Authorization checking logic There are some problems with the current password and permission checking logic. For example: First, we create a user by: `create user cmy@"%" identified by "12345";` And then 'cmy' can login with password '12345' from any hosts. Second, we create another user by: `create user cmy@"192.168.%" identified by "abcde";` Because "192.168.%" has a higher priority in the permission table than "%". So when "cmy" try to login in by password "12345" from host "192.168.1.1", it should match the second permission entry, and will be rejected because of invalid password. But in current implementation, Doris will continue to check password on first entry, than let it pass. So we should change it. Permission checking logic After a user login, it should has a unique identity which is got from permission table. For example, when "cmy" from host "192.168.1.1" login, it's identity should be `cmy@"192.168.%"`. And Doris should use this identity to check other permission, not by using the user's real identity, which is `cmy@"192.168.1.1"`. Black list Functionally speaking, Doris only support adding WHITE LIST, which is to allow user to login from those hosts in the white list. But is some cases, we do need a BLACK LIST function. Fortunately, by changing the logic described above, we can simulate the effect of the BLACK LIST. For example, First we add a user by: `create user cmy@'%' identified by '12345';` And now user 'cmy' can login from any hosts. and if we don't want 'cmy' to login from host A, we can add a new user by: `create user cmy@'A' identified by 'other_passwd';` Because "A" has a higher priority in the permission table than "%". If 'cmy' try to login from A using password '12345', it will be rejected.	2019-12-06 17:45:56 +08:00
HaiBo Li	9fbc1c7ee6	Support where/orderby/limit after “SHOW ALTER TABLE COLUMN“ syntax (#2380 ) Features： 1、Support WHERE/ORDER BY/LIMIT 2、Columns：TableName、CreatTime、FinishTime、State 3、Only “And” between conditions 4、TableName and State column only support "=" operator 5、CreateTime and FinishTime column support “=”,“>=”,"<=",">","<","!=" operators 6、CreateTime and FinishTime column support Date and DateTime string, eg:"2019-12-04" or "2019-12-04 17:18:00" TestCase: MySQL [haibotest]> show alter table column where State='FINISHED' and CreateTime > '2019-12-03' order by FinishTime desc limit 0,2; +-------+---------------+---------------------+---------------------+---------------+---------+---------------+---------------+---------------+----------+------+----------+---------+ \| JobId \| TableName \| CreateTime \| FinishTime \| IndexName \| IndexId \| OriginIndexId \| SchemaVersion \| TransactionId \| State \| Msg \| Progress \| Timeout \| +-------+---------------+---------------------+---------------------+---------------+---------+---------------+---------------+---------------+----------+------+----------+---------+ \| 11134 \| test_schema_2 \| 2019-12-03 19:21:42 \| 2019-12-03 19:22:11 \| test_schema_2 \| 11135 \| 11059 \| 1:192010000 \| 3 \| FINISHED \| \| N/A \| 86400 \| \| 11096 \| test_schema_3 \| 2019-12-03 19:21:31 \| 2019-12-03 19:21:51 \| test_schema_3 \| 11097 \| 11018 \| 1:2063361382 \| 2 \| FINISHED \| \| N/A \| 86400 \| +-------+---------------+---------------------+---------------------+---------------+---------+---------------+---------------+---------------+----------+------+----------+---------+ 2 rows in set (0.00 sec)	2019-12-06 16:24:44 +08:00
yangzhg	597a8b2146	Revert "Fix arithmetic operation between numeric and non-numeric (#2362 )" (#2398 ) This reverts commit 6857ffe1c5976ef06003aa479279368bafc581f1.	2019-12-06 14:58:38 +08:00
yangzhg	6857ffe1c5	Fix arithmetic operation between numeric and non-numeric (#2362 ) fix arithmetic operation between numeric and non-numeric will cause unexpected value. After this patch you will get mysql> select 1 + "kks"; +-----------+ \| 1 + 'kks' \| +-----------+ \| 1 \| +-----------+ 1 row in set (0.02 sec) mysql> select 1 - "kks"; +-----------+ \| 1 - 'kks' \| +-----------+ \| 1 \| +-----------+ 1 row in set (0.01 sec)	2019-12-06 10:33:06 +08:00
EmmyMiao87	27d6794b81	Support subquery with non-scalar result in Binary predicate and Between-and predicate (#2360 ) This commit add a new plan node named AssertNumRowsNode which is used to determine whether the number of rows exceeds the limit. The subquery in Binary predicate and Between-and predicate should be added a AssertNumRowsNode which is used to determine whether the number of rows in subquery is more than 1. If the number of rows in subquery is more than 1, the query will be cancelled. For example: There are 4 rows in table t1. Query: select c1 from t1 where c1=(select c2 from t1); Result: ERROR 1064 (HY000): Expected no more than 1 to be returned by expression select c2 from t1 ISSUE-2270 TPC-DS 6,54,58	2019-12-05 21:27:33 +08:00
Mingyu Chen	4f39d405ee	Fix some load bugs (#2384 ) For #2383 1. Limit the concurrent transactions of routine load job 2. Create new routine load task when txn is VISIBLE, not after COMMITTED. For #2267 1. All non-master daemon thread should also be started after catalog is ready. For #2354 1. `fixLoadJobMetaError()` should be called after all meta data is read, including image and edit logs. 2. Mini load job should set to CANCELLED when corresponding transaction is not found, instead of UNKNOWN.	2019-12-05 13:41:04 +08:00
Mingyu Chen	c8cff85c94	Fixed a bug that HttpServer in unit test does not start correctly. (#2361 ) Because the http client in unit test try to connect to the server when server is not ready yet.	2019-12-03 20:34:16 +08:00
Mingyu Chen	086bb82fd2	Fixed a bug that Load job's state is incorrect when upgrading from 0.10.x to 0.11.x (#2356 ) There is bug in Doris version 0.10.x. When a load job in PENDING or LOADING state was replayed from image (not through the edit log), we forgot to add the corresponding callback id in the CallbackFactory. As a result, the subsequent finish txn edit logs cannot properly finish the job during the replay process. This results in that when the FE restarts, these load jobs that should have been completed are re-entered into the pending state, resulting in repeated submission load tasks. Those wrong images are unrecoverable, so that we have to reset all load jobs in PENDING or LOADING state when restarting FE, depends on its corresponding txn's status, to avoid submit jobs repeatedly. If corresponding txn exist, set load job' state depends on txn's status. If txn does not exist, may be the txn has been removed due to label expiration. So that we don't know the txn is aborted or visible. So we have to set the job's state as UNKNOWN, which need handle it manually.	2019-12-03 16:02:50 +08:00
lichaoyong	875790eb13	Remove VersionHash used to comparation in Fe (#2335 )	2019-12-02 19:59:13 +08:00
Mingyu Chen	a2d7c42042	Add a variable to specifically limit the memory usage of the load part in the insert operation (#2305 ) This variable is mainly for INSERT operation, because INSERT operation has both query and load part. Using only the exec_mem_limit variable does not make a good distinction of memory limit between the two parts.	2019-11-28 13:03:11 +08:00
Mingyu Chen	d5aeb9a6b7	Add document for session variables. (#2284 ) Also make the variable effective in current session when setting it globally.	2019-11-24 22:47:05 +08:00
Mingyu Chen	46181c0880	Fix some bugs about load label (#2241 )	2019-11-23 00:04:45 +08:00
xy720	79ff0ad2a4	Add pipes_as_concat_mode (#2252 ) This commit will add a new sql mode named MODE_PIPES_AS_CONCAT: Description: 1、If this mode is active, '\|\|' will be handled different from the original way ('\|\|' and 'or' are seen as the same symbols in Doris) that it can be used to concat two exps and returns a new string. For example, 'a' \|\| 'b' = 'ab' and 1 \|\| 0 = '10'. 2. User can active this mode by "SET sql_mode = PIPES_AS_CONCAT", and deactive it by "SET sql_mode = '' ".	2019-11-22 15:01:53 +08:00
Mingyu Chen	297542bd3f	Delay start master only daemon threads (#2268 ) These daemon thread should be started after catalog is ready, otherwise it may cause some undefined behavior.	2019-11-22 14:39:37 +08:00
caiconghui	4fb498a1dc	fix unit test failure for show columns from unknown table (#2261 )	2019-11-21 21:38:36 +08:00
Mingyu Chen	9b5eeaec19	Fix bug that DeployManager should start working after catalog is ready. (#2244 ) Otherwise, it can not get master ip/port from not-ready catalog.	2019-11-20 09:49:09 +08:00
HangyuanLiu	48d9318d07	Support date_add function to support partition prune (#2154 ) Currently in the date_add/date_sub functions (DATE_ADD(DATETIME date,INTERVAL expr type)), the expr parameter is the interval you want to add. Doris will convert these functions to xxx_sub/xxx_add. However, there is only the days_add function in fe, which causes other date_add formats, such as select date_add('2010-11-30 23:59:59', INTERVAL 2 DAY), cannot be pruned. So I've added other functions to support fe partition prune	2019-11-08 18:57:21 +08:00
Mingyu Chen	af79485eb2	Ignore --helper start argument if not first time to start FE (#2159 )	2019-11-08 08:48:11 +08:00
ZHAO Chun	89dc461f91	Fix UT and remove unused code (#2160 )	2019-11-08 08:47:48 +08:00
Mingyu Chen	5a4908e99a	Forward stmt with stmt id generated on origin FE. (#2129 ) Some stmt, such as DDL and DML stmt will be forwarded from non-master FE to Master FE. But these stmt will be logged in non-master FE's audit log with its origin stmt id generated on non-master FE. So we should also pass this origin stmt id to Master, so that we can track this stmt's execution process more easily.	2019-11-07 10:28:15 +08:00
ZHAO Chun	65c3b0907a	Support aggregation type of REPLACE_IF_NOT_NULL (#2127 ) Some use has the requirment that only some of columns will be update in one load operation, and others will retain as original. However, Doris can't handle this situation, because user must specify value for all columns. Then if a column aggregation method is REPLACE, use must query original value to overwrite it. This often needs some work for user to do. If this CL is applied, user can use REPLACE_IF_NOT_NULL instead of REPLACE. Then when load data to table, if user don't intent to change value of this column, user can specify NULL for this column. Doris will retain original value for this column.	2019-11-05 18:08:34 +08:00
xy720	ac5dd0c9f2	Support sql mode (#2083 ) At present, we do not support SQL MODE which is similar to MySQL. In MySQL, SQL MODE is stored in global session and session with a 64 bit address，and every bit 0 or 1 on this address represents a mode state. Besides, MySQL supports combine mode which is composed of several modes. We should support SQL MODE to deal with sql dialect problems. We can heuristically use the MySQL way to store SQL MODE in session and parse it into string when we need to return it back to client. This commit suggests a solution to support SQL MODE. But it's just a sample, and the mode types in SqlModeHelper.java are not really meaningful from now on.	2019-11-01 23:21:00 +08:00
Mingyu Chen	45df6aae08	Fix some routine load bugs (#2093 ) Mainly fix the following issues: 1. A null pointer exception is raised when a database or table is dropped. The expected behavior is that the routine load job is stopped. 2. Memory leaks. Batch routine load task submissions are no longer performed, and modifications are submitted separately for each task. 3. Unreasonable task timeout. Routine load tasks should not be queued in the BE thread pool for execution. The task sent to the BE should be executed immediately, otherwise the task in the FE will be timeout first. Eventually leads to constant timeout for all subsequent tasks. 4. All routine load job should be scheduled once it being submitted. Not waiting the available BE slot. Otherwise, all later submitted jobs may not be scheduled forever.	2019-10-31 21:53:03 +08:00
Mingyu Chen	5e8c96f28b	Optimize FE start logic (#2052 )	2019-10-31 11:11:50 +08:00
Mingyu Chen	3c12af4dcc	Limit the memory consumption of broker scan node (#1996 ) If memory exceed limit, no more row batch will be pushed to batch queue	2019-10-17 14:40:16 +08:00
EmmyMiao87	ac16318c9b	[Bug-fix][Broker-load] Fix the bug of the label already exists when the txn has been finished (#1992 ) If FE is restarted between txn committed and visible, the load job will be rescheduled and failed with label already exists. The reason is that there are inconsistency between transaction of load job and meta of load job. So, the replay of the txn attachment need to be done in function replayOnCommitted. The load job state and progress is correct after that.	2019-10-16 16:35:18 +08:00
Mingyu Chen	41e55cfca9	Modify fixed partition feature (#1989 ) 1. Not support MAVALUE in multi partition column. 2. Fix the incorrect show create table stmt.	2019-10-16 16:03:46 +08:00
worker24h	ec7c8a2c6f	Support adding fixed range partition eg: ALTER TABLE test_table ADD PARTITION p0125 VALUES [("20190125"), ("20190126"));	2019-10-15 09:50:30 +08:00
yangzhg	463b462b8d	Add create_time to information_schema.tables	2019-10-12 21:45:14 +08:00
HangyuanLiu	74d6d04e01	Fix two digit year bug in to_days function (#1839 )	2019-09-20 22:59:05 +08:00
xy720	e516eba940	Remove the "author" tag (#1829 )	2019-09-19 16:59:08 +08:00
Mingyu Chen	714dca8699	Support table comment and column comment for view (#1799 )	2019-09-18 09:45:28 +08:00
Mingyu Chen	9aa2045987	Refactor alter job (#1695 )	2019-09-12 16:31:29 +08:00
Mingyu Chen	76987275b9	Fix result of unix_timestamp() (#1727 )	2019-08-30 21:39:16 +08:00
kangkaisen	3a33f3d350	Make bitmap_union agg column support insert into and broker load (#1721 )	2019-08-30 14:44:51 +08:00
Mingyu Chen	378ce8ca04	Use double when converting TIME type value (#1722 ) TIME type value is saved in DOUBLE, so using int64 can extend the time range.	2019-08-29 21:19:19 +08:00
HangyuanLiu	0c2e344f45	Refactor DateLiteral class in FE (#1644 ) 1. Add FE time zone function support 2. Refactor DateLiteral class in FE ISSUE #1583	2019-08-27 22:20:06 +08:00
Mingyu Chen	7e981b2b14	Limit the disk usage to avoid running out of disk capacity (#1702 ) Set high watermark and flood stage of disk used capacity. And forbid some operations if disk usage is too high.	2019-08-27 22:18:17 +08:00
Mingyu Chen	b6b860c808	Make the max recursion depth of distribution pruner configurable (#1709 ) Add a new FE config 'max_distribution_pruner_recursion_depth'.	2019-08-27 22:17:07 +08:00
EmmyMiao87	b28f4242c3	Add config max_concurrent_task_num_per_be (#1693 ) This config is used to control the max concurrent task num per be. The cluster max concurrent task num = max_concurrent_task_num_per_be * number of be.	2019-08-24 00:56:40 +08:00
Mingyu Chen	00f8040bf3	Fix bug that 2 same stream load jobs may both be able to executed successfully (#1690 ) This will cause 2 jobs trying to write same file, and cause file damaged.	2019-08-22 19:38:16 +08:00
Mingyu Chen	2b2bc82ae2	Add timeout on snapshot of data (#1672 ) Release snapshot when finishing or cancelling backup/restore job. Snapshot may takes a lot disk space if not releasing them in time.	2019-08-21 21:18:53 +08:00
yuanli	ba6d728f26	Enable parsing columns from file path for Broker Load (#1582 ) (#1635 ) Currently, we do not support parsing encoded/compressed columns in file path, eg: extract column k1 from file path /path/to/dir/k1=1/xxx.csv This patch is able to parse columns from file path like in Spark(Partition Discovery). This patch parse partition columns at BrokerScanNode.java and save parsing result of each file path as a property of TBrokerRangeDesc, then the broker reader of BE can read the value of specified partition column.	2019-08-19 09:39:21 +08:00
Mingyu Chen	6d73658207	Support checking error data row when doing INSERT (#1597 ) If strict mode is true, and at least one row is filtered, the insert operation will fail and a url will be given to get the error rows. ``` ERROR 1064 (HY000): all partitions have no load data. url: http://host:ip/api/_load_error_log?file=__shard_2/error_log_insert_stmt_e0a620e93dc54461-b89ec64768367d25_e0a620e93dc54461_b89ec64768367d25 ``` If all rows are good, insert will return OK with affected rows: ``` Query OK, 1 row affected (0.26 sec) ``` If strict mode is false, and at least one row is good, the insert operation will return OK with affected rows and warnings. If has error row num, a label will be returned: ``` Query OK, 1 row affected, 1 warning (0.32 sec) {'label':'7d66c457-658b-4a3e-bdcf-8beee872ef2c'} ```	2019-08-16 21:40:29 +08:00
ZHAO Chun	b85bd334de	Remove tempory fail UT (#1659 )	2019-08-16 11:26:41 +08:00

1 2 3 4

167 Commits