doris

Author	SHA1	Message	Date
chenhao	2cb82c57bb	Fix bug that <=> operator and in operator get wrong result (#1516 ) * Fix bug that <=> operator and in operator get wrong result * Add some comment to get_result_for_null * Add an new Binary Operator to replace is_safe_for_null for handleing '<=>' operator * Add EQ_FOR_NULL to TExprOpcode * Remove macro definition last backslash	2019-07-30 11:17:53 +08:00
Mingyu Chen	97718a35a2	Do not get file size in Broker openReader() method (#1560 ) The file is already got when listing files. Get file size in openReader() again is unnecessary and inefficient.	2019-07-29 23:05:01 +08:00
Mingyu Chen	0694b6a6fa	Fix bugs of Broker load (#1546 ) Use same UUID as query ID and load ID of a load execution plan. Each load execution plan has a load ID, and as a plan, there is also a query ID. We can use same UUID as query ID and load ID, for tracing the load process more easily. Change the load ID when retrying a load execution plan. When a load execution plan retry, the load ID should be changed, otherwise BE can not distinguish the old and new load requests. Cancel the running loading task when cancelling the broker load. When user cancel a broker load, the running loading task should also be cancelled, or it may occupies the worker thread for a long time. Remove the unnecessary query report when doing load execution plan. Only the last query report is needed. Add a new BE config tablet_writer_rpc_timeout_sec. It is used for RPC of tablet sink. The default is 600 seconds. which is long enough for flushing about 6GB data. The long timeout config will reduce the possibility of encountering fail to send batch error when loading. Use streaming_load_max_mb instead of mini_load_max_mb in BE config. Add more logs for tracing a broker load process easily.	2019-07-27 20:17:05 +08:00
EmmyMiao87	8a7fe521d6	Allow the null default in insert into stmt (#1556 ) The default value of null is forbidden in insert into stmt while null column has not been mentioned in stmt. This is a bug because the unmentioned column has default value. The values should be inserted successfully although the default value is null. So the column may simply be not assigned default value when the column is not allowed null and the default value of column is null.	2019-07-26 21:32:00 +08:00
chenhao	abda544d3c	Fix bug that getting compatible type for TIME with other types fails (#1544 )	2019-07-26 19:10:04 +08:00
EmmyMiao87	000e9cf53c	Add administrator guide of load (#1488 ) The catalogue of load docs: ---- load-manual.md ---- broker-load-manual.md ---- insert-into-manual.md ---- stream-load-manual.md This commit also changes max/min_stream_load_timeout to max/min_load_timeout. The old config named stream_load_timeout means the max timeout suited for all types of load. So the config name has been changed.	2019-07-25 21:02:32 +08:00
EmmyMiao87	e29eceae0a	Fix the null pointer exception when ReplayOnAborted of txn in broker load (#1543 ) The txn attachment maybe null when broker load has been cancelled without attachment. The end log of broker load has been record but the callback id of txnState hasn't been removed So the callback of txn is executed when log of txn aborted is replayed.	2019-07-24 22:17:55 +08:00
worker24h	4f4c8d1824	Fix Bug: Load fail when we don't specify format type. (#1538 )	2019-07-24 15:53:00 +08:00
HangyuanLiu	7c24bf38bc	Show load statement support offset (#1531 ) Such as `show load order by createtime desc limit 1,2`	2019-07-24 13:27:21 +08:00
Mingyu Chen	a88b55e649	Add more logs and metrics to trace the broker load process (#1530 ) The Operator wants to known when the job being scheduled as PENDING and LOADING. And how long it takes to finish these sub states. Also add 2 metrics on BE to monitor the memtable's flush time. `memtable_flush_total` and `memtable_flush_duration_us`	2019-07-23 21:42:44 +08:00
Mingyu Chen	69040572fb	Use different ID instead of table ID for base index of an OLAP table (#1524 )	2019-07-23 15:48:45 +08:00
HangyuanLiu	4aedaea84e	Support TIME type and timediff function (#1505 )	2019-07-23 13:42:39 +08:00
Mingyu Chen	221cd2e103	Fix bug that user with LOAD_PRIV can see load job by SHOW LOAD stmt (#1528 ) User should has LOAD_PRIV to use SHOW LOAD stmt, not SHOW_PRIV.	2019-07-23 08:48:23 +08:00
WingC	cd7ab5af0b	Fix variable arguments bug in UDAF (#1523 )	2019-07-21 23:11:56 +08:00
Mingyu Chen	556299aae9	Remove query status report from BE when query is cancelled normally (#1489 ) When query result reach limit, the Coordinator in FE will send a cancel request to BE to cancel the query. And when being cancelled, BE will report query status to FE for debug purpose. But actually it is not necessary and will generate too many logs. So I add a CancelReason to distinguish the difference between 'normally' cancellation and 'internal error' cancellation. if 'normally' cancelled, no status will be reported from BE. When query reach limit, or user cancel it actively, it is being cancelled 'normally'. Otherwise, the query is cancelled due to internal error, which will need a report from BE.	2019-07-19 09:36:01 +08:00
EmmyMiao87	1f3f3f76a2	Fix the duplicated request bug of mini load (#1504 ) The function of miniLoadBegin will return the txn_id. If the backend sends the duplicated request to frontend, frontend will return the txn_id which was created by the same mini load. The issue is that frontend returns the txn_id when the last same request hasn't been begun the txn. The frontend returns the zero which is initialized txn_id and the be could not execute the load plan with a error txn_id. The commit conbines the `createLoadJob` and `execute` together in the write lock. It protects the atomicity of `create` and `beginTxn`. So the duplicated request cannot get the txn id before the last same request is finished.	2019-07-18 23:52:12 +08:00
chenhao	ca480914de	Fix bug that single partition table get wrong partition type (#1503 )	2019-07-18 19:17:38 +08:00
yiguolei	755b12cd75	Add partition id to tablet meta in be (#1490 ) FE uses partition_id to publish version. BE should check whether all tablets related with this partition have the version. But Tablet in BE does not have partition id in its metadata. So that BE could not check it. This patch will add partition id to tablet meta during report task. Sync at most 10k tablets during set tablet meta.	2019-07-17 14:07:55 +08:00
Mingyu Chen	2551248a52	Support grant GRANT_PRIV on database or table level (#1472 ) Currently, GRANT_PRIV can only be granted on global level, which means it can only be granted on .. Grant it on db.* or db.tbl are not allowed. This will not be able to meet the requirement to create a user who has privilege to grant privileges to other users on specified database or table, such as: GRANT SELECT_PRIV ON db1.* TO cmy@'%'; So I extend the range of GRANT_PRIV. User can now grant GRANT_PRIV on database or even table level, such as: GRANT GRANT_PRIV ON db1.* TO cmy@'%'; And after being granted, the user cmy@'%' can now grant GRANT_PRIV on db1.* to other users.	2019-07-16 19:25:18 +08:00
EmmyMiao87	6c246418fb	Add timeout in stream load planner (#1480 ) Mini load timeout needs to be added in plan options. The timeout property has been added in request of process put. Otherwise, the timeout of mini load is useless. Add log of label, txn and query id in mini load	2019-07-15 22:14:59 +08:00
lichaoyong	0d48a3961c	Refactor Storage Engine (#1478 ) NOTE: This patch would modify all Backend's data. And this will cause a very long time to restart be. So if you want to interferer your product environment, you should upgrade backend one by one. 1. Refactoring be is to clarify the structure the codes. 2. Use unique id to indicate a rowset. Nameing rowset with tablet_id and version will lead to many conflicts among compaction, clone, restore. 3. Extract an rowset interface to encapsulate rowsets with different format.	2019-07-15 21:18:22 +08:00
Mingyu Chen	863eb83cb1	Delete deprecated code in Frontend (#1463 ) 1. Delete Clone/CloneJob/CloneChecker The old clone framework is deprecated, using TabletChecker/TabletScheduler instead 2. Delete old BackupJob/RestoreJob 3. Delete OP_DROP_USER edit log 4. Delete CLONE_DONE edit log	2019-07-12 13:34:05 +08:00
HangyuanLiu	a7390c03f4	Add percentile_approx aggregate function (#1432 )	2019-07-11 16:44:43 +08:00
Yunfeng,Wu	81f062dd4c	Bug-fix: query es table would fail when thrift_port configuration not set (#1455 )	2019-07-11 12:29:18 +08:00
Mingyu Chen	e27d2fcfb1	Fix bug of wrong judgement for Hll type column's default value (#1458 ) Fix the bug that user may receive '"Hll can not set default value' error when creating table with Hll type columns.	2019-07-11 10:13:07 +08:00
Mingyu Chen	9c96a688c3	Fix bug that user can set null default value to non-nullable column in create table stmt (#1453 ) In create table stmt, column definition `k1 INT NOT NULL DEFAULT NULL` should not be allowed	2019-07-10 23:48:29 +08:00
EmmyMiao87	645f0a5279	Persist auth info in LoadJob (#1443 ) The new class named 'AuthorizationInfo' is used to save the auth info in jobs. The job doesn't need to retrieve the auth info by meta id which maybe throw the exception when db or table has been dropped or renamed. The persistence of 'AuthorizationInfo' take effect in META_VERSION 56	2019-07-09 20:50:55 +08:00
chenhao	1336093395	Set stmt support Expr (#1418 )	2019-07-09 13:12:03 +08:00
Mingyu Chen	3202dc28e8	Fix bug that unable to delete redundant replicas (#1442 ) This can happen if the Doris cluster is deployed with all, for example, SSD medium, but create all tables with HDD storage medium property. Then getLoadScore(SSD) will always return 0.0, so that no replica will be chosen when try to delete redundant replicas.	2019-07-09 10:38:35 +08:00
Mingyu Chen	bde362c3cd	Modify insert operation's behavior (#1444 ) Before changing default insert operation to streaming load, if the select result of a insert stmt is empty, a label will still be returned to the user, and user can use this label to check the insert load job's status. After changing the insert operation, if the select result is empty, a exception will be thrown to user client directly without any label. This new usage pattern is not friendly to already existed users, which is forcing them to change their way of using insert operation. So I add a new FE config 'using_old_load_usage_pattern', default is false. If set to true, a label will be returned to user even if the select result is empty.	2019-07-09 10:17:09 +08:00
HangyuanLiu	dc64521607	Modify bugs in sample variance and variance confusion (#1439 ) 1. variance same as var_pop/variance_pop 2. stddev same as stddev_pop	2019-07-08 14:12:36 +08:00
EmmyMiao87	44d89ddcba	Resolve the problem of accidental pause in RoutineLoadJob (#1437 ) The issue is following: Request1: BE aborts the txn. Attachment of txn is set. Attachment of txn is set to null without lock of txn because the task has been aborted by FE. Request2: BE aborts the txn again. Attachment of txn is set again. Request1: The attachment is not null so the job wants to find the task and commit it. The job could not find the task so it is paused. (NullPointer Exception) In this commit, The commit request will check if task exists instead of checking txn attachment is not null.	2019-07-08 11:01:42 +08:00
Mingyu Chen	e2246e7181	Fix bug that publish task missing dbId parameter (#1426 ) Without dbId parameter, the backend report version can not be updated when publish task report to FE, which may cause incorrect order of report. Related commit: 5c1b4f6	2019-07-04 15:42:39 +08:00
Mingyu Chen	eeb6fa082b	Revert "Cast the type of constant value in binary predicate to column type. (#1422 )" (#1425 ) This reverts commit 687f5a10868ce62d2d25d028152f84f3385b73f4.	2019-07-04 13:48:19 +08:00
Mingyu Chen	687f5a1086	Cast the type of constant value in binary predicate to column type. (#1422 ) If one child of a binary predicate is column, and the other is a constant expr, set the compatible type to column's type. eg: k1(int): ... WHERE k1 = "123" --> k1 = cast("123" as int); k2(varchar): ... WHERE 123 = k2 --> cast(123 as varchar) = k2 This optimization is for the case that some users may use a int column to save date, eg: 20190703, but query with predicate: col = "20190703". If not casting "20190703" to int, query optimizer can not do partition prune correctly.	2019-07-03 21:49:30 +08:00
Mingyu Chen	4ff17c1fc3	Forbid adding rollup with REPLACE value but without all key columns. (#1421 ) When a rollup table contains value columns of REPLACE aggregation type, all key columns of base table should be included in this rollup. Without all key columns, the order of rows is undefined. For example, table schema k1,k2,k3,v1(REPLACE) 1,2,3,A 1,2,4,B 1,2,5,C A rollup with column(k1,k2,v1): 1,2,A 1,2,B 1,2,C No matter A or B or C be the last winner, it is meanless to user. Also fix a bug that set password for current user on non-master FE will throw NullPointerException.	2019-07-03 09:17:39 +08:00
EmmyMiao87	d9f6829f4f	Fix the deadlock of database and loadjob (#1419 ) This commit change the idToTable to concurrent hashmap in database. It don't need to lock the database before getTable. The database will be locked in GlobalTxnManager. The load job will be locked after that. So the lock sequence is: database, load manager, load job.	2019-07-02 16:31:01 +08:00
worker24h	7eab12a40e	Support reading Parquet file when loading data (#1173 )	2019-07-01 18:39:27 +08:00
EmmyMiao87	9f743e3343	Use load job v2 in default (#1404 )	2019-06-29 11:20:57 +08:00
EmmyMiao87	6b83440b59	Get table name from DataSourceInfo instead of DataDesc (#1405 )	2019-06-29 11:20:12 +08:00
Mingyu Chen	5c1b4f641e	Add report version for publish task (#1401 )	2019-06-28 20:15:08 +08:00
Yunfeng,Wu	7cb3c95330	Modify some method for external invoking and add extra sink (#1383 )	2019-06-26 19:11:43 +08:00
Mingyu Chen	24a46a4552	Fix column nullable property in show create table stmt result (#1381 ) Since the default nullable of a column changed to NOT NULL, result of show table stmt should show the correct nullable property of a column.	2019-06-26 18:55:27 +08:00
Mingyu Chen	edeac1a339	Fix publish version task timeout (#1380 ) Publish version task using wrong timeout config.	2019-06-26 12:27:25 +08:00
Mingyu Chen	566e122c0d	Optimize Export feature (#1378 ) 1. Add 'timeout' properties in Export stmt. 2. Add more infos in 'show export' stmt. 3. Add more logs for debug.	2019-06-26 00:20:53 +08:00
Mingyu Chen	e807064a88	Modify colocation creation logic (#1289 )	2019-06-25 21:20:18 +08:00
EmmyMiao87	322de9cd8e	Add sql-function doc of cast_to_bigint (#1370 )	2019-06-24 19:40:57 +08:00
chenhao	9cd0e09457	Fix bug that cast DateLiteral to other types (#1364 )	2019-06-24 10:57:25 +08:00
Mingyu Chen	9f7a335d02	Increase the timeout of publish version task when doing alter job (#1359 ) The previous setting of timeout of a publish version task is mess. I change it to a configurable time, default it 30 seconds. And when the table is under rollup or schema change, I double this timeout. This a kind of best-effort-optimization. Because with a short timeout, a replica's publish version task is more likely to fail. And if quorum replicas of a tablet fail to publish, the alter job will fail. If the table is not under rollup or schema change, the failure of a replica's publish version task has a minor effect because the replica can be repaired by tablet repair process very soon. But the tablet repair process will not repair rollup replicas.	2019-06-22 14:29:16 +08:00
ZHAO Chun	bc14855f25	Fix FE unit test's failure (#1360 )	2019-06-21 22:23:36 +08:00

1 2 3 4 5 ...

429 Commits