doris

Author	SHA1	Message	Date
Youngwb	f7a05d8580	Support setting timezone variable in FE (#1587 )	2019-08-07 09:25:26 +08:00
Mingyu Chen	343b913f0d	Fix a serious bug that will cause all replicas being deleted. (#1589 ) Revert commit: eda55a7394fcec2f7b6c0aefd1628f9d63911815	2019-08-06 19:23:53 +08:00
ZHAO Chun	b2e678dfc1	Support Segment for BetaRowset (#1577 ) We create a new segment format for BetaRowset. New format merge data file and index file into one file. And we create a new format for short key index. In origin code index is stored in format like RowCusor which is not efficient to compare. Now we encode multiple column into binary, and we assure that this binary is sorted same with the key columns.	2019-08-06 17:15:11 +08:00
yiguolei	ec7b9e421f	Acquire tablet map write lock during tablet gc (#1588 )	2019-08-06 17:14:39 +08:00
Dayue Gao	d938f9a6ea	Implement the initial version of BetaRowset (#1568 )	2019-08-06 10:40:16 +08:00
Mingyu Chen	eda55a7394	Fix bug that unable to delete replica if version is missing (#1585 ) If there is a redundant replica on BE which version is missing, the tablet report logic can not drop it correctly.	2019-08-05 16:19:05 +08:00
Mingyu Chen	93a3577baa	Support multi partition column when creating table (#1574 ) When creating table with OLAP engine, use can specify multi parition columns. eg: PARTITION BY RANGE(`date`, `id`) ( PARTITION `p201701_1000` VALUES LESS THAN ("2017-02-01", "1000"), PARTITION `p201702_2000` VALUES LESS THAN ("2017-03-01", "2000"), PARTITION `p201703_all` VALUES LESS THAN ("2017-04-01") ) Notice that load by hadoop cluster does not support multi parition column table.	2019-08-05 16:16:43 +08:00
EmmyMiao87	938c6d4cdf	Thrown TabletQuorumFailedException in commitTxn (#1575 ) The TabletQuorumFailedException will be thrown in commitTxn while the success replica num of tablet is less then quorom replica num. The Hadoop load does not handle this exception because the push task will retry it later. The streaming broker, insert, stream and mini load will catch this exception and abort the txn after that.	2019-08-04 15:54:03 +08:00
ZHAO Chun	6c21a5a484	Switch MAKE_TEST off in build.sh (#1579 )	2019-08-03 22:49:35 +08:00
Mingyu Chen	cefe1794d4	Fix bug that replicas of a tablet may be located on same host (#1517 ) Doris support deploy multi BE on one host. So when allocating BE for replicas of a tablet, we should select different host. But there is a bug in tablet scheduler that same host may be selected for one tablet. This patch will fix this problem. There are some places related to this problem: 1. Create Table There is no bug in Create Table process. 2. Tablet Scheduler Fixed when selecting BE for REPLICA_MISSING and REPLICA_RELOCATING. Fixed when balance the tablet. 3. Colocate Table Balancer Fixed when selecting BE for repairing colocate backend sequence. Not fix in colocate group balance. Leave it to colocate repairing. 4. Tablet report Tablet report may add replica to catalog. But I did not check the host here, Tablet Scheduler will fix it.	2019-08-01 10:26:06 +08:00
EmmyMiao87	9128af6499	Broker load hang when rpc failed (#1567 ) Broker load hang on broker reader when the thrift request between broker and be is failed.	2019-07-31 19:03:38 +08:00
ZHAO Chun	c5edf9dae0	Unify Field and ColumnSchema in Storage (#1561 ) Currently, we have Field and ColumnSchema to access column data in a row. These two classes are mostly the same. So we should unify these to one class. Now, Field has offset information, which is an row attribute, so we remove offset in Field. RowCursor now has some logic which belong to Schema, so in this patch I add Schema attribute to RowCursor to make RowCursor simple. After this change, only Schema will handle Field/ColumnSchema. I extract some logic from RowCursor to be/src/olap/row.h, then we can use same logic to handle different types of row. Each type of row has same function that to get Cell of this row. A cell represent a column content with a null indicator.	2019-07-30 14:01:57 +08:00
EmmyMiao87	8bc8fcffae	Fix NullPointerException when creating mini load in LoadManager (#1565 ) The catch statement cancel the load job in the function named createMiniLoad. But sometimes, the load job hasn't been created in catch statement. It will throw the NullPointerException when the load job is cancelled. This commit fix this bug.	2019-07-30 12:52:14 +08:00
Dayue Gao	e0d991f4dc	Remove unreachable code in EnginePublishVersionTask (#1562 )	2019-07-30 12:35:21 +08:00
chenhao	2cb82c57bb	Fix bug that <=> operator and in operator get wrong result (#1516 ) * Fix bug that <=> operator and in operator get wrong result * Add some comment to get_result_for_null * Add an new Binary Operator to replace is_safe_for_null for handleing '<=>' operator * Add EQ_FOR_NULL to TExprOpcode * Remove macro definition last backslash	2019-07-30 11:17:53 +08:00
Mingyu Chen	97718a35a2	Do not get file size in Broker openReader() method (#1560 ) The file is already got when listing files. Get file size in openReader() again is unnecessary and inefficient.	2019-07-29 23:05:01 +08:00
Mingyu Chen	99836f0d7c	Modify load docs (#1558 ) Make it work with documentation website	2019-07-29 15:48:59 +08:00
Yingchun Lai	011bdcd641	Bump thirdparty's BZIP2 version to 1.0.8 (#1559 )	2019-07-29 00:10:44 +08:00
Mingyu Chen	0694b6a6fa	Fix bugs of Broker load (#1546 ) Use same UUID as query ID and load ID of a load execution plan. Each load execution plan has a load ID, and as a plan, there is also a query ID. We can use same UUID as query ID and load ID, for tracing the load process more easily. Change the load ID when retrying a load execution plan. When a load execution plan retry, the load ID should be changed, otherwise BE can not distinguish the old and new load requests. Cancel the running loading task when cancelling the broker load. When user cancel a broker load, the running loading task should also be cancelled, or it may occupies the worker thread for a long time. Remove the unnecessary query report when doing load execution plan. Only the last query report is needed. Add a new BE config tablet_writer_rpc_timeout_sec. It is used for RPC of tablet sink. The default is 600 seconds. which is long enough for flushing about 6GB data. The long timeout config will reduce the possibility of encountering fail to send batch error when loading. Use streaming_load_max_mb instead of mini_load_max_mb in BE config. Add more logs for tracing a broker load process easily.	2019-07-27 20:17:05 +08:00
EmmyMiao87	8a7fe521d6	Allow the null default in insert into stmt (#1556 ) The default value of null is forbidden in insert into stmt while null column has not been mentioned in stmt. This is a bug because the unmentioned column has default value. The values should be inserted successfully although the default value is null. So the column may simply be not assigned default value when the column is not allowed null and the default value of column is null.	2019-07-26 21:32:00 +08:00
chenhao	abda544d3c	Fix bug that getting compatible type for TIME with other types fails (#1544 )	2019-07-26 19:10:04 +08:00
ZHAO Chun	6c8d34fa70	Fix bug which make BE crash when load HLL type (#1552 )	2019-07-26 11:22:08 +08:00
kangpinghuang	e8561d71a6	Add dict page (#1409 ) Add dict encoding page for binary/string type data. Construct a dict for original data, and save encoded id instead of origin data to save space. If the dict is too big, then will automatically fall back to plain encoding.	2019-07-26 09:47:11 +08:00
EmmyMiao87	000e9cf53c	Add administrator guide of load (#1488 ) The catalogue of load docs: ---- load-manual.md ---- broker-load-manual.md ---- insert-into-manual.md ---- stream-load-manual.md This commit also changes max/min_stream_load_timeout to max/min_load_timeout. The old config named stream_load_timeout means the max timeout suited for all types of load. So the config name has been changed.	2019-07-25 21:02:32 +08:00
EmmyMiao87	473d69e8f8	Fix the mistake in docs of rollup (#1551 )	2019-07-25 20:53:41 +08:00
ZHAO Chun	dbc912d2df	Unify ColumnSchemaV2 and ColumnSchema to one (#1545 ) Currently, we have two versions of ColumnSchema, in this patch, we unify these two classes to one class.	2019-07-25 10:48:16 +08:00
kangpinghuang	8160232097	Fix miss delete predicate when clone (#1541 ) related issue #1539	2019-07-25 09:16:44 +08:00
EmmyMiao87	e29eceae0a	Fix the null pointer exception when ReplayOnAborted of txn in broker load (#1543 ) The txn attachment maybe null when broker load has been cancelled without attachment. The end log of broker load has been record but the callback id of txnState hasn't been removed So the callback of txn is executed when log of txn aborted is replayed.	2019-07-24 22:17:55 +08:00
ZHAO Chun	0805b05d81	Remove unused FieldInfo (#1540 )	2019-07-24 19:33:30 +08:00
worker24h	4f4c8d1824	Fix Bug: Load fail when we don't specify format type. (#1538 )	2019-07-24 15:53:00 +08:00
ZHAO Chun	fde3941185	Remove unused code (#1537 )	2019-07-24 14:48:01 +08:00
HangyuanLiu	7c24bf38bc	Show load statement support offset (#1531 ) Such as `show load order by createtime desc limit 1,2`	2019-07-24 13:27:21 +08:00
ZHAO Chun	a6f0b5c789	Change RowsetWriter num_rows() return int64_t (#1535 )	2019-07-24 10:44:38 +08:00
kangpinghuang	9e2b93a8e2	Fix rowset build validate failure (#1532 ) The reason for validate failure is the cloned file's names may conflict and load segment read file througth cache and cache key is file name, so index may read wrong file. The solution is load index without use file handle cache.	2019-07-24 10:08:39 +08:00
ZHAO Chun	68782be7a6	Refactor storage aggregate framework (#1529 ) Add AggregateInfo to enclose all functions that used to aggregate value column.	2019-07-24 10:02:35 +08:00
Mingyu Chen	a88b55e649	Add more logs and metrics to trace the broker load process (#1530 ) The Operator wants to known when the job being scheduled as PENDING and LOADING. And how long it takes to finish these sub states. Also add 2 metrics on BE to monitor the memtable's flush time. `memtable_flush_total` and `memtable_flush_duration_us`	2019-07-23 21:42:44 +08:00
Mingyu Chen	69040572fb	Use different ID instead of table ID for base index of an OLAP table (#1524 )	2019-07-23 15:48:45 +08:00
yiguolei	c34b35e6c4	Add ALTER_TABLET task in be (#1497 ) This a for the new implementation of alter table process.	2019-07-23 15:16:21 +08:00
HangyuanLiu	4aedaea84e	Support TIME type and timediff function (#1505 )	2019-07-23 13:42:39 +08:00
Mingyu Chen	221cd2e103	Fix bug that user with LOAD_PRIV can see load job by SHOW LOAD stmt (#1528 ) User should has LOAD_PRIV to use SHOW LOAD stmt, not SHOW_PRIV.	2019-07-23 08:48:23 +08:00
ZHAO Chun	0c8e91adf4	Add storage rowwise iterator (#1515 ) Use RowwiseIterator to uniform all data fetch in storage engine. All objects in storage engine can be read in iterator format. For example: Segment, Rowset. This patch implement two generic iterators: UnionRowwiseIterator, MergeRowwiseIterator. These two class will add iterator as its inputs. To implement iterators, we define a new class RowBlockV2, all data read from iterator is in this format. We define a new class other than use old version's RowBlock is because we want to keep old code work normally.	2019-07-22 14:35:11 +08:00
WingC	cd7ab5af0b	Fix variable arguments bug in UDAF (#1523 )	2019-07-21 23:11:56 +08:00
Mingyu Chen	7b019ab37f	Fix bug that WrapperField does not consider HLL column type when creating (#1514 ) This bug may cause BE crash when handling HLL column in some process. This bug is introduced by code merge. Version 0.10 does not has this bug.	2019-07-19 18:19:23 +08:00
Mingyu Chen	6e1ccbc542	Fix index.rst file for aggregation-function SQL reference docs (#1518 )	2019-07-19 18:16:50 +08:00
kangpinghuang	74eb43206d	Fix segment group add zone check bug and remove unused meta log (#1513 )	2019-07-19 17:03:19 +08:00
lichaoyong	227af49331	Fix rollup bug when init RowCursor in MergeContext (#1510 ) When doing rollup, seek_columns equals to the complete set of tablet's columns. There is no necessity to set it. Related to commit 36df6ebe4e5f0abd3f07c1e454710590f1de23c7	2019-07-19 14:32:58 +08:00
Mingyu Chen	6c1f95c3a0	Fix bug that BE may crash when closing OlapTableSink (#1507 ) The `_profile` in OlapTableSink may not be initialized if `prepare()` method is not called. So when close the OlapTableSink, we should check if `_profile` is initialized.	2019-07-19 10:30:44 +08:00
Mingyu Chen	556299aae9	Remove query status report from BE when query is cancelled normally (#1489 ) When query result reach limit, the Coordinator in FE will send a cancel request to BE to cancel the query. And when being cancelled, BE will report query status to FE for debug purpose. But actually it is not necessary and will generate too many logs. So I add a CancelReason to distinguish the difference between 'normally' cancellation and 'internal error' cancellation. if 'normally' cancelled, no status will be reported from BE. When query reach limit, or user cancel it actively, it is being cancelled 'normally'. Otherwise, the query is cancelled due to internal error, which will need a report from BE.	2019-07-19 09:36:01 +08:00
EmmyMiao87	1f3f3f76a2	Fix the duplicated request bug of mini load (#1504 ) The function of miniLoadBegin will return the txn_id. If the backend sends the duplicated request to frontend, frontend will return the txn_id which was created by the same mini load. The issue is that frontend returns the txn_id when the last same request hasn't been begun the txn. The frontend returns the zero which is initialized txn_id and the be could not execute the load plan with a error txn_id. The commit conbines the `createLoadJob` and `execute` together in the write lock. It protects the atomicity of `create` and `beginTxn`. So the duplicated request cannot get the txn id before the last same request is finished.	2019-07-18 23:52:12 +08:00
chenhao	ca480914de	Fix bug that single partition table get wrong partition type (#1503 )	2019-07-18 19:17:38 +08:00

... 244 245 246 247 248 ...

13073 Commits