doris

Author	SHA1	Message	Date
kangkaisen	cd5cfea5cc	Encapsulate HLL logic (#1756 )	2019-09-09 15:52:10 +08:00
EmmyMiao87	b85cb0071b	Bug-fix: error result of union stmt (#1758 ) ISSUES-1725: The result of union stmt whose child is outer join stmt is incorrect. Example: sql: (select k1 from empty) union all (select b.k1 k1 from left_table a left join empty b on a.k2 = b.k2); context: the empty table has no data. error result: 0 expect result: null Reason: The judgment (columns k1 who belongs to union tuple is nullable ) is incorrect. It could not be determined by slot attribute of children when the slot is produced by the outer join. The slot A is not nullable while the result of outer join is nullable which is same as slot A. So, the judgment needs to consider if the slot is come from the outer join.	2019-09-08 21:26:31 +08:00
chenhao	f23ac0eadd	Planner support push down predicates past agg, win and sort (#1471 )	2019-09-08 09:30:46 +08:00
worker24h	2f52ae7988	Add PreAgg Hint (#1617 ) eg: SELECT xxx FROM tbl /+ PREAGGOPEN / This will open pre-aggregation forcibly for the specified table	2019-09-06 19:47:18 +08:00
ZHAO Chun	a84c64785f	Shuffle partitioned instance to avoid skew (#1744 )	2019-09-04 18:31:18 +08:00
lxqfy	726509e9b9	Add MIN/MAX aggregate function compatible with char/varchar (#1739 )	2019-09-04 17:28:27 +08:00
worker24h	fddfffe4c0	Fix bug that failed to create a new partition when no partition in a table (#1688 )	2019-09-04 13:36:17 +08:00
yiguolei	6f4feca3dc	Add rowset id generator to FE and BE (#1678 )	2019-09-02 18:51:31 +08:00
Mingyu Chen	ba170aa9e6	Fix NPE of DataDescription (#1735 ) When user does not specify column mapping in BrokerLoadStmt, NPE may be thrown.	2019-09-02 16:03:26 +08:00
Mingyu Chen	76987275b9	Fix result of unix_timestamp() (#1727 )	2019-08-30 21:39:16 +08:00
EmmyMiao87	06b87d998a	Error check about column which has no default value (#1728 ) This commit check the all of parsed column include hadoop function and other function. Otherwise, the load will thrown the "Column has no default value" exception while the column also has been defined by a non-hadoop function.	2019-08-30 20:23:32 +08:00
kangkaisen	3a33f3d350	Make bitmap_union agg column support insert into and broker load (#1721 )	2019-08-30 14:44:51 +08:00
Mingyu Chen	378ce8ca04	Use double when converting TIME type value (#1722 ) TIME type value is saved in DOUBLE, so using int64 can extend the time range.	2019-08-29 21:19:19 +08:00
Mingyu Chen	c541c3fd59	Fix bug that failed to get enough normal replica because path hash is not set. (#1714 ) Path Hash of a replica in metadata should be set immediately after replica is created. And we should not depend on path hash to find replicas. Because path hash may be set delayed.	2019-08-28 19:37:38 +08:00
kangpinghuang	6865f4238b	Add limit to show tablet stmt (#1547 ) Also add some where predicates for filtering results ISSUE #1687	2019-08-28 16:25:12 +08:00
HangyuanLiu	0c2e344f45	Refactor DateLiteral class in FE (#1644 ) 1. Add FE time zone function support 2. Refactor DateLiteral class in FE ISSUE #1583	2019-08-27 22:20:06 +08:00
Mingyu Chen	7e981b2b14	Limit the disk usage to avoid running out of disk capacity (#1702 ) Set high watermark and flood stage of disk used capacity. And forbid some operations if disk usage is too high.	2019-08-27 22:18:17 +08:00
Mingyu Chen	b6b860c808	Make the max recursion depth of distribution pruner configurable (#1709 ) Add a new FE config 'max_distribution_pruner_recursion_depth'.	2019-08-27 22:17:07 +08:00
Mingyu Chen	a1b92768dd	Add a loaded rows in SHOW LOAD result (#1686 ) Loaded rows will be updated periodically by query report. So that user can see that a load job is still running or being blocked.	2019-08-27 14:13:47 +08:00
kangkaisen	1e4dd77d2a	Add bitmap agg type and udaf (#1610 )	2019-08-26 14:24:42 +08:00
EmmyMiao87	b28f4242c3	Add config max_concurrent_task_num_per_be (#1693 ) This config is used to control the max concurrent task num per be. The cluster max concurrent task num = max_concurrent_task_num_per_be * number of be.	2019-08-24 00:56:40 +08:00
Mingyu Chen	00f8040bf3	Fix bug that 2 same stream load jobs may both be able to executed successfully (#1690 ) This will cause 2 jobs trying to write same file, and cause file damaged.	2019-08-22 19:38:16 +08:00
Mingyu Chen	2b2bc82ae2	Add timeout on snapshot of data (#1672 ) Release snapshot when finishing or cancelling backup/restore job. Snapshot may takes a lot disk space if not releasing them in time.	2019-08-21 21:18:53 +08:00
Mingyu Chen	0792e06eed	Fix NPE of insert load job persist operation (#1683 ) tracking url may be null	2019-08-21 20:30:55 +08:00
worker24h	9f50f84b68	Fix bug: "SHOW DATA" or "SHOW PARTITIONS", the DATA-SIZE less than 0 (#1680 )	2019-08-21 15:33:26 +08:00
EmmyMiao87	978b1ee1af	Add strict mode in Routine load, Stream load and Mini load (#1677 )	2019-08-20 21:56:45 +08:00
Mingyu Chen	0a27ef030b	Reduce the number of partition info in BrokerScanNode param (#1675 ) And we should reduce the number of partition info in BrokerScanNode param if user already set target partitions to load, instead of adding all partitions' info. It will cause the size of RPC packet too large.	2019-08-20 19:30:57 +08:00
Mingyu Chen	8e6814cfcd	Support setting timeout for stream load (#1670 )	2019-08-20 15:43:03 +08:00
EmmyMiao87	731f78accc	Don't persisted the data source info in broker load (#1665 )	2019-08-19 15:45:21 +08:00
yuanli	ba6d728f26	Enable parsing columns from file path for Broker Load (#1582 ) (#1635 ) Currently, we do not support parsing encoded/compressed columns in file path, eg: extract column k1 from file path /path/to/dir/k1=1/xxx.csv This patch is able to parse columns from file path like in Spark(Partition Discovery). This patch parse partition columns at BrokerScanNode.java and save parsing result of each file path as a property of TBrokerRangeDesc, then the broker reader of BE can read the value of specified partition column.	2019-08-19 09:39:21 +08:00
Mingyu Chen	6d73658207	Support checking error data row when doing INSERT (#1597 ) If strict mode is true, and at least one row is filtered, the insert operation will fail and a url will be given to get the error rows. ``` ERROR 1064 (HY000): all partitions have no load data. url: http://host:ip/api/_load_error_log?file=__shard_2/error_log_insert_stmt_e0a620e93dc54461-b89ec64768367d25_e0a620e93dc54461_b89ec64768367d25 ``` If all rows are good, insert will return OK with affected rows: ``` Query OK, 1 row affected (0.26 sec) ``` If strict mode is false, and at least one row is good, the insert operation will return OK with affected rows and warnings. If has error row num, a label will be returned: ``` Query OK, 1 row affected, 1 warning (0.32 sec) {'label':'7d66c457-658b-4a3e-bdcf-8beee872ef2c'} ```	2019-08-16 21:40:29 +08:00
Mingyu Chen	82d0afc1ba	FROM_UNIXTIME should only convert timestamp from 0 to 253402271999 (#1658 ) which is between 1970-01-01 00:00:00 ~ 9999-12-31 23:59:59, otherwise, return null	2019-08-16 18:29:57 +08:00
wkhappy1	1ed25ad83d	Add kafka_default_offsets when no partiotion specify Support read kafka partition from start (#1642)	2019-08-16 13:30:26 +08:00
ZHAO Chun	b85bd334de	Remove tempory fail UT (#1659 )	2019-08-16 11:26:41 +08:00
DDDDDDouble	4f27129368	Fix get label when use StreamLoad (#1655 )	2019-08-16 09:56:20 +08:00
kangkaisen	4cc2285094	Make http server and thrift server backlog num configurable (#1638 )	2019-08-14 19:58:48 +08:00
Mingyu Chen	03b99ddd37	Fix bug that bad replica can not be synchronized when report (#1634 ) When the replica is recovered from bad on BE, the report process should change the bad status of replica on FE to false, or the replica can not be recovered.	2019-08-14 09:49:44 +08:00
HangyuanLiu	199ff968dc	Fix time zone compatibility (#1631 )	2019-08-13 18:44:35 +08:00
EmmyMiao87	780a255112	Change the prefix of table info apis (#1625 ) The pathtrie could not distinguish the different param key with the same prefix path. So the prefix of table info apis has been change to /api/external which is used by spark-doris-connector.	2019-08-13 11:30:32 +08:00
wangbo	c8352a9e4d	Insert select Stmt keep the same semantics with mysql (#1626 ) (#1628 )	2019-08-13 09:56:26 +08:00
HangyuanLiu	69af50aa8c	Time zone related BE function (#1598 ) Details can be found in time-zone.md document	2019-08-12 20:57:59 +08:00
Mingyu Chen	3080139e78	Avoid load or query failed when doing alter job 2 cases: Sometimes a missing version replica can not be repaired. Which may cause query failed with error: failed to initialize storage reader. tablet=xxx, res=-214 Cancel the rollup job when there are load jobs on that table may cause load job fail. We should ignore "table not found" exception when committing the txn.	2019-08-12 16:27:34 +08:00
Yunfeng,Wu	e3348c46a9	Expose data pruned-filter-scan ability (#1527 )	2019-08-11 12:59:24 +08:00
EmmyMiao87	add6266c71	Broker load supports function (#1592 ) * Broker load supports function The commit support the column function in broker load. The grammar of LoadStmt has not been changed. Example: columns terminated by ',' (tmp_c1, tmp_c2) set (c1=tmp_c1+tmp_c2) Also, the old function is compatible such as default_value, strftime etc. After this commit, there are no difference in column function between stream load and broker load except old function.	2019-08-09 13:27:31 +08:00
Mingyu Chen	69de5df167	Fix bug that cluster balance may cause load job failed (#1581 ) The bug is described in issue #1580 . And this patch will fix 2 cases of cluster balance After finish adding the new replica, the new replica's version may not catch up with the visible version, so the new replica may be treated as a stale and redundant replica, which will be deleting at next tablet checking round. I add a mark named needFurtherRepair to the newly added replica, only mark it when that replica's version does not catch up with visible version. This replica will receive a further repair at next tablet checking round, instead of being deleted. When deleting the redundant replicas, there may be some load jobs on it. Delete these replicas may cause the load job fail. Before deleting a redundant replica, I first mark the next txn id on that replica, and set replica's state to CLONE. The CLONE state will ensure that no more load jobs will be on that replica, and we will wait all load jobs before the marked txn id to be finished. After that, the replica can be deleted safely.	2019-08-08 18:38:30 +08:00
Yunfeng,Wu	60d997fe67	Fix errors when ES username and passwd is empty (#1601 )	2019-08-08 09:29:23 +08:00
xy720	4c2a3d6da4	Merge Help document to documentation (#1586 ) Help document collation (integration of help and documentation documents)	2019-08-07 21:31:53 +08:00
Youngwb	f7a05d8580	Support setting timezone variable in FE (#1587 )	2019-08-07 09:25:26 +08:00
Mingyu Chen	343b913f0d	Fix a serious bug that will cause all replicas being deleted. (#1589 ) Revert commit: eda55a7394fcec2f7b6c0aefd1628f9d63911815	2019-08-06 19:23:53 +08:00
Mingyu Chen	eda55a7394	Fix bug that unable to delete replica if version is missing (#1585 ) If there is a redundant replica on BE which version is missing, the tablet report logic can not drop it correctly.	2019-08-05 16:19:05 +08:00

1 2 3 4 5 ...

482 Commits