doris

Author	SHA1	Message	Date
Youngwb	068707484d	Support sequence column for UNIQUE_KEYS Table (#4256 ) * add sequence col Co-authored-by: yangwenbo6 <yangwenbo3@jd.com>	2020-09-04 10:10:17 +08:00
HappenLee	a64c3a7acd	[ODBC SCAN NODE] 3/4 Add ODBC_TABLE and ODBC_SCAN NODE in FE. (#4430 ) we can create odbc_table use SQL like ``` CREATE EXTERNAL TABLE `baseall_oracle` ( `k1` decimal(9, 3) NOT NULL COMMENT "", `k2` char(10) NOT NULL COMMENT "", `k3` datetime NOT NULL COMMENT "", `k5` varchar(20) NOT NULL COMMENT "", `k6` double NOT NULL COMMENT "" ) ENGINE=ODBC PROPERTIES ( "host" = "192.168.0.1", "port" = "8086", "user" = "happenlee", "password" = "doris", "database" = "doris", "table" = "baseall", "driver" = "Oracle 19 ODBC driver", "type" = "oracle" ); ``` Now we only support Oracle and MySQL Database and this feature default turned off by conf enable_odbc_table.	2020-09-04 09:30:01 +08:00
Mingyu Chen	5166a6c6bc	[Bug] function str_to_date()'s behavior on BE and FE is inconsistent (#4495 ) Main CL: 1. Copy the code from BE to implement the `str_to_date()` function in FE. 2. `str_to_date("2020-08-08", "%Y-%m-%d %H:%i:%s")` will return `2020-08-08 00:00:00` instead of `2020-08-08`.	2020-09-03 17:16:19 +08:00
Zhengguo Yang	ac3bbdd3ab	[BatchDelete] Add a configuration indicating whether to enable the batch delete function (#4493 )	2020-09-03 16:56:37 +08:00
xy720	f207036cad	[Spark load][Document] Add docs about spark and yarn client for spark load (#4489 ) Add docs about spark and yarn client for spark load	2020-09-02 10:52:49 +08:00
Mingyu Chen	8bb65863f5	[Doc] Update doc of fe-idea-dev.md (#4485 )	2020-08-31 10:09:10 +08:00
ZhangYu0123	1d93ba027a	[Compaction] Compaction show policy type and disk format (#4466 ) Add more information in compaction show api 1、add cumulative policy type 2、format rowset total disk size	2020-08-30 21:09:47 +08:00
wyb	ffe696d17c	[Doc] Add spark load sql statement doc and update manual (#4463 ) 1. add sql statement in dml 2. update spark load manual	2020-08-30 21:09:17 +08:00
Mingyu Chen	0db9194dc0	[Doc] Fix wrong doc name (#4477 ) Co-authored-by: morningman <chenmingyu@baidu.com>	2020-08-28 11:56:59 +08:00
Zhengguo Yang	174c9f89ea	[DOCS] Add batch delete docs (#4435 ) update documents for batch delete #4051	2020-08-28 09:24:07 +08:00
HangyuanLiu	ad738fa198	Add OLAP_ERR_DATE_QUALITY_ERR error status to display schema change failure (#4388 ) In the process of historical data transformation of materialized views, it may occur that the transformation fails due to data quality. Add an error status code ：OLAP_ERR_DATE_QUALITY_ERR to determine if a data problem is causing the failure #3344	2020-08-27 17:52:53 +08:00
Mingyu Chen	8b0b120aca	[Profile] Add 2 Segment related metrics in query profile (#4348 ) Total number of segments and filterd number of segment	2020-08-27 12:07:21 +08:00
EmmyMiao87	b4d8b3d9ba	Forbidden the illegal column types on BITMAP_UNION OR HLL_UNION mv (#4432 ) 1. The base column of bitmap_union could must be integer. The largeint is not supported too. 2. The base column of hll_union could not be decimal. Check error msg of const expr in Union Node If user wants to insert a negative number into bitmap mv, Doris will thrown exception 'invalid input'. The const value in Union Node is checked in this commit.	2020-08-26 10:49:32 +08:00
caiconghui	a5d1d010c0	[Doc] Fix typo about plugin content (#4416 )	2020-08-26 10:48:07 +08:00
caiconghui	1410d4e623	[Doc] Add in predicate support content in delete-manual.md (#4404 ) Add in predicate support content in delete-manual.md	2020-08-24 21:52:28 +08:00
Mingyu Chen	67b842ce04	[License] Organize and modify the license of the code (#4371 ) 1. Disable the MySQL client and LZO library by default when building the Doris. MySQL client library is used for MySQL external table feature. This feature will be replaced by the new ODBC external table soon. LZO library is used to compress/decompress data of some old data format of Doris, which is no longer used anymore. 2. Add missing license to some files. 3. For all non-Apache-License code, all are explained in NOTICE file and the corresponding license is declared. 4. Remove the js source code from webroot, it will be downloaded as thirdparty	2020-08-24 21:51:55 +08:00
Mingyu Chen	976820ba20	[SegmentV2] Change the default storage format to SegmentV2 (#4387 ) Since the Segment V2 has been released for a long time, we should make it as default storage format for newly created table. This CL mainly changes: 1. For all newly created tables, their default storage format is Segment V2. 2. For all already exist tablets, their storage format remain unchanged. 3. Fix bugs described in Fix #4384 and Fix #4385	2020-08-24 21:51:17 +08:00
bigdataplumber	0715c54004	Fix mispelling (#4407 ) Centers to centos	2020-08-21 09:14:21 +08:00
EmmyMiao87	04a75b7c28	[Doc] Fix spelling errors in dynamic partition docs (#4395 ) Change-Id: I84de1602b99c6b89b59ccc5869c96516c40a181d	2020-08-20 09:31:33 +08:00
xinghuayu007	bfb39a2826	[SQL][Function] Add replace() function (#4347 ) replace is an user defined function, which is to replace all old substrings with a new substring in a string, as follow: mysql> select replace("http://www.baidu.com:9090", "9090", ""); +------------------------------------------------------+ \| replace('http://www.baidu.com:9090', '9090', '') \| +------------------------------------------------------+ \| http://www.baidu.com: \| +------------------------------------------------------+	2020-08-20 09:28:53 +08:00
Mingyu Chen	4c571cb6f5	Revert "[Metrics] Support tablet level metrics (#4327 )" (#4397 ) This reverts commit 56260a65c87830ffe34109195ee4d6f1d543e630. Co-authored-by: morningman <chenmingyu@baidu.com>	2020-08-19 22:37:52 +08:00
zhbinbin	f92428248f	Support udaf_orthogonal_bitmap (#4198 ) The original Doris bitmap aggregation function has poor performance on the intersection and union set of bitmap cardinality of more than one billion. There are two reasons for this. The first is that when the bitmap cardinality is large, if the data size exceeds 1g, the network / disk IO time consumption will increase; The second point is that all the sink data of the back-end be instance are transferred to the top node for intersection and union calculation, which leads to the pressure on the top single node and becomes the bottleneck. My solution is to create a fixed schema table based on the Doris fragmentation rule, and hash fragment the ID range based on the bitmap, that is, cut the ID range vertically to form a small cube. Such bitmap blocks will become smaller and evenly distributed on all back-end be instances. Based on the schema table, some new high-performance udaf aggregation functions are developed. All Scan nodes participate in intersection and union calculation, and top nodes only summarize The design goal is that the base number of bitmap is more than 10 billion, and the response time of cross union set calculation of 100 dimensional granularity is within 5 s. There are three udaf functions in this commit: orthogonal_bitmap_intersect_count, orthogonal_bitmap_union_count, orthogonal_bitmap_intersect.	2020-08-19 10:29:13 +08:00
ZhangYu0123	dc3ed1c525	[Compaction]Compaction rules optimization (#4212 ) Compaction rules optimization, the detail problem description and design to see #4164. This pr commits 2 functions: (1) add the cumulative policy configable, and implement original policy. (2) implement universal policy, the optimization version in #4164.	2020-08-19 09:34:13 +08:00
Yingchun Lai	56260a65c8	[Metrics] Support tablet level metrics (#4327 ) Sometimes we want to detect the hotspot of a cluster, for example, hot scanned tablet, hot wrote tablet, but we have no insight about tablets in the cluster. This patch introduce tablet level metrics to help to achieve this object, now support 4 metrics on tablets: `query_scan_bytes `, `query_scan_rows `, `flush_bytes `, `flush_count `. However, one BE may holds hundreds of thousands of tablets, so I add a parameter for the metrics HTTP request, and not return tablet level metrics by default.	2020-08-18 16:56:12 +08:00
ZhangYu0123	8a3eaeecf1	Update support batch delete storage design document (#4234 ) * Update delete index design document	2020-08-18 15:37:14 +08:00
Mingyu Chen	3359467b9a	[Tablet][Recovery] Support using empty tablet to repair the damaged or missing tablet (#4255 ) In some very special circumstances, such as code bugs, or human misoperation, etc., all replicas of some tablets may be lost. In this case, the data has been substantially lost. However, in some scenarios, the business still hopes to ensure that the query will not report errors even if there is data loss, and reduce the perception of the user layer. At this point, we can use the blank Tablet to fill the missing replica to ensure that the query can be executed normally. Add a new FE config `recover_with_empty_tablet`. default is false. true means to use empty tablet to fill the missing one. Also fix a bug in Fix #4274	2020-08-18 06:13:53 +00:00
Stalary	26fe510011	[Doc] modify the document error (#4357 )	2020-08-17 23:06:23 +08:00
ZhangYu0123	d6028863f3	[Compaction] Manually trigger compaction RESTapi interface (#4312 ) Add restapi to be which do compaction task by manual trigger. The detail design in #4311 .	2020-08-13 23:41:46 +08:00
Mingyu Chen	05fa55047e	[Doc][Json Load] Improve json data format load documents (#4337 ) And some detail explaination of JsonPath and Columns parameter	2020-08-13 23:39:57 +08:00
ZhangYu0123	1d9b3aeee7	[Doc] Repair document format (#4336 ) The error format '##keyword' in a lot of docs. This pr is to repair document format. #4335	2020-08-13 23:39:41 +08:00
weizuo93	d655b271b8	[Feature][Web] Add new feature to list all tablets on a particular BE (#4268 ) A new feature has been added to acquire tablet id and schema hash of all the tablets on a particular BE node via Web page，so that more detailed information of each tablet can be obtained according to these tablet id and schema hash. In accordance with different web request, there are two ways (table and json)to show these acquired tablet id and schema hash on Web page.	2020-08-12 20:55:19 +08:00
Fullstop000	651a7e50d0	[Doc] Update compilation.md (#4297 )	2020-08-09 20:50:33 +08:00
caiconghui	eefad13107	[Feature] Support InPredicate in delete statement (#4006 ) This PR is to add inPredicate support to delete statement, and add max_allowed_in_element_num_of_delete variable to limit element num of InPredicate in delete statement.	2020-08-06 23:19:40 +08:00
EmmyMiao87	5ba4b024e7	[Docs] Add Materialized view manual (#4229 ) Add usage manual of materialized view in Chinese and English	2020-08-06 23:18:06 +08:00
Mingyu Chen	237c0807a4	[RoutineLoad] Support modify routine load job (#4158 ) Support ALTER ROUTINE LOAD JOB stmt, for example: ``` alter routine load db1.label1 properties ( "desired_concurrent_number"="3", "max_batch_interval" = "5", "max_batch_rows" = "300000", "max_batch_size" = "209715200", "strict_mode" = "false", "timezone" = "+08:00" ) ``` Details can be found in `alter-routine-load.md`	2020-08-06 23:11:02 +08:00
Fullstop000	421828d52a	[Doc] Fix format in doris_storage_optimization.md (#4250 )	2020-08-05 21:45:03 +08:00
Zhengguo Yang	1b341601fe	Generate jave files using maven (#4133 ) generate generated-java files using maven instead of by build.sh	2020-08-05 15:20:39 +08:00
Mingyu Chen	3f31866169	[Bug][Load][Json] #4124 Load json format with stream load failed (#4217 ) Stream load should read all the data completely before parsing the json. And also add a new BE config streaming_load_max_batch_read_mb to limit the data size when loading json data. Fix the bug of loading empty json array [] Add doc to explain some certain case of loading json format data. Fix: #4124	2020-08-04 12:55:53 +08:00
Lijia Liu	bdaef84a10	[FE] [HttpServer] Config netty param in HttpServer (#4225 ) Now, if the length of URL is longer than 4096 bytes, netty will refuse. The case can be reproduced by constructing a very long URL(longer than 4096bytes) Add 2 http server params: 1. http_max_line_length 2. http_max_header_size	2020-08-01 17:59:01 +08:00
HangyuanLiu	116d7ffa3c	[SQL][Function] Add approx_count_distinct() function (#4221 ) Add approx_count_distinct() function to replace the ndv() function	2020-08-01 17:54:19 +08:00
HaiBo Li	b4cb8fb9b2	[Feature][Cache]Add interface, metric, variable and config for query cache (#4159 )	2020-07-30 11:24:20 +08:00
worker24h	fdcc223ad2	[Bug][Json] Refactor the json load logic to fix some bug 1. Add `json_root` for nest json data. 2. Remove `_jmap` to make the logic reasonable.	2020-07-30 10:36:34 +08:00
caiconghui	237271c764	[Bug] Fix fe meta version problem, make drop meta check code easy to read and add doc content for drop meta check (#4205 ) This PR is mainly do three things: 1. Fix fe meta version bug introduced by #4029 , when fix conflict with #4086 2. Make drop check code easy to read 3. Add doc content for drop meta check	2020-07-30 09:54:20 +08:00
Mingyu Chen	8a169981cf	[Bug][TabletRepair] Fix bug that too many replicas generated when decommission BE (#4148 ) Try to select the BE with an existing replicas as the destination BE for REPLICA_RELOCATING clone task. Fix #4147 Also add 2 new FE configs `max_clone_task_timeout_sec` and `min_clone_task_timeout_sec`	2020-07-30 09:46:33 +08:00
caiconghui	1b3af783e6	[Plugin] Add properties grammar in InstallPluginStmt (#4173 ) This PR is to support grammar like the following: INSTALL PLUGIN FROM [source] [PROPERTIES("KEY"="VALUE", ...)] user can set md5sum="xxxxxxx", so we don't need to provide a md5 uri.	2020-07-29 15:02:31 +08:00
wutiangan	59676a1117	[BUG] fix 4149, add sessionVariable to choose broadcastjoin first when cardinality cannot be estimated (#4150 )	2020-07-29 12:28:52 +08:00
caiconghui	9e5ca697f3	[Doc] Fix typo for stream load content in basic-usage.md (#4185 )	2020-07-27 16:50:15 +08:00
caiconghui	4608f9786e	Support checking database used data quota when data load job begin a new txn (#3955 ) Now, we only check database used data quota when create or alter table, or in some old type load job, but not for routine load job and stream load job. This PR provide a uniform solution to check db used data quota when data load job begin a new txn.	2020-07-24 10:03:43 +08:00
HuangWei	a01d1aec56	[Compaction] track RowsetReader's mem & add metric (#4068 ) Ref https://github.com/apache/incubator-doris/issues/3624#issuecomment-655933244 Only RowsetReaders in compaction are under the track. Other RowsetReaders won't be effected, because the parent_tracker is nullptr.	2020-07-24 07:58:09 +08:00
caiconghui	2334f5d997	Fix some problem related with publish version task (#4089 ) This PR is mainly do following three things: 1. Add thread name in fe log to make trace problem more easy. 2. Add agent_task_resend_wait_time_ms config to escape sending duplicate agent task to be. 3. Skip to continue to update replica version when new version is lower than replica version in fe.	2020-07-23 20:06:02 +08:00

1 2 3 4 5 ...

452 Commits