doris

Author	SHA1	Message	Date
Youngwb	068707484d	Support sequence column for UNIQUE_KEYS Table (#4256 ) * add sequence col Co-authored-by: yangwenbo6 <yangwenbo3@jd.com>	2020-09-04 10:10:17 +08:00
wyb	ffe696d17c	[Doc] Add spark load sql statement doc and update manual (#4463 ) 1. add sql statement in dml 2. update spark load manual	2020-08-30 21:09:17 +08:00
Zhengguo Yang	174c9f89ea	[DOCS] Add batch delete docs (#4435 ) update documents for batch delete #4051	2020-08-28 09:24:07 +08:00
caiconghui	a5d1d010c0	[Doc] Fix typo about plugin content (#4416 )	2020-08-26 10:48:07 +08:00
Stalary	26fe510011	[Doc] modify the document error (#4357 )	2020-08-17 23:06:23 +08:00
caiconghui	eefad13107	[Feature] Support InPredicate in delete statement (#4006 ) This PR is to add inPredicate support to delete statement, and add max_allowed_in_element_num_of_delete variable to limit element num of InPredicate in delete statement.	2020-08-06 23:19:40 +08:00
EmmyMiao87	5ba4b024e7	[Docs] Add Materialized view manual (#4229 ) Add usage manual of materialized view in Chinese and English	2020-08-06 23:18:06 +08:00
Mingyu Chen	237c0807a4	[RoutineLoad] Support modify routine load job (#4158 ) Support ALTER ROUTINE LOAD JOB stmt, for example: ``` alter routine load db1.label1 properties ( "desired_concurrent_number"="3", "max_batch_interval" = "5", "max_batch_rows" = "300000", "max_batch_size" = "209715200", "strict_mode" = "false", "timezone" = "+08:00" ) ``` Details can be found in `alter-routine-load.md`	2020-08-06 23:11:02 +08:00
HangyuanLiu	116d7ffa3c	[SQL][Function] Add approx_count_distinct() function (#4221 ) Add approx_count_distinct() function to replace the ndv() function	2020-08-01 17:54:19 +08:00
worker24h	fdcc223ad2	[Bug][Json] Refactor the json load logic to fix some bug 1. Add `json_root` for nest json data. 2. Remove `_jmap` to make the logic reasonable.	2020-07-30 10:36:34 +08:00
caiconghui	237271c764	[Bug] Fix fe meta version problem, make drop meta check code easy to read and add doc content for drop meta check (#4205 ) This PR is mainly do three things: 1. Fix fe meta version bug introduced by #4029 , when fix conflict with #4086 2. Make drop check code easy to read 3. Add doc content for drop meta check	2020-07-30 09:54:20 +08:00
caiconghui	1b3af783e6	[Plugin] Add properties grammar in InstallPluginStmt (#4173 ) This PR is to support grammar like the following: INSTALL PLUGIN FROM [source] [PROPERTIES("KEY"="VALUE", ...)] user can set md5sum="xxxxxxx", so we don't need to provide a md5 uri.	2020-07-29 15:02:31 +08:00
WingC	d7893f0fa7	[Bug]Fix some schema change not work right (#4009 ) [Bug]Fix some schema change not work right This CL mainly fix some schema change to varchar type not work right because forget to logic check && Add ConvertTypeResolver to add supported convert type in order to avoid forget logic check	2020-07-11 10:18:29 +08:00
xy720	d2ab38a5e0	[Feature] Batch update partition's property in one command (#3981 ) Support following command. ``` alter table tbl_name modify partition (p1, p2, p3) set ("replication_num" = "3"); ```	2020-07-09 21:48:43 +08:00
caiconghui	b7051d0971	[Config]Make it easier for users to find configuration items needed (#3957 ) This PR is to make config items ordered by key and support like predicate for admin show config stmt	2020-07-07 23:12:21 +08:00
Mingyu Chen	c3d9feed75	[Load][Json] Refactor json load logic to make it more reasonable (#4020 ) This CL mainly changes: 1. Reorganized the code logic to limit the supported json format to two, and the import behavior is more consistent. 2. Modified the statistical behavior of the number of error rows when loading in json format, so that the error rows can be counted correctly. 3. See `load-json-format.md` to get details of loading json format.	2020-07-07 23:07:28 +08:00
funyeah	d396408861	Correct typos (#4024 )	2020-07-07 13:33:46 +08:00
WingC	b2b9e22b24	[CreateTable] Check backend disk has available capacity by storage medium before create table (#3519 ) Currently we choose BE random without check disk is available, the create table will failed until create tablet task is sent to BE and BE will check is there has available capacity to create tablet. So check backend disk available by storage medium will reduce unnecessary RPC call.	2020-06-28 09:36:31 +08:00
EmmyMiao87	feec4ee5bf	[UDF] Support external users to contribute udf (#3760 )	2020-06-23 13:43:08 +08:00
Mingyu Chen	b3811f910f	[Spark load][Fe 4/6] Add hive external table and update hive table syntax in loadstmt (#3819 ) * Add hive external table and update hive table syntax in loadstmt * Move check hive table from SelectStmt to FromClause and update doc * Update hive external table en sql reference	2020-06-13 16:28:24 +08:00
wyb	44dbdf4986	Update hive external table en sql reference	2020-06-12 21:38:05 +08:00
wyb	7f7ee63723	Move check hive table from SelectStmt to FromClause and update doc	2020-06-11 16:53:41 +08:00
ChenXiaofei	4adc9d45c2	[Doc] Update ALTER TABLE.md	2020-06-10 22:58:29 +08:00
wyb	4c2e73a5fe	Add hive external table and update hive table syntax in loadstmt	2020-06-10 16:32:32 +08:00
WingC	a7bf006b51	Use BackendStatus to show BE's infomation in `show backends;` (#3713 ) The infomation is displayed in JSON format.For example: {"lastTabletReportTime":"2020-05-28 15:29:01"}	2020-06-06 11:37:48 +08:00
wyb	fdf3415d06	[Website] Fix CREATE RESOURCE sidebar text and link not right bug (#3777 )	2020-06-05 09:20:36 +08:00
wyb	4978bd6c81	[Spark load] Add resource manager (#3418 ) 1. User interface: 1.1 Spark resource management Spark is used as an external computing resource in Doris to do ETL work. In the future, there may be other external resources that will be used in Doris, for example, MapReduce is used for ETL, Spark/GPU is used for queries, HDFS/S3 is used for external storage. We introduced resource management to manage these external resources used by Doris. ```sql -- create spark resource CREATE EXTERNAL RESOURCE resource_name PROPERTIES ( type = spark, spark_conf_key = spark_conf_value, working_dir = path, broker = broker_name, broker.property_key = property_value ) -- drop spark resource DROP RESOURCE resource_name -- show resources SHOW RESOURCES SHOW PROC "/resources" -- privileges GRANT USAGE_PRIV ON RESOURCE resource_name TO user_identity GRANT USAGE_PRIV ON RESOURCE resource_name TO ROLE role_name REVOKE USAGE_PRIV ON RESOURCE resource_name FROM user_identity REVOKE USAGE_PRIV ON RESOURCE resource_name FROM ROLE role_name ``` - CREATE EXTERNAL RESOURCE: FOR user_name is optional. If there has, the external resource belongs to this user. If not, the external resource belongs to the system and all users are available. PROPERTIES： 1. type: resource type. Only support spark now. 2. spark configuration: follow the standard writing of Spark configurations, refer to: https://spark.apache.org/docs/latest/configuration.html. 3. working_dir: optional, used to store ETL intermediate results in spark ETL. 4. broker: optional, used in spark ETL. The ETL intermediate results need to be read with the broker when pushed into BE. Example: ```sql CREATE EXTERNAL RESOURCE "spark0" PROPERTIES ( "type" = "spark", "spark.master" = "yarn", "spark.submit.deployMode" = "cluster", "spark.jars" = "xxx.jar,yyy.jar", "spark.files" = "/tmp/aaa,/tmp/bbb", "spark.yarn.queue" = "queue0", "spark.executor.memory" = "1g", "spark.hadoop.yarn.resourcemanager.address" = "127.0.0.1:9999", "spark.hadoop.fs.defaultFS" = "hdfs://127.0.0.1:10000", "working_dir" = "hdfs://127.0.0.1:10000/tmp/doris", "broker" = "broker0", "broker.username" = "user0", "broker.password" = "password0" ) ``` - SHOW RESOURCES: General users can only see their own resources. Admin and root users can show all resources. 1.2 Create spark load job ```sql LOAD LABEL db_name.label_name ( DATA INFILE ("/tmp/file1") INTO TABLE table_name, ... ) WITH RESOURCE resource_name [(key1 = value1, ...)] [PROPERTIES (key2 = value2, ... )] ``` Example: ```sql LOAD LABEL example_db.test_label ( DATA INFILE ("hdfs:/127.0.0.1:10000/tmp/file1") INTO TABLE example_table ) WITH RESOURCE "spark0" ( "spark.executor.memory" = "1g", "spark.files" = "/tmp/aaa,/tmp/bbb" ) PROPERTIES ("timeout" = "3600") ``` The spark configurations in load stmt can override the existing configuration in the resource for temporary use. #3010	2020-05-26 18:21:21 +08:00
Mingyu Chen	77b9acc242	[Stmt] Add rowCount column to SHOW DATA stmt (#3676 ) User can see the row count of all materialized indexes of a table. ``` mysql> show data from test; +-----------+-----------+-----------+--------------+----------+ \| TableName \| IndexName \| Size \| ReplicaCount \| RowCount \| +-----------+-----------+-----------+--------------+----------+ \| test2 \| r1 \| 10.000MB \| 30 \| 10000 \| \| \| r2 \| 20.000MB \| 30 \| 20000 \| \| \| test2 \| 50.000MB \| 30 \| 50000 \| \| \| Total \| 80.000 \| 90 \| \| +-----------+-----------+-----------+--------------+----------+ ``` Fix #3675	2020-05-26 15:53:38 +08:00
worker24h	ef8fd1fcbe	[Load] Support load json-data into Doris by RoutineLoad or StreamLoad (#3553 ) Doris support load json-data by RoutineLoad or StreamLoad	2020-05-21 13:00:49 +08:00
Seaven	488aa22938	[Doc] Update plugin document (#3447 ) (#3505 )	2020-05-09 19:19:38 +08:00
EmmyMiao87	f591976976	[Doc] Fix the incorrect docs (#3501 )	2020-05-08 12:47:00 +08:00
hffariel	432965e360	[Enhancement] documents rebuild with Vuepress (#3408 ) (#3414 )	2020-04-29 09:14:31 +08:00

32 Commits