doris

Author	SHA1	Message	Date
Ting Sun	e93a6da0e5	[Doc] correct format errors in English doc (#5321 ) Fix some English doc format errors	2021-02-26 11:32:14 +08:00
Zhengguo Yang	6ede4c6ec1	[Feature] Support backup,restore,load,export directly connect to s3 (#5399 ) * [doris-1008] support backup and restore directly to cloud storage via aws s3 protocol * Internal][S3DirectAccess] Support backup,restore,load,export directlyconnect to s3 1. Support load and export data from/to s3 directly. 2. Add a config to auto convert broker access to s3 acces when available Change-Id: Iac96d4b3670776708bc96a119ff491db8cb4cde7 (cherry picked from commit 2f03832ca52221cc7436069b96c45c48c4bc7201) * [Internal][S3DirectAccess] File path glob compatible with broker Change-Id: Ie55e07a547aa22c6fa8d432ca926216c10384e68 (cherry picked from commit d4fb25544c0dc06d23e1ada571ec3f8edd4ba56f) * [internal] [doris-1008] fix log4j class not found Change-Id: I468176aca0d821383c74ee658d461aba9e7d5be3 (cherry picked from commit 029adaa9d6ded8503acbd6644c1519456f3db232) * add poms Co-authored-by: yangzhengguo01 <yangzhengguo01@baidu.com>	2021-02-22 16:07:56 +08:00
Yuan Liu	b098261253	docs(Doc): correct wrong num in create table help doc (#5365 ) Co-authored-by: liuyuan <liuyuan.a@miaozhen.com>	2021-02-20 10:07:48 +08:00
HappenLee	a1808c1a71	[Function] Add BE udf bitmap_not (#5346 ) (#5357 ) this function will return the not result of inputs two bitmap.	2021-02-07 22:39:17 +08:00
Mingyu Chen	780900ac9c	[Feature] Support preceding filter original data when loading (#5338 ) Support conditional filtering of original data in broker load and routine load eg: ``` LOAD LABEL `label1` ( DATA INFILE ('bos://cmy-repo/1.csv') INTO TABLE tbl2 COLUMNS TERMINATED BY '\t' (event_day, product_id, ocpc_stage, user_id) SET ( ocpc_stage = ocpc_stage + 100 ) PRECEDING FILTER user_id = 1381035 WHERE ocpc_stage > 30 ) ... ```	2021-02-07 22:37:48 +08:00
jolly king	b315244ba7	[Doc] Fix the error description for the number of bytes of double type. (#5273 ) Modify the error description of double type: 12 bytes is modified to 8 bytes	2021-02-01 00:11:14 +08:00
Mingyu Chen	de57667d6d	[Delete] Support delete with multi partitions (#5252 ) Support delete statement like: 1. delete from table partitions(p1, p2) where xxx; // apply to p1, p2 2. delete from table where xxx; // apply to all partitions Also remove code about the deprecated sync/async delete job. This CL changes FE meta version to 94	2021-01-30 20:33:34 +08:00
caiconghui	ca10205137	[Function] Support show create function statement (#5197 ) * [Function]Support show create function stmt Co-authored-by: caiconghui [蔡聪辉] <caiconghui@xiaomi.com>	2021-01-28 10:52:37 +08:00
Zhengguo Yang	83b7a23d5c	fix alter routine load not work (#5257 )	2021-01-20 10:52:02 +08:00
lihuigang	05ac7fcd4a	[Function] Add BE udf bitmap_xor (#5098 ) this function will return the xor result of inputs two bitmap .	2021-01-04 09:27:46 +08:00
Zhengguo Yang	279ae1cb75	Add fuzzy_parse option to speed up json import (#5114 ) add a flag of fuzzy_parse, if the json file all object keys are the same and has same order, we only need to parse the first row, and then use index instead key to parse value	2020-12-25 09:19:42 +08:00
EmmyMiao87	74bfd69595	[Bug] Forbidden creating table with dynamic partition when FE.config dynamic_partition_enable=false (#5043 ) - There is a fe configuration called dynamic_partition_enable which controls the opening and closing of the dynamic partition function. When this configuration is false, it means that all tables do not support dynamic partitioning. - But when the user tried to create the dynamic partition table, Doris did not detect this parameter. This will cause the user can normally create a dynamic partition table, but in fact Doris cannot create a partition for this table. - This pr detect this config when building the table. The dynamic partition table can be created only when the dynamic_partition_enable configuration is true. If the configuration is false, the command to create a dynamic partition table will directly report an error.	2020-12-16 23:44:20 +08:00
Youngwb	650536d53e	[Feature] Add Topn udaf (#4803 ) For #4674 This is a udaf for approximate topn using Space-Saving algorithm. At present, we can only calculate the frequent items and their frequencies in a certain column, based on which we can implement similar topN functions supported by Kylin in the future. I have also added a test to calculate the accuracy of this algorithm. The following is a rough running result. The total amount of data is 1 million lines and follows the Zipfian distribution, where Element Cardinality represents the data cardinality, 20X, 50X.. The value representing space_expand_rate is 20,50, which is used to set the counter number in the space-saving algorithm ``` zf exponent = 0.5 Element cardinality 20X 50X 100X 1000 100% 100% 100% 10000 100% 100% 100% 100000 100% 100% 100% 500000 94% 98% 99% zf exponent = 0.6，1 Element cardinality 20X 50X 100X 1000 100% 100% 100% 10000 100% 100% 100% 100000 100% 100% 100% 500000 100% 100% 100% ```	2020-12-16 21:58:34 +08:00
Zhengguo Yang	bc063ebce2	fix typo in docs (#5046 )	2020-12-10 15:10:22 +08:00
HappenLee	55ce88da34	[Schema change] Support More column type in schema change (#4938 ) 1. Support modify column type CHAR to TINYINT/SMALLINT/INT/BIGINT/LARGEINT/FLOAT/DOUBLE/DATE and TINYINT/SMALLINT/INT/BIGINT/LARGEINT/FLOAT/DOUBLE convert to a wider range of numeric types (#4937) 2. Use template to refactor code of types.h and schema_change.cpp to delete redundant code.	2020-11-28 09:52:28 +08:00
EmmyMiao87	d6497fedc4	[Config] Change config name 'streaming_load_max_batch_size_mb' to 'streaming_load_json_max_mb' (#4791 ) The name and another config name are close to each other and are indistinguishable. So this pr modify the name. The document description has also been changed	2020-10-28 23:27:33 +08:00
Mingyu Chen	a95ce69c0d	[Doc] Bug fix that help commend not work (#4760 ) There are 2 docs with same name "bitmap", which cause error when building help system.	2020-10-20 09:47:51 +08:00
xueyan.li	a605b3160f	[Docs] update data types doc and fix some typo (#4712 ) * update data types doc and fix some typo * update data types doc and fix some typo Co-authored-by: lixueyan07 <lixueyan07@meituan.com>	2020-10-14 09:34:58 +08:00
Zhengguo Yang	751aa05cc0	fix docs typo (#4725 )	2020-10-14 09:27:50 +08:00
Zhengguo Yang	dec91a3d43	fix docs typo (#4723 )	2020-10-14 09:27:31 +08:00
Zhengguo Yang	3f55c1425c	fix docs typo (#4722 )	2020-10-14 09:27:12 +08:00
WangCong	28f4e922a7	[CREATE TABLE]Support new syntax CREATE TABLE LIKE to clone an existe… (#4705 ) Support new synatx CREATE TABLE [IF NOT EXISTS] [db_name].table_name AS [db_name2].table_name2; to create a new table from existed table with same table schema. ISSUE: #4355	2020-10-10 21:16:53 +08:00
ccoffline	f3cdf167d1	[Feature] Add time_round builtin functions (#4640 ) #4619 Add time_round functions that provides `time_floor` & `time_ceil` at each time unit. Fix two related bugs. - #4618 - Fix `struct TimeInterval` to use `int64_t` instead of `int32_t`, in case when the second diff overflow	2020-10-09 16:05:51 +08:00
Zhengguo Yang	0475aa9b93	[Bug]Fix delete on clause may not work in routineLoad (#4683 ) fix delete on may not work in some cases, this is describe in #4682	2020-09-30 09:56:19 +08:00
HappenLee	4e3b576fd3	[NewFeature] Support ExternalCatalogResource to simplify external table manage operation. (#4559 ) 1. Add new Resource ExternalCatalogResource ``` create external resource "odbc" properties ( "type" = "external_catalog", (required) "user" = "test",(required) "password" = "", (required) "host" = "192.168.0.1", (required) "port" = "8086", (required) "type" = "oracle" , (optinal,only odbc exteranl table use) "driver" = "Oracle 19 ODBC driver" (optional,only odbc exteranl table use) ) ``` 2.After create ExternalCatalogResource, can create external table like: ``` CREATE TABLE `test_mysql` ( `k1` tinyint(4) NOT NULL, `k2` smallint(6) NOT NULL, `k3` int(11) NOT NULL, `k4` bigint(20) NOT NULL, `k5` decimal(9,3) NOT NULL, `k6` char(5) NOT NULL, `k10` date DEFAULT NULL, `k11` datetime DEFAULT NULL, `k7` varchar(20) NOT NULL, `k8` double NOT NULL, `k9` float NOT NULL ) ENGINE=MYSQL PROPERTIES ( "external_catalog_resource" = "odbc", "database" = "test", "table" = "test" ); ```	2020-09-25 10:20:33 +08:00
xy720	fd37c4f352	[Document] Fix some typo in alter table document Fix some typos in document that confusing users.	2020-09-22 16:23:23 +08:00
HappenLee	a1f52ec2ab	[SQL] Support where, limit, order clause in show resourcestmt. (#4502 ) * [SQL] Support where, limit, order clause in show resourcestmt. Grammar SHOW RESOURCES [ WHERE [NAME [ = "your_resource_name" \| LIKE "name_matcher"]] [RESOURCETYPE = ["SPARK"]] ] [ORDER BY ...] [LIMIT limit][OFFSET offset]; issue #4501	2020-09-16 17:57:48 +08:00
Youngwb	95111f9228	[Feature] Support alter table syntax for sequence column (#4582 ) * enable sequence col Co-authored-by: yangwenbo6 <yangwenbo3@jd.com>	2020-09-15 10:19:38 +08:00
Mingyu Chen	5166a6c6bc	[Bug] function str_to_date()'s behavior on BE and FE is inconsistent (#4495 ) Main CL: 1. Copy the code from BE to implement the `str_to_date()` function in FE. 2. `str_to_date("2020-08-08", "%Y-%m-%d %H:%i:%s")` will return `2020-08-08 00:00:00` instead of `2020-08-08`.	2020-09-03 17:16:19 +08:00
Mingyu Chen	0db9194dc0	[Doc] Fix wrong doc name (#4477 ) Co-authored-by: morningman <chenmingyu@baidu.com>	2020-08-28 11:56:59 +08:00
Zhengguo Yang	174c9f89ea	[DOCS] Add batch delete docs (#4435 ) update documents for batch delete #4051	2020-08-28 09:24:07 +08:00
caiconghui	a5d1d010c0	[Doc] Fix typo about plugin content (#4416 )	2020-08-26 10:48:07 +08:00
xinghuayu007	bfb39a2826	[SQL][Function] Add replace() function (#4347 ) replace is an user defined function, which is to replace all old substrings with a new substring in a string, as follow: mysql> select replace("http://www.baidu.com:9090", "9090", ""); +------------------------------------------------------+ \| replace('http://www.baidu.com:9090', '9090', '') \| +------------------------------------------------------+ \| http://www.baidu.com: \| +------------------------------------------------------+	2020-08-20 09:28:53 +08:00
Stalary	26fe510011	[Doc] modify the document error (#4357 )	2020-08-17 23:06:23 +08:00
ZhangYu0123	1d9b3aeee7	[Doc] Repair document format (#4336 ) The error format '##keyword' in a lot of docs. This pr is to repair document format. #4335	2020-08-13 23:39:41 +08:00
caiconghui	eefad13107	[Feature] Support InPredicate in delete statement (#4006 ) This PR is to add inPredicate support to delete statement, and add max_allowed_in_element_num_of_delete variable to limit element num of InPredicate in delete statement.	2020-08-06 23:19:40 +08:00
Mingyu Chen	237c0807a4	[RoutineLoad] Support modify routine load job (#4158 ) Support ALTER ROUTINE LOAD JOB stmt, for example: ``` alter routine load db1.label1 properties ( "desired_concurrent_number"="3", "max_batch_interval" = "5", "max_batch_rows" = "300000", "max_batch_size" = "209715200", "strict_mode" = "false", "timezone" = "+08:00" ) ``` Details can be found in `alter-routine-load.md`	2020-08-06 23:11:02 +08:00
HangyuanLiu	116d7ffa3c	[SQL][Function] Add approx_count_distinct() function (#4221 ) Add approx_count_distinct() function to replace the ndv() function	2020-08-01 17:54:19 +08:00
worker24h	fdcc223ad2	[Bug][Json] Refactor the json load logic to fix some bug 1. Add `json_root` for nest json data. 2. Remove `_jmap` to make the logic reasonable.	2020-07-30 10:36:34 +08:00
caiconghui	237271c764	[Bug] Fix fe meta version problem, make drop meta check code easy to read and add doc content for drop meta check (#4205 ) This PR is mainly do three things: 1. Fix fe meta version bug introduced by #4029 , when fix conflict with #4086 2. Make drop check code easy to read 3. Add doc content for drop meta check	2020-07-30 09:54:20 +08:00
caiconghui	1b3af783e6	[Plugin] Add properties grammar in InstallPluginStmt (#4173 ) This PR is to support grammar like the following: INSTALL PLUGIN FROM [source] [PROPERTIES("KEY"="VALUE", ...)] user can set md5sum="xxxxxxx", so we don't need to provide a md5 uri.	2020-07-29 15:02:31 +08:00
WingC	d7893f0fa7	[Bug]Fix some schema change not work right (#4009 ) [Bug]Fix some schema change not work right This CL mainly fix some schema change to varchar type not work right because forget to logic check && Add ConvertTypeResolver to add supported convert type in order to avoid forget logic check	2020-07-11 10:18:29 +08:00
xy720	d2ab38a5e0	[Feature] Batch update partition's property in one command (#3981 ) Support following command. ``` alter table tbl_name modify partition (p1, p2, p3) set ("replication_num" = "3"); ```	2020-07-09 21:48:43 +08:00
caiconghui	b7051d0971	[Config]Make it easier for users to find configuration items needed (#3957 ) This PR is to make config items ordered by key and support like predicate for admin show config stmt	2020-07-07 23:12:21 +08:00
Mingyu Chen	c3d9feed75	[Load][Json] Refactor json load logic to make it more reasonable (#4020 ) This CL mainly changes: 1. Reorganized the code logic to limit the supported json format to two, and the import behavior is more consistent. 2. Modified the statistical behavior of the number of error rows when loading in json format, so that the error rows can be counted correctly. 3. See `load-json-format.md` to get details of loading json format.	2020-07-07 23:07:28 +08:00
Mingyu Chen	af1beb6ce4	[Enhance] Add prepare phase for some timestamp functions (#3947 ) Fix: #3946 CL: 1. Add prepare phase for `from_unixtime()`, `date_format()` and `convert_tz()` functions, to handle the format string once for all. 2. Find the cctz timezone when init `runtime state`, so that don't need to find timezone for each rows. 3. Add constant rewrite rule for `utc_timestamp()` 4. Add doc for `to_date()` 5. Comment out the `push_handler_test`, it can not run in DEBUG mode, will be fixed later. 6. Remove `timezone_db.h/cpp` and add `timezone_utils.h/cpp` The performance shows bellow: 11,000,000 rows SQL1: `select count(from_unixtime(k1)) from tbl1;` Before: 8.85s After: 2.85s SQL2: `select count(from_unixtime(k1, '%Y-%m-%d %H:%i:%s')) from tbl1 limit 1;` Before: 10.73s After: 4.85s The date string format seems still slow, we may need a further enhancement about it.	2020-06-29 19:15:09 +08:00
WingC	b2b9e22b24	[CreateTable] Check backend disk has available capacity by storage medium before create table (#3519 ) Currently we choose BE random without check disk is available, the create table will failed until create tablet task is sent to BE and BE will check is there has available capacity to create tablet. So check backend disk available by storage medium will reduce unnecessary RPC call.	2020-06-28 09:36:31 +08:00
Mingyu Chen	b3811f910f	[Spark load][Fe 4/6] Add hive external table and update hive table syntax in loadstmt (#3819 ) * Add hive external table and update hive table syntax in loadstmt * Move check hive table from SelectStmt to FromClause and update doc * Update hive external table en sql reference	2020-06-13 16:28:24 +08:00
wyb	44dbdf4986	Update hive external table en sql reference	2020-06-12 21:38:05 +08:00
ChenXiaofei	4adc9d45c2	[Doc] Update ALTER TABLE.md	2020-06-10 22:58:29 +08:00

1 2

62 Commits