doris

Author	SHA1	Message	Date
wfjcmcb	86d235a76a	[Extension] Logstash Doris output plugin (#3800 ) This plugin is used to output data to Doris for logstash Use the HTTP protocol to interact with the Doris FE Http interface Load data through Doris's stream load	2020-06-11 08:54:51 +08:00
ChenXiaofei	4adc9d45c2	[Doc] Update ALTER TABLE.md	2020-06-10 22:58:29 +08:00
yangzhg	de91037d8c	[Doc]Add some routine load docs (#3796 ) Add some documentation about using routine load in the cloud environment	2020-06-10 22:57:00 +08:00
EmmyMiao87	4cb5f7a535	[Config]Remove max_user_connections from config (#3805 ) Update max_user_connections by user property: ``` set property `user` max_user_connections=100; ```	2020-06-10 22:56:05 +08:00
WingC	a7bf006b51	Use BackendStatus to show BE's infomation in `show backends;` (#3713 ) The infomation is displayed in JSON format.For example: {"lastTabletReportTime":"2020-05-28 15:29:01"}	2020-06-06 11:37:48 +08:00
Xiang Wei	ed9022a908	Ignore broken disk when BE starts up (#3741 )	2020-06-05 10:26:07 +08:00
xy720	73719f263d	Fix document (#3773 )	2020-06-05 10:19:17 +08:00
wyb	fdf3415d06	[Website] Fix CREATE RESOURCE sidebar text and link not right bug (#3777 )	2020-06-05 09:20:36 +08:00
caiconghui	01c1de1870	[Load] Add more metric to trace the time cost in stream load and make brpc_num_threads configurable (#3703 )	2020-06-04 13:37:28 +08:00
Mingyu Chen	27046c5b61	[Enhancement] Improve the performance of query with IN predicate (#3694 ) This CL mainly changes: 1. Add a new BE config `max_pushdown_conditions_per_column` to limit the number of conditions of a single column that can be pushed down to storage engine. 2. Add 2 new session variables `max_scan_key_num` and `doris_max_scan_key_num` which can set in session level and overwrite the config value in BE.	2020-06-04 11:39:00 +08:00
Mingyu Chen	fc33ee3618	[Plugin] Add timeout of connection when downloading the plugins from URL (#3755 ) If no timeout is set, the download process may be blocked forever.	2020-06-04 11:37:18 +08:00
Mingyu Chen	791f8fee49	[Bug][Outfile] Fix bug that column separater is missing in output file. (#3765 ) When output result of a query using `OUTFILE` statement, is some of output column is null, then then following column separator is missing.	2020-06-04 10:35:32 +08:00
Mingyu Chen	2ad1b20b24	[Config] Add new BE config for tcmalloc (#3732 ) Add a new BE config tc_max_total_thread_cache_bytes	2020-06-03 21:58:13 +08:00
yangzhg	3194aa129d	Add a link to Tablet Meta URL (#3745 )	2020-06-03 10:10:32 +08:00
HappenLee	761a0ccd12	[Bug] Fix bug that runningprofile show time problem in FE web page and add the runingprofile doc (#3722 )	2020-06-02 11:07:15 +08:00
Mingyu Chen	bc35f3a31f	[DynamicPartition] Optimize the rule of creating dynamic partition (#3679 ) Problem is described in ISSUE #3678 This CL mainly changed to rule of creating dynamic partition. 1. If time unit is DAY, the logic remains unchanged. 2. If time unit is WEEK, the logical changes are as follows: 1. Allow to set the start day of every week, the default is Monday. Optional Monday to Sunday 2. Assuming that the starting day is a Tuesday, the range of the partition is Tuesday of the week to Monday of the next week. 3. If time unit is MONTH, the logical changes are as follows: 1. Allow to set the start date of each month. The default is 1st, and can be selected from 1st to 28th. 2. Assuming that the starting date is the 2nd, the range of the partition is from the 2nd of this month to the 1st of the next month. 4. The `SHOW DYNAMIC PARTITION TABLES` statement adds a `StartOf` column to show the start day of week or month. It is recommended to refer to the example in `dynamic-partition.md` to understand. TODO: Better to support HOUR and YEAR time unit. Maybe in next PR. FIX: #3678	2020-05-27 16:42:41 +08:00
wyb	4978bd6c81	[Spark load] Add resource manager (#3418 ) 1. User interface: 1.1 Spark resource management Spark is used as an external computing resource in Doris to do ETL work. In the future, there may be other external resources that will be used in Doris, for example, MapReduce is used for ETL, Spark/GPU is used for queries, HDFS/S3 is used for external storage. We introduced resource management to manage these external resources used by Doris. ```sql -- create spark resource CREATE EXTERNAL RESOURCE resource_name PROPERTIES ( type = spark, spark_conf_key = spark_conf_value, working_dir = path, broker = broker_name, broker.property_key = property_value ) -- drop spark resource DROP RESOURCE resource_name -- show resources SHOW RESOURCES SHOW PROC "/resources" -- privileges GRANT USAGE_PRIV ON RESOURCE resource_name TO user_identity GRANT USAGE_PRIV ON RESOURCE resource_name TO ROLE role_name REVOKE USAGE_PRIV ON RESOURCE resource_name FROM user_identity REVOKE USAGE_PRIV ON RESOURCE resource_name FROM ROLE role_name ``` - CREATE EXTERNAL RESOURCE: FOR user_name is optional. If there has, the external resource belongs to this user. If not, the external resource belongs to the system and all users are available. PROPERTIES： 1. type: resource type. Only support spark now. 2. spark configuration: follow the standard writing of Spark configurations, refer to: https://spark.apache.org/docs/latest/configuration.html. 3. working_dir: optional, used to store ETL intermediate results in spark ETL. 4. broker: optional, used in spark ETL. The ETL intermediate results need to be read with the broker when pushed into BE. Example: ```sql CREATE EXTERNAL RESOURCE "spark0" PROPERTIES ( "type" = "spark", "spark.master" = "yarn", "spark.submit.deployMode" = "cluster", "spark.jars" = "xxx.jar,yyy.jar", "spark.files" = "/tmp/aaa,/tmp/bbb", "spark.yarn.queue" = "queue0", "spark.executor.memory" = "1g", "spark.hadoop.yarn.resourcemanager.address" = "127.0.0.1:9999", "spark.hadoop.fs.defaultFS" = "hdfs://127.0.0.1:10000", "working_dir" = "hdfs://127.0.0.1:10000/tmp/doris", "broker" = "broker0", "broker.username" = "user0", "broker.password" = "password0" ) ``` - SHOW RESOURCES: General users can only see their own resources. Admin and root users can show all resources. 1.2 Create spark load job ```sql LOAD LABEL db_name.label_name ( DATA INFILE ("/tmp/file1") INTO TABLE table_name, ... ) WITH RESOURCE resource_name [(key1 = value1, ...)] [PROPERTIES (key2 = value2, ... )] ``` Example: ```sql LOAD LABEL example_db.test_label ( DATA INFILE ("hdfs:/127.0.0.1:10000/tmp/file1") INTO TABLE example_table ) WITH RESOURCE "spark0" ( "spark.executor.memory" = "1g", "spark.files" = "/tmp/aaa,/tmp/bbb" ) PROPERTIES ("timeout" = "3600") ``` The spark configurations in load stmt can override the existing configuration in the resource for temporary use. #3010	2020-05-26 18:21:21 +08:00
Mingyu Chen	77b9acc242	[Stmt] Add rowCount column to SHOW DATA stmt (#3676 ) User can see the row count of all materialized indexes of a table. ``` mysql> show data from test; +-----------+-----------+-----------+--------------+----------+ \| TableName \| IndexName \| Size \| ReplicaCount \| RowCount \| +-----------+-----------+-----------+--------------+----------+ \| test2 \| r1 \| 10.000MB \| 30 \| 10000 \| \| \| r2 \| 20.000MB \| 30 \| 20000 \| \| \| test2 \| 50.000MB \| 30 \| 50000 \| \| \| Total \| 80.000 \| 90 \| \| +-----------+-----------+-----------+--------------+----------+ ``` Fix #3675	2020-05-26 15:53:38 +08:00
hffariel	963d4d48aa	Override the style of sidebar's sub-direcotry (#3683 ) Override the style of sidebar's sub-directory.	2020-05-26 09:07:55 +08:00
Mingyu Chen	3ffc447b38	[OUTFILE] Support `INTO OUTFILE` to export query result (#3584 ) This CL mainly changes: 1. Support `SELECT INTO OUTFILE` command. 2. Support export query result to a file via Broker. 3. Support CSV export format with specified column separator and line delimiter.	2020-05-25 21:24:56 +08:00
caiconghui	e6864a1cda	Allow user to set thrift_client_timeout_ms config for thrift server (#3670 ) 1. Allow user to set thrift_client_timeout_ms config for thrift server 2. Add doc for thrift_client_timeout_ms config	2020-05-25 11:32:14 +08:00
yangzhg	ba7d2dbf7b	[Function] Support utf-8 encoding in instr, locate, locate_pos, lpad, rpad (#3638 ) Support utf-8 encoding for string function `instr`, `locate`, `locate_pos`, `lpad`, `rpad` and add unit test for them	2020-05-22 14:34:26 +08:00
EmmyMiao87	dbfe8a067f	[Doc ]Add docs of max_running_txn_num_per_db (#3657 ) Change-Id: Ibdbc19a5558b0eb3f6a5fc4ef630de255b408a92	2020-05-22 10:22:11 +08:00
Mingyu Chen	f6b5c8839b	[Bug] Ignore loading DELETE status tablet error when restarting BE (#3641 ) Fix: #3640 Also add a `batch delete meta` feature for `meta tool` Fix #3639	2020-05-21 19:08:28 +08:00
worker24h	ef8fd1fcbe	[Load] Support load json-data into Doris by RoutineLoad or StreamLoad (#3553 ) Doris support load json-data by RoutineLoad or StreamLoad	2020-05-21 13:00:49 +08:00
EmmyMiao87	0d66e6bd15	Support bitmap_intersect (#3571 ) * Support bitmap_intersect Support aggregate function Bitmap Intersect, it is mainly used to take intersection of grouped data. The function 'bitmap_intersect(expr)' calculates the intersection of bitmap columns and returns a bitmap object. The defination is following: FunctionName: bitmap_intersect, InputType: bitmap, OutputType: bitmap The scenario is as follows: Query which users satisfy the three tags a, b, and c at the same time. ``` select bitmap_to_string(bitmap_intersect(user_id)) from ( select bitmap_union(user_id) user_id from bitmap_intersect_test where tag in ('a', 'b', 'c') group by tag ) a ``` Closed #3552. * Add docs of bitmap_union and bitmap_intersect * Support null of bitmap_intersect	2020-05-20 21:12:02 +08:00
yangzhg	6be7a6232f	[Config] Add ignore config to determine whether to continue to start be when load tablet from header failed. (#3632 ) Add config ignore_load_tablet_failure to determine whether to continue to start be when load tablet from header failed.	2020-05-20 09:40:50 +08:00
Mingyu Chen	4cbcae1574	[Spark on Doris] Shade and provide the thrift lib in spark-doris-connector (#3631 ) Mainly changes: 1. Shade and provide the thrift lib in spark-doris-connector 2. Add a `build.sh` for spark-doris-connector 3. Move the README.md of spark-doris-connector to `docs/` 4. Change the line delimiter of `fe/src/test/java/org/apache/doris/analysis/AggregateTest.java`	2020-05-19 14:20:21 +08:00
EmmyMiao87	87caa697a9	[Doc] Update table-restore-tool.md Fix some format. NOTICE(#3622 ): This is a "revert of revert pull request". This pr is mainly used to synthesize the PRs whose commits were scattered and submitted due to the wrong merge method into a complete single commit.	2020-05-18 14:42:17 +08:00
Mingyu Chen	24ca937877	Revert "[Doc] Update table-restore-tool.md" (#3606 )	2020-05-18 12:08:54 +08:00
Mingyu Chen	0d76c78537	[Doc] Update table-restore-tool.md	2020-05-18 11:12:24 +08:00
chenmingyu	d4ff6dcdd6	fix by review	2020-05-18 10:56:12 +08:00
hffariel	a4e98953be	[website] modify download links & remove some links' suffix `_EN`(master) (#3573 ) modify download links & remove some links' suffix _EN	2020-05-15 14:03:28 +08:00
yangzhg	4464328d8f	[Doc] Add doc link to char_length (#3548 )	2020-05-14 21:21:31 +08:00
hffariel	47bce081d2	[website] Support documents' fulltext searching (master) (#3535 ) add documents' fulltext search powered by algolia	2020-05-13 21:18:42 +08:00
Zhao Chun	95c67db712	[community] Add Committer Guide (#3522 )	2020-05-13 21:17:12 +08:00
Mingyu Chen	40cd5365ce	[Doc] Update table-restore-tool.md Fix some format.	2020-05-13 18:51:11 +08:00
Dayue Gao	56db6e7a35	[Config] allow user to config BRPC socket_max_unwritten_bytes (#3488 ) Add new BE config `brpc_socket_max_unwritten_bytes`	2020-05-10 17:56:14 +08:00
Seaven	488aa22938	[Doc] Update plugin document (#3447 ) (#3505 )	2020-05-09 19:19:38 +08:00
Youngwb	a656a7ddd4	Support append_trailing_char_if_absent function (#3439 )	2020-05-09 08:59:34 +08:00
yangzhg	94b3a2bd50	[Bug] Fix string functions not support multibyte string (#3345 ) Let string functions support utf8 encoding	2020-05-08 12:52:46 +08:00
EmmyMiao87	f591976976	[Doc] Fix the incorrect docs (#3501 )	2020-05-08 12:47:00 +08:00
Mingyu Chen	5e63629b8b	[Decommission] Support NOT dropping BE after decommission (#3461 ) Add a new config `drop_backend_after_decommission` in FE. if this config is false, the BE will not be dropped after finishing decommission operation. This new config is try to solve the problem described in ISSUE: #3460 . TODO: This method will generate a lot of data migration, so it is only a temporary solution. After that, we should try to solve the problem of data balancing within the BE. This CL also add the documents of FE and BE configuration. These documents are incomplete and can be added later.	2020-05-06 17:14:24 +08:00
hffariel	dafb356b42	[Bugfix] Fix navbar not showing on mobile clients(#3419 ) & image relative path problem (#3427 )	2020-05-06 11:57:03 +08:00
xbyang18	a1500eb544	Update doris-on-es.md (#3446 )	2020-05-03 12:48:48 +08:00
xbyang18	2cb4027164	Update doris-on-es.md (#3441 )	2020-05-03 12:48:19 +08:00
yangzhg	54da5a491c	Fix delete statement doc display not correctly (#3445 )	2020-05-01 19:20:00 +08:00
Mingyu Chen	73a3c59efb	[Bug] Fix bug that help-resource.zip file is missing. (#3423 )	2020-04-29 19:25:28 +08:00
hffariel	432965e360	[Enhancement] documents rebuild with Vuepress (#3408 ) (#3414 )	2020-04-29 09:14:31 +08:00
Mingyu Chen	9a934ec9f6	[Load] Add more info in SHOW LOAD result (#3391 ) Fix #3390 This CL add more info in `JobDetails` column of `SHOW LOAD` result for Broker Load Job. For example: ``` { "Unfinished backends": { "9c3441027ff948a0-8287923329a2b6a7": [10002] }, "All backends": { "9c3441027ff948a0-8287923329a2b6a7": [10002, 10004, 10006] }, "ScannedRows": 2390016, "TaskNumber": 1, "FileNumber": 1, "FileSize": 1073741824 } ``` 2 newly added keys: `Unfinished backends` indicates the BE which task on them are not finished. `All backends` indicates the BE which this job has tasks on it. One more thing, I pass the Backend Id along with the heartbeat msg from FE to BE, so that BE can know the Id of themselves.	2020-04-26 21:30:23 +08:00

1 2 3 4 5 ...

357 Commits