doris

Author	SHA1	Message	Date
Hu Yanjun	4512569a3a	[docs](releasenote)Update en release note 2.0.0 (#23041 )	2023-08-16 15:13:09 +08:00
Liqf	a2095b7d9e	[fix](docs) add enable_single_replica_load on be config doc (#22948 )	2023-08-16 10:31:01 +08:00
zy-kkk	fe08db191f	[typo](docs) Optimize the release note 2.0.0 (#22926 )	2023-08-15 20:09:56 +08:00
irenesrl	27f5b623e6	[Chore](docs)Add SSL Faq (#22956 )	2023-08-15 09:49:39 +08:00
Luzhijing	c67d1cc805	[docs](releasenote)2.0.0 release note (#22904 )	2023-08-14 10:11:03 +08:00
zxealous	e2b06cd0cf	[opt](docs) Optimize docs to avoid user set wrong replication_allocation (#22767 )	2023-08-14 09:38:22 +08:00
Kaijie Chen	79a61ced42	[docs](load) fix indentation in stream load manual (#22807 )	2023-08-13 10:16:11 +08:00
Jack Drogon	1f8cb3f54a	[Chore](doc) Fix doc zh-CN typo (#22903 ) Signed-off-by: Jack Drogon <jack.xsuperman@gmail.com>	2023-08-12 16:14:06 +08:00
zxealous	2b81553879	[doc](docs) Add some docs of baidu cloud bos (#22833 ) * [doc](docs) Add some docs of baidu cloud bos * fix	2023-08-12 07:09:57 +08:00
Jibing-Li	5b09254fac	[improvement](external statistics)Fix external stats collection bugs (#22788 ) 1. Collect external table row count when execute analyze database. 2. Support show cached table stats (row count) 3. Support alter external table column stats. 4. Refresh/Invalidate table row count stat memory cache when analyze task finished and drop table stats.	2023-08-11 21:58:24 +08:00
KassieZ	84ee814bc3	[docs](docs) Update invalid pics of release note 1.1.0 and 2.0-beta (#22804 )	2023-08-11 20:08:21 +08:00
zy-kkk	3e9ba632d7	[typo](docs) Add a guide to using SQL for the jdbc catalog (#22880 )	2023-08-11 16:28:42 +08:00
AKIRA	0c38f42827	[fix](doc) Remove introduction to unstable features (#22832 ) 1. Remove introduction to unstable features 2. Rename some sub-titles to avoid mixed use of chiniese and english	2023-08-11 15:59:16 +08:00
Yulei-Yang	94a7b44540	[Improvement](log) add config to controll compression of fe log & fe audit log (#22865 ) fe log is large for a busy doris cluster, if you want to preserve some historical logs, it cost too much disk space. enable compression is a good way to save space. and a gzip compressed text file can be viewed without decompression.	2023-08-11 14:08:08 +08:00
Calvin Kirs	caf496a67e	[Chore](RoutineLoad)Change max_batch_interval minimum limit from 5 to 1 (#22858 )	2023-08-11 12:02:20 +08:00
Chuanle Chen	71807ceb5f	[Enhancement](tvf) Table value function support reading local file (#17404 ) I tested the local tvf with tpch queries. First, generate `lineitem` datasets with 6001215 rows, and load it into `lineitem` table by: ``` insert into lineitem select c11, c1, c4, c2, c3, c5, c6, c7, c8, c9, c10, c12, c13, c14, c15, c16 from local( "file_path" = "tools/tpch-tools/bin/tpch-data/lineitem.tbl.1", "backend_id" = "10003", "format" = "csv", "column_separator" = "\|" ); ``` Then, run `q1` and `q16` tpch queries, the query result is correct. It can also analyze the BE's log directly like: ``` mysql> select * from local( "file_path" = "log/be.out", "backend_id" = "10006", "format" = "csv") where c1 like "%start_time%" limit 10; +--------------------------------------------------------+ \| c1 \| +--------------------------------------------------------+ \| start time: 2023年 08月 07日星期一 23:20:32 CST \| \| start time: 2023年 08月 07日星期一 23:32:10 CST \| \| start time: 2023年 08月 08日星期二 00:20:50 CST \| \| start time: 2023年 08月 08日星期二 00:29:15 CST \| +--------------------------------------------------------+ ```	2023-08-10 20:07:42 +08:00
Calvin Kirs	221e860cb7	[Feature](Routine Load)Support Partial Update (#22785 )	2023-08-10 17:41:53 +08:00
Qi Chen	f2658dc7bd	[Feature](multi-catalog) Truncate char or varchar columns if size is smaller than file columns or not found in the file column schema. (#22318 ) Truncate char or varchar columns if size is smaller than file columns or not found in the file column schema by session var `truncate_char_or_varchar_columns`.	2023-08-10 14:37:20 +08:00
Jerry Hu	57fb9799b5	[feature](agg) add aggregation function 'bitmap_agg' (#22768 ) This function can be used to replace bitmap_union(to_bitmap(expr))， because bitmap_union(to_bitmap(expr)) need create many many small bitmaps firstly and then merge them into a single bitmap. bitmap_agg will convert the column value into a bitmap directly. Its performance is better than bitmap_union(to_bitmap(expr)) . In our test , there is about 30% improvement.	2023-08-10 12:18:25 +08:00
AKIRA	c1bc2c289b	[doc](stats) Add description for some new configure option in stats related docs (#22723 )	2023-08-10 11:37:50 +08:00
herry2038	eafdab0cfd	[Enhancement](tvf) Add frontends_disks table-valued-function (#22568 ) --------- Co-authored-by: yuxianbing <yuxianbing@yy.com> Co-authored-by: yuxianbing <iloveqaz123>	2023-08-10 10:40:24 +08:00
HB	5147c096ef	[Enhancement] Add an API to query session information for all FEs (#20134 ) Currently, Doris only has one interface for querying specific FE session information, and many times we need to know how many session information there are in the current cluster, so I added this API. ` GET /rest/v1/session/all { "msg": "success", "code": 0, "data": { "column_names": ["FE", "Id", "User", "Host", "Cluster", "Db", "Command", "Time", "State", "Info"], "rows": [{ "FE": "10.14.170.23", "User": "root", "Command": "Sleep", "State": "", "Cluster": "default_cluster", "Host": "10.81.85.89:31465", "Time": "230", "Id": "0", "Info": "", "Db": "db1" }, { "FE": "10.14.170.24", "User": "root", "Command": "Sleep", "State": "", "Cluster": "default_cluster", "Host": "10.81.85.88:61465", "Time": "460", "Id": "1", "Info": "", "Db": "db1" }] }, "count": 2 } `	2023-08-09 19:02:45 +08:00
KassieZ	9422494064	[docs](docs)Rename Title and URL of HLL Functions (#22728 )	2023-08-09 15:53:39 +08:00
KassieZ	58ef388c32	[docs](docs)Rename Title and URL of JSON Functions (#22732 )	2023-08-09 15:53:25 +08:00
KassieZ	af5f3ae2a6	[docs](docs)Rename Title & URL and Change Category Name as Numeric of Math Functions (#22733 )	2023-08-09 15:52:49 +08:00
KassieZ	2fb7aba9bc	[docs](docs)Rename Title and URL of IP Functions (#22741 )	2023-08-09 15:52:35 +08:00
KassieZ	910863b329	[docs](docs) Rename Window Functions (#22742 )	2023-08-09 15:52:22 +08:00
KassieZ	780ba83d91	[docs](docs)Rename the Files Without Category of Sql Functions (#22746 )	2023-08-09 15:51:47 +08:00
KassieZ	61e661d389	[docs](docs)Rename Title and URL of Table Functions (#22747 )	2023-08-09 15:51:15 +08:00
KassieZ	c443bce141	[docs](docs)Delete Dash Between Title of Benchmark (#22751 )	2023-08-09 15:51:01 +08:00
KassieZ	bf29110856	[docs](docs)Rename Title of FAQ-CN Version (#22752 )	2023-08-09 15:50:44 +08:00
KassieZ	4332e15800	[docs](docs)Rename Title and URL of Hash Functions (#22726 )	2023-08-09 15:50:23 +08:00
KassieZ	ed91ce5b1a	[docs](docs)Rename Title and URL of Conditional Functions (#22725 )	2023-08-09 15:49:11 +08:00
KassieZ	1625a7993c	[docs](docs)Rename Title and URL of Bitmap Functions (#22721 )	2023-08-09 15:48:16 +08:00
Xinyi Zou	77d3d4e324	[fix](cache) add sql cache conf `cache_result_max_data_size` (#22645 ) Only the maximum number of rows in sql cache cache_result_max_row_count is not enough. If a row of data is too large, FE may OOM.	2023-08-09 14:46:23 +08:00
Kaijie Chen	19a2617d70	[docs](streamload) improve some formatting (#22659 )	2023-08-09 14:38:22 +08:00
KassieZ	445bebb4cb	[docs](docs)Rename Titles & URL and Add Histogram Files of Aggregate Functions (#22720 )	2023-08-09 13:13:07 +08:00
KassieZ	512af9b2ff	[docs](docs)Rename Title and URL of Struct Functions (#22712 )	2023-08-09 13:12:48 +08:00
KassieZ	7164dd09e6	[docs](docs) Rename Title and URL of String Functions (#22711 )	2023-08-09 13:12:26 +08:00
KassieZ	d7ea27a5c7	[docs](docs) Rename Title and URL of GIS Functions (#22704 )	2023-08-09 13:12:10 +08:00
echo-dundun	a778027569	[typo](docs)Modified description of JSON /String size (#21694 )	2023-08-09 10:00:25 +08:00
KassieZ	8ef38637ae	[docs](docs) Rename Title and URL of Date Functions (#22686 )	2023-08-08 14:44:05 +08:00
KassieZ	0c972288ef	[docs](docs)Rename Title and URL of Array Functions for SEO (#22669 )	2023-08-08 14:32:05 +08:00
KassieZ	1d2046de64	[docs](docs)Rename Title of zh-CN Docs (#22662 )	2023-08-08 14:31:28 +08:00
zzzzzzzs	66784cef71	[Enhancement](Load) Stream Load using SQL (#22509 ) This PR was originally #16940 , but it has not been updated for a long time due to the original author @Cai-Yao . At present, we will merge some of the code into the master first. thanks @Cai-Yao @yiguolei	2023-08-08 13:49:04 +08:00
Calvin Kirs	d1a2473944	[Feature](broker)Support GCS (#20904 )	2023-08-07 19:37:18 +08:00
Siyang Tang	77e772e103	[enhancement](config) add some pre-process and pre-check for BE storage config attentions in docs (#22486 )	2023-08-07 18:16:57 +08:00
czzmmc	1a8a1e5b16	[Feature](count_by_enum) support count_by_enum function (#22071 ) count_by_enum(expr1, expr2, ... , exprN); Treats the data in a column as an enumeration and counts the number of values in each enumeration. Returns the number of enumerated values for each column, and the number of non-null values versus the number of null values.	2023-08-06 16:05:14 +08:00
Mingyu Chen	d628baba0a	[improvement](hdfs) support hedged read (#22634 ) In some cases, the high load of HDFS may lead to a long time to read the data on HDFS, thereby slowing down the overall query efficiency. HDFS Client provides Hedged Read. This function can start another read thread to read the same data when a read request exceeds a certain threshold and is not returned, and whichever is returned first will use the result. eg: create catalog regression properties ( 'type'='hms', 'hive.metastore.uris' = 'thrift://172.21.16.47:7004', 'dfs.client.hedged.read.threadpool.size' = '128', 'dfs.client.hedged.read.threshold.millis' = "500" );	2023-08-06 14:51:48 +08:00
Xinyi Zou	96f42ca20a	[fix](memory) Independent count exec node memory profile (#22598 ) Independent count exec node memory profile, after #22582	2023-08-06 10:56:31 +08:00

1 2 3 4 5 ...

2592 Commits