Commit Graph

2575 Commits

Author SHA1 Message Date
f2658dc7bd [Feature](multi-catalog) Truncate char or varchar columns if size is smaller than file columns or not found in the file column schema. (#22318)
Truncate char or varchar columns if size is smaller than file columns or not found in the file column schema by session var `truncate_char_or_varchar_columns`.
2023-08-10 14:37:20 +08:00
57fb9799b5 [feature](agg) add aggregation function 'bitmap_agg' (#22768)
This function can be used to replace bitmap_union(to_bitmap(expr)), because bitmap_union(to_bitmap(expr)) need create many many small bitmaps firstly and then merge them into a single bitmap.
bitmap_agg will convert the column value into a bitmap directly. Its performance is better than bitmap_union(to_bitmap(expr)) . In our test , there is about 30% improvement.
2023-08-10 12:18:25 +08:00
c1bc2c289b [doc](stats) Add description for some new configure option in stats related docs (#22723) 2023-08-10 11:37:50 +08:00
eafdab0cfd [Enhancement](tvf) Add frontends_disks table-valued-function (#22568)
---------

Co-authored-by: yuxianbing <yuxianbing@yy.com>
Co-authored-by: yuxianbing <iloveqaz123>
2023-08-10 10:40:24 +08:00
HB
5147c096ef [Enhancement] Add an API to query session information for all FEs (#20134)
Currently, Doris only has one interface for querying specific FE session information, and many times we need to know how many session information there are in the current cluster, so I added this API.

`
GET /rest/v1/session/all

{
"msg": "success",
"code": 0,
"data": {
"column_names": ["FE", "Id", "User", "Host", "Cluster", "Db", "Command", "Time", "State", "Info"],
"rows": [{
"FE": "10.14.170.23",
"User": "root",
"Command": "Sleep",
"State": "",
"Cluster": "default_cluster",
"Host": "10.81.85.89:31465",
"Time": "230",
"Id": "0",
"Info": "",
"Db": "db1"
},
{
"FE": "10.14.170.24",
"User": "root",
"Command": "Sleep",
"State": "",
"Cluster": "default_cluster",
"Host": "10.81.85.88:61465",
"Time": "460",
"Id": "1",
"Info": "",
"Db": "db1"
}]
},
"count": 2
}
`
2023-08-09 19:02:45 +08:00
9422494064 [docs](docs)Rename Title and URL of HLL Functions (#22728) 2023-08-09 15:53:39 +08:00
58ef388c32 [docs](docs)Rename Title and URL of JSON Functions (#22732) 2023-08-09 15:53:25 +08:00
af5f3ae2a6 [docs](docs)Rename Title & URL and Change Category Name as Numeric of Math Functions (#22733) 2023-08-09 15:52:49 +08:00
2fb7aba9bc [docs](docs)Rename Title and URL of IP Functions (#22741) 2023-08-09 15:52:35 +08:00
910863b329 [docs](docs) Rename Window Functions (#22742) 2023-08-09 15:52:22 +08:00
780ba83d91 [docs](docs)Rename the Files Without Category of Sql Functions (#22746) 2023-08-09 15:51:47 +08:00
61e661d389 [docs](docs)Rename Title and URL of Table Functions (#22747) 2023-08-09 15:51:15 +08:00
c443bce141 [docs](docs)Delete Dash Between Title of Benchmark (#22751) 2023-08-09 15:51:01 +08:00
bf29110856 [docs](docs)Rename Title of FAQ-CN Version (#22752) 2023-08-09 15:50:44 +08:00
4332e15800 [docs](docs)Rename Title and URL of Hash Functions (#22726) 2023-08-09 15:50:23 +08:00
ed91ce5b1a [docs](docs)Rename Title and URL of Conditional Functions (#22725) 2023-08-09 15:49:11 +08:00
1625a7993c [docs](docs)Rename Title and URL of Bitmap Functions (#22721) 2023-08-09 15:48:16 +08:00
77d3d4e324 [fix](cache) add sql cache conf cache_result_max_data_size (#22645)
Only the maximum number of rows in sql cache cache_result_max_row_count is not enough. If a row of data is too large, FE may OOM.
2023-08-09 14:46:23 +08:00
19a2617d70 [docs](streamload) improve some formatting (#22659) 2023-08-09 14:38:22 +08:00
445bebb4cb [docs](docs)Rename Titles & URL and Add Histogram Files of Aggregate Functions (#22720) 2023-08-09 13:13:07 +08:00
512af9b2ff [docs](docs)Rename Title and URL of Struct Functions (#22712) 2023-08-09 13:12:48 +08:00
7164dd09e6 [docs](docs) Rename Title and URL of String Functions (#22711) 2023-08-09 13:12:26 +08:00
d7ea27a5c7 [docs](docs) Rename Title and URL of GIS Functions (#22704) 2023-08-09 13:12:10 +08:00
a778027569 [typo](docs)Modified description of JSON /String size (#21694) 2023-08-09 10:00:25 +08:00
8ef38637ae [docs](docs) Rename Title and URL of Date Functions (#22686) 2023-08-08 14:44:05 +08:00
0c972288ef [docs](docs)Rename Title and URL of Array Functions for SEO (#22669) 2023-08-08 14:32:05 +08:00
1d2046de64 [docs](docs)Rename Title of zh-CN Docs (#22662) 2023-08-08 14:31:28 +08:00
66784cef71 [Enhancement](Load) Stream Load using SQL (#22509)
This PR was originally #16940 , but it has not been updated for a long time due to the original author @Cai-Yao . At present, we will merge some of the code into the master first.

thanks @Cai-Yao @yiguolei
2023-08-08 13:49:04 +08:00
d1a2473944 [Feature](broker)Support GCS (#20904) 2023-08-07 19:37:18 +08:00
77e772e103 [enhancement](config) add some pre-process and pre-check for BE storage config attentions in docs (#22486) 2023-08-07 18:16:57 +08:00
1a8a1e5b16 [Feature](count_by_enum) support count_by_enum function (#22071)
count_by_enum(expr1, expr2, ... , exprN);

Treats the data in a column as an enumeration and counts the number of values in each enumeration. Returns the number of enumerated values for each column, and the number of non-null values versus the number of null values.
2023-08-06 16:05:14 +08:00
d628baba0a [improvement](hdfs) support hedged read (#22634)
In some cases, the high load of HDFS may lead to a long time to read the data on HDFS,
thereby slowing down the overall query efficiency. HDFS Client provides Hedged Read.
This function can start another read thread to read the same data when a read request
exceeds a certain threshold and is not returned, and whichever is returned first will use the result.

eg:

create catalog regression properties (
    'type'='hms',
    'hive.metastore.uris' = 'thrift://172.21.16.47:7004',
    'dfs.client.hedged.read.threadpool.size' = '128',
    'dfs.client.hedged.read.threshold.millis' = "500"
);
2023-08-06 14:51:48 +08:00
96f42ca20a [fix](memory) Independent count exec node memory profile (#22598)
Independent count exec node memory profile, after #22582
2023-08-06 10:56:31 +08:00
1073a924c1 [fix](doc) Update upgrade.md file add mysql_ssl_default_certificate folder info #22623 2023-08-06 10:55:46 +08:00
20fbc6dcf4 [typo](doc) Modify auto pull up document #21662 2023-08-06 10:47:40 +08:00
60b802eafa [typo](docs) update document syntax and add links to referenced articles (#21817) 2023-08-06 10:41:08 +08:00
fe6bae2924 [fix](invert index) supports utf8 and non-utf8 strings (#22570)
supports utf8 and non-utf8 strings: [fix] compatible with utf8 and invalid utf8 doris-thirdparty#110
2023-08-05 12:52:53 +08:00
fcdd1b96d2 [docs](delete-recover) merge docs: recover catalog and recover tablet trash #22525
Doris trash include FE catalog recycle bin and BE tablet trash. Users sometimes may be confused abount them. Put them together to let them better understand.
2023-08-05 10:31:48 +08:00
ea674aa540 [docs](community) Delete Gitter Mannual of EN & CN Verison (#22348) 2023-08-04 19:45:31 +08:00
846d6edab8 [docs](docs) Rename Advanced Usage Files for SEO (#22511) 2023-08-04 19:33:57 +08:00
d040a858f2 [docs](docs) Capitalize Query Acceleration Files Name and Title (#22512) 2023-08-04 19:33:31 +08:00
30b8c7b9e6 [docs](docs) Rename Lakehouse Files for SEO (#22513) 2023-08-04 19:33:02 +08:00
577cd51fde [docs](docs) Capitalize Ecosystem Files Name and Titles (#22515) 2023-08-04 19:32:39 +08:00
89fff98ced [docs](docs)Update spark_load.md (#22428) 2023-08-04 19:22:52 +08:00
872280135d [exec](pipeline) revert FE pipeline instance num pr (#22617)
* Revert "[fix](executor) only mysql connect to set GlobalPipelineTask (#22205)"
* Revert "[feature](executor) using fe version to set instance_num (#22047)"
2023-08-04 19:07:14 +08:00
d974af5feb [Fix](Load)Multi table plan not include task info (#22613) 2023-08-04 18:52:22 +08:00
95aa4d8631 [Feature](Export) Supports concurrently export of table data (#21911) 2023-08-04 18:50:17 +08:00
93593a013d [feature](load) add segment bytes limit in segcompaction (#22526) 2023-08-04 18:00:52 +08:00
9cf6b1b4cf [docs](typo) fix some typo of docs (#22591) 2023-08-04 16:29:04 +08:00
b9e344617a [typo](kerberos)support read jdk auth creds and add some krb tips in FAQ (#22535)
support read jdk auth creds and add some krb tips in FAQ
1. about the 'javax.security.auth.useSubjectCredsOnly': https://stackoverflow.com/questions/43660265/java-automatically-uses-kerberos-ticketcache-when-it-shouldnt
2. add tips for `No common protection layer between client and server` and yum jdk version.
2023-08-04 14:51:31 +08:00