Commit Graph

2795 Commits

Author SHA1 Message Date
51bc49a727 [docs](truncate-table) load may fail when truncating table (#25793) 2023-10-24 14:10:26 +08:00
87b414cdae [Fix](query execution) Fix result sink fragment can't be cancelled in non-pipeline (#25524) 2023-10-24 11:30:29 +08:00
f23fdcbbef [typo](doc)Adjust tablet_rowset_stale_sweep_time_sec parameter default value to 300 (#25584) 2023-10-24 10:32:08 +08:00
28c799ce35 [Docs](partial update) Fix a typo in docs in partial update for insert statement (#25776) 2023-10-23 17:54:47 +08:00
6f7f0a24c5 [doc])(sidebae)Update partial-update sidebars.json (#25357) 2023-10-23 15:15:59 +08:00
fbc448520a [feature](ColdHeatSeperation) Support to upload cold data to HDFS (#22048) 2023-10-22 21:04:43 +08:00
49ca36720d [fix](docs) Fix mistakes in flink-doris-connector docs (#24512) (#25557) 2023-10-20 17:53:18 +08:00
68d3c25f26 [typo](doc)Modify the default value of Stale rowset cleanup policy(#25517) 2023-10-20 15:03:49 +08:00
32fe78511a [typo](doc) update spark connector two phase commit option doc (#24458) 2023-10-20 10:22:05 +08:00
68eaba7220 [DOC](fix) fix hyperlink to create tpch table (#25561) 2023-10-19 16:22:50 +08:00
2353582493 [enhancement](load) support for broker load, routine load, mysql load and add docs (#25528)
cases will be added later.
2023-10-19 15:43:22 +08:00
fcf7bdc9e0 [typo](docs) Rename Import Advanced to CN Version (#25374) 2023-10-19 10:18:30 +08:00
4752b800b2 [typo](doc)update config (#25425) 2023-10-19 10:02:31 +08:00
2a442972a8 [Fix](merge-on-write) Fix some bugs about sequence column (#24915)
1. add checks and handling of sequence column in #21896 to insert statement in origin planner and Nereids planner.
2. disable drop sequence mapping column in schema change.
2023-10-18 20:40:12 +08:00
2ddd2e5079 [feature](Nereids) add map_agg function (#25246) 2023-10-18 06:44:36 -05:00
0ec537edef [fix](column-id) fix null conn ctx in column id flusher and parser for database field in corresponding show stmt (#25393) 2023-10-18 14:11:31 +08:00
1130317b91 [Improvement](statistics)Collect stats for hive partition column using metadata (#24853)
Hive partition columns' stats could be calculated from hive metastore data. Doesn't need to execute sql to get the stats.
This PR is using hive partition metadata to collect partition column stats.
2023-10-17 10:31:57 +08:00
85b8497624 [fix](Tvf) return empty set when tvf queries an empty file or an error uri (#25280)
### Before:
return errors when tvf queries an empty file or an error uri:
1. get parsed schema failed, empty csv file
2. Can not get first file, please check uri.

### Now:
we just return empty set when tvf queries an empty file or an error uri.
```sql
mysql> select * from s3( 
"uri" = "https://error_uri/exp_1.csv", 
"s3.access_key"= "xx", 
"s3.secret_key" = "yy", 
"format" = "csv") limit 10;

Empty set (1.29 sec)
```
2023-10-17 09:52:53 +08:00
fe1980d7f2 [docs](docs) Add release note 2.0.2 (#25375) 2023-10-16 20:38:45 +08:00
4c42f3b783 [Improvement](hive-udf)(doc) minimize hive-udf and add some docs. (#24786) 2023-10-16 16:47:21 +08:00
5eff36417a [typo](docs) Fix some ambiguous descriptions (#23912) 2023-10-16 16:44:11 +08:00
0585beee02 [typo](docs) Modify parameter description (#23782) 2023-10-16 01:29:00 -05:00
471cf2c48b [improvement](auth) support show view priv (#25370)
Issue Number: close #xxx

current ,if user has select_priv or load_priv,he can show create table view_name,
but this is not safe,so add show_view_priv for show create table view_name

mysql SHOW VIEW description: https://dev.mysql.com/doc/refman/8.0/en/privileges-provided.html#priv_show-view
2023-10-14 22:37:51 +08:00
96f31ae9a7 [Docs](merge-on-write) Add more docs for partial update using native insert statement (#25356) 2023-10-13 14:48:51 +08:00
6298f90347 [ecosystem](doc) mysql synchronization example add mysql-conf port (#24666) 2023-10-13 01:36:26 -05:00
6757d2f361 Revert "[Enhancement](show-backends-disks) Add show backends disks (#24229)" (#25389)
This reverts commit 21223e65c59c23cfcb9e8ab610ea321168bcb75a.
2023-10-13 14:08:45 +08:00
aa0b74d63a [improvement](fe and broker) support specify broker to getSplits, check isSplitable, file scan for HMS Multi-catalog (#24830)
I want to use Doris Multi-catalog to accelerate HMS query. My organization has custom distributed file system, and we think wrapping the fs access difference into broker (listLocatedFiles, openReader..) would be a elegant approach.

This pr introduce HMS catalog conf `bind.broker.name`. If we set this conf, file split, query scan operation will send to broker.

usage:
create a hms catalog with broker usage
```
CREATE CATALOG hive_catalog_broker PROPERTIES (
    'type'='hms',
    'hive.metastore.uris' = 'thrift://xxx',
    'broker.name' = 'hdfs_broker'
);
```
When we try to query from this catalog, file split and query scan request will send to broker `hdfs_broker`.

More details about this pr:
1. Introduce HMS catalog proporty `bind.broker.name` to specify broker name to do remote path work. When `broker.name` is set, `enable.self.splitter` must be `true` to ensure file splitting process is executed in Fe
2. Introduce 2 more interfaces to broker service:
- `TBrokerIsSplittableResponse isSplittable(1: TBrokerIsSplittableRequest request)`, helps to invoke input format `isSplitable` interface.
- `TBrokerListResponse listLocatedFiles(1: TBrokerListPathRequest request)`, helps to do `listFiles` or `listLocatedStatus` for remote file system
3. 3 parts of whole processing will be executed in broker:
- Check whether the path with specified input format name `isSplittable`
- `listLocatedFiles` of table / partition locations.
- `OpenReader` for specified file splits.

Co-authored-by: chenlinzhong <490103404@qq.com>
2023-10-13 11:04:38 +08:00
ed67d5a2c2 [docs](developer-guide) Improve the be-vscode-gdb document (#25192)
Add miDebuggerPath into document to allow user set the gdb path.
If miDebuggerPath is not set, vscode may choose the gdb with low version.

ref: https://code.visualstudio.com/docs/cpp/launch-json-reference#_midebuggerpath
2023-10-13 11:03:46 +08:00
a30d30e7b5 [improvement](resource-tag) limit the default user's resource tag to 'default' (#25331)
In previous, if user property `'resource_tags.location'` is not set, the can use Backends with any resource tag.
It may confuse that when the DBA set part of Backends to resource group A, then the current existing user
should not be able to use this group A util it's `'resource_tags.location'` is set.

So in this PR, I change the behavior, that if user property `'resource_tags.location'` is not set, it can only use the
Backends with `default` tag.
2023-10-13 10:50:00 +08:00
11bbeb9a21 [Enhance](resource group)db support replication_allocation (#25195)
- db support replication_allocation,when create table,if not set `replication_num` or `replication_allocation `,will use it in db
- fix partition property will disappear when table partition is not null
2023-10-13 10:24:01 +08:00
21223e65c5 [Enhancement](show-backends-disks) Add show backends disks (#24229)
* Add statement to query disk information corresponding to data directory of BE node


[msyql]->'show backends disks;'
+-----------+-------------+------------------------------+---------+----------+---------------+-------------+-------------------+---------+
| BackendId | Host | RootPath | DirType | DiskState| TotalCapacity | UsedCapacity| AvailableCapacity | UsedPct |
+-----------+-------------+------------------------------+---------+----------+---------------+-------------+-------------------+---------+
| 10002 | 10.xx.xx.90 | /home/work/output/be/storage | STORAGE | ONLINE | 7.049 TB | 2.478 TB | 4.571 TB | 35.16 % |
| 10002 | 10.xx.xx.90 | /home/work/output/be | DEPLOY | ONLINE | 7.049 TB | 2.478 TB | 4.571 TB | 35.16 % |
| 10002 | 10.xx.xx.90 | /home/work/output/be/log | LOG | ONLINE | 7.049 TB | 2.478 TB | 4.571 TB | 35.16 % |
+-----------+-------------+------------------------------+---------+----------+---------------+-------------+-------------------+---------+
2023-10-12 20:24:45 +08:00
e41b03e530 [Fix](multi-catalog) delete hdfs hedged configs at BE side. (#25094)
Issue Number: close #25093 

We can set hdfs hedged configs when creating catalog, just like this:
```
CREATE CATALOG `test_ctl` PROPERTIES (
...
"dfs.client.hedged.read.threadpool.size" = "128",
"dfs.client.hedged.read.threshold.millis" = "500",
...
);
```
It is redundant to set these configs at BE side, and it will brings an occasional bug at #25093 .
2023-10-11 23:25:30 +08:00
d1f59a4025 [fix](catalog)fix when modifying comments in property, it will modify the comments in the catalog (#24857)
- fix when modifying comments in property, it will modify the comments in the catalog
- add `alter catalog modify comment` to modify comment for catalog
- abstract some logic of `alter catalog` to parent class
2023-10-11 23:16:19 +08:00
73c3e3ab55 [Feature](x-load) support config min replica num for loading data (#21118) 2023-10-11 21:07:35 +08:00
df7724d6cb [typo](docs)delete wrong description of from_unixtime (#23897) 2023-10-11 03:20:13 -05:00
004d3264a6 [typo](docs) add 'order by' when use 'limit m,n' (#24236) 2023-10-11 03:15:33 -05:00
62a6b132be [Fix](func numbers) Remove backend_nums argument of numbers function (#25200) 2023-10-10 20:25:58 +08:00
643f7cad0e [typo](docs) Delete wrong schema change memory parameters (#25234) 2023-10-10 04:49:40 -05:00
d702bc3c13 [typo](doc) hot and cold stratification increases FAQ (#24974) 2023-10-10 17:38:43 +08:00
3a29bb4bc5 [fix](doc) spelling error for colocate join #25053 (#25202)
Issue: 25053

Translation text not cleaned up
2023-10-10 10:10:55 +08:00
aa1704c50b [doc](data-model) update data-model doc (#24941) 2023-10-08 21:08:16 -05:00
f41b6a5fc3 [minor](doc) update the doc for docker env and custom_lib dir (#25088)
1. Update the doc for `apache/doris:build-env-for-2.0`
2. Update the doc for `custom_dir`
2023-10-09 09:50:31 +08:00
7af4be1ee3 [fix](mysqldb) Fix mysqldb upgrade (#25111)
If user has database with same name mysql, will introduce problem when doing checkpoint.

Solution:

Add check for this situation, if duplicate, exit and print log info to prevent damage of metadata;
Add fe config field: mysqldb_replace_name to make things correct if user already has mysql db.
Related pr: #23087 #22868
2023-10-09 09:40:56 +08:00
541f48a754 [feature](es-catalog) add include_hidden_indexin order to get the hidden index. (#24826) 2023-10-08 14:35:08 +08:00
934e9d5617 [typo](docs) Add example for create sql block rule (#24754) 2023-10-08 01:18:11 -05:00
961ca76bd3 [doc](fix)fix doc misspell (#25072) 2023-10-08 10:24:59 +08:00
f8e4cefb8c [typo](doc)Add be's enable_java_support configuration document (#25069) 2023-10-07 23:56:14 +08:00
f3e95608cb (Fix)(RoutineLoad)Query the transaction status NPE when the task has not yet started scheduling (#25074) 2023-10-07 07:26:49 -05:00
cb03703990 [fix](doc) spelling error for colocate join #25053 (#25054)
Issue: 25053

Change spell error for Colocate Join.
2023-10-07 19:51:07 +08:00
5130a6c006 [improvement](jdbc catalog)Adjustment to JDBC External Table Configuration Based on Internal Table Settings (#25059)
This pull request addresses the behavior of the `lower_case_table_names` parameter for jdbc catalog's based on the configuration of the internal table's corresponding parameter.

Changes:
- For internal tables, if `lower_case_table_names` is set to 1 or 2, thejdbc catalog's parameter is forcefully set to `true`.
- For internal tables, if `lower_case_table_names` is set to 0, the jdbc catalog's parameter can be either `true` or `false` with a default value of `false`.

These adjustments ensure consistency and predictability when working with both internal and external table configurations in Doris.
2023-10-07 06:25:52 -05:00