Commit Graph

2473 Commits

Author SHA1 Message Date
22aa54e335 [enhancement](config) enlarge max_bytes_per_broker_scanner to 5G #22099 2023-07-23 12:00:32 +08:00
eceb30f47e [doc](catalog)paimon doc (#21966)
code pr: #21910
2023-07-23 11:24:40 +08:00
3d0f952934 [FIX](complex-type)delete enable_map/struct_type switch #21957 2023-07-22 15:29:32 +08:00
50c8563f35 [fix](partial update) fix some bugs of sequence column (#21896) 2023-07-22 15:26:48 +08:00
32fce013f7 [feature](docs) add docs dbt-doris adapter (#22067) 2023-07-21 23:34:47 +08:00
e489b60ea3 [feature](load) support line delimiter for old broker load (#22030) 2023-07-21 19:31:19 +08:00
63b17bc7ba [typo](docs) fix some mistake in Doris & Spark Column Type Mapping (#19998) 2023-07-21 16:37:51 +08:00
67a3f37779 [doc](routineload)add routine load ssl example for access ali-kafka (#21877) 2023-07-21 16:03:10 +08:00
732e0d14ff [Enhancement](window-funnel)add different modes for window_funnel() function (#20563) 2023-07-21 13:57:27 +08:00
74313c7d54 [feature-wip](autoinc)(step-3) add auto increment support for unique table (#22036) 2023-07-21 13:24:41 +08:00
ab11dea98d [Enhancement](config) optimize behavior of default_storage_medium (#20739) 2023-07-20 22:00:11 +08:00
7d488688b4 [fix](multi-catalog)fix minio default region and throw minio error msg, support s3 bucket root path (#21994)
1. check minio region, set default region if user region is not provided, and throw minio error msg
2. support read root path s3://bucket1
3. fix max compute public access
2023-07-20 20:48:55 +08:00
367ad9164a [feature-wip](auto-inc)(step-2) support auto-increment column for duplicate table (#19917) 2023-07-20 18:03:39 +08:00
c31e826756 [opt](config) rename alter_inverted_index_worker_count to alter_index_worker_count, and add docs (#21985) 2023-07-20 17:50:04 +08:00
2ae9bfa3b2 [typo](docs) add oracle jdbc catalog FAQ of orai18n.jar (#22016) 2023-07-20 14:10:58 +08:00
1afe090486 [improvement](memory) modify jemalloc conf in be.conf (#21943)
modify jemalloc conf in be.conf
    disable je_purge_all_arena_dirty_pages
2023-07-20 10:34:31 +08:00
2daad2151d [enhancement](jdbc catalog) Add mysql jdbc catalog function to filter push-down identification (#21745) 2023-07-19 23:48:23 +08:00
845cf94a7a [feature](function) support time_to_sec (#21722)
mysql >select sec_to_time(time_to_sec(cast('16:32:18' as time)));
+----------------------------------------------------+
| sec_to_time(time_to_sec(CAST('16:32:18' AS TIME))) |
+----------------------------------------------------+
| 16:32:18                                           |
+----------------------------------------------------+
1 row in set (0.53 sec)

mysql [test]>select sec_to_time(59538);
+--------------------+
| sec_to_time(59538) |
+--------------------+
| 16:32:18           |
+--------------------+
1 row in set (0.03 sec)
2023-07-19 01:09:48 +08:00
1c149439d7 [docs](map)Add map and struct type support parameters (#21802) 2023-07-19 01:06:23 +08:00
c6063ed92f [Revert](lazy open) revert lazy open and add case (#21821) 2023-07-18 19:41:33 +08:00
e24867e138 [typo][docs] Modify the description of CREATE-TABLE (#21858) 2023-07-18 10:29:47 +08:00
ebc1e9e9f9 [docs](releasenote)add 1.2.6 release note (#21875) 2023-07-17 17:56:08 +08:00
1c36b77024 [typo][docs] Modify a typo in the aggr_type description for CREATE-TABLE (#21861)
Modify a typo in the CREATE-TABLE's aggr_type description to change "后倒入" to "后导入".
2023-07-17 17:02:39 +08:00
4cea785f13 [typo][docs] Delete the extra characters in the tablet-local-debug Chinese document. (#21846) 2023-07-17 17:02:16 +08:00
03b575842d [Feature](table function) support explode_json_array_json (#21795) 2023-07-17 11:40:02 +08:00
ca6e33ec0c [feature](table-value-functions)add catalogs table-value-function (#21790)
mysql> select * from catalogs() order by CatalogId;
2023-07-14 10:25:16 +08:00
4158253799 [feature](hudi) support hudi time travel in external table (#21739)
Support hudi time travel in external table:
```
select * from hudi_table for time as of '20230712221248';
```
PR(https://github.com/apache/doris/pull/15418) supports to take timestamp or version as the snapshot ID in iceberg, but hudi only has timestamp as the snapshot ID. Therefore, when querying hudi table with `for version as of`, error will be thrown like:
```
ERROR 1105 (HY000): errCode = 2, detailMessage = Hudi table only supports timestamp as snapshot ID
```
The supported formats of timestamp in hudi are: 'yyyy-MM-dd HH:mm:ss[.SSS]' or 'yyyy-MM-dd' or 'yyyyMMddHHmmss[SSS]', which is consistent with the [time-travel-query.](https://hudi.apache.org/docs/quick-start-guide#time-travel-query)

## Partitioning Strategies
Before this PR, hudi's partitions need to be synchronized to hive through [hive-sync-tool](https://hudi.apache.org/docs/syncing_metastore/#hive-sync-tool), or by setting very complex synchronization parameters in [spark conf](https://hudi.apache.org/docs/syncing_metastore/#sync-template). These processes are exceptionally complex and unnecessary, unless you want to query hudi data through hive.

In addition, partitions are changed in time travel. We cannot guarantee the correctness of time travel through partition synchronization.

So this PR directly obtain partitions by reading hudi meta information. Caching and updating table partition information through hudi instant timestamp, and reusing Doris' partition pruning.
2023-07-13 22:30:07 +08:00
23272abf48 [chore](docs)Removed documentation related to dynamic tables (#21803)
since the feature was reworked
2023-07-13 22:20:20 +08:00
c5dbd53e6f [fix](multi-catalog)support oss-hdfs service (#21504)
1. support oss-hdfs if it is enabled when use dlf or hms catalog
2. add docs for aliyun dlf and mc.
2023-07-13 18:02:15 +08:00
c78349a4c6 [Docs](statistics)Add external table statistic docs (#21567) 2023-07-13 17:54:34 +08:00
8a42ba5742 [typo](docs) modify bitmap function document (#21721) 2023-07-13 14:02:10 +08:00
06d129c364 [docs](stats) Update statistics related content #21766
1. Update grammar of `ANALYZE`
2. Add command description about how to delete a analyze job
2023-07-13 13:51:26 +08:00
e18465eac7 [feature](TVF) support path partition keys for external file TVF (#21648) 2023-07-13 10:15:55 +08:00
8ffa21a157 [fix](config) set FE header size limit to 1MB from 10k (#21719)
Enlarge jetty_server_max_http_header_size to avoid Request Header Fields
Too Large error when streamloading to FE.

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
2023-07-11 19:52:14 +08:00
7a758f7944 [enhancement](mysql) Add have_query_cache variable to be compatible with old mysql client (#21701) 2023-07-11 14:05:40 +08:00
7b403bff62 [feature](partial update)support insert new rows in non-strict mode partial update with nullable unmentioned columns (#21623)
1. expand the semantics of variable strict_mode to control the behavior for stream load: if strict_mode is true, the stream load can only update existing rows; if strict_mode is false, the stream load can insert new rows if the key is not present in the table
2. when inserting a new row in non-strict mode stream load, the unmentioned columns should have default value or be nullable
2023-07-11 09:38:56 +08:00
47dd2db292 [doc](fix) storage policy fe conf doc (#21679)
* [doc](fix) storage policy fe conf doc
2023-07-11 09:16:58 +08:00
f5641b59ae [typo](docs) Fixed a typo that changed "不再分区范围内的数据" to "不在分区范围内的数据" (#21655) 2023-07-10 22:17:53 +08:00
90bebc57b9 [docs]Update upgrade.md #21658
* Update upgrade.md

* Update upgrade.md
2023-07-10 22:17:05 +08:00
202a5c636f [fix](create table) modify varchar default length 1 to 65533 (#21302)
*modify archer default length 1 to  varchar.max.length , when create table.*

```mysql
create table t2 (             
k1 CHAR,              
K2 CHAR(10) ,               
K3 VARCHAR ,             
 K4 VARCHAR(1024) )              
duplicate key (k1)              
distributed by hash(k1) buckets 1              
properties('replication_num' = '1');  

desc t2;
```

| Field | Type           | Null | Key   | Default | Extra |
| -- |--|--| -| -| -| 
| k1    | CHAR(1)        | Yes  | true  | NULL    |       |
| K2    | CHAR(10)       | Yes  | false | NULL    | NONE  |
| K3    | VARCHAR(65533) | Yes  | false | NULL    | NONE  |
| K4    | VARCHAR(1024)  | Yes  | false | NULL    | NONE  |
2023-07-10 17:57:21 +08:00
0be349e250 [feature](jdbc) Support jdbc catalog to read json types (#21341) 2023-07-10 16:21:00 +08:00
a1a8ee8320 [enchancement](stats) Inject partition statistics #21543
The cost estimation can be more accurate if the statistics of partition are available. But we are running big data like 1T, can not really import.

So now we want to extend this by injecting partition statistics.

Syntax:

ALTER TABLE table_name MODIFY COLUMN column_name SET STATS ('stat_name' = 'stat_value', ...)
  [ PARTITION (partition_name) ];
Explanation:

- Table_name: The table to which the statistics are dropped. It can be a db_name.table_name form.
Column_name: Specified target column. table_name Must be a column that exists in. Statistics can only be modified one column at a time.

- Stat _ name and stat _ value: The corresponding stat name and the value of the stat info. Multiple stats are comma separated. Statistics that can be modified include row_count, ndv, num_nulls min_value max_value, and data_size.

- Partition_name: specifies the target partition. Must be a partition existing in table_name. Multiple partitions are separated by commas.
2023-07-10 15:06:25 +08:00
c36cd18a08 [docs](docs)add more explanation for Fe config (#21627)
add more explanation for Fe config
2023-07-09 08:46:37 +08:00
29dd0158cf Delete alter system modify broker related documents (#21578) 2023-07-07 15:34:59 +08:00
d76293d9bf [improve](doc) add doc about explain plan (#21561) 2023-07-07 12:34:52 +08:00
fde73b6cc6 [Fix](multi-catalog) Fix hadoop short circuit reading can not enabled in some environments. (#21516)
Fix hadoop short circuit reading can not enabled in some environments.
- Revert #21430 because it will cause performance degradation issue.
- Add `$HADOOP_CONF_DIR` to `$CLASSPATH`.
- Remove empty `hdfs-site.xml`. Because in some environments it will cause hadoop short circuit reading can not enabled.
- Copy the hadoop common native libs(which is copied from https://github.com/apache/doris-thirdparty/pull/98
) and add it to `LD_LIBRARY_PATH`. Because in some environments `LD_LIBRARY_PATH` doesn't contain hadoop common native libs, which will cause hadoop short circuit reading can not enabled.
2023-07-06 15:00:26 +08:00
b1be59c799 [enhancement](query) enable strong consistency by syncing max journal id from master (#21205)
Add a session var & config enable_strong_consistency_read to solve the problem that loading result may be shortly invisible to follwers, to meet users requirements in strong consistency read scenario.

Will sync max journal id from master and wait for replaying.
2023-07-06 10:25:38 +08:00
242a35fa80 [fix](s3) fix s3 fs benchmark tool (#21401)
1. fix concurrency bug of s3 fs benchmark tool, to avoid crash on multi thread.
2. Add `prefetch_read` operation to test prefetch reader.
3. add `AWS_EC2_METADATA_DISABLED` env in `start_be.sh` to avoid call ec2 metadata when creating s3 client.
4. add `AWS_MAX_ATTEMPTS` env in `start_be.sh` to avoid warning log of s3 sdk.
2023-07-05 16:20:58 +08:00
8c2963961f [docs](releasenote) 2.0 beta release note (#21457) 2023-07-04 19:02:18 +08:00
be406a1696 [typo](docs) fix presto jdbc catalog docs (#21445) 2023-07-04 18:24:58 +08:00