Commit Graph

2439 Commits

Author SHA1 Message Date
7a758f7944 [enhancement](mysql) Add have_query_cache variable to be compatible with old mysql client (#21701) 2023-07-11 14:05:40 +08:00
7b403bff62 [feature](partial update)support insert new rows in non-strict mode partial update with nullable unmentioned columns (#21623)
1. expand the semantics of variable strict_mode to control the behavior for stream load: if strict_mode is true, the stream load can only update existing rows; if strict_mode is false, the stream load can insert new rows if the key is not present in the table
2. when inserting a new row in non-strict mode stream load, the unmentioned columns should have default value or be nullable
2023-07-11 09:38:56 +08:00
47dd2db292 [doc](fix) storage policy fe conf doc (#21679)
* [doc](fix) storage policy fe conf doc
2023-07-11 09:16:58 +08:00
f5641b59ae [typo](docs) Fixed a typo that changed "不再分区范围内的数据" to "不在分区范围内的数据" (#21655) 2023-07-10 22:17:53 +08:00
90bebc57b9 [docs]Update upgrade.md #21658
* Update upgrade.md

* Update upgrade.md
2023-07-10 22:17:05 +08:00
202a5c636f [fix](create table) modify varchar default length 1 to 65533 (#21302)
*modify archer default length 1 to  varchar.max.length , when create table.*

```mysql
create table t2 (             
k1 CHAR,              
K2 CHAR(10) ,               
K3 VARCHAR ,             
 K4 VARCHAR(1024) )              
duplicate key (k1)              
distributed by hash(k1) buckets 1              
properties('replication_num' = '1');  

desc t2;
```

| Field | Type           | Null | Key   | Default | Extra |
| -- |--|--| -| -| -| 
| k1    | CHAR(1)        | Yes  | true  | NULL    |       |
| K2    | CHAR(10)       | Yes  | false | NULL    | NONE  |
| K3    | VARCHAR(65533) | Yes  | false | NULL    | NONE  |
| K4    | VARCHAR(1024)  | Yes  | false | NULL    | NONE  |
2023-07-10 17:57:21 +08:00
0be349e250 [feature](jdbc) Support jdbc catalog to read json types (#21341) 2023-07-10 16:21:00 +08:00
a1a8ee8320 [enchancement](stats) Inject partition statistics #21543
The cost estimation can be more accurate if the statistics of partition are available. But we are running big data like 1T, can not really import.

So now we want to extend this by injecting partition statistics.

Syntax:

ALTER TABLE table_name MODIFY COLUMN column_name SET STATS ('stat_name' = 'stat_value', ...)
  [ PARTITION (partition_name) ];
Explanation:

- Table_name: The table to which the statistics are dropped. It can be a db_name.table_name form.
Column_name: Specified target column. table_name Must be a column that exists in. Statistics can only be modified one column at a time.

- Stat _ name and stat _ value: The corresponding stat name and the value of the stat info. Multiple stats are comma separated. Statistics that can be modified include row_count, ndv, num_nulls min_value max_value, and data_size.

- Partition_name: specifies the target partition. Must be a partition existing in table_name. Multiple partitions are separated by commas.
2023-07-10 15:06:25 +08:00
c36cd18a08 [docs](docs)add more explanation for Fe config (#21627)
add more explanation for Fe config
2023-07-09 08:46:37 +08:00
29dd0158cf Delete alter system modify broker related documents (#21578) 2023-07-07 15:34:59 +08:00
d76293d9bf [improve](doc) add doc about explain plan (#21561) 2023-07-07 12:34:52 +08:00
fde73b6cc6 [Fix](multi-catalog) Fix hadoop short circuit reading can not enabled in some environments. (#21516)
Fix hadoop short circuit reading can not enabled in some environments.
- Revert #21430 because it will cause performance degradation issue.
- Add `$HADOOP_CONF_DIR` to `$CLASSPATH`.
- Remove empty `hdfs-site.xml`. Because in some environments it will cause hadoop short circuit reading can not enabled.
- Copy the hadoop common native libs(which is copied from https://github.com/apache/doris-thirdparty/pull/98
) and add it to `LD_LIBRARY_PATH`. Because in some environments `LD_LIBRARY_PATH` doesn't contain hadoop common native libs, which will cause hadoop short circuit reading can not enabled.
2023-07-06 15:00:26 +08:00
b1be59c799 [enhancement](query) enable strong consistency by syncing max journal id from master (#21205)
Add a session var & config enable_strong_consistency_read to solve the problem that loading result may be shortly invisible to follwers, to meet users requirements in strong consistency read scenario.

Will sync max journal id from master and wait for replaying.
2023-07-06 10:25:38 +08:00
242a35fa80 [fix](s3) fix s3 fs benchmark tool (#21401)
1. fix concurrency bug of s3 fs benchmark tool, to avoid crash on multi thread.
2. Add `prefetch_read` operation to test prefetch reader.
3. add `AWS_EC2_METADATA_DISABLED` env in `start_be.sh` to avoid call ec2 metadata when creating s3 client.
4. add `AWS_MAX_ATTEMPTS` env in `start_be.sh` to avoid warning log of s3 sdk.
2023-07-05 16:20:58 +08:00
8c2963961f [docs](releasenote) 2.0 beta release note (#21457) 2023-07-04 19:02:18 +08:00
be406a1696 [typo](docs) fix presto jdbc catalog docs (#21445) 2023-07-04 18:24:58 +08:00
890e55b604 [typo](docs)Delete unsupported sql statements in GROUP_CONCAT() (#21455)
Delete unsupported sql statements in GROUP_CONCAT()
2023-07-04 14:46:49 +08:00
e4c0a0ac24 [improve](dependency)Upgrade dependency version (#21431)
exclude old netty version
upgrade spring-boot version to 2.7.13
used ojdbc8 replace ojdbc6
upgrade jackson version to 2.15.2
upgrade fabric8 version to 6.7.2
2023-07-04 11:29:21 +08:00
5e6242e235 [typo](docs) Refactor upgrade documentation (#21449)
Co-authored-by: Yijia Su <suyijia@selectdb.com>
2023-07-03 20:14:19 +08:00
bb33ad0bde [opt](docs) update nereids doc to reflect the latest changes (#21444) 2023-07-03 18:50:01 +08:00
124516c1ea [Fix](orc-reader) Fix Wrong data type for column error when column order in hive table is not same in orc file schema. (#21306)
`Wrong data type for column` error when column order in hive table is not same in orc file schema.

The root cause is in order to handle the following case:

The table in orc format of Hive 1.x may encounter system column names such as `_col0`, `_col1`, `_col2`... in the underlying orc file schema, which need to use the column names in the hive table for mapping.

### Solution
Currently fix this issue by handling the following case by specifying hive version to 1.x.x in the hive catalog configuration.

```sql
CREATE CATALOG hive PROPERTIES (
    'hive.version' = '1.x.x'
);
```
2023-07-03 09:32:55 +08:00
1fe04b7242 [Chore](metrics) remove trace metrics code using runtime profile instead (#21394)
* commit

* fix

* format
2023-07-01 12:18:23 +08:00
e8ffca6487 [doc](stream load json) modify enable_simdjson_reader since it's default open (#21379) 2023-07-01 08:42:50 +08:00
df23ab3f29 [Enhancement](tvf) Add authentication for workload group tvf (#21323) 2023-06-30 12:56:23 +08:00
6d63261b71 [docs]<docs>Add file system benchmark tools docs (#21262) 2023-06-30 09:27:18 +08:00
3fb75c1844 [docs](workload-group) Modify workload group docs (#21349) 2023-06-29 23:25:06 +08:00
f07e0d7686 [typo](docs) Some typo in nereids.md has been fixed (#20475) 2023-06-29 22:04:13 +08:00
f5668ac1a0 [fix](doc) Fix table typo in star schema benchmark documentation and join optimization (#19181) 2023-06-29 11:50:04 +08:00
3a12b67517 [Improvement](statistics, multi catalog)Implement hive table statistic connector (#21053)
This pr is to add the collecting hive statistic function. While the CBO fetching hive table statistics, statistic cache will 
first load from internal stats olap table. If not found, then using this pr's function to fetch from remote Hive metastore.
2023-06-29 10:50:54 +08:00
54e2e2f7ee [typo](doc)FlinkCDC access to multi-table or whole database example document mod… (#21295) 2023-06-29 09:42:13 +08:00
73bce9e750 [typo](doc) add params description and example for accessing hdfs in ha mode by tvf #21277 2023-06-29 09:05:35 +08:00
449c8d4568 [fix](jdbc) Handling Zero DateTime Values in Non-nullable Columns for JDBC Catalog Reading MySQL (#21296) 2023-06-28 22:51:17 +08:00
274203a59c [typo](storage)Fixed wrong description about Storage_root_path parameter (#20641) 2023-06-28 21:28:50 +08:00
a6b51ec19a [Feature](avro) Support Apache Avro file format (#19990)
support read avro file by hdfs() or s3() .
```sql
select * from s3(
         "uri" = "http://127.0.0.1:9312/test2/person.avro",
         "ACCESS_KEY" = "ak",
         "SECRET_KEY" = "sk",
         "FORMAT" = "avro");
+--------+--------------+-------------+-----------------+
| name   | boolean_type | double_type | long_type       |
+--------+--------------+-------------+-----------------+
| Alyssa |            1 |     10.0012 | 100000000221133 |
| Ben    |            0 |    5555.999 |      4009990000 |
| lisi   |            0 | 5992225.999 |      9099933330 |
+--------+--------------+-------------+-----------------+

select * from hdfs(
                "uri" = "hdfs://127.0.0.1:9000/input/person2.avro",
                "fs.defaultFS" = "hdfs://127.0.0.1:9000",
                "hadoop.username" = "doris",
                "format" = "avro");
+--------+--------------+-------------+-----------+
| name   | boolean_type | double_type | long_type |
+--------+--------------+-------------+-----------+
| Alyssa |            1 |  8888.99999 |  89898989 |
+--------+--------------+-------------+-----------+
```

current avro reader only support common data type, the complex data types will be supported later.
2023-06-28 21:15:35 +08:00
4e082a803f [typo](docs) improvement lakehouse doc sidebar (#21270) 2023-06-28 20:19:17 +08:00
283fd2903f [typo](doc)json document optimization (#20753) 2023-06-28 18:01:41 +08:00
824c1fe165 [typo](docs)delete the native udf doc (#21146) 2023-06-28 11:29:49 +08:00
1d406d486c [typo](docs) modify invalid URLs in release-1.2.0 (#21175) 2023-06-28 11:29:33 +08:00
e9bbac71dc [typo](docs) poor phrasing (#21224) 2023-06-28 11:05:09 +08:00
a6ff87f32c [docker](trino) add Trino docker compose and hive catalog (#21086) 2023-06-28 11:04:41 +08:00
33ace22471 [typo](docs) improvement SQL manual doc sidebar (#21267) 2023-06-28 11:03:53 +08:00
18878df1c0 [typo](doc)outfile export document optimization (#21211) 2023-06-28 10:30:30 +08:00
ac62ca0320 [typo](doc) add model limitation description for inverted index (#21245) 2023-06-28 10:13:42 +08:00
bed2a5efa7 [typo](doc) Fix errors in the example (#21151) 2023-06-27 18:23:48 +08:00
3ab06bf381 [typo](docs) fix monitor alert doc start grafana err (#21244) 2023-06-27 18:20:32 +08:00
5f4167d816 [fix](doc)description of stream-load (#20979) 2023-06-27 15:22:34 +08:00
f7fd891cd3 [typo](docs) delete no-used ENABLE-FEATURE doc (#21227) 2023-06-27 14:24:29 +08:00
Pxl
70ddf64126 [Chore](agg-state) add documentation about agg_state, add group_concat agg_state test case (#21147)
add documentation about agg_state, add group_concat agg_state test case
2023-06-27 11:28:19 +08:00
e0b20f0437 [feature](function) add ip function ipv4numtostring (alias inet_ntoa) (#20936) 2023-06-27 10:17:40 +08:00
efcc65a0d3 [feature-wip](workload-group) Support for workload group Authentication (#20242) 2023-06-27 09:57:18 +08:00