Commit Graph

155 Commits

Author SHA1 Message Date
4c7525cf2c [improvement](show) Support that user can use show data skew statement instead of admin (#7914)
* [improvement](show) Support that user can use show data skew statement instead of admin
This PR mainly do two things:
1. Support that user can use show data skew statement instead of admin
2. Fix fe ut failed caused by pr [improvement](rewrite) Make RewriteDateLiteralRule to be compatible with mysql #7876 and pr [feature-wip](iceberg) Step1: Support create Iceberg external table #7391

Co-authored-by: caiconghui1 <caiconghui1@jd.com>
2022-01-29 10:45:03 +08:00
f9ac302807 [docs] add function substring document (#7895) 2022-01-28 22:26:11 +08:00
3b8d48f08b [feature-wip](iceberg) Step1: Support create Iceberg external table (#7391)
Close related #7389

Support create Iceberg external table in Doris. 

This is the first step to support Iceberg external table.

### Create Iceberg external table
This pr describes two ways to create Iceberg external tables. Both ways do not require explicitly specifying column definitions, Doris automatically converts them based on Iceberg's column definitions.

1. Create an Iceberg external table directly

```sql
    CREATE [EXTERNAL] TABLE table_name 
    ENGINE = ICEBERG
    [COMMENT "comment"]
    PROPERTIES (
    "iceberg.database" = "iceberg_db_name",
    "iceberg.table" = "icberg_table_name",
    "iceberg.hive.metastore.uris"  =  "thrift://192.168.0.1:9083",
    "iceberg.catalog.type"  =  "HIVE_CATALOG"
    );
```

2. Create an Iceberg database and automatically create all the tables under that db.

```sql
    CREATE DATABASE db_name 
    [COMMENT "comment"]
    PROPERTIES (
    "iceberg.database" = "iceberg_db_name",
    "iceberg.hive.metastore.uris" = "thrift://192.168.0.1:9083",
    "iceberg.catalog.type" = "HIVE_CATALOG"
    );
```

### Show table creation

1. For individual tables you can view them with `help show create table`.

```sql 
mysql> show create table iceberg_db.logs_1;
+--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table  | Create Table                                                                                                                                                                                                                                                                                                                                                 |
+--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| logs_1 | CREATE TABLE `logs_1` (
  `level` varchar(-1) NOT NULL COMMENT "null",
  `event_time` datetime NOT NULL COMMENT "null",
  `message` varchar(-1) NOT NULL COMMENT "null"
) ENGINE=ICEBERG
COMMENT "ICEBERG"
PROPERTIES (
"iceberg.database" = "doris",
"iceberg.table" = "logs_1",
"iceberg.hive.metastore.uris"  =  "thrift://10.10.10.10:9087",
"iceberg.catalog.type"  =  "HIVE_CATALOG"
) |
+--------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
```

2. For Iceberg database, you can view it with `help show table creation`.

```sql
mysql> show table creation from iceberg_db;
+--------+---------+---------------------+---------------------------------------------------------+
| Table  | Status  | Create Time         | Error Msg                                               |
+--------+---------+---------------------+---------------------------------------------------------+
| logs   | fail    | 2021-12-14 13:50:10 | Cannot convert unknown type to Doris type: list<string> |
| logs_1 | success | 2021-12-14 13:50:10 |                                                         |
+--------+---------+---------------------+---------------------------------------------------------+
2 rows in set (0.00 sec)
```

  This is a new syntax.
  
  Show table creation records in Iceberg database:
  
  Syntax:
  ```sql
      SHOW TABLE CREATION [FROM db] [LIKE mask]
  ```
2022-01-27 10:22:47 +08:00
461b352d3e [fix](function) Change digital_masking function arg type to BIGINT (#7888)
Change digital_masking function arg type to BIGINT to fix the wrong result.
2022-01-25 22:28:05 +08:00
4e9bc5cb65 [doc] add documents for bitwise functions (#7790) 2022-01-24 21:08:41 +08:00
ed39ff1500 [feature](compaction) Support triggering compaction for a specific partition manually (#7521)
Add statement to trigger cumulative or base compaction for a specified partition.
2022-01-21 09:27:06 +08:00
e1d7233e9c [feature](vectorization) Support Vectorized Exec Engine In Doris (#7785)
# Proposed changes

Issue Number: close #6238

    Co-authored-by: HappenLee <happenlee@hotmail.com>
    Co-authored-by: stdpain <34912776+stdpain@users.noreply.github.com>
    Co-authored-by: Zhengguo Yang <yangzhgg@gmail.com>
    Co-authored-by: wangbo <506340561@qq.com>
    Co-authored-by: emmymiao87 <522274284@qq.com>
    Co-authored-by: Pxl <952130278@qq.com>
    Co-authored-by: zhangstar333 <87313068+zhangstar333@users.noreply.github.com>
    Co-authored-by: thinker <zchw100@qq.com>
    Co-authored-by: Zeno Yang <1521564989@qq.com>
    Co-authored-by: Wang Shuo <wangshuo128@gmail.com>
    Co-authored-by: zhoubintao <35688959+zbtzbtzbt@users.noreply.github.com>
    Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>
    Co-authored-by: xinghuayu007 <1450306854@qq.com>
    Co-authored-by: weizuo93 <weizuo@apache.org>
    Co-authored-by: yiguolei <guoleiyi@tencent.com>
    Co-authored-by: anneji-dev <85534151+anneji-dev@users.noreply.github.com>
    Co-authored-by: awakeljw <993007281@qq.com>
    Co-authored-by: taberylyang <95272637+taberylyang@users.noreply.github.com>
    Co-authored-by: Cui Kaifeng <48012748+azurenake@users.noreply.github.com>


## Problem Summary:

### 1. Some code from clickhouse

**ClickHouse is an excellent implementation of the vectorized execution engine database,
so here we have referenced and learned a lot from its excellent implementation in terms of
data structure and function implementation.
We are based on ClickHouse v19.16.2.2 and would like to thank the ClickHouse community and developers.**

The following comment has been added to the code from Clickhouse, eg:
// This file is copied from
// https://github.com/ClickHouse/ClickHouse/blob/master/src/Interpreters/AggregationCommon.h
// and modified by Doris

### 2. Support exec node and query:
* vaggregation_node
* vanalytic_eval_node
* vassert_num_rows_node
* vblocking_join_node
* vcross_join_node
* vempty_set_node
* ves_http_scan_node
* vexcept_node
* vexchange_node
* vintersect_node
* vmysql_scan_node
* vodbc_scan_node
* volap_scan_node
* vrepeat_node
* vschema_scan_node
* vselect_node
* vset_operation_node
* vsort_node
* vunion_node
* vhash_join_node

You can run exec engine of SSB/TPCH and 70% TPCDS stand query test set.

### 3. Data Model

Vec Exec Engine Support **Dup/Agg/Unq** table, Support Block Reader Vectorized.
Segment Vec is working in process.

### 4. How to use

1. Set the environment variable `set enable_vectorized_engine = true; `(required)
2. Set the environment variable `set batch_size = 4096; ` (recommended)

### 5. Some diff from origin exec engine

https://github.com/doris-vectorized/doris-vectorized/issues/294

## Checklist(Required)

1. Does it affect the original behavior: (No)
2. Has unit tests been added: (Yes)
3. Has document been added or modified: (No)
4. Does it need to update dependencies: (No)
5. Are there any changes that cannot be rolled back: (Yes)
2022-01-18 10:07:15 +08:00
e80c34b6fe [docs][typo] fix some typos in documents (#7769) 2022-01-16 10:43:42 +08:00
8b7d7e4dac [improvement] create/drop index support if [not] exist (#7748)
create or drop index clause support if [not] exist
2022-01-16 10:40:44 +08:00
5b0f11b665 [feature](mysql-compatibility)(function) add WEEKDAY function (#7673)
`WEEKDAY` in MySQL: returns an index from 0 to 6 for Monday to Sunday.
`DAYOFWEEK` in MySQL: returns an index from 1 to 7 for Sunday to Saturday.

Doris only have `DAYOFWEEK` function, so I add `WEEKDAY` function.

Thanks for the following materials:
- https://github.com/apache/incubator-doris/pull/6982/files
- https://www.bilibili.com/video/BV1V44y1Y7Ro
2022-01-16 10:39:21 +08:00
2de79832fc [docs](hive)(function) fix Hive type error and optimize alias function example (#7694)
1. fix Hive type error 
2. optimize alias function example
2022-01-11 15:07:32 +08:00
a60d86c1e1 [improvement](broker) add disable cache config for broker (#7506) 2021-12-31 16:48:55 +08:00
dc9cd34047 [docs] Add user manual for hdfs load and transaction. (#7497) 2021-12-30 10:22:48 +08:00
07e2acb2f3 [feature] Suport national secret (national commercial password) algorithm SM3/SM4 (#7464)
SM3 is password hash algorithm
SM4 is a block cipher used to replace DES / AES and other international algorithms.
2021-12-28 10:39:54 +08:00
0c154733e0 [feature](function) support bitmap_union/intersect have more columns parameters (#7379)
support multi bitmap parameter for all bitmap aggregation function
2021-12-26 11:03:20 +08:00
0499b2211b [feat](lateral-view) Support execution of lateral view stmt (#7255)
1. Add table function node
2. Add 3 table functions: explode_split, explode_bitmap and explode_json_array
2021-12-16 10:46:15 +08:00
62d12067aa [feature](udf) make orthogonal bitmap udaf as build in functions (#7211)
move orthogonal bitmap udaf as build in functions
add three buildin bitmap functions:

- orthogonal_bitmap_intersect
- orthogonal_bitmap_intersect_count
- orthogonal_bitmap_union_count
2021-12-07 09:57:26 +08:00
fbab8afe24 [feature] Support disable query and load for backend to make Doris more robust and set default value to 1 for max_query_retry_time (#7155)
ALTER SYSTEM MODIFY BACKEND "host1:9050" SET ("disable_query" = "true");
ALTER SYSTEM MODIFY BACKEND "host1:9050" SET ("disable_load" = "true");
2021-11-30 22:08:32 +08:00
be89f0f77e [feat-opt](routine-load) Support show offset lag in show routine load stmt (#7114)
Add a new field `Lag` in result of `show routine load` stmt.

`Lag: {"0":10, "1":0}` means kafka partition 0 has 10 msg behind and partition 1 is update-to-date.
2021-11-18 14:31:16 +08:00
5b01f7bba2 [Feature] Support query hive table (#6569)
Users can directly query the data in the hive table in Doris, and can use join to perform complex queries without laboriously importing data from hive.

Main changes list below:

FE:

Extend HiveScanNode from BrokerScanNode
HiveMetaStoreClientHelper communicate with HIVE and HDFS.
BE:
Treate HiveScanNode as BrokerScanNode, treate HiveTable as BrokerTable.

broker_scanner.cpp: suppot read column from HDFS path.
orc_scanner.cpp: support read hdfs file.
POM:

Add hive.version=2.3.7, hive-metastore and hive-exec
Add hadoop.version=2.8.0, hadoop-hdfs
Upgrade commons-lang to fix incompatiblity of Java 9 and later.
Thrift:

Add THiveTable
Add read_by_column_def in TBrokerRangeDesc
2021-11-16 11:59:07 +08:00
d4c0156e0f [Doc] REPLACE_IF_NOT_NULL document modification (#7100)
REPLACE_IF_NOT_NULL document modification
2021-11-13 17:11:20 +08:00
3d8166504a [Alter] Support alter table engine type from MySQL to ODBC (#6993)
Support alter table engine type from MySQL to ODBC:

```
ALTER TABLE tbl MODIFY ENGINE TO odbc PROPERTIES("driver" = "odbc");
```
2021-11-12 15:12:41 +08:00
e69249c082 sub_bitmap (#6977)
Starting from the offset position, intercept the specified limit bitmap elements and return a bitmap subset.

Types of chang
2021-11-06 13:31:03 +08:00
599ecb1f30 [Function] Add bitmap function bitmap_subset_limit (#6980)
Add bitmap function bitmap_subset_limit.
This function will return subset in specified index.
2021-11-04 12:14:47 +08:00
aeec9c45e6 [Function] Add bitmap-xor-count function for doris (#6982)
Add bitmap-xor-count function for doris

relate to #6875
2021-11-02 16:37:00 +08:00
1ff3d708ca [Function] add functions of bitmap_and/or_count (#6912)
issue #6875
add bitmap_and_count/ bitmap_or_count
2021-11-01 14:00:07 +08:00
c7a3116f98 [Function] add bitmap function of bitmap_has_all (#6918)
The 'bitmap_has_all' function returns true if the first bitmap contains all the elements of the second bitmap.
2021-11-01 12:50:47 +08:00
65ded82778 [Function] add BE bitmap function bitmap_subset_in_range (#6917)
Add bitmap function bitmap_subset_in_range.
This function will return subset in specified range (not include the range_end).
2021-11-01 11:05:19 +08:00
Pxl
28030294f7 [Feature] Support bitmap_and_not & bitmap_and_not_count (#6910)
Support bitmap_and_not & bitmap_and_not_count.
2021-11-01 10:11:54 +08:00
a842d41b87 [Function] add BE bitmap function bitmap_max (#6942)
Support bitmap_max.
2021-10-30 18:16:38 +08:00
3267455eca Replace replica_allocation to replication_allocation (#6870)
Fix #6869
2021-10-20 15:32:35 +08:00
bd25d1a828 [Doc] Add documents for MySQL Binlog Load (#6859)
* add zh-CN docs

* add en docs and image

* fix

* fix
2021-10-19 10:25:42 +08:00
bb2b29c64f [Doc] Add type BOOLEAN when enter 'help create table' in mysql client (#6852)
some user do not know Doris support type boolean, they use TINYINT,
so i add type BOOLEAN when enter 'help create table' in mysql client.

currently, type BOOLEAN size is 1 byte, but the value of boolean column only in {0,1} ,
which waste some memory, and i want change it's implement to 1 bit in the future.
2021-10-17 22:54:12 +08:00
fcd15edbf9 [Export] Support export job with label (#6835)
```
EXPORT TABLE xxx
...
PROPERTIES
(
    "label" = "mylabel",
    ...
);
```

And than user can use label to get the info by SHOW EXPORT stmt:
```
show export from db where label="mylabel";
```

For compatibility, if not specified, a random label will be used. And for history jobs, the label will be "export_job_id";

Not like LOAD stmt, here we specify label in `properties` because this will not cause grammatical conflicts,
and there is no need to modify the meta version of the metadata.
2021-10-15 10:18:11 +08:00
ad949c2f65 Optimize Hex and add related Doc (#6697)
I tested hex in a 1000w times for loop with random numbers,
old hex avg time cost is 4.92 s,optimize hex avg time cost is 0.46 s which faster nearly 10x.
2021-10-13 11:36:14 +08:00
675aef7d75 [AliasFunction] Add support for cast in alias function (#6754)
support #6753
2021-10-10 23:05:44 +08:00
7a20d6d4c2 [Doc] Modify document of resource tag (#6778)
Fix typo
2021-10-03 11:37:45 +08:00
e7707c8180 [FOLLOWUP] create table like clause support copy rollup (#6580)
* Remove `ALL` key word to make grammar more clear.

Co-authored-by: qzsee <shizhiqiang03@meituan.com>
2021-09-30 18:26:21 +08:00
cdf9f9e980 [Dynamic Partition] reserve specific history periods by dynamic partition. (#6554)
Add RESERVED_HISTORY_STARTS and RESERVED_HISTORY_ENDS.
Fixes #6514
2021-09-28 11:39:35 +08:00
982b76c3c0 [Bug] Fix resource tag bug, add documents and some other bug fix (#6708)
1. Fix bug of UNKNOWN Operation Type 91
2. Support using resource_tag property of user to limit the usage of BE
3. Add new FE config `disable_tablet_scheduler` to disable tablet scheduler.
4. Add documents for resource tag.
5. Modify the default value of FE config `default_db_data_quota_bytes` to 1PB.
6. Add a new BE config `disable_compaction_trace_log` to disable the trace log of compaction time cost.
7. Modify the default value of BE config `remote_storage_read_buffer_mb` to 16MB
8. Fix `show backends` results error
9. Add new BE config `external_table_connect_timeout_sec` to set the timeout when connecting to odbc and mysql table.
10. Modify issue template to enable blank issue, for release note or other specific usage.
11. Fix a bug in alpha_row_set split_range() function.
2021-09-28 10:37:42 +08:00
56031cbbe1 [Doc] Change CN/EN sql-functions single quote in markdown (#6698) 2021-09-24 21:42:52 +08:00
b3f02955d3 [Doc] modify irregular documents (like/ not like/ regexp.md) (#6572) 2021-09-09 14:11:37 +08:00
7a15e583a7 [Feature]Support functions of json_array, json_object, json_quote (#6504) 2021-09-02 09:59:02 +08:00
a949dcd9f6 [Feature] Create table like clause support copy rollup (#6475)
for issue #6474

```sql
create table test.table1 like test.table with rollup r1,r2 -- copy some rollup

create table test.table1 like test.table with rollup all -- copy all rollup

create table test.table1 like test.table  -- only copy base table
```
2021-08-31 20:33:26 +08:00
0393c9b3b9 [Optimize] Support send batch parallelism for olap table sink (#6397)
* Support send batch parallelism for olap table sink

Co-authored-by: caiconghui <caiconghui@xiaomi.com>
2021-08-30 11:03:09 +08:00
3f2fdd236f Add scan thread token (#6443) 2021-08-27 10:56:17 +08:00
c71f58fef9 [Doc] Add sidebar for percentile doc (#6470) 2021-08-22 22:03:07 +08:00
66a7a4b294 [Feature] Support exact percentile aggregate function (#6410)
Support to calculate the exact percentile value array of numeric column `col` at the given percentage(s).
2021-08-18 15:56:06 +08:00
8738ce380b Add long text type STRING, with a maximum length of 2GB. Usage is similar to varchar, and there is no guarantee for the performance of storing extremely long data (#6391) 2021-08-18 09:05:40 +08:00
4be06a470f fix typo: dynamic_partitoin -> dynamic_partition (#6445) 2021-08-16 09:17:57 +08:00