Commit Graph

806 Commits

Author SHA1 Message Date
416fb73621 docs format fix for explode-json-array table function (#10613)
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-07-06 17:57:19 +08:00
302e078e6a [dev env]: add idea provided doc. (#10597) 2022-07-05 11:06:53 +08:00
3c140ae05b [fix] [docs] Fixed Use examples in sequence-column-manual.md file. (#10588)
* [fix] [docs] Fixed Use examples in sequence-column-manual.md file.

Co-authored-by: 杨帅统 <yangshuaitong@gaolvgo.com>
Co-authored-by: spaces-x <weixiao5220@gmail.com>
2022-07-05 10:27:13 +08:00
cc2de23455 [docs] add quick compaction configs (#10559) 2022-07-05 10:03:37 +08:00
1a173a854e [fix](routine-load) Fix that routine load cannot work with old kafka version (#10554)
Co-authored-by: caiconghui1 <caiconghui1@jd.com>
2022-07-04 10:47:50 +08:00
c5f85c9818 [community] modify release doc to remove incubator (#10574) 2022-07-04 10:18:23 +08:00
614b782d4d [feature](doris-on-es) Support es external table not assign schema (#9583) 2022-07-03 23:19:05 +08:00
3b3debf5a4 [build] Fix nested resource path error when as maven project from eclipse (#10427)
1. Fix nested resource path error when as maven project from eclipse
2. Add instructions of "Eclipse import FE as maven project" in developer guide
2022-07-01 18:03:54 +08:00
e3d9f9430e [docs] Change the problem field in SSB test (#10540) 2022-07-01 15:30:35 +08:00
2708289816 [Doc]Change the download url for the binary package #10527 (#10528) 2022-07-01 10:47:14 +08:00
d52da675aa [docs](array-type)Fix keywords in array functions' help documents (#10500)
* save code

* save code
2022-07-01 10:43:04 +08:00
ec6620ae3e [feature-wip](array-type) add function arrays_overlap (#10233) 2022-06-30 08:12:29 +08:00
73999feca7 [doc] mod alter-table-replace (#10324)
Modify alter-table-replace to alter-table-replace-column, move alter-table-replace to data-definition-statements.
2022-06-30 08:11:59 +08:00
deeb3028ad [Enhancement] [Memory] [Vectorized] Stress test and optimize memory allocation (#9581)
* vec stress test, Allocator introduce chunkallocator

* fix comment
2022-06-29 02:57:51 +08:00
17eb8c00d3 [feature] add table valued function framework and numbers table valued function (#10214) 2022-06-28 14:01:57 +08:00
79ad05eec6 [fix](doe) fix doe on es v8 (#10391)
doris on es8 can not work, because type change. The use of type is no longer recommended in es7,
and support for type has been removed from es8.

1. /_mapping not support include_type_name
2. /_search not support use type
2022-06-26 09:51:29 +08:00
a0e330a156 [website] add website external resource (#10416) 2022-06-26 01:22:14 +08:00
8a49c7ef04 [chore] Rename Doris binary output format 2022-06-24 15:30:05 +08:00
f15d84335c [websit][doc]Modify image path (#10361) 2022-06-24 09:12:20 +08:00
ad8da109c3 [community] update PMC & Committer list (#10360) 2022-06-24 09:11:49 +08:00
573ad57467 [doc]Added Iceberg 0.13.2 version support (#10312)
Added 0.13.2 version support
2022-06-23 09:33:26 +08:00
98b3306e05 [docs]add key words for helps (#10263)
* add key words for helps

* add key words for helps

* add key words for helps
2022-06-22 14:41:15 +08:00
b913d59560 [docs] aes docs fix (#10251)
* fix aes docs

* update keywords inside aes.md

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-06-22 14:40:40 +08:00
e49fa6075f [doc] fix wrong number (#10305)
Co-authored-by: stephen <hello-stephen@qq.com>
2022-06-22 08:58:08 +08:00
d18808d0eb [docs] Add some user case to user list (#10256)
* [docs] Add user case to user list
2022-06-21 09:25:57 +08:00
75719bca92 [doc](website)Remove incubator prefix and add graduate note (#10257) 2022-06-20 17:48:29 +08:00
087fc596b1 [feature] add remote storage policy config for create table properties (#10159)
Add remote storage policy config for create table properties. It will set storage policy for table and partitions in `CREATE TABLE` and `ALTER TABLE`.
This policy will be used when partition is being migrated from local to remote.
grammy:
1.
`CREATE TABLE TblPxy1
(...)
ENGINE=olap
DISTRIBUTED BY HASH (aa) BUCKETS 1
PROPERTIES(
    "remote_storage_policy" = "testPolicy3"
);`
2.
`ALTER TABLE TblPxy01 SET ("remote_storage_policy" = "testPolicy3");`
3.
`ALTER TABLE TblPxy01 MODIFY PARTITION p2 SET ("remote_storage_policy" = "testPolicy3");`
2022-06-20 12:42:23 +08:00
185de4dd43 [docs]update develop document (#10242) 2022-06-20 09:24:49 +08:00
9a1f1c3864 [improvement](variables) change session variable when set global variable (#10238)
Currently, when setting variables with `global` keywords, it will not affect the
current session variable's value. That is always make user confused.

This CL mainly changes:

1. Change session variable when set global variable
2022-06-20 09:05:50 +08:00
e09066c7ee [Improvement] delete deprefacte config in document and regression test (#10231) 2022-06-19 18:16:59 +08:00
8439adad05 [doc] update array functions docs' location (#10226)
Change docs about array functions to correct directory.
Because we already refractor the docs directory.

```
docs/en/sql-manual/sql-functions/array-functions/    ===>
docs/en/docs/sql-manual/sql-functions/array-functions

```
```
docs/zh-CN/sql-manual/sql-functions/array-functions/    ===>
docs/zh-CN/docs/sql-manual/sql-functions/array-functions/
```
2022-06-19 10:40:40 +08:00
1d3496c6ab [feature] support backup/restore connect to HDFS (#10081) 2022-06-19 10:26:20 +08:00
b7b78ae707 [style](fe)the last step of fe CheckStyle (#10134)
1. fix all checkstyle warning
2. change all checkstyle rules to error
3. remove some java doc rules
    a. RequireEmptyLineBeforeBlockTagGroup
    b. JavadocStyle
    c. JavadocParagraph
4. suppress some rules for old codes
    a. all java doc rules only affect on Nereids
    b. DeclarationOrder only affect on Nereids
    c. OverloadMethodsDeclarationOrder only affect on Nereids
    d. VariableDeclarationUsageDistance only affect on Nereids
    e. suppress OneTopLevelClass on org/apache/doris/load/loadv2/dpp/ColumnParser.java
    f. suppress OneTopLevelClass on org/apache/doris/load/loadv2/dpp/SparkRDDAggregator.java
    g. suppress LineLength on org/apache/doris/catalog/FunctionSet.java
    h. suppress LineLength on org/apache/doris/common/ErrorCode.java
2022-06-17 21:02:45 +08:00
f35b235c3b [opt](compaction) optimize compaction in concurrent load (#10153)
add some logic to opt compaction:
1.seperate base&cumu compaction in case base compaction runs too long and
affect cumu compaction
2.fix level size in cu compaction so that file size below 64M have a right level
size, when choose rowsets to do compaction, the policy will ignore big rowset,
this will reduce about 25% cpu in high frequency concurrent load
3.remove skip window restriction so rowset can do compaction right after
generated, cause we'll not delete rowset after compaction. This will highly
reduce compaction score in concurrent log.
4.remove version consistence check in can_do_compaction, we'll choose a
consecutive rowset to do compaction, so this logic is useless

after add logic above, compaction score and cpu cost will have a substantial
optimize in concurrent load.

Co-authored-by: yixiutt <yixiu@selectdb.com>
2022-06-17 17:49:45 +08:00
5e47b03595 [feature-wip](array-type) Add array aggregation functions (#10108) 2022-06-17 11:07:49 +08:00
44e979e43b [Vectorized][Function] add orthogonal bitmap agg functions (#10126)
* [Vectorized][Function] add orthogonal bitmap agg functions
save some file about orthogonal bitmap function
add some file to rebase
update functions file

* refactor union_count function
refactor orthogonal union count functions

* remove bool is_variadic
2022-06-17 08:48:41 +08:00
f1c9105af1 [feature] Support hive on s3 (#10128)
Support query hive table on S3. Pass AK/SK, Region and s3 endpoint to hive table while creating the external table.

example create table sql:
```
CREATE TABLE `region_s3` (
`r_regionkey` integer NOT NULL,   
`r_name` char(25) NOT NULL,   
`r_comment` varchar(152) ) 
engine=hive 
properties 
("database"="default", 
"table"="region_s3", 
“hive.metastore.uris"="thrift://127.0.0.1:9083",
“AWS_ACCESS_KEY”=“YOUR_ACCESS_KEY",
“AWS_SECRET_KEY”=“YOUR_SECRET_KEY",
"AWS_ENDPOINT"="s3.us-east-1.amazonaws.com", 
“AWS_REGION”=“us-east-1”);
```
2022-06-16 19:15:46 +08:00
41b693e1df [test] Add window cast bitmap digital_masking function regression test. (#9924) 2022-06-16 19:14:51 +08:00
9217223cc5 [doc] update sequence en and zh-CN doc. (#10164)
* update sequence en and zh-CN doc.
2022-06-16 09:32:52 +08:00
ca88f258d9 [improvement] remove unused codes and docs for SHOW USER (#10107)
* remove unused codes and docs for `SHOW USER`
2022-06-15 21:49:08 +08:00
c4871fb306 [doc](website)remove translate warning form Chinese docs (#10157)
* modify home page text
2022-06-15 18:17:37 +08:00
4005b34a52 [doc] add tpc-h benchmark (#10150)
[doc] add tpc-h benchmark
2022-06-15 16:43:10 +08:00
96b54dd1d5 [doc](website)modify home page text and navbar (#10148)
* fix docs bugs with sidebar can not display and some style problems
2022-06-15 12:21:40 +08:00
c4d0fba713 Add storage policy for remote storage migration (#9997) 2022-06-15 11:00:06 +08:00
4c24586865 [Vectorized][UDF] support java-udaf (#9930) 2022-06-15 10:53:44 +08:00
7ab64f9155 [doc][website]update home page content and add slack button (#10091)
* fix docs bugs with sidebar can not display and some style problems
2022-06-15 09:31:40 +08:00
34ea6ce850 [doc]Added be enable_stream_load_record configuration description (#10130) 2022-06-15 08:14:47 +08:00
be3aa2aa37 [enhancement](community): polish doc to reformat (#10137) 2022-06-15 08:14:13 +08:00
f7b5f36da4 [feature] Support read hive external table and outfile into HDFS that authenticated by kerberos (#9579)
At present, Doris can only access the hadoop cluster with kerberos authentication enabled by broker, but Doris BE itself 
does not supports access to a kerberos-authenticated HDFS file.

This PR hope solve the problem.

When create hive external table, users just specify following properties to access the hdfs data with kerberos authentication enabled:

```sql
CREATE EXTERNAL TABLE t_hive (
k1 int NOT NULL COMMENT "",
k2 char(10) NOT NULL COMMENT "",
k3 datetime NOT NULL COMMENT "",
k5 varchar(20) NOT NULL COMMENT "",
k6 double NOT NULL COMMENT ""
) ENGINE=HIVE
COMMENT "HIVE"
PROPERTIES (
'hive.metastore.uris' = 'thrift://192.168.0.1:9083',
'database' = 'hive_db',
'table' = 'hive_table',
'dfs.nameservices'='hacluster',
'dfs.ha.namenodes.hacluster'='n1,n2',
'dfs.namenode.rpc-address.hacluster.n1'='192.168.0.1:8020',
'dfs.namenode.rpc-address.hacluster.n2'='192.168.0.2:8020',
'dfs.client.failover.proxy.provider.hacluster'='org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider',
'dfs.namenode.kerberos.principal'='hadoop/_HOST@REALM.COM'
'hadoop.security.authentication'='kerberos',
'hadoop.kerberos.principal'='doris_test@REALM.COM',
'hadoop.kerberos.keytab'='/path/to/doris_test.keytab'
);
```

If you want  to `select into outfile` to HDFS that kerberos authentication enable, you can refer to the following SQL statement:

```sql
select * from test into outfile "hdfs://tmp/outfile1" 
format as csv
properties
(
'fs.defaultFS'='hdfs://hacluster/',
'dfs.nameservices'='hacluster',
'dfs.ha.namenodes.hacluster'='n1,n2',
'dfs.namenode.rpc-address.hacluster.n1'='192.168.0.1:8020',
'dfs.namenode.rpc-address.hacluster.n2'='192.168.0.2:8020',
'dfs.client.failover.proxy.provider.hacluster'='org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider',
'dfs.namenode.kerberos.principal'='hadoop/_HOST@REALM.COM'
'hadoop.security.authentication'='kerberos',
'hadoop.kerberos.principal'='doris_test@REALM.COM',
'hadoop.kerberos.keytab'='/path/to/doris_test.keytab'
);
```
2022-06-14 20:07:03 +08:00
eb4d0f508a [doc] Add docs for SHOW TABLETS (#10105)
* add docs for SHOW TABLETS

* update

* add more examples for SHOW TABLETS

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-06-14 15:29:46 +08:00