Commit Graph

4943 Commits

Author SHA1 Message Date
5fdd995b4c [fix] Fix heap-use-after-free when using type array<string> (#10127) 2022-06-19 10:27:36 +08:00
1d3496c6ab [feature] support backup/restore connect to HDFS (#10081) 2022-06-19 10:26:20 +08:00
0e404edf54 [improvement] Change array offset type from UInt32 to UInt64 (#10070)
Now column `Array<T>` contains column `offsets` and `data`, and type of column `offsets` is UInt32 now.
If we call array_union to merge arrays repeatedly, the size of array may overflow.
So we need to extend it before `Array Data Type` release.
2022-06-19 10:24:08 +08:00
534844ead6 [chore] update fe checkstyle workflow to required (#10237) 2022-06-18 21:32:28 +08:00
a52f40eb77 [fix](regression-test) fix run-regression-test Xmx2048m param (#10234)
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-06-17 23:30:44 +08:00
7a85e8d525 [bug](be) fix be block_reader.cc::_update_agg_value() mem leak.(#10216) (#10218) 2022-06-17 21:25:52 +08:00
b7b78ae707 [style](fe)the last step of fe CheckStyle (#10134)
1. fix all checkstyle warning
2. change all checkstyle rules to error
3. remove some java doc rules
    a. RequireEmptyLineBeforeBlockTagGroup
    b. JavadocStyle
    c. JavadocParagraph
4. suppress some rules for old codes
    a. all java doc rules only affect on Nereids
    b. DeclarationOrder only affect on Nereids
    c. OverloadMethodsDeclarationOrder only affect on Nereids
    d. VariableDeclarationUsageDistance only affect on Nereids
    e. suppress OneTopLevelClass on org/apache/doris/load/loadv2/dpp/ColumnParser.java
    f. suppress OneTopLevelClass on org/apache/doris/load/loadv2/dpp/SparkRDDAggregator.java
    g. suppress LineLength on org/apache/doris/catalog/FunctionSet.java
    h. suppress LineLength on org/apache/doris/common/ErrorCode.java
2022-06-17 21:02:45 +08:00
fea815f290 [doc](website)Replace CDN files with local files (#10212)
Replace CDN files with local files
2022-06-17 20:58:56 +08:00
f7789f4bc4 [fix]InListPredicate wrong result (#10211)
* fix

* reg test

Co-authored-by: Wang Bo <wangbo36@meituan.com>
2022-06-17 18:34:25 +08:00
6baa694bc1 [feature-wip](multi-catalog) Catalog operation syntax (#10033)
Impl catalog operation syntax
2022-06-17 17:50:31 +08:00
f35b235c3b [opt](compaction) optimize compaction in concurrent load (#10153)
add some logic to opt compaction:
1.seperate base&cumu compaction in case base compaction runs too long and
affect cumu compaction
2.fix level size in cu compaction so that file size below 64M have a right level
size, when choose rowsets to do compaction, the policy will ignore big rowset,
this will reduce about 25% cpu in high frequency concurrent load
3.remove skip window restriction so rowset can do compaction right after
generated, cause we'll not delete rowset after compaction. This will highly
reduce compaction score in concurrent log.
4.remove version consistence check in can_do_compaction, we'll choose a
consecutive rowset to do compaction, so this logic is useless

after add logic above, compaction score and cpu cost will have a substantial
optimize in concurrent load.

Co-authored-by: yixiutt <yixiu@selectdb.com>
2022-06-17 17:49:45 +08:00
d51166dd2a [Enhancement](Nereids) Automatic compute logical properties (#10176)
Automatic compute logical properties
2022-06-17 11:31:05 +08:00
60147ad7a5 [Improvement] build runtime filters asynchronously (#10186) 2022-06-17 11:09:13 +08:00
5e47b03595 [feature-wip](array-type) Add array aggregation functions (#10108) 2022-06-17 11:07:49 +08:00
b9f8df0264 [Bug] Compatible with Datagrip, fix checkStyle (#10143)
* Compatible with Datagrip, fix checkStyle

* ADD: comment
2022-06-17 11:05:17 +08:00
67e95276fb [fix](optimizer) Fix the default join reorder algorithm (#10174)
Default join reorder algorithm not working for the most cases.
2022-06-17 10:59:33 +08:00
Pxl
fd0bd395ac [Enhancement] Remove some unused include (#10035) 2022-06-17 10:47:25 +08:00
de86c0dd25 [doc](website)fix algolia search bug (#10196) 2022-06-17 08:51:28 +08:00
2a1d1b951a [data lake]Add HMS external data source. (#10088) 2022-06-17 08:49:15 +08:00
44e979e43b [Vectorized][Function] add orthogonal bitmap agg functions (#10126)
* [Vectorized][Function] add orthogonal bitmap agg functions
save some file about orthogonal bitmap function
add some file to rebase
update functions file

* refactor union_count function
refactor orthogonal union count functions

* remove bool is_variadic
2022-06-17 08:48:41 +08:00
a62a485faf [regression test]Constrain run-regression-test mem to 2G (#10165)
* Update .asf.yaml

* Update vtablet_sink_test.cpp

* Constrain run-regression-test mem to 2G
2022-06-17 08:46:16 +08:00
1cca319d18 [fix](vectorized) intersect operator takes too long time to execute (#10183)
* fix itersect operator takes too long time to execute

* modify code based on review comments
2022-06-17 08:43:53 +08:00
6f5f447aa3 [FOLLOWUP] cherrypick after refactoring scan nodes (#10177) 2022-06-17 08:41:47 +08:00
96de99525e [compile&build]clang compile errors fix (#10201)
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-06-17 08:41:25 +08:00
c784fb3ddd [fix] (mem tracker) Fix core dump during transmit_block (#10133)
In some cases, query mem tracker does not exist in BE when transmit block. This will result in a null pointer for get query mem tracker in brpc transmit_block
2022-06-17 00:01:30 +08:00
8d98c17c4e [Bug][Vectorized] Fix DCHECK failed in VExchangeNode close twice (#10184)
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
2022-06-16 23:56:49 +08:00
f1c9105af1 [feature] Support hive on s3 (#10128)
Support query hive table on S3. Pass AK/SK, Region and s3 endpoint to hive table while creating the external table.

example create table sql:
```
CREATE TABLE `region_s3` (
`r_regionkey` integer NOT NULL,   
`r_name` char(25) NOT NULL,   
`r_comment` varchar(152) ) 
engine=hive 
properties 
("database"="default", 
"table"="region_s3", 
“hive.metastore.uris"="thrift://127.0.0.1:9083",
“AWS_ACCESS_KEY”=“YOUR_ACCESS_KEY",
“AWS_SECRET_KEY”=“YOUR_SECRET_KEY",
"AWS_ENDPOINT"="s3.us-east-1.amazonaws.com", 
“AWS_REGION”=“us-east-1”);
```
2022-06-16 19:15:46 +08:00
41b693e1df [test] Add window cast bitmap digital_masking function regression test. (#9924) 2022-06-16 19:14:51 +08:00
ac2be958b3 [tpch tools]set exec_mem_limit=8G for tpch queries (#10119)
Co-authored-by: Jerry <root@localhost.localdomain>
2022-06-16 18:19:11 +08:00
75a7e72402 [Refactor] Use iequal to replace boost::iequals (#10146)
* [Refactor] Use iequal to replace boost::iequals

* remove unused include
2022-06-16 18:18:38 +08:00
14d21edf65 [fix] croaringbitmap compile support USE_AVX2=0 (#10140)
* If we disable AVX2 by config USE_AVX2=0, we need to croaringbitmap with ROARING_DISABLE_AVX=ON

* update to trigger regression test again

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-06-16 18:17:46 +08:00
Pxl
ae9c231925 [Enhancement][Storage] refactor InListPredicate/NotInListPredicate (#10139)
* refactor in_list_pred

* update
2022-06-16 18:09:29 +08:00
f49a4535c4 [Fix] fix vjson_scanner heap use after free when meet object or array type (#10179)
quick merge. It is a serious bug in 1.1.
2022-06-16 16:01:18 +08:00
33921c5e75 [Bug] Fix _add_block_closure do not delete in ~VNodeChannel() (#10180)
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
2022-06-16 15:56:07 +08:00
dad953bc08 [doc](website)fix SSR bug and add algolia search (#10178)
* fix ssr bug and add algolia search
2022-06-16 14:25:46 +08:00
3f9436c6a8 [compile]fix simdjson compile flags (#10054) 2022-06-16 11:28:51 +08:00
28e8effc52 [Refactor] Refactor vectorized scan node (#9968) 2022-06-16 11:10:56 +08:00
4b9d500425 [improvement](profile) Add table name and predicates (#10093) 2022-06-16 10:59:31 +08:00
Pxl
3b6451273b [regression test]fix test_outfile to use user regression conf (#10123) 2022-06-16 10:58:36 +08:00
Pxl
5805f8077f [Feature] [Vectorized] Some pre-refactorings or interface additions for schema change part2 (#10003) 2022-06-16 10:50:08 +08:00
90f229c038 [refactor] remove useless plugin test code (#10061)
* remove plugin test code

* remove plugin test

Co-authored-by: yiguolei <yiguolei@gmail.com>
2022-06-16 10:43:28 +08:00
bc431f2806 [typo] Fix typos in comments (#10142) 2022-06-16 10:13:59 +08:00
9217223cc5 [doc] update sequence en and zh-CN doc. (#10164)
* update sequence en and zh-CN doc.
2022-06-16 09:32:52 +08:00
dff1f09406 [doc](website)update Chinese heme page text (#10168)
update Chinese home page text
2022-06-16 08:04:21 +08:00
ca88f258d9 [improvement] remove unused codes and docs for SHOW USER (#10107)
* remove unused codes and docs for `SHOW USER`
2022-06-15 21:49:08 +08:00
4dfebb9852 [Feature] compaction quickly for small data import (#9804)
* compaction quickly for small data import #9791
1.merge small versions of rowset as soon as possible to increase the import frequency of small version data
2.small version means that the number of rows is less than config::small_compaction_rowset_rows  default 1000
2022-06-15 21:48:34 +08:00
c4871fb306 [doc](website)remove translate warning form Chinese docs (#10157)
* modify home page text
2022-06-15 18:17:37 +08:00
4005b34a52 [doc] add tpc-h benchmark (#10150)
[doc] add tpc-h benchmark
2022-06-15 16:43:10 +08:00
49f4437396 [fix] Fix disk used pct only consider the data that used by Doris (#9705) 2022-06-15 16:28:56 +08:00
f1d0c231b9 [Opt][Vectorized] Opt vectorized the unique_table in storage vectorized (#10132)
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
2022-06-15 15:32:15 +08:00