Commit Graph

4969 Commits

Author SHA1 Message Date
f5e5880fb6 [Improvement] make expression for template argument a constexpr (#10268) 2022-06-21 07:42:02 +08:00
dd39287a0a [fix](compile) Fix compile error in external file scan node (#10274) 2022-06-20 22:39:32 +08:00
5974e452bc [enhancement] CRC32 instructions compatible arm arch (#10261)
The performance of some CPUs that do not implement CRC instructions is particularly poor
2022-06-20 17:49:06 +08:00
75719bca92 [doc](website)Remove incubator prefix and add graduate note (#10257) 2022-06-20 17:48:29 +08:00
c3743ec9aa [enhancement] optmize 2 cases in seg_iter: all/none rows passed predicate (#10259)
* [enhancement] optmize 2 cases: all/none rows passed predicate in seg_iter.

* format
2022-06-20 17:47:52 +08:00
8a8d24b4a6 [feature-wip](multi-catalog) External file scan node (#9973) 2022-06-20 17:46:24 +08:00
8531dcb885 [refactor](nereids) Abstract interface of statistics framework for new optimizer reuse (#10240)
As the statistics framework could not be reused by new optmizer before, so I abstract some interface to make it reusable.

1. Make Slot extends the Id
2. Add new interface:ExprStats,PlanStats
3. Move definition of PlanNode.NodeType to statistics sub-directory
2022-06-20 15:16:06 +08:00
57327e6236 [improvement]Separate input and output parameters in ColumnPredicate (#10249)
```cpp
for (uint16_t i = 0; i < *size; ++i) {
	// some code here
}
```
The value of size is read for each conditional test, which also prevents possible vectorization.
2022-06-20 15:04:57 +08:00
087fc596b1 [feature] add remote storage policy config for create table properties (#10159)
Add remote storage policy config for create table properties. It will set storage policy for table and partitions in `CREATE TABLE` and `ALTER TABLE`.
This policy will be used when partition is being migrated from local to remote.
grammy:
1.
`CREATE TABLE TblPxy1
(...)
ENGINE=olap
DISTRIBUTED BY HASH (aa) BUCKETS 1
PROPERTIES(
    "remote_storage_policy" = "testPolicy3"
);`
2.
`ALTER TABLE TblPxy01 SET ("remote_storage_policy" = "testPolicy3");`
3.
`ALTER TABLE TblPxy01 MODIFY PARTITION p2 SET ("remote_storage_policy" = "testPolicy3");`
2022-06-20 12:42:23 +08:00
acf07d8966 [test] Add bitmap_intersect and schema_change test for regression test. (#9854) 2022-06-20 09:51:31 +08:00
588634ddf6 [feature] support runtime filter on vectorized engine (#10103) 2022-06-20 09:46:38 +08:00
2f37e108e3 [feature-wip](array-type) add ArrayType support for FeFunctions (#10041)
FEFunctionSignature do not support ArrayType as args, then following SQL failed:
`> select array_contains([1,2,3], 1);`
ERROR 1105 (HY000): errCode = 2, detailMessage = Unexpected exception: org.apache.doris.catalog.ArrayType cannot be cast to org.apache.doris.catalog.ScalarType
2022-06-20 09:35:06 +08:00
185de4dd43 [docs]update develop document (#10242) 2022-06-20 09:24:49 +08:00
ecdf8bcfdd [comments]Replace some chinese comments in product Code (#10243) 2022-06-20 09:24:19 +08:00
1c9ce29440 [improvement]Avoid frequently allocating and releasing flags in InListPredicate (#10248) 2022-06-20 09:08:02 +08:00
ab29ad2144 [typo] Fix typos in comments (#10247) 2022-06-20 09:06:29 +08:00
9a1f1c3864 [improvement](variables) change session variable when set global variable (#10238)
Currently, when setting variables with `global` keywords, it will not affect the
current session variable's value. That is always make user confused.

This CL mainly changes:

1. Change session variable when set global variable
2022-06-20 09:05:50 +08:00
f728fd4933 [fix](auth) Authentication exception when the name of database or table contains an underscore in grant statement. (#10213) 2022-06-19 22:20:01 +08:00
67f341f44e [TLP](step-1) Remove incubator prefix (#10230)
Remove some `incubator-` prefix in source code.
The document is not modified, will be done in next PR.
2022-06-19 19:34:52 +08:00
e09066c7ee [Improvement] delete deprefacte config in document and regression test (#10231) 2022-06-19 18:16:59 +08:00
6ad024a2bf [fix] (mem tracker) Refactor memtable mem tracker, fix flush memtable DCHECK failed (#10156)
1. Added memory leak detection for `DeltaWriter` and `MemTable` mem tracker
2. Modify memtable mem tracker to virtual to avoid frequent recursive consumption of parent tracker.
3. Disable memtable flush thread attach memtable tracker, ensure that memtable mem tracker is completely accurate.
4. Modify `memory_verbose_track=false`. At present, there is a performance problem in the frequent switch thread mem tracker. 
      - Because the mem tracker exists as a shared_ptr in the thread local. Each time it is switched, the atomic variable use_count in the shared_ptr of the current tracker will be -1, and the tracker to be replaced use_count +1, multi-threading Frequent changes to the same tracker shared_ptr are slow.
      - TODO: 1. Reduce unnecessary thread mem tracker switch, 2. Consider using raw pointers for mem tracker in thread local.
2022-06-19 16:48:42 +08:00
61d7724ab3 [tpch] Change all replication_num to 1 (#10244) 2022-06-19 10:42:04 +08:00
8439adad05 [doc] update array functions docs' location (#10226)
Change docs about array functions to correct directory.
Because we already refractor the docs directory.

```
docs/en/sql-manual/sql-functions/array-functions/    ===>
docs/en/docs/sql-manual/sql-functions/array-functions

```
```
docs/zh-CN/sql-manual/sql-functions/array-functions/    ===>
docs/zh-CN/docs/sql-manual/sql-functions/array-functions/
```
2022-06-19 10:40:40 +08:00
6b61b970f5 [chore] Fix a warning reported by maven (#10205) 2022-06-19 10:34:27 +08:00
70450d04ba [typo] Fix typos in comments (#10172) 2022-06-19 10:30:17 +08:00
ffe466cbc7 [fix](reader)replace an auto with size_t to avoid integer overflow (#10163) 2022-06-19 10:29:01 +08:00
5fdd995b4c [fix] Fix heap-use-after-free when using type array<string> (#10127) 2022-06-19 10:27:36 +08:00
1d3496c6ab [feature] support backup/restore connect to HDFS (#10081) 2022-06-19 10:26:20 +08:00
0e404edf54 [improvement] Change array offset type from UInt32 to UInt64 (#10070)
Now column `Array<T>` contains column `offsets` and `data`, and type of column `offsets` is UInt32 now.
If we call array_union to merge arrays repeatedly, the size of array may overflow.
So we need to extend it before `Array Data Type` release.
2022-06-19 10:24:08 +08:00
534844ead6 [chore] update fe checkstyle workflow to required (#10237) 2022-06-18 21:32:28 +08:00
a52f40eb77 [fix](regression-test) fix run-regression-test Xmx2048m param (#10234)
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-06-17 23:30:44 +08:00
7a85e8d525 [bug](be) fix be block_reader.cc::_update_agg_value() mem leak.(#10216) (#10218) 2022-06-17 21:25:52 +08:00
b7b78ae707 [style](fe)the last step of fe CheckStyle (#10134)
1. fix all checkstyle warning
2. change all checkstyle rules to error
3. remove some java doc rules
    a. RequireEmptyLineBeforeBlockTagGroup
    b. JavadocStyle
    c. JavadocParagraph
4. suppress some rules for old codes
    a. all java doc rules only affect on Nereids
    b. DeclarationOrder only affect on Nereids
    c. OverloadMethodsDeclarationOrder only affect on Nereids
    d. VariableDeclarationUsageDistance only affect on Nereids
    e. suppress OneTopLevelClass on org/apache/doris/load/loadv2/dpp/ColumnParser.java
    f. suppress OneTopLevelClass on org/apache/doris/load/loadv2/dpp/SparkRDDAggregator.java
    g. suppress LineLength on org/apache/doris/catalog/FunctionSet.java
    h. suppress LineLength on org/apache/doris/common/ErrorCode.java
2022-06-17 21:02:45 +08:00
fea815f290 [doc](website)Replace CDN files with local files (#10212)
Replace CDN files with local files
2022-06-17 20:58:56 +08:00
f7789f4bc4 [fix]InListPredicate wrong result (#10211)
* fix

* reg test

Co-authored-by: Wang Bo <wangbo36@meituan.com>
2022-06-17 18:34:25 +08:00
6baa694bc1 [feature-wip](multi-catalog) Catalog operation syntax (#10033)
Impl catalog operation syntax
2022-06-17 17:50:31 +08:00
f35b235c3b [opt](compaction) optimize compaction in concurrent load (#10153)
add some logic to opt compaction:
1.seperate base&cumu compaction in case base compaction runs too long and
affect cumu compaction
2.fix level size in cu compaction so that file size below 64M have a right level
size, when choose rowsets to do compaction, the policy will ignore big rowset,
this will reduce about 25% cpu in high frequency concurrent load
3.remove skip window restriction so rowset can do compaction right after
generated, cause we'll not delete rowset after compaction. This will highly
reduce compaction score in concurrent log.
4.remove version consistence check in can_do_compaction, we'll choose a
consecutive rowset to do compaction, so this logic is useless

after add logic above, compaction score and cpu cost will have a substantial
optimize in concurrent load.

Co-authored-by: yixiutt <yixiu@selectdb.com>
2022-06-17 17:49:45 +08:00
d51166dd2a [Enhancement](Nereids) Automatic compute logical properties (#10176)
Automatic compute logical properties
2022-06-17 11:31:05 +08:00
60147ad7a5 [Improvement] build runtime filters asynchronously (#10186) 2022-06-17 11:09:13 +08:00
5e47b03595 [feature-wip](array-type) Add array aggregation functions (#10108) 2022-06-17 11:07:49 +08:00
b9f8df0264 [Bug] Compatible with Datagrip, fix checkStyle (#10143)
* Compatible with Datagrip, fix checkStyle

* ADD: comment
2022-06-17 11:05:17 +08:00
67e95276fb [fix](optimizer) Fix the default join reorder algorithm (#10174)
Default join reorder algorithm not working for the most cases.
2022-06-17 10:59:33 +08:00
Pxl
fd0bd395ac [Enhancement] Remove some unused include (#10035) 2022-06-17 10:47:25 +08:00
de86c0dd25 [doc](website)fix algolia search bug (#10196) 2022-06-17 08:51:28 +08:00
2a1d1b951a [data lake]Add HMS external data source. (#10088) 2022-06-17 08:49:15 +08:00
44e979e43b [Vectorized][Function] add orthogonal bitmap agg functions (#10126)
* [Vectorized][Function] add orthogonal bitmap agg functions
save some file about orthogonal bitmap function
add some file to rebase
update functions file

* refactor union_count function
refactor orthogonal union count functions

* remove bool is_variadic
2022-06-17 08:48:41 +08:00
a62a485faf [regression test]Constrain run-regression-test mem to 2G (#10165)
* Update .asf.yaml

* Update vtablet_sink_test.cpp

* Constrain run-regression-test mem to 2G
2022-06-17 08:46:16 +08:00
1cca319d18 [fix](vectorized) intersect operator takes too long time to execute (#10183)
* fix itersect operator takes too long time to execute

* modify code based on review comments
2022-06-17 08:43:53 +08:00
6f5f447aa3 [FOLLOWUP] cherrypick after refactoring scan nodes (#10177) 2022-06-17 08:41:47 +08:00
96de99525e [compile&build]clang compile errors fix (#10201)
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-06-17 08:41:25 +08:00