Commit Graph

5265 Commits

Author SHA1 Message Date
ca5dbb1bcc Fix olap scan node normalize_in_and_eq_predicate infinite loop bug. (#10817) 2022-07-14 14:54:57 +08:00
799300c475 [website]1.1 release formulation (#10838) 2022-07-14 14:43:33 +08:00
190daee9f3 [doc] Fixed doc typo for materialized views 2022-07-14 14:16:45 +08:00
d245ab76cc [improvement]Use uint32 instead of size_t to reduce agg key's length (#10832) 2022-07-14 14:11:55 +08:00
575a1cb173 [Release] add download links for v1.1 (#10836) 2022-07-14 14:01:52 +08:00
13e9cb146f [feature-wip](unique-key-merge-on-write) Add option to enable unique-key-merge-on-write, DSIP-018[5/1] (#10814)
* Add option in FE

* add opt in be

* some fix

* update

* fix code style

* fix typo

* fix typo

* update

* code format
2022-07-14 12:10:58 +08:00
d1573e1a4a [improvement]Use phmap for aggregation with serialized key (#10821) 2022-07-14 11:26:09 +08:00
e361eb385e [vectorized][udf] improvement java-udaf with group by clause (#10296)
save for file about udaf
add bool _destory_deserialize
update some code according reviewer
change destroy all data at once
2022-07-14 11:23:42 +08:00
3d52bff8d1 [improvement]output query_id when be core dumped. (#10822) 2022-07-14 10:55:28 +08:00
3b46242483 [feature-wip] Optimize Decimal type (#10794)
* [feature-wip](decimalv3) support decimalv3

* [feature-wip] Optimize Decimal type

Co-authored-by: liaoxin <liaoxinbit@126.com>
2022-07-14 10:50:50 +08:00
bb0d023abd [docs] Change the incubator fields before and after Doris' Graduation (#10482)
Change the incubator fields before and after Doris' Graduation
2022-07-14 10:48:20 +08:00
077ec4b114 [bug](multi-catalog) empty hadoop configuration when reading iceberg table (#10793) 2022-07-14 10:18:59 +08:00
4f4ce4674a [fix](array) regression test fix for array (#10815) 2022-07-13 21:21:34 +08:00
e78cca1009 (Refactor)[Nereids] Combine operator and plan (#10786)
in #9755, we split plan into plan & operator, but in subsequent development, we found the rule became complex and counter intuition: 
1. we must create an operator instance, then wrap a plan by the operator type.
2. relational algebra(operator) not contains children 

e.g.
```java
logicalProject().then(project -> {
    List<NamedExpression> boundSlots =
        bind(project.operator.getProjects(), project.children(), project);
    LogicalProject op = new LogicalProject(flatBoundStar(boundSlots));
    // wrap a plan
    return new LogicalUnaryPlan(op, project.child());
})
```

after combine operator and plan, the code become to:
```java
logicalProject().then(project -> {
    List<NamedExpression> boundSlots =
        bind(project.getProjects(), project.children(), project);
    return new LogicalProject(flatBoundStar(boundSlots), project.child());
})
```

Originally, we thought it would be convenient for `Memo.copyIn()` after split plan & operator, because Memo don't known how to re-new the plan(assembling child plan in the children groups) by the plan type. So plan must provide the `withChildren()` abstract method to assembling children. The less plan type, the lower code cost we have(logical/physical with leaf/unary/binary plan, about 6 plans, no concrete plan e.g. LogicalAggregatePlan). 

But the convenient make negative effect that difficult to understand, and people must known the concept then can develop some new rules, and rule become ugly. So we combine the plan & operator, make the rule as simple as possible, the negative effect is we must overwrite some withXxx for all concrete plan, e.g. LogicalAggregate, PhysicalHashJoin.
2022-07-13 19:05:15 +08:00
56b55563c6 [feature-wip](unique-key-merge-on-write) add bloom filter index for primary key, DSIP-018[1.2] (#10706) 2022-07-13 18:58:45 +08:00
ad7702f06e add flink local debug log dep doc (#10806)
add flink local debug log dep doc
2022-07-13 17:43:42 +08:00
def59a686e [improvement]output fetal log to stderr (#10789) 2022-07-13 16:34:37 +08:00
bd982ac815 [Bug] Fix array functions arguments mismatch (#10549)
Currently, we convert array<Int> to array<BigInt>

For example, the input array_sum([1, 2, 3]) can match function array_sum(Array<Int>) as well as array_sum(Array<BigInt>).

But when a function has more than one argument, the function may be match incorrectly.

For example, the input array_contains([1, 2, 3], 2147483648) will match the function array_contains(Array<BigInt>, BigInt), but the correct match should be array_contains(Array<Int>, Int)

The correct match should be:
array_contains([1, 2, 3], 1) match array_contains(Array<Int>, Int)
array_contains([1, 2, 3], 2147483648) match array_contains(Array<Int>, Int)
array_contains([2147483648, 2147483649, 2147483650], 2147483648) match array_contains(Array<BigInt>, BigInt)

now is:
array_contains([1, 2, 3], 1) match array_contains(Array<Int>, Int)
array_contains([1, 2, 3], 2147483648) match array_contains(Array<BigInt>, BigInt)
array_contains([2147483648, 2147483649, 2147483650], 2147483648) match array_contains(Array<BigInt>, BigInt)

And this will cause some trouble.

Assume that there are two functions being defined:
Int array_functions(Array<Int>, Int)
BigInt array_functions(Array<BigInt>, BigInt)

And array_functions([1,2,3], 2147483648) will match BigInt array_functions(Array<BigInt>, BigInt), but the result type should not be BigInt, but should be Int.
2022-07-13 14:54:49 +08:00
d03b5c29a8 [bugfix] fix bug of ComparisonPredicate for nullable column (#10798) 2022-07-13 12:26:05 +08:00
4719d4705f [regression] update test framework and fix cases (#10686)
and regression test exclude suite test_create_table_with_bloom_filter temporarily.

Co-authored-by: stephen <hello-stephen@qq.com>
2022-07-13 10:16:16 +08:00
7906866826 Fix show table status docs. (#10782)
Co-authored-by: smallhibiscus <844981280>
2022-07-13 08:41:49 +08:00
f9f711cd16 FIX: fix datetimev2 decimal error. (#10736) 2022-07-13 08:32:26 +08:00
a9a08d3d0b [doc]Add common errors to broker load import documentation (#10773)
* Add common errors to broker load import documentation

* Add common errors to broker load import documentation
2022-07-13 08:31:17 +08:00
Pxl
4190f7354c [Bug][Memtable] fix core dump on int128 because not aligned by 16 byte (#10775)
* fix core dump on int128 because not aligned by 16 byte

* update
2022-07-13 08:30:58 +08:00
Pxl
d6210edcda [bugfix]set IsNullPredicate to ALWAYS_NOT_NULLABLE (#10785) 2022-07-13 08:28:00 +08:00
d278f400d4 [enhancement](show data skew) Support show avg_row_count for data skew of one table (#10790) 2022-07-13 08:27:20 +08:00
6063c0c9c8 [enhancement](singal) output git commit id when the program coredump (#10788)
* [enhancement](singal) output git commit id when the program coredump

* modift output info
2022-07-13 08:24:58 +08:00
89e2678f4e [improvement]Increase min_ht_mem of StreamingHtMinReductionEntry (#10787) 2022-07-12 22:20:02 +08:00
89bec9b56a [enhancement](be) be asan add asan_suppr.conf to ignore known leak. (#10768) 2022-07-12 19:51:34 +08:00
486cf0ebd4 [Feature] Lightweight schema change of add/drop column (#10136)
* [Schema Change] support fast add/drop column  (#49)

* [feature](schema-change) support fast schema change. coauthor: yixiutt

* [schema change] Using columns desc from fe to read data. coauthor: Lchangliang

* [feature](schema change) schema change optimize for add/drop columns.

1.add uniqueId field for class column.
2.schema change for add/drop columns directly update schema meta

Co-authored-by: yixiutt <yixiu@selectdb.com>
Co-authored-by: SWJTU-ZhangLei <1091517373@qq.com>

[Feature](schema change) fix write and add regression test (#69)

Co-authored-by: yixiutt <yixiu@selectdb.com>

[schema change] be ssupport that delete use newest schema

add delete regression test

fix regression case (#107)

tmp

[feature](schema change) light schema change exclude rollup and agg/uniq/dup key type.

[feature](schema change) fe olapTable maxUniqueId write in disk.

[feature](schema change) add rpc iface for sc add column.

[feature](schema change) add columnsDesc to TPushReq for ligtht sc.

resolve the deadlock when schema change (#124)

fix columns from fe don't has bitmap_index flag (#134)

add update/delete case

construct MATERIALIZED schema from origin schema when insert

fix not vectorized compaction coredump

use segment cache

choose newest schema by schema version when compaction (#182)

[bugfix](schema change) fix ligth schema change problem.

[feature](schema change) light schema change add alter job. (#1)

fix be ut

[bug] (schema change) unique drop key column should not light schema
change

[feature](schema change) add schema change regression-test.

fix regression test

[bugfix](schema change) fix multi alter clauses for light schema change. (#2)

[bugfix](schema change) fix multi clauses calculate column unique id (#3)

modify PushTask process (#217)

[Bugfix](schema change) fix jobId replay cause bdbje exception.

[bug](schema change) fix max col unique id repeatitive. (#232)

[optimize](schema change) modify pendingMaxColUniqueId generate rule.

fix compaction error
* fix be ut

* fix snapshot load core

fix unique_id error (#278)

[refact](fe) remove redundant code for light schema change. (#4)

[refact](fe) remove redundant code for light schema change. (#4)

format fe core

format be core

fix be ut

modify fe meta version

fix rebase error

flush schema into rowset_meta in old table

[refactor](schema change) refact fe light schema change. (#5)

delete the change of schemahash and support get max version schema

* modify for review

* fix be ut

* fix schema change test
2022-07-12 19:41:06 +08:00
41f9ee2f9e mem_tracker_factor_v2 (#10743) 2022-07-12 18:09:41 +08:00
eb079950cb [feature-wip] (array-type) add the array_distinct function (#10388)
* add the array_distinct function

* add the support for decimal and update variable names

* add docs and regression test for array_distinct function

Co-authored-by: hucheng01 <hucheng01@baidu.com>
2022-07-12 17:02:42 +08:00
5f99a95816 fix use released memory bugs (#10777)
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-07-12 16:54:32 +08:00
2084d8bdf3 [feature-wip](unique-key-merge-on-write) Add delete bitmap for DSIP-018 (#10548)
Add delete bitmap for
DSIP-018: Support Merge-On-Write implementation for UNIQUE KEY data model
2022-07-12 16:34:42 +08:00
4e9d5a7f7a optimize substr performance and fix ASAN global buffer overflow (#10442)
* add volnitsky substr algorithm

* replace std::search with volnitsky search algorithm in StringSearch

* optimize substring for constant_substring_fn case
use long run length search for performance
2022-07-12 08:36:21 +08:00
f5036fea63 [enhancement][multi-catalog]Add strong checker for hms table (#10724) 2022-07-11 23:48:15 +08:00
aea41c2ffa support different python version in build-support scripts (#10378)
Co-authored-by: hucheng01 <hucheng01@baidu.com>
2022-07-11 23:23:45 +08:00
62fba5aa5d [Fix](distrubution) fix random tablet (#10756)
_compute_tablet_index catched up a local variable
2022-07-11 23:12:56 +08:00
3b9cb524bc [feature-wip](unique-key-merge-on-write) Add rowset tree, based on interval-tree, DSIP-018[3/3] (#10714)
* port from rowset-tree from kudu

* use shared_ptr

* some update

* add mock rowset

* some compatibility update

* fix ut fail

* reformat code
2022-07-11 23:10:38 +08:00
88f466ab86 [bugfix] temporarily disable pushing RF to scanner to avoid coredump (#10776) 2022-07-11 22:48:08 +08:00
a266d7b040 [bug](be) fix be _quick_compaction_thread_pool without shutdown. (#10758) 2022-07-11 22:33:56 +08:00
5a54d518dc [Refactor](Nereids) remove generic type from concrete expressions (#10761)
in the past, we use generic type for plan and expression to support pattern match framework, it can support type inference without unsafely type cast. then, we observed that expression usually traverse or rewrite by visitor pattern, so generic type is useless for expression and introduces complexity. so we remove generic type from concrete expressions.
2022-07-11 22:30:42 +08:00
27505773f5 add regression test for array functions inside WHERE condition (#10748)
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-07-11 22:18:48 +08:00
195d3b4a5a fix Level1Iterator memory leak (#10772) 2022-07-11 22:00:50 +08:00
9b554be698 [improvement]Division of integer is too slow (#10769) 2022-07-11 19:36:12 +08:00
c51badb1ae [feature-wip](datev2) add FE functions and fix some bugs (#10767) 2022-07-11 19:25:31 +08:00
5eb38467ef [bug](be) be asan core doris::DiskIoMgr::~DiskIoMgr(#10759) (#10760) 2022-07-11 19:04:16 +08:00
deae728fc6 [refactor](nereids) Refine some code snippets (#10672)
Refine some code snippets:
1. Rename: ExpressionUtils::add -> ExpressionUtils::and
2. Reduce temporary objects when combing expressions.
2022-07-11 16:31:38 +08:00
51855633e4 [feature](Nereids): cost and enforcer job in cascades. (#10657)
Issue Number: close #9640

Add enforcer job for cascades.

Inspired by to *NoisePage enforcer job*, and *ORCA paper*

During this period, we will derive physical property for plan tree, and prune the plan according to the cos.
2022-07-11 15:01:59 +08:00
f21ce35059 [refactor]remove unused private field _profile (#10732) 2022-07-11 14:04:09 +08:00