Commit Graph

5948 Commits

Author SHA1 Message Date
bd982ac815 [Bug] Fix array functions arguments mismatch (#10549)
Currently, we convert array<Int> to array<BigInt>

For example, the input array_sum([1, 2, 3]) can match function array_sum(Array<Int>) as well as array_sum(Array<BigInt>).

But when a function has more than one argument, the function may be match incorrectly.

For example, the input array_contains([1, 2, 3], 2147483648) will match the function array_contains(Array<BigInt>, BigInt), but the correct match should be array_contains(Array<Int>, Int)

The correct match should be:
array_contains([1, 2, 3], 1) match array_contains(Array<Int>, Int)
array_contains([1, 2, 3], 2147483648) match array_contains(Array<Int>, Int)
array_contains([2147483648, 2147483649, 2147483650], 2147483648) match array_contains(Array<BigInt>, BigInt)

now is:
array_contains([1, 2, 3], 1) match array_contains(Array<Int>, Int)
array_contains([1, 2, 3], 2147483648) match array_contains(Array<BigInt>, BigInt)
array_contains([2147483648, 2147483649, 2147483650], 2147483648) match array_contains(Array<BigInt>, BigInt)

And this will cause some trouble.

Assume that there are two functions being defined:
Int array_functions(Array<Int>, Int)
BigInt array_functions(Array<BigInt>, BigInt)

And array_functions([1,2,3], 2147483648) will match BigInt array_functions(Array<BigInt>, BigInt), but the result type should not be BigInt, but should be Int.
2022-07-13 14:54:49 +08:00
d03b5c29a8 [bugfix] fix bug of ComparisonPredicate for nullable column (#10798) 2022-07-13 12:26:05 +08:00
4719d4705f [regression] update test framework and fix cases (#10686)
and regression test exclude suite test_create_table_with_bloom_filter temporarily.

Co-authored-by: stephen <hello-stephen@qq.com>
2022-07-13 10:16:16 +08:00
7906866826 Fix show table status docs. (#10782)
Co-authored-by: smallhibiscus <844981280>
2022-07-13 08:41:49 +08:00
f9f711cd16 FIX: fix datetimev2 decimal error. (#10736) 2022-07-13 08:32:26 +08:00
a9a08d3d0b [doc]Add common errors to broker load import documentation (#10773)
* Add common errors to broker load import documentation

* Add common errors to broker load import documentation
2022-07-13 08:31:17 +08:00
Pxl
4190f7354c [Bug][Memtable] fix core dump on int128 because not aligned by 16 byte (#10775)
* fix core dump on int128 because not aligned by 16 byte

* update
2022-07-13 08:30:58 +08:00
Pxl
d6210edcda [bugfix]set IsNullPredicate to ALWAYS_NOT_NULLABLE (#10785) 2022-07-13 08:28:00 +08:00
d278f400d4 [enhancement](show data skew) Support show avg_row_count for data skew of one table (#10790) 2022-07-13 08:27:20 +08:00
6063c0c9c8 [enhancement](singal) output git commit id when the program coredump (#10788)
* [enhancement](singal) output git commit id when the program coredump

* modift output info
2022-07-13 08:24:58 +08:00
89e2678f4e [improvement]Increase min_ht_mem of StreamingHtMinReductionEntry (#10787) 2022-07-12 22:20:02 +08:00
89bec9b56a [enhancement](be) be asan add asan_suppr.conf to ignore known leak. (#10768) 2022-07-12 19:51:34 +08:00
486cf0ebd4 [Feature] Lightweight schema change of add/drop column (#10136)
* [Schema Change] support fast add/drop column  (#49)

* [feature](schema-change) support fast schema change. coauthor: yixiutt

* [schema change] Using columns desc from fe to read data. coauthor: Lchangliang

* [feature](schema change) schema change optimize for add/drop columns.

1.add uniqueId field for class column.
2.schema change for add/drop columns directly update schema meta

Co-authored-by: yixiutt <yixiu@selectdb.com>
Co-authored-by: SWJTU-ZhangLei <1091517373@qq.com>

[Feature](schema change) fix write and add regression test (#69)

Co-authored-by: yixiutt <yixiu@selectdb.com>

[schema change] be ssupport that delete use newest schema

add delete regression test

fix regression case (#107)

tmp

[feature](schema change) light schema change exclude rollup and agg/uniq/dup key type.

[feature](schema change) fe olapTable maxUniqueId write in disk.

[feature](schema change) add rpc iface for sc add column.

[feature](schema change) add columnsDesc to TPushReq for ligtht sc.

resolve the deadlock when schema change (#124)

fix columns from fe don't has bitmap_index flag (#134)

add update/delete case

construct MATERIALIZED schema from origin schema when insert

fix not vectorized compaction coredump

use segment cache

choose newest schema by schema version when compaction (#182)

[bugfix](schema change) fix ligth schema change problem.

[feature](schema change) light schema change add alter job. (#1)

fix be ut

[bug] (schema change) unique drop key column should not light schema
change

[feature](schema change) add schema change regression-test.

fix regression test

[bugfix](schema change) fix multi alter clauses for light schema change. (#2)

[bugfix](schema change) fix multi clauses calculate column unique id (#3)

modify PushTask process (#217)

[Bugfix](schema change) fix jobId replay cause bdbje exception.

[bug](schema change) fix max col unique id repeatitive. (#232)

[optimize](schema change) modify pendingMaxColUniqueId generate rule.

fix compaction error
* fix be ut

* fix snapshot load core

fix unique_id error (#278)

[refact](fe) remove redundant code for light schema change. (#4)

[refact](fe) remove redundant code for light schema change. (#4)

format fe core

format be core

fix be ut

modify fe meta version

fix rebase error

flush schema into rowset_meta in old table

[refactor](schema change) refact fe light schema change. (#5)

delete the change of schemahash and support get max version schema

* modify for review

* fix be ut

* fix schema change test
2022-07-12 19:41:06 +08:00
41f9ee2f9e mem_tracker_factor_v2 (#10743) 2022-07-12 18:09:41 +08:00
eb079950cb [feature-wip] (array-type) add the array_distinct function (#10388)
* add the array_distinct function

* add the support for decimal and update variable names

* add docs and regression test for array_distinct function

Co-authored-by: hucheng01 <hucheng01@baidu.com>
2022-07-12 17:02:42 +08:00
5f99a95816 fix use released memory bugs (#10777)
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-07-12 16:54:32 +08:00
2084d8bdf3 [feature-wip](unique-key-merge-on-write) Add delete bitmap for DSIP-018 (#10548)
Add delete bitmap for
DSIP-018: Support Merge-On-Write implementation for UNIQUE KEY data model
2022-07-12 16:34:42 +08:00
4e9d5a7f7a optimize substr performance and fix ASAN global buffer overflow (#10442)
* add volnitsky substr algorithm

* replace std::search with volnitsky search algorithm in StringSearch

* optimize substring for constant_substring_fn case
use long run length search for performance
2022-07-12 08:36:21 +08:00
f5036fea63 [enhancement][multi-catalog]Add strong checker for hms table (#10724) 2022-07-11 23:48:15 +08:00
aea41c2ffa support different python version in build-support scripts (#10378)
Co-authored-by: hucheng01 <hucheng01@baidu.com>
2022-07-11 23:23:45 +08:00
62fba5aa5d [Fix](distrubution) fix random tablet (#10756)
_compute_tablet_index catched up a local variable
2022-07-11 23:12:56 +08:00
3b9cb524bc [feature-wip](unique-key-merge-on-write) Add rowset tree, based on interval-tree, DSIP-018[3/3] (#10714)
* port from rowset-tree from kudu

* use shared_ptr

* some update

* add mock rowset

* some compatibility update

* fix ut fail

* reformat code
2022-07-11 23:10:38 +08:00
88f466ab86 [bugfix] temporarily disable pushing RF to scanner to avoid coredump (#10776) 2022-07-11 22:48:08 +08:00
a266d7b040 [bug](be) fix be _quick_compaction_thread_pool without shutdown. (#10758) 2022-07-11 22:33:56 +08:00
5a54d518dc [Refactor](Nereids) remove generic type from concrete expressions (#10761)
in the past, we use generic type for plan and expression to support pattern match framework, it can support type inference without unsafely type cast. then, we observed that expression usually traverse or rewrite by visitor pattern, so generic type is useless for expression and introduces complexity. so we remove generic type from concrete expressions.
2022-07-11 22:30:42 +08:00
27505773f5 add regression test for array functions inside WHERE condition (#10748)
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-07-11 22:18:48 +08:00
195d3b4a5a fix Level1Iterator memory leak (#10772) 2022-07-11 22:00:50 +08:00
9b554be698 [improvement]Division of integer is too slow (#10769) 2022-07-11 19:36:12 +08:00
c51badb1ae [feature-wip](datev2) add FE functions and fix some bugs (#10767) 2022-07-11 19:25:31 +08:00
5eb38467ef [bug](be) be asan core doris::DiskIoMgr::~DiskIoMgr(#10759) (#10760) 2022-07-11 19:04:16 +08:00
deae728fc6 [refactor](nereids) Refine some code snippets (#10672)
Refine some code snippets:
1. Rename: ExpressionUtils::add -> ExpressionUtils::and
2. Reduce temporary objects when combing expressions.
2022-07-11 16:31:38 +08:00
51855633e4 [feature](Nereids): cost and enforcer job in cascades. (#10657)
Issue Number: close #9640

Add enforcer job for cascades.

Inspired by to *NoisePage enforcer job*, and *ORCA paper*

During this period, we will derive physical property for plan tree, and prune the plan according to the cos.
2022-07-11 15:01:59 +08:00
f21ce35059 [refactor]remove unused private field _profile (#10732) 2022-07-11 14:04:09 +08:00
7fa72406a5 [Doc]Update flink / spark connector download url (#10746) 2022-07-11 14:02:53 +08:00
277a7dd97e [bugfix]ColumnDecimal missed some interfaces about pre-serialization (#10751) 2022-07-11 14:00:58 +08:00
cc279d09a1 [BUG] Wrong result when build size is beyond IN runtime filter threshold (#10735) 2022-07-11 12:19:38 +08:00
639f1cd26c [improvement](parquet-reader) Add some profile for parquet reader (#10740) 2022-07-11 12:19:06 +08:00
d6e6aae6c6 [docs] how-to-contribute remove incubator (#10730) 2022-07-11 12:17:16 +08:00
81101fc1c5 [enhancement](alter) Make alter job more robust by ignoring some task failure (#10719)
Co-authored-by: caiconghui1 <caiconghui1@jd.com>
2022-07-11 12:16:48 +08:00
8472ea8324 Revert "[Enhancement] Add column prune support for VOlapScanNode (#10615)" (#10734) 2022-07-11 12:16:08 +08:00
b04a791895 [Enhancement] support compile with jemalloc (#10542)
A test feature to use jemalloc as default malloc.
2022-07-11 12:15:35 +08:00
4cd4f94717 [docs]fix typo in substring docs (#10747) 2022-07-11 11:50:13 +08:00
1dccfa3d84 [enhancement](nereids) make SSB works (#10659)
enhancement
- refactor compute output expression on root fragment in nereids planner
- refactor aggregate plan translator
- refactor aggregate disassemble rule
- slightly refactor sort plan translator
- add exchange node on the top of plan node tree if it is needed
- slightly refactor PhysicalPlanTranslator#translatePlan

fix
- slotDescriptor should not reuse between TupleDescriptors
- expression's nullable now works fine
- remove quotes when parse string literal
- set resolvedTupleExprs in SortNode to control output
- remove the extra column in sortTupleSlotExprs in SortInfo

known issues
- aggregate function must be the top expression in output expression (need project in ExecNode in BE)
- first phase aggregate could not convert to stream mode.
- OlapScanNode do not set data partition
- Sort could not process expression like 'order by a + 1' and SortInfo generated in a trick way and should be refactor when we want to support 'order by a + 1'
- column prune do not work as expected
2022-07-11 11:33:17 +08:00
a044b5dcc5 [refactor](predicate) refactor predicates in scan node (#10701)
* [reafactor](predicate) refactor predicates in scan node

* update
2022-07-11 09:21:01 +08:00
4cb80c5733 [memtracker]fix fix_memtracker_performance_ (#10629) 2022-07-11 08:35:05 +08:00
46662bfee8 [Bug] CTAS varchar length lost (#10738) 2022-07-10 23:51:36 +08:00
9e9d6a4dea [Load][Vectorized] load opt code by change replace and replace_if_not_null do not copy value (#10447)
load opt code by change `replace` and `replace_if_not_null` do not copy value
2022-07-10 22:04:32 +08:00
502ac4e76b [Load][Vectorized] opt the mem use of aggregate function in load to speed up (#10448)
opt the mem use of aggregate function in load to speed up
2022-07-10 13:34:25 +08:00
a6e4c88663 [improve](planner): split output expr to multiple line. (#10710)
* [improve](planner): split output expr to multiple line.

+---------------------------------------------------+
| Explain String                                    |
+---------------------------------------------------+
| PLAN FRAGMENT 0                                   |
|   OUTPUT EXPRS:                                   |
|     <slot 9> `user_id`                            |
|     <slot 11> `default_cluster:test`.`tbl`.`date` |
|     <slot 10> `city`                              |
|     <slot 12> `default_cluster:test`.`tbl`.`age`  |
+---------------------------------------------------+

* *: fix UT and regression-test.
2022-07-10 11:35:48 +08:00
7f9eeb8fc3 [BUG] runtime filter core dump (#10716) 2022-07-09 21:36:22 +08:00