Commit Graph

5226 Commits

Author SHA1 Message Date
88f466ab86 [bugfix] temporarily disable pushing RF to scanner to avoid coredump (#10776) 2022-07-11 22:48:08 +08:00
a266d7b040 [bug](be) fix be _quick_compaction_thread_pool without shutdown. (#10758) 2022-07-11 22:33:56 +08:00
5a54d518dc [Refactor](Nereids) remove generic type from concrete expressions (#10761)
in the past, we use generic type for plan and expression to support pattern match framework, it can support type inference without unsafely type cast. then, we observed that expression usually traverse or rewrite by visitor pattern, so generic type is useless for expression and introduces complexity. so we remove generic type from concrete expressions.
2022-07-11 22:30:42 +08:00
27505773f5 add regression test for array functions inside WHERE condition (#10748)
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-07-11 22:18:48 +08:00
195d3b4a5a fix Level1Iterator memory leak (#10772) 2022-07-11 22:00:50 +08:00
9b554be698 [improvement]Division of integer is too slow (#10769) 2022-07-11 19:36:12 +08:00
c51badb1ae [feature-wip](datev2) add FE functions and fix some bugs (#10767) 2022-07-11 19:25:31 +08:00
5eb38467ef [bug](be) be asan core doris::DiskIoMgr::~DiskIoMgr(#10759) (#10760) 2022-07-11 19:04:16 +08:00
deae728fc6 [refactor](nereids) Refine some code snippets (#10672)
Refine some code snippets:
1. Rename: ExpressionUtils::add -> ExpressionUtils::and
2. Reduce temporary objects when combing expressions.
2022-07-11 16:31:38 +08:00
51855633e4 [feature](Nereids): cost and enforcer job in cascades. (#10657)
Issue Number: close #9640

Add enforcer job for cascades.

Inspired by to *NoisePage enforcer job*, and *ORCA paper*

During this period, we will derive physical property for plan tree, and prune the plan according to the cos.
2022-07-11 15:01:59 +08:00
f21ce35059 [refactor]remove unused private field _profile (#10732) 2022-07-11 14:04:09 +08:00
7fa72406a5 [Doc]Update flink / spark connector download url (#10746) 2022-07-11 14:02:53 +08:00
277a7dd97e [bugfix]ColumnDecimal missed some interfaces about pre-serialization (#10751) 2022-07-11 14:00:58 +08:00
cc279d09a1 [BUG] Wrong result when build size is beyond IN runtime filter threshold (#10735) 2022-07-11 12:19:38 +08:00
639f1cd26c [improvement](parquet-reader) Add some profile for parquet reader (#10740) 2022-07-11 12:19:06 +08:00
d6e6aae6c6 [docs] how-to-contribute remove incubator (#10730) 2022-07-11 12:17:16 +08:00
81101fc1c5 [enhancement](alter) Make alter job more robust by ignoring some task failure (#10719)
Co-authored-by: caiconghui1 <caiconghui1@jd.com>
2022-07-11 12:16:48 +08:00
8472ea8324 Revert "[Enhancement] Add column prune support for VOlapScanNode (#10615)" (#10734) 2022-07-11 12:16:08 +08:00
b04a791895 [Enhancement] support compile with jemalloc (#10542)
A test feature to use jemalloc as default malloc.
2022-07-11 12:15:35 +08:00
4cd4f94717 [docs]fix typo in substring docs (#10747) 2022-07-11 11:50:13 +08:00
1dccfa3d84 [enhancement](nereids) make SSB works (#10659)
enhancement
- refactor compute output expression on root fragment in nereids planner
- refactor aggregate plan translator
- refactor aggregate disassemble rule
- slightly refactor sort plan translator
- add exchange node on the top of plan node tree if it is needed
- slightly refactor PhysicalPlanTranslator#translatePlan

fix
- slotDescriptor should not reuse between TupleDescriptors
- expression's nullable now works fine
- remove quotes when parse string literal
- set resolvedTupleExprs in SortNode to control output
- remove the extra column in sortTupleSlotExprs in SortInfo

known issues
- aggregate function must be the top expression in output expression (need project in ExecNode in BE)
- first phase aggregate could not convert to stream mode.
- OlapScanNode do not set data partition
- Sort could not process expression like 'order by a + 1' and SortInfo generated in a trick way and should be refactor when we want to support 'order by a + 1'
- column prune do not work as expected
2022-07-11 11:33:17 +08:00
a044b5dcc5 [refactor](predicate) refactor predicates in scan node (#10701)
* [reafactor](predicate) refactor predicates in scan node

* update
2022-07-11 09:21:01 +08:00
4cb80c5733 [memtracker]fix fix_memtracker_performance_ (#10629) 2022-07-11 08:35:05 +08:00
46662bfee8 [Bug] CTAS varchar length lost (#10738) 2022-07-10 23:51:36 +08:00
9e9d6a4dea [Load][Vectorized] load opt code by change replace and replace_if_not_null do not copy value (#10447)
load opt code by change `replace` and `replace_if_not_null` do not copy value
2022-07-10 22:04:32 +08:00
502ac4e76b [Load][Vectorized] opt the mem use of aggregate function in load to speed up (#10448)
opt the mem use of aggregate function in load to speed up
2022-07-10 13:34:25 +08:00
a6e4c88663 [improve](planner): split output expr to multiple line. (#10710)
* [improve](planner): split output expr to multiple line.

+---------------------------------------------------+
| Explain String                                    |
+---------------------------------------------------+
| PLAN FRAGMENT 0                                   |
|   OUTPUT EXPRS:                                   |
|     <slot 9> `user_id`                            |
|     <slot 11> `default_cluster:test`.`tbl`.`date` |
|     <slot 10> `city`                              |
|     <slot 12> `default_cluster:test`.`tbl`.`age`  |
+---------------------------------------------------+

* *: fix UT and regression-test.
2022-07-10 11:35:48 +08:00
7f9eeb8fc3 [BUG] runtime filter core dump (#10716) 2022-07-09 21:36:22 +08:00
1f08f2d144 [Bug][Vectorized] Support array function in where pre in volap_scan_node (#10467)
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
Support array function in where pre in volap_scan_node
2022-07-09 16:22:01 +08:00
24d824a783 [improvement](multi-catalog) Impl parallel for file scanner to improve the scanner performance (#10620)
Add multi-thread support in FileScanNode on be and impl the file spilt logic in fe.
2022-07-09 15:52:53 +08:00
d5ea677282 [feature](tracing) Support query tracing to improve doris observability by introducing OpenTelemetry. (#10533)
The collection of query traces is implemented in fe and be, and the spans are exported to zipkin.
DSIP: https://cwiki.apache.org/confluence/display/DORIS/DSIP-012%3A+Introduce+opentelemetry
2022-07-09 15:50:40 +08:00
1112dba525 [be ut]add some case for array type in block_test (#10656)
Co-authored-by: hucheng01 <hucheng01@baidu.com>
2022-07-09 12:00:42 +08:00
a5efda6882 [chore] Update .gitignore to ignore generated files by be ut (#10713) 2022-07-09 10:45:00 +08:00
3229730933 [refactor]broker rpc timeout configuration parameterization (#10692) 2022-07-09 06:27:02 +08:00
ed4b2140d7 [BUG](datev2) fix bloom filter for datev2 and remove redundant code (#10695) 2022-07-09 06:26:28 +08:00
08384fea1c [BUG] fix DCHECK failed for vectorized InPredicate (#10709) 2022-07-09 06:25:32 +08:00
9c7601841e [Doc]broker load rpc timeout problem FQA (#10698) 2022-07-09 06:24:08 +08:00
e293fbd277 [improvement]pre-serialize aggregation keys (#10700) 2022-07-09 06:21:56 +08:00
c358a43f35 [feature-wip] support parquet predicate push down (#10512) 2022-07-08 23:11:25 +08:00
feeef7e4da [feature-wip](unique-key-merge-on-write) add interface for segment key bounds, DSIP-018[3/2] (#10655)
Add interfaces for segment key bounds, key bounds will be used to speed up point lookup
on the primary key index of each segment.
For the detail, see DSIP-018:https://cwiki.apache.org/confluence/display/DORIS/DSIP-018%3A+Support+Merge-On-Write+implementation+for+UNIQUE+KEY+data+model

KeyBounds will be updated by BetaRowsetWriter, will be used to construct a RowsetTree(based on IntervalTree,
will be added through next patch)
2022-07-08 21:39:13 +08:00
2b2bf017f8 [enhancement](regression-test) add real data path for regression test. (#10577)
in some situation, we need compare real result with
previous result for  analyzing.
2022-07-08 20:51:23 +08:00
Pxl
f58a071605 [Bug][Function] pass intermediate argument list to be (#10650) 2022-07-08 20:50:05 +08:00
6f29a8ac0d [refactor] update stop_be.sh to avoid error message (#10691)
* update stop_be.sh to avoid error message

* update stop_be.sh
2022-07-08 20:49:00 +08:00
e6da00bb26 [feature](nereides) support sort translator (#10678)
Physical sort:
     * 1. Build sortInfo
     *    There are two types of slotRef:
     *    one is generated by the previous node, collectively called old.
     *    the other is newly generated by the sort node, collectively called new.
     *    Filling of sortInfo related data structures,
     *    a. ordering use newSlotRef.
     *    b. sortTupleSlotExprs use oldSlotRef.
     * 2. Create sortNode
     * 3. Create mergeFragment

TODO:
1.Currently, columns that do not exist in select but exist in order by cannot be parsed.
eg: select key from table order by value;

2.For the combination of Literal and slotRefrance in select, there is a problem with parsing, 
eg: select key ,(10-value) from table;
2022-07-08 19:22:48 +08:00
2b1d8ac28a [Doc] add flink-doris-connector 1.1.0 doc (#10660)
* add flink connector 1.1.0 doc

* update

Co-authored-by: wudi <>
2022-07-08 17:40:23 +08:00
35a282fd61 [BugFix] Column datas doesn't match nullmap when vectorization load (#10684)
* block column doesn't match nullmap

* remove _nullmap+_row_pos in convertor_to_olap
2022-07-08 17:39:44 +08:00
eeee036cba [fix](optimizer) join reorder may cause column non-existence problem (#10670)
for example:
select * from t1 inner join t2 on t1.a = t2.b inner join t3 on t3.c = t2.b;
If t3 is a large table, it will be placed first after the reorderTable,
and the problem that t2.b does not exist will occur in reanalyzing.
2022-07-08 17:28:32 +08:00
d127cfeea2 [docs]fix keywords in sql-functions help documents (#10671)
Co-authored-by: hucheng01 <hucheng01@baidu.com>
2022-07-08 16:22:47 +08:00
7c330e38d9 [fix](multi-catalog)Fix coredump when reading the parquet file for multi-thread (#10635)
There is two issue fixed in this pr:

**The first issue** is the C++ code rule of `do not call virtual function in constructor or deconstructor`. 
The deconstructor function of `ArrowReaderWrap` call the virtual function named `close()`.
When deconstructing, it will never call `ParquetReaderWrap::close()` just call the `ArrowReaderWrap::close()`

**The second issue** is parallelism deconstructing for `ParquetReaderWrap` and `prefetch_batch`.
`prefetch_batch` use `thread.detach()` to separate the control from `ParquetReaderWrap`, but it rely on some local vars from `ParquetReaderWrap` such as **`_closed ` /`_total_groups ` and `_reader`**

In this case, `ParquetReaderWrap` may call deconstructor before `prefetch_batch` and then get the core dump.
2022-07-08 14:54:10 +08:00
43915936b6 [refactor] add evaluate_and_vec() for ComparisonPredicateBase (#10631) 2022-07-08 14:47:37 +08:00