Commit Graph

2235 Commits

Author SHA1 Message Date
88f466ab86 [bugfix] temporarily disable pushing RF to scanner to avoid coredump (#10776) 2022-07-11 22:48:08 +08:00
a266d7b040 [bug](be) fix be _quick_compaction_thread_pool without shutdown. (#10758) 2022-07-11 22:33:56 +08:00
195d3b4a5a fix Level1Iterator memory leak (#10772) 2022-07-11 22:00:50 +08:00
9b554be698 [improvement]Division of integer is too slow (#10769) 2022-07-11 19:36:12 +08:00
5eb38467ef [bug](be) be asan core doris::DiskIoMgr::~DiskIoMgr(#10759) (#10760) 2022-07-11 19:04:16 +08:00
f21ce35059 [refactor]remove unused private field _profile (#10732) 2022-07-11 14:04:09 +08:00
277a7dd97e [bugfix]ColumnDecimal missed some interfaces about pre-serialization (#10751) 2022-07-11 14:00:58 +08:00
cc279d09a1 [BUG] Wrong result when build size is beyond IN runtime filter threshold (#10735) 2022-07-11 12:19:38 +08:00
639f1cd26c [improvement](parquet-reader) Add some profile for parquet reader (#10740) 2022-07-11 12:19:06 +08:00
8472ea8324 Revert "[Enhancement] Add column prune support for VOlapScanNode (#10615)" (#10734) 2022-07-11 12:16:08 +08:00
b04a791895 [Enhancement] support compile with jemalloc (#10542)
A test feature to use jemalloc as default malloc.
2022-07-11 12:15:35 +08:00
a044b5dcc5 [refactor](predicate) refactor predicates in scan node (#10701)
* [reafactor](predicate) refactor predicates in scan node

* update
2022-07-11 09:21:01 +08:00
4cb80c5733 [memtracker]fix fix_memtracker_performance_ (#10629) 2022-07-11 08:35:05 +08:00
9e9d6a4dea [Load][Vectorized] load opt code by change replace and replace_if_not_null do not copy value (#10447)
load opt code by change `replace` and `replace_if_not_null` do not copy value
2022-07-10 22:04:32 +08:00
502ac4e76b [Load][Vectorized] opt the mem use of aggregate function in load to speed up (#10448)
opt the mem use of aggregate function in load to speed up
2022-07-10 13:34:25 +08:00
7f9eeb8fc3 [BUG] runtime filter core dump (#10716) 2022-07-09 21:36:22 +08:00
1f08f2d144 [Bug][Vectorized] Support array function in where pre in volap_scan_node (#10467)
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
Support array function in where pre in volap_scan_node
2022-07-09 16:22:01 +08:00
24d824a783 [improvement](multi-catalog) Impl parallel for file scanner to improve the scanner performance (#10620)
Add multi-thread support in FileScanNode on be and impl the file spilt logic in fe.
2022-07-09 15:52:53 +08:00
d5ea677282 [feature](tracing) Support query tracing to improve doris observability by introducing OpenTelemetry. (#10533)
The collection of query traces is implemented in fe and be, and the spans are exported to zipkin.
DSIP: https://cwiki.apache.org/confluence/display/DORIS/DSIP-012%3A+Introduce+opentelemetry
2022-07-09 15:50:40 +08:00
ed4b2140d7 [BUG](datev2) fix bloom filter for datev2 and remove redundant code (#10695) 2022-07-09 06:26:28 +08:00
08384fea1c [BUG] fix DCHECK failed for vectorized InPredicate (#10709) 2022-07-09 06:25:32 +08:00
e293fbd277 [improvement]pre-serialize aggregation keys (#10700) 2022-07-09 06:21:56 +08:00
c358a43f35 [feature-wip] support parquet predicate push down (#10512) 2022-07-08 23:11:25 +08:00
feeef7e4da [feature-wip](unique-key-merge-on-write) add interface for segment key bounds, DSIP-018[3/2] (#10655)
Add interfaces for segment key bounds, key bounds will be used to speed up point lookup
on the primary key index of each segment.
For the detail, see DSIP-018:https://cwiki.apache.org/confluence/display/DORIS/DSIP-018%3A+Support+Merge-On-Write+implementation+for+UNIQUE+KEY+data+model

KeyBounds will be updated by BetaRowsetWriter, will be used to construct a RowsetTree(based on IntervalTree,
will be added through next patch)
2022-07-08 21:39:13 +08:00
Pxl
f58a071605 [Bug][Function] pass intermediate argument list to be (#10650) 2022-07-08 20:50:05 +08:00
35a282fd61 [BugFix] Column datas doesn't match nullmap when vectorization load (#10684)
* block column doesn't match nullmap

* remove _nullmap+_row_pos in convertor_to_olap
2022-07-08 17:39:44 +08:00
7c330e38d9 [fix](multi-catalog)Fix coredump when reading the parquet file for multi-thread (#10635)
There is two issue fixed in this pr:

**The first issue** is the C++ code rule of `do not call virtual function in constructor or deconstructor`. 
The deconstructor function of `ArrowReaderWrap` call the virtual function named `close()`.
When deconstructing, it will never call `ParquetReaderWrap::close()` just call the `ArrowReaderWrap::close()`

**The second issue** is parallelism deconstructing for `ParquetReaderWrap` and `prefetch_batch`.
`prefetch_batch` use `thread.detach()` to separate the control from `ParquetReaderWrap`, but it rely on some local vars from `ParquetReaderWrap` such as **`_closed ` /`_total_groups ` and `_reader`**

In this case, `ParquetReaderWrap` may call deconstructor before `prefetch_batch` and then get the core dump.
2022-07-08 14:54:10 +08:00
43915936b6 [refactor] add evaluate_and_vec() for ComparisonPredicateBase (#10631) 2022-07-08 14:47:37 +08:00
e37d29485f [Enhancement] Add column prune support for VOlapScanNode (#10615) 2022-07-08 13:56:26 +08:00
fe8acdb268 [feature-wip](array-type) add agg function collect_list and collect_set (#10606)
add codes for collect_list and collect_set and update regression output, before output format for ARRAY(string) already changed.

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-07-08 12:48:46 +08:00
331fa50501 [feature](cold-data) move cold data to object storage without losing any feature(BE) (#10280)
This PR supports rowset level data upload on the BE side, so that there can be both cold data and hot data in a tablet,
and there is no necessary to prohibit loading new data to cooled tablets.

Each rowset is bound to a `FileSystem`, so that the storage layer can read and write rowsets without
perceiving the underlying filesystem.

The abstracted `RemoteFileSystem` can try local caching strategies with different granularity,
instead of caching segment files as before.

To avoid conflicts with the code in be/src/io, we temporarily put the file system related code in the be/src/io/fs directory.
In the future, `FileReader`s and `FileWriter`s should be unified.
2022-07-08 12:18:39 +08:00
03296aedd5 [BUG] fix core dump caused by runtime filter (#10611) 2022-07-08 08:28:39 +08:00
853f85aea4 [enhancement] improve performance of week() and yearweek() (#10633) 2022-07-08 08:26:58 +08:00
c583d3e27c [fix][vectorized] Fix bug of VInPredicate on date type (#10663) 2022-07-07 22:15:33 +08:00
8012d63ea0 [fix] substr('', 1, 5) return empty string instead of null (#10622) 2022-07-06 22:51:02 +08:00
29d4809c80 [BugFix](Array) fix DataTypeArray to_string use after free (#10640)
ColumnArray::convert_to_full_column_if_const override the base function
and ColumnArray::create generate a temporary variable
2022-07-06 18:18:00 +08:00
cff9ffa0e1 fix the inaccurate comments (#10617)
Co-authored-by: hucheng01 <hucheng01@baidu.com>
2022-07-06 17:54:43 +08:00
b4c5dfc28e [Improvement] remove redundant code of VOlapScanner (#10621) 2022-07-06 17:54:10 +08:00
a7df6e3dee rename some files inside vec/sink dir (#10636)
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-07-06 17:52:47 +08:00
006283c036 [Fix] select nested type of string within type array should be wrapped with '' in vectorized path (#10498) 2022-07-06 10:47:36 +08:00
8e364fb848 [fix](load) skip empty orc file (#10593)
Something the upstream system(eg, hive) may create empty orc file
which only has a header and footer, without schema.
And if we call `_reader->createRowReader()` with selected columns,
it will throw ParserError: Invalid column selected xx.
So here we first check its number of rows and skip these kind of files.

This is only a fix for non-vec load, for vec load, it use arrow scanner
to read orc file, which does not have this problem.
2022-07-05 22:18:56 +08:00
89e56ea67f [refactor] remove alpha rowset related code and vectorized row batch related code (#10584) 2022-07-05 20:33:34 +08:00
3e87960202 [bugfix] fix bug of vhash join build (#10614)
* [bugfix] fix bug of vhash join build

* format code
2022-07-05 19:14:42 +08:00
86502b014d [feature-wip](unique-key-merge-on-write)port IntervalTree from kudu (#10511)
See the DISP-18:https://cwiki.apache.org/confluence/display/DORIS/DSIP-018%3A+Support+Merge-On-Write+implementation+for+UNIQUE+KEY+data+model
This patch is for step 3.1 in scheduling.
2022-07-05 17:43:01 +08:00
575bf18d55 [enhancement] speed up week_of_year by pre_calc table (#10586) 2022-07-05 15:37:02 +08:00
585d42330c [BUG] fix bug in bloom filter for datev2 (#10579) 2022-07-05 11:10:03 +08:00
a2f74bf260 [Improvement] remove profile with poor readability (#10581) 2022-07-05 11:09:23 +08:00
73ba806046 [feature-wip](multi-catalog) Add catalog to information_schema table "columns". (#10592) 2022-07-05 09:57:19 +08:00
570139e332 [fix][be] Delete uncivilized comments. (#10578) 2022-07-04 22:35:15 +08:00
1f1bdaa9c3 [bugfix] fix coredump of left anti join (#10591) 2022-07-04 22:29:41 +08:00