Commit Graph

5948 Commits

Author SHA1 Message Date
1f08f2d144 [Bug][Vectorized] Support array function in where pre in volap_scan_node (#10467)
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
Support array function in where pre in volap_scan_node
2022-07-09 16:22:01 +08:00
24d824a783 [improvement](multi-catalog) Impl parallel for file scanner to improve the scanner performance (#10620)
Add multi-thread support in FileScanNode on be and impl the file spilt logic in fe.
2022-07-09 15:52:53 +08:00
d5ea677282 [feature](tracing) Support query tracing to improve doris observability by introducing OpenTelemetry. (#10533)
The collection of query traces is implemented in fe and be, and the spans are exported to zipkin.
DSIP: https://cwiki.apache.org/confluence/display/DORIS/DSIP-012%3A+Introduce+opentelemetry
2022-07-09 15:50:40 +08:00
1112dba525 [be ut]add some case for array type in block_test (#10656)
Co-authored-by: hucheng01 <hucheng01@baidu.com>
2022-07-09 12:00:42 +08:00
a5efda6882 [chore] Update .gitignore to ignore generated files by be ut (#10713) 2022-07-09 10:45:00 +08:00
3229730933 [refactor]broker rpc timeout configuration parameterization (#10692) 2022-07-09 06:27:02 +08:00
ed4b2140d7 [BUG](datev2) fix bloom filter for datev2 and remove redundant code (#10695) 2022-07-09 06:26:28 +08:00
08384fea1c [BUG] fix DCHECK failed for vectorized InPredicate (#10709) 2022-07-09 06:25:32 +08:00
9c7601841e [Doc]broker load rpc timeout problem FQA (#10698) 2022-07-09 06:24:08 +08:00
e293fbd277 [improvement]pre-serialize aggregation keys (#10700) 2022-07-09 06:21:56 +08:00
c358a43f35 [feature-wip] support parquet predicate push down (#10512) 2022-07-08 23:11:25 +08:00
feeef7e4da [feature-wip](unique-key-merge-on-write) add interface for segment key bounds, DSIP-018[3/2] (#10655)
Add interfaces for segment key bounds, key bounds will be used to speed up point lookup
on the primary key index of each segment.
For the detail, see DSIP-018:https://cwiki.apache.org/confluence/display/DORIS/DSIP-018%3A+Support+Merge-On-Write+implementation+for+UNIQUE+KEY+data+model

KeyBounds will be updated by BetaRowsetWriter, will be used to construct a RowsetTree(based on IntervalTree,
will be added through next patch)
2022-07-08 21:39:13 +08:00
2b2bf017f8 [enhancement](regression-test) add real data path for regression test. (#10577)
in some situation, we need compare real result with
previous result for  analyzing.
2022-07-08 20:51:23 +08:00
Pxl
f58a071605 [Bug][Function] pass intermediate argument list to be (#10650) 2022-07-08 20:50:05 +08:00
6f29a8ac0d [refactor] update stop_be.sh to avoid error message (#10691)
* update stop_be.sh to avoid error message

* update stop_be.sh
2022-07-08 20:49:00 +08:00
e6da00bb26 [feature](nereides) support sort translator (#10678)
Physical sort:
     * 1. Build sortInfo
     *    There are two types of slotRef:
     *    one is generated by the previous node, collectively called old.
     *    the other is newly generated by the sort node, collectively called new.
     *    Filling of sortInfo related data structures,
     *    a. ordering use newSlotRef.
     *    b. sortTupleSlotExprs use oldSlotRef.
     * 2. Create sortNode
     * 3. Create mergeFragment

TODO:
1.Currently, columns that do not exist in select but exist in order by cannot be parsed.
eg: select key from table order by value;

2.For the combination of Literal and slotRefrance in select, there is a problem with parsing, 
eg: select key ,(10-value) from table;
2022-07-08 19:22:48 +08:00
2b1d8ac28a [Doc] add flink-doris-connector 1.1.0 doc (#10660)
* add flink connector 1.1.0 doc

* update

Co-authored-by: wudi <>
2022-07-08 17:40:23 +08:00
35a282fd61 [BugFix] Column datas doesn't match nullmap when vectorization load (#10684)
* block column doesn't match nullmap

* remove _nullmap+_row_pos in convertor_to_olap
2022-07-08 17:39:44 +08:00
eeee036cba [fix](optimizer) join reorder may cause column non-existence problem (#10670)
for example:
select * from t1 inner join t2 on t1.a = t2.b inner join t3 on t3.c = t2.b;
If t3 is a large table, it will be placed first after the reorderTable,
and the problem that t2.b does not exist will occur in reanalyzing.
2022-07-08 17:28:32 +08:00
d127cfeea2 [docs]fix keywords in sql-functions help documents (#10671)
Co-authored-by: hucheng01 <hucheng01@baidu.com>
2022-07-08 16:22:47 +08:00
7c330e38d9 [fix](multi-catalog)Fix coredump when reading the parquet file for multi-thread (#10635)
There is two issue fixed in this pr:

**The first issue** is the C++ code rule of `do not call virtual function in constructor or deconstructor`. 
The deconstructor function of `ArrowReaderWrap` call the virtual function named `close()`.
When deconstructing, it will never call `ParquetReaderWrap::close()` just call the `ArrowReaderWrap::close()`

**The second issue** is parallelism deconstructing for `ParquetReaderWrap` and `prefetch_batch`.
`prefetch_batch` use `thread.detach()` to separate the control from `ParquetReaderWrap`, but it rely on some local vars from `ParquetReaderWrap` such as **`_closed ` /`_total_groups ` and `_reader`**

In this case, `ParquetReaderWrap` may call deconstructor before `prefetch_batch` and then get the core dump.
2022-07-08 14:54:10 +08:00
43915936b6 [refactor] add evaluate_and_vec() for ComparisonPredicateBase (#10631) 2022-07-08 14:47:37 +08:00
e37d29485f [Enhancement] Add column prune support for VOlapScanNode (#10615) 2022-07-08 13:56:26 +08:00
fe8acdb268 [feature-wip](array-type) add agg function collect_list and collect_set (#10606)
add codes for collect_list and collect_set and update regression output, before output format for ARRAY(string) already changed.

Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-07-08 12:48:46 +08:00
331fa50501 [feature](cold-data) move cold data to object storage without losing any feature(BE) (#10280)
This PR supports rowset level data upload on the BE side, so that there can be both cold data and hot data in a tablet,
and there is no necessary to prohibit loading new data to cooled tablets.

Each rowset is bound to a `FileSystem`, so that the storage layer can read and write rowsets without
perceiving the underlying filesystem.

The abstracted `RemoteFileSystem` can try local caching strategies with different granularity,
instead of caching segment files as before.

To avoid conflicts with the code in be/src/io, we temporarily put the file system related code in the be/src/io/fs directory.
In the future, `FileReader`s and `FileWriter`s should be unified.
2022-07-08 12:18:39 +08:00
e159e748df [chore](dependency) fix opentelemetry-cpp enable o3 optimization will core. (#10675) 2022-07-08 10:08:07 +08:00
6c3a25bf14 [enhancement](nereids) add betweentocompound rewrite rule for ssb (#10630)
add betweentocompound rewrite rule for ssb.
for example:
1. A BETWEEN X AND Y ==> A >= X AND A <= Y
2. A NOT BETWEEN X AND Y ==> A < X OR A > Y
2022-07-08 10:07:04 +08:00
874299f39e [feature-wip](multi-catalog)(fix) federation query failed (#10602)
Fix https://github.com/apache/doris/pull/10521, multi-catalog query failed for two reasons:
1. The `SelectStmt` does not get the correct catalog.
2. External table should have three level aliases.

Disable querying external views.
Support show create table for external table&view.
2022-07-08 08:52:17 +08:00
03296aedd5 [BUG] fix core dump caused by runtime filter (#10611) 2022-07-08 08:28:39 +08:00
853f85aea4 [enhancement] improve performance of week() and yearweek() (#10633) 2022-07-08 08:26:58 +08:00
3ce9e7cfca [enhance](planner): remove redundant field in sort (#10624)
SortInfo is in SortNode. But there are some replicated field in SortNode

Issue Number: close #10616

Remove the redundant field in `TSortNode` which exist in `TSortInfo`.

[API-BREAK] This has changed `Thrift` file.
2022-07-07 22:32:07 +08:00
c583d3e27c [fix][vectorized] Fix bug of VInPredicate on date type (#10663) 2022-07-07 22:15:33 +08:00
f03335d61d [action](Nereids): add label auto for nereids UT. (#10665) 2022-07-07 18:21:04 +08:00
a2df5beebb [fix](Nereids): fix ut. (#10658)
fix ut.
2022-07-07 12:00:47 +08:00
8012d63ea0 [fix] substr('', 1, 5) return empty string instead of null (#10622) 2022-07-06 22:51:02 +08:00
8de8a9571a [docs] Fixed description about networks in Quick Start (#10639) 2022-07-06 22:49:43 +08:00
3bf8c761a4 [BUG] Fix invalid return type for left and right function (#10643) 2022-07-06 22:49:19 +08:00
5dfb59844f [enhancement](Nereids)refactor PlannerContext and JobContext (#10485)
Refactor Context in Cascades:
use two context in cascades framework.

JobContext is used in each job, contains such attributes:
- reference to PlannerContext
- current cost upper bound 
- current required physical properties

PlannerContext is used to hold global info for query planner, contains such attributes:
- reference to Memo
- reference to connectContext
- reference to ruleset could be used for plan
- job pool to maintain unexecuted jobs
- job scheduler to schedule unexecuted jobs
- current job context for next job to be executed
2022-07-06 18:36:31 +08:00
29d4809c80 [BugFix](Array) fix DataTypeArray to_string use after free (#10640)
ColumnArray::convert_to_full_column_if_const override the base function
and ColumnArray::create generate a temporary variable
2022-07-06 18:18:00 +08:00
416fb73621 docs format fix for explode-json-array table function (#10613)
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-07-06 17:57:19 +08:00
Pxl
6d092a6d53 set strleft to always_nullable (#10496) 2022-07-06 17:56:01 +08:00
cff9ffa0e1 fix the inaccurate comments (#10617)
Co-authored-by: hucheng01 <hucheng01@baidu.com>
2022-07-06 17:54:43 +08:00
b4c5dfc28e [Improvement] remove redundant code of VOlapScanner (#10621) 2022-07-06 17:54:10 +08:00
d9ba946118 [enhance](*): git ignore package-lock.json. (#10637) 2022-07-06 17:53:22 +08:00
bff561c0da [feature](script) add --grace option for stop_be.sh (#10626)
be asan mem leak check needs exit app gracefully.
2022-07-06 17:53:01 +08:00
a7df6e3dee rename some files inside vec/sink dir (#10636)
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-07-06 17:52:47 +08:00
f758e1166a [fix] Fix RewriteBinaryPredicatesRule which causes wrong query results in some cases. (#10551)
During the query planning phase, the binary predicate rewrite optimization process converting DecimalLiteral to integers may overflow, resulting in false values like "id = 12345678901.0" (see the issue for detailed examples).

This pr fixes a possible overflow and optimizes the case where DecimalLiteral is not in the column type value range.

Issue Number: close #10544
2022-07-06 15:39:27 +08:00
0b80457c1f [feature](nereids) support like and regexp predicate (#10411)
support like and regexp predicate for nereids.
for example:
select * from t1 where k1 like 'xxx' and k2 regexp '^sa'
2022-07-06 14:32:06 +08:00
006283c036 [Fix] select nested type of string within type array should be wrapped with '' in vectorized path (#10498) 2022-07-06 10:47:36 +08:00
0b9f508379 [fix](nereids) fix ut,check bound should be called recursively on the plan node (#10530)
fix ut,check bound should be called recursively on the plan node
2022-07-06 10:37:05 +08:00