Commit Graph

259 Commits

Author SHA1 Message Date
05da3d947f [feature-wip](new-scan) add scanner scheduling framework (#11582)
There are currently many types of ScanNodes in Doris. And most of the logic of these ScanNodes is the same, including:

Runtime filter
Predicate pushdown
Scanner generation and scheduling
So I intend to unify the common logic of all ScanNodes.
Different data sources only need to implement different Scanners for data access.
So that the future optimization for scan can be applied to the scan of all data sources,
while also reducing the code duplication.

This PR mainly adds 4 new class:

VScanner
All Scanners' parent class. The subclasses can inherit this class to implement specific data access methods.

VScanNode
The unified ScanNode, and is responsible for common logic including RuntimeFilter, predicate pushdown, Scanner generation and scheduling.

ScannerContext
ScannerContext is responsible for recording the execution status
of a group of Scanners corresponding to a ScanNode.
Including how many scanners are being scheduled, and maintaining
a producer-consumer blocks queue between scanners and scan nodes.

ScannerContext is also the scheduling unit of ScannerScheduler.
ScannerScheduler schedules a ScannerContext at a time,
and submits the Scanners to the scanner thread pool for data scanning.

ScannerScheduler
Unified responsible for all Scanner scheduling tasks

Test:
This work is still in progress and default is disabled.
I tested it with jmeter with 50 concurrency, but currently the scanner is just return without data.
The QPS can reach about 9000.
I can't compare it to origin implement because no data is read for now. I will test it when new olap scanner is ready.
Co-authored-by: morningman <morningman@apache.org>
2022-08-23 08:45:18 +08:00
caec862d91 [feature](Nereids)add type coercion rule for nereids (#11802)
- add an interface ExpectsInputTypes to Expression
- add an interface ImplicitCastInputTypes to Expression
- add a Expression rewrite rule for type coercion
- add a Check Analysis Rule to check whether Plan is Semantically correct

if Expression implements ImplicitCastInputTypes, type coercion rule will automatic rewrite its children that casting it to the most suitable type.
If Expression implements ExpectsInputTypes, Check Analysis will check its children's type whether accepted by expects input types.
2022-08-22 23:06:02 +08:00
68e2b3db44 [regression](rollup) Modify test case (#11960) 2022-08-22 19:18:35 +08:00
Pxl
192cdd4d76 [Bug](cast) change binary predicate finally cast to varchar (#11796) 2022-08-21 10:13:47 +08:00
25b427d0c6 [Bugfix](inpredicate) fix in predicate in group by clause may cause NPE (#11886)
* [bug](inpredicate) fix in predicate in group by clause may cause NPE
2022-08-21 10:03:30 +08:00
d83c11a032 [regression](datev2) add schema change cases for datev2/datetimev2 (#11924) 2022-08-19 21:29:24 +08:00
ffe7af49c8 [fix](array-type) run 'show create table' return null (#11912)
Co-authored-by: hucheng01 <hucheng01@baidu.com>
2022-08-19 21:28:15 +08:00
be7a38e170 [refactor](planner): refactor and replace use NIO (#11645)
* [refactor](planner): refactor equals code in Catalog dir.
2022-08-19 21:26:39 +08:00
f66e42f848 [optimization](array-type) support the decimal/datetime as the nest type of array in print_value (#11784)
Co-authored-by: hucheng01 <hucheng01@baidu.com>
2022-08-19 17:59:09 +08:00
1f9eec5462 [Regression](datev2) Add test cases for datev2/datetimev2 (#11831) 2022-08-19 10:57:55 +08:00
f1ede2aa9d [fix](function) Fix semantic analysis error in window function at first_value (#11855) 2022-08-19 09:13:29 +08:00
Pxl
c0dc51b453 [Bug](Vectorzed alter table)modify schema change cast validate (#11864) 2022-08-18 16:05:48 +08:00
b9dcb60172 [Planner](fix)Fix unexpected index out of bound exception (#11819) 2022-08-18 15:52:54 +08:00
066bc7693e [fix](orderby)disallow hll and bitmap data type in order by list (#11837) 2022-08-18 14:50:25 +08:00
Pxl
cac317430f [Bug](aggregation) fix core dump on 2nd phase aggregate (#11843) 2022-08-18 14:42:34 +08:00
0637c339b1 [fix](array-type) support to insert the largeint in array (#11868)
Co-authored-by: hucheng01 <hucheng01@baidu.com>
2022-08-18 14:41:07 +08:00
cfb90b39c7 (vec-stream-load-json) simdjson throw execption lead to core dump (#11880)
when config::enable_simdjson_parser=true in vec streamload, may lead to core dump when json input invalid format string like '{ "a', or all the fields is null like '{}', this may lead to simdjson lib throw some unhandled expection like `Objects and arrays can only be iterated when they are first encountered`.We should take care of these cases

Signed-off-by: eldenmoon <15605149486@163.com>
2022-08-18 10:27:34 +08:00
6c66bdbf30 [fix](orderby)remove useless null literal in order by (#11821) 2022-08-18 10:10:25 +08:00
ff1971f916 [improvement](test) add dryRun option and group all cases into either p0 or p1 (#11576)
1. add dryRun option to list tests
2. group all cases into p0 p1 p2
2022-08-17 22:45:53 +08:00
3a49156e30 [performance] (vectorization)optimize In Expr (#11826)
Co-authored-by: Wang Bo <wangbo36@meituan.com>
2022-08-17 10:46:37 +08:00
a07e153419 [Feature](nereids)support view and nested view (#11589)
support view in query
and add a rewrite rule: merge consecutive projects.
the rule can merge relative consecutive projects to one project to improve efficiency
2022-08-16 19:24:01 +08:00
fadc78c6cf [fix](str_to_date) str_to_date support format without leading zero (#11817)
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-08-16 18:23:16 +08:00
f2292a3b1d [Enhancement](array-type) enable_array_type flag update (#11785)
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-08-16 14:41:57 +08:00
340ee6af6a (fix)[regression-test] add qt_having1 (#11800) 2022-08-16 14:29:37 +08:00
dc18def456 [regression-test](bitmap) Add regression case for some bitmap funcations (#11783)
Co-authored-by: smallhibiscus <844981280>
2022-08-16 14:23:59 +08:00
f3f1bbc48c [fix](agg)disallow group by bitmap or hll data type (#11782)
* [fix](agg)disallow group by bitmap or hll data type
2022-08-16 09:25:02 +08:00
d37cf0a41b [regression-test](p0) add p0 test cases (#11624)
Add p0 test cases, including:
aggregate
join
union
order by
group by
keyword
arithmetic operators
logical operators
case function
coalesce
between
in
like
limit
where
regexp
window function
runtime filter
schema change
2022-08-15 23:12:07 +08:00
8f98357c0b [fix](array-type) disable cast function to array type on origin exec engine. (#11602)
This commit disable cast to array type on origin exec engine, except cast varchar to array type.
2022-08-15 21:30:56 +08:00
0f75bd0e38 [fix](delete) fix query result error after delete (#11754)
convert dictionary code for delete predicates.
2022-08-15 17:52:03 +08:00
910d51c76f [fix](update) Fix where clause is not reanalyzed after rewrite (#11723) 2022-08-15 13:24:57 +08:00
8c8f48c4c2 [feature-wip](array-type) add the array_join function (#11406)
this pr is used to add the array_join function.
Co-authored-by: hucheng01 <hucheng01@baidu.com>
2022-08-15 11:43:17 +08:00
abd2eb4fa1 [Bug](date function) Fix bug for date format %T (#11729)
* [Bug](date function) Fix bug for date format %T
2022-08-12 19:29:58 +08:00
408dbf840b [bugfix](schema change) when there is a string column with delete predicate, the schema change may core (#11739)
* [bugfix](schema change) when there is a string column with delete predicate, the schema change may core

Co-authored-by: yiguolei <yiguolei@gmail.com>
2022-08-12 19:29:22 +08:00
ec4347ad39 [enhancement](Nereids) support StatementContext, SET_VAR, and Plan pre/post processor (#11654)
1. add StatementContext, and PlannerContext is renamed to CascadsContext. CascadsContext belong to a StatementContext, and StatementContext belong to a ConnectionContext, and the lifecycle increases in turn. StatementContext can wrap some statement's lifecycle-related state, such as ExpressionId, TableLock. MemoTestUtil can simplify create a CascadesContext and Memo for test.
2. add PlanPreprocessor to process parsed logical plan before copy into memo. and add a PlanPostprocessor to process physical plan after copy out from memo.
3. utilize PlanPreprocessor to process SET_VAR hint, the class is EliminateLogicalSelectHint
4. pass the limit clause in regression test case, in set_var.groovy
2022-08-12 14:49:11 +08:00
e353be7dcb [Bug](date function) Return null if date format is invalid (#11720) 2022-08-12 14:07:55 +08:00
6c6328fc6d [fix](join)fix outer join bug when a subquery as nullable side #11700 2022-08-12 11:50:15 +08:00
2d5ffac590 [fix](optimization) InferFiltersRule bug: a self inner join on a view, which contains where clause, will cause mis-inference. (#11566) 2022-08-11 17:13:26 +08:00
ea57bf6370 [refactor](delete predicate) Unify delete to segmentiterator (#11650)
* remove seek columns and unify delete columns in rowset reader


Co-authored-by: yiguolei <yiguolei@gmail.com>
2022-08-11 15:12:43 +08:00
180cc35815 [Feature](nereids) support sub-query and alias in FromClause (#11035)
Support sub-query and alias for TPC-H,for example:
select * from (select * from (T1) A join T2 as B on T1.id = T2.id) T;
2022-08-11 12:42:19 +08:00
02a3f21b65 [fix](analyzer) InferFilterRule bug: equations in on clause of outer/anti join are not inferable. (#11515) 2022-08-11 09:36:43 +08:00
a153af9698 [chore](regression-test) Add drop table in aggregate_count1 (#11632) 2022-08-10 19:25:43 +08:00
c8418d13b5 [improvement](config)Use session variable to replace configuration for 'enable_function_pushdown' (#11641) 2022-08-10 19:25:02 +08:00
01e4522612 [fix]collect_list/collect_set without GROUP BY for NOT NULL column (#11529)
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-08-09 20:49:37 +08:00
df47b6941d [feature-wip](array-type) support the array type in reverse function (#11213)
Co-authored-by: hucheng01 <hucheng01@baidu.com>
2022-08-09 20:49:09 +08:00
b9f7f63c81 [Fix](planner) Fix wrong planner with count(*) optmizer for cross join optimization (#11569) 2022-08-09 09:01:25 +08:00
7c950c7cd5 [feature](Nereids) support cross join in Nereids (#11502)
support cross join in Nereids

1. add PhysicalNestedLoopJoin
2. Translate PhysicalNestedLoopJoin to CrossJoinNode in PhysicalPlanTranslator
2022-08-08 22:14:27 +08:00
1701ffa7c0 [fix](planner)push constant expr in predicate to outer join's other conjuncts by mistake (#11527)
constant expr in predicate should not be pushed to outer join's other conjuncts
2022-08-08 20:56:08 +08:00
Fy
647b6e843a [feature](nereids)add InPredicate in expressions (#11264)
1. Add InPredicate expression parser and translator
2. Add regression-test for In predicate (in nereids_syntax)
3. Support NOT EqualTo and NOT InPredicate in ExpressionTranslator#visitNot()
2022-08-08 19:59:54 +08:00
9349746987 [Fix](stream-load-json) fix VJsonReader::_write_data_to_column invalid column type cast when meet null (#11564)
column_ptr will be a none nullable column pointer after `column_ptr = &nullable_column->get_nested_column()`
so we should not cast column_ptr to ColumnNullable any more
2022-08-08 15:57:39 +08:00
Pxl
2cd3bf80dc [bugfix](schema change)fix core dump on vectorized_alter_table (#11538) 2022-08-08 10:45:28 +08:00