Commit Graph

133 Commits

Author SHA1 Message Date
78abb40fdc [improvement](string) throw exception instead of log fatal if string column exceed total size limit (#17989)
Throw exception instead of log fatal if string column exceed total size limit, so that we can catch it and let query fail, instead of causing be exit.
2023-03-27 08:55:26 +08:00
Pxl
40ca250678 [Feature](materialized-view) support where clause on create materialized view (#17534)
support where clause on create materialized view
2023-03-22 11:25:13 +08:00
dc284b62d9 [vectorized](function) support array_filter function (#17832) 2023-03-20 23:18:10 +08:00
Pxl
a92115f709 [Bug](materialized-view) fix select mv rollback fail on left join (#17850)
fix select mv rollback fail on left join
2023-03-20 19:14:17 +08:00
85080ee3c3 [vectorized](function) support array_map function (#17581) 2023-03-15 10:51:29 +08:00
77ab2fac20 [refactor](functioncontext) remove function context impl class (#17715)
* [refactor](functioncontext) remove function context impl class


Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-03-14 11:21:45 +08:00
9b7596f1c6 [Feature](Dynamic schema table) step1 support schema change expression (#17494)
1. introduce a new type `VARIANT` to encapsulate dynamic generated columns for hidding the detail of types and names of newly generated columns
2. introduce a new expression `SchemaChangeExpr` for doing schema change for extensibility
2023-03-13 15:12:42 +08:00
6386458498 [Refactor](exec) remove unless attr of slot ref (#17688)
Remove unless attr of slot ref
2023-03-12 23:45:32 +08:00
9213dd906a [enhancement](exception) add exception structure and using unique ptr in VExplodeBitmapTableFunction (#17531)
add exception class in common.
using unique ptr in VExplodeBitmapTableFunction
support single exception or nested exception, like this:

---SingleException
[E-100] test OS_ERROR bug
@ 0x55e80b93c0d9 doris::Exception::Exception<>()
@ 0x55e80b938df1 doris::ExceptionTest_NestedError_Test::TestBody()
@ 0x55e82e16bafb testing::internal::HandleSehExceptionsInMethodIfSupported<>()
@ 0x55e82e15ab3a testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x55e82e1361e3 testing::Test::Run()
@ 0x55e82e136f29 testing::TestInfo::Run()
@ 0x55e82e1376e4 testing::TestSuite::Run()
@ 0x55e82e148042 testing::internal::UnitTestImpl::RunAllTests()
@ 0x55e82e16dcab testing::internal::HandleSehExceptionsInMethodIfSupported<>()
@ 0x55e82e15ce4a testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x55e82e147bab testing::UnitTest::Run()
@ 0x55e80c4b39e3 RUN_ALL_TESTS()
@ 0x55e80c4a99b5 main
@ 0x7f0a619d0493 __libc_start_main
@ 0x55e80b84602a _start
@ (nil) (unknown)
2023-03-08 10:44:14 +08:00
9477c48ef8 [refactor](functioncontext) remove duplicate type definition in function context (#17421)
remove duplicate type definition in function context
remove unused method in function context
not need stale state in vexpr context because vexpr is stateless and function context saves state and they are cloned.
remove useless slot_size in all tuple or slot descriptor.
remove doris_udf namespace, it is useless.
remove some unused macro definitions.
init v_conjuncts in vscanner, not need write the same code in every scanner.
using unique ptr to manage function context since it could only belong to a single expr context.
Issue Number: close #xxx
---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-03-06 16:07:09 +08:00
823d968452 [fix](expr) avoid crashing caused by big depth of expression tree (#17314) 2023-03-02 16:55:53 +08:00
17f4990bd3 [enhancement](functioncontext) function context should use shared ptr and simply function context (#17311)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-03-02 16:23:54 +08:00
0732eb54bc [feature](struct-type) support csv format stream load for struct type (#17143)
Refactor from_string method in data_type_struct.cpp to support csv format stream load for struct type.
2023-03-01 15:48:48 +08:00
7229751bd9 [Improve](map-type) Add contains_null for map (#16948)
Add contains_null for map type.
2023-02-23 20:47:26 +08:00
Pxl
2bc014d83a [Enchancement](function) remove unused params on aggregate function (#16886)
remove unused params on aggregate function
2023-02-20 11:08:45 +08:00
8b70bfdc31 [Feature](map-type) Support stream load and fix some bugs for map type (#16776)
1、support stream load with json, csv format for map
2、fix olap convertor when compaction action in map column which has null
3、support select outToFile for map
4、add some regression-test
2023-02-19 15:11:54 +08:00
Pxl
f50edff59d [Chore](build) enable fallthrough check annd fix some fallthrough bug (#16748)
* enable fallthrough check annd fix some fallthrough bug

* fix

* fix
2023-02-15 15:58:43 +08:00
1b3902baa2 [Feature](Complex-type) Add struct and map type to Doris (#16444)
This commit support:
1、Insert + select for struct/map type
2、Json stream load for struct type
3、m[key] function for map type

How to use:
Set the fe config to create table for struct and map type
1、admin set frontend config("enable_struct_type" = "true");
2、admin set frontend config("enable_map_type" = "true");

#16547

Co-authored-by: xy720 <xuyang25@baidu.com>
Co-authored-by: amory <wangqiannan@selectdb.com>
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
Co-authored-by: hucheng01 <hucheng01@baidu.com>
2023-02-10 11:00:33 +08:00
7d035486ad [Opt](vec) opt the fast execute logic to remove useless function call (#16532) 2023-02-09 14:12:40 +08:00
9114896178 [DecimalV3](opt) opt the function of decimalv3 to_string logic (#16427) 2023-02-07 13:28:07 +08:00
c3a6eb4f9a [Refactor](function) remove useless function get to create column (#16333)
remove unless create_column to redurce the unless new operator
2023-02-04 22:54:14 +08:00
Pxl
5e4bb98900 [Chore](build) enable -Wpedantic and update lowest gcc version to 11.1 (#16290)
enable -Wpedantic and update lowest gcc version to 11.1
2023-02-03 11:28:48 +08:00
Pxl
0d5b115993 [Feature](Materialized-View) support duplicate base column for diffrent aggregate function (#15837)
support duplicate base column for diffrent aggregate function
2023-02-02 18:57:39 +08:00
aaae1497cd [Refactor](function) opt the exec of function with null column (#16256) 2023-02-01 15:56:31 +08:00
Pxl
ca73c60442 [Chore](build) enable ignored-qualifiers check (#16196)
enable ignored-qualifiers check
2023-02-01 15:15:59 +08:00
471db80f69 [Bug](date) Fix invalid date (#16205)
Issue Number: close #15777
2023-01-31 10:08:44 +08:00
f72efd4523 [Improvement](decimal) do not log fatal when precision is invalid (#16207) 2023-01-30 09:54:22 +08:00
Pxl
46347a51d2 [Bug](exec) enable warning on ignoring function return value for vctx (#16157)
* enable warning on ignoring function return value for vctx
2023-01-29 17:23:21 +08:00
e49766483e [refactor](remove unused code) remove many xxxVal structure (#16143)
remove many xxxVal structure
remove BetaRowsetWriter::_add_row
remove anyval_util.cpp
remove non-vectorized geo functions
remove non-vectorized like predicate
Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-01-28 14:17:43 +08:00
79ad74637d [refactor](remove expr) remove non vectorized Expr and ExprContext related codes (#16136) 2023-01-24 10:45:35 +08:00
199d7d3be8 [Refactor]Merged string_value into string_ref (#15925) 2023-01-22 16:39:23 +08:00
3894de49d2 [Enhancement](topn) support two phase read for topn query (#15642)
This PR optimize topn query like `SELECT * FROM tableX ORDER BY columnA ASC/DESC LIMIT N`.

TopN is is compose of SortNode and ScanNode, when user table is wide like 100+ columns the order by clause is just a few columns.But ScanNode need to scan all data from storage engine even if the limit is very small.This may lead to lots of read amplification.So In this PR I devide TopN query into two phase:
1. The first phase we just need to read `columnA`'s data from storage engine along with an extra RowId column called `__DORIS_ROWID_COL__`.The other columns are pruned from ScanNode.
2. The second phase I put it in the ExchangeNode beacuase it's the central node for topn nodes in the cluster.The ExchangeNode will spawn a RPC to other nodes using the RowIds(sorted and limited from SortNode) read from the first phase and read row by row from storage engine.

After the second phase read, Block will contain all the data needed for the query
2023-01-19 10:01:33 +08:00
d5a3e8df3a [Exec](opt) Opt the vexplode_split function performance (#15945) 2023-01-17 19:02:57 +08:00
d062ca2944 [refactor](vectorized) remove unnecessary vectorization check (#15984) 2023-01-17 12:21:46 +08:00
Pxl
81bab55d43 [Bug](function) catch function calculation error on aggregate node to avoid core dump (#15903) 2023-01-16 11:21:28 +08:00
0fbdf8e3e1 [Refactor](table function) Decouple vectorized table functions from non-vectorized ones (#15772) 2023-01-12 15:08:21 +08:00
8f31a36429 [feature] support spill to disk for sort node (#15624) 2023-01-11 08:40:58 +08:00
edecc2e706 [feature-wip](inverted index) API for inverted index reader and syntax for fulltext match (#14211)
* [feature-wip](inverted index)inverted index api: reader

* [feature-wip](inverted index) Fulltext query syntax with MATCH/MATCH_ALL/MATCH_ALL

* [feature-wip](inverted index) Adapt to index meta

* [enhance] add more metrics

* [enhance] add fulltext match query check for column type and index parser

* [feature-wip](inverted index) Support apply inverted index in compound predicate which except leaf node of and node
2022-12-30 21:48:14 +08:00
520b6d7910 [Improvement](decimalv3) Add a config to check overflow for DECIMALV3 (#15463) 2022-12-30 14:02:24 +08:00
305dd15fea [improvement](index) Support bitmap index can be applied with compound predicate when enable vectorized engine query (#13035)
Current bitmap index only can apply pushed down predicates which in AND conditions. When predicates in OR conditions and other complex compound conditions, it will not be pushed down to the storage layer, this leads to read more data.

Based on that situation, this pr will do:

1. this pr in order to support bitmap index apply compound predicates, query sql like:
select * from tb where a > 'hello' or b < 100;
select * from tb where a > 'hello' or b < 100 or c > 'ok';
select * from tb where (a > 'hello' or b <100) and (a < 'world' or b > 200);
select * from tb where (not a> 'hello') or b < 100;
...
above sql,column a and b and c has created bitmap_index.

2. this optimization can reduce reading data by index
3. set config enable_index_apply_compound_predicates to use this optimization
2022-12-28 20:08:57 +08:00
fe562bc3e7 [Bug](Agg) fix crash when encountering not supported agg function like last_value(bitmap) (#15257)
The former logic inside aggregate_function_window.cpp would shutdown BE once encountering agg function with complex type like BITMAP. This pr makes it don't crash and would return one more concrete error message which tells the unsupported function signature to user.
2022-12-23 14:23:21 +08:00
b2438b076d [Bug](case function) do not crash if prepare failed (#15113) 2022-12-16 11:02:05 +08:00
f3aea7f0f0 [Enhancement](status) Unify error code and enable customed err msg for BE internal errors (#14744) 2022-12-11 23:33:18 +08:00
fcea89bcf4 [fix](const_expr) fix coredump caused by unsupported cast const expr (#14825) 2022-12-06 10:31:15 +08:00
8c0e13ab51 [improvement](profile) add detail memory counter for exec nodes (#14806)
* [improvement](profile) improve accuraccy of memory usage and add detail memory counter

* fix
2022-12-05 11:51:52 +08:00
3e8b3658c7 [feature-wip](decimalv3) Support basic agg and arithmetic operations for decimal v3 (#14513) 2022-11-29 15:12:41 +08:00
52c6ba051e [feature](jsonb type)refactor JSONB type using column and add testcase (#13778)
1. Refactor JSONB type using ColumnString instead making a copy.
2. Add regression testcase for JSONB load and functions.
2022-11-26 10:06:15 +08:00
4728e75079 [feature](bitmap) Support in bitmap syntax and bitmap runtime filter (#14340)
1.Support in bitmap syntax, like 'where k1 in (select bitmap_column from tbl)';
2.Support bitmap runtime filter. Generate a bitmap filter using the right table bitmap and push it down to the left table storage layer for filtering.
2022-11-25 15:22:44 +08:00
b04ec41c1d [Vectorized](udaf) fix java-udaf couldn't get jar core dump (#14393)
fix java-udaf couldn't get jar core dump
2022-11-22 20:49:02 +08:00
Pxl
c18a471303 [Optimize](predicate) update inplace on VcompoundPred (#14402)
select count(*) from lineorder where lo_orderkey<100000000 OR lo_orderkey>100000000 AND lo_orderkey<200000000 OR lo_orderkey >200000000;

0.6s -> 0.5s
2022-11-21 09:12:30 +08:00