Commit Graph

145 Commits

Author SHA1 Message Date
79c446c89f [enhancement](exception) Column filter/replicate supports exception safety (#18503) 2023-04-18 19:23:09 +08:00
c904384672 Revert "[refactor](planner) using crchash replace murmurhash in the runtime filter (#18472)" (#18730)
This reverts commit a8315b86ca5543a6cc5b3eab97e4f0953b984247.
2023-04-17 20:25:18 +08:00
5300b21db7 [Bug](DECIMALV3) report failure if a decimal value is overflow (#18336) 2023-04-17 13:18:14 +08:00
2519931a04 [vectorized](function) support time_to_sec function (#18354)
support time_to_sec function
2023-04-13 19:31:12 +08:00
a8315b86ca [refactor](planner) using crchash replace murmurhash in the runtime filter (#18472)
When the be_exec_version is less than 2, murmurhash will still be used, otherwise crc32 will be used. When the be_exec_version is upgraded to 2, please remove.
2023-04-10 14:12:39 +08:00
f38e00b4c0 [refactor](typesystem) using typeindex to create column instead of type name because type name is not stable (#18328)
---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-04-09 18:08:31 +08:00
4e1cdb9ce7 [fix](agg_sort)fix bug of agg sort group concat with order by(#18447) 2023-04-07 08:42:36 +08:00
4b914c196a [fix](expr pushdown) Fix VRuntimeFilterWrapper cannot get children #18289 2023-04-03 09:09:52 +08:00
Pxl
307170030c [Bug](materialized-view) fix core dump when create mv have case different with base table (#18206)
fix core dump when create mv have case different with base table
2023-03-31 12:32:09 +08:00
7d92bf095a [fix](expr) refractor create_tree_from_thrift to avoid stack overflow (#18214) 2023-03-31 10:38:20 +08:00
Pxl
664fbffcba [Enchancement](table-function) optimization for vectorized table function (#17973) 2023-03-29 10:45:00 +08:00
6b6682cd96 [Enhancement](Expr) Opt In Set by small size fixed container to improve performance. (#17976) 2023-03-28 23:10:39 +08:00
78abb40fdc [improvement](string) throw exception instead of log fatal if string column exceed total size limit (#17989)
Throw exception instead of log fatal if string column exceed total size limit, so that we can catch it and let query fail, instead of causing be exit.
2023-03-27 08:55:26 +08:00
Pxl
40ca250678 [Feature](materialized-view) support where clause on create materialized view (#17534)
support where clause on create materialized view
2023-03-22 11:25:13 +08:00
dc284b62d9 [vectorized](function) support array_filter function (#17832) 2023-03-20 23:18:10 +08:00
Pxl
a92115f709 [Bug](materialized-view) fix select mv rollback fail on left join (#17850)
fix select mv rollback fail on left join
2023-03-20 19:14:17 +08:00
85080ee3c3 [vectorized](function) support array_map function (#17581) 2023-03-15 10:51:29 +08:00
77ab2fac20 [refactor](functioncontext) remove function context impl class (#17715)
* [refactor](functioncontext) remove function context impl class


Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-03-14 11:21:45 +08:00
9b7596f1c6 [Feature](Dynamic schema table) step1 support schema change expression (#17494)
1. introduce a new type `VARIANT` to encapsulate dynamic generated columns for hidding the detail of types and names of newly generated columns
2. introduce a new expression `SchemaChangeExpr` for doing schema change for extensibility
2023-03-13 15:12:42 +08:00
6386458498 [Refactor](exec) remove unless attr of slot ref (#17688)
Remove unless attr of slot ref
2023-03-12 23:45:32 +08:00
9213dd906a [enhancement](exception) add exception structure and using unique ptr in VExplodeBitmapTableFunction (#17531)
add exception class in common.
using unique ptr in VExplodeBitmapTableFunction
support single exception or nested exception, like this:

---SingleException
[E-100] test OS_ERROR bug
@ 0x55e80b93c0d9 doris::Exception::Exception<>()
@ 0x55e80b938df1 doris::ExceptionTest_NestedError_Test::TestBody()
@ 0x55e82e16bafb testing::internal::HandleSehExceptionsInMethodIfSupported<>()
@ 0x55e82e15ab3a testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x55e82e1361e3 testing::Test::Run()
@ 0x55e82e136f29 testing::TestInfo::Run()
@ 0x55e82e1376e4 testing::TestSuite::Run()
@ 0x55e82e148042 testing::internal::UnitTestImpl::RunAllTests()
@ 0x55e82e16dcab testing::internal::HandleSehExceptionsInMethodIfSupported<>()
@ 0x55e82e15ce4a testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x55e82e147bab testing::UnitTest::Run()
@ 0x55e80c4b39e3 RUN_ALL_TESTS()
@ 0x55e80c4a99b5 main
@ 0x7f0a619d0493 __libc_start_main
@ 0x55e80b84602a _start
@ (nil) (unknown)
2023-03-08 10:44:14 +08:00
9477c48ef8 [refactor](functioncontext) remove duplicate type definition in function context (#17421)
remove duplicate type definition in function context
remove unused method in function context
not need stale state in vexpr context because vexpr is stateless and function context saves state and they are cloned.
remove useless slot_size in all tuple or slot descriptor.
remove doris_udf namespace, it is useless.
remove some unused macro definitions.
init v_conjuncts in vscanner, not need write the same code in every scanner.
using unique ptr to manage function context since it could only belong to a single expr context.
Issue Number: close #xxx
---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-03-06 16:07:09 +08:00
823d968452 [fix](expr) avoid crashing caused by big depth of expression tree (#17314) 2023-03-02 16:55:53 +08:00
17f4990bd3 [enhancement](functioncontext) function context should use shared ptr and simply function context (#17311)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-03-02 16:23:54 +08:00
0732eb54bc [feature](struct-type) support csv format stream load for struct type (#17143)
Refactor from_string method in data_type_struct.cpp to support csv format stream load for struct type.
2023-03-01 15:48:48 +08:00
7229751bd9 [Improve](map-type) Add contains_null for map (#16948)
Add contains_null for map type.
2023-02-23 20:47:26 +08:00
Pxl
2bc014d83a [Enchancement](function) remove unused params on aggregate function (#16886)
remove unused params on aggregate function
2023-02-20 11:08:45 +08:00
8b70bfdc31 [Feature](map-type) Support stream load and fix some bugs for map type (#16776)
1、support stream load with json, csv format for map
2、fix olap convertor when compaction action in map column which has null
3、support select outToFile for map
4、add some regression-test
2023-02-19 15:11:54 +08:00
Pxl
f50edff59d [Chore](build) enable fallthrough check annd fix some fallthrough bug (#16748)
* enable fallthrough check annd fix some fallthrough bug

* fix

* fix
2023-02-15 15:58:43 +08:00
1b3902baa2 [Feature](Complex-type) Add struct and map type to Doris (#16444)
This commit support:
1、Insert + select for struct/map type
2、Json stream load for struct type
3、m[key] function for map type

How to use:
Set the fe config to create table for struct and map type
1、admin set frontend config("enable_struct_type" = "true");
2、admin set frontend config("enable_map_type" = "true");

#16547

Co-authored-by: xy720 <xuyang25@baidu.com>
Co-authored-by: amory <wangqiannan@selectdb.com>
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
Co-authored-by: hucheng01 <hucheng01@baidu.com>
2023-02-10 11:00:33 +08:00
7d035486ad [Opt](vec) opt the fast execute logic to remove useless function call (#16532) 2023-02-09 14:12:40 +08:00
9114896178 [DecimalV3](opt) opt the function of decimalv3 to_string logic (#16427) 2023-02-07 13:28:07 +08:00
c3a6eb4f9a [Refactor](function) remove useless function get to create column (#16333)
remove unless create_column to redurce the unless new operator
2023-02-04 22:54:14 +08:00
Pxl
5e4bb98900 [Chore](build) enable -Wpedantic and update lowest gcc version to 11.1 (#16290)
enable -Wpedantic and update lowest gcc version to 11.1
2023-02-03 11:28:48 +08:00
Pxl
0d5b115993 [Feature](Materialized-View) support duplicate base column for diffrent aggregate function (#15837)
support duplicate base column for diffrent aggregate function
2023-02-02 18:57:39 +08:00
aaae1497cd [Refactor](function) opt the exec of function with null column (#16256) 2023-02-01 15:56:31 +08:00
Pxl
ca73c60442 [Chore](build) enable ignored-qualifiers check (#16196)
enable ignored-qualifiers check
2023-02-01 15:15:59 +08:00
471db80f69 [Bug](date) Fix invalid date (#16205)
Issue Number: close #15777
2023-01-31 10:08:44 +08:00
f72efd4523 [Improvement](decimal) do not log fatal when precision is invalid (#16207) 2023-01-30 09:54:22 +08:00
Pxl
46347a51d2 [Bug](exec) enable warning on ignoring function return value for vctx (#16157)
* enable warning on ignoring function return value for vctx
2023-01-29 17:23:21 +08:00
e49766483e [refactor](remove unused code) remove many xxxVal structure (#16143)
remove many xxxVal structure
remove BetaRowsetWriter::_add_row
remove anyval_util.cpp
remove non-vectorized geo functions
remove non-vectorized like predicate
Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-01-28 14:17:43 +08:00
79ad74637d [refactor](remove expr) remove non vectorized Expr and ExprContext related codes (#16136) 2023-01-24 10:45:35 +08:00
199d7d3be8 [Refactor]Merged string_value into string_ref (#15925) 2023-01-22 16:39:23 +08:00
3894de49d2 [Enhancement](topn) support two phase read for topn query (#15642)
This PR optimize topn query like `SELECT * FROM tableX ORDER BY columnA ASC/DESC LIMIT N`.

TopN is is compose of SortNode and ScanNode, when user table is wide like 100+ columns the order by clause is just a few columns.But ScanNode need to scan all data from storage engine even if the limit is very small.This may lead to lots of read amplification.So In this PR I devide TopN query into two phase:
1. The first phase we just need to read `columnA`'s data from storage engine along with an extra RowId column called `__DORIS_ROWID_COL__`.The other columns are pruned from ScanNode.
2. The second phase I put it in the ExchangeNode beacuase it's the central node for topn nodes in the cluster.The ExchangeNode will spawn a RPC to other nodes using the RowIds(sorted and limited from SortNode) read from the first phase and read row by row from storage engine.

After the second phase read, Block will contain all the data needed for the query
2023-01-19 10:01:33 +08:00
d5a3e8df3a [Exec](opt) Opt the vexplode_split function performance (#15945) 2023-01-17 19:02:57 +08:00
d062ca2944 [refactor](vectorized) remove unnecessary vectorization check (#15984) 2023-01-17 12:21:46 +08:00
Pxl
81bab55d43 [Bug](function) catch function calculation error on aggregate node to avoid core dump (#15903) 2023-01-16 11:21:28 +08:00
0fbdf8e3e1 [Refactor](table function) Decouple vectorized table functions from non-vectorized ones (#15772) 2023-01-12 15:08:21 +08:00
8f31a36429 [feature] support spill to disk for sort node (#15624) 2023-01-11 08:40:58 +08:00
edecc2e706 [feature-wip](inverted index) API for inverted index reader and syntax for fulltext match (#14211)
* [feature-wip](inverted index)inverted index api: reader

* [feature-wip](inverted index) Fulltext query syntax with MATCH/MATCH_ALL/MATCH_ALL

* [feature-wip](inverted index) Adapt to index meta

* [enhance] add more metrics

* [enhance] add fulltext match query check for column type and index parser

* [feature-wip](inverted index) Support apply inverted index in compound predicate which except leaf node of and node
2022-12-30 21:48:14 +08:00