Commit Graph

27 Commits

Author SHA1 Message Date
a11fd62043 [fix](window function) Fix illegal frame range (#41147) (#41305)
pick #41147

0# doris::signal::(anonymous namespace)::FailureSignalHandler(int,
siginfo_t*, void*) at

/home/zcp/repo_center/doris_master/doris/be/src/common/signal_handler.h:421
 1# 0x00007F591D573520 in /lib/x86_64-linux-gnu/libc.so.6
 2# pthread_kill at ./nptl/pthread_kill.c:89
 3# raise at ../sysdeps/posix/raise.c:27
 4# abort at ./stdlib/abort.c:81
 5# _nl_load_domain at ./intl/loadmsgcat.c:1177
 6# 0x00007F591D56AE96 in /lib/x86_64-linux-gnu/libc.so.6
7# doris::vectorized::PODArray<unsigned char, 4096ul, Allocator<false,
false, false, DefaultMemoryAllocator>, 16ul, 15ul>::operator[](long)
const at

/home/zcp/repo_center/doris_master/doris/be/src/vec/common/pod_array.h:365
8# doris::vectorized::ColumnNullable::is_null_at(unsigned long) const at
/home/zcp/repo_center/doris_master/doris/be/src/vec/columns/column_nullable.h:158
9#

doris::vectorized::ReaderFirstAndLastData<doris::vectorized::ColumnVector<double>,
true, true, false>::insert_result_into(doris::vectorized::IColumn&)
const at

/home/zcp/repo_center/doris_master/doris/be/src/vec/aggregate_functions/aggregate_function_reader_first_last.h:125
10# doris::pipeline::AnalyticLocalState::_insert_result_info(long) in
/mnt/hdd01/PERFORMANCE_ENV/be/lib/doris_be
11# std::_Function_handler<void (long), std::_Bind_result<void, void
(doris::pipeline::AnalyticLocalState::*(doris::pipeline::AnalyticLocalState*,
std::_Placeholder<1>))(long)> >::_M_invoke(std::_Any_data const&,
long&&) at

/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291
12# std::function<void (long)>::operator()(long) const at
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:560
13# doris::pipeline::AnalyticLocalState::_get_next_for_rows(unsigned
long) in /mnt/hdd01/PERFORMANCE_ENV/be/lib/doris_be 14#
std::enable_if<is_invocable_r_v<doris::Status, doris::Status
(doris::pipeline::AnalyticLocalState::*&)(unsigned long),
doris::pipeline::AnalyticLocalState*&, unsigned long>,
doris::Status>::type std::__invoke_r<doris::Status, doris::Status
(doris::pipeline::AnalyticLocalState::*&)(unsigned long),
doris::pipeline::AnalyticLocalState*&, unsigned long>(doris::Status
(doris::pipeline::AnalyticLocalState::*&)(unsigned long),
doris::pipeline::AnalyticLocalState*&, unsigned long&&) at
/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:114
15# doris::Status std::_Bind_result<doris::Status, doris::Status
(doris::pipeline::AnalyticLocalState::*(doris::pipeline::AnalyticLocalState*,
std::_Placeholder<1>))(unsigned long)>::__call<doris::Status, unsigned
long&&, 0ul, 1ul>(std::tuple<unsigned long&&>&&, std::_Index_tuple<0ul,
1ul>) at

/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/functional:570
16# std::_Function_handler<doris::Status (unsigned long),
std::_Bind_result<doris::Status, doris::Status

(doris::pipeline::AnalyticLocalState::*(doris::pipeline::AnalyticLocalState*,
std::_Placeholder<1>))(unsigned long)> >::_M_invoke(std::_Any_data
const&, unsigned long&&) at

/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291
17# std::function<doris::Status (unsigned long)>::operator()(unsigned
long) const at

/var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:560
18#

doris::pipeline::AnalyticSourceOperatorX::get_block(doris::RuntimeState*,
doris::vectorized::Block*, bool*) in
/mnt/hdd01/PERFORMANCE_ENV/be/lib/doris_be
19#

doris::pipeline::OperatorXBase::get_block_after_projects(doris::RuntimeState*,
doris::vectorized::Block*, bool*) at

/home/zcp/repo_center/doris_master/doris/be/src/pipeline/exec/operator.cpp:322
20# doris::pipeline::PipelineTask::execute(bool*) in
/mnt/hdd01/PERFORMANCE_ENV/be/lib/doris_be
21# doris::pipeline::TaskScheduler::_do_work(unsigned long) at
/home/zcp/repo_center/doris_master/doris/be/src/pipeline/task_scheduler.cpp:138
22# doris::ThreadPool::dispatch_thread() in
/mnt/hdd01/PERFORMANCE_ENV/be/lib/doris_be
23# doris::Thread::supervise_thread(void*) at
/home/zcp/repo_center/doris_master/doris/be/src/util/thread.cpp:499 24#
start_thread at ./nptl/pthread_create.c:442
25# 0x00007F591D657850 at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:83
2024-09-26 09:55:33 +08:00
25358564ca [Fix](compile) Fix gcc compile on master (#33864)
This is imported by #33511. wrongly used

ColumnStr<T> ();

which violate C++20 standard(see https://wg21.cmeerw.net/cwg/issue2237) but still supported by clang up until now(see llvm/llvm-project#58112)
2024-04-19 23:41:37 +08:00
4f174c4fb9 [feature](function) Support for aggregate function foreach combiner (#31526) 2024-03-06 13:08:30 +08:00
e68019c10a [Function](Exec) Support windows function cume_dist (#30997) 2024-02-16 10:16:40 +08:00
40e1326bc9 [feature](window-func) support percent_rank window function (#30926) 2024-02-16 10:12:25 +08:00
5739167142 [feature](window_function) support to secondary argument to ignore null values in first_value/last_value (#27623) 2023-11-30 09:56:43 +08:00
b42828cf69 [fix](window_function) min/max/sum/avg should be always nullable (#27104)
Co-authored-by: starocean999 <40539150+starocean999@users.noreply.github.com>
2023-11-18 18:41:42 +08:00
6dd60c6ebb [Enhance](BE) Add -Wshadow-field compile option to avoid unexpected shadowing behavior (#25698)
* Fix `Tablet::_meta_lock` shadows member inherited from `BaseTablet`

* Add -Wshadow-field compile option to avoid unexpected shadowing behavior
2023-10-26 10:00:28 +08:00
f23c93b3c6 [fix](memory) Fix AggFunc memory leak due to incorrect destroy (#19126) 2023-04-27 14:58:32 +08:00
e412dd12e8 [chore](build) Use include-what-you-use to optimize includes (PART II) (#18761)
Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.
2023-04-19 23:11:48 +08:00
2b6d971c2f [fix](nereids)fix first_value/lead/lag window function bug in nereids (#17315)
* [fix](nereids)fix first_value/lead/lag window function bug in nereids

* add more test

* add order by to fix test case

* fix test cases
2023-03-09 09:35:27 +08:00
1dd2a41e38 [vectorized](bug) fix window function can't handle first row of beyond (#17084)
Issue Number: close #16845
2023-02-28 17:30:23 +08:00
Pxl
2bc014d83a [Enchancement](function) remove unused params on aggregate function (#16886)
remove unused params on aggregate function
2023-02-20 11:08:45 +08:00
1f30e563a7 [refactor][vectorized] refactor first/last value agg functions (#10661)
* refactor first and last
[refactor][vectorized] refactor first/last value agg functions

* add some change

* remove first/last about always nullable

* remove always nullable and register it

* refactor value remove bool null flag

* refactor win first last to ptr and pos
2022-07-30 18:38:56 +08:00
babab5d535 [feature-wip] support datetimev2 (#11085) 2022-07-23 16:07:59 +08:00
9e9d6a4dea [Load][Vectorized] load opt code by change replace and replace_if_not_null do not copy value (#10447)
load opt code by change `replace` and `replace_if_not_null` do not copy value
2022-07-10 22:04:32 +08:00
ca94867b4e [Feature-wip] add date v2 type (#9916) 2022-06-26 16:07:56 +08:00
4a474420c8 [feature](function) Add ntile function (#9867)
Add ntile function.
For non-vectorized-engine, I just implemented like Impala, rewrite ntile to row_number and count.
But for vectorized-engine, I implemented WindowFunctionNTile.
2022-06-10 10:32:40 +08:00
953429e370 [fix](function) fix last_value get wrong result when have order by clause (#9247) 2022-05-15 23:56:01 +08:00
718a51a388 [refactor][style] Use clang-format to sort includes (#9483) 2022-05-10 21:25:35 +08:00
49890ce9aa [BUG][Vectorized] fix replace_if_not_null in vectorized compaction (#9376) 2022-05-07 17:16:54 +08:00
c9961c9bb9 [style] clang-format all c++ code (#9305)
- sh build-support/clang-format.sh  to  clang-format all c++ code
2022-04-29 16:14:22 +08:00
d330bc3806 [Vectorized](stream-load-vec) Support stream load in vectorized engine (#8709) (#9280)
Implement vectorized stream load.
Added fe configuration option `enable_vectorized_load` to enable vectorized stream load.

    Co-authored-by: tengjp@outlook.com
    Co-authored-by: mrhhsg@gmail.com
    Co-authored-by: minghong.zhou@163.com
    Co-authored-by: HappenLee <happenlee@hotmail.com>
    Co-authored-by: zhoubintao <35688959+zbtzbtzbt@users.noreply.github.com>
2022-04-29 09:50:51 +08:00
Pxl
d24735e95a [refactor] add some clang-tidy checks && some code style fix (#8752) 2022-03-31 13:53:41 +08:00
c772020db4 [fix] fix bug in WindowFunctionLastData::data, it keeps the first data not the last. (#8536)
WindowFunctionLastData::add should keep the last value,
but current implementation keeps the first one.
Obviously, this code is copied from WindowFunctionFirstData::add.
2022-03-21 09:51:56 +08:00
Pxl
0553ce2944 [feature](vectorization) support function topn && remove some unused code (#7793) 2022-02-09 13:05:31 +08:00
e1d7233e9c [feature](vectorization) Support Vectorized Exec Engine In Doris (#7785)
# Proposed changes

Issue Number: close #6238

    Co-authored-by: HappenLee <happenlee@hotmail.com>
    Co-authored-by: stdpain <34912776+stdpain@users.noreply.github.com>
    Co-authored-by: Zhengguo Yang <yangzhgg@gmail.com>
    Co-authored-by: wangbo <506340561@qq.com>
    Co-authored-by: emmymiao87 <522274284@qq.com>
    Co-authored-by: Pxl <952130278@qq.com>
    Co-authored-by: zhangstar333 <87313068+zhangstar333@users.noreply.github.com>
    Co-authored-by: thinker <zchw100@qq.com>
    Co-authored-by: Zeno Yang <1521564989@qq.com>
    Co-authored-by: Wang Shuo <wangshuo128@gmail.com>
    Co-authored-by: zhoubintao <35688959+zbtzbtzbt@users.noreply.github.com>
    Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>
    Co-authored-by: xinghuayu007 <1450306854@qq.com>
    Co-authored-by: weizuo93 <weizuo@apache.org>
    Co-authored-by: yiguolei <guoleiyi@tencent.com>
    Co-authored-by: anneji-dev <85534151+anneji-dev@users.noreply.github.com>
    Co-authored-by: awakeljw <993007281@qq.com>
    Co-authored-by: taberylyang <95272637+taberylyang@users.noreply.github.com>
    Co-authored-by: Cui Kaifeng <48012748+azurenake@users.noreply.github.com>


## Problem Summary:

### 1. Some code from clickhouse

**ClickHouse is an excellent implementation of the vectorized execution engine database,
so here we have referenced and learned a lot from its excellent implementation in terms of
data structure and function implementation.
We are based on ClickHouse v19.16.2.2 and would like to thank the ClickHouse community and developers.**

The following comment has been added to the code from Clickhouse, eg:
// This file is copied from
// https://github.com/ClickHouse/ClickHouse/blob/master/src/Interpreters/AggregationCommon.h
// and modified by Doris

### 2. Support exec node and query:
* vaggregation_node
* vanalytic_eval_node
* vassert_num_rows_node
* vblocking_join_node
* vcross_join_node
* vempty_set_node
* ves_http_scan_node
* vexcept_node
* vexchange_node
* vintersect_node
* vmysql_scan_node
* vodbc_scan_node
* volap_scan_node
* vrepeat_node
* vschema_scan_node
* vselect_node
* vset_operation_node
* vsort_node
* vunion_node
* vhash_join_node

You can run exec engine of SSB/TPCH and 70% TPCDS stand query test set.

### 3. Data Model

Vec Exec Engine Support **Dup/Agg/Unq** table, Support Block Reader Vectorized.
Segment Vec is working in process.

### 4. How to use

1. Set the environment variable `set enable_vectorized_engine = true; `(required)
2. Set the environment variable `set batch_size = 4096; ` (recommended)

### 5. Some diff from origin exec engine

https://github.com/doris-vectorized/doris-vectorized/issues/294

## Checklist(Required)

1. Does it affect the original behavior: (No)
2. Has unit tests been added: (Yes)
3. Has document been added or modified: (No)
4. Does it need to update dependencies: (No)
5. Are there any changes that cannot be rolled back: (Yes)
2022-01-18 10:07:15 +08:00