Commit Graph

33 Commits

Author SHA1 Message Date
b052f46944 [fix](json-quote) fix json quote func for not find the func (#39931) (#40061)
1.  fix function not found before this pr:
```
drop table if exists t003;
create table t003 (a bigint, b json not null) properties ("replication_num"="1");
insert into t003 values (1, '{"a":1,"b":2}');
select a, map_agg("k1", json_quote(b)) from t003 group by a;

[17:47]>  select a, map_agg("k1", json_quote(b)) from t003 group by a;
(1105, 'errCode = 2, detailMessage = (172.20.48.119)[INTERNAL_ERROR]Function json_quote get failed, expr is VectorizedFnCall[json_quote](arguments=(CAST b(JSONB) TO String),return=Nullable(String)) and return type is Nullable(String).')
``` 
after pr , we can carry on it 
2. fix a core if we use json_quote with table column
```
[WARNING!] /sys/kernel/mm/transparent_hugepage/enabled: [always] madvise never, Doris not recommend turning on THP, which may cause the BE process to use more memory and cannot be freed in time. Turn off THP: `echo madvise | sudo tee /sys/kernel/mm/transparent_hugepage/enabled`
start BE in local mode
doris_be: /mnt/disk1/wangqiannan/amory/doris/thirdparty/installed/include/rapidjson/writer.h:488:void rapidjson::Writer<rapidjson::GenericStringBuffer<rapidjson::UTF8<>>>::Prefix(Type) [OutputStream = rapidjson::GenericStringBuffer<rapidjson::UTF8<>>, SourceEncoding = rapidjson::UTF8<>, TargetEncoding = rapidjson::UTF8<>, StackAllocator = rapidjson::CrtAllocator, writeFlags = 0]: 假设 ‘!hasRoot_’ 失败。
*** Query id: b7b94dead55e4090-8c8e94f65ec9efd3 ***
*** is nereids: 1 ***
*** tablet id: 0 ***
*** Aborted at 1724744881 (unix time) try "date -d @1724744881" if you are using GNU date ***
*** Current BE git commitID: bd5844ea0d ***
*** SIGABRT unknown detail explain (@0x461003611e1) received by PID 3543521 (TID 3548858 OR 0x7f8a53c1a700) from PID 3543521; stack trace: ***
doris_be: /mnt/disk1/wangqiannan/amory/doris/thirdparty/installed/include/rapidjson/writer.h:488:void rapidjson::Writer<rapidjson::GenericStringBuffer<rapidjson::UTF8<>>>::Prefix(Type) [OutputStream = rapidjson::GenericStringBuffer<rapidjson::UTF8<>>, SourceEncoding = rapidjson::UTF8<>, TargetEncoding = rapidjson::UTF8<>, StackAllocator = rapidjson::CrtAllocator, writeFlags = 0]: 假设 ‘!hasRoot_’ 失败。
doris_be: /mnt/disk1/wangqiannan/amory/doris/thirdparty/installed/include/rapidjson/writer.h:488:void rapidjson::Writer<rapidjson::GenericStringBuffer<rapidjson::UTF8<>>>::Prefix(Type) [OutputStream = rapidjson::GenericStringBuffer<rapidjson::UTF8<>>, SourceEncoding = rapidjson::UTF8<>, TargetEncoding = rapidjson::UTF8<>, StackAllocator = rapidjson::CrtAllocator, writeFlags = 0]: 假设 ‘!hasRoot_’ 失败。
doris_be: /mnt/disk1/wangqiannan/amory/doris/thirdparty/installed/include/rapidjson/writer.h:488:void rapidjson::Writer<rapidjson::GenericStringBuffer<rapidjson::UTF8<>>>::Prefix(Type) [OutputStream = rapidjson::GenericStringBuffer<rapidjson::UTF8<>>, SourceEncoding = rapidjson::UTF8<>, TargetEncoding = rapidjson::UTF8<>, StackAllocator = rapidjson::CrtAllocator, writeFlags = 0]: 假设 ‘!hasRoot_’ 失败。
doris_be: /mnt/disk1/wangqiannan/amory/doris/thirdparty/installed/include/rapidjson/writer.h:488:void rapidjson::Writer<rapidjson::GenericStringBuffer<rapidjson::UTF8<>>>::Prefix(Type) [OutputStream = rapidjson::GenericStringBuffer<rapidjson::UTF8<>>, SourceEncoding = rapidjson::UTF8<>, TargetEncoding = rapidjson::UTF8<>, StackAllocator = rapidjson::CrtAllocator, writeFlags = 0]: 假设 ‘!hasRoot_’ 失败。
doris_be: /mnt/disk1/wangqiannan/amory/doris/thirdparty/installed/include/rapidjson/writer.h:488:void rapidjson::Writer<rapidjson::GenericStringBuffer<rapidjson::UTF8<>>>::Prefix(Type) [OutputStream = rapidjson::GenericStringBuffer<rapidjson::UTF8<>>, SourceEncoding = rapidjson::UTF8<>, TargetEncoding = rapidjson::UTF8<>, StackAllocator = rapidjson::CrtAllocator, writeFlags = 0]: 假设 ‘!hasRoot_’ 失败。
doris_be: /mnt/disk1/wangqiannan/amory/doris/thirdparty/installed/include/rapidjson/writer.h:488:void rapidjson::Writer<rapidjson::GenericStringBuffer<rapidjson::UTF8<>>>::Prefix(Type) [OutputStream = rapidjson::GenericStringBuffer<rapidjson::UTF8<>>, SourceEncoding = rapidjson::UTF8<>, TargetEncoding = rapidjson::UTF8<>, StackAllocator = rapidjson::CrtAllocator, writeFlags = 0]: 假设 ‘!hasRoot_’ 失败。
 0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /mnt/disk1/wangqiannan/amory/doris/be/src/common/signal_handler.h:421
 1# 0x00007F9429595B50 in /lib64/libc.so.6
 2# gsignal in /lib64/libc.so.6
 3# __GI_abort in /lib64/libc.so.6
 4# _nl_load_domain.cold.0 in /lib64/libc.so.6
 5# 0x00007F942958E426 in /lib64/libc.so.6
 6# rapidjson::Writer<rapidjson::GenericStringBuffer<rapidjson::UTF8<char>, rapidjson::CrtAllocator>, rapidjson::UTF8<char>, rapidjson::UTF8<char>, rapidjson::CrtAllocator, 0u>::Prefix(rapidjson::Type) at /mnt/disk1/wangqiannan/amory/doris/thirdparty/installed/include/rapidjson/writer.h:488
 7# rapidjson::Writer<rapidjson::GenericStringBuffer<rapidjson::UTF8<char>, rapidjson::CrtAllocator>, rapidjson::UTF8<char>, rapidjson::UTF8<char>, rapidjson::CrtAllocator, 0u>::String(char const*, unsigned int, bool) at /mnt/disk1/wangqiannan/amory/doris/thirdparty/installed/include/rapidjson/writer.h:206
 8# bool rapidjson::GenericValue<rapidjson::UTF8<char>, rapidjson::MemoryPoolAllocator<rapidjson::CrtAllocator> >::Accept<rapidjson::Writer<rapidjson::GenericStringBuffer<rapidjson::UTF8<char>, rapidjson::CrtAllocator>, rapidjson::UTF8<char>, rapidjson::UTF8<char>, rapidjson::CrtAllocator, 0u> >(rapidjson::Writer<rapidjson::GenericStringBuffer<rapidjson::UTF8<char>, rapidjson::CrtAllocator>, rapidjson::UTF8<char>, rapidjson::UTF8<char>, rapidjson::CrtAllocator, 0u>&) const at /mnt/disk1/wangqiannan/amory/doris/thirdparty/installed/include/rapidjson/document.h:1974
 9# doris::vectorized::FunctionJsonQuoteImpl::execute(std::vector<doris::vectorized::ColumnStr<unsigned int> const*, std::allocator<doris::vectorized::ColumnStr<unsigned int> const*> > const&, doris::vectorized::ColumnStr<unsigned int>&, unsigned long) at /mnt/disk1/wangqiannan/amory/doris/be/src/vec/functions/function_json.cpp:826
10# doris::vectorized::FunctionJson<doris::vectorized::FunctionJsonQuoteImpl>::execute_impl(doris::FunctionContext*, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long, unsigned long) const at /mnt/disk1/wangqiannan/amory/doris/be/src/vec/functions/function_json.cpp:985
11# doris::vectorized::DefaultExecutable::execute_impl(doris::FunctionContext*, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long, unsigned long) const at /mnt/disk1/wangqiannan/amory/doris/be/src/vec/functions/function.h:463
12# doris::vectorized::PreparedFunctionImpl::_execute_skipped_constant_deal(doris::FunctionContext*, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long, unsigned long, bool) const at /mnt/disk1/wangqiannan/amory/doris/be/src/vec/functions/function.cpp:120
13# doris::vectorized::PreparedFunctionImpl::execute_without_low_cardinality_columns(doris::FunctionContext*, doris::vectorized::Block&, std::vector<unsigned long, std::allocator<unsigned long> > const&, unsigned long, unsigned long, bool) const at /mnt/disk1/wangqiannan/amory/doris/be/src/vec/functions/function.cpp:245
```

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-28 21:31:08 +08:00
21bd4a4ac8 [bug](function)fix json_replace check return type error (#37014) (#39938)
1. fix the return type dcheck error:
```
mysql [test]>select (json_replace(a, '$.fparam.nested_2', "qwe")) from json_table_2 limit 1;
ERROR 1105 (HY000): errCode = 2, detailMessage = (10.16.10.8)[INTERNAL_ERROR]Function json_replace get failed, expr is VectorizedFnCall[json_replace](arguments=a, String, String, String,return=Nullable(String)) and return type is Nullable(String).
```

2. improve the json_replace/json_insert/json_set function execute of not
convert const column, test about could faster 1s on 1000w table rows

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-08-27 08:03:48 +08:00
876248aa4e [fix](function) json_object can not input null value (#34591) 2024-05-18 18:00:48 +08:00
0578b28d54 [fix](function) fixed the get_json_string function (#32150) 2024-03-15 18:04:02 +08:00
0f25a4b3c6 [bug](json)Fix the problem of be down caused by json path ending with \ (#28180) 2023-12-15 15:57:08 +08:00
Pxl
2b715924c5 [Chore](function) set normal function use_default_implementation_for_constants to default (#27891)
set normal function use_default_implementation_for_constants to default
2023-12-04 14:19:25 +08:00
a5565f68b2 [Refactor](opentelemetry) Remove opentelemetry (#26605) 2023-11-09 18:05:34 +08:00
3a954cd1aa [fix](function)return NULL rather than 'null' if path not found (#25880)
fix json_extract not return NULL but null
2023-10-30 14:26:44 +08:00
68087f6c82 [fix](json function) Fix the slow performance of get_json_path when processing JSONB (#24631)
When processing JSONB, automatically convert to jsonb_extract_string
2023-09-27 21:17:39 +08:00
5d138b6928 [remove](function) make execute_impl const and remove running_difference function (#24935) 2023-09-27 18:17:28 +08:00
xfz
1b95ce1d93 [feature](json-function) add json_insert, json_replace, json_set functions (#24384)
[feature](json-function) add three json funcitons
2023-09-25 12:52:29 +08:00
14bd290aec [feature](jsonb)support json_length and json_contains function (#24332) 2023-09-20 10:40:44 +08:00
4e880288c6 [refactor]use clear concept to replace std::enable_if_t (#22801)
---------

Signed-off-by: flynn <fenglv15@mails.ucas.ac.cn>
2023-08-12 15:10:30 +08:00
97135a1cbb [Feature] (json)add json_contains function (#20824) 2023-06-16 15:10:12 +08:00
06e7c14320 [Improve](json-array) Support json array with nereids bool (#20248)
Support json array with nereids bool
now : 

```
set enable_nereids_planner=true;
mysql> SELECT json_array(1, "abc", NULL, TRUE, '10:00:00');
+----------------------------------------------+
| json_array(1, 'abc', NULL, TRUE, '10:00:00') |
+----------------------------------------------+
| [1,"abc",null,false,"10:00:00"]              |
+----------------------------------------------+
1 row in set (0.02 sec)
```
 
nereids boolean is "true"/"false" is not '0' /'1' , so we always get false
2023-06-02 14:47:24 +08:00
Pxl
d64be9565d [Bug](function) fix function in get wrong result when input const column (#19791)
fix function in get wrong result when input const column
2023-05-22 10:58:29 +08:00
673cbe3317 [chore](build) Porting to GCC-13 (#19293)
Support using GCC-13 to build the codebase.
2023-05-08 10:42:06 +08:00
1dfc5ea34c [bugfix](jsonb) fix jsonb parser crash on noavx2 host (#18977)
support avx2 and noavx2 for jsonb parser using __AVX2__ macro.
2023-04-26 15:10:12 +08:00
ab2a6864bc [function](json) Json unquote (#18037) 2023-04-24 10:33:29 +08:00
e412dd12e8 [chore](build) Use include-what-you-use to optimize includes (PART II) (#18761)
Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.
2023-04-19 23:11:48 +08:00
43392918cd [Optimization](functions)Optimize function call for const columns. (#18310) 2023-04-12 11:11:01 +08:00
ee80c12815 [feature](json) add json_extract function (#17808) 2023-03-27 21:19:47 +08:00
50eeb2d9a4 [fix](json) change int to bigint for json function (#17769) 2023-03-25 21:57:29 +08:00
9f088f6e90 [feature](json) add json_valid function (#17247)
add json_valid function

Signed-off-by: nextdreamblue <zxw520blue1@163.com>
2023-03-02 14:08:52 +08:00
55ca810445 [fix](Vectorized)fix json_object and json_array function return wrong result on vectorized engine (#13775)
Issue Number: close #13598
2022-11-09 11:26:55 +08:00
291fa499e9 [fix](JSON) Fail to parse JSONPath (libc++) (#13941) 2022-11-09 08:58:01 +08:00
Pxl
0ead048b93 [Enhancement](column) remove ColumnString terminating zero and add a data_version for pblock (#12456)
1. remove ColumnString terminating zero
    2. add a data_version for pblock
    3. change EncryptionMode to enum class
2022-09-14 21:25:22 +08:00
Pxl
6a566ccb74 [Enhancement][Vectorized] add constexpr_loop_match (#10283) 2022-06-29 14:58:50 +08:00
Pxl
a8af8d2981 [fix](vectorized) fix core dump on get_json_string and add some ut (#8496) 2022-03-17 10:08:31 +08:00
2b9b0fc1ec [Fix] Function percentile input null return null (#8238) 2022-03-01 14:42:48 +08:00
Pxl
87e555c27d [Feature][Vectorized] support function json_array/json_object/json_quote (#8158) 2022-02-22 09:29:56 +08:00
Pxl
3ee000c13c [chore] support build with libc++ && add some build config (#7903)
support LIBCPP/LDD/BUILD_META_TOOL for build.sh
2022-01-30 16:47:22 +08:00
e1d7233e9c [feature](vectorization) Support Vectorized Exec Engine In Doris (#7785)
# Proposed changes

Issue Number: close #6238

    Co-authored-by: HappenLee <happenlee@hotmail.com>
    Co-authored-by: stdpain <34912776+stdpain@users.noreply.github.com>
    Co-authored-by: Zhengguo Yang <yangzhgg@gmail.com>
    Co-authored-by: wangbo <506340561@qq.com>
    Co-authored-by: emmymiao87 <522274284@qq.com>
    Co-authored-by: Pxl <952130278@qq.com>
    Co-authored-by: zhangstar333 <87313068+zhangstar333@users.noreply.github.com>
    Co-authored-by: thinker <zchw100@qq.com>
    Co-authored-by: Zeno Yang <1521564989@qq.com>
    Co-authored-by: Wang Shuo <wangshuo128@gmail.com>
    Co-authored-by: zhoubintao <35688959+zbtzbtzbt@users.noreply.github.com>
    Co-authored-by: Gabriel <gabrielleebuaa@gmail.com>
    Co-authored-by: xinghuayu007 <1450306854@qq.com>
    Co-authored-by: weizuo93 <weizuo@apache.org>
    Co-authored-by: yiguolei <guoleiyi@tencent.com>
    Co-authored-by: anneji-dev <85534151+anneji-dev@users.noreply.github.com>
    Co-authored-by: awakeljw <993007281@qq.com>
    Co-authored-by: taberylyang <95272637+taberylyang@users.noreply.github.com>
    Co-authored-by: Cui Kaifeng <48012748+azurenake@users.noreply.github.com>


## Problem Summary:

### 1. Some code from clickhouse

**ClickHouse is an excellent implementation of the vectorized execution engine database,
so here we have referenced and learned a lot from its excellent implementation in terms of
data structure and function implementation.
We are based on ClickHouse v19.16.2.2 and would like to thank the ClickHouse community and developers.**

The following comment has been added to the code from Clickhouse, eg:
// This file is copied from
// https://github.com/ClickHouse/ClickHouse/blob/master/src/Interpreters/AggregationCommon.h
// and modified by Doris

### 2. Support exec node and query:
* vaggregation_node
* vanalytic_eval_node
* vassert_num_rows_node
* vblocking_join_node
* vcross_join_node
* vempty_set_node
* ves_http_scan_node
* vexcept_node
* vexchange_node
* vintersect_node
* vmysql_scan_node
* vodbc_scan_node
* volap_scan_node
* vrepeat_node
* vschema_scan_node
* vselect_node
* vset_operation_node
* vsort_node
* vunion_node
* vhash_join_node

You can run exec engine of SSB/TPCH and 70% TPCDS stand query test set.

### 3. Data Model

Vec Exec Engine Support **Dup/Agg/Unq** table, Support Block Reader Vectorized.
Segment Vec is working in process.

### 4. How to use

1. Set the environment variable `set enable_vectorized_engine = true; `(required)
2. Set the environment variable `set batch_size = 4096; ` (recommended)

### 5. Some diff from origin exec engine

https://github.com/doris-vectorized/doris-vectorized/issues/294

## Checklist(Required)

1. Does it affect the original behavior: (No)
2. Has unit tests been added: (Yes)
3. Has document been added or modified: (No)
4. Does it need to update dependencies: (No)
5. Are there any changes that cannot be rolled back: (Yes)
2022-01-18 10:07:15 +08:00