Commit Graph

2042 Commits

Author SHA1 Message Date
a0b95d8fcb [fix](storage) fix core for string predicate in storage layer (#9500)
Co-authored-by: Wang Bo <wangbo36@meituan.com>
2022-05-12 15:41:39 +08:00
4cd579b155 [refactor] Check status precise_code instead of construct OLAPInternalError (#9514)
* check status precise_code instead of construct OLAPInternalError
* move is_io_error to Status
2022-05-12 15:39:29 +08:00
d26f5d22be [refactor]Cleanup unused empty files (#9497) 2022-05-12 14:58:28 +08:00
289608cc20 [fixbug]fix bug for OLAP_SUCCESS with Status (#9427) 2022-05-11 20:04:06 +08:00
e3bac86b43 [bugfix](vtablet_sink) fix max_pending_bytes for vtablet_sink (#9462)
Co-authored-by: yixiutt <yixiu@selectdb.com>
2022-05-11 18:00:56 +08:00
375c1bf5c0 [feature](mysql-table) support utf8mb4 for mysql external table (#9402)
This patch supports utf8mb4 for mysql external table.

if someone needs a mysql external table with utf8mb4 charset, but only support charset utf8 right now.

When create mysql external table, it can add an optional propertiy "charset" which can set character fom mysql connection, 
default value is "utf8". You can set "utf8mb4" instead of "utf8" when you need.
2022-05-11 09:39:23 +08:00
718a51a388 [refactor][style] Use clang-format to sort includes (#9483) 2022-05-10 21:25:35 +08:00
ce926a7abb [refactor] delete OLAP_LOG_WARNING related macro definition (#9484)
Co-authored-by: BePPPower <fangtiewei@selectdb.com>
2022-05-10 20:53:45 +08:00
b34ed43ec9 [feature-wip] (memory tracker) (step6, End) Fix some details (#9301)
1. Fix LoadTask, ChunkAllocator, TabletMeta, Brpc, the accuracy of memory track.
2. Modified some MemTracker names, deleted some unnecessary trackers, and improved readability.
3. More powerful MemTracker debugging capabilities.
4. Avoid creating TabletColumn temporary objects and improve BE startup time by 8%.
5. Fix some other details.
2022-05-10 18:17:09 +08:00
e61d296486 [Refactor] Replace '#ifndef' with '#pragma once' (#9456)
* Replace '#ifndef' with '#pragma once'
2022-05-10 09:25:59 +08:00
51db78d375 [refactor] modify all OLAP_LOG_WARNING to LOG(WARNING) (#9473)
Co-authored-by: BePPPower <fangtiewei@selectdb.com>
2022-05-10 09:25:25 +08:00
eec1dfde3a [feature] (vec) instead of converting line to src tuple for stream load in vectorized. (#9314)
Co-authored-by: xiepengcheng01 <xiepengcheng01@xafj-palo-rpm64.xafj.baidu.com>
2022-05-09 11:24:07 +08:00
ae01862ae4 [fix](ut) fix DeltaWriter::close_wait parameter mismatch in delta_writer_test (#9457) 2022-05-09 09:38:12 +08:00
7e86c1beab [fix] UT MathFunctionTest.round_test fix (#9447)
Function round support two format round(double) and round(double, int), the argument is variadic.
But FunctionBinaryArithmetic not support variadic argument now, make get_function for round(double, int) failed.

reproduce steps:
1. set enable_vectorized_engine=true;
2. try to call round(double, int);
```
> select round(10.12345,2);
ERROR 1105 (HY000): errCode = 2, detailMessage = Function round is not implemented
```
2022-05-09 09:37:27 +08:00
6834fb23ca [fix](s3) fix s3 Temp file may write failed because of has no space on disk (#9421) 2022-05-09 09:28:43 +08:00
580ce38a3f [fix](schema_hash) Fix bug that introduced by removing schema_hash (#9449) 2022-05-08 21:03:10 +08:00
7234c964ae [Bug] Missing error tablet list when close_wait return error (#9418) 2022-05-08 06:45:28 +08:00
fd11a6b493 [fix][feature](Function) fix return type && support hll_union_agg/group_concat agg to window function (#9119) 2022-05-07 20:44:04 +08:00
4235db8902 [refactor] some code cleanup for min/max function. (#8874) 2022-05-07 20:39:44 +08:00
53574ce0ea [Bug] (fix) DeltaWriter::mem_consumption() coredump (#9245) 2022-05-07 19:13:08 +08:00
49890ce9aa [BUG][Vectorized] fix replace_if_not_null in vectorized compaction (#9376) 2022-05-07 17:16:54 +08:00
Pxl
98bfeaf560 [Enhancement] [Vectorized] Refactor and optimize BinaryOperation (#9087) 2022-05-07 10:55:15 +08:00
2ccaa6338c [enhancement](load) optimize load string data and dict page write (#9123)
* [enhancement](load) optimize load string data and dict page write
2022-05-07 10:27:27 +08:00
22439cb6a6 [Improvement] [compaction]Enable vectorized compaction by default (#9383) 2022-05-07 08:46:35 +08:00
dce18cb325 [doc] Add window functions sql help doc (#9393) 2022-05-07 08:43:51 +08:00
811f019e47 [performance][query]improve the performance of DISTINCT aggregation by using flat hash set replace unordered set (#9401)
Co-authored-by: shizhiqiang03 <shizhiqiang03@meituan.com>
2022-05-07 08:43:14 +08:00
e7f12db06c [fixbug][compaction] update OLAP_ERR_CUMULATIVE_NO_SUITABLE_VERSION (#9410) 2022-05-07 08:39:20 +08:00
a9831f87f2 [refactor]refactor lazy materialized (#8834)
[refactor]refactor lazy materialized (#8834)
2022-05-06 19:16:35 +08:00
edc833ab76 [Bug][stream-vec-load] Null data load do not skip the same place data (#9360)
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
2022-05-06 16:07:45 +08:00
e130d2f233 [fix][compaction] Rowset::end_version null pointer(#9379) 2022-05-06 14:40:08 +08:00
e3b90de2d5 remove file result writer from result sink (#9378) 2022-05-06 02:37:20 +08:00
e5d4cf01ed [fix](ut) fix a potential memory leak in BE ut (#9362) 2022-05-05 20:47:31 +08:00
a33191e222 [fix](memtracker) DCHECK failed in vetorized exec engine fold constant execute (#9354) 2022-05-05 09:55:38 +08:00
832338c55e [improvement] set name for scanner threads and fix compile error in clang (#9336) 2022-05-05 09:53:43 +08:00
f1aa9668af [refactor][storage format] Forbidden rowset v1 (#9248)
- Force change the existing olaptable's storage format from V1 to V2
- Forbidden to create new olap table with storage format == v1 OR do schema change that want to create new v1 format
2022-05-04 17:32:20 +08:00
eed62695e1 [fix](function) handle merge in window_funnel_init and add test (#9338) 2022-05-03 22:37:06 +08:00
49a0cd1925 [fix](compaction) fix bug for vectorized compaction (#9344)
1. add a BE config to switch vectorized compaction
2. Fix vectorized compaction bug that row statistic is not right.
2022-05-03 17:31:40 +08:00
Pxl
b1870faddd [Bug] [Build] fix clang build fail (#9323)
* fix clang compile fail
2022-05-02 18:04:57 +08:00
Pxl
d2374dbd5e [fix](Lateral-View) fix outer combinator not work on non-vectorized (#9212) 2022-05-01 22:09:50 +08:00
c9961c9bb9 [style] clang-format all c++ code (#9305)
- sh build-support/clang-format.sh  to  clang-format all c++ code
2022-04-29 16:14:22 +08:00
201cd207f9 [Enhancement][Vectorized] Improve hash table build efficiency (#9250)
1. MAP_POPULATE is missing for mmap in Allocator, because macro OS_LINUX is not defined in allocator.h;
2. MAP_POPULATE has no effect for mremap as for mmap, zero-fill enlarged memory range explicitly to pre-fault the pages
2022-04-29 14:26:33 +08:00
ce7905e983 [fix](vectorized) Query get wrong result when ColumnDict concurrent predicate eval (#9270) 2022-04-29 11:45:04 +08:00
2fa19113ab [fix](profile) Short-circuit and del predicate filter rows are not counted on vectorized exec (#9268) 2022-04-29 10:45:48 +08:00
d330bc3806 [Vectorized](stream-load-vec) Support stream load in vectorized engine (#8709) (#9280)
Implement vectorized stream load.
Added fe configuration option `enable_vectorized_load` to enable vectorized stream load.

    Co-authored-by: tengjp@outlook.com
    Co-authored-by: mrhhsg@gmail.com
    Co-authored-by: minghong.zhou@163.com
    Co-authored-by: HappenLee <happenlee@hotmail.com>
    Co-authored-by: zhoubintao <35688959+zbtzbtzbt@users.noreply.github.com>
2022-04-29 09:50:51 +08:00
48222f1fb0 [fix](storage)bloom filter support ColumnDict (#9167)
bloom filter support ColumnDict(#9167)
2022-04-28 20:03:26 +08:00
2ec0b98787 [fix](routine-load) Fix bug that new coming routine load tasks are rejected all the time and report TOO_MANY_TASK error (#9164)
```
CREATE ROUTINE LOAD iaas.dws_nat ON dws_nat
WITH APPEND PROPERTIES (
"desired_concurrent_number"="2",
"max_batch_interval" = "20",
"max_batch_rows" = "400000",
"max_batch_size" = "314572800",
"format" = "json",
"max_error_number" = "0"
)
FROM KAFKA (
"kafka_broker_list" = "xxxx:xxxx",
"kafka_topic" = "nat_nsq",
"property.kafka_default_offsets" = "2022-04-19 13:20:00"
);
```

In the create statement example below, you can see
The user didn't specify the custom partitions.
So that 1. Fe will get all kafka partitions from server in routine load's scheduler.
The user set the default offset by datetime.
So that 2. Fe will get kafka offset by time from server in routine load's scheduler.

When 1 is success, meanwhile 2 is failed, the progress of this routine load may not contains any partitions and offsets.
Nevertheless, since newCurrentKafkaPartition which is get by kafka server may be always equal to currentKafkaPartitions, 
the wrong progress will never be updated.
2022-04-27 23:21:17 +08:00
26bc462e1c [feature-wip] (memory tracker) (step5) Fix track bthread, fix track vectorized query (#9145)
1. fix track bthread
- Bthread, a high performance M:N thread library used by brpc. In Doris, a brpc server response runs on one bthread, possibly on multiple pthreads. Currently, MemTracker consumption relies on pthread local variables (TLS).
- This caused pthread TLS MemTracker confusion when switching pthread TLS MemTracker in brpc server response. So replacing pthread TLS with bthread TLS in the brpc server response saves the MemTracker.
Ref: 731730da85/docs/en/server.md (bthread-local)

2. fix track vectorized query
- Added track mmap. Currently, mmap allocates memory in many places of the vectorized execution engine.
- Refactored ThreadContext to avoid dependency conflicts and make it easier to debug.
- Fix some bugs.
2022-04-27 20:34:02 +08:00
597115c305 [feature] add SHOW TABLET STORAGE FORMAT stmt (#9037)
use this stmt to show tablets storage format in be, if verbose is set,
    will show detail message of tablet storage format.
    e.g.
    ```
    MySQL [(none)]> admin show tablet storage format;
    +-----------+---------+---------+
    | BackendId | V1Count | V2Count |
    +-----------+---------+---------+
    | 10002     | 0       | 2867    |
    +-----------+---------+---------+
    1 row in set (0.003 sec)
    MySQL [test_query_qa]> admin show tablet storage format verbose;
    +-----------+----------+---------------+
    | BackendId | TabletId | StorageFormat |
    +-----------+----------+---------------+
    | 10002     | 39227    | V2            |
    | 10002     | 39221    | V2            |
    | 10002     | 39215    | V2            |
    | 10002     | 39199    | V2            |
    +-----------+----------+---------------+
    4 rows in set (0.034 sec)
    ```
    add storage format infomation to show full table statment.
    ```
    MySQL [test_query_qa]> show full tables;
    +-------------------------+------------+---------------+
    | Tables_in_test_query_qa | Table_type | StorageFormat |
    +-------------------------+------------+---------------+
    | bigtable                | BASE TABLE | V2            |
    | test_dup                | BASE TABLE | V2            |
    | test                    | BASE TABLE | V2            |
    | baseall                 | BASE TABLE | V2            |
    | test_string             | BASE TABLE | V2            |
    +-------------------------+------------+---------------+
    5 rows in set (0.002 sec)
    ```
2022-04-27 10:53:43 +08:00
87fc46f84c update comments in run-be-ut.sh (#9092) 2022-04-26 12:48:35 +08:00
47a59c7fe6 [fix](OlapScanner)fix bitmap or hll's OOM when loading too many unqualified data (#9205) 2022-04-26 10:25:56 +08:00