Commit Graph

2133 Commits

Author SHA1 Message Date
f49284036e [Enhancement] Refactor functions in int_exp by templates (#9939) 2022-06-04 11:53:31 +08:00
c996334ad1 [improvement] Optimize send fragment logic to reduce send fragment timeout error (#9720)
This CL mainly changes:
1. Reducing the rpc timeout problem caused by rpc waiting for the worker thread of brpc.
    1. Merge multiple fragment instances on the same BE to send requests to reduce the number of send fragment rpcs
    2. If fragments size >= 3, use 2 phase RPC: one is to send all fragments, two is to start these fragments. So that there
         will be at most 2 RPC for each query on one BE.

3. Set the timeout of send fragment rpc to the query timeout to ensure the consistency of users' expectation of query timeout period.

4. Do not close the connection anymore when rpc timeout occurs.
5. Change some log level from info to debug to simplify the fe.log content.

NOTICE:
1. Change the definition of execPlanFragment rpc, must first upgrade BE.
3. Remove FE config `remote_fragment_exec_timeout_ms`
2022-06-03 15:47:40 +08:00
cacad64d2c [fix] Error when compiling under ARM architecture machine, fixed a loop (#9948) 2022-06-03 08:00:55 +08:00
Pxl
c0ad1be1bd [Enhancement][Chore] remove breakpad and unused variable (#9937) 2022-06-02 20:52:17 +08:00
c426c2e4b1 [Vectorized-Load] Support vectorized load table with materialized view (#9923)
* [Vectorized-Load] Support vectorized load table with materialized view

* fix ut

Co-authored-by: lihaopeng <lihaopeng@baidu.com>
2022-06-02 14:59:01 +08:00
4ea5782838 [fix](Function) fix to_bitmap to return always not nullable (#9859) 2022-06-02 10:37:45 +08:00
e896fffd76 [Vectorized][Function] fix bitmap_intersect get wrong result (#9907) 2022-06-01 23:51:52 +08:00
47dfdd8e09 [fix](storage) Disable compaction before schema change is actually executed(#9032) (#9065)
As in issue, the combination and schema change at the same time may lead to version intersection.
Describe the overview of changes.
1. Do not do compaction before schema change is actually executed.
2. Set tablet as bad when it has version intersection.
3. Do not do schema change when it can not find appropriate versions to delete in new tablet.
4. Do not change rowsets after compaction if the rowsets of the tablet has changed.
2022-06-01 23:29:18 +08:00
Pxl
ac08c7ac91 [fix](vectorized) fix vcast expr input wrong row number (#9520) 2022-06-01 15:19:31 +08:00
632f7a3d3d [Feature] add weekday function on vectorized engine (#9901) 2022-06-01 14:47:37 +08:00
00719db3a2 [bugfix]handle ComlumnDictionary in evaluate_and and evaluate_or (#9818)
* handle ColumnDictory in evaluate_or

We need to handle ComlumnDictory in evaluate_or, otherwise delete handler
would triger a core dump.

* handle ComlumnDictionary in evaluate_and

Because there is only one difference between evaluate_and and
evaluate_or, that is or and delete, I merge two macros into one.

Delete handlers also trigger evaluate_and, i am not sure if column
dictionary would be used in evaluate_and.

* clang format

* fix short circut for evaluate_and and evaluate_or

* clang format
2022-06-01 08:10:43 +08:00
35f99faa0a [Bug][Vectorized] fix core dump on vcase_expr::close (#9893)
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
2022-06-01 08:05:09 +08:00
0376ca17f3 [Enhancement] Remove minidump (#9894) 2022-06-01 08:04:24 +08:00
f3193c5ea3 [improvement]opt column_dictinary range filter (#9881)
* opt column_dictinary range filter

* fomart
2022-05-31 22:30:05 +08:00
54e9d49718 [Bug][Vectorized] Fix call nvl function core dump (#9883)
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
2022-05-31 22:18:38 +08:00
Pxl
d34d631519 [bugfix]fix TableFunctionNode memory leak (#9853) 2022-05-31 19:20:22 +08:00
c8d303a82c [bugfix] Fix BE core about vectorized join build thread memtracker switch, and FileStat duplicate 2022-05-31 19:12:42 +08:00
Pxl
fa50b63cee fix core dump on vcase_expr::close (#9875) 2022-05-31 15:45:39 +08:00
0cba6b7d95 [Bug][Fix] One Rowset have same key output in unique table (#9858)
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
2022-05-31 12:29:16 +08:00
7199102d7c [Opt][VecLoad] Opt the vec stream load performance (#9772)
Co-authored-by: lihaopeng <lihaopeng@baidu.com>
2022-05-31 11:53:32 +08:00
7b55d4cb88 [BUG] return NULL for invalid date value (#9862) 2022-05-30 21:35:41 +08:00
85f525e991 [Bugfix(Vec)] Close result_sink properly (#9849)
Close result_sink properly so that error code is reported and
expr_context is always closed.
2022-05-30 19:03:33 +08:00
f377c26bf7 [refactor][be] Optimize headers (#9708) 2022-05-30 16:12:10 +08:00
4af2493c42 [Improvement] optimize scannode concurrency query performance in vectorized engine. (#9792) 2022-05-30 16:04:40 +08:00
7b98dd438d [feature](function) Add nvl function (#9726) 2022-05-30 09:43:00 +08:00
0683181fef [API changed](parser) Remove merge join syntax (#9795)
Remove merge join sql and merge join node
2022-05-30 09:04:21 +08:00
a96b41db7a [Improvement] Simplify expressions for _vconjunct_ctx_ptr (#9816) 2022-05-29 23:05:21 +08:00
63aab5ee5d [Bugfix(Vec)] Fix some memory leak issues (#9824) 2022-05-29 23:04:11 +08:00
1aeb16d153 [improvement](load) reduce useless err_msg format in VOlapTableSink send (#9531) 2022-05-29 16:02:57 +08:00
9fe3827239 [fix](ut) fix BE ut (#9831)
introduced from #8923, the github checks has some problem that failed to check BE ut in #8923
2022-05-29 12:25:41 +08:00
Pxl
f33ef32d92 [Bug] [Bitmap] change to_bitmap to always_not_nullable (#9716) 2022-05-28 17:33:55 +08:00
4d1e926b6c [feature][config] introduce a new BE config storage_page_cache_shard_size (#9821)
Co-authored-by: gaodayue <gaodayue@bytedance.com>
2022-05-28 10:17:09 +08:00
efdb3b79a5 [feature] add zstd compression codec (#9747)
ZSTD compression is fast with high compression ratio. It can be used to archive higher compression ratio
than default Lz4f codec for storing cost sensitive data such as logs.

Compared to Lz4f codec, we see zstd codec get 35% compressed size off, 30% faster at first time read without OS page 
cache, 40% slower at second time read with OS page cache in the following comparison test.

test data: 25GB text log, 110 million rows
test table: test_table(ts varchar(30), log string)
test SQL: set enable_vectorized_engine=1; select sum(length(log)) from test_table
be.conf: disable_storage_page_cache = true
set this config to disable doris page cache to avoid all data cached in memory for test real decompression speed.
test result

master branch with lz4f codec result: 
- compressed size 4.3G
- SQL first exec time(read data from disk + decompress + little computation) : 18.3s
- SQL second exec time(read data from OS pagecache + decompress + little computation) : 2.4s

this branch with zstd codec (hardcode enable it) result:
- compressed size: 2.8G
- SQL first exec time: 12.8s
- SQL second exec time: 3.4s
2022-05-27 21:56:18 +08:00
b2c2cdb122 [feature] Support compression prop (#8923) 2022-05-27 21:52:05 +08:00
af2cfa2db4 [fix] Fix bug of bloom filter hash value calculation error (#9802)
* Fix bug of bloom filter hash value calculation error

* fix code style
2022-05-27 20:44:26 +08:00
cbbda7857b [feature-wip](parquet-orc) Support orc scanner in vectorized engine (#9541) 2022-05-26 21:39:12 +08:00
Pxl
13c1d20426 [Bug] [Vectorized] add padding when load char type data (#9734) 2022-05-26 16:51:01 +08:00
9236c2efc9 [improvement] Show detail status code string for be http api (#9771)
1. move to_json method to common/status
2. modify related usage in http folder
2022-05-26 15:09:21 +08:00
f4dd3bf013 [bugfix] fix memleak in olapscannode(#9736) 2022-05-26 15:06:54 +08:00
24631915ed [bugfix] fix correctness for vectorized compaction (#9773) 2022-05-26 15:05:50 +08:00
cd99c24844 [Improvement] remove unused code in vectorized compaction (#9774) 2022-05-26 15:05:27 +08:00
2a11a4ab99 [feature-wip][array-type] Support more sub types. (#9466)
Please refer to #9465
2022-05-26 08:41:34 +08:00
73e31a2179 [stream-load-vec]: memtable flush only if necessary after aggregated (#9459)
Co-authored-by: weixiang <weixiang06@meituan.com>
2022-05-25 21:12:24 +08:00
8470543144 [Improvement] fix typo (#9743) 2022-05-25 19:29:01 +08:00
f5bef328fe [fix] disable transfer data large than 2GB by brpc (#9770)
because of brpc and protobuf cannot transfer data large than 2GB, if large than 2GB will overflow, so add a check before send
2022-05-25 18:41:13 +08:00
2725127421 [fix] group by with two NULL rows after left join (#9688)
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>
2022-05-25 16:43:55 +08:00
ca05d1ee01 [fix](memory tracker) Fix lru cache, compaction tracker, add USE_MEM_TRACKER compile (#9661)
1. Fix Lru Cache MemTracker consumption value is negative.
2. Fix compaction Cache MemTracker has no track.
3. Add USE_MEM_TRACKER compile option.
4. Make sure the malloc/free hook is not stopped at any time.
2022-05-25 08:56:17 +08:00
90e8cda5f2 [Enhancement](Vectorized)build hash table with new thread, as non-vec… (#9290)
* [Enhancement][Vectorized]build hash table with new thread, as non-vectorized past do

edit after comments

* format code with clang format

Co-authored-by: lidongyang <dongyang.li@rateup.com.cn>
Co-authored-by: stephen <hello-stephen@qq.com>
2022-05-24 10:23:15 +08:00
6353539ef7 [bugfix]teach BufferedBlockMgr2 track memory right (#9722)
The problem was introduced by e2d3d0134eee5d50b6619fd9194a2e5f9cb557dc.
2022-05-24 10:18:51 +08:00
8b7bb2d07c [bugfix]fix column reader compress codec unsafe problem (#9741)
by moving codec from shared reader to unshared iterator
2022-05-23 20:25:49 +08:00