Commit Graph

295 Commits

Author SHA1 Message Date
07e2acb2f3 [feature] Suport national secret (national commercial password) algorithm SM3/SM4 (#7464)
SM3 is password hash algorithm
SM4 is a block cipher used to replace DES / AES and other international algorithms.
2021-12-28 10:39:54 +08:00
Pxl
6d1cf599f8 [fix] DCHECK fail at BitmapValue getSizeInBytes (#7430) 2021-12-24 21:23:58 +08:00
20ef8a6e21 [feature-wip](remote storage)(step1) use a struct instead of string for parameter path, add basic remote method (#7098)
For the first, we need to make a parameter to discribe the data is local or remote.
At then, we need to support some basic function to support the operation for remote storage.
2021-12-22 22:58:23 +08:00
97749ed85b [community][chore] Modify .asf.yaml and fix BE build warning (#7439) 2021-12-21 11:06:12 +08:00
2d72c039ad [deps](openssl) upgrade openssl to 1.1.1m (#7446)
upgrade openssl to 1.1.1m, ready for support SM2 / SM3 / SM4 national secret (national commercial password) algorithm
2021-12-21 10:09:36 +08:00
7d4da7af5c [fix](rpc) fix BE crash in SendRpcResponse when high concurrency (#7413)
The response is accessed when done->Run is called in transmit_data(),
give response a default value to avoid null pointers in high concurrency.
2021-12-16 20:27:24 +08:00
0499b2211b [feat](lateral-view) Support execution of lateral view stmt (#7255)
1. Add table function node
2. Add 3 table functions: explode_split, explode_bitmap and explode_json_array
2021-12-16 10:46:15 +08:00
d3316ff567 [performance](function) Support SIMD function in some string function (#7236)
Support SIMD function in some string function:lrtim,rtrim,trim,reverse,hex
2021-12-06 10:24:26 +08:00
fc9e502b51 [improvement](brpc)(config) Support transfer RowBatch in Controller Attachment (#7164)
Transfer RowBatch in Protobuf Request to Controller Attachment,
when the maximum length of the RowBatch in the Protobuf Request is exceeded.
This can avoid reaching the upper limit of the Protobuf Request length (2G),
and it is expected that performance can be improved.
2021-12-02 11:41:38 +08:00
dd36ccc3bf [feature](storage-format) Z-Order Implement (#7149)
Support sort data by Z-Order:

```
CREATE TABLE table2 (
siteid int(11) NULL DEFAULT "10" COMMENT "",
citycode int(11) NULL COMMENT "",
username varchar(32) NULL DEFAULT "" COMMENT "",
pv bigint(20) NULL DEFAULT "0" COMMENT ""
) ENGINE=OLAP
DUPLICATE KEY(siteid, citycode)
COMMENT "OLAP"
DISTRIBUTED BY HASH(siteid) BUCKETS 1
PROPERTIES (
"replication_allocation" = "tag.location.default: 1",
"data_sort.sort_type" = "ZORDER",
"data_sort.col_num" = "2",
"in_memory" = "false",
"storage_format" = "V2"
);
```
2021-12-02 11:39:51 +08:00
d8ba6e3eb6 1. Fix an error when fetch string type field may cause malform packet error. (#7262)
This is beacuse of an const MAX_PHYSICAL_PACKET_LENGTH  in fe should be 2^24 -1,
   but it is set as 2^24 -2 by mistake.
2. Fix bitmap_to_string may failed when the result is large than 2G
2021-12-01 10:02:34 +08:00
Pxl
2445f10868 [fix](bitmap-function) fix core dump at some bitmap function (#7221) 2021-11-25 22:52:50 +08:00
c9e578032b optimize bitmap function count, use roaring cardinality method, this will more fast than current version (#7151) 2021-11-24 14:42:48 +08:00
e2d3d0134e dd a method to get doris current memory usage (#6979)
Add all memory usage check when TryConsume memory
2021-11-24 10:07:54 +08:00
6c6380969b [refactor] replace boost smart ptr with stl (#6856)
1. replace all boost::shared_ptr to std::shared_ptr
2. replace all boost::scopted_ptr to std::unique_ptr
3. replace all boost::scoped_array to std::unique<T[]>
4. replace all boost:thread to std::thread
2021-11-17 10:18:35 +08:00
d751937828 [Optimize] Optimize mem_tracker (#6988)
1. Optimize HighWaterMarkCounter::add(), call `UpdateMax()` only if delta greater than 0
to reduce function call times

2. delete useless code lines to keep MemTracker clean
    some member datas never be set, but check its value,the if condition never meet, so clean these codes
2021-11-12 10:51:45 +08:00
632f8fcc75 [libhdfs] Add errno for hdfs writer. when no dir, hdfs writer open failed, the dir need to be created. (#7050)
1. Add errno message for hdfs writer failed.
2. When call openWrite for hdfs, the dir will be created when it doesn't exist,
2021-11-11 15:21:21 +08:00
Pxl
fc62090558 [Bug] fix Log tags empty reference core dump (#7043)
key may have been destructed when key reference is called.
2021-11-09 10:00:08 +08:00
ca8268f1c9 [Feature] Extend logger interface, support structured log output (#6600)
Support structured logging.
2021-11-07 17:39:53 +08:00
e69249c082 sub_bitmap (#6977)
Starting from the offset position, intercept the specified limit bitmap elements and return a bitmap subset.

Types of chang
2021-11-06 13:31:03 +08:00
760fc02bfe Added bprc stub cache check and reset api, used to test whether the bprc stub cache is available, and reset the bprc stub cache (#6916)
Added bprc stub cache check and reset api, used to test whether the bprc stub cache is available, and reset the bprc stub cache
add a config used for auto check and reset bprc stub
2021-11-05 09:45:37 +08:00
599ecb1f30 [Function] Add bitmap function bitmap_subset_limit (#6980)
Add bitmap function bitmap_subset_limit.
This function will return subset in specified index.
2021-11-04 12:14:47 +08:00
65ded82778 [Function] add BE bitmap function bitmap_subset_in_range (#6917)
Add bitmap function bitmap_subset_in_range.
This function will return subset in specified range (not include the range_end).
2021-11-01 11:05:19 +08:00
e8cabfff27 [S3] Support path style endpoint (#6962)
Add a use_path_style property for S3
Upgrade hadoop-common and hadoop-aws to 2.8.0 to support path style property
Fix some S3 URI bugs
Add some logs for tracing load process.
2021-11-01 10:48:10 +08:00
a842d41b87 [Function] add BE bitmap function bitmap_max (#6942)
Support bitmap_max.
2021-10-30 18:16:38 +08:00
51e210869a [ARM64] Fix some problem when compiling on ARM64 platform (#6836) (#6872)
With thirdparties 1.4.0 to 1.4.1

1. Add patch for aws-c-cal-0.4.5
2. Add some solutions for `undefined reference libpsl`
3. Move libgsasl to fix link problme of libcurl.
4. Downgrade openssl to 1.0.2k to fix problem of low version glibc
2021-10-19 13:26:02 +08:00
59017cebe6 [ARM64] Fix some problem when compiling on ARM64 platform (#6836)
1. Refactor the create method of hdfs reader & writer.

    libhdfs3 does not support arm64. So we should not support hdfs reader & writer on arm64.

2. And micro for LowerUpperImpl
2021-10-16 21:56:49 +08:00
24d38614a0 [Dependency] Upgrade thirdparty libs (#6766)
Upgrade the following dependecies:

libevent -> 2.1.12
OpenSSL 1.0.2k -> 1.1.1l
thrift 0.9.3 -> 0.13.0
protobuf 3.5.1 -> 3.14.0
gflags 2.2.0 -> 2.2.2
glog 0.3.3 -> 0.4.0
googletest 1.8.0 -> 1.10.0
snappy 1.1.7 -> 1.1.8
gperftools 2.7 -> 2.9.1
lz4 1.7.5 -> 1.9.3
curl 7.54.1 -> 7.79.0
re2 2017-05-01 -> 2021-02-02
zstd 1.3.7 -> 1.5.0
brotli 1.0.7 -> 1.0.9
flatbuffers 1.10.0 -> 2.0.0
apache-arrow 0.15.1 -> 5.0.0
CRoaring 0.2.60 -> 0.3.4
orc 1.5.8 -> 1.6.6
libdivide 4.0.0 -> 5.0
brpc 0.97 -> 1.0.0-rc02
librdkafka 1.7.0 -> 1.8.0

after this pr compile doris should use build-env:1.4.0
2021-10-15 13:03:04 +08:00
332ba4cded [config] use thrift_rpc_timeout_ms config replace hard code value (#6637)
use thrift_rpc_timeout_ms config to replace hard code value
2021-09-16 10:22:57 +08:00
794d4e7ace fix insert null as string type may coredump (#6615) 2021-09-13 12:30:34 +08:00
b3ae607fe9 [Sprak-Doris-Connector] support boolean data type (#6601)
1. Support boolean data type for spark-doris-connector because Doris has previously supported the boolean data type
2. Bug-Fix for the Doris BE core when spark request data from be
2021-09-12 10:07:23 +08:00
a4fbad3736 [BUG][Profile] Fixed the problem that BE's profile could not add chil… (#6268)
* [BUG][Profile] Fixed the problem that BE's profile could not add child profile in the specified correct location

bug:
    runtime_profile()->add_child(build_phase_profile, false, nullptr);
    child profile will add to second location

* Update runtime_profile.cpp
2021-09-10 09:53:51 +08:00
39bb669dcb [BUG] fix extra memory copy in bitmap value (#6599) 2021-09-10 09:52:41 +08:00
4f744333c2 fix some core in local test: (#6594)
1. insert very large string value may coredump
    2. some analitic functiuon and agg function result may be incorrect
    3. string compare may be coredump when string type is too large
    4. string type in delete condition can not process correctly
    5. add text/blob as alias of string to compitable with mysql
    6. fix string type min/max agg may  process incorrectly
2021-09-10 09:52:03 +08:00
74ddea8d83 [Optimize] Remove some unused code to reduce lock contention (#6566)
1. Remove global runtime profile counter
2. Remove unused thread token register
2021-09-07 11:56:12 +08:00
57199955d6 [Compaction][ThreadPool]Support adjust compaction threads num at runtime (#5781)
* adjust thread number of compaction thread pool dynamically

Co-authored-by: weizuo <weizuo@xiaomi.com>
2021-09-02 10:01:44 +08:00
0393c9b3b9 [Optimize] Support send batch parallelism for olap table sink (#6397)
* Support send batch parallelism for olap table sink

Co-authored-by: caiconghui <caiconghui@xiaomi.com>
2021-08-30 11:03:09 +08:00
acc5fd2f21 [BUG] Fix string type cast bug and runtime filter may core when not support avx2 (#6495)
* fix string type cast bug and runtime filter instructions may not support

* add arm support
2021-08-26 09:14:31 +08:00
7e30b28f3a [Optimize] Speed up converting the data of other types to string in mysql_result_writer (#6384)
Co-authored-by: caiconghui <caiconghui@xiaomi.com>
2021-08-24 22:30:58 +08:00
fa382f8602 [Bug][MemLimit] Modify the memory limit of storage page cache (#6451)
This CL mainly changes:

1. the `storage_page_cache_limit` is based on config `mem_limit`

    the default is 20% of `mem_limit`. 

2. the `buffer_pool_limit` is based on config `mem_limit`

    the default is 20% of `mem_limit`. 

3. the `buffer_pool_clean_pages_limit` is based on config `buffer_pool_limit`

    the default is 50% of `buffer_pool_limit`

4. Fix some show bugs of lru cache hit ratio and usage ratio
5. Fix a create view bug that `notEvalNondeterministicFunction` should be reset after analyze.
2021-08-19 14:16:53 +08:00
66a7a4b294 [Feature] Support exact percentile aggregate function (#6410)
Support to calculate the exact percentile value array of numeric column `col` at the given percentage(s).
2021-08-18 15:56:06 +08:00
8738ce380b Add long text type STRING, with a maximum length of 2GB. Usage is similar to varchar, and there is no guarantee for the performance of storing extremely long data (#6391) 2021-08-18 09:05:40 +08:00
9216735cfa [New Featrue] Support Vectorization Execution Engine Interface For Doris (#6329)
1. FE vectorized plan code
2. Function register vec function
3. Diff function nullable type
4. New thirdparty code and new thrift struct
2021-08-11 14:54:06 +08:00
f772649535 [Optimize] Optimize lock when check error storage (#6321)
1. `StorageEngine::_delete_tablets_on_unused_root_path` will try to obtain tablet shard write lock in `TabletManager`
```
StorageEngine::_delete_tablets_on_unused_root_path
  TabletManager::drop_tablets_on_error_root_path
    obtain each tablet shard's write lock
```
2. `TabletManager::build_all_report_tablets_info` and other methods will obtain tablet shard read lock frequently.

So, `StorageEngine::_delete_tablets_on_unused_root_path` will hold `_store_lock` for a long time.
This will make it difficult for other threads to get write `_store_lock`, such as `StorageEngine::get_stores_for_create_tablet`

`drop_tablets_on_error_root_path` is a small probability event, `TabletManager::drop_tablets_on_error_root_path` should return when its param `tablet_info_vec` is empty
2021-08-07 21:30:49 +08:00
d1007afe80 Use fmt and std::from_chars to make convert integer to string and convert string to integer more efficient (#6361)
* [Optimize] optimize the speed of converting integer to string

* Use fmt and std::from_chars to make convert integer to string and convert string to integer more efficient

Co-authored-by: caiconghui <caiconghui@xiaomi.com>
2021-08-04 10:55:19 +08:00
1454aacd69 [Metric] Add metrics to monitor size of queued tasks in load thread pool (#6306)
(1) Add metrics to monitor the size of queued tasks in load thread pool.
(2) Change some log level to VLOG_NOTICE
2021-07-27 13:41:44 +08:00
776df2effc [BUG][stack-buffer-overflow] fix overflow while calculate hash code in ArrayType and fix some warning 2021-07-27 13:41:00 +08:00
13ef2c9e1d [Function][Enhance] lower/upper case transfer function vectorized (#6253)
Currently, the function lower()/upper() can only handle one char at a time.
A vectorized function has been implemented, it makes performance 2 times faster. Here is the performance test:

The length of char: 26, test 100 times
vectorized-function-cost: 99491 ns
normal-function-cost: 134766 ns

The length of char: 260, test 100 times
vectorized-function-cost: 179341 ns
normal-function-cost: 344995 ns
2021-07-26 09:38:07 +08:00
327e31c227 [Feature] Support setting concurrency for thread pool token (#6237)
Now we can submit a group of tasks using thread pool token, and limit
the max concurrency of this task group
2021-07-21 12:30:43 +08:00
19cd42ccbd [BUG] avoid std::function copy in client cache (#6186)
* [BUG] avoid std::function copy in client cache

* Refactor ClientFactory Name
2021-07-16 09:20:28 +08:00