88e08a92d8
[fix](array-type) fix the wrong result when import array element with double quotes ( #12786 )
...
Co-authored-by: hucheng01 <hucheng01@baidu.com >
2022-10-13 23:07:19 +08:00
de4315c1c5
[feature](function) support initcap string function ( #13193 )
...
support `initcap` string function
2022-10-13 21:31:44 +08:00
cb300b0b39
[feature](agg) support any,any_value agg functions. ( #13228 )
2022-10-13 18:31:19 +08:00
baf2689610
[Improvement](join) compute hash values by vectorized way ( #13335 )
2022-10-13 16:04:58 +08:00
87793b7c00
[bugfix](datatimev2) fix value column loss precision and scale ( #13233 )
...
Co-authored-by: yixiutt <yixiu@selectdb.com >
2022-10-13 15:39:53 +08:00
c1ed7d4d7d
[Bug](function) fix core dump on case when have 1000 condition #13315
2022-10-13 14:37:03 +08:00
830183984a
[fix](hash)update_hashes_with_value method should handle if input value is null ( #13332 )
...
* [fix](hash)update_hashes_with_value method should handle if input value is null
* remove unnessasery xxHash64NullWithSeed
2022-10-13 14:36:01 +08:00
3e84c04195
[Bug](predicate) fix nullptr in scan node ( #13316 )
2022-10-13 12:14:42 +08:00
9b590ac4cb
[improvement](olap) cache value of has_null in ColumnNullable ( #13289 )
2022-10-13 09:12:02 +08:00
c494ca0ed4
[enhancement](memtracker) Print query memory usage log every second when memory_verbose_track is enabled ( #13302 )
2022-10-13 09:11:23 +08:00
d430aec3ae
[Bug](bloomfilter) fix concurrency bug caused by bloom filter ( #13306 )
2022-10-13 09:10:02 +08:00
a77808e103
[Enhancement](function) optimize decimal minus and plus #13320
2022-10-13 09:00:05 +08:00
d63a80eaba
[fix](bitmap_intersect) fix bitmap_intersect result error ( #13298 )
2022-10-12 19:12:11 +08:00
dfe308f501
[Improvement](join) refine prefetch strategy ( #13286 )
2022-10-12 19:02:06 +08:00
4fc7a048d2
[feature-wip](parquet-reader) fix string test and support decimal64 ( #13184 )
...
1. Refactor arguments list of parquet min max filter, pass parquet type for min max value parsing
2. Fix the filter of string min max
Co-authored-by: jinzhe <jinzhe@selectdb.com >
2022-10-12 16:52:28 +08:00
bb4414e303
[feature-wip](multi-catalog) optimize parquet profile & add null map timer ( #13257 )
...
Use indentation to make `ParquetReader`'s profile more readable
Add `ParquetReader.DecodeNullMapTime` to show the time of parsing `NullMap` for `NullableColumn`
```
VFILE_SCAN_NODE (id=0):(Active: 279.62ms, % non-child: 85.83%)
- FileReadBytes: 2.36 MB
- FileReadCalls: 20
- FileReadTime: 5.686ms
- MaxScannerThreadNum: 1
- NewlyCreateFreeBlocksNum: 125
- NumScanners: 1
- ParquetReader: 0ns
- ColumnReadTime: 259.946ms
- DecodeDictTime: 0ns
- DecodeHeaderTime: 437.707us
- DecodeLevelTime: 30.101us
- DecodeNullMapTime: 53.295ms
- DecodeValueTime: 62.607ms
- DecompressCount: 511
- DecompressTime: 1.159ms
- FilteredBytes: 0.00
- FilteredGroups: 0
- FilteredRowsByGroup: 0
- FilteredRowsByPage: 0
- ParseMetaTime: 22.517ms
- ReadBytes: 2.36 MB
- ReadGroups: 20
```
2022-10-12 16:51:06 +08:00
b7621e1615
[feature-wip](new-scan) support csv reader ( #13282 )
...
Issue Number: close #12574
This pr adds CsvReader which implements GenericReader interface to support read csv format file.
2022-10-12 16:22:13 +08:00
4a5095f00d
[cleanup](config) remove unused config push_write_mbytes_per_sec ( #13290 )
2022-10-12 15:58:04 +08:00
1bd14f1d82
[feature-wip](jsonb) jsonb parse function and load ( #13129 )
...
add function to parse json string to jsonb format and use it to support stream load.
2022-10-12 13:56:37 +08:00
239e5b9943
[enhancement](storage) set the segment cache capacity according to the open file limit of the process ( #13269 )
2022-10-12 12:10:58 +08:00
af7b6524f2
add hide config to hide config in webserver for safety. ( #13255 )
2022-10-12 10:27:09 +08:00
89b295c6cc
[enhancement](memory) Print memory usage log when memory allocation fails ( #13301 )
2022-10-12 10:08:25 +08:00
16999ef02d
[Vectorized][Function] support date_trunc and countequal function ( #13039 )
2022-10-12 10:01:09 +08:00
5c68f69362
[improvement](config) set enable_local_exchange default value to true ( #13292 )
2022-10-12 09:07:24 +08:00
df54c6b63a
[enhancement](memtracker) Add independent and unique scanner mem tracker for each query ( #13262 )
2022-10-11 19:47:12 +08:00
334708dc8c
[fix](memory): avoid coredump when list pointer is null ( #12919 )
2022-10-11 16:00:23 +08:00
e8e171e0a3
[improvement](log) limit nums of logging disable auto compaction ( #13113 )
2022-10-11 15:52:56 +08:00
1724a91f53
[Bug](predicate) Cover all const predicates in scan node ( #13238 )
...
For an vectorized expression which meets the condition vexpr->is_constant(), a const column is expected to return.
But now we still don't cover all predicates for const expression.
For example, for query SELECT col FROM tbl WHERE 'PROMOTION' LIKE 'AAA%', predicate like will return a ColumnVector which contains a single value.
This PR want to cover all const predicates in scan node whether it returns a constcolumn or not
2022-10-11 15:49:53 +08:00
4e4f8afa28
[fix](array-type) fix get_data_at for zero element array #13225
...
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com >
2022-10-11 15:41:34 +08:00
606b514329
[fix](olap) fix core dump caused by LikeColumnPredicate with nullable column ( #13250 )
2022-10-11 15:38:55 +08:00
c1ce48ffe4
[fix](new-scann) scanner may be marked close twice ( #13263 )
2022-10-11 15:37:15 +08:00
5757bbc9f3
fix be oom when replace with an empty old str ( #13220 )
2022-10-10 15:58:12 +08:00
86d55dd79c
[Improvement](like function) avoid to convert const column to full column ( #13214 )
2022-10-10 14:19:46 +08:00
a8535e91af
[Improvement](runtimefilter) DO NOT allocate memory for bbf in prepare phase ( #13207 )
2022-10-10 14:19:33 +08:00
bdcb600f3d
[Bug](load) fix core dump on big block load ( #13014 )
2022-10-10 12:38:32 +08:00
1cd4e5cec6
refractor insert_xxx functions ( #13088 )
...
As mentioned in #13074 , there will be some problem in ColumnVector<int>::insert_many_in_copy_way.
Column::insert_xxx functions will append some data, they should reserve or resize before append data.
Co-authored-by: cambyzju <zhuxiaoli01@baidu.com >
2022-10-10 11:54:27 +08:00
20b583c91e
[Bug](array-type) Fix memory buffer overflow ( #13074 )
2022-10-10 11:42:13 +08:00
935ef5a598
[feature-wip](new-scan) Add new ES scanner and new ES scan node #13027
2022-10-10 09:56:38 +08:00
dd089259be
[feature-wip](multi-catalog) Optimize the performance of boolean & dictionary decoding ( #13212 )
...
Generate vector for dictionary data.
Decode boolean values in batch.
2022-10-10 08:41:11 +08:00
3dc4dc6d43
[compaction](http_action) enable be run manual compaction concurrently ( #13219 )
...
In some case, we need to run manual compaction via http interface
concurrently, so we remove the mutex and tablet's compaction lock
is enough to prevent concurrent compaction in tablet.
Co-authored-by: yixiutt <yixiu@selectdb.com >
2022-10-10 08:33:18 +08:00
15c7c0b754
[chore](release build) copy license and notice file to output folder and strip debug info from meta tool ( #13222 )
...
* [chore](release build) copy license and notice file to output folder and strip debug info from meta tool
Co-authored-by: yiguolei <yiguolei@gmail.com >
2022-10-10 08:31:34 +08:00
7b2fdd26a1
[schema change](fix) fix coredump of schema change ( #13183 )
...
When schema change and compaction is executing simutaneously, both
nullable and not nullable data can be read for the same column, need to
reset _nullmap for each Block when converting Block data, or else Column
case will be wrong.
2022-10-09 19:44:00 +08:00
fc711d89c8
[fix](projections) Open the project expressions properly. ( #13162 )
...
In current 'ExecNode::open' function, the 'open(_projections)' is unreachable which might cause serious crashed. (#13150 )
2022-10-09 18:43:45 +08:00
89514fc964
[fix](rowset) fix that rowset writer doesn't process the return value, which may result in data loss ( #13189 )
2022-10-09 17:10:11 +08:00
dc2d33298b
[chore](be config) remove config use_mmap_allocate_chunk #13196
...
This config is never used online and there exist bugs if enable this config. So that I remove this config and related tests.
Co-authored-by: yiguolei <yiguolei@gmail.com >
2022-10-09 16:19:59 +08:00
f373b22dcf
[fix](string) Fix over-allocated memory for string type ( #13167 )
...
For string/varchar/text type, the length field is fixed to 2GB. (`ColumnMetaPB`)
We don't actually have to allocate 2GB for every string type because we
will reallocate the precise size of memory for the string in
`WrapperField::from_string()`
```
Status from_string(const std::string& value_string, const int precision = 0,
const int scale = 0) {
if (_is_string_type) {
if (value_string.size() > _var_length) {
Slice* slice = reinterpret_cast<Slice*>(cell_ptr());
slice->size = value_string.size();
_var_length = slice->size;
_string_content.reset(new char[slice->size]);
slice->data = _string_content.get();
}
}
return _rep->from_string(_field_buf + 1, value_string, precision, scale);
}
```
2022-10-09 14:14:39 +08:00
245490d6b7
[Enhancement](runtime filter) optimize for runtime filter ( #12856 )
...
optimize for runtime filter
2022-10-09 14:11:03 +08:00
9e42804298
[feature-wip](unique-key-merge-on-write) unique key with merge on write table support schema change ( #12886 )
2022-10-09 11:31:53 +08:00
671dc93035
[feature-wip](unique-key-merge-on-write) fix that versions of multiple replicas are inconsistent when rebalance ( #12363 )
2022-10-09 11:31:27 +08:00
b8b18e5153
[enhancement](array-type) Handle cast empty string value to array ( #13028 )
...
Handle empty value between two comma when cast string to array type.
before:
mysql> select cast("[a,b,c,,,,]" as array<string>);
+-----------------------------------+
| CAST('[a,b,c,,,,]' AS ARRAY<TEXT>) |
+-----------------------------------+
| ['a', 'b', 'c', ',', ','] |
+-----------------------------------+
1 row in set (0.01 sec)
after:
mysql> select cast("[a,b,c,,,,]" as array<string>);
+-----------------------------------+
| CAST('[a,b,c,,,,]' AS ARRAY<TEXT>) |
+-----------------------------------+
| ['a', 'b', 'c', '', '', ''] |
+-----------------------------------+
1 row in set (0.01 sec)
2022-10-08 21:45:42 +08:00