Commit Graph

3467 Commits

Author SHA1 Message Date
e6a5d3375e [Feature-WIP](inverted index) add chinese analyzer for inverted index reader (#15998)
add chinese analyzer for inverted index reader
dependency pr: #14211 #15807 #15823
2023-01-17 20:20:40 +08:00
6be0cc252a [fix](BrokerFileReader) fix Compile error #16018 2023-01-17 19:53:06 +08:00
95397ff05d [refactor](array) remove depandancy of ColumnBlock, ColumnBlockView (#16002)
change to vectorized::MutableColumnPtr
2023-01-17 19:16:16 +08:00
d5a3e8df3a [Exec](opt) Opt the vexplode_split function performance (#15945) 2023-01-17 19:02:57 +08:00
bbdf40b6bd [Enhencement](Push Handle) use VParquetScanner in PushHandle (#15980)
* use VParquetScanner in PushHadnle

* delete ParquetScanner
2023-01-17 16:21:04 +08:00
151ae71761 [fix](be)fix bug of VSetOperationNode::release_resource (#15997)
should call "ExecNode::release_resource(state)" if child class override the parent's method
2023-01-17 16:16:25 +08:00
d062ca2944 [refactor](vectorized) remove unnecessary vectorization check (#15984) 2023-01-17 12:21:46 +08:00
7d34512501 [Bug](pipeline) Fix DCHECK failure (#15928) 2023-01-17 12:01:20 +08:00
9f106161a7 [Bug](join) Fix null aware anti join error in fuzzy mode (#15987) 2023-01-17 11:32:16 +08:00
9755358787 [fix](brokerload) fix be core dump casued by broker load (#15874) 2023-01-17 11:21:13 +08:00
0ab0479633 [Compile](lzo) fix lzo decompressor compiler error (#15956) 2023-01-17 09:56:07 +08:00
b1caa68706 [Feature-WIP](inverted index) inverted index reader's implementation, and add mysql_fulltext regression case to test fulltext query (#15823)
Issue Number: Step2 of DSIP-023: Add inverted index for full text search
implementation of inverted index reader

dependency pr: #14211 #15807 #15821
2023-01-17 09:13:56 +08:00
0057243f54 [improvement](reader) use union merge when rowset are noneoverlapping (#15749) 2023-01-16 21:53:18 +08:00
65a4c8b163 [refactor] refactor segment writer (#15705)
Co-authored-by: zhoubintao <1229701101@qq.com>
2023-01-16 21:50:21 +08:00
5521c7a236 [fix](load) fix that tablet channel doesn't set received rows for verify the number of rows (#15961) 2023-01-16 19:46:59 +08:00
bdec4d5ac2 [enhancement](profile) add read columns to scanner profile (#15902) 2023-01-16 19:32:46 +08:00
97fcad76f8 [enhancement](memtracker) Improve readability (#15716) 2023-01-16 16:30:35 +08:00
b7f43441e3 [enhancement](load) change the publish version log to VLOG_CRITICAL (#15673) 2023-01-16 16:22:33 +08:00
63d48564ed [fix](datetimev2) fix datetimev2 error with T (#15915)
Signed-off-by: nextdreamblue <zxw520blue1@163.com>
2023-01-16 15:30:48 +08:00
Pxl
81bab55d43 [Bug](function) catch function calculation error on aggregate node to avoid core dump (#15903) 2023-01-16 11:21:28 +08:00
151fdc224e [Fix](inverted index) fix compilation error for inverted index compound directory (#15946)
fix compilation error for inverted index compound directory

```
be/src/olap/rowset/segment_v2/inverted_index_compound_directory.cpp:249:32: error: comparison of unsigned expression in '< 0' is always false [-Werror=type-limits]
  249 |         if (h->_reader->size() < 0) {
      |             ~~~~~~~~~~~~~~~~~~~^~~
```
2023-01-16 08:59:55 +08:00
Pxl
b727033906 [Chore](build) enable -Wextra and remove some -Wno (#15760)
enable -Wextra and remove some -Wno
2023-01-15 10:40:35 +08:00
5af7bcaa55 [Bug](decimalv3) Fix missing precision and scale in predicates (#15930) 2023-01-15 00:01:48 +08:00
58c520dbfd [Feature](remote) Cooldown cold data to object storage only one replica (#15832) 2023-01-14 23:58:00 +08:00
0206e0bc57 [Feature](inverted index) implementation of inverted index writer for numeric types, using bkd index (#15918)
Step3 of DSIP-023: Add inverted index for full text search
implementation of inverted index writer for numeric types, using bkd index
dependency pr: #14207 #15807 #15821
2023-01-14 21:06:51 +08:00
98c74f9ab8 [improvement](signal) add tid during core dump,the tid is equal to tid in be.INFO (#15893)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-01-14 18:40:02 +08:00
84d6938a73 [Bug](pipeline) Fix BE crash caused by pipeline (#15890)
* [Bug](pipeline) Fix BE crash caused by pipeline

* update
2023-01-14 18:37:19 +08:00
c4475a8dbc [Enhencement](jdbc scanner) add profile for jdbc scanner (#15914) 2023-01-14 10:28:59 +08:00
313e14d220 [Bugfix] (ROLLUP) fix the coredump when add rollup by link schema change (#15654)
Because of the rollup has the same keys and the keys's order is same, BE will do linked schema change. The base tablet's segments will link to the new rollup tablet. But the unique id from the base tablet is starting from 0 and as the rollup tablet also. In this case, the unique id 4 in the base table is column 'city', but in the rollup tablet is 'cost'. It will decode the varcode page to bigint page so that be coredump. It needs to be rejected.

I think that if a rollup add by link schema change, it means this rollup is redundant. It brings no additional revenue and wastes storage space. So It needs to be rejected.
2023-01-14 10:20:07 +08:00
d8990522fb [conf](compaction) enable vertical_compaction ordered_data_compaction (#14945) 2023-01-13 23:12:42 +08:00
ecb5aea182 [Feature-WIP](inverted index) inverted index writer's implementation (#15821) 2023-01-13 21:30:44 +08:00
514de605b6 [Bug](predicate) add double predicate creator (#15762)
Add one double predicator the same as integer predicate creator.
2023-01-13 18:34:09 +08:00
049f8ad2f9 [Bug](sort)fix merge sorter might div zero when block bytes less than block rows (#15859)
If block bytes are bigger than the corresponding block's rows, then the avg_size_per_row would be zero. Which would end up diving zero in the following logic.
2023-01-13 18:33:40 +08:00
1489e3cfbf [Fix](file system) Make the constructor of XxxFileSystem a private method (#15889)
Since Filesystem inherited std::enable_shared_from_this , it is dangerous to create native point of FileSystem.
To avoid this behavior, making the constructor of XxxFileSystem a private method and using the static method create(...) to get a new FileSystem object.
2023-01-13 15:32:16 +08:00
34bb9cd5d3 [fix](parquet-reader) fix coredump when load datatime data to doris from parquet (#15794)
`date_time_v2` will check scale when constructed datatimev2:
```
LOG(FATAL) << fmt::format("Scale {} is out of bounds", scale);
```

This [PR](https://github.com/apache/doris/pull/15510) has fixed this issue, but parquet does not use constructor to create `TypeDescriptor`, leading the `scale = -1` when reading datetimev2 data.
2023-01-13 11:51:11 +08:00
b1fb1277dd [fix](bitmap) fix bitmap iterator comparison error (#15779)
Fix the bug that bitmap.begin() == bitmap.end() is always true when the bitmap contains a single value.
2023-01-13 11:37:07 +08:00
9468711f9f [Bug](join) fix bug null aware left anti join not correct result (#15841) 2023-01-13 10:18:05 +08:00
688a0bb96a [feature](multi-catalog) support clickhouse jdbc catalog (#15780) 2023-01-13 10:07:22 +08:00
16862d9b43 [refactor](remove unused code) remove buffer pool and disk io mgr (#15853)
* [refactor](remove buffer pool and disk io mgr) remove unused code


Co-authored-by: yiguolei <yiguolei@gmail.com>
2023-01-13 09:42:58 +08:00
bae29157aa [fix](olap) dictionary cannot be sorted after inserting some null values (#15829) 2023-01-13 09:28:55 +08:00
730571e386 [fix](sort spill) fix bug of failed to create spilled file (#15864)
Also increase buffered block size when it has started to spill.
2023-01-13 09:23:26 +08:00
174e5e601f [refactor](rpc fn) decouple vectorized remote function from row-based one (#15871) 2023-01-13 09:21:33 +08:00
0fbdf8e3e1 [Refactor](table function) Decouple vectorized table functions from non-vectorized ones (#15772) 2023-01-12 15:08:21 +08:00
ef0e0cf68d [enhancement](load) refine the reduce memory policy when process memory is nearly full (#15685)
If process memory is almost full but data load don't consume more than 5% (50% * 10%) of total memory, we don't need to reduce memory of load jobs
2023-01-12 14:43:33 +08:00
7441b4dc96 [Feature](function) Support width_bucket function (#14396) 2023-01-12 13:59:21 +08:00
92dd7c442a [enhancement](unique key) disable concurrent flush memtable for unique key (#15802) 2023-01-12 12:10:50 +08:00
791604ba1f [log](vlog) improve vlog print for query TExecPlanFragmentParams (#15806)
* [log] improve vlog print for query TExecPlanFragmentParams

* improvement
2023-01-12 09:27:59 +08:00
f3ef3f7e15 [fix](sink) fix memory leak in VNodeChannel (#15834) (#15835)
Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>

Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
2023-01-12 09:24:51 +08:00
98d69d1568 [fix](compile) fix vscan node compile error (#15805)
conflict merge of #15604 and #15618
2023-01-11 15:08:46 +08:00
fe5e5d2bf4 [refactor] separate agg and flush in memtable (#15713) 2023-01-11 10:07:34 +08:00