doris

Author	SHA1	Message	Date
lihangyu	3894de49d2	[Enhancement](topn) support two phase read for topn query (#15642 ) This PR optimize topn query like `SELECT * FROM tableX ORDER BY columnA ASC/DESC LIMIT N`. TopN is is compose of SortNode and ScanNode, when user table is wide like 100+ columns the order by clause is just a few columns.But ScanNode need to scan all data from storage engine even if the limit is very small.This may lead to lots of read amplification.So In this PR I devide TopN query into two phase: 1. The first phase we just need to read `columnA`'s data from storage engine along with an extra RowId column called `__DORIS_ROWID_COL__`.The other columns are pruned from ScanNode. 2. The second phase I put it in the ExchangeNode beacuase it's the central node for topn nodes in the cluster.The ExchangeNode will spawn a RPC to other nodes using the RowIds(sorted and limited from SortNode) read from the first phase and read row by row from storage engine. After the second phase read, Block will contain all the data needed for the query	2023-01-19 10:01:33 +08:00
pengxiangyu	c43edbdfea	[bug](cooldown)fix bug for single cooldown (#16040 ) * fix bug for single cooldown * fix bug for single cooldown	2023-01-19 08:03:32 +08:00
HappenLee	ee76b9796c	[Bug](regresstest) BE Crash in DEBUG mode run regress test (#16042 )	2023-01-18 17:58:16 +08:00
Gabriel	95c91fab2e	[refactor](vec) delete non-vec runtime filter (#16016 ) * [refactor](vec) delete non-vec runtime filter * update	2023-01-18 17:49:20 +08:00
camby	bac2adfc74	[refractor](schema) refractor schema::get_predicate_column_ptr (#16043 ) * refractor Schema::get_predicate_column_ptr * update code format Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2023-01-18 17:47:37 +08:00
yiguolei	d257059e6b	[refactor](remove hadoop dpp) remove hadoop dpp code since it is not used (#16009 )	2023-01-18 15:01:04 +08:00
yiguolei	42b5d17fa1	[refactor](remove non vec) remove column block and column view (#16022 ) * [refactor](remove non vec) remove column block and column view and column vectorized batch Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-01-18 12:40:53 +08:00
camby	b2fe385742	[refractor](schema) refractor function Schema::get_column_by_field to make it simple #16027 Co-authored-by: cambyzju <zhuxiaoli01@baidu.com>	2023-01-18 11:11:16 +08:00
YueW	e579530c99	[Feature-WIP](inverted index) support use inverted index searcher cache (#16003 ) use inverted index searcher cache to improve query performance dependency pr: #14211 #15807 #15823	2023-01-18 09:30:55 +08:00
qiye	3bff5ebf9a	[fix](DOE) only return first batch data in ES 8.x (#16025 ) Do not use terminate_after and size together in scroll request of ES 8.x.	2023-01-18 09:28:34 +08:00
YueW	31cc99964c	[Feature-WIP](inverted index)(bkd) bdk index'reader implementation which in inverted index using for numeric types (#15994 ) Step3 of DSIP-023: Add inverted index for full text search implementation of bkd index's reader which in inverted index using for numeric types dependency pr: #14211 #15807 #15823	2023-01-18 09:24:19 +08:00
YueW	e6a5d3375e	[Feature-WIP](inverted index) add chinese analyzer for inverted index reader (#15998 ) add chinese analyzer for inverted index reader dependency pr: #14211 #15807 #15823	2023-01-17 20:20:40 +08:00
Tiewei Fang	6be0cc252a	[fix](BrokerFileReader) fix Compile error #16018	2023-01-17 19:53:06 +08:00
lihangyu	95397ff05d	[refactor](array) remove depandancy of ColumnBlock, ColumnBlockView (#16002 ) change to vectorized::MutableColumnPtr	2023-01-17 19:16:16 +08:00
HappenLee	d5a3e8df3a	[Exec](opt) Opt the vexplode_split function performance (#15945 )	2023-01-17 19:02:57 +08:00
Tiewei Fang	bbdf40b6bd	[Enhencement](Push Handle) use VParquetScanner in PushHandle (#15980 ) * use VParquetScanner in PushHadnle * delete ParquetScanner	2023-01-17 16:21:04 +08:00
starocean999	151ae71761	[fix](be)fix bug of VSetOperationNode::release_resource (#15997 ) should call "ExecNode::release_resource(state)" if child class override the parent's method	2023-01-17 16:16:25 +08:00
Gabriel	d062ca2944	[refactor](vectorized) remove unnecessary vectorization check (#15984 )	2023-01-17 12:21:46 +08:00
Gabriel	7d34512501	[Bug](pipeline) Fix DCHECK failure (#15928 )	2023-01-17 12:01:20 +08:00
HappenLee	9f106161a7	[Bug](join) Fix null aware anti join error in fuzzy mode (#15987 )	2023-01-17 11:32:16 +08:00
luozenglin	9755358787	[fix](brokerload) fix be core dump casued by broker load (#15874 )	2023-01-17 11:21:13 +08:00
HappenLee	0ab0479633	[Compile](lzo) fix lzo decompressor compiler error (#15956 )	2023-01-17 09:56:07 +08:00
YueW	b1caa68706	[Feature-WIP](inverted index) inverted index reader's implementation, and add mysql_fulltext regression case to test fulltext query (#15823 ) Issue Number: Step2 of DSIP-023: Add inverted index for full text search implementation of inverted index reader dependency pr: #14211 #15807 #15821	2023-01-17 09:13:56 +08:00
yixiutt	0057243f54	[improvement](reader) use union merge when rowset are noneoverlapping (#15749 )	2023-01-16 21:53:18 +08:00
zbtzbtzbt	65a4c8b163	[refactor] refactor segment writer (#15705 ) Co-authored-by: zhoubintao <1229701101@qq.com>	2023-01-16 21:50:21 +08:00
Xin Liao	5521c7a236	[fix](load) fix that tablet channel doesn't set received rows for verify the number of rows (#15961 )	2023-01-16 19:46:59 +08:00
WenYao	bdec4d5ac2	[enhancement](profile) add read columns to scanner profile (#15902 )	2023-01-16 19:32:46 +08:00
Xinyi Zou	97fcad76f8	[enhancement](memtracker) Improve readability (#15716 )	2023-01-16 16:30:35 +08:00
zhannngchen	b7f43441e3	[enhancement](load) change the publish version log to VLOG_CRITICAL (#15673 )	2023-01-16 16:22:33 +08:00
xueweizhang	63d48564ed	[fix](datetimev2) fix datetimev2 error with T (#15915 ) Signed-off-by: nextdreamblue <zxw520blue1@163.com>	2023-01-16 15:30:48 +08:00
Pxl	81bab55d43	[Bug](function) catch function calculation error on aggregate node to avoid core dump (#15903 )	2023-01-16 11:21:28 +08:00
airborne12	151fdc224e	[Fix](inverted index) fix compilation error for inverted index compound directory (#15946 ) fix compilation error for inverted index compound directory ``` be/src/olap/rowset/segment_v2/inverted_index_compound_directory.cpp:249:32: error: comparison of unsigned expression in '< 0' is always false [-Werror=type-limits] 249 \| if (h->_reader->size() < 0) { \| ~~~~~~~~~~~~~~~~~~~^~~ ```	2023-01-16 08:59:55 +08:00
Pxl	b727033906	[Chore](build) enable -Wextra and remove some -Wno (#15760 ) enable -Wextra and remove some -Wno	2023-01-15 10:40:35 +08:00
Gabriel	5af7bcaa55	[Bug](decimalv3) Fix missing precision and scale in predicates (#15930 )	2023-01-15 00:01:48 +08:00
pengxiangyu	58c520dbfd	[Feature](remote) Cooldown cold data to object storage only one replica (#15832 )	2023-01-14 23:58:00 +08:00
airborne12	0206e0bc57	[Feature](inverted index) implementation of inverted index writer for numeric types, using bkd index (#15918 ) Step3 of DSIP-023: Add inverted index for full text search implementation of inverted index writer for numeric types, using bkd index dependency pr: #14207 #15807 #15821	2023-01-14 21:06:51 +08:00
yiguolei	98c74f9ab8	[improvement](signal) add tid during core dump,the tid is equal to tid in be.INFO (#15893 ) Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-01-14 18:40:02 +08:00
Gabriel	84d6938a73	[Bug](pipeline) Fix BE crash caused by pipeline (#15890 ) * [Bug](pipeline) Fix BE crash caused by pipeline * update	2023-01-14 18:37:19 +08:00
Tiewei Fang	c4475a8dbc	[Enhencement](jdbc scanner) add profile for jdbc scanner (#15914 )	2023-01-14 10:28:59 +08:00
Lightman	313e14d220	[Bugfix] (ROLLUP) fix the coredump when add rollup by link schema change (#15654 ) Because of the rollup has the same keys and the keys's order is same, BE will do linked schema change. The base tablet's segments will link to the new rollup tablet. But the unique id from the base tablet is starting from 0 and as the rollup tablet also. In this case, the unique id 4 in the base table is column 'city', but in the rollup tablet is 'cost'. It will decode the varcode page to bigint page so that be coredump. It needs to be rejected. I think that if a rollup add by link schema change, it means this rollup is redundant. It brings no additional revenue and wastes storage space. So It needs to be rejected.	2023-01-14 10:20:07 +08:00
yixiutt	d8990522fb	[conf](compaction) enable vertical_compaction ordered_data_compaction (#14945 )	2023-01-13 23:12:42 +08:00
airborne12	ecb5aea182	[Feature-WIP](inverted index) inverted index writer's implementation (#15821 )	2023-01-13 21:30:44 +08:00
AlexYue	514de605b6	[Bug](predicate) add double predicate creator (#15762 ) Add one double predicator the same as integer predicate creator.	2023-01-13 18:34:09 +08:00
AlexYue	049f8ad2f9	[Bug](sort)fix merge sorter might div zero when block bytes less than block rows (#15859 ) If block bytes are bigger than the corresponding block's rows, then the avg_size_per_row would be zero. Which would end up diving zero in the following logic.	2023-01-13 18:33:40 +08:00
Tiewei Fang	1489e3cfbf	[Fix](file system) Make the constructor of `XxxFileSystem` a private method (#15889 ) Since Filesystem inherited std::enable_shared_from_this , it is dangerous to create native point of FileSystem. To avoid this behavior, making the constructor of XxxFileSystem a private method and using the static method create(...) to get a new FileSystem object.	2023-01-13 15:32:16 +08:00
Ashin Gau	34bb9cd5d3	[fix](parquet-reader) fix coredump when load datatime data to doris from parquet (#15794 ) `date_time_v2` will check scale when constructed datatimev2: ``` LOG(FATAL) << fmt::format("Scale {} is out of bounds", scale); ``` This [PR](https://github.com/apache/doris/pull/15510) has fixed this issue, but parquet does not use constructor to create `TypeDescriptor`, leading the `scale = -1` when reading datetimev2 data.	2023-01-13 11:51:11 +08:00
luozenglin	b1fb1277dd	[fix](bitmap) fix bitmap iterator comparison error (#15779 ) Fix the bug that bitmap.begin() == bitmap.end() is always true when the bitmap contains a single value.	2023-01-13 11:37:07 +08:00
HappenLee	9468711f9f	[Bug](join) fix bug null aware left anti join not correct result (#15841 )	2023-01-13 10:18:05 +08:00
yongkang.zhong	688a0bb96a	[feature](multi-catalog) support clickhouse jdbc catalog (#15780 )	2023-01-13 10:07:22 +08:00
yiguolei	16862d9b43	[refactor](remove unused code) remove buffer pool and disk io mgr (#15853 ) * [refactor](remove buffer pool and disk io mgr) remove unused code Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-01-13 09:42:58 +08:00

1 2 3 4 5 ...

3597 Commits