doris

Author	SHA1	Message	Date
abmdocrt	fd62af82d2	[enhancement](mow) Add bvar for bloom filter and segment (#32355 )	2024-03-22 08:52:12 +08:00
airborne12	ecadb60bcd	[Pick 2.1](inverted index) support inverted index format v2 (#30145 ) (#32418 )	2024-03-19 08:11:33 +08:00
Pxl	6b08a4ec93	[Bug](top-n) do not get runtime predicate when predicate not initialized #32209	2024-03-14 09:12:09 +08:00
lihangyu	0da010603e	[Improve](TabletSchemaCache) reduce duplicated memory consumption for column name and column path (#31141 ) Both could be reference to related field in TabletColumn.And use shared_ptr for TabletColumn in TabletSchema for later memory reuse	2024-03-09 19:44:42 +08:00
Pxl	25d1934289	[Feature](topn) support multiple topn filter on backend (#31665 ) support multiple topn filter on backend	2024-03-06 13:05:22 +08:00
yiguolei	7d1db6cd1f	[refactor](exception safe) Refactor delete handler and block column predicates to make sure exception safe (#31618 )	2024-03-01 14:21:17 +08:00
Pxl	d36ad56dce	[Opt](Exec) Support runtime update topn filter (#31250 )	2024-02-29 12:38:03 +08:00
lihangyu	586217bf73	[Improve](Variant) support prune segment for quering variant (#31310 )	2024-02-28 17:52:11 +08:00
yiguolei	a3c78dd21a	[chore](refactor) refactor some rf code and delete rpc file (#31031 ) --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2024-02-18 11:50:17 +08:00
Sun Chenyang	0442d5dc0e	[fix](Variant Type) Add sparse columns meta to fix compaction (#28673 ) Co-authored-by: eldenmoon <15605149486@163.com>	2024-02-16 10:12:23 +08:00
lihangyu	b23a785775	[Fix](Variant) support materialize view for variant and accessing variant subcolumns (#30603 ) * [Fix](Variant) support materialize view for variant and accessing variant subcolumns 1. fix schema change with path lost and lead to invalid data read 2. support element_at function in BE side and use simdjson to parse data 3. fix multi slot expression	2024-02-16 10:12:23 +08:00
Luwei	e610044bae	[Enhancement] (schema) add column type check (#28718 )	2023-12-28 17:11:24 +08:00
lihangyu	6d817bc253	[fix](topn opt) avoid using topn runtime predicate which segment does not contain such column(column unique id) when pruning segment (#29148 )	2023-12-27 20:31:03 +08:00
lihangyu	e9e1e2894b	[performance](variant) support topn 2phase read for variant column (#28318 ) [performance](variant) support topn 2phase read for variant column	2023-12-25 11:50:41 +08:00
lihangyu	341822ec05	[regression-test](Variant) add compaction case for variant and fix bugs (#28066 )	2023-12-08 12:18:46 +08:00
lihangyu	a7d1e92fc2	[Fix](variant) handle `StorageReadOptions` to avoid crash in `new_column_iterator_with_path` (#27936 ) In partial update, read variant without `opt` will lead to crash	2023-12-04 17:02:35 +08:00
lihangyu	48935c14e2	[Improvement](variant) limit the column size on tablet schema (#27399 ) (#27785 ) 1. limit the column count to default 2048 2. fix get_inverted_index return nullptr when variant's unique id is -1, using it's parent unique id instead 3. avoid add same path subcolumn duplicately in tablet schema 4. make extracted column unique id -1	2023-12-04 14:47:36 +08:00
lihangyu	a2fa0b3745	[compability](segment) fix compability issue introduced by #27676 (#27799 ) Prior to PR #27676, data was written with empty path information. Consequently, after implementing #27676, data that already exists in a segment is not included in `column_id_to_footer_ordinal`. This issue will lead to `invalid nonexistent column without default value` error.	2023-11-30 21:24:59 +08:00
lihangyu	7398c3daf1	[Feature-Variant](Variant Type) support variant type query and index (#27676 )	2023-11-29 10:37:28 +08:00
meiyi	553e4a8903	[feature-wip](merge-on-write) MOW table support different primary keys and sort keys (#24788 )	2023-11-24 16:37:30 +08:00
airborne12	c51146df10	[Fix](segment) need to rebuild col_id_to_predicates when true predicates encountered (#25685 )	2023-10-22 21:26:52 -05:00
Xiaocc	dbf5787682	[fix](be) Make DorisCallOnce's function exception-safe (#25579 )	2023-10-18 22:13:30 +08:00
Jerry Hu	80e5e72202	[fix](scanner) coredump caused by 'prune_predicates_by_zone_map' (#25555 )	2023-10-18 16:11:41 +08:00
Jerry Hu	283bd59eba	[improvement](scanner) Remove the predicate that is always true for the segment (#25366 ) By utilizing the zonemap index of the segment, we can ascertain if a predicate is always true. For example, if the segment’s maximum value is 100 and the predicate is col < 101, then this predicate is always true for this segment.	2023-10-13 15:25:38 +08:00
plat1ko	b9ddcbf729	[feature](merge-cloud) Rewrite code related to IOContext (#24269 )	2023-09-15 19:57:58 +08:00
plat1ko	d8ef9dda59	[feature](merge-cloud) Rewrite FS interface (#23953 )	2023-09-12 19:20:25 +08:00
bobhan1	bdacefa734	[Fix](status)Fix leaky abstraction and shield the status code `END_OF_FILE` from upper layers (#24165 )	2023-09-12 11:10:52 +08:00
Yongqiang YANG	1228995dec	[improvement](segment) reduce memory footprint of column_reader and segment (#24140 )	2023-09-11 21:54:00 +08:00
zzzxl	153c7982f3	[Optimize](invert index) Optimize multiple terms conjunction query (#23871 )	2023-09-09 01:52:58 +08:00
plat1ko	09bcedb116	[feature](merge-cloud) Remove deprecated old cache (#23881 ) * Remove deprecated old cache	2023-09-06 08:07:05 +08:00
airborne12	347cceb530	[Feature](inverted index) push count on index down to scan node (#22687 ) Co-authored-by: airborne12 <airborne12@gmail.com>	2023-09-02 22:24:43 +08:00
bobhan1	e05a0466f2	[improve](Status) Add new status code`KEY_NOT_FOUND` and `KEY_ALREADY_EXISTS` for merge on write (#23619 )	2023-08-30 08:50:07 +08:00
Mingyu Chen	2678afd2db	[fix][improvement](fs) add HdfsIO profile and modification time (#21638 ) Refactor the interface of create_file_reader the file_size and mtime are merged into FileDescription, not in FileReaderOptions anymore. Now the file handle cache can get correct file's modification time from FileDescription. Add HdfsIO for hdfs file reader pick from [Enhancement](multi-catalog) Add hdfs read statistics profile. #21442	2023-07-08 14:49:44 +08:00
airborne12	9d2f879bd2	[Enhancement](inverted index) make InvertedIndexReader shared_from_this (#21381 ) This PR proposes several changes to improve code safety and readability by replacing raw pointers with smart pointers in several places. use enable_factory_creator in InvertedIndexIterator and InvertedIndexReader, remove explicit new constructor. make InvertedIndexReader shared_from_this, it may desctruct when InvertedIndexIterator use it.	2023-07-06 11:52:59 +08:00
zhannngchen	85ce6a22c0	[enhancement](merge-on-write) some misc optimizations (#21039 )	2023-06-21 16:16:06 +08:00
Yongqiang YANG	87e3a79387	[enhancement](pk) add bvar latency recorder for pk (#20942 )	2023-06-19 15:29:42 +08:00
Xin Liao	48065fce19	[bugfix](merge-on-write) optimize rowset tree and tablet header lock (#20911 )	2023-06-18 19:26:02 +08:00
zhannngchen	15b9830859	[fix](partial-update) sequence column is not proceeded correctly #20813 When checking the keys in PrimaryKeyIndex, seq_col_length is not set to correct value, then we got a NOT_FOUND result for an existing key.	2023-06-15 14:07:00 +08:00
lihangyu	ab8125d56f	[Improve](performance) introduce SchemaCache to cache TabletSchame & Schema (#20037 ) * [Improve](performance) introduce SchemaCache to cache TabletSchame & Schema 1. When the system is under high-concurrency load with wide table point queries, the frequent memory allocation and deallocation of Schema become evident system bottlenecks. Additionally, the initialization of TabletSchema and Schema also becomes a CPU hotspot.Therefore, the introduction of a SchemaCache is implemented to cache these resources for reuse. 2. Make some variables wrapped with std::unique<unique_ptr> Performance: \| 状态 \| QPS \| 平均响应时间 (avg) \| P99 响应时间 \| \|------------------\|-----\|------------------\|-------------\| \| 开启 SchemaCache \| 501 \| 20ms \| 34ms \| \| 关闭 SchemaCache \| 321 \| 31ms \| 61ms \| * handle schema change with schema version * remove useless header * rebase	2023-05-29 17:34:53 +08:00
Xinyi Zou	16f5d3d5b3	[Improvement](memory) new page use Allocator (#19472 )	2023-05-16 19:09:17 +08:00
yixiutt	aef9355cd3	[feature-wip](partial update) PART1: support basic partial write (#17542 )	2023-04-28 17:17:57 +08:00
Yongqiang YANG	6eb12640a1	[fix](segment_iter) do not init segment_iterator twice (#18337 ) * [fix](segment_iter) do not init segment_iterator twice SegmentIterator::init is called by Segment::new_iterator and BetaRowsetReader::get_segment_iterators twice.	2023-04-27 09:51:57 +08:00
yiguolei	3736530585	[refactor](query context) rename query fragments context to query context and make query context safe (#18950 ) * [refactor](query context) rename query fragments context to query context and make query context safe --------- Co-authored-by: yiguolei <yiguolei@gmail.com>	2023-04-23 22:53:56 +08:00
Adonis Ling	e412dd12e8	[chore](build) Use include-what-you-use to optimize includes (PART II) (#18761 ) Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.	2023-04-19 23:11:48 +08:00
zxealous	e3ff2e3d21	[fix](file cache) Fix be core while use block/whole/sub file cache (#18440 ) BE will core dump while use whole/sub file cache. Call func CachedRemoteFileReader/WholeFileCache/SubFileCache::read_at_impl() did not pass IOContext when reading segment footer.	2023-04-07 16:39:59 +08:00
Mingyu Chen	cb79e42e5c	[refactor](file-system)(step-1) refactor file sysmte on BE and remove storage_backend (#17586 ) See #17764 for details I have tested: - Unit test for local/s3/hdfs/broker file system: be/test/io/fs/file_system_test.cpp - Outfile to local/s3/hdfs/broker. - Load from local/s3/hdfs/broker. - Query file on local/s3/hdfs/broker file system, with table value function and catalog. - Backup/Restore with local/s3/hdfs/broker file system Not test: - cold & host data separation case.	2023-03-21 21:08:38 +08:00
xueweizhang	e0cd8599d2	[fix](delete) fix delete from bug which can get wrong result (#17146 ) 理论上，如果是两次独立的删除，比如delete from table where a=1; delete from table where a=2;其实这个地方应该可以使用的，但是目前的代码，是把所有不同版本的delete predicates和不同列的delete predicates都放到一起了，失去了版本信息、失去了谓词间可能是and的关系，统一弱化成了delete predicates都是独立的，有一个delete predicates满足条件，就把page都去掉。这个pr的修改方式，就是在当前代码的基础上，当只有一个delete predicate的时候才能保证后续淘汰page的正确性，所以这里一律加了 == 1的判断才传递delete predicates。如果要把不同版本的delete predicates和不同列的delete predicates作为完整和严谨的逻辑去判断page，需要修改的设计就有点多了，目前的方案算是一种优先解决bug的思路，后续可以进一步把delete predicates这块加速zone判断进行page淘汰的逻辑完善，提高delete predicates使用的场景。	2023-02-28 09:20:10 +08:00
Xinyi Zou	b194a7cf83	[improvement](memory) Support GC segment cache, when memory insufficient (#16987 ) fix segment cache memory tracker statistics support GC	2023-02-22 18:31:20 +08:00
Xin Liao	c98a0bf803	[Enchancement](merge-on-write) check the correctness of rowid conversion after compaction (#16689 ) MoW updates the delete bitmap of the imported data during the compaction by rowid conversion. The correctness of rowid conversion is very important to the result of delete bitmap. So I add a rowid conversion result check.	2023-02-20 16:27:18 +08:00
TengJianPing	9b8c91e18c	[improvement](rowset reader) fix possible memleak (#16680 ) * [improvement](rowset reader) fix possible memleak * fix be UT	2023-02-15 11:13:31 +08:00

1 2 3

126 Commits