Commit Graph

143 Commits

Author SHA1 Message Date
df628e1538 [chore](merge-on-write) disable rowid conversion check for mow table by default (#27482) 2023-11-23 23:39:01 +08:00
d767804815 [feature](merge-cloud) Decouple rowset id generator and local rowsets gc implementation (#25921) 2023-11-10 10:07:02 +08:00
d0960bac56 [Fix](partial update) Fix partial update info loss when the delete bitmaps of the committed transactions are calculated by the compaction (#26556)
a fix for #25147
2023-11-08 19:56:31 +08:00
44b51bf0b9 [Feature](Variant) support variant load (#26572) 2023-11-08 00:37:57 -06:00
005a36322e [opt](index compaction) optimize checks before index compaction (#25486) 2023-10-25 22:21:46 -05:00
77f727e0a1 [chore](compaction) Print roswet size when compaction finishes successfully (#25891) 2023-10-26 09:49:28 +08:00
Pxl
2e2d5bcba2 [Improvements](status) catch some error status (#25677)
catch some error status
2023-10-23 10:19:08 +08:00
9c9fc84f39 [feature](merge-cloud) Abstract BaseTablet for CloudTablet (#24929) 2023-10-18 20:29:04 +08:00
1514f78b87 [refactor](partial-update) Split partial update infos from tablet schema (#25147) 2023-10-17 14:21:40 +08:00
cda8fb6b8b [fix](load) return Status when error in RowsetWriter::build (#25381) 2023-10-17 09:40:23 +08:00
239df5860b [enhancement](tablet_meta_lock) add more trace for write lock of tablet's _meta_lock (#25095) 2023-10-08 10:28:10 +08:00
642e5cdb69 [Fix](Status) Make Status [[nodiscard]] and handle returned Status correctly (#23395) 2023-09-29 22:38:52 +08:00
2ec50dcfc7 [log](compaction) add more stats for compaction log (#24984) 2023-09-28 15:29:15 +08:00
f4b1e2b343 [Fix](core) Fix segment cache core when output rowset is nullptr (#24778) 2023-09-22 19:48:42 +08:00
ec987b1b7d [fix](index compaction)ignore doc which dose not exist in destination segment (#24729) 2023-09-21 18:27:08 +08:00
67e789e025 [Fix](point query) Fix point query unstable (#24570) 2023-09-20 18:07:07 +08:00
643d09de06 [fix](index compaction)skip index compaction when no output segment (#24468) 2023-09-16 16:42:39 +08:00
09bcedb116 [feature](merge-cloud) Remove deprecated old cache (#23881)
* Remove deprecated old cache
2023-09-06 08:07:05 +08:00
1ac0ff0ea9 [feature](delete-predicate) support delete sub predicate v2 (#22442)
New structure for delete sub predicate.
Delete sub predicate uses a string type condition_str to stored temporarily now and fields will be extracted from it using std::regex, which may introduces stack overflow when matching a extremely large string(bug of libc).

Now we attempt to use a new PB structure to hold the delete sub predicate, to avoid that problem.

message DeleteSubPredicatePB {
    optional int32 column_unique_id = 1;
    optional string column_name = 2;
    optional string op = 3;
    optional string cond_value = 4;
}
Currently, 2 versions of sub predicate will both be filled. For query, we use the v2, and during compaction we still use v1. The old rowset meta with delete predicates which had sub predicate v1 will be attempted to convert to v2 when read from PB. Moreover, efforts will be made to rewrite these meta with the new delete sub predicate.

Make preparation to use column unique id to specify a column globally.
Using the column unique id rather than the column name to identify a column is vital for flexible schema change. The rewritten delete predicate will attach column unique id.
2023-08-29 19:37:23 +08:00
82a4f114e4 [improvement](compaction) add an option on delete stale rowset by judging _stale_rs_metas size when doing compaction (#23448) 2023-08-29 17:40:37 +08:00
da9eb79ac4 [Enhancement](Schema hash) Remove schema hash in tablet info (#23516) 2023-08-29 10:05:12 +08:00
d4694167a8 [Enhancement](chore) Some Status relevant enhancement (#23072) 2023-08-21 14:14:38 +08:00
cf368728be [fix](merge-on-write) Fix a typo and remove useless member rowset in CommitTabletTxnInfo (#23151)
Fix a typo in #23078
2023-08-18 14:14:34 +08:00
29ff7b7964 [fix](merge-on-write) add sentinel mark when do compaction (#23078) 2023-08-17 20:08:01 +08:00
f036cdfde6 [feature](compaction) support delete in cumulative compaction (#19609) 2023-08-07 15:22:21 +08:00
19d1f49fbe [improvement](compaction) compaction policy and options in the properties of a table (#22461) 2023-08-01 22:02:23 +08:00
c409fa0f58 [Feature](Compaction)Support full compaction (#21177) 2023-07-16 13:21:15 +08:00
fd6553b218 [Fix](MoW) Fix bug about caculating all committed rowsets delete bitmaps when do comapction (#21760) 2023-07-13 21:10:15 +08:00
f0d08da97c [enhancement](merge-on-write) split delete bitmap from tablet meta (#21456) 2023-07-12 19:13:36 +08:00
Pxl
ca71048f7f [Chore](status) avoid empty error msg on status (#21454)
avoid empty error msg on status
2023-07-11 13:48:16 +08:00
7d4c47e250 [Enhancement](Compaction) Caculate all committed rowsets delete bitmaps when do comapction (#20907)
Here we will calculate all the rowsets delete bitmaps which are committed but not published to reduce the calculation pressure of publish phase.

Step1: collect this tablet's all committed rowsets' delete bitmaps.

Step2: calculate all rowsets' delete bitmaps which are published during compaction.

Step3: write back updated delete bitmap and tablet info.
2023-07-10 14:06:11 +08:00
1fe04b7242 [Chore](metrics) remove trace metrics code using runtime profile instead (#21394)
* commit

* fix

* format
2023-07-01 12:18:23 +08:00
b2dc4a8cb9 [Fix](inverted index) check inverted index file existence befor data compaction (#21173) 2023-06-26 19:55:55 +08:00
7a58a69aa9 [Fix](inverted index) skip index compaction when src rs did not have inverted index (#21010) 2023-06-20 21:22:25 +08:00
0585a1f004 [fix](compaction) fix time series compaction policy to adjust vertical compaction max segment size (#20889) 2023-06-17 20:32:34 +08:00
24fb05ec83 [Bug](row-store) Fix row store with materialize index (#20356)
If a query hits a materialized view that has row storage enabled, but the row storage column is not present in the materialized view, it will result in a query crash. Therefore, it is necessary to include the row storage column when creating the materialized view, and serialize the row storage column during the execution of SchemaChange.
2023-06-08 10:55:22 +08:00
Pxl
7dc7ed97eb [Chore](build) remove some unused code and remove some wno (#20326)
remove some unused code about spinlock
remove some wno and fix warning
remove varadic macro usage
2023-06-05 10:48:07 +08:00
42239d635a [fix](tablet_manager_lock) fix create tablet timeout #20067 (#20069) 2023-05-28 23:05:13 +08:00
Pxl
15a7420661 [Chore](ub) fix some undefined behaviors (#19986)
/home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/column_reader.cpp:895:21: runtime error: load of value 423208544, which is not a valid value for type 'doris::ReaderType'

/home/zcp/repo_center/doris_master/doris/be/src/vec/columns/column_decimal.cpp:260:33: runtime error: load of misaligned address 0x7fa3348b301c for type 'int64_t' (aka 'long'), which requires 8 byte alignment

/home/zcp/repo_center/doris_master/doris/be/src/olap/block_column_predicate.cpp:82:24: runtime error: variable length array bound evaluates to non-positive value 0

/home/zcp/repo_center/doris_master/doris/be/src/vec/columns/column_string.h:225:26: runtime error: null pointer passed as argument 2, which is declared to never be null
2023-05-26 14:08:40 +08:00
e08de52ee7 [chore](compile) using PCH for compilation acceleration under clang (#19303) 2023-05-08 19:51:06 +08:00
f199860dea [Improvement](inverted index) Enhance compaction performance through direct inverted index merging (#19207) 2023-05-08 14:07:32 +08:00
a1c05b5c13 [fix](compaction) fix potential null pointer dereference (#18915) 2023-04-22 08:38:32 +08:00
e412dd12e8 [chore](build) Use include-what-you-use to optimize includes (PART II) (#18761)
Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.
2023-04-19 23:11:48 +08:00
c3e2269c4c [fix](merge-on-write) fix that missed rows don't match merged rows for base compaction (#18262) 2023-03-31 15:06:51 +08:00
fa586c00a9 [fix](merge-on-write) fix that missed rows don't match merged rows (#18128)
Due to concurrent load, there may be duplication in the delete bitmap of historical data and incremental calculations, resulting in duplicate calculations of missed rows.
2023-03-27 23:00:54 +08:00
Pxl
d8f0ca7108 [Chore](schema change) remove some unused code in schema change (#17459)
remove some unused code in schema change.
remove some row-based config and code.
2023-03-07 09:18:34 +08:00
b0d67c0358 [fix](merge-on-write) fix cu compaction correctness check (#17347)
During concurrent import, the same row location may be marked delete multiple times by different versions of rowset.
Duplicate row location need to be removed.
2023-03-06 21:31:48 +08:00
3636d0a561 [feature](merge-on-write) add DCHECK in compaction to detect data inconsistency (#16564)
MoW will mark all duplicate primary key as deleted, so we can add a DCHECK while compaction, if MoW's delete bitmap works incorrectly, we're able to detect this kind of issue ASAP.
In Debug version, DCHECK will make BE crush, in release version, compaction will fail and finally load will fail due to -235
2023-02-22 14:59:18 +08:00
52f9e03eea [fix](cooldown) Use pending_remote_rowsets to avoid deleting rowset files being uploaded (#16803) 2023-02-21 21:58:20 +08:00
c98a0bf803 [Enchancement](merge-on-write) check the correctness of rowid conversion after compaction (#16689)
MoW updates the delete bitmap of the imported data during the compaction by rowid conversion. The correctness of rowid conversion is very important to the result of delete bitmap. So I add a rowid conversion result check.
2023-02-20 16:27:18 +08:00