Generally speaking, as long as a rowset has a version, it can be
considered not to be in a pending state. However, if the rowset was
created through ingesting binlogs, it will have a version but should
still be considered in a pending state because the ingesting txn has not
yet been committed.
This PR updates the condition for determining the pending state. If a
rowset is COMMITTED, the txn should be allowed to roll back even if a
version exists.
Cherry-pick #36551
* [Enhancement](full compaction) Add run status support for full compaction (#34043)
* The usage is `curl http://{ip}:{host}/api/compaction/run_status?tablet_id={tablet_id}`
e.g. `curl http://127.0.0.1:8040/api/compaction/run_status?tablet_id=10084`
If full compaction is running, the output will be
```
{
"status" : "Success",
"run_status" : true,
"msg" : "compaction task for this tablet is running",
"tablet_id" : 10084,
"compact_type" : "full"
}
```
else the ouput will be
```
{
"status" : "Success",
"run_status" : false,
"msg" : "compaction task for this tablet is not running",
"tablet_id" : 10084,
"compact_type" : "full"
}
```
* 2
* 2
* [Fix](partial update) Fix rowset not found error when doing partial update (#34112)
Cause: In the logic of partial column updates, the existing data columns are read first, and then the data is supplemented and written back. During the reading process, initialization involves initially fetching rowset IDs, and the actual rowset object is fetched only when needed later. However, between fetching the rowset IDs and the rowset object, compaction may occur, turning the old rowset into a stale rowset. If too much time passes, the stale rowset might be directly deleted. Thus, when the rowset object is needed for an update, it cannot be found. Although the update operation with partial column logic should be able to read all keys and should not encounter new keys, if the rowset disappears, the Backend (BE) will consider these keys as missing. Consequently, it will check whether other columns have default values or are nullable. If this check fails, the aforementioned error is thrown.
Solution: To avoid such issues during partial column updates, the initialization step should involve fetching both the rowset IDs and the shared pointer to the rowset object simultaneously. This ensures that the rowset can always be found during data retrieval.
In order to add common code to the value deleter of LRU cache, let all lru cache values inherit from LRUCacheValueBase class and tracking memory in destructor.
Remove LRUCacheValueBase, put last_visit_time into LRUHandle, and automatically update timestamp to last_visit_time during cache insert and lookup.
Do not rely on external modification of last_visit_time, which is often forgotten.
* [improvement](create tablet) backend create tablet round robin among … (#29818)
* [improvement](create tablet) be choose disk tolerate with little skew (#30354)
---------
Co-authored-by: yujun <yu.jun.reach@gmail.com>
TabletSchema and Schema no longer track memory, only track columns count. because cannot accurately track memory size.
TabletMeta MemTracker changed to track TabletSchema columns count.
Segment::_meta_mem_usage Unknown value overflow, causes the value of SegmentMeta MemTracker is similar to -2912341218700198079. So, temporarily put it in experimental type tracker.
It should be ensured that the obtained versions are continuous when calculate delete bitmap calculations in publish.
The remaining NOTREADY tablet in the schema change failure should be dropped.
When a rowset was deleted, the delete bitmap cannot be deleted until there are no read requests to use the rowset.
When executing admin clean trash, if the backend daemon clean thread is cleaning trash, then SQL command will return immediately. But for the backend daemon thread, it doesn't clean all the trashes, it clean only the expired trashes.
Also if there's lots of trashes, the daemon clean thread will busy handling trashes for a long time.
Current initialization dependency:
Daemon ───┬──► StorageEngine ──► ExecEnv ──► Disk/Mem/CpuInfo
│
│
BackendService ─┘
However, original code incorrectly initialize Daemon before StorageEngine.
This PR also stop and join threads of daemon services in their dtor, to ensure Daemon services release resources in reverse order of initialization via RAII.