Mingyu Chen 0f7a25367d [fix](rowset-meta) Fix bug that rowset meta is not deleted (#8118)
As described in #8120, a large number of rowset meta remain in rocksdb, which may be generated by:

1. drop tablet

    The drop tablet task itself just sets the state of the tablet meta to `SHUTDOWN`
    and moves the tablet to `_shutdown_tablets` vector then the background thread
    will periodically clean up the tablet in `_shutdown_tablets` (that's why even if we execute
    the `drop table xx force`, the tablet may be delayed by 10min to 1 hour before it goes into the trash directory).

    The regular cleanup thread in the background saves the complete tablet meta as a `.hdr` file
    when deleting the tablet, and then moves it to the trash directory along with the data files.

    But this process does not process the rowset meta (before doing the checkpoint of the tablet meta,
    the rowset meta is stored independently in rocksdb as a key-value). So this results in a residual rowset meta.

2. clone task

    The clone task may migrate back and forth between BEs, which may result in a situation
    where the tablet id is the same on the BE, but the tablet uuid is different.
    This leads to some rowset meta can not find the corresponding tablet, but there is no thread
    to process these rowsets, and eventually lead to residual.

This is PR, I handled it in the regular cleanup thread with method `_clean_unused_rowset_metas()`.
I did not delete rowset meta along with "drop tablet" task, because "drop tablet" itself is not a synchronous operation.
It also relies on a background thread to clean up the tablet periodically.
So I put this operation in the background cleanup thread.
2022-02-19 12:00:48 +08:00
2022-02-08 09:46:04 +08:00

Apache Doris (incubating)

License Total Lines GitHub release Join the Doris Community at Slack Join the chat at https://gitter.im/apache-doris/Lobby

Doris is an MPP-based interactive SQL data warehousing for reporting and analysis. Its original name was Palo, developed in Baidu. After donated to Apache Software Foundation, it was renamed Doris.

  • Doris provides high concurrent low latency point query performance, as well as high throughput queries of ad-hoc analysis.

  • Doris provides batch data loading and real-time mini-batch data loading.

  • Doris provides high availability, reliability, fault tolerance, and scalability.

The main advantages of Doris are the simplicity (of developing, deploying and using) and meeting many data serving requirements in a single system. For details, refer to Overview.

Official website: https://doris.apache.org/

Monthly Active Contributors

Contributor over time

License

Apache License, Version 2.0

Note

Some licenses of the third-party dependencies are not compatible with Apache 2.0 License. So you need to disable some Doris features to be complied with Apache 2.0 License. For details, refer to the thirdparty/LICENSE.txt

Technology

Doris mainly integrates the technology of Google Mesa and Apache Impala, and it is based on a column-oriented storage engine and can communicate by MySQL client.

Compile and install

See Compilation

Getting start

See Basic Usage

Report issues or submit pull request

If you find any bugs, feel free to file a GitHub issue or fix it by submitting a pull request.

Contact Us

Contact us through the following mailing list.

Name Scope
dev@doris.apache.org Development-related discussions Subscribe Unsubscribe Archives
Description
No description provided
Readme 825 MiB
Languages
Java 31.7%
Groovy 22.6%
C++ 20.5%
Csound 18.9%
Python 4.2%
Other 1.8%