[refactor](file-cache) refactor the file cache interface (#15398)

Refactor the usage of file cache

### Motivation

There may be many kinds of file cache for different scenarios.
So the logic of the file cache should be hidden inside the file reader,
so that for the upper-layer caller, the change of the file cache does not need to
modify the upper-layer calling logic.

### Details

1. Add `FileReaderOptions` param for `fs->open_file()`, and in `FileReaderOptions`
    1. `CachePathPolicy`
        Determine the cache file path for a given file path.
        We can implement different `CachePathPolicy` for different file cache.

    2. `FileCacheType`
        Specified file cache type: SUB_FILE_CACHE, WHOLE_FILE_CACHE, FILE_BLOCK_SIZE, etc.

2. Hide the cache logic inside the file reader.

    The `RemoteFileSystem` will handle the `CacheOptions` and determine whether to
    return a `CachedFileReader` or a `RemoteFileReader`.

    And the file cache is managed by `CachedFileReader`
This commit is contained in:
Mingyu Chen
2022-12-29 12:15:46 +08:00
committed by GitHub
parent 93a7981427
commit 29492f0d6c
15 changed files with 213 additions and 46 deletions

View File

@ -216,12 +216,13 @@ void FileCacheManager::gc_file_caches() {
FileCachePtr FileCacheManager::new_file_cache(const std::string& cache_dir, int64_t alive_time_sec,
io::FileReaderSPtr remote_file_reader,
const std::string& file_cache_type) {
if (file_cache_type == "whole_file_cache") {
io::FileCacheType cache_type) {
switch (cache_type) {
case io::FileCacheType::SUB_FILE_CACHE:
return std::make_unique<WholeFileCache>(cache_dir, alive_time_sec, remote_file_reader);
} else if (file_cache_type == "sub_file_cache") {
case io::FileCacheType::WHOLE_FILE_CACHE:
return std::make_unique<SubFileCache>(cache_dir, alive_time_sec, remote_file_reader);
} else {
default:
return nullptr;
}
}