[feature](file cache)Import file cache for remote file reader (#15622)

The main purpose of this pr is to import `fileCache` for lakehouse reading remote files.
Use the local disk as the cache for reading remote file, so the next time this file is read,
the data can be obtained directly from the local disk.
In addition, this pr includes a few other minor changes

Import File Cache:
1. The imported `fileCache` is called `block_file_cache`, which uses lru replacement policy.
2. Implement a new FileRereader `CachedRemoteFilereader`, so that the logic of `file cache` is hidden under `CachedRemoteFilereader`.

Other changes:
1. Add a new interface `fs()` for `FileReader`.
2. `IOContext` adds some statistical information to count the situation of `FileCache`

Co-authored-by: Lightman <31928846+Lchangliang@users.noreply.github.com>
This commit is contained in:
Tiewei Fang
2023-01-10 12:23:56 +08:00
committed by GitHub
parent dec79c000b
commit f17d69e450
86 changed files with 4105 additions and 186 deletions

View File

@ -216,11 +216,11 @@ void FileCacheManager::gc_file_caches() {
FileCachePtr FileCacheManager::new_file_cache(const std::string& cache_dir, int64_t alive_time_sec,
io::FileReaderSPtr remote_file_reader,
io::FileCacheType cache_type) {
io::FileCachePolicy cache_type) {
switch (cache_type) {
case io::FileCacheType::SUB_FILE_CACHE:
case io::FileCachePolicy::SUB_FILE_CACHE:
return std::make_unique<WholeFileCache>(cache_dir, alive_time_sec, remote_file_reader);
case io::FileCacheType::WHOLE_FILE_CACHE:
case io::FileCachePolicy::WHOLE_FILE_CACHE:
return std::make_unique<SubFileCache>(cache_dir, alive_time_sec, remote_file_reader);
default:
return nullptr;