The abstraction of the Block layer, inspired by Kudu, lies between the "business layer" and the "underlying file storage layer" (`Env`), making them no longer strongly coupled. In this way, for the business layer (such as `SegmentWriter`), there is no need to directly do the file operation, which will bring better encapsulation. An ideal situation in the future is: when we need to support a new file storage system, we only need to add a corresponding type of BlockManager without modifying the business code (such as `SegmentWriter`). With the Block layer, there are some benefits: 1. First and foremost, the mapping relationship between data and `Env` is more flexible. For example, in the storage engine, the data of the tablet can be placed in multiple file systems (`Env`) at the same time. That is, one-to-many relationships can be supported. For example: one on the local and one on the remote storage. 2. The mapping relationship between blocks and files can be adjusted, for example, it may not be a one-to-one relationship. For example, the data of multiple blocks can be stored in a physical file, which can reduce the number of files that need to be opened during querying. It is like `LogBlockManager` in Kudu. 3. We can move the opened-file-cache under the Block layer, which can automatically close and open the files used by the upper layer, so that the upper business level does not need to be aware of the restrictions of the file handle at all (This problem is often encountered online now). 4. Better automatic cleanup logic when there are exceptions. For example, a block that is not closed explicitly can automatically clean up its corresponding file, thereby avoiding generating most garbage files. 5. More convenient for batch file creation and deletion. Some business operations create multiple files, such as compaction. At present, the processing flow that these files go through is executed one by one: 1) creation; 2) writing data; 3) fsync to disk. But in fact, this is not necessary, we only need to fsync this batch of files at the end. The advantage is that it can give the operating system more opportunities to perform IO merge, thereby improving performance. However, this operation is relatively tedious, there is no need to be coupled in the business code, it is an ideal place to put it in the Block layer. This is the first patch, just add related classes, laying the groundwork for later switching of read and write logic.
86 lines
2.6 KiB
C++
86 lines
2.6 KiB
C++
// Licensed to the Apache Software Foundation (ASF) under one
|
|
// or more contributor license agreements. See the NOTICE file
|
|
// distributed with this work for additional information
|
|
// regarding copyright ownership. The ASF licenses this file
|
|
// to you under the Apache License, Version 2.0 (the
|
|
// "License"); you may not use this file except in compliance
|
|
// with the License. You may obtain a copy of the License at
|
|
//
|
|
// http://www.apache.org/licenses/LICENSE-2.0
|
|
//
|
|
// Unless required by applicable law or agreed to in writing,
|
|
// software distributed under the License is distributed on an
|
|
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
|
// KIND, either express or implied. See the License for the
|
|
// specific language governing permissions and limitations
|
|
// under the License.
|
|
|
|
#include "olap/fs/file_block_manager.h"
|
|
|
|
#include <string>
|
|
|
|
#include <gtest/gtest.h>
|
|
|
|
#include "env/env.h"
|
|
#include "util/file_utils.h"
|
|
#include "util/slice.h"
|
|
|
|
using std::string;
|
|
|
|
namespace doris {
|
|
|
|
class FileBlockManagerTest : public testing::Test {
|
|
protected:
|
|
const string kBlockManagerDir = "./ut_dir/file_block_manager";
|
|
|
|
void SetUp() override {
|
|
if (FileUtils::check_exist(kBlockManagerDir)) {
|
|
ASSERT_TRUE(FileUtils::remove_all(kBlockManagerDir).ok());
|
|
}
|
|
ASSERT_TRUE(FileUtils::create_dir(kBlockManagerDir).ok());
|
|
}
|
|
|
|
void TearDown() override {
|
|
if (FileUtils::check_exist(kBlockManagerDir)) {
|
|
ASSERT_TRUE(FileUtils::remove_all(kBlockManagerDir).ok());
|
|
}
|
|
}
|
|
};
|
|
|
|
TEST_F(FileBlockManagerTest, NormalTest) {
|
|
fs::BlockManagerOptions bm_opts;
|
|
bm_opts.read_only = false;
|
|
bm_opts.enable_metric = false;
|
|
Env* env = Env::Default();
|
|
std::unique_ptr<fs::FileBlockManager> fbm(new fs::FileBlockManager(env, std::move(bm_opts)));
|
|
|
|
std::unique_ptr<fs::WritableBlock> wblock;
|
|
string fname = kBlockManagerDir + "/test_file";
|
|
fs::CreateBlockOptions wblock_opts({ fname });
|
|
Status st = fbm->create_block(wblock_opts, &wblock);
|
|
ASSERT_TRUE(st.ok()) << st.get_error_msg();
|
|
|
|
string data = "abcdefghijklmnopqrstuvwxyz";
|
|
wblock->append(data);
|
|
wblock->close();
|
|
|
|
std::unique_ptr<fs::ReadableBlock> rblock;
|
|
st = fbm->open_block(fname, &rblock);
|
|
uint64_t file_size = 0;
|
|
ASSERT_TRUE(rblock->size(&file_size).ok());
|
|
ASSERT_EQ(data.size(), file_size);
|
|
string read_buff(data.size(), 'a');
|
|
Slice read_slice(read_buff);
|
|
rblock->read(0, read_slice);
|
|
ASSERT_EQ(data, read_buff);
|
|
rblock->close();
|
|
}
|
|
|
|
} // namespace doris
|
|
|
|
int main(int argc, char **argv) {
|
|
::testing::InitGoogleTest(&argc, argv);
|
|
return RUN_ALL_TESTS();
|
|
}
|
|
|