# Proposed changes
This PR fixed lots of issues when building from source on macOS with Apple M1 chip.
## ATTENTION
The job for supporting macOS with Apple M1 chip is too big and there are lots of unresolved issues during runtime:
1. Some errors with memory tracker occur when BE (RELEASE) starts.
2. Some UT cases fail.
...
Temporarily, the following changes are made on macOS to start BE successfully.
1. Disable memory tracker.
2. Use tcmalloc instead of jemalloc.
This PR kicks off the job. Guys who are interested in this job can continue to fix these runtime issues.
## Use case
```shell
./build.sh -j 8 --be --clean
cd output/be/bin
ulimit -n 60000
./start_be.sh --daemon
```
## Something else
It takes around _**10+**_ minutes to build BE (with prebuilt third-parties) on macOS with M1 chip. We will improve the development experience on macOS greatly when we finish the adaptation job.
This PR supports rowset level data upload on the BE side, so that there can be both cold data and hot data in a tablet,
and there is no necessary to prohibit loading new data to cooled tablets.
Each rowset is bound to a `FileSystem`, so that the storage layer can read and write rowsets without
perceiving the underlying filesystem.
The abstracted `RemoteFileSystem` can try local caching strategies with different granularity,
instead of caching segment files as before.
To avoid conflicts with the code in be/src/io, we temporarily put the file system related code in the be/src/io/fs directory.
In the future, `FileReader`s and `FileWriter`s should be unified.
For the first, we need to make a parameter to discribe the data is local or remote.
At then, we need to support some basic function to support the operation for remote storage.
1. replace all boost::shared_ptr to std::shared_ptr
2. replace all boost::scopted_ptr to std::unique_ptr
3. replace all boost::scoped_array to std::unique<T[]>
4. replace all boost:thread to std::thread
This CL mainly includes:
- add some methods to get thread's stats from Linux's system file in
env.
- support get thread's stats by http method.
- register page handle in BE to show thread's stats to help developer
position some thread relate problem.
RandomAccessFileOptions, WritableFileOptions, RandomRWFileOptions
defined as a struct but previously declared as a class; this is valid,
but will result in compile warning or error under clang compiler
The code submitted later will use this utility class.
Currently only factory methods for various file types are provided.
In the future, tool methods that are common to all Env types can
be added here.
when we need to ensure that **a newly-created file** is fully
synchronized back to disk, we should call `fsync()` on the parent
directory—that is, the directory containing the newly-created file.
That is to say, In this situation, we should call `fsync()` on
both the newly-created file and its parent directory.
Unfortunately, currently in Doris, in any scenario, directories
are not fsynced.
This patch adds `sync_dir()` interface first, laying the groundwork
for future fixes.
This patch also removes unneeded private method `dir_exists()`.
The upcoming patch will use CREATE_OR_OPEN mode
This patch also remove virtual dtors to cpp file.
* Move the dtors back to env.h
Generally, placing the dtor in an `.h` file(inline) or in a `cpp` file
depends on the trade-off between code expansion and function call overhead.
The code expansion rate is closely related to the number of class members
and the inheritance level.
For the several classes here: `Env`, `ReadableFile`, and `WritableFile`
have no members and are the top level of the inheritance hierarchy, But
for now I have no obvious evidence to prove that make their dtors inline
will cause serious code expansion and more instruction cache-misses,
even if there are thousands of `ReadableFile` objects kept being created
and released during running.
Now Env has unify all environment operation, such as file operation.
However some of our old functions don't leverage it. This change unify
FileUtils::scan_dir to use Env's function.