Add an optional executable binary fs_benchmark_tool, for test the performance of file system such as hdfs, s3.
Usage:
./fs_benchmark_tool --conf my.conf --fs_type=s3 --operation=read --iterations=5
in my.conf, you can add any config key value with following format:
key1=value1
key2=value2
By default, this binary will not be built. Only build it when setting BUILD_FS_BENCHMARK=ON.
The binary will be installed in output/be/lib.
For developer, you can add new subclass of BaseBenchmark to add your own benchmark.
See be/src/io/fs/benchmark/s3_benchmark.hpp for an example
This PR calculates the size of the inverted index files. The changes consist of:
Introduction of a new get_inverted_index_size() method in different column writers such as ScalarColumnWriter, StructColumnWriter, ArrayColumnWriter, and MapColumnWriter. This method will fetch the size of the inverted index file associated with that column. If the file size cannot be fetched, it defaults to 0.
A new method file_size() has been added in InvertedIndexColumnWriter class which retrieves the size of the file stored on disk. If the file size cannot be fetched, it logs an error and returns -1.
Additionally, a new method get_inverted_index_file_size() is introduced in SegmentWriter which aggregates the inverted index file sizes of all the column writers.
* [Bug](topn opt) Fix Two-Phase read when some rowset swept
If this is a Two-Phase read query, and we need to delay the release of Rowset by row->update_delayed_expired_timestamp() to expand the lifespan of rowsets. This is necessary to avoid data loss during the second phase reading, where some stale rowsets may be swept and result in missing data.
When FE is old version, be is new version, issue a schema change(add column) and
then query, old version of FE query without schema version could result in reading
stale schema from schema cache
1. Add hdfs file handle cache for hdfs file reader
Copied from Impala, `https://github.com/apache/impala/blob/master/be/src/util/lru-multi-cache.h`. (Thanks for the Impala team)
This is a lru cache that can store multi entries with same key.
The key is build with {file name + modification time}
The value is the hdfsFile pointer that point to a certain hdfs file.
This cache is to avoid reopen same hdfs file mutli time, which can save
query time.
Add a BE config `max_hdfs_file_handle_cache_num` to limit the max number
of file handle cache, default is 20000.
2. Add file meta cache
The file meta cache is a lru cache. the key is {file name + modification time},
the value is the parsed file meta info of the certain file, which can save
the time of re-parsing file meta everytime.
Currently, it is only used for caching parquet file footer.
The test show that is cache is hit, the `FileOpenTime` and `ParseFooterTime` is reduce to almost 0
in query profile, which can save time when there are lots of files to read.
The java-udf module has become increasingly large and difficult to manage, making it inconvenient to package and use as needed. It needs to be split into multiple sub-modules, such as : java-commom、java-udf、jdbc-scanner、hudi-scanner、 paimon-scanner.
Co-authored-by: lexluo <lexluo@tencent.com>
After supporting insert-only transactional hive full acid tables #19518, #19419, this PR support transactional hive full acid tables.
Support hive3 transactional hive full acid tables.
Hive2 transactional hive full acid tables need to run major compactions.
Currently, there are many profiles using add child profile to orgnanize profile into blocks. But it is wrong. Child profile will have a total time counter. Actually, what we should use is just a label.
- MemoryUsage:
- HashTable: 23.98 KB
- SerializeKeyArena: 446.75 KB
Add a new macro ADD_LABEL_COUNTER to add just a label in the profile.
---------
Co-authored-by: yiguolei <yiguolei@gmail.com>
PR(https://github.com/apache/doris/pull/19909) has implemented the framework of hudi reader for MOR table. This PR completes all functions of reading MOR table and enables end-to-end queries.
Key Implementations:
1. Use hudi meta information to generate the table schema, not from hive client.
2. Use hive client to list hudi partitions, so it strongly depends the sync-tools(https://hudi.apache.org/docs/syncing_metastore/) which syncs the partitions of hudi into hive metastore. However, we may get the hudi partitions directly from .hoodie directory.
3. Remove `HudiHMSExternalCatalog`, because other catalogs like glue is compatible with hive catalog.
4. Read the COW table originally from c++.
5. Hudi RecordReader will use ProcessBuilder to start a hotspot debugger process, which may be stuck when attaching the origin JNI process, soI use a tricky method to kill this useless process.
start time: Wed 07 Jun 2023 06:50:14 PM CST
*** Query id: e9000000e9-eb00000073 ***
*** Aborted at 1686136356 (unix time) try "date -d @1686136356" if you are using GNU date ***
*** Current BE git commitID: 5c33dd7a2c ***
*** SIGSEGV address not mapped to object (@0x23000000235) received by PID 2131238 (TID 2132258 OR 0x7f708eff7700) from PID 565; stack trace: ***
0# doris::signal::(anonymous namespace)::FailureSignalHandler(int, siginfo_t*, void*) at /mnt/hdd01/repo_center/doris_branch-2.0-beta/doris/be/src/common/signal_handler.h:413
1# 0x00007F727BBE3090 in /lib/x86_64-linux-gnu/libc.so.6
2# doris::AttachTask::AttachTask(doris::RuntimeState*) at /mnt/hdd01/repo_center/doris_branch-2.0-beta/doris/be/src/runtime/thread_context.cpp:43
3# std::_Function_handler<void (doris::PTabletWriterAddBlockResult const&, bool), doris::stream_load::VNodeChannel::open_wait()::$_1>::_M_invoke(std::_Any_data const&, doris::PTabletWriterAddBlockResult const&, bool&&) at /var/local/ldb_toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291
4# doris::stream_load::ReusableClosure<doris::PTabletWriterAddBlockResult>::Run() at /mnt/hdd01/repo_center/doris_branch-2.0-beta/doris/be/src/vec/sink/vtablet_sink.h:176
5# brpc::Controller::EndRPC(brpc::Controller::CompletionInfo const&) in /root/20230607171843-doris-branch-2.0-beta-5c33dd7a/be/lib/doris_be
6# brpc::Controller::OnVersionedRPCReturned(brpc::Controller::CompletionInfo const&, bool, int) in /root/20230607171843-doris-branch-2.0-beta-5c33dd7a/be/lib/doris_be
7# brpc::policy::ProcessRpcResponse(brpc::InputMessageBase*) in /root/20230607171843-doris-branch-2.0-beta-5c33dd7a/be/lib/doris_be
8# brpc::InputMessenger::InputMessageClosure::~InputMessageClosure() in /root/20230607171843-doris-branch-2.0-beta-5c33dd7a/be/lib/doris_be
9# brpc::InputMessenger::OnNewMessages(brpc::Socket*) in /root/20230607171843-doris-branch-2.0-beta-5c33dd7a/be/lib/doris_be
10# brpc::Socket::ProcessEvent(void*) in /root/20230607171843-doris-branch-2.0-beta-5c33dd7a/be/lib/doris_be
11# bthread::TaskGroup::task_runner(long) in /root/20230607171843-doris-branch-2.0-beta-5c33dd7a/be/lib/doris_be
12# bthread_make_fcontext in /root/20230607171843-doris-branch-2.0-beta-5c33dd7a/be/lib/doris_be
in join node, if it's broadcast_join
and shared hash table, some counter/timer about build hash table is useless,
so we could add those counter/timer in faker profile, and those will not display in web profile.