Currently, there are some useless includes in the codebase. We can use a tool named include-what-you-use to optimize these includes. By using a strict include-what-you-use policy, we can get lots of benefits from it.
See #17764 for details
I have tested:
- Unit test for local/s3/hdfs/broker file system: be/test/io/fs/file_system_test.cpp
- Outfile to local/s3/hdfs/broker.
- Load from local/s3/hdfs/broker.
- Query file on local/s3/hdfs/broker file system, with table value function and catalog.
- Backup/Restore with local/s3/hdfs/broker file system
Not test:
- cold & host data separation case.
Currently, BE will print fail to get master client from cache. host=xxxxx, port=9228, code=THRIFT_RPC_ERROR but we did not know which step generate this error. So that I refactor error status in be and add error stack for RPC_ERROR.
W1122 10:19:21.130796 30405 utils.cpp:89] fail to get master client from cache. host=xxxx, port=9228, code=RPC error(error -1): Couldn't open transport for xxxx:9228 (open() timed out)/n @ 0x559af8f774ea doris::Status::ConstructErrorStatus()
@ 0x559af9aacbee _ZN5doris16ThriftClientImpl4openEv.cold
@ 0x559af97f563a doris::ClientCacheHelper::_create_client()
@ 0x559af97f78cd doris::ClientCacheHelper::get_client()
@ 0x559af934f38b doris::MasterServerClient::report()
@ 0x559af932e7a7 doris::TaskWorkerPool::_handle_report()
@ 0x559af932f07c doris::TaskWorkerPool::_report_task_worker_thread_callback()
@ 0x559af9b223c5 doris::ThreadPool::dispatch_thread()
@ 0x559af9b187af doris::Thread::supervise_thread()
@ 0x7f661bd8bea5 start_thread
@ 0x7f661c09eb0d __clone
Co-authored-by: yiguolei <yiguolei@gmail.com>
boost::stacktrace::stacktrace() has memory leak, so use glog internal func to print stacktrace.
The reason for the memory leak of boost::stacktrace is that a state is saved in the thread local of each thread but not actively released. The test found that each thread leaked about 100M after calling boost::stacktrace.
refer to:
boostorg/stacktrace#118boostorg/stacktrace#111
Currently, there are 2 status code in BE, one is common/Status.h,
and the other is olap/olap_define.h called OLAPStatus.
OLAPStatus is just an enum type, it is very simple and could not save many informations,
I will unify these code to common/Status.
There are 3 error code types in BE: OLAPStatus AgentStatus Status.
It is very confused and sometimes conflict during write code.
I will try to unify them to Status.
This CL mainly changes:
1. Reorganized the code logic to limit the supported json format to two, and the import behavior is more consistent.
2. Modified the statistical behavior of the number of error rows when loading in json format, so that the error rows can be counted correctly.
3. See `load-json-format.md` to get details of loading json format.
* Reduce UT binary size
Almost every module depend on ExecEnv, and ExecEnv contains all
singleton, which make UT binary contains all object files.
This patch seperate ExecEnv's initial and destory to anthor file to
avoid other file's dependence. And status.cc include debug_util.h which
depend tuple.h tuple_row.h, and I move get_stack_trace() to
stack_util.cpp to reduce status.cc's dependence.
I add USE_RTTI=1 to build rocksdb to avoid linking librocksdb.a
Issue: #292
* Update
1. Implement Backend http server using libevent instead of mongoose.
2. Remove Old Hypertable rpc framework, use brpc instead.
3. Change rpc from FE to BE to brpc.
4. Fs broker support HDFS HA.
5. add more metrics to monitor.
6. Lots of bug fixed.