Commit Graph

44 Commits

Author SHA1 Message Date
d5ea677282 [feature](tracing) Support query tracing to improve doris observability by introducing OpenTelemetry. (#10533)
The collection of query traces is implemented in fe and be, and the spans are exported to zipkin.
DSIP: https://cwiki.apache.org/confluence/display/DORIS/DSIP-012%3A+Introduce+opentelemetry
2022-07-09 15:50:40 +08:00
70450d04ba [typo] Fix typos in comments (#10172) 2022-06-19 10:30:17 +08:00
0376ca17f3 [Enhancement] Remove minidump (#9894) 2022-06-01 08:04:24 +08:00
ca05d1ee01 [fix](memory tracker) Fix lru cache, compaction tracker, add USE_MEM_TRACKER compile (#9661)
1. Fix Lru Cache MemTracker consumption value is negative.
2. Fix compaction Cache MemTracker has no track.
3. Add USE_MEM_TRACKER compile option.
4. Make sure the malloc/free hook is not stopped at any time.
2022-05-25 08:56:17 +08:00
718a51a388 [refactor][style] Use clang-format to sort includes (#9483) 2022-05-10 21:25:35 +08:00
b34ed43ec9 [feature-wip] (memory tracker) (step6, End) Fix some details (#9301)
1. Fix LoadTask, ChunkAllocator, TabletMeta, Brpc, the accuracy of memory track.
2. Modified some MemTracker names, deleted some unnecessary trackers, and improved readability.
3. More powerful MemTracker debugging capabilities.
4. Avoid creating TabletColumn temporary objects and improve BE startup time by 8%.
5. Fix some other details.
2022-05-10 18:17:09 +08:00
c9961c9bb9 [style] clang-format all c++ code (#9305)
- sh build-support/clang-format.sh  to  clang-format all c++ code
2022-04-29 16:14:22 +08:00
e5e0dc421d [refactor] Change ALL OLAPStatus to Status (#8855)
Currently, there are 2 status code in BE, one is common/Status.h,
and the other is olap/olap_define.h called OLAPStatus.
OLAPStatus is just an enum type, it is very simple and could not save many informations,
I will unify these code to common/Status.
2022-04-14 11:43:49 +08:00
e63afc1a3c [feature-wip](remote storage)(step2) add storage_backend_mgr on BE side (#8663)
1. add storage backend mgr
2. remove env_remote
2022-03-31 11:13:14 +08:00
aaaaae53b5 [feature] (memory) Switch TLS mem tracker to separate more detailed memory usage (#8605)
In pr #8476, all memory usage of a process is recorded in the process mem tracker,
and all memory usage of a query is recorded in the query mem tracker,
and it is still necessary to manually call `transfer to` to track the cached memory size.

We hope to separate out more detailed memory usage based on Hook TCMalloc new/delete + TLS mem tracker.

In this pr, the more detailed mem tracker is switched to TLS, which automatically and accurately
counts more detailed memory usage than before.
2022-03-24 14:29:34 +08:00
b89e4c7bba [feature-wip](java-udf) support java UDF with fixed-length input and output (#8516)
This feature is propsoed in [DSIP-1](https://cwiki.apache.org/confluence/display/DORIS/DSIP-001%3A+Java+UDF). 
This PR support fixed-length input and output Java UDF. Phase I in DIP-1 is done after this PR.

To support Java UDF effeciently, I use no data copy in JNI call and all compute operations are off-heap in Java.
To achieve that, I use a UdfExecutor instead. 

For users, a UDF class must have a public evaluate method.
2022-03-23 10:32:50 +08:00
989e03ddf9 [improvement] Improve sig handler (#8545)
* Refactor glog's default signal handler

Co-authored-by: Zhengguo Yang <780531911@qq.com>
2022-03-22 10:40:31 +08:00
eeae516e37 [Feature](Memory) Hook TCMalloc new/delete automatically counts to MemTracker (#8476)
Early Design Documentation: https://shimo.im/docs/DT6JXDRkdTvdyV3G

Implement a new way of memory statistics based on TCMalloc New/Delete Hook,
MemTracker and TLS, and it is expected that all memory new/delete/malloc/free
of the BE process can be counted.
2022-03-20 23:06:54 +08:00
e17aef9467 [refactor] refactor the implement of MemTracker, and related usage (#8322)
Modify the implementation of MemTracker:
1. Simplify a lot of useless logic;
2. Added MemTrackerTaskPool, as the ancestor of all query and import trackers, This is used to track the local memory usage of all tasks executing;
3. Add cosume/release cache, trigger a cosume/release when the memory accumulation exceeds the parameter mem_tracker_consume_min_size_bytes;
4. Add a new memory leak detection mode (Experimental feature), throw an exception when the remaining statistical value is greater than the specified range when the MemTracker is destructed, and print the accurate statistical value in HTTP, the parameter memory_leak_detection
5. Added Virtual MemTracker, cosume/release will not sync to parent. It will be used when introducing TCMalloc Hook to record memory later, to record the specified memory independently;
6. Modify the GC logic, register the buffer cached in DiskIoMgr as a GC function, and add other GC functions later;
7. Change the global root node from Root MemTracker to Process MemTracker, and remove Process MemTracker in exec_env;
8. Modify the macro that detects whether the memory has reached the upper limit, modify the parameters and default behavior of creating MemTracker, modify the error message format in mem_limit_exceeded, extend and apply transfer_to, remove Metric in MemTracker, etc.;

Modify where MemTracker is used:
1. MemPool adds a constructor to create a temporary tracker to avoid a lot of redundant code;
2. Added trackers for global objects such as ChunkAllocator and StorageEngine;
3. Added more fine-grained trackers such as ExprContext;
4. RuntimeState removes FragmentMemTracker, that is, PlanFragmentExecutor mem_tracker, which was previously used for independent statistical scan process memory, and replaces it with _scanner_mem_tracker in OlapScanNode;
5. MemTracker is no longer recorded in ReservationTracker, and ReservationTracker will be removed later;
2022-03-11 22:04:23 +08:00
0ff7de4157 [refactor] remove agent status (#8273)
There are 3 error code types in BE: OLAPStatus AgentStatus Status.
It is very confused and sometimes conflict during write code.
I will try to unify them to Status.
2022-03-09 13:04:50 +08:00
d1cb2913c1 [improvement] check simd instructions before start (#8042)
Sometimes BE is build on a machine with SIMD instruction such as AVX2.
But the BE binary will be copied to a machine without AVX2. It will crashed without any error message.

This PR will check the required SIMD instructions and print error messages during startup.
2022-02-17 10:46:03 +08:00
20ef8a6e21 [feature-wip](remote storage)(step1) use a struct instead of string for parameter path, add basic remote method (#7098)
For the first, we need to make a parameter to discribe the data is local or remote.
At then, we need to support some basic function to support the operation for remote storage.
2021-12-22 22:58:23 +08:00
e2d3d0134e dd a method to get doris current memory usage (#6979)
Add all memory usage check when TryConsume memory
2021-11-24 10:07:54 +08:00
a81f4da4e4 [feat](minidump) Add minidump support (#7124)
Now minidump file will be created when BE crashes.
And user can manually trigger a minidump by sending SIGUSR1 to BE process.

More details can be found in minidump.md documents
2021-11-20 21:41:26 +08:00
6c6380969b [refactor] replace boost smart ptr with stl (#6856)
1. replace all boost::shared_ptr to std::shared_ptr
2. replace all boost::scopted_ptr to std::unique_ptr
3. replace all boost::scoped_array to std::unique<T[]>
4. replace all boost:thread to std::thread
2021-11-17 10:18:35 +08:00
6597a338dc [Feature] Support config max length of zone map index (#6293) 2021-07-30 09:23:11 +08:00
a803ceea86 [refactor] Remove boost mutex, use std::mutex instead (#5684)
* Remove boost mutex, use std::mutex instead

* replace shared_mutex
2021-04-22 11:29:36 +08:00
6fedf5881b [CodeFormat] Clang-format cpp sources (#4965)
Clang-format all c++ source files.
2020-11-28 18:36:49 +08:00
0213e93ea1 [Feature][Config] Support persistence of configuration items modified at runtime (#4704)
Support persistence of configuration items modified at runtime via HTTP API.

```
FE:
GET /api/_set_config?key=value&persist=true

BE
POST /api/update_config?key=value&persist=true
```

The modified config will be saved in `fe_custom.conf` or `be_custom.conf`.
And when process starts, it will load `fe.conf/be.conf` first, then `fe_custom.conf/be_custom.conf`.
2020-10-22 21:39:02 +08:00
09f97f8a05 [Refactor] Fixes some be typo part 2 (#4747) 2020-10-20 09:28:57 +08:00
b780df697a [refactor] Optimize threads usage mode in BE (#4440)
BE can not graceful exit because some threads are running in endless
loop. This patch do the following optimization:
- Use the well encapsulated Thread and ThreadPool instead of std::thread
  and std::vector<std::thread>
- Use CountDownLatch in thread's loop condition to avoid endless loop
- Introduce a new class Daemon for daemon works, like tcmalloc_gc,
  memory_maintenance and calculate_metrics
- Decouple statistics type TaskWorkerPool and StorageEngine notification
  by submit tasks to TaskWorkerPool's queue
- Reorder objects' stop and deconstruct in main(), i.e. stop network
  services at first, then internal services
- Use libevent in pthreads mode, by calling evthread_use_pthreads(),
  then EvHttpServer can exit gracefully in multi-threads
- Call brpc::Server's Stop() and ClearServices() explicitly
2020-09-06 20:19:14 +08:00
ed9022a908 Ignore broken disk when BE starts up (#3741) 2020-06-05 10:26:07 +08:00
2ad1b20b24 [Config] Add new BE config for tcmalloc (#3732)
Add a new BE config tc_max_total_thread_cache_bytes
2020-06-03 21:58:13 +08:00
2f7d2c7e1a [BUG] Fix a bug that ignore_broken_disk may not work (#3486)
When BE sets `ignore_broken_disk` to true, it's expected that non-exist path in storage_root_path won't prevent BE from launching, but in 0.12 BE fails to launch in such scenario.

```
W0506 14:46:11.039953 17040 options.cpp:64] path can not be canonicalized. may be not exist. path=/data11/olap
W0506 14:46:11.040014 17040 options.cpp:141] failed to parse store path /data11/olap, res=-203
```
The reason is that #2861 adds a path existence check in `parse_root_path` which precedes the usage of `ignore_broken_disk` in the main method.
2020-05-08 12:53:44 +08:00
74b987f053 [Bug] Fix bug that storage engine bg threads should start after env is ready 2020-04-29 11:21:19 +08:00
e4682398bd [web] Dump configs on BE's website '/varz' (#3220)
Dump configs on BE's website '/varz'
Change NAVIGATION_BAR_PREFIX from 'Impala' to 'Doris'
Format the related files by clang-format
2020-03-28 16:26:38 +08:00
0f98f975c7 Remove unused LLVM related codes of directory:be/src/codegen (#2910) (#2987)
Remove unused LLVM related codes of directory (step 5):be/src/codegen (#2910)

there are many LLVM related codes in code base, but these codes are not really used.
The higher version of GCC is not compatible with the LLVM 3.4.2 version currently used by Doris.
The PR delete all LLVM related code of directory: be/src/codegen
2020-02-26 10:57:57 +08:00
e991b1300f [Code Refactor] Refactor AgentServer to make it less error-prone and more readable (#2831)
In `AgentServer`, each task type needs to be processed separately,
which leads to very long code, hard to read, and not easy to detect
errors (for example, some task type processing may be missed,
corresponding relationship may be error)

Fortunately, the code for each task_type is very similar, so this
is a good case to use `MACRO`, which can greatly reduce the repeated
code and solve above problems.

This patch also fix two small bugs:
1. The `_topic_subscriber` member has not been released in dtor
2. in `submit_tasks()`, the `status_code` is not reset before
   each task is processed, resulting in wrong judgment.

No functional changes in this patch.
2020-02-06 09:56:00 +08:00
64b2291347 Allow user to ignore the broken disk (#2755)
Add a BE config `ignore_broken_disk`.
2020-01-14 22:40:43 +08:00
c07f37d78c [Segment V2] Add a control framework between FE and BE through heartbeat #2247 (#2364)
The control framework is implemented through heartbeat message. Use uint64_t as flags to control different functions. 
Now add a flag to set the default rowset type to beta.
2019-12-12 12:18:32 +08:00
f852f50acb Improve unique id performance (#1911)
Remove the default constructor for UniqueID
Add a gen_uid method in UniqueId. If need to generate a new uid, users should call this api explicitly.
Reuse boost random generator not generate a new one every time.
2019-09-29 18:20:02 +08:00
2f0808137a Refactor FrontendHelper (#1888) 2019-09-27 13:21:14 +08:00
6f4feca3dc Add rowset id generator to FE and BE (#1678) 2019-09-02 18:51:31 +08:00
0d48a3961c Refactor Storage Engine (#1478)
NOTE: This patch would modify all Backend's data.
And this will cause a very long time to restart be.
So if you want to interferer your product environment,
you should upgrade backend one by one.

1. Refactoring be is to clarify the structure the codes.
2. Use unique id to indicate a rowset.
   Nameing rowset with tablet_id and version will lead to
   many conflicts among compaction, clone, restore.
3. Extract an rowset interface to encapsulate rowsets
   with different format.
2019-07-15 21:18:22 +08:00
0820a29b8d Implement the routine load process of Kafka on Backend (#671) 2019-04-28 10:33:50 +08:00
ca2ab5da59 Add more thrift log (#605) 2019-01-30 21:28:42 +08:00
ff12281a72 Call curl_global_init at the begining (#602)
To avoid concurrent initilization of libcurl, we call curl_global_init
in our main function.
2019-01-30 10:27:19 +08:00
a2b299e3b9 Reduce UT binary size (#314)
* Reduce UT binary size

Almost every module depend on ExecEnv, and ExecEnv contains all
singleton, which make UT binary contains all object files.

This patch seperate ExecEnv's initial and destory to anthor file to
avoid other file's dependence. And status.cc include debug_util.h which
depend tuple.h tuple_row.h, and I move get_stack_trace() to
stack_util.cpp to reduce status.cc's dependence.

I add USE_RTTI=1 to build rocksdb to avoid linking librocksdb.a

Issue: #292

* Update
2018-11-15 16:17:23 +08:00
37b4cafe87 Change variable and namespace name in BE (#268)
Change 'palo' to 'doris'
2018-11-02 10:22:32 +08:00