Commit Graph

60 Commits

Author SHA1 Message Date
b86e4e8498 [fix](profile) fix possible coredump of rpc verbose profile (#40117)
## Proposed changes

Issue Number: close #xxx

`_instance_to_rpc_stats_vec` may be updated when sorting in
`ExchangeSinkBuffer<Parent>::update_profile`, which may cause coredump.
2024-08-29 23:54:52 +08:00
2dea859bdb [debug](rpc) debug rpc time consumption problem (#39852)
## Proposed changes

Issue Number: close #xxx

Add detail RPC time info for each channel, sorted by max rpc time of
channels:
```
                     DATA_STREAM_SINK_OPERATOR  (id=1,dst_id=1):
                          -  Partitioner:  Crc32HashPartitioner(64)
                          -  BlocksProduced:  74
                          -  BrpcSendTime:  2.689us
                          -  BrpcSendTime.Wait:  0ns
                          -  BytesSent:  89.35  KB
                          -  CloseTime:  680.152us
                          -  CompressTime:  0ns
                          -  ExecTime:  160.663ms
                          -  InitTime:  263.608us
                          -  InputRows:  32.512K  (32512)
                          -  LocalBytesSent:  0.00  
                          -  LocalSendTime:  0ns
                          -  LocalSentRows:  0
                          -  MemoryUsage:  
                              -  PeakMemoryUsage:  80.00  KB
                          -  MergeBlockTime:  0ns
                          -  OpenTime:  4.113ms
                          -  OverallThroughput:  0.0  /sec
                          -  PendingFinishDependency:  41.179ms
                          -  RowsProduced:  32.512K  (32512)
                          -  RpcAvgTime:  11.850ms
                          -  RpcCount:  10
                          -  RpcMaxTime:  86.891ms
                          -  RpcMinTime:  15.200ms
                          -  RpcSumTime:  118.503ms
                          -  SerializeBatchTime:  13.517ms
                          -  SplitBlockDistributeByChannelTime:  38.923ms
                          -  SplitBlockHashComputeTime:  2.659ms
                          -  UncompressedRowBatchSize:  135.19  KB
                          -  WaitForDependencyTime:  0ns
                              -  WaitForRpcBufferQueue:  0ns
                        RpcInstanceDetails:
                              -  Instance  85d4f75b72a9ea61:  Count:  4,  MaxTime:  36.238ms,  MinTime:  12.107ms,  AvgTime:  21.722ms,  SumTime:  86.891ms
                              -  Instance  85d4f75b72a9ea91:  Count:  3,  MaxTime:  11.107ms,  MinTime:  2.431ms,  AvgTime:  5.470ms,  SumTime:  16.412ms
                              -  Instance  85d4f75b72a9eac1:  Count:  3,  MaxTime:  7.554ms,  MinTime:  3.160ms,  AvgTime:  5.066ms,  SumTime:  15.200m
```
2024-08-24 19:59:39 +08:00
b864aa7aa2 [fix](pipeline) Fix query hang up if limited rows is reached (#35513) (#35746)
Follow-up for #35466.

We should assure closed tasks will not block other tasks.

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->
2024-05-31 22:50:57 +08:00
25358564ca [Fix](compile) Fix gcc compile on master (#33864)
This is imported by #33511. wrongly used

ColumnStr<T> ();

which violate C++20 standard(see https://wg21.cmeerw.net/cwg/issue2237) but still supported by clang up until now(see llvm/llvm-project#58112)
2024-04-19 23:41:37 +08:00
1a8b1e6787 [pipelineX](broadcast) Set dependency ready if a limited exchange returns EOS (#33525) 2024-04-17 23:42:00 +08:00
cf7595d423 [opt](memory) Optimize mem tracker accuracy (#32039) (#33140) 2024-04-10 11:42:19 +08:00
7675383c40 [bugfix](deadlock) fix dead lock in cancel fragment (#33181)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-04-03 13:41:24 +08:00
bea05da638 [shuffle](fix) Do not use copy assignment for TUniqueId (#32969) 2024-03-29 10:37:26 +08:00
4bf202db04 [pipelineX](exchange) Make exchange buffer size configurable (#32201) 2024-03-16 20:58:20 +08:00
5cfd7c2a1c [improvement](memtracker) should counter memory usage to query when exchange sink buffer rpc (#30964)
* [improvement](memtracker) should counter memory usage to query when rpc callback

* f

---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-02-16 10:12:25 +08:00
6614d40dad [bugfix](core) fix core due to send rpc and request is deconstructed (#30344)
Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-01-25 13:24:52 +08:00
ce5ba61640 [refactor](close)Full refactor async writer (#30082)
---------

Co-authored-by: yiguolei <yiguolei@gmail.com>
2024-01-23 13:22:15 +08:00
Pxl
3cf95d0fdf [Improvement](execute) optimize for ColumnNullable's serialize_vec/deserialize_vec (#28788)
optimize for ColumnNullable's serialize_vec/deserialize_vec
2024-01-12 11:59:52 +08:00
0d691c638b [Feature](profile)Support report runtime workload statistics #29591 2024-01-12 11:59:27 +08:00
18ad8562f2 [refactor](broadcastbuffer) using a queue to remove ref and unref codes (#28698)
Co-authored-by: yiguolei <yiguolei@gmail.com>Add a new class broadcastbufferholderqueue to manage holders
Using shared ptr to manage holders, not use ref and unref, it is too difficult to maintain.
2023-12-20 21:23:25 +08:00
5442e8d1fc [pipelineX](dependency) split different dependencies (#27366) 2023-11-22 12:50:39 +08:00
b1eef30b49 [pipelineX](dependency) Wake up task by dependencies (#26879)
---------

Co-authored-by: Mryange <2319153948@qq.com>
2023-11-18 03:20:24 +08:00
8cf360fff7 [refactor](closure) remove ref count closure using auto release closure (#26718)
1. closure should be managed by a unique ptr and released by brpc , should not hold by our code. If hold by our code, we need to wait brpc finished during cancel or close.
2. closure should be exception safe, if any exception happens, should not memory leak.
3. using a specific callback interface to be implemented by Doris's code, we could write any code and doris should manage callback's lifecycle.
4. using a weak ptr between callback and closure. If callback is deconstruted before closure'Run, should not core.
2023-11-12 11:57:46 +08:00
d988193d39 [pipelineX](shuffle) block exchange sink by memory usage (#26595) 2023-11-09 21:28:22 +08:00
a6f9df7096 [LOG] Add fatal log in exchange sink buffer (#26594) 2023-11-08 21:52:21 +08:00
a6756b4660 [pipelineX](bug) Fix broadcast buffer reference count (#26545) 2023-11-08 00:14:48 +08:00
8da1a9a370 [pipeline](fix) remove unreasonable CHECK (#26504) 2023-11-07 15:48:07 +08:00
bd89028306 [bug](pipelineX) Fix potential bug using broadcast shuffle (#26458) 2023-11-06 17:33:20 +08:00
6b2eed779c [feature](AuditLog) add scanRows scanBytes in auditlog (#25435) 2023-10-25 10:00:35 +08:00
a6925cc0cf Fix exchange operator can not aware end of file (#25562) 2023-10-20 18:56:01 +08:00
ac8fbdd53c [pipelineX](fix) Fix use-after-free in shuffling (#25409) 2023-10-13 16:57:34 +08:00
7434f80300 [pipelineX](refactor) Refactor pending finish dependency (#25181) 2023-10-10 11:56:02 +08:00
642e5cdb69 [Fix](Status) Make Status [[nodiscard]] and handle returned Status correctly (#23395) 2023-09-29 22:38:52 +08:00
864a0f9bcb [opt](pipeline) Make pipeline fragment context send_report asynchronized (#23142) 2023-09-28 17:55:53 +08:00
034582bb64 [pipelineX](fix) Fix broadcast dependency hanging (#24740) 2023-09-22 12:24:32 +08:00
e54c4ef258 [pipelineX](dependency) refactor write dependency (#24555) 2023-09-19 18:01:42 +08:00
23f01ddf3a [feature](profile) support simply profile (#23377)
A Simplified Version of the Profile

Divided into three levels:
Level 2: The original profile.
Level 1: Instances with identical structures are merged, utilizing concatenation for info strings, and recording the extremum for time types.


Note that currently, this is purely experimental, simplifying the profile on the frontend (you can view profiles at any level).

Subsequently, we will transition the simplification process to the backend. At that point, due to the simplification being done on the backend, viewing profiles at other levels won't be possible.

Due to the issue with the pipeline structure, the active time does not accurately reflect the time of the operators.

```
set enable_simply_profile = false;
set enable_simply_profile = true;
```
2023-09-15 10:25:14 +08:00
29b94c4ed7 [pipeline](refactor) refine pipeline fragment context (#23478) 2023-08-28 15:55:02 +08:00
ba351af452 [enhancement](thirdparty) upgrade thirdparty libs - again (#23414)
submit again #23290 (not upgrade brpc, because bthread local has error)

protobuf 3.15.0 -> 21.11
glog 0.4.0 -> 0.6.0
lz4 1.9.3 -> 1.9.4
curl 7.79.0 -> 8.2.1
zstd 1.5.2 -> 1.5.5
arrow 7.0.0 -> 13.0.0
abseil 20220623.1 -> 20230125.3
orc 1.7.2 -> 1.9.0
jemalloc for arrow 5.2.1 -> 5.3.0
xsimd 7.0.0 -> 13.0.0
opentelemetry-proto 0.19.0 -> 1.0.0
opentelemetry 1.8.3 -> 1.10.0

new:
c-ares -> 1.19.1
grpc -> 1.54.3
2023-08-26 22:59:10 +08:00
dcd6c3c022 [pipelineX](refactor) propose a new pipeline execution model (#22562) 2023-08-21 15:38:45 +08:00
Pxl
34399e2965 [Bug](exchange) init _instance_to_rpc_ctx on register_sink (#22976)
init _instance_to_rpc_ctx on register_sink
2023-08-15 13:02:28 +08:00
5f25b924b3 [opt](conf) Modify brpc eovercrowded conf (#22407)
brpc ignore eovercrowded of data stream sender and exchange sink buffer
Modify the default value of brpc_socket_max_unwritten_bytes
2023-08-01 08:47:55 +08:00
aa75f79fad [fix](executor)cancel exchange buffer rpc when query is cancelled (#22226)
when brpc client make a request to a server, if the server doesn't response and may not response forever(such as BE restart), the query can be cancelled at once, but the ExchangeSinkBuffer can not be cancelled until rpc timeout.
So we hope when the query is cancelled, the ExchangeSinkBuffer can be closed at once.
2023-07-27 14:38:25 +08:00
Pxl
ca71048f7f [Chore](status) avoid empty error msg on status (#21454)
avoid empty error msg on status
2023-07-11 13:48:16 +08:00
ee9822fa7e [Fix](pipeline) fix ExchangeSinkBuffer request id memory alloc problem (#21647)
Co-authored-by: airborne12 <airborne12@gmail.com>
fix ExchangeSinkBuffer request id memory alloc problem
2023-07-09 23:45:28 +08:00
Pxl
f8cfe5e579 [Bug](pipeline) add DCHECK for _instance_to_sending_by_pipeline = false on _send_rpc (#21169)
add DCHECK for _instance_to_sending_by_pipeline = false on _send_rpc
2023-06-29 10:03:57 +08:00
601120db04 [Bug](pipeline) access map may cause coredump in sink buffer (#21108) 2023-06-24 23:03:59 +08:00
661e1ae7c5 [fix](memory) no switch bthread context in UBSAN compile (#21064)
When UBSAN is compiled, all memory will be tracked to the orphan (unknown) mem tracker, and the bthread context and mem tracker will no longer be switched.

The supplementary fixes are as follows: #20999
2023-06-21 21:14:07 +08:00
622ef63c69 [fix](memory) fix bthread_setspecific error in rpc done.run() (#20999) 2023-06-20 21:00:45 +08:00
14f59bef1d [improvement](profile)add sum/avg rpc time (#20511) 2023-06-12 11:34:49 +08:00
05438eab0d remove DCHECK for rpc time (#20621) 2023-06-09 13:38:12 +08:00
65100d8083 [improvement](profile)add max/min rpc time (#20339) 2023-06-06 12:03:01 +08:00
8bec2b41db [pipeline](rpc) support closure reuse in pipeline exec engine (#20278) 2023-06-02 09:50:21 +08:00
488c9ba7c2 [improvement](exchange) test: data stream sender stop sending data to receiver if it returns eos early (#20081) 2023-05-26 16:05:38 +08:00
3598518e59 [fix](revert) data stream sender stop sending data to receiver if it returns eos early (#19847)" (#20040)
* Revert "[fix](sink) fix END_OF_FILE error for pipeline caused by VDataStreamSender eof (#20007)"

This reverts commit 2ec1d282c5e27b25d37baf91cacde082cca4ec31.

* [fix](revert) data stream sender stop sending data to receiver if it returns eos early (#19847)"

This reverts commit c73003359567067ea7d44e4a06c1670c9ec37902.
2023-05-25 16:50:17 +08:00