Reland "Fix getStats() freeze bug affecting Chromium but not WebRTC standalone."

This is a reland of 05d43c6f7fe497fed0f2c8714e2042dd07a86df2

The original CL got reverted because Chrome did not support IsQuitting() which
triggered a NOTREACHED() inside of a DCHECK. With
https://chromium-review.googlesource.com/c/chromium/src/+/1491620
it is safe to reland this CL.

The only changes between this and the original patch set is that this is now
rebased on top of https://webrtc-review.googlesource.com/c/src/+/124701, i.e.
rtc::PostMessageWithFunctor() has been replaced by rtc::Thread::PostTask().

Original change's description:
> Fix getStats() freeze bug affecting Chromium but not WebRTC standalone.
>
> PeerConnection::Close() is, per-spec, a blocking operation.
> Unfortunately, PeerConnection is implemented to own resources used by
> the network thread, and Close() - on the signaling thread - destroys
> these resources. As such, tasks run in parallel like getStats() get into
> race conditions with Close() unless synchronized. The mechanism in-place
> is RTCStatsCollector::WaitForPendingRequest(), it waits until the
> network thread is done with the in-parallel stats request.
>
> Prior to this CL, this was implemented by performing
> rtc::Thread::ProcessMessages() in a loop until the network thread had
> posted a task on the signaling thread to say that it was done which
> would then get processed by ProcessMessages(). In WebRTC this works, and
> the test is RTCStatsIntegrationTest.GetsStatsWhileClosingPeerConnection.
>
> But because Chromium's thread wrapper does no support
> ProcessMessages(), calling getStats() followed by close() in Chrome
> resulted in waiting forever (https://crbug.com/850907).
>
> In this CL, the process messages loop is removed. Instead, the shared
> resources are guarded by an rtc::Event. WaitForPendingRequest() still
> blocks the signaling thread, but only while shared resources are in use
> by the network thread. After this CL, calling WaitForPendingRequest() no
> longer has any unexpected side-effects since it no longer processes
> other messages that might have been posted on the thread.
>
> The resource ownership and threading model of WebRTC deserves to be
> revisited, but this fixes a common Chromium crash without redesigning
> PeerConnection, in a way that does not cause more blocking than what
> the other PeerConnection methods are already doing.
>
> Note: An alternative to using rtc::Event is to use resource locks and
> to not perform the stats collection on the network thread if the
> request was cancelled before the start of processing, but this has very
> little benefit in terms of performance: once the network thread starts
> collecting the stats, it would use the lock until collection is
> completed, blocking the signaling thread trying to acquire that lock
> anyway. This defeats the purpose and is a riskier change, since
> cancelling partial collection in this inherently racy edge-case would
> have observable differences from the returned stats, which may cause
> more regressions.
>
> Bug: chromium:850907
> Change-Id: Idceeee0bddc0c9d5518b58a2b263abb2bbf47cff
> Reviewed-on: https://webrtc-review.googlesource.com/c/121567
> Commit-Queue: Henrik Boström <hbos@webrtc.org>
> Reviewed-by: Steve Anton <steveanton@webrtc.org>
> Cr-Commit-Position: refs/heads/master@{#26707}

TBR=steveanton@webrtc.org

Bug: chromium:850907
Change-Id: I5be7f69f0de65ff1120e4926fbf904def97ea9c0
Reviewed-on: https://webrtc-review.googlesource.com/c/124781
Reviewed-by: Henrik Boström <hbos@webrtc.org>
Reviewed-by: Steve Anton <steveanton@webrtc.org>
Commit-Queue: Henrik Boström <hbos@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#26896}
This commit is contained in:
Henrik Boström
2019-02-28 09:49:31 +01:00
committed by Commit Bot
parent 25fb765367
commit 40b030edbf
3 changed files with 142 additions and 89 deletions

View File

@ -2249,7 +2249,9 @@ class FakeRTCStatsCollector : public RTCStatsCollector,
worker_thread_(pc->worker_thread()),
network_thread_(pc->network_thread()) {}
void ProducePartialResultsOnSignalingThread(int64_t timestamp_us) override {
void ProducePartialResultsOnSignalingThreadImpl(
int64_t timestamp_us,
RTCStatsReport* partial_report) override {
EXPECT_TRUE(signaling_thread_->IsCurrent());
{
rtc::CritScope cs(&lock_);
@ -2257,13 +2259,15 @@ class FakeRTCStatsCollector : public RTCStatsCollector,
++produced_on_signaling_thread_;
}
rtc::scoped_refptr<RTCStatsReport> signaling_report =
RTCStatsReport::Create(0);
signaling_report->AddStats(std::unique_ptr<const RTCStats>(
partial_report->AddStats(std::unique_ptr<const RTCStats>(
new RTCTestStats("SignalingThreadStats", timestamp_us)));
AddPartialResults(signaling_report);
}
void ProducePartialResultsOnNetworkThread(int64_t timestamp_us) override {
void ProducePartialResultsOnNetworkThreadImpl(
int64_t timestamp_us,
const std::map<std::string, cricket::TransportStats>&
transport_stats_by_name,
const std::map<std::string, CertificateStatsPair>& transport_cert_stats,
RTCStatsReport* partial_report) override {
EXPECT_TRUE(network_thread_->IsCurrent());
{
rtc::CritScope cs(&lock_);
@ -2271,11 +2275,8 @@ class FakeRTCStatsCollector : public RTCStatsCollector,
++produced_on_network_thread_;
}
rtc::scoped_refptr<RTCStatsReport> network_report =
RTCStatsReport::Create(0);
network_report->AddStats(std::unique_ptr<const RTCStats>(
partial_report->AddStats(std::unique_ptr<const RTCStats>(
new RTCTestStats("NetworkThreadStats", timestamp_us)));
AddPartialResults(network_report);
}
private: