This CL finalizes the feature extraction part for the RNN VAD adding
a class that combines a high-pass filter, LP residual computation,
pitch estimation and spectral features extraction.
This CL also includes a minor refactoring of the pitch estimation
library.
Bug: webrtc:9076
Change-Id: I918b9e143bc6dd2bf508a891446067258a68a777
Reviewed-on: https://webrtc-review.googlesource.com/75504
Commit-Queue: Alessio Bazzica <alessiob@webrtc.org>
Reviewed-by: Alex Loiko <aleloi@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#23235}
This CL defines SpectralFeaturesExtractor which is responsible for
computing the spectral features used as input for the RNN.
Bug: webrtc:9076
Change-Id: I5e1396b89eca9c13bb268e8419a16436a9c3450f
Reviewed-on: https://webrtc-review.googlesource.com/73760
Commit-Queue: Alessio Bazzica <alessiob@webrtc.org>
Reviewed-by: Alex Loiko <aleloi@webrtc.org>
Reviewed-by: Ivo Creusen <ivoc@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#23206}
During pitch search in the RNN VAD, we calculate auto
correlation. Before this CL, we computed kNumInvertedLags12kHz=147 dot
products of vectors with kBufSize12kHz-kMaxPitch12kHz=240
elements. This was the most time consuming step of the new VAD.
This CL makes the computation happen in frequency domain. Profiling
shows a 3x speed increase. In future, we can try using a more efficient
FFT and to reduce the FFT length to some of e.g. 400, 405, 432.
# For minimal Clang plugin check change.
TBR: kwiberg@webrtc.org
Bug: webrtc:9076
Change-Id: I688251a415869d53175a37f390f441d4e035d954
Reviewed-on: https://webrtc-review.googlesource.com/73366
Reviewed-by: Karl Wiberg <kwiberg@webrtc.org>
Reviewed-by: Alessio Bazzica <alessiob@webrtc.org>
Commit-Queue: Alex Loiko <aleloi@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#23171}
This CL adds helper functions to be used for the spectral features
computation. Namely, it includes the following:
- band boundaries (frequency to FFT coeffcient index)
- band energy coefficients
- log band energy coefficients
- fixed size DCT table and computation
Bug: webrtc:9076
Change-Id: I03a8799b226d986bc1e37cefd0c3039f94b5592a
Reviewed-on: https://webrtc-review.googlesource.com/73687
Reviewed-by: Alex Loiko <aleloi@webrtc.org>
Reviewed-by: Minyue Li <minyue@webrtc.org>
Commit-Queue: Alessio Bazzica <alessiob@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#23170}
BandAnalysisFft class that wraps the FFT library, makes it easy to change
FFT library, applies windowing function and owns the FFT input buffer.
Bug: webrtc:9076
Change-Id: I9e7ed587ae263b906e04a66bf8c06eaae64daf19
Reviewed-on: https://webrtc-review.googlesource.com/72900
Commit-Queue: Alessio Bazzica <alessiob@webrtc.org>
Reviewed-by: Alex Loiko <aleloi@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#23150}
RNN implementation for the AGC2 VAD that includes a fully connected
layer and a gated recurrent unit layer.
Bug: webrtc:9076
Change-Id: Ibb8b0b4e9213f09eb9dbe118bbdc94d7e8e4f91b
Reviewed-on: https://webrtc-review.googlesource.com/72060
Reviewed-by: Patrik Höglund <phoglund@webrtc.org>
Reviewed-by: Alex Loiko <aleloi@webrtc.org>
Reviewed-by: Ivo Creusen <ivoc@webrtc.org>
Commit-Queue: Alessio Bazzica <alessiob@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#23101}
Functions to estimate pitch period and gain.
Bug: webrtc:9076
Change-Id: Icfe9430dcae11bdb96165c5bfe6e2b1d3bf848ab
Reviewed-on: https://webrtc-review.googlesource.com/70382
Commit-Queue: Alex Loiko <aleloi@webrtc.org>
Reviewed-by: Per Åhgren <peah@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#23066}
Functions to estimate the inverse filter via LPC and compute the LP
residual applying the inverse filter.
This CL also includes test utilities, in particular BinaryFileReader,
used to read chunks of data and optionally cast them on the fly, and
Create*Reader() functions to read resource files available at test
time.
Bug: webrtc:9076
Change-Id: Ia4793b8ad6a63cb3089ed11ddad89d1aa0b840f6
Reviewed-on: https://webrtc-review.googlesource.com/70244
Commit-Queue: Alessio Bazzica <alessiob@webrtc.org>
Reviewed-by: Jesus de Vicente Pena <devicentepena@webrtc.org>
Reviewed-by: Alex Loiko <aleloi@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#22946}
Adding a data structure to cache the results of pair-wise comparisons
between items stored in a ring buffer. This is used to avoid recomputing
the pair-wise comparison every time that a new item is added in a ring
buffer.
Bug: webrtc:9076
Change-Id: I88fb67a80bd3fd8497764dc7ae7e0a577c06b20f
Reviewed-on: https://webrtc-review.googlesource.com/70162
Commit-Queue: Alessio Bazzica <alessiob@webrtc.org>
Reviewed-by: Henrik Lundin <henrik.lundin@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#22942}
Ring buffer template for a finite number of arrays of given type and size.
Bug: webrtc:9076
Change-Id: Ia6c2065b0013f4a00f693966641f9aebe09f6f5c
Reviewed-on: https://webrtc-review.googlesource.com/70161
Reviewed-by: Alex Loiko <aleloi@webrtc.org>
Reviewed-by: Sam Zackrisson <saza@webrtc.org>
Commit-Queue: Alessio Bazzica <alessiob@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#22939}
The SequenceBuffer class template implements a linear buffer with a Push
operation that is used to add a fixed size chunk of new samples into the
buffer. Its properties are its size and the size of the chunks that are
pushed. It is used to implement the pitch buffer in the RNN VAD feature
extractor, for which a ring buffer would be a painful choice.
Bug: webrtc:9076
Change-Id: I4767bf06d5a414dbed724a96ea4186ef013a1e30
Reviewed-on: https://webrtc-review.googlesource.com/70204
Commit-Queue: Alessio Bazzica <alessiob@webrtc.org>
Reviewed-by: Gustaf Ullberg <gustaf@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#22919}
This reverts commit 8628f5bb7c7f5bd0373567095af08cebe8bb7f8d.
Reason for revert: iOS buildbot failing
Original change's description:
> AGC2 RNN VAD: initial build targets
>
> rnn_vad_tool is an executable that reads a wav file of any sample rate
> compatible with 10 ms frames that are resampled and, when the VAD is
> fully landed, will process the resampled frames to compute the VAD
> probability.
>
> To avoid mac, win and ios trybot failures, to_be_removed.h/.cc have
> been added and will be removed as soon as the :lib target includes
> code that leads to a non-empty static lib file on those platforms.
>
> Bug: webrtc:9076
> Change-Id: I810c08acfa1adf2029e3baac2adda3045ae5214a
> Reviewed-on: https://webrtc-review.googlesource.com/70202
> Reviewed-by: Alex Loiko <aleloi@webrtc.org>
> Commit-Queue: Alessio Bazzica <alessiob@webrtc.org>
> Cr-Commit-Position: refs/heads/master@{#22898}
TBR=alessiob@webrtc.org,aleloi@webrtc.org
Change-Id: Ic6014dde78b0ef371804c52608145ba8acdd9c97
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: webrtc:9076
Reviewed-on: https://webrtc-review.googlesource.com/70144
Reviewed-by: Alessio Bazzica <alessiob@webrtc.org>
Commit-Queue: Alessio Bazzica <alessiob@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#22899}
rnn_vad_tool is an executable that reads a wav file of any sample rate
compatible with 10 ms frames that are resampled and, when the VAD is
fully landed, will process the resampled frames to compute the VAD
probability.
To avoid mac, win and ios trybot failures, to_be_removed.h/.cc have
been added and will be removed as soon as the :lib target includes
code that leads to a non-empty static lib file on those platforms.
Bug: webrtc:9076
Change-Id: I810c08acfa1adf2029e3baac2adda3045ae5214a
Reviewed-on: https://webrtc-review.googlesource.com/70202
Reviewed-by: Alex Loiko <aleloi@webrtc.org>
Commit-Queue: Alessio Bazzica <alessiob@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#22898}