This CL brings a large improvement to the VAD by precomputing the
energy for the sliding frame `y` in the pitch buffer instead of
computing them twice in two different places. The realtime factor
has improved by about +16x.
There is room for additional improvement (TODOs added), but that will
be done in a follow up CL since the change won't be bit-exact and
careful testing is needed.
Benchmarked as follows:
```
out/release/modules_unittests \
--gtest_filter=*RnnVadTest.DISABLED_RnnVadPerformance* \
--gtest_also_run_disabled_tests --logs
```
Results:
| baseline | this CL
------+----------------------+------------------------
run 1 | 23.568 +/- 0.990788 | 22.8319 +/- 1.46554
| 377.207x | 389.367x
------+----------------------+------------------------
run 2 | 23.3714 +/- 0.857523 | 22.4286 +/- 0.726449
| 380.379x | 396.369x
------+----------------------+------------------------
run 2 | 23.709 +/- 1.04477 | 22.5688 +/- 0.831341
| 374.963x | 393.906x
Bug: webrtc:10480
Change-Id: I599a4dda2bde16dc6c2f42cf89e96afbd4630311
Reviewed-on: https://webrtc-review.googlesource.com/c/src/+/191484
Reviewed-by: Per Åhgren <peah@webrtc.org>
Commit-Queue: Alessio Bazzica <alessiob@webrtc.org>
Cr-Commit-Position: refs/heads/master@{#32571}