ViVoLab, Aragón Institute for Engineering Research (I3A), University of Zaragoza, 50018 Zaragoza, Spain
Contact: {dribas, amiguel, ortega, lleida}@unizar.es
This paper proposes a Deep Learning (DL) based Wiener filter estimator for speech enhancement in the framework of the classical spectral-domain speech estimator algorithm. According to the characteristics of the intermediate steps of the speech enhancement algorithm, i.e., the SNR estimation and the gain function, there is determined the best usage of the network at learning a robust instance of the Wiener filter estimator.
Experiments show that the use of data-driven learning of the SNR estimator provides robustness to the statistical-based speech estimator algorithm and achieves performance on the state-of-the-art. Several objective quality metrics show the performance of the speech enhancement and beyond them, there are examples of noisy vs. enhanced speech available for listening to demonstrate in practice the skills of the method in simulated and real audio.