Noise Adaptive Speech Enhancement using Domain Adversarial Training
In this study, we propose a novel noise adaptive speech enhancement (SE) system, which employs a domain adversarial training (DAT) approach to tackle the issue of noise type mismatch between training and testing conditions. Such a mismatch is a critical problem in deep-learning-based SE systems. A large mismatch may cause serious performance degradation to the SE performance. Since we generally use a well trained SE system to handle various unseen noise types, the noise type mismatch commonly happens in real-world scenarios. The proposed noise adaptive SE system contains an encoder-decoder-based enhancement model and a domain discriminator model. During adaptation, the DAT approach encourages the encoder to produce noise invariant features based on the information from the discriminator model and consequentially increases the robustness of the enhancement model to unseen noise types. Here we regard stationary noises as the source domain (with ground-truth clean speech) and non-stationary noises as the target domain (without ground truth). We evaluated the proposed system on the TMHINT sentences. Experimental results show that the proposed noise adaptive SE system successfully provide notable PESQ (55.9 SSNR (26.1 adaptation.
READ FULL TEXT