Improving Transformer-based Speech Recognition Using Unsupervised Pre-training

10/22/2019
by   Dongwei Jiang, et al.
0

Speech recognition technologies are gaining enormous popularity in various industrial applications. However, building a good speech recognition system usually requires significant amounts of transcribed data which is expensive to collect. To tackle this problem, we propose a novel unsupervised pre-training method called masked predictive coding, which can be applied for unsupervised pre-training with Transformer based model. Experiments on HKUST show that using the same training data and other open source Mandarin data, we can reduce CER of a strong Transformer based baseline by 3.7 reduce CER of AISHELL-1 by 12.9

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset