Exploring End-to-End Techniques for Low-Resource Speech Recognition

07/02/2018
by   Vladimir Bataev, et al.
0

In this work we present simple grapheme-based system for low-resource speech recognition using Babel data for Turkish spontaneous speech (80 hours). We have investigated different neural network architectures performance, including fully-convolutional, recurrent and ResNet with GRU. Different features and normalization techniques are compared as well. We also proposed CTC-loss modification using segmentation during training, which leads to improvement while decoding with small beam size. Our best model achieved word error rate of 45.8 data for this task, according to our knowledge.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset