Large language models have proven themselves highly flexible, able to so...
Self-supervised learning via masked prediction pre-training (MPPT) has s...
While end-to-end models have shown great success on the Automatic Speech...
Accents mismatching is a critical problem for end-to-end ASR. This paper...
The effects of speaking-style variability on automatic speaker verificat...
In this work, we propose a novel and efficient minimum word error rate (...
Singing voice conversion is a task to convert a song sang by a source si...
Attention-based sequence-to-sequence models for speech recognition joint...
Text-independent speaker recognition using short utterances is a highly
...