Accurate recognition of specific categories, such as persons' names, dat...
Recently, a number of approaches to train speech models by incorpo-ratin...
We introduce the Universal Speech Model (USM), a single large model that...
We propose a novel method to accelerate training and inference process o...
This paper proposes Virtuoso, a massively multilingual speech-text joint...
Training state-of-the-art Automated Speech Recognition (ASR) models typi...
We present JOIST, an algorithm to train a streaming, cascaded, encoder
e...
Building inclusive speech recognition systems is a crucial step towards
...
We present Maestro, a self-supervised training method to unify
represent...
Self-supervised pretraining for Automated Speech Recognition (ASR) has s...
We introduce asynchronous dynamic decoder, which adopts an efficient A*
...
End-to-end (E2E) systems have played a more and more important role in
a...
End-to-end modeling (E2E) of automatic speech recognition (ASR) blends a...
Recent advances in deep learning based large vocabulary con- tinuous spe...
Speech recognition is a sequence prediction problem. Besides employing
v...
We describe initial work on an extension of the Kaldi toolkit that suppo...
End-to-end (E2E) automatic speech recognition (ASR) systems directly map...
Unsupervised single-channel overlapped speech recognition is one of the
...