While standard speaker diarization attempts to answer the question "who
...
We introduce a multilingual speaker change detection model (USM-SCD) tha...
Traditionally, audio-visual automatic speech recognition has been studie...
This work presents a large-scale audio-visual speech recognition system ...
End-to-end automatic speech recognition (ASR) models, including both
att...
Multilingual training has been shown to improve acoustic modeling perfor...
Multimodal language models attempt to incorporate non-linguistic feature...
This work presents a scalable solution to open-vocabulary visual speech
...
Recurrent neural network (RNN) language models (LMs) and Long Short Term...
We present results that show it is possible to build a competitive, grea...