Previous Multimodal Information based Speech Processing (MISP) challenge...
Speech technology has improved greatly for norm speakers, i.e., adult na...
The Multi-modal Information based Speech Processing (MISP) challenge aim...
In this work, we analyzed and compared speech representations extracted ...
We present an articulatory synthesis framework for the synthesis and
man...
Background: Computational models of speech recognition often assume that...
The high cost of data acquisition makes Automatic Speech Recognition (AS...
In this paper, we investigate several existing and a new state-of-the-ar...
We present a voice conversion framework that converts normal speech into...
The development of pathological speech systems is currently hindered by ...
In this paper, we propose a new approach to pathological speech synthesi...
This paper tackles automatically discovering phone-like acoustic units (...
Automatic speech recognition (ASR) systems promise to deliver objective
...
This study addresses unsupervised subword modeling, i.e., learning acous...
This technical report describes our submission to the 2021 SLT Children
...
This paper proposes a new model, referred to as the show and speak (SAS)...
The idea of combining multiple languages' recordings to train a single
a...
Image2Speech is the relatively new task of generating a spoken descripti...
Oral cancer speech is a disease which impacts more than half a million p...
This study addresses unsupervised subword modeling, i.e., learning featu...
We investigated word recognition in a Visually Grounded Speech model. Th...
Only a handful of the world's languages are abundant with the resources ...
An estimated half of the world's languages do not have a written form, m...
Background music in social interaction settings can hinder conversation....
Developing speech technologies for low-resource languages has become a v...
We summarize the accomplishments of a multi-disciplinary workshop explor...