A large part of the expressive speech synthesis literature focuses on
le...
This paper proposes an Expressive Speech Synthesis model that utilizes
t...
The gender of a voice assistant or any voice user interface is a central...
A text-to-speech (TTS) model typically factorizes speech attributes such...
Voice cloning is a difficult task which requires robust and informative
...
In this work, we present the SOMOS dataset, the first large-scale mean
o...