Neural text-to-speech systems are often optimized on L1/L2 losses, which...
Numerous examples in the literature proved that deep learning models hav...
Stuttering is a speech disorder where the natural flow of speech is
inte...
In this paper, we propose GlowVC: a multilingual multi-speaker flow-base...
The goal of automatic dubbing is to perform speech-to-speech translation...
Non-parallel voice conversion (VC) is typically achieved using lossy
rep...
State-of-the-art text-to-speech (TTS) systems require several hours of
r...
Automatic dubbing aims at seamlessly replacing the speech in a video doc...
This paper proposes a general enhancement to the Normalizing Flows (NF) ...
End-to-end (E2E) automatic speech recognition (ASR) models have recently...
Text-to-speech systems recently achieved almost indistinguishable qualit...
A new emotional multimedia database has been recorded and aligned. The d...
This paper proposes an emotion transplantation method capable of modifyi...
We have applied two state-of-the-art speech synthesis techniques (unit s...
This paper describes two novel complementary techniques that improve the...
Recently the state-of-the-art text-to-speech synthesis systems have shif...
We present enhancements to a speech-to-speech translation pipeline in or...
We propose a Text-to-Speech method to create an unseen expressive style ...
This paper proposed a novel approach for the detection and reconstructio...
Neural text-to-speech synthesis (NTTS) models have shown significant pro...
Statistical TTS systems that directly predict the speech waveform have
r...
This paper introduces a robust universal neural vocoder trained with 74
...