Self-supervised speech representation models have succeeded in various t...
In this paper, we introduce self-distillation and online clustering for
...
We introduce the first unsupervised speech synthesis system based on a
s...
Unsupervised speech recognition has shown great potential to make Automa...
Are end-to-end text-to-speech (TTS) models over-parametrized? To what ex...
Recent work on speech self-supervised learning (speech SSL) demonstrated...
Recent advances in representation learning have demonstrated an ability ...
Self-supervised speech representations have been shown to be effective i...
Speech translation (ST) aims to learn transformations from speech in the...
Recently, end-to-end multi-speaker text-to-speech (TTS) systems gain suc...
Whispering is an important mode of human speech, but no end-to-end
recog...
In this paper, we investigate the benefit that off-the-shelf word embedd...
In this paper we propose a Sequential Representation Quantization AutoEn...
In this paper we proposed a novel Adversarial Training (AT) approach for...