While the performance of cross-lingual TTS based on monolingual corpora ...
Recent development of neural vocoders based on the generative adversaria...
In current two-stage neural text-to-speech (TTS) paradigm, it is ideal t...
The zero-shot scenario for speech generation aims at synthesizing a nove...
Speaker adaptation in text-to-speech synthesis (TTS) is to finetune a
pr...
Text to speech (TTS) has made rapid progress in both academia and indust...
In this paper, we propose VISinger, a complete end-to-end high-quality
s...
Current two-stage TTS framework typically integrates an acoustic model w...
In spoken conversations, spontaneous behaviors like filled pause and
pro...
Data efficient voice cloning aims at synthesizing target speaker's voice...