The inverse short-time Fourier transform network (iSTFTNet) has garnered...
In speech synthesis, a generative adversarial network (GAN), training a
...
In recent text-to-speech synthesis and voice conversion systems, a
mel-s...
This paper proposes a non-autoregressive extension of our previously pro...
Non-parallel voice conversion (VC) is a technique for training voice
con...
Non-parallel voice conversion (VC) is a technique for learning mappings
...
In this paper, we propose a non-parallel any-to-many voice conversion (V...
We have previously proposed a method that allows for non-parallel voice
...
This paper proposes a voice conversion (VC) method based on a
sequence-t...
Automatic speaker verification (ASV) is one of the most natural and
conv...
Non-parallel multi-domain voice conversion (VC) is a technique for learn...
Non-parallel voice conversion (VC) is a technique for learning the mappi...
Humans are able to imagine a person's voice from the person's appearance...
WaveCycleGAN has recently been proposed to bridge the gap between natura...
This paper describes a method based on a sequence-to-sequence learning
(...
This paper proposes a voice conversion method based on fully convolution...
We propose a learning-based filter that allows us to directly modify a
s...
This paper proposes a non-parallel many-to-many voice conversion (VC) me...
This paper proposes a method that allows for non-parallel many-to-many v...
In this paper, we address the problem of reconstructing a time-domain si...