Ryuichi Yamamoto

Chat Image Generator Video Music Voice Chat Photo Editor

Featured Co-authors

Yu Zhang
406 publications
Shinji Watanabe
239 publications
Xu Tan
86 publications
Xiaofei Wang
67 publications
Tomoki Toda
66 publications
Shinnosuke Takamichi
50 publications
Wen-Chin Huang
39 publications
Takaaki Hori
38 publications
Tomoki Hayashi
38 publications
Hirofumi Inaguma
36 publications
Jiatong Shi
30 publications

research

∙ 09/18/2023

Electrolaryngeal Speech Intelligibility Enhancement Through Robust Linguistic Encoders

We propose a novel framework for electrolaryngeal speech intelligibility...

0 Lester Phillip Violeta, et al. ∙

research

∙ 09/15/2023

PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech Using Natural Language Descriptions

We propose PromptTTS++, a prompt-based text-to-speech (TTS) synthesis sy...

0 Reo Shimizu, et al. ∙

research

∙ 10/28/2022

NNSVS: A Neural Network-Based Singing Voice Synthesis Toolkit

This paper describes the design of NNSVS, an open-source software for ne...

0 Ryuichi Yamamoto, et al. ∙

research

∙ 10/28/2022

Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform

We propose a lightweight end-to-end text-to-speech model using multi-ban...

0 Masaya Kawamura, et al. ∙

research

∙ 10/28/2022

Period VITS: Variational Inference with Explicit Pitch Modeling for End-to-end Emotional Speech Synthesis

Several fully end-to-end text-to-speech (TTS) models have been proposed ...

0 Yuma Shirahata, et al. ∙

research

∙ 10/28/2022

Nonparallel High-Quality Audio Super Resolution with Domain Adaptation and Resampling CycleGANs

Neural audio super-resolution models are typically trained on low- and h...

0 Reo Yoneyama, et al. ∙

research

∙ 06/30/2022

Language Model-Based Emotion Prediction Methods for Emotional Speech Synthesis Systems

This paper proposes an effective emotional text-to-speech (TTS) system w...

0 Hyun-Wook Yoon, et al. ∙

research

∙ 06/30/2022

TTS-by-TTS 2: Data-selective augmentation for neural speech synthesis using ranking support vector machine with variational autoencoder

Recent advances in synthetic speech quality have enabled us to train tex...

0 Eunwoo Song, et al. ∙

research

∙ 04/21/2022

Cross-Speaker Emotion Transfer for Low-Resource Text-to-Speech Using Non-Parallel Voice Conversion with Pitch-Shift Data Augmentation

Data augmentation via voice conversion (VC) has been successfully applie...

0 Ryo Terashima, et al. ∙

research

∙ 03/29/2022

DRSpeech: Degradation-Robust Text-to-Speech Synthesis with Frame-Level and Utterance-Level Acoustic Representation Learning

Most text-to-speech (TTS) methods use high-quality speech corpora record...

0 Takaaki Saeki, et al. ∙

research

∙ 10/15/2021

ESPnet2-TTS: Extending the Edge of TTS Research

This paper describes ESPnet2-TTS, an end-to-end text-to-speech (E2E-TTS)...

0 Tomoki Hayashi, et al. ∙

research

∙ 04/26/2021

Phrase break prediction with bidirectional encoder representations in Japanese text-to-speech synthesis

We propose a novel phrase break prediction method that combines implicit...

0 Kosuke Futamata, et al. ∙

research

∙ 01/19/2021

Improved parallel WaveGAN vocoder with perceptually weighted spectrogram loss

This paper proposes a spectral-domain perceptual weighting technique for...

0 Eunwoo Song, et al. ∙

research

∙ 10/27/2020

Parallel waveform synthesis based on generative adversarial networks with voicing-aware conditional discriminators

This paper proposes voicing-aware conditional discriminators for Paralle...

0 Ryuichi Yamamoto, et al. ∙

research

∙ 10/26/2020

TTS-by-TTS: TTS-driven Data Augmentation for Fast and High-Quality Speech Synthesis

In this paper, we propose a text-to-speech (TTS)-driven data augmentatio...

0 Min-Jae Hwang, et al. ∙

research

∙ 10/25/2019

Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram

We propose Parallel WaveGAN, a distillation-free, fast, and small-footpr...

0 Ryuichi Yamamoto, et al. ∙

research

∙ 10/24/2019

ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit

This paper introduces a new end-to-end text-to-speech (E2E-TTS) toolkit ...

0 Tomoki Hayashi, et al. ∙

research

∙ 09/13/2019

A Comparative Study on Transformer vs RNN in Speech Applications

Sequence-to-sequence models have been widely used in end-to-end speech p...

0 Shigeki Karita, et al. ∙

research

∙ 04/09/2019

Probability density distillation with generative adversarial networks for high-quality parallel waveform generation

This paper proposes an effective probability density distillation (PDD) ...

0 Ryuichi Yamamoto, et al. ∙

Success!

An error occurred

Ryuichi Yamamoto

Featured Co-authors

Sign in with Google

Consider DeepAI Pro