Chenpeng Du

research

∙ 09/14/2023

Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS

Self-supervised learning (SSL) proficiency in speech-related tasks has d...

0 Yifan Yang, et al. ∙

research

∙ 09/10/2023

VoiceFlow: Efficient Text-to-Speech with Rectified Flow Matching

Although diffusion models in text-to-speech have become a popular choice...

0 Yiwei Guo, et al. ∙

research

∙ 06/25/2023

DSE-TTS: Dual Speaker Embedding for Cross-Lingual Text-to-Speech

Although high-fidelity speech can be obtained for intralingual speech sy...

0 Sen Liu, et al. ∙

research

∙ 06/14/2023

Improving Code-Switching and Named Entity Recognition in ASR with Speech Editing based Data Augmentation

Recently, end-to-end (E2E) automatic speech recognition (ASR) models hav...

0 Zheng Liang, et al. ∙

research

∙ 06/13/2023

UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding

The utilization of discrete speech tokens, divided into semantic tokens ...

0 Chenpeng Du, et al. ∙

research

∙ 04/25/2023

Multi-Speaker Multi-Lingual VQTTS System for LIMMITS 2023 Challenge

In this paper, we describe the systems developed by the SJTU X-LANCE tea...

0 Chenpeng Du, et al. ∙

research

∙ 11/17/2022

EmoDiff: Intensity Controllable Emotional Text-to-Speech with Soft-Label Guidance

Although current neural text-to-speech (TTS) models are able to generate...

0 Yiwei Guo, et al. ∙

research

∙ 04/02/2022

VQTTS: High-Fidelity Text-to-Speech Synthesis with Self-Supervised VQ Acoustic Feature

The mainstream neural text-to-speech(TTS) pipeline is a cascade system, ...

0 Chenpeng Du, et al. ∙

research

∙ 02/15/2022

Unsupervised word-level prosody tagging for controllable speech synthesis

Although word-level prosody modeling in neural text-to-speech (TTS) has ...

0 Yiwei Guo, et al. ∙

research

∙ 05/27/2021

Diverse and Controllable Speech Synthesis with GMM-Based Phone-Level Prosody Modelling

Generating natural speech with diverse and smooth prosody pattern is a c...

0 Chenpeng Du, et al. ∙

research

∙ 02/01/2021

Mixture Density Network for Phone-Level Prosody Modelling in Speech Synthesis

Recent researches on both utterance-level and phone-level prosody modell...

0 Chenpeng Du, et al. ∙

research

∙ 11/04/2020

Data Augmentation for End-to-end Code-switching Speech Recognition

Training a code-switching end-to-end automatic speech recognition (ASR) ...

11 Chenpeng Du, et al. ∙

Chenpeng Du

Featured Co-authors

Sign in with Google

Consider DeepAI Pro