b'Jinzheng He'

research

∙ 07/14/2023

Mega-TTS 2: Zero-Shot Text-to-Speech with Arbitrary Length Speech Prompts

Zero-shot text-to-speech aims at synthesizing voices with unseen speech ...

0 Ziyue Jiang, et al. ∙

research

∙ 05/24/2023

AV-TranSpeech: Audio-Visual Robust Speech-to-Speech Translation

Direct speech-to-speech translation (S2ST) aims to convert speech from o...

0 Rongjie Huang, et al. ∙

research

∙ 05/22/2023

ViT-TTS: Visual Text-to-Speech with Scalable Diffusion Transformer

Text-to-speech(TTS) has undergone remarkable improvements in performance...

0 Huadai Liu, et al. ∙

research

∙ 05/21/2023

Wav2SQL: Direct Generalizable Speech-To-SQL Parsing

Speech-to-SQL (S2SQL) aims to convert spoken questions into SQL queries ...

0 Huadai Liu, et al. ∙

research

∙ 05/18/2023

CLAPSpeech: Learning Prosody from Text Context with Contrastive Language-Audio Pre-training

Improving text representation has attracted much attention to achieve ex...

0 Zhenhui Ye, et al. ∙

research

∙ 05/18/2023

RMSSinger: Realistic-Music-Score based Singing Voice Synthesis

We are interested in a challenging task, Realistic-Music-Score based Sin...

0 Jinzheng He, et al. ∙

research

∙ 05/01/2023

GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation

Generating talking person portraits with arbitrary speech audio is a cru...

8 Zhenhui Ye, et al. ∙

research

∙ 01/31/2023

GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis

Generating photo-realistic video portrait with arbitrary speech audio is...

3 Zhenhui Ye, et al. ∙

research

∙ 05/25/2022

TranSpeech: Speech-to-Speech Translation With Bilateral Perturbation

Direct speech-to-speech translation (S2ST) systems leverage recent progr...

0 Rongjie Huang, et al. ∙

research

∙ 08/18/2020

PopMAG: Pop Music Accompaniment Generation

In pop music, accompaniments are usually played by multiple instruments ...

0 Yi Ren, et al. ∙

Jinzheng He

Featured Co-authors

Sign in with Google

Consider DeepAI Pro