In this work, we present CleanUNet 2, a speech denoising model that comb...
In this paper, we investigate the in-context learning ability of
retriev...
Textual backdoor attack, as a novel attack model, has been shown to be
e...
Large decoder-only language models (LMs) can be largely improved in term...
Deep learning models have been widely used in commercial acoustic system...
Augmenting pretrained language models (LMs) with a vision encoder (e.g.,...
Parameter efficient learning methods (PERMs) have recently gained signif...
Despite recent progress in generative adversarial network(GAN)-based
voc...
Pretrained language models (LMs) are susceptible to generate text with
n...
Existing knowledge-grounded dialogue systems typically use finetuned ver...
In this work, we present CleanUNet, a causal speech denoising model on t...
Pre-trained language models (LMs) are shown to easily generate toxic
lan...
Speech-to-text alignment is a critical component of neural textto-speech...
Transformers have achieved success in both language and vision domains.
...
In this work, we propose FastDPM, a unified framework for fast sampling ...
Recent work on training neural retrievers for open-domain question answe...
State-of-the-art conversational agents have advanced significantly in
co...
In this work, we propose DiffWave, a versatile Diffusion probabilistic m...
In this work, we present WaveFlow, a small-footprint generative flow for...
In this work, we extend ClariNet (Ping et al., 2019), a fully end-to-end...
In this work, we propose a non-autoregressive seq2seq model that convert...
We propose a large margin criterion for training neural language models....
In this work, we propose an alternative solution for parallel wave gener...
Breast cancer diagnosis often requires accurate detection of metastasis ...
Voice cloning is a highly desired feature for personalized speech interf...
We propose a Topic Compositional Neural Language Model (TCNLM), a novel
...
We present Deep Voice 3, a fully-convolutional attention-based neural
te...
We present Deep Voice 3, a fully-convolutional attention-based neural
te...
In this work, we propose an infinite restricted Boltzmann machine (RBM),...
We introduce a technique for augmenting neural text-to-speech (TTS) with...
Restricted Boltzmann machines (RBMs) and conditional RBMs (CRBMs) are po...
Marginal MAP inference involves making MAP predictions in systems define...