Haohe Liu

research

∙ 09/18/2023

Synth-AC: Enhancing Audio Captioning with Synthetic Supervision

Data-driven approaches hold promise for audio captioning. However, the d...

0 Feiyang Xiao, et al. ∙

research

∙ 09/13/2023

AudioSR: Versatile Audio Super-resolution at Scale

Audio super-resolution is a fundamental task that predicts high-frequenc...

0 Haohe Liu, et al. ∙

research

∙ 09/10/2023

Multimodal Fish Feeding Intensity Assessment in Aquaculture

Fish feeding intensity assessment (FFIA) aims to evaluate the intensity ...

0 Meng Cui, et al. ∙

research

∙ 08/03/2023

MusicLDM: Enhancing Novelty in Text-to-Music Generation Using Beat-Synchronous Mixup Strategies

Diffusion models have shown promising results in cross-modal generation ...

0 Ke Chen, et al. ∙

research

∙ 05/30/2023

E-PANNs: Sound Recognition Using Efficient Pre-trained Audio Neural Networks

Sounds carry an abundance of information about activities and events in ...

0 Arshdeep Singh, et al. ∙

research

∙ 05/22/2023

Learning to detect an animal sound from five examples

Automatic detection and classification of animal sounds has many applica...

0 Inês Nolasco, et al. ∙

research

∙ 05/11/2023

Universal Source Separation with Weakly Labelled Data

Universal source separation (USS) is a fundamental research task for com...

0 Qiuqiang Kong, et al. ∙

research

∙ 03/30/2023

WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research

The advancement of audio-language (AL) multimodal learning tasks has bee...

0 Xinhao Mei, et al. ∙

research

∙ 11/22/2022

Ontology-aware Learning and Evaluation for Audio Tagging

This study defines a new evaluation metric for audio tagging tasks to ov...

0 Haohe Liu, et al. ∙

research

∙ 10/28/2022

Visually-Aware Audio Captioning With Adaptive Audio-Visual Attention

Audio captioning is the task of generating captions that describe the co...

0 Xubo Liu, et al. ∙

research

∙ 10/04/2022

Learning the Spectrogram Temporal Resolution for Audio Classification

The audio spectrogram is a time-frequency representation that has been w...

16 Haohe Liu, et al. ∙

research

∙ 10/03/2022

Simple Pooling Front-ends For Efficient Audio Classification

Recently, there has been increasing interest in building efficient audio...

19 Xubo Liu, et al. ∙

research

∙ 07/21/2022

Surrey System for DCASE 2022 Task 5: Few-shot Bioacoustic Event Detection with Segment-level Metric Learning

Few-shot audio event detection is a task that detects the occurrence tim...

0 Haohe Liu, et al. ∙

research

∙ 07/15/2022

Segment-level Metric Learning for Few-shot Bioacoustic Event Detection

Few-shot bioacoustic event detection is a task that detects the occurren...

10 Haohe Liu, et al. ∙

research

∙ 05/30/2022

BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis

Binaural audio plays a significant role in constructing immersive augmen...

1 Yichong Leng, et al. ∙

research

∙ 05/09/2022

NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality

Text to speech (TTS) has made rapid progress in both academia and indust...

18 Xu Tan, et al. ∙

research

∙ 04/12/2022

VoiceFixer: A Unified Framework for High-Fidelity Speech Restoration

Speech restoration aims to remove distortions in speech signals. Prior m...

0 Haohe Liu, et al. ∙

research

∙ 03/28/2022

Separate What You Describe: Language-Queried Audio Source Separation

In this paper, we introduce the task of language-queried audio source se...

4 Xubo Liu, et al. ∙

research

∙ 03/28/2022

Neural Vocoder is All You Need for Speech Super-resolution

Speech super-resolution (SR) is a task to increase speech sampling rate ...

8 Haohe Liu, et al. ∙

research

∙ 03/06/2022

Leveraging Pre-trained BERT for Audio Captioning

Audio captioning aims at using natural language to describe the content ...

13 Xubo Liu, et al. ∙

research

∙ 12/09/2021

CWS-PResUNet: Music Source Separation with Channel-wise Subband Phase-aware ResUNet

Music source separation (MSS) shows active progress with deep learning m...

0 Haohe Liu, et al. ∙

research

∙ 09/28/2021

VoiceFixer: Toward General Speech Restoration with Neural Vocoder

Speech restoration aims to remove distortions in speech signals. Prior m...

0 Haohe Liu, et al. ∙

research

∙ 09/12/2021

Decoupling Magnitude and Phase Estimation with Deep ResUNet for Music Source Separation

Deep neural network based methods have been successfully applied to musi...

0 Qiuqiang Kong, et al. ∙

research

∙ 07/20/2021

Joint Echo Cancellation and Noise Suppression based on Cascaded Magnitude and Complex Mask Estimation

Acoustic echo and background noise can seriously degrade the intelligibi...

0 Xiaofeng Shu, et al. ∙

research

∙ 02/19/2021

Speech enhancement with weakly labelled data from AudioSet

Speech enhancement is a task to improve the intelligibility and perceptu...

0 Qiuqiang Kong, et al. ∙

research

∙ 08/12/2020

Channel-wise Subband Input for Better Voice and Accompaniment Separation on High Resolution Music

This paper presents a new input format, channel-wise subband input (CWS)...

2 Haohe Liu, et al. ∙

Haohe Liu

Featured Co-authors

Sign in with Google

Consider DeepAI Pro