Audio-Visual Segmentation (AVS) aims to precisely outline audible object...
Sound Event Detection (SED) aims to predict the temporal boundaries of a...
Automatic Audio Captioning (AAC) refers to the task of translating an au...
Automatic Audio Captioning (AAC) refers to the task of translating audio...
Dysarthria is a condition which hampers the ability of an individual to
...
Few-shot learning aims to generalize unseen classes that appear during
t...