This paper proposes an improved Goodness of Pronunciation (GoP) that uti...
With the advent of general-purpose speech representations from large-sca...
Inspired by humans comprehending speech in a multi-modal manner, various...
Maintaining road networks is labor-intensive, especially in actively
dev...
Automatic assessment of dysarthric speech is essential for sustained
tre...
Self-supervised models, namely, wav2vec and its variants, have shown
pro...
Active learning (AL) aims to select the most useful data samples from an...
This paper proposes a cross-lingual classification method for English,
K...
Social media platforms struggle to protect users from harmful content th...
Multilingual speech data often suffer from long-tailed language distribu...
Recent studies on learning with noisy labels have shown remarkable
perfo...
Improving the performance of on-device audio classification models remai...
Neural architecture search (NAS) has fostered various fields of machine
...
The current evaluation protocol of long-tailed visual recognition trains...
With the variational lower bound of mutual information (MI), the estimat...