Large-scale self-supervised pre-trained speech encoders outperform
conve...
Self-supervised speech representation models have succeeded in various t...
In this paper, we introduce self-distillation and online clustering for
...
This work investigates the use of large-scale, pre-trained models (CLIP ...
Data-driven speech processing models usually perform well with a large a...
Transfer learning has proven to be crucial in advancing the state of spe...
Code-switching (CS) is common in daily conversations where more than one...
Self-supervised speech representation learning methods like wav2vec 2.0 ...
Mandarin-English code-switching (CS) is frequently used among East and
S...
Automatic speech recognition (ASR) technologies today are primarily opti...
Whispering is an important mode of human speech, but no end-to-end
recog...