In this paper, we explored how to boost speech emotion recognition (SER)...
Self-supervised learning (SSL) proficiency in speech-related tasks has d...
Although diffusion models in text-to-speech have become a popular choice...
In recent years, speech-based self-supervised learning (SSL) has made
si...
The excellent generalization ability of self-supervised learning (SSL) f...
Recently, end-to-end (E2E) automatic speech recognition (ASR) models hav...
False information can spread quickly on social media, negatively influen...
Audio-driven talking face has attracted broad interest from academia and...
Recent years have witnessed a boom in self-supervised learning (SSL) in
...
Self-supervised speech pre-training empowers the model with the contextu...
In this paper, we provide a new perspective on self-supervised speech mo...
Recent years have witnessed great strides in self-supervised learning (S...
Temporal Moment Localization (TML) in untrimmed videos is a challenging ...