Recent advances in deep neural networks have achieved unprecedented succ...
Speech-driven animation has gained significant traction in recent years,...
Safeguarding personal information is paramount for healthcare data shari...
Recently reported state-of-the-art results in visual speech recognition ...
Audio-visual speech recognition has received a lot of attention due to i...
Cross-lingual self-supervised learning has been a growing research topic...
Talking face generation has historically struggled to produce head movem...
We present RAVEn, a self-supervised multi-modal approach to jointly lear...
Audio-visual speech enhancement aims to extract clean speech from a nois...
Recognizing a word shortly after it is spoken is an important requiremen...
This work focuses on the apparent emotional reaction recognition (AERR) ...
Several training strategies and temporal models have been recently propo...
Video-to-speech synthesis (also known as lip-to-speech) refers to the
tr...
This paper presents a novel method for face clustering in videos using a...
Visual speech recognition (VSR) aims to recognise the content of speech ...
One of the most pressing challenges for the detection of face-manipulate...
Apparent emotional facial expression recognition has attracted a lot of
...
The large amount of audiovisual content being shared online today has dr...
Video-to-speech is the process of reconstructing the audio speech from a...
Domain translation is the process of transforming data from one domain t...
In this work, we present a hybrid CTC/Attention model based on a ResNet-...
Although current deep learning-based face forgery detectors achieve
impr...
Speech recognition systems have improved dramatically over the last few
...
In this work, we present the Densely Connected Temporal Convolutional Ne...
Lipreading has witnessed a lot of progress due to the resurgence of neur...
The intuitive interaction between the audio and visual modalities is val...
Self-supervised learning has attracted plenty of recent research interes...
Lip-reading has attracted a lot of research attention lately thanks to
a...
Self supervised representation learning has recently attracted a lot of
...
Adversarial attacks pose a threat to deep learning models. However, rese...
Speech-driven facial animation involves using a speech signal to generat...
Lip-reading models have been significantly improved recently thanks to
p...
Speech is a means of communication which relies on both audio and visual...
Speech-driven facial animation is the process that automatically synthes...
Several audio-visual speech recognition models have been recently propos...
Traditional visual speech recognition systems consist of two stages, fea...
Recent works in speech recognition rely either on connectionist temporal...
This paper presents a classifier ensemble for Facial Expression Recognit...
Speech-driven facial animation is the process which uses speech signals ...
In the context of Human-Robot Interaction (HRI), face Re-Identification ...
Several end-to-end deep learning approaches have been recently presented...
Silent speech interfaces have been recently proposed as a way to enable
...
Several end-to-end deep learning approaches have been recently presented...
Local deep neural networks have been recently introduced for gender
reco...
Traditional visual speech recognition systems consist of two stages, fea...