Recent years have witnessed great progress in creating vivid audio-drive...
Existing automated dubbing methods are usually designed for Professional...
Implicit neural representations have shown powerful capacity in modeling...
Synthesizing high-fidelity head avatars is a central problem for compute...
Text-driven content creation has evolved to be a transformative techniqu...
Animating virtual avatars with free-view control is crucial for various
...
Synthetic data has emerged as a promising source for 3D human research a...
Recent advances in modeling 3D objects mostly rely on synthetic datasets...
We present 3DHumanGAN, a 3D-aware generative adversarial network (GAN) t...
Co-speech gesture is crucial for human-machine interaction and digital
e...
Video recognition in an open and dynamic world is quite challenging, as ...
We present MotionBERT, a unified pretraining framework, to tackle differ...
Realistic generative face video synthesis has long been a pursuit in bot...
Video-to-Video synthesis (Vid2Vid) has achieved remarkable results in
ge...
Generic event boundary detection (GEBD) is an important yet challenging ...
Generating high-quality and diverse human images is an important yet
cha...
Although significant progress has been made to audio-driven talking face...
Unconditional human image generation is an important task in vision and
...
This work targets at using a general deep learning framework to synthesi...
This paper focuses on the weakly-supervised audio-visual video parsing t...
Recent advances like StyleGAN have promoted the growth of controllable f...
Generating speech-consistent body and gesture movements is a long-standi...
Animating high-fidelity video portrait with speech audio is crucial for
...
We present a novel framework that brings the 3D motion retargeting task ...
Generic event boundary detection is an important yet challenging task in...
Generative adversarial networks (GANs) typically require ample data for
...
While accurate lip synchronization has been achieved for arbitrary-subje...
Despite previous success in generating audio-driven talking heads, most ...
We present a new application direction named Pareidolia Face Reenactment...
This paper reports methods and results in the DeeperForensics Challenge ...
Despite the remarkable success of generative models in creating
photorea...
Depth information has proven to be a useful cue in the semantic segmenta...
Temporal modeling is crucial for capturing spatiotemporal structure in v...
We present a lightweight video motion retargeting approach TransMoMo tha...
We present a method to edit a target portrait footage by taking a sequen...
In this paper, we present our on-going effort of constructing a large-sc...
Recently, facial landmark detection algorithms have achieved remarkable
...
Recent studies have shown remarkable success in face manipulation task w...
Facial landmark detection, or face alignment, is a fundamental task that...
It is challenging to disentangle an object into two orthogonal spaces of...
Unsupervised image-to-image translation aims at learning a mapping betwe...
We present a novel learning-based framework for face reenactment. The
pr...
We present a novel boundary-aware face alignment algorithm by utilising
...