Domain generalization (DG) aims to learn domain-generalizable models fro...
Pose-free neural radiance fields (NeRF) aim to train NeRF with unposed
m...
Black-box unsupervised domain adaptation (UDA) learns with source predic...
Neural Radiance Field (NeRF) has shown impressive performance in novel v...
One-shot skeleton action recognition, which aims to learn a skeleton act...
Traditional domain adaptation assumes the same vocabulary across source ...
In the past decade, deep neural networks have achieved significant progr...
Open-vocabulary segmentation of 3D scenes is a fundamental function of h...
Audio-driven talking face generation, which aims to synthesize talking f...
Facial expression editing has attracted increasing attention with the ad...
3D Skeleton-based human action recognition has attracted increasing atte...
Face swapping aims to generate swapped images that fuse the identity of
...
Robust point cloud parsing under all-weather conditions is crucial to le...
Most visual recognition studies rely heavily on crowd-labelled data in d...
3D style transfer aims to render stylized novel views of a 3D scene with...
The task of 3D single object tracking (SOT) with LiDAR point clouds is
c...
Quantizing images into discrete representations has been a fundamental
p...
Contrastive learning has recently demonstrated great potential for
unsup...
Recent deep-learning-based compression methods have achieved superior
pe...
3D object detection with surround-view images is an essential task for
a...
Most existing scene text detectors require large-scale training data whi...
Multi-scale features have been proven highly effective for object detect...
With the prevalence of LiDAR sensors in autonomous driving, 3D object
tr...
Recently, single image super-resolution (SR) under large scaling factors...
3D object detection using point clouds has attracted increasing attentio...
LiDAR point clouds, which are usually scanned by rotating LiDAR sensors
...
Few-shot object detection has been extensively investigated by incorpora...
The recently proposed DEtection TRansformer (DETR) has established a ful...
Most existing scene text detectors focus on detecting characters or word...
Deep generative models have achieved conspicuous progress in realistic i...
Leveraging StyleGAN's expressivity and its disentangled latent codes,
ex...
Neural Radiance Fields (NeRF) have demonstrated very impressive performa...
Video semantic segmentation has achieved great progress under the superv...
Domain adaptive panoptic segmentation aims to mitigate data annotation
c...
Exemplar-based image translation establishes dense correspondences betwe...
Semi-supervised semantic segmentation learns from small amounts of label...
State-of-the-art document dewarping techniques learn to predict 3-dimens...
Perceiving the similarity between images has been a long-standing and
fu...
The recently developed DEtection TRansformer (DETR) establishes a new ob...
Recently, Vision-Language Pre-training (VLP) techniques have greatly
ben...
Point cloud data have been widely explored due to its superior accuracy ...
Music and dance have always co-existed as pillars of human activities,
c...
Predicting human motion from historical pose sequence is crucial for a
m...
As information exists in various modalities in real world, effective
int...
In a point cloud sequence, 3D object tracking aims to predict the locati...
Dance choreography for a piece of music is a challenging task, having to...
Unsupervised domain adaptation aims to align a labeled source domain and...
Training effective Generative Adversarial Networks (GANs) requires large...
Skeleton-based human action recognition has attracted increasing attenti...
Deep learning-based 3D object detection has achieved unprecedented succe...