World models, especially in autonomous driving, are trending and drawing...
Recent works have explored the fundamental role of depth estimation in
m...
In recent years, soft prompt learning methods have been proposed to fine...
The vision-based perception for autonomous driving has undergone a
trans...
3D semantic scene completion (SSC) is an ill-posed task that requires
in...
3D scene understanding plays a vital role in vision-based autonomous dri...
BEV perception is of great importance in the field of autonomous driving...
Depth estimation has been widely studied and serves as the fundamental s...
Monocular depth estimation is a challenging task that predicts the pixel...
Dataset distillation reduces the network training cost by synthesizing s...
Semantic occupancy perception is essential for autonomous driving, as
au...
Dataset distillation aims to generate small datasets with little informa...
Talking head synthesis is a promising approach for the video production
...
Representing and synthesizing novel views in real-world dynamic scenes f...
In recent years, vision-centric perception has flourished in various
aut...
Data mixing strategies (e.g., CutMix) have shown the ability to greatly
...
The pretrain-finetune paradigm in modern computer vision facilitates the...
3D object detection with surrounding cameras has been a promising direct...
Self-supervised monocular methods can efficiently learn depth informatio...
Self-supervised monocular depth estimation is an attractive solution tha...
Talking head synthesis is an emerging technology with wide applications ...
In this paper, we propose a Shapley value based method to evaluate opera...
This paper presents a language-powered paradigm for ordinal regression.
...
Domain Adaptation of Black-box Predictors (DABP) aims to learn a model o...
Face recognition, as one of the most successful applications in artifici...
In this paper, we present BEVerse, a unified framework for 3D perception...
This paper proposes an introspective deep metric learning (IDML) framewo...
Gait benchmarks empower the research community to train and evaluate
hig...
Learning with noisy labels has aroused much research interest since data...
Matching and pickup processes are core features of ride-sourcing service...
Face benchmarks empower the research community to train and evaluate
hig...
Learning-based Multi-View Stereo (MVS) methods warp source images into t...
Autonomous driving requires accurate and detailed Bird's Eye View (BEV)
...
Depth estimation from images serves as the fundamental step of 3D percep...
This paper probes intrinsic factors behind typical failure cases (e.g.
s...
Many gait recognition methods first partition the human gait into N-part...
Dataset condensation aims at reducing the network training effort throug...
Robot mobility is critical for mission success, especially in soft or
de...
Recent self-supervised contrastive learning methods greatly benefit from...
Autonomous driving perceives the surrounding environment for decision ma...
Recent progress has shown that large-scale pre-training using contrastiv...
Recently, face recognition in the wild has achieved remarkable success a...
During the COVID-19 coronavirus epidemic, almost everyone wears a facial...
According to WHO statistics, there are more than 204,617,027 confirmed
C...
Recent advances in self-attention and pure multi-layer perceptrons (MLP)...
The practical application requests both accuracy and efficiency on
multi...
Face clustering is a promising method for annotating unlabeled face imag...
In this paper, we contribute a new million-scale face benchmark containi...
Ride-hailing platforms generally provide various service options to
cust...
Both appearance cue and constraint cue are important in human pose
estim...