Four-dimensional Digital Subtraction Angiography (4D DSA) plays a critic...
High-definition (HD) map provides abundant and precise static environmen...
3D Semantic Scene Completion (SSC) has emerged as a nascent and pivotal ...
Exploring robust and efficient association methods has always been an
im...
Natural image matting algorithms aim to predict the transparency map
(al...
Gait recognition is an emerging biological recognition technology that
i...
We present RND-SCI, a novel framework for compressive hyperspectral imag...
Semantic map construction under bird's-eye view (BEV) plays an essential...
High-definition (HD) map serves as the essential infrastructure of auton...
Efficient inference for object detection networks is a major challenge o...
This paper explores the properties of the plain Vision Transformer (ViT)...
Large-scale language models (LLMs) have demonstrated outstanding perform...
Although recent approaches aiming for video instance segmentation have
a...
Open-world instance segmentation has recently gained significant
popular...
Rendering moving human bodies at free viewpoints only from a monocular v...
Post-training quantization (PTQ) is a popular method for compressing dee...
Autonomous driving requires a comprehensive understanding of the surroun...
Online lane graph construction is a promising but challenging task in
au...
In the field of skeleton-based action recognition, current top-performin...
We present a simple, efficient, and scalable unfolding network, SAUNet, ...
As a neural network compression technique, post-training quantization (P...
Motion prediction is highly relevant to the perception of dynamic object...
In contrast to fully supervised methods using pixel-wise mask labels,
bo...
Labeling objects with pixel-wise segmentation requires a huge amount of ...
We present MapTR, a structured end-to-end framework for efficient online...
Multi-object tracking in videos requires to solve a fundamental problem ...
In contrast to the fully supervised methods using pixel-wise mask labels...
In this work, we propose PolarBEV for vision-based uneven BEV representa...
Semantic segmentation on driving-scene images is vital for autonomous
dr...
3D detection based on surround-view camera system is a critical techniqu...
The query mechanism introduced in the DETR method is changing the paradi...
Learning Bird's Eye View (BEV) representation from surrounding-view came...
Neural radiance fields (NeRF) have shown great success in modeling 3D sc...
Recently vision transformer has achieved tremendous success on image-lev...
Although vision transformers (ViTs) have achieved great success in compu...
Recently, the semantics of scene text has been proven to be essential in...
Studying the inherent symmetry of data is of great importance in machine...
In this paper, we propose a conceptually novel, efficient, and fully
con...
Though deep learning-based object detection methods have achieved promis...
Box-supervised instance segmentation has recently attracted lots of rese...
Neural radiance fields (NeRF) have shown great potentials in representin...
Semantic information has been proved effective in scene text recognition...
Multi-object tracking (MOT) aims at estimating bounding boxes and identi...
We present VoxelTrack for multi-person 3D pose estimation and tracking f...
Instance segmentation on point clouds is a fundamental task in 3D scene
...
Recent studies show that hierarchical Vision Transformer with interleave...
Recently, query based deep networks catch lots of attention owing to the...
Can Transformer perform 2D object-level recognition from a pure
sequence...
Transformers have offered a new methodology of designing neural networks...
Learning discriminative representation using large-scale face datasets i...