Synthesizing realistic videos according to a given speech is still an op...
Open-world instance-level scene understanding aims to locate and recogni...
3D semantic segmentation on multi-scan large-scale point clouds plays an...
In this paper, we delve deeper into the Kullback-Leibler (KL) Divergence...
Existing 3D scene understanding tasks have achieved high performance on
...
3D automatic annotation has received increased attention since manually
...
3D scene understanding, e.g., point cloud semantic and instance segmenta...
Category-level 6D pose estimation aims to predict the poses and sizes of...
Semantic segmentation is still a challenging task for parsing diverse
co...
3D object detectors usually rely on hand-crafted proxies, e.g., anchors ...
Implicit neural rendering, which uses signed distance function (SDF)
rep...
A recent study has shown a phenomenon called neural collapse in that the...
Although considerable progress has been obtained in neural network
quant...
Open-vocabulary scene understanding aims to localize and recognize unsee...
Weakly supervised detection of anomalies in surveillance videos is a
cha...
There are a lot of promising results in 3D recognition, including
classi...
Most existing 3D point cloud object detection approaches heavily rely on...
The sixth-generation (6G) mobile networks are expected to feature the
ub...
3D scenes are dominated by a large number of background points, which is...
In this paper, we empirically study how to make the most of low-resoluti...
Text-guided 3D shape generation remains challenging due to the absence o...
With the rapid development of mobile devices, modern widely-used mobile
...
Despite a growing number of datasets being collected for training 3D obj...
Recent advances in 2D CNNs and vision transformers (ViTs) reveal that la...
In this work, we present a unified framework for multi-modality 3D objec...
In this work, we present a conceptually simple yet effective framework f...
In this paper, we tackle the problem of learning visual representations ...
Despite substantial progress in 3D object detection, advanced 3D detecto...
Moire patterns, appearing as color distortions, severely degrade image a...
Deep learning approaches achieve prominent success in 3D semantic
segmen...
Manually annotating 3D point clouds is laborious and costly, limiting th...
In this work, we explore the challenging task of generating 3D shapes fr...
3D point cloud segmentation has made tremendous progress in recent years...
To interpret deep networks, one main approach is to associate neurons wi...
In this paper, we propose a new query-based detection framework for crow...
Large-scale pre-training has been proven to be crucial for various compu...
Video Panoptic Segmentation (VPS) aims at assigning a class label to eac...
In this paper, we present a conceptually simple, strong, and efficient
f...
In this paper, we present a self-training method, named ST3D++, with a
h...
Domain shift is a well known problem where a model trained on a particul...
While self-training has advanced semi-supervised semantic segmentation, ...
Point cloud semantic segmentation often requires largescale annotated
tr...
Indoor scene semantic parsing from RGB images is very challenging due to...
We introduce Position Adaptive Convolution (PAConv), a generic convoluti...
The neuromorphic event cameras, which capture the optical changes of a s...
We present a new domain adaptive self-training pipeline, named ST3D, for...
In 2D image processing, some attempts decompose images into high and low...
In this paper, we propose a geometric neural network with edge-aware
ref...
In this paper, we present a conceptually simple, strong, and efficient
f...
We present an Object-aware Feature Aggregation (OFA) module for video ob...