The task of 3D semantic scene graph (3DSSG) prediction in the point clou...
We find Mask2Former also achieves state-of-the-art performance on video
...
Image segmentation is about grouping pixels with different semantics, e....
Modern approaches typically formulate semantic segmentation as a per-pix...
We propose point-based instance-level annotation, a new form of weak
sup...
3D object detection in point clouds is a challenging vision task that
be...
We present Boundary IoU (Intersection-over-Union), a new segmentation
ev...
Scale variance among different sizes of body parts and objects is a
chal...
Supervised learning in large discriminative models is a mainstay for mod...
Supervised learning in large discriminative models is a mainstay for mod...
In this work, we introduce Panoptic-DeepLab, a simple, strong, and fast
...
We present Panoptic-DeepLab, a bottom-up and single-shot approach for
pa...
Developing object detection and tracking on resource-constrained embedde...
In this paper, we are interested in bottom-up multi-person human pose
es...
Multi-scale context module and single-stage encoder-decoder structure ar...
We present a novel high frequency residual learning framework, which lea...
The training method of repetitively feeding all samples into a pre-defin...
In this paper, we analyze failure cases of state-of-the-art detectors an...
We study in this paper how to initialize the parameters of multinomial
l...
This work provides a simple approach to discover tight object bounding b...
Recent region-based object detectors are usually built with separate
cla...
Visual recognition under adverse conditions is a very important and
chal...
Emotion recognition from facial expressions is tremendously useful,
espe...