LLMs have demonstrated remarkable abilities at interacting with humans
t...
Image matting requires high-quality pixel-level human annotations to sup...
Since the introduction of Vision Transformers, the landscape of many com...
We study a challenging task, conditional human motion generation, which
...
Implicit neural 3D representation has achieved impressive results in sur...
Although vision transformers (ViTs) have achieved great success in compu...
Very recently, Window-based Transformers, which computed self-attention
...
Despite their success for semantic segmentation, convolutional neural
ne...
In this paper, we tackle the problem of human de-occlusion which reasons...
Image matting is a key technique for image and video editing and composi...
The standard petrography test method for measuring air voids in concrete...
The first Agriculture-Vision Challenge aims to encourage research in
dev...
The success of deep learning in visual recognition tasks has driven
adva...
Multi-scale context module and single-stage encoder-decoder structure ar...
Video object segmentation (VOS) aims at pixel-level object tracking give...
Long-range dependencies can capture useful contextual information to ben...
Human parsing has received considerable interest due to its wide applica...
Object detection is a core problem in computer vision. With the developm...
Patch-level image representation is very important for object classifica...