This paper offers a new perspective to ease the challenge of domain
gene...
Visual grounding (VG) aims to establish fine-grained alignment between v...
Parameter Efficient Tuning (PET) has gained attention for reducing the n...
Streaming video clips with large-scale video tokens impede vision
transf...
The main challenge in domain generalization (DG) is to handle the
distri...
Contrastive learning methods train visual encoders by comparing views fr...
The relation modeling between actors and scene context advances video ac...
We propose DiffusionDet, a new framework that formulates object detectio...
Although the pre-trained Vision Transformers (ViTs) achieved great succe...
Pre-training video transformers on extra large-scale datasets is general...
Recent studies in deepfake detection have yielded promising results when...
Vision Transformers (ViTs) take all the image patches as tokens and cons...
Recently, MLP-like vision models have achieved promising performances on...
Dancing video retargeting aims to synthesize a video that transfers the ...
Studies on self-supervised visual representation learning (SSL) improve
...
We propose PD-GAN, a probabilistic diverse GAN for image inpainting. Giv...
Universal style transfer retains styles from reference images in content...
Adversarial attack arises due to the vulnerability of deep neural networ...
User-intended visual content fills the hole regions of an input image in...
Image virtual try-on replaces the clothes on a person image with a desir...
MoCo is effective for unsupervised image representation learning. In thi...
Convolutional Neural Networks (CNNs) have advanced existing medical syst...
Image virtual try-on aims to fit a garment image (target clothes) to a p...
Single image deraining regards an input image as a fusion of a backgroun...
The advancement of visual tracking has continuously been brought by deep...
While deep convolutional neural networks (CNNs) are vulnerable to advers...
Deep encoder-decoder based CNNs have advanced image inpainting methods f...
In this paper, we present an end-to-end learning framework for detailed ...
Correlation filters (CF) have received considerable attention in visual
...
We address the problem of recovering the 3D geometry of a human face fro...
We propose an unsupervised visual tracking method in this paper. Differe...
We address the problem of restoring a high-resolution face image from a
...
Visual attention, derived from cognitive neuroscience, facilitates human...
The tracking-by-detection framework receives growing attentions through ...
Image correction aims to adjust an input image into a visually pleasing ...
The tracking-by-detection framework consists of two stages, i.e., drawin...
We address the problem of transferring the style of a headshot photo to ...
Discriminative correlation filters (DCFs) have been shown to perform
sup...
Exemplar-based face sketch synthesis methods usually meet the challengin...
We propose a two-stage method for face hallucination. First, we generate...