The AI community has been pursuing algorithms known as artificial genera...
With the advance of large-scale model technologies, parameter-efficient
...
Fine-tuning visual models has been widely shown promising performance on...
Despite recent competitive performance across a range of vision tasks, v...
Temporal sentence grounding aims to detect the event timestamps describe...
Weakly-supervised temporal action localization (WTAL) learns to detect a...
Generating motion in line with text has attracted increasing attention
n...
Since the release of various large-scale natural language processing (NL...
The objective of this work is to explore how to effectively and efficien...
Fine-grained object retrieval aims to learn discriminative representatio...
In computer vision, fine-tuning is the de-facto approach to leverage
pre...
In the past few years, the emergence of vision-language pre-training (VL...
Exploiting convolutional neural networks for point cloud processing is q...
This is an opinion paper. We hope to deliver a key message that current
...
Recent years have witnessed significant progress in 3D hand mesh recover...
Recently, contrastive learning has largely advanced the progress of
unsu...
Neural architecture search (NAS) has attracted increasing attentions in ...
RGB-Infrared (IR) person re-identification is very challenging due to th...
For network architecture search (NAS), it is crucial but challenging to
...
Traditional clustering methods often perform clustering with low-level
i...