Medical imaging has witnessed remarkable progress but usually requires a...
The recent work CLIPA presents an inverse scaling law for CLIP training ...
CLIP, the first foundation model that connects images and text, has enab...
This paper presents a simple and effective visual prompting method for
a...
Image pre-training, the current de-facto paradigm for a wide range of vi...
Adversarial Propagation (AdvProp) is an effective way to improve recogni...
Deep neural networks are powerful tools for representation learning, but...
We focus on the problem of novel-view human action synthesis. Given an a...
3D convolution is powerful for video classification but often computatio...
Temporal convolution has been widely used for video classification. Howe...