We present LongLoRA, an efficient fine-tuning approach that extends the
...
Recent success of Contrastive Language-Image Pre-training (CLIP) has sho...
We propose Stratified Image Transformer(StraIT), a pure
non-autoregressi...
The architecture of transformers, which recently witness booming applica...
The transformer architectures, based on self-attention mechanism and
con...
For a long time, the vision community tries to learn the spatio-temporal...
Recent studies have shown remarkable success in face manipulation task w...
Facial landmark detection, or face alignment, is a fundamental task that...