This paper presents an extension to train end-to-end Context-Aware
Trans...
To train image-caption retrieval (ICR) methods, contrastive loss functio...
The triplet loss with semi-hard negatives has become the de facto choice...
E-commerce provides rich multimodal data that is barely leveraged in
pra...
In this work we explain the setup for a technical, graduate-level course...
In recent years, Generative Adversarial Networks (GANs) have improved
st...
Scene Text Recognition (STR) is the problem of recognizing the correct w...