IRGen: Generative Modeling for Image Retrieval

03/17/2023
by   Yidan Zhang, et al.
0

While generative modeling has been ubiquitous in natural language processing and computer vision, its application to image retrieval remains unexplored. In this paper, we recast image retrieval as a form of generative modeling by employing a sequence-to-sequence model, contributing to the current unified theme. Our framework, IRGen, is a unified model that enables end-to-end differentiable search, thus achieving superior performance thanks to direct optimization. While developing IRGen we tackle the key technical challenge of converting an image into quite a short sequence of semantic units in order to enable efficient and effective retrieval. Empirical experiments demonstrate that our model yields significant improvement over three commonly used benchmarks, for example, 22.9% higher than the best baseline method in precision@10 on In-shop dataset with comparable recall@10 score.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset