Musketeer (All for One, and One for All): A Generalist Vision-Language Model with Task Explanation Prompts

05/11/2023

∙

We present a sequence-to-sequence vision-language model whose parameters are jointly trained on all tasks (all for one) and fully shared among multiple tasks (one for all), resulting in a single model which we named Musketeer. The integration of knowledge across heterogeneous tasks is enabled by a novel feature called Task Explanation Prompt (TEP). TEP reduces interference among tasks, allowing the model to focus on their shared structure. With a single model, Musketeer achieves results comparable to or better than strong baselines trained on single tasks, almost uniformly across multiple tasks.

READ FULL TEXT

Musketeer (All for One, and One for All): A Generalist Vision-Language Model with Task Explanation Prompts

Sign in with Google

Consider DeepAI Pro