Mandela Patrick

Chat Image Generator Video Music Voice Chat Photo Editor

Featured Co-authors

Graham Neubig
243 publications
Andrea Vedaldi
146 publications
Florian Metze
72 publications
Junjie Hu
59 publications
Christian Rupprecht
48 publications
Ishan Misra
46 publications
Christoph Feichtenhofer
41 publications
Alexander Hauptmann
30 publications
João F. Henriques
30 publications
Yuki M. Asano
27 publications
Dylan Campbell
24 publications

research

∙ 06/09/2021

Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers

In video transformers, the time dimension is often treated in the same w...

5 Mandela Patrick, et al. ∙

research

∙ 03/18/2021

Space-Time Crop Attend: Improving Cross-modal Video Representation Learning

The quality of the image representations obtained from self-supervised l...

7 Mandela Patrick, et al. ∙

research

∙ 03/16/2021

Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models

This paper studies zero-shot cross-lingual transfer of vision-language m...

7 Po-Yao Huang, et al. ∙

research

∙ 10/06/2020

Support-set bottlenecks for video-text representation learning

The dominant paradigm for learning video-text representations – noise co...

1 Mandela Patrick, et al. ∙

research

∙ 06/24/2020

Labelling unlabelled videos from scratch with multi-modal self-supervision

A large part of the current success of deep learning lies in the effecti...

0 Yuki M. Asano, et al. ∙

research

∙ 03/09/2020

Multi-modal Self-Supervision from Generalized Data Transformations

Self-supervised learning has advanced rapidly, with several results beat...

12 Mandela Patrick, et al. ∙

research

∙ 10/18/2019

Understanding Deep Networks via Extremal Perturbations and Smooth Masks

The problem of attribution is concerned with identifying the parts of an...

17 Ruth Fong, et al. ∙

Success!

An error occurred

Mandela Patrick

Featured Co-authors

Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers

Space-Time Crop Attend: Improving Cross-modal Video Representation Learning

Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models

Support-set bottlenecks for video-text representation learning

Labelling unlabelled videos from scratch with multi-modal self-supervision

Multi-modal Self-Supervision from Generalized Data Transformations

Understanding Deep Networks via Extremal Perturbations and Smooth Masks

Sign in with Google

Consider DeepAI Pro