State-of-the-art rehearsal-free continual learning methods exploit the
p...
Although instance segmentation methods have improved considerably, the
d...
Deep convolutional networks are ubiquitous in computer vision, due to th...
Self-supervised learning models have been shown to learn rich visual
rep...
The success of transformer models trained with a language modeling objec...
Developing agents that can execute multiple skills by learning from
pre-...
Recent large-scale image generation models such as Stable Diffusion have...
Despite significant advances, the performance of state-of-the-art contin...
We propose a novel antialiasing method to increase shift invariance in
c...
Vision Transformers (ViTs) have become a dominant paradigm for visual
re...
In defense-related remote sensing applications, such as vehicle detectio...
The application of deep neural networks to remote sensing imagery is oft...
In this paper, we aim to improve the mathematical interpretability of
co...
We consider the problem of training a deep neural network on a given
cla...
Recent works in autonomous driving have widely adopted the bird's-eye-vi...
Learning a diverse set of skills by interacting with an environment with...
Audio-visual automatic speech recognition (AV-ASR) is an extension of AS...
Both a good understanding of geometrical concepts and a broad familiarit...
Self-supervised models have been shown to produce comparable or better v...
Pre-training on large scale unlabelled datasets has shown impressive
per...
We introduce regularized Frank-Wolfe, a general and effective algorithm ...
Vision-based depth estimation is a key feature in autonomous systems, wh...
In this work, we address the problem of image-goal navigation in the con...
Measuring concept generalization, i.e., the extent to which models train...
We propose a novel amortized variational inference scheme for an empiric...
The task of retrieving video content relevant to natural language querie...
Generative modeling of natural images has been extensively studied in re...
Although deep learning approaches have stood out in recent years due to ...
Generative adversarial networks (GANs) are one of the most popular metho...
Several theories in cognitive neuroscience suggest that when people inte...
In Actor and Observer we introduced a dataset linking the first and
thir...
We study the problem of segmenting moving objects in unconstrained video...
Despite their success for object detection, convolutional neural network...
In this paper, we propose a new framework for action localization that t...
The problem of determining whether an object is in motion, irrespective ...
Fully convolutional neural networks (FCNNs) trained on a large number of...
Recognizing scene text is a challenging problem, even more so than the
r...
We present alpha-expansion beta-shrink moves, a simple generalization of...