To make effective decisions in novel environments with long-horizon goal...
Large language models (LLMs) and Vision-Language Models (VLMs) have been...
Large Language Models (LLMs) have demonstrated impressive planning abili...
Text-to-image generative models have enabled high-resolution image synth...
Large text-to-video models trained on internet-scale data have demonstra...
Reconstruction of 3D neural fields from posed images has emerged as a
pr...
Large language models (LLMs) have demonstrated remarkable capabilities i...
Diffusion models are a class of flexible generative models trained with ...
We introduce a method for novel view synthesis given only a single
wide-...
Recent works have shown that sequence modeling can be effectively used t...
Humans are able to accurately reason in 3D by gathering multi-view
obser...
We present NeuSE, a novel Neural SE(3)-Equivariant Embedding for objects...
Foundation models pretrained on diverse data at scale have demonstrated
...
Since their introduction, diffusion models have quickly become the preva...
A robot operating in a household environment will see a wide range of un...
A goal of artificial intelligence is to construct an agent that can solv...
Humans form mental images of 3D scenes to support counterfactual imagina...
In this paper, we examine the problem of visibility-aware robot navigati...
Recent improvements in conditional generative modeling have made it poss...
We present a method for performing tasks involving spatial relations bet...
Can continuous diffusion models bring the same performance breakthrough ...
Large pre-trained models exhibit distinct and complementary capabilities...
The ability to reason about changes in the environment is crucial for ro...
Human perception reliably identifies movable and immovable parts of 3D
s...
In this paper, we address the challenging problem of 3D concept groundin...
Deep learning has excelled on complex pattern recognition tasks such as ...
Large text-guided diffusion models, such as DALLE-2, are able to generat...
Model-based reinforcement learning methods often use learning only for t...
Learning from a continuous stream of non-stationary data in an unsupervi...
Our environment is filled with rich and dynamic acoustic information. Wh...
Language model (LM) pre-training has proven useful for a wide variety of...
We present Neural Descriptor Fields (NDFs), an object representation tha...
The visual world around us can be described as a structured set of objec...
Deep neural networks have been used widely to learn the latent structure...
Humans are able to rapidly understand scenes by utilizing concepts extra...
Neural MMO is a computationally accessible research platform that combin...
We introduce the task of weakly supervised learning for detecting human ...
Self-supervised representation learning has achieved remarkable success ...
We propose a novel approach for probabilistic generative modeling of 3D
...
We present a method, Neural Radiance Flow (NeRFlow),to learn a 4D
spatia...
We propose several different techniques to improve contrastive divergenc...
We motivate Energy-Based Models (EBMs) as a promising model class for
co...
We present a framework for solving long-horizon planning problems involv...
When an agent interacts with a complex environment, it receives a stream...
We study the problem of unsupervised physical object discovery. Unlike
e...
We propose an energy-based model (EBM) of protein conformations that ope...
A vital aspect of human intelligence is the ability to compose increasin...
Progress in multiagent intelligence research is fundamentally limited by...
A major component of overfitting in model-free reinforcement learning (R...
Model-based planning holds great promise for improving both sample effic...