Offline reinforcement learning (RL) methods strike a balance between
exp...
Complex manipulation tasks often require robots with complementary
capab...
We introduce Alexa Arena, a user-centric simulation platform for Embodie...
For service robots to become general-purpose in everyday household
envir...
A key goal for the advancement of AI is to develop technologies that ser...
We introduce OPEND, a benchmark for learning how to use a hand to open
c...
Recent breakthroughs in Vision-Language (V L) joint research have achi...
The domain of joint vision-language understanding, especially in the con...
We propose a multimodal (vision-and-language) benchmark for cooperative ...
To solve video-and-language grounding tasks, the key is for the network ...
Language-guided Embodied AI benchmarks requiring an agent to navigate an...
We introduce a novel privacy-preserving methodology for performing Visua...
Recent years have witnessed an emerging paradigm shift toward embodied
a...
We present a two-step hybrid reinforcement learning (RL) policy that is
...
Language-enabled AI systems can answer complex, multi-hop questions to h...
Learning-based methods for training embodied agents typically require a ...
Language-guided robots performing home and office tasks must navigate in...
GuessWhat?! is a two-player visual dialog guessing game where player A a...
Embodied instruction following is a challenging problem requiring an age...
Current conversational AI systems aim to understand a set of pre-designe...
The predominant approach to visual question answering (VQA) relies on
en...