Intuitive physics is pivotal for human understanding of the physical wor...
Understanding the continuous states of objects is essential for task lea...
We introduce SceneDiffuser, a conditional generative model for 3D scene
...
Current computer vision models, unlike the human visual system, cannot y...
The ability to decompose complex natural scenes into meaningful
object-c...
Understanding human tasks through video observations is an essential
cap...
Causal induction, i.e., identifying unobservable mechanisms that lead to...
Spatial-temporal reasoning is a challenging task in Artificial Intellige...
Understanding and interpreting human actions is a long-standing challeng...
"Thinking in pictures," [1] i.e., spatial-temporal reasoning, effortless...
Dramatic progress has been witnessed in basic vision tasks involving
low...
This paper addresses the task of detecting and recognizing human-object
...
Future predictions on sequence data (e.g., videos or audios) require the...