Relational Language-Image Pre-training (RLIP) aims to align vision
repre...
This paper introduces ModelScopeT2V, a text-to-video synthesis model tha...
The pursuit of controllability as a higher standard of visual content
cr...
Detecting players from sports broadcast videos is essential for intellig...
Learning from changing tasks and sequential experience without forgettin...
The task of Human-Object Interaction (HOI) detection targets fine-graine...
Traditional object detectors are ill-equipped for incremental learning.
...
Human-Object Interaction (HOI) detection is an essential task to underst...
Group activity recognition aims to understand the activity performed by ...