Visual Relationship Detection (VRD) impels a computer vision model to 's...
Characterizing the remarkable generalization properties of over-paramete...
General purpose semantic segmentation relies on a backbone CNN network t...
Automatically detecting violence from surveillance footage is a subset o...
Action quality assessment (AQA) aims at automatically judging human acti...
Conditional generative modeling typically requires capturing one-to-many...
Humans explain inter-object relationships with semantic labels that