Video-grounded Dialogue (VGD) aims to decode an answer sentence to a que...
Video moment retrieval (VMR) aims to localize target moments in untrimme...
Existing state-of-the-art 3D point cloud instance segmentation methods r...
A video-grounded dialogue system referred to as the Structured Co-refere...
Video Moment Retrieval (VMR) is a task to localize the temporal moment i...
This paper considers a network referred to as Modality Shifting Attentio...
This paper proposes a method to gain extra supervision via multi-task
le...
This paper proposes the progressive attention memory network (PAMN) for ...