We present GeGnn, a learning-based method for computing the approximate
...
Question-answering (QA) tasks often investigate specific question types,...
This work proposes a method for using any generator network as the found...
As machine learning tools progress, the inevitable question arises: How ...
In this paper, we propose a genuine group-level contrastive visual
repre...
Dense video captioning aims to identify the events of interest in an inp...
Typical vision backbones manipulate structured features. As a compromise...
Program synthesis strives to generate a computer program as a solution t...
Text summarization aims to condense long documents and retain key
inform...
Traffic classification associates packet streams with known application
...
Transfer learning with large pretrained transformer-based language model...
Heatmap-based methods dominate in the field of human pose estimation by
...
This paper studies the adaptive optimal stationary control of continuous...
Convolutional video models have an order of magnitude larger computation...
Human attention mechanisms often work in a top-down manner, yet it is no...
Image captioning models generally lack the capability to take into accou...
Learning specific hands-on skills such as cooking, car maintenance, and ...
Person re-identification (ReID) aims at searching the same identity pers...
This paper studies the robustness aspect of reinforcement learning algor...
Recent works of point clouds show that mulit-frame spatio-temporal model...
The generator model assumes that the observed example is generated by a
...
Learning energy-based model (EBM) requires MCMC sampling of the learned ...
This paper proposes a joint training method to learn both the variationa...
Multi-object tracking is a fundamental vision problem that has been stud...
Image enhancement from degradation of rainy artifacts plays a critical r...
Understanding sequential information is a fundamental task for artificia...
This paper studies the robustness of policy iteration in the context of
...
This paper reviews the NTIRE 2020 challenge on video quality mapping (VQ...
Pretraining from unlabelled web videos has quickly become the de-facto m...
Understanding interaction is an essential part of video action detection...
This paper studies the fundamental problem of learning deep generative m...
Instructional videos get high-traffic on video sharing platforms, and pr...
We present CoSQL, a corpus for building cross-domain, general-purpose
da...
Object detection plays an important role in current solutions to vision ...
We present SParC, a dataset for cross-domainSemanticParsing inContext th...
Extracting temporal and representation features efficiently plays a pivo...
We introduce the first benchmark for a new problem --- recognizing human...