This paper addresses the issue of modifying the visual appearance of vid...
The integration of natural language processing (NLP) technologies into
e...
The transcription quality of automatic speech recognition (ASR) systems
...
Explaining how important each input feature is to a classifier's decisio...
We present an algorithm for searching image collections using free-hand
...
Humans are arguably one of the most important subjects in video streams,...
Neural networks have shown great abilities in estimating depth from a si...
Do state-of-the-art natural language understanding models care about wor...
Despite significant progress in monocular depth estimation in the wild,
...
Video frame interpolation, the synthesis of novel views in time, is an
i...
Current methods for active speak er detection focus on modeling short-te...
While image captioning has progressed rapidly, existing works focus main...
We present BlockGAN, an image generative model that learns object-aware ...
Mode collapse is a well-known issue with Generative Adversarial Networks...
We propose a novel video inpainting algorithm that simultaneously
halluc...
The Ken Burns effect allows animating still images with a virtual camera...
Incremental learning targets at achieving good performance on new catego...
Despite excellent performance on stationary test sets, deep neural netwo...
Standard video frame interpolation methods first estimate optical flow
b...
Video frame interpolation typically involves two steps: motion estimatio...