Whole slide image (WSI) assessment is a challenging and crucial step in
...
Following the tracking-by-attention paradigm, this paper introduces an
o...
Visual grounding of Language aims at enriching textual representations o...
To generate proper captions for videos, the inference needs to identify
...
Current computational models capturing words' meaning mostly rely on tex...
Language grounding to vision is an active field of research aiming to en...
Inverse rendering of an object under entirely unknown capture conditions...
Providing explanations in the context of Visual Question Answering (VQA)...
Decomposing a scene into its shape, reflectance and illumination is a
fu...
Language grounding aims at linking the symbolic representation of langua...
Decomposing a scene into its shape, reflectance, and illumination is a
c...
Knowledge of the hidden factors that determine particular system dynamic...
The novel DISTributed Artificial neural Network Architecture (DISTANA) i...
Capturing the shape and spatially-varying appearance (SVBRDF) of an obje...
We introduce a distributed spatio-temporal artificial neural network
arc...
Approximate nearest neighbor (ANN) search in high dimensions is an integ...
Creating plausible surfaces is an essential component in achieving a hig...
The goal of this work is to enable deep neural networks to learn
represe...
As handheld video cameras are now commonplace and available in every
sma...
Fisher-Vectors (FV) encode higher-order statistics of a set of multiple ...
Rating how aesthetically pleasing an image appears is a highly complex m...
Aligning video sequences is a fundamental yet still unsolved component f...
Material classification in natural settings is a challenge due to comple...
This paper proposes an approach that predicts the road course from camer...
This paper proposes a method for transferring the RGB color spectrum to
...