As one of the most fundamental techniques in multimodal learning, cross-...
The gap between low-level visual signals and high-level semantics has be...
Masked autoencoders (MAEs) have emerged recently as art self-supervised
...
Scene text recognition with arbitrary shape is very challenging due to l...
3D shape reconstruction from a single-view RGB image is an ill-posed pro...
Merchandise categories inherently form a semantic hierarchy with differe...
Deep hashing methods have been proved to be effective and efficient for
...
Inferring the 3D shape of an object from an RGB image has shown impressi...
Sketch recognition remains a significant challenge due to the limited
tr...
In this paper, we propose a novel deep framework for part-level semantic...
Recovering the 3D representation of an object from single-view or multi-...
Convolutional Neural Networks (CNNs) have achieved comparable error rate...