Attention networks for image-to-text

12/11/2017
by   Jason Poulos, et al.
0

The paper approaches the problem of image-to-text with attention-based encoder-decoder networks that are trained to handle sequences of characters rather than words. We experiment on lines of text from a popular handwriting database with different attention mechanisms for the decoder. The model trained with softmax attention achieves the lowest test error, outperforming several other RNN-based models. Our results show that softmax attention is able to learn a linear alignment whereas the alignment generated by sigmoid attention is linear but much less precise.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset