Auto-Encoder-BoF/HMM System for Arabic Text Recognition

12/10/2018
by   Najoua Rahal, et al.
0

The recognition of Arabic text, in both handwritten and printed forms, represents a fertile provenance of technical difficulties for Optical Character Recognition (OCR). Indeed, the printed is commonly governed by well-established calligraphy rules and the characters are well aligned. However, there is not always a system capable of reading Arabic printed text in an unconstrained environments such as unlimited vocabulary, multi styles, mixed-font and their great morphological variability. This diversity complicates the choice of features to extract and algorithm of segmentation. In this context, we adopt a new solution for unlimited-vocabulary and mixed-font Arabic printed text recognition. The proposed system is based on the adoption of Bag of Features (BoF) model using Sparse Auto-Encoder (SAE) for features representation and Hidden Markov Models (HMM) for recognition. As results, the obtained average accuracies of recognition vary between 99.65 exceed 99

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset