Implicit segmentation of Kannada characters in offline handwriting recognition using hidden Markov models
We describe a method for classification of handwritten Kannada characters using Hidden Markov Models (HMMs). Kannada script is agglutinative, where simple shapes are concatenated horizontally to form a character. This results in a large number of characters making the task of classification difficult. Character segmentation plays a significant role in reducing the number of classes. Explicit segmentation techniques suffer when overlapping shapes are present, which is common in the case of handwritten text. We use HMMs to take advantage of the agglutinative nature of Kannada script, which allows us to perform implicit segmentation of characters along with recognition. All the experiments are performed on the Chars74k dataset that consists of 657 handwritten characters collected across multiple users. Gradient-based features are extracted from individual characters and are used to train character HMMs. The use of implicit segmentation technique at the character level resulted in an improvement of around 10 tested on the same dataset by around 16 showed that increasing the training data could result in better accuracy. Accordingly, we collected additional data and obtained an improvement of 4 with 6 additional samples.
READ FULL TEXT