Strategies for Language Identification in Code-Mixed Low Resource Languages
In the recent years, substantial work has been done on language tagging of code-mixed data, but most of them use large amounts of data to build their models. In this article, we present three strategies for building a word level language tagger for code-mixed data using very low resources. Each of them secured an accuracy higher than our baseline model, and the best performing system got an accuracy around 91 an accuracy around 92.6
READ FULL TEXT