Including Dialects and Language Varieties in Author Profiling

07/03/2017
by   Alina Maria Ciobanu, et al.
0

This paper presents a computational approach to author profiling taking gender and language variety into account. We apply an ensemble system with the output of multiple linear SVM classifiers trained on character and word n-grams. We evaluate the system using the dataset provided by the organizers of the 2017 PAN lab on author profiling. Our approach achieved 75 accuracy on gender identification on tweets written in four languages and 97 accuracy on language variety identification for Portuguese.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset