Improving the Interpretability of Neural Sentiment Classifiers via Data Augmentation

09/10/2019
by   Hanjie Chen, et al.
0

Recent progress of neural network models has achieved remarkable performance on sentiment classification, while the lack of classification interpretation may raise the trustworthy and many other issues in practice. In this work, we study the problem of improving the interpretability of existing sentiment classifiers. We propose two data augmentation methods that create additional training examples to help improve model interpretability: one method with a predefined sentiment word list as external knowledge and the other with adversarial examples. We test the proposed methods on both CNN and RNN classifiers with three benchmark sentiment datasets. The model interpretability is assessed by both human evaluators and a simple automatic evaluation measurement. Experiments show the proposed data augmentation methods significantly improve the interpretability of both neural classifiers.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset