ATM: An Uncertainty-aware Active Self-training Framework for Label-efficient Text Classification
Despite the great success of pre-trained language models (LMs) in many natural language processing (NLP) tasks, they require excessive labeled data for fine-tuning to achieve satisfactory performance. To enhance the label efficiency, researchers have resorted to active learning (AL), while the potential of unlabeled data is ignored by most of prior work. To unleash the power of unlabeled data for better label efficiency and model performance, we develop ATM, a new framework that leverage self-training to exploit unlabeled data and is agnostic to the specific AL algorithm, serving as a plug-in module to improve existing AL methods. Specifically, the unlabeled data with high uncertainty is exposed to oracle for annotations while those with low uncertainty are leveraged for self-training. To alleviate the label noise propagation issue in self-training, we design a simple and effective momentum-based memory bank to dynamically aggregate the model predictions from all rounds. By extensive experiments, we demonstrate that ATM outperforms the strongest active learning and self-training baselines and improve the label efficiency by 51.9
READ FULL TEXT