Deep neural networks often suffer from poor generalization due to comple...
In this paper, we study teacher-student learning from the perspective of...
Fine-tuning large pretrained language models on a limited training corpu...
Deep neural networks often suffer from poor generalization caused by com...
In this paper, we discover two factors that inhibit POMs from achieving ...