DISCO: Distilling Phrasal Counterfactuals with Large Language Models

12/20/2022
by   Zeming Chen, et al.
0

Recent methods demonstrate that data augmentation using counterfactual knowledge can teach models the causal structure of a task, leading to robust and generalizable models. However, such counterfactual data often has a limited scale and diversity if crowdsourced and is computationally expensive to extend to new perturbation types if generated using supervised methods. To address this, we introduce a new framework called DISCO for automatically generating high-quality counterfactual data at scale. DISCO engineers prompts to generate phrasal perturbations with a large general language model. Then, a task-specific teacher model filters the generation to distill high-quality counterfactual data. We show that learning with this counterfactual data yields a comparatively small student model that is 6 generalizes 5 challenging evaluations. This model is also 15 differentiating original and counterfactual examples, on three evaluation sets written by human workers and via human-AI collaboration.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset