Fairness Through Robustness: Investigating Robustness Disparity in Deep Learning

06/17/2020
by   Vedant Nanda, et al.
0

Deep neural networks are being increasingly used in real world applications (e.g. surveillance, face recognition). This has resulted in concerns about the fairness of decisions made by these models. Various notions and measures of fairness have been proposed to ensure that a decision-making system does not disproportionately harm (or benefit) particular subgroups of population. In this paper, we argue that traditional notions of fairness that are only based on models' outputs are not sufficient when decision-making systems such as deep networks are vulnerable to adversarial attacks. We argue that in some cases, it may be easier for an attacker to target a particular subgroup, resulting in a form of robustness bias. We propose a new notion of adversarial fairness that requires all subgroups to be equally robust to adversarial perturbations. We show that state-of-the-art neural networks can exhibit robustness bias on real world datasets such as CIFAR10, CIFAR100, Adience, and UTKFace. We then formulate a measure of our proposed fairness notion and use it as a regularization term to decrease the robustness bias in the traditional empirical risk minimization objective. Through empirical evidence, we show that training with our proposed regularization term can partially mitigate adversarial unfairness while maintaining reasonable classification accuracy.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset