There Is No Free Lunch In Adversarial Robustness (But There Are Unexpected Benefits)

05/30/2018
by   Dimitris Tsipras, et al.
2

We provide a new understanding of the fundamental nature of adversarially robust classifiers and how they differ from standard models. In particular, we show that there provably exists a trade-off between the standard accuracy of a model and its robustness to adversarial perturbations. We demonstrate an intriguing phenomenon at the root of this tension: a certain dichotomy between "robust" and "non-robust" features. We show that while robustness comes at a price, it also has some surprising benefits. Robust models turn out to have interpretable gradients and feature representations that align unusually well with salient data characteristics. In fact, they yield striking feature interpolations that have thus far been possible to obtain only using generative models such as GANs.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset