Non-Singular Adversarial Robustness of Neural Networks
Adversarial robustness has become an emerging challenge for neural network owing to its over-sensitivity to small input perturbations. While being critical, we argue that solving this singular issue alone fails to provide a comprehensive robustness assessment. Even worse, the conclusions drawn from singular robustness may give a false sense of overall model robustness. Specifically, our findings show that adversarially trained models that are robust to input perturbations are still (or even more) vulnerable to weight perturbations when compared to standard models. In this paper, we formalize the notion of non-singular adversarial robustness for neural networks through the lens of joint perturbations to data inputs as well as model weights. To our best knowledge, this study is the first work considering simultaneous input-weight adversarial perturbations. Based on a multi-layer feed-forward neural network model with ReLU activation functions and standard classification loss, we establish error analysis for quantifying the loss sensitivity subject to ℓ_∞-norm bounded perturbations on data inputs and model weights. Based on the error analysis, we propose novel regularization functions for robust training and demonstrate improved non-singular robustness against joint input-weight adversarial perturbations.
READ FULL TEXT