Controlling Over-generalization and its Effect on Adversarial Examples Generation and Detection

08/21/2018
by   Mahdieh Abbasi, et al.
0

Convolutional Neural Networks (CNNs) allowed improving the state-of-the-art for many vision applications. However, naive CNNs suffer from two serious issues: vulnerability to adversarial examples and making incorrect but confident predictions for out-distribution samples. In this paper, we draw a connection between these two issues of CNNs through over-generalization. We reveal an augmented CNN (an extra output class added) as a simple yet effective end-to-end approach has the capacity for controlling over-generalization. We demonstrate training an augmented CNN on only a properly selected natural out-distribution dataset and interpolated samples empowers it to classify a wide range of unseen out-distribution samples as dustbin. Meanwhile, its misclassification rates on a broad spectrum of well-known black-box adversaries drop drastically as it classifies a portion of adversaries as dustbin class (rejection option) while correctly classifies some of the remaining. However, such an augmented CNN is never trained with any types of adversaries. Finally, generation of white-box adversarial attacks using augmented CNNs can be harder as the attack algorithms have to avoid dustbin regions for generating actual adversaries.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset