Improved Detection of Adversarial Attacks via Penetration Distortion Maximization
This paper is concerned with the defense of deep models against adversarial attacks. We developan adversarial detection method, which is inspired by the certificate defense approach, and capturesthe idea of separating class clusters in the embedding space to increase the margin. The resultingdefense is intuitive, effective, scalable, and can be integrated into any given neural classificationmodel. Our method demonstrates state-of-the-art (detection) performance under all threat models.
READ FULL TEXT