Robust Neural Networks using Randomized Adversarial Training
Since the discovery of adversarial examples in machine learning, researchers have designed several techniques to train neural networks that are robust against different types of attacks (most notably ℓ_∞ and ℓ_2 based attacks). However, it has been observed that the defense mechanisms designed to protect against one type of attack often offer poor performance against the other. In this paper, we introduce Randomized Adversarial Training (RAT), a technique that is efficient both against ℓ_2 and ℓ_∞ attacks. To obtain this result, we build upon adversarial training, a technique that is efficient against ℓ_∞ attacks, and demonstrate that adding random noise at training and inference time further improves performance against attacks. We then show that RAT is as efficient as adversarial training against ℓ_∞ attacks while being robust against strong ℓ_2 attacks. Our final comparative experiments demonstrate that RAT outperforms all state-of-the-art approaches against ℓ_2 and ℓ_∞ attacks.
READ FULL TEXT