Ensemble of Models Trained by Key-based Transformed Images for Adversarially Robust Defense Against Black-box Attacks
We propose a voting ensemble of models trained by using block-wise transformed images with secret keys for an adversarially robust defense. Key-based adversarial defenses were demonstrated to outperform state-of-the-art defenses against gradient-based (white-box) attacks. However, the key-based defenses are not effective enough against gradient-free (black-box) attacks without requiring any secret keys. Accordingly, we aim to enhance robustness against black-box attacks by using a voting ensemble of models. In the proposed ensemble, a number of models are trained by using images transformed with different keys and block sizes, and then a voting ensemble is applied to the models. In image classification experiments, the proposed defense is demonstrated to defend state-of-the-art attacks. The proposed defense achieves a clean accuracy of 95.56 attacks with a noise distance of 8/255 on the CIFAR-10 dataset.
READ FULL TEXT