Provably Adversarially Robust Nearest Prototype Classifiers

07/14/2022
by   Václav Voráček, et al.
0

Nearest prototype classifiers (NPCs) assign to each input point the label of the nearest prototype with respect to a chosen distance metric. A direct advantage of NPCs is that the decisions are interpretable. Previous work could provide lower bounds on the minimal adversarial perturbation in the ℓ_p-threat model when using the same ℓ_p-distance for the NPCs. In this paper we provide a complete discussion on the complexity when using ℓ_p-distances for decision and ℓ_q-threat models for certification for p,q ∈{1,2,∞}. In particular we provide scalable algorithms for the exact computation of the minimal adversarial perturbation when using ℓ_2-distance and improved lower bounds in other cases. Using efficient improved lower bounds we train our Provably adversarially robust NPC (PNPC), for MNIST which have better ℓ_2-robustness guarantees than neural networks. Additionally, we show up to our knowledge the first certification results w.r.t. to the LPIPS perceptual metric which has been argued to be a more realistic threat model for image classification than ℓ_p-balls. Our PNPC has on CIFAR10 higher certified robust accuracy than the empirical robust accuracy reported in (Laidlaw et al., 2021). The code is available in our repository.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset