Semantically Adversarial Learnable Filters
We present the first adversarial framework that crafts perturbations that mislead classifiers by accounting for the content of the images and the semantics of the labels. The proposed framework combines deep neural networks and traditional image processing filters, which define the type and magnitude of the adversarial perturbation. We also introduce a semantic adversarial loss that guides the training of a fully convolutional neural network to generate adversarial images that will be classified with a label that is semantically different from the label of the original (clean) image. We analyse the limitations of existing methods that do not account for the semantics of the labels and evaluate the proposed framework, FilterFool, on ImageNet and with three object classifiers, namely ResNet50, ResNet18 and AlexNet. We discuss its success rate, robustness and transferability to unseen classifiers.
READ FULL TEXT