CIFS: Improving Adversarial Robustness of CNNs via Channel-wise Importance-based Feature Selection

02/10/2021
by   Hanshu Yan, et al.
1

We investigate the adversarial robustness of CNNs from the perspective of channel-wise activations. By comparing non-robust (normally trained) and robustified (adversarially trained) models, we observe that adversarial training (AT) robustifies CNNs by aligning the channel-wise activations of adversarial data with those of their natural counterparts. However, the channels that are negatively-relevant (NR) to predictions are still over-activated when processing adversarial data. Besides, we also observe that AT does not result in similar robustness for all classes. For the robust classes, channels with larger activation magnitudes are usually more positively-relevant (PR) to predictions, but this alignment does not hold for the non-robust classes. Given these observations, we hypothesize that suppressing NR channels and aligning PR ones with their relevances further enhances the robustness of CNNs under AT. To examine this hypothesis, we introduce a novel mechanism, i.e., Channel-wise Importance-based Feature Selection (CIFS). The CIFS manipulates channels' activations of certain layers by generating non-negative multipliers to these channels based on their relevances to predictions. Extensive experiments on benchmark datasets including CIFAR10 and SVHN clearly verify the hypothesis and CIFS's effectiveness of robustifying CNNs.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset