Self-Healing Robust Neural Networks via Closed-Loop Control
Despite the wide applications of neural networks, there have been increasing concerns about their vulnerability issue. While numerous attack and defense techniques have been developed, this work investigates the robustness issue from a new angle: can we design a self-healing neural network that can automatically detect and fix the vulnerability issue by itself? A typical self-healing mechanism is the immune system of a human body. This biology-inspired idea has been used in many engineering designs but is rarely investigated in deep learning. This paper considers the post-training self-healing of a neural network, and proposes a closed-loop control formulation to automatically detect and fix the errors caused by various attacks or perturbations. We provide a margin-based analysis to explain how this formulation can improve the robustness of a classifier. To speed up the inference of the proposed self-healing network, we solve the control problem via improving the Pontryagin Maximum Principle-based solver. Lastly, we present an error estimation of the proposed framework for neural networks with nonlinear activation functions. We validate the performance on several network architectures against various perturbations. Since the self-healing method does not need a-priori information about data perturbations/attacks, it can handle a broad class of unforeseen perturbations.
READ FULL TEXT 
  
  
     share
 share