DeepGauge: Multi-Granularity Testing Criteria for Deep Learning Systems
Deep learning defines a new data-driven programming paradigm that constructs the internal system logic of a crafted neuron network through a set of training data. Deep learning (DL) has been widely adopted in many safety-critical scenarios. However, a plethora of studies have shown that the state-of-the-art DL systems suffer from various vulnerabilities which can lead to severe consequences when applied to real-world applications. Currently, the robustness of a DL system against adversarial attacks is usually measured by the accuracy of test data. Considering the limitation of accessible high quality test data, good accuracy performance on test data can hardly guarantee the robustness and generality of DL systems. Different from traditional software systems which have clear and controllable logic and functionality, a DL system is trained with data and lacks thorough understanding. This makes it difficult for system analysis and defect detection, which could potentially hinder its real-world deployment without safety guarantees. In this paper, we propose DeepGauge, comprehensive and multi-granularity testing criteria for DL systems, which aims to render a complete and multi-faceted portrayal of the testbed. The in-depth evaluation of our proposed testing criteria is demonstrated on two well-known datasets, five DL systems, with four state-of-the-art attack techniques against DL. The effectiveness of DeepGauge sheds light on the construction of robust DL systems.
READ FULL TEXT