Explaining the Black-box Smoothly- A Counterfactual Approach
We propose a BlackBox Counterfactual Explainer that is explicitly developed for medical imaging applications. Classical approaches (e.g. saliency maps) assessing feature importance do not explain how and why variations in a particular anatomical region is relevant to the outcome, which is crucial for transparent decision making in healthcare application. Our framework explains the outcome by gradually exaggerating the semantic effect of the given outcome label. Given a query input to a classifier, Generative Adversarial Networks produce a progressive set of perturbations to the query image that gradually changes the posterior probability from its original class to its negation. We design the loss function to ensure that essential and potentially relevant details, such as support devices, are preserved in the counterfactually generated images. We provide an extensive evaluation of different classification tasks on the chest X-Ray images. Our experiments show that a counterfactually generated visual explanation is consistent with the disease's clinical relevant measurements, both quantitatively and qualitatively.
READ FULL TEXT