Measures to Evaluate Generative Adversarial Networks Based on Direct Analysis of Generated Images
The Generative Adversarial Network (GAN) is a state-of-the-art technique in the field of deep learning. A number of recent papers address the theory and applications of GANs in various fields of image processing. Fewer studies, however, have directly evaluated GAN outputs. Those that have been conducted focused on using classification performance (e.g., Inception Score) and statistical metrics (e.g., Fréchet Inception Distance). , Here, we consider a fundamental way to evaluate GANs by directly analyzing the images they generate, instead of using them as inputs to other classifiers. We characterize the performance of a GAN as an image generator according to three aspects: 1) Creativity: non-duplication of the real images. 2) Inheritance: generated images should have the same style, which retains key features of the real images. 3) Diversity: generated images are different from each other. A GAN should not generate a few different images repeatedly. Based on the three aspects of ideal GANs, we have designed two measures: Creativity-Inheritance-Diversity (CID) index and Likeness Score (LS) to evaluate GAN performance, and have applied them to evaluate three typical GANs. We compared our proposed measures with three commonly used GAN evaluation methods: Inception Score (IS), Fréchet Inception Distance (FID) and 1-Nearest Neighbor classifier (1NNC). In addition, we discuss how these evaluations could help us deepen our understanding of GANs and improve their performance.
READ FULL TEXT