Are All Training Examples Created Equal? An Empirical Study

11/30/2018
by   Kailas Vodrahalli, et al.
0

Modern computer vision algorithms often rely on very large training datasets. However, it is conceivable that a carefully selected subsample of the dataset is sufficient for training. In this paper, we propose a gradient-based importance measure that we use to empirically analyze relative importance of training images in four datasets of varying complexity. We find that in some cases, a small subsample is indeed sufficient for training. For other datasets, however, the relative differences in importance are negligible. These results have important implications for active learning on deep networks. Additionally, our analysis method can be used as a general tool to better understand diversity of training examples in datasets.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset