Optimal Resampling for Learning Small Models

05/04/2019
by   Abhishek Ghose, et al.
26

Models often need to be constrained to a certain size for them to be considered interpretable, for e.g., a decision tree of depth 5 is much easier to make sense of than one of depth 30. This suggests a trade-off between interpretability and accuracy. Our work tries to minimize this trade-off by suggesting the optimal distribution of the data to learn from, that surprisingly, may be different from the original distribution. We use an Infinite Beta Mixture Model (IBMM) to represent a specific set of sampling schemes. The parameters of the IBMM are learned using a Bayesian Optimizer (BO). While even under simplistic assumptions a distribution in the original d-dimensional space would need to optimize for O(d) variables - cumbersome for most real-world data - our technique lowers this number significantly to a fixed set of 8 variables at the cost of some additional preprocessing. The proposed technique is model-agnostic; it can be applied to any classifier. It also admits a general notion of model size. We demonstrate its effectiveness using multiple real-world datasets to construct decision trees, linear probability models and gradient boosted models.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset