Faster estimation for constrained gamma mixture models using closed-form estimators
Mixture models are useful in a wide array of applications to identify subpopulations in noisy overlapping distributions. For example, in multiplexed immunofluorescence (mIF), cell image intensities represent expression levels and the cell populations are a noisy mixture of expressed and unexpressed cells. Among mixture models, the gamma mixture model has the strength of being flexible in fitting skewed strictly positive data that occur in many biological measurements. However, the current estimation method uses numerical optimization within the expectation maximization algorithm and is computationally expensive. This makes it infeasible to be applied across many large data sets, as is necessary in mIF data. Powered by a recently developed closed-form estimator for the gamma distribution, we propose a closed-form gamma mixture model that is not only more computationally efficient, but can also incorporate constraints from known biological information to the fitted distribution. We derive the closed-form estimators for the gamma mixture model and use simulations to demonstrate that our model produces comparable results with the current model with significantly less time, and is excellent in constrained model fitting.
READ FULL TEXT