Parameter Estimation for Grouped Data Using EM and MCEM Algorithms
Nowadays, the confidentiality of data and information is of great importance for many companies and organizations. For this reason, they may prefer not to release exact data, but instead to grant researchers access to approximate data. For example, rather than providing the exact income of their clients, they may only provide researchers with grouped data, that is, the number of clients falling in each of a set of non-overlapping income intervals. The challenge is to estimate the mean and variance structure of the hidden ungrouped data based on the observed grouped data. To tackle this problem, this work considers the exact observed data likelihood and applies the Expectation-Maximization (EM) and Monte-Carlo EM (MCEM) algorithms for cases where the hidden data follow a univariate, bivariate, or multivariate normal distribution. The results are then compared with the case of ignoring the grouping and applying regular maximum likelihood. The well-known Galton data and simulated datasets are used to evaluate the properties of the proposed EM and MCEM algorithms.
READ FULL TEXT