Bayesian latent Gaussian graphical models for mixed data with marginal prior information
Associations between variables of mixed types are of interest in a variety of scientific fields, particularly in the social sciences. In many such settings, learning the dependence relationships among large numbers of continuous and discrete variables from relatively few observations is critical for both understanding the data and for predictive tasks. Further, in settings with insufficient data to estimate the complete dependence structure, informative prior beliefs become essential. Collecting informative prior beliefs about the complete dependence structure, however, is practically challenging, and in many cases reliable prior information can only be solicited about marginal distributions of the variables. In this work we introduce a latent Gaussian graphical modeling approach to characterize the underlying dependence relationships between variables of mixed types. Our approach incorporates informative prior beliefs about the marginal distributions of the variables, and we show that such information can play a significant role in decoding the dependencies between the variables. Our work is motivated by survey-based cause of death instruments, known as verbal autopsies (VAs). These data are widely used in places without full-coverage civil registration systems and where most deaths occur outside of hospitals. We show that our method can be integrated into existing probabilistic cause-of-death assignment algorithms and improves model performance while recovering dependencies in the data that could prove useful for streamlining future data collection.
READ FULL TEXT