A Bayesian latent allocation model for clustering compositional data with application to the Great Barrier Reef
Relative abundance is a common metric to estimate the composition of species in ecological surveys reflecting patterns of commonness and rarity of biological assemblages. Measurements of coral reef compositions formed by four communities along Australia's Great Barrier Reef (GBR) gathered between 2012 and 2017 are the focus of this paper. We undertake the task of finding clusters of transect locations with similar community composition and investigate changes in clustering dynamics over time. During these years, an unprecedented sequence of extreme weather events (cyclones and coral bleaching) impacted the 58 surveyed locations. The dependence between constituent parts of a composition presents a challenge for existing multivariate clustering approaches. In this paper, we introduce a finite mixture of Dirichlet distributions with group-specific parameters, where cluster memberships are dictated by unobserved latent variables. The inference is carried in a Bayesian framework, where MCMC strategies are outlined to sample from the posterior model. Simulation studies are presented to illustrate the performance of the model in a controlled setting. The application of the model to the 2012 coral reef data reveals that clusters were spatially distributed in similar ways across reefs which indicates a potential influence of wave exposure at the origin of coral reef community composition. The number of clusters estimated by the model decreased from four in 2012 to two from 2014 until 2017. Posterior probabilities of transect allocations to the same cluster substantially increase through time showing a potential homogenization of community composition across the whole GBR. The Bayesian model highlights the diversity of coral reef community composition within a coral reef and rapid changes across large spatial scales that may contribute to undermining the future of the GBR's biodiversity.
READ FULL TEXT