Geostatistical models for zero-inflated data and extreme values
Understanding the spatial distribution of animals, during all their life phases, as well as how the distributions are influenced by environmental covariates, is a fundamental requirement for the effective management of animal populations. Several geostatistical models have been proposed in the literature, however often the data structure presents an excess of zeros and extreme values, which can lead to unreliable estimates when these are ignored in the modelling process. To deal with these issues, we propose a point-referenced zero-inflated model to model the probability of presence together with the positive observations and a point-referenced generalised Pareto model for the extremes. Finally, we combine the results of these two models to get the spatial predictions of the variable of interest. We follow a Bayesian approach and the inference is made using the package R-INLA in the software R. Our proposed methodology was illustrated through the analysis of the spatial distribution of sardine eggs density (eggs/m^3). The results showed that the combined model for zero-inflated and extreme values improved the spatial prediction accuracy. Accordingly, our conclusion is that it is relevant to consider the data structure in the modelling process. Also, the hierarchical model considered can be widely applicable in many ecological problems and even in other contexts.
READ FULL TEXT