Modelling Numerical Systems with Two Distinct Labelled Output Classes
We present a new method of modelling numerical systems where there are two distinct output solution classes, for example tipping points or bifurcations. Gaussian process emulation is a useful tool in understanding these complex systems and provides estimates of uncertainty, but we aim to include systems where there are discontinuities between the two output solutions. Due to continuity assumptions, we consider current methods of classification to split our input space into two output regions. Classification and logistic regression methods currently rely on drawing from an independent Bernoulli distribution, which neglects any information known in the neighbouring area. We build on this by including correlation between our input points. Gaussian processes are still a vital element, but used in latent space to model the two regions. Using the input values and an associated output class label, the latent variable is estimated using MCMC sampling and a unique likelihood. A threshold (usually at zero) defines the boundary. We apply our method to a motivating example provided by the hormones associated with the reproductive system in mammals, where the two solutions are associated with high and low rates of reproduction.
READ FULL TEXT