Gaussian process aided function comparison using noisy scattered data
This work proposes a new nonparametric method to compare the underlying mean functions given by two noisy datasets. The motivation for the work stems from an application of comparing wind turbine power curves. Wind turbine data present new problems, including that two different datasets do not share the same input points and individual datasets do not have replicates for the response at a given input point. Our proposed method estimates the underlying functions for different data samples using Gaussian process models. The posterior covariance is used to build a confidence band for the difference in the mean functions. Then, the band is used for hypothesis test as well as for identifying the regions of the input space where the functions are statistically different. This identification of difference regions is also distinct from many existing methods, which either simply conduct a hypothesis test without identifying regions of difference or provide only a point estimate (metric) of the overall functional difference. Practically, understanding the difference regions can lead to further insights and help devise better control and maintenance strategies for the turbine. The merit of our proposed method is demonstrated by using three simulation studies and four real wind turbine datasets.
READ FULL TEXT