Bias Reduction via End-to-End Shift Learning: Application to Citizen Science
Citizen science projects are successful at gathering rich datasets for various applications. Nevertheless, the data collected by the citizen scientists are often biased, more aligned with the citizens' preferences rather than scientific objectives. We propose the Shift Compensation Network (SCN), an end-to-end learning scheme which learns the shift from the scientific objectives to the biased data, while compensating the shift by re-weighting the training data. Applied to bird observational data from the citizen science project eBird, we demonstrate how SCN quantifies the data distribution shift as well as outperforms supervised learning models that do not address the data bias. Compared with other competing models in the context of covariate shift, we further demonstrate the advantage of SCN in both the effectiveness and the capability of handling massive high-dimensional data.
READ FULL TEXT