Confidence Intervals for Error Rates in Matching Tasks: Critical Review and Recommendations
Matching algorithms are commonly used to predict matches between items in a collection. For example, in 1:1 face verification, a matching algorithm predicts whether two face images depict the same person. Accurately assessing the uncertainty of the error rates of such algorithms can be challenging when data are dependent and error rates are low, two aspects that have been often overlooked in the literature. In this work, we review methods for constructing confidence intervals for error rates in matching tasks such as 1:1 face verification. We derive and examine the statistical properties of these methods and demonstrate how coverage and interval width vary with sample size, error rates, and degree of data dependence using both synthetic and real-world datasets. Based on our findings, we provide recommendations for best practices for constructing confidence intervals for error rates in matching tasks.
READ FULL TEXT