Binary classification models with "Uncertain" predictions
Binary classification models which can assign probabilities to categories such as "the tissue is 75 likely to be toxic" are well understood statistically, but their utility as an input to decision making is less well explored. We argue that users need to know which is the most probable outcome, how likely that is to be true and, in addition, whether the model is capable enough to provide an answer. It is the last case, where the potential outcomes of the model explicitly include "don't know" that is addressed in this paper. Including this outcome would better separate those predictions that can lead directly to a decision from those where more data is needed. Where models produce an "Uncertain" answer similar to a human reply of "don't know" or "50:50" in the examples we refer to earlier, this would translate to actions such as "operate on tumour" or "remove compound from use" where the models give a "more true than not" answer. Where the models judge the result "Uncertain" the practical decision might be "carry out more detailed laboratory testing of compound" or "commission new tissue analyses". The paper presents several examples where we first analyse the effect of its introduction, then present a methodology for separating "Uncertain" from binary predictions and finally, we provide arguments for its use in practice.
READ FULL TEXT