Aggregate-Eliminate-Predict: Detecting Adverse Drug Events from Heterogeneous Electronic Health Records
We study the problem of detecting adverse drug events in electronic healthcare records. The challenge in this work is to aggregate heterogeneous data types involving diagnosis codes, drug codes, as well as lab measurements. An earlier framework proposed for the same problem demonstrated promising predictive performance for the random forest classifier by using only lab measurements as data features. We extend this framework, by additionally including diagnosis and drug prescription codes, concurrently. In addition, we employ a recursive feature selection mechanism on top, that extracts the top-k most important features. Our experimental evaluation on five medical datasets of adverse drug events and six different classifiers, suggests that the integration of these additional features provides substantial and statistically significant improvements in terms of AUC, while employing medically relevant features.
READ FULL TEXT