Extraction of evidence tables from abstracts of randomized clinical trials using a maximum entropy classifier and global constraints
Systematic use of the published results of randomized clinical trials is increasingly important in evidence-based medicine. In order to collate and analyze the results from potentially numerous trials, evidence tables are used to represent trials concerning a set of interventions of interest. An evidence table has columns for the patient group, for each of the interventions being compared, for the criterion for the comparison (e.g. proportion who survived after 5 years from treatment), and for each of the results. Currently, it is a labour-intensive activity to read each published paper and extract the information for each field in an evidence table. There have been some NLP studies investigating how some of the features from papers can be extracted, or at least the relevant sentences identified. However, there is a lack of an NLP system for the systematic extraction of each item of information required for an evidence table. We address this need by a combination of a maximum entropy classifier, and integer linear programming. We use the later to handle constraints on what is an acceptable classification of the features to be extracted. With experimental results, we demonstrate substantial advantages in using global constraints (such as the features describing the patient group, and the interventions, must occur before the features describing the results of the comparison).
READ FULL TEXT