M^2-MedDialog: A Dataset and Benchmarks for Multi-domain Multi-service Medical Dialogues

09/01/2021
by   Guojun Yan, et al.
19

Medical dialogue systems (MDSs) aim to assist doctors and patients with a range of professional medical services, i.e., diagnosis, consultation, and treatment. However, one-stop MDS is still unexplored because: (1) no dataset has so large-scale dialogues contains both multiple medical services and fine-grained medical labels (i.e., intents, slots, values); (2) no model has addressed a MDS based on multiple-service conversations in a unified framework. In this work, we first build a Multiple-domain Multiple-service medical dialogue (M^2-MedDialog)dataset, which contains 1,557 conversations between doctors and patients, covering 276 types of diseases, 2,468 medical entities, and 3 specialties of medical services. To the best of our knowledge, it is the only medical dialogue dataset that includes both multiple medical services and fine-grained medical labels. Then, we formulate a one-stop MDS as a sequence-to-sequence generation problem. We unify a MDS with causal language modeling and conditional causal language modeling, respectively. Specifically, we employ several pretrained models (i.e., BERT-WWM, BERT-MED, GPT2, and MT5) and their variants to get benchmarks on M^2-MedDialog dataset. We also propose pseudo labeling and natural perturbation methods to expand M2-MedDialog dataset and enhance the state-of-the-art pretrained models. We demonstrate the results achieved by the benchmarks so far through extensive experiments on M2-MedDialog. We release the dataset, the code, as well as the evaluation scripts to facilitate future research in this important research direction.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset