Knowledge-Aware Graph-Enhanced GPT-2 for Dialogue State Tracking
Dialogue State Tracking is a crucial part of multi-domain task-oriented dialogue systems, responsible for extracting information from user utterances. We present a novel architecture that utilizes the powerful generative model GPT-2 to generate slot values one by one causally, and at the same time utilizes Graph Attention Networks to enable inter-slot information exchanges, which exploits the inter-slot relations such as correlations. Our model achieves 54.86% joint accuracy in MultiWOZ 2.0, and it retains a performance of up to 50.43% in sparse supervision training, where only session-level annotations (14.3% of the full training set) are used. We conduct detailed analyses to demonstrate the significance of using graph models in this task, and show by experiments that the proposed graph modules indeed help to capture more inter-slot relations.
READ FULL TEXT