KDSL: a Knowledge-Driven Supervised Learning Framework for Word Sense Disambiguation
We propose KDSL, a new word sense disambiguation (WSD) framework that utilizes knowledge to automatically generate sense-labeled data for supervised learning. First, from WordNet, we au- tomatically construct a semantic knowledge base called DisDict, which provides refined feature words that highlight the differences among word senses, i.e., synsets. Second, we automatically generate new sense-labeled data by DisDict from unlabeled corpora. Third, these generated data, together with manually labeled data, are fed to a supervised learning neural network to model the semantic relations among synsets, feature words and their contexts. Jointly with the supervised learning process, we also implement unsupervised learning on unlabeled data as an auxiliary task. The experimental results show that KDSL outperforms several representative state-of-the-art methods on various major benchmarks. Interestingly, it performs relatively well even when manually labeled data is unavailable, thus provides a new promising backoff strategy for WSD.
READ FULL TEXT