Translating Domain-Specific Expressions in Knowledge Bases with Neural Machine Translation
Our work presented in this paper focuses on the translation of domain-specific expressions represented in semantically structured resources, like ontologies or knowledge graphs. To make knowledge accessible beyond language borders, these resources need to be translated into different languages. The challenge of translating labels or terminological expressions represented in ontologies lies in the highly specific vocabulary and the lack of contextual information, which can guide a machine translation system to translate ambiguous words into the targeted domain. Due to the challenges, we train and translate the terminological expressions in the medial and financial domain with statistical as well as with neural machine translation methods. We evaluate the translation quality of domain-specific expressions with translation systems trained on a generic dataset and experiment domain adaptation with terminological expressions. Furthermore we perform experiments on the injection of external knowledge into the translation systems. Through these experiments, we observed a clear advantage in domain adaptation and terminology injection of NMT methods over SMT. Nevertheless, through the specific and unique terminological expressions, subword segmentation within NMT does not outperform a word based neural translation model.
READ FULL TEXT