Addressing the Barriers to Interlingual Rule-based Machine Translation with a Concept Specification and Abstraction Semantic Representation
Interlingual machine translation offers the prospect of better preservation of meaning and requires fewer language/translation models than statistical machine translation approaches. However, it lacks popularity primarily because of the extensive training and labour required to define the language rules. In the present work, we introduce a semantic representation designed to address this. The novel representation treats all bits of meaning as individual concepts that refine or further specify one another to build a network that relates entities in space and time. Its "specifying" nature replaces the need for almost all case roles found in typical thematic-role-based representations. Also, the representation can encapsulate propositions and treat them as new concepts, meaning that concepts can be defined in terms of other concepts. This allows the approach to abstract away some technical linguistic terms and ontological details. The proposed natural language generation, parsing, and translation strategies using this representation are also amenable to probabilistic modeling and thus to learning directly from example data.
READ FULL TEXT