NorNE: Annotating Named Entities for Norwegian

11/27/2019
by   Fredrik Jørgensen, et al.
0

This paper presents NorNE, a manually annotated corpus of named entities which extends the annotation of the existing Norwegian Dependency Treebank. The corpus contains around 600,000 tokens taken from both varieties of written Norwegian (Bokmål and Nynorsk) and annotates a rich set of entity types including persons, organizations, locations, geo-political entities, products, and events, in addition a class corresponding to nominals derived from a name. We here present details on the annotation effort, guidelines, inter-annotator agreement and an experimental analysis of the corpus using a neural sequence labeling architecture.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset