NorNE: Annotating Named Entities for Norwegian
This paper presents NorNE, a manually annotated corpus of named entities which extends the annotation of the existing Norwegian Dependency Treebank. The corpus contains around 600,000 tokens taken from both varieties of written Norwegian (Bokmål and Nynorsk) and annotates a rich set of entity types including persons, organizations, locations, geo-political entities, products, and events, in addition a class corresponding to nominals derived from a name. We here present details on the annotation effort, guidelines, inter-annotator agreement and an experimental analysis of the corpus using a neural sequence labeling architecture.
READ FULL TEXT