Capturing Evolution in Word Usage: Just Add More Clusters?
The way the words are used evolves through time, mirroring cultural or technological evolution of society. Semantic change detection is the task of detecting and analysing word evolution in textual data, even in short periods of time. This task has recently become a popular task in the NLP community. In this paper we focus on a new set of methods relying on contextualised embedding, a type of semantic modelling that revolutionised the field recently. We leverage the ability of the transformer-based BERT model to generate contextualised embeddings suitable to detect semantic change of words across time. We compare our results to other approaches from the literature in a common setting in order to establish strengths and weaknesses for each of them. We also propose several ideas for improvements, managing to drastically improve the performance of existing approaches.
READ FULL TEXT