Shape of Elephant: Study of Macro Properties of Word Embeddings Spaces

06/13/2021
by   Alexey Tikhonov, et al.
0

Pre-trained word representations became a key component in many NLP tasks. However, the global geometry of the word embeddings remains poorly understood. In this paper, we demonstrate that a typical word embeddings cloud is shaped as a high-dimensional simplex with interpretable vertices and propose a simple yet effective method for enumeration of these vertices. We show that the proposed method can detect and describe vertices of the simplex for GloVe and fasttext spaces.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset