Persistence Homology of TEDtalk: Do Sentence Embeddings Have a Topological Shape?

03/25/2021
by   Shouman Das, et al.
0

Topological data analysis (TDA) has recently emerged as a new technique to extract meaningful discriminitve features from high dimensional data. In this paper, we investigate the possibility of applying TDA to improve the classification accuracy of public speaking rating. We calculated persistence image vectors for the sentence embeddings of TEDtalk data and feed this vectors as additional inputs to our machine learning models. We have found a negative result that this topological information does not improve the model accuracy significantly. In some cases, it makes the accuracy slightly worse than the original one. From our results, we could not conclude that the topological shapes of the sentence embeddings can help us train a better model for public speaking rating.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset