Unsupervised Hashtag Retrieval and Visualization for Crisis Informatics

01/18/2018
by   Yao Gu, et al.
0

In social media like Twitter, hashtags carry a lot of semantic information and can be easily distinguished from the main text. Exploring and visualizing the space of hashtags in a meaningful way can offer important insights into a dataset, especially in crisis situations. In this demonstration paper, we present a functioning prototype, HashViz, that ingests a corpus of tweets collected in the aftermath of a crisis situation (such as the Las Vegas shootings) and uses the fastText bag-of-tricks semantic embedding algorithm (from Facebook Research) to embed words and hashtags into a vector space. Hashtag vectors obtained in this way can be visualized using the t-SNE dimensionality reduction algorithm in 2D. Although multiple Twitter visualization platforms exist, HashViz is distinguished by being simple, scalable, interactive and portable enough to be deployed on a server for million-tweet corpora collected in the aftermath of arbitrary disasters, without special-purpose installation, technical expertise, manual supervision or costly software or infrastructure investment. Although simple, we show that HashViz offers an intuitive way to summarize, and gain insight into, a developing crisis situation. HashViz is also completely unsupervised, requiring no manual inputs to go from a raw corpus to a visualization and search interface. Using the recent Las Vegas mass shooting massacre as a case study, we illustrate the potential of HashViz using only a web browser on the client side.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset