Node Centrality Metrics for Hotspots Analysis in Telecom Big Data
In this work, we are interested in the applications of big data in the telecommunication domain, analysing two weeks of datasets provided by Telecom Italia for Milan and Trento. Our objective is to identify hotspots which are places with very high communication traffic relative to others and measure the interaction between them. We model the hotspots as nodes in a graph and then apply node centrality metrics that quantify the importance of each node. We review five node centrality metrics and show that they can be divided into two families: the first family is composed of closeness and betweenness centrality whereas the second family consists of degree, PageRank and eigenvector centrality. We then proceed with a statistical analysis in order to evaluate the consistency of the results over the two weeks. We find out that the ranking of the hotspots under the various centrality metrics remains practically the same with the time for both Milan and Trento. We further identify that the relative difference of the values of the metrics is smaller for PageRank centrality than for closeness centrality and this holds for both Milan and Trento. Finally, our analysis reveals that the variance of the results is significantly smaller for Trento than for Milan.
READ FULL TEXT