Predicting Research Trends From Arxiv
We perform trend detection on two datasets of Arxiv papers, derived from its machine learning (cs.LG) and natural language processing (cs.CL) categories. Our approach is bottom-up: we first rank papers by their normalized citation counts, then group top-ranked papers into different categories based on the tasks that they pursue and the methods they use. We then analyze these resulting topics. We find that the dominating paradigm in cs.CL revolves around natural language generation problems and those in cs.LG revolve around reinforcement learning and adversarial principles. By extrapolation, we predict that these topics will remain lead problems/approaches in their fields in the short- and mid-term.
READ FULL TEXT