Business Taxonomy Construction Using Concept-Level Hierarchical Clustering
Business taxonomies are indispensable tools for investors to do equity research and make professional decisions. However, to identify the structure of industry sectors in an emerging market is challenging for two reasons. First, existing taxonomies are designed for mature markets, which may not be the appropriate classification for small companies with innovative business models. Second, emerging markets are fast-developing, thus the static business taxonomies cannot promptly reflect the new features. In this article, we propose a new method to construct business taxonomies automatically from the content of corporate annual reports. Extracted concepts are hierarchically clustered using greedy affinity propagation. Our method requires less supervision and is able to discover new terms. Experiments and evaluation on the Chinese National Equities Exchange and Quotations (NEEQ) market show several advantages of the business taxonomy we build. Our results provide an effective tool for understanding and investing in the new growth companies.
READ FULL TEXT