Knowledge mining of unstructured information: application to cyber-domain
Cyber intelligence is widely and abundantly available in numerous open online sources with reports on vulnerabilities and incidents. This constant stream of noisy information requires new tools and techniques if it is to be used for the benefit of analysts and investigators in various organizations. In this paper we present and implement a novel knowledge graph and knowledge mining framework for extracting relevant information from free-form text about incidents in the cyber domain. Our framework includes a machine learning based pipeline as well as crawling methods for generating graphs of entities, attackers and the related information with our non-technical cyber ontology. We test our framework on publicly available cyber incident datasets to evaluate the accuracy of our knowledge mining methods as well as the usefulness of the framework in the use of cyber analysts. Our results show analyzing the knowledge graph constructed using the novel framework, an analyst can infer additional information from the current cyber landscape in terms of risk to various entities and the propagation of risk between industries and countries. Expanding the framework to accommodate more technical and operational level information can increase the accuracy and explainability of trends and risk in the knowledge graph.
READ FULL TEXT