Deep Learning for Technical Document Classification
In large technology companies, the requirements for managing and organizing technical documents created by engineers and managers in supporting relevant decision making have increased dramatically in recent years, which has led to a higher demand for more scalable, accurate, and automated document classification. Prior studies have primarily focused on processing text for classification and small-scale databases. This paper describes a novel multimodal deep learning architecture, called TechDoc, for technical document classification, which utilizes both natural language and descriptive images to train hierarchical classifiers. The architecture synthesizes convolutional neural networks and recurrent neural networks through an integrated training process. We applied the architecture to a large multimodal technical document database and trained the model for classifying documents based on the hierarchical International Patent Classification system. Our results show that the trained neural network presents a greater classification accuracy than those using a single modality and several earlier text classification methods. The trained model can potentially be scaled to millions of real-world technical documents with both text and figures, which is useful for data and knowledge management in large technology companies and organizations.
READ FULL TEXT