ChemNet: A Transferable and Generalizable Deep Neural Network for Small-Molecule Property Prediction
With access to large datasets, deep neural networks (DNN) have achieved human-level accuracy in image and speech recognition tasks. However, in chemistry, availability of large standardized and labelled datasets is scarce, and many chemical properties of research interest, chemical data is inherently small and fragmented. In this work, we explore transfer learning techniques in conjunction with the existing Chemception CNN model, to create a transferable and generalizable deep neural network for small-molecule property prediction. Our latest model, ChemNet learns in a semi-supervised manner from inexpensive labels computed from the ChEMBL database. When fine-tuned to the Tox21, HIV and FreeSolv dataset, which are 3 separate chemical properties that ChemNet was not originally trained on, we demonstrate that ChemNet exceeds the performance of existing Chemception models and other contemporary DNN models. Furthermore, as ChemNet has been pre-trained on a large diverse chemical database, it can be used as a general-purpose plug-and-play deep neural network for the prediction of novel small-molecule chemical properties.
READ FULL TEXT