COVER: conformational oversampling as data augmentation for molecules

24.03.2020

Publication in Journal of Cheminformatics - we are very happy that this open access article has now been published. A lot to celebrate once the #pharminfo group can meet again. Keywords: Deep learning, Toxicity, Imbalanced learning, Upsampling

 

Congratulations go to Jennifer Hemmerich, who is recieving funding from Moltag, and our former colleague Ece Asilar. Well done!

Abstract

Training neural networks with small and imbalanced datasets often leads to overfitting and disregard of the minority class. For predictive toxicology, however, models with a good balance between sensitivity and specificity are needed. In this paper we introduce conformational oversampling as a means to balance and oversample datasets for prediction of toxicity. Conformational oversampling enhances a dataset by generation of multiple conformations of a molecule. These conformations can be used to balance, as well as oversample a dataset, thereby increasing the dataset size without the need of artificial samples. We show that conformational oversampling facilitates training of neural networks and provides state-of-the-art results on the Tox21 dataset.

Please find the article at https://doi.org/10.1186/s13321-020-00420-z

 More News

Project
 

InSilify DrugTox is amongst the seven projects that were granted by the Austrian Science Fund FWF to enable research on different possibilities to...

News
 

We warmly welcome Sharath to the Pharminfo group! His expertise includes AI-assisted drug discovery and cheminformatics. We look forward to working...

Project
 

We are excited that our project AI4Health - Using AI for detecting drug-drug interactions - was recently granted by the Vienna Business Agency.