MolCompass

Author(s)
Sergey Sosnin
Abstract

The exponential growth of data is challenging for humans because their ability to analyze data is limited. Especially in chemistry, there is a demand for tools that can visualize molecular datasets in a convenient graphical way. We propose a new, ready-to-use, multi-tool, and open-source framework for visualizing and navigating chemical space. This framework adheres to the low-code/no-code (LCNC) paradigm, providing a KNIME node, a web-based tool, and a Python package, making it accessible to a broad cheminformatics community. The core technique of the MolCompass framework employs a pre-trained parametric t-SNE model. We demonstrate how this framework can be adapted for the visualisation of chemical space and visual validation of binary classification QSAR/QSPR models, revealing their weaknesses and identifying model cliffs. All parts of the framework are publicly available on GitHub, providing accessibility to the broad scientific community. Scientific contribution We provide an open-source, ready-to-use set of tools for the visualization of chemical space. These tools can be insightful for chemists to analyze compound datasets and for the visual validation of QSAR/QSPR models.

Organisation(s)
Department of Pharmaceutical Sciences
Journal
Journal of Cheminformatics
Volume
16
ISSN
1758-2946
DOI
https://doi.org/10.1186/s13321-024-00888-z
Publication date
12-2024
Peer reviewed
Yes
Austrian Fields of Science 2012
301207 Pharmaceutical chemistry, 104027 Computational chemistry, 102018 Artificial neural networks
Keywords
ASJC Scopus subject areas
Computer Science Applications, Physical and Theoretical Chemistry, Computer Graphics and Computer-Aided Design, Library and Information Sciences
Portal url
https://ucrisportal.univie.ac.at/en/publications/molcompass(f3d4a218-3d75-4cec-88e3-8c90d80e8780).html