Congratulations
All of us at #pharminfo are so happy and proud. We congratulate you on this great success. Unfortunately, such a success always means saying goodbye for us. Saying goodbye to a great and warm colleague. When we think about what Barbara means to us, it becomes all the more clear that she played a special role in the team. This huge gap has yet to be filled. Her new employer has found a brilliant team player in Barbara. Your new team will soon be able to enjoy this, both scientifically and personally. Congratulations! We will miss you Barbara. Keep us updated. Wishing you all the best!
Barbara was a PhD Student in the MolTag program working in the field of data science.
Abstract
The toxicity of medically relevant compounds remains a challenge in drug discovery, often due to the unknown mechanism of toxic events. Computational methods for understanding and predicting toxicity are frequently developed. Systems biology aims to support deciphering biological systems and mechanisms. In this thesis, we are introducing systems biology-based data science approaches with the aim of utilizing already existing data to create strong hypotheses and high-quality methods for supporting drug safety assessments. The core work of the thesis was divided in three studies. In Path4Drug, a workflow was created, to connect drugs with toxicity records to different biological entities and processes, such as target proteins or biological pathways. The workflow utilizes public repertories and web-services. The statistical analysis of the collected results for withdrawn and black-box warning drugs yielded insights into pathways that can have an essential role in causing tissue-specific toxicities. A network-based study aimed to build causal networks of proteins linked to drugs causing tissue-specific toxicities. Complex mining steps were insured to collect and annotate causal compound-protein and protein-protein data. Based on the created networks, topological enrichment analysis was performed to establish connections between the compounds and pathways. In the last study, the collected target, interactome and pathway connections were utilized to perform toxicity predictions. The profiles were used as descriptors for training models based on tree-based algorithms. In this work, we could show the importance of avoiding superfluous descriptors, pointing out the data availability gap issue, and introducing a novel way of using systemic descriptors of compounds for toxicity predictions. All the tools created as part of the thesis are openly available and adjustable for other systems biology-based research.
Keywords
Toxicity / Systems biology / Data science / Protein / Pathway / Network