For increasing the knowledge about small RNA an online platform was built in this thesis, homogenizing the analysis of sequencing data from over 4,000 samples collected in different laboratories worldwide from various organisms. Consequentially, tissue and disease specific biomarkers were detected and evidence was found for bacterial and viral infections of the sequenced tissues. This motivated the development of an algorithm for analyzing non-host RNA sequencing reads for detecting pathogenic abundance in the host. Focusing on neurological diseases, bacterial presence is suggested in brains of dementia patients. The platform and the algorithm can help scientists to add important leads to their own studies. In dieser Arbeit wurde eine Online-Plattform für small RNA entwickelt, die die Analyse der Sequenzierungsdaten von über 4.000 weltweit gewonnenen Proben verschiedener Organismen homogenisiert. Mit ihr wurden gewebe- und krankheitsspezifische Biomarker gefunden, sowie Hinweise auf bakterielle und virale Infektionen der sequenzierten Gewebe. Dies motivierte die Entwicklung eines Algorithmus für die Analyse von Nicht-Wirts-RNA-Sequenzierungsdaten zum Detektieren von Pathogenen. Bei einer Fokussierung auf neurologische Erkrankungen wurde bakterielle Präsenz im Gehirn von Demenzpatienten detektiert. Die Plattform und der Algorithmus können der Wissenschaft behilflich sein, neue Studien mit wichtigen Informationen zu verbessern.
Author(s): Anna-Maria Liebhoff
Publisher: Cuvillier Verlag
Year: 2021
Language: English
Pages: 139
City: Göttingen
1 Introduction
2 Biological background
2.1 Ribonucleic acid (RNA)
2.2 Small RNA
2.3 Sequencing
2.4 Microbiology
2.5 Neurodegenerative diseases
2.6 Pathogens in neurological diseases
3
Bioinformatics background
3.1 Data sources
3.2 Metadata and biological ontologies
3.3 Read alignment
3.4 Comparative analysis
3.5 Metagenomic analysis
3.6 Integrated analysis pipelines
4 Computer science background
4.1 Web development
4.2 Data science and machine learning
5 Small RNA Expression Atlas (SEA)
5.1 Background
5.2 System architecture
5.3 Implementation
5.4 System behavior
5.5 Results
6 Biological discoveries through SEA
6.1 Tissue specificity of small RNA
6.2 User data and Parkinson’s disease
6.3 Pathogens in neurodegenerative diseases
6.4 Sex specificity of small RNA in human body
6.5 Discussion
7 Pathogen detection in RNA-seq data
7.1 Background
7.2 Identification algorithm
7.3 Performance measures
7.4 Evaluation
8 Path(ogen)s to disease
8.1 Analysis pipeline
8.2 Burkholderia stabilis in frontotemporal dementia
8.3 Discussion
9 Conclusion
Bibliography
A SEA JSON communication objects
B Pathonoia algorithms
C Detailed analysis results