AbstractsBiology & Animal Science

Bioinformatic Analysis of microRNA Genes in Free-Living and Parasitic Nematodes

by Rina Ahmed




Institution: Freie Universität Berlin
Department: FB Mathematik und Informatik
Degree: PhD
Year: 2015
Record ID: 1106546
Full text PDF: http://edocs.fu-berlin.de/diss/receive/FUDISS_thesis_000000098864


Abstract

The bioinformatics side has become the ‘bottleneck’ of all high-throughput based biological studies. Next-generation sequencers (NGS) produce millions of sequences (reads) in a short amount of time at low costs. A major problem is the handling and analysis of these large-scale data sets in an efficient and systematic way. Bioinformatics methods can be applied to analyze generated high-throughput sequencing data computationally and therefore help to address biological questions. This thesis approaches computational challenges and biological questions that arise when investigating microRNA genes (miRNAs) in nematodes using NGS technologies (ABI SOLiD, Illumina GA II, and HiSeq). On the one hand, bioinformatics methods and computational strategies were identified and developed to analyze experimental large-scale small RNA data. These data sets were generated in-house and by collaborators as well as publicly available. On the other hand, this work addresses the question whether miRNA genes impact developmental arrest and long-term survival in dauer larvae of two free-living nematodes (Caenorhabditis elegans (C. elegans) and Pristionchus pacificus (P. pacificus)) and the infective stage of parasites (Strongyloides ratti (S. ratti)). In particular, I address the long-standing hypothesis that dauer and infective larvae share a common origin. This investigation is specifically focused on determining whether these two larval stages exhibit similar miRNA expression signatures. In the first part of this study I developed a bioinformatics workflow that characterizes the miRNA gene complement in C. elegans, P. pacificus, and S. ratti and investigates their expression levels. Additionally, this workflow infers miRNA gene families and integrates the observed phylogenetic relationships with measured expression level changes. As part of this study, I was involved in the development of FLEXBAR (published 2012 in the special issue “Next-Generation Sequencing Approaches in Biology”, Biology), a program that I applied to preprocess our small RNA sequencing data. FLEXBAR is a versatile solution for three critical preprocessing steps in any next- generation processing pipeline: (i) basic clipping and quality filtering, (ii) barcode recognition and processing, and (iii) adapter recognition and removal. Importantly, all of these steps can be performed in one program call and executed in parallel. FLEXBAR performs slightly better than FASTX, which is widely considered to be the best of all (selected) competitors in removing adapters from an Illumina read (benchmark I). Furthermore, FLEXBAR covers a large range of sequencing platform applications, formats, and features and provides detailed output statistics, e.g. graphical output of read alignments. In the second part of this study I applied the bioinformatics workflow to address the question whether miRNAs impact developmental arrest and long term survival in dauer and infective larvae of nematodes (published 2013 in Genome Biology and Evolution). This study identifies and extends…