AbstractsBiology & Animal Science

Evolutionary analysis of viral sequences in eukaryotic genomes

by Sean Eric Schneider




Institution: University of Washington
Department:
Degree: PhD
Year: 2015
Keywords: ; Genetics
Record ID: 2059149
Full text PDF: http://hdl.handle.net/1773/27492


Abstract

The focus of this work is several evolutionary analyses of endogenous viral sequences in eukaryotic genomes. Endogenous viral sequences can provide key insights into the past forms and evolutionary history of viruses, as well as the responses of host organisms they infect. In this work I have examined viral sequences in a diverse assortment of eukaryotic hosts in order to study coevolution between hosts and the organisms that infect them. This research consisted of two major lines of investigation. In the first portion of this work, I outline the hypothesis that the C2H2 zinc finger gene family in vertebrates has evolved by birth-death evolution in response to sporadic retroviral infection. The hypothesis suggests an evolutionary model in which newly duplicated zinc finger genes are retained by selection in response to retroviral infection. This hypothesis is supported by a strong association (R2=0.67) between the number of endogenous retroviruses and the number of zinc fingers in diverse vertebrate genomes. Based on this and other evidence, the zinc finger gene family appears to act as a "genomic immune system" against retroviral infections. The other major line of investigation in this work examines endogenous virus sequences utilized by parasitic wasps to disable hosts that they infect. These wasps package their own DNA into viral particles and inject them into the host. I found that the DNA packaged into these viral particles can be permanently transferred to the hosts that these wasps infect. I have identified 105 transferred regions in two host species: Monarch butterfly (Danaus Plexippus) and Silkworm (Bombyx mori). The last common ancestor between these species and wasps lived around 300 million years ago. Many of these regions are highly similar to one another and the sequences form 12 groups when clustered by 90% nucleotide identity. These similarities may arise from repeated integration of the same sequence or duplication after integration into the host.