|Institution:||University of British Columbia|
|Full text PDF:||http://hdl.handle.net/2429/54739|
State-of-the-art methods in RNA secondary structure prediction focus on predicting the final, functional structure. However, ample experimental and statistical evidence indicates that structure formation starts immediately during transcription and this co-transcriptional folding influences the resultant final RNA structure. Thus, identifying the transient structures that are formed co-transcriptionally may bring insight into understanding how co-transcriptional folding leads to the final conformation in vivo. As RNA secondary structures are currently best predicted by comparative approaches, we therefore investigated whether homologous RNA genes not only assume the same final structure, but also share structural features during the co-transcriptional folding in vivo. For this, we compiled a non-redundant data set of 32 transcripts deriving from six different RNA families which constitutes the most comprehensive data set with experimentally confirmed transient and alternative RNA structures so far. We present statistical evidence that homologous RNA genes from related organisms fold co-transcriptionally in a similar way. In particular, we show that some transient structures are highly conserved with levels similar to those of the final, functional structure. Moreover, we find that the predicted co-transcriptional folding pathways of homologous sequences encounter similar transient structure features, which often coincide with known transient features. We thus also predict candidates for these evolutionarily conserved transient features of co-transcriptional folding pathways in silico. We further expand 4 alignments from the aforementioned dataset by searching via covariance model and manual curation in order to share them with the RNA community. These alignments either update the existing Rfam datasets with annotation of transient structures, or introduce new RNA family: (1) Trp operon leader, where alternative structures are coordinated to regulate the operon transcription in response to tryptophan abundance (2) HDV ribozyme, where the self-cleavage activity is modulated via transient structures involving the extended 5’ flanking sequence (3) 5’ UTR of Levivirus maturation protein, where a transient structure temporarily postpones the formation of the final structure that inhibits the translation of maturation protein (4) SAM riboswitch, where the downstream gene expression is regulated by alternative structures upon binding of SAM.