Learning from Large-scale Mutagenesis Data

by Vanessa Elaine Gray

Institution: University of Washington
Year: 2018
Keywords: Deep mutational scanning; Machine learning; Variant effect prediction; Genetics; Genetics
Posted: 02/01/2018
Record ID: 2215981
Full text PDF: http://hdl.handle.net/1773/40907


Mutations can have profound effects on protein function. For example, mutations can increase or decrease enzymatic activity, influence aggregation propensity, or lead to novel protein functions. Mastery of the rules governing how mutations affect protein function has the potential to revolutionize bioengineering. Recently, advances in DNA sequencing technologies and molecular biology techniques have afforded new methods, such as deep mutational scanning, to measure the quantitative effects of mutations on protein function in high throughput. In this dissertation, I first describe the state of the deep mutational scanning field. In the following chapters, I employ large-scale mutagenesis data sets from deep mutational scanning experiments to perform three investigations. In Chapter 2, I report how different amino acids affect the severity of mutational effect. In Chapter 3, I show how machine-learning algorithms can be used to model the evolutionary, structural and physicochemical properties of mutations from large-scale mutagenesis data. In Chapter 4, I describe initial work on a deep mutational scan of amyloid to reveal how mutations can affect the aggregation propensity of a protein. In Chapter 5, I discuss some outstanding questions that can be resolved with future analyses of large-scale mutagenesis datasets.Advisors/Committee Members: Fowler, Douglas M (advisor).