Learning from Large-scale Mutagenesis Data
|Institution:||University of Washington|
|Keywords:||Deep mutational scanning; Machine learning; Variant effect prediction; Genetics; Genetics|
|Full text PDF:||http://hdl.handle.net/1773/40907|
Mutations can have profound effects on protein function. For example, mutations can increase or decrease enzymatic activity, influence aggregation propensity, or lead to novel protein functions. Mastery of the rules governing how mutations affect protein function has the potential to revolutionize bioengineering. Recently, advances in DNA sequencing technologies and molecular biology techniques have afforded new methods, such as deep mutational scanning, to measure the quantitative effects of mutations on protein function in high throughput. In this dissertation, I first describe the state of the deep mutational scanning field. In the following chapters, I employ large-scale mutagenesis data sets from deep mutational scanning experiments to perform three investigations. In Chapter 2, I report how different amino acids affect the severity of mutational effect. In Chapter 3, I show how machine-learning algorithms can be used to model the evolutionary, structural and physicochemical properties of mutations from large-scale mutagenesis data. In Chapter 4, I describe initial work on a deep mutational scan of amyloid to reveal how mutations can affect the aggregation propensity of a protein. In Chapter 5, I discuss some outstanding questions that can be resolved with future analyses of large-scale mutagenesis datasets.Advisors/Committee Members: Fowler, Douglas M (advisor).