Modeling the effects of site-specific amino-acid preferences on protein evolution.

dc.contributor.advisorBloom, Jesse D
dc.contributor.authorHilton, Sarah
dc.date.accessioned2020-08-14T03:30:31Z
dc.date.available2020-08-14T03:30:31Z
dc.date.issued2020-08-14
dc.date.submitted2020
dc.descriptionThesis (Ph.D.)--University of Washington, 2020
dc.description.abstractAn important goal in the study of protein evolution is understanding which genetic changes that fixed in nature were selected for and why. However, understanding the functional consequence of any given mutation is a challenge because the effects of amino-acid changes are highly idiosyncratic across sites in a protein. My graduate research has focused on developing computational tools and methods that integrate two existing methods, phylogenetics and deep mutational scanning, to understand the effect of site-specific amino-acid constraint on protein evolution. These two methods, one computational and one experimental, have complementary strengths and weaknesses. Phylogenetics provides methods to study natural sequences, which are subjected to natural selective pressures, in a principled manner; however, these methods are constrained to the genetic sequences we have sampled. Deep mutational scanning allows for the unbiased measurement of all single amino-acid changes to a protein, but the assay occurs in an artificial laboratory setting. The goal of my work is to leverage the strengths of each method, the comprehensiveness of the deep mutational scan with the realism of comparative sequence analysis, for a more complete and accurate understanding of site-specific protein constraint. In Chapter 2, I develop a web-based visualization tool, \texttt{dms-view}, for interactive exploration of deep mutational scanning experiments. \texttt{dms-view} addresses common analysis challenges by allowing the user to easily and iteratively view site-level summary metrics, individual mutation measurements, and the 3-D protein structure for site(s) of interest from a deep mutational scan. \texttt{dms-view} is a flexible tool that allows the user to explore the site-specific amino-acid preferences measured by a given deep mutational alongside external datasets, such as site-specific amino-acid frequencies observed in nature. While tools like \texttt{dms-view} allow for qualitative comparison of natural selection and selection in the lab, more sophisticated methods are needed to make this comparison while account for sequencing sampling and shared evolutionary history. To this end, in Chapter 3, I implement a relatively new family of phylogenetic substitution models called Experimentally Informed Codon Models (ExpCMs) in a new Python software package called \texttt{phydms}. ExpCMs describe the selection on amino-acid changes using the empirical measurements from a deep mutational scan and therefore represent a bridge between selection in the laboratory and selection in nature. \texttt{phydms} implements the models in maximum-likelihood framework and includes auxiliary command line tools to facilitate fast and easy analysis. In Chapter 4, I investigate the effect of the site-specific ExpCMs with empirical measurements on phylogenetic inference, specifically branch length estimation. A long-standing observation in phylogenetics is that long branches in phylogenetic trees are consistently underestimated. I found that site-specific ExpCMs estimated longer branches than a common site-uniform codon model but that this extension in branch length was limited by intraprotein epistasis. This work suggests that current phylogenetic models assumptions of independent evolution between sites and identical evolution among sites results inaccurate branch length estimation. Overall, my graduate work has produced general computational methods and tools that integrate empirical measurements of site-specific amino-acid constraint with comparative sequence analysis to create a more complete picture of protein evolution.
dc.embargo.termsOpen Access
dc.format.mimetypeapplication/pdf
dc.identifier.otherHilton_washington_0250E_21550.pdf
dc.identifier.urihttp://hdl.handle.net/1773/46018
dc.language.isoen_US
dc.rightsCC BY
dc.subjectevolutionary biology
dc.subjectinfluenza virus
dc.subjectphylogenetics
dc.subjectprotein
dc.subjectGenetics
dc.subjectEvolution & development
dc.subjectVirology
dc.subject.otherGenetics
dc.titleModeling the effects of site-specific amino-acid preferences on protein evolution.
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Hilton_washington_0250E_21550.pdf
Size:
6.74 MB
Format:
Adobe Portable Document Format

Collections