Deep Learning to Predict Protein Backbone Structure from High-Resolution Cryo-EM Density Maps

dc.contributor.advisorSi, Dong
dc.contributor.authorMoritz, Spencer A
dc.date.accessioned2019-05-02T23:16:23Z
dc.date.available2019-05-02T23:16:23Z
dc.date.issued2019-05-02
dc.date.submitted2019
dc.descriptionThesis (Master's)--University of Washington, 2019
dc.description.abstractUnderstanding a protein’s structure can lead to the discovery of therapeutic protein drugs. However, imaging proteins remains a challenge due to their small size. Cryo-electron microscopy (cryo-EM) is a leading imaging technology that has recently been able to produce near atomic resolution images called electron density maps. However, predicting various protein structures remains a challenge on all but the most pristine density maps (< 2.5Å resolution). Here we introduce a deep learning model that uses a set of cascaded convolutional neural networks (CNNs) to predict the critical Cα atoms along a protein’s backbone structure. The cascaded-CNN (C-CNN) is a novel deep learning architecture comprised of multiple CNNs, each predicting a specific aspect of a protein’s structure. This model predicts various levels of protein structures: secondary structure elements (SSEs), backbone structure, and Cα atoms. It combines the results of each to produce a complete prediction map. The cascaded-CNN is a semantic segmentation image classifier and was trained using thousands of simulated density maps. This method is largely automatic and only requires a recommended threshold value for each evaluated protein. A specialized tabu-search path walking algorithm was used to produce an initial backbone trace with Cα placements. This method was tested on 50 experimental maps between 2.6Å and 4.4Å resolution. It outperformed several state-of-the-art prediction methods including RosettaES, MAINMAST, and a Phenix-based method by producing the most complete prediction models, as measured by percentage of found Cα atoms. This method accurately predicted 88.5% (mean) of the Cα atoms within 3Å of a protein’s backbone structure surpassing the 66.8% mark achieved by the leading alternate method (Phenix based fully automatic method) on the same set of density maps. The C-CNN also achieved an average RMSD of 1.23Å for all 50 experimental density maps which is similar to the Phenix based fully automatic method. The source code and demo of this research has been published at https://github.com/DrDongSi/Ca-Backbone-Prediction.
dc.embargo.termsOpen Access
dc.format.mimetypeapplication/pdf
dc.identifier.otherMoritz_washington_0250O_19691.pdf
dc.identifier.urihttp://hdl.handle.net/1773/43603
dc.language.isoen_US
dc.rightsCC BY
dc.subjectConstitutional Neural Network
dc.subjectDeep Learning
dc.subjectMachine Learning
dc.subjectProtein Structure Prediction
dc.subjectComputer science
dc.subjectMolecular biology
dc.subject.otherComputing and software systems
dc.titleDeep Learning to Predict Protein Backbone Structure from High-Resolution Cryo-EM Density Maps
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Moritz_washington_0250O_19691.pdf
Size:
2.22 MB
Format:
Adobe Portable Document Format