Diffusion Models for Protein Structure Design: From Backbone Generation to Atomic-Resolution Enzyme Design

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

The field of protein structure modeling has been revolutionized by the introduction of deep learning methods, particularly AlphaFold2, which has achieved near-experimental accuracy in predicting protein structures from amino acid sequences. This dissertation explores the application of diffusion models to create general solutions to protein design tasks. We introduce RFdiffusion, a model that generates protein structures as a series of backbone frames, which achieves state of the art performance on unconditional generation, motif scaffolding, and protein-protein binder design. We then leverage a broadened molecular vocabulary to predict general biomolecular structures including nucleic acids, small molecules, post-translational modifications, metals, and ions with RoseTTAFoldAA. Using the RoseTTAFoldAA architecture we finetune a diffusion model capable of generating proteins which bind small molecules. Finally, we present RFdiffusion2, a flow-matching model trained from random weight initializations capable of unindexed atomic motif scaffolding, enabling the design of enzymes with complex active sites. In all cases we validate the design capabilities of the models \textit{in vitro}. Our work demonstrates the potential of diffusion models to advance the field of protein design and opens new avenues for enzyme engineering.

Description

Thesis (Ph.D.)--University of Washington, 2025

Citation

DOI