In silico protein evolution by intelligent design: creating new and improved protein structures
Natural proteins perform a startling diversity of biological functions, but comprise a miniscule fraction of the theoretical sequence-structure space that polypeptides might occupy. The goal of protein design is to identify new free-energy minima in this sequence-structure landscape so as to expand the functional repertoire of polypeptides beyond that observed in nature. The accurate design of new proteins requires an exacting understanding of the forces that govern protein structure and folding and should allow for the creation of novel molecular machines and therapeutics. This dissertation details the development of a computational protein design method, RosettaDesign, its application to design new and improved protein structures, and the rigorous experimental characterisation and analyses of the designed proteins to evaluate and improve the design process.First, we applied RosettaDesign to computationally redesign the sequence of nine, natural, globular proteins. Experimental characterisation revealed that eight of the redesigned proteins were folded with similar secondary structure to their wild-type counterparts, and six had stabilities equal to or up to 7 kcal/mol greater than the wild-type counterparts. High resolution structures of the two most dramatically stabilised redesigned proteins (human procarboxypeptidase and U1A) showed them to be virtually identical to the template natural counterparts.Second, we extended the capabilities of RosettaDesign to create a protein topology not observed in nature, by iterating between sequence design and structure prediction. We applied this general computational strategy to create a 93-residue alpha/beta protein called Top7 with a novel sequence and topology. We showed that the Top7 protein is folded and extremely stable, and the striking similarity between the x-ray crystal structure and the designed model demonstrated the unprecedented high-resolution accuracy of the design.Third, we showed that the final 49 C-terminal residues of Top7 (named CFr) can be efficiently mistranslated in E. coli. While an overwhelming majority of naturally mistranslated polypeptides are unfolded, the CFr protein folds into an independently stable, obligate, symmetric homo-dimer, with a novel, high-affinity interface. We further stabilised CFr by disulfide-induced covalent circularisation to create an ideal scaffold for novel functional protein design.
- Biological chemistry