Genomic analysis by single cell flow sorting
The Human Genome Project has dramatically changed the landscape of biology. With the availability of genomic sequence from humans and many other organisms, new biological questions are being asked that involve the simultaneous study of thousands of genes or proteins. The invention of new technologies continues to be important for the timely investigation of many of these questions.In this work, we present new technologies that address several genomcs-level questions using electronic cell sorters. Because these machines are capable of examining and sorting tens of thousands of cells per second, they are potentially ideal platforms for investigating large systems. The challenge lies in converting biological attributes into readable physical attributes. In this work, we present the development of a series of plasmid vectors that encode biological states as the ratio of two fluorescent proteins in E. coli.Using this doctrine, we created the pGRFP series of vectors that can be used to rapidly isolate insert-bearing clones on an electronic cell sorter. This technique is a powerful alternative to traditional colony picking based on blue/white color selection. The speed of the electronic cell sorter allows us to deposit single cells into tubes as fast as the tubes can be transported. We validate this method's precision is selecting insert-bearing clones and show its usefulness in a small sequencing project.We also show how the pGRFP series vectors can be used to classify a large number of protein mutants. We sequenced hundreds of active mutants of a human enzyme. From these data, we introduce the concept of the "x-factor" that indicates a particular protein's tolerance to mutation. We are able to make striking correlations between the pattern of mutability throughout the enzyme and what is known about its 3D structure and mechanism of action.Finally, we present the pGFPpDsRed series of vectors that show promise in detecting DNA-Protein interactions. This might make a very useful tool for scanning genomic DNA for transcription factor binding sites on the road to solving regulatory networks. Conversely, a large number of protein mutants could be searched quickly to find variants that bind to a specific DNA sequence.