Organization and evolution of transcription factor occupancy in the human genome
Vierstra, Jeffrey David
MetadataShow full item record
<italic>Cis</italic>-regulatory DNA encodes the circuitry that enables cell development and differentiation. <italic>Cis</italic>-regulatory DNA is densely populated by recogntition sequences for transcription factors and the cooperative binding TFs to these sequences determines cell-fate and function by the precise transcriptional regulation of their cognate genes. As such, a mechanistic understanding of gene regulation hinges on our ability to quantify transcription factor occupancy. To map transcription factor occupancy with in the human genome, I took part in the development of digital genomic footprinting -- a technique leveraging the endonuclease DNase I that enables the unbiased and simultaneous detection of transcription factor occupancy genome-wide. We applied digital genomic footprinting to 41 diverse cell- and tissue-types to comprehensively map the human <italic>cis</italic>-regulatory lexicon. We show that this small genomic compartment contains an expansive repertoire of conserved recognition sequences for DNA-binding proteins and that nuclease patterns within these sequences mirror nucleotide-level evolutionary conservation and track the crystallographic topography of protein-DNA interfaces. We also show that both genetic and epigenetic variants affecting chromatin states are concentrated within footprints. Finally, we describe a large collection of novel regulatory factor recognition motifs that are highly conserved in both sequence and function, and exhibit cell-selective occupancy patterns that closely parallel major regulators of development, differentiation and pluripotency. These results provide for the first time an exhaustive map of TF occupancy within the human genome. The architecture of individual <italic>cis</italic>-regulatory sites is critical for their function. While digital genomic footprinting provides rich information about the occupancy of TFs within individual <italic>cis</italic>-regulatory elements, it is currently not possible to resolve the genome-wide relationship of transcription factors (TFs) and nucleosomes. To address this deficiency, I developed an extension to digital genomic footprinting that couples the detection of individual TF footprints to nucleosome occupancy. We find that TF occupancy is the major determinant of the positioning of <italic>cis</italic>-regulatory proximal nucleosomes, and that the positioning and occupancy of promoter-associated nucloeosomes is related to transcriptional start sites selection and output. The approach we describe provides a new view on the structure of <italic>cis</italic>-regulatory chromatin. In the second part of this thesis, I used a comparative genomics approach to study the evolution of <italic>cis</italic>-regulatory DNA and protein occupancy. To do this, I mapped DNase I hypersensitive sites (DHSs) in 45 mouse cell types and primary tissues, and systematically compared these with human DHS maps from orthologous cell and tissue compartments. While I uncovered a small set of core regulatory sequences that encode a developmental program, the vast majority of <italic>cis</italic>-regulatory DNA is rapidly evolving independently in mouse and human. Overall, I find that the activity of <italic>cis</italic>-regulatory DNA is directly linked to the the composition of TF recognition sequences within and that the aggregate recognition sequence space for each transcription factor within accessible regulatory DNA of orthologous mouse and human cell types has been strictly conserved. These results demonstrate the remarkable plasticity of the mammalian <italic>cis</italic>-regulatory program and that TF occupancy is driven by an evolutionary inflexible <italic>trans</italic>-environment rather than conservation of individual regulatory elements. Taken together, this thesis provides a framework to understand the organization and evolution of global TF occupancy within the mammalian genome.
- Genetics