Comprehensive, precision genomics

dc.contributor.advisorShendure, Jay Aen_US
dc.contributor.authorAdey, Andrewen_US
dc.date.accessioned2014-04-30T16:21:22Z
dc.date.available2014-04-30T16:21:22Z
dc.date.issued2014-04-30
dc.date.submitted2014en_US
dc.descriptionThesis (Ph.D.)--University of Washington, 2014en_US
dc.description.abstractThe past decade has observed a significant drop in the cost-per-base of DNA sequencing. Driven by a new era of `next-generation' sequencing (NGS), there has been an explosion of new technologies that utilize DNA sequencing, not just for primary sequence but a wide variety of biological assays. Despite the versatility of NGS, there are a number of drawbacks, including high sample input requirements and short read lengths. Because of the latter, the majority of genome studies cannot resolve haplotype or structural variation which requires long-range information and can play an important role in studying evolution, disease, and is crucial in the <italic>de novo</italic> assembly of genomes. In this dissertation I describe and apply methods to overcome these obstacles. First, I describe a method for the construction of DNA sequencing libraries that utilized a hyperactive transposase to fragment DNA and append universal sequencing primers in a single enzymatic step. This approach reduced the turnaround time from sample to sequencing-ready libraries, and significantly reduced the sample input requirements due to fewer enzymatic steps. I then describe a modified version of the method that allowed for a greater than 100 fold decrease in input requirements for the construction of libraries for the detection of DNA methylation. Next, I discuss a method that utilized the inherent properties of Tn5 transposase to provide long-range sequence information that served as the input for a novel <italic>de novo</italic> genome assembly algorithm. I applied this method to human, mouse, and fly assemblies to produce output scaffolds with contiguity improvements of up to 75 fold with high accuracy. Last, I describe the application of long-range sequence information to haplotype-resolve the genome and epigenome of the aneuploid HeLa cancer cell line. I investigated the global effects of copy number and haplotype on transcript abundance and epigenetic landscape and identified a number of outliers, including haplotype-specific expression of the proto-oncogene <italic>MYC</italic>. I reveal the mechanism responsible for this activation as the complex integration of the HPV-18 viral genome that includes an epithelial-specific enhancer at high copy number 500 kilobasepairs upstream of <italic>MYC</italic> locus.en_US
dc.embargo.termsNo embargoen_US
dc.format.mimetypeapplication/pdfen_US
dc.identifier.otherAdey_washington_0250E_12857.pdfen_US
dc.identifier.urihttp://hdl.handle.net/1773/25401
dc.language.isoen_USen_US
dc.rightsCopyright is held by the individual authors.en_US
dc.subjectBiotechnology; Epigenetics; Epigenomics; Genome Assembly; Genomics; Sequencingen_US
dc.subject.otherGeneticsen_US
dc.subject.otherMolecular biologyen_US
dc.subject.othermolecular and cellular biologyen_US
dc.titleComprehensive, precision genomicsen_US
dc.typeThesisen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Adey_washington_0250E_12857.pdf
Size:
17.92 MB
Format:
Adobe Portable Document Format