Comprehensive, precision genomics

Adey, Andrew

Comprehensive, precision genomics

dc.contributor.advisor	Shendure, Jay A	en_US
dc.contributor.author	Adey, Andrew	en_US
dc.date.accessioned	2014-04-30T16:21:22Z
dc.date.available	2014-04-30T16:21:22Z
dc.date.issued	2014-04-30
dc.date.submitted	2014	en_US
dc.description	Thesis (Ph.D.)--University of Washington, 2014	en_US
dc.description.abstract	The past decade has observed a significant drop in the cost-per-base of DNA sequencing. Driven by a new era of `next-generation' sequencing (NGS), there has been an explosion of new technologies that utilize DNA sequencing, not just for primary sequence but a wide variety of biological assays. Despite the versatility of NGS, there are a number of drawbacks, including high sample input requirements and short read lengths. Because of the latter, the majority of genome studies cannot resolve haplotype or structural variation which requires long-range information and can play an important role in studying evolution, disease, and is crucial in the <italic>de novo</italic> assembly of genomes. In this dissertation I describe and apply methods to overcome these obstacles. First, I describe a method for the construction of DNA sequencing libraries that utilized a hyperactive transposase to fragment DNA and append universal sequencing primers in a single enzymatic step. This approach reduced the turnaround time from sample to sequencing-ready libraries, and significantly reduced the sample input requirements due to fewer enzymatic steps. I then describe a modified version of the method that allowed for a greater than 100 fold decrease in input requirements for the construction of libraries for the detection of DNA methylation. Next, I discuss a method that utilized the inherent properties of Tn5 transposase to provide long-range sequence information that served as the input for a novel <italic>de novo</italic> genome assembly algorithm. I applied this method to human, mouse, and fly assemblies to produce output scaffolds with contiguity improvements of up to 75 fold with high accuracy. Last, I describe the application of long-range sequence information to haplotype-resolve the genome and epigenome of the aneuploid HeLa cancer cell line. I investigated the global effects of copy number and haplotype on transcript abundance and epigenetic landscape and identified a number of outliers, including haplotype-specific expression of the proto-oncogene <italic>MYC</italic>. I reveal the mechanism responsible for this activation as the complex integration of the HPV-18 viral genome that includes an epithelial-specific enhancer at high copy number 500 kilobasepairs upstream of <italic>MYC</italic> locus.	en_US
dc.embargo.terms	No embargo	en_US
dc.format.mimetype	application/pdf	en_US
dc.identifier.other	Adey_washington_0250E_12857.pdf	en_US
dc.identifier.uri	http://hdl.handle.net/1773/25401
dc.language.iso	en_US	en_US
dc.rights	Copyright is held by the individual authors.	en_US
dc.subject	Biotechnology; Epigenetics; Epigenomics; Genome Assembly; Genomics; Sequencing	en_US
dc.subject.other	Genetics	en_US
dc.subject.other	Molecular biology	en_US
dc.subject.other	molecular and cellular biology	en_US
dc.title	Comprehensive, precision genomics	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Adey_washington_0250E_12857.pdf
Size:: 17.92 MB
Format:: Adobe Portable Document Format

Download

Collections

Molecular and cellular biology