Discovery and Applications of Bacterial Noncoding RNAs

Tseng, Huei-Hun Elizabeth

Discovery and Applications of Bacterial Noncoding RNAs

dc.contributor.advisor	Ruzzo, Walter L	en_US
dc.contributor.author	Tseng, Huei-Hun Elizabeth	en_US
dc.date.accessioned	2013-02-25T18:01:28Z
dc.date.available	2013-08-25T11:05:50Z
dc.date.issued	2013-02-25
dc.date.submitted	2012	en_US
dc.description	Thesis (Ph.D.)--University of Washington, 2012	en_US
dc.description.abstract	Noncoding RNAs (ncRNAs) are functional transcripts that do not code for proteins. Many of them play indispensible roles in the cell. For example, the ribosomal RNAs make up the ribosome that is the factory for making proteins and riboswitches bind to small metabolites in the cell and regulate gene expression. Computational discovery of ncRNAs is challenging, however, because ncRNAs evolve rapidly on the nucleotide level while preserving secondary structure. In the first part of this thesis, we develop two clustering algorithms that are robust to weak sequence homology signals and are applicable on the genomic scale. We show that both algorithms can recover most known ncRNA families and as few as 5 homologous sequences are needed to predict a strong motif. In the second part of the thesis, we investigate whether secondary structure in- formation improves maximum likelihood tree inference for ncRNAs. An accurate phylogenetic tree has important biological and clinical applications: it can be used to infer the function of novel organisms and understand the evolutionary history of species. We show that using structure information, a more realistic gap model, and a maximum likelihood approach improves phylogenetic tree inference. In the third part of the thesis, we develop a method for profiling human gut microbial communities using high-throughput sequencing. Our method works on Illumina short reads and does not require assembly or taxonomic identification. We show that it can differentiate between the gut microbiota of healthy individuals at low sequencing depth, making it a cost-effective screening tool for large population studies. In the final part of the thesis, we use a standard additions experiment to examine sequencing bias and errors in Illumina HiSeq. We identify features associated with systematic errors and develop an error correction pipeline. We show that our method reduces base errors and produces better species diversity estimates.	en_US
dc.embargo.terms	Restrict to UW for 6 months -- then make Open Access	en_US
dc.format.mimetype	application/pdf	en_US
dc.identifier.other	Tseng_washington_0250E_10974.pdf	en_US
dc.identifier.uri	http://hdl.handle.net/1773/22012
dc.language.iso	en_US	en_US
dc.rights	Copyright is held by the individual authors.	en_US
dc.subject.other	Computer science	en_US
dc.subject.other	Bioinformatics	en_US
dc.subject.other	Microbiology	en_US
dc.subject.other	Computer science and engineering	en_US
dc.title	Discovery and Applications of Bacterial Noncoding RNAs	en_US
dc.type	Thesis	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Tseng_washington_0250E_10974.pdf
Size:: 2.57 MB
Format:: Adobe Portable Document Format

Download

Collections

Computer science and engineering