Discovery and Applications of Bacterial Noncoding RNAs

dc.contributor.advisorRuzzo, Walter Len_US
dc.contributor.authorTseng, Huei-Hun Elizabethen_US
dc.date.accessioned2013-02-25T18:01:28Z
dc.date.available2013-08-25T11:05:50Z
dc.date.issued2013-02-25
dc.date.submitted2012en_US
dc.descriptionThesis (Ph.D.)--University of Washington, 2012en_US
dc.description.abstractNoncoding RNAs (ncRNAs) are functional transcripts that do not code for proteins. Many of them play indispensible roles in the cell. For example, the ribosomal RNAs make up the ribosome that is the factory for making proteins and riboswitches bind to small metabolites in the cell and regulate gene expression. Computational discovery of ncRNAs is challenging, however, because ncRNAs evolve rapidly on the nucleotide level while preserving secondary structure. In the first part of this thesis, we develop two clustering algorithms that are robust to weak sequence homology signals and are applicable on the genomic scale. We show that both algorithms can recover most known ncRNA families and as few as 5 homologous sequences are needed to predict a strong motif. In the second part of the thesis, we investigate whether secondary structure in- formation improves maximum likelihood tree inference for ncRNAs. An accurate phylogenetic tree has important biological and clinical applications: it can be used to infer the function of novel organisms and understand the evolutionary history of species. We show that using structure information, a more realistic gap model, and a maximum likelihood approach improves phylogenetic tree inference. In the third part of the thesis, we develop a method for profiling human gut microbial communities using high-throughput sequencing. Our method works on Illumina short reads and does not require assembly or taxonomic identification. We show that it can differentiate between the gut microbiota of healthy individuals at low sequencing depth, making it a cost-effective screening tool for large population studies. In the final part of the thesis, we use a standard additions experiment to examine sequencing bias and errors in Illumina HiSeq. We identify features associated with systematic errors and develop an error correction pipeline. We show that our method reduces base errors and produces better species diversity estimates.en_US
dc.embargo.termsRestrict to UW for 6 months -- then make Open Accessen_US
dc.format.mimetypeapplication/pdfen_US
dc.identifier.otherTseng_washington_0250E_10974.pdfen_US
dc.identifier.urihttp://hdl.handle.net/1773/22012
dc.language.isoen_USen_US
dc.rightsCopyright is held by the individual authors.en_US
dc.subject.otherComputer scienceen_US
dc.subject.otherBioinformaticsen_US
dc.subject.otherMicrobiologyen_US
dc.subject.otherComputer science and engineeringen_US
dc.titleDiscovery and Applications of Bacterial Noncoding RNAsen_US
dc.typeThesisen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Tseng_washington_0250E_10974.pdf
Size:
2.57 MB
Format:
Adobe Portable Document Format