Ontology-driven pathway data integration

dc.contributor.advisorGennari, John H
dc.contributor.authorWang, Lucy Lu
dc.date.accessioned2019-05-02T23:16:48Z
dc.date.available2019-05-02T23:16:48Z
dc.date.issued2019-05-02
dc.date.submitted2019
dc.descriptionThesis (Ph.D.)--University of Washington, 2019
dc.description.abstractBiological pathways are useful tools for understanding human physiology and disease pathogenesis. Pathway analysis can be used to detect genes and functions associated with complex disease phenotypes. When performing pathway analysis, researchers take advantage of multiple pathway datasets, combining pathways from different pathway databases. Pathways from different databases do not easily inter-operate, and the resulting combined pathway dataset can suffer from redundancy or reduced interpretability. Ontologies have been used to organize pathway data and eliminate redundancy. I generated clusters of semantically similar pathways by mapping pathways from seven databases to classes of one such ontology, the Pathway Ontology (PW). I then produced a typology of differences between pathways by summarizing the differences in content and knowledge representation between databases. Using the typology, I optimized an entity and graph-based network alignment algorithm for aligning pathways between databases. The algorithm was applied to clusters of semantically similar pathways to generate normalized pathways for each PW class. These normalized pathways were used to produce normalized gene sets for gene set enrichment analysis (GSEA). I evaluated these normalized gene sets against baseline gene sets in GSEA using four public gene expression datasets. Results suggest that normalized pathways can help to reduce redundancy in enrichment outputs. The normalized pathways also retain the hierarchical structure of the PW, which can be used to visualize enrichment results and provide hints for interpretation. Ontology-based organization of biological pathways can play a vital role in improving data quality and interoperability, and the resulting normalized pathways may have broad applications in genomic analysis.
dc.embargo.termsOpen Access
dc.format.mimetypeapplication/pdf
dc.identifier.otherWang_washington_0250E_19740.pdf
dc.identifier.urihttp://hdl.handle.net/1773/43615
dc.language.isoen_US
dc.rightsCC BY
dc.subjectbiological pathways
dc.subjectbiomedical ontology
dc.subjectgene set enrichment analysis
dc.subjectknowledge representation
dc.subjectpathway analysis
dc.subjectpathway ontology
dc.subjectBioinformatics
dc.subjectInformation science
dc.subjectComputer science
dc.subject.otherBiomedical and health informatics
dc.titleOntology-driven pathway data integration
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Wang_washington_0250E_19740.pdf
Size:
4.58 MB
Format:
Adobe Portable Document Format