Knowledge-Intensive NLP for Real-World Information Needs

dc.contributor.advisorHajishirzi, Hannaneh
dc.contributor.authorWadden, David
dc.date.accessioned2023-04-17T18:03:08Z
dc.date.available2023-04-17T18:03:08Z
dc.date.issued2023-04-17
dc.date.submitted2023
dc.descriptionThesis (Ph.D.)--University of Washington, 2023
dc.description.abstractKnowledge-intensive NLP aims to help humans navigate and synthesize the information contained in massive textual corpora. The past few years have witnessed rapid progress on knowledge-intensive NLP for general-domain corpora like Wikipedia, which can be accessed conveniently and annotated at scale using crowd workers. However, approaches that perform well on general-domain text may fail when applied to specialized settings like scientific literature, or when asked to generalize to novel entities or concepts. Three key research challenges arise in these real-world settings: (1) Standard task formulations and evaluation metrics may not adequately capture user information needs, (2) Training data are often scarce and expensive, and (3) Novel linguistic phenomena -- ranging from vocabulary shift to long-range semantic dependencies -- may not be captured by existing modeling approaches. In this thesis, we report progress on three knowledge-intensive NLP tasks addressing real-world information needs. Scientific information extraction aims to organize the findings reported in scientific literature into a structured knowledge graph. Scientific claim verification aims to assess the veracity of scientific claims against a corpus of research literature. Finally, entity-oriented query refinement aims to help users navigate an open-ended space of entities to quickly understand a new domain or discover relevant information. Throughout this thesis, we will frequently encounter the three key research challenges outlined above. We hope that our proposed solutions will contribute toward the development of more robust, flexible, and capable systems for knowledge-intensive NLP.
dc.embargo.termsOpen Access
dc.format.mimetypeapplication/pdf
dc.identifier.otherWadden_washington_0250E_25222.pdf
dc.identifier.urihttp://hdl.handle.net/1773/49883
dc.language.isoen_US
dc.rightsCC BY
dc.subjectNatural language processing
dc.subjectComputer science
dc.subject.otherComputer science and engineering
dc.titleKnowledge-Intensive NLP for Real-World Information Needs
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Wadden_washington_0250E_25222.pdf
Size:
3.37 MB
Format:
Adobe Portable Document Format