Interactive Learning of Relation Extractors with Weak Supervision

dc.contributor.advisorWeld, Daniel Sen_US
dc.contributor.authorHoffmann, Raphael Dominiken_US
dc.date.accessioned2013-04-17T18:03:05Z
dc.date.available2013-10-15T11:06:15Z
dc.date.issued2013-04-17
dc.date.submitted2012en_US
dc.descriptionThesis (Ph.D.)--University of Washington, 2012en_US
dc.description.abstractThe ability to automatically convert natural language text into a knowledge base may open the door to great new opportunities, including question-answering on the Web, detection of trends and sentiments in social media, and perhaps even intelligent agents which understand our language. Today, however, there does not exist a system that can reliably convert text into a knowledge base, and the task turns out to be far more difficult than it appears. A key challenge is relation extraction - detecting semantic relationships between entities mentioned in text. Most successful approaches use supervised machine learning, but creating the required labeled training examples has proven too expensive for constructing Web-scale knowledge bases. This dissertation shows that we can greatly reduce the amount of human effort necessary to create relation extractors by leveraging a richer set of user interactions, some of which use more accurate models of weak supervision from a database. Specifically, this dissertation presents (1) a weakly supervised technique based on multi-instance learning wich allows relations to overlap, (2) a weakly supervised technique that allows learning from only a few instances per relation by dynamically inducing relation-specific lexicons, (3) an approach for developing extraction rules interactively, and (4) a technique which synergistically pairs weakly supervised relation extraction with extraction validation by an online community. Our proposed techniques make it possible to create a high-quality relation extractor in under one hour, moving us closer towards automatically constructing Web-scale knowledge-bases.en_US
dc.embargo.termsRestrict to UW for 6 months -- then make Open Accessen_US
dc.format.mimetypeapplication/pdfen_US
dc.identifier.otherHoffmann_washington_0250E_11163.pdfen_US
dc.identifier.urihttp://hdl.handle.net/1773/22600
dc.language.isoen_USen_US
dc.rightsCopyright is held by the individual authors.en_US
dc.subjectComputational Linguistics; Human-Computer Interaction; Information Extraction; Natural Language Processing; Weak Supervisionen_US
dc.subject.otherComputer scienceen_US
dc.subject.othercomputer science and engineeringen_US
dc.titleInteractive Learning of Relation Extractors with Weak Supervisionen_US
dc.typeThesisen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Hoffmann_washington_0250E_11163.pdf
Size:
35 MB
Format:
Adobe Portable Document Format