Interactive Learning of Relation Extractors with Weak Supervision

Loading...
Thumbnail Image

Authors

Hoffmann, Raphael Dominik

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

The ability to automatically convert natural language text into a knowledge base may open the door to great new opportunities, including question-answering on the Web, detection of trends and sentiments in social media, and perhaps even intelligent agents which understand our language. Today, however, there does not exist a system that can reliably convert text into a knowledge base, and the task turns out to be far more difficult than it appears. A key challenge is relation extraction - detecting semantic relationships between entities mentioned in text. Most successful approaches use supervised machine learning, but creating the required labeled training examples has proven too expensive for constructing Web-scale knowledge bases. This dissertation shows that we can greatly reduce the amount of human effort necessary to create relation extractors by leveraging a richer set of user interactions, some of which use more accurate models of weak supervision from a database. Specifically, this dissertation presents (1) a weakly supervised technique based on multi-instance learning wich allows relations to overlap, (2) a weakly supervised technique that allows learning from only a few instances per relation by dynamically inducing relation-specific lexicons, (3) an approach for developing extraction rules interactively, and (4) a technique which synergistically pairs weakly supervised relation extraction with extraction validation by an online community. Our proposed techniques make it possible to create a high-quality relation extractor in under one hour, moving us closer towards automatically constructing Web-scale knowledge-bases.

Description

Thesis (Ph.D.)--University of Washington, 2012

Citation

DOI