Interactive Learning of Relation Extractors with Weak Supervision
Hoffmann, Raphael Dominik
MetadataShow full item record
The ability to automatically convert natural language text into a knowledge base may open the door to great new opportunities, including question-answering on the Web, detection of trends and sentiments in social media, and perhaps even intelligent agents which understand our language. Today, however, there does not exist a system that can reliably convert text into a knowledge base, and the task turns out to be far more difficult than it appears. A key challenge is relation extraction - detecting semantic relationships between entities mentioned in text. Most successful approaches use supervised machine learning, but creating the required labeled training examples has proven too expensive for constructing Web-scale knowledge bases. This dissertation shows that we can greatly reduce the amount of human effort necessary to create relation extractors by leveraging a richer set of user interactions, some of which use more accurate models of weak supervision from a database. Specifically, this dissertation presents (1) a weakly supervised technique based on multi-instance learning wich allows relations to overlap, (2) a weakly supervised technique that allows learning from only a few instances per relation by dynamically inducing relation-specific lexicons, (3) an approach for developing extraction rules interactively, and (4) a technique which synergistically pairs weakly supervised relation extraction with extraction validation by an online community. Our proposed techniques make it possible to create a high-quality relation extractor in under one hour, moving us closer towards automatically constructing Web-scale knowledge-bases.