Issues in Named Entity Recognition on Early Modern English Letters

dc.contributor.advisorXia, Fei
dc.contributor.authorWoldenga-Racine, Vanessa
dc.date.accessioned2019-10-15T22:59:17Z
dc.date.available2019-10-15T22:59:17Z
dc.date.issued2019-10-15
dc.date.submitted2019
dc.descriptionThesis (Master's)--University of Washington, 2019
dc.description.abstractThe influx of digitized historical documents into online collections has made the study of these documents much more accessible to researchers and the general public. This data, however, is frequently raw data sometimes obtained through automated methods such as optical character recognition. Without rich metadata, the content of these documents is difficult to search and organize. Tasks commonly undertaken in the field of computational linguistics can aid in this endeavour. These documents often present challenges for modern systems, however, as the text contained in historical documents frequently differs in many ways from the present-day newswire these systems are most often trained on. In this thesis I explore the task of Named Entity Recognition on texts written in Early Modern English. I investigate three methodologies for bootstrapping training data to train a character-based neural net model. The results show substantial improvements upon all baselines, with the best f-measure at 60.31%
dc.embargo.termsOpen Access
dc.format.mimetypeapplication/pdf
dc.identifier.otherWoldengaRacine_washington_0250O_20656.pdf
dc.identifier.urihttp://hdl.handle.net/1773/44845
dc.language.isoen_US
dc.rightsnone
dc.subjectbrown cluster
dc.subjectcomputational linguistics
dc.subjectdigital humanities
dc.subjectearly modern english
dc.subjectnamed entity recognition
dc.subjectneural net
dc.subjectLinguistics
dc.subjectComputer science
dc.subjectHistory
dc.subject.otherLinguistics
dc.titleIssues in Named Entity Recognition on Early Modern English Letters
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
WoldengaRacine_washington_0250O_20656.pdf
Size:
478.18 KB
Format:
Adobe Portable Document Format

Collections