A Linguist-Friendly Machine Translation System for Low-Resource Languages

dc.contributor.advisorXia, Feien_US
dc.contributor.authorLockwood, Ronald Miltonen_US
dc.date.accessioned2015-09-29T21:24:09Z
dc.date.available2015-09-29T21:24:09Z
dc.date.issued2015-09-29
dc.date.submitted2015en_US
dc.descriptionThesis (Master's)--University of Washington, 2015en_US
dc.description.abstractLow-resource languages have largely been left out of the machine translation revolution. Speakers would benefit from machine translation for many different tasks if it were available. Because of insufficient text data, the results of using statistical machine translation are subpar. The best choice for these languages is probably a transfer-based approach where rules define how to translate from one language to another. Unfortunately, the transfer-based systems available today are not easy to use for anyone outside the computational linguistics field. This thesis presents a transfer-based system that is easy to use for ordinary linguists. It is linguist-friendly because a central component is the intuitive application Fieldworks Language Explorer. This application serves as the repository for lexicons, the place where entries are linked and the tool for the analysis piece of the analysis-transfer-synthesis-style system. Apertium, the well-established open-source machine translation platform, is used for the transfer piece of the system and STAMP for the synthesis piece. All of these programs are well-documented. The linguist’s role is to link lexicon entries and write transfer rules to do either word or syntactic-level translation. Although this machine translation system is a proof-of-concept system, I show that it translates texts successfully in a test case using Persian and Gilaki. Such a system can be used by ordinary linguists all over the world for almost any language pair where machine translation is needed.en_US
dc.embargo.termsOpen Accessen_US
dc.format.mimetypeapplication/pdfen_US
dc.identifier.otherLockwood_washington_0250O_14408.pdfen_US
dc.identifier.urihttp://hdl.handle.net/1773/33999
dc.language.isoen_USen_US
dc.rightsCopyright is held by the individual authors.en_US
dc.subjectApertium; Fieldworks; linguist-friendly; low-resource; machine translation; transfer-baseden_US
dc.subject.otherLinguisticsen_US
dc.subject.otherComputer scienceen_US
dc.subject.otherlinguisticsen_US
dc.titleA Linguist-Friendly Machine Translation System for Low-Resource Languagesen_US
dc.typeThesisen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Lockwood_washington_0250O_14408.pdf
Size:
897.48 KB
Format:
Adobe Portable Document Format

Collections