Show simple item record

dc.contributor.advisorNascimento, Anderson C
dc.contributor.authorCallow, Edward Joseph
dc.date.accessioned2019-10-15T22:54:00Z
dc.date.available2019-10-15T22:54:00Z
dc.date.submitted2019
dc.identifier.otherCallow_washington_0250O_20779.pdf
dc.identifier.urihttp://hdl.handle.net/1773/44691
dc.descriptionThesis (Master's)--University of Washington, 2019
dc.description.abstractAccurate classification, morphological analysis and translation of compound words is a problem that has not been satisfactorily solved in many of its aspects. For example, as of the date of this paper, Google translates “Trittbrettunsterblichkeit’, a GCW meaning, in the English idom, the act of “riding on someone’s coattails to achieve immortality” as “footboard immortality.” This is a literal translation that does not capture the meaning. Conversely, when one tries to describe this idiom in an effort to get “Trittbrettunsterblichkeit”, there is no way to get this word unless one inputs “footboard immortality”, which makes no sense in English. Inputting “immortality achieved by riding on someone’s coattails”, which is a fairly accurate definition of “Trittbrettunterblichkeit” translates as the awkward phrase: “Unsterblichkeit, die durch das Reiten auf den Fellschwänzen eines Menschen erreicht wird.” Clearly, constructing a GCW to match a concept in English, even when the word exists as a succinct native German word, is a problem. The goal of this thesis is to explore generation of GCWs, existing or non-existing, based on inputs of component root words. Although the methods explored may be adaptable to generation of various words in various languages, the focus here is German compound words (GCWs), known also called Komposita. In particular, this thesis discusses the problem of predicting the correct linking element of the GCW. To accomplish this a recurrent neural network (hereinafter ‘GCW RNN’) with Attention is used, trained upon the characters of the constituent words of the GCW in the training set. From this, a prediction is made as to the linking element. This report contains a description of the problem, the dataset, the model, and the results.
dc.format.mimetypeapplication/pdf
dc.language.isoen_US
dc.rightsnone
dc.subjectFugenelement
dc.subjectKomposita
dc.subjectNeural Networks
dc.subjectSequence Models
dc.subjectComputer science
dc.subjectLinguistics
dc.subject.otherComputer science and systems
dc.titlePredicting German Compound Words Using a Recurrent Neural Network
dc.typeThesis
dc.embargo.termsOpen Access


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record