Show simple item record

dc.contributor.advisorSteinert-Threlkel, Shane
dc.contributor.authorCampos, Daniel
dc.date.accessioned2021-03-19T22:56:06Z
dc.date.available2021-03-19T22:56:06Z
dc.date.submitted2020
dc.identifier.otherCampos_washington_0250O_22318.pdf
dc.identifier.urihttp://hdl.handle.net/1773/46825
dc.descriptionThesis (Master's)--University of Washington, 2020
dc.description.abstractUnderstanding language depending on the context of its usage has always been one of thecore goals of natural language processing. Recently, contextual word representations created by language models like ELMo, BERT, ELECTRA, and RoBERTA have provided robust representations of natural language which serve as the language understanding component for a diverse range of downstream tasks like information retrieval, question answering, and information extraction. Curriculum learning is a method that employs a structured training regime instead of the traditional random sampling. Research areas like computer vision and machine translation have used curriculum learning methods in model training to improve model training speed and model performance. While language models have proven transformational for the natural language processing community, these models have proven expensive, energy-intensive, and challenging to train, which has inspired researchers to explore new training methods. In this thesis, we explore the effect of curriculum learning in the training of language models. Using wikitext-2 and wikitext-103 textual datasets and evaluating word representation transfer learning on the GLUE Benchmark, we find that curriculum learning methods produce models that outperform their traditionally trained counterparts when the training corpus is small, but as the training corpora scale, curriculum methods become less effective than traditional stochastic sampling.
dc.format.mimetypeapplication/pdf
dc.language.isoen_US
dc.rightsnone
dc.subjectCurriculum Learning
dc.subjectDeep Learning
dc.subjectLanguage Model
dc.subjectComputer science
dc.subjectLanguage
dc.subject.otherLinguistics
dc.titleExplorations In Curriculum Learning Methods For Training Language Models
dc.typeThesis
dc.embargo.termsOpen Access


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record