Explorations In Curriculum Learning Methods For Training Language Models
| dc.contributor.advisor | Steinert-Threlkel, Shane | |
| dc.contributor.author | Campos, Daniel | |
| dc.date.accessioned | 2021-03-19T22:56:06Z | |
| dc.date.available | 2021-03-19T22:56:06Z | |
| dc.date.issued | 2021-03-19 | |
| dc.date.submitted | 2020 | |
| dc.description | Thesis (Master's)--University of Washington, 2020 | |
| dc.description.abstract | Understanding language depending on the context of its usage has always been one of thecore goals of natural language processing. Recently, contextual word representations created by language models like ELMo, BERT, ELECTRA, and RoBERTA have provided robust representations of natural language which serve as the language understanding component for a diverse range of downstream tasks like information retrieval, question answering, and information extraction. Curriculum learning is a method that employs a structured training regime instead of the traditional random sampling. Research areas like computer vision and machine translation have used curriculum learning methods in model training to improve model training speed and model performance. While language models have proven transformational for the natural language processing community, these models have proven expensive, energy-intensive, and challenging to train, which has inspired researchers to explore new training methods. In this thesis, we explore the effect of curriculum learning in the training of language models. Using wikitext-2 and wikitext-103 textual datasets and evaluating word representation transfer learning on the GLUE Benchmark, we find that curriculum learning methods produce models that outperform their traditionally trained counterparts when the training corpus is small, but as the training corpora scale, curriculum methods become less effective than traditional stochastic sampling. | |
| dc.embargo.terms | Open Access | |
| dc.format.mimetype | application/pdf | |
| dc.identifier.other | Campos_washington_0250O_22318.pdf | |
| dc.identifier.uri | http://hdl.handle.net/1773/46825 | |
| dc.language.iso | en_US | |
| dc.rights | none | |
| dc.subject | Curriculum Learning | |
| dc.subject | Deep Learning | |
| dc.subject | Language Model | |
| dc.subject | Computer science | |
| dc.subject | Language | |
| dc.subject.other | Linguistics | |
| dc.title | Explorations In Curriculum Learning Methods For Training Language Models | |
| dc.type | Thesis |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Campos_washington_0250O_22318.pdf
- Size:
- 1.1 MB
- Format:
- Adobe Portable Document Format
