Learnability of Autoregressive Transformers
| dc.contributor.advisor | Steinert-Threlkeld, Shane | |
| dc.contributor.author | Hong, Jeongyeob | |
| dc.date.accessioned | 2026-02-05T19:37:29Z | |
| dc.date.available | 2026-02-05T19:37:29Z | |
| dc.date.issued | 2026-02-05 | |
| dc.date.submitted | 2025 | |
| dc.description | Thesis (Master's)--University of Washington, 2025 | |
| dc.description.abstract | This paper explores the learning mechanism of a decoder-only transformer through the lens of human concept learning. We investigated whether decoder-only Transformers experience the simplicity bias, a human tendency to favor simpler representations. To do so, we create a pipeline that generates every task that a decoder-only transformer can learn and express with a given input symbol, length, and depth. Our initial results show no sufficient evidence for simplicity bias occurring in the autoregressive models. We end our paper with a discussion of other factors that can explain the learnability of transformers, such as the computational cost of each operation. | |
| dc.embargo.terms | Open Access | |
| dc.format.mimetype | application/pdf | |
| dc.identifier.other | Hong_washington_0250O_29153.pdf | |
| dc.identifier.uri | https://hdl.handle.net/1773/55251 | |
| dc.language.iso | en_US | |
| dc.rights | CC BY | |
| dc.subject | Linguistics | |
| dc.subject | Cognitive psychology | |
| dc.subject | Computer science | |
| dc.subject.other | Linguistics | |
| dc.title | Learnability of Autoregressive Transformers | |
| dc.type | Thesis |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Hong_washington_0250O_29153.pdf
- Size:
- 686.16 KB
- Format:
- Adobe Portable Document Format
