Learnability of Autoregressive Transformers

Hong, Jeongyeob

Learnability of Autoregressive Transformers

Files

Hong_washington_0250O_29153.pdf (686.16 KB)

Date

2026-02-05

relationships.isAuthorOf

Hong, Jeongyeob

Abstract

This paper explores the learning mechanism of a decoder-only transformer through the lens of human concept learning. We investigated whether decoder-only Transformers experience the simplicity bias, a human tendency to favor simpler representations. To do so, we create a pipeline that generates every task that a decoder-only transformer can learn and express with a given input symbol, length, and depth. Our initial results show no sufficient evidence for simplicity bias occurring in the autoregressive models. We end our paper with a discussion of other factors that can explain the learnability of transformers, such as the computational cost of each operation.