Language Models can Generalize from Indirect Evidence: Evidence from Filtered Corpus Training (FICT)
| dc.contributor.advisor | Steinert-Threlkeld, Shane | |
| dc.contributor.author | Patil, Abhinav | |
| dc.date.accessioned | 2024-09-09T23:12:02Z | |
| dc.date.available | 2024-09-09T23:12:02Z | |
| dc.date.issued | 2024-09-09 | |
| dc.date.submitted | 2024 | |
| dc.description | Thesis (Master's)--University of Washington, 2024 | |
| dc.description.abstract | This thesis introduces Filtered Corpus Training, a method that trains language models (LMs) on corpora with certain linguistic constructions filtered out from the training data, and uses it to measure the ability of LMs to perform linguistic generalization on the basis of indirect evidence. Applying the method to both LSTM and Transformer LMs, of roughly comparable size, we develop corpora filtered of direct evidence for a wide range of linguistic phenomena. Our results show that while transformers are better qua LMs (as measured by perplexity), both models perform equally and surprisingly well on linguistic generalization measures, suggesting that they are capable of generalizing from indirect evidence. This adds to a growing body of evidence on the limitations of perplexity as an evaluation metric, while also showing that direct attestation may be not strictly be necessary for learners to develop the appropriate linguistic generalizations. | |
| dc.embargo.terms | Open Access | |
| dc.format.mimetype | application/pdf | |
| dc.identifier.other | Patil_washington_0250O_27109.pdf | |
| dc.identifier.uri | https://hdl.handle.net/1773/52074 | |
| dc.language.iso | en_US | |
| dc.rights | CC BY-SA | |
| dc.subject | Filtered Corpus Training | |
| dc.subject | Inductive Biases | |
| dc.subject | Language Model Evaluation | |
| dc.subject | Linguistic Generalization | |
| dc.subject | Poverty of the Stimulus | |
| dc.subject | Targeted Syntactic Evaluations | |
| dc.subject | Linguistics | |
| dc.subject | Artificial intelligence | |
| dc.subject | Computer science | |
| dc.subject.other | Linguistics | |
| dc.title | Language Models can Generalize from Indirect Evidence: Evidence from Filtered Corpus Training (FICT) | |
| dc.type | Thesis |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Patil_washington_0250O_27109.pdf
- Size:
- 2.12 MB
- Format:
- Adobe Portable Document Format
