Training Models to Ignore Dataset Bias
| dc.contributor.advisor | Zettlemoyer, Luke | |
| dc.contributor.author | Clark, Christopher Andreas | |
| dc.date.accessioned | 2020-10-26T20:41:04Z | |
| dc.date.available | 2020-10-26T20:41:04Z | |
| dc.date.issued | 2020-10-26 | |
| dc.date.submitted | 2020 | |
| dc.description | Thesis (Ph.D.)--University of Washington, 2020 | |
| dc.description.abstract | Modern machine learning algorithms have been able to achieve impressive results on complex tasks such as language comprehension or image understanding. However, recent work has cautioned that this success is often partially due to exploiting incidental correlations that were introduced during dataset creation, and are not fundamental to the target task. For example, sentence entailment datasets can have spurious word-class correlations if nearly all contradiction sentences contain the word ``not'', and image recognition datasets can have tell-tale object-background correlations if dogs are always indoors. Models that exploit these incidental correlation, which we call dataset bias, can be brittle and perform poorly on out-of-domain examples. In this thesis, we present several methods of solving this issue by preventing models from using dataset bias. A key challenge for this task is determining which predictive patterns in the training data are bias. This thesis proposes several solutions, ranging from methods that exploit domain expertise when such knowledge is available, to more broadly applicable domain-general solutions. Solving this task also requires preventing complex neural models from exploiting these biased patterns, even though they are often easy to learn and effective on the training data. We present ensembling and data augmentation based methods to handle this difficulty. In all cases, we evaluate our models by showing improved performance on out-of-domain datasets that were built to penalize biased models. Our first focus is on question answering, motivated by the observation biases can lead to poor performance when applying a model to multiple paragraphs. To solve this task, we propose a modified training scheme that exposes the model to additional paragraphs that do not answer the question. We then consider the case where expert knowledge of the bias can be used to construct a \textit{bias-only} model that captures biased methods. In this case, we can build an unbiased model by ensembling it with the bias-only model in order to disincentive it from learning bias. Finally, we generalize this approach by proposing a method to automatically construct the bias-only model when no such expert knowledge is available. Overall, this thesis shows that it is possible to train unbiased models on biased datasets, and proposes some fundamental answers to question about how bias can be detected and avoided. | |
| dc.embargo.terms | Open Access | |
| dc.format.mimetype | application/pdf | |
| dc.identifier.other | Clark_washington_0250E_22161.pdf | |
| dc.identifier.uri | http://hdl.handle.net/1773/46425 | |
| dc.language.iso | en_US | |
| dc.rights | none | |
| dc.subject | computer vision | |
| dc.subject | dataset bias | |
| dc.subject | machine learning | |
| dc.subject | natural language processing | |
| dc.subject | Artificial intelligence | |
| dc.subject.other | Computer science and engineering | |
| dc.title | Training Models to Ignore Dataset Bias | |
| dc.type | Thesis |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Clark_washington_0250E_22161.pdf
- Size:
- 9.81 MB
- Format:
- Adobe Portable Document Format
