Training Models to Ignore Dataset Bias

Clark, Christopher Andreas

Training Models to Ignore Dataset Bias

dc.contributor.advisor	Zettlemoyer, Luke
dc.contributor.author	Clark, Christopher Andreas
dc.date.accessioned	2020-10-26T20:41:04Z
dc.date.available	2020-10-26T20:41:04Z
dc.date.issued	2020-10-26
dc.date.submitted	2020
dc.description	Thesis (Ph.D.)--University of Washington, 2020
dc.description.abstract	Modern machine learning algorithms have been able to achieve impressive results on complex tasks such as language comprehension or image understanding. However, recent work has cautioned that this success is often partially due to exploiting incidental correlations that were introduced during dataset creation, and are not fundamental to the target task. For example, sentence entailment datasets can have spurious word-class correlations if nearly all contradiction sentences contain the word ``not'', and image recognition datasets can have tell-tale object-background correlations if dogs are always indoors. Models that exploit these incidental correlation, which we call dataset bias, can be brittle and perform poorly on out-of-domain examples. In this thesis, we present several methods of solving this issue by preventing models from using dataset bias. A key challenge for this task is determining which predictive patterns in the training data are bias. This thesis proposes several solutions, ranging from methods that exploit domain expertise when such knowledge is available, to more broadly applicable domain-general solutions. Solving this task also requires preventing complex neural models from exploiting these biased patterns, even though they are often easy to learn and effective on the training data. We present ensembling and data augmentation based methods to handle this difficulty. In all cases, we evaluate our models by showing improved performance on out-of-domain datasets that were built to penalize biased models. Our first focus is on question answering, motivated by the observation biases can lead to poor performance when applying a model to multiple paragraphs. To solve this task, we propose a modified training scheme that exposes the model to additional paragraphs that do not answer the question. We then consider the case where expert knowledge of the bias can be used to construct a \textit{bias-only} model that captures biased methods. In this case, we can build an unbiased model by ensembling it with the bias-only model in order to disincentive it from learning bias. Finally, we generalize this approach by proposing a method to automatically construct the bias-only model when no such expert knowledge is available. Overall, this thesis shows that it is possible to train unbiased models on biased datasets, and proposes some fundamental answers to question about how bias can be detected and avoided.
dc.embargo.terms	Open Access
dc.format.mimetype	application/pdf
dc.identifier.other	Clark_washington_0250E_22161.pdf
dc.identifier.uri	http://hdl.handle.net/1773/46425
dc.language.iso	en_US
dc.rights	none
dc.subject	computer vision
dc.subject	dataset bias
dc.subject	machine learning
dc.subject	natural language processing
dc.subject	Artificial intelligence
dc.subject.other	Computer science and engineering
dc.title	Training Models to Ignore Dataset Bias
dc.type	Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Clark_washington_0250E_22161.pdf
Size:: 9.81 MB
Format:: Adobe Portable Document Format

Download

Collections

Computer science and engineering