Socially Responsible and Factual Reasoning for Equitable AI Systems

dc.contributor.advisorChoi, Yejin
dc.contributor.advisorRoesner, Franziska
dc.contributor.authorGabriel, Saadia
dc.date.accessioned2023-09-27T17:19:16Z
dc.date.issued2023-09-27
dc.date.submitted2023
dc.descriptionThesis (Ph.D.)--University of Washington, 2023
dc.description.abstractThrough natural language communication, writers have enormous persuasive power over readers. This can have broad-reaching positive societal impact like in the case of social movements (e.g. the Black Lives Matter movement and protests against anti-Asian hate), however there are severe negative ramifications when communication is used with malintent (e.g. to directly inflict harm through hate speech or mislead). The ability to read between the lines of what is explicitly stated and adapt to dynamic social contexts is critical to detecting false or harmful text. However, existing deep learning approaches still have limited generalization and commonsense reasoning capabilities. To expand machine reasoning capabilities, we propose theoretical formalisms to measure intent, factuality and social bias of language. We first introduce reaction frames, which allow us to distill knowledge of cognitive and physical effects on readers like implied actions (e.g. given the false statement ``Water boiled with garlic cures coronavirus,'' we can infer that the writer is compelling an audience to ``drink garlic water''). We find that while neural misinformation detection classifiers are highly capable of distinguishing between truthful and false content, these models are challenged by commonsense implications derived using our neuro-symbolic approach. We discuss how a major bottleneck comes from the inability of neural models to correctly interpret meaning, particularly when this pertains to plausibility of claims. We conduct a meta-evaluation to test efficacy of factuality metrics, and expose that the evaluation used for generation is ill-suited to benchmarking progress in learning factuality. This study pinpoints specific failure cases of metrics and underlying models, outlining future directions for factuality evaluation. Finally we show how, despite their limitations, large pretrained language models like GPT-3 can be used to mitigate dataset bias in existing hate speech corpora. We use adversarial generation approaches to better align classifiers with human interpretation of toxicity and mitigate potentially harmful vulnerabilities in classifiers. As future work, we discuss the need for a proactive, community-driven approach to reduce online harms.
dc.embargo.lift2024-09-26T17:19:16Z
dc.embargo.termsRestrict to UW for 1 year -- then make Open Access
dc.format.mimetypeapplication/pdf
dc.identifier.otherGabriel_washington_0250E_25995.pdf
dc.identifier.urihttp://hdl.handle.net/1773/50762
dc.language.isoen_US
dc.rightsCC BY
dc.subjectFact-checking
dc.subjectHate speech
dc.subjectMisinformation
dc.subjectNLP
dc.subjectToxicity
dc.subjectArtificial intelligence
dc.subjectComputer science
dc.subject.otherComputer science and engineering
dc.titleSocially Responsible and Factual Reasoning for Equitable AI Systems
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Gabriel_washington_0250E_25995.pdf
Size:
678.34 KB
Format:
Adobe Portable Document Format