Socially Responsible and Factual Reasoning for Equitable AI Systems

Gabriel, Saadia

Socially Responsible and Factual Reasoning for Equitable AI Systems

dc.contributor.advisor	Choi, Yejin
dc.contributor.advisor	Roesner, Franziska
dc.contributor.author	Gabriel, Saadia
dc.date.accessioned	2023-09-27T17:19:16Z
dc.date.issued	2023-09-27
dc.date.submitted	2023
dc.description	Thesis (Ph.D.)--University of Washington, 2023
dc.description.abstract	Through natural language communication, writers have enormous persuasive power over readers. This can have broad-reaching positive societal impact like in the case of social movements (e.g. the Black Lives Matter movement and protests against anti-Asian hate), however there are severe negative ramifications when communication is used with malintent (e.g. to directly inflict harm through hate speech or mislead). The ability to read between the lines of what is explicitly stated and adapt to dynamic social contexts is critical to detecting false or harmful text. However, existing deep learning approaches still have limited generalization and commonsense reasoning capabilities. To expand machine reasoning capabilities, we propose theoretical formalisms to measure intent, factuality and social bias of language. We first introduce reaction frames, which allow us to distill knowledge of cognitive and physical effects on readers like implied actions (e.g. given the false statement ``Water boiled with garlic cures coronavirus,'' we can infer that the writer is compelling an audience to ``drink garlic water''). We find that while neural misinformation detection classifiers are highly capable of distinguishing between truthful and false content, these models are challenged by commonsense implications derived using our neuro-symbolic approach. We discuss how a major bottleneck comes from the inability of neural models to correctly interpret meaning, particularly when this pertains to plausibility of claims. We conduct a meta-evaluation to test efficacy of factuality metrics, and expose that the evaluation used for generation is ill-suited to benchmarking progress in learning factuality. This study pinpoints specific failure cases of metrics and underlying models, outlining future directions for factuality evaluation. Finally we show how, despite their limitations, large pretrained language models like GPT-3 can be used to mitigate dataset bias in existing hate speech corpora. We use adversarial generation approaches to better align classifiers with human interpretation of toxicity and mitigate potentially harmful vulnerabilities in classifiers. As future work, we discuss the need for a proactive, community-driven approach to reduce online harms.
dc.embargo.lift	2024-09-26T17:19:16Z
dc.embargo.terms	Restrict to UW for 1 year -- then make Open Access
dc.format.mimetype	application/pdf
dc.identifier.other	Gabriel_washington_0250E_25995.pdf
dc.identifier.uri	http://hdl.handle.net/1773/50762
dc.language.iso	en_US
dc.rights	CC BY
dc.subject	Fact-checking
dc.subject	Hate speech
dc.subject	Misinformation
dc.subject	NLP
dc.subject	Toxicity
dc.subject	Artificial intelligence
dc.subject	Computer science
dc.subject.other	Computer science and engineering
dc.title	Socially Responsible and Factual Reasoning for Equitable AI Systems
dc.type	Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Gabriel_washington_0250E_25995.pdf
Size:: 678.34 KB
Format:: Adobe Portable Document Format

Download

Collections

Computer science and engineering