On the Statistical Significance Testing for Natural Language Processing

Zhu, Haotian

On the Statistical Significance Testing for Natural Language Processing

dc.contributor.advisor	Xia, Fei
dc.contributor.author	Zhu, Haotian
dc.date.accessioned	2020-04-30T17:43:47Z
dc.date.issued	2020-04-30
dc.date.submitted	2020
dc.description	Thesis (Master's)--University of Washington, 2020
dc.description.abstract	This thesis explores and compares statistical significance tests frequently used in comparing Natural Language Processing (NLP) system performance in several aspects. We begin by establishing the fundamentals of the NLP system performance comparison and formulating it into four major tasks specific to NLP. Each statistical significance test is explained in great detail with its assumptions explicated and testing procedure outlined. We stress the importance of verifying test assumptions before conducting a test. In addition, we examine the effect size and statistical power and discuss their significance in the statistical significance testing in NLP. By considering potential dependencies within a test set, the block bootstrap is introduced and employed to calibrate the statistical significance testing for comparing performance of two systems on average. Four case studies with both simulated and real data, of which the complexity of data dependency varies, are presented to illustrate the process of properly using a statistical significance test in comparing NLP system performance under different settings. We then proceed to discussion from different perspectives, with some open issues such as cross-domain comparison and the violation of i.i.d. assumption, which expects further studies. In conclusion, this thesis advocates the proper use of statistical significance testing in comparing NLP system performance and the reporting of the comparison results in more transparency and completeness.
dc.embargo.lift	2021-04-30T17:43:47Z
dc.embargo.terms	Restrict to UW for 1 year -- then make Open Access
dc.format.mimetype	application/pdf
dc.identifier.other	Zhu_washington_0250O_21188.pdf
dc.identifier.uri	http://hdl.handle.net/1773/45512
dc.language.iso	en_US
dc.rights	none
dc.subject	Effect size
dc.subject	Power analysis
dc.subject	Significance testing
dc.subject	Linguistics
dc.subject	Statistics
dc.subject.other	Linguistics
dc.title	On the Statistical Significance Testing for Natural Language Processing
dc.type	Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Zhu_washington_0250O_21188.pdf
Size:: 730.31 KB
Format:: Adobe Portable Document Format

Download

Collections

Linguistics