Applying Compositional Data Analysis Methods To Complete Blood-Counts Data For Early COVID-19 Detection

dc.contributor.advisorGraffelman, Jan
dc.contributor.authorZhang, Zhilong
dc.date.accessioned2025-08-01T22:16:58Z
dc.date.issued2025-08-01
dc.date.submitted2025
dc.descriptionThesis (Master's)--University of Washington, 2025
dc.description.abstractThe investigation of using complete blood-count (CBC) data analysis for COVID-19 infection diagnosis has been a topic of interest in the last couple of years. It could be used as an affordable complementary tool to RT-qPCR and is particularly useful for developing areas struggling to test their population or suffering from massive COVID-19 infection. However, previous research of using CBC data for COVID-19 infection classification didn’t appreciate the compositional nature of white blood cell counts data. In this master’s thesis, we treat white blood cell counts data as compositional variables and apply compositional data visualization methods, using biplots based on log-ratio principal component analysis. Also, we apply compositional classification models to detect COVID-19 infection, using log-ratio linear discriminant analysis. We successfully illustrate the efficacy of compositional methods by building a compositional classification model superior to traditional models and highlight the benefits of analyzing CBC data from a compositional perspective. In a database of symptomatic individuals, we achieve a classification rate of 85% for the PCR-test result using the main CBC composition with some additional blood characteristics.
dc.embargo.lift2026-08-01T22:16:58Z
dc.embargo.termsRestrict to UW for 1 year -- then make Open Access
dc.format.mimetypeapplication/pdf
dc.identifier.otherZhang_washington_0250O_28047.pdf
dc.identifier.urihttps://hdl.handle.net/1773/53419
dc.language.isoen_US
dc.rightsCC BY
dc.subjectCBC analysis
dc.subjectCompositional data
dc.subjectCOVID-19
dc.subjectLog-ratio transform
dc.subjectMultivariate analysis
dc.subjectPrincipal component analysis
dc.subjectBiostatistics
dc.subject.otherBiostatistics
dc.titleApplying Compositional Data Analysis Methods To Complete Blood-Counts Data For Early COVID-19 Detection
dc.typeThesis

Files

Collections