Applying Compositional Data Analysis Methods To Complete Blood-Counts Data For Early COVID-19 Detection

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

The investigation of using complete blood-count (CBC) data analysis for COVID-19 infection diagnosis has been a topic of interest in the last couple of years. It could be used as an affordable complementary tool to RT-qPCR and is particularly useful for developing areas struggling to test their population or suffering from massive COVID-19 infection. However, previous research of using CBC data for COVID-19 infection classification didn’t appreciate the compositional nature of white blood cell counts data. In this master’s thesis, we treat white blood cell counts data as compositional variables and apply compositional data visualization methods, using biplots based on log-ratio principal component analysis. Also, we apply compositional classification models to detect COVID-19 infection, using log-ratio linear discriminant analysis. We successfully illustrate the efficacy of compositional methods by building a compositional classification model superior to traditional models and highlight the benefits of analyzing CBC data from a compositional perspective. In a database of symptomatic individuals, we achieve a classification rate of 85% for the PCR-test result using the main CBC composition with some additional blood characteristics.

Description

Thesis (Master's)--University of Washington, 2025

Citation

DOI

Collections