The Use of Natural Language Processing and Machine Learning for Early Diagnosis of Lung and Ovarian Cancer

Loading...
Thumbnail Image

Authors

TURNER, GRACE

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Cancer is a serious diagnosis and diagnostic delay is correlated with reductions in survivalrates following treatment. For many cancers, providers can only rely on symptoms and signs to diagnose patients. These details are recorded primarily free text clinical notes. Natural language processing (NLP) can be used to extract symptoms/signs from these notes for population level diagnosis screening. This creates opportunity for machine learning to alert providers earlier in the diagnostic process using existing, but easily overlooked information. Thus, the focus of this thesis was to determine opportunities for reducing diagnostic delayin ovarian and lung cancer. A symptom extraction model trained on a primarily COVID-19 population was adapted to lung and ovarian cancer populations. The model then extracted symptoms/signs from a retrospective case-control study (ovarian) developed as part of this work as a well a leveraged study (lung). Symptom frequencies for ovarian cancer were then explored across different routes to diagnosis. Finally, this thesis developed experiments using machine learning models to predict lung and ovarian cancer prior to diagnosis. This work showed early prediction using symptoms was only possible on the lung cohort. Nevertheless, both cohorts had significantly higher “next step” recommendations in cases as compared to controls, even 6 months prior to diagnosis.

Description

Thesis (Master's)--University of Washington, 2022

Citation

DOI