Privacy Preserving Machine Learning for Next Day Pain Prediction

Loading...
Thumbnail Image

Authors

Engavle, Ashutosh Vilas

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

The availability of healthcare data is critically limited due to stringent privacy regulations, ethical considerations, and the intrinsic sensitivity of medical information. This scarcity hampers research and development in medical science, ultimately affecting the advancement of healthcare services and patient outcomes. Synthetic data emerges as a potent solution to this challenge, offering a pathway to bolster data accessibility while safeguarding patient privacy. This thesis explores the multifaceted issue of next day pain prediction with machine learning models and the limited availability of patient data to train such models, and delves into the potential of synthetic data to bridge this gap. By generating realistic, non-personal data that mimics the statistical properties of real healthcare datasets, synthetic data provides a viable alternative for research and analysis, circumventing privacy concerns. We use methodologies for synthetic data generation with and without privacy, and evaluate their effectiveness and utility for next day pain prediction in patients with Juvenile Idiopathic Arthritis and lupus. We compare the utility of synthetic data with that of real data by training models on both kinds of data and evaluating the trained models on real data. The findings indicate that machine learning models are able to do next day pain prediction. We also see that marginal based synthetic data generation methods can create synthetic data with good utility with substantial privacy guarantee for this task.

Description

Thesis (Master's)--University of Washington, 2024

Citation

DOI