Using biomedical data to identify genetic variants that drive drug responses in Acute Myeloid Leukemia
Kauer, Nicole Maryellen
MetadataShow full item record
Acute Myeloid Leukemia (AML) is a heterogeneous cancer of the blood that progresses quickly, with approximately 10,000 AML related deaths reported annually in the United States. Patients with AML tend to have genetic variations, which can significantly affect drug sensitivity and treatment outcomes. While massive amounts of big biomedical data have been generated to characterize AML, these genetic variations are not yet well-understood. Thus, the development of individualized approaches to AML therapy using these big data has great potential. The promise of precision medicine is that knowledge of the genetic characteristics present within a cancer will enable better choices for therapy. In this thesis, we applied data science techniques to analyze AML biomedical data in collaboration with Dr. Pamela Becker, Hematology, UW-Seattle, with the goal of identifying genetic variants that drive drug responses in AML. We identified 30 novel gene-drug pairs with statistical significant responses. Additionally, both univariate and multivariate machine learning models were created, with multivariate feature selection via Bayesian Model Averaging. We found multivariate models to outperform univariate models in most cases. Additionally, an automated workflow for the analysis was created to allow for incremental additions of patient data over the course of Dr. Becker's study.