Show simple item record

dc.contributor.advisorDieleman, Joseph
dc.contributor.authorHorst, Cody J
dc.date.accessioned2017-10-26T20:45:11Z
dc.date.available2017-10-26T20:45:11Z
dc.date.submitted2017-08
dc.identifier.otherHorst_washington_0250O_17931.pdf
dc.identifier.urihttp://hdl.handle.net/1773/40423
dc.descriptionThesis (Master's)--University of Washington, 2017-08
dc.description.abstractPrescription pharmaceuticals are a vital component of personal healthcare and contribute significantly to total health expenditure in the United States. Currently, data on pharmaceutical use and expenditure generally precludes the attribution of specific pharmaceuticals to medical conditions, either because the pharmaceutical data and diagnostic data are unlinked or are linked in such a way as to preclude specific attribution. The ability to make this attribution would open up new datasets to analyses that require specific pharmaceutical-condition associations and would strengthen analyses performed using imprecisely associated data. Cost-of-illness studies in particular would benefit from better accounting of pharmaceutical claims. This work seeks to address this problem by constructing a multilabel logistic classifier and an LSTM-based recurrent neural network classifier, training them on the MarketScan© commercial claims data and considering their feasibility. Both models pick up trends in the data based on peak f1-score and individual evaluation of output, but while the logistic model is able to recreate logical associations between medical conditions, comorbidities, and the pharmaceuticals used to treat them, the recurrent neural network learns to produce the most common medical conditions. We conclude that in their current form, the models are not refined enough to be useful. However, this work was instructive in illuminating promising directions to improve the model architecture and data to better cope with noisy classification labels and aid in causal inference.
dc.format.mimetypeapplication/pdf
dc.language.isoen_US
dc.rightsnone
dc.subject
dc.subjectHealth sciences
dc.subjectStatistics
dc.subject.otherGlobal Health
dc.titlePredicting Medical Diagnoses from Pharmaceutical Claims Data
dc.typeThesis
dc.embargo.termsOpen Access


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record