Analyzing small molecule inhibition of enzymes: A preliminary machine learning approach towards drug lead generation

dc.contributor.advisorBeck, David A.C.
dc.contributor.authorPhilip, Pearl
dc.date.accessioned2017-08-11T22:52:14Z
dc.date.available2017-08-11T22:52:14Z
dc.date.issued2017-08-11
dc.date.submitted2017-06
dc.descriptionThesis (Master's)--University of Washington, 2017-06
dc.description.abstractThis project is designed to create an implementation of quantitative structure-activity relationships (QSAR) models in Python for the prediction of inhibitory action of small-molecule drugs on the enzyme USP1 - an enzyme essential to DNA-repair in proliferating cancer cells. Molecular descriptors are calculated using PyChem and employed to characterize the properties of about 400,000 drug-like compounds from a high-throughput screening assay made available on PubChem. Multiple machine learning models are created on the training data using Scikit-learn and Theano after feature selection and processing, followed by a genetic algorithm to synthesize an ideal enzyme inhibitor to be tested for activity and use as a drug compound. Higher error and poorer model fits can be attributed to multiple sources of error – measurement of activity using AC50, imbalanced dataset in favor of molecules with zero inhibition, incomplete feature space, highly non-linear interactions between the enzyme and drug, and the attainment of local minima in hyperparameter optimization. Solutions have been suggested for each of these issues, and is proposed as a part of future work. The genetic algorithm is used to synthesize a molecule in-silico and as the model prediction accuracy is increased, it can be pursued as a drug lead in clinical trials. This project provides a promising pipeline for future work in open-source molecular drug design and can be extended for use with other datasets and target species.
dc.embargo.termsOpen Access
dc.format.mimetypeapplication/pdf
dc.identifier.otherPhilip_washington_0250O_17372.pdf
dc.identifier.urihttp://hdl.handle.net/1773/39980
dc.language.isoen_US
dc.rightsCC BY
dc.subjectdata science
dc.subjectdrug design
dc.subjectgenetic algorithms
dc.subjectmachine learning
dc.subjectmolecule
dc.subjectQSAR
dc.subjectPharmaceutical sciences
dc.subjectPharmacology
dc.subjectInformation science
dc.subject.otherChemical engineering
dc.titleAnalyzing small molecule inhibition of enzymes: A preliminary machine learning approach towards drug lead generation
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Philip_washington_0250O_17372.pdf
Size:
1.5 MB
Format:
Adobe Portable Document Format