Analyzing small molecule inhibition of enzymes: A preliminary machine learning approach towards drug lead generation
| dc.contributor.advisor | Beck, David A.C. | |
| dc.contributor.author | Philip, Pearl | |
| dc.date.accessioned | 2017-08-11T22:52:14Z | |
| dc.date.available | 2017-08-11T22:52:14Z | |
| dc.date.issued | 2017-08-11 | |
| dc.date.submitted | 2017-06 | |
| dc.description | Thesis (Master's)--University of Washington, 2017-06 | |
| dc.description.abstract | This project is designed to create an implementation of quantitative structure-activity relationships (QSAR) models in Python for the prediction of inhibitory action of small-molecule drugs on the enzyme USP1 - an enzyme essential to DNA-repair in proliferating cancer cells. Molecular descriptors are calculated using PyChem and employed to characterize the properties of about 400,000 drug-like compounds from a high-throughput screening assay made available on PubChem. Multiple machine learning models are created on the training data using Scikit-learn and Theano after feature selection and processing, followed by a genetic algorithm to synthesize an ideal enzyme inhibitor to be tested for activity and use as a drug compound. Higher error and poorer model fits can be attributed to multiple sources of error – measurement of activity using AC50, imbalanced dataset in favor of molecules with zero inhibition, incomplete feature space, highly non-linear interactions between the enzyme and drug, and the attainment of local minima in hyperparameter optimization. Solutions have been suggested for each of these issues, and is proposed as a part of future work. The genetic algorithm is used to synthesize a molecule in-silico and as the model prediction accuracy is increased, it can be pursued as a drug lead in clinical trials. This project provides a promising pipeline for future work in open-source molecular drug design and can be extended for use with other datasets and target species. | |
| dc.embargo.terms | Open Access | |
| dc.format.mimetype | application/pdf | |
| dc.identifier.other | Philip_washington_0250O_17372.pdf | |
| dc.identifier.uri | http://hdl.handle.net/1773/39980 | |
| dc.language.iso | en_US | |
| dc.rights | CC BY | |
| dc.subject | data science | |
| dc.subject | drug design | |
| dc.subject | genetic algorithms | |
| dc.subject | machine learning | |
| dc.subject | molecule | |
| dc.subject | QSAR | |
| dc.subject | Pharmaceutical sciences | |
| dc.subject | Pharmacology | |
| dc.subject | Information science | |
| dc.subject.other | Chemical engineering | |
| dc.title | Analyzing small molecule inhibition of enzymes: A preliminary machine learning approach towards drug lead generation | |
| dc.type | Thesis |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Philip_washington_0250O_17372.pdf
- Size:
- 1.5 MB
- Format:
- Adobe Portable Document Format
