Solvation Meta Predictor

dc.contributor.advisorBeck, David
dc.contributor.authorAmadu Somah, Annatu
dc.date.accessioned2025-08-01T22:17:48Z
dc.date.available2025-08-01T22:17:48Z
dc.date.issued2025-08-01
dc.date.submitted2025
dc.descriptionThesis (Master's)--University of Washington, 2025
dc.description.abstractAccurately predicting the aqueous solubility of organic molecules is essential in a wide range of scientific and industrial domains, including drug development, food, and energy storage. This study builds upon prior work by Panapitiya et al. by introducing a multi-stage ensemble learning framework to enhance the predictive performance of solubility models using the SOMAS dataset. The dataset comprises 11,696 molecules with diverse structural and physicochemical properties, including 2D, 3D, and quantum descriptors. Three base models, a Molecular Descriptor Model (MDM), a Graph Neural Network (GNN), and a SMILES model developed by Panapitiya et al. were utilized and evaluated using RMSE, MAE, R², and Spearman correlation. Among individual models, MDM achieved the strongest performance, but ensemble methods consistently outperformed standalone models. Simple averaging improved predictive accuracy, while Optuna-based ensemble weight optimization yielded the best overall results. Additionally, a Mixture of Experts (MoE) architecture was implemented to dynamically weight model outputs based on structural input features, demonstrating strong performance and scalability. This work highlights the value of combining diverse molecular representations and advanced ensemble techniques, providing a robust, adaptive framework for high-accuracy solubility prediction and future data-driven molecular design.
dc.embargo.termsOpen Access
dc.format.mimetypeapplication/pdf
dc.identifier.otherAmaduSomah_washington_0250O_28555.pdf
dc.identifier.urihttps://hdl.handle.net/1773/53454
dc.language.isoen_US
dc.rightsnone
dc.subjectMachine Learning
dc.subjectSolubility
dc.subjectChemical engineering
dc.subject.otherChemical engineering
dc.titleSolvation Meta Predictor
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
AmaduSomah_washington_0250O_28555.pdf
Size:
714.55 KB
Format:
Adobe Portable Document Format