Predicting Operational Lifetimes in Hybrid Perovskite Solar Cells: A Case Study of Machine Learning with Small Datasets

Sunkari, Preetham Paul

Predicting Operational Lifetimes in Hybrid Perovskite Solar Cells: A Case Study of Machine Learning with Small Datasets

dc.contributor.advisor	Hillhouse, Hugh W.
dc.contributor.author	Sunkari, Preetham Paul
dc.date.accessioned	2025-08-01T22:17:46Z
dc.date.available	2025-08-01T22:17:46Z
dc.date.issued	2025-08-01
dc.date.submitted	2025
dc.description	Thesis (Ph.D.)--University of Washington, 2025
dc.description.abstract	Mixed organic-inorganic halide perovskite solar cells (PSCs) have emerged as promising candidates for photovoltaic applications due to their exceptional optoelectronic properties, high defect tolerance, low cost, and ease of fabrication. However, their instability under environmental stress remains a major barrier to commercial viability. Recent studies have focused on understanding the degradation mechanisms in these solar cells under varied environmental conditions to develop predictive models for operational lifetimes.Given the limited understanding of these mechanisms, data-driven approaches—particularly those leveraging machine learning (ML) tools guided by domain knowledge—have become central to lifetime prediction. However, due to the significant time and cost associated with generating the requisite data on PSC degradation, available datasets are typically small, often containing only hundreds, or even just dozens, of meticulously gathered trials. This scarcity of data is a common challenge not only in PSC research but also in other experimental fields within chemistry and materials science, limiting the effectiveness of popular data-hungry ML techniques such as neural networks. In this work, I present a case study using in-house PSC degradation datasets to explore the challenges and strategies of modeling with small data. First, I outline criteria for identifying when a dataset falls within the small data regime or is considered too small for reliable predictive modeling. I then review recent ML advances tailored to such contexts and present a modeling workflow that includes feature construction, feature selection, assessment of metrics (such as sparsity level, false-discovery and ground-truth recovery), and uncertainty quantification. Using simulated datasets generated from known ground-truth variables, I demonstrate that prediction error alone is not always the most reliable metric for feature selection. These simulations, designed to reflect real-world scientific conditions, yield heuristic insights that are then applied to real PSC datasets. Overall, with the help of the in-house PSC degradation datasets, I present a practical framework to guide researchers in selecting appropriate machine learning methods based on data availability. This work aims to educate scientists and engineers on statistically rigorous modeling techniques for small datasets.
dc.embargo.terms	Open Access
dc.format.mimetype	application/pdf
dc.identifier.other	Sunkari_washington_0250E_28412.pdf
dc.identifier.uri	https://hdl.handle.net/1773/53447
dc.language.iso	en_US
dc.rights	CC BY
dc.subject	Feature Selection
dc.subject	Machine Learning
dc.subject	Perovskites
dc.subject	Small data
dc.subject	Solar Cells
dc.subject	Chemical engineering
dc.subject	Materials Science
dc.subject	Statistics
dc.subject.other	Chemical engineering
dc.title	Predicting Operational Lifetimes in Hybrid Perovskite Solar Cells: A Case Study of Machine Learning with Small Datasets
dc.type	Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Sunkari_washington_0250E_28412.pdf
Size:: 8.82 MB
Format:: Adobe Portable Document Format

Download

Collections

Chemical engineering