Seidler, Gerald TTetef, Samantha2023-08-142023-08-142023-08-142023Tetef_washington_0250E_25795.pdfhttp://hdl.handle.net/1773/50527Thesis (Ph.D.)--University of Washington, 2023Data science and machine learning (ML) methods are revolutionizing scientific analysis and data processing. As a case in point, ML applied to X-ray spectroscopies has recently exploded, showcasing its effectiveness in fields such as electrical energy storage and chemical catalysis. Here, I include comprehensive computational studies of ML techniques applied to X-ray spectra, including X-ray absorption near edge structure (XANES) and valence-to-core X-ray emission spectra (VtC-XES). First, I utilize unsupervised ML to extract import chemical fingerprints and information content in sulforganics and phosphorganics, elucidating important and sometimes surprising correlations between spectral content and elemental coordination or electronic structure. In this work, I compare different unsupervised ML techniques, namely principal component analysis (PCA), variational autoencoder (VAE), t-distributed stochastic neighbor embedding (t-SNE), and uniform manifold approximation and projection (UMAP), and I find important benefits from the added flexibility of similarity-based manifold mappings. Additionally, I help develop open-source tools for future researchers to utilize, including an API that interacts with PubChem to efficiently download and store metadata. Next, I use ML to improve the reliability of data analysis and decrease computational time in the context of XANES imaging experiments. To do so, I utilize dimensionality reduction and clustering to perform image segmentation and then identified phase composition using linear combination fitting. By decoupling the domain identification from the phase identification, I provide a more robust way to handle noise that was not reliant on obtaining appropriate linear combination fitting results. Moreover, my pipeline is flexible enough to uniquely incorporate auxiliary data or multimodal characterization measurements. Finally, I use feature selection to accelerate high-throughput experimental design. Specifically, I use Recursive Feature Elimination to select the most important energies in XANES spectra to measure in the context of a reference library. This approach is validated by appropriate analysis on these reduced-energy spectra using a nano-XANES image. All these approaches and tools are broadly applicable to X-ray spectroscopy on other chemical systems or and are also likely to find use in other spectroscopy techniques.application/pdfen-USCC BYMachine LearningX-ray SpectroscopyPhysicsArtificial intelligencePhysicsInformation Content and Analysis of X-ray Absorption Spectroscopy and X-ray Emission Spectroscopy Using Machine LearningThesis