Evaluating Multi-Modal Data Fusion Approaches for Predictive Clinical Models Using Multiple Medical Data Domains

Alipour, Ehsan

Evaluating Multi-Modal Data Fusion Approaches for Predictive Clinical Models Using Multiple Medical Data Domains

dc.contributor.advisor	Tarczy-Hornoch, Peter
dc.contributor.advisor	Hadlock, Jennifer
dc.contributor.author	Alipour, Ehsan
dc.date.accessioned	2025-08-01T22:12:50Z
dc.date.available	2025-08-01T22:12:50Z
dc.date.issued	2025-08-01
dc.date.submitted	2025
dc.description	Thesis (Ph.D.)--University of Washington, 2025
dc.description.abstract	Disease outcome prediction is a central research focus in biomedical informatics, as it facilitates precision health related interventions and scientific discovery by enabling digital clinical trials and multiple other benefits. Multimodal deep learning models have emerged as powerful tools in biomedical research, offering the ability to integrate diverse data sources such as clinical records, multi-omics data, imaging, survey responses, and wearable data to enhance predictive accuracy and deepen understanding of medical phenomena. Central to multimodal modeling is the process of data fusion, where information from different modalities is integrated into a unified model. Three primary fusion strategies exist in deep learning: early fusion (feature-level), intermediate fusion and late fusion (decision-level). While widely adopted in other domains, their comparative performance and implementation considerations remain underexplored in biomedical applications, where data heterogeneity, missingness, and varying dimensionality present additional challenges.This dissertation aims to evaluate the implications of data fusion strategies for developing multimodal predictive models in medicine. Across three distinct aims, I assess the impact of early, intermediate, and late fusion techniques on predictive performance, implementation complexity, and generalizability using diverse combinations of data types, outcomes, and modeling strategies. These studies span multiple datasets and outcome types (binary categorial variables vs continuous ratio variables) providing a broad view of fusion strategy utility in real-world biomedical settings. In Aim 1—Evaluation and comparison of early, intermediate, and late fusion techniques for combining exposures, clinical and genomics data for disease risk prediction task using All of Us: Risk of CKD in patients with type 2 diabetes—I evaluated and compared early, intermediate, and late fusion strategies for integrating longitudinal EHR, genomic, and survey data to predict chronic kidney disease (CKD) progression in patients with type 2 diabetes using a novel transformer-based multimodal architecture. Using data from the NIH’s All of Us initiative, I trained models on a cohort of approximately 40,000 patients. While the best performing unimodal model achieved a baseline performance with an AUROC of 0.73 (0.71 - 0.75), the inclusion of multimodal data offered only marginal improvement with an AUROC of 0.74 (0.72 – 0.76), with the benefit limited to the early fusion approach and lacking statistical significance. This aim highlighted the challenges of integrating multimodal data with different dimensions using transformer models and emphasized the role of modality-specific relative predictive strength. In Aim 2—Development and assessment of the incremental value of combining a deep convolutional neural network feature extractor on imaging data and clinical data on a binary prediction task: Predict post-surgical margin status in soft tissue sarcoma—I extended the fusion analysis to imaging data by combining a convolutional neural network (CNN) trained on longitudinal cross-sectional imaging with a shallow neural network trained on clinical and pathology variables to predict post-surgical margin status in patients with soft tissue sarcoma (n=202). Here, the intermediate fusion strategy significantly outperformed other approaches, achieving an AUROC of 0.80 (0.66–0.95), suggesting that cross-modal interactions between histologic features and imaging embeddings may be best captured through intermediate fusion. This result demonstrated the potential value of intermediate fusion when complementary signals exist across modalities. In Aim 3—Evaluation and comparison of early, intermediate, and late fusion techniques for combining imaging and clinical data on a regression prediction task: Estimation of CT-based body composition metrics from chest radiographs—I explored fusion strategies for estimating continuous CT-derived body composition metrics (e.g., visceral, and subcutaneous fat volumes) using only chest radiographs and clinical variables in a dataset of 1,088 patients. A multitask multimodal model was developed and evaluated across early, intermediate, and late fusion strategies. Late fusion consistently delivered the best performance across most body composition metrics, closely followed by intermediate fusion. These results suggest that when individual modalities offer high independent predictive power, decision-level integration may be optimal for regression tasks. Collectively, these aims provide a broad evaluation of data fusion strategies in multimodal biomedical modeling, highlighting their strengths, limitations, and practical considerations. Findings suggest that no single fusion strategy universally outperforms the others; rather, optimal fusion depends on data characteristics, model architecture, and task-specific objectives. This dissertation lays the groundwork for future research aimed at developing adaptive fusion strategies tailored to the complexities of real-world biomedical data.
dc.embargo.terms	Open Access
dc.format.mimetype	application/pdf
dc.identifier.other	Alipour_washington_0250E_28628.pdf
dc.identifier.uri	https://hdl.handle.net/1773/53329
dc.language.iso	en_US
dc.rights	CC BY
dc.subject	Data Fusion
dc.subject	Deep Learning
dc.subject	Informatics
dc.subject	Medical Predictive Models
dc.subject	Multimodal
dc.subject	Medicine
dc.subject	Computer science
dc.subject.other	Biomedical and health informatics
dc.title	Evaluating Multi-Modal Data Fusion Approaches for Predictive Clinical Models Using Multiple Medical Data Domains
dc.type	Thesis

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Alipour_washington_0250E_28628.pdf
Size:: 3.75 MB
Format:: Adobe Portable Document Format

Download

Collections

Biomedical and health informatics