Evaluating Multi-Modal Data Fusion Approaches for Predictive Clinical Models Using Multiple Medical Data Domains

dc.contributor.advisorTarczy-Hornoch, Peter
dc.contributor.advisorHadlock, Jennifer
dc.contributor.authorAlipour, Ehsan
dc.date.accessioned2025-08-01T22:12:50Z
dc.date.available2025-08-01T22:12:50Z
dc.date.issued2025-08-01
dc.date.submitted2025
dc.descriptionThesis (Ph.D.)--University of Washington, 2025
dc.description.abstractDisease outcome prediction is a central research focus in biomedical informatics, as it facilitates precision health related interventions and scientific discovery by enabling digital clinical trials and multiple other benefits. Multimodal deep learning models have emerged as powerful tools in biomedical research, offering the ability to integrate diverse data sources such as clinical records, multi-omics data, imaging, survey responses, and wearable data to enhance predictive accuracy and deepen understanding of medical phenomena. Central to multimodal modeling is the process of data fusion, where information from different modalities is integrated into a unified model. Three primary fusion strategies exist in deep learning: early fusion (feature-level), intermediate fusion and late fusion (decision-level). While widely adopted in other domains, their comparative performance and implementation considerations remain underexplored in biomedical applications, where data heterogeneity, missingness, and varying dimensionality present additional challenges.This dissertation aims to evaluate the implications of data fusion strategies for developing multimodal predictive models in medicine. Across three distinct aims, I assess the impact of early, intermediate, and late fusion techniques on predictive performance, implementation complexity, and generalizability using diverse combinations of data types, outcomes, and modeling strategies. These studies span multiple datasets and outcome types (binary categorial variables vs continuous ratio variables) providing a broad view of fusion strategy utility in real-world biomedical settings. In Aim 1—Evaluation and comparison of early, intermediate, and late fusion techniques for combining exposures, clinical and genomics data for disease risk prediction task using All of Us: Risk of CKD in patients with type 2 diabetes—I evaluated and compared early, intermediate, and late fusion strategies for integrating longitudinal EHR, genomic, and survey data to predict chronic kidney disease (CKD) progression in patients with type 2 diabetes using a novel transformer-based multimodal architecture. Using data from the NIH’s All of Us initiative, I trained models on a cohort of approximately 40,000 patients. While the best performing unimodal model achieved a baseline performance with an AUROC of 0.73 (0.71 - 0.75), the inclusion of multimodal data offered only marginal improvement with an AUROC of 0.74 (0.72 – 0.76), with the benefit limited to the early fusion approach and lacking statistical significance. This aim highlighted the challenges of integrating multimodal data with different dimensions using transformer models and emphasized the role of modality-specific relative predictive strength. In Aim 2—Development and assessment of the incremental value of combining a deep convolutional neural network feature extractor on imaging data and clinical data on a binary prediction task: Predict post-surgical margin status in soft tissue sarcoma—I extended the fusion analysis to imaging data by combining a convolutional neural network (CNN) trained on longitudinal cross-sectional imaging with a shallow neural network trained on clinical and pathology variables to predict post-surgical margin status in patients with soft tissue sarcoma (n=202). Here, the intermediate fusion strategy significantly outperformed other approaches, achieving an AUROC of 0.80 (0.66–0.95), suggesting that cross-modal interactions between histologic features and imaging embeddings may be best captured through intermediate fusion. This result demonstrated the potential value of intermediate fusion when complementary signals exist across modalities. In Aim 3—Evaluation and comparison of early, intermediate, and late fusion techniques for combining imaging and clinical data on a regression prediction task: Estimation of CT-based body composition metrics from chest radiographs—I explored fusion strategies for estimating continuous CT-derived body composition metrics (e.g., visceral, and subcutaneous fat volumes) using only chest radiographs and clinical variables in a dataset of 1,088 patients. A multitask multimodal model was developed and evaluated across early, intermediate, and late fusion strategies. Late fusion consistently delivered the best performance across most body composition metrics, closely followed by intermediate fusion. These results suggest that when individual modalities offer high independent predictive power, decision-level integration may be optimal for regression tasks. Collectively, these aims provide a broad evaluation of data fusion strategies in multimodal biomedical modeling, highlighting their strengths, limitations, and practical considerations. Findings suggest that no single fusion strategy universally outperforms the others; rather, optimal fusion depends on data characteristics, model architecture, and task-specific objectives. This dissertation lays the groundwork for future research aimed at developing adaptive fusion strategies tailored to the complexities of real-world biomedical data.
dc.embargo.termsOpen Access
dc.format.mimetypeapplication/pdf
dc.identifier.otherAlipour_washington_0250E_28628.pdf
dc.identifier.urihttps://hdl.handle.net/1773/53329
dc.language.isoen_US
dc.rightsCC BY
dc.subjectData Fusion
dc.subjectDeep Learning
dc.subjectInformatics
dc.subjectMedical Predictive Models
dc.subjectMultimodal
dc.subjectMedicine
dc.subjectComputer science
dc.subject.otherBiomedical and health informatics
dc.titleEvaluating Multi-Modal Data Fusion Approaches for Predictive Clinical Models Using Multiple Medical Data Domains
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Alipour_washington_0250E_28628.pdf
Size:
3.75 MB
Format:
Adobe Portable Document Format