Efficient Estimation Under Data Fusion

Loading...
Thumbnail Image

Authors

Li, Sijia

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

This dissertation introduced a general framework and approach for researchers to tackle data fusion problems in generality. The following three chapters contribute to a cohesive narrative, wherein each chapter addresses a distinct statistical challenge that arises in data fusion. Chapter 1 exploits the fundamental inquiry of potential efficiency gain from multiple aligned data sources. Specifically, Chapter 1 introduced a novel data fusion method that utilizes multiple sources to estimate a smooth, finite-dimensional parameter. We characterize the reduction in the semiparametric efficiency bound achieved by leveraging multiple data sources in a single analysis, and provided a general means of constructing efficient estimators that achieve these bounds. Chapter 1 was co-authored by Alex Luedtke and appears in Biometrika under the title ``Efficient estimation under data fusion". In Chapter 2, we consider the general case where some data sources do not perfectly align with parts of the target population, but are weakly aligned in the sense that the ratio of densities between these sources and the target distribution can be characterized parametrically. We proposed methods that further unlock efficiency gains by making use of these slightly misaligned data sources. While Chapter 1 and 2 provided data fusion techniques for binary and continuous outcomes, none of the existing work is tailored to handle time-to-event outcomes. We addressed this gap in Chapter 3. We introduced a two-step mapping procedure that accounts for different censoring distributions across data sources, and provided the forms of gradients for constructing estimators for three estimands that are often of interest in vaccine trials.

Description

Thesis (Ph.D.)--University of Washington, 2023

Citation

DOI

Collections