Preferential sampling and model checking in phylodynamic inference
Karcher, Michael D
MetadataShow full item record
Estimating population size fluctuations is one of the key tasks in Ecology. Traditional sampling based approaches to this task have limitations when populations of interest are extinct or are hard to reach, as is the case for individuals infected for a short time period by a pathogen. Phylodynamics combines coalescent theory from population genetics and statistical modeling to estimate fluctuations of effective population size---an idealized quantity that can be mapped to census population size with additional demographic information---from molecular sequences of individuals sampled from a population of interest. However, many methods implicitly assume that the samples' collection times do not depend on the effective population size. When sampling times do probabilistically depend on effective population size, estimation methods that do not account for this dependence may be systematically biased. We propose a model that accommodates preferentially sampled data by modeling the distribution of sampling times as an inhomogeneous Poisson process dependent on effective population size via a log-linear intensity function. We extend our model to include optional time-varying covariates into the intensity function. Via simulations and via recent influenza and Ebola datasets, we demonstrate that our model not only reduces bias, but also improves estimation precision. Finally, we propose and implement a posterior predictive diagnostic method to check the adequacy of the coalescent and sampling time models.
- Statistics