Statistics
Browse by
Recent Submissions

Parameter Identification and Assessment of Independence in Multivariate Statistical Modeling
We are interested in the extent to which, possibly causal, relationships can be statistically quantified from multivariate data obtained from a system of random variables. In the ideal setting, we would begin with refined ... 
Coevolution Regression and Composite Likelihood Estimation for Social Networks
We study how social networks and nodal attributes influence each other over time. A multiplicative coevolution regression (MCR) model is proposed for longitudinal network and nodal attribute data. The coevolution model is ... 
Linear Structural Equation Models with NonGaussian Errors: Estimation and Discovery
Linear structural equation models (SEMs) are multivariate models which encode direct causal effects. We focus on SEMs in which unobserved latent variables have been marginalized and only observed variables are explicitly ... 
Inference for HighDimensional Instrumental Variables Regression
This thesis concerns statistical inference for the components of a highdimensional regression parameter despite possible endogeneity of each regressor. Given a firststage linear model for the endogenous regressors and a ... 
Methods for estimation and inference for highdimensional models
This thesis tackles three different problems in highdimensional statistics. The first two parts of the thesis focus on estimation of sparse highdimensional undirected graphical models under nonstandard conditions, ... 
Topics in Graph Clustering
In this thesis, two problems in social networks will be studied. In the first part of the thesis, we focus on community recovery problems for social networks. There have been many recent theoretical advances in the modelbased ... 
Scalable Manifold Learning and Related Topics
The subject of manifold learning is vast and still largely unexplored. As a subset of unsupervised learning it has a fundamental challenge in adequately defining the problem but whose solution is to an increasingly important ... 
Applications of Robust Statistical Methods in Quantitative Finance
Financial asset returns and fundamental factor exposure data often contain outliers, observations that are inconsistent with the majority of the data. Both academic finance researchers and quantitative finance professionals ... 
Scalable Methods for the Inference of Identity by Descent
Identity by descent (IBD) describes the shared inheritance of DNA and underlies genetic similarity between individuals. Estimated IBD graphs describing the IBD relationships among individuals have many uses in statistical ... 
Projection and Estimation of International Migration
I propose techniques for improving both estimation and projection of international migration. By applying a Bayesian hierarchical modeling approach to net migration data, I produce projections of international migration ... 
Bayesian Methods for Inferring Gene Regulatory Networks
The recent explosion in the availability of gene expression data has opened up new possibilities in advancing our understanding of the fundamental processes of life. To keep up with the increasing size of the datasets, new ... 
Finite Sampling Exponential Bounds
This dissertation develops new exponential bounds for the tail of the hypergeometric distribution. It is organized as follows. In Chapter 1, it reviews existing exponential bounds used to control the hypergeometric tail. ... 
Finite Population Inference for Causal Parameters
Randomized experiments are often employed to determine whether a treatment X has a causal effect on an outcome Y. Under the NeymanRubin causal model with binary X and Y, each patient is characterized by two binary potential ... 
LikelihoodBased Inference for Partially Observed MultiType Markov Branching Processes
Markov branching processes are a class of continuoustime Markov chains (CTMCs) frequently used in stochastic modeling with ubiquitous applications. Bivariate or multitype processes are necessary to model phenomena such ... 
SpaceTime Smoothing Models for Surveillance and Complex Survey Data
Area and timespecific estimates of disease rates, causespecific mortality rates and other key health indicators are of great interest for health care and policy purposes. Such estimates provide the information needed to ... 
Testing Independence in High Dimensions & Identifiability of Graphical Models
In this thesis two problems in multivariate statistics will be studied. In the first chaper, we treat the problem of testing independence between m continuous observations when m can be larger than the available sample ... 
Statistical Hurdle Models for Single Cell Gene Expression: Differential Expression and Graphical Modeling
This dissertation describes a set of statistical methods developed for analysis of single cell gene expression. A characteristic of single cell expression is bimodal expression, in which two clusters of expression are ... 
Bayesian Modeling of a High Resolution Housing Price Index
Understanding how housing values evolve over time is important to consumers, real estate professionals, and policy makers. Existing methods for constructing housing indices are computed at a coarse spatial granularity, ... 
Phylogenetic Stochastic Mapping
Phylogenetic stochastic mapping is a method for reconstructing the history of trait changes on a phylogenetic tree relating species/organisms carrying the trait. Stateoftheart methods assume that the trait evolves ... 
Degeneracy, Duration, and Coevolution: Extending Exponential Random Graph Models (ERGM) for Social Network Analysis
We address three aspects of statistical methodology in the application of Exponential family Random Graphs to modeling social network processes. The first is the topic of model degeneracy in ERGMs. We show this is a ...