Browsing Statistics by Title
Now showing items 2140 of 66

Finite Sampling Exponential Bounds
This dissertation develops new exponential bounds for the tail of the hypergeometric distribution. It is organized as follows. In Chapter 1, it reviews existing exponential bounds used to control the hypergeometric tail. ... 
Functional Quantitative Genetics and the Missing Heritability Problem
In classical quantitative genetics, the correlation between the phenotypes of individuals with unknown genotypes and a known pedigree relationship is expressed in terms of probabilities of IBD states. In existing models ... 
Generalization of boosting algorithms and applications of Bayesian inference for massive datasets
(1999)In recent years statisticians, computational learning theorists, and engineers have developed more advance techniques to learn complex nonlinear relationships from datasets. However, not only have models increased in ... 
Generalized linear mixed models: development and comparison of different estimation methods
(2002)The use of generalized linear mixed models is growing in popularity in the modelling of correlated data. To date, methods available are either computationally intensive or asymptotically biased. The following work examines ... 
Genetic restoration on complex pedigrees
(1990)Analyses of genetic data observed on groups of related individuals frequently require the computation of probabilities on pedigrees. Existing methods are computationally intensive and can be infeasible on large and complex ... 
Gravimetric Anomaly Detection using Compressed Sensing
We address the problem of identifying underground anomalies (e.g. holes) based on gravity measurements. This is a theoretically wellstudied yet difficult problem. In all except a few special cases, the inverse problem has ... 
Inference for HighDimensional Instrumental Variables Regression
This thesis concerns statistical inference for the components of a highdimensional regression parameter despite possible endogeneity of each regressor. Given a firststage linear model for the endogenous regressors and a ... 
LargeScale B Cell Receptor Sequence Analysis Using Phylogenetics and Machine Learning
The adaptive immune system synthesizes antibodies, the soluble form of B cell receptors (BCRs), to bind to and neutralize pathogens that enter our body. B cells are able to generate a diverse set of high affinity antibodies ... 
Latent models for crosscovariance
(2001)Crosscovariance problems arise in the analysis of multivariate data that can be divided naturally into two blocks of variables, X and Y, observed on the same units. In a crosscovariance problem we are interested, not in ... 
Learning and Manifolds: Leveraging the Intrinsic Geometry
(20130723)In this work, we explore and exploit the use of differential operators on manifolds  the LaplaceBeltrami operator in particular  in learning tasks. In particular, we are interested in uncovering the geometric structure ... 
The Likelihood Pivot: Performing Inference with Confidence
Maximum likelihood estimation is a popular statistical method. To account for possible model misspecification, the sandwich estimate of variance can be used to generate asymptotically correct confidence intervals. Several ... 
LikelihoodBased Inference for Partially Observed MultiType Markov Branching Processes
Markov branching processes are a class of continuoustime Markov chains (CTMCs) frequently used in stochastic modeling with ubiquitous applications. Bivariate or multitype processes are necessary to model phenomena such ... 
Linear Structural Equation Models with NonGaussian Errors: Estimation and Discovery
Linear structural equation models (SEMs) are multivariate models which encode direct causal effects. We focus on SEMs in which unobserved latent variables have been marginalized and only observed variables are explicitly ... 
Lord's Paradox and Targeted Interventions: The Case of Special Education
Lord (1967) describes a hypothetical “paradox” in which two statisticians, analyzing the same dataset using different but defensible methods, come to very different conclusions about the effects of an intervention on student ... 
Maximum likelihood estimation in Gaussian AMP chain graph models and Gaussian ancestral graph models
(2004)Graphical Markov models use graphs to represent dependencies between stochastic variables. Via Markov properties, missing edges in the graph are translated into conditional independence statements, which, in conjunction ... 
Methods for estimation and inference for highdimensional models
This thesis tackles three different problems in highdimensional statistics. The first two parts of the thesis focus on estimation of sparse highdimensional undirected graphical models under nonstandard conditions, ... 
ModelBased Penalized Regression
This thesis contains three chapters that consider penalized regression from a modelbased perspective, interpreting penalties as assumed prior distributions for unknown regression coefficients. In the first chapter, we ... 
Modeling Heterogeneity within and between Matrices and Arrays
(20131114)Datasets in the form of matrices and arrays arise frequently in the social and biological sciences and are characterized by measurements indexed by two or more factors. In this dissertation we address two problems relating ... 
Monte Carlo estimation of identity by descent in populations
Genetic similarity between organisms arises from segments of shared genome, which are said to be identical by descent (IBD). Modeling IBD in pedigrees forms the basis of classical linkage analysis and has been a fruitful ... 
Monte Carlo likelihood calculation for identity by descent data
(1999)Two individuals are identical by descent at a genetic locus if they share the same gene copy at that locus due to inheritance from a recent common ancestor. Identity by descent can be thought of as a continuous process ...