Linear Structural Equation Models with Non-Gaussian Errors: Estimation and Discovery
Wang, Y. Samuel
MetadataShow full item record
Linear structural equation models (SEMs) are multivariate models which encode direct causal effects. We focus on SEMs in which unobserved latent variables have been marginalized and only observed variables are explicitly modeled. In this thesis, we study three problems where the distribution of the stochastic errors in the SEMs, and thus the corresponding data, are non-Gaussian. Throughout, we utilize graphical models to represent the causal structure. First, we consider estimation of model parameters using an empirical likelihood framework when the causal structure is known. Asymptotically, under very mild conditions on the error distributions, this approach yields normal estimators and well calibrated confidence intervals and hypothesis tests. However, the procedure can be computationally expensive and suffer from poor performance when the sample size is small. We propose several modifications to a naive procedure and show that empirical likelihood can be an attractive alternative to existing methods when the data is non-Gaussian. The models considered in this section correspond to general mixed graphs. We then consider the problem of estimating the underlying structure. Most of the previous work on causal discovery focuses on estimating an equivalence class of graphs rather than a specific graph. However, Shimizu et al. (2016) show that under certain conditions, when the errors are non-Gaussian, the exact causal structure can be identified. We extend these results in two ways. In Chapter 3, we show that when there is no unobserved confounding and the causal structure is suitably sparse, the identification results can be extended to the high-dimensional setting where the number of variables exceed the number of observations. The models considered correspond to directed acyclic graphs (DAGs) with bounded in-degree. In Chapter 4, we show that non-Gaussian errors also allow for identification of the specific graph when unobserved confounding occurs in a restricted way. In particular, we consider the case where the underlying model corresponds to a bow-free acyclic path diagram (BAP). The proposed method consistently estimates the underlying structure, and unlike previous results does not require the number of latent variables or distribution of the errors to be specified in advance.
- Statistics