Fohner, AlisonKingston, Hanley2020-10-262020-10-262020-10-262020Kingston_washington_0250O_22204.pdfhttp://hdl.handle.net/1773/46572Thesis (Master's)--University of Washington, 2020The Cystic Fibrosis Genome Project (CFGP) has assembled whole genome sequences on ~5K individuals with cystic fibrosis (CF) with the goal of identifying genetic modifiers of CF-related phenotypes. We hypothesized that the over-sampling of the clinal CFTR F508del haplotype in this dataset might make such studies particularly susceptible to deriving spurious associations between variants correlated with CFTR F508del genotype and CF-related outcomes. We assessed whether regions of the genome are associated with the CFTR F508del genotype by performing genome-wide association studies (GWAS’s) of CFTR F508del genotypes and measuring the type I error rate across the genome (genomic inflation) that results when not accounting for population structure. We determined that linear mixed models with orthogonally partitioned structure (LMM-OPS) adequately controlled for the underlying relatedness and population structure within our dataset, reducing signals in genomic locations correlated with CFTR. Our results support that performing a GWAS of a disease-causing variant is a useful method to assess the effectiveness of principal components and genetic relatedness estimates at controlling for confounding in datasets with over-sampling of a clinal variant.application/pdfen-USCC BY-NC-SACFTRcystic fibrosisF508delGWASpopulation structureprinciple component analysisGeneticsEpidemiologyPublic healthPublic health geneticsCFTR F508del and population structure in a cystic fibrosis populationThesis