Deciphering sequence determinants of alternative splicing and polyadenylation in health and disease with massively parallel reporter assays

dc.contributor.advisorSeelig, Georg
dc.contributor.authorKoplik, Samantha
dc.date.accessioned2026-04-20T15:25:27Z
dc.date.available2026-04-20T15:25:27Z
dc.date.issued2026-04-20
dc.date.submitted2026
dc.descriptionThesis (Ph.D.)--University of Washington, 2026
dc.description.abstractSplicing and polyadenylation are major co- and post-transcriptional processes that regulate gene expression. Alternative splicing (AS) and alternative polyadenylation (APA) are frequent drivers of human disease, yet systematic maps of how genetic variants affect these processes in different cell types remain limited. To address this gap, this thesis presents high-throughput perturbations of splicing and polyadenylation using massively parallel reporter assays (MPRAs) to measure the impact of genetic variation on both AS and APA in human cell lines of diverse tissue origin. First, I introduce Cell-type Oriented Massively Parallel Assay of Splicing Signatures (COMPASS), an MPRA that quantifies splicing outcomes for 87,546 variants across five human cell lines. COMPASS targets disease-relevant genes, including ACMG actionable and autism-associated genes, providing a resource to systematically dissect splicing impacts in health and disease. Benchmarking COMPASS data against predictive models highlights both strengths and weaknesses of current approaches. Biological relevance is further supported by prime editing experiments that validate selected variants in their native genomic context. Analyses of COMPASS data also reveal RNA-binding protein motifs whose disruption drives splicing changes and identify subsets of sequences that mediate cell type-specific splicing programs. Next, I applied a similar approach to dissect the cis-regulatory determinants of APA. APARENT2, a deep residual network for predicting APA, had previously identified variants enriched for gain-of-function for polyadenylation in autism GWAS cohorts. To validate these predictions, I developed an MPRA in multiple cell types, which confirmed these gain-of-function variants and uncovered additional cell type-specific effects. I further expanded this work to a larger APA MPRA that cataloged the effects of over 5,000 disease-associated variants, revealing both conserved and cell type-specific regulation. Together, these studies provide the most comprehensive cell type-resolved compendia of AS and APA to date. COMPASS delivers the largest atlas of splice-disrupting variants to date, and two APA MPRAs provide a complementary resource cataloging polyadenylation-disrupting variants. The resources developed in this thesis serve to support variant reclassification in clinical genomics, guide therapeutic target discovery, and aid in the refinement of predictive models.
dc.embargo.termsOpen Access
dc.format.mimetypeapplication/pdf
dc.identifier.otherKoplik_washington_0250E_29209.pdf
dc.identifier.urihttps://hdl.handle.net/1773/55441
dc.language.isoen_US
dc.rightsCC BY-NC-ND
dc.subjectBioengineering
dc.subject.otherBioengineering
dc.titleDeciphering sequence determinants of alternative splicing and polyadenylation in health and disease with massively parallel reporter assays
dc.typeThesis

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Koplik_washington_0250E_29209.pdf
Size:
22.44 MB
Format:
Adobe Portable Document Format

Collections