1 Genetic etiologies of Autism Spectrum Disorder Niklas Krumm A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Washington, 2014 Reading Committee: Evan E. Eichler, Chair Deborah A. Nickerson Jay Shendure Program Authorized to Offer Degree Genome Sciences, University of Washington 2 ABSTRACT Autism spectrum disorder (ASD) is a common, heritable neurodevelopmental disorder. In this thesis, I examine how different genetic etiologies, mutation types and specific genes contribute to the risk of ASD and how these factors can be used to expand our understanding of the neurobiological underpinnings of ASD. I develop a new bioinformatics method (CoNIFER: copy number inference from exome reads) to identify copy number variants (CNVs) using exome sequencing data, enabling much more sensitive identification of a previously under-ascertained class of small CNVs (<100 kbp in size). I estimate the precision of the algorithm using 366 exomes and show that this method can be used to reliably predict both de novo and inherited rare CNVs and can predict absolute copy number for loci with fewer than eight copies. Next, I searched for disruptive, genic rare CNVs among 411 families with sporadic ASD from the Simons Simplex Collection (SSC) and identified additional small genic rare CNVs compared to high-density single nucleotide polymorphism (SNP) microarrays (~2X higher yield). I found that affected probands inherit more CNVs than their siblings (p=0.004; OR=1.19), and these CNVs affect more genes, are enriched for brain-expressed genes, and are transmitted preferentially from the mother. Finally, I found that the excess burden of inherited CNVs among probands is driven primarily by proband-sibling pairs with discordant social behavior phenotypes. I then created a combined set of both inherited and de novo single nucleotide variants (SNVs) and CNVs across 2,377 SSC ASD families, including 1,786 families with both an affected and unaffected child. I compared the burden of inherited and de novo mutations between affected and unaffected siblings and found that private inherited truncating SNV mutations in conserved genes are significantly enriched in probands (OR=1.14, p<0.0002)? an effect that became more pronounced with increasing gene conservation. I quantified ASD risk for de novo and inherited CNVs and SNVs by using a conditional logistic regression model. Independent from de novo mutations, private truncating SNVs and rare inherited CNVs contribute an increase in risk of 1.11 (p=0.0002) and 1.23 (p=0.01), respectively. These results confirm a statistically independent role for inherited mutations in ASD risk and identify additional candidate genes (e.g., RIMS1, CUL7 and CSMD1) where inherited and de novo burden converge. 3 TABLE of CONTENTS I. INTRODUCTION 6!1.1 SUMMARY 6!1.2 AUTISM SPECTRUM DISORDER 7!1.3 IDENTIFYING GENES AND PATHWAYS INVOLVED IN ASD AND ID 8!1.4 AN INCREASE IN DE NOVO LOSS-OF-FUNCTION MUTATIONS 11!1.5 CANDIDATE GENES YET FEW RECURRENT HITS 12!1.6 LARGE-SCALE RESEQUENCING OF CANDIDATE GENES 13!1.7 NOVEL CANDIDATES AND THEIR NEUROBIOLOGY 16!1.8 PROTEIN INTERACTION NETWORKS CONVERGE ON COMMON PATHWAYS 18!1.9 ASSAYING COPY NUMBER VARIATION 24!1.10 THESIS GOALS 25!II. COPY NUMBER VARIATION DETECTION AND GENOTYPING FROM EXOME SEQUENCE DATA 28!2.1 SUMMARY 28!2.2 INTRODUCTION 29!2.3 METHODS 29!2.4 RESULTS 32!2.5 DISCUSSION 41!III. TRANSMISSION DISEQUILIBRIUM OF SMALL CNVS IN SIMPLEX AUTISM 44!3.1 SUMMARY 44!3.2 INTRODUCTION 45!3.3 METHODS 46!3.4 RESULTS 48!3.5 DISCUSSION 57!IV. INHERITED SNV MUTATIONS IN AUTISM SPECTRUM DISORDER 65!4.1 SUMMARY 65!4.2 INTRODUCTION 66!4.3 METHODS 67!4.4 RESULTS 68!4.5 DISCUSSION 79!V. SUMMARY AND FUTURE DIRECTIONS 82!5.1 SUMMARY OF RESULTS 82!5.2 TOWARDS ASSAYING THE COMPLETE SET OF GENETIC VARIATION 82!5.3 UNDERSTANDING NORMAL VARIATION AND PATHOGENIC VARIANTS: 86!5.4 DEFINING SUBTYPES OF THE AUTISM SPECTRUM 88!5.5 DEFINING A GRADIENT OF SIMPLEX AND MULTIPLEX AUTISM 89!5.6 UNDERSTANDING COMPLEX GENETIC ETIOLOGIES AT A FAMILY LEVEL 90!5.7 FUTURE DIRECTIONS 92!REFERENCES 94!WEB AND SOFTWARE RESOURCES 106!APPENDICES 108! 4 LIST of FIGURES Chapter 1 Figure 1.1: Estimating the number of ASD/ID risk genes ...........................................14 Figure 1.2: Location of de novo truncating mutations in ASD and ID genes ..............16 Figure 1.3: Genes disrupted by de novo mutations form a connected network ...........20 Figure 1.4: CHD8 and CTNNB1 putative regulatory network for head size ................22 Chapter 2 Figure 2.1: Method overview and CNV discovery ......................................................34 Figure 2.2: CNP locus genotyping of RHD and C4A ...................................................38 Figure 2.3: Genotyping accuracy across 62 CNP loci ..................................................39 Chapter 3 Figure 3.1: Discovery and validation of previously undiscovered CNVs using exomes ........................................................................................50 Figure 3.2: Increased inherited CNV burden in ASD probands for large and small CNVs ..........................................................................52 Figure 3.3: Inherited CNV burden correlates with SRS phenotype .............................54 Figure 3.4: Genes in proband-only CNVs from SRS-discordant quads are more likely brain-expressed .................................................................56 Figure 3.5: A combined model of inherited and de novo mutations ............................62 Chapter 4 Figure 4.1: Network of genes with recurrent de novo hits ...........................................70 Figure 4.2: Transmission disequilibrium of SNVs in ASD ..........................................72 Figure 4.3: Transmitted mutations and their effect on phenotype ................................74 Figure 4.4: Convergence of de novo and inherited mutations on CSMD1 ...................77 Figure 4.5: Combined risk model for SNVs and CNVs, inherited and de novo ..........78 Chapter 5 Figure 5.1: High genomic GC nucleotide content (green histogram) hinders whole-exome sequencing ................................................................................84 Figure 5.2: Number of trios sequenced and expected rate for ASD-implicated genes .........................................................................91 5 LIST of TABLES Chapter 1 Table 1.1: Six recent family-based exome studies of ASD and ID ..............................10 Table 1.2: Recurrent disruptive mutations in ID and ASD ..........................................15 Chapter 2 Table 2.1: Cohorts analyzed .........................................................................................30 Table 2.2: Precision of exome-based CNV calls in HapMap samples .........................36 Table 2.3: Precision of exome-based CNV calls in autism trios ..................................37 Chapter 3 Table 3.1: Summary of transmitted CNVs in 411 ASD quads .....................................49 Table 3.2: Summary of IQ and SRS burden .................................................................55 Table 3.3: Selected inherited CNVs .............................................................................61 Chapter 4 Table 4.1: Genes with new recurrent de novo mutations .............................................69 Table 4.2: Summary of mutations in IGF-related ASD network .................................71 Table 4.3: Converging evidence for RIMS1, CUL7 and CSMD1 from de novo and inherited mutations ................................................................................75 Table 4.4: Summary of logistic regression model results ............................................79 6 I. Introduction 1.1 Summary Autism spectrum disorder (ASD) and intellectual disability (ID) are neurodevelopmental disorders with large genetic components, but identification of pathogenic genes has proceeded slowly because hundreds of loci are involved. This introduction describes how new exome sequencing technology has identified novel rare variants and has found that sporadic cases of ASD/ID are enriched for disruptive de novo mutations. Targeted large-scale resequencing studies have confirmed the significance of specific loci, including chromodomain helicase DNA binding protein 8 (CHD8), sodium channel, voltage-gated, type II, alpha subunit (SCN2A), dual specificity tyrosine-phosphorylation-regulated kinase 1A (DYRK1A), and beta-catenin (CTNNB1). I review recent studies and suggest that they have led to a convergence on three functional pathways: (i) chromatin remodeling; (ii) wnt signaling during development; and (iii) synaptic function. The heritability and genetic etiology cannot be fully explained by de novo events, and I suggest that additional underlying genetic etiologies, mechanisms and effects also play a role in ASD. I describe three major topics of this thesis: (i) the development of new tools to more completely assay genetic variants using next-generation sequencing data (Chapter II), (ii) describing the role of inherited copy number variants (CNVs) in the risk of ASD (Chapter III), and (iii) exploring how multiple types of mutations, including inherited and de novo SNVs and CNVs, contribute to ASD (Chapter IV). This introduction has been adapted from: Krumm, N., O'Roak, B. J., Shendure, J., & Eichler, E. E. (2014). A de novo convergence of autism genetics and molecular neuroscience. Trends in Neurosciences, 37(2), 95?105. doi:10.1016/j.tins.2013.11.005. 7 1.2 Autism spectrum disorder Autism spectrum disorder (ASD) is a common (frequency 1/88?1/250 births) disorder that results in three distinct phenotypes which manifest early in development: 1) deficits in language ability, 2) impairment in reciprocal social interaction and communication, and 3) repetitive or stereotyped movements and interests. ASD encompasses a range of disorders, including Asperger?s syndrome and Autistic Disorder itself. ASD and related disorders can commonly present with comorbidities of intellectual disability or developmental delay and epilepsy, but these are neither requirements for diagnosis nor observed in all cases. The etiology of ASD has a strong genetic component, and the heritability of these disorders is widely estimated to be 80-90% based on studies comparing concordance in monozygotic and dizygotic twins (Lichtenstein et al. 2010; Bailey et al. 1995), although larger and more recent studies based on a broader set of familial relationships have shown significantly lower estimates of ~50% (Hallmayer et al. 2011). Additional evidence for a genetic etiology stems from increased proportions of ASD diagnoses in monogenic developmental disorders such as Fragile X and tuberous sclerosis, the identification of genes in families with Mendelian forms of autism, an increased rate of sibling recurrence (~8% for boys), and observation of a ?Broad Autism Phenotype? in (unaffected) parents of children with ASD. Understanding the genetic etiology of ASD has several important motivating factors. First, identification of specific genes (and mutations) underlying the pathogenesis of ASD will enable a new era of genetic diagnosis and classification of ASD. The observed ASD phenotypes are extremely heterogeneous, making diagnostic criteria based on behavior and phenotype necessarily broad as well. An understanding of the specific genes that cause ASD will enable a much more precise diagnosis, or refinement of an initial diagnosis. Second, it is likely the genetic etiology of ASD encompasses two different broad mechanisms of genetic mutations: The first are so-called de novo mutations, which are 8 present in offspring but not their parents and arise either in the parent?s germline or early in development; the second type of mutation is an inherited (or transmitted) mutation from parents to offspring. Both types of mutations are likely to contribute to ASD, and an understanding of their importance and pathogenic potential will provide a crucial ?big picture? understanding of how genetics affect ASD risk. Finally, an understanding of both the genes and the broader genetic mechanisms in ASD is requisite for the identification and development of therapeutics. Specifically, therapeutic development will follow the ability to identify genetic subtypes of ASD (both defined by gene and genetic etiology or mechanism) and will build off of an understanding of the function of individual genes identified. The often severe and debilitating nature of the ASD phenotype, combined with its high population prevalence, provides strong impetus to study and understand this disorder. 1.3 Identifying genes and pathways involved in ASD and ID The identification of genes underlying ID and ASD has been most successful for syndromic Mendelian or monogenic disorders?for example, FMR1 (Fragile X syndrome, Fu et al. 1991), MECP2 (Rett syndrome, Amir et al. 1999) or UBE3A (Angelman syndrome, Matsuura et al. 1997). Together, however, these syndromes are estimated to account for less than 10% of ASD/ID, suggesting the presence of additional genes and etiologies. Initial population-based studies failed to identify single genes of major effect and few major common risk variants have been replicated, despite the strong observed heritability of these diseases (Steffenburg et al. 1989; Bailey et al. 1995; Hallmayer et al. 2011; Constantino et al. 2013). By contrast, targeted and genome-wide microarray studies revealed that large de novo CNVs were significantly enriched among probands when compared to unaffected siblings and/or controls (Marshall et al. 2008; Sebat et al. 2007; de Vries et al. 2005; Levy et al. 2011; Sanders et al. 2011; Cooper et al. 2011; Sharp et al. 2006), a finding that echoed the earlier discovery of large chromosomal aberrations in ASD and ID. Both initial and subsequent higher-resolution studies estimate that 8% of sporadic ASD cases carried a de novo CNV, as compared with only 2% of unaffected siblings (Levy et al. 2011; Sanders et al. 2011). Furthermore, among children 9 with general developmental delay (DD) and ID, rare large de novo CNVs are thought to account for up to 15% of disease burden (Cooper et al. 2011). Although individually rare, some of these CNVs were in fact recurrent mutations, mediated by locus-specific genomic instability (Sharp et al. 2006), and many of these same recurrent CNVs observed initially in patients with ID (Sharp et al. 2008) or ASD (Miller et al. 2009) have been identified in adults with epilepsy (Helbig et al. 2009), bipolar disorder (Ben-Shachar et al. 2009), or schizophrenia (Stefansson et al. 2008; International Schizophrenia Consortium 2008), suggesting overlap in the genetic etiology of these disorders. The discovery of an aggregate burden of large de novo CNVs and the identification of recurrent events signaled a new paradigm for ASD and ID genetics. While specific CNVs are individually rare, combined they account for a significant fraction of cases, indicating the presence of considerable locus heterogeneity of ASD and ID. The de novo nature of these CNVs, together with their absence in the general population, suggests they represent a class of highly deleterious and highly penetrant mutations. Their underlying genetic model does not explicitly fit a recessive model of disease, since CNVs are primarily present as hemizygous deletions or duplications. These mutations alter the dosage of genes but do not completely abolish their presence. Collectively, these observations support a complex disease/rare variant model for ASD, in which a proportion of etiologic risk is conferred by very rare variants and de novo mutations. The commoditization of next-generation or ?massively parallel? sequencing represents a turning point in human genetics and makes it possible to discover sequence-level variants across nearly all coding regions (?the exome?) or the whole genome. These methods were first applied to confirm point mutations underlying Mendelian disorders (Ng et al. 2009c), and subsequent pilot studies demonstrated that family-based (trio) exome sequencing could discover pathogenic mutations in simplex ID (Vissers et al. 2010) or ASD (O'Roak et al. 2011). In the past year, this paradigm of de novo mutation discovery using exome sequencing of parent-child trios has been expanded to about one thousand ASD or ID families, resulting in the first detailed picture of how de novo coding mutations contribute to these disorders. 10 In this introduction, I synthesize the results of recent large-scale exome sequencing studies of ASD and ID (O'Roak et al. 2012b; Neale et al. 2012; Iossifov et al. 2012; Sanders et al. 2012; de Ligt et al. 2012; Rauch et al. 2012) and summarize their implications for human neurodevelopmental genetics. There are three themes: 1) Exome sequencing of ASD/ID families has revealed a significant excess of de novo mutations in probands when compared to unaffected siblings and has identified novel candidate genes contributing to the neurological deficits. I note that the strongest effects are observed for de novo loss-of-function (or truncating) mutations, which prematurely truncate the protein due to frameshift and nonsense mutations. 2) Both CNV and exome sequencing data suggest that no single gene will account for more than 1% of autism cases; rather, rare mutations in hundreds of genes may contribute to ASD or ID. 3) Analyses of network connectivity further implicate potentially important neurodevelopmental and synaptic pathways in ASD and ID. Study details N Synonymousb Missensec Nonsense, splice site, indels Iossifov 2012 (Iossifov!et!al.!2012) ASDa Quads 343 79:69 207:207 59:28 Sanders 2012 (Sanders!et!al.!2012) ASDa Quads 200 29:39 110:82 15:5c O?Roak 2012 (O'Roak!et!al.!2012b) ASDa Quads 50 14:16 40:31 6:3 Trios 159 54 115 29 Neale 2012 (Neale!et!al.!2012) ASD Trios 175 50 101 18 de Ligt 201 (de!Ligt!et!al.!2012) ID Trios 100 16 48 14 Rauch 2012 (Rauch!et!al.!2012) ID Trios 51 11 56 20 Sum Odds Ratio (ASD) (95% CI) 1,078 122:124 0.98 (0.73?1.31:1) 357:320 1.29 (1.01?1.63) 80:36 2.41 (1.58?3.75) aTrios/quads from the Simons Simplex Collection (SSC). bCounts refer to the number of mutations in probands or, when separated by a colon, to probands and siblings (e.g., probands:siblings). cNot including indels. Table 1.1: Six recent family-based exome studies of ASD and ID 11 1.4 An increase in de novo loss-of-function mutations Both de novo CNVs and single nucleotide variants (SNVs) can have, in principle, similarly disruptive effects on genes. Crucially, however, the detection of de novo SNVs yields gene-level specificity, thus allowing individual pathogenic genes and neurobiological pathways to be identified. Moreover, a small subset of the de novo mutations (~4% for unaffected and ~9% of affected; Iossifov et al. 2012; Sanders et al. 2012) are disruptive (e.g., frameshift, premature stop codon, splice-donor defect) with respect to the protein?s biological function. Recurrent mutations of this type for a specific gene can strengthen the probability that the de novo mutation relates to phenotype. Since de novo protein-altering SNVs are collectively more common mutation events (~1/generation) than large de novo CNVs (~0.02/generation), there is the exciting possibility that this type of mutation may explain a larger faction of genetic etiology of ASD. In all, six recent exome sequencing studies of trios (mother, father and affected child) and quads (also includes an unaffected sibling) of sporadic ASD (Iossifov et al. 2012; Sanders et al. 2012; O'Roak et al. 2012b; Neale et al. 2012) or sporadic ID (de Ligt et al. 2012; Rauch et al. 2012), together comprising 1,078 families (Table 1.1) have been performed. Three of the ASD studies included unaffected siblings in order to compare mutation rates between affected probands and siblings. While these studies found a slightly elevated rate of mutation in probands versus their unaffected siblings (1.02 vs. 0.79 mutations per offspring), the type of mutation was critical: probands had two- to threefold more disruptive de novo mutations in comparison to their siblings, or to a random model of mutation (Sanders et al. 2012; Iossifov et al. 2012). Overall, among the 593 ASD quads, there were 80 such mutations in probands with ASD, but only 36 in siblings (OR = 2.41, p < 1x10-4, Fisher?s exact test; Table 1.1). The reported enrichment of missense mutations in probands has been less robust, with study estimates for enrichment between 1- and 1.34-fold, but analysis of all quads does show weak statistical enrichment (OR = 1.29, p = 0.03). It is likely that some missense mutations are pathogenic while others are benign, a distinction likely dependent on the context of the mutation and affected proteins themselves. 12 Overall, these studies suggest that protein-truncating de novo SNVs contribute to the risk of ASD for about 10-15% of probands (Sanders et al. 2012; Iossifov et al. 2012), though this fraction is almost certainly a conservative estimate, as an unknown fraction of de novo events are still missed using current sequencing methods and bioinformatics tools. It is important to note that the six current exome studies focused primarily on a de novo mutation genetic model for the development of disease. Recent results highlight the effect of transmitted CNVs (Poultney et al. 2013; Krumm et al. 2013), as well as a renewed emphasis on the effect of common variation in ASD (Klei et al. 2012) based on study of data generated from the same samples. 1.5 Candidate genes yet few recurrent hits Many strong neurobiological candidates have emerged from the genes disrupted by de novo mutations in these studies, including mutations in previously identified ASD/ID genes. Several mutations were identified in NRXN1 and NLGN1; both are central components of the neurexin-neuroligin synaptic cell adhesion complex (Kim et al. 2008). Numerous de novo mutations were identified in genes or loci linked to Mendelian disorders, many of which have features of ASD or ID. These loci and genes include MBD5 (mental retardation, autosomal dominant #1, OMIM 156200), CHD7 (CHARGE syndrome, OMIM 214800), PTEN (Cowden syndrome, OMIM 158350), DYRK1A (in Down syndrome critical region, OMIM 190685), TSC2 (tuberous sclerosis, OMIM 613254), SETBP1 (Schinzel-Giedion syndrome, OMIM 269150), and RPS6KA3 (Coffin-Lowery syndrome, OMIM 303600). Finally, mutations in several genes mapped to critical deletion regions or association intervals initially discovered by large CNVs, including mutations in SYNRG (17q12 deletion syndrome), POLRMT (19p13.3 deletion), and CTTNBP2?a potential candidate for the AUTS1 (7q31) deletion locus. Recurrently mutated genes, however, were few. In sum, only three genes (CHD8, SCN2A, and SYNGAP1) had two independent truncating de novo mutations in any single study, and no gene had more than three mutations. Models designed to estimate the significance of recurrent de novo mutations based on gene size and context found 13 nominal significance for CHD8, NTNG1 (O'Roak et al. 2012b), and SCN2A (Sanders et al. 2012), but most genes could not be conclusively implicated. Notably, however, a review of case-control data by Neale and colleagues of 935 cases found three additional truncating CHD8 mutations and one splice-site mutation in SCN2A, further strengthening the initial disease associations of these genes (Neale et al. 2012). In addition, within a few weeks of these initial reports, a de novo translocation was discovered mapping to CHD8 (Talkowski et al. 2012). 1.6 Large-scale resequencing of candidate genes Given the low rate of recurrence among genes with de novo mutations, estimates of overall locus heterogeneity for ASD have yielded between 300 and 1,000 genes that could confer increased ASD risk when subjected to de novo mutation (Figure 1.1). Even if exome sequencing prices continue to fall, the cost to confirm the association for a significant fraction of these genes remains impractically high, especially if thousands or tens of thousands of samples are required as has been suggested by CNV studies. Instead, targeted next-generation resequencing of candidate genes has proven to be instrumental in associating specific genes. In particular, de Ligt and colleagues resequenced five candidate genes in a confirmation series of 765 ID patients, identifying additional mutations in CTNNB1 and GATAD2B and markedly strengthening their association with ID. Similarly, we have successfully used a molecular inversion probe (MIP) assay to capture and sequence 44 candidate genes in 2,446 ASD probands (O'Roak et al. 2012a). MIP resequencing generates complete sequence across targeted regions, can be performed at high scale and low cost (under $1 per gene per sample), and delivers higher sensitivity for targeted loci than exome sequencing due to increased sequence coverage. Altogether, this assay yielded 27 new de novo mutations across 16 genes; of these mutations, 17 were disruptive SNVs, a fraction higher than expected by chance. The discovery of these mutations confirmed the association with ASD for CHD8 and DYRK1A and provided significant statistical support for four novel genes: GRIN2B, TBR1, PTEN, and TBL1XR1. 14 Figure 1.1: Estimating the number of ASD/ID risk genes. We estimate the number of ASD and ID genes, using an adaptation of the ?hidden species problem? based on the ratio of genes with multiple de novo mutations to all genes with de novo mutations. For each estimate, all genes with recurrent de novo mutations are considered pathogenic, as well as a defined fraction of mutations in genes observed just once (since all de novo mutations are unlikely to be pathogenic). Including more of these singleton mutations as pathogenic, as well as including a broader range of mutation type, exponentially increases the number of ASD and ID risk loci. Thus, considering a disease model in which 15% of all truncating de novo mutations are sufficient and pathogenic, only ~50 genes are expected to be similarly sufficient in their pathogenicity; however, by including missense mutations, the number of loci rises dramatically (to over 400 loci when 15% of de novo SNVs are considered pathogenic). Taken together, this model highlights the locus heterogeneity underlying the genetic etiology of ASD and ID and suggests that the etiology of a large proportion of ASD/ID cases may not be due to a single de novo mutation (truncating or missense); rather, these cases may be the result of a complex set of interactions between multiple mutations, including SNVs, indels, and CNVs. Shaded area indicates 95% confidence intervals around estimate. In sum, when considering only protein-disruptive mutations from six exome sequencing studies (four ASD and two ID) and including the resequencing of some of these candidate genes, a set of 11 genes (Table 1.2) show enrichment in cases with ASD/ID and account for approximately 2.2% of all cases. We have summarized the distribution of mutations, as well as the prevalence and coding location of mutations found in exome sequence from 6,503 samples from the NHLBI Exome Sequencing Project (ESP) in Table 1.2. For several of these genes with recurrent de novo hits in ASD probands (CHD8, GRIN2B, 15 DYRK1A), no truncating variants were observed in the ESP. Furthermore, while control mutations are sometimes found in genes in high frequency (e.g., frameshift in SYNGAP1 at 3.2% frequency in controls), these mutations are found exclusively near the carboxy terminus of the protein and outside of functional protein domains and are unlikely to affect protein function (Figure 1.2). Gene ID Cases ASD Cases Summary ESP Samples Variants Frequency CHD8 - 9/2,446 2 (O), 7 (O*) [+ 3 (N*)] 0 0/6503 SCN2A 3/151 2/593 1 (L), 2 (R), 2 (S) [+ 1 (N*)] 1 7/6503 SYNGAP1 3/151 - 1 (L), 2 (R) 1 207/6503a GRIN2B - 3/2446 1 (O), 2 (O*) 0 0/6503 DYRK1A - 3/2446 1 (I), 1 (O), 1 (O*) 0 0/6503 ZNF292 1/151 1/593 1 (L), 1 (N) 1 2/6503 POGZ - 2/593 1 (I), 1 (N) 1 1/6503 KATNAL2 - 2/593 1 (O), 1 (S) 1 1/6503 TBR1 - 2/2446 1 (O), 1 (O*) 0 0/6503 CTNNB1 1/151 1/2446 1 (L), 1 (O*), [+ 1 (L*)] 0 0/6503 SETBP1 1/151 1/593 1 (O), 1 (R) 3 58/6503a ADNP - 2/2446 1 (O), 1 (O*) 1 1/6500 LRP2 1/151 1/593 1 (I), 1 (L) 6 53/6500 ARID1B - 2/2446 1 (O), 1 (O*) 5 314/6500 Table 1.2: Recurrent disruptive mutations in ID and ASD Genes with two or more de novo truncating mutations observed in studies of ASD or ID. Summary indicates studies in which mutations were discovered. I, Iossifov et al. (Iossifov(et(al.(2012); L, de Ligt et al. (de(Ligt(et(al.(2012); N, Neal et al. (Neale(et(al.(2012), O and O*: O?Roak et al. (O'Roak(et(al.(2012a;(2012b); R, Rauch et al. (Rauch(et(al.(2012); S, Sanders et al. (Sanders(et(al.(2012). Mutations found in secondary replication screens or case-control studies indicated in [brackets] with starred (*) reference. Truncating events found in the ESP database and their population frequencies are shown. (a) The truncating variants found in the EVS database in SYNGAP1 and SETBP1 genes fell at the extreme 3' end of the gene, suggesting that they do not adversely affect gene function. 16 Figure 1.2: Location of de novo truncating mutations in top five ASD and ID genes. Red markers indicate locations of de novo mutations in ASD and ID cases; green markers indicate locations of truncating mutations in ESP database of over 6,500 samples (see Table 2.2 for details). Mutation codes: S, Stop-gain; Fs, Frameshift; Ss, Splice-site mutation; ?AA amino-acid loss (non-frameshifting). Blocks indicate annotated protein domains from UniProt. Domain names, top to bottom: CD, chromodomain; DEX, Helicase ATP-binding; HELC, Helicase C-terminal; TM, transmembrane domain; IQ, IQ domain; PDZ, PDZ-binding motif; LOC, Bipartite nuclear localization signal; STK, Serine/Threonine protein kinase; PH, Pleckstrin homology domain; C2, C2 domain; SH3, SRC Homology 3 Domain. 1.7 Novel candidates and their neurobiology Many of the top genes from recent exome studies are novel candidates for ASD and ID, including the strongest overall association: CHD8, an ATP-dependent chromodomain helicase that directly regulates CTNNB1 (Beta-catenin; Nishiyama et al. 2012) as well as the p53 pathway (Nishiyama et al. 2009). The CHD8 protein has known binding activity with another chromodomain helicase, CHD7 (Batsukh et al. 2010), which is the key protein in CHARGE syndrome, a rare syndrome with high ASD comorbidity (Betancur CHD8 HELCDEXCD DYRK1A STKLOC GRIN2B TM PDZ TM IQ SCN2A SYNGAP1 PH C2 SH3Ras-GAP Figure 2 17 2011). In addition to directly interacting, both are homologues of the Drosophila trithorax group protein kismet and are components of large chromatin remodeling complexes thought to be important in neural crest cell differentiation (Bajpai et al. 2010). Overall, eight de novo truncating mutations were observed across 2,597 cases in this gene; in contrast, no such mutations were observed in control siblings, or in over 6,500 exomes in the ESP. The frequency of mutations in this gene is the highest of all genes screened thus far and nearly matches that of CNVs at 16p11.2, which is the most frequent recurrently deleted (0.5%) or duplicated (0.3%) locus in sporadic ASD (Kumar et al. 2009; Walsh and Bracken 2011). The second strongest overall association, with two truncating mutations in ASD cases and three such mutations in ID cases, is SCN2A, a gene previously associated with epilepsy and seizure disorders (Kamiya et al. 2004; Ogiwara et al. 2009). SCN2A encodes a voltage-gated sodium channel (type II, alpha 1;Nav1.2) expressed throughout the brain, and is responsible for the generation and propagation of action potential in neurons. The phenotype associated with this gene appears to be highly variable. Given the smaller number of ID cases, the prevalence of mutations in SCN2A is higher in ID?however, one of the ID cases also shows signs of autism. Lastly, only one of the five cases had a history of seizures (de Ligt et al. 2012), suggesting that mutations in SCN2A have highly variable phenotypic outcomes. Another striking candidate is DYRK1A (dual-specificity tyrosine phosphorylation-regulated kinase 1A), for which three truncating mutations have been discovered in autism probands (O'Roak et al. 2012b; 2012a) and de novo structural variants in ID probands (van Bon et al. 2011; Moller et al. 2008). DYRK1A is a highly conserved gene whose dosage imbalance has been implicated in the cognitive deficits associated with Down syndrome. The gene interacts with the SWI/SNF complex (Lepagnol-Bestel et al. 2009) and is considered a master regulator of brain growth, affecting diverse aspects of neurogenesis, including neuronal proliferation, morphogenesis, differentiation, and maturation (Mazur-Kolecka et al. 2012; Guedj et al. 2012; Yabut et al.). Mutations in the Drosophila ortholog (mnb) have been known for more than 20 years and result in a 18 ?minibrain phenotype? where optic lobes and central brain hemispheres are reduced (Tejedor et al. 1995). Similarly, heterozygous mice knockouts for Dyrk1A (+/-) show a reduction of brain volume in a region-specific manner as well as mental impairment (Fotaki et al. 2002; Song et al. 1996). Consistent with these models, all three human loss-of-function autism patients are cognitively impaired and microcephalic (z-score < ?2). Three truncating mutations each of GRIN2B and SYNGAP1, and two mutations of TBR1, highlight the importance of excitatory/glutamatergic signaling in both ASD and ID?and are perhaps some of the most conclusive previously implicated genes to date. GRIN2B (found in an ASD case with ID) forms a subunit of an NMDA receptor associated with learning and memory, and targeted sequencing has linked it to neurodevelopmental disorders as well as its discovery in ASD (Endele et al. 2010). This receptor participates in a larger postsynaptic complex with SYNGAP1, in which three mutations in ID patients have been observed in the present cohorts, as well as in multiple previous screens of ID (Vissers et al. 2010; Hamdan et al. 2009). Interestingly, while no mutations of SYNGAP1 were found in the Simons Simplex Collection (SSC) ASD cohorts, SYNGAP1 mutations were recently implicated in several cases of ID with ASD (Berryer et al. 2012). Finally, TBR1, together with the CASK protein, regulates transcription of GRIN2B (as well as several other candidate ASD genes, such as RELN and AUTS2) (Bedogni et al. 2010). 1.8 Protein interaction networks converge on common pathways Knowledge of molecular-level interaction between proteins has enabled the development of transcriptional networks (Voineagu et al. 2011) and protein-protein interaction (PPI) networks enriched for mutation in ASD and ID cases. These networks provide a powerful method to unify the landscape of mutations observed in genetically heterogeneous human disorders by leveraging regulatory interactions between genes and/or physical interactions between proteins. For example, Iossifov et al. found that 14/59 genes disrupted by de novo mutations were significantly enriched (p < 0.006) in a group of 842 genes previously defined (Darnell et al. 2011) as regulated by FMR1, the key protein disrupted in Fragile X syndrome, and noted that this was not true for mutations found in siblings (2/28 part of FMR1-regulated genes), the general population, nor for all missense 19 variants (Iossifov et al. 2012). Neale et al. performed a similar analysis to previously identified ASD and ID risk genes?including a core set of 31 synaptic genes identified from previous proteomic studies?and found that genes with nonsynonymous de novo mutations had a significantly reduced network distance (i.e., they were more closely associated in the network) than was a set of ?comparator? genes derived from silent de novo mutations and sibling mutations (Neale et al. 2012). Lastly, we developed a network for interactions between proteins corresponding to genes with de novo mutations, revealing a single connected component for 39% (49/126) of genes with disruptive or likely disruptive missense de novo mutations (O'Roak et al. 2012b). Notably, in follow-up MIP resequencing, we targeted ~50% network and ~50% non-network genes and found that 94% (16/17) of the newly discovered truncating mutations fell within the network (or a similar, expanded 74-gene network)?an observation unlikely to have occurred by chance (p = 0.0002). In contrast, the non-network genes had only six total mutations, only one of which was a truncating mutation. I integrated the results from the six exome studies by forming PPI networks using experimentally verified interaction data from StringDB (Jensen et al. 2009) (Supplemental Methods). I found the PPI network based on all truncating and missense mutations in probands was significantly more clustered, had more edges, and created larger connected components than randomly sampled or permuted networks (p ? 0.009 for all tests; Supplemental Methods); in contrast, neither the genes with mutations in siblings, nor those with synonymous mutations (in either proband or siblings) showed any difference from the null distribution of networks (Table S1). In order to summarize these PPI networks, I connected all truncating mutations as well as six genes with missense mutations with important roles in brain development (Figure 1.3; Supplemental Methods). The two largest connected components of this combined network encompass three broad functional pathways: the first connected component (13 proteins) forms a highly interconnected set of postsynaptic scaffolding proteins and receptors, including SYNGAP1, DLG4, GRIN2A/B, NLGN1 and NRXN1, while the second (9 proteins) contains both WNT-signaling functions of CTNNB1, DLL1, and 20 TBL1XR1 and chromatin remodeling functions, anchored by the CHD8 protein. It is important to emphasize that while the nodes in the displayed network are partially based on a manually selected set of genes, the connected components formed are a strict subset of the unbiased PPI simulations described above and are larger than any connected component that can be formed using disruptive mutations found in siblings, synonymous changes, or randomly chosen genes. Figure 1.3: Genes disrupted by de novo mutations in ASD and ID form a central connected network. Genes with de novo truncating mutations (red nodes) or selected missense mutations (blue nodes) in four ASD exome studies and two ID exome studies are connected using experimentally derived PPI data from StringDB. Only medium- and high-confidence experimental interactions are shown, though we note that these may not always represent local interactions protein-protein interactions or interactions within the same subcellular compartment. Peripheral nodes (lighter shades) represent genes with additional truncating de novo mutations, which are separated from the central network by only a single node (white nodes; for this analysis we excluded SUMO1/SUMO2 and UBC, which are highly connected but nonspecific nodes). Interestingly, Gilman et al., using a novel probabilistic framework (NETBAG) in conjunction with CNV data from SSC families, highlighted several genes and pathways with remarkable premonition and overlap to those found in the present exome-based studies (Gilman et al. 2011). In particular, their model showed enrichment of the 21 canonical WNT pathway, postsynaptic complexes, and dendritic spine development (e.g., DLG4, SYNGAP1) and several proteins involved in chromatin remodeling, including BAZ1B and SMARCA2, both of which interact with the central nodes of the chromatin remodeling network (Figure 1.3). The WNT pathway and chromatin remodeling modules of the network are linked by interaction between CHD8 and CTNNB1/Beta-catenin. Both of these proteins play important roles in neural development and growth: Beta-catenin, via downstream WNT pathways, influences neuronal migration, polarity and synaptogenesis (Salinas and Zou 2008), and constitutive overexpression of beta-catenin in mice results in macrocephaly (Chenn and Walsh 2003). CHD8 negatively regulates beta-catenin via direct binding and, furthermore, downregulates beta-catenin responsive genes by recruitment to their promoter regions (Thompson et al. 2008). Strikingly, ASD cases with truncating mutations in CHD8 have significant macrocephaly (O'Roak et al. 2012a), while all three cases with truncating mutations in beta-catenin have microcephaly (de Ligt et al. 2012; Bernier et al. 2014). These reciprocal phenotypes suggest that CHD8 and beta-catenin form a regulatory network that controls head size by altering neuronal migration and growth during development (Figure 1.4). Other proteins with de novo mutations in this network include TBL1XR1, which binds beta-catenin (Cadigan 2008), and DLL1, which is expressed in neural progenitor cells and part of the Delta/Notch signaling pathway (Barton and Fendrik 2013). 22 Figure 1.4: CHD8 and Beta-catenin/CTNNB1 putative regulatory network for head size. a) Truncating mutations in CTNNB1 (red arrows) and CHD8 (blue arrows) are found in patients with small and large head circumference, respectively. Gray histogram represents background distribution of age- and sex-corrected head circumference Z-scores for 2,446 probands from the SSC. (The exact head circumference for one case [marked with *] with clinically reported microcephaly could not be determined, so Z-score was estimated at ?2.0, or the clinical threshold for microcephaly). b) A putative regulatory model of head growth where CHD8 negatively regulates CTNNB1 (Thompson et al. 2008); CTNNB1 promotes head growth and constitutive over-expression of CTNNB1 in mice results in macrocephaly (Chenn and Walsh 2003). Convergence on a second common pathway?chromatin remodeling?has primarily been driven by overlap between genetic syndromes and de novo mutations in sporadic ASD and ID. As discussed, CHD8 possesses ATP-dependent chromatin remodeling activity and directly interacts with CHD7 (Batsukh et al. 2010), which is responsible for CHARGE syndrome, a complex syndrome in which up to two-thirds of patients have been found to have ASD (Betancur 2011). Several de novo missense mutations in ASD cases have been noted in genes encoding for chromodomain helicase proteins, including CHD7 and CHD3, and a de novo frameshift in CHD2 was found by Rauch et al. in an ID case (Rauch et al. 2012). A second syndrome, Coffin-Siris syndrome (OMIM 135900), characterized by ID and severe speech delays, was recently attributed to truncating CHD8 CTNNB1 Head Growth Fre que ncy (P rob and s) Head Circumference (Z-score) * CTNNB1 mutations CHD8 mutations a. b. Figure 4 23 mutations or disruptive CNVs in ARID1B (encoding a subunit of the SWI/SNF chromatin remodeling complex; Santen et al. 2012), and one de novo frameshift of ARID1B was found in a sporadic ASD case (O'Roak et al. 2012b). Additional disruptive de novo mutations recognized in ASH1L, KDM6B, and MLL5 suggest that the chromatin remodeling activity of these proteins may be an underlying pathway implicated in ASD and ID (Iossifov et al. 2012). Finally, we note that mutations in KANSL1 (n? KIAA1267), a histone acetyltransferase with similar p53 regulatory activity to CHD8, were recently found to underlie 17q21.31 microdeletion syndrome, in which ID is a characteristic feature (Koolen et al. 2012). However, no mutations in KANSL1 have been found in ASD cases, though this is likely due to exclusion of known clinical syndromes from these cohorts. In addition to these newly proposed pathways, de novo mutations also highlight the importance of genes with roles in synaptic function and localization?a pathway previously suspected to be disrupted in ASD (Glessner et al. 2009). Many of these genes with de novo mutations form a closely related network of postsynaptic proteins, including the GTPase activating protein SYNGAP1, NMDA receptor subunits GRIN2B and GRIN2A, the scaffolding proteins DLG4 and CASK (the underlying mutation in CASK syndrome, OMIM 300749), and NRXN1, which has been previously associated with ASD (Kim et al. 2008). In conjunction with TBR1, CASK also transcriptionally activates several known neurodevelopmental genes, such as RELN, a gene with critical roles in neuronal development, synaptogenesis, and plasticity (Wang et al. 2004). Finally, this pathway is closely linked to SHANK3, a previously identified ASD protein with up to 1% mutation frequency in ASD cases (Durand et al. 2007; Moessner et al. 2007), although no mutations in this gene have been identified in the six studies presented here. While the reasons for this are not fully clear, it is likely that the high GC content of the gene impedes current short-read sequencing platforms. Interaction networks (Figure 1.3) can also suggest novel targets for mutation screens or functional studies. For example, while DLGAP1 plays a central role in connecting the ?Synaptic Function? component to beta-catenin, no mutations have been observed in 24 DLGAP1. Similarly, SMARCA4 connects BRWD1 to the in-network ADNP protein. These proteins, as well as other ?nearby? proteins suggested by PPI networks, can provide novel targets for mutation screens and deeper functional/pathway study. It is likely that sequencing studies of patients will identify novel candidates for PPI networks, creating a reiterative process by which networks and genetics mutually inform. Despite their widespread role in the current study of ASD and ID, PPI networks have several important limitations. First, protein interactions are difficult to assay experimentally and often are not at a proteomic scale, resulting in false negatives and false positives in databases. In addition, the extent to which the temporal and spatial nature of interactions is captured also limited, and in our network we do not distinguish between different interaction types (regulatory or physical) or cellular compartments. For example, while CASK binds NRXN1 presynaptically (Fairless et al. 2008), binding to the transcription factor TBR1 is in the nucleus (Hsueh et al. 2000). Second, although our PPI network only uses experimentally verified interactions, the impact and weight of interactions can vary considerably for different nodes, especially for ?hub? nodes which can interact with hundreds of other proteins. Finally, current PPI networks do not take into account the functional impact of mutations on the proteins or the interactions themselves. 1.9 Assaying copy number variation Recent advances in technology have made possible genome-wide discovery of rare CNVs and the estimation of copy number for copy number polymorphisms (CNPs). Most commonly, array comparative genome hybridization (array-CGH; Pinkel et al. 1998) or single nucleotide polymorphism (SNP) microarray platforms (Peiffer et al. 2006) have been used to interrogate the copy number of thousands to millions of positions within the genome. Briefly, these technologies work by hybridizing fluorescent-labeled DNA to targeted oligonucleotide probes on a glass slide and use a scanner to measure the hybridization intensity of each fragment. Copy number variation increases or decreases this intensity, either relative to a control sample (for array-CGH) or in absolute units (for 25 SNP microarray platforms); a CNV can be inferred (?called?) from the presence of multiple nearby probes with increases or decreases in signal. However, microarray-based technology has several significant limitations. First, the minimum size of a CNV is limited by the number of probes printed on the microarray and further by the fact that generally at least 20 probes are required to distinguish a true CNV from random noise. Thus, even with high-resolution microarrays of 1 million or more probes, studies have typically examined CNVs greater than 50 or 100 kbp in size. A second limitation of microarray platforms is their poor performance within segmental duplications in the human genome (Bailey et al. 2002; Sudmant et al. 2010). Thus, the majority of studies to date have excluded the analysis of duplicated and CNP loci. The advent of whole-genome sequencing using ?next-generation? short-read technology has resulted in the development of several additional methods for CNV/CNP discovery and genotyping. Two broad approaches exist: the first are read-depth methods which leverage the relatively uniform distribution of sequenced reads throughout the genome (the ?read-depth?) to estimate copy number; these methods can estimate the copy number of genomic segments as small as 1 kbp (Alkan et al. 2009; Yoon et al. 2009; Chiang et al. 2008). The second class of CNV discovery tools is based on the ?paired-end? alignment of sequences from the two ends of fragment of DNA of known approximate size: when the ends of the fragment align to distant genomic loci, a deletion in the fragment can be inferred, and similar rules can be used to infer insertions and (unique to this method) inversions (Hormozdiari et al. 2009; Korbel et al. 2009; 2007). Paired-end approaches are able to find very small CNVs less than 1 kbp in size, though they can suffer from a high false positive rate and require a large amount of sequence data as well as computational resources. 1.10 Thesis goals This introduction and review of published literature has emphasized the role of de novo variation in ASD and ID and has cast light on how genes recurrently affected by de novo loss-of-function mutation in probands are part of functionally important and connected pathways. However, the rarity and effect size (versus unaffected siblings) of de novo 26 mutations in ASD cannot fully explain the overall heritability of the disorder, suggesting that additional underlying genetic etiologies, mechanisms and effects also play a role in ASD. Furthermore, given the extensive heterogeneity of both genotypes and phenotypes seen in ASD, it is likely that multiple genetic etiologies underlie ASD risk in families, and that any explanation of the genetics of ASD is incomplete if based on de novo mutations alone. Therefore, the first aim of this dissertation is to fully explore the spectrum of mutations that contribute to ASD risk. To do this, I develop novel methods that leverage exome sequencing data to assay small CNVs, especially those under 100 kbp that have been missed using traditional assay methods such as array-CGH. Chapter one describes CoNIFER (copy number inference from exome reads), a novel computation method that uses exome read-depth data to infer copy number. First, I establish the sensitivity and precision of the method compared to standard high-resolution microarrays, and by forward validation of novel events. I demonstrate that exome-based detection of rare CNVs has up to 10-fold higher sensitivity for small events less than 10 kbp. In addition, I show that exome-based copy number correlates with true copy number for regions with 0-10 copies, suggesting that exome-based methods can be used for quantitative assessment of copy number. Knowledge of these smaller CNVs, many of which affect only a few exons of a single gene, fills out the spectrum of mutations, from the smallest single base-pair mutations through small and large CNVs and provides a more complete picture of the genes and pathways involved in ASD. The second aim of this dissertation is to understand the role of rare inherited mutations in ASD. In chapter two, I apply CoNIFER to exome data from 411 ASD families and examine the pattern of inherited variation between the ASD-affected probands and their unaffected siblings. I show that rare inherited CNVs confer increased risk for ASD, even in the context of ?sporadic? autism. These inherited variants are correlated with specific ASD-related phenotypes, including IQ and social ability. Finally, inherited variants found to be associated with ASD also are more likely to affect highly brain-expressed genes and are more likely to be part of existing disease and disorder pathways. 27 In chapter three, I expand the hypothesis that inherited events contribute to ASD risk to SNVs. Using an expanded resource of nearly 2,400 families, I demonstrate that ultra-rare SNVs carry pathogenic risk for ASD similar to that seen in CNVs and find that this risk is specifically related to loss-of-function SNVs in genes that do not tolerate functional (i.e., deleterious) variation in control populations. These SNVs also mirror CNVs in that they are correlated with specific phenotypes, in particular the IQ of ASD individuals. This dissertation describes the integration of multiple genetic etiologies (de novo and inherited) and types of mutations in the context of simplex ASD. First in chapter two, and more fully in chapter three, I utilize a logistic regression model of ASD risk to directly compare the effects of de novo and inherited variation, for both CNVs and SNVs. These models suggest that all four combinations contribute an independent statistical component of risk for the development of ASD. Finally, I also examine how multiple types and classes of mutation converge on specific genes and interactions and suggest new ASD candidate genes as well as an integrated etiology for risk of autism (Chapter III) . In particular, convergence of de novo SNVs and inherited CNVs suggests that CSMD1, a complement control protein, may be an ASD risk factor. This gene is particularly interesting in the context of a neurodevelopmental disorder in that it displays strong and specific pan-brain expression, participates in dendritic spine restructuring, and has been implicated in other disorders with neurological basis, such as schizophrenia. I also examine how multiple mutations within one individual may lead to ASD. One case, described in chapter three, has mutations in two previously identified ASD genes that are part of the same complex (NLGN2 and NRXN3). Underscoring the importance of the entire spectrum of genetic mutations, this interaction was discovered only by examination of inherited and de novo mutations, as the NRXN3 mutation was an inherited CNV and the NLGN2 mutation was a de novo SNV. 28 II. Copy number variation detection and genotyping from exome sequence data 2.1 Summary While exome sequencing is readily amenable to single nucleotide variant discovery, the sparse and non-uniform nature of the exome capture reaction has hindered exome-based detection and characterization of genic copy number variation. We developed a novel method using singular value decomposition (SVD) normalization to discover rare genic copy number variants (CNVs) as well as genotype copy number polymorphic (CNP) loci with high sensitivity and specificity from exome sequencing data. We estimate the precision of our algorithm using 122 trios (366 exomes) and show that this method can be used to reliably predict both de novo and inherited rare CNVs involving three or more consecutive exons. We demonstrate that exome-based genotyping of CNPs strongly correlates with whole-genome data (median r2 = 0.91), especially for loci with fewer than eight copies, and can estimate the absolute copy number of multi-allelic genes with high accuracy (78% call level). The resulting user-friendly computational pipeline, CoNIFER (copy number inference from exome reads), can reliably be used to discover disruptive genic CNVs missed by standard approaches and should have broad application in human genetic studies of disease. This chapter has been published: Krumm, N., Sudmant, P. H., Ko, A., O'Roak, B. J., Malig, M., Coe, B. P., et al. (2012). Copy number variation detection and genotyping from exome sequence data. Genome Research, 22(8), 1525?1532. doi:10.1101/gr.138115.112 29 2.2 Introduction Targeted capture and sequencing of coding exons (?exome sequencing?) has revealed common single-nucleotide polymorphisms (SNPs), rare sequence variants, short indels, and breakpoints of structural variation (Ng et al. 2009b; for review see Bamshad et al. 2011), but has been largely refractory to the discovery of copy number variants (CNVs). In contrast to whole-genome sequencing data, exome capture and sequencing results in non-uniform read-depth between captured regions and strong systematic biases between batches of samples. These biases, as well as the sparse nature of the capture, make exome sequencing unsuitable for ?traditional? CNV detection algorithms, such as raw read-depth (Alkan et al. 2009), (Yoon et al. 2009), read-pair alignment (Hormozdiari et al. 2009) or split-read mapping (Karakoc et al. 2011). In this study, we combine read-depth data from exome sequencing with singular value decomposition (SVD) methods to discover rare CNVs and genotype known copy number polymorphic (CNP) regions from eight HapMap samples and 122 autism spectrum disorder (ASD) mother-father-proband trios sequenced as part of separate study to primarily discover de novo SNPs and indels (O'Roak et al. 2011). We validated the discovered events using orthogonal datasets, including whole-genome sequencing and tiling array comparative genomic hybridization (array-CGH) data for HapMap samples, and SNP array and quantitative PCR for events discovered in the autism trios. In light of the tens of thousands of exomes already sequenced, we believe this method will have widespread application for the discovery and association of both rare and common copy number variation in disease, and will complement existing methods to discover single-nucleotide variation from exome-sequencing data. 2.3 Methods Samples and datasets We used exome sequencing data from eight HapMap individuals (NA12878, NA15510, NA18507, NA18517, NA18555, NA18956, NA19129, and NA19240; available in the NCBI Sequence Read Archive under accession SRA039053 or SRP007298) and exomes from 122 mother-father-proband ASD trios (for 366 total individuals). In addition, we utilized exome data from 533 individuals from the NHLBI Exome Sequencing Project 30 (ESP) as a means to derive accurate estimates of the distribution of sequence coverage at each exon. Underlying exome sequence data are available from the Short Read Archive for the HapMap exomes (SRA039053) and from the dbGaP exchange area (for ASD exomes: phs000482.v1.p1; for ESP exomes: phs000 279.v1.p1, phs000290.v1.p1, phs000291.v1.p1, phs000281.v1.p1, phs000254.v1.p1), with additional cohorts pending; more information at http://evs.gs.washington.edu/EVS/). All exomes were captured using either the Roche NimbleGen EZ Exome SeqCap Version 2 (for ESP samples and ASD trios) or Version 1 (for HapMap samples) in-solution exome capture kits (44 Mbp captured, including 36 Mbp exon target). Short-read sequencing was performed using either an Illumina HiSeq2000 platform or an Illumina GAII, with a mix of 50 bp and 76 bp paired-end reads (Table 2.1; see Supplementary Note for additional details). Cohort # Samples Capture Version Passed QC Average Number of Mapped Reads Average Number of Mappings HapMap 8 Roche NimbleGen EZ Exome SeqCap Version 1 8 138,593,483 158,568,475 Autism Trios 122 probands and 244 parents Roche NimbleGen EZ Exome SeqCap Version 2 366 119,461,629 143,574,053 NHLBI Exome Sequence Project 613 533 127,125,719 152,787,950 Table 2.1: Cohorts analyzed Mapping Sequence reads were divided into non-overlapping 36 bp constituents and mapped to exons and the 300 bp flanking sequence of the repeat-masked hg19 reference sequence using mrsFAST (Hach et al., 2010), allowing for up to two mismatches per 36 bp. We calculated RPKM (reads per thousand bases per million reads sequenced; Mortazavi et al. 2008) values for 194,080 exome capture targets (see Supplementary Note) and excluded from further analysis 3,964 probes with a median RPKM of less than one, as these probes were likely failed or improperly targeted. Singular value decomposition RPKM values were transformed into standardized z-scores (termed Z-RPKM values) 31 based on the mean and standard deviation across all analyzed exomes and organized into an exon-by-sample matrix (X). Using SVD, we decomposed X into three matrices: X = USVT In order to remove the strongest k components, we set S1...Sk to zero to form S?, and then recalculate X as the dot product of U, S? and VT (Fig. 2.1). We termed these final values SVD-ZRPKM values?each of which represents the normalized relative copy number of an exon in a sample. Validation For this study, we specifically selected samples that had been subjected to extensive prior experimental validation. Copy number variation of the eight HapMap samples was previously assessed by whole-genome shotgun sequencing and targeted clone sequencing (Kidd et al. 2008), and data from the 1000 Genomes Project (Sudmant et al. 2010). Accurate estimates of copy number for duplicated loci were determined experimentally by single-channel array-CGH data and qPCR (Sudmant et al. 2010; Campbell et al. 2011). CNV data for the 366 autism exomes was obtained by SNP Microarray (Illumina 1M) and by targeted array-CGH as described previously (O'Roak et al. 2011; Sanders et al. 2011). CoNIFER implementation We implemented our algorithm as a collection of python programs under the name CoNIFER (copy number inference from exome reads), available at http://conifer.sourceforge.net. CoNIFER can accept files containing BAM alignment files or RPKM values from samples and outputs a number of charts (e.g., scree plots), a text file containing calls, and images corresponding to each call. Additionally, the raw SVD-ZRPKM values can be saved, facilitating genotyping of CNP loci and further analysis. The computational resources to run CoNIFER are lightweight. BAM-format files can be converted into read-depth files in approximately 20 to 30 minutes; then, given read-depth or read-count values for targeted exons or probes, the CoNIFER and the SVD-normalization can be run with minimal hardware requirements (e.g., 500 samples processed in less than one hour using 4 GB or less of memory). 32 2.4 Results Our method exploits differences in sequence read-depth from exome datasets to predict copy number variation (Fig. 2.1). We focused on characterizing two distinct classes of genetic variation: rare CNVs and CNPs. The former are individually rare in populations (less than 1% frequency) and are predominantly found in unique regions of the genome. In contrast, CNPs are common, both between individuals and between populations, and are frequently associated with segmental duplications (Girirajan et al. 2010). The absolute copy number of multi-allelic CNPs embedded in segmental duplications ranges widely from zero to more than 40 copies, and this variation is typically referred to as multi-copy or multi-allelic (Sudmant et al. 2010). Crucially, as the total copy number of CNPs is estimated as the sum of both haplotypes (i.e., the copy number is not phased), independent re-assortment of parental haplotypes obfuscates the pattern of inheritance for CNPs between parents and offspring. Moreover, our approach utilizes relative read-depth values for each exon; for exons with highly diverse copy number across a population, the population standard deviation will be high as well, thus shrinking the range of relative values observed at that exon. In effect, this makes a threshold-based discovery algorithm less sensitive for CNPs and exons of high copy number diversity, but does not impact genotyping of these CNPs and exons when their location is known. Because of these fundamental differences, we chose to pursue the characterization of CNVs and CNPs differently: for CNVs, discovery within the exome data was unbiased by location, whereas for CNPs, we used a priori information regarding the location of copy number variable loci. Furthermore, when estimating the precision for rare inherited CNVs, we excluded calls within segmental duplications or CNP loci, as the inheritance pattern for these loci cannot be determined without phased copy number information. For discovery of rare CNVs (Fig. 2.1, Fig. S1), we removed between 12 and 15 (k) singular values, a number we empirically adjusted based on the inflection point of the scree plot (Fig. S2). We set discovery thresholds at -1.5 or +1.5 SVD-ZRPKM for rare deletions and duplications, respectively, and required at least three exome probes to exceed the threshold (Supplementary Note). For genotyping CNP regions in the genome, we opted to remove only five components in order to prevent the SVD algorithm from 33 removing bona fide signal from highly CNP loci. The genotype value was calculated by determining the average of the SVD-transformed ZRPKM values for the exons/targets in the region of interest. As the output from our algorithm provides a relative value, we estimated absolute copy number from the SVD-ZRPKM values via two methods: 1) by using population frequency information of copy number states (Campbell et al., 2011) and 2) by creating a standard curve using copy number estimated from whole-genome sequencing data of matched HapMap samples (Sudmant et al. 2010); Supplementary Note). An overview of our method is presented in Figure 2.1. Briefly, sequence reads from each exome were mapped to exons using mrsFAST (Hach et al. 2010), which allows for reads given a set edit-distance to map to multiple locations. Similar to RNAseq data analysis, we calculated RPKM values and transformed these into standardized z-scores (termed Z-RPKM values) based on the mean and standard deviation across all analyzed exomes. These were subsequently organized into an exon-by-sample matrix (X). Next, we implemented an SVD algorithm to overcome the systematic biases that pervade exome capture reactions. Since ?singular values? can be used to examine the relative amount of contributed variance from each component, we used a plot of these singular values, known as a ?scree plot? to identify this experimental noise. Our analysis reveals that the first 10?15 components disproportionally contribute to the variance of the data (Fig. S2). Given that we expect biological variation, in the form of rare CNVs as well as common CNPs, to be a minor contributor to the overall variance of the exon-by-sample matrix X, we formulated the basis of our algorithm by eliminating these strongest components. We selected the number of components for elimination based on the inflection point of the scree plot. 34 Figure 2.1: Method overview and CNV discovery (a) Exome sequencing reads from FASTQ files were divided into non-overlapping 36 bp constituents and (b) aligned to targeted regions, allowing for up to two mismatches per 36 bp alignment. (c) For each exon or targeted region, we calculated RPKM values and then transformed these into ?ZRPKM? values based on the median and standard deviation of each exon across all samples. (d) ZRPKM values were inputted into the SVD transformation, where we removed the first 12-15 singular values. Finally, a centrally weighted 15-exon average was passed over the SVD-ZRPKM values in order to reduce false positives, and a ?1.5 SVD-ZRPKM threshold is used to discover CNVs. Final image (e) shows ZRPKM values from 1,000 consecutive exons on chromosome 16, plotted for 533 ESP exome background samples (black traces) and NA18507 (pink trace). Blue bar corresponds to a rare duplication in NA18507 at the METTL9/OTOA locus at chr16p12.2 that was validated by SNP microarray CNV analysis. The number of components selected for removal is an important parameter in our algorithm and warrants further consideration. Removing too few components leaves the algorithm at risk for residual systematic bias; conversely, removing too many components will begin to remove bona fide signal from exomes, especially at large, common segmental duplications within which a large proportion of analyzed exomes contribute strongly altered read-depth signal. However, individually rare CNVs do not contribute significantly to the overall variance of the sample-by-exon matrix, thus making 35 it unlikely that removing the first 12?15 components of the SVD decomposition results in loss of signal for these rare events. The SVD method depends on concurrently analyzing many samples, so that systematic noise becomes evident and can subsequently be removed. For the eight HapMap samples, we included an additional 533 ESP samples and removed 12 components. For analysis of the ASD trios, we combined the 122 trios (366 samples) with 366 randomly selected samples from the ESP dataset and removed 15 components. In our comparison of mrsFAST and BWA mappings, we used 492 ESP samples (for which BWA mappings were available) and the eight HapMap samples. Overall variance was lower in the BWA-based mappings, thus only six components needed to be removed during the SVD normalization. Rare CNV discovery To discover rare CNVs, we initially restricted our analysis to events where there was a change in copy number state for three or more consecutive exons. In order to assess the precision of our method, we intersected our exome-based deletion and duplication calls from five of the HapMap control individual genomes that had been previously analyzed using high-resolution array-CGH (2010). Of the 32 detected events (Table S1), seven were rare CNVs and 25 were CNPs; after intersecting with the reference set and requiring 10% reciprocal overlap, our method yielded 6/7 (86%) precision for rare CNVs and 16/25 (64%) for CNPs (Table 2.2). We also estimated sensitivity from this comparison: starting with 486 high-resolution array-CGH calls from the five HapMap samples that overlapped at least three exome probes, we restricted calls to those in unique/diploid regions of the genome (i.e., outside of segmental duplications, duplicated genes, and regions of somatic variation such as the HLA locus; Fig. S4). In this set of 41 calls (Table S2, lines annotated ?Rare? and ?CNP?), our algorithm identified 5/5 rare CNVs, but only 3/36 (8%) of CNPs (example calls in Fig. S7). 36 Rare CNVs Common CNPs Total ? 10% Reciprocal Overlap 6/7 (86%) 16/25 (64%) 22/32 (69%) Any Overlap 7/7 (100%) 19/25 (76%) 26/32 (81%) No Overlap --- 6/25 (24%) 6/32 (19%) Table 2.2: Precision of exome-based CNV calls in HapMap samples The relative paucity of rare CNVs in the HapMap cohort prompted us to estimate the precision of our method for rare CNVs using a larger set of 122 ASD trios. In our initial analysis, we applied the same filters for unique/diploid calls as above (Fig. S3), resulting in 191 calls among 97 probands. We identified eight putative de novo events (6.6% incidence; Table S3). For six of these, we were able to validate the event using available Illumina SNP microarray data as well as targeted array-CGH experimental data (Sanders et al. 2011). We could not confirm two de novo events, both of which were single-exon duplications of FAF2 (data not shown). Next, we considered inherited events using our exome read-depth analysis and found that 128 events in the probands were inherited from either the mother or the father (Table S3a). For 117/128 (91.4%) of these events, the SVD-RPKM values of both the proband and the parent exceeded the detection threshold (?1.5). However, for 11/128 (8.6%) of these calls, the SVD-RPKM values between proband and parents were just below the deletion or duplication threshold required for calling, and the inheritance status was determined by manual inspection. Finally, inspection of the SVD-RPKM values for remaining 55 events (14 loci; Table S3b) revealed that these events strongly resemble CNP sites or mapped to genes for which processed pseudogenes exist. An example of such a locus can be seen at the DAZL gene (Fig. S8: sample 13517.p1; chr3:15,636,820-17,640,105). The lack of phased copy number information precludes these loci from inheritance-based precision analysis (as the independent assortment of haplotypes can alter copy number in the offspring) and we, thus, excluded them from the precision analysis of rare CNV events. In total, we found 37 128 rare inherited events and validated 6/8 rare de novo events (excluding processed pseudogenes and duplicated exons), leading to a precision for the discovery of rare CNVs in the autism cohort of 134/136 (98.5%; Table 2.3). Example calls from this experiment are shown in Figure S8. Total Validated de novo CNVs 8 6 validated by SNP microarray Inherited CNVs 127 116 have call made in parents (exceeded threshold) 11 manually inspected SVD-ZRPKM values revealed inheritance Overall Precision 133/135 (98.5%) Table 2.3: Precision of exome-based CNV calls in autism trios We also gauged the accuracy of the exome-based CNV discovery against previously generated Illumina SNP microarray experiments (Sanders et al. 2011). SNP microarray data was available for 108 of 128 predicted inherited events and all eight predicted de novo events. We found 70% (76/108) of the inherited and 75% (6/8) of the de novo CNV events were confirmed by the SNP microarray (Table S3a). Given the high concordance rate of exome-based events within trios (>98%), the lower overlap vis-?-vis SNP microarray experiments likely reflects platform-specific differences in resolution and sensitivity and not an increased false-positive rate in the exome data. Genotyping copy number polymorphic variants We took two approaches in assessing our method?s ability to determine the copy number of CNPs: 1) a relative correlation approach between the continuous SVD-ZRPKM values and whole-genome-sequence derived copy number estimates, and 2) an unsupervised clustering approach of exome-based genotype values in order to derive absolute copy number states for CNP loci. 38 Figure 2.2: CNP locus genotyping of RHD and C4A (a) SVD-transformed values for exons for the Rhesus deletion factor locus (RHD/RHCE) show distinct copy number states across both paralogous genes. (b) Histogram of average SVD-ZRPKM values for ESP dataset (533 individuals) and seven HapMap samples. Clustering was performed using an unsupervised algorithm (Supplementary Note). (c) Correlation between SVD-ZRPKM genotype values (y-axis) and absolute copy number estimate (x-axis) based on whole-genome read-depth for seven HapMap samples and experimentally validated by array-CGH. (d-f) Similar to above, for C4A locus. For the first approach, we selected 62 previously identified CNP loci and genes (from (Sudmant et al. 2010); Table S4) and calculated the copy number of each locus based on whole-genome read-depth data using previously described methodology, which has been experimentally validated using single-channel array-CGH intensity data9. For each locus, we correlated the estimated whole-genome copy number with the average of SVD-ZRPKM values for the exons in the locus (Fig. 2.2). The median r2 value between exome-based and whole-genome-based genotyping at each locus was 0.91 (Fig. 2.3a; Table S4), indicating a high degree of reliability between exome and whole-genome copy number estimation for CNP loci. Furthermore, after stratifying the results by the median copy number of each locus, we found that for loci with median copy number of eight or less, 32 of 39 loci (82%) were highly correlated (r2 value ? 0.9), but for loci with median copy 39 number greater than eight, the median locus r2 was only 0.32. Secondly, we assessed the accuracy of our approach in determining the absolute copy number of common CNPs. We leveraged available genotype information for seven of the HapMap samples in this study across 43 autosomal CNP loci previously studied by Campbell and colleagues (Table S5; Campbell et al. 2011). For each locus, we again used the locus-average of SVD-ZRPKM values and clustered these genotype values using an unsupervised clustering algorithm (Supplementary Note). Each cluster was then assigned the most likely copy number based on the most common copy number state previously identified. Using this unsupervised method, we correctly predicted absolute copy for 235/301 (78%) calls (Table S5) with an overall absolute genotype correlation across all 43 CNP loci of r2 = 0.74. Figure 2.3: Genotyping accuracy across 62 CNP loci (a) Distribution of correlation coefficients of SVD-ZRPKM to whole-genome copy number estimate across 62 CNP loci for seven HapMap samples, split by the median copy number of each locus. (b) Correlation between SVD-ZRPKM score and relative (by median and standard deviation) whole-genome copy number estimate for 39 loci with ? 8 copies; and (c) for 23 loci with > 8 copies. Whole-genome read-depth copy number estimates for these specific sites and genomes were orthogonally validated using single-channel intensity data from previous array-CGH experiments. 40 Our algorithm uses relative read-depth values (introduced both by the Z-transformation and the SVD algorithm itself) in order to overcome significant batch biases in exomes, thus sacrificing the genome-wide linear model of read-depth and copy number exploited by whole-genome structural variation discovery algorithms. Nonetheless, the two approaches presented above can be used to ?anchor? the relative SVD-ZRPKM values to absolute copy number. First, the strong r2 correlation for many loci can be exploited as a ?standard curve? for each locus, and the absolute copy number for exome samples can be estimated. Alternatively, SVD-ZRPKM values can be clustered (Supplemental Note) into copy number groups, thus facilitating absolute copy number estimates without the use of whole-genome data. Comparison with other methods As we generated our read-depth estimates from mrsFAST-based alignments, we were interested to see how our method would perform using BWA-based alignments. The BWA alignments were generated using commonly used parameters and filtering steps suitable for SNP-centric analyses, including removal of reads with multiple mappings (Supplementary Note). We calculated RPKM values from these BWA alignments for the HapMap samples and a subset of the ESP exomes. We observed that signal for rare deletions and duplications in the HapMap samples were attenuated (Fig. S5), and that the median signal-to-noise ratio for the seven rare deletions and duplications was 58% lower for the BWA-based mappings (Table S6; Supplementary Note). In addition, we genotyped 47/62 loci in Table S4 and found a striking difference in the correlation between BWA-based mappings (median r2 = 0.36) and mrsFAST-based mappings (median r2 = 0.92). The remaining 15/62 loci did not have any probes with adequate BWA read-depth, making them intractable and false negatives by this approach. The difference in correlation with mrsFAST mappings was mostly notable for loci with copy numbers ranging between 7 and 12 (Fig. S6b). These data highlight the importance of considering reads with multiple mappings, especially for loci with increased copy number (e.g., the LRRC37A3 locus; Fig. S6c). These differences, however, do not solely reflect differences between the alignment algorithms, but rather of the entire alignment and post-processing pipeline. 41 Finally, we compared our algorithm to ExomeCNV (Sathirapongsasuti et al. 2011), which is designed to detect copy number aberration in the context of cancer using closely matched tumor-normal pairs of exomes. Nevertheless, we were interested to see if ExomeCNV could be used to detect germline variation. We analyzed (using default settings; see Supplementary Note) four HapMap exomes with NA19240 as the reference and compared the results to a validated call set from these genomes (2010). Overall, ExomeCNV predicted 450 CNVs, of which only 63 (14%) had more than 10% reciprocal overlap with the validated call set. In contrast, our algorithm identified 24 calls among these four samples, of which 21 (88%) overlapped the validated call set. We note that ExomeCNV uses uncalibrated read-depth to estimate copy number, and, depending upon batch effects, this can result in the algorithm reporting a significant fraction of the exome as non-diploid (Fig. S9). Furthermore, similar to the BWA-based alignments (see above), ExomeCNV has limited dynamic range in CNP loci and duplicated genes: the average r2 correlation across tested CNPs was 0.57 (compared to our algorithm, r2 = 0.92; Fig. S10). 2.5 Discussion We have outlined a method for making read-depth data from exomes amenable to rare CNV discovery, as well as copy number genotyping of CNP loci. We used SVD normalization to overcome a host of coverage biases introduced by the capture and sequencing of exomes. Our method allows for differing sample preparations and capture reactions to be integrated into the same experiment, provided each ?batch? is sufficiently large (n ? 8). This includes correct normalization of the X chromosome, such that deletions and duplications can be assayed regardless of the sample?s sex. Additionally, our method can integrate exomes captured with different exome capture target designs: the eight HapMap exomes were captured using the Roche NimbleGen SeqCap EZ Version 1, while all other exomes in our experiments were captured using the SeqCap EZ Version 2 capture kit. Remarkably, we find that sufficient dynamic range response remains to accurately predict the copy of duplicated genes up to eight copies. The upper limit of this response is likely an effect of the stoichiometry of the exome-capture reaction and we suggest that this may 42 be improved simply by adjusting the concentrations and targets of exome-capture platforms. Another important consideration in interpreting exome read-depth data is the presence of polymorphic processed pseudogenes. In our study of autism trios, we found that 14% (26/191) of events correspond to changes in the copy of processed pseudogenes residing elsewhere in the genome, often in segmental duplications. Such events have been difficult or impossible to discover using traditional SNP microarray approaches, as the probes for these assays often do not explicitly target the coding exons themselves. While such events may be easily inferred based on the absence of intronic sequence, a comprehensive catalog of polymorphic processed pseudogenes will improve detection of bona fide exonic deletions and duplications. We envision a number of algorithmic improvements. Although using mrsFAST mappings both increases the signal-to-noise ratio for rare CNVs and improves genotyping accuracy for CNPs, these mappings often cannot distinguish between paralogous genes. By restricting the RPKM calculation to exons and regions that contain paralog-specific single nucleotide variants (Sudmant et al. 2010), we hope to be able to extend our method to genotype duplicated genes in a paralog-specific manner. We also expect to lower the minimum number of exons required to detect a CNV. We applied our method to genotyping single exons (such as the third exon of GHR; Santos et al. 2004) and found the SVD-ZRPKM values robustly distinguished different copy number classes. By developing a discovery set of copy number polymorphic exons, genes, and loci?as well as their copy number states in populations?future disease-association studies will be better informed. Finally, though array-based technologies have described many CNP-disease associations (Girirajan et al. 2010), discovery of loci has been limited to those with low median copy number, and our approach here will be able to examine CNP loci with higher copy number. Using our approach with large clinical cohorts currently undergoing exome sequencing, we expect to find new disease associations with rare CNVs, CNP loci, and paralog-specific copy number of known CNP loci. 43 ACKNOWLEDGEMENTS We thank S. Ng, S. McGee, and T. Brown for helpful comments in the preparation of this manuscript, M. State and the Simons Simplex Collection Genetics Consortium for providing Illumina genotyping data, and A. Schachtel for suggesting the CoNIFER name. This work was supported by NIH grants HD065285 (E.E.E.), HHSN273200800010C (D.A.N.), and HL102926 (D.A.N.) and the Simons Foundation Autism Research Initiative (E.E.E.). E.E.E. is an Investigator of the Howard Hughes Medical Institute. 44 III. Transmission disequilibrium of small CNVs in simplex autism 3.1 Summary We searched for disruptive, genic rare CNVs among 411 families with sporadic autism spectrum disorder (ASD) from the Simons Simplex Collection using available exome sequence and CoNIFER (copy number inference from exome reads). Our approach yielded increased sensitivity for smaller genic rare CNVs compared to high-density SNP microarrays (~2X higher yield), especially for CNVs smaller than 20 kbp. We find that affected probands inherit more CNVs than their siblings (453 vs. 394, p=0.004; OR=1.19), and these affect more genes (921 vs. 726, p=0.02; OR=1.30). These smaller CNVs (median size 18 kbp) are transmitted preferentially from the mother (136 maternal vs 100 paternal, p = 0.02) although this bias occurs irrespective of affected status. The excess burden of inherited CNVs among probands is driven primarily by sib-pairs with discordant social behavior phenotypes (p < 0.0002, measured by SRS score), in contrast to families where the phenotypes are more closely matched or less extreme (p > 0.5). Finally, we found strong enrichment for brain-expressed genes unique to probands, especially in the discordant SRS group (p = 0.0035). In a combined risk model, our set of inherited CNVs, de novo CNVs and de novo SNVs all independently contributed to the risk of autism (p < 0.05). Taken together, these results suggest that small transmitted rare CNVs play a role in the etiology of simplex autism. Importantly, the small size of these variants aids in the identification of specific genes as additional risk factors associated with ASD. This chapter has been published: Krumm, N., O'Roak, B. J., Karakoc, E., Mohajeri, K., Nelson, B., Vives, L., et al. (2013). Transmission Disequilibrium of Small CNVs in Simplex Autism. American Journal of Human Genetics, 93(4), 595?606. doi:10.1016/j.ajhg.2013.07.024 45 3.2 Introduction Discovering the mutations and the genes responsible for autism spectrum disorder (ASD) requires an assessment of the full-spectrum of genetic variation within families including both de novo and inherited events. There is compelling evidence that a diverse range of de novo mutations play an important role, including copy number variants (CNVs; Levy et al. 2011; Sanders et al. 2011; Sebat et al. 2007; Glessner et al. 2009; Pinto et al. 2010), single nucleotide variants (SNVs) and insertions and deletions (indels) (Iossifov et al. 2012; O'Roak et al. 2012b; Sanders et al. 2012; O'Roak et al. 2011). However, taken together, de novo variation does not fully explain the genetic etiology of ASD: only ~8% of probands carry a de novo CNV and ~10-20% carry a pathogenic de novo SNV or indel. Many of these mutations likely play a pathogenic role in the development of ASD, especially in the context of sporadic (or ?simplex?) ASD. However, the heritability of ASD is estimated to be between 50% and 90% (Bailey et al. 1995; Hallmayer et al. 2011)?much higher than the to-date explained fraction of disease?suggesting that additional genetic factors contribute to the etiology of ASD. The prevalence of rare CNVs smaller than 50 kbp has been underestimated in previous surveys using oligonucleotide microarrays (Levy et al. 2011; Sanders et al. 2011) and their role in ASD has yet to be explored. Such pathogenic events could in principle provide as much specificity as exonic de novo mutations with respect to genes and informative protein networks. Several recent methods based on exome sequencing read-depth data have enabled the discovery of small genic CNVs previously missed by microarray (Krumm et al. 2012; Fromer et al. 2012). In this study, we tested the hypothesis that small genic inherited CNVs also contribute to the genetic etiology of sporadic autism. Several lines of evidence are potentially supportive of this hypothesis, including increased prevalence of the broader autism phenotype (BAP) in parents of affected children (Losh et al. 2008; Davidson et al. 2012), trends for higher burden of extremely rare singly-transmitted CNVs in simplex families (Levy et al. 2011), and enrichment for large CNVs in cases versus unrelated controls (Pinto et al. 2010). In contrast other previous studies which have examined inherited CNVs in ASD found no significant excess of inherited burden in probands with ASD, although these studies were 46 mainly designed to detect de novo CNVs (Sanders et al. 2011). Here, we present evidence for transmission distortion for smaller CNVs (median size ~18 kbp) by investigating families where both affected and unaffected siblings have been exome-sequenced. The availability of whole-exome sequence data for our samples has the advantage of increased sensitivity for small, genic CNVs affecting two or more exons, as well allowing us to integrate both rare SNV and CNV to develop a model to explain the genetic architecture of ASD. 3.3 Methods CNV Detection from exome sequence data We analyzed exome sequencing data from families ascertained as part of the Simons Simplex Collection (Fischbach and Lord 2010). Underlying FASTQ sequence data was obtained from 391 published ASD quads (O'Roak et al. 2012b; Sanders et al. 2012; Iossifov et al. 2012) and we generated sequence data for an additional 19 unaffected siblings from published trios (O'Roak et al. 2011). The data set include sequence data (median coverage >50x) from 411 families where a proband, unaffected sibling, mother and father (termed quad) all had been sequenced for a total of 1644 samples (see Table S1 and S2 for details). Sequence reads were split into 36mers, and mapped using the mrsFAST alignment program (Hach et al. 2010) to the Nimblegen EZ-SeqCap v2 targets (including 300 bp around each target and allowing two mismatches per 36mer). We used CoNIFER(Krumm et al. 2012) to calculate exon-level coverage and removed systematic bias between samples and targets. Using a custom pipeline (Figures S1 and Supplemental Methods), we 1) segmented our CoNIFER SVD-ZRPKM values using the DNACopy algorithm (Venkatraman and Olshen 2007), 2) minimized false-negatives by a quad-based genotyping method, 3) clustered CNVs into overlapping CNVRs, and 4) removed CNVs found in duplicated or repetitive genomic space. We limited our final call set to inherited CNVs (i.e., transmitted CNVs) that were present in 10 or fewer families (or approximately 1% population frequency), and we excluded CNVs which primarily fell within duplicated or highly polymorphic regions of the genome. We considered a CNV ?rare? if it occurred in 10 or fewer families and a CNV private if it was observed only in 47 one family. Lastly, we did not include CNVs on the X chromosome in any analysis, and all de novo CNVs were excluded from burden analyses except where noted. Throughout this paper, we define ?CNV burden? as the number of rare CNVs per individual. Array comparative genomic hybridization We designed a customized CGH microarray (Agilent SurePrint G3 4x180k CGH microarray; probe density ranging from 125 bp-1 to 5 kbp-1 depending on the size of the event to be validated) and selected 161 CNVs from a subset of 80 samples, stratified by proband/sibling (36 probands and 44 siblings), and by dataset (26 from Iossifov et al., 22 from O?Roak et al., 32 from Sanders et al.; Table S1 and S5). Minimum deletion and duplication thresholds for validation were determined by ROC curve analysis of known positive and negative control CNVs (Figure S3). Phenotypic measures and models Social Responsiveness Scale (SRS) was used as a quantitative measure of social deficits(Constantino and Gruber). We had complete phenotype information (SRS for both proband and sibling, and full-scale IQ for proband) for 389 families in this study (Table S7) based on data from the SSC. The probands in this study had a median SRS t-score of 82, significantly higher (i.e., more severely affected) than the median SRS score of our unaffected siblings (45; p < 0.00001, two-tailed paired t-test). We defined mild, moderate and severely affected individuals based on published thresholds (Constantino and Gruber). Expression analysis Gene expression data was from the Human U133A/GNF1H Gene Atlas (GEO: GSE1133), comprising 79 human tissues, including 18 nervous system tissues(Su 2004). Expression values were averaged across multiple probes when available. We defined a gene to be expressed in a given tissue if it ranked in the top 5% of all genes for that tissue. To measure enrichment, we compared the fraction of genes unique to either siblings or probands expressed in each tissue, and empirical p-values were calculated by shuffling proband/sibling labels 20,000 times and recomputing tissue-level expression enrichment. We FDR-corrected for 79 tests (i.e., for tissues) and statistical significance was assessed at q < 0.05. 48 Combined mutation model We generated a list of truncating de novo SNV mutations (nonsense, frameshift or splice mutations) discovered in our 411 quads from published lists (Iossifov et al. 2012; O'Roak et al. 2012b; 2011; Sanders et al. 2012). Both de novo and inherited CNV burden was derived from this work (Table S3 and S4). We used a logistic regression model, which transforms the binary outcome (i.e., affected vs. unaffected) such that linear predictors can be used. The model shown in Figure 3.5 is summarized as: logit[P(Affected=1)] ~ intercept + (de novo CNV burden) * (inherited CNV burden) * (de novo SNV burden). Data availability The CoNIFER output files for 1,644 samples are deposited in the National Database of Autism Research (NDAR) under the NDAR Collection ID 1878 and the title of this manuscript. 3.4 Results Samples and CNV discovery We discovered a total of 847 transmitted, exonic, rare, autosomal CNVs (Table 3.1). This included 453 CNVs transmitted to probands and 394 transmitted to unaffected siblings. Overall, the median estimated CNV size was 18.1 kbp (range 150 bp ? 5.18 Mbp, or 2 ? 320 exons). The median size of inherited CNVs was slightly larger in probands (19.4 kbp) when compared to unaffected siblings (16.6 kbp) but this difference was not statistically significant. As expected, duplications outnumbered deletions (519 vs 328; p < 1*10-10, binomial two-tailed test) and duplications were significantly larger than deletions (two-sided Mann-Whitney-U test, p < 1*10-16). The excess of duplications depended upon the size of event. For example, rare CNVs involving 20 or more exons were overwhelmingly duplications (139 duplications vs. 25 deletions), while small events were not significantly different (73 duplications and 93 deletions for 2-exon CNVs). This difference is observed irrespective of disease status (Figure S4). 49 Category CNVs Dups Dels Median Size (est) % of Samples CNVs >500 kbp CNVRs All Proband CNVs 453 277 176 19.4 kbp 64% 21 390 All Sibling CNVs 394 242 152 16.6 kbp 60% 16 345 Father ? Both 199 130 69 16.7 kbp 41% 7 94 Father ? Proband Only 100 67 33 25.0 kbp 19% 7 93 Father ? Sibling Only 82 52 30 15.4 kbp 18% 2 80 Mother ? Both 233 127 106 15.0 kbp 48% 10 118 Mother ? Proband Only 136 82 54 24.9 kbp 26% 5 128 Mother ? Sibling Only 97 61 36 21.7 kbp 21% 6 94 Either Parent ? Proband Only 236 149 87 25.0 kbp 39% 12 211 Either Parent ? Sibling Only 179 113 66 19.3 kbp 36% 8 168 Mother ? Either Offspring 466 270 196 17.8 kbp 86% 21 313 Father ? Either Offspring 381 249 132 18.6 kbp 72% 16 252 Totals 847 519 328 18.1 kbp 62% 37 525 Table 3.1: Summary of transmitted CNVs in 411 ASD quads Validation using single nucleotide polymorphism SNP microarray and targeted array-CGH We assessed the specificity of our call set by comparing our larger calls to Illumina 1M/Duo SNP microarray data and then selecting a subset of 80 samples for validation of smaller CNVs by array comparative genomic hybridization validation. These 80 samples carried a total of 161 exome-based CNV calls of which 69 (43%) were confirmed by SNP microarray (Figure 3.1a). Using a customized microarray design (Methods), we were able to test 86/92 of the remaining calls and confirmed an additional 65 events (nearly a twofold increased yield of CNVs) (Table S5). Of the 27 events which were not validated by array-CGH, 14 (or 9% of all 161 calls) were found to be specifically part of processed pseudogenes (i.e., retro-transcribed mRNA), which masquerade as duplications in exome-based discovery of CNVs, indicating that these events? while not genomic CNVs? are in fact true duplications of these genes or exons. Thus, we estimate an overall false positive rate (FPR) of 4%?8% (7/155 tested, or 13/161 in total; Figure 3.1a), dependent on the number of probes (or exons) in each call: for calls with fewer than 10 exons, the false positive rate was ~7% (6/104), while only one calls with 10 or more exons did not validate (2% or 1/51). There was no difference in the FPR between probands and siblings (3/68 [4.2%] for probands, 4/80 [4.5%] for siblings; Table S6). 50 Figure 3.1: Discovery and validation of previously undiscovered CNVs using exomes. (a) Fraction of CNVs previously identified using Illumina 1M SNP microarray (gray, ?known true positives?), the fraction of previously undiscovered CNVs identified and confirmed by targeted array-CGH in this study (green, ?previously undiscovered CNVs?), confirmed processed pseudogenes (hatched green) and the overall false positive rate for unconfirmed CNVs (gray). (b) The majority (73%, 152/207) of all previously undiscovered calls (green) discovered using exomes were smaller than 20 kbp (c-d,f) Three examples of previously undiscovered CNVs in this study. Top: CoNIFER output and normalized coverage at each exon. Middle: targeted array-CGH at CNV locus, with threshold for deletion/duplication (dotted red line) as determined by ROC-curve analysis of known CNVs (Supplemental methods). Bottom: Illumina 1M SNP microarray data for locus, showing poor probe coverage (c and d only). (e) Exome-based CNV discovery affords high exon-level specificity, as indicated by duplication of NETO1 exons (?, CoNIFER call). Previous work (Sanders et al., 2011) had discovered this CNV (*), but the (incorrect) breakpoints did not extend into NETO1. We also assessed the sensitivity (or false negative rate, FNR) of our calls versus the previously identified CNVs from SNP microarray data. We found that our pipeline identified 72% (FNR of 0.28) of all known CNVs intersecting at least two exons and 51 supported by 10 SNP microarray probes. False negative CNVs corresponded to samples with reduced mapped sequence coverage (Figure S2). For example, the Iossifov dataset (Iossifov et al. 2012)had an approximately twofold higher FNR, likely due to the lower overall sequence coverage in these exomes (a known factor in exome-based CNV discovery; Krumm et al. 2012; Fromer et al. 2012), and that the FNR for calls affecting only two exons was significantly higher than those with 3 or more exons (Table S6). We found no differences in the mapped coverage, estimated false positive rates or false negative rate among siblings and probands (p>0.3, Fisher?s two sided exact test and Table S6). Increased inherited CNV burden among autism probands. We compared the burden of inherited CNVs in the 411 probands and their siblings in terms of the total number of CNVs and the total number of genes ?hit?. We find that probands inherit more CNVs than siblings (453 vs. 394; Figure 3.2a) and these harbor more genes (921 vs. 726; Figure 3.2b). These comparisons are significant when using a paired t-test of probands-sibling pairs (p = 0.02 for genes and p = 0.004 for CNVs, two-tailed paired t-test) and when comparing the summed values for probands and siblings in aggregate (p < 1x10-6 for genes and p = 0.046 for CNVs, binomial two-tailed test). In order to ensure that these results were not driven by a few outlier families, we bootstrapped our data and calculated the confidence intervals for the proband-to-sibling burden (Figure S5). For CNVs, we found a median burden increase of 1.19 (95% CI: 1.09 ? 1.29) and for genes a burden increase of 1.30 (95% CI: 1.10 ? 1.52) across 10,000 bootstrap replicates, thereby rejecting the null hypothesis that probands have no increased inherited CNV burden in comparison to their siblings (Figure S5). Proband CNV burden was elevated over siblings across all size ranges, although individual quintile bins did not independently achieve statistical significance, due to their smaller size (Figure 3.2c). We find no significant enrichment of burden in either the smallest or the largest CNVs (by chi-square x2= 1.18, p = 0.95, df = 5, suggesting that the burden was not exclusively the result of either small or large CNVs. 52 Figure 3.2: Increased inherited CNV burden in ASD probands for large and small CNVs. (a) Total number of rare (observed in fewer than 10 families) inherited CNVs (? 2 exons) for 411 ASD probands (Pro) and their unaffected siblings (Sib). (b) Total number of affected genes in rare inherited CNVs. P-values are two-tailed paired t-tests between proband and sibling counts. (c) Burden of inherited CNVs across six size categories. Previous work has indicated that private or ultra-rare CNVs may be more likely to be pathogenic than simply ?rare? (e.g., < 1% frequency) CNVs (Levy et al. 2011). We therefore examined if the inherited burden in probands was due to private CNVs in a small subset of the 411 families. We examined 271 private CNVs in probands and 245 private CNVs in siblings, but found no enrichment of private burden when compared to rare CNVs (p = 0.74, Fisher?s exact test; Figure S6a), nor did we find enrichment for the number of affected genes (p=0.46, fisher?s exact test; Figure S6b). (Note: the burden was in fact slightly increased when considering all rare events). We searched for additional factors which could underlie the proband-sibling burden differential. We found no significant differences in CNV burden dependent on the sex of the proband or the sibling, the concordance of their sexes, or the birth order of proband and sibling (p > 0.5, Fisher?s 53 exact test; Table S8). However, we note that the highest overall CNV burden was found in families with one affected proband and at least three unaffected siblings. In fact, there was a linear increase in burden between probands and siblings across increasing family size, culminating in a 1.38x higher burden of CNVs in probands with three or more unaffected siblings. Finally, we analyzed our dataset for parent-of-origin effects, and found a greater number of maternally transmitted CNVs, (136 maternal vs 100 paternal, binomial two-tailed p-value = 0.02); but this effect was not significantly enriched in probands versus siblings (Fisher?s exact test odds ratio = 1.14, two-tailed p = 0.49). Nonetheless, when we considered a null hypothesis in which a given transmitted CNV was equally likely to be transmitted to the proband only, the sibling only, or both (each with 1/3 probability), we found strong evidence that CNVs were not transmitted in equal fashion (Table 3.1; chi-square test with equal expected proportions, p = 0.0058, x2=16.4, df = 5), and that CNVs transmitted from the mother to the proband only were significantly more common than other transmissions. CNV burden-phenotype correlation. We assessed whether the increased inherited CNV burden would segregate with markers of ASD phenotypic severity using phenotype data from the SSC. First, we utilized the Social Responsiveness Scale (SRS), a standardized parent- or teacher-completed questionnaire which measures the severity of autism symptoms in social settings (but is not a diagnostic indicator of ASD and was not used in ascertainment of the SSC). We partitioned our 411 families into two groups based on the SRS t-score: 1) We defined ?Discordant SRS quads? as those where the proband was severely affected (SRS t > 75) and the sibling mildly affected (SRS t < 60), and 2) ?Concordant SRS quads? as all others (Figure S7). The concordant group encompassed a range of moderately affected probands as well as some moderately affected siblings (Figure S7). There were a total of 276 discordant SRS proband-sib pairs and 115 concordant pairs based on this definition. We found a striking split between the discordant and concordant proband-sibling pairs: the increased CNV and gene burden was almost completely driven by the discordant pairs (Figure 3.3a; p < 0.0002 for CNVs, p < 0.02 for genes, two-tailed paired t-test), and there 54 was virtually no difference at a group or family level for concordant SRS pairs overall (1.04x, p> 0.5). Moreover, the burden ratio between probands and siblings was increased in the discordant group (for CNVs: 1.27x; for genes: 1.41x) over the ratio for the full set of 411 quads. Finally, we found that offspring (probands and siblings) with SRS scores ? 60 (?moderate? and ?severe? range) had higher CNV burden than did all offspring with SRS score < 60 (361 CNVs in 390 mildly affected offspring [1.12] vs. 436 CNVs in 388 moderately/severely affected offspring [0.92]; two-tailed independent t-test p < 0.0094). There was no statistically significant difference in burden between probands and siblings within each group (i.e., SRS < 60 or ? 60), however the relatively low number of ?affected? siblings and ?unaffected? probands hampers these comparisons. Figure 3.3: Inherited CNV burden correlates with SRS phenotype. The Social Responsiveness Scale measures autism features in social settings via parent report on 65 items.(a) We classified proband-sibling pairs with severely affected probands but mildly or unaffected siblings as ?Discordant SRS? quads (276 quads), and all other quads as ?Concordant SRS? quads (115 quads). Strikingly, the discordant SRS quads fully recapitulated the inherited CNV transmission bias, whereas the concordant SRS quads did not show a differential burden. (b) CNV burden was independent of full scale IQ (FSIQ), and probands with either low FSIQ (?70) or high FSIQ had more CNVs than did their siblings. P values refer to two-tailed paired t-tests between probands and siblings. ProPro SibSib Discordant SRS Concordant SRS T ra n sm itt e d C N V s a b T ra n sm itt e d C N V s ProPro SibSib Proband IQ > 70 Proband ,4? Figure 3: CNV burden and phenotype Inherited by bothProband CNVs Sibling CNVs 55 Proband FSIQ Proband CNVs Sibling CNVs Ratio Two-tailed t-test Probands vs. Sibs All Quads ? 70 157 126 1.25 p = 0.014 71 ? 85 89 70 1.27 p = 0.029 ? 86 184 166 1.11 NS Discordant SRS Quads Proband SRS < 60 Sibling SRS > 75 ? 70 138 104 1.32 p = 0.004 71 ? 85 62 44 1.40 p = 0.012 ? 86 113 101 1.12 NS Concordant SRS Quads ? 70 19 22 0.86 NS 71 ? 85 27 26 1.04 NS ? 86 71 65 1.09 NS Table 3.2: Summary of IQ and SRS burden. P-values represent two-tailed paired t-tests between probands and siblings in each group. Next, we considered if the full-scale IQ (IQ) of the probands was affected by inherited CNVs. Since IQ scores were only available for probands (Table S7), we grouped quads into three groups: IQ ? 70 (?low?, consistent with a diagnosis of intellectual disability), between 71 and 85 (?intermediate?), and or greater than 85 (?high?). The CNV burden was significantly greater for probands in the ?low? and ?intermediate? proband IQ bins (1.25-1.27x burden, Table 3/2). Probands with ?high? IQ did not show statistically significant enrichment over siblings, although a trend was still apparent (1.11x, Table 3.2). When we examined the effect of SRS and IQ together (Table 3.2, Table S8 and Figure S8), we found that the burden differential was strongest for the most severely affected probands (those with IQ ? 85 and part of discordant SRS quads), reaching 1.32?1.40x for CNVs (p = 0.004). However, there was no significant burden between probands and siblings in SRS concordant quads, even with ?low? IQ probands (0.8x-1.09x, p>0.5; Table 3.2), indicating that the inherited burden may be most closely aligned with SRS score, and not IQ (however, we caution that there were only 22 quads total in this group). Enrichment for brain-expressed genes in inherited CNVs: We observed a trend for more of the proband-only genes to be highly expressed in brain-related tissues (19/317 or 6% proband vs. 6/224 or 2.7% for sibling only; Table S8; see Methods). The effect becomes most pronounced when considering discordant SRS quads 56 (15/256 genes (5.9%) in probands, and 2/170 (1.2%) in siblings (p = 0.007) (Figure 3.4). When we considered all genes highly expressed in at least one brain-related tissue, we found significantly more 57/411 (13.9%) of probands had a CNV than did their siblings (33/411, or 8.0%; OR = 1.85, p = 0.009, Fisher?s exact test). These results suggest that a fraction of proband-specific genes are expressed in the nervous system tissues, and that this fraction is higher in proband-only genes than in sibling-only genes. While we caution that expression does not definitively imply pathogenicity, many of these genes and their biological pathways may be of interest for further study, both in these particular individuals and for ASD genetics in general. Figure 3.4: Genes in proband-only CNVs from SRS-discordant quads are more likely brain-expressed. We defined a gene to be expressed in a tissue if it ranked in the top 5% of all genes in that tissue, and calculated the fold enrichment of proband and sibling genes expressed in each tissue. Tissues part of brain structures had the strongest proband enrichment (black bars), as did a computed average of expression across 18 brain regions (?Brain average?) in comparison to the average expression across other regions. However, the particular brain tissues with the strongest apparent enrichment should not be considered as independently enriched, as expression values for individual genes between brain regions are highly correlated. Stars indicate a FDR-corrected p-value < 0.05. See Figure S9 for results from all 411 quads. We compared the genes detected in the CNVs in this study to a set of 1,560 genes that have been previously observed in autism/ASD, intellectual disability or schizophrenia (Table S10; Figure S10). Among SRS-discordant quads, we found significant enrichment 43 additional tissues listed in Table S9 P ? 0.05 (FDR/q-value adjusted)* Brain-related tissues (18) Non-brain tissues (61) Brain/non-brain computed averages Figure 4: Ratio of highly-expressed genes in 79 human tissues found in inherited CNVs 57 of autism genes among proband CNVs compared to unaffected siblings (66 vs 35 genes, p=0.006 two tailed paired t-test; Table 3.2a and Table S8) In contrast, there was no enrichment among ?concordant SRS? proband-sibling pairs for previously observed genes (In fact, siblings had more genes: 17 vs. 24, p = 0.069). Overall, 16% of probands (44/276) in the SRS discordant group had a CNV in a previously observed gene, while only 10% of probands (12/115) in the concordant group had such an event. Intersecting the brain-expressed genes and previously observed disease genes we found that 13 genes matched both criteria, corresponding to 1.7% of all proband genes, and only two genes (0.3%) in siblings (Figure S10). The 13 convergent proband-only genes were found exclusively in discordant SRS families, indicating that these genes may be associated with more severe phenotypes (Table S11). 3.5 Discussion In this study, we report a significant CNV transmission bias for autism, finding an enrichment of inherited CNVs in sporadic cases versus their unaffected siblings. The targeted nature of exome sequencing enabled us to explore a smaller CNV landscape largely inaccessible by high-density SNP microarray data (Pinto et al. 2010; Sanders et al. 2011; Levy et al. 2011). We estimate that the use of exome data increased our power to detect gene-disruptive CNVs smaller than 20 kbp by ~2.25-fold. These CNVs provide potential insight in the pathophysiology of inherited CNVs in sporadic autism. We find that the CNV burden is more strongly correlated with measures of ASD phenotypes (such as the SRS score) as opposed to IQ; for proband-sibling pairs with concordant SRS scores, IQ was not dependent on the probands CNV burden. Genes already associated with autism and/or highly expressed in the brain are more likely to be disrupted. Private CNVs (seen once) were no more likely to be found in probands than simply rare variants (seen fewer than 10 times in our families). Burden was consistent across all sizes of CNVs, and we did not find any enrichment for either small or large events. Mothers are significantly more likely to be carriers of transmitted CNVs than fathers irrespective of disease status of the child. This finding is consistent with our recent analysis of ?secondary? CNVs being transmitted from mothers to children with developmental delay and multiple CNVs (Girirajan et al. 2012). We also noted that the transmission bias 58 becomes more significant in probands from ASD families with many siblings as opposed to fewer individuals. Although this observation is inconsistent with the assumption that that probands in larger families with many unaffected siblings are more likely to have an underlying sporadic genetic etiology, it likely reflects an ascertainment in selecting the ?least affected? sibling in a large family as the ?designated sibling? for the purposes of forming a quad (Sanders et al. 2012). This suggests that a significant fraction of the underlying genetic etiology in the SSC may be inherited, a notion that has been examined previously(Davidson et al. 2012). Our study benefited from the quad-based design of the SSC (Fischbach and Lord 2010), which provided a robust genetic control for each ASD proband, as well as the detailed phenotypic information available, which sharpened the contrasts between severely affected and less affected probands and their siblings, some of which showed subtle signs of the Broader Autism Phenotype (BAP; Davidson et al. 2012). Most of our observations were strengthened or restricted to ?SRS discordant? quads, where the proband was severely affected in terms of the SRS scale, but the sibling was unaffected. Approximately 67% (276/411) of the quads in this study were categorized as SRS discordant, and these quads explained virtually the entire overall CNV burden, encompassed the majority of brain-expressed genes and strengthened the association with previously implicated disease genes. This effect may be driven by inherent ambiguity in the simplex and multiplex classification scheme-- a scheme that is not truly binary but rather a continuous probability based on the number of unaffected siblings in the family (the more unaffected siblings, the greater the likelihood that the family is simplex). In essence, by focusing on the SRS discordant quads only, we have enriched not only for a more severe proband phenotype, but also a truly ?simplex? genetic etiology (as opposed to an environmental and/or stochastic one), thus enhancing the observed transmission disequilibrium of CNVs. Our results should be viewed carefully in the context of previous studies. Notably, two recent studies (Levy et al. 2011; Sanders et al. 2011)failed to find statistically significant enrichment of inherited CNVs in sporadic autism probands compared to their siblings. 59 These studies, which also analyzed families from the SSC, used high-density microarray platforms to discover CNVs in genome-wide fashion. It is possible that the increased sensitivity of our exome-based method for genic events? which are most strongly implicated by both de novo CNVs and de novo SNV studies? revealed the difference in burden between probands and siblings. Additionally, our study found that the differential burden was dependent on the SRS score and not IQ, a factor which has not been previously examined in the context of ASD and CNVs. In contrast, our results are in good agreement with those of the case-control study by Pinto and colleagues (Pinto et al. 2010), who found an overall case/control ratio of 1.19 for genic CNVs and no enrichment for ?ultra-rare? CNVs seen only once in their cases (although this study was largely limited to CNVs larger than 50 kbp). The smaller size of the CNVs provides increased specificity to define individual genes when compared to previous studies focused on large CNVs, which typically encompass dozens of genes. The deletion or duplication of a subset of exons can have, in principle, the same impact on gene function as disruptive point mutations. Accordingly, several genes in our brain-expressed/SRS-discordant set of CNVs have been previously identified as part of severe neurological disorders. Among these was a CNV affecting DDHD2, an intracellular phospholipase which plays an essential role in synaptic function, and which has recently been implicated in a recessive form of complex hereditary spastic paraplegia (HSP [MIM 615033])-- a syndrome characterized by early-onset intellectual disability and spastic paraplegia (Schuurs-Hoeijmakers et al. 2012). We did not observe any CNVs in this gene in 2,972 control exomes. Similarly, another proband (and 0/2,972 controls) carried an inherited CNV affecting only the PACS2 gene (MIM 610423), part of six genes in critical region of 14q32 deletion syndrome, characterized by intellectual disability and mild facial dysmorphology (Holder et al. 2012). Lastly, in two families, we identified a previously unidentified small (~5 kbp), 2-exon deletion of the ZNF396 gene (Figure 3.1c), which was identified as a candidate gene for Alopecia with Mental Retardation syndrome (MIM 613930) by microsatellite linkage analysis (in fact, ZNF396 was the closest gene to the linkage peak; Wali et al. 2007). The frequency of this deletion in our control set was 3/2,972 (0.1%). Although these 60 identified genes and CNVs may play an important role in the pathogenesis of ASD on the basis of their previously identified roles in Mendelian disorders, we would like to emphasize that their individual rarity and overall small effect prevents them from being conclusively identified as having Mendelian effects. Other genes disrupted by CNVs have functional roles in neural function, brain development or neurobehavioral phenotypes in model organisms (Table 3.3). For example, we identified two independent disruptions of ORC3 gene (MIM 604972; one shown in Figure 3.1d) encoding a protein of the Origin Recognition Complex. The complex regulates dendritic spines and dendrite arborization in post-mitotic neurons, and has been implicated in olfactory learning and memory in Drosophila (Huang 2005). Notable also was a CNV affecting CPLX1 (MIM 605032; Figure 3.1f), specific to the SNARE neuronal vesicle exocytosis pathway in neurons, as well as CNVs affecting neural receptors such as HTR3E (MIM 610123; a subunit of the ionotropic serotonin receptor) and NETO1 (MIM 607973; Figure 3.1e), a key component of the NMDA-receptor complex and critical for synaptic plasticity and learning in mice (however, this CNV was transmitted to both proband and sibling; Ng et al. 2009a). Previous work has implicated the ubiquitin processing pathway (Glessner et al. 2009), and we found a rare CNV in UCHL1 (MIM 191342), a ubiquitin-adduct processing enzyme which has strong and specific brain expression, knockout mice show specific neurodegenerative phenotypes (Wada et al. 1999), and recent work has shown it to regulate the NCAM1 neural cell adhesion molecule (MIM 116930; Wobst et al. 2012) Finally, we found several interesting genes on the basis of brain expression pattern, including 1) an inherited deletion of the IQSEC1/BRAG2 gene (MIM 610166), which is strongly expressed in the prefrontal cortex and involved in clathrin-mediated endocytosis of AMPA receptors critical to long-term potentiation in mice (Scholz et al. 2010), 2) a duplication of the ZNF251/ZNF517 cluster on 8q24.3, which have tissue-specific expression highest in the fetal brain and cerebellum (Peter Lorenz 2010), and 3) duplication of AQP4 (MIM 600308), the primary water transporter in brain glial cells, especially in the amygdala and prefrontal cortex and has been implicated in epilepsy (Binder et al. 2012). 61 Table 3.3: Selected inherited CNVs Sample Chr/Position (hg19) Size (kbp) State # exons Freq. in 411 quads Genes in transmitted CNV 12647.p1 1 32,084,793 32,110,465 25.7 Dup 12 2 HCRTR1, PEF1 11872.p1 1 65,730,593 65,831,879 101.3 Dup 4 1 DNAJC6 12719.p1 1 146,715,494 146,767,190 51.7 Del 23 1 CHD1L 12997.p1 2 230,632,269 230,724,290 92 Dup 39 1 TRIP12 12394.p1 2 241,538,067 241,709,123 171.1 Dup 42 3 KIF1A, GPR35, AQP12B, AQP12A, CAPN10 12534.p1 3 12,940,888 12,978,197 37.3 Del 13 2 IQSEC1 13099.p1 3 97,486,951 97,634,880 147.9 Del 19 1 ARL6 12645.p1 4 818,279 845,762 27.5 Dup 5 1 CPLX1, GAK 11773.p1 4 2,641,461 2,835,561 194.1 Dup 34 1 TNIP2, FAM193A, SH3BP2 11066.p1 4 41,258,993 41,259,143 150 bp Dup 2 3 UCHL1 13385.p1 4 169,083,678 169,086,477 2.8 Del 3 1 ANXA10 13293.p1 5 619,104 644,540 25.4 Dup 9 2 CEP72 12758.p1 6 24,454,242 24,523,153 68.9 Dup 20 1 ALDH5A1, GPLD1 11551.p1 6 88,315,634 88,318,947 3.3 Del 3 1 ORC3 11459.p1 6 88,317,390 88,366,700 49.3 Del 10 1 ORC3 13412.p1 7 33,102,179 33,185,976 83.8 Dup 7 3 RP9, BBS9, NT5C3 11722.p1 7 48,308,576 48,416,169 107.6 Del 20 1 ABCA13 11716.p1 8 38,090,512 38,117,639 27.1 Del 16 1 DDHD2 13412.p1 8 86,351,940 86,575,726 223.8 Dup 14 1 CA3, CA2, REXO1L1 12534.p1 8 145,947,028 146,033,780 86.8 Dup 18 1 ZNF251, ZNF34, ZNF517, RPL8 11356.p1 9 139,634,401 139,651,044 16.6 Dup 16 1 LCN6, LCN10, LCN8 13162.p1 10 5,203,384 5,260,723 57.3 Del 12 1 AKR1C4, AKR1CL1 13843.p1 11 43,772,460 43,775,671 3.2 Del 2 1 HSD17B12 11241.p1 12 120,875,929 120,884,632 8.7 Dup 7 2 GATC, COX6A1, TRIAP1 12396.p1 14 105,836,177 105,861,009 24.8 Dup 17 1 PACS2 11479.p1 15 43,696,610 43,701,294 4.7 Dup 5 1 TP53BP1, TUBGCP4 13843.p1 15 55,475,512 55,497,903 22.4 Dup 6 2 RAB27A, RSL24D1 12837.p1 15 57,730,197 57,754,090 23.9 Dup 7 2 CGNL1 13543.p1 15 91,488,121 91,520,001 31.9 Del 25 2 RCCD1, PRC1, UNC45A 13215.p1 16 15,596,178 15,609,285 13.1 Del 6 1 C16orf45 14201.p1 16 68,710,287 68,713,877 3.6 Dup 5 2 CDH3 12100.p1 16 70,714,696 70,714,928 232 bp Dup 2 3 MTSS1L 12373.p1 16 81,314,461 81,396,216 81.8 Dup 10 1 GAN, BCMO1 12697.p1 18 24,436,174 24,628,467 192.3 Dup 10 1 CHST9, AQP4, CHST9-AS1 12869.p1 18 72,229,281 72,251,798 22.5 Dup 8 1 CNDP1 11356.p1 18 77,470,345 77,891,075 420.7 Dup 28 2 KCNG2, RBFA, CTDP1, ADNP2, TXNL4A, PQLC1 13296.p1 19 6,681,951 6,686,913 5 Dup 8 1 C3 11298.p1 19 18,704,375 18,704,917 542 bp Dup 2 1 CRLF1 13815.p1 19 57,835,049 57,932,849 97.8 Del 15 1 ZNF547, ZNF304, ZNF17, ZNF548, ZNF543 13396.p1 21 19,628,825 19,632,603 3.8 Del 3 1 CHODL 13327.p1 21 35,742,777 35,899,047 156.3 Dup 8 2 KCNE2, RCAN1, KCNE1, FAM165B 62 Figure 3.5: A combined model of inherited and de novo mutations reveals independent risk for both. A logistic regression model estimates the odds ratio for each inherited CNVs (blue), de novo CNV (red) or disruptive de novo SNV variant (gray; nonsense, splice and indels only) in probands and siblings. Odds Ratios and burden (proband vs. sibling ratio) given in accompanying table, revealing independent risk for each type of mutation. The line width for each type of mutation in the figure indicates if a bias has been observed for new mutations arising on the maternal or paternal haplotypes (see also: for SNVs, O?Roak et al. 2012 and for CNVs: Hehir-Kwa et al. 2011). Since the patients and families have been analyzed for both de novo CNVs and SNVs, we can develop a model to assess the relative contribution of each class of genetic variant to autism. First, we confirmed that inherited CNVs were enriched in the set of probands without other known de novo CNVs or SNVs (368 inherited CNVs in probands vs. 327 CNVs in siblings of 336 quads; p < 0.03, two-tailed paired t-test). Second, we developed a logistic regression model, in which the binary outcome of either proband or sibling is predicted by the count of disruptive de novo SNVs, de novo CNVs and the count of our rare transmitted CNVs. We performed regressions on both the set of all 411 quads, as well as the set of 276 proband-sibling pairs with discordant SRS scores. The results (Figure 3.5 and Table S12) reveal a strong) effect for disruptive (nonsense, splice and frameshift) de novo SNVs (OR 4.30, p < 0.001) and CNVs (OR 6.65, p <0.02), and also confirmed a statistically independent effect for transmitted CNVs (OR 1.16, p <0.04); again, in this model, the effect was primarily driven by discordant SRS quads. Although the strength of de novo SNVs strongly outweighs the pathogenic effect of inherited CNVs, our model predicts that the inherited CNV contribute significantly to sporadic disease, especially in the case of discordant SRS pairs (where the OR increased to 1.26, p< 0.015). We did not find any significant interactions between our predictors, reflecting 63 the relative infrequency of co-occurring CNVs and de novo SNVs but also the limited sample size. It is also possible that careful consideration of rare and disruptive, inherited SNVs could statistically interact with other classes of mutation, but we did not take these into account in building our model. Taken together, our model suggests that disruptive de novo SNVs and both inherited and de novo CNVs contribute independently to the risk of autism. We believe that future studies of ASD and other complex neurological disorders will make significant strides in understanding the genetic underpinnings of disease, especially if an integrated approach considering all disruptive mutations? inherited and de novo, CNV and SNV, small and large? is applied. 64 ACKNOWLEDGEMENTS: We thank the National Heart, Lung, and Blood Institute, NIH Grand Opportunity (GO) Exome Sequencing Project and its ongoing studies, which produced and provided exome variant calls for comparison: the Lung GO Sequencing Project (HL-102923), the Women?s Health Initiative Sequencing Project (HL-102924), the Broad GO Sequencing Project (HL-102925), the Seattle GO Sequencing Project (HL-102926), and the Heart GO Sequencing Project (HL-103010). We are grateful to all of the families at the participating Simons Simplex Collection (SSC) sites, as well as the principal investigators (A. Beaudet, R. Bernier, J. Constantino, E. Cook, E. Fombonne, D. Geschwind, E. Hanson, D. Grice, A. Klin, R. Kochel, D. Ledbetter, C. Lord, C. Martin, D. Martin, R. Maxim, J. Miles, O. Ousley, K. Pelphrey, B. Peterson, J. Piggot, C. Saulnier, M. State, W. Stone, J. Sutcliffe, C. Walsh, Z. Warren and E. Wijsman). We also acknowledge M. State and the Simons Simplex Collection Genetics Consortium for providing Illumina genotyping data, T. Lehner and the Autism Sequencing Consortium for providing an opportunity for pre-publication data exchange among the participating groups. We are grateful for helpful discussion and manuscript preparation help from T. Brown and P. Sudmant, as well as all members of the Eichler lab. We appreciate obtaining access to phenotypic data on SFARI Base. This work was supported by the Simons Foundation Autism Research Initiative (SFARI 137578 and 191889; E.E.E., J.S. and R.B.) and NIH HD065285 (E.E.E. and J.S.). E.B. is an Alfred P. Sloan Research Fellow. E.E.E. is an Investigator of the Howard Hughes Medical Institute. 65 IV. Inherited SNV mutations in Autism Spectrum Disorder Towards a holistic model of genetic variation in ASD 4.1 Summary We describe the creation of a combined dataset and resource of inherited and de novo SNVs and CNVs across 2,377 Simons Simplex Collection (SSC) families. The dataset includes 1,786 parent-child-unaffected sibling ?quads?, which enable comparison of the burden of inherited and de novo mutations between affected and unaffected siblings in simplex autism families. We find that private inherited truncating SNV mutations in conserved genes are significantly enriched in probands (OR=1.14, p < 0.0002, two-tailed paired t-test), and we observe that this effect that becomes more pronounced with increasing gene-level conservation (assessed via the RVIS score). Likewise, we confirm previous reports of transmission disequilibrium for inherited CNVs. Transmission disequilibrium of SNVs was strongest in probands with diagnoses of Autism Disorder or Pervasive Developmental Disorder, and we did not observe a significant enrichment in probands with Asperger?s Disorder; similar results were observed when stratifying by IQ. We quantified ASD risk for de novo and inherited CNVs and SNVs by using a conditional logistic regression model, and found that inherited private truncating SNVs and rare inherited CNVs contribute an independent increase in risk of 1.11 (p=0.0002) and 1.23 (p = 0.01), respectively. Our results confirm a statistically independent role for inherited mutations in ASD risk and identify additional candidate genes (e.g., RIMS1, CUL7 and CSMD1) where inherited and de novo burden converge. This chapter is in preparation for publication. 66 4.2 Introduction Autism spectrum disorder (ASD) is a common neurodevelopmental disorder diagnosed in approximately 1/88 children and manifests as deficits in social behavior and language development, as well as restricted or stereotyped interests. Several large studies have confirmed initial observations that ASD is a highly heritable, and consensus estimates suggest that ~50-60% of ASD etiologies are genetic (Hallmayer et al. 2011; Bailey et al. 1995; Constantino et al. 2013; Steffenburg et al. 1989). In particular, de novo mutations have been implicated as the underlying genetic cause in cases, and these mutations have provided a rich source for understanding the pathogenic genes and neurobiological mechanisms of ASD. However, de novo mutations are rare, and their overall contribution is estimated to be 25-35% (Krumm et al. 2014; Ronemus et al. 2014), shy of the overall heritability estimate, suggesting that other genetic etiologies contribute to ASD. Previous reports have suggested additional genetic models for ASD, in which rare inherited copy number variants (CNVs) are disproportionally inherited by affected probands, rather than their unaffected siblings (Krumm et al. 2013; Poultney et al. 2013; Pinto et al. 2010). These studies describe a transmission disequilibrium for mutations of very low population frequency, suggesting that the pathogenic CNVs for ASD are of relatively young age and under strong purifying selection. In the present study, we hypothesize that inherited single nucleotide variants (SNVs) also contribute to a model of rare inherited variation underlying the genetic etiology of autism. We leverage a family-based study design of simplex autism, in which one offspring carries a diagnosis of autism and an unaffected offspring acts as a genetic control, to discover specific ASD risk genes and integrate inherited factors with de novo factors in a general ASD risk model. This study, in conjunction with the National Database for Autism Research and the Simons Simplex Consortium, also makes available a resource of uniformly processed raw exome sequence (bam) files as well as raw variant (vcf) files from multiple variant calling tools. We have generated both inherited and de novo SNVs and CNVs across 2,377 families, include 1,786 quads. We envision that this data becomes a resource for 67 further study and investigation in the autism research community. The complete set of raw underlying data, variant calls and methods are available through the National Database for Autism Research (NDAR) under the Study ID 334 (see Web Resources for link) 4.3 Methods Dataset We analyzed exome data from 2,377 ASD families after quality control of 2,391 families from the Simons Simplex Collection (Fischbach and Lord 2010), including 1,786 quads and 591 trios (total n=8,917 exomes). A subset of these families (n=752 families) were sequenced as part of three previous publications (Sanders et al. 2012; Iossifov et al. 2012; O'Roak et al. 2012b). We note that Iossifov and colleagues report on aspects of sequence generation, and de novo rates (Iossifov et al., in preparation, Nature, 2014). This study was approved by the institutional review board of the University of Washington. Alignment and SNV discovery We aligned exome data to the GRCh37 reference genome using BWA-MEM (Li 2013; v0.7.5a) and post-processed alignments using the GATK ?best practices? pipeline, including indel realignment and BQSR. Exome data was matched to existing SNP barcodes (generated from Illumina 1M/1MDuo SNP microarrays and/or 96-SNP fingerprints collected by the Rutgers sample distribution center) in order to eliminate sample identity/paternity mix-ups. We called SNVs and indels with both GATK HaplotypeCaller (McKenna et al. 2010; v 2.7-4) and FreeBayes (Garrison and Marth 2012; v0.99) to within 20 bp of the exon targets, and were calls were annotated using SnpEFF and merged into union and intersection sets. Allele frequency was estimated by counting non-reference alleles across all parents (n=4,754 parents). We define all stop-gained, frameshift and splice-site variants as ?Likely Gene Disrupting? or LGD variants. For de novo events, we applied minimum a read-depth of six alternate alleles in offspring and a depth of >10 reference reads in parents, including no more than two low-quality bases of the de novo allele. To exclude common artifacts, we only accepted unique de 68 novo sites across all families. Inherited events were derived from the intersection set of both algorithms, with filters depth and quality filters set to DP > 20 and QUAL > 50. CNV discovery We used CoNIFER (Krumm et al. 2012)and XHMM (Fromer et al. 2012)algorithms to discover copy number variation from exome data at high exonic resolution. Calls from each algorithm were reconciled, merged, and genotyped within each family to determine inheritance patterns. In order to maximize both sensitivity and the precision of the callset, we used targeted in silico genotype information based on available SNP microarray data of each call (n=1,266 families; CRLMM algorithm (Scharpf et al. 2011), see supplementary methods). In order to focus our analysis to those CNVs most likely relevant to ASD pathogenesis, we restrict our analysis to rare CNVs found at less than 0.8% frequency (< 10 events/1,266 families) and outside of repetitive genomic elements. 4.4 Results Discovery and validation of inherited and de novo SNVs Starting from raw sequence data, we reprocessed 8,917 exomes from the Simons Simplex Collection in order to standardize their analysis and allow comparison among the entire data set of 2,377 families. Our pipeline entailed remapping using bwa-mem and variant calling using both GATK HaplotypeCaller and FreeBayes, which amounted to over 1,000,000 CPU-hours of computation using Amazon Web Services. Using FreeBayes and GATK, we found a median of 26,920 transmitted variants per family (95% CI 23,394?31,401). Overall, 81% of all transmitted variants were found by both FreeBayes and GATK, 12% by FreeBayes alone and 7% by GATK. Of all transmitted mutations in the intersection set, an average of 341 (95% CI: 133-632) sites per family were novel and not observed in dbSNP (v137); 98.6% of sites were in dbSNP with a mean concordance rate of 99.7% (for the union of transmitted variants, 93.4% of variants were found in dbSNP and 99.5% were concordant; for events discovered only by GATK, 76% were found in dbSNP and these were on average 96.7% concordant; for events discovered only by FreeBayes, 64% were found in dbSNP and these were 99.4% concordant). For intersecting variants, the Ti/Tv ratio was 2.94 (95% CI 2.79?3.03) for all sites, 2.95 69 (2.83?3.04) for dbSNP sites and 1.94 (1.05?2.75) for novel sites. The median Ti/Tv ratios for transmitted variants specific found only by GATK was 1.34 overall and 1.54 for dbSNP sites; for variants specific to FreeBayes discovery, the Ti/Tv was 2.15 overall and 2.35 for dbSNP sites. New de novo mutations (SNVs and CNVs) Our analysis benefited from the use of newer bioinformatics tools, allowing us to discover 1560 new de novo mutations previously not detected. We tested a small subset of these (n=75) and validated set of 23 new likely gene disrupting (LGD) mutations and 26 new missense de novo mutations in probands (Table 4.1 and Table S1). Notably, these validated mutations established ASH1L as a newly recurrently truncated gene, added a new LGD mutation to GIGYF1 for a total of three LGD de novo mutations observed. In addition, the new mutations established recurrent hits in seven new genes, including GIGYF2 (see below), ATP1B1 (a gene with strong brain expression), SSPO (a brain-secreted protein involved in axon growth), and JAKMIP1 (previously implicated in ASD and with strong and specific brain expression). Mutations (this work) Previously identified mutations Status Siblings ASH1L 1 LGD 1 LGD Newly Recurrent LGD GIGYF1 1 LGD 2 LGD Multiple recurrent LGD GIGYF2 1 Ms 1 LGD Newly recurrent ATP1B1 1 Ms 1 LGD Newly recurrent SSPO 1 LGD 1 Ms Newly recurrent w/ new LGD JAKMIP1 1 Ms (NV: 12607.p1) 1 LGD Newly recurrent 1 Ms, 1 LGD RNF213 1 Ms 1 LGD Newly recurrent 2 Ms UBR5 1 Ms (NV: 11102.p1) 1 LGD Newly recurrent ZBTB45 1 LGD 1 Ms Newly recurrent w/ new LGD Table 4.1: genes with new recurrent de novo mutations. All mutations validated, except those marked NV (No validation attempted) 70 We used an intersection of both CoNIFER and XHMM to discover small, exonic CNVs and validated these using custom array-CGH and targeted in-silico genotyping using Illumina SNP microarray data. Of 52 tested CNVs, we validated 21 new (previously unvalidated or undetected) de novo CNVs; several of these affected genes recurrently hit by de novo SNVs, including DSCAM, CHD2, and TNRC6B. Investigation of the newly recurrent genes found that three genes from this list (GIGYF1, GIGYF2 and TNRC6B) and three additional genes with single de novo mutations (GRB10, RBM12, and ZNF598) are closely linked with one-another using protein-protein interaction data (Figure 4.1; Table 4.2). Gene ontology annotation of the genes in this network suggests involvement of the IGF (Insulin Growth Factor) signaling pathway (GIGYF1, GIGYF2, GRB10; accession GO:0048009), which has been previously implicated in the development of ASD (Bozdagi et al. 2013) Furthermore, GIGYF2 and ZNF598 form part of the m4EHP mRNA binding complex and have widespread translational repression roles, especially in the brain and lungs (Masahiro Morita 2012). Figure 4.1: Network of genes with recurrent de novo hits, based on new de novo mutations identified in this study. Red Stars: de novo LGD mutations (Frameshift, Stop-gained, Splice-site); Blue stars: de novo missense mutations; Purple star: CNV deletion (see Figure S1) Figure 1 GIGYF1 GRB10 GIGYF2 TNRC6B ZNF598 RBM12 Part of IGF signalling pathway de novo missense SNV de novo CNV deletion de novo LGD SNV 71 De novo mutations in probands ESP6500 Rare LGD SNVs Allele Count ^ Brain Expression Part of IGF Pathway GIGYF1 Fs, Fs*, SS 3 +++ (Cerebellar) Yes GIGYF2 Stop, Ms* 0 Yes GRB10 Ms* 1 + Yes TNRC6B Fs, Stop, CNV del* 101& +++ (Cerebellar) ZFN598 Stop 4 ++ (Cerebellar) RBM12 Ms 0 Table 4.2: Summary of mutations in IGF-related ASD network * Mutations newly identified in this study ^ Must have minimum read depth of 10 + bi-allelic & 99/101 are a single 5' frameshift variant (AA position 4949/5502) Transmission disequilibrium of SNVs between probands and siblings We tested for transmission disequilibrium between probands and siblings in three ways: i) by a Fisher?s exact test, ii) by paired Student?s t-test or Mann-Whitney U test, and iii) by logistic regression (where the dependent variable was if the variant was found in a proband or sibling). We used only variants called by both FreeBayes and GATK variant callers in order to minimize false positives. We found no statistically significant overall burden when considering all rare or private protein-altering mutations (LGD + missense) together, even when considering additional hypotheses based on highly brain expressed genes, mutations with high CADD scores, or mutations in genes with de novo mutations in other probands (p > 0.05 in all comparisons). In contrast, we found private LGD mutations in genes which are intolerant of deleterious mutations (as informed by the RVIS (Petrovski et al. 2013) in the lower 50% of all scores) were statistically enriched overall in probands (OR=1.14, p < 0.0002, Fisher?s exact test) and at a family level (p < 0.0001, two-tailed paired t-test; Figure 4.2A). These effects persisted even for all LGD mutations in genes (regardless of frequency) with RVIS scores <50% (OR=1.06, p=0.03 Fisher?s exact test; p=0.02 two-tailed paired t-test). Furthermore, the RVIS score was a significant predictor of proband 72 or sibling inheritance in a logistic regression model build on all LGD mutations (p=0.028, OR=1.01 per RVIS percentage point). As suggested by this model, the burden of private LGD mutations in genes with progressively lower RVIS scores continues to increase (Figure 4.2B). At the extreme, the burden between probands and siblings in genes with the lowest 1% of all RVIS scores reaches an odds ratio of 1.4, although this comparison at the extreme is not yet statistically significant (due to the small number of mutations present in this bin). Figure 4.2 Transmission disequilibrium of SNVs in ASD A) Private LGD (red bars) inherited SNVs in genes which are not tolerant to functional variation were significantly enriched in probands. The analysis examines only SNVs in genes with an RVIS score in the lower 50%. Non-private rare variants, or missense (gray bars) inherited SNVs are not enriched in probands. B) The RVIS score is a critical determinant for enrichment in probands: Burden was highest (reaching OR=1.4) for private inherited LGD SNVs amongst genes with the lowest RVIS scores. We looked for a relationship between the set of private LGD mutations in RVIS-restricted genes and the phenotype of probands in the SSC (Figure 4.3). First, we examined how the overall clinical diagnosis impacted burden: for the 1,575 probands 73 with a diagnosis of ?autism? or ?pervasive developmental disorder?, the odds ratio was 1.14 and 1.18 (p=0.001 and 0.05), respectively; in contrast, probands with a diagnosis of ?Asperger?s? (n=205) had a lower odds ratio of 1.05 (p> 0.7; Figure 4.3A). Consistent with this, we found that probands with full-scale IQ lower than 70 had an odds ratio of 1.18 (p = 0.014, n=530), whereas those with IQ above 100 had a lower, non-significant odds ratio of 1.06 (n=454; Figure 4.3B). In contrast to IQ, there was no difference in transmission disequilibrium between probands and siblings with highly differential Social Responsiveness Scale scores (?discordant SRS? quads) and those with less extreme scores (data not shown). Transmission disequilibrium of CNVs between probands and siblings: There were 2,891 total autosomal CNVs detected in this study with child specific event counts of 854 in probands and 743 in siblings (ratio=1.25, p=0.006, binomial two-sided test) with 47.4% of probands and 44% of siblings having a CNV. Overall, proband CNVs (median=40.6 kbp) were slightly larger than sibling events (median=38.4 kbp) but not statistically significant (p = 0.09, Wilcoxon). The overall ratio of duplications to deletions was 1.6 consistent with previous results for a smaller SSC dataset (Krumm et al. 2013). Lastly, the number of proband CNVs >500 kbp (n=85, median size=1,211 kbp) identified in probands was 2.3-fold higher than in siblings (n=37, median size=889 kbp). Phenotypic measures and autism CNVs A previous publication has identified significance for CNVs in proband-sibling pairs that are discordant for their Social Responsiveness Score (SRS, where discordant is defined as a proband with SRS > 75 and an unaffected sibling with a score < 50) and not in those that are concordant (Krumm et al. 2013). Here, we confirm these results: we find that probands in discordant pairs have a significantly higher burden of CNVs (OR=1.16, p=0.008; Figure 4.3C). In contrast, probands were not enriched for transmitted CNVs when their SRS scores were concordant or unremarkable in comparison to their siblings (OR=1.02, p > 0.1; Figure 4.3C). When examining IQ, we find that probands with low IQ (FSIQ < 70) are enriched for inherited CNVs in comparison to their siblings (OR=1.16, p=0.04; Figure 4.3D), but that probands with higher IQs (>70) are not enriched. This is in agreement with our previous report, where we found that probands with low IQ (either 74 <70 or <85) were enriched for inherited CNVs versus their siblings, while those with higher IQ (>85) did not have a higher burden. Figure 4.3. Transmitted mutations and their effect on phenotype. Clockwise from top left: (a) Private inherited LGD SNVs enriched in probands with Autism and Pervasive Developmental Disorder (PDD) diagnoses, but not Asperger?s Syndrome (AS). (b) Private inherited LGD SNVs primarily enriched in cases with lower IQ than average (<100). (c) We observe transmission disequilibrium of rare inherited CNVs in SRS Discordant families (Proband SRS score > 75, Sibling < 50), but not in families where the SRS score is mild or more balanced between proband and sibling. (d) Rare inherited CNVs are enriched in probands (versus their siblings) with IQ lower than 70, but the effect is not significant in probands with IQ > 70. All tests and reported p-values are paired t-tests based on proband-sibling pairs. Autism PDD AS < 70 70 -100 100+ Full-scale IQClinical impression Probands (n=1,786) Siblings (n=1,786) Pr iv at e LG D SN Vs OR=1.15 p = 0.001 OR=1.18 p = 0.05 OR=1.04 NS OR=1.18 p = 0.014 OR=1.18 p = 0.002 OR=1.06 NS Pr iv at e LG D SN Vs 400 300 200 100 SRS Discordant (Probands SRS >75 & Sibling SRS < 50) SRS concordant 100 200 300 400 500 600 IQ < 70 ,4? Probands (n=1,786) Siblings (n=1,786) OR=1.16 p = 0.008 OR=1.02 NS OR=1.16 p = 0.04 OR=1.07 NS Ra re T ra ns m itt ed C NV s Ra re T ra ns m itt ed C NV s Inherited SNVs Inherited CNVs 75 Integration of mutational spectrum suggests new ASD candidate genes We jointly examined SNVs and CNVs at a gene level in order to suggest new ASD candidate genes (Table 4.2). Events were tabulated based on mutation type (SNV/CNV) and inheritance class as presented throughout this manuscript. In particular, we counted all de novo CNVs and LGD or missense SNV events, private LGD-inherited SNVs in genes with an RVIS score < 50%, and rare inherited CNV, in which at least one gene had an RVIS score <50%. From these values, we calculated p-values for de novo SNVs (as in O'Roak et al. 2012a) and inherited SNVs and CNVs (using a binomial test). Genes were ranked based on the Fisher?s combined p-value heuristic. Finally, in order to remove common ?false-positive? genes, we restricted our analysis to genes with low RVIS scores (<10%) or those with no events in sibling. Mutations RVIS Notes/Function RIMS1 - 2 de novo LGD - 2 private inherited LGD - 6 rare inherited LGD - 3 rare inherited LGD in siblings (2 shared with probands) 3.3% Strong and specific brain expression; Previous candidate in ASD studies CUL7 - 2 de novo missense SNV - 2 private inherited LGD SNVs 3.4% Neuronal dendrite patterning function CSMD1 - 3 de novo missense SNV - 4 private inherited LGD SNVs - 5 inherited CNVs (4 focal to CSMD1) <0.5% Strong and specific brain expression; previous assoc. w/ SCZ Table 4.3: Converging evidence for RIMS1, CUL7 and CSMD1 from de novo and inherited mutations The combined gene-level table identifies several new candidate genes. In particular, the three highest ranked genes?RIMS1, CUL7, and CSMD1?each display brain-specific patterns or have identified neural functions. The highest ranked gene, RIMS1, has two de novo LGD mutations and two private LGD-inherited mutations in probands. Additionally, there were six additional LGD non-private inherited mutations in probands (two of which are shared with siblings). RIMS1 has been previously suggested as an ASD candidate by Iossifov and colleagues (Iossifov et al. 2012), and has been also observed by 76 Mathew State and colleagues (personal communication).; it displays brain-specific brain expression, and disruption of the gene in mice leads to increased post-synaptic density and impaired learning. CUL7 has two de novo and two LGD-inherited mutations in probands (none in siblings); functionally, it is a E3 ligase with high cerebellar brain expression and a selective role in neural dendrite patterning and growth (Litterman et al. 2011; CUL7 is also the causative gene in 3M syndrome [OMIM:273750], which curiously is not associated with abnormal mental development). Finally, CSMD1 appears as a strong ASD risk factor candidate. Among the 1,386 quad families, we found multiple types of mutations and events in CSMD1: three de novo missense mutations, one shared inherited LGD SNVs, and one four rare inherited focal CNVs (one shared with siblings). Overall, there are eight events in probands and two in siblings. In addition, there were three additional LGD-inherited SNVs in probands within the trios. CSMD1 has the fourth-lowest RVIS score (0.02 percentile) of all genes, suggesting it is highly intolerant to functional mutation; this is born out in examination of mutations in the ESP6500, where CSMD1 has only six LGD mutations. Comparison of the CSMD1-focal inherited CNVs seen in probands with the events seen in the Database of Genomic Variants (DGV) suggests that they are private to the ASD families (i.e., not observed in DGV). Furthermore, the ASD-specific events occur at the exon-dense 5?-end of CSMD1, a region nearly devoid of exonic CNVs in the DGV (Figure S#). Functionally, CSMD1 exhibits strong and specific brain expression; it functions within the complement control pathway, which has been implicated in synaptic pruning. CSMD1 in particular has been associated with schizophrenia (H?vik et al. 2011), and damaging variants of the gene segregated in two ASD families with distantly related probands (Cukier et al. 2014). 77 Figure 4.4: Convergence of de novo and inherited mutations on CSMD1. From top: (a) RefSeq gene model of CSMD1 (RVIS score < 1%), (b) three de novo missense mutations in probands, (c) four inherited LGD SNVs in probands, (d) five inherited CNVs, four are focal to CSMD1 alone, (e) of all mutations, only two are shared with siblings (and none are specific to siblings), (f) Expression profile of CSMD1 shows strong brain tissue expression (data from GTEx consortium). Integration of ASD risk across SNVs and CNVs We quantified the risk for ASD of de novo and inherited CNVs and SNVs by using a conditional logistic regression model (methods; Figure 4.5 and Table 4.4). In this model, the binary outcome of ASD proband or unaffected sibling is predicted by four independent counts: 1) the number of de novo CNVs, 2) the number of LGD de novo SNVs, 3) the set of rare inherited CNVs and 4) the set of private LGD-inherited SNVs in genes in the lower 50% percentile of RVIS scores. Additionally, we accounted for familial stratification effects by adding a family-level stratum to the model. Using data from the 1,786 quads, we found robust effects for de novo events? each de novo CNV increased the risk for ASD by 2.05-fold, while each de novo SNV increased risk by 1.72-fold (p = 0.0004 and p < 1 x10-7, respectively; Table 4.4). In addition, the results from Expression pattern (GTEx portal data) Brain tissues Testis RPKM 4 - 3 - 2 - 1 - 78 this analysis confirm a statistically independent role for inherited mutations in ASD risk: Rare inherited CNVs contribute an increase in risk of 1.23 (p = 0.01), and private LGD SNVs have an odds ratio of 1.11 (p=0.0002). These results suggest that each of the four domains of mutations modeled additively contribute to the risk of ASD, and that they may do so in statistically independent manner. We examined the ?differential? for each category of mutation in the model, calculated by examining the difference between the percent of probands and percent of siblings with at least one of the mutation type, in order to estimate the proportion of ASD which might be attributable to each type of mutation individually. The strongest differential was seen for de novo SNVs (6.6% differential), suggesting this class of mutations is a large portion of ASD heritability. Although inherited LGD SNVs had a low differential in the model (0.1%), we note that the RVIS score of the mutations plays a crucial role: when examining only inherited LGD SNVs with an RVIS score of 10 or lower, the differential jumps to 2.7% (probands=50.5%, siblings=47.9%). Furthermore, even stronger differentials were observed when examining SRS discordant quads only (3.7%), while SRS concordant quads had only a 1.6% differential in inherited LGD SNVs (with RVIS < 10). Figure 4.5 Combined risk model for SNVs and CNVs, inherited and de novo Integrative risk model for ASD, based on de novo and inherited events, and covering both SNVs and CNVs. The model used is a stratified logistic regression model, which uses proband-sibling pairs to estimate the odds ratio (i.e., risk of ASD) for each type of event. sibling motherfather proband de novo SNV (disruptive) de novo CNV Inherited CNVs (rare) Odds Ratio P value 1.72 2.05 1.23 < 1 x 10-5 0.0004 0.01 Inherited SNVs (rare) 1.11 0.0002 Figure 5 P ( ASD ) ~ B0 + B1(de novo SNVs) + B2(de novo CNVs) + B3(Inherited CNVs) + B4(Inherited SNVs) + Strata(family) 79 % Probands % Siblings % Differential P-value Odds Ratio Inherited CNVs 26.1 23.7 2.4% 0.01045 1.23 Inherited SNVs 92.1 91.2 0.1% 0.00024 1.11 De novo CNVs 3.8 1.8 2.0% 0.00039 2.05 De novo SNVs 15.3 8.7 6.6% 0.00000 1.72 Table 4.4: Summary of logistic regression model results Our statistical risk model did not uncover any statistically significant interactions between the main effects, reflecting the relative rarity of each effect type in each individual. In addition, we found no evidence in the data for the presence of non-linear, exponential risk based on the summed number of mutations (methods). 4.5 Discussion We present the complete ascertainment of inherited and de novo mutations in 2,388 families from the Simons Simplex Collection (SSC) of autism. Using a complete ?ground-up? reanalysis of the data, and multiple variant discovery tools, we developed a resource of raw data and genetic variants for use throughout the community. Together with the extensive phenotype information present in the SSC, we believe that this resource will enable new and innovative research on the genetic basis and impact of autism. In the present analysis, we have explored the effect of rare inherited variation on the risk of autism. Our results extend previous work in understanding the role of rare inherited CNVs in ASD (Krumm et al. 2014; Poultney et al. 2013; Pinto et al. 2010), by providing crucial evidence that rare inherited SNVs are also a risk factor for simplex ASD. In particular, we find that private inherited SNVs which likely disrupt the protein product are enriched in probands?but crucially, that these SNVs which disrupt genes intolerant to functional variation in control populations (measured by the RVIS score in the ESP6500 data set) are most enriched in ASD. Disruptive mutations in these genes face 80 strong selective pressure, suggesting that they may have significant phenotypic consequences. We have also used the set of 1,786 quads in this analysis to confirm the results of previous work (Krumm et al., 2013) which examined the effect of inherited CNVs in simplex ASD cases on their phenotypes. In particular, we confirm that the SRS score? and especially the differential in scores between probands and their siblings (defined as discordant or concordant SRS scores)? is an important discriminant in CNV burden. Furthermore, we also confirm that probands with lower IQ scores are enriched for inherited CNVs, but those with higher IQ are not enriched in comparison to their siblings. These results suggest that more severely impacted ASD cases (as measured by SRS or IQ) are also enriched for Exome sequence data provides the basis for detailed, gene-level examination of variants. In this study, we leverage exome sequence data to discover SNVs and CNVs, and use the convergence of inherited and de novo events to identify new ASD risk factors. We hypothesize that rare inherited mutations can highlight genes in one of two ways. First, they can narrow the focus onto those genes with identified de novo mutations but no or few recurrent de novo mutations. In this study, we identify several genes, such as RIMS1 and CSMD1, for which a combination of inherited and de novo mutations of both SNVs and CNVs paints a strong picture of ASD risk for these genes. In a second approach, we find that examination of multiple mutations within each probands reveals a ?multiple hit? model for ASD. Critically, this study is the first to examine the complete genetic picture at an individual level in the context of autism. In particular, we found an inherited two-exon intra-genic deletion of NRXN3 and a de novo missense mutation of NLGN2 in 13367.p1. Both of these genes have been identified as ASD risk factors, but crucially, they are also protein-protein interacting partners. The neuroligin-neurexin interaction has long been hypothesized to be a key underlying pathway in ASD pathology (for review see Abrahams and Geschwind 2008), but to our knowledge this is the first identification of a case with mutations in both binding partners. 81 Finally, our ground-up reanalysis resulted in the identification of several new de novo mutations in genes, and added to the extensive existing work on de novo mutations in the SSC. The additional 49 validated de novo mutations discovered in the present analysis added several new genes to the ?recurrently hit? list of genes (Table S2). Several of these ?newly recurrent? genes form a network of protein-protein interactions (Figure 4.1), suggesting an underlying neurobiological pathway for ASD. Interestingly, three of these genes (GIGYF1, GIGYF2, and GRB10) all participate in IGF-pathway signaling, dysregulation of which has been previously suggested as an underlying neurobiological cause of ASD (Chen et al. 2014). 82 V. Summary and Future Directions 5.1 Summary of results This thesis describes the development of a method (CoNIFER) to find a new class of genetic variation (small CNVs) and its use in more comprehensively assaying genetic variation in simplex autism families. Using exome sequence data from over 1,800 quads and nearly 600 trios in conjunction with CoNIFER and newer bioinformatics methods, I discovered new de novo variants and assayed uniformly for the first time in both inherited CNVs and SNVs. Integration of the spectrum of genetic variants yielded new insight into their relative contribution to simplex ASD risk and highlighted how convergence of multiple mutation types can identify new ASD risk genes. Although these results are in strong agreement with the highly heritable nature of ASD established through twin and sibling studies, and exome sequencing has identified over a dozen ASD-related genes, clear and specific genotype-phenotype correlations have not yet been established. Here, I summarize and highlight some of the conclusions from this thesis and contextualize them in light of remaining challenges and questions. 5.2 Towards assaying the complete set of genetic variation The importance of de novo and ultra-rare transmitted variants in the etiology of ASD makes the sensitivity and specificity of variant detection and discovery algorithms a critical factor in our understanding of ASD genetics. In this thesis, I took two approaches to increasing sensitivity and/or specificity of the variants discovered. In chapter one, I describe CoNIFER, a new algorithm that extends the use of exome sequence data and leverages it to find genic CNVs. In particular, the targeted nature of the exome sequence data to the exons provides a powerful way to assay very small CNVs that disrupt genes. These CNVs are often too small to be detected using standard genome-wide microarrays, suggesting they have not been previously accounted for in our understanding of ASD. In chapter two, I found that CoNIFER was able to identify over 40% of novel, validated gene-containing CNVs, which previously had been missed by 83 high-density Illumina 1M/1M Duo (with 1.1 and 1.3 million probes, respectively). Many of these CNVs disrupted individual genes, which were enriched for brain expression or previous associations with neurodevelopmental disorders. In addition, the high sensitivity afforded by CoNIFER sharpened the contrast between probands and unaffected siblings, and I was able to demonstrate a significant increased burden of CNVs in the affected probands. This differential burden was especially strong for proband-sibling pairs who were strongly discordant for social phenotype (via the SRS score). Finally, the results suggested that inherited CNVs in probands were maternally inherited. In chapter three, I use multiple tools (i.e., FreeBayes and GATK, CoNIFER and XHMM) to establish both a highly specific set of variants (based on the intersection of each pair) and a highly sensitive set of variants (based on the union). The use of multiple algorithms resulted in an increased yield of loss-of-function de novo variants, added two new genes (ASH1L and IRF2BPL), to the list of recurrently hit genes with multiple observed de novo likely gene disrupting (LGD) mutations and implicated a new functional network of IGF-related proteins. Furthermore, the highly specific intersection set of these tools provided the basis for comparing rates and burden between probands and unaffected siblings. However, exome sequencing is biased by amplification, sequencing and enrichment biases, which are especially dependent on the %GC content of targets. These biases reduce coverage or prevent sequencing altogether of regions with high %GC nucleotide content. As can be seen in Figure 5.1, GC-bias can prevent adequate coverage of entire domains and exons in genes, in turn preventing variant discovery. Importantly, these regions harbor previously identified autism and neurodevelopmental genes, such as SHANK3. De novo mutations of SHANK3 were expected to account for up to 1% of autism cases, yet no de novo mutations have been observed so far using exome sequencing (in contrast, several mutations in SHANK2 have been observed). These results suggest that a large fraction of ASD-related mutations and genes are not assayed using current data. Several methods are available to assay the GC-rich portion of the exome, including molecular inversion probe (MIP)-based resequencing and whole-genome sequencing, although both are still prone to some bias. Future studies will need to not 84 only critically evaluate which portion of the genome/exome is missed but will need to systematically overcome this bias. Figure 5.1: High genomic GC nucleotide content (green histogram) hinders whole-exome sequencing for some genes, such as SHANK3 (right). Individual coding exons are shown in blue with non-genic sequences removed. Black lines indicate mean sequence depth for a single ASD trio and dark gray intervals indicate maximum and minimum depth across the family. The red dashed line indicates the minimum threshold required for accurate variant detection. Whole-genome sequencing (WGS), in particular, is likely to provide the underlying basis for a ?complete? view of genetic variation, even if results are initially ?masked? to the genic portion of the genome. In addition to being much less prone to GC bias, the uniform coverage of WGS across all portions of the genome enables algorithms that capitalize on paired-end sequence information. Such algorithms are well suited for identifying fine or complex structural variation, such as very small CNVs (affecting only one or two exons), which still elude exome-based methods (as a single exon or ?probe? is nearly always insufficient to identifying a CNV). Moreover, WGS addresses two important remaining spaces of the human genome that have not been fully assayed to date: First, non-genic space and regulatory elements, and second, genes or regions that are part of segmentally duplicated portions. Both are more fully described below. Mutations in non-genic regulatory elements of the genome are potentially a pervasive component of ASD genetic etiology, especially in the context of a susceptible genetic 85 background. Such regulatory backgrounds may in fact be relatively more common than the mutations seen in exons and genes. There is little present understanding of the expected penetrance of such regulatory mutations, and it is possible that the penetrance of ASD-implicated regulatory mutations mirrors that of the genes they regulate?that is, regulatory mutations of highly penetrant genes such as CHD8 may be equally penetrant (e.g., as in mutations affecting CFTR expression, see Rowntree and Harris 2003), while other mutations may be of significantly lower penetrance (Walsh et al. 2008). Crucially, however, these mutations and backgrounds are still expected to be very rare and fall below the frequency threshold of genome-wide association studies. Another possibility for regulatory mutations is that they contain a subset of highly penetrant de novo mutations with similar properties to loss-of-function mutations seen in genes. Yet, given the difficulty in identifying important regulatory elements in the genome, our limited understanding of their effects, and the enormous genomic space within which these mutations can fall, such mutations may be extraordinarily difficult to identify in a de novo paradigm. Instead, such mutations may be more readily identifiable using large families with recurrent autism and no strongly identifiable genic mutations. Finally, studies of mutations in ASD have systematically ignored or filtered genes in segmental duplications, due in part to the ambiguity and difficulty involved in accurately mapping sequence data to incomplete or truncated haplotypes. However, segmental duplications may be very important to a complete understanding of ASD, as they contain over 1,000 genes and are enriched for neurodevelopmental and brain-related functions (Sudmant et al. 2010). These duplications represent the youngest structural changes to the human genome?and many are specific to the homo lineage. Furthermore, while segmental duplications are known to be a critical driving factor in non-allelic homologous recombination (NAHR), it is not yet understood how the breakpoints of NAHR events may affect the genic content of segmental duplications. Study of these regions has not been comprehensively possible using either array-based or exome sequence data; however, WGS data and analysis may reveal that these regions can contribute additional genetic risk for ASD. 86 5.3 Understanding normal variation and pathogenic variants The extreme locus and variant heterogeneity of ASD has made statistical association and identification of genes and variants difficult. To combat this heterogeneity, many strategies rely on either reducing the genetic space searched or adding additional abstraction to the genetic data examined. In this thesis, a critical insight was the use of the RVIS score in chapter four. The RVIS score was able to stratify private, LGD-inherited SNVs into two groups: a group of variants that affected genes where functional variation was commonly observed, and a group of variants that truncated genes which were intolerant to functional variation. This is broad distinction at a gene level and reduction of the genetic/genomic space (i.e., it reduced the number of genes studied) was critical in establishing private, LGD-inherited SNVs as a risk factor for ASD as well as suggesting new candidate ASD genes. However, gene-based scores such as the RVIS score may only be a stepping stone to understanding genetic variants at a variant level. Currently, the agreement between variant pathogenicity scores (such as PolyPhen2, SIFT, MutationTaster, etc.) is low and poorly stratifies risk versus non-risk variants in ASD. Although some methods, such as the CADD score (Kircher et al., 2014), are designed to aggregate multiple variant-level scores and their genomic context, I found that the CADD score did not substantially add to the identification of SNVs with ASD risk (notably, the CADD model does not include the RVIS score). In order to improve our understanding of ultra-rare variants and their functional impact, additional work in three domains is needed. First, while the NHLBI?s ESP 6500 project has provided an excellent base for understanding rare variation, cohorts of 50,000+ are likely needed to understand the spectrum of ultra-rare variation in human populations. In addition, trio- or family-based studies of control individuals are needed in order to understand the specific pattern of de novo variation. 87 Second, a critical next step in understanding functional variation in humans is improving the definition and identification of gene and transcript models. Over 90% of transcripts undergo alternative splicing, and over 60% are tissue-regulated (Wang et al. 2008). Crucially, these gene and transcript models are the first step in the identification of mutations as non-genic, synonymous, nonsynonymous, or likely gene disrupting (LGD). In many cases, multiple transcripts exist for a single locus/gene, resulting in multiple annotations for just one SNV. In this case, many publications (including this one) take the ?most severe? effect as the effect of that SNV, a practice that biases towards more severe effects to be reported, or the ?longest transcript?, which likely over-estimates the number mutations that fall in coding regions. I analyzed the frequency of variant annotation disagreement for variants on currently known transcripts using a set of 2,529 de novo missense, stop-gained and frameshift variants from published sources, by taking the ?most severe? functional annotation for each variant (O'Roak et al. 2012b; Iossifov et al. 2012; Sanders et al. 2012). Of these, 151 (6%) were also annotated as intronic (i.e., not protein-altering), and this fraction was slightly higher for frameshift and nonsense mutations (32/418, 7.7%) than for missense mutations (119/2,111, 5.6%), suggesting a measurable bias due to the selection of the ?most severe? effect at each variant. Long-read technology and RNAseq has the potential to dramatically improve our understanding of transcript models and increase our confidence in functional variant annotation (for reviews of this topic, see Garber et al. 2011 and Mutz et al. 2013). These improvements will not only create a canonical reference set of transcripts for each gene but have the potential to create tissue- or cell-type specific transcripts (Lonsdale et al. 2013). This information will make it possible to annotate variants to a much smaller subset of transcripts, improving the specificity of the variant effects. Finally, larger transcript-level datasets will also improve our understanding of splice-site variation. Current annotation tools crudely annotate a variant as a ?splice-site disrupting variant? based on proximity to exons alone; future models of variant effects and transcripts will undoubtedly improve on this. 88 Finally, the wealth of information available at variant, gene, and gene-network levels continues to be a challenge to the interpretation of genomic variants. Future interpretation schemes will need to integrate information across levels, including, for example, the impact of an SNV on a gene at an amino acid level, the transcripts and expression levels of that gene within specific tissues of interest, and how that gene is connected within functional pathways and networks. Encouragingly, such improvements can provide both a wealth of value for new experiments as well as dramatically improve the value of existing data, such as the exome data from the SSC.. 5.4 Defining subtypes of ASD Exome sequencing provides a unique opportunity not only to understand ASD from a genetic viewpoint but also from a phenotypic point of view. This ?genotype-first? approach, in which ?genetic subtypes? of ASD are defined by mutations in genes or pathways promises to create meaningful distinctions within the ASD spectrum for both researchers and patients (Stessman et al. 2014). By assaying and stratifying patients based on the genetic data, the ?search space? for possible associated genotypes is greatly reduced, and individual patterns (not significant at a study level) become more pronounced. One example of such ASD ?genetic subtypes? is the CHD8 mutation, associated with larger head circumference, gastro-intestinal problems and distinctive facial features. Notably, however, mutations in CHD8 are not specifically associated with low IQ, and nearly half of patients with disruptive de novo mutations in CHD8 have normal or above normal IQ. Follow-up study of CHD8 mutations in zebrafish has confirmed these phenotypic effects (including slower gastrointestinal motility and wider spaced eye cusps, a proxy for brain or head size; Bernier et al. 2014). The ability to find genetically and phenotypically similar subgroups within the larger ASD spectrum may also lead to the identification genes which are ?specific? to the social and language deficits of ASD, rather than those that broadly impair IQ or cortical functioning. In chapters three and four, I explored how rare inherited CNVs were enriched in probands, and specifically in those probands with the poorest (i.e., highest) 89 scores on the Social Responsiveness Scale (SRS), which tracks a child?s ability to communicate and interact socially. The families with the greatest spread in SRS score between the affected probands and his/her unaffected sibling also showed the greatest probands enrichment for inherited CNVs. In contrast, I saw no additional enrichment of rare inherited CNVs amongst probands with lower IQs, suggesting that many rare inherited CNVs may be specific to ASD-related phenotypes. Interestingly, when I examined the relationship of private inherited LGD SNVs (in chapter four) and these two phenotypic measures, I found a different relationship: Probands with lower IQ were enriched for private LGD SNVs. The difference in phenotypic effect for inherited CNVs and inherited SNVs may be explained in several ways. First, it is possible that CNV duplications, which triple gene dosage as opposed to reducing copy number (as CNV deletion or LGD mutation of a genes does), are more likely to result in social impairment, perhaps because triplication would dysregulate pathways, rather than abrogate their function. A second possibility is that a subset of private inherited LGD SNVs has a higher phenotypic impact than do CNVs; in this scenario, the observed inherited SNVs would be expected to be very young mutations, which are quickly purged from populations. A final possibility is the reverse, in that it is in fact CNVs that are more phenotypically impactful (and thus more likely to have more severe IQ-reducing effects) and are purged so quickly from populations that only relatively more benign CNVs remain as transmitted CNVs (in this scenario, the highly impactful CNVs are observed simply as de novo CNVs). 5.5 Defining a gradient of simplex and multiplex autism The study of simplex autism families and a quad-based approach that includes an unaffected sibling has been instrumental in understanding both the effect of de novo mutations in ASD, as well as the relative risk of inherited genetic mutations for ASD. However, the ascertainment of simplex autism families is complicated by effects of stoppage (i.e., families are more likely to stop having more children when one child is diagnosed with ASD), and the fact that family sizes are often too small to rule out an inherited phenotype. A better understanding of the impact and phenotypic effects of inherited and de novo mutations will be gained by examination of larger families, 90 including those with multiple affected offspring. In particular, a community-based or longitudinal study of all ASD diagnoses is a logical next step in understanding the balance of influence from inherited and de novo genetic etiologies in the risk of ASD. 5.6 Understanding complex genetic etiologies at a family level The exome sequencing and targeted resequencing studies of the past three years have created a wealth of new information about specific, highly penetrant genes, such as CHD8 and SCN2A, and their role and risk in ASD. These studies have established that de novo mutation and rare variants play a dominant role in the genetic etiology of ASD and have identified mutations in highly penetrant genes that are likely causative. In this thesis, I have examined the role of inherited mutations in the context of ASD genetic etiology and identified inherited ASD risk factors and genes. Taken together, however, the identified gene candidates from both studies of de novo and inherited variants can only explain a small fraction of the overall heritable risk for ASD. Identification of additional specific genes implicated in ASD can be achieved through the study of much larger cohorts (Figure 5.2) in order to establish genome-wide significance. However, these estimates assume a very high and constant penetrance and ASD risk for all de novo mutation of genes. Critically, if bona fide ASD risk genes are not fully penetrant, then even larger cohorts of samples will be required for statistical association and discovery of genes. In fact, the fraction of genes with multiple de novo hits that are not fully penetrant can be estimated by examining how many of these genes also have inherited LGD mutations in siblings. From the data generated in the reanalysis of the SSC exomes, 40/128 of genes with two or more de novo mutations in probands (and none in siblings) also have at least one LGD SNV mutation or rare CNV inherited in siblings, suggesting that these genes have reduced penetrance for ASD. Thus, this analysis suggests that very large cohorts will be required for the large-scale identification single ASD risk genes using a de novo sequencing approach. 91 Figure 5.2: Expected hit rate (or sensitivity) of true positive genes discovered using trio sequencing studies (under a family-wise error rate of 5%; that is, each gene passes exome-wide significance of 2.6 ? 10?6). We estimate the power of trio sequencing to detect statistically significant associations for disease-associated genes, under the assumption that 10% or 20% of singleton mutations could be fully penetrant. A critical component to identifying additional ASD genes and the underlying neurobiological pathways will be comprehensive analysis of mutations at an individual level. Such an approach will leverage the existing genotype data?including ?one-off? de novo mutations in genes as well as inherited CNVs and SNVs?along with pathway and protein network metadata about the genes and pathways affected. Identification of interactions between genes with inherited and de novo mutations will be a powerful method to implicate new candidate genes and may be able to increase the fraction of cases for which a plausible genetic etiology can be identified. As an example of such individual-level pathway identification, in chapter four I identified an ASD proband with a de novo missense mutation NLGN2, in addition to an inherited 2-exon intragenic deletion of NRXN3. These two genes are part of the neuroligin-neurexin interaction pathways, which directly mediate the trans-synaptic interface of neurons. While both genes have been implicated in ASD, this specific case illustrates how an incompletely penetrant inherited mutation and a de novo missense mutation can act in a NATURE NEUROSCIENCE VOLUME 17 | NUMBER 6 | JUNE 2014 767 R E V I E W random mutation modeling40 to calculate the likelihood that observed (de novo) mutations have a damaging effect. Similar prioritizations are provided by tools that score individual mutation severity (SIFT, PolyPhen2, MutationTaster, MutPred, CONDEL, etc.), some of which can be adapted to a gene-based prioritization score from genome-wide data41. These population data provide a powerful unbiased approach to home in on genes that are likely to be among the most penetrant because of the complete absence of disruptive variation in the general population (for example, CHD8 or DYRK1A). A critical aspect of such analyses is the reliability of a particular gene model. Most human genes show evidence of alternative splice forms, many of which have no known function. Apparent hotspots of mutation for a particular exon (often exon-intron boundaries) in both cases and controls may suggest misannotation, the presence of a processed pseudogene or an alternative, nonfunctional splice form. Pathway enrichment and links to cancer biology Another popular approach to discern the most important gene candidates for further disease association and characterization has been to identify specific biological networks of genes enriched in cases as compared to controls. Although this approach cannot be used unequivocally to define causality, membership of a specific gene in a particular protein-protein interaction (PPI) or coexpression network may increase the likelihood of its association with disease. Numerous studies have reported significant enrichment of both de novo CNV and single-nucleotide variant (SNV) mutations in particular pathways3,4,42,43. O?Roak et al.3, for example, reported a significant enrichment of de novo disruptive autism mutations among proteins associated with chromatin remodeling and B-catenin and WNT signaling?a finding that was replicated in a follow-up resequencing study of more than 2,400 probands. One recent instance, in which membership of a new candidate gene in a PPI network led to the discovery of an autism-associated gene, is ADNP. A single ADNP LoF mutation was initially observed in exome sequencing studies. Although the observed mutation frequency in this gene did not reach statistical significance when cases and controls were compared20, it was strongly implicated in the PPI network originally defined by O?Roak et al.3 Targeted resequencin experiments combined with clinical exome sequencing identified s veral more cases with de novo mutations and remarkably similar phenotypes represent- ing a new SWI-SNF?related autism syndrome (Fig. 3)44. Notably, many of the genes implicated in the B-catenin pathway have also been described as mutated in patients with ID1 but not in patients with SCZ. Similarly, an enrichment of genes interacting with FMR1 (also known as FMRP)?the gene responsible for fragile X syndrome?has been reported with de novo mutations in ASD5, epilepsy11 and, most recently, SCZ10,45. Whether this observation is due to the relative high incidence of cases that also presented with comorbid ID remains to be determined. In addition to PPI networks, studies of coexpression have shown enrichment for specific spatio-temporal patterns of expression. A study of coexpressed genes affected by de novo mutations reported an enrichment in fetal prefrontal cortical network in SCZ8, which is in line with the finding by Xu et al.9 that genes with higher expression Table 4 Recurrent identical de novo mutations in 6 genes identified in 11 exome studies with different neurodevelopmental phenotypes Gene Coding effect Mutation (genomic DNA level) Mutation (cDNA level) Mutation (protein level) Study Disorder ALG13 Missense ChrX(GRCh37):g.110928268A>G NM_001099922.2:c.320A>G p.Asn107Ser de Ligt et al.1 ID ALG13 Missense ChrX(GRCh37):g.110928268A>G NM_001099922.2:c.320A>G p.Asn107Ser Allen et al.11 EE ALG13 Missense ChrX(GRCh37):g.110928268A>G NM_001099922.2:c.320A>G p.Asn107Ser Allen et al.11 EE KCNQ3 Missense Chr8(GRCh37):g.133192493G>A NM_001204824.1:c.328C>T p.Arg110Cys Rauch et al.2 ID KCNQ3 Missense Chr8(GRCh37):g.133192493G>A NM_001204824.1:c.328C>T p.Arg110Cys Allen et al.11 EE SCN1A Splice donor LRG_8:g.24003G>A NM_006920.4:c.602+1G>A p.? Allen et al.11 EE SCN1A Splice donor LRG_8:g.24003G>A NM_006920.4:c.602+1G>A p.? Allen et al.11 EE CUX2 Missense Chr12(GRCh37):g.111748354G>A NM_015267.3:c.1768G>A p.Glu590Lys Rauch et al.2 ID CUX2 Missense Chr12(GRCh37):g.111748354G>A NM_015267.3:c.1768G>A p.Glu590Lys Allen et al.11 EE SCN2A Missense Chr2(GRCh37):g.166198975G>A NM_021007.2:c.2558G>A p.Arg853Gln Allen et al.11 EE SCN2A Missense Chr2(GRCh37):g.166198975G>A NM_021007.2:c.2558G>A p.Arg853Gln Allen et al.11 EE DUSP15 Missense Chr20(GRCh37):g.30450489G>A NM_080611.2:c.320C>T p.Thr107Met Neale et al.7 ASD DUSP15 Missense Chr20(GRCh37):g.30450489G>A NM_080611.2:c.320C>T p.Thr107Met Fromer et al.10 SCZ EE, epileptic encephalopathies; ASD, autism spectrum disorder; ID, intellectual disability; SCZ, schizophrenia. Figure 1 Genes with recurrent de novo mutations in four neurodevelopmental disorders. (a) We estimate the number of fully penetrant genes that can explain disease once mutated, based on a de novo model using the ?unseen species problem?. We consider all recurrent missense or LoF de novo mutations pathogenic, as well as a defined fraction of mutations in genes observed just once (because it is unlikely that all de novo mutations are pathogenic). The ratio between genes mutated recurrently and the rate of singleton mutations suggests an estimate for the true number of genes pathogenic when mutated. Including more singleton mutations increases the fraction of each disorder explained by single de novo SNVs at the cost of including more genes as pathogenic. Initial exome sequencing studies of epilepsy and ID focused on specific pediatric subtypes or the most severe cases; thus, the number of generalized epilepsy- or ID-associated genes is likely to be much higher. EE, epileptic encephalopathies; ASD, autism spectrum disorder; ID, intellectual disability; SCZ, schizophrenia. (b) Expected hit rate (or sensitivity) of true positive genes discovered using trio sequencing studies (under a family-wise error rate of 5%; that is, each gene passes exome- wide significance of 2.6 ? 10?6). We estimate the power of trio sequencing to detect statistically significant associations for disease-associated genes, under the assumption that 10% or 20% of singleton mutations could be fully penetrant (vertical bars in a). We assume the distribu ion of these genes is uniform within each disorder and that they do not differ significantly from all genes in terms of length and mutability, although these are taken into account when determining significance. Number of trios sequencedFraction of pathogenic singleton mutations N u m b e r o f m o n o g e n ic d is e a s e g e n e s N u m b e r o f g e n e s d e te c te d ASD SCZ EE ID 20% singletons pathogenic 10% singletons pathogenic 0.05 0.10 0.15 0.20 0.25 0.30 0 200 400 600 800 1,000 1,200 a b 1,400 1,600 0 50 100 150 200 0 2,000 4,000 6,000 8,000 10,000 92 ?synergistically? pathogenic manner. Using this framework, in conjunction with additional information about cellular- and tissue-level co-expression, will undoubtedly reveal additional ASD risk gene combinations and pathways. 5.7 Future directions New sequencing technology and the establishment of large well-phenotyped family-based cohorts, such as the SSC, have enabled the systematic discovery of mutations that underlie the genetic etiology of ASD and ID. The fraction of explained genetic etiology is a measurable indicator of progress. In 2005, ~10% of the genetic etiology of autism was understood. Within seven years, advances in genomics technology facilitated the rapid discovery of de novo SNVs and CNVs leading to the discovery of disruptive genetic variants that may account for another ~25% of cases. Although the extent of locus heterogeneity in ASD and ID was initially underestimated, the development of exome sequencing and low-cost/high-throughput MIP-based resequencing has strongly implicated two dozen novel genes accounting for >3% of disease. Many of these genes may in fact define distinct clinical ?subtypes? of ASD upon detailed examination of patients with a common genetic etiology, consistent with the hypothesis that autism is an umbrella term underlying many different and distinct ?autisms?. This is reminiscent of the work with CNVs, where the identification of recurrent mutations and patient follow-up led to the identification of novel syndromes and subtypes from idiopathic cases of disease (Sharp et al. 2006). There is already compelling evidence for this based on an assessment of multiple patients with ADNP (Helsmoortel et al. 2014), DYRK1A (Courcet et al. 2012) and CHD8 mutations (Bernier et al. 2014), which appear to define microcephalic and macrocephalic subtypes, respectively. Alternatively, the ?genotype-first? approach may also reveal phenotypic variability of genic mutations across a diverse array of neuropsychiatric and neurodevelopmental disorders. Similar to the CNVs of 16p11.2 and 15q13.3, which are associated with several disorders, there is evidence for this already for mutations associated with SETBP1 (Hoischen et al. 2010) and SCN2A, resulting in very different outcomes. Establishment of cohorts with different 93 types of mutations and careful study of their phenotypes and comorbidities may reveal specific protein domains and mutation types associated with different diseases. The knowledge of specific genes, loci, and pathways now spurs the development of functional experiments. These include using novel methods with induced pluripotent stem cells to assay specific mutations in a patient with Timothy syndrome (Yazawa et al. 2011; Pa?ca et al. 2011), as well as established model systems, such as mouse and zebrafish models to explore the roles of CHD8 (Bernier et al. 2014), DYRK1A (Ahn et al. 2006) and PTEN (Backman et al. 2001) in brain volume. Improving knowledge of ASD genetic and neurobiological etiologies will aid in the diagnosis of ASD/ID subtypes, allowing for specific recruitment for clinical trials and the development of targeted therapeutics for each subtype. This model is akin to the heterogeneity seen in other broad categories of human disorders and disease and has proven to be successful in many cases (e.g., specific therapeutics for a particular mutation in cystic fibrosis or specific forms of cancer). Integrating the genetics, neurology, and pathophysiology of these disorders holds considerable promise not only for our understanding of the biology of the human brain but also for potential treatments. 94 REFERENCES Abrahams BS, Geschwind DH. 2008. Advances in autism genetics: on the threshold of a new neurobiology. Nature Reviews Neuroscience 9: 341?355. Ahn K-J, Jeong HK, Choi H-S, Ryoo S-R, Kim YJ, Goo J-S, Choi S-Y, Han J-S, Ha I, Song W-J. 2006. DYRK1A BAC transgenic mice show altered synaptic plasticity with learning and memory defects. Neurobiol Dis 22: 463?472. Alkan C, Kidd JM, Marques-Bonet T, Aksay G, Antonacci F, Hormozdiari F, Kitzman JO, Baker C, Malig M, Mutlu O, et al. 2009. Personalized copy number and segmental duplication maps using next-generation sequencing. Nat Genet 41: 1061?1067. Amir RE, Van den Veyver IB, Wan M, Tran CQ, Francke U, Zoghbi HY. 1999. Rett syndrome is caused by mutations in X-linked MECP2, encoding methyl-CpG-binding protein 2. Nat Genet 23: 185?188. Backman SA, Stambolic V, Suzuki A, Haight J, Elia A, Pretorius J, Tsao MS, Shannon P, Bolon B, Ivy GO, et al. 2001. Deletion of Pten in mouse brain causes seizures, ataxia and defects in soma size resembling Lhermitte-Duclos disease. Nat Genet 29: 396?403. Bailey A, Le Couteur A, Gottesman I, Bolton P, Simonoff E, Yuzda E, Rutter M. 1995. Autism as a strongly genetic disorder: evidence from a British twin study. Psychol Med 25: 63?77. Bailey JA, Gu Z, Clark RA, Reinert K, Samonte RV, Schwartz S, Adams MD, Myers EW, Li PW, Eichler EE. 2002. Recent segmental duplications in the human genome. Science 297: 1003?1007. Bajpai R, Chen DA, Rada-Iglesias A, Zhang J, Xiong Y, Helms J, Chang C-P, Zhao Y, Swigut T, Wysocka J. 2010. CHD7 cooperates with PBAF to control multipotent neural crest formation. Nature 463: 958?962. Bamshad MJ, Ng SB, Bigham AW, Tabor HK, Emond MJ, Nickerson DA, Shendure J. 2011. Exome sequencing as a tool for Mendelian disease gene discovery. Nature Reviews Genetics 12: 745?755. http://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&id=21946919&retmode=ref&cmd=prlinks. Barton A, Fendrik AJ. 2013. Sustained vs. oscillating expressions of Ngn2, Dll1 and Hes1: A model of neural differentiation of embryonic telencephalon. Journal of Theoretical Biology 328: 1?8. 95 Batsukh T, Pieper L, Koszucka AM, Velsen von N, Hoyer-Fender S, Elbracht M, Bergman JEH, Hoefsloot LH, Pauli S. 2010. CHD8 interacts with CHD7, a protein which is mutated in CHARGE syndrome. Human Molecular Genetics 19: 2858?2866. Bedogni F, Hodge RD, Elsen GE, Nelson BR, Daza RAM, Beyer RP, Bammler TK, Rubenstein JLR, Hevner RF. 2010. Tbr1 regulates regional and laminar identity of postmitotic neurons in developing neocortex. Proceedings of the National Academy of Sciences 107: 13129?13134. Ben-Shachar S, Lanpher B, German JR, Qasaymeh M, Potocki L, Nagamani SCS, Franco LM, Malphrus A, Bottenfield GW, Spence JE, et al. 2009. Microdeletion 15q13.3: a locus with incomplete penetrance for autism, mental retardation, and psychiatric disorders. J Med Genet 46: 382?388. Bernier R, Golzio C, Xiong B, Stessman HA, Coe BP, Penn O, Witherspoon K, Gerdts J, Baker C, Vulto-van Silfhout AT, et al. 2014. Disruptive CHD8 Mutations Define a Subtype of Autism Early in Development. Cell 158: 263?276. Berryer MH, Hamdan FF, Klitten LL, Moller RS, Carmant L, Schwartzentruber J, Patry L, Dobrzeniecka S, Rochefort D, Neugnot-Cerioli M, et al. 2012. Mutations in SYNGAP1Cause Intellectual Disability, Autism, and a Specific Form of Epilepsy by Inducing Haploinsufficiency. Human Mutation 34: 385?394. Betancur C. 2011. Etiological heterogeneity in autism spectrum disorders: more than 100 genetic and genomic disorders and still counting. Brain Res 1380: 42?77. Binder DK, Nagelhus EA, Ottersen OP. 2012. Aquaporin-4 and epilepsy eds. C. Steinh?user and D. Boison. FEBS J 60: 1203?1214. Bozdagi O, Tavassoli T, Buxbaum JD. 2013. Insulin-like growth factor-1 rescues synaptic and motor deficits in a mouse model of autism and developmental delay. Mol Autism 4: 9. Cadigan KM. 2008. Wnt/?-Catenin Signaling: Turning the Switch. Developmental Cell 14: 322?323. Campbell CD, Sampas N, Tsalenko A, Sudmant PH, Kidd JM, Malig M, Vu TH, Vives L, Tsang P, Bruhn L. 2011. Population-Genetic Properties of Differentiated Human Copy-Number Polymorphisms. The American Journal of Human Genetics 88: 317?332. Chen J, Alberts I, Li X. 2014. Dysregulation of the IGF-I/PI3K/AKT/mTOR signaling pathway in autism spectrum disorders. International Journal of Developmental Neuroscience 35: 35?41. Chenn A, Walsh CA. 2003. Increased neuronal production, enlarged forebrains and cytoarchitectural distortions in beta-catenin overexpressing transgenic mice. Cereb 96 Cortex 13: 599?606. Chiang DY, Getz G, Jaffe DB, O'Kelly MJT, Zhao X, Carter SL, Russ C, Nusbaum C, Meyerson M, Lander ES. 2008. High-resolution mapping of copy-number alterations with massively parallel sequencing. Nat Methods 6: 99?103. Constantino JN, Gruber CP. Social Responsiveness Scale. Western Psychological Services, Los Angeles http://portal.wpspublish.com/portal/page?_pageid=53,70492&_dad=portal&_schema=PORTAL. Constantino JN, Todorov A, Hilton C, Law P, Zhang Y, Molloy E, Fitzgerald R, Geschwind D. 2013. Autism recurrence in half siblings: strong support for genetic mechanisms of transmission in ASD. Mol Psychiatry 18: 137?138. Cooper GM, Coe BP, Girirajan S, Rosenfeld JA, Vu TH, Baker C, Williams C, Stalker H, Hamid R, Hannig V, et al. 2011. A copy number variation morbidity map of developmental delay. Nat Genet 43: 838?846. Courcet J-B, Faivre L, Malzac P, Masurel-Paulet A, Lopez E, Callier P, Lambert L, Lemesle M, Thevenon J, Gigot N, et al. 2012. The DYRK1A gene is a cause of syndromic intellectual disability with severe microcephaly and epilepsy. J Med Genet 49: 731?736. Cukier HN, Dueker ND, Slifer SH, Lee JM, Whitehead PL, Lalanne E, Leyva N, Konidari I, Gentry RC, Hulme WF, et al. 2014. Exome sequencing of extended families with autism reveals genes shared across neurodevelopmental and neuropsychiatric disorders. Mol Autism 5: 1. Darnell JC, Van Driesche SJ, Zhang C, Hung KYS, Mele A, Fraser CE, Stone EF, Chen C, Fak JJ, Chi SW, et al. 2011. FMRP stalls ribosomal translocation on mRNAs linked to synaptic function and autism. Cell 146: 247?261. Davidson J, Goin-Kochel RP, Green-Snyder LA, Hundley RJ, Warren Z, Peters SU. 2012. Expression of the Broad Autism Phenotype in Simplex Autism Families from the Simons Simplex Collection. J Autism Dev Disord. de Ligt J, Willemsen MH, van Bon BWM, Kleefstra T, Yntema HG, Kroes T, Vulto-van Silfhout AT, Koolen DA, de Vries P, Gilissen C, et al. 2012. Diagnostic exome sequencing in persons with severe intellectual disability. N Engl J Med 367: 1921?1929. de Vries BBA, Pfundt R, Leisink M, Koolen DA, Vissers LELM, Janssen IM, Reijmersdal SV, Nillesen WM, Huys EHLPG, Leeuw N de, et al. 2005. Diagnostic genome profiling in mental retardation. Am J Hum Genet 77: 606?616. Durand CM, Betancur C, Boeckers TM, Bockmann J, Chaste P, Fauchereau F, Nygren G, Rastam M, Gillberg IC, Anckars?ter H, et al. 2007. Mutations in the gene encoding 97 the synaptic scaffolding protein SHANK3 are associated with autism spectrum disorders. Nat Genet 39: 25?27. Endele S, Rosenberger G, Geider K, Popp B, Tamer C, Stefanova I, Milh M, Kort?m F, Fritsch A, Pientka FK, et al. 2010. Mutations in GRIN2A and GRIN2B encoding regulatory subunits of NMDA receptors cause variable neurodevelopmental phenotypes. Nat Genet 42: 1021?1026. Fairless R, Masius H, Rohlmann A, Heupel K, Ahmad M, Reissner C, Dresbach T, Missler M. 2008. Polarized Targeting of Neurexins to Synapses Is Regulated by their C-Terminal Sequences. J Neurosci 28: 12969?12981. Fischbach GD, Lord C. 2010. The Simons Simplex Collection: A Resource for Identification of Autism Genetic Risk Factors. Neuron 68: 192?195. Fotaki V, Dierssen M, Alc?ntara S, Mart?nez S, Mart? E, Casas C, Visa J, Soriano E, Estivill X, Arbon?s ML. 2002. Dyrk1A haploinsufficiency affects viability and causes developmental delay and abnormal brain morphology in mice. Mol Cell Biol 22: 6636?6647. Fromer M, Moran JL, Chambert K, Banks E, Bergen SE, Ruderfer DM, Handsaker RE, McCarroll SA, O?Donovan MC, Owen MJ, et al. 2012. Discovery and statistical genotyping of copy-number variation from whole-exome sequencing depth. Am J Hum Genet 91: 597?607. Fu Y-H, Kuhl DPA, Pizzuti A, Pieretti M, Sutcliffe JS, Richards S, Verkert AJMH, Holden JJA, Fenwick RG Jr., Warren ST, et al. 1991. Variation of the CGG repeat at the fragile X site results in genetic instability: Resolution of the Sherman paradox. Cell Reports 67: 1047?1058. Garber M, Grabherr MG, Guttman M, Trapnell C. 2011. Computational methods for transcriptome annotation and quantification using RNA-seq. Nat Methods 8: 469?477. Garrison E, Marth G. 2012. Haplotype-based variant detection from short-read sequencing. arXiv q-bio.GN. Gilman SR, Iossifov I, Levy D, Ronemus M, Wigler M, Vitkup D. 2011. Rare De Novo Variants Associated with Autism Implicate a Large Functional Network of Genes Involved in Formation and Function of Synapses. Neuron 70: 898?907. Girirajan S, Campbell CD, Eichler EE. 2010. Human Copy Number Variation and Complex Genetic Disease. Annu Rev Genet. Girirajan S, Rosenfeld JA, Coe BP, Parikh S, Friedman N, Goldstein A, Filipink RA, McConnell JS, Angle B, Meschino WS, et al. 2012. Phenotypic heterogeneity of genomic disorders and rare copy-number variants. N Engl J Med 367: 1321?1331. 98 Glessner JT, Wang K, Cai G, Korvatska O, Kim CE, Wood S, Zhang H, Estes A, Brune CW, Bradfield JP, et al. 2009. Autism genome-wide copy number variation reveals ubiquitin and neuronal genes. Nature 459: 569?573. Guedj F, Pereira PL, Najas S, Barallobre M-J, Chabert C, Souchet B, Sebrie C, Verney C, Herault Y, Arbones M, et al. 2012. DYRK1A: A master regulatory protein controlling brain growth. Neurobiol Dis 46: 190?203. Hach F, Hormozdiari F, Alkan C, Hormozdiari F, Birol I, Eichler EE, Sahinalp SC. 2010. mrsFAST: a cache-oblivious algorithm for short-read mapping. Nat Methods 7: 576?577. Hallmayer J, Cleveland S, Torres A, Phillips J, Cohen B, Torigoe T, Miller J, Fedele A, Collins J, Smith K, et al. 2011. Genetic heritability and shared environmental factors among twin pairs with autism. Arch Gen Psychiatry 68: 1095?1102. Hamdan FF, Gauthier J, Spiegelman D, Noreau A, Yang Y, Pellerin S, Dobrzeniecka S, C?t? M, Perreau-Linck E, Perreault-Linck E, et al. 2009. Mutations in SYNGAP1 in autosomal nonsyndromic mental retardation. N Engl J Med 360: 599?605. H?vik B, Le Hellard S, Rietschel M, Lyb?k H, Djurovic S, Mattheisen M, M?hleisen TW, Degenhardt F, Priebe L, Maier W, et al. 2011. The complement control-related genes CSMD1 and CSMD2 associate to schizophrenia. Biol Psychiatry 70: 35?42. Helbig I, Mefford HC, Sharp AJ, Guipponi M, Fichera M, Franke A, Muhle H, de Kovel C, Baker C, Spiczak von S, et al. 2009. 15q13.3 microdeletions increase risk of idiopathic generalized epilepsy. Nat Genet 41: 160?162. Helsmoortel, C., Vulto-van Silfhout, A. T., Coe, B. P., Vandeweyer, G., Rooms, L., van den Ende, J., et al. (2014). A SWI/SNF-related autism syndrome caused by de novo mutations in ADNP. Nature Genet, 46: 380?384. Hoischen A, van Bon BWM, Gilissen C, Arts P, van Lier B, Steehouwer M, de Vries P, de Reuver R, Wieskamp N, Mortier G, et al. 2010. De novo mutations of SETBP1 cause Schinzel-Giedion syndrome. Nat Genet 42: 483?485. Holder JL Jr., Lotze TE, Bacino C, Cheung SW. 2012. A child with an inherited 0.31%Mb microdeletion of chromosome 14q32.33: Further delineation of a critical region for the 14q32 deletion syndrome. Am J Med Genet 158A: 1962?1966. Hormozdiari F, Alkan C, Eichler EE, Sahinalp SC. 2009. Combinatorial algorithms for structural variation detection in high-throughput sequenced genomes. Genome Research 19: 1270?1278. Hsueh Y-P, Wang T-F, Yang F-C, Sheng M. 2000. Nuclear translocation and transcription regulation by the membrane-associated guanylate kinase CASK/LIN-2. Nature 404: 298?302. 99 Huang Z. 2005. The origin recognition core complex regulates dendrite and spine development in postmitotic neurons. The Journal of Cell Biology 170: 527?535. International Schizophrenia Consortium. 2008. Rare chromosomal deletions and duplications increase risk of schizophrenia. Nature 455: 237?241. Iossifov I, Ronemus M, Levy D, Wang Z, Hakker I, Rosenbaum J, Yamrom B, Lee Y-H, Narzisi G, Leotta A, et al. 2012. De novo gene disruptions in children on the autistic spectrum. Neuron 74: 285?299. Jensen LJ, Kuhn M, Stark M, Chaffron S, Creevey C, Muller J, Doerks T, Julien P, Roth A, Simonovic M, et al. 2009. STRING 8--a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Research 37: D412?6. Kamiya K, Kaneda M, Sugawara T, Mazaki E, Okamura N, Montal M, Makita N, Tanaka M, Fukushima K, Fujiwara T, et al. 2004. A nonsense mutation of the sodium channel gene SCN2A in a patient with intractable epilepsy and mental decline. J Neurosci 24: 2690?2698. Karakoc E, Alkan C, O'Roak BJ, Dennis MY, Vives L, Mark K, Rieder MJ, Nickerson DA, Eichler EE. 2011. Detection of structural variants and indels within exome data. Nat Methods 9: 176?178. Kidd JM, Cooper GM, Donahue WF, Hayden HS, Sampas N, Graves T, Hansen N, Teague B, Alkan C, Antonacci F, et al. 2008. Mapping and sequencing of structural variation from eight human genomes. Nature 453: 56?64. Kim H-G, Kishikawa S, Higgins AW, Seong I-S, Donovan DJ, Shen Y, Lally E, Weiss LA, Najm J, Kutsche K, et al. 2008. Disruption of Neurexin 1 Associated with Autism Spectrum Disorder. The American Journal of Human Genetics 82: 199?207. Klei L, Sanders SJ, Murtha MT, Hus V, Lowe JK, Willsey AJ, Moreno-De-Luca D, Yu TW, Fombonne E, Geschwind D, et al. 2012. Common genetic variants, acting additively, are a major source of risk for autism. Mol Autism 3: 9. Koolen DA, Kramer JM, Neveling K, Nillesen WM, Moore-Barton HL, Elmslie FV, Toutain A, Amiel J, Malan V, Tsai AC-H, et al. 2012. Mutations in the chromatin modifier gene KANSL1 cause the 17q21.31 microdeletion syndrome. Nat Genet 44: 639?641. Korbel JO, Abyzov A, Mu XJ, Carriero N, Cayting P, Zhang Z, Snyder M, Gerstein MB. 2009. PEMer: a computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data. Genome Biology 10: R23. Korbel JO, Urban AE, Gruber F, Du J, Royce TE, Starr P, Zhong G, Emanuel B, Weissman SM, Snyder M, et al. 2007. Systematic prediction and validation of breakpoints associated with copy-number variants in the human genome. 100 Proceedings of the National Academy of Sciences of the United States of America 104: 10110. Krumm N, O'Roak BJ, Karakoc E, Mohajeri K, Nelson B, Vives L, Jacquemont S, Munson J, Bernier R, Eichler EE. 2013. Transmission Disequilibrium of Small CNVs in Simplex Autism. Am J Hum Genet 93: 595?606. Krumm N, O'Roak BJ, Shendure J, Eichler EE. 2014. A de novo convergence of autism genetics and molecular neuroscience. Trends Neurosci 37: 95?105. Krumm N, Sudmant PH, Ko A, O'Roak BJ, Malig M, Coe BP, NHLBI Exome Sequencing Project, Quinlan AR, Nickerson DA, Eichler EE. 2012. Copy number variation detection and genotyping from exome sequence data. Genome Research 22: 1525?1532. Kumar RA, Marshall CR, Badner JA, Babatz TD, Mukamel Z, Aldinger KA, Sudi J, Brune CW, Goh G, KaraMohamed S, et al. 2009. Association and Mutation Analyses of 16p11.2 Autism Candidate Genes. PLoS ONE 4: e4582. Lepagnol-Bestel A-M, Zvara A, Maussion G, Quignon F, Ngimbous B, Ramoz N, Imbeaud S, Loe-Mie Y, Benihoud K, Agier N, et al. 2009. DYRK1A interacts with the REST/NRSF-SWI/SNF chromatin remodelling complex to deregulate gene clusters involved in the neuronal phenotypic traits of Down syndrome. Human Molecular Genetics 18: 1405?1414. Levy D, Ronemus M, Yamrom B, Lee Y-H, Leotta A, Kendall J, Marks S, Lakshmi B, Pai D, Ye K, et al. 2011. Rare De Novo and Transmitted Copy-Number Variation in Autistic Spectrum Disorders. Neuron 70: 886?897. Li H. 2013. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Lichtenstein P, Carlstrom E, Rastam M, Gillberg C, Anckarsater H. 2010. The Genetics of Autism Spectrum Disorders and Related Neuropsychiatric Disorders in Childhood. American Journal of Psychiatry 167: 1357?1363. Litterman N, Ikeuchi Y, Gallardo G, O'Connell BC, Sowa ME, Gygi SP, Harper JW, Bonni A. 2011. An OBSL1-Cul7Fbxw8 ubiquitin ligase signaling mechanism regulates Golgi morphology and dendrite patterning. ed. P. Scheiffele. PLoS Biol 9: e1001060. Lonsdale J, Thomas J, Salvatore M, Phillips R, Lo E, Shad S, Hasz R, Walters G, Garcia F, Young N, et al. 2013. The Genotype-Tissue Expression (GTEx) project. Nat Genet 45: 580?585. Losh M, Childress D, Lam K, Piven J. 2008. Defining key features of the broad autism phenotype: a comparison across parents of multiple- and single-incidence autism families. Am J Med Genet B Neuropsychiatr Genet 147B: 424?433. 101 Marshall CR, Noor A, Vincent JB, Lionel AC, Feuk L, Skaug J, Shago M, Moessner R, Pinto D, Ren Y, et al. 2008. Structural variation of chromosomes in autism spectrum disorder. Am J Hum Genet 82: 477?488. Morita, M., Ler, L. W., Fabian, M. R., Siddiqui, N., Mullin, M., Henderson, V. C., et al. (2012). A novel 4EHP-GIGYF2 translational repressor complex is essential for mammalian development. Molecular and Cellular Biology, 32(17), 3585?3593. doi:10.1128/MCB.00455-12. Matsuura T, Sutcliffe JS, Fang P, Galjaard RJ, Jiang YH, Benton CS, Rommens JM, Beaudet AL. 1997. De novo truncating mutations in E6-AP ubiquitin-protein ligase gene (UBE3A) in Angelman syndrome. Nat Genet 15: 74?77. Mazur-Kolecka B, Golabek A, Kida E, Rabe A, Hwang Y-W, Adayev T, Wegiel J, Flory M, Kaczmarski W, Marchi E, et al. 2012. Effect of DYRK1A activity inhibition on development of neuronal progenitors isolated from Ts65Dn mice. J Neurosci Res 90: 999?1010. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, et al. 2010. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Research 20: 1297?1303. Miller DT, Shen Y, Weiss LA, Korn J, Anselm I, Bridgemohan C, Cox GF, Dickinson H, Gentile J, Harris DJ, et al. 2009. Microdeletion/duplication at 15q13.2q13.3 among individuals with features of autism and other neuropsychiatric disorders. J Med Genet 46: 242?248. Moessner R, Marshall CR, Sutcliffe JS, Skaug J, Pinto D, Vincent J, Zwaigenbaum L, Fernandez B, Roberts W, Szatmari P, et al. 2007. Contribution of SHANK3 mutations to autism spectrum disorder. Am J Hum Genet 81: 1289?1297. Moller RS, K?bart S, Hoeltzenbein M, Heye B, Vogel I, Hansen CP, Menzel C, Ullmann R, Tommerup N, Ropers H-H, et al. 2008. Truncation of the Down syndrome candidate gene DYRK1A in two unrelated patients with microcephaly. Am J Hum Genet 82: 1165?1170. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. 2008. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5: 621?628. Mutz K-O, Heilkenbrinker A, L?nne M, Walter J-G, Stahl F. 2013. Transcriptome analysis using next-generation sequencing. Curr Opin Biotechnol 24: 22?30. Neale BM, Kou Y, Liu L, Ma?ayan A, Samocha KE, Sabo A, Lin C-F, Stevens C, Wang L-S, Makarov V, et al. 2012. Patterns and rates of exonic de novo mutations in autism spectrum disorders. Nature 485: 242?245. Ng D, Pitcher GM, Szilard RK, Serti? A, Kanisek M, Clapcote SJ, Lipina T, Kalia LV, 102 Joo D, McKerlie C, et al. 2009a. Neto1 Is a Novel CUB-Domain NMDA Receptor?Interacting Protein Required for Synaptic Plasticity and Learning. PLoS Biol 7: e1000041. Ng SB, Buckingham KJ, Lee C, Bigham AW, Tabor HK, Dent KM, Huff CD, Shannon PT, Jabs EW, Nickerson DA, et al. 2009b. Exome sequencing identifies the cause of a mendelian disorder. Nat Genet 42: 30?35. Ng SB, Turner EH, Robertson PD, Flygare SD, Bigham AW, Lee C, Shaffer T, Wong M, Bhattacharjee A, Eichler EE, et al. 2009c. Targeted capture and massively parallel sequencing of 12 human exomes. Nature 461: 272?276. Nishiyama M, Oshikawa K, Tsukada Y-I, Nakagawa T, Iemura S-I, Natsume T, Fan Y, Kikuchi A, Skoultchi AI, Nakayama KI. 2009. CHD8 suppresses p53-mediated apoptosis through histone H1 recruitment during early embryogenesis. Nat Cell Biol 11: 172?182. Nishiyama M, Skoultchi AI, Nakayama KI. 2012. Histone H1 recruitment by CHD8 is essential for suppression of the Wnt-?-catenin signaling pathway. Mol Cell Biol 32: 501?512. O'Roak BJ, Deriziotis P, Lee C, Vives L, Schwartz JJ, Girirajan S, Karakoc E, Mackenzie AP, Ng SB, Baker C, et al. 2011. Exome sequencing in sporadic autism spectrum disorders identifies severe de novo mutations. Nat Genet 43: 585?589. O'Roak BJ, Vives L, Fu W, Egertson JD, Stanaway IB, Phelps IG, Carvill G, Kumar A, Lee C, Ankenman K, et al. 2012a. Multiplex targeted sequencing identifies recurrently mutated genes in autism spectrum disorders. Science 338: 1619?1622. O'Roak BJ, Vives L, Girirajan S, Karakoc E, Krumm N, Coe BP, Levy R, Ko A, Lee C, Smith JD, et al. 2012b. Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations. Nature 485: 246?250. Ogiwara I, Ito K, Sawaishi Y, Osaka H, Mazaki E, Inoue I, Montal M, Hashikawa T, Shike T, Fujiwara T, et al. 2009. De novo mutations of voltage-gated sodium channel alphaII gene SCN2A in intractable epilepsies. Neurology 73: 1046?1053. Pa?ca SP, Portmann T, Voineagu I, Yazawa M, Shcheglovitov A, Pasca AM, Cord B, Palmer TD, Chikahisa S, Nishino S, et al. 2011. Using iPSC-derived neurons to uncover cellular phenotypes associated with Timothy syndrome. Nat Med 17: 1657?1662. Peiffer DA, Le JM, Steemers FJ, Chang W, Jenniges T, Garcia F, Haden K, Li J, Shaw CA, Belmont J, et al. 2006. High-resolution genomic profiling of chromosomal aberrations using Infinium whole-genome genotyping. Genome Research 16: 1136?1148. Lorenz, P., Dietmann, S., Wilhelm, T., Koczan, D., Autran, S., Gad, S., et al. (2010). The 103 ancient mammalian KRAB zinc finger gene cluster on human chromosome 8q24.3 illustrates principles of C2H2 zinc finger evolution associated with unique expression profiles in human tissues. BMC Genomics, 11(1), 206. doi:10.1186/1471-2164-11-206 Petrovski S, Wang Q, Heinzen EL, Allen AS, Goldstein DB. 2013. Genic Intolerance to Functional Variation and the Interpretation of Personal Genomes. PLoS Genetics 9: e1003709. Pinkel D, Segraves R, Sudar D, Clark S, Poole I, Kowbel D, Collins C, Kuo WL, Chen C, Zhai Y, et al. 1998. High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays. Nat Genet 20: 207?211. Pinto D, Pagnamenta AT, Klei L, Anney R, Merico D, Regan R, Conroy J, Magalhaes TR, Correia C, Abrahams BS, et al. 2010. Functional impact of global rare copy number variation in autism spectrum disorders. Nature 466: 368?372. Poultney CS, Goldberg AP, Drapeau E, Kou Y, Harony-Nicolas H, Kajiwara Y, De Rubeis S, Durand S, Stevens C, Rehnstr?m K, et al. 2013. Identification of Small Exonic CNV from Whole-Exome Sequence Data and Application to Autism Spectrum Disorder. The American Journal of Human Genetics 93: 607?619. Rauch A, Wieczorek D, Graf E, Wieland T, Endele S, Schwarzmayr T, Albrecht B, Bartholdi D, Beygo J, Di Donato N, et al. 2012. Range of genetic mutations associated with severe non-syndromic sporadic intellectual disability: an exome sequencing study. Lancet 380: 1674?1682. Ronemus M, Iossifov I, Levy D, Wigler M. 2014. The role of de novo mutations in the genetics of autism spectrum disorders. Nature Reviews Genetics 15: 133?141. Rowntree RK, Harris A. 2003. The Phenotypic Consequences of CFTR Mutations. Ann Human Genet 67: 471?485. Salinas PC, Zou Y. 2008. Wnt Signaling in Neural Circuit Assembly. Annu Rev Neurosci 31: 339?358. Sanders SJ, Ercan-Sencicek AG, Hus V, Luo R, Murtha MT, Moreno-De-Luca D, Chu SH, Moreau MP, Gupta AR, Thomson SA, et al. 2011. Multiple recurrent de novo CNVs, including duplications of the 7q11.23 Williams syndrome region, are strongly associated with autism. Neuron 70: 863?885. Sanders SJ, Murtha MT, Gupta AR, Murdoch JD, Raubeson MJ, Willsey AJ, Ercan-Sencicek AG, DiLullo NM, Parikshak NN, Stein JL, et al. 2012. De novo mutations revealed by whole-exome sequencing are strongly associated with autism. Nature 485: 237?241. Santen GWE, Aten E, Sun Y, Almomani R, Gilissen C, Nielsen M, Kant SG, Snoeck IN, Peeters EAJ, Hilhorst-Hofstee Y, et al. 2012. Mutations in SWI/SNF chromatin 104 remodeling complex gene ARID1B cause Coffin-Siris syndrome. Nat Genet 44: 379?380. Santos Dos C, Essioux L, Teinturier C, Tauber M, Goffin V, Bougn?res P. 2004. A common polymorphism of the growth hormone receptor is associated with increased responsiveness to growth hormone. Nat Genet 36: 720?724. Sathirapongsasuti JF, Lee H, Horst BAJ, Brunner G, Cochran AJ, Binder S, Quackenbush J, Nelson SF. 2011. Exome sequencing-based copy-number variation and loss of heterozygosity detection: ExomeCNV. Bioinformatics 27: 2648?2654. Scharpf RB, Irizarry RA, Ritchie ME, Carvalho B, Ruczinski I. 2011. Using the R Package crlmm for Genotyping and Copy Number Estimation. J Stat Softw 40: 1?32. Scholz R, Berberich S, Rathgeber L, Kolleker A, K?hr G, Kornau H-C. 2010. AMPA Receptor Signaling through BRAG2 and Arf6 Critical for Long-Term Synaptic Depression. Neuron 66: 768?780. Schuurs-Hoeijmakers JHM, Geraghty MT, Kamsteeg E-J, Ben-Salem S, de Bot ST, Nijhof B, van de Vondervoort IIGM, van der Graaf M, Nobau AC, Otte-H?ller I, et al. 2012. Mutations in DDHD2, Encoding an Intracellular Phospholipase A1, Cause a Recessive Form of Complex Hereditary Spastic Paraplegia. The American Journal of Human Genetics 91: 1073?1081. Sebat J, Lakshmi B, Malhotra D, Troge J, Lese-Martin C, Walsh T, Yamrom B, Yoon S, Krasnitz A, Kendall J, et al. 2007. Strong Association of De Novo Copy Number Mutations with Autism. Science 316: 445?449. Sharp AJ, Hansen S, Selzer RR, Cheng Z, Regan R, Hurst JA, Stewart H, Price SM, Blair E, Hennekam RC, et al. 2006. Discovery of previously unidentified genomic disorders from the duplication architecture of the human genome. Nat Genet 38: 1038?1042. Sharp AJ, Mefford HC, Li K, Baker C, Skinner C, Stevenson RE, Schroer RJ, Novara F, De Gregori M, Ciccone R, et al. 2008. A recurrent 15q13.3 microdeletion syndrome associated with mental retardation and seizures. Nat Genet 40: 322?328. Song WJ, Sternberg LR, Kasten-Sport?s C, Keuren ML, Chung SH, Slack AC, Miller DE, Glover TW, Chiang PW, Lou L, et al. 1996. Isolation of human and murine homologues of the Drosophila minibrain gene: human homologue maps to 21q22.2 in the Down syndrome "critical region". Genomics 38: 331?339. Stefansson H, Rujescu D, Cichon S, Pietil?inen OPH, Ingason A, Steinberg S, Fossdal R, Sigurdsson E, Sigmundsson T, Buizer-Voskamp JE, et al. 2008. Large recurrent microdeletions associated with schizophrenia. Nature 455: 232?236. Steffenburg S, Gillberg C, Hellgren L, Andersson L, Gillberg IC, Jakobsson G, Bohman M. 1989. A twin study of autism in Denmark, Finland, Iceland, Norway and Sweden. 105 J Child Psychol Psychiatry 30: 405?416. Stessman HA, Bernier R, Eichler EE. 2014. A genotype-first approach to defining the subtypes of a complex disease. Cell 156: 872?877. Su AI. 2004. A gene atlas of the mouse and human protein-encoding transcriptomes. Proceedings of the National Academy of Sciences 101: 6062?6067. Sudmant PH, Kitzman JO, Antonacci F, Alkan C, Malig M, Tsalenko A, Sampas N, Bruhn L, Shendure J, 1000 Genomes Project, et al. 2010. Diversity of human copy number variation and multicopy genes. Science 330: 641?646. Talkowski ME, Rosenfeld JA, Blumenthal I, Pillalamarri V, Chiang C, Heilbut A, Ernst C, Hanscom C, Rossin E, Lindgren AM, et al. 2012. Sequencing chromosomal abnormalities reveals neurodevelopmental loci that confer risk across diagnostic boundaries. Cell 149: 525?537. Tejedor F, Zhu XR, Kaltenbach E, Ackermann A, Baumann A, Canal I, Heisenberg M, Fischbach KF, Pongs O. 1995. minibrain: a new protein kinase family involved in postembryonic neurogenesis in Drosophila. Neuron 14: 287?301. Thompson BA, Tremblay V, Lin G, Bochar DA. 2008. CHD8 is an ATP-dependent chromatin remodeling factor that regulates beta-catenin target genes. Mol Cell Biol 28: 3894?3904. van Bon BWM, Hoischen A, Hehir-Kwa J, de Brouwer APM, Ruivenkamp C, Gijsbers ACJ, Marcelis CL, de Leeuw N, Veltman JA, Brunner HG, et al. 2011. Intragenic deletion in DYRK1A leads to mental retardation and primary microcephaly. Clin Genet 79: 296?299. Venkatraman ES, Olshen AB. 2007. A faster circular binary segmentation algorithm for the analysis of array CGH data. Bioinformatics 23: 657?663. Vissers LELM, de Ligt J, Gilissen C, Janssen I, Steehouwer M, de Vries P, van Lier B, Arts P, Wieskamp N, del Rosario M, et al. 2010. A de novo paradigm for mental retardation. Nat Genet 42: 1109?1112. Voineagu I, Wang X, Johnston P, Lowe JK, Tian Y, Horvath S, Mill J, Cantor RM, Blencowe BJ, Geschwind DH. 2011. Transcriptomic analysis of autistic brain reveals convergent molecular pathology. Nature 474: 380?384. Wada K, Saigoh K, Wang Y-L, Suh J-G, Yamanishi T, Sakai Y, Kiyosawa H, Harada T, Ichihara N, Wakana S, et al. 1999. Intragenic deletion in the gene encoding ubiquitin carboxy-terminal hydrolase in gad mice. Nat Genet 23: 47?51. Wali A, Ali G, John P, Lee K, Chishti MS, Leal SM, Ahmad W. 2007. Mapping of a Gene for Alopecia with Mental Retardation Syndrome (APMR3) on Chromosome 18q11.2-q12.2. Ann Human Genet 71: 570?577. 106 Walsh CA, Morrow EM, Rubenstein JLR. 2008. Autism and Brain Development. Cell 135: 396?400. Walsh KM, Bracken MB. 2011. Copy number variation in the dosage-sensitive 16p11.2 interval accounts for only a small proportion of autism incidence: A systematic review and meta-analysis. Genet Med 13: 377?384. Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, Kingsmore SF, Schroth GP, Burge CB. 2008. Alternative isoform regulation in human tissue transcriptomes. Nature 456: 470?476. Wang T-F, Ding C-N, Wang G-S, Luo S-C, Lin Y-L, Ruan Y, Hevner R, Rubenstein JLR, Hsueh Y-P. 2004. Identification of Tbr-1/CASK complex target genes in neurons. J Neurochem 91: 1483?1492. Wobst H, F?rster S, Laurini C, Sekulla A, Dreiseidler M, H?hfeld J, Schmitz B, Diestel S. 2012. UCHL1 regulates ubiquitination and recycling of the neural cell adhesion molecule NCAM. FEBS J 279: 4398?4409. Yabut O, Domogauer J, D'Arcangelo G. Dyrk1A Overexpression Inhibits Proliferation and Induces Premature Neuronal Differentiation of Neural Progenitor Cells. jneurosciorg. Yazawa M, Hsueh B, Jia X, Pasca AM, Bernstein JA, Hallmayer J, Dolmetsch RE. 2011. Using induced pluripotent stem cells to investigate cardiac phenotypes in Timothy syndrome. Nature 471: 230?234. Yoon S, Xuan Z, Makarov V, Ye K, Sebat J. 2009. Sensitive and accurate detection of copy number variants using read depth of coverage. Genome Research 19: 1586?1592. 107 Web and Software Resources CoNIFER (Copy Number Inference from Exome Reads): source code, tutorials and sample can be downloaded from http://conifer.sf.net and additional pipeline implementations are available at https://github.com/nkrumm/conifer-tools MIM (Mendelian Inheritance in Man) identifiers can be accessed via OMIM (Online Mendelian Inheritance in Man): http://omim.org mrsFAST: source code and binaries are available at http://mrsfast.sf.net NDAR (National Database of Autism Research): http://ndar.nih.gov Data from Chapter 4 available at: https://ndar.nih.gov/study.html?id=334 Appendix(A((Chapter(1)((Details of StringDB network generation: In order to create the PPI network in Figure 3, we started with the de novo mutations published in each of the six exome studies [1-6] and limited these to events found in probands and intersecting exons or canonical splice sites. The network in Figure 3 was created using all genes with de novo truncating variants (defined as nonsense variants, frameshifting variants or variants likely affecting mRNA splicing) as well as six additional genes (DLG4, GRIN2A, CASK, PSEN1, CHD7, NLGN1) in which only missense variants have been observed thus far, but which have important neurobiological roles and/or disease association. In all, we included 158 genes, of which 157 could be identified in the StringDB database (LTN1 was not found in any human interactions in StringDB). Data from the StringDB interaction database version 9.05 [7] was used to create the edges of the PPI. We strictly limited our interactions to only human (organism ID 9606) interactions which were based on experimental evidence and an overall combined interaction confidence score of 400 or more. We did not include interactions solely based on any of the other StringDB interaction types, such as in silico text-mining, co-expression, etc. Overall, we included 85,678 interactions and 12,113 nodes in our analysis. In order to create the network displayed in Figure 3, we took these steps: 1. Intersected the 157 identified genes based on the criteria above with the human, experimentally-validated StringDB interactions. These form the central red (truncating mutations) and blue nodes (selected missense mutations) in Figure 3, and are connected using thick black lines. 2. Found the two largest connected components. We observed that these were connected via the DLGAP1 protein (see main text for discussion) and added this node as a unfilled (white) node with dashed lines. 3. In addition, we surmised that our set of truncating mutations was likely incomplete, and that many ASD/ID genes may be excluded from the central network simply due to the fact that rare variants in these genes have not yet been discovered. Thus, we ?grew? or expanded the network by allowing genes with truncating mutations to be included as ?peripheral? nodes if they were within a distance of two (i.e., one intervening node) of the central network. These nodes are drawn as a lighter shade of red and have finely dashed edges. For this analysis, we excluded three proteins (SUMO1, SUMO2 and UBC) which had highly non-specific interactions in StringDB (sumosylation and ubiquitination). 4. We indicated which mutations have only been observed in studies of ID by using half-filled circles. The reciprocal (ASD-only) situation is not indicated due to the fact that there have been nearly ten-fold more ASD exomes sequenced than ID exomes. 5. Lastly, we scaled the sizes of the nodes based the number of times mutations in cases had been observed in each gene (including the mutations from the MIP resequencing data). Estimating PPI significance: In order to test if the PPI network of de novo mutations found in the six reviewed exome studies was significantly distinct from randomly formed networks of similar size, we performed two simulation studies. These two simulations were based on random sampling from the complete set of known PPI interactions (i.e., from StringDB) or from random permutation of the existing network. Both simulations were designed to take into account the highly variable degree distribution of interaction networks-- that is, some nodes are highly connected ?hub nodes? while other proteins are scantly connected, if at all. The results of the simulations are described in Table S1, and each is described in more detail below. Stratified node resampling: For each iteration of the simulation, we randomly selected a stratified (based on degree distribution for the nodes with mutations) set of nodes from the complete StringDB interaction network (limited to interactions with ?experimental? evidence and a minimum interaction score of 400). This ensured that the nodes we picked were similar in connectivity and that representation of ?hub? nodes and ?outlier? nodes was equivalent to that of the actual network. A new PPI graph was generated from each set of stratified random nodes, and the structural characteristics of these graphs were compiled into a null distribution. We primarily examined the average clustering coefficient and the total number of edges of the permuted graphs and compared these to the characteristics of the actual PPI networks. P-values were derived using the empirical distributions from 10,000 iterations of the simulation. Edge swapping simulation: In this simulation, we did not alter the set of nodes included in each PPI network, but instead permuted the edges found within the PPI network, thus preserving the degree distribution of the network. Specifically, in each iteration of the simulation, a random sampling of edges (where the number of sampled edges was equal to the total number of edges in the PPI network) in the network were swapped with another eligible edge: u --- v u x | | x ----y x y After randomly swapping edges, we re-computed the average clustering coefficient, size of largest connected component and number of edges for the subgraph of the genes (nodes) with observed mutations and computed the empirical p-value as above. Due to the increased complexity and running time of this simulation, we performed only 1,000 iterations. Table S1: Summary table of PPI network simulations Top row p-values are from stratified node resampling simulation ----- Bottom row p-values are from the edge-swap simulation Nominally significant (p < 0.05) values highlighted in bold Details of Hidden Species simulation in Figure 1: In order to estimate the number of genes implicated in ASD under a de novo/rare variant model, we used mutations in probands from the four ASD exome studies and a reformulation of the ?unseen species problem? (see [8] for review; [9] for application to de novo CNVs discovered in autism), where genes with severe de novo SNPs in probands are considered ?observed species?, and binned by their frequency of appearance (i.e., ?singletons?, ?doubletons?, etc.). For each category (truncating, truncating+missense), we find the distribution of the number of recurrently mutated genes (i.e., the bins and bin counts of a histogram function). All genes with more than one mutation are included, as is a fraction of the ?singleton? mutations (those with only one observed mutation in the four studies). The recurrence counts are shown below: Table S2: Recurrence of de novo mutations in 4 ASD studies Given these frequencies and frequency counts, we estimated the total number of genes implicated in autism (the total number of species) using the Chao and Lee estimator implemented in the R package SPECIES [10]. The ?Percentage of de novo singleton events considered pathogenic? refers to the fraction of the singletons (recurrence = 1) included in the frequency counts. References: 1?O'Roak, B.J. et al. (2012) Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations. Nature 485, 246?250 2?Sanders, S.J. et al. (2012) De novo mutations revealed by whole-exome sequencing are strongly associated with autism. Nature 485, 237?241 3?Iossifov, I. et al. (2012) De novo gene disruptions in children on the autistic spectrum. Neuron 74, 285?299 4?Neale, B.M. et al. (2012) Patterns and rates of exonic de novo mutations in autism spectrum disorders. Nature 485, 242?245 5?Rauch, A. et al. (2012) Range of genetic mutations associated with severe non-syndromic sporadic intellectual disability: an exome sequencing study. Lancet 380, 1674?1682 6?de Ligt, J. et al. (2012) Diagnostic exome sequencing in persons with severe intellectual disability. N Engl J Med 367, 1921?1929 7?Franceschini, A. et al. (2012) STRING v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Research 41, D808?D815 8?BungeFitzpatrick (1993) Estimating the Number of Species: A Review. Journal of the American Statistical Association 88, 364?374 9?Sanders, S.J. et al. (2011) Multiple recurrent de novo CNVs, including duplications of the 7q11.23 Williams syndrome region, are strongly associated with autism. Neuron 70, 863?885 10?Wang, J.-P. 01-Apr-(2011), SPECIES: An R Package for Species Richness Estimation. Journal of Statistical Software. [Online]. Available: http://www.jstatsoft.org/. [Accessed: 30-Aug-2011] Appendix(B((Chapter(2)( Library construction and exome capture: All exome samples were prepared by subjecting 2 ug of genomic DNA to a series of shotgun library construction steps, including fragmentation through acoustic sonication (Covaris), end-polishing and A-tailing, ligation of sequencing adaptors, and PCR amplification. Following library construction, 1 ?g of shotgun library is hybridized to biotinylated capture probes for 72 hours and then recovered via streptavidin beads. Unbound DNA is washed away, and the captured DNA is PCR amplified for sequencing. Sequence data processing and alignment: Raw sequenced reads (from FASTQ files) were first split into 36bp chunks (in order to to avoid interference from indels), and mapped using the mrsFAST (v.2.3.0.2) aligner. Up to two mismatches were allowed per read. To reduce computational overhead, we created a concatenated exome index, consisting of the targeted exons (see below), plus 300bp flanking sequence from the hg19 (NCBI build 37) human reference genome, masked with RepeatMasker and Tandem Repeat Finder. After mapping to this concatenated ?exome?, we translated mapped coordinates back to to hg19 genome coordinates for further processing. Exome probe definitions: For the mrsFAST-based alignments, we developed a probe set (i.e., target regions) by intersecting target definitions of the Roche Nimblegen EZ Exome SeqCap Version 2 (from http://www.nimblegen.com/downloads/annotation/ez_exome_v2/SeqCapEZ_Exome_v2.0_Design_Annotation_files.zip) exome capture kit with RefSeq exons (excluding UTR regions). In addition, we included 4,857 non-exonic targeted regions from the SeqCap Version 2 target definition list. This resulted in 194,080 target probes (available at http://conifer.sourceforge.net) Initial exon-level normalization: We calculated RPKM values for the 194,080 target probes individually. The RPKM normalization is given by RPKM = 109 * Read Starts / Total Mapped Reads * Target Size (bp) where the number of Read Starts is defined as the number of reads starting within the target boundaries, and the Total Mapped Reads corresponds to the number of unique reads which had at least one mapping. This initial RPKM normalization step adjusts our read-depth estimates for target (exon) size as well as the overall sequencing coverage in the experiment. To reduce erroneous signal from failed or improperly targeted probes, we excluded 3,964 targets which had a median RPKM < 1 in the 533 ESP samples. Next, to control for probe-to-probe differences in capture efficiency, we standardized the RPKM values using a z-transformation. The median and standard deviation of each exon were derived from RPKM values of the 533 ESP exomes. The formula for the zRPKM value is: zRPKM = (RPKMexon,sample - Medianexon) / StdDevexon Removing systematic bias between batches: A previous analysis of exome read-depth values from ~1,700 ESP exomes using principal components analysis (PCA) revealed several strong components, some of which were attributed to ?batch? effects (unpublished, Sara Ng and Jay Shendure). We hypothesized that these strong components do not correspond to biological signal, but rather to differences in capture protocol, efficiency and sequencing bias. Using singular value decomposition, a mathematical analog of PCA, we decompose the exon-by-sample (X) data matrix into three matrices: X = USVT In order to remove the strongest k components, we set S1...Sk to zero to form S?, and then recalculate X as the dot product of U, S? and VT . For computation efficiency, each chromosome is normalized individually across the population. We used an implementation of SVD in the scipy.stats package available for the python programming language. Discovery of rare CNVs For discovery of rare CNVs, we removed between 12 and 15 (k) singular values, a number which we empirically adjusted based on the inflection point of the ?scree plot? (Fig S2), as well as by manual inspection of the final normalized data. To reduce the false positive rate of discovery for rare CNVs, we applied a 15-exon centrally-weighted moving average across exons. We set discovery thresholds at -1.5 or +1.5 for rare deletions and duplications, respectively, and required at least three exome probes to exceed the threshold. To account for the fact that smoothing shrinks the apparent size of discovered events, regions which exceeded this threshold were slightly expanded until the sample?s smoothed value crossed within two standard deviations surrounding the population mean of the smoothed values (Fig S1b). Sample-level quality control: We excluded ESP exomes from the final background distribution if our algorithm predicted more than 10 calls, as we noted that these samples had a greatly increased total call count (up to 111 calls/sample), and that the calls were largely false positives. This resulted in the exclusion of a total of 80 of 613 initial exomes (87% pass rate) ESP exomes from the background distribution, leaving our final set of 533 exomes. No exomes from the HapMap cohort (range: 1-7 calls per individual) or the autism cohort (range: 0-14 calls per individual) were excluded. Genotyping CNPs: For genotyping copy number polymorphic (CNP) regions of the genome, as well as assessing the copy-number of multi-copy genes, we developed a slightly modified approach. Starting from zRPKM values, we again applied the SVD transformation, but opted to remove only five components, in order to prevent the SVD algorithm from remove bona fide signal from the regions of interest. We genotype each individual by determining the average, resulting in the ?SVD-ZRPKM value?. Whole Genome Copy Number Correlations: To estimate the absolute copy number at CNP loci, read-depth from independent whole-genome sequencing (as previously described in (Sudmant et al., 2010)) was used. Briefly, regions of known copy-number were used to create a copy-number standard curve, and the absolute copy number of tiling 1kb windows across the genome was estimated. For genotyping, the median of the 1kb window estimates was used. Because we wanted to assess a correlation between exome and whole-genome based methods, we only included loci in the final set if the whole-genome copy number estimate indicated that the locus was polymorphic among the seven HapMap samples tested. We defined a locus to be polymorphic if the absolute range of copy numbers amongst the HapMap samples was greater than 1. Finally, we defined the median copy number of each locus as the median of the absolute copy number estimates among the seven HapMap samples. Absolute copy number estimation using population frequency information: To convert relative SVD-ZRPKM values into absolute copy numbers, we used an unsupervised clustering algorithm to cluster SVD-ZRPKM genotype values, and then leveraged genotypes from 43 CNPs in a large set of HapMap samples from (Campbell et al., 2011) to match clusters to absolute copy number. Unsupervised clustering was done using a mean-shift algorithm implemented in the python package SciKits.learn. The mean-shift algorithm is similar to k-means clustering, but does not require a priori information regarding the number of clusters. After clustering, we automatically merged clusters together if their centers were not spaced linearly on the x-axis, as we found that this marginally improved the clustering for some loci. Finally, we fit the most common copy-number state(s) for each locus from (Campbell et al., 2011) to the largest cluster(s) identified by the exome-based SVD-ZRPKM values by maximizing the r2 value between the two vectors (from each data source) of copy-number states. In other words, we attempted to match the frequencies of each copy number state identified by (Campbell et al., 2011) to consecutive clusters identified by our clustering method. To determine an absolute copy number genotype of a CNP locus for a HapMap sample, we simply determined to which cluster the sample belonged and the matched absolute copy number for that cluster. Sensitivity call set for HapMap Samples: To assess sensitivity, we started with CNV calls from the discovery experiment from Conrad and colleagues (Conrad et al., 2010) as a gold standard. This list contained at first 6919 calls for the 5 overlapping hapmap samples in our set. Of these, 486 overlapped at least 3 exome probes (required by our discovery algorithm). Because segmental duplications are prone to array-CGH reference and detection bias, we removed 416 calls for which 50% of the underlying exome probes were in segmental duplications. Finally, we removed 20 calls found in somatically rearranged regions: chr2:89156874-89630175 Ig light chain kappa chr6:32386993-32787910 HLA chr6:31226231-31328167 HLA chr14:105994256-107283087 Ig Heavy chain chr22:22380820-23265082 Ig light chain lambda chr7:141975722-142519580 T-cell receptor beta subunit This resulted in 50 calls. For each call, we reviewed several data sources: 1) Illumina i1M or 650Y (for NA15510) SNP array LogR intensities and B-allele frequency, 2) whole genome copy number estimates (from (Sudmant et al., 2010), but not available for NA15510), 3) fosmid-based calls from (Kidd et al., 2008) and 4) SVD-ZRPKM signal across ESP and HapMap samples. We manually curated the 50 calls into four categories: Rare CNVs (5 total), CNPs or CNP-like (36 events), events in high-diversity regions of the genome (6 events; primarily Olfactory receptors and zinc-finger genes), and false positives in the Conrad et al. set (3 calls). False positives had no corroborating evidence in any other data set, and were not counted towards the sensitivity estimates. Discovery of rare CNVs in ASD trios: Using the input set of 366 ASD cohort individuals (122 probands) with 366 randomly picked ESP samples, and removing 15 components, our algorithm made a total of 1,043 calls among the 366 individuals in the ASD cohort (with 369 calls in probands), with each sample having between 0 and 14 calls; overall 340 individuals had at least one call. Merging all overlapping calls in the ASD resulted in 282 CNVRs. As the exome capture reaction targets many genes present in duplicated regions of the genome, and as many exons share homologous sequence, a significant proportion of our calls in probands are due to changes in the copy number of these genes due to independent assortment of parental haplotypes. Starting with the 369 calls made in the 117 probands, we filtered calls to enrich for ?rare? CNVs. Calls which had greater than 50% reciprocal overlap (as determined by the fraction of underlying exome probes within the call also in segmental duplications) with segmental duplications were removed (163/369, or 44%). Next, we calculated the median copy number of calls based on whole-genome read-depth copy-number estimates from ~660 genomes (Sudmant et al., 2010), and additionally filtered 13 calls (3.5%) with more than 3+ copies population-wide (as events stemming from these segmentally-duplicated or higher-copy regions of the genome are likely due to the independent assortment of parental haplotypes, and not ?true? rare CNVs). Additionally, we manually curated the calls to remove calls within regions undergoing somatic rearrangement (two calls; one at the IGH locus and one in the HLA locus), and merged adjacent or overlapping calls. These steps left 191 calls among 97 probands, and these calls were primarily found in non-duplicated genes and diploid regions of the genome. We categorized each call into one of three bins: de novo, inherited or copy-number polymorphic (Table S3). Description of CNVs in ASD trios: We found eight putative de novo events (Table S2). For six of these, we were able to corroborate the event using available Illumina SNP microarray data as well as targeted array-CGH experimental data (Sanders et al. 2011, O?Roak et al., submitted). The other two de novo events were each driven by increases in SVD-ZRPKM values for the eighth exon of FAF2 gene. Although we were able to confirm an excess of reads mapping to this exon by manual inspection of the mapped reads (from both mrsFAST and BWA alignments), we were not able to experimentally validate these duplications using a quantitative real-time PCR assay targeting the eighth exon itself (data not shown). Next, we looked for inherited events using our exome read-depth analysis and found that 128 events in the probands were inherited from either the probands mother or father. For 117 of these events, the SVD-RPKM values of both the proband and the parent exceeded the detection threshold (?1.5); however, for 11 of these calls, the SVD-RPKM values between proband and parents was just below the deletion or duplication threshold required for calling, and inheritance status was determined by manual inspection. Inspection of the SVD-RPKM values for remaining 55 events (14 loci; see Table S#) revealed that these events strongly resemble copy-number polymorphic sites or contained processed pseudogenes. Such events are likely due to increases or decreases in copy number from the independent assortment of parental alleles; furthermore, changes in processed pseudogenes at other genomic loci can change the apparent copy number of annotated genes in the exome-capture reaction. As we had explicitly attempted to filter out such sites, we investigated these sites further. The observed signals in five of the loci, PRKRA (18 events), RNF145 (3 events), and CDC27 (2 events), HNRNPA1 (1 event), and TDG (1 event) are likely driven by processed pseudogenes. Most of the remaining loci (DAZL, BTNL8/BTNL3, CLPS, OR4, SIGLEC14, and KRT34) were previously identified to be copy-number polymorphic by Conrad et al. (2010). Finally, the SVD-RPKM values of the last event, a duplication of exons in KRT8 and KRT18 are in-line with the signature of CNPs or highly duplicated exons. Comparison of mrsFAST- and BWA-based read-depth estimation BWA-based mappings were generated using the default settings for BWA (0.5.6) and post-processed with a pipeline developed specifically for SNP and single nucleotide variant (SNV) discovery. Reads which had more than one high-quality mappings were removed from the alignment and a minimum mapping quality (MAPQ) of 30 was required of all reads. The same method for generating RPKM values from BWA alignments was used as was for mrsFAST-based alignments. We calculated RPKM values for the same 194,080 intervals used elsewhere in this report, and again excluded targets with a median RPKM < 1, a total of 7,117 probes in this experiment. To make up the sample set for the comparison experiment, we combined 492 ESP samples, for which we had both mrsFAST and BWA-based mapping information, with the 8 HapMap samples. We noticed the the overall variance (as determined by the scree plot) in the BWA-based mapping was lower, and opted to remove only 6 components of variance. For the mrsFAST-based mappings, we removed the usual first 12 components. All other processing steps were done in the same fashion as elsewhere in this paper. The signal-to-noise ratio for calls was calculated using the formula SNR = |?call| / ?chromosome where ?call is the mean of the SVD-ZRPKM values for the exons within a call, and ?chromosome is the standard deviation of all the SVD-ZRPKM values of the call?s chromosome. We calculated the SNR for the seven rare validated calls from table S1 for both mrsFAST-based and BWA-based SVD-ZRPKM values (Table S6). Six of seven rare CNVs showed improved SNR using the mrsFAST-based mappings, with a median improvement of 58% over BWA (mean 38% improvement). Comparison to ExomeCNV algorithm: We compared our algorithm to the previously published ExomeCNV (Sathirapongsasuti et al., 2011) in order to better understand the strengths and weaknesses of each. ExomeCNV is designed to detect copy number aberration in the context of cancer, a special case of copy number variation which requires additional parameters to be defined (e.g., the rate of admixture/contamination of tumor and normal), and which must be able to handle samples for which a large fraction of the genome is not diploid. Accordingly, ExomeCNV is designed around a digital comparative hybridization algorithm, which requires that both the test and reference are as closely matched as possible (e.g., tumor-normal pairs of exomes from the same capture and sequence), and includes many features to better characterize cancer exomes. In contrast, ours is designed to discover genic deletions and duplications of exonic regions independently in each sample by first eliminating systematic noise using singular value decomposition. We compared the ability of both algorithms to detect germline variation in DNA samples extensively analyzed and validated as part of other studies. To assess the sensitivity and specificity of both algorithms, we used the five HapMap samples for which exome sequence data had been generated and where high-density microarray analyses had been performed previously (Conrad et al., 2010). We set NA19240 as the reference sample, and used ExomeCNV to call CNVs on the remaining four samples (NA12878, NA15510, NA18517, and NA19129). Similar to the authors own use of the NovaAlign alignment package, we used the available BWA alignments for this comparison, and used the same 194,080 probes to generate an interval coverage file using the GATK (version 1.3.8) software package. We left all ExomeCNV parameters at their default values: sensitivity and specificity were set at 0.9999 for exons (maximizing specificity) and 0.99 for calls (?auc? option), and the admixture rate was set at a conservative 0.5 (despite the fact that we did not expect any biological admixture, we found that keeping this setting reduced the number of false positive calls). Among the four test samples, ExomeCNV predicted 450 CNVs, of which only 63 (14%) overlapped with calls in the Conrad et al. call set by more than 10% reciprocal overlap. In contrast, our algorithm found 24 calls among these four samples, of which 21 (87.5%) overlapped the Conrad et al. set. While both programs were able to find all of the five rare CNVs (Table S3), we note that ExomeCNV predicted 16 CNVs larger than 500kb, which did not have any overlap with the high resolution Conrad et al. set of calls. This low specificity would make it very difficult to find ?true positives? in the ExomeCNV output, even when filtering for large CNVs only. Using exon-level log-ratio output from ExomeCNV, we next compared how sensitive it was to changes in copy-number of duplicated genes. Across the 62 CNP loci genotyped by our algorithm (Table S4), ExomeCNV was able to generate LogR values for 51 loci (82%). Example correlations and a comparison between ExomeCNV and our algorithm are shown for four loci in Figure S8a. Across all loci, when compared to the log-ratio values to the whole-genome estimate for each locus, the median r2 across these loci was 0.57 (c.f. this work?s algorithm r2 = 0.92). As with the BWA alignment comparison, the genotyping dynamic range of ExomeCNV was severely limited, and the LogR values from ExomeCNV correlated only poorly with the corresponding whole-genome estimates of absolute copy number for loci with median copy number greater than seven (Figure S8c). Finally, although the authors of ExomeCNV recognize that their algorithm depends on sample-to-sample consistency, large cohorts of tens to hundreds of exomes cannot be expected to maintain such consistency. Crucially, our algorithm allows for the comparison of samples from different cohorts, and even different iterations of the exome capture reaction itself. To demonstrate this, we examined two ESP samples from two different experimental cohorts (but stemming from the same study, and using the same capture kit version, library preparation steps and sequencing machines). The output from ExomeCNV for chromosome 20 is shown in the top left panel of Figure S7. When we counted the fraction of exome probes which ExomeCNV predicted as copy-number variant, we found that a biologically implausible 96.6% of the exome was detected as changed from diploid copy number (Figure S7, top right panel). In contrast, when we picked an ESP sample from the same experimental batch (and which was closely matched based on the variance we observed using the SVD decomposition) as the reference, ExomeCNV reported only 0.4% of exome probes as non-diploid (Figure S7, bottom panel). When we applied our algorithm (this work) at a very sensitive setting (? 1 SVD-ZRPKM threshold), we found only that for the same samples, only 0.06% and 0.15% of the exons were altered from diploid. This comparison highlights the strength of singular value decomposition for eliminating batch effects and systematic noise that may arise from exome capture experiments. References: Campbell, C. D., Sampas, N., Tsalenko, A., Sudmant, P. H., Kidd, J. M., Malig, M., Vu, T. H., et al. (2011). Population-Genetic Properties of Differentiated Human Copy-Number Polymorphisms. The American Journal of Human Genetics, 88(3), 317?332. doi:10.1016/j.ajhg.2011.02.004 Conrad, D. F., Pinto, D., Redon, R., Feuk, L., Gokcumen, O., Zhang, Y., Aerts, J., et al. (2010). Origins and functional impact of copy number variation in the human genome Nature, 464(7289), 704?712. doi:10.1038/nature08516 Kidd, J. M., Cooper, G. M., Donahue, W. F., Hayden, H. S., Sampas, N., Graves, T., Hansen, N., et al. (2008). Mapping and sequencing of structural variation from eight human genomes Nature, 453(7191), 56?64. doi:10.1038/nature06862 Sudmant, P. H., Kitzman, J. O., Antonacci, F., Alkan, C., Malig, M., Tsalenko, A., Sampas, N., et al. (2010). Diversity of human copy number variation and multicopy genes Science, 330(6004), 641?646. doi:10.1126/science.1197005 Sathirapongsasuti, J. F., Lee, H., Horst, B. A. J., Brunner, G., Cochran, A. J., Binder, S., Quackenbush, J., et al. (2011). Exome sequencing-based copy-number variation and loss of heterozygosity detection: ExomeCNV. Bioinformatics, 27(19), 2648?2654. doi:10.1093/bioinformatics/btr462 Figure S1: Threshold call overview: +1.5 duplication threshold (A) Extension boundary (B) 2 SD range across population Final Call (C) S1: Threshold algorithm To discover rare CNVs, we found smoothed SVD-ZRPKM values which crossed a threshold (A) of +1.5 or -1.5 for duplications and deletions, respectively. To account for the fact that our smoothed values shrink the apparent size of the call, we extended calls such that the final call (C) better represented the extend of the actual CNV. To do this, we extended calls from the initial supra-threshold event until the smoothed SVD- ZRPKM values dipped below ?2 standard deviations surrounding the population median (red highlight) of the SVD-ZRPKM values (marked in figure by line [B], and by black circles). Figure S2: Scree Plot S2: Scree plot This scree plot shows the first 40 singular values (S n ) from the HapMap (blue) ASD trio (green) samples. The relative contributed variance of each singular value is proportional to its strength indicated on the y-axis. Number (n) Si ng ula r V alu e ( S n ) !""#$%&#'()*+#,-..#+/0123+4 /56#-..#7%8#+/0123+ 9:!#7%8#+/0123+#/56 ;#80% of 34 unrelated genomes had a copy number of three or greater in 500bp repeat-masked windows across the genome). Excluding calls which overlapped at least 50% with these regions resulted in the exclusion of 2,353 calls (45% of all calls), corresponding to 609 CNVRs. Next, we excluded calls which were likely to due solely to the insertion of processed pseudogenes. CoNIFER and most exome-based read-depth methods are sensitive to copy number changes specifically of exons, which can be the result of retro-insertion of processed mRNA transcripts (see Krumm et al., 2012 for more details).Processed pseudogenes can be polymorphic or fixed among individuals; furthermore, inserted copies may be present in variable copy number or absent altogether within the reference genome. To eliminate CNV calls in our dataset due to polymorphic or novel insertions of processed pseudogenes, we referenced three sources to define likely processed pseudogoenes: (1) a list of commonly polymorphic processed pseudogenes generated using SPLIT-READ (Karakoc et al., 2011) from 20 control exomes and (2) 225 autism trios (data not reported here). We excluded calls from our call list for which ?90% of the probes corresponded to a gene which had been observed at least once in 225 trios. This excluded a total of 1,098/7,628 calls (17%, in line with previous estimates from (Krumm et al., 2012)). Final filtering and call set generation Our final set of calls was created by requiring an absolute median SVD-ZRPKM score (i.e., signal strength) of ? 0.5 for calls with 5 or more probes, ?1.0 for calls 3-5 probes in length, and ? 1.0 for calls 2 probes in length. We excluded any calls on the X or Y chromosomes for all analyses in this work. Validation of CNV calls Targeted array CGH microarray design To validate the small rare CNVs discovered using the discovery pipeline, we designed a custom Agilent SurePrint G3 4x180k CGH microarray to confirm CNVs. As the variable spacing of the exome probes prevents precise knowledge of CNV breakpoints, we used a variable density array design with ?3 exon overhang based on the exome-based breakpoints. (with min/max limits of 5kbp and 50kbp) where possible. Probe density within the CNV call ranged from 125bp-1 for calls smaller than 10kbp to 5kbp-1 for large calls up to 500kbp, in order to insure at least 10 probes per call. Due to the high density of probes required for validation of small CNVs, some of the probes were of lower quality (as based on the manufacturer?s quality score), and their performance was accordingly lower. Array CGH methods Test and reference DNA (we used DNA from HapMap sample NA18507) from each sample was labeled with Cy3 and Cy5 dye using a NimbleGen array labeling kit according to manufacturer?s instructions. Five micrograms of labelled test and reference DNA was hybridized for 24 hours using Agilent reagents to the microarray slide and washed according to manufacturer?s directions. Slides were scanned using an Agilent Microarray Scanner and analyzed using Agilent Feature Extract v10.5.1.1. Data processing and array quality control Array intensity ratios were log-transformed and assessed for quality control. Arrays with a per-sample standard deviation of LogR values > 0.5 were repeated. In order to reduce systematic and batch noise between probes and samples, we employed a similar normalization strategy to the CoNIFER pipeline and used SVD to remove the three strongest components of variance. Receiver Operating Curve determination of array CGH thresholds We determined minimum logR thresholds for the validation arrays by leveraging the logR values across the 60 previously identified CNVs (from Sanders et al., 2011), each found in at least one of our validation samples. We calculated Receiver Operating Curves (ROC, Figure S3) for duplications (39 calls) and deletions (21 calls), using the samples without the previously identified CNVs as the ?true negatives?. Next, we individually picked the optimal operating point (OOP) for deletions (median LogR OOP <= -0.178) and duplications (median LogR OOP >= 0.24), such that we maximally discerned our known true positives from true negatives. Both OOPs had a FPR of ~1%, and a recall rate >90%, indicating our array was highly specific and sensitive to true events. These logR cutoff values were used in assessing if novel CNVs were true positives or not: if the mean LogR across all probes in the call interval was greater than the duplication threshold (or lower than the deletion threshold), we considered the call validated. Estimating false positive rate We started with 161 exome-based calls (Table S5) among 80 randomly selected probands and siblings. Of these 161 calls, 69 could be confirmed by a CNV already reported by Sanders and colleagues. We used array CGH data from our customized microarray and the OOPs determined above and the mean of all the array CGH probes across the exome-based CNV start and stop in order to validate remaining calls. The OOP thresholds were exceeded in 61 calls, and based on inheritance across multiple and a combination of available raw Illumina 1M data, we scored four additional calls as validated (Table S5). Six calls did not have sufficient array probe coverage (our upper estimate of the reported FPR includes these calls as false positives, the lower estimate excludes these six from all calculations). We found that 14 of the unconfirmed calls were rare processed pseudogenes specific to the family or samples tested. To find these, we mapped exome sequencing reads from each sample to a customized reference sequence composed of mRNA sequences extracted from RefSeq. If more reads mapped across the exon-exon junctions within the CNV call in the sample tested than across the same junction in other samples, we considered the elevation of exome read-depth signal to be due to a processed pseudogene insertion, rather than a true genomic CNV. Additional analyses Bootstrap permutations of burden analysis We tested the robustness of the overall effect of burden by a bootstrap method, in which we calculated the CNV burden ratio (for CNVs and genes) of 10,000 randomly sampled (with replacement) sets of families from the overall set of 411. The resulting distributions of total CNV counts and total gene count and the distribution of burden for both are shown in Figure S5 (compare to actual results for all 411 families in Figure 2a and 2b). The empirical 95% confidence intervals for both the burden of CNVs (CI: 1.09-1.29) and genic burden (1.10-1.52) reject the null hypothesis (at alpha = 0.05) of no differential CNV burden between probands and siblings. Phenotypes and regression models Full-scale IQ, SRS t-scores, and individual components of the SRS were downloaded from the SSC database and release 14 of the SSC. For all analyses we excluded the entire family if any values were missing for either the proband or sibling. To clarify how we classified SRS discordant and concordant families, we provide the following table: Gene expression data and enrichment analysis We used publicly available gene expression data from the Human U133A/GNF1H Gene Atlas (GEO: GSE1133), comprising 79 human tissues, including 18 nervous system tissues(Su, 2004). We associated the microarray probe IDs with HUGO gene names and average expression across multiple probes in the same gene. For each tissue, genes were sorted by expression and we considered the top 5% of each category to be ?highly expressed?. To calculate enrichment, we took the unique sets of genes disrupted in probands and those disrupted in siblings and intersected each with the set of highly expressed genes in each category. The ratio of counts between these two intersections constituted the fold enrichment for each category. In order to correct for the 79 multiple comparisons, we employed a permutation and false discovery rate (FDR) strategy. First, we derived a null distribution of enrichment between probands and siblings by shuffling the proband-only and sibling-only sets of genes and recomputing the enrichment. Next, an empirical p-value was derived by scoring the actual enrichment value against the null distributions for each tissue. Using the FDR method described in (Storey & Tibshirani, 2003) and the R package qvalue, we calculated q values for each tissue and assessed statistical significance at q < 0.05. In order to calculate the brain and non-brain averages, we averaged gene expression across all 18 brain- and nervous system tissues and 61 non-brain tissues. These two categories were corrected for two comparisons each. Previously associated genes To establish the list of genes previously associated with autism/ASD/intellectual disability/schizophrenia, we attempted to identify all genes that were ?causal? or associated with developmental delay, intellectual disabilities and schizophrenia. We conducted searches of OMIM with the following terms: ?mental retardation? ?intellectual disabilities?, ?autism? schizophrenia. We also included genes from the Simons SFARI autism candidate genes with? association scores? ranging from 1 to 4 (n=155 genes) (https://gene.sfari.org/autdb/submitsearch?selfld_0=GENES_GENE_SYMBOL&selfldv_0=&numOfFields=1&userAction=viewall&tableName=AUT_HG&submit2=View+All#GS). Additional control exomes The set of 2,972 exomes used to assess population frequency of ultra rare CNVs were taken from the National Heart Lung and Blood Institute?s Exome Sequencing Project (ESP). These exomes were processed in bulk using CoNIFER (with 21 components removed) and locus-specific population frequencies were determined by manual inspection of outlier samples for each locus. Combined mutation model We used published lists of de novo SNV and indel mutations from published lists in the three studies ((Iossifov et al., 2012; O'Roak et al., 2012; Sanders et al., 2012)). In our combined model (see discussion), we only counted disruptive SNVs and indels (i.e., nonsense, splice, and frameshifting), as these have been shown to be most enriched in probands. Inherited and de novo CNV counts were derived from this work (Tables S7). We used a logistic regression model, which transforms a binary outcome (i.e., affected vs. unaffected) such that linear predictors can be used. The model as shown in Figure 5 is: Code availability CoNIFER and CNV calling CoNIFER can be downloaded from http://conifer.sf.net. Version 0.2.2 was used in this work. The custom pipeline for CNV calling as described in this work is available at http://conifer.sf.net, although the authors cannot guarantee or provide any technical support for this. Supplement References Hach, F., Hormozdiari, F., Alkan, C., Hormozdiari, F., Birol, I., Eichler, E. E., & Sahinalp, S. C. (2010). mrsFAST: a cache-oblivious algorithm for short-read mapping. Nature Methods, 7(8), 576?577. doi:10.1038/nmeth0810-576 Iossifov, I., Ronemus, M., Levy, D., Wang, Z., Hakker, I., Rosenbaum, J., et al. (2012). De novo gene disruptions in children on the autistic spectrum. Neuron, 74(2), 285?299. doi:10.1016/j.neuron.2012.04.009 Krumm, N., Sudmant, P. H., Ko, A., O'Roak, B. J., Malig, M., Coe, B. P., et al. (2012). Copy number variation detection and genotyping from exome sequence data. Genome Research. doi:10.1101/gr.138115.112 O'Roak, B. J., Deriziotis, P., Lee, C., Vives, L., Schwartz, J. J., Girirajan, S., et al. (2011). Exome sequencing in sporadic autism spectrum disorders identifies severe de novo mutations. Nature Genetics, 43(6), 585?589. doi:10.1038/ng.835 O'Roak, B. J., Vives, L., Girirajan, S., Karakoc, E., Krumm, N., Coe, B. P., et al. (2012). Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations. Nature. doi:10.1038/nature10989 Sanders, S. J., Murtha, M. T., Gupta, A. R., Murdoch, J. D., Raubeson, M. J., Willsey, A. J., et al. (2012). De novo mutations revealed by whole-exome sequencing are strongly associated with autism. Nature, 485(7397), 237?241. doi:10.1038/nature10945 Storey, J. D., & Tibshirani, R. (2003). Statistical significance for genomewide studies. Proceedings of the National Academy of Sciences, 100(16), 9440?9445. doi:10.1073/pnas.1530509100 Su, A. I. (2004). A gene atlas of the mouse and human protein-encoding transcriptomes. Proceedings of the National Academy of Sciences, 101(16), 6062?6067. doi:10.1073/pnas.0400782101 Figure S1: Flow chart for inherited CNV detection. See Methods and Supplemental Methods for details. Figure S1: CNV Calling Flowchart Figure S2: Mapped coverage between probands/siblings and by data source a b Figure S2: Mapped Coverage between probands/siblings and by data source. X-axis: total mapped 36mer reads (x108) by the mrsFAST alignment program to the human exome. (a) Histograms of Probands (left) and Siblings (center) and overlap (right) shows no significant difference in coverage levels (Paired t-test p= 0.09). (b). Same as in (a), but by dataset, revealing that the Iossifov dataset had lower coverage than the O!Roak or Sanders datasets. Figure S3: Array-CGH validation ROC curves a b Figure S3: Receiver-Operator Curve determining deletion and duplication thresh- olds in array-CGH validation. ROC curves based on 60 true-positive deletions (a) and duplications (b) from Sanders et al., 2011 in these samples. Arrows indicate chosen optimal operating point (OOP), which was used as the threshold for valida- tion of unknown calls. Figure S4: CNV Size, inheritance, and copy number a b Figure S4: CNV size and copy number. Inherited CNVs in probands and siblings, binned by size in exons (a) or estimated genomic size (b). As expected, larger CNVs are more likely to be duplications, an effect we found true for both probands and siblings. Figure S5: Bootstrap results a b c d Figure S5: Results of bootstrap permutation test. We bootstrapped our set of inherited CNVs (sampling CNVs by family, with replacement), and calculated the total CNV counts (a) for probands (green)and siblings (blue) and CNV burden (b) between probands and siblings (dark blue: inner 95% of empirical distribution). In (c) and (d), the results when counting total number of genes and genic burden. Figure S6: Rare vs. Private burden in 411 quads Si bli ng sP ro ba nd s p=0.029 ns p=0.004 ns Genes affectedCNVs Tr an sm itt ed C N V s Tr an sm itt ed G en es a b Figure S6: Rare vs. Private burden in 411 quads. There was no increased burden for CNVs (a) observed only once in 411 families, or for genes in those CNVs (b). Figure S7: Phenotypes in 411 probands and siblings Discordant SRS All quads discordant SRS a b c d Figure S7: Phenotypes (SRS and IQ) in probands and siblings. (a) Distribution of SRS t-scores in probands (blue) and siblings (green). Higher scores are more affected, and SRS t-scores greater than 75 are considered ?severely affected?. (b) Heatmap plot of SRS values for probands (x-axis) and their siblings (y-axis). In almost all cases, the probands have higher SRS scores, but the difference in SRS score between probands and siblings varies widely among all pairs. We designated the pairs with the most extreme differences of SRS score between them as ?Discordant SRS? pairs (indicated by arrow and dashed orange box, lower right). (c) All of the SRS discordant pairs had SRS differences > 25 (by definition, as we required these pairs to have a proband SRS  (d). Scatter plot showing both proband SRS score and proband IQ score. Dashed blue line indicates cutoff for High and Low IQ in our com-   Figure S8: Burden contrasts including SRS and IQ by Probands by Siblings by Both CNVs inherited... a c d b Figure S8. Burden between SRS and IQ in probands and siblings. (a) Genic burden and (b) CNV burden for proband-sibling pairs where the proband has low IQ (< 70) for         shown in (c) and (d). P-value bars drawn if two-tailed paired t-test p value is less than 0.05. Pro Sib Pro Sib Pro Sib Pro Sib Pro Sib Pro Sib Pro Sib Pro Sib Figure S9: Enrichment of brain expressed genes in SRS discordant quads (A) and all quads (B) Figure S9. Enrichment of brain expressed genes in probands vs. siblings Bars (y-axis) represent ratio of enrichment between proband and siblings for genes highly expressed in each tissue (defined as top 5%, see Methods). Black bars: tissue is part of brain or nervous system; white bars: non-brain or nervous system tissues; hatched bars: are computed averages. Asterix indicates significance using a FDR-based multiple testing correction q-value < 0.05. (a) probands from SRS discordant quads only show greater enrichment for brain-expressed genes than do all quads, (b). Figure S10: Intersection between brain-expressed genes and previously associated genes in proband CNVs, but not sibling CNVs Brain Expression Previously associated w. ID/ASD/SCZ Probands (discordant SRS)Probands (all) Siblings (discordant SRS)Siblings (all) Figure S10. Intersection of brain-expressed and disease genes in probands We intersected the sets of genes found in probands (top row) and siblings (bottom row) that were either brain expressed (teal circles) or had previously been observed in ASD/Schizophrenia/ID (yellow circles). Probands?especially those in SRS discordant pairs? had a higher fraction of intersecting genes (13 genes, Table S11) than other groups or their siblings, suggesting that these genes may be top candidates for follow-up study in the pathogenesis of ASD. Sample Source SNV data available Part of Sanders 2011 (Illumina SNP) Part of QC analysis CoNIFER Std Dev11000.fa Sanders et al TRUE TRUE FALSE 0.3011000.mo Sanders et al TRUE TRUE FALSE 0.3011000.p1 Sanders et al TRUE TRUE FALSE 0.2811000.s1 Sanders et al TRUE TRUE FALSE 0.3711008.fa Sanders et al TRUE TRUE FALSE 0.2711008.mo Sanders et al TRUE TRUE FALSE 0.2711008.p1 Sanders et al TRUE TRUE FALSE 0.2711008.s1 Sanders et al TRUE TRUE FALSE 0.2811010.fa Sanders et al TRUE TRUE FALSE 0.2811010.mo Sanders et al TRUE TRUE FALSE 0.2911010.p1 Sanders et al TRUE TRUE FALSE 0.2811010.s1 Sanders et al TRUE TRUE FALSE 0.3011013.fa O?Roak et al TRUE TRUE FALSE 0.3811013.mo O?Roak et al TRUE TRUE FALSE 0.4211013.p1 O?Roak et al TRUE TRUE FALSE 0.3911013.s1 O?Roak et al TRUE TRUE TRUE 0.3611014.fa Sanders et al TRUE TRUE FALSE 0.2411014.mo Sanders et al TRUE TRUE FALSE 0.2611014.p1 Sanders et al TRUE TRUE FALSE 0.2511014.s1 Sanders et al TRUE TRUE FALSE 0.2611029.fa O?Roak et al TRUE TRUE FALSE 0.3911029.mo O?Roak et al TRUE TRUE FALSE 0.4011029.p1 O?Roak et al TRUE TRUE TRUE 0.3811029.s1 O?Roak et al TRUE TRUE TRUE 0.3911045.fa Sanders et al TRUE TRUE FALSE 0.3211045.mo Sanders et al TRUE TRUE FALSE 0.2911045.p1 Sanders et al TRUE TRUE TRUE 0.3411045.s1 Sanders et al TRUE TRUE FALSE 0.2811057.fa Sanders et al TRUE TRUE FALSE 0.2411057.mo Sanders et al TRUE TRUE FALSE 0.2511057.p1 Sanders et al TRUE TRUE FALSE 0.2411057.s1 Sanders et al TRUE TRUE FALSE 0.2311060.fa Sanders et al TRUE TRUE FALSE 0.3111060.mo Sanders et al TRUE TRUE FALSE 0.3211060.p1 Sanders et al TRUE TRUE FALSE 0.3811060.s2 Sanders et al TRUE FALSE FALSE 0.3511066.fa Sanders et al TRUE TRUE FALSE 0.3711066.mo Sanders et al TRUE TRUE FALSE 0.3011066.p1 Sanders et al TRUE TRUE FALSE 0.2311066.s2 Sanders et al TRUE FALSE FALSE 0.3411067.fa Sanders et al TRUE TRUE FALSE 0.2611067.mo Sanders et al TRUE TRUE FALSE 0.2111067.p1 Sanders et al TRUE TRUE FALSE 0.2311067.s1 Sanders et al TRUE TRUE FALSE 0.2411074.fa Sanders et al TRUE TRUE FALSE 0.2411074.mo Sanders et al TRUE TRUE FALSE 0.2411074.p1 Sanders et al TRUE TRUE FALSE 0.2611074.s1 Sanders et al TRUE TRUE FALSE 0.2711075.fa Sanders et al TRUE TRUE FALSE 0.3111075.mo Sanders et al TRUE TRUE FALSE 0.2011075.p1 Sanders et al TRUE TRUE FALSE 0.2411075.s1 Sanders et al TRUE TRUE FALSE 0.2311077.fa Sanders et al TRUE TRUE FALSE 0.2611077.mo Sanders et al TRUE TRUE FALSE 0.2111077.p1 Sanders et al TRUE TRUE FALSE 0.2211077.s1 Sanders et al TRUE TRUE FALSE 0.2711079.fa Sanders et al TRUE TRUE FALSE 0.2911079.mo Sanders et al TRUE TRUE FALSE 0.3011079.p1 Sanders et al TRUE TRUE FALSE 0.2211079.s1 Sanders et al TRUE TRUE FALSE 0.4311085.fa Sanders et al TRUE TRUE FALSE 0.3011085.mo Sanders et al TRUE TRUE FALSE 0.3011085.p1 Sanders et al TRUE TRUE FALSE 0.2711085.s1 Sanders et al TRUE TRUE FALSE 0.3011089.fa Sanders et al TRUE TRUE FALSE 0.2411089.mo Sanders et al TRUE TRUE FALSE 0.2311089.p1 Sanders et al TRUE TRUE FALSE 0.2311089.s1 Sanders et al TRUE TRUE FALSE 0.2511090.fa Sanders et al TRUE TRUE FALSE 0.2611090.mo Sanders et al TRUE TRUE FALSE 0.3111090.p1 Sanders et al TRUE TRUE TRUE 0.2911090.s1 Sanders et al TRUE TRUE FALSE 0.3911092.fa Sanders et al TRUE TRUE FALSE 0.2711092.mo Sanders et al TRUE TRUE FALSE 0.3011092.p1 Sanders et al TRUE TRUE FALSE 0.2911092.s1 Sanders et al TRUE TRUE FALSE 0.2911094.fa Sanders et al TRUE TRUE FALSE 0.2411094.mo Sanders et al TRUE TRUE FALSE 0.2211094.p1 Sanders et al TRUE TRUE FALSE 0.3911094.s1 Sanders et al TRUE TRUE FALSE 0.2411107.fa Sanders et al TRUE TRUE FALSE 0.2411107.mo Sanders et al TRUE TRUE FALSE 0.2411107.p1 Sanders et al TRUE TRUE FALSE 0.2711107.s1 Sanders et al TRUE TRUE FALSE 0.2511108.fa Sanders et al TRUE TRUE FALSE 0.32 11108.mo Sanders et al TRUE TRUE FALSE 0.3211108.p1 Sanders et al TRUE TRUE FALSE 0.3011108.s1 Sanders et al TRUE TRUE FALSE 0.3811114.fa Sanders et al TRUE TRUE FALSE 0.2811114.mo Sanders et al TRUE TRUE FALSE 0.3011114.p1 Sanders et al TRUE TRUE FALSE 0.2911114.s1 Sanders et al TRUE TRUE FALSE 0.2811115.fa Sanders et al TRUE TRUE FALSE 0.2911115.mo Sanders et al TRUE TRUE FALSE 0.4011115.p1 Sanders et al TRUE TRUE FALSE 0.3111115.s1 Sanders et al TRUE TRUE TRUE 0.3211117.fa Sanders et al TRUE TRUE FALSE 0.2411117.mo Sanders et al TRUE TRUE FALSE 0.3111117.p1 Sanders et al TRUE TRUE FALSE 0.2411117.s1 Sanders et al TRUE TRUE FALSE 0.3511118.fa Sanders et al TRUE TRUE FALSE 0.4611118.mo Sanders et al TRUE TRUE FALSE 0.3111118.p1 Sanders et al TRUE TRUE TRUE 0.3011118.s1 Sanders et al TRUE TRUE FALSE 0.3511132.fa Sanders et al TRUE TRUE FALSE 0.2511132.mo Sanders et al TRUE TRUE FALSE 0.2311132.p1 Sanders et al TRUE TRUE FALSE 0.2111132.s1 Sanders et al TRUE TRUE FALSE 0.3111146.fa Sanders et al TRUE TRUE FALSE 0.2511146.mo Sanders et al TRUE TRUE FALSE 0.2711146.p1 Sanders et al TRUE TRUE FALSE 0.3511146.s1 Sanders et al TRUE TRUE FALSE 0.3611154.fa Sanders et al TRUE TRUE FALSE 0.2911154.mo Sanders et al TRUE TRUE FALSE 0.2911154.p1 Sanders et al TRUE TRUE FALSE 0.2911154.s1 Sanders et al TRUE TRUE FALSE 0.4211172.fa O?Roak et al TRUE TRUE FALSE 0.3711172.mo O?Roak et al TRUE TRUE FALSE 0.4411172.p1 O?Roak et al TRUE TRUE FALSE 0.3911172.s1 O?Roak et al TRUE TRUE FALSE 0.3611180.fa Sanders et al TRUE FALSE FALSE 0.3111180.mo Sanders et al TRUE FALSE FALSE 0.3311180.p1 Sanders et al TRUE FALSE FALSE 0.2811180.s1 Sanders et al TRUE FALSE FALSE 0.3311190.fa O?Roak et al TRUE TRUE FALSE 0.5511190.mo O?Roak et al TRUE TRUE FALSE 0.4611190.p1 O?Roak et al TRUE TRUE FALSE 0.3611190.s1 O?Roak et al TRUE TRUE FALSE 0.4111196.fa Sanders et al TRUE TRUE FALSE 0.2211196.mo Sanders et al TRUE TRUE FALSE 0.2411196.p1 Sanders et al TRUE TRUE FALSE 0.2411196.s1 Sanders et al TRUE TRUE TRUE 0.2311203.fa Sanders et al TRUE TRUE FALSE 0.2311203.mo Sanders et al TRUE TRUE FALSE 0.2211203.p1 Sanders et al TRUE TRUE FALSE 0.2111203.s1 Sanders et al TRUE TRUE FALSE 0.2511219.fa Sanders et al TRUE TRUE FALSE 0.2711219.mo Sanders et al TRUE TRUE FALSE 0.2611219.p1 Sanders et al TRUE TRUE FALSE 0.2911219.s1 Sanders et al TRUE TRUE FALSE 0.2311220.fa Sanders et al TRUE TRUE FALSE 0.2711220.mo Sanders et al TRUE TRUE FALSE 0.2711220.p1 Sanders et al TRUE TRUE FALSE 0.2611220.s1 Sanders et al TRUE TRUE FALSE 0.2811229.fa O?Roak et al TRUE TRUE FALSE 0.4511229.mo O?Roak et al TRUE TRUE FALSE 0.4411229.p1 O?Roak et al TRUE TRUE TRUE 0.4211229.s1 O?Roak et al TRUE TRUE TRUE 0.3911241.fa Sanders et al TRUE TRUE FALSE 0.2811241.mo Sanders et al TRUE TRUE FALSE 0.2911241.p1 Sanders et al TRUE TRUE TRUE 0.2811241.s1 Sanders et al TRUE TRUE FALSE 0.2811242.fa Sanders et al TRUE TRUE FALSE 0.2711242.mo Sanders et al TRUE TRUE FALSE 0.2411242.p1 Sanders et al TRUE TRUE FALSE 0.2611242.s1 Sanders et al TRUE TRUE FALSE 0.2611247.fa Sanders et al TRUE TRUE FALSE 0.2611247.mo Sanders et al TRUE TRUE FALSE 0.2311247.p1 Sanders et al TRUE TRUE FALSE 0.2911247.s1 Sanders et al TRUE TRUE FALSE 0.2111252.fa Sanders et al TRUE TRUE FALSE 0.3111252.mo Sanders et al TRUE TRUE FALSE 0.3211252.p1 Sanders et al TRUE TRUE TRUE 0.2611252.s1 Sanders et al TRUE TRUE FALSE 0.3111265.fa Sanders et al TRUE TRUE FALSE 0.3111265.mo Sanders et al TRUE TRUE FALSE 0.3111265.p1 Sanders et al TRUE TRUE FALSE 0.3011265.s1 Sanders et al TRUE TRUE FALSE 0.3811267.fa Sanders et al TRUE TRUE FALSE 0.2911267.mo Sanders et al TRUE TRUE FALSE 0.3111267.p1 Sanders et al TRUE TRUE FALSE 0.3511267.s1 Sanders et al TRUE TRUE TRUE 0.3211282.fa Sanders et al TRUE TRUE FALSE 0.3111282.mo Sanders et al TRUE TRUE FALSE 0.31 11282.p1 Sanders et al TRUE TRUE TRUE 0.3211282.s1 Sanders et al TRUE TRUE TRUE 0.3511285.fa Sanders et al TRUE TRUE FALSE 0.2711285.mo Sanders et al TRUE TRUE FALSE 0.2911285.p1 Sanders et al TRUE TRUE FALSE 0.2911285.s1 Sanders et al TRUE TRUE FALSE 0.2811290.fa Sanders et al TRUE TRUE FALSE 0.2911290.mo Sanders et al TRUE TRUE FALSE 0.3011290.p1 Sanders et al TRUE TRUE FALSE 0.2411290.s1 Sanders et al TRUE TRUE FALSE 0.3611291.fa O?Roak et al TRUE TRUE FALSE 0.3911291.mo O?Roak et al TRUE TRUE FALSE 0.4111291.p1 O?Roak et al TRUE TRUE FALSE 0.4111291.s1 O?Roak et al TRUE TRUE FALSE 0.3711298.fa Sanders et al TRUE TRUE FALSE 0.3511298.mo Sanders et al TRUE TRUE FALSE 0.3211298.p1 Sanders et al TRUE TRUE FALSE 0.2911298.s1 Sanders et al TRUE TRUE FALSE 0.3111301.fa Sanders et al TRUE TRUE FALSE 0.2711301.mo Sanders et al TRUE TRUE FALSE 0.2311301.p1 Sanders et al TRUE TRUE FALSE 0.2611301.s1 Sanders et al TRUE TRUE FALSE 0.2611304.fa Sanders et al TRUE TRUE FALSE 0.3011304.mo Sanders et al TRUE TRUE FALSE 0.3211304.p1 Sanders et al TRUE TRUE FALSE 0.3111304.s1 Sanders et al TRUE TRUE TRUE 0.2611316.fa Sanders et al TRUE TRUE FALSE 0.2811316.mo Sanders et al TRUE TRUE FALSE 0.2911316.p1 Sanders et al TRUE TRUE FALSE 0.3011316.s1 Sanders et al TRUE TRUE FALSE 0.2811336.fa Sanders et al TRUE TRUE FALSE 0.2711336.mo Sanders et al TRUE TRUE FALSE 0.2211336.p1 Sanders et al TRUE TRUE FALSE 0.2311336.s1 Sanders et al TRUE TRUE FALSE 0.2311353.fa Sanders et al TRUE TRUE FALSE 0.3111353.mo Sanders et al TRUE TRUE FALSE 0.4011353.p1 Sanders et al TRUE TRUE FALSE 0.3011353.s1 Sanders et al TRUE TRUE FALSE 0.3611356.fa Sanders et al TRUE TRUE FALSE 0.2111356.mo Sanders et al TRUE TRUE FALSE 0.2111356.p1 Sanders et al TRUE TRUE FALSE 0.2211356.s1 Sanders et al TRUE TRUE FALSE 0.2511364.fa O?Roak et al TRUE TRUE FALSE 0.3811364.mo O?Roak et al TRUE TRUE FALSE 0.4111364.p1 O?Roak et al TRUE TRUE FALSE 0.3911364.s1 O?Roak et al TRUE TRUE FALSE 0.3911382.fa Sanders et al TRUE TRUE FALSE 0.2311382.mo Sanders et al TRUE TRUE FALSE 0.2711382.p1 Sanders et al TRUE TRUE FALSE 0.2411382.s1 Sanders et al TRUE TRUE FALSE 0.2611390.fa O?Roak et al TRUE TRUE FALSE 0.3211390.mo O?Roak et al TRUE TRUE FALSE 0.3611390.p1 O?Roak et al TRUE TRUE FALSE 0.3511390.s1 O?Roak et al TRUE TRUE FALSE 0.3811411.fa Sanders et al TRUE FALSE FALSE 0.2211411.mo Sanders et al TRUE FALSE FALSE 0.2511411.p1 Sanders et al TRUE FALSE FALSE 0.2111411.s1 Sanders et al TRUE FALSE FALSE 0.2611412.fa Sanders et al TRUE TRUE FALSE 0.2611412.mo Sanders et al TRUE TRUE FALSE 0.2811412.p1 Sanders et al TRUE TRUE TRUE 0.2711412.s1 Sanders et al TRUE TRUE TRUE 0.3011429.fa Sanders et al TRUE TRUE FALSE 0.2711429.mo Sanders et al TRUE TRUE FALSE 0.3011429.p1 Sanders et al TRUE TRUE FALSE 0.2711429.s1 Sanders et al TRUE TRUE FALSE 0.2611433.fa Sanders et al TRUE TRUE FALSE 0.3011433.mo Sanders et al TRUE TRUE FALSE 0.3011433.p1 Sanders et al TRUE TRUE FALSE 0.3511433.s1 Sanders et al TRUE TRUE TRUE 0.4111437.fa Sanders et al TRUE TRUE FALSE 0.3211437.mo Sanders et al TRUE TRUE FALSE 0.3511437.p1 Sanders et al TRUE TRUE FALSE 0.3511437.s1 Sanders et al TRUE TRUE FALSE 0.4611452.fa O?Roak et al TRUE TRUE FALSE 0.3911452.mo O?Roak et al TRUE TRUE FALSE 0.4411452.p1 O?Roak et al TRUE TRUE FALSE 0.4711452.s1 O?Roak et al TRUE TRUE FALSE 0.3711456.fa Sanders et al TRUE TRUE FALSE 0.2911456.mo Sanders et al TRUE TRUE FALSE 0.3111456.p1 Sanders et al TRUE TRUE FALSE 0.2911456.s1 Sanders et al TRUE TRUE TRUE 0.2911459.fa O?Roak et al TRUE TRUE FALSE 0.3811459.mo O?Roak et al TRUE TRUE FALSE 0.4111459.p1 O?Roak et al TRUE TRUE FALSE 0.4411459.s1 O?Roak et al TRUE TRUE FALSE 0.3611462.fa Sanders et al TRUE TRUE FALSE 0.2411462.mo Sanders et al TRUE TRUE FALSE 0.2611462.p1 Sanders et al TRUE TRUE FALSE 0.22 11462.s1 Sanders et al TRUE TRUE FALSE 0.2611469.fa O?Roak et al TRUE TRUE FALSE 0.4211469.mo O?Roak et al TRUE TRUE FALSE 0.4611469.p1 O?Roak et al TRUE TRUE FALSE 0.4411469.s1 O?Roak et al TRUE TRUE FALSE 0.4311472.fa O?Roak et al TRUE TRUE FALSE 0.4811472.mo O?Roak et al TRUE TRUE FALSE 0.4211472.p1 O?Roak et al TRUE TRUE FALSE 0.4411472.s1 O?Roak et al TRUE TRUE FALSE 0.4411474.fa Sanders et al TRUE TRUE FALSE 0.3411474.mo Sanders et al TRUE TRUE FALSE 0.3111474.p1 Sanders et al TRUE TRUE FALSE 0.2611474.s1 Sanders et al TRUE TRUE FALSE 0.2411479.fa O?Roak et al TRUE TRUE FALSE 0.3711479.mo O?Roak et al TRUE TRUE FALSE 0.4811479.p1 O?Roak et al TRUE TRUE FALSE 0.3911479.s1 O?Roak et al TRUE TRUE FALSE 0.3811484.fa Sanders et al TRUE TRUE FALSE 0.2511484.mo Sanders et al TRUE TRUE FALSE 0.2511484.p1 Sanders et al TRUE TRUE FALSE 0.2611484.s1 Sanders et al TRUE TRUE TRUE 0.2411490.fa Sanders et al TRUE TRUE FALSE 0.2911490.mo Sanders et al TRUE TRUE FALSE 0.3011490.p1 Sanders et al TRUE TRUE FALSE 0.2511490.s1 Sanders et al TRUE TRUE FALSE 0.4311491.fa O?Roak et al TRUE TRUE FALSE 0.3411491.mo O?Roak et al TRUE TRUE FALSE 0.3711491.p1 O?Roak et al TRUE TRUE FALSE 0.3811491.s1 O?Roak et al TRUE TRUE FALSE 0.3511501.fa Sanders et al TRUE TRUE FALSE 0.2811501.mo Sanders et al TRUE TRUE FALSE 0.2711501.p1 Sanders et al TRUE TRUE FALSE 0.2711501.s1 Sanders et al TRUE TRUE FALSE 0.2611509.fa Sanders et al TRUE TRUE FALSE 0.2611509.mo Sanders et al TRUE TRUE FALSE 0.2511509.p1 Sanders et al TRUE TRUE FALSE 0.2511509.s1 Sanders et al TRUE TRUE FALSE 0.2611519.fa Sanders et al TRUE TRUE FALSE 0.2711519.mo Sanders et al TRUE TRUE FALSE 0.2611519.p1 Sanders et al TRUE TRUE TRUE 0.2911519.s1 Sanders et al TRUE TRUE FALSE 0.3411524.fa Sanders et al TRUE TRUE FALSE 0.2911524.mo Sanders et al TRUE TRUE FALSE 0.2911524.p1 Sanders et al TRUE TRUE FALSE 0.2611524.s1 Sanders et al TRUE TRUE FALSE 0.2711532.fa Sanders et al TRUE TRUE FALSE 0.3011532.mo Sanders et al TRUE TRUE FALSE 0.3011532.p1 Sanders et al TRUE TRUE TRUE 0.3011532.s1 Sanders et al TRUE TRUE FALSE 0.2611551.fa Sanders et al TRUE TRUE FALSE 0.2411551.mo Sanders et al TRUE TRUE FALSE 0.2811551.p1 Sanders et al TRUE TRUE TRUE 0.2511551.s1 Sanders et al TRUE TRUE FALSE 0.2911561.fa Sanders et al TRUE TRUE FALSE 0.2611561.mo Sanders et al TRUE TRUE FALSE 0.2911561.p1 Sanders et al TRUE TRUE FALSE 0.2311561.s1 Sanders et al TRUE TRUE FALSE 0.3511569.fa O?Roak et al TRUE TRUE FALSE 0.3011569.mo O?Roak et al TRUE TRUE FALSE 0.3811569.p1 O?Roak et al TRUE TRUE TRUE 0.3911569.s1 O?Roak et al TRUE TRUE FALSE 0.3511571.fa O?Roak et al TRUE TRUE FALSE 0.4111571.mo O?Roak et al TRUE TRUE FALSE 0.4211571.p1 O?Roak et al TRUE TRUE FALSE 0.4111571.s1 O?Roak et al TRUE TRUE FALSE 0.3811581.fa Sanders et al TRUE TRUE FALSE 0.2511581.mo Sanders et al TRUE TRUE FALSE 0.2811581.p1 Sanders et al TRUE TRUE FALSE 0.2711581.s1 Sanders et al TRUE TRUE FALSE 0.2611610.fa O?Roak et al TRUE TRUE FALSE 0.3811610.mo O?Roak et al TRUE TRUE FALSE 0.4111610.p1 O?Roak et al TRUE TRUE FALSE 0.3511610.s1 O?Roak et al TRUE TRUE FALSE 0.3811611.fa Sanders et al TRUE TRUE FALSE 0.2911611.mo Sanders et al TRUE TRUE FALSE 0.2911611.p1 Sanders et al TRUE TRUE FALSE 0.2711611.s1 Sanders et al TRUE FALSE FALSE 0.3211622.fa Sanders et al TRUE TRUE FALSE 0.2711622.mo Sanders et al TRUE TRUE FALSE 0.2511622.p1 Sanders et al TRUE TRUE FALSE 0.3011622.s1 Sanders et al TRUE TRUE FALSE 0.2411629.fa O?Roak et al TRUE TRUE FALSE 0.3011629.mo O?Roak et al TRUE TRUE FALSE 0.4011629.p1 O?Roak et al TRUE TRUE FALSE 0.3511629.s1 O?Roak et al TRUE TRUE FALSE 0.3811638.fa O?Roak et al TRUE TRUE FALSE 0.4811638.mo O?Roak et al TRUE TRUE FALSE 0.3711638.p1 O?Roak et al TRUE TRUE FALSE 0.3711638.s1 O?Roak et al TRUE TRUE FALSE 0.39 11641.fa Sanders et al TRUE TRUE FALSE 0.2711641.mo Sanders et al TRUE TRUE FALSE 0.2411641.p1 Sanders et al TRUE TRUE FALSE 0.3311641.s1 Sanders et al TRUE TRUE FALSE 0.2411654.fa Sanders et al TRUE TRUE FALSE 0.3511654.mo Sanders et al TRUE TRUE FALSE 0.3011654.p1 Sanders et al TRUE TRUE FALSE 0.3211654.s1 Sanders et al TRUE TRUE FALSE 0.2711659.fa O?Roak et al TRUE TRUE FALSE 0.3711659.mo O?Roak et al TRUE TRUE FALSE 0.3811659.p1 O?Roak et al TRUE TRUE FALSE 0.4011659.s1 O?Roak et al TRUE TRUE TRUE 0.3211667.fa Sanders et al TRUE TRUE FALSE 0.3411667.mo Sanders et al TRUE TRUE FALSE 0.3311667.p1 Sanders et al TRUE TRUE TRUE 0.3411667.s1 Sanders et al TRUE TRUE TRUE 0.2711676.fa Sanders et al TRUE TRUE FALSE 0.2211676.mo Sanders et al TRUE TRUE FALSE 0.2211676.p1 Sanders et al TRUE TRUE FALSE 0.2211676.s1 Sanders et al TRUE TRUE FALSE 0.2611691.fa O?Roak et al TRUE TRUE FALSE 0.3611691.mo O?Roak et al TRUE TRUE FALSE 0.4011691.p1 O?Roak et al TRUE TRUE FALSE 0.3711691.s1 O?Roak et al TRUE TRUE FALSE 0.3711696.fa O?Roak et al TRUE TRUE FALSE 0.4711696.mo O?Roak et al TRUE TRUE FALSE 0.3511696.p1 O?Roak et al TRUE TRUE FALSE 0.4411696.s1 O?Roak et al TRUE TRUE TRUE 0.3911700.fa Sanders et al TRUE TRUE FALSE 0.2611700.mo Sanders et al TRUE TRUE FALSE 0.2711700.p1 Sanders et al TRUE TRUE FALSE 0.2611700.s1 Sanders et al TRUE TRUE FALSE 0.2711711.fa O?Roak et al TRUE TRUE FALSE 0.4011711.mo O?Roak et al TRUE TRUE FALSE 0.3911711.p1 O?Roak et al TRUE TRUE FALSE 0.4011711.s1 O?Roak et al TRUE TRUE FALSE 0.3811715.fa O?Roak et al TRUE TRUE FALSE 0.4011715.mo O?Roak et al TRUE TRUE FALSE 0.3611715.p1 O?Roak et al TRUE TRUE FALSE 0.4211715.s1 O?Roak et al TRUE TRUE TRUE 0.3811716.fa Sanders et al TRUE TRUE FALSE 0.2711716.mo Sanders et al TRUE TRUE FALSE 0.2611716.p1 Sanders et al TRUE TRUE FALSE 0.2611716.s1 Sanders et al TRUE TRUE FALSE 0.2811720.fa Sanders et al TRUE TRUE FALSE 0.2711720.mo Sanders et al TRUE TRUE FALSE 0.4211720.p1 Sanders et al TRUE TRUE FALSE 0.3311720.s1 Sanders et al TRUE TRUE FALSE 0.3611722.fa O?Roak et al TRUE TRUE FALSE 0.3911722.mo O?Roak et al TRUE TRUE FALSE 0.3611722.p1 O?Roak et al TRUE TRUE FALSE 0.4011722.s1 O?Roak et al TRUE TRUE FALSE 0.3711724.fa Sanders et al TRUE TRUE FALSE 0.4711724.mo Sanders et al TRUE TRUE FALSE 0.3811724.p1 Sanders et al TRUE TRUE FALSE 0.3611724.s1 Sanders et al TRUE TRUE FALSE 0.2911740.fa Sanders et al TRUE TRUE FALSE 0.2511740.mo Sanders et al TRUE TRUE FALSE 0.2611740.p1 Sanders et al TRUE TRUE FALSE 0.2511740.s1 Sanders et al TRUE TRUE FALSE 0.2711766.fa Sanders et al TRUE TRUE FALSE 0.2511766.mo Sanders et al TRUE TRUE FALSE 0.2711766.p1 Sanders et al TRUE TRUE FALSE 0.2911766.s1 Sanders et al TRUE TRUE FALSE 0.2811773.fa O?Roak et al TRUE TRUE FALSE 0.4411773.mo O?Roak et al TRUE TRUE FALSE 0.3911773.p1 O?Roak et al TRUE TRUE FALSE 0.3311773.s1 O?Roak et al TRUE TRUE FALSE 0.3411788.fa O?Roak et al TRUE TRUE FALSE 0.3811788.mo O?Roak et al TRUE TRUE FALSE 0.3711788.p1 O?Roak et al TRUE TRUE FALSE 0.4211788.s1 O?Roak et al TRUE TRUE TRUE 0.3711797.fa Sanders et al TRUE TRUE FALSE 0.2811797.mo Sanders et al TRUE TRUE FALSE 0.2711797.p1 Sanders et al TRUE TRUE FALSE 0.2611797.s1 Sanders et al TRUE TRUE FALSE 0.2611808.fa Sanders et al TRUE TRUE FALSE 0.3511808.mo Sanders et al TRUE TRUE FALSE 0.3711808.p1 Sanders et al TRUE TRUE FALSE 0.3411808.s1 Sanders et al TRUE TRUE FALSE 0.4111809.fa Sanders et al TRUE TRUE FALSE 0.3711809.mo Sanders et al TRUE TRUE FALSE 0.3111809.p1 Sanders et al TRUE TRUE FALSE 0.2911809.s1 Sanders et al TRUE TRUE FALSE 0.3411810.fa Sanders et al TRUE TRUE FALSE 0.3111810.mo Sanders et al TRUE TRUE FALSE 0.3011810.p1 Sanders et al TRUE TRUE FALSE 0.2411810.s1 Sanders et al TRUE TRUE FALSE 0.4411824.fa Sanders et al TRUE TRUE FALSE 0.30 11824.mo Sanders et al TRUE TRUE FALSE 0.3111824.p1 Sanders et al TRUE TRUE FALSE 0.3011824.s1 Sanders et al TRUE FALSE FALSE 0.3911828.fa Sanders et al TRUE TRUE FALSE 0.2711828.mo Sanders et al TRUE TRUE FALSE 0.2711828.p1 Sanders et al TRUE TRUE TRUE 0.2611828.s1 Sanders et al TRUE TRUE FALSE 0.2711872.fa O?Roak et al TRUE TRUE FALSE 0.3611872.mo O?Roak et al TRUE TRUE FALSE 0.3511872.p1 O?Roak et al TRUE TRUE TRUE 0.3511872.s1 O?Roak et al TRUE TRUE FALSE 0.3311892.fa Sanders et al TRUE TRUE FALSE 0.3011892.mo Sanders et al TRUE TRUE FALSE 0.3111892.p1 Sanders et al TRUE TRUE FALSE 0.3411892.s1 Sanders et al TRUE TRUE FALSE 0.2711894.fa Iossifov et al TRUE TRUE FALSE 0.4411894.mo Iossifov et al TRUE TRUE FALSE 0.5411894.p1 Iossifov et al TRUE TRUE FALSE 0.5411894.s1 Iossifov et al TRUE TRUE FALSE 0.5511895.fa O?Roak et al TRUE TRUE FALSE 0.4011895.mo O?Roak et al TRUE TRUE FALSE 0.4111895.p1 O?Roak et al TRUE TRUE FALSE 0.4811895.s1 O?Roak et al TRUE TRUE TRUE 0.3411905.fa Sanders et al TRUE TRUE FALSE 0.3211905.mo Sanders et al TRUE TRUE FALSE 0.3311905.p1 Sanders et al TRUE TRUE FALSE 0.3811905.s1 Sanders et al TRUE TRUE FALSE 0.3911942.fa O?Roak et al TRUE TRUE FALSE 0.3411942.mo O?Roak et al TRUE TRUE FALSE 0.4011942.p1 O?Roak et al TRUE TRUE TRUE 0.3411942.s1 O?Roak et al TRUE TRUE FALSE 0.3711959.fa O?Roak et al TRUE TRUE FALSE 0.3811959.mo O?Roak et al TRUE TRUE FALSE 0.3911959.p1 O?Roak et al TRUE TRUE FALSE 0.3411959.s1 O?Roak et al TRUE TRUE FALSE 0.3711962.fa Sanders et al TRUE TRUE FALSE 0.2811962.mo Sanders et al TRUE TRUE FALSE 0.2711962.p1 Sanders et al TRUE TRUE FALSE 0.2611962.s1 Sanders et al TRUE TRUE FALSE 0.2711964.fa O?Roak et al TRUE TRUE FALSE 0.4611964.mo O?Roak et al TRUE TRUE FALSE 0.4111964.p1 O?Roak et al TRUE TRUE FALSE 0.4311964.s1 O?Roak et al TRUE TRUE FALSE 0.3712011.fa O?Roak et al TRUE TRUE FALSE 0.3312011.mo O?Roak et al TRUE TRUE FALSE 0.4012011.p1 O?Roak et al TRUE TRUE FALSE 0.3912011.s1 O?Roak et al TRUE TRUE TRUE 0.3812051.fa Iossifov et al TRUE TRUE FALSE 0.5012051.mo Iossifov et al TRUE TRUE FALSE 0.5612051.p1 Iossifov et al TRUE TRUE FALSE 0.5512051.s1 Iossifov et al TRUE TRUE FALSE 0.5812100.fa Sanders et al TRUE TRUE FALSE 0.3212100.mo Sanders et al TRUE TRUE FALSE 0.3012100.p1 Sanders et al TRUE TRUE FALSE 0.3212100.s1 Sanders et al TRUE TRUE FALSE 0.4012106.fa O?Roak et al TRUE TRUE FALSE 0.3912106.mo O?Roak et al TRUE TRUE FALSE 0.3312106.p1 O?Roak et al TRUE TRUE FALSE 0.3312106.s1 O?Roak et al TRUE TRUE TRUE 0.3812152.fa O?Roak et al TRUE TRUE FALSE 0.3712152.mo O?Roak et al TRUE TRUE FALSE 0.3712152.p1 O?Roak et al TRUE TRUE FALSE 0.3312152.s1 O?Roak et al TRUE TRUE FALSE 0.3712153.fa O?Roak et al TRUE TRUE FALSE 0.3812153.mo O?Roak et al TRUE TRUE FALSE 0.4212153.p1 O?Roak et al TRUE TRUE FALSE 0.3712153.s1 O?Roak et al TRUE TRUE FALSE 0.3112161.fa O?Roak et al TRUE TRUE FALSE 0.3112161.mo O?Roak et al TRUE TRUE FALSE 0.3512161.p1 O?Roak et al TRUE TRUE FALSE 0.4412161.s1 O?Roak et al TRUE TRUE TRUE 0.3712162.fa Sanders et al TRUE TRUE FALSE 0.2912162.mo Sanders et al TRUE TRUE FALSE 0.3312162.p1 Sanders et al TRUE TRUE FALSE 0.3012162.s1 Sanders et al TRUE TRUE FALSE 0.2612175.fa Sanders et al TRUE TRUE FALSE 0.3112175.mo Sanders et al TRUE TRUE FALSE 0.2812175.p1 Sanders et al TRUE TRUE FALSE 0.3012175.s1 Sanders et al TRUE TRUE FALSE 0.2712187.fa Sanders et al TRUE TRUE FALSE 0.3212187.mo Sanders et al TRUE TRUE FALSE 0.3412187.p1 Sanders et al TRUE TRUE FALSE 0.3212187.s1 Sanders et al TRUE TRUE FALSE 0.2812224.fa Sanders et al TRUE TRUE FALSE 0.2912224.mo Sanders et al TRUE TRUE FALSE 0.2912224.p1 Sanders et al TRUE TRUE FALSE 0.2812224.s1 Sanders et al TRUE TRUE FALSE 0.2712228.fa Sanders et al TRUE TRUE FALSE 0.2812228.mo Sanders et al TRUE TRUE FALSE 0.30 12228.p1 Sanders et al TRUE TRUE TRUE 0.2512228.s1 Sanders et al TRUE TRUE FALSE 0.2912233.fa O?Roak et al TRUE TRUE FALSE 0.4012233.mo O?Roak et al TRUE TRUE FALSE 0.3912233.p1 O?Roak et al TRUE TRUE FALSE 0.3612233.s1 O?Roak et al TRUE TRUE FALSE 0.3412235.fa Sanders et al TRUE TRUE FALSE 0.2412235.mo Sanders et al TRUE TRUE FALSE 0.2812235.p1 Sanders et al TRUE TRUE FALSE 0.2612235.s1 Sanders et al TRUE TRUE FALSE 0.2212241.fa Sanders et al TRUE TRUE FALSE 0.3212241.mo Sanders et al TRUE TRUE FALSE 0.3112241.p1 Sanders et al TRUE TRUE FALSE 0.3012241.s1 Sanders et al TRUE TRUE FALSE 0.2912243.fa Iossifov et al TRUE TRUE FALSE 0.5512243.mo Iossifov et al TRUE TRUE FALSE 0.6012243.p1 Iossifov et al TRUE TRUE FALSE 0.5712243.s1 Iossifov et al TRUE TRUE FALSE 0.5912252.fa Iossifov et al TRUE TRUE FALSE 0.5012252.mo Iossifov et al TRUE TRUE FALSE 0.5012252.p1 Iossifov et al TRUE TRUE FALSE 0.5812252.s1 Iossifov et al TRUE TRUE TRUE 0.5212285.fa O?Roak et al TRUE FALSE FALSE 0.4012285.mo O?Roak et al TRUE FALSE FALSE 0.3612285.p1 O?Roak et al TRUE FALSE FALSE 0.4912285.s1 O?Roak et al TRUE FALSE FALSE 0.3012295.fa Sanders et al TRUE TRUE FALSE 0.3712295.mo Sanders et al TRUE TRUE FALSE 0.3512295.p1 Sanders et al TRUE TRUE FALSE 0.3112295.s1 Sanders et al TRUE TRUE FALSE 0.3512297.fa Sanders et al TRUE TRUE FALSE 0.2512297.mo Sanders et al TRUE TRUE FALSE 0.2512297.p1 Sanders et al TRUE TRUE FALSE 0.2112297.s1 Sanders et al TRUE TRUE FALSE 0.2312301.fa Iossifov et al TRUE TRUE FALSE 0.5512301.mo Iossifov et al TRUE TRUE FALSE 0.5512301.p1 Iossifov et al TRUE TRUE FALSE 0.4812301.s1 Iossifov et al TRUE TRUE FALSE 0.4812303.fa Sanders et al TRUE TRUE FALSE 0.3512303.mo Sanders et al TRUE TRUE FALSE 0.2312303.p1 Sanders et al TRUE TRUE FALSE 0.2412303.s1 Sanders et al TRUE TRUE FALSE 0.2612304.fa O?Roak et al TRUE TRUE FALSE 0.3712304.mo O?Roak et al TRUE TRUE FALSE 0.4212304.p1 O?Roak et al TRUE TRUE TRUE 0.4012304.s1 O?Roak et al TRUE TRUE FALSE 0.3512308.fa Sanders et al TRUE TRUE FALSE 0.2912308.mo Sanders et al TRUE TRUE FALSE 0.3012308.p1 Sanders et al TRUE TRUE FALSE 0.3112308.s1 Sanders et al TRUE TRUE FALSE 0.4012313.fa Iossifov et al TRUE TRUE FALSE 0.4812313.mo Iossifov et al TRUE TRUE FALSE 0.5012313.p1 Iossifov et al TRUE TRUE FALSE 0.5312313.s1 Iossifov et al TRUE TRUE FALSE 0.5812317.fa Sanders et al TRUE TRUE FALSE 0.2312317.mo Sanders et al TRUE TRUE FALSE 0.2712317.p1 Sanders et al TRUE TRUE FALSE 0.2312317.s1 Sanders et al TRUE TRUE TRUE 0.2512321.fa Iossifov et al TRUE TRUE FALSE 0.4912321.mo Iossifov et al TRUE TRUE FALSE 0.5012321.p1 Iossifov et al TRUE TRUE FALSE 0.5112321.s1 Iossifov et al TRUE TRUE FALSE 0.5412327.fa Sanders et al TRUE TRUE FALSE 0.3612327.mo Sanders et al TRUE TRUE FALSE 0.3312327.p1 Sanders et al TRUE TRUE FALSE 0.3112327.s1 Sanders et al TRUE TRUE FALSE 0.5012334.fa Iossifov et al TRUE TRUE FALSE 0.4112334.mo Iossifov et al TRUE TRUE FALSE 0.4812334.p1 Iossifov et al TRUE TRUE FALSE 0.4012334.s1 Iossifov et al TRUE TRUE FALSE 0.4112340.fa Sanders et al TRUE TRUE FALSE 0.3812340.mo Sanders et al TRUE TRUE FALSE 0.3812340.p1 Sanders et al TRUE TRUE FALSE 0.4312340.s1 Sanders et al TRUE TRUE FALSE 0.3412343.fa Sanders et al TRUE TRUE FALSE 0.3512343.mo Sanders et al TRUE TRUE FALSE 0.3912343.p1 Sanders et al TRUE TRUE FALSE 0.4012343.s1 Sanders et al TRUE TRUE FALSE 0.3712345.fa Sanders et al TRUE TRUE FALSE 0.2412345.mo Sanders et al TRUE TRUE FALSE 0.2312345.p1 Sanders et al TRUE TRUE FALSE 0.2612345.s1 Sanders et al TRUE TRUE FALSE 0.2412358.fa O?Roak et al TRUE TRUE FALSE 0.4612358.mo O?Roak et al TRUE TRUE FALSE 0.3912358.p1 O?Roak et al TRUE TRUE FALSE 0.4312358.s1 O?Roak et al TRUE TRUE FALSE 0.3812360.fa Iossifov et al TRUE TRUE FALSE 0.6012360.mo Iossifov et al TRUE TRUE FALSE 0.5412360.p1 Iossifov et al TRUE TRUE FALSE 0.57 12360.s1 Iossifov et al TRUE TRUE FALSE 0.6112368.fa Sanders et al TRUE TRUE FALSE 0.3112368.mo Sanders et al TRUE TRUE FALSE 0.2912368.p1 Sanders et al TRUE TRUE FALSE 0.3312368.s1 Sanders et al TRUE TRUE FALSE 0.2812370.fa Sanders et al TRUE TRUE FALSE 0.2712370.mo Sanders et al TRUE TRUE FALSE 0.2912370.p1 Sanders et al TRUE TRUE TRUE 0.3012370.s1 Sanders et al TRUE TRUE TRUE 0.2712373.fa O?Roak et al TRUE TRUE FALSE 0.3512373.mo O?Roak et al TRUE TRUE FALSE 0.3312373.p1 O?Roak et al TRUE TRUE FALSE 0.4412373.s1 O?Roak et al TRUE TRUE FALSE 0.3612375.fa Sanders et al TRUE TRUE FALSE 0.2812375.mo Sanders et al TRUE TRUE FALSE 0.3112375.p1 Sanders et al TRUE TRUE FALSE 0.3012375.s1 Sanders et al TRUE TRUE FALSE 0.2812383.fa Sanders et al TRUE TRUE FALSE 0.3012383.mo Sanders et al TRUE TRUE FALSE 0.3012383.p1 Sanders et al TRUE TRUE TRUE 0.3212383.s1 Sanders et al TRUE TRUE TRUE 0.2912390.fa O?Roak et al TRUE TRUE FALSE 0.3912390.mo O?Roak et al TRUE TRUE FALSE 0.3012390.p1 O?Roak et al TRUE TRUE FALSE 0.3712390.s1 O?Roak et al TRUE TRUE TRUE 0.3912394.fa Iossifov et al TRUE FALSE FALSE 0.5012394.mo Iossifov et al TRUE FALSE FALSE 0.5012394.p1 Iossifov et al TRUE FALSE FALSE 0.4512394.s1 Iossifov et al TRUE FALSE FALSE 0.4712396.fa Iossifov et al TRUE TRUE FALSE 0.6012396.mo Iossifov et al TRUE TRUE FALSE 0.5012396.p1 Iossifov et al TRUE TRUE FALSE 0.4512396.s1 Iossifov et al TRUE TRUE FALSE 0.4712403.fa Sanders et al TRUE TRUE FALSE 0.4012403.mo Sanders et al TRUE TRUE FALSE 0.3612403.p1 Sanders et al TRUE TRUE FALSE 0.3712403.s1 Sanders et al TRUE TRUE FALSE 0.3512409.fa Iossifov et al TRUE TRUE FALSE 0.4412409.mo Iossifov et al TRUE TRUE FALSE 0.4512409.p1 Iossifov et al TRUE TRUE FALSE 0.4212409.s1 Iossifov et al TRUE TRUE TRUE 0.4812412.fa Iossifov et al TRUE FALSE FALSE 0.4912412.mo Iossifov et al TRUE FALSE FALSE 0.5012412.p1 Iossifov et al TRUE FALSE FALSE 0.4612412.s1 Iossifov et al TRUE FALSE FALSE 0.4612420.fa Iossifov et al TRUE TRUE FALSE 0.5012420.mo Iossifov et al TRUE TRUE FALSE 0.5612420.p1 Iossifov et al TRUE TRUE FALSE 0.4412420.s1 Iossifov et al TRUE TRUE FALSE 0.4712424.fa Iossifov et al TRUE TRUE FALSE 0.4612424.mo Iossifov et al TRUE TRUE FALSE 0.3812424.p1 Iossifov et al TRUE TRUE TRUE 0.4612424.s1 Iossifov et al TRUE TRUE FALSE 0.4812438.fa Iossifov et al TRUE TRUE FALSE 0.4912438.mo Iossifov et al TRUE TRUE FALSE 0.5012438.p1 Iossifov et al TRUE TRUE FALSE 0.5112438.s1 Iossifov et al TRUE TRUE FALSE 0.5312441.fa Iossifov et al TRUE TRUE FALSE 0.3812441.mo Iossifov et al TRUE TRUE FALSE 0.4112441.p1 Iossifov et al TRUE TRUE TRUE 0.3912441.s1 Iossifov et al TRUE TRUE FALSE 0.4412445.fa Iossifov et al TRUE TRUE FALSE 0.4512445.mo Iossifov et al TRUE TRUE FALSE 0.5312445.p1 Iossifov et al TRUE TRUE FALSE 0.4112445.s1 Iossifov et al TRUE TRUE FALSE 0.4212460.fa Iossifov et al TRUE TRUE FALSE 0.5312460.mo Iossifov et al TRUE TRUE FALSE 0.4212460.p1 Iossifov et al TRUE TRUE FALSE 0.4612460.s1 Iossifov et al TRUE TRUE FALSE 0.5012462.fa Iossifov et al TRUE TRUE FALSE 0.6112462.mo Iossifov et al TRUE TRUE FALSE 0.5512462.p1 Iossifov et al TRUE TRUE FALSE 0.5012462.s1 Iossifov et al TRUE TRUE FALSE 0.5912463.fa Iossifov et al TRUE TRUE FALSE 0.5112463.mo Iossifov et al TRUE TRUE FALSE 0.4612463.p1 Iossifov et al TRUE TRUE FALSE 0.5912463.s1 Iossifov et al TRUE TRUE FALSE 0.5812467.fa Iossifov et al TRUE FALSE FALSE 0.4812467.mo Iossifov et al TRUE FALSE FALSE 0.5012467.p1 Iossifov et al TRUE FALSE FALSE 0.4712467.s1 Iossifov et al TRUE FALSE FALSE 0.4212473.fa Iossifov et al TRUE FALSE FALSE 0.3312473.mo Iossifov et al TRUE FALSE FALSE 0.4512473.p1 Iossifov et al TRUE FALSE FALSE 0.3012473.s1 Iossifov et al TRUE FALSE FALSE 0.3012480.fa Iossifov et al FALSE TRUE FALSE 0.4612480.mo Iossifov et al FALSE TRUE FALSE 0.5312480.p1 Iossifov et al FALSE TRUE FALSE 0.4312480.s1 Iossifov et al FALSE FALSE FALSE 0.41 12481.fa Iossifov et al TRUE TRUE FALSE 0.4312481.mo Iossifov et al TRUE TRUE FALSE 0.4912481.p1 Iossifov et al TRUE TRUE TRUE 0.4812481.s1 Iossifov et al TRUE TRUE FALSE 0.4512498.fa Iossifov et al TRUE TRUE FALSE 0.3812498.mo Iossifov et al TRUE TRUE FALSE 0.4012498.p1 Iossifov et al TRUE TRUE FALSE 0.4412498.s1 Iossifov et al TRUE TRUE FALSE 0.4012507.fa Sanders et al TRUE TRUE FALSE 0.2812507.mo Sanders et al TRUE TRUE FALSE 0.2712507.p1 Sanders et al TRUE TRUE FALSE 0.2912507.s1 Sanders et al TRUE TRUE FALSE 0.3412510.fa Iossifov et al TRUE TRUE FALSE 0.4212510.mo Iossifov et al TRUE TRUE FALSE 0.4912510.p1 Iossifov et al TRUE TRUE TRUE 0.4912510.s1 Iossifov et al TRUE TRUE TRUE 0.5312512.fa Sanders et al TRUE TRUE FALSE 0.3412512.mo Sanders et al TRUE TRUE FALSE 0.3212512.p1 Sanders et al TRUE TRUE FALSE 0.4012512.s1 Sanders et al TRUE TRUE FALSE 0.3512515.fa Iossifov et al TRUE TRUE FALSE 0.5012515.mo Iossifov et al TRUE TRUE FALSE 0.5412515.p1 Iossifov et al TRUE TRUE FALSE 0.4912515.s1 Iossifov et al TRUE TRUE FALSE 0.5112518.fa Iossifov et al TRUE TRUE FALSE 0.4712518.mo Iossifov et al TRUE TRUE FALSE 0.5512518.p1 Iossifov et al TRUE TRUE FALSE 0.4612518.s1 Iossifov et al TRUE TRUE FALSE 0.4812522.fa Sanders et al TRUE FALSE FALSE 0.1912522.mo Sanders et al TRUE FALSE FALSE 0.1912522.p1 Sanders et al TRUE FALSE FALSE 0.2412522.s1 Sanders et al TRUE FALSE FALSE 0.2412523.fa Iossifov et al TRUE TRUE FALSE 0.5412523.mo Iossifov et al TRUE TRUE FALSE 0.5612523.p1 Iossifov et al TRUE TRUE FALSE 0.4412523.s1 Iossifov et al TRUE TRUE FALSE 0.5812524.fa Sanders et al TRUE TRUE FALSE 0.2912524.mo Sanders et al TRUE TRUE FALSE 0.2912524.p1 Sanders et al TRUE TRUE FALSE 0.3012524.s1 Sanders et al TRUE TRUE FALSE 0.2812526.fa Iossifov et al TRUE FALSE FALSE 0.5912526.mo Iossifov et al TRUE FALSE FALSE 0.6012526.p1 Iossifov et al TRUE FALSE FALSE 0.5812526.s1 Iossifov et al TRUE FALSE FALSE 0.5912534.fa Sanders et al TRUE TRUE FALSE 0.3612534.mo Sanders et al TRUE TRUE FALSE 0.3812534.p1 Sanders et al TRUE TRUE FALSE 0.3712534.s1 Sanders et al TRUE TRUE FALSE 0.3412536.fa Sanders et al TRUE TRUE FALSE 0.3612536.mo Sanders et al TRUE TRUE FALSE 0.3712536.p1 Sanders et al TRUE TRUE FALSE 0.3512536.s1 Sanders et al TRUE TRUE FALSE 0.4512552.fa Sanders et al TRUE TRUE FALSE 0.3612552.mo Sanders et al TRUE TRUE FALSE 0.3812552.p1 Sanders et al TRUE TRUE FALSE 0.3512552.s1 Sanders et al TRUE TRUE TRUE 0.4012561.fa Sanders et al TRUE TRUE FALSE 0.2712561.mo Sanders et al TRUE TRUE FALSE 0.2112561.p1 Sanders et al TRUE TRUE FALSE 0.2112561.s1 Sanders et al TRUE TRUE TRUE 0.2212578.fa O?Roak et al TRUE TRUE FALSE 0.3712578.mo O?Roak et al TRUE TRUE FALSE 0.3812578.p1 O?Roak et al TRUE TRUE FALSE 0.3512578.s1 O?Roak et al TRUE TRUE FALSE 0.3612579.fa Iossifov et al TRUE TRUE FALSE 0.5812579.mo Iossifov et al TRUE TRUE FALSE 0.5212579.p1 Iossifov et al TRUE TRUE FALSE 0.5512579.s1 Iossifov et al TRUE TRUE FALSE 0.6012581.fa O?Roak et al TRUE TRUE FALSE 0.4512581.mo O?Roak et al TRUE TRUE FALSE 0.4612581.p1 O?Roak et al TRUE TRUE FALSE 0.4412581.s1 O?Roak et al TRUE TRUE FALSE 0.3812582.fa Iossifov et al TRUE TRUE FALSE 0.6012582.mo Iossifov et al TRUE TRUE FALSE 0.5912582.p1 Iossifov et al TRUE TRUE TRUE 0.5512582.s1 Iossifov et al TRUE TRUE FALSE 0.5312588.fa Iossifov et al TRUE TRUE FALSE 0.5412588.mo Iossifov et al TRUE TRUE FALSE 0.5312588.p1 Iossifov et al TRUE TRUE FALSE 0.5712588.s1 Iossifov et al TRUE TRUE TRUE 0.5712616.fa Sanders et al TRUE TRUE FALSE 0.2312616.mo Sanders et al TRUE TRUE FALSE 0.2112616.p1 Sanders et al TRUE TRUE FALSE 0.2212616.s1 Sanders et al TRUE TRUE FALSE 0.2212618.fa Iossifov et al TRUE TRUE FALSE 0.4812618.mo Iossifov et al TRUE TRUE FALSE 0.5412618.p1 Iossifov et al TRUE TRUE TRUE 0.5512618.s1 Iossifov et al TRUE TRUE FALSE 0.5512620.fa Iossifov et al TRUE FALSE FALSE 0.54 12620.mo Iossifov et al TRUE FALSE FALSE 0.5412620.p1 Iossifov et al TRUE FALSE FALSE 0.5112620.s1 Iossifov et al TRUE FALSE FALSE 0.4912626.fa Iossifov et al TRUE TRUE FALSE 0.5012626.mo Iossifov et al TRUE TRUE FALSE 0.5212626.p1 Iossifov et al TRUE TRUE FALSE 0.5212626.s1 Iossifov et al TRUE TRUE FALSE 0.5212628.fa Iossifov et al TRUE TRUE FALSE 0.5212628.mo Iossifov et al TRUE TRUE FALSE 0.4912628.p1 Iossifov et al TRUE TRUE FALSE 0.5112628.s1 Iossifov et al TRUE TRUE FALSE 0.5512630.fa O?Roak et al TRUE TRUE FALSE 0.3912630.mo O?Roak et al TRUE TRUE FALSE 0.3912630.p1 O?Roak et al TRUE TRUE FALSE 0.4012630.s1 O?Roak et al TRUE TRUE FALSE 0.3712631.fa Iossifov et al TRUE TRUE FALSE 0.5012631.mo Iossifov et al TRUE TRUE FALSE 0.5212631.p1 Iossifov et al TRUE TRUE FALSE 0.5312631.s1 Iossifov et al TRUE TRUE TRUE 0.5212633.fa Iossifov et al TRUE FALSE FALSE 0.4712633.mo Iossifov et al TRUE FALSE FALSE 0.4812633.p1 Iossifov et al TRUE FALSE FALSE 0.4512633.s1 Iossifov et al TRUE FALSE FALSE 0.4712637.fa Iossifov et al TRUE TRUE FALSE 0.4012637.mo Iossifov et al TRUE TRUE FALSE 0.4112637.p1 Iossifov et al TRUE TRUE TRUE 0.4012637.s1 Iossifov et al TRUE TRUE FALSE 0.4312638.fa Iossifov et al TRUE TRUE FALSE 0.4512638.mo Iossifov et al TRUE TRUE FALSE 0.4512638.p1 Iossifov et al TRUE TRUE FALSE 0.4312638.s1 Iossifov et al TRUE TRUE FALSE 0.4712642.fa Iossifov et al TRUE TRUE FALSE 0.5612642.mo Iossifov et al TRUE TRUE FALSE 0.5412642.p1 Iossifov et al TRUE TRUE FALSE 0.5712642.s1 Iossifov et al TRUE TRUE FALSE 0.5312644.fa Iossifov et al TRUE TRUE FALSE 0.5412644.mo Iossifov et al TRUE TRUE FALSE 0.5112644.p1 Iossifov et al TRUE TRUE FALSE 0.5012644.s1 Iossifov et al TRUE TRUE FALSE 0.4912645.fa Iossifov et al TRUE TRUE FALSE 0.4112645.mo Iossifov et al TRUE TRUE FALSE 0.4312645.p1 Iossifov et al TRUE TRUE FALSE 0.4112645.s1 Iossifov et al TRUE TRUE FALSE 0.4212647.fa Sanders et al TRUE TRUE FALSE 0.3112647.mo Sanders et al TRUE TRUE FALSE 0.2612647.p1 Sanders et al TRUE TRUE FALSE 0.4012647.s1 Sanders et al TRUE TRUE FALSE 0.3912650.fa Sanders et al TRUE TRUE FALSE 0.2412650.mo Sanders et al TRUE TRUE FALSE 0.2112650.p1 Sanders et al TRUE TRUE FALSE 0.2012650.s1 Sanders et al TRUE TRUE TRUE 0.2612651.fa Sanders et al TRUE TRUE FALSE 0.3112651.mo Sanders et al TRUE TRUE FALSE 0.3012651.p1 Sanders et al TRUE TRUE FALSE 0.3712651.s1 Sanders et al TRUE TRUE FALSE 0.2812652.fa Iossifov et al TRUE TRUE FALSE 0.5012652.mo Iossifov et al TRUE TRUE FALSE 0.5512652.p1 Iossifov et al TRUE TRUE FALSE 0.5412652.s1 Iossifov et al TRUE TRUE FALSE 0.5412653.fa Iossifov et al TRUE TRUE FALSE 0.5312653.mo Iossifov et al TRUE TRUE FALSE 0.4912653.p1 Iossifov et al TRUE TRUE FALSE 0.4912653.s1 Iossifov et al TRUE TRUE FALSE 0.5312655.fa Iossifov et al TRUE TRUE FALSE 0.5012655.mo Iossifov et al TRUE TRUE FALSE 0.5812655.p1 Iossifov et al TRUE TRUE TRUE 0.5012655.s1 Iossifov et al TRUE TRUE TRUE 0.4612656.fa Sanders et al TRUE TRUE FALSE 0.3712656.mo Sanders et al TRUE TRUE FALSE 0.3512656.p1 Sanders et al TRUE TRUE FALSE 0.3412656.s1 Sanders et al TRUE TRUE TRUE 0.3312657.fa Sanders et al TRUE TRUE FALSE 0.3112657.mo Sanders et al TRUE TRUE FALSE 0.3512657.p1 Sanders et al TRUE TRUE FALSE 0.3412657.s1 Sanders et al TRUE TRUE FALSE 0.4012664.fa Iossifov et al TRUE TRUE FALSE 0.5012664.mo Iossifov et al TRUE TRUE FALSE 0.5212664.p1 Iossifov et al TRUE TRUE FALSE 0.4612664.s1 Iossifov et al TRUE TRUE FALSE 0.5612680.fa Sanders et al TRUE FALSE FALSE 0.3312680.mo Sanders et al TRUE FALSE FALSE 0.3212680.p1 Sanders et al TRUE FALSE FALSE 0.3112680.s1 Sanders et al TRUE FALSE FALSE 0.4412683.fa Iossifov et al TRUE FALSE FALSE 0.5212683.mo Iossifov et al TRUE FALSE FALSE 0.5412683.p1 Iossifov et al TRUE FALSE FALSE 0.5812683.s1 Iossifov et al TRUE FALSE FALSE 0.5312685.fa Sanders et al TRUE TRUE FALSE 0.3212685.mo Sanders et al TRUE TRUE FALSE 0.29 12685.p1 Sanders et al TRUE TRUE FALSE 0.2912685.s1 Sanders et al TRUE TRUE FALSE 0.3212688.fa Iossifov et al TRUE FALSE FALSE 0.6012688.mo Iossifov et al TRUE FALSE FALSE 0.5512688.p1 Iossifov et al TRUE FALSE FALSE 0.5712688.s1 Iossifov et al TRUE FALSE FALSE 0.5612690.fa Sanders et al TRUE TRUE FALSE 0.2112690.mo Sanders et al TRUE TRUE FALSE 0.2312690.p1 Sanders et al TRUE TRUE FALSE 0.2112690.s1 Sanders et al TRUE TRUE FALSE 0.2212691.fa Iossifov et al TRUE TRUE FALSE 0.4612691.mo Iossifov et al TRUE TRUE FALSE 0.4812691.p1 Iossifov et al TRUE TRUE FALSE 0.4812691.s1 Iossifov et al TRUE TRUE FALSE 0.4912697.fa Iossifov et al TRUE FALSE FALSE 0.5012697.mo Iossifov et al TRUE FALSE FALSE 0.5012697.p1 Iossifov et al TRUE FALSE FALSE 0.4612697.s1 Iossifov et al TRUE FALSE FALSE 0.5212703.fa O?Roak et al TRUE TRUE FALSE 0.5012703.mo O?Roak et al TRUE TRUE FALSE 0.4012703.p1 O?Roak et al TRUE TRUE FALSE 0.3912703.s1 O?Roak et al TRUE TRUE FALSE 0.3112705.fa Iossifov et al TRUE FALSE FALSE 0.5712705.mo Iossifov et al TRUE FALSE FALSE 0.5512705.p1 Iossifov et al TRUE FALSE FALSE 0.5712705.s1 Iossifov et al TRUE FALSE FALSE 0.5512708.fa Iossifov et al TRUE TRUE FALSE 0.4812708.mo Iossifov et al TRUE TRUE FALSE 0.5012708.p1 Iossifov et al TRUE TRUE FALSE 0.4712708.s1 Iossifov et al TRUE TRUE FALSE 0.5312716.fa Iossifov et al TRUE FALSE FALSE 0.4812716.mo Iossifov et al TRUE FALSE FALSE 0.4312716.p1 Iossifov et al TRUE FALSE FALSE 0.4312716.s1 Iossifov et al TRUE FALSE FALSE 0.4312719.fa Iossifov et al TRUE FALSE FALSE 0.4512719.mo Iossifov et al TRUE FALSE FALSE 0.4612719.p1 Iossifov et al TRUE FALSE FALSE 0.4512719.s1 Iossifov et al TRUE FALSE FALSE 0.4612720.fa Iossifov et al TRUE FALSE FALSE 0.4512720.mo Iossifov et al TRUE FALSE FALSE 0.4612720.p1 Iossifov et al TRUE FALSE FALSE 0.4412720.s1 Iossifov et al TRUE FALSE FALSE 0.4912723.fa Iossifov et al TRUE TRUE FALSE 0.4612723.mo Iossifov et al TRUE TRUE FALSE 0.4312723.p1 Iossifov et al TRUE TRUE FALSE 0.4312723.s1 Iossifov et al TRUE TRUE FALSE 0.4412724.fa Iossifov et al TRUE FALSE FALSE 0.4512724.mo Iossifov et al TRUE FALSE FALSE 0.4212724.p1 Iossifov et al TRUE FALSE FALSE 0.4412724.s1 Iossifov et al TRUE FALSE FALSE 0.4112727.fa Iossifov et al TRUE FALSE FALSE 0.4212727.mo Iossifov et al TRUE FALSE FALSE 0.4112727.p1 Iossifov et al TRUE FALSE FALSE 0.4312727.s1 Iossifov et al TRUE FALSE FALSE 0.4312729.fa Sanders et al TRUE TRUE FALSE 0.3412729.mo Sanders et al TRUE TRUE FALSE 0.3712729.p1 Sanders et al TRUE TRUE FALSE 0.3412729.s1 Sanders et al TRUE TRUE FALSE 0.2712733.fa Iossifov et al TRUE FALSE FALSE 0.4512733.mo Iossifov et al TRUE FALSE FALSE 0.5012733.p1 Iossifov et al TRUE FALSE FALSE 0.4512733.s1 Iossifov et al TRUE FALSE FALSE 0.4712735.fa Iossifov et al TRUE TRUE FALSE 0.4112735.mo Iossifov et al TRUE TRUE FALSE 0.4212735.p1 Iossifov et al TRUE TRUE FALSE 0.4112735.s1 Iossifov et al TRUE TRUE FALSE 0.4112736.fa Sanders et al TRUE TRUE FALSE 0.3112736.mo Sanders et al TRUE TRUE FALSE 0.3112736.p1 Sanders et al TRUE TRUE FALSE 0.2412736.s1 Sanders et al TRUE TRUE FALSE 0.4112739.fa Iossifov et al TRUE TRUE FALSE 0.5512739.mo Iossifov et al TRUE TRUE FALSE 0.5812739.p1 Iossifov et al TRUE TRUE FALSE 0.5712739.s1 Iossifov et al TRUE TRUE FALSE 0.5912741.fa O?Roak et al TRUE TRUE FALSE 0.3812741.mo O?Roak et al TRUE TRUE FALSE 0.3812741.p1 O?Roak et al TRUE TRUE FALSE 0.4012741.s1 O?Roak et al TRUE TRUE TRUE 0.4012743.fa Iossifov et al TRUE TRUE FALSE 0.4712743.mo Iossifov et al TRUE TRUE FALSE 0.4912743.p1 Iossifov et al TRUE TRUE FALSE 0.4712743.s1 Iossifov et al TRUE TRUE FALSE 0.4912748.fa Iossifov et al TRUE TRUE FALSE 0.5112748.mo Iossifov et al TRUE TRUE FALSE 0.5312748.p1 Iossifov et al TRUE TRUE FALSE 0.5512748.s1 Iossifov et al TRUE TRUE FALSE 0.5712758.fa Iossifov et al TRUE TRUE FALSE 0.5512758.mo Iossifov et al TRUE TRUE FALSE 0.5412758.p1 Iossifov et al TRUE TRUE FALSE 0.54 12758.s1 Iossifov et al TRUE TRUE FALSE 0.5512759.fa Iossifov et al TRUE TRUE FALSE 0.5112759.mo Iossifov et al TRUE TRUE FALSE 0.5312759.p1 Iossifov et al TRUE TRUE FALSE 0.5012759.s1 Iossifov et al TRUE TRUE FALSE 0.5412763.fa Sanders et al TRUE FALSE FALSE 0.3012763.mo Sanders et al TRUE FALSE FALSE 0.3212763.p1 Sanders et al TRUE FALSE FALSE 0.3512763.s1 Sanders et al TRUE FALSE FALSE 0.3712764.fa Iossifov et al TRUE TRUE FALSE 0.5212764.mo Iossifov et al TRUE TRUE FALSE 0.5212764.p1 Iossifov et al TRUE TRUE FALSE 0.5112764.s1 Iossifov et al TRUE TRUE FALSE 0.5512770.fa Iossifov et al TRUE FALSE FALSE 0.5212770.mo Iossifov et al TRUE FALSE FALSE 0.5612770.p1 Iossifov et al TRUE FALSE FALSE 0.5412770.s1 Iossifov et al TRUE FALSE FALSE 0.5712780.fa Sanders et al TRUE TRUE FALSE 0.2412780.mo Sanders et al TRUE TRUE FALSE 0.2212780.p1 Sanders et al TRUE TRUE FALSE 0.2312780.s1 Sanders et al TRUE TRUE FALSE 0.2112790.fa Sanders et al TRUE TRUE FALSE 0.3812790.mo Sanders et al TRUE TRUE FALSE 0.3212790.p1 Sanders et al TRUE TRUE FALSE 0.3112790.s1 Sanders et al TRUE TRUE FALSE 0.3112802.fa Sanders et al TRUE FALSE FALSE 0.3712802.mo Sanders et al TRUE FALSE FALSE 0.3512802.p1 Sanders et al TRUE FALSE FALSE 0.3512802.s1 Sanders et al TRUE FALSE FALSE 0.3312810.fa O?Roak et al TRUE TRUE FALSE 0.4912810.mo O?Roak et al TRUE TRUE FALSE 0.3912810.p1 O?Roak et al TRUE TRUE TRUE 0.4212810.s1 O?Roak et al TRUE TRUE FALSE 0.3312826.fa Iossifov et al TRUE FALSE FALSE 0.4012826.mo Iossifov et al TRUE FALSE FALSE 0.4412826.p1 Iossifov et al TRUE FALSE FALSE 0.4112826.s1 Iossifov et al TRUE FALSE FALSE 0.4112829.fa Iossifov et al TRUE TRUE FALSE 0.4012829.mo Iossifov et al TRUE TRUE FALSE 0.4212829.p1 Iossifov et al TRUE TRUE FALSE 0.3912829.s1 Iossifov et al TRUE TRUE TRUE 0.4112833.fa Iossifov et al TRUE FALSE FALSE 0.4612833.mo Iossifov et al TRUE FALSE FALSE 0.4512833.p1 Iossifov et al TRUE FALSE FALSE 0.4912833.s1 Iossifov et al TRUE FALSE FALSE 0.4512836.fa Iossifov et al TRUE TRUE FALSE 0.4212836.mo Iossifov et al TRUE TRUE FALSE 0.4312836.p1 Iossifov et al TRUE TRUE TRUE 0.4312836.s1 Iossifov et al TRUE TRUE TRUE 0.4312837.fa Iossifov et al TRUE TRUE FALSE 0.4512837.mo Iossifov et al TRUE TRUE FALSE 0.4512837.p1 Iossifov et al TRUE TRUE FALSE 0.4512837.s1 Iossifov et al TRUE TRUE TRUE 0.4512838.fa Iossifov et al TRUE TRUE FALSE 0.4412838.mo Iossifov et al TRUE TRUE FALSE 0.4412838.p1 Iossifov et al TRUE TRUE FALSE 0.4612838.s1 Iossifov et al TRUE TRUE TRUE 0.4412840.fa Iossifov et al TRUE FALSE FALSE 0.3912840.mo Iossifov et al TRUE FALSE FALSE 0.3812840.p1 Iossifov et al TRUE FALSE FALSE 0.3912840.s1 Iossifov et al TRUE FALSE FALSE 0.4212843.fa Iossifov et al FALSE TRUE FALSE 0.4212843.mo Iossifov et al FALSE TRUE FALSE 0.4012843.p1 Iossifov et al FALSE TRUE FALSE 0.3812843.s1 Iossifov et al FALSE FALSE FALSE 0.4112851.fa Iossifov et al TRUE TRUE FALSE 0.4112851.mo Iossifov et al TRUE TRUE FALSE 0.4012851.p1 Iossifov et al TRUE TRUE FALSE 0.4012851.s1 Iossifov et al TRUE TRUE TRUE 0.4112852.fa Iossifov et al TRUE TRUE FALSE 0.3912852.mo Iossifov et al TRUE TRUE FALSE 0.4212852.p1 Iossifov et al TRUE TRUE FALSE 0.4012852.s1 Iossifov et al TRUE TRUE FALSE 0.4012869.fa Sanders et al TRUE TRUE FALSE 0.3512869.mo Sanders et al TRUE TRUE FALSE 0.4312869.p1 Sanders et al TRUE TRUE FALSE 0.4012869.s1 Sanders et al TRUE FALSE FALSE 0.3212905.fa O?Roak et al TRUE FALSE FALSE 0.3812905.mo O?Roak et al TRUE FALSE FALSE 0.3312905.p1 O?Roak et al TRUE FALSE FALSE 0.3412905.s1 O?Roak et al TRUE FALSE FALSE 0.3912906.fa Sanders et al TRUE FALSE FALSE 0.3512906.mo Sanders et al TRUE FALSE FALSE 0.2912906.p1 Sanders et al TRUE FALSE FALSE 0.2812906.s1 Sanders et al TRUE FALSE FALSE 0.2912937.fa Iossifov et al TRUE TRUE FALSE 0.4812937.mo Iossifov et al TRUE TRUE FALSE 0.5112937.p1 Iossifov et al TRUE TRUE FALSE 0.5012937.s1 Iossifov et al TRUE TRUE FALSE 0.42 12958.fa Sanders et al TRUE FALSE FALSE 0.3712958.mo Sanders et al TRUE FALSE FALSE 0.3612958.p1 Sanders et al TRUE FALSE FALSE 0.4412958.s1 Sanders et al TRUE FALSE FALSE 0.3612962.fa Iossifov et al TRUE TRUE FALSE 0.4312962.mo Iossifov et al TRUE TRUE FALSE 0.5112962.p1 Iossifov et al TRUE TRUE FALSE 0.5012962.s1 Iossifov et al TRUE TRUE FALSE 0.4712975.fa Iossifov et al TRUE TRUE FALSE 0.4112975.mo Iossifov et al TRUE TRUE FALSE 0.5612975.p1 Iossifov et al TRUE TRUE FALSE 0.5012975.s1 Iossifov et al TRUE TRUE FALSE 0.5212984.fa Sanders et al TRUE TRUE FALSE 0.3212984.mo Sanders et al TRUE TRUE FALSE 0.3412984.p1 Sanders et al TRUE TRUE FALSE 0.3112984.s1 Sanders et al TRUE TRUE FALSE 0.4012997.fa Iossifov et al TRUE TRUE FALSE 0.4312997.mo Iossifov et al TRUE TRUE FALSE 0.5012997.p1 Iossifov et al TRUE TRUE TRUE 0.4812997.s1 Iossifov et al TRUE TRUE TRUE 0.5713000.fa Sanders et al TRUE FALSE FALSE 0.2813000.mo Sanders et al TRUE FALSE FALSE 0.2813000.p1 Sanders et al TRUE FALSE FALSE 0.2913000.s1 Sanders et al TRUE FALSE FALSE 0.3013016.fa Iossifov et al TRUE TRUE FALSE 0.4713016.mo Iossifov et al TRUE TRUE FALSE 0.5413016.p1 Iossifov et al TRUE TRUE FALSE 0.5213016.s1 Iossifov et al TRUE TRUE FALSE 0.6013018.fa Iossifov et al TRUE TRUE FALSE 0.5813018.mo Iossifov et al TRUE TRUE FALSE 0.6113018.p1 Iossifov et al TRUE TRUE TRUE 0.4813018.s1 Iossifov et al TRUE TRUE FALSE 0.4413048.fa O?Roak et al TRUE TRUE FALSE 0.4613048.mo O?Roak et al TRUE TRUE FALSE 0.4013048.p1 O?Roak et al TRUE TRUE FALSE 0.4013048.s1 O?Roak et al TRUE TRUE FALSE 0.3313063.fa Sanders et al TRUE TRUE FALSE 0.2813063.mo Sanders et al TRUE TRUE FALSE 0.2813063.p1 Sanders et al TRUE TRUE FALSE 0.2913063.s1 Sanders et al TRUE TRUE FALSE 0.2713073.fa Sanders et al TRUE TRUE FALSE 0.2813073.mo Sanders et al TRUE TRUE FALSE 0.3013073.p1 Sanders et al TRUE TRUE FALSE 0.3213073.s1 Sanders et al TRUE TRUE FALSE 0.3313094.fa Iossifov et al TRUE FALSE FALSE 0.4613094.mo Iossifov et al TRUE FALSE FALSE 0.4813094.p1 Iossifov et al TRUE FALSE FALSE 0.4713094.s1 Iossifov et al TRUE FALSE FALSE 0.4513096.fa Iossifov et al TRUE TRUE FALSE 0.5113096.mo Iossifov et al TRUE TRUE FALSE 0.5213096.p1 Iossifov et al TRUE TRUE FALSE 0.4413096.s1 Iossifov et al TRUE FALSE FALSE 0.4913097.fa Iossifov et al TRUE TRUE FALSE 0.4313097.mo Iossifov et al TRUE TRUE FALSE 0.4113097.p1 Iossifov et al TRUE TRUE TRUE 0.5013097.s1 Iossifov et al TRUE TRUE FALSE 0.5113099.fa Iossifov et al TRUE FALSE FALSE 0.5413099.mo Iossifov et al TRUE FALSE FALSE 0.5813099.p1 Iossifov et al TRUE FALSE FALSE 0.5013099.s1 Iossifov et al TRUE FALSE FALSE 0.4413101.fa Iossifov et al FALSE FALSE FALSE 0.5613101.mo Iossifov et al FALSE FALSE FALSE 0.5613101.p1 Iossifov et al FALSE FALSE FALSE 0.3613101.s1 Iossifov et al FALSE FALSE FALSE 0.6113104.fa Iossifov et al TRUE TRUE FALSE 0.5113104.mo Iossifov et al TRUE TRUE FALSE 0.4813104.p1 Iossifov et al TRUE TRUE FALSE 0.4013104.s1 Iossifov et al TRUE TRUE FALSE 0.5413116.fa O?Roak et al TRUE FALSE FALSE 0.5013116.mo O?Roak et al TRUE FALSE FALSE 0.3913116.p1 O?Roak et al TRUE FALSE FALSE 0.4913116.s1 O?Roak et al TRUE FALSE FALSE 0.3013120.fa Iossifov et al TRUE TRUE FALSE 0.4913120.mo Iossifov et al TRUE TRUE FALSE 0.4713120.p1 Iossifov et al TRUE TRUE FALSE 0.5313120.s1 Iossifov et al TRUE TRUE FALSE 0.5613125.fa Iossifov et al TRUE FALSE FALSE 0.5413125.mo Iossifov et al TRUE FALSE FALSE 0.4413125.p1 Iossifov et al TRUE FALSE FALSE 0.5213125.s1 Iossifov et al TRUE FALSE FALSE 0.4013129.fa Iossifov et al TRUE FALSE FALSE 0.5813129.mo Iossifov et al TRUE FALSE FALSE 0.5213129.p1 Iossifov et al TRUE FALSE FALSE 0.3913129.s1 Iossifov et al TRUE FALSE FALSE 0.4813131.fa Iossifov et al TRUE FALSE FALSE 0.4813131.mo Iossifov et al TRUE FALSE FALSE 0.4613131.p1 Iossifov et al TRUE FALSE FALSE 0.4613131.s1 Iossifov et al TRUE FALSE FALSE 0.4713139.fa Iossifov et al TRUE FALSE FALSE 0.45 13139.mo Iossifov et al TRUE FALSE FALSE 0.4413139.p1 Iossifov et al TRUE FALSE FALSE 0.4313139.s1 Iossifov et al TRUE FALSE FALSE 0.4613144.fa Iossifov et al TRUE TRUE FALSE 0.4713144.mo Iossifov et al TRUE TRUE FALSE 0.4613144.p1 Iossifov et al TRUE TRUE FALSE 0.4513144.s1 Iossifov et al TRUE TRUE FALSE 0.4713146.fa Iossifov et al TRUE TRUE FALSE 0.4613146.mo Iossifov et al TRUE TRUE FALSE 0.4713146.p1 Iossifov et al TRUE TRUE FALSE 0.4613146.s1 Iossifov et al TRUE FALSE FALSE 0.4713148.fa Iossifov et al TRUE FALSE FALSE 0.4413148.mo Iossifov et al TRUE FALSE FALSE 0.4513148.p1 Iossifov et al TRUE FALSE FALSE 0.4613148.s1 Iossifov et al TRUE FALSE FALSE 0.4213152.fa Iossifov et al TRUE TRUE FALSE 0.5113152.mo Iossifov et al TRUE TRUE FALSE 0.5013152.p1 Iossifov et al TRUE TRUE FALSE 0.5213152.s1 Iossifov et al TRUE FALSE FALSE 0.5113153.fa Iossifov et al TRUE TRUE FALSE 0.4413153.mo Iossifov et al TRUE TRUE FALSE 0.4313153.p1 Iossifov et al TRUE TRUE FALSE 0.4213153.s1 Iossifov et al TRUE TRUE FALSE 0.4213154.fa Sanders et al TRUE FALSE FALSE 0.2313154.mo Sanders et al TRUE FALSE FALSE 0.2313154.p1 Sanders et al TRUE FALSE FALSE 0.2213154.s1 Sanders et al TRUE FALSE FALSE 0.2313159.fa Iossifov et al TRUE TRUE FALSE 0.4313159.mo Iossifov et al TRUE TRUE FALSE 0.4413159.p1 Iossifov et al TRUE TRUE FALSE 0.4313159.s1 Iossifov et al TRUE TRUE FALSE 0.4313162.fa Iossifov et al TRUE TRUE FALSE 0.4413162.mo Iossifov et al TRUE TRUE FALSE 0.4213162.p1 Iossifov et al TRUE TRUE FALSE 0.4213162.s1 Iossifov et al TRUE TRUE TRUE 0.4313165.fa Iossifov et al TRUE FALSE FALSE 0.4313165.mo Iossifov et al TRUE FALSE FALSE 0.4413165.p1 Iossifov et al TRUE FALSE FALSE 0.4313165.s1 Iossifov et al TRUE FALSE FALSE 0.4613166.fa Iossifov et al TRUE TRUE FALSE 0.4313166.mo Iossifov et al TRUE TRUE FALSE 0.4213166.p1 Iossifov et al TRUE TRUE FALSE 0.4213166.s1 Iossifov et al TRUE TRUE FALSE 0.4213168.fa Iossifov et al TRUE TRUE FALSE 0.4813168.mo Iossifov et al TRUE TRUE FALSE 0.4613168.p1 Iossifov et al TRUE TRUE FALSE 0.4813168.s1 Iossifov et al TRUE TRUE FALSE 0.4913169.fa O?Roak et al TRUE TRUE FALSE 0.3713169.mo O?Roak et al TRUE TRUE FALSE 0.4013169.p1 O?Roak et al TRUE TRUE FALSE 0.4113169.s1 O?Roak et al TRUE TRUE FALSE 0.3513171.fa Sanders et al TRUE TRUE FALSE 0.3713171.mo Sanders et al TRUE TRUE FALSE 0.3713171.p1 Sanders et al TRUE TRUE FALSE 0.4513171.s1 Sanders et al TRUE TRUE FALSE 0.3513174.fa Iossifov et al TRUE TRUE FALSE 0.5013174.mo Iossifov et al TRUE TRUE FALSE 0.5013174.p1 Iossifov et al TRUE TRUE FALSE 0.5013174.s1 Iossifov et al TRUE TRUE FALSE 0.5113176.fa Iossifov et al TRUE FALSE FALSE 0.4513176.mo Iossifov et al TRUE FALSE FALSE 0.4313176.p1 Iossifov et al TRUE FALSE FALSE 0.4513176.s1 Iossifov et al TRUE FALSE FALSE 0.4513183.fa Iossifov et al TRUE TRUE FALSE 0.5413183.mo Iossifov et al TRUE TRUE FALSE 0.5313183.p1 Iossifov et al TRUE TRUE FALSE 0.5113183.s1 Iossifov et al TRUE TRUE FALSE 0.5113187.fa Iossifov et al TRUE TRUE FALSE 0.4713187.mo Iossifov et al TRUE TRUE FALSE 0.4713187.p1 Iossifov et al TRUE TRUE FALSE 0.4813187.s1 Iossifov et al TRUE TRUE FALSE 0.5113188.fa O?Roak et al TRUE FALSE FALSE 0.3113188.mo O?Roak et al TRUE FALSE FALSE 0.4113188.p1 O?Roak et al TRUE FALSE FALSE 0.2913188.s1 O?Roak et al TRUE FALSE FALSE 0.3913193.fa Iossifov et al TRUE TRUE FALSE 0.4213193.mo Iossifov et al TRUE TRUE FALSE 0.4913193.p1 Iossifov et al TRUE TRUE FALSE 0.5313193.s1 Iossifov et al TRUE TRUE FALSE 0.4713195.fa Sanders et al TRUE TRUE FALSE 0.4013195.mo Sanders et al TRUE TRUE FALSE 0.3613195.p1 Sanders et al TRUE TRUE FALSE 0.3313195.s1 Sanders et al TRUE TRUE FALSE 0.4213196.fa Iossifov et al TRUE TRUE FALSE 0.4613196.mo Iossifov et al TRUE TRUE FALSE 0.4713196.p1 Iossifov et al TRUE TRUE FALSE 0.4313196.s1 Iossifov et al TRUE TRUE FALSE 0.5213197.fa Iossifov et al TRUE FALSE FALSE 0.4713197.mo Iossifov et al TRUE FALSE FALSE 0.47 13197.p1 Iossifov et al TRUE FALSE FALSE 0.4913197.s1 Iossifov et al TRUE FALSE FALSE 0.4913215.fa Iossifov et al TRUE FALSE FALSE 0.5713215.mo Iossifov et al TRUE FALSE FALSE 0.5313215.p1 Iossifov et al TRUE FALSE FALSE 0.4613215.s1 Iossifov et al TRUE FALSE FALSE 0.5713216.fa Iossifov et al TRUE TRUE FALSE 0.2713216.mo Iossifov et al TRUE TRUE FALSE 0.3413216.p1 Iossifov et al TRUE TRUE FALSE 0.2713216.s1 Iossifov et al TRUE TRUE FALSE 0.2713218.fa Iossifov et al TRUE TRUE FALSE 0.4913218.mo Iossifov et al TRUE TRUE FALSE 0.4713218.p1 Iossifov et al TRUE TRUE FALSE 0.4113218.s1 Iossifov et al TRUE FALSE FALSE 0.5213227.fa Iossifov et al TRUE FALSE FALSE 0.3813227.mo Iossifov et al TRUE FALSE FALSE 0.4213227.p1 Iossifov et al TRUE FALSE FALSE 0.3713227.s1 Iossifov et al TRUE FALSE FALSE 0.4013239.fa Iossifov et al TRUE FALSE FALSE 0.3613239.mo Iossifov et al TRUE FALSE FALSE 0.4013239.p1 Iossifov et al TRUE FALSE FALSE 0.3413239.s1 Iossifov et al TRUE FALSE FALSE 0.3913258.fa Iossifov et al TRUE FALSE FALSE 0.4213258.mo Iossifov et al TRUE FALSE FALSE 0.4213258.p1 Iossifov et al TRUE FALSE FALSE 0.4213258.s1 Iossifov et al TRUE FALSE FALSE 0.4013263.fa Iossifov et al TRUE FALSE FALSE 0.4713263.mo Iossifov et al TRUE FALSE FALSE 0.4813263.p1 Iossifov et al TRUE FALSE FALSE 0.4613263.s1 Iossifov et al TRUE FALSE FALSE 0.4813266.fa Iossifov et al TRUE TRUE FALSE 0.4613266.mo Iossifov et al TRUE TRUE FALSE 0.5213266.p1 Iossifov et al TRUE TRUE FALSE 0.4813266.s1 Iossifov et al TRUE TRUE FALSE 0.4713269.fa Iossifov et al TRUE FALSE FALSE 0.5013269.mo Iossifov et al TRUE FALSE FALSE 0.4513269.p1 Iossifov et al TRUE FALSE FALSE 0.4913269.s1 Iossifov et al TRUE FALSE FALSE 0.5013271.fa Sanders et al TRUE FALSE FALSE 0.3013271.mo Sanders et al TRUE FALSE FALSE 0.3113271.p1 Sanders et al TRUE FALSE FALSE 0.3113271.s1 Sanders et al TRUE FALSE FALSE 0.2713293.fa Iossifov et al TRUE FALSE FALSE 0.4713293.mo Iossifov et al TRUE FALSE FALSE 0.5113293.p1 Iossifov et al TRUE FALSE FALSE 0.5013293.s1 Iossifov et al TRUE FALSE FALSE 0.5613296.fa Iossifov et al FALSE TRUE FALSE 0.4713296.mo Iossifov et al FALSE TRUE FALSE 0.4713296.p1 Iossifov et al FALSE TRUE FALSE 0.4613296.s1 Iossifov et al FALSE TRUE TRUE 0.4713307.fa Iossifov et al TRUE FALSE FALSE 0.3313307.mo Iossifov et al TRUE FALSE FALSE 0.3313307.p1 Iossifov et al TRUE FALSE FALSE 0.3413307.s1 Iossifov et al TRUE FALSE FALSE 0.3413309.fa Iossifov et al TRUE FALSE FALSE 0.4413309.mo Iossifov et al TRUE FALSE FALSE 0.4413309.p1 Iossifov et al TRUE FALSE FALSE 0.4513309.s1 Iossifov et al TRUE FALSE FALSE 0.4313312.fa Iossifov et al TRUE FALSE FALSE 0.5313312.mo Iossifov et al TRUE FALSE FALSE 0.5013312.p1 Iossifov et al TRUE FALSE FALSE 0.4513312.s1 Iossifov et al TRUE FALSE FALSE 0.5013315.fa Iossifov et al TRUE FALSE FALSE 0.3313315.mo Iossifov et al TRUE FALSE FALSE 0.3113315.p1 Iossifov et al TRUE FALSE FALSE 0.3513315.s1 Iossifov et al TRUE FALSE FALSE 0.3213322.fa Sanders et al TRUE TRUE FALSE 0.2713322.mo Sanders et al TRUE TRUE FALSE 0.3013322.p1 Sanders et al TRUE TRUE FALSE 0.3313322.s1 Sanders et al TRUE TRUE FALSE 0.2613327.fa Iossifov et al TRUE TRUE FALSE 0.3013327.mo Iossifov et al TRUE TRUE FALSE 0.3413327.p1 Iossifov et al TRUE TRUE FALSE 0.5013327.s1 Iossifov et al TRUE TRUE TRUE 0.4513328.fa Iossifov et al TRUE FALSE FALSE 0.3713328.mo Iossifov et al TRUE FALSE FALSE 0.3813328.p1 Iossifov et al TRUE FALSE FALSE 0.4013328.s1 Iossifov et al TRUE FALSE FALSE 0.4913330.fa Iossifov et al TRUE FALSE FALSE 0.5013330.mo Iossifov et al TRUE FALSE FALSE 0.3913330.p1 Iossifov et al TRUE FALSE FALSE 0.5513330.s1 Iossifov et al TRUE FALSE FALSE 0.5613335.fa O?Roak et al TRUE FALSE FALSE 0.4813335.mo O?Roak et al TRUE FALSE FALSE 0.4313335.p1 O?Roak et al TRUE FALSE FALSE 0.5713335.s1 O?Roak et al TRUE FALSE FALSE 0.4713338.fa Iossifov et al TRUE FALSE FALSE 0.4113338.mo Iossifov et al TRUE FALSE FALSE 0.4513338.p1 Iossifov et al TRUE FALSE FALSE 0.54 13338.s1 Iossifov et al TRUE FALSE FALSE 0.5113346.fa O?Roak et al TRUE FALSE FALSE 0.3813346.mo O?Roak et al TRUE FALSE FALSE 0.3713346.p1 O?Roak et al TRUE FALSE FALSE 0.3813346.s1 O?Roak et al TRUE FALSE FALSE 0.3413349.fa Iossifov et al TRUE FALSE FALSE 0.5013349.mo Iossifov et al TRUE FALSE FALSE 0.4013349.p1 Iossifov et al TRUE FALSE FALSE 0.3813349.s1 Iossifov et al TRUE FALSE FALSE 0.3613355.fa Sanders et al TRUE FALSE FALSE 0.2813355.mo Sanders et al TRUE FALSE FALSE 0.2813355.p1 Sanders et al TRUE FALSE FALSE 0.2913355.s1 Sanders et al TRUE FALSE FALSE 0.2613366.fa Iossifov et al TRUE FALSE FALSE 0.4813366.mo Iossifov et al TRUE FALSE FALSE 0.4313366.p1 Iossifov et al TRUE FALSE FALSE 0.3713366.s1 Iossifov et al TRUE FALSE FALSE 0.5213374.fa Sanders et al TRUE FALSE FALSE 0.3313374.mo Sanders et al TRUE FALSE FALSE 0.3813374.p1 Sanders et al TRUE FALSE FALSE 0.3813374.s1 Sanders et al TRUE FALSE FALSE 0.3413385.fa Sanders et al TRUE FALSE FALSE 0.2513385.mo Sanders et al TRUE FALSE FALSE 0.2413385.p1 Sanders et al TRUE FALSE FALSE 0.2213385.s1 Sanders et al TRUE FALSE FALSE 0.2413387.fa Iossifov et al TRUE FALSE FALSE 0.4513387.mo Iossifov et al TRUE FALSE FALSE 0.4613387.p1 Iossifov et al TRUE FALSE FALSE 0.5413387.s1 Iossifov et al TRUE FALSE FALSE 0.5113393.fa Sanders et al TRUE FALSE FALSE 0.2813393.mo Sanders et al TRUE FALSE FALSE 0.3113393.p1 Sanders et al TRUE FALSE FALSE 0.3013393.s1 Sanders et al TRUE FALSE FALSE 0.2713396.fa Iossifov et al TRUE FALSE FALSE 0.4613396.mo Iossifov et al TRUE FALSE FALSE 0.4513396.p1 Iossifov et al TRUE FALSE FALSE 0.4613396.s1 Iossifov et al TRUE FALSE FALSE 0.4513398.fa Iossifov et al TRUE FALSE FALSE 0.4213398.mo Iossifov et al TRUE FALSE FALSE 0.4413398.p1 Iossifov et al TRUE FALSE FALSE 0.4413398.s1 Iossifov et al TRUE FALSE FALSE 0.4313412.fa Iossifov et al TRUE FALSE FALSE 0.4813412.mo Iossifov et al TRUE FALSE FALSE 0.4913412.p1 Iossifov et al TRUE FALSE FALSE 0.4713412.s1 Iossifov et al TRUE FALSE FALSE 0.4413418.fa Iossifov et al TRUE FALSE FALSE 0.4513418.mo Iossifov et al TRUE FALSE FALSE 0.4313418.p1 Iossifov et al TRUE FALSE FALSE 0.4513418.s1 Iossifov et al TRUE FALSE FALSE 0.4313424.fa Iossifov et al TRUE FALSE FALSE 0.4413424.mo Iossifov et al TRUE FALSE FALSE 0.4213424.p1 Iossifov et al TRUE FALSE FALSE 0.4113424.s1 Iossifov et al TRUE FALSE FALSE 0.4413439.fa Iossifov et al TRUE FALSE FALSE 0.4313439.mo Iossifov et al TRUE FALSE FALSE 0.4513439.p1 Iossifov et al TRUE FALSE FALSE 0.4513439.s1 Iossifov et al TRUE FALSE FALSE 0.4413443.fa Iossifov et al TRUE FALSE FALSE 0.4413443.mo Iossifov et al TRUE FALSE FALSE 0.4413443.p1 Iossifov et al TRUE FALSE FALSE 0.4513443.s1 Iossifov et al TRUE FALSE FALSE 0.4513444.fa Iossifov et al TRUE FALSE FALSE 0.3413444.mo Iossifov et al TRUE FALSE FALSE 0.3613444.p1 Iossifov et al TRUE FALSE FALSE 0.3513444.s1 Iossifov et al TRUE FALSE FALSE 0.3613447.fa O?Roak et al TRUE FALSE FALSE 0.3813447.mo O?Roak et al TRUE FALSE FALSE 0.3413447.p1 O?Roak et al TRUE FALSE FALSE 0.3813447.s1 O?Roak et al TRUE FALSE FALSE 0.3213462.fa Iossifov et al TRUE FALSE FALSE 0.3413462.mo Iossifov et al TRUE FALSE FALSE 0.3613462.p1 Iossifov et al TRUE FALSE FALSE 0.3513462.s1 Iossifov et al TRUE FALSE FALSE 0.3613465.fa Iossifov et al TRUE FALSE FALSE 0.4713465.mo Iossifov et al TRUE FALSE FALSE 0.4413465.p1 Iossifov et al TRUE FALSE FALSE 0.4413465.s1 Iossifov et al TRUE FALSE FALSE 0.4213486.fa Iossifov et al TRUE FALSE FALSE 0.4713486.mo Iossifov et al TRUE FALSE FALSE 0.5113486.p1 Iossifov et al TRUE FALSE FALSE 0.5013486.s1 Iossifov et al TRUE FALSE FALSE 0.5313487.fa Iossifov et al TRUE FALSE FALSE 0.3813487.mo Iossifov et al TRUE FALSE FALSE 0.4013487.p1 Iossifov et al TRUE FALSE FALSE 0.3913487.s1 Iossifov et al TRUE FALSE FALSE 0.4013493.fa Iossifov et al TRUE FALSE FALSE 0.4213493.mo Iossifov et al TRUE FALSE FALSE 0.3913493.p1 Iossifov et al TRUE FALSE FALSE 0.3813493.s1 Iossifov et al TRUE FALSE FALSE 0.38 13496.fa Iossifov et al TRUE FALSE FALSE 0.4113496.mo Iossifov et al TRUE FALSE FALSE 0.4013496.p1 Iossifov et al TRUE FALSE FALSE 0.4113496.s1 Iossifov et al TRUE FALSE FALSE 0.4313502.fa Iossifov et al TRUE FALSE FALSE 0.3413502.mo Iossifov et al TRUE FALSE FALSE 0.3313502.p1 Iossifov et al TRUE FALSE FALSE 0.3413502.s1 Iossifov et al TRUE FALSE FALSE 0.3313504.fa Iossifov et al TRUE FALSE FALSE 0.3813504.mo Iossifov et al TRUE FALSE FALSE 0.3713504.p1 Iossifov et al TRUE FALSE FALSE 0.3513504.s1 Iossifov et al TRUE FALSE FALSE 0.3713505.fa Iossifov et al TRUE FALSE FALSE 0.3513505.mo Iossifov et al TRUE FALSE FALSE 0.3313505.p1 Iossifov et al TRUE FALSE FALSE 0.3313505.s1 Iossifov et al TRUE FALSE FALSE 0.3413507.fa Iossifov et al TRUE FALSE FALSE 0.3513507.mo Iossifov et al TRUE FALSE FALSE 0.3613507.p1 Iossifov et al TRUE FALSE FALSE 0.3513507.s1 Iossifov et al TRUE FALSE FALSE 0.3613508.fa Iossifov et al TRUE FALSE FALSE 0.3413508.mo Iossifov et al TRUE FALSE FALSE 0.3913508.p1 Iossifov et al TRUE FALSE FALSE 0.3813508.s1 Iossifov et al TRUE FALSE FALSE 0.3413509.fa Sanders et al TRUE FALSE FALSE 0.3413509.mo Sanders et al TRUE FALSE FALSE 0.3413509.p1 Sanders et al TRUE FALSE FALSE 0.3313509.s1 Sanders et al TRUE FALSE FALSE 0.3513512.fa Iossifov et al TRUE FALSE FALSE 0.3813512.mo Iossifov et al TRUE FALSE FALSE 0.3913512.p1 Iossifov et al TRUE FALSE FALSE 0.3913512.s1 Iossifov et al TRUE FALSE FALSE 0.3913513.fa Iossifov et al TRUE FALSE FALSE 0.4513513.mo Iossifov et al TRUE FALSE FALSE 0.4413513.p1 Iossifov et al TRUE FALSE FALSE 0.4613513.s1 Iossifov et al TRUE FALSE FALSE 0.4313533.fa O?Roak et al TRUE FALSE FALSE 0.4813533.mo O?Roak et al TRUE FALSE FALSE 0.4213533.p1 O?Roak et al TRUE FALSE FALSE 0.4213533.s1 O?Roak et al TRUE FALSE FALSE 0.3413543.fa Sanders et al TRUE FALSE FALSE 0.3113543.mo Sanders et al TRUE FALSE FALSE 0.2713543.p1 Sanders et al TRUE FALSE FALSE 0.2913543.s1 Sanders et al TRUE FALSE FALSE 0.3013589.fa Iossifov et al TRUE FALSE FALSE 0.4013589.mo Iossifov et al TRUE FALSE FALSE 0.3713589.p1 Iossifov et al TRUE FALSE FALSE 0.4213589.s1 Iossifov et al TRUE FALSE FALSE 0.4013590.fa Iossifov et al TRUE FALSE FALSE 0.4413590.mo Iossifov et al TRUE FALSE FALSE 0.4313590.p1 Iossifov et al TRUE FALSE FALSE 0.4113590.s1 Iossifov et al TRUE FALSE FALSE 0.4113593.fa O?Roak et al TRUE FALSE FALSE 0.5713593.mo O?Roak et al TRUE FALSE FALSE 0.4913593.p1 O?Roak et al TRUE FALSE FALSE 0.4413593.s1 O?Roak et al TRUE FALSE FALSE 0.3113599.fa Iossifov et al TRUE FALSE FALSE 0.4713599.mo Iossifov et al TRUE FALSE FALSE 0.4813599.p1 Iossifov et al TRUE FALSE FALSE 0.4013599.s1 Iossifov et al TRUE FALSE FALSE 0.4613601.fa Iossifov et al TRUE FALSE FALSE 0.4213601.mo Iossifov et al TRUE FALSE FALSE 0.4013601.p1 Iossifov et al TRUE FALSE FALSE 0.4413601.s1 Iossifov et al TRUE FALSE FALSE 0.4313606.fa O?Roak et al TRUE FALSE FALSE 0.3313606.mo O?Roak et al TRUE FALSE FALSE 0.3713606.p1 O?Roak et al TRUE FALSE FALSE 0.3613606.s1 O?Roak et al TRUE FALSE FALSE 0.3213608.fa Sanders et al TRUE FALSE FALSE 0.2513608.mo Sanders et al TRUE FALSE FALSE 0.2413608.p1 Sanders et al TRUE FALSE FALSE 0.2413608.s1 Sanders et al TRUE FALSE FALSE 0.2613618.fa Sanders et al TRUE FALSE FALSE 0.2713618.mo Sanders et al TRUE FALSE FALSE 0.2813618.p1 Sanders et al TRUE FALSE FALSE 0.3313618.s1 Sanders et al TRUE FALSE FALSE 0.2813621.fa Sanders et al TRUE FALSE FALSE 0.3313621.mo Sanders et al TRUE FALSE FALSE 0.4113621.p1 Sanders et al TRUE FALSE FALSE 0.3213621.s1 Sanders et al TRUE FALSE FALSE 0.3013625.fa Sanders et al TRUE FALSE FALSE 0.2913625.mo Sanders et al TRUE FALSE FALSE 0.2913625.p1 Sanders et al TRUE FALSE FALSE 0.2913625.s1 Sanders et al TRUE FALSE FALSE 0.2913629.fa O?Roak et al TRUE FALSE FALSE 0.3313629.mo O?Roak et al TRUE FALSE FALSE 0.3713629.p1 O?Roak et al TRUE FALSE FALSE 0.3813629.s1 O?Roak et al TRUE FALSE FALSE 0.3213660.fa Sanders et al TRUE FALSE FALSE 0.29 13660.mo Sanders et al TRUE FALSE FALSE 0.2913660.p1 Sanders et al TRUE FALSE FALSE 0.3213660.s1 Sanders et al TRUE FALSE FALSE 0.3313684.fa Iossifov et al TRUE FALSE FALSE 0.4413684.mo Iossifov et al TRUE FALSE FALSE 0.4713684.p1 Iossifov et al TRUE FALSE FALSE 0.4513684.s1 Iossifov et al TRUE FALSE FALSE 0.4513689.fa Iossifov et al TRUE FALSE FALSE 0.4313689.mo Iossifov et al TRUE FALSE FALSE 0.4413689.p1 Iossifov et al TRUE FALSE FALSE 0.4213689.s1 Iossifov et al TRUE FALSE FALSE 0.4413695.fa Iossifov et al FALSE FALSE FALSE 0.4813695.mo Iossifov et al FALSE FALSE FALSE 0.4613695.p1 Iossifov et al FALSE FALSE FALSE 0.4613695.s1 Iossifov et al FALSE FALSE FALSE 0.4513698.fa Iossifov et al TRUE FALSE FALSE 0.4013698.mo Iossifov et al TRUE FALSE FALSE 0.4513698.p1 Iossifov et al TRUE FALSE FALSE 0.4913698.s1 Iossifov et al TRUE FALSE FALSE 0.4213726.fa O?Roak et al TRUE FALSE FALSE 0.5313726.mo O?Roak et al TRUE FALSE FALSE 0.5013726.p1 O?Roak et al TRUE FALSE FALSE 0.4113726.s1 O?Roak et al TRUE FALSE FALSE 0.3213730.fa Sanders et al TRUE FALSE FALSE 0.2813730.mo Sanders et al TRUE FALSE FALSE 0.2813730.p1 Sanders et al TRUE FALSE FALSE 0.3313730.s1 Sanders et al TRUE FALSE FALSE 0.2913739.fa Sanders et al TRUE FALSE FALSE 0.3013739.mo Sanders et al TRUE FALSE FALSE 0.2913739.p1 Sanders et al TRUE FALSE FALSE 0.2813739.s1 Sanders et al TRUE FALSE FALSE 0.2913752.fa Sanders et al TRUE FALSE FALSE 0.2413752.mo Sanders et al TRUE FALSE FALSE 0.2413752.p1 Sanders et al TRUE FALSE FALSE 0.2413752.s1 Sanders et al TRUE FALSE FALSE 0.2413774.fa Sanders et al TRUE FALSE FALSE 0.2913774.mo Sanders et al TRUE FALSE FALSE 0.3213774.p1 Sanders et al TRUE FALSE FALSE 0.2713774.s1 Sanders et al TRUE FALSE FALSE 0.3013793.fa O?Roak et al TRUE FALSE FALSE 0.4913793.mo O?Roak et al TRUE FALSE FALSE 0.5213793.p1 O?Roak et al TRUE FALSE FALSE 0.3913793.s1 O?Roak et al TRUE FALSE FALSE 0.3613795.fa Sanders et al TRUE FALSE FALSE 0.2813795.mo Sanders et al TRUE FALSE FALSE 0.2913795.p1 Sanders et al TRUE FALSE FALSE 0.2713795.s1 Sanders et al TRUE FALSE FALSE 0.3013798.fa O?Roak et al TRUE FALSE FALSE 0.4113798.mo O?Roak et al TRUE FALSE FALSE 0.3713798.p1 O?Roak et al TRUE FALSE FALSE 0.3613798.s1 O?Roak et al TRUE FALSE FALSE 0.3913808.fa Sanders et al TRUE FALSE FALSE 0.2113808.mo Sanders et al TRUE FALSE FALSE 0.2113808.p1 Sanders et al TRUE FALSE FALSE 0.2013808.s1 Sanders et al TRUE FALSE FALSE 0.2513809.fa Sanders et al TRUE FALSE FALSE 0.2713809.mo Sanders et al TRUE FALSE FALSE 0.2713809.p1 Sanders et al TRUE FALSE FALSE 0.2713809.s1 Sanders et al TRUE FALSE FALSE 0.2713815.fa O?Roak et al TRUE FALSE FALSE 0.5413815.mo O?Roak et al TRUE FALSE FALSE 0.4013815.p1 O?Roak et al TRUE FALSE FALSE 0.3713815.s1 O?Roak et al TRUE FALSE FALSE 0.4813821.fa Sanders et al TRUE FALSE FALSE 0.2813821.mo Sanders et al TRUE FALSE FALSE 0.2313821.p1 Sanders et al TRUE FALSE FALSE 0.2613821.s1 Sanders et al TRUE FALSE FALSE 0.2613825.fa Sanders et al TRUE FALSE FALSE 0.2913825.mo Sanders et al TRUE FALSE FALSE 0.2913825.p1 Sanders et al TRUE FALSE FALSE 0.3113825.s1 Sanders et al TRUE FALSE FALSE 0.3113832.fa Sanders et al TRUE FALSE FALSE 0.2813832.mo Sanders et al TRUE FALSE FALSE 0.2913832.p1 Sanders et al TRUE FALSE FALSE 0.2713832.s1 Sanders et al TRUE FALSE FALSE 0.2813835.fa O?Roak et al FALSE FALSE FALSE 0.3313835.mo O?Roak et al FALSE FALSE FALSE 0.3713835.p1 O?Roak et al FALSE FALSE FALSE 0.3613835.s1 O?Roak et al FALSE FALSE FALSE 0.3813840.fa Sanders et al TRUE FALSE FALSE 0.2513840.mo Sanders et al TRUE FALSE FALSE 0.2713840.p1 Sanders et al TRUE FALSE FALSE 0.2613840.s1 Sanders et al TRUE FALSE FALSE 0.2413843.fa Sanders et al TRUE FALSE FALSE 0.2213843.mo Sanders et al TRUE FALSE FALSE 0.2413843.p1 Sanders et al TRUE FALSE FALSE 0.2413843.s1 Sanders et al TRUE FALSE FALSE 0.2413876.fa Sanders et al TRUE FALSE FALSE 0.2113876.mo Sanders et al TRUE FALSE FALSE 0.24 13876.p1 Sanders et al TRUE FALSE FALSE 0.2413876.s1 Sanders et al TRUE FALSE FALSE 0.3013887.fa Sanders et al TRUE FALSE FALSE 0.2913887.mo Sanders et al TRUE FALSE FALSE 0.2713887.p1 Sanders et al TRUE FALSE FALSE 0.2913887.s1 Sanders et al TRUE FALSE FALSE 0.3013890.fa O?Roak et al TRUE FALSE FALSE 0.3413890.mo O?Roak et al TRUE FALSE FALSE 0.3413890.p1 O?Roak et al TRUE FALSE FALSE 0.4513890.s1 O?Roak et al TRUE FALSE FALSE 0.3713912.fa Sanders et al TRUE FALSE FALSE 0.2413912.mo Sanders et al TRUE FALSE FALSE 0.2413912.p1 Sanders et al TRUE FALSE FALSE 0.2713912.s1 Sanders et al TRUE FALSE FALSE 0.2313922.fa Sanders et al TRUE FALSE FALSE 0.2413922.mo Sanders et al TRUE FALSE FALSE 0.2513922.p1 Sanders et al TRUE FALSE FALSE 0.2613922.s1 Sanders et al TRUE FALSE FALSE 0.2613926.fa O?Roak et al TRUE FALSE FALSE 0.3713926.mo O?Roak et al TRUE FALSE FALSE 0.3813926.p1 O?Roak et al TRUE FALSE FALSE 0.3713926.s1 O?Roak et al TRUE FALSE FALSE 0.3813992.fa Sanders et al TRUE FALSE FALSE 0.2313992.mo Sanders et al TRUE FALSE FALSE 0.2713992.p1 Sanders et al TRUE FALSE FALSE 0.2413992.s1 Sanders et al TRUE FALSE FALSE 0.2614009.fa Sanders et al TRUE FALSE FALSE 0.2714009.mo Sanders et al TRUE FALSE FALSE 0.2314009.p1 Sanders et al TRUE FALSE FALSE 0.2514009.s1 Sanders et al TRUE FALSE FALSE 0.2414011.fa O?Roak et al FALSE FALSE FALSE 0.4014011.mo O?Roak et al FALSE FALSE FALSE 0.3514011.p1 O?Roak et al FALSE FALSE FALSE 0.3014011.s1 O?Roak et al FALSE FALSE FALSE 0.3314110.fa Sanders et al TRUE FALSE FALSE 0.2914110.mo Sanders et al TRUE FALSE FALSE 0.3014110.p1 Sanders et al TRUE FALSE FALSE 0.3114110.s1 Sanders et al TRUE FALSE FALSE 0.3614167.fa Sanders et al TRUE FALSE FALSE 0.3014167.mo Sanders et al TRUE FALSE FALSE 0.3114167.p1 Sanders et al TRUE FALSE FALSE 0.3014167.s1 Sanders et al TRUE FALSE FALSE 0.3014201.fa O?Roak et al FALSE FALSE FALSE 0.3114201.mo O?Roak et al FALSE FALSE FALSE 0.3314201.p1 O?Roak et al FALSE FALSE FALSE 0.3314201.s1 O?Roak et al FALSE FALSE FALSE 0.37 Study Quads (before QC) CoNIFER Standard Deviation Accession CodesMedian Minimum AverageO?Roak et al. (2012) 70 (70) 137,828,338 54,895,318 0.39 dbGAP: phs000482.v1.p1 or NDAR: NDARCOL0001878Iossifov et al. (2012) 165 (165) 111,781,453 51,482,969 0.47 NDAR: NDARCOL0001936Sanders et al. (2012) 176 (177) 160,294,934 48,431,443 0.29 SRA: SRP010920.1All probands 411 (412) 142,557,586 48,431,443 0.38All siblings 411 (412) 138,779,930 51,482,969 0.38All mothers 411 (412) 134,381,120 51,504,433 0.38 All fathers 411 (412) 140,013,304 51,366,451 0.38Discordant SRS 276 134,837,454 51,482,969 0.38Concordant SRS 115 146,397,189 48,431,443 0.37All 411 (412) 138,205,341 48,431,443 0.38 Reads (36mers) mapped with mrsFAST to exome targets (per sample) cnvrID callID familyID relChrom osomeSt art (hg19) Stop (hg19) length (bp) length (exon s)state Frequency i n 411 families ESP Frequency Inheritance Genes SRS discordant Previously seen genes de novo SN V summary in sample 16 5636 13698p1 12106 66222 34542 127880 13dup 1? 0.1 %mo_to_bot hC1orf86 , PRKCZ, SKI FALSEPR KCZ 16 8285 13698s1 12116 02121 18961 2940 5dup 1? 0.1 %mo_to_bot hPRKCZ FALSEPR KCZ 19 191 1571p1 12529 64525 38508 8863 7del 1? 0.1 %mo_to_p1 MMEL1 TRUE 21 1749 12383p1 13413 21834 17328 4110 8dup 3? 0.5 %fa_to_p1 MEGF6 FALSE 20 1548 11336p1 13424 35834 32091 7733 9del 2? 0.1 %mo_to_p1 MEGF6 TRUE SLC26A5 [mis sense] 34 105 13798p1 19063 35891 71548 108190 27dup 1? 0.1 %fa_to_p1 SLC2A7, SLC 2A5, GPR157 FALSE 39 5495 12618p1 111134 287111 55938 21651 16del 1? 0.1 %mo_to_bot hEXOSC1 0 TRUE 39 5496 12618s1 111134 287111 55938 21651 16del 1? 0.1 %mo_to_bot hEXOSC1 0 TRUE 45 5483 12498s1 112020 702120 30873 10171 8dup 1? 0.1 %mo_to_bot hPLOD1 TRUEPL OD1 GPR82 [misse nse] 45 8286 12498p1 112020 702120 27148 6446 7dup 1? 0.1 %mo_to_bot hPLOD1 TRUEPL OD1 CPA4 [missen se] 87 1822 13730p1 132084 793321 46668 61875 46dup 1? 0.1 %mo_to_p1 HCRTR1, CO L16A1, PEF1 FALSE DICER1 [miss ense] 86 7578 12647p1 132084 793321 10465 25672 12dup 2? 0.1 %mo_to_p1 HCRTR1, PEF 1 TRUE SLC30A5 [mis sense] 100 1561 11433s1 140204 572403 12969 108397 21dup 1? 0.1 %fa_to_both PPIE, TRIT1, BMP8B TRUE 100 1560 11433p1 140205 836403 13332 107496 21dup 1? 0.1 %fa_to_both PPIE, TRIT1, BMP8B TRUE 106 1581 11667s1 142693 553427 44343 50790 3dup 1? 0.1 %fa_to_both FOXJ3 TRUE CDH3 [missen se] 112 1522 11203s1 147512 138475 15846 3708 4dup 7? 2 %mo_to_s1 CYP4X1 TRUE 112 5458 12394p1 147512 138475 15846 3708 4dup 7? 2 %fa_to_p1 CYP4X1 TRUE 112 5463 12396p1 147512 138475 15846 3708 4dup 7? 2 %mo_to_bot hCYP4X1 TRUE 112 5464 12396s1 147512 138475 15846 3708 4dup 7? 2 %mo_to_bot hCYP4X1 TRUE ZNF518A [mis sense] 112 5622 13590p1 147512 138475 15846 3708 4dup 7? 2 %fa_to_p1 CYP4X1 TRUE MTHFS [fram eshift], EFCAB5 [fram eshift] 112 1831 13809p1 147512 138475 15846 3708 4dup 7? 2 %mo_to_p1 CYP4X1 TRUE 115 1508 11146p1 148865 022488 69554 4532 2del 1? 1 %mo_to_bot hSPATA6 TRUE 115 1509 11146s1 148865 022488 69554 4532 2del 1? 1 %mo_to_bot hSPATA6 TRUE DIP2B [misse nse] 123 881 3346s1 157340 621573 41882 1261 2del 1? 0.1 %mo_to_s1 C8A FALSE 125 401 1872p1 165730 593658 31879 101286 4dup 1? 0.1 %fa_to_p1 DNAJC6 TRUE CACNA1D [m issense], KATNAL2 [sp lice] 126 5555 13097p1 165849 875658 55310 5435 6dup 2? 2 %mo_to_bot hDNAJC6 FALSE 126 5557 13097s1 165849 875658 55310 5435 6dup 2? 2 %mo_to_bot hDNAJC6 FALSE 127 1499 11118p1 166837 995670 00051 162056 2dup 1? 0.1 %fa_to_p1 PDE4B, SGIP 1 TRUEPD E4B 133 1792 12906p1 174663 962748 36076 172114 21dup 1? 0.1 %mo_to_p1 FPGT-TNNI3K , FPGT TRUE SLCO1C1 [m issense], A2M [missense], M YO7B [missense] 134 5487 12518s1 184944 964849 62053 17089 8dup 1? 0.1 %mo_to_s1 RPF1 FALSE 138 691 2810p1 187029 343870 38403 9060 6del 5? 2 %mo_to_p1 CLCA4 TRUE 138 5561 13144p1 187029 343870 40438 11095 7del 5? 2 %mo_to_p1 CLCA4 TRUE 138 5596 13387s1 187029 343870 40438 11095 7del 5? 2 %mo_to_s1 CLCA4 TRUE 141 7590 11000p1 192262 843925 73558 310715 37dup 1? 0.1 %fa_to_p1 TGFBR3, BRD T, EPHX4, BT BD8 FALSE 148 8296 13096s1 1104068 6921040 94402 25710 14dup 2? 0.1 %mo_to_s1 RNPC3 TRUE 157 251 1610s1 1111724 8081118 63088 138280 37dup 1? 0.1 %fa_to_s1 CHIA, CEPT1 , CHI3L2, DEN ND2D FALSE 160 8299 13296s1 1113245 1841132 64970 19786 13dup 1? 0.1 %fa_to_s1 RHOC, PPM1 J, FAM19A3 TRUE 165 5628 13601p1 1115316 8911153 23228 6337 5dup 1? 0.1 %fa_to_both SIKE1 TRUE AK1 [missens e] 165 5629 13601s1 1115316 8911153 22836 5945 4dup 1? 0.1 %fa_to_both SIKE1 TRUE 195 5571 13215s1 1145415 2781459 23443 508165 156del 1? 0.1 %fa_to_s1 RNF115, GPR 89A, RBM8A, PIAS3, CD16 0, HFE2, ANKRD34A, L IX1L, POLR3G L, ANKRD35, ITGA10, PEX11B, NUD T17, TXNIP, G PR89C, PDZK 1, POLR3C TRUE 187 5513 12719p1 1146715 4941467 67190 51696 23del 1? 0.1 %mo_to_p1 CHD1L TRUECH D1L 204 1809 13355p1 1150955 5811509 67175 11594 11del 1? 0.1 %fa_to_both ANXA9 TRUE 204 1810 13355s1 1150955 8131509 67175 11362 10del 1? 0.1 %fa_to_both ANXA9 TRUE 206 5509 12691p1 1151337 0181514 03317 66299 35dup 1? 0.1 %mo_to_p1 SELENBP1, P OGZ, PSMB4 FALSE 222 5627 13599p1 1156890 5951569 33386 42791 45dup 1? 0.1 %mo_to_p1 ARHGEF11, L RRC71 TRUE TMEM62 [mis sense] 225 1555 11411s1 1158323 7781583 26686 2908 6del 1? 0.1 %fa_to_s1 CD1E TRUE DCLRE1B [m issense] 252 551 2161s1 1182550 3591825 55941 5582 4del 1? 0.1 %mo_to_s1 RNASEL FALSE 255 361 1715p1 1185097 8001851 30057 32257 11dup 3? 1 %fa_to_both TRMT1L, SW T1 TRUE ASAH2 [frame shift] 255 371 1715s1 1185097 8001851 30057 32257 11dup 3? 1 %fa_to_both TRMT1L, SW T1 TRUE 255 8312 12313p1 1185097 8001851 30057 32257 11dup 3? 0.5 %mo_to_p1 TRMT1L, SW T1 TRUE 255 5623 13590p1 1185097 8001851 30057 32257 11dup 3? 0.5 %mo_to_bot hTRMT1L , SWT1 TRUE MTHFS [fram eshift], EFCAB5 [fram eshift] 255 5624 13590s1 1185106 7371851 21067 14330 9dup 3? 0.5 %mo_to_bot hTRMT1L TRUE 257 5488 12518s1 1190195 2111902 50880 55669 4del 2? 0.1 %mo_to_s1 FAM5C FALSE 260 1863 14110p1 1197479 5891974 82083 2494 3del 1? 1 %mo_to_p1 DENND1B FALSE PHF3 [missen se] 277 1643 12175p1 1213003 5832130 09491 5908 2del 6? 2 %fa_to_p1 C1orf227 TRUE 279 116 13890p1 1220088 7902201 00447 11657 3dup 1? 0.1 %fa_to_both RNU5F TRUE DYRK1A [spli ce] 279 117 13890s1 1220088 7902201 00447 11657 3dup 1? 0.1 %fa_to_both RNU5F TRUE 296 7648 12403p1 1230313 9632303 39036 25073 2del 1? 0.1 %fa_to_p1 GALNT2 TRUE 295 1640 12162p1 1230313 9632320 94638 1780675 160del 1? 0.1 %mo_to_p1 C1orf124, PG BD5, COG2, E XOC8, EGLN1 , C1orf131, C1 orf198, TSNA X-DISC1, GAL NT2, TTC13, GNPA T, FAM89A, A RV1, AGT, CA PN9, TRIM67 TRUEAG T, GNPAT 308 1725 12308p1 1245026 9182451 33745 106827 2dup 1? 0.1 %mo_to_p1 HNRNPU TRUE 309 1544 11301p1 1245530 1352457 22314 192179 4del 1? 0.1 %fa_to_both KIF26B TRUE ZNF335 [miss ense], BRD1 [missense] 309 1545 11301s1 1245530 1352457 22314 192179 4del 1? 0.1 %fa_to_both KIF26B TRUE 311 1597 11797s1 1245912 8642468 05339 892475 27dup 2? 0.1 %mo_to_s1 TFB2M, CNS T, SMYD3 TRUE 311 1523 11203s1 1246490 5022468 11354 320852 21dup 2? 0.1 %mo_to_s1 TFB2M, CNS T, SMYD3 TRUE 312 7653 11676s1 1246927 5472469 30602 3055 3dup 1? 0.1 %mo_to_s1 SCCPDH TRUE 316 8327 12463p1 1247599 2712477 69817 170546 14dup 1? 0.1 %fa_to_p1 OR2B11, NLR P3, OR2G3, O R2G2, OR2C3 , OR2W5, C1o rf150 TRUENL RP3U NC80 [nonsen se] 318 8328 12463p1 1248023 9182481 13098 89180 8dup 1? 0.5 %fa_to_p1 OR2W3, OR2 L13, TRIM58, OR2T8 TRUE UNC80 [nons ense] 1605 5679 12837p1 21396 88315 46246 149363 28dup 2? 0.5 %mo_to_p1 TPO TRUE SH3RF3 [miss ense] 1605 5672 12733s1 21426 81615 20754 93938 19dup 2? 0.5 %mo_to_s1 TPO FALSE RANBP9 [mis sense] 1604 5709 13296s1 21437 20914 79843 42634 8dup 1? 0.5 %mo_to_s1 TPO TRUE 1628 2030 12729p1 229117 564291 64440 46876 15dup 1? 0.1 %mo_to_bot hWDR43 TRUE 1628 2032 12729s1 229117 564291 69645 52081 18dup 1? 0.1 %mo_to_bot hWDR43 TRUE 1632 5701 13176s1 232631 566332 46273 614707 87dup 1? 0.5 %mo_to_s1 BIRC6, TTC2 7, LTBP1 TRUE 1633 5674 12826p1 244071 645440 73450 1805 2dup 1? 0.1 %mo_to_p1 ABCG8 TRUE TROVE2 [fram eshift] 1634 1898 11118p1 244527 109445 41090 13981 5dup 2? 0.1 %fa_to_both SLC3A1 TRUESL C3A1 1634 7946 11118s1 244527 109445 41090 13981 5dup 2? 0.1 %fa_to_both SLC3A1 TRUESL C3A1 1634 146 11472p1 244527 109445 41090 13981 5dup 2? 0.1 %fa_to_both SLC3A1 TRUESL C3A1 KRT80 [misse nse], SP7 [missense] 1634 147 11472s1 244527 109445 41090 13981 5dup 2? 0.1 %fa_to_both SLC3A1 TRUESL C3A1 MACC1 [miss ense], ZYG11A [mis sense] 1635 2031 12729p1 245616 448458 32580 216132 20dup 1? 0.5 %fa_to_both SRBD1 TRUE 1635 2033 12729s1 245616 448458 79587 263139 21dup 1? 0.5 %fa_to_both SRBD1, PRKC E TRUE 1644 7950 11716p1 261558 410629 98527 1440117 76dup 1? 0.1 %fa_to_p1 XPO1, FAM16 1A, EHBP1, T MEM17, B3G NT2, COMMD1, CC T4, USP34 TRUE 1652 1985 12228p1 274129 486741 66149 36663 9dup 4? 0.1 %mo_to_p1 ACTG2, DGU OK TRUE CYP4F3 [miss ense] 1661 175 11895p1 286292 396865 09365 216969 64dup 2? 0.1 %fa_to_p1 REEP1, POLR 1A, MRPL35, PTCD3, IMM T FALSE 1666 5682 12851s1 296780 544977 84254 1003710 255dup 1? 0.1 %mo_to_s1 FER1L5, CIAO 1, ANKRD36, SEMA4C, AN KRD39, SNRNP200, A STL, ADRA2B , NEURL3, LM AN2L, DUSP2, TME M127, ITPRIP L1, ANKRD23 , FAHD2B, CNN M3, CNNM4, ARID5A, LOC 285033, STARD7, NCA PH, KIAA1310 , FAM178B TRUE 1674 5735 13507p1 298192 841982 75940 83099 18dup 2? 0.5 %fa_to_both ANKRD36B, C OX5B, ACTR1 B FALSE 1674 5736 13507s1 298192 841982 74779 81938 15dup 2? 0.5 %fa_to_both ANKRD36B, C OX5B, ACTR1 B FALSE SCOC [frame shift], LAMA1 [missense] 1674 1930 11484p1 298262 549982 75940 13391 12dup 2? 0.5 %mo_to_bot hACTR1B , COX5B TRUE TMEM85 [mis sense], PPP2R1B [mi ssense] 1674 1931 11484s1 298263 529982 75940 12411 11dup 2? 0.5 %mo_to_bot hACTR1B , COX5B TRUE 1675 5695 13166s1 298866 780988 72637 5857 2del 1? 2 %fa_to_s1 VWA3B TRUE USP34 [misse nse] 1677 5740 13508s1 299858 826999 12133 53307 10dup 2? 0.5 %mo_to_s1 LYG1, LYG2 FALSE 1677 2057 13795p1 299900 855999 09103 8248 4dup 2? 0.5 %mo_to_bot hLYG1 TRUE 1680 7964 11045p1 2102407 1811024 16105 8924 3dup 1? 0.1 %mo_to_p1 MAP4K4 FALSE 1688 5663 12691p1 2111395 5401114 38066 42526 26dup 1? 0.1 %mo_to_p1 BUB1 FALSE 1688 5656 12579p1 2111395 5401130 90065 1694525 167dup 1? 0.1 %fa_to_both ZC3H8, BCL2 L11, ACOXL, ZC3H6, MERT K, ANAPC1, BU B1, FBLN7, T MEM87B TRUE 1687 5657 12579s1 2111395 5401130 90065 1694525 167dup 1? 0.1 %fa_to_both ZC3H8, BCL2 L11, ACOXL, ZC3H6, MERT K, ANAPC1, BU B1, FBLN7, T MEM87B TRUE 1692 2045 13355p1 2113251 7191132 78002 26283 4dup 1? 0.1 %mo_to_p1 TTL TRUE 1693 2016 12552s1 2113346 4421134 04739 58297 2dup 1? 2 %mo_to_s1 SLC20A1, CH CHD5 TRUE 1694 5654 12526s1 2113537 0721135 41347 4275 4del 1? 0.1 %mo_to_s1 IL1A TRUEIL1 A 1699 5716 13418p1 2116535 3621165 99921 64559 12dup 1? 0.1 %fa_to_p1 DPP10 FALSED PP10 CGNL1 [miss ense], DENND5B [m issense], LRRC40 [miss ense] 1703 5749 13601p1 2128414 9761289 45188 530212 107dup 1? 0.1 %fa_to_both LIMS2, POLR 2D, AMMECR 1L, WDR33, S AP130, UGGT1 TRUE AK1 [missens e] 1703 5750 13601s1 2128463 8961289 45188 481292 104dup 1? 0.1 %fa_to_both POLR2D, AM MECR1L, SAP 130, WDR33, UGGT1 TRUE 1716 5675 12826p1 2135206 2191352 15744 9525 3dup 1? 0.1 %mo_to_bot hTMEM16 3, MGAT5 TRUE TROVE2 [fram eshift] 1716 8556 12826s1 2135206 2191352 15744 9525 3dup 1? 0.1 %mo_to_bot hTMEM16 3, MGAT5 TRUE 1718 1938 11519s1 2159663 5661599 22483 258917 3dup 1? 0.1 %mo_to_s1 TANC1, DAPL 1 TRUETA NC1 1719 164 11711p1 2160585 5331606 05414 19881 4dup 1? 0.1 %mo_to_bot hMARCH 7 TRUE KIAA0182 [mi ssense] 1719 166 11711s1 2160585 5331606 05414 19881 4dup 1? 0.1 %mo_to_bot hMARCH 7 TRUE 1722 5707 13269s1 2165381 5131656 00385 218872 17del 1? 0.1 %fa_to_s1 GRB14, COB LL1 FALSE 1743 5683 12851s1 2192711 1681930 59250 348082 11del 1? 0.1 %mo_to_s1 TMEFF2, SDP R TRUE 1754 1946 11622p1 2211299 2102115 42709 243499 48del 1? 0.1 %mo_to_bot hLANCL1 , CPS1 TRUECP S1 1754 1947 11622s1 2211299 2102115 42709 243499 48del 1? 0.1 %mo_to_bot hLANCL1 , CPS1 TRUECP S1 1762 5644 12420p1 2224758 9502248 31751 72801 12dup 1? 0.1 %fa_to_both WDFY1, MRP L44 FALSE HRH2 [missen se], GOLGA4 [missense] 1762 5645 12420s1 2224758 9502248 31751 72801 12dup 1? 0.1 %fa_to_both WDFY1, MRP L44 FALSE VPS18 [misse nse] 1764 5689 12997p1 2230632 2692307 24290 92021 39dup 1? 0.1 %fa_to_p1 TRIP12 TRUE 1767 160 11659s1 2232070 9512320 72965 2014 2del 1? 0.1 %mo_to_s1 ARMC9 FALSE 1770 1967 12175s1 2234181 6112342 29469 47858 16dup 1? 0.1 %fa_to_s1 ATG16L1, SA G TRUE 1778 5746 13589p1 2241439 3742415 14045 74671 19dup 1? 0.1 %fa_to_both DUSP28, ANK MY1, RNPEP L1 TRUE WDR55 [miss ense] 1778 5747 13589s1 2241439 8682415 14045 74177 18dup 1? 0.1 %fa_to_both DUSP28, ANK MY1, RNPEP L1 TRUE 1780 5640 12394p1 2241538 0672417 09123 171056 42dup 3? 0.5 %mo_to_p1 KIF1A, GPR3 5, AQP12B, A QP12A, CAPN 10 TRUEKI F1A 1782 1894 11107p1 2242371 0932423 75953 4860 4del 1? 0.1 %mo_to_bot hFARP2 TRUE TROAP [miss ense] 1782 1895 11107s1 2242371 0932423 75953 4860 4del 1? 0.1 %mo_to_bot hFARP2 TRUE 2232 5765 12588p1 37594 80877 82093 187285 6del 1? 0.1 %mo_to_bot hGRM7 TRUE TEKT1 [misse nse] 2232 5766 12588s1 37594 80877 82093 187285 6del 1? 0.1 %mo_to_bot hGRM7 TRUE NBPF9 [nons ense] 2237 2179 11824p1 38578 87086 72615 93745 11dup 1? 0.1 %fa_to_both LMCD1, C3or f32 FALSE EBAG9 [misse nse] 2237 2180 11824s1 38578 87086 75624 96754 13dup 1? 0.1 %fa_to_both LMCD1, C3or f32 FALSE 2238 5784 12851p1 39867 48398 74929 7446 4dup 1? 0.1 %mo_to_bot hARPC4- TTLL3 TRUE SLC25A29 [m issense], HMGXB3 [mis sense], UBE3C [miss ense] 2238 8377 12851s1 39867 48398 71079 3596 3dup 1? 0.1 %mo_to_bot hARPC4- TTLL3 TRUE 2242 2182 11905p1 312475 396127 91331 315935 47dup 2? 0.1 %mo_to_p1 PPARG, MKR N2, RAF1, TS EN2, TMEM4 0 TRUE 2242 5761 12481p1 312632 296127 91331 159035 22dup 2? 0.1 %mo_to_bot hRAF1, T MEM40 FALSE 2242 5762 12481s1 312632 296127 91331 159035 22dup 2? 0.1 %mo_to_bot hRAF1, T MEM40 FALSE 2236 2239 12534p1 312940 888129 78197 37309 13del 2? 0.1 %mo_to_p1 IQSEC1 TRUE MTMR9 [miss ense], MMP8 [missense] 2236 5778 12727p1 312940 888129 83365 42477 14dup 2? 0.1 %mo_to_bot hIQSEC1 TRUE ZMAT5 [misse nse] 2236 5779 12727s1 312942 481129 83365 40884 13dup 2? 0.1 %mo_to_bot hIQSEC1 TRUE TCF12 [frame shift] 2262 2258 13322s1 331917 924319 21322 3398 2del 1? 0.1 %mo_to_s1 OSBPL10 TRUE HPS6 [missen se] 2265 2163 11501s1 335833 873358 35450 1577 2dup 3? 0.5 %fa_to_s1 ARPP21 FALSE ANPEP [miss ense] 2265 2232 12383s1 335833 873358 35450 1577 2dup 3? 0.5 %fa_to_s1 ARPP21 FALSE 2265 2261 13393p1 335833 873358 35450 1577 2dup 3? 0.5 %mo_to_bot hARPP21 FALSE 2265 2262 13393s1 335833 873358 35450 1577 2dup 3? 0.5 %mo_to_bot hARPP21 FALSE 2266 5819 13512p1 337095 341370 96656 1315 3del 1? 0.1 %fa_to_p1 LRRFIP2 TRUE 2301 8380 12683p1 357143 564571 44339 775 2del 1? 0.1 %mo_to_p1 IL17RD FALSE KDM6B [fram eshift] 2307 5788 13094p1 378648 062787 96050 147988 27dup 1? 0.1 %fa_to_both ROBO1 FALSER OBO1 WDFY3 [nons ense] 2307 5789 13094s1 378649 262787 96050 146788 26dup 1? 0.1 %fa_to_both ROBO1 FALSER OBO1 CNOT4 [miss ense] 2308 5755 12252p1 381627 075816 40315 13240 4del 1? 0.1 %mo_to_bot hGBE1 FALSEG BE1 2308 5756 12252s1 381627 075816 40315 13240 4del 1? 0.1 %mo_to_bot hGBE1 FALSEG BE1 2309 5791 13099p1 397486 951976 34880 147929 19del 1? 0.1 %fa_to_p1 None, ARL6 TRUEAR L6 2317 5796 13162p1 3113588 3531136 19993 31640 5del 2? 0.1 %mo_to_bot hGRAMD 1C TRUE RIMS1 [frame shift] 2317 5797 13162s1 3113588 3531136 19993 31640 5del 2? 0.1 %mo_to_bot hGRAMD 1C TRUE 2326 260 13890p1 3129120 4351291 27701 7266 4del 1? 0.5 %mo_to_p1 C3orf25 TRUE DYRK1A [spli ce] 2329 2172 11676p1 3132277 8141322 80061 2247 3del 7? 1 %mo_to_p1 NPHP3-ACAD 11 TRUE 2329 237 12304p1 3132277 8141322 80061 2247 3del 7? 1 %mo_to_p1 NPHP3-ACAD 11 FALSE STIL [missens e], PSEN1 [missense], P HF19 [missense] 2329 2266 13660s1 3132277 8141322 80061 2247 3del 7? 1 %fa_to_s1 NPHP3-ACAD 11 TRUE ZNF423 [miss ense] 2329 255 13793s1 3132277 8141322 80061 2247 3del 7? 1 %mo_to_s1 NPHP3-ACAD 11 TRUE 2331 2098 11067p1 3137781 6571378 16689 35032 14dup 2? 0.1 %mo_to_bot hDZIP1L TRUE 2331 2099 11067s1 3137781 6571378 16689 35032 14dup 2? 0.1 %mo_to_bot hDZIP1L TRUE 2331 230 12106s1 3137781 6571378 03095 21438 9dup 2? 0.1 %mo_to_s1 DZIP1L TRUE 2335 2096 11057s1 3141884 4631420 84208 199745 31dup 6? 0.5 %fa_to_s1 GK5, XRN1 FALSE 2335 2124 11108p1 3141884 4631420 75961 191498 28dup 6? 0.5 %mo_to_bot hGK5, XR N1 FALSE KANK1 [misse nse] 2335 2126 11108s1 3141884 4631420 84021 199558 30dup 6? 0.5 %mo_to_bot hGK5, XR N1 FALSE MAN2A1 [mis sense] 2335 2152 11304s1 3141884 4631420 84021 199558 30dup 6? 0.5 %fa_to_s1 GK5, XRN1 TRUE KLC2 [missen se] 2335 247 13335p1 3141884 4631420 84208 199745 31dup 6? 0.5 %fa_to_both GK5, XRN1 FALSE ZNF420 [miss ense] 2335 248 13335s1 3141884 4631420 84208 199745 31dup 6? 0.5 %fa_to_both GK5, XRN1 FALSE 2335 5768 12631p1 3141889 1661420 75961 186795 27dup 6? 0.5 %mo_to_bot hGK5, XR N1 FALSE 2335 5769 12631s1 3141889 1661420 84208 195042 30dup 6? 0.5 %mo_to_bot hGK5, XR N1 FALSE GAPVD1 [mis sense], YIF1A [missense] 2334 5800 13176s1 3141896 3231419 01891 5568 3del 2? 0.1 %fa_to_both GK5 TRUE 2334 8388 13176p1 3141896 3231419 01891 5568 3del 2? 0.1 %fa_to_both GK5 TRUE ZFYVE26 [fram eshift] 2341 2218 12297s1 3150280 3281502 86079 5751 6del 1? 0.1 %fa_to_s1 EIF2A TRUEEIF 2AG NA14 [missen se] 2342 8389 13487p1 3155481 3051554 93637 12332 3dup 2? 0.5 %fa_to_p1 C3orf33 FALSE CYP4Z1 [miss ense] 2342 2279 13922s1 3155481 3051554 93637 12332 3dup 2? 1 %fa_to_s1 C3orf33 TRUE 2356 224 11788p1 3183822 5741838 23757 1183 3del 2? 0.1 %mo_to_p1 HTR3E TRUE YTHDC2 [mis sense] 2349 2103 11075p1 3184104 8381842 89170 184332 7del 1? 0.1 %mo_to_bot hEPHB3, CHRD FALSE 2349 2104 11075s1 3184104 8381841 07210 2372 6del 1? 0.1 %mo_to_bot hCHRD FALSE MED12L [mis sense], C4orf40 [miss ense] 2357 244 13169s1 3194947 4371954 53443 506006 38dup 1? 0.1 %mo_to_s1 C3orf21, APO D, MUC20, A CAP2, PPP1R 2 FALSE 2362 245 13169s1 3197238 7651972 60432 21667 5dup 1? 0.1 %mo_to_s1 BDH1 FALSE 2347 2154 11336p1 3197553 7481975 56544 2796 2del 1? 0.1 %mo_to_bot hLRCH3 TRUE SLC26A5 [mis sense] 2347 2155 11336s1 3197553 7481975 56544 2796 2del 1? 0.1 %mo_to_bot hLRCH3 TRUE RGS7 [missen se] 2364 2125 11108p1 3197574 2791976 40913 66634 14dup 1? 0.1 %mo_to_p1 IQCG, LRCH3 FALSE KANK1 [misse nse] 2386 5834 12645p1 4818 2798 45762 27483 5dup 1? 0.1 %mo_to_p1 CPLX1, GAK TRUECP LX1A RHGAP21 [m issense], ANK2 [nonsen se], LRRC31 [missense] 2392 2369 12340p1 41746 67417 95770 49096 2dup 1? 0.1 %fa_to_p1 TACC3, FGFR 3 TRUEFG FR3P PM1D [nonse nse], BCORL1 [mis sense] 2395 7474 11773p1 42641 46128 35561 194100 34dup 1? 0.1 %mo_to_p1 TNIP2, FAM1 93A, SH3BP2 TRUETN IP2, SH3BP2 KRBA1 [misse nse], CACNA1E [m issense] 2398 2324 11219p1 42906 49032 01666 295176 93dup 1? 0.1 %mo_to_bot hADD1, G RK4, C4orf10 , MFSD10, HT T, NOP14 FALSEH TT 2398 2325 11219s1 42906 49032 01666 295176 93dup 1? 0.1 %mo_to_bot hADD1, G RK4, C4orf10 , MFSD10, HT T, NOP14 FALSEH TT LAMA4 [miss ense] 2399 2304 11060p1 43445 76534 50003 4238 9del 4? 0.5 %fa_to_p1 HGFAC FALSE 2399 8152 11700p1 43446 99134 49762 2771 4del 4? 0.5 %mo_to_p1 HGFAC FALSE 2403 278 12161p1 46594 89966 13005 18106 10del 1? 0.1 %mo_to_bot hMAN2B2 FALSE UBR3 [frames hift], CARKD [nonsense] 2403 279 12161s1 46594 89966 13005 18106 10del 1? 0.1 %mo_to_bot hMAN2B2 FALSE 2419 2310 11075p1 440762 450407 78276 15826 5dup 1? 0.5 %mo_to_bot hNSUN7 FALSE 2419 2311 11075s1 440762 450407 78276 15826 5dup 1? 0.5 %mo_to_bot hNSUN7 FALSE MED12L [mis sense], C4orf40 [miss ense] 2420 2305 11066p1 441258 993412 59143 150 2dup 3? 0.1 %fa_to_p1 UCHL1 TRUEUC HL1T CF7L1 [misse nse] 2423 8403 12962s1 447625 587476 25965 378 2del 1? 0.1 %mo_to_s1 CORIN TRUE 2425 2379 12524p1 448165 718481 78203 12485 6dup 1? 0.1 %fa_to_p1 TEC TRUE 2429 2302 11014p1 457361 522575 22178 160656 9dup 1? 0.1 %mo_to_p1 ARL9, SRP72 , HOPX FALSE 2445 5824 12420p1 481207 478820 92958 885480 26dup 1? 0.1 %mo_to_bot hFGF5, C 4orf22, BMP3 , PRKG2 FALSE HRH2 [missen se], GOLGA4 [missense] 2445 5825 12420s1 481207 478820 92958 885480 26dup 1? 0.1 %mo_to_bot hFGF5, C 4orf22, BMP3 , PRKG2 FALSE VPS18 [misse nse] 2448 5821 12409p1 489978 088900 35703 57615 2del 1? 0.1 %mo_to_bot hTIGD2, F AM13A TRUE LRP2 [nonsen se] 2448 5822 12409s1 489978 088900 35703 57615 2del 1? 0.1 %mo_to_bot hTIGD2, F AM13A TRUE 2450 292 13533p1 490855 960908 74569 18609 3del 2? 0.1 %fa_to_p1 MMRN1 FALSE 2453 2341 11622p1 4100738 0701007 82782 44712 5dup 1? 0.1 %fa_to_both DAPP1 TRUE 2453 2342 11622s1 4100738 0701007 74505 36435 4dup 1? 0.1 %fa_to_both DAPP1 TRUE 2457 263 11190p1 4108535 4041088 71579 336175 22dup 1? 0.1 %fa_to_both PAPSS1, SGM S2, CYP2U1 TRUE 2457 264 11190s1 4108535 4041088 71579 336175 22dup 1? 0.1 %fa_to_both PAPSS1, SGM S2, CYP2U1 TRUE 2465 5839 12770p1 4134071 2951351 22348 1051053 6dup 1? 0.1 %fa_to_both PCDH10 TRUEPC DH10 2465 5841 12770s1 4134071 2951351 22348 1051053 6dup 1? 0.1 %fa_to_both PCDH10 TRUEPC DH10 2473 275 11959s1 4151727 4221520 70713 343291 49dup 1? 0.1 %fa_to_s1 SH3D19, RPS 3A, LRBA FALSE 2459 2373 12370p1 4159590 7641596 16795 26031 9dup 1? 0.1 %mo_to_p1 C4orf46, ETF DH TRUE 2458 2390 13385p1 4169083 6781690 86477 2799 3del 1? 0.5 %fa_to_p1 ANXA10 TRUE 2462 2348 12100p1 4187192 7621874 76519 283757 15dup 1? 0.1 %fa_to_p1 F11, MTNR1A TRUE 2463 8168 11429s1 4189018 2141890 68526 50312 12dup 1? 0.1 %fa_to_s1 TRIML2, TRIM L1 FALSE IL6R [missens e] 2483 341 12373p1 5619 1048 01279 182175 15dup 2? 0.1 %mo_to_bot hTPPP, ZD HHC11, CEP7 2 TRUE 2483 342 12373s1 5619 1046 78175 59071 14dup 2? 0.1 %mo_to_bot hTPPP, C EP72 TRUE 2483 8408 13293p1 5619 1046 44540 25436 9dup 2? 0.1 %fa_to_p1 CEP72 TRUE DNAH10 [mis sense], KIAA0240 [mi ssense], MUC4 [misse nse] 2490 5918 13396p1 59136 61592 38002 101387 8del 1? 0.1 %mo_to_bot hSEMA5A TRUESE MA5A 2490 5919 13396s1 59136 61593 37924 201309 10del 1? 0.1 %mo_to_bot hSEMA5A TRUESE MA5AC DC123 [misse nse] 2503 313 11659p1 540931 165409 37792 6627 4del 1? 0.1 %fa_to_both C7 FALSE 2503 314 11659s1 540931 165409 37792 6627 4del 1? 0.1 %fa_to_both C7 FALSE 2537 2442 11456s1 581283 389813 54421 71032 2dup 2? 0.1 %mo_to_s1 ATG10 TRUE EPG5 [missen se] 2537 5890 12851s1 581283 389813 54421 71032 2dup 2? 0.1 %fa_to_s1 ATG10 TRUE 2545 5876 12588s1 5110427 9861104 46977 18991 15dup 6? 0.5 %mo_to_s1 WDR36 TRUE NBPF9 [nons ense] 2545 2490 12370p1 5110430 6171104 46977 16360 14dup 6? 0.5 %mo_to_bot hWDR36 TRUE 2545 2492 12370s1 5110430 6171104 46977 16360 14dup 6? 0.5 %mo_to_bot hWDR36 TRUE 2545 2508 12690p1 5110430 6171104 46977 16360 14dup 6? 0.5 %mo_to_p1 WDR36 TRUE HIST1H2AE [m issense] 2545 5939 13684p1 5110430 6171104 46977 16360 14dup 6? 0.5 %mo_to_p1 WDR36 TRUE 2545 2533 13876s1 5110430 6171104 46977 16360 14dup 6? 0.5 %fa_to_s1 WDR36 FALSE 2545 2520 13322p1 5110432 7761104 46977 14201 13dup 6? 0.5 %fa_to_p1 WDR36 TRUE 2546 303 11469p1 5112915 2821129 29080 13798 6dup 1? 0.5 %fa_to_both YTHDC2 TRUE 2546 304 11469s1 5112915 2821129 29080 13798 6dup 1? 0.5 %fa_to_both YTHDC2 TRUE 2549 5911 13312s1 5121297 7271213 30365 32638 4del 2? 0.1 %mo_to_s1 SRFBP1 FALSE TRIM37 [miss ense] 2549 336 12161p1 5121309 8901213 58102 48212 6dup 2? 0.1 %mo_to_bot hSRFBP1 FALSE UBR3 [frames hift], CARKD [nonsense] 2549 337 12161s1 5121309 8901213 62821 52931 7dup 2? 0.1 %mo_to_bot hSRFBP1 FALSE 2550 2527 13825p1 5122161 7441221 63341 1597 3del 1? 0.5 %fa_to_p1 SNX2 FALSE 2551 5916 13387s1 5123973 5901240 36962 63372 8dup 1? 0.1 %mo_to_s1 ZNF608 TRUE 2566 346 12390s1 5141312 8231413 14151 1328 3del 1? 0.1 %mo_to_s1 KIAA0141 TRUE 2568 5937 13601s1 5147498 5481474 99699 1151 2dup 1? 0.1 %mo_to_s1 SPINK5 TRUE 2570 8183 13625p1 5150175 0021502 77746 102744 4del 3? 0.1 %mo_to_p1 C5orf62, ZNF 300 TRUE 2572 2454 11828p1 5156456 7431564 79665 22922 6dup 1? 0.1 %fa_to_both HAVCR1 TRUE 2572 2455 11828s1 5156456 7431564 79665 22922 6dup 1? 0.1 %fa_to_both HAVCR1 TRUE 2574 2540 13922p1 5175992 3531759 95818 3465 3del 1? 0.1 %mo_to_bot hCDHR2 TRUE 2574 2541 13922s1 5175992 3531759 95818 3465 3del 1? 0.1 %mo_to_bot hCDHR2 TRUE 2577 2514 12906s1 5176314 0021763 18192 4190 10dup 3? 0.5 %mo_to_s1 HK3 TRUE 2577 5929 13504s1 5176314 4511763 17942 3491 7dup 3? 0.5 %mo_to_bot hHK3 TRUE 2577 8436 13504p1 5176314 4511763 17942 3491 7dup 3? 0.5 %mo_to_bot hHK3 TRUE 2611 2571 11115p1 62949 14730 17219 68072 10del 2? 0.1 %fa_to_p1 NQO2, SERP INB6 TRUE 2617 2688 13876p1 620781 375208 46409 65034 2dup 1? 0.1 %mo_to_p1 CDKAL1 FALSE TUBGCP5 [m issense] 2618 5978 12758p1 624454 242245 23153 68911 20dup 1? 0.1 %mo_to_p1 ALDH5A1, GP LD1 TRUEAL DH5A1 2652 2584 11220s1 633384 873333 85087 214 2dup 2? 0.1 %mo_to_s1 CUTA FALSE 2655 5986 13153p1 635959 435359 67885 8450 4dup 1? 0.1 %fa_to_p1 SLC26A8 TRUE MAPK13 [mis sense] 2657 5974 12743p1 641243 862413 18602 74740 9dup 1? 0.1 %mo_to_p1 TREM1, NCR 2 FALSE 2658 7509 13335s1 641709 533417 10227 694 2del 1? 0.1 %mo_to_s1 PGC FALSE 2660 5960 12637s1 642883 807428 92011 8204 3del 1? 0.1 %fa_to_s1 PTCRA FALSE CDC7 [missen se] 2664 5955 12467s1 643495 896435 01744 5848 6del 1? 0.1 %mo_to_bot hXPO5 FALSE 2664 5953 12467p1 643496 565435 01744 5179 5del 1? 0.1 %mo_to_bot hXPO5 FALSE 2666 2685 13825s1 643603 612436 04383 771 2dup 1? 0.1 %fa_to_s1 MAD2L1BP FALSE 2672 2585 11220s1 647471 015475 67179 96164 12del 1? 0.1 %fa_to_s1 CD2AP FALSE 2673 7512 11013s1 649426 794494 40571 13777 5dup 2? 2 %mo_to_s1 MUT, CENPQ TRUE 2674 378 11229p1 651747 890517 52043 4153 3del 1? 0.1 %fa_to_p1 PKHD1 FALSE 2679 393 12106p1 656915 571569 19661 4090 2dup 2? 0.5 %fa_to_both KIAA1586 TRUEKI AA1586 2679 394 12106s1 656915 571569 19661 4090 2dup 2? 0.5 %fa_to_both KIAA1586 TRUEKI AA1586 2682 2661 12869p1 665523 270662 05303 682033 19dup 1? 0.1 %fa_to_p1 EYS TRUE PRCP [missen se] 2683 2624 11828s1 665596 589656 22636 26047 4del 1? 0.1 %mo_to_s1 TRUE 2688 2612 11551p1 688315 634883 18947 3313 3del 1? 0.5 %mo_to_p1 ORC3 TRUE 2689 380 11459p1 688317 390883 66700 49310 10del 1? 0.1 %fa_to_p1 ORC3 TRUE DEPDC7 [mis sense] 2692 5987 13153s1 699998 868999 99771 903 2dup 3? 2 %fa_to_s1 CCNC TRUE 2700 5941 12334p1 6118786 5671189 53774 167207 13dup 1? 0.1 %mo_to_bot hC6orf20 4 FALSE 2700 5942 12334s1 6118786 5671192 15686 429119 14dup 1? 0.1 %mo_to_bot hC6orf20 4 FALSE CSNK1G3 [fra meshift], PDS5A [nons ense] 2703 6010 13487p1 6123573 5561235 86483 12927 5del 3? 0.1 %mo_to_bot hTRDN FALSE CYP4Z1 [miss ense] 2712 6021 13513p1 6139206 6421392 06967 325 2del 1? 0.1 %mo_to_bot hECT2L FALSE ZMYND11 [sp lice] 2712 6022 13513s1 6139206 6421392 06967 325 2del 1? 0.1 %mo_to_bot hECT2L FALSE FGD5 [missen se] 2713 5989 13183p1 6142396 7841424 00040 3256 2del 1? 0.1 %mo_to_bot hNMBR TRUE BCL11A [fram eshift], CNOT6 [miss ense] 2713 5990 13183s1 6142396 7841424 00040 3256 2del 1? 0.1 %mo_to_bot hNMBR TRUE 2715 2614 11581p1 6146870 5991468 75741 5142 2del 3? 0.5 %fa_to_p1 RAB32 TRUE 2715 5962 12655s1 6146870 5991468 75741 5142 2del 3? 0.5 %mo_to_s1 RAB32 TRUE 2715 2677 13739p1 6146870 5991468 75741 5142 2del 3? 0.5 %mo_to_bot hRAB32 TRUE CD151 [misse nse] 2715 2678 13739s1 6146870 5991468 75741 5142 2del 3? 0.5 %mo_to_bot hRAB32 TRUE COL11A1 [mi ssense] 2717 2608 11509s1 6151089 7731511 53341 63568 13del 1? 0.1 %mo_to_s1 PLEKHG1 FALSE 2718 6003 13396p1 6151865 7061518 69624 3918 2del 1? 0.1 %fa_to_both C6orf97 TRUE 2718 6004 13396s1 6151865 7061518 69624 3918 2del 1? 0.1 %fa_to_both C6orf97 TRUE CDC123 [mis sense] 2722 5998 13327s1 6160575 8291605 77106 1277 2del 1? 0.1 %mo_to_s1 SLC22A1 TRUE 2723 2655 12650s1 6162683 5561628 64505 180949 2dup 1? 0.1 %mo_to_s1 PARK2 FALSEPA RK2 2726 2643 12317p1 6168187 9251682 27435 39510 3dup 3? 0.1 %fa_to_p1 C6orf124 FALSE 2730 2610 11519p1 6169617 9151696 46376 28461 19dup 1? 0.1 %mo_to_p1 THBS2 TRUE SMC3 [misse nse], SUV420H1 [m issense] 2732 6016 13504p1 6170844 3071708 92835 48528 18dup 1? 0.1 %mo_to_bot hTBP, PD CD2, PSMB1 TRUETB P 2732 6017 13504s1 6170844 3071708 92835 48528 18dup 1? 0.1 %mo_to_bot hTBP, PD CD2, PSMB1 TRUETB P 2739 6031 12334p1 74050 59741 19216 68619 8del 1? 0.1 %mo_to_p1 SDK1 FALSESD K1 2758 488 13335s1 77398 26374 95743 97480 20del 1? 0.1 %mo_to_s1 COL28A1 FALSE 2767 450 11696p1 716839 367173 82688 543321 22dup 1? 0.1 %fa_to_p1 AHR, AGR2, A GR3 FALSE INCENP [miss ense] 2766 512 14201p1 716900 123169 13467 13344 5del 1? 0.1 %mo_to_p1 AGR3 TRUE 2768 2757 11437p1 719737 938197 39903 1965 2del 1? 0.1 %fa_to_p1 TWISTNB FALSE MADD [misse nse] 2769 2724 11117s1 720180 568201 99868 19300 3dup 1? 0.1 %mo_to_s1 MACC1 TRUE 2776 2853 13608p1 724904 960249 11688 6728 5dup 1? 0.1 %fa_to_both OSBPL3 FALSE CSDE1 [nons ense] 2776 2854 13608s1 724904 960249 11688 6728 5dup 1? 0.1 %fa_to_both OSBPL3 FALSE 2778 2721 11107p1 726236 934262 40199 3265 3dup 1? 0.1 %mo_to_p1 HNRNPA2B1 TRUE TROAP [miss ense] 2783 6104 13412p1 733102 179331 85976 83797 7dup 3? 0.5 %fa_to_p1 RP9, BBS9, N T5C3 TRUEBB S9 2783 482 13116s1 733134 845331 85976 51131 6dup 3? 0.5 %mo_to_s1 RP9, BBS9 TRUEBB S9S RRM5 [misse nse] 2784 6043 12473p1 733945 225340 14396 69171 6dup 1? 0.1 %fa_to_p1 BMPER TRUE 2788 2784 11740p1 735707 043357 33940 26897 4dup 1? 0.1 %mo_to_p1 HERPUD2 FALSE EIF2C1 [miss ense] 2796 454 11722p1 748308 576484 16169 107593 20del 1? 0.1 %mo_to_p1 ABCA13 TRUEAB CA13A PAF1 [missen se] 2815 2780 11622s1 773631 154736 39072 7918 10del 1? 0.1 %fa_to_s1 LAT2 TRUE 2847 6032 12334p1 789556 556895 83631 27075 6del 1? 0.1 %mo_to_bot h FALSE 2847 6033 12334s1 789556 556895 83631 27075 6del 1? 0.1 %mo_to_bot h FALSE CSNK1G3 [fra meshift], PDS5A [nons ense] 2850 2813 12368s1 791737 807917 46526 8719 4dup 1? 0.1 %mo_to_bot hAKAP9, CYP51A1 TRUEAK AP9 2850 8228 12368p1 791737 807917 39473 1666 2dup 1? 0.1 %mo_to_bot hAKAP9 TRUEAK AP9 2839 465 12304p1 798628 206986 33339 5133 3dup 1? 0.1 %mo_to_bot hSMURF1 FALSE STIL [missens e], PSEN1 [missense], P HF19 [missense] 2839 466 12304s1 798628 206986 33339 5133 3dup 1? 0.1 %mo_to_bot hSMURF1 FALSE 2880 6040 12463p1 7100150 9691001 51881 912 2dup 1? 0.1 %fa_to_p1 AGFG2 TRUE UNC80 [nons ense] 2837 492 13346p1 7107261 7751072 69629 7854 2del 1? 0.1 %mo_to_bot hNone, B CAP29 FALSE KIAA0100 [no nsense] 2837 493 13346s1 7107261 7751072 69629 7854 2del 1? 0.1 %mo_to_bot hNone, B CAP29 FALSE 2841 433 11479p1 7111127 2931111 61503 34210 2del 3? 0.5 %fa_to_p1 IMMP2L TRUEIM MP2L 2841 495 13533s1 7111127 2931111 61503 34210 2del 3? 0.5 %fa_to_s1 IMMP2L FALSEIM MP2L ENOX2 [misse nse] 2830 6066 12837p1 7133502 0771336 02491 100414 3del 1? 0.1 %mo_to_p1 EXOC4 TRUE SH3RF3 [miss ense] 2870 6089 13327p1 7137206 6111375 97824 391213 29dup 1? 0.1 %fa_to_p1 CREB3L2, DG KI TRUE 2835 2753 11336s1 7138391 3681383 94540 3172 2del 1? 0.1 %mo_to_s1 ATP6V0A4 TRUE RGS7 [missen se] 2842 428 11459s1 7140064 2071400 69482 5275 2dup 1? 0.1 %mo_to_s1 SLC37A3 TRUE 2863 6098 13338p1 7141952 0401424 80066 528026 43dup 5? 0.1 %mo_to_bot hPRSS1, None, PRSS5 8 TRUE 2863 6101 13338s1 7141952 0401424 81378 529338 44dup 5? 0.1 %mo_to_bot hPRSS1, None, PRSS5 8 TRUE 2868 6099 13338p1 7142569 4591426 59296 89837 51dup 1? 0.1 %mo_to_bot hTRPV5, TRPV6, KEL, C7orf34 TRUE 2868 6102 13338s1 7142571 8281426 59296 87468 48dup 1? 0.1 %mo_to_bot hTRPV5, TRPV6, KEL, C7orf34 TRUE 2838 470 12578p1 7150706 0171507 25697 19680 23dup 1? 0.1 %fa_to_both ATG9B, NOS3 , ABCB8 TRUENO S3 2838 472 12578s1 7150706 0171507 25697 19680 23dup 1? 0.1 %fa_to_both ATG9B, NOS3 , ABCB8 TRUENO S3 2872 2731 11146p1 7151833 9161518 53431 19515 15dup 1? 0.1 %fa_to_both MLL3 TRUE 2872 2732 11146s1 7151833 9161518 53431 19515 15dup 1? 0.1 %fa_to_both MLL3 TRUE DIP2B [misse nse] 2873 6067 12837s1 7151833 9161520 27824 193908 58dup 1? 0.1 %mo_to_s1 MLL3 TRUE 2931 2917 11196p1 8190 8951 96362 5467 5dup 1? 2 %mo_to_bot hZNF596 TRUE 2931 2918 11196s1 8190 8953 82935 192040 8dup 1? 2 %mo_to_bot hZNF596, FBXO25 TRUE 2933 6133 12727s1 81824 73620 92905 268169 58dup 1? 0.1 %fa_to_s1 ARHGEF10, M YOM2 TRUE TCF12 [frame shift] 2937 2928 11336p1 82796 10630 87753 291647 44dup 1? 0.1 %fa_to_both CSMD1 TRUE SLC26A5 [mis sense] 2937 2929 11336s1 82796 10630 87753 291647 44dup 1? 0.1 %fa_to_both CSMD1 TRUE RGS7 [missen se] 2944 2935 11501p1 815480 588160 35497 554909 19del 4? 0.1 %mo_to_bot hMSR1, T USC3 FALSETU SC3 2944 2936 11501s1 815480 588160 35497 554909 19del 4? 0.1 %mo_to_bot hMSR1, T USC3 FALSETU SC3A NPEP [missen se] 2944 543 12810s1 815967 593160 21760 54167 7del 4? 2 %mo_to_s1 MSR1 TRUE 2944 6146 13196p1 815967 593160 21760 54167 7del 4? 2 %mo_to_p1 MSR1 TRUE 2955 8489 12631p1 827378 399273 80025 1626 2del 1? 0.1 %mo_to_bot hEPHX2 FALSE 2955 8490 12631s1 827378 399273 80025 1626 2del 1? 0.1 %mo_to_bot hEPHX2 FALSE GAPVD1 [mis sense], YIF1A [missense] 2956 532 11722p1 829959 413300 40689 81276 14dup 1? 0.1 %fa_to_p1 MIR548O2 TRUE APAF1 [misse nse] 2958 2940 11716p1 838090 512381 17639 27127 16del 1? 0.1 %fa_to_p1 DDHD2 TRUE 2957 8491 12733p1 842585 736425 87692 1956 2del 1? 0.5 %fa_to_both CHRNB3 FALSE BRD4 [3n-non - frameshifting] 2957 8492 12733s1 842585 736425 87692 1956 2del 1? 0.5 %fa_to_both CHRNB3 FALSE RANBP9 [mis sense] 2961 6152 13293s1 842938 260432 12038 273778 34dup 2? 0.1 %mo_to_s1 HGSNAT, FNT A, POTEA, SG K196 TRUEHG SNAT 2967 530 11715s1 857078 801570 80828 2027 2dup 7? 1 %mo_to_s1 PLAG1 TRUE 2967 2948 12224p1 857078 801570 80828 2027 2dup 7? 1 %mo_to_bot hPLAG1 TRUE MPHOSPH8 [ nonsense] 2967 2950 12224s1 857078 801570 80828 2027 2dup 7? 1 %mo_to_bot hPLAG1 TRUE 2967 2954 12297s1 857078 801570 80828 2027 2dup 7? 1 %fa_to_both PLAG1 TRUE GNA14 [misse nse] 2967 8244 12297p1 857078 801570 80828 2027 2dup 7? 1 %fa_to_both PLAG1 TRUE 2967 8493 13293p1 857078 801570 80828 2027 2dup 7? 1 %mo_to_bot hPLAG1 TRUE DNAH10 [mis sense], KIAA0240 [mi ssense], MUC4 [misse nse] 2967 8494 13293s1 857078 801570 80828 2027 2dup 7? 1 %mo_to_bot hPLAG1 TRUE 2967 6167 13599p1 857078 801570 80828 2027 2dup 7? 1 %fa_to_both PLAG1 TRUE TMEM62 [mis sense] 2967 6168 13599s1 857078 801570 80828 2027 2dup 7? 1 %fa_to_both PLAG1 TRUE 2972 544 13048p1 867790 808677 91154 346 2del 1? 0.1 %fa_to_both C8orf45 TRUE 2972 545 13048s1 867790 808677 91154 346 2del 1? 0.1 %fa_to_both C8orf45 TRUE 2978 6157 13412p1 886351 940865 75726 223786 14dup 1? 0.1 %fa_to_p1 CA3, CA2, RE XO1L1 TRUECA 2 2981 6139 12840p1 8101642 5541016 61742 19188 4del 1? 0.1 %fa_to_both SNX31 TRUE ATP1B1 [nons ense], TM4SF19 [sp lice] 2981 6140 12840s1 8101642 5541016 61742 19188 4del 1? 0.1 %fa_to_both SNX31 TRUE 2986 3002 14110s1 8120638 8041208 59320 220516 35del 1? 0.1 %fa_to_both ENPP2, TAF2 , DSCC1 FALSE VPS53 [misse nse] 2986 3001 14110p1 8120756 5271208 62756 106229 31del 1? 0.1 %fa_to_both TAF2, DSCC1 FALSE PHF3 [missen se] 2988 525 11629p1 8124975 5171249 98416 22899 10del 1? 0.1 %fa_to_both FER1L6 TRUE FBXO10 [miss ense] 2988 526 11629s1 8124975 5171249 98416 22899 10del 1? 0.1 %fa_to_both FER1L6 TRUE 2992 516 11472p1 8128748 8391287 53204 4365 3dup 1? 0.1 %mo_to_p1 MYC TRUE KRT80 [misse nse], SP7 [missense] 3003 2922 11252p1 8144295 1421444 50815 155673 24dup 1? 0.1 %fa_to_both ZFP41, GPIH BP1, TOP1MT , GLI4, ZNF69 6 FALSE LCN10 [misse nse] 3003 2923 11252s1 8144295 1421444 50815 155673 24dup 1? 0.1 %fa_to_both ZFP41, GPIH BP1, TOP1MT , GLI4, ZNF69 6 FALSE NT5E [missen se] 3002 2989 13825s1 8144391 6101444 00256 8646 6del 1? 0.1 %mo_to_s1 TOP1MT FALSE 3004 2924 11252s1 8144511 5141445 48018 36504 6dup 1? 0.1 %fa_to_both MAFA, ZC3H3 FALSE NT5E [missen se] 3019 2963 12534p1 8145947 0281460 33780 86752 18dup 1? 0.1 %mo_to_p1 ZNF251, ZNF 34, ZNF517, R PL8 TRUE MTMR9 [miss ense], MMP8 [missense] 3020 551 13815p1 8146029 0251460 29623 598 2del 1? 0.5 %mo_to_bot hZNF517 TRUE 3020 553 13815s1 8146029 0251460 29623 598 2del 1? 0.5 %mo_to_bot hZNF517 TRUE 3023 591 12106p1 9172 0803 40321 168241 19dup 4? 0.5 %fa_to_both DOCK8, C9or f66, CBWD1 TRUEDO CK8 3023 6239 13366s1 9214 5083 12166 97658 6dup 1? 0.1 %fa_to_s1 DOCK8, C9or f66 FALSED OCK8 3022 593 12106s1 9214 5083 40321 125813 14dup 4? 0.5 %fa_to_both DOCK8, C9or f66 TRUEDO CK8 3023 3046 11316p1 9271 6264 07069 135443 27dup 4? 0.5 %mo_to_p1 DOCK8 TRUEDO CK8 3023 3048 11353p1 9286 4604 07069 120609 26dup 4? 0.5 %mo_to_p1 DOCK8 TRUEDO CK8 3026 6194 12655p1 9368 0176 77009 308992 35dup 1? 0.5 %mo_to_p1 DOCK8, KAN K1 TRUEDO CK8, KANK1 EIF4A1 [misse nse] 3031 597 12161s1 95968 01860 15607 47589 4dup 1? 0.1 %mo_to_s1 RANBP6, KIA A2026 FALSE 3035 3103 12512s1 912693 996127 76026 82030 8del 1? 0.1 %mo_to_s1 TYRP1, C9orf 150 FALSE 3036 6208 13094p1 914857 550148 68975 11425 4dup 1? 0.1 %fa_to_p1 FREM1 FALSE WDFY3 [nons ense] 3037 6187 12480p1 915564 086156 23411 59325 6del 1? 0.1 %mo_to_p1 C9orf93 TRUE 3044 633 13926s1 927217 685272 29230 11545 5dup 1? 0.1 %fa_to_s1 TEK TRUE 3047 6176 12252p1 933135 186332 61167 125981 11dup 1? 0.5 %fa_to_both B4GALT1, No ne, SPINK4, B AG1 FALSEB 4GALT1 3047 6177 12252s1 933166 755332 61167 94412 10dup 1? 0.5 %fa_to_both B4GALT1, No ne, SPINK4, B AG1 FALSEB 4GALT1 3050 6214 13129s1 933886 878339 48585 61707 29dup 1? 0.1 %mo_to_bot hUBAP2, UBE2R2 TRUE 3050 8504 13129p1 933886 878339 48585 61707 29dup 1? 0.1 %mo_to_bot hUBAP2, UBE2R2 TRUE 3052 6251 13589p1 934286 614342 90387 3773 2del 2? 0.1 %mo_to_bot hKIF24 TRUE WDR55 [miss ense] 3052 8506 13589s1 934286 614342 90387 3773 2del 2? 0.1 %mo_to_bot hKIF24 TRUE 3054 606 12741p1 935228 011352 37823 9812 4del 1? 0.1 %mo_to_bot hUNC13B TRUEUN C13BE HD2 [missens e] 3054 607 12741s1 935228 011352 37823 9812 4del 1? 0.1 %mo_to_bot hUNC13B TRUEUN C13B 3057 3087 12308p1 935662 942356 64489 1547 4del 8? 1 %mo_to_p1 C9orf100 TRUE 3057 3108 12534p1 935662 942356 64489 1547 4del 8? 1 %fa_to_p1 C9orf100 TRUE MTMR9 [miss ense], MMP8 [missense] 3057 599 12578p1 935662 942356 64489 1547 4del 8? 1 %fa_to_p1 C9orf100 TRUE 3057 6191 12637p1 935662 942356 64489 1547 4del 8? 1 %fa_to_p1 C9orf100 FALSE RNPEPL1 [mi ssense], RHOT2 [miss ense] 3057 6200 12829p1 935662 942356 64489 1547 4del 8? 1 %mo_to_bot hC9orf10 0 FALSE 3057 6201 12829s1 935662 942356 64489 1547 4del 8? 1 %mo_to_bot hC9orf10 0 FALSE 3057 8507 13099s1 935664 004356 64489 485 2del 8? 1 %fa_to_s1 C9orf100 TRUE 3073 572 11479s1 988923 343889 68114 44771 21dup 1? 0.1 %fa_to_s1 ZCCHC6 TRUE 3075 6183 12334p1 994984 801949 85771 970 2del 1? 0.5 %mo_to_p1 IARS FALSE 3087 6231 13296p1 9115407 9291155 80095 172166 8dup 1? 1 %mo_to_p1 SNX30, KIAA1 958, C9orf80 TRUE 3088 3098 12507p1 9115811 6411158 12152 511 2del 2? 0.5 %mo_to_bot hZFP37 FALSE 3088 3099 12507s1 9115811 6411158 18968 7327 3del 2? 0.5 %mo_to_bot hZFP37 FALSE 3088 623 13793p1 9115811 6411158 18968 7327 3del 2? 0.5 %mo_to_p1 ZFP37 TRUE PCDHB4 [mis sense] 3090 618 13629p1 9116122 7861161 32422 9636 4dup 2? 0.5 %fa_to_both BSPRY TRUE 3090 621 13629s1 9116124 6481161 32422 7774 3dup 2? 0.5 %fa_to_both BSPRY TRUE 3093 3057 11437s1 9123898 0831239 21297 23214 17del 1? 0.1 %fa_to_s1 CNTRL FALSE 3097 6232 13296p1 9125562 4011255 89066 26665 4dup 1? 0.1 %mo_to_bot hOR1K1, PDCL TRUE 3097 6233 13296s1 9125562 4011255 89066 26665 4dup 1? 0.1 %mo_to_bot hOR1K1, PDCL TRUE 3119 6240 13418p1 9136259 4171362 62421 3004 3del 1? 0.1 %mo_to_p1 C9orf96 FALSE CGNL1 [miss ense], DENND5B [m issense], LRRC40 [miss ense] 3127 3050 11356p1 9139634 4011396 51044 16643 16dup 1? 0.1 %mo_to_p1 LCN6, LCN10 , LCN8 TRUE NAPRT1 [spli ce], SV2B [missense] 3129 6216 13129s1 9139887 3761398 90995 3619 11dup 1? 0.1 %fa_to_s1 CLIC3, C9orf1 42 TRUE 3135 6182 12334p1 9140126 1541401 27856 1702 6dup 1? 0.1 %fa_to_p1 SLC34A3 FALSE 3138 3126 13808p1 9140458 8851404 59606 721 3del 1? 0.1 %mo_to_p1 WDR85 TRUE 3139 8279 12647s1 9140622 8001407 33293 110493 25dup 1? 0.1 %mo_to_bot hEHMT1, MIR602 TRUEEH MT1T MEM218 [mis sense], DHCR7 [miss ense] 3139 3113 12647p1 9140646 7821407 33293 86511 22del 1? 0.1 %mo_to_bot hEHMT1, MIR602 TRUEEH MT1S LC30A5 [miss ense] 3141 3129 13825s1 9141000 1501411 24256 124106 15dup 3? 0.1 %fa_to_s1 CACNA1B FALSEC ACNA1B 323 662 11895s1 1022 5952 532470 306518 50dup 1? 0.1 %fa_to_s1 DIP2C, ZMYN D11 FALSE 324 3213 11611p1 1085 8865 910210 51345 15dup 1? 0.1 %mo_to_bot hLARP4B TRUE 324 3214 11611s1 1085 8865 910210 51345 15dup 1? 0.1 %mo_to_bot hLARP4B TRUE 325 8330 13144s1 10312 45793 178030 53451 20dup 1? 0.1 %mo_to_s1 PFKP TRUE CACNA1H [fr ameshift] 326 3215 11622p1 10317 53943 176774 1380 2del 1? 0.5 %fa_to_p1 PFKP TRUE 330 6333 13162p1 10520 33845 260723 57339 12del 1? 0.1 %mo_to_p1 AKR1C4, AKR 1CL1 TRUE RIMS1 [frame shift] 340 3249 12317p1 101827 640718 292287 15880 6dup 1? 0.1 %fa_to_both SLC39A12 FALSE 340 3250 12317s1 101828 959518 331762 42167 3dup 1? 0.1 %fa_to_both SLC39A12 FALSE CHST2 [misse nse] 344 3150 11094p1 102768 722227 700858 13636 3del 3? 1 %fa_to_p1 PTCHD3 TRUE CCDC14 [mis sense] 344 3245 12303p1 102768 722227 700858 13636 3del 3? 1 %mo_to_p1 PTCHD3 TRUE NCAPD2 [mis sense] 346 8331 12313s1 102894 040828 945596 5188 3del 2? 0.1 %fa_to_s1 TRUE PGM3 [misse nse] 346 684 13793s1 102894 040828 945596 5188 3del 2? 0.1 %mo_to_s1 TRUE 348 679 12741p1 103529 923835 305322 6084 4dup 1? 0.1 %mo_to_p1 CUL2 TRUE EHD2 [missen se] 370 6362 13465p1 104837 053251 890838 3520306 320dup 1? 0.1 %fa_to_both PARG, AGAP 7, ERCC6, CH AT, VSTM4, N COA4, GDF2, ARHG AP22, C10orf 71, MSMB, FA M21A, MAPK8, OGD HL, C10orf53 , None, DRGX , FRMPD2P1, R BP3, FRMPD 2, ZNF488, C 10orf128, WDFY4, PTPN 20B, GDF10, TIMM23 TRUECH AT, ERCC6 370 6365 13465s1 104837 053251 890838 3520306 320dup 1? 0.1 %fa_to_both PARG, AGAP 7, ERCC6, CH AT, VSTM4, N COA4, GDF2, ARHG AP22, C10orf 71, MSMB, FA M21A, MAPK8, OGD HL, C10orf53 , None, DRGX , FRMPD2P1, R BP3, FRMPD 2, ZNF488, C 10orf128, WDFY4, PTPN 20B, GDF10, TIMM23 TRUECH AT, ERCC6 374 656 11696p1 105452 789654 531395 3499 4del 5? 2 %mo_to_p1 MBL2 FALSE INCENP [miss ense] 374 8347 12997p1 105452 789654 531395 3499 4del 5? 2 %fa_to_both MBL2 TRUE 374 8348 12997s1 105452 789654 531395 3499 4del 5? 2 %fa_to_both MBL2 TRUE 376 673 12304s1 105628 757156 424022 136451 4del 1? 0.1 %mo_to_s1 PCDH15 FALSEPC DH15 379 8349 13097s1 106442 588864 430061 4173 2del 1? 0.5 %fa_to_s1 ZNF365 FALSE 380 658 11711p1 106828 037468 381542 101168 2del 1? 0.1 %fa_to_both CTNNA3 TRUECT NNA3K IAA0182 [mis sense] 380 659 11711s1 106828 037468 381542 101168 2del 1? 0.1 %fa_to_both CTNNA3 TRUECT NNA3 382 3174 11267p1 107260 422972 645686 41457 16del 1? 0.1 %mo_to_bot hSGPL1, PCBD1 TRUE 382 3175 11267s1 107260 422972 645686 41457 16del 1? 0.1 %mo_to_bot hSGPL1, PCBD1 TRUE 384 3178 11298p1 107501 060975 016174 5565 5dup 1? 0.5 %mo_to_p1 C10orf103 TRUE SLC6A13 [mis sense] 395 6270 12510p1 109070 355390 707143 3590 2dup 4? 2 %fa_to_both ACTA2 TRUE 395 6271 12510s1 109070 355390 707143 3590 2dup 4? 2 %fa_to_both ACTA2 TRUE 395 6281 12518p1 109070 355390 707143 3590 2dup 4? 2 %fa_to_both ACTA2 FALSE 395 6286 12518s1 109070 355390 707143 3590 2dup 4? 2 %fa_to_both ACTA2 FALSE 403 3194 11474p1 109553 713595 557560 20425 6dup 1? 0.1 %mo_to_bot hLGI1 FALSELG I1 403 3195 11474s1 109553 713595 557560 20425 6dup 1? 0.1 %mo_to_bot hLGI1 FALSELG I1Z W10 [missens e] 404 3167 11180s1 109646 654096 495201 28661 5del 1? 0.1 %mo_to_bot hCYP2C1 8 FALSE 404 3166 11180p1 109648 015296 495201 15049 4del 1? 0.1 %mo_to_bot hCYP2C1 8 FALSE 400 670 12285p1 1010159 4136101 596047 1911 2dup 5? 2 %mo_to_bot hABCC2 FALSE EIF4G1 [miss ense] 400 671 12285s1 1010159 4136101 596047 1911 2dup 5? 2 %mo_to_bot hABCC2 FALSE ZNF780A [mis sense] 400 3253 12383s1 1010159 4136101 596047 1911 2dup 5? 2 %mo_to_bot hABCC2 FALSE 401 652 11629p1 1010329 0993103 310617 19624 8dup 1? 0.5 %fa_to_both BTRC TRUE FBXO10 [miss ense] 401 653 11629s1 1010329 0993103 310617 19624 8dup 1? 0.5 %fa_to_both BTRC TRUE 396 6295 12644p1 1011442 7976114 496847 68871 3dup 1? 0.1 %mo_to_bot hVTI1A TRUE 396 6296 12644s1 1011442 7976114 496847 68871 3dup 1? 0.1 %mo_to_bot hVTI1A TRUE 402 3201 11519p1 1012221 6817122 349014 132197 7del 1? 0.1 %mo_to_p1 PPAPDC1A TRUE SMC3 [misse nse], SUV420H1 [m issense] 413 8359 13512p1 1012467 2277124 697675 25398 3del 1? 0.1 %mo_to_p1 C10orf88, FAM 24A TRUE 414 6375 13512p1 1012480 0724124 924572 123848 19dup 1? 0.1 %mo_to_p1 HMX3, HMX2 , BUB3, ACAD SB TRUE 416 3209 11561p1 1013473 4122134 793304 59182 20dup 2? 0.5 %mo_to_bot hC10orf9 3 FALSE 416 3210 11561s1 1013473 4122134 793304 59182 20dup 2? 0.5 %mo_to_bot hC10orf9 3 FALSE 397 3291 14110s1 1013503 2334135 032594 260 2dup 1? 0.1 %mo_to_bot hKNDC1 FALSE VPS53 [misse nse] 420 6334 13162s1 1013516 8873135 179599 10726 9dup 1? 0.1 %mo_to_bot hECHS1, C10orf125 TRUE 420 8356 13162p1 1013517 6371135 179599 3228 3dup 1? 0.1 %mo_to_bot hECHS1 TRUE RIMS1 [frame shift] 449 3356 11810p1 1119 3099 249057 55958 36dup 1? 0.5 %fa_to_both ODF3, SIRT3, RIC8A, PSMD 13, SCGB1C1 , BET1L FALSE 449 3358 11810s1 1119 3099 249057 55958 36dup 1? 0.5 %fa_to_both ODF3, SIRT3, RIC8A, PSMD 13, SCGB1C1 , BET1L FALSE RSRC1 [nons ense] 450 3350 11766p1 1124 4040 244469 429 2dup 10? 2%mo_to_p1 PSMD13 FALSE KIAA0100 [mi ssense] 464 3321 11180p1 11486 94664 904113 34647 2del 1? 0.5 %fa_to_both OR51T1, OR5 1S1 FALSE 464 3322 11180s1 11486 94664 904113 34647 2del 1? 0.5 %fa_to_both OR51T1, OR5 1S1 FALSE 465 6393 12683s1 11500 94415 021160 11719 7del 1? 0.1 %mo_to_s1 OR51L1, MM P26 FALSE C6orf174 [3n- non- frameshifting] 467 697 11569p1 11586 21855 878932 16747 2del 4? 2 %fa_to_both OR52E6, OR5 2E8 TRUE TNKS [missen se] 467 699 11569s1 11586 21855 878932 16747 2del 4? 2 %fa_to_both OR52E6, OR5 2E8 TRUE 473 721 12581s1 11794 92647 961067 11803 2del 1? 0.1 %fa_to_s1 OR10A3, OR1 0A6 TRUE MAML3 [miss ense] 476 714 11964p1 111488 058814 902314 21726 7dup 1? 0.1 %fa_to_p1 CYP2R1, PDE 3B TRUE 479 3345 11667p1 111872 736318 729842 2479 4del 1? 0.1 %mo_to_bot hIGSF22 TRUE 479 3347 11667s1 111872 736318 729842 2479 4del 1? 0.1 %mo_to_bot hIGSF22 TRUE CDH3 [missen se] 480 3408 12512s1 112080 522520 869299 64074 2del 1? 0.1 %fa_to_s1 NELL1 FALSE 481 6387 12523s1 112221 503822 833562 618524 42del 1? 0.1 %mo_to_bot hANO5, S LC17A6, GAS 2, FANCF FALSEFA NCF, SLC17A6 481 6386 12523p1 112223 280922 833562 600753 40del 1? 0.1 %mo_to_bot hANO5, S LC17A6, GAS 2, FANCF FALSEFA NCF, SLC17A6 G3BP2 [misse nse] 482 744 13926p1 113131 219331 541638 229445 16del 1? 0.1 %mo_to_bot hIMMP1L , DNAJC24, D CDC1, ELP4 TRUE FBXW9 [miss ense] 482 745 13926s1 113131 219331 541638 229445 16del 1? 0.1 %mo_to_bot hIMMP1L , DNAJC24, D CDC1, ELP4 TRUE 483 6436 13296p1 113260 855732 623945 15388 10dup 1? 1 %mo_to_p1 EIF3M TRUE 484 728 12810p1 113269 711032 781789 84679 8del 1? 0.1 %mo_to_p1 CCDC73 TRUE 485 6383 12445s1 113298 784533 113886 126041 30dup 1? 0.1 %fa_to_s1 TCP11L1, QS ER1, DEPDC7 , CSTF3 TRUE 490 3439 13843p1 114377 246043 775671 3211 2del 1? 1 %fa_to_p1 HSD17B12 TRUE AGK [missens e] 498 6411 13097s1 115543 264255 595630 162988 6del 1? 1 %mo_to_s1 OR5L1, OR5L 2, OR5D18, O R5D13, OR4C 6, OR5D14 FALSE 501 3327 11219p1 115725 638857 327905 71517 25dup 1? 0.1 %fa_to_p1 UBE2L6, Non e, TIMM10, S LC43A1, SMT NL1 FALSE 500 3393 12303p1 115924 490259 283330 38428 3dup 3? 0.1 %mo_to_p1 OR4D9, OR4D 10, OR4D11 TRUE NCAPD2 [mis sense] 503 6402 12836s1 115962 044759 623531 3084 4del 1? 0.5 %mo_to_bot hTCN1 FALSE 503 6401 12836p1 115962 067559 622308 1633 2del 1? 0.5 %mo_to_bot hTCN1 FALSE TBC1D2B [mi ssense] 511 6379 12334s1 116260 042362 607042 6619 11dup 1? 0.5 %mo_to_s1 WDR74 FALSE CSNK1G3 [fra meshift], PDS5A [nons ense] 512 710 11788s1 116305 763763 059115 1478 2dup 1? 0.5 %mo_to_s1 SLC22A10 TRUE 514 3451 14110s1 116456 905064 569229 179 2dup 1? 0.1 %fa_to_both MAP4K2 FALSE VPS53 [misse nse] 529 6415 13148s1 117695 473876 956547 1809 2del 1? 0.1 %mo_to_bot hGDPD4 TRUE KIAA1244 [mi ssense] 529 8375 13148p1 117695 473876 956547 1809 2del 1? 0.1 %mo_to_bot hGDPD4 TRUE 534 701 11610p1 118542 983285 468768 38936 10del 1? 0.5 %mo_to_bot hSYTL2 FALSE DNAH5 [fram eshift], HDLB P [missense] 534 702 11610s1 118542 983285 468768 38936 10del 1? 0.5 %mo_to_bot hSYTL2 FALSE 549 7715 11115s1 1110055 8409100 831720 273311 14dup 8? 0.1 %fa_to_s1 TRUE 549 7466 11229p1 1110055 8409100 859532 301123 24dup 8? 0.1 %fa_to_p1 FALSE 550 6461 13502p1 1110270 8028102 709441 1413 2del 1? 0.1 %mo_to_bot hMMP3 TRUE 550 6462 13502s1 1110270 8028102 709441 1413 2del 1? 0.1 %mo_to_bot hMMP3 TRUE 554 3346 11667s1 1110825 6646108 264105 7459 2dup 3? 0.5 %mo_to_bot hC11orf6 5 TRUE CDH3 [missen se] 554 7717 11667p1 1110825 6646108 264105 7459 2dup 3? 0.5 %mo_to_bot hC11orf6 5 TRUE 554 7467 11773s1 1110825 6646108 264105 7459 2dup 3? 0.5 %mo_to_s1 C11orf65 TRUE CCDC15 [mis sense] 554 3433 13808s1 1110825 6646108 264105 7459 2dup 3? 0.5 %mo_to_s1 C11orf65 TRUE 558 725 12630s1 1111204 6324112 084584 38260 9del 2? 0.1 %fa_to_s1 BCO2 TRUE 559 6439 13307s1 1111366 9961113 725057 55096 24dup 1? 1 %mo_to_s1 USP28 TRUE 571 3357 11810p1 1112450 8437124 509763 1326 2del 1? 0.1 %mo_to_p1 SIAE FALSE 576 3430 13730p1 1113415 1270134 215023 63753 21del 2? 0.5 %mo_to_p1 GLB1L2, GLB 1L3 FALSE DICER1 [miss ense] 576 6429 13239s1 1113416 2025134 214349 52324 16dup 2? 0.1 %fa_to_s1 GLB1L2, GLB 1L3 FALSE 575 6406 12851p1 1113417 7017134 257553 80536 34dup 1? 0.1 %fa_to_both GLB1L2, GLB 1L3, B3GAT1 TRUEB3 GAT1S LC25A29 [mis sense], HMGXB3 [mis sense], UBE3C [miss ense] 575 6407 12851s1 1113417 7017134 257553 80536 34dup 1? 0.1 %fa_to_both GLB1L2, GLB 1L3, B3GAT1 TRUEB3 GAT1 579 7722 11282p1 12465 10494 668159 17110 8dup 3? 0.5 %mo_to_bot hRAD51A P1 TRUE 579 7723 11519p1 12465 10494 668159 17110 8dup 3? 0.5 %fa_to_p1 RAD51AP1 TRUE SMC3 [misse nse], SUV420H1 [m issense] 579 3495 11282s1 12465 54744 668159 12685 6dup 3? 0.5 %mo_to_bot hRAD51A P1 TRUE 579 3621 13809p1 12465 54744 668159 12685 6dup 3? 0.5 %fa_to_p1 RAD51AP1 TRUE 596 3597 12802s1 12860 87078 689862 81155 14del 1? 0.1 %mo_to_bot hCLEC4E , CLEC4D, CL EC6A TRUE 596 7730 12802p1 12860 87078 689862 81155 14del 1? 0.1 %mo_to_bot hCLEC4E , CLEC4D, CL EC6A TRUE 602 3594 12736p1 121033 792810 342739 4811 3dup 1? 0.1 %mo_to_bot hC12orf5 9 TRUE 602 7731 12736s1 121033 792810 342739 4811 3dup 1? 0.1 %mo_to_bot hC12orf5 9 TRUE 614 6513 13125p1 122100 796121 377773 369812 40del 2? 0.1 %mo_to_bot hLST-3TM 12, SLCO1B1 , SLCO1B3 TRUE RAD21L1 [no nsense] 614 6514 13125s1 122100 796121 392123 384162 41del 2? 0.1 %mo_to_bot hLST-3TM 12, SLCO1B1 , SLCO1B3 TRUE 619 8313 12997p1 122526 470525 267804 3099 2del 1? 0.1 %fa_to_both CASC1 TRUE 619 8314 12997s1 122526 470525 267804 3099 2del 1? 0.1 %fa_to_both CASC1 TRUE 628 3618 13739s1 123353 276638 712260 5179494 11dup 2? 0.5 %mo_to_s1 ALG10, SYT1 0, ALG10B TRUE COL11A1 [mi ssense] 640 3463 11090p1 124968 898349 691056 2073 5dup 5? 2 %mo_to_p1 PRPH TRUE 640 6506 12836s1 124968 898349 691056 2073 5dup 5? 2 %fa_to_s1 PRPH FALSE 642 6494 12719s1 125047 532750 484373 9046 9dup 1? 0.1 %mo_to_s1 ACCN2, SMA RCD1 TRUE 643 3540 11828p1 125120 323851 213562 10324 4dup 1? 0.1 %mo_to_p1 ATF1 TRUE 647 3473 11114p1 125277 486352 779369 4506 6dup 1? 1 %mo_to_bot hKRT84 TRUE SCN2A [nons ense] 647 7739 11114s1 125277 486352 779369 4506 6dup 1? 1 %mo_to_bot hKRT84 TRUE DZANK1 [mis sense] 658 3479 11180s1 125661 996256 620203 241 2dup 1? 0.1 %fa_to_s1 OBFC2B FALSE 663 8395 12723p1 125734 581257 351246 5434 4dup 4? 1 %fa_to_both RDH16 TRUE 663 8396 12723s1 125734 581257 351246 5434 4dup 4? 1 %fa_to_both RDH16 TRUE SERAC1 [mis sense] 663 807 13835p1 125734 581257 348948 3136 3del 4? 1 %mo_to_bot hRDH16 TRUE 663 808 13835s1 125734 581257 351246 5434 4del 4? 1 %mo_to_bot hRDH16 TRUE 671 6474 12334p1 126974 218869 744052 1864 2del 1? 0.1 %fa_to_p1 LYZ FALSE 672 7751 12763p1 127543 609275 905377 469285 49dup 1? 0.1 %mo_to_p1 KCNC2, GLIP R1, KRR1, CA PS2, GLIPR1L 2, GLIPR1L1 TRUE 674 6492 12716s1 128019 056180 211298 20737 9dup 1? 0.1 %mo_to_s1 PPP1R12A TRUE 675 3599 12869p1 128063 263080 730784 98154 31del 2? 0.5 %mo_to_bot hOTOGL TRUE PRCP [missen se] 675 3601 12869s1 128063 263080 730335 97705 30del 2? 0.5 %mo_to_bot hOTOGL TRUE 675 802 13798p1 128063 263080 730784 98154 32del 2? 0.5 %fa_to_p1 OTOGL FALSE 680 8398 13215p1 129670 482996 707232 2403 2dup 1? 0.1 %mo_to_p1 CDK17 TRUE 684 3629 13832s1 1210417 1606104 174142 2536 2del 1? 0.1 %fa_to_s1 NT5DC3 TRUE 688 774 11895s1 1210929 0781109 293251 2470 3del 1? 0.1 %mo_to_s1 DAO FALSED AO 693 786 12581p1 1211218 2446112 308984 126538 28dup 2? 0.5 %mo_to_p1 ACAD10, MA PKAPK5, C12 orf47, ALDH2 TRUE 697 3607 13322p1 1211429 6595114 404032 107437 22dup 1? 0.1 %mo_to_p1 RBM19 TRUE 703 3489 11241p1 1212087 5929120 884632 8703 7dup 2? 0.5 %mo_to_p1 GATC, COX6A 1, TRIAP1 TRUE C18orf26 [mis sense] 703 6505 12836p1 1212087 5929120 884632 8703 7dup 2? 0.1 %fa_to_p1 GATC, COX6A 1, TRIAP1 FALSE TBC1D2B [mi ssense] 716 7488 12285p1 1213263 3328132 636946 3618 7dup 1? 0.1 %fa_to_p1 NOC4L FALSE EIF4G1 [miss ense] 720 3483 11196p1 1213361 8053133 781116 163063 16dup 1? 0.5 %mo_to_p1 ZNF140, ZNF 268, ZNF10, Z NF84 TRUE 724 3678 11509s1 132042 549420 426320 826 2dup 5? 1 %fa_to_s1 ZMYM5 FALSE 724 7476 12011p1 132042 549420 426320 826 2dup 5? 1 %fa_to_both ZMYM5 TRUE FAM45A [mis sense] 724 7477 12011s1 132042 549420 426320 826 2dup 5? 1 %fa_to_both ZMYM5 TRUE 724 3718 12561p1 132042 549420 426320 826 2dup 5? 1 %fa_to_p1 ZMYM5 FALSE NLRX1 [misse nse], ADAM33 [non sense] 724 3739 13322p1 132042 549420 426320 826 2dup 5? 1 %fa_to_p1 ZMYM5 TRUE 730 3683 11676p1 132130 321821 311944 8726 3dup 1? 0.1 %mo_to_p1 N6AMT2 TRUE 736 3656 11154s1 132527 607225 285660 9588 10dup 3? 0.1 %fa_to_s1 ATP12A TRUE 742 6576 13590p1 133330 623734 405491 1099254 43dup 1? 0.1 %mo_to_bot hKL, STAR D13, RFC3, P DS5B TRUE MTHFS [fram eshift], EFCAB5 [fram eshift] 742 6577 13590s1 133330 930834 540262 1230954 45dup 1? 0.1 %mo_to_bot hKL, STAR D13, RFC3, P DS5B TRUE 754 3749 13808p1 134908 476449 086310 1546 3del 4? 0.1 %fa_to_both RCBTB2 TRUE 754 3750 13808s1 134908 476449 086310 1546 3del 4? 0.1 %fa_to_both RCBTB2 TRUE 755 3666 11298s1 135012 359350 134220 10627 5dup 1? 0.5 %mo_to_s1 RCBTB1 TRUERC BTB1 759 3747 13774p1 137611 179976 123998 12199 3dup 9? 2 %mo_to_p1 COMMD6, UC HL3 TRUE DNAH11 [mis sense] 764 3671 11412p1 139650 841196 515968 7557 4del 1? 0.5 %fa_to_both UGGT2 FALSE 764 3672 11412s1 139650 841196 515968 7557 4del 1? 0.5 %fa_to_both UGGT2 FALSE C16orf62 [mis sense] 779 817 11190s1 1311117 6011111 319820 143809 21dup 1? 0.1 %mo_to_s1 RAB20, CARS 2, None, CAR KD TRUE 785 3687 11808p1 1311397 9975113 980406 431 2dup 1? 0.1 %mo_to_p1 GRTP1 FALSE 786 6555 12829s1 1311413 8154114 175048 36894 8dup 1? 0.1 %mo_to_s1 DCUN1D2, TM CO3 FALSE 789 3732 13063p1 1311451 4708114 535461 20753 7dup 2? 0.5 %mo_to_bot hGAS6, F AM70B FALSE TPK1 [missen se] 789 3733 13063s1 1311451 4708114 531684 16976 6dup 2? 0.5 %mo_to_bot hGAS6, F AM70B FALSE 790 3660 11196p1 1311500 2273115 091756 89483 26dup 1? 0.5 %mo_to_bot hUPF3A, CDC16, ZNF8 28 TRUEUP F3A 790 3661 11196s1 1311500 7595115 091756 84161 23dup 1? 0.5 %mo_to_bot hUPF3A, CDC16, ZNF8 28 TRUEUP F3A 798 6586 12460p1 142121 573921 424416 208677 6dup 4? 0.1 %mo_to_p1 RNASE1, RNA SE3, RNASE2 , RNASE6, ED DM3A, EDDM3B TRUE 794 3811 11766s1 142123 830921 270227 31918 3dup 1? 0.1 %fa_to_s1 RNASE1, EDD M3B, RNASE 6 FALSE SYTL3 [nonse nse] 795 7780 11094p1 142135 984521 624184 264339 73dup 1? 0.1 %fa_to_p1 RNASE3, RNA SE2, RNASE7 , RNASE13, R NASE8, ZNF219, OR5 AU1, TPPP2, NDRG2, SLC 39A2, METTL17, AR HGEF40 TRUE CCDC14 [mis sense] 811 3788 11242p1 142213 329622 410148 276852 19dup 1? 0.1 %fa_to_p1 None, OR4E2 TRUE 810 6604 13139p1 142218 023522 192868 12633 2del 1? 0.1 %fa_to_p1 None FALSE 809 3851 13912s1 142289 120922 928469 37260 8del 5? 0.1 %mo_to_bot hNone TRUE 845 6606 13153p1 146375 759563 860646 103051 10dup 2? 0.5 %fa_to_p1 RHOJ, GPHB 5, PPP2R5E TRUE MAPK13 [mis sense] 846 3843 13195p1 146400 624664 066660 60414 2dup 1? 0.1 %fa_to_both PPP2R5E, WD R89 TRUE TSPYL5 [3n-n on- frameshifting] , SCARB2 [missense] 851 865 12304p1 146501 671565 019579 2864 3dup 3? 2 %fa_to_p1 C14orf50 FALSE STIL [missens e], PSEN1 [missense], P HF19 [missense] 851 877 12810p1 146501 671565 019579 2864 3dup 3? 2 %fa_to_p1 C14orf50 TRUE 854 8420 12582p1 146991 995769 969596 49639 10dup 1? 0.5 %mo_to_p1 SLC39A9 TRUE EMID2 [misse nse] 856 888 13533p1 147452 228674 541720 19434 15del 1? 0.1 %fa_to_both ALDH6A1, C1 4orf45 FALSE 856 891 13533s1 147452 228674 541720 19434 15del 1? 0.1 %fa_to_both ALDH6A1, C1 4orf45 FALSE ENOX2 [misse nse] 860 6596 12697p1 147837 414578 392304 18159 3del 1? 0.1 %mo_to_p1 ADCK1 TRUE 863 3814 11810p1 149076 758290 784457 16875 6dup 1? 0.1 %mo_to_p1 C14orf102 FALSE 870 3782 11118p1 149918 252899 183611 1083 2del 4? 2 %fa_to_p1 C14orf177 TRUE 870 3795 11356p1 149918 252899 183611 1083 2del 4? 2 %mo_to_p1 C14orf177 TRUE NAPRT1 [spli ce], SV2B [missense] 870 879 13116p1 149918 252899 183611 1083 2del 4? 2 %fa_to_both C14orf177 TRUE 870 881 13116s1 149918 252899 183611 1083 2del 4? 2 %fa_to_both C14orf177 TRUE SRRM5 [miss ense] 870 904 14201s1 149918 252899 183611 1083 2del 4? 2 %mo_to_s1 C14orf177 TRUE 871 6580 11894p1 1410040 2365100 405664 3299 4dup 3? 0.5 %mo_to_p1 EML1 FALSEEM L1 871 3826 12317s1 1410040 2365100 405664 3299 4dup 3? 0.5 %fa_to_s1 EML1 FALSEEM L1C HST2 [missen se] 874 6583 12396p1 1410583 6177105 861009 24832 17dup 1? 0.1 %fa_to_p1 PACS2 TRUE 888 6675 13493p1 152073 949623 445762 2706266 97del 6? 0.5 %fa_to_p1 None, CYFIP1 , NIPA2, LOC 727924, GOL GA6L1, OR4N4, POTE B, NIPA1, GO LGA8E, TUBG CP5 TRUENI PA2, NIPA1, CYFIP1, TUBGCP5 888 4002 13393p1 152273 953323 062319 322786 66dup 6? 0.5 %fa_to_both NIPA2, NIPA1 , CYFIP1, GO LGA6L1, TUB GCP5 FALSEN IPA2, NIPA1, CYFIP1, TUBGCP5 888 6640 12735p1 152283 591523 062319 226404 62del 6? 0.5 %fa_to_p1 NIPA2, NIPA1 , CYFIP1, TUB GCP5 TRUENI PA2, NIPA1, CYFIP1, TUBGCP5 888 4003 13393s1 152283 591523 062319 226404 62dup 6? 0.5 %fa_to_both NIPA2, NIPA1 , CYFIP1, TUB GCP5 FALSEN IPA2, NIPA1, CYFIP1, TUBGCP5 8437 13493p1 152283 591523 062319 226404 62del 6? 0.5 %fa_to_p1 NIPA2, NIPA1 , CYFIP1, TUB GCP5 TRUENI PA2, NIPA1, CYFIP1, TUBGCP5 898 6672 13487s1 153135 691031 369124 12214 6del 4? 0.5 %fa_to_s1 TRPM1 FALSE 897 8440 12833p1 153232 310032 404100 81000 3dup 2? 1 %fa_to_both CHRNA7 TRUECH RNA7 897 8441 12833s1 153232 310032 404100 81000 3dup 2? 1 %fa_to_both CHRNA7 TRUECH RNA7 905 936 13890s1 153698 388537 002162 18277 5del 1? 0.1 %fa_to_s1 C15orf41 TRUE 909 3954 12317s1 154099 326141 001314 8053 3del 1? 0.1 %mo_to_s1 RAD51 FALSER AD51 CHST2 [misse nse] 915 934 13629s1 154206 747342 147943 80470 75dup 1? 0.1 %mo_to_s1 JMJD7-PLA2 G4B, SPTBN5 , MAPKBP1 TRUE 918 4013 13825p1 154280 743442 836322 28888 5dup 1? 0.1 %mo_to_p1 LRRC57, SNA P23 FALSE 920 6655 13148p1 154357 701143 585152 8141 5del 1? 0.1 %fa_to_both TGM7 TRUE 920 6656 13148s1 154357 701143 584295 7284 4del 1? 0.1 %fa_to_both TGM7 TRUE KIAA1244 [mi ssense] 921 6651 13018p1 154362 789343 644143 16250 9dup 1? 0.1 %mo_to_p1 ADAL TRUE PCOLCE [fram eshift], MCOLN3 [mis sense] 922 914 11479p1 154369 661043 701294 4684 5dup 1? 0.1 %mo_to_p1 TP53BP1, TU BGCP4 TRUE 929 6664 13216s1 155073 127051 535109 803839 113dup 1? 0.1 %fa_to_s1 USP8, TNFAIP 8L3, CYP19A 1, USP50, AP 4E1, SPPL2A, TRP M7 TRUEAP 4E1 932 6670 13330p1 155399 194654 025346 33400 12del 2? 0.1 %mo_to_p1 WDR72 FALSE 934 4015 13843p1 155547 551255 497903 22391 6dup 2? 0.1 %mo_to_p1 RAB27A, RSL 24D1 TRUERA B27AA GK [missense ] 936 3938 12295p1 155773 019757 754090 23893 7dup 2? 0.5 %mo_to_bot hCGNL1 FALSE 936 6644 12837p1 155773 019757 754090 23893 7dup 2? 0.5 %fa_to_p1 CGNL1 TRUE SH3RF3 [miss ense] 936 3939 12295s1 155773 257457 746016 13442 5dup 2? 0.5 %mo_to_bot hCGNL1 FALSE 963 906 11229s1 157642 660476 641077 214473 28dup 1? 0.1 %fa_to_s1 ETFA, C15orf 27, ISL2, SCA PER FALSE 974 3878 11241s1 158546 175485 488431 26677 10del 3? 0.1 %fa_to_s1 SLC28A1 TRUE SNRNP200 [m issense] 974 6628 12420p1 158546 175485 488431 26677 10del 3? 0.1 %mo_to_p1 SLC28A1 FALSE HRH2 [missen se], GOLGA4 [missense] 974 6633 12445p1 158546 175485 488431 26677 10del 3? 0.1 %mo_to_p1 SLC28A1 TRUE 976 3867 11092p1 158602 895186 129054 100103 7dup 1? 0.5 %mo_to_p1 AKAP13 TRUE 981 3977 12651s1 159148 614191 500955 14814 15del 2? 0.1 %fa_to_s1 RCCD1, UNC 45A TRUE 981 4005 13543p1 159148 812191 520001 31880 25del 2? 0.1 %mo_to_p1 RCCD1, PRC 1, UNC45A TRUE 983 6658 13197s1 159485 751094 888393 30883 6dup 1? 0.1 %mo_to_s1 MCTP2 TRUE GMPPA [3n-n on- frameshifting] 984 911 11452s1 1510197 0191101 983865 13674 4dup 1? 0.1 %mo_to_s1 PCSK6 TRUE 985 3892 11356s1 1510222 4277102 501099 276822 15dup 2? 0.1 %mo_to_s1 OR4F6, TARS L2, OR4F4, O R4F15 TRUE DLGAP3 [mis sense], STEAP3 [miss ense], DHX30 [missense] 985 3969 12375p1 1510222 4277102 261525 37248 10dup 2? 0.1 %fa_to_p1 TARSL2 TRUE FAM205A [mi ssense] 992 6741 12826p1 1645 4955 461578 6623 7del 1? 0.1 %mo_to_bot hDECR2 TRUE TROVE2 [fram eshift] 992 6742 12826s1 1645 4955 461578 6623 7del 1? 0.1 %mo_to_bot hDECR2 TRUE 1015 6809 13398s1 16301 45213 100547 86026 53dup 1? 0.1 %mo_to_bot hHCFC1R 1, PAQR4, PK MYT1, CCDC 64B, CLDN9, MMP25, KRE MEN2, TNFRS F12A, THOC6 , CLDN6 TRUE 1015 6808 13398p1 16301 63253 100547 84222 50dup 1? 0.1 %mo_to_bot hHCFC1R 1, PAQR4, PK MYT1, CCDC 64B, CLDN9, MMP25, KRE MEN2, TNFRS F12A, THOC6 , CLDN6 TRUE POGZ [frames hift], PYHIN1 [missense], TT N [missense] 1017 6736 12735p1 16310 70333 119356 12323 12del 1? 0.1 %fa_to_p1 MMP25, IL32 TRUE 1020 6825 13512p1 16439 03194 391505 1186 3dup 1? 0.1 %fa_to_p1 CORO7-PAM 16 TRUE 1022 6752 13018s1 16462 15614 642368 20807 10del 1? 0.1 %mo_to_bot h TRUE 1022 6751 13018p1 16462 49724 642368 17396 8del 1? 0.1 %mo_to_bot h TRUE PCOLCE [fram eshift], MCOLN3 [mis sense] 1023 4043 11118p1 16487 13804 871598 218 2dup 2? 1 %mo_to_p1 GLYR1 TRUE 1029 6796 13327s1 16872 25868 997259 274673 53dup 1? 0.1 %mo_to_bot hUSP7, M ETTL22, TME M186, ABAT, PMM2, CARHSP1 TRUEPM M2, ABAT 1028 6794 13327p1 16872 88958 740000 11105 8dup 1? 0.1 %mo_to_bot hMETTL2 2 TRUE 1039 6719 12652s1 161512 559115 168718 43127 24dup 3? 1 %fa_to_s1 NTAN1, PDXD C1, RRN3 TRUE CTSB [missen se] 1039 6756 13125p1 161512 559115 135533 9942 12dup 3? 1 %fa_to_both NTAN1, PDXD C1 TRUE RAD21L1 [no nsense] 1039 6757 13125s1 161512 671715 135533 8816 11dup 3? 1 %fa_to_both NTAN1, PDXD C1 TRUE 1045 6772 13215p1 161559 617815 609285 13107 6del 1? 0.1 %fa_to_p1 C16orf45 TRUE 1046 6829 13689s1 161614 658016 276445 129865 36dup 1? 0.1 %mo_to_s1 ABCC6, ABC C1 TRUE ODZ4 [missen se], FBN1 [missense], IA RS [nonsense] 1054 989 13593s1 161960 311119 663412 60301 19dup 1? 0.1 %fa_to_s1 C16orf62 FALSE 1057 6760 13139p1 162162 396521 742251 118286 30del 5? 1 %mo_to_bot hMETTL9 , OTOA FALSE 1057 6761 13139s1 162162 396521 763826 139861 35del 5? 1 %mo_to_bot hMETTL9 , OTOA FALSE C7orf71 [miss ense] 1057 6696 12441p1 162165 561021 763826 108216 29dup 5? 1 %mo_to_p1 METTL9, OTO A TRUE 1057 6727 12697p1 162165 561021 768598 112988 31del 5? 1 %fa_to_p1 METTL9, OTO A TRUE 1062 4115 12100s1 162307 936623 117808 38442 14dup 3? 0.1 %fa_to_s1 USP31 TRUE 1080 7516 11629p1 162967 504930 199897 524848 192dup 7? 0.1 %mo_to_p1 DOC2A, ASP HD1, CORO1 A, TBX6, KIF2 2, CDIPT, QPRT, YPEL3 , PPP4C, MAP K3, SPN, MVP , FAM57B, ALD OA, INO80E, SEZ6L2, TAO K2, KCTD13, MAZ , PRRT2, GDP D3, C16orf92 , C16orf53, TM EM219, C16o rf54, HIRIP3 TRUEAL DOA, MAPK3, SEZ6L2 FBXO10 [miss ense] 1080 4203 13509p1 162967 504930 199408 524359 191dup 7? 0.1 %fa_to_p1 DOC2A, ASP HD1, CORO1 A, TBX6, KIF2 2, CDIPT, QPRT, YPEL3 , PPP4C, MAP K3, SPN, MVP , FAM57B, ALD OA, INO80E, SEZ6L2, TAO K2, KCTD13, MAZ , PRRT2, GDP D3, C16orf92 , C16orf53, TM EM219, C16o rf54, HIRIP3 TRUEAL DOA, MAPK3, SEZ6L2 1085 966 12152s1 163079 303830 849705 56667 5dup 1? 0.1 %mo_to_s1 ZNF629 TRUE 1088 943 11229p1 163147 717031 488897 11727 11dup 1? 0.5 %mo_to_bot hTGFB1I1 , ARMC5 FALSE 1088 945 11229s1 163147 717031 487883 10713 10dup 1? 0.5 %mo_to_bot hTGFB1I1 , ARMC5 FALSE 1098 6806 13366p1 165759 600557 604447 8442 9del 1? 0.1 %mo_to_bot hGPR114 FALSE 1098 8471 13366s1 165759 600557 604447 8442 9del 1? 0.1 %mo_to_bot hGPR114 FALSE 1101 8472 13227s1 166696 788966 972144 4255 4del 1? 0.1 %mo_to_s1 CES2, FAM96 B TRUE LCOR [missen se] 1105 4217 13825p1 166822 467068 283349 58679 19dup 1? 0.1 %mo_to_bot hESRP2, NFATC3, PLA 2G15 FALSE 1105 4218 13825s1 166822 467068 283349 58679 19dup 1? 0.1 %mo_to_bot hESRP2, NFATC3, PLA 2G15 FALSE 1107 996 14201p1 166871 028768 713877 3590 5dup 2? 0.1 %mo_to_p1 CDH3 TRUE 1114 979 13169s1 167028 662370 296427 9804 10dup 2? 0.1 %fa_to_s1 AARS FALSE 1109 4191 12790p1 167036 319470 405466 42272 16dup 1? 0.1 %fa_to_p1 DDX19A, DDX 19B TRUE 1110 4113 12100p1 167071 469670 714928 232 2dup 3? 0.5 %fa_to_p1 MTSS1L TRUE 1112 6715 12628s1 167199 710372 001906 4803 3del 2? 0.5 %fa_to_both TRUE CSDE1 [misse nse] 1112 6714 12628p1 167200 103572 001906 871 2del 2? 0.5 %fa_to_both TRUE OTUD7A [3n- non- frameshifting] 1112 986 13447s1 167200 103572 001906 871 2del 2? 0.5 %fa_to_s1 FALSE 1125 4208 13621s1 167527 636775 690509 414142 55dup 1? 0.1 %fa_to_s1 KARS, TMEM 170A, BCAR1 , GABARAPL2 , TMEM231, AD AT1, TERF2IP , CFDP1, CHS T5, CHST6 TRUECH ST5 1128 4091 11581p1 167723 200477 769883 537879 36dup 1? 0.1 %mo_to_p1 ADAMTS18, M ON1B, NUDT 7 TRUE 1129 6814 13493p1 167789 664477 918699 22055 4del 1? 0.1 %fa_to_both VAT1L TRUE 1129 6815 13493s1 167789 664477 918699 22055 4del 1? 0.1 %fa_to_both VAT1L TRUE 1131 4104 11797p1 167805 652378 062087 5564 2del 2? 0.5 %fa_to_both CLEC3A TRUE 1131 4105 11797s1 167805 652378 062087 5564 2del 2? 0.5 %fa_to_both CLEC3A TRUE 1131 6774 13216p1 167805 652378 064738 8215 3del 2? 0.5 %fa_to_both CLEC3A TRUE ITGA7 [misse nse] 1131 6776 13216s1 167805 652378 064738 8215 3del 2? 0.5 %fa_to_both CLEC3A TRUE 1135 970 12373p1 168131 446181 396216 81755 10dup 1? 0.1 %fa_to_p1 GAN, BCMO1 TRUEGA N 1142 962 11964p1 168443 873384 459407 20674 9del 2? 0.1 %mo_to_p1 ATP2C2 TRUE 1143 7863 11412p1 168469 336584 812688 119323 15del 9? 2 %mo_to_p1 KLHL36, USP 10 FALSE 1160 1036 12578p1 17 6010 11981 5971 3dup 1? 0.1 %mo_to_p1 TRUE 1162 6906 13263s1 17 6010 636421 630411 38dup 2? 0.1 %mo_to_s1 RPH3AL, FAM 101B, C17orf 97, FAM57A, VPS53 FALSE 1161 6898 13165s1 1729 5626 729318 433692 48dup 1? 0.1 %fa_to_s1 RNMTL1, FAM 57A, GLOD4, GEMIN4, VPS 53, NXN TRUE 1159 1061 13335p1 1767 3108 695309 22201 12del 2? 0.1 %fa_to_both RNMTL1, GLO D4 FALSE ZNF420 [miss ense] 1159 1062 13335s1 1767 3108 695309 22201 12del 2? 0.1 %fa_to_both RNMTL1, GLO D4 FALSE 1159 6959 13599s1 1767 3108 695309 22201 12del 2? 0.1 %mo_to_s1 RNMTL1, GLO D4 TRUE 1166 6903 13196s1 17345 80253 486724 28699 9del 2? 0.1 %fa_to_s1 TRPV1, TRPV 3 TRUETR PV3 1168 4278 11285s1 17351 86303 561469 42839 14del 2? 0.1 %mo_to_s1 CTNS, SHPK FALSE 1168 4306 11532p1 17351 86303 561469 42839 14del 2? 0.1 %mo_to_p1 CTNS, SHPK TRUE 1167 4390 13271p1 17355 07373 552225 1488 2del 1? 0.1 %mo_to_bot hCTNS TRUE 1178 4275 11267p1 17725 85237 259951 1428 7dup 1? 0.1 %mo_to_bot hTMEM95 TRUE 1178 4276 11267s1 17725 85237 259951 1428 7dup 1? 0.1 %mo_to_bot hTMEM95 TRUE 1199 4234 11067s1 171166 675611 713686 46930 10dup 1? 0.1 %mo_to_s1 DNAH9 TRUE 1202 6919 13328p1 171409 530515 458678 1363373 31dup 1? 0.5 %fa_to_p1 PMP22, CDR T15, TEKT3, N one, FAM18B 2- CDRT4, HS3S T3B1, COX10 FALSE SMAD2 [miss ense] 1207 6859 12579p1 171632 098216 351274 30292 18dup 1? 0.1 %fa_to_both NCRNA00188 , TRPV2 TRUE 1207 6860 12579s1 171632 098216 347445 26463 17dup 1? 0.1 %fa_to_both NCRNA00188 , TRPV2 TRUE 1218 4412 13843p1 172691 871526 920084 1369 2del 1? 0.1 %fa_to_both SPAG5 TRUE AGK [missens e] 1218 4413 13843s1 172691 871526 920084 1369 2del 1? 0.1 %fa_to_both SPAG5 TRUE DOCK10 [mis sense] 1225 4295 11484s1 172931 163429 324349 12715 3dup 1? 0.1 %fa_to_s1 RNF135 TRUE 1237 4269 11196p1 173367 937433 769305 89931 9del 4? 1 %mo_to_p1 SLFN11, SLF N12, SLFN13 TRUE 1237 6948 13507s1 173367 937433 769305 89931 9del 4? 1 %mo_to_s1 SLFN11, SLF N12, SLFN13 FALSE SCOC [frame shift], LAMA1 [missense] 1239 1006 11459p1 173418 206634 183809 1743 4del 1? 0.1 %fa_to_both C17orf66 TRUE DEPDC7 [mis sense] 1239 1007 11459s1 173418 206634 183809 1743 4del 1? 0.1 %fa_to_both C17orf66 TRUE 1264 4388 13195s1 173967 172339 680792 9069 9dup 1? 0.5 %fa_to_s1 KRT15, KRT1 9 TRUE 1300 1009 11479s1 175731 185257 430887 119035 11dup 1? 0.1 %mo_to_s1 GDPD1, YPEL 2 TRUE 1313 4256 11107p1 176707 511367 087403 12290 14del 2? 0.1 %mo_to_bot hABCA6 TRUE TROAP [miss ense] 1313 4257 11107s1 176707 511367 087403 12290 14del 2? 0.1 %mo_to_bot hABCA6 TRUE 1320 4352 12561p1 177287 445972 877385 2926 3del 1? 0.5 %fa_to_both FADS6 FALSE NLRX1 [misse nse], ADAM33 [non sense] 1320 4353 12561s1 177287 445972 877385 2926 3del 1? 0.5 %fa_to_both FADS6 FALSE ROBO3 [miss ense] 1330 4376 13000s1 177462 141174 625793 4382 8dup 1? 0.1 %mo_to_s1 ST6GALNAC1 TRUE 1331 4396 13608p1 177486 511175 398785 533674 38del 1? 0.1 %fa_to_p1 SEPT9, NCRN A00338, MGA T5B, SEC14L 1 FALSE CSDE1 [nons ense] 1333 6962 13601p1 177619 857976 210508 11929 10dup 1? 0.1 %fa_to_both BIRC5, AFMID TRUE AK1 [missens e] 1333 6963 13601s1 177619 878376 202131 3348 7dup 1? 0.1 %fa_to_both AFMID TRUE 1336 4272 11220s1 177818 841378 188937 524 2del 1? 0.1 %fa_to_both SGSH FALSESG SH 1336 7895 11220p1 177818 841378 188937 524 2del 1? 0.1 %fa_to_both SGSH FALSESG SH 1337 6867 12691s1 177822 192878 323706 101778 36dup 1? 0.1 %mo_to_s1 SLC26A11, R NF213 FALSE PNPLA6 [mis sense], SLC43A1 [mis sense] 1345 6881 12937s1 177982 704879 827282 234 2del 1? 0.1 %fa_to_s1 ARHGDIA TRUE OR2T3 [misse nse] 1351 6838 12424p1 178015 163180 153240 1609 3del 1? 0.1 %mo_to_p1 CCDC57 FALSE 1356 4423 11008p1 1858 0408 645097 64689 8dup 1? 0.1 %mo_to_bot hCLUL1, CETN1 FALSE KATNAL2 [sp lice] 1356 4424 11008s1 1858 0408 645097 64689 8dup 1? 0.1 %mo_to_bot hCLUL1, CETN1 FALSE 1357 4435 11252p1 1868 8573 697355 8782 7del 4? 0.1 %mo_to_p1 ENOSF1 FALSE LCN10 [misse nse] 1373 4442 11429p1 181853 134118 650574 119233 32dup 1? 0.1 %fa_to_both ROCK1 FALSE 1373 4443 11429s1 181853 134118 650574 119233 32dup 1? 0.1 %fa_to_both ROCK1 FALSE IL6R [missens e] 1375 6973 12394s1 182105 696921 059388 2419 3dup 1? 0.1 %fa_to_s1 RIOK3 TRUE 1379 6981 12697p1 182443 617424 628467 192293 10dup 1? 0.1 %fa_to_p1 CHST9, AQP4 , CHST9-AS1 TRUE 1380 6979 12523p1 182896 832929 049312 80983 25dup 1? 0.5 %fa_to_p1 DSG3, DSG4 FALSE G3BP2 [misse nse] 1384 1079 11942p1 183295 341932 954256 837 2del 2? 0.5 %mo_to_p1 ZNF396 FALSE 1384 6997 13443p1 183295 341932 954256 837 2del 2? 0.5 %mo_to_p1 ZNF396 TRUE 1385 4483 12656p1 183383 693033 848648 11718 4dup 1? 0.1 %mo_to_p1 MOCOS TRUE CPZ [missens e] 1392 1082 12285s1 184700 871447 010141 1427 2dup 1? 0.1 %fa_to_s1 C18orf32 FALSE ZNF780A [mis sense] 1397 6975 12420s1 184780 134647 801814 468 3dup 1? 2 %fa_to_s1 MBD1 FALSEM BD1 VPS18 [misse nse] 1406 1087 13726p1 186421 120764 212140 933 2del 4? 0.1 %mo_to_bot hCDH19 TRUE 1406 1088 13726s1 186421 120764 212140 933 2del 4? 0.1 %mo_to_bot hCDH19 TRUE MAPK8 [miss ense] 1412 4484 12656p1 187041 626270 417855 1593 2dup 1? 0.1 %fa_to_both NETO1 TRUE CPZ [missens e] 1412 4485 12656s1 187041 626270 417855 1593 2dup 1? 0.1 %fa_to_both NETO1 TRUE 1414 4489 12869p1 187222 928172 251798 22517 8dup 1? 0.1 %fa_to_p1 CNDP1 TRUE PRCP [missen se] 1415 7002 13508p1 187481 716674 980858 163692 4dup 1? 0.1 %mo_to_bot hGALR1, MBP FALSE 1415 7003 13508s1 187496 811374 980858 12745 2dup 1? 0.1 %mo_to_bot hGALR1 FALSE 1417 4438 11356p1 187747 034577 891075 420730 28dup 2? 0.1 %fa_to_p1 KCNG2, RBFA , CTDP1, ADN P2, TXNL4A, PQLC1 TRUECT DP1N APRT1 [splice ], SV2B [missense] 1417 4461 11810p1 187766 397578 005231 341256 23dup 2? 0.1 %fa_to_both PARD6G, LOC 100130522, R BFA, ADNP2, TXNL4A, PQL C1 FALSE 1417 4462 11810s1 187769 396878 005231 311263 21dup 2? 0.1 %fa_to_both PARD6G, LOC 100130522, R BFA, ADNP2, TXNL4A, PQL C1 FALSE RSRC1 [nons ense] 1423 1143 12358p1 19178 30261 784945 1919 2del 2? 0.5 %fa_to_p1 ATP8B3 TRUE TSR2 [missen se] 1424 1183 13815p1 19179 99451 802644 2699 4del 2? 0.5 %fa_to_p1 ATP8B3 TRUE 1429 4550 11298p1 19241 02552 418136 7881 5del 1? 0.5 %fa_to_both TMPRSS9 TRUE SLC6A13 [mis sense] 1429 4551 11298s1 19241 02552 418136 7881 5del 1? 0.5 %fa_to_both TMPRSS9 TRUE 1431 1132 12161s1 19380 59463 834981 29035 17dup 2? 0.1 %fa_to_both ZFR2 FALSE 1431 1131 12161p1 19380 71693 833776 26607 15dup 2? 0.1 %fa_to_both ZFR2 FALSE UBR3 [frames hift], CARKD [nonsense] 1431 7051 12837p1 19381 66713 825405 8734 7del 2? 0.1 %fa_to_both ZFR2 TRUE SH3RF3 [miss ense] 1431 7052 12837s1 19381 66713 825405 8734 7del 2? 0.1 %fa_to_both ZFR2 TRUE 1442 8514 13296p1 19668 19516 686913 4962 8dup 1? 0.1 %mo_to_p1 C3 TRUE 1448 7030 12626s1 19689 04916 991143 100652 27dup 6? 0.5 %fa_to_both EMR1, EMR4 P TRUE 1448 4659 12780p1 19689 04917 083755 193264 41dup 6? 0.5 %fa_to_both EMR1, MBD3 L2, EMR4P, Z NF557 TRUE 1448 4660 12780s1 19689 04916 991143 100652 27dup 6? 0.5 %fa_to_both EMR1, EMR4 P TRUE RABEP1 [mis sense] 1448 1098 11364s1 19689 64087 083755 187347 40dup 6? 0.5 %fa_to_s1 EMR1, MBD3 L2, EMR4P, Z NF557 FALSE 1448 7020 12462s1 19689 64087 083755 187347 40dup 6? 0.5 %mo_to_s1 EMR1, MBD3 L2, EMR4P, Z NF557 TRUE EPB41L3 [non sense] 1448 7028 12626p1 19689 64086 991143 94735 26dup 6? 0.5 %fa_to_both EMR1, EMR4 P TRUE GALNT9 [mis sense] 1448 7112 13504s1 19689 64087 083755 187347 40dup 6? 0.5 %fa_to_both EMR1, MBD3 L2, EMR4P, Z NF557 TRUE 1448 7110 13504p1 19689 71596 989173 92014 24dup 6? 0.5 %fa_to_both EMR1, EMR4 P TRUE 1447 7029 12626p1 19707 50857 083755 8670 6dup 4? 0.5 %fa_to_both ZNF557 TRUE GALNT9 [mis sense] 1447 7031 12626s1 19707 50857 083755 8670 6dup 4? 0.5 %fa_to_both ZNF557 TRUE 1447 4661 12780s1 19707 50857 083755 8670 6dup 4? 0.5 %fa_to_both ZNF557 TRUE RABEP1 [mis sense] 1447 7111 13504p1 19708 13707 083755 2385 3dup 4? 0.5 %fa_to_both ZNF557 TRUE 1453 4644 12616p1 19815 93298 168571 9242 9dup 1? 0.1 %fa_to_both FBN3 FALSE 1453 4645 12616s1 19815 93298 168571 9242 9dup 1? 0.1 %fa_to_both FBN3 FALSE 1465 7006 12252s1 191143 607311 448075 12002 4dup 1? 0.1 %mo_to_s1 RAB3D, TSPA N16 FALSER AB3D 1466 1108 11479p1 191154 171811 548945 7227 6dup 1? 0.1 %mo_to_p1 PRKCSH, CC DC151 TRUE 1472 1125 12106p1 191414 268414 153601 10917 6dup 4? 1 %fa_to_p1 IL27RA TRUE 1472 7116 13512s1 191414 319714 153601 10404 5dup 4? 1 %fa_to_s1 IL27RA TRUE 1472 7101 13487s1 191415 326414 153601 337 2del 4? 1 %mo_to_s1 IL27RA FALSE 1481 4549 11298p1 191870 437518 704917 542 2dup 1? 0.1 %mo_to_p1 CRLF1 TRUE SLC6A13 [mis sense] 1482 7054 12838s1 191903 003119 033553 3522 7dup 1? 0.1 %mo_to_s1 DDX49, COPE TRUE FAM49A [fram eshift] 1519 4580 11724p1 194019 983640 225713 25877 4dup 1? 0.1 %fa_to_p1 CLC, LGALS1 4 TRUE TUBA1A [mis sense] 1528 7010 12394s1 194392 003243 922549 2517 5del 3? 0.1 %fa_to_s1 TEX101 TRUE 1528 4678 13509p1 194392 003243 922549 2517 5del 3? 0.1 %mo_to_bot hTEX101 TRUE 1528 4679 13509s1 194392 003243 922549 2517 5del 3? 0.1 %mo_to_bot hTEX101 TRUE 1530 1191 14201s1 194528 416145 296877 12716 7dup 1? 0.1 %fa_to_s1 CBLC TRUE 1531 1091 11029p1 194549 410745 495658 1551 3del 1? 0.1 %mo_to_bot hCLPTM1 TRUE 1531 1092 11029s1 194549 410745 495658 1551 3del 1? 0.1 %mo_to_bot hCLPTM1 TRUE 1532 4510 11085s1 194584 879945 899694 50895 44dup 2? 0.5 %mo_to_s1 ERCC2, KLC3 , PPP1R13L TRUEER CC2P OGK [missen se] 1532 1158 13116p1 194584 879945 901576 52777 47dup 2? 0.5 %fa_to_both ERCC2, KLC3 , PPP1R13L TRUEER CC2 1532 1160 13116s1 194585 070445 901576 50872 45dup 2? 0.5 %fa_to_both ERCC2, KLC3 , PPP1R13L TRUEER CC2S RRM5 [misse nse] 1534 7057 12840p1 194611 905946 120111 1052 3del 1? 0.1 %mo_to_bot hEML2 TRUE ATP1B1 [nons ense], TM4SF19 [sp lice] 1534 7058 12840s1 194611 905946 120111 1052 3del 1? 0.1 %mo_to_bot hEML2 TRUE 1544 4554 11304s1 194948 118049 497180 16000 11dup 1? 0.1 %fa_to_s1 GYS1, RUVBL 2 TRUE KLC2 [missen se] 1546 7060 12843p1 194963 625749 675365 39108 37dup 1? 0.5 %mo_to_p1 TRPM4, PPFI A3, HRC FALSE 1547 7087 13296s1 194967 117349 675365 4192 6del 1? 0.5 %mo_to_s1 TRPM4 TRUE 1557 7043 12705p1 195222 717652 543836 316660 29dup 4? 0.1 %fa_to_both ZNF613, ZNF 615, ZNF614, FPR1, FPR2, FPR3, ZNF432, ZNF 350, ZNF649, ZNF577 FALSE DIP2C [frame shift] 1557 7044 12705s1 195227 191152 550121 278210 30dup 4? 0.1 %fa_to_both ZNF613, ZNF 615, ZNF614, FPR2, FPR3, ZNF432, ZNF350, ZNF 649, ZNF577 FALSE PLXNA4 [miss ense] 1557 7082 13215s1 195227 191152 579425 307514 33dup 4? 0.1 %mo_to_s1 ZNF613, ZNF 615, ZNF614, FPR2, FPR3, ZNF432, ZNF350, ZNF 649, ZNF577 TRUE 1568 4697 14110p1 195462 990254 631752 1850 3del 1? 0.1 %mo_to_p1 PRPF31 FALSE PHF3 [missen se] 1588 4674 13171p1 195587 206055 879739 7679 10del 1? 0.1 %mo_to_p1 IL11 TRUE 1596 7074 13162s1 195728 605557 293489 7434 3dup 1? 0.1 %fa_to_s1 ZIM2 TRUE 1598 1184 13815p1 195783 504957 932849 97800 15del 1? 0.1 %mo_to_p1 ZNF547, ZNF 304, ZNF17, Z NF548, ZNF5 43 TRUE 1601 7045 12716p1 195863 924458 652113 12869 2dup 1? 0.1 %fa_to_both ZNF329 TRUE 1789 8293 12409s1 20373 47043 739325 4621 4dup 5? 0.5 %mo_to_s1 C20orf27 TRUE 1789 1219 13335s1 20373 47043 736254 1550 3dup 5? 0.5 %mo_to_s1 C20orf27 FALSE 1789 4741 12650p1 20373 50433 739325 4282 3dup 5? 0.5 %mo_to_bot hC20orf2 7 FALSE 1789 4742 12650s1 20373 50433 739325 4282 3dup 5? 0.5 %mo_to_bot hC20orf2 7 FALSE 1792 7131 13101s1 20383 52713 893281 58010 12dup 1? 0.1 %fa_to_s1 MAVS, PANK 2 TRUEPA NK2 1792 7992 12383p1 20383 52713 838456 3185 2dup 1? 0.5 %fa_to_both MAVS FALSE 1791 7993 12383s1 20383 52713 838456 3185 2dup 1? 0.5 %fa_to_both MAVS FALSE 1797 1193 11013p1 20786 42397 990962 126723 15dup 1? 0.1 %mo_to_bot hHAO1, T MX4 TRUE TMPRSS2 [m issense] 1797 1194 11013s1 20786 42397 990962 126723 15dup 1? 0.1 %mo_to_bot hHAO1, T MX4 TRUE 1799 7136 13418p1 201385 668313 869158 12475 7dup 1? 0.1 %mo_to_bot hSEL1L2 FALSE CGNL1 [miss ense], DENND5B [m issense], LRRC40 [miss ense] 1799 7137 13418s1 201385 816513 869158 10993 6dup 1? 0.1 %mo_to_bot hSEL1L2 FALSE 1806 7143 13589s1 202546 987925 472161 2282 3del 1? 0.1 %fa_to_s1 NINL TRUE 1808 1207 11872p1 203053 360930 538190 4581 4dup 1? 0.5 %mo_to_p1 PDRG1 TRUE CACNA1D [m issense], KATNAL2 [sp lice] 1824 4726 11501p1 204435 100644 354275 3269 3del 6? 2 %mo_to_p1 SPINT4 FALSE 1824 1224 13926s1 204435 100644 354275 3269 3del 6? 2 %mo_to_s1 SPINT4 TRUE 1828 7995 11304p1 204522 860945 235221 6612 2dup 2? 0.1 %fa_to_p1 SLC13A3 TRUE 1828 7122 12473p1 204522 860945 235221 6612 2dup 2? 0.1 %mo_to_p1 SLC13A3 TRUE 1830 1222 13815s1 204812 445648 166726 42270 11del 2? 0.5 %fa_to_s1 PTGIS TRUE 1842 4712 11196s1 206285 324562 904953 51708 15dup 1? 0.5 %fa_to_both PCMTD2, MY T1 TRUE 1851 7175 13396p1 211962 882519 632603 3778 3del 1? 0.1 %mo_to_p1 CHODL TRUE 1854 4778 11581p1 212829 637128 317408 21037 7dup 1? 0.1 %fa_to_both ADAMTS5 TRUE 1854 4779 11581s1 212829 637128 317408 21037 7dup 1? 0.1 %fa_to_both ADAMTS5 TRUE 1864 4808 12507p1 213574 277735 899047 156270 8dup 2? 0.5 %mo_to_bot hKCNE2, RCAN1, KCN E1, FAM165B FALSE 1864 7173 13327p1 213574 277735 899047 156270 8dup 2? 0.5 %fa_to_p1 KCNE2, RCA N1, KCNE1, F AM165B TRUE 1864 4810 12507s1 213589 038135 899047 8666 4dup 2? 0.5 %mo_to_bot hRCAN1 FALSE 1868 1233 13593p1 213764 931837 665869 16551 10dup 2? 0.1 %mo_to_bot hDOPEY2 FALSE RGS22 [misse nse] 1868 1234 13593s1 213764 931837 665869 16551 10dup 2? 0.1 %mo_to_bot hDOPEY2 FALSE 1875 4784 11716p1 214280 399742 818068 14071 9dup 1? 0.1 %fa_to_p1 MX1 TRUE 1880 4799 12308p1 214483 662144 837654 1033 2dup 2? 2 %mo_to_p1 SIK1 TRUE 1885 7149 12481p1 214572 491145 826648 101737 48dup 1? 0.1 %fa_to_p1 PFKL, C21orf 2, TRPM2 FALSE 1887 4824 13795p1 214631 898146 323450 4469 4del 1? 0.1 %mo_to_bot hITGB2 TRUEITG B2 1887 4825 13795s1 214631 898146 323450 4469 4del 1? 0.1 %mo_to_bot hITGB2 TRUEITG B2S TX3 [missens e] 1897 4768 11196p1 214801 927548 084239 64964 12dup 1? 0.1 %mo_to_bot hPRMT2, S100B TRUE 1898 7216 12997p1 221726 450817 288963 24455 3dup 5? 1 %mo_to_p1 XKR3 TRUE 1898 7230 13418p1 221726 450817 288963 24455 3dup 5? 1 %mo_to_p1 XKR3 FALSE CGNL1 [miss ense], DENND5B [m issense], LRRC40 [miss ense] 1910 7191 12481p1 222136 946321 473538 104075 17dup 1? 0.1 %fa_to_both SLC7A4, P2R X6 FALSE 1910 7192 12481s1 222136 946321 384640 15177 15dup 1? 0.1 %fa_to_both SLC7A4, P2R X6 FALSE 1917 8016 12680p1 222212 349222 127271 3779 2dup 1? 0.1 %fa_to_both MAPK1 TRUEMA PK1 1927 4841 11090p1 222340 163924 967945 1566306 242dup 1? 0.1 %mo_to_p1 GUSBP11, M MP11, SLC2A 11, SPECC1L , UPB1, CHCHD10, ZN F70, VPREB3 , RAB36, CAB IN1, RTDR1, DDTL , C22orf13, D ERL3, C22orf 15, BCR, SUSD2, SMA RCB1, ADOR A2A, LOC284 889, DDT, GGT5, SNRP D3, GSTT2, G STT1, C22orf 43, IGLL1 TRUEUP B1, ADORA2A, SMARCB1 1930 8025 11090p1 222403 421724 325797 291580 80dup 1? 0.1 %mo_to_p1 GUSBP11, SM ARCB1, MMP 11, LOC2848 89, DDTL, SLC2A 11, DDT, DER L3, C22orf15, CHCHD10, ZN F70, VPREB3 , GSTT2 TRUESM ARCB1 1931 4854 11242s1 222403 421724 037206 2989 6dup 1? 0.5 %fa_to_s1 GUSBP11 TRUE 1927 8024 11090p1 222443 196124 967945 535984 98dup 1? 0.1 %mo_to_p1 ADORA2A, SN RPD3, CABIN 1, SPECC1L, C22orf13, SU SD2, GGT5, U PB1 TRUEUP B1, ADORA2A 1938 7194 12498s1 222909 288829 095925 3037 2del 1? 0.1 %mo_to_bot hCHEK2 TRUE GPR82 [misse nse] 1938 8522 12498p1 222909 288829 095925 3037 2del 1? 0.1 %mo_to_bot hCHEK2 TRUE CPA4 [missen se] 1947 1263 12810p1 223254 573932 651316 105577 26dup 1? 0.1 %fa_to_p1 C22orf42, No ne, RFPL2, SL C5A4 TRUE 1950 1283 14011p1 223278 822632 794087 5861 5del 1? 0.1 %fa_to_both C22orf28 TRUE 1950 1284 14011s1 223278 822632 794087 5861 5del 1? 0.1 %fa_to_both C22orf28 TRUE 1952 1240 11364p1 223571 386935 742965 29096 13dup 1? 0.1 %mo_to_bot hTOM1 FALSE 1952 1241 11364s1 223571 386935 743202 29333 14dup 1? 0.1 %mo_to_bot hTOM1 FALSE 1954 4856 11247p1 223670 197536 907833 205858 29dup 1? 0.1 %fa_to_p1 TXN2, FOXRE D2, MYH9, EI F3D TRUE 1958 4963 13992s1 223847 789538 483271 5376 5dup 1? 0.1 %fa_to_s1 SLC16A8, BA IAP2L2 TRUE 1966 7215 12937s1 224075 486740 762526 7659 9del 1? 0.1 %fa_to_s1 ADSL TRUEAD SLO R2T3 [missen se] 1979 4893 11828p1 224278 116642 899582 118416 8dup 2? 0.1 %mo_to_p1 NFAM1, SERH L TRUE 1978 8034 11654p1 224295 619142 962344 6153 2dup 8? 2 %fa_to_both SERHL2 TRUE 1978 1267 13116p1 224295 619142 962344 6153 2del 8? 2 %fa_to_both SERHL2 TRUE 1978 4948 13625p1 224295 619142 962344 6153 2del 8? 2 %fa_to_both SERHL2 TRUE 1978 4950 13625s1 224295 619142 962344 6153 2del 8? 2 %fa_to_both SERHL2 TRUE 1984 4921 12375s1 224432 281444 377341 54527 19dup 1? 0.1 %fa_to_s1 PNPLA3, SAM M50 TRUE 1987 1255 12011s1 224488 350344 893436 9933 2dup 1? 0.1 %fa_to_both LDOC1L TRUE 1987 7445 12011p1 224488 350344 893436 9933 2dup 1? 0.1 %fa_to_both LDOC1L TRUE FAM45A [mis sense] 1988 1273 13606s1 224512 112345 198044 76921 8del 1? 0.1 %fa_to_s1 PRR5-ARHGA P8, PRR5 TRUE ZFYVE26 [mis sense] 1994 7186 12409p1 224728 716147 308084 20923 3del 1? 0.1 %mo_to_bot hTBC1D2 2A TRUE LRP2 [nonsen se] 1994 8318 12409s1 224728 716147 308084 20923 3del 1? 0.1 %mo_to_bot hTBC1D2 2A TRUE 1998 4887 11716p1 225101 620351 018700 2497 6del 2? 0.1 %fa_to_both CHKB-CPT1B , CPT1B TRUE 1998 4888 11716s1 225101 620351 018700 2497 6del 2? 0.1 %fa_to_both CHKB-CPT1B , CPT1B TRUE cnvrID callID familyI D relC hrom osome Start (h g19) Stop (h g19)l ength ( bp)le ng th (exo ns)st at eFre q uen cy in 411 qua ds Genes 2363 21111 1079p1 319 5778812 1972733 14 1,494,50 2161d el1 PCYT1A , FBXO4 5, LRRC 33, WDR 53, NCB P2, TM4 SF19-TC TEX1D2 , DLG1, M FI2, SEN P5, RNF 168, ZD HHC19, OSTalph a, TFRC , C3orf4 3, CEP19, PIGX, TC TEX1D2 , PIGZ, U BXN7, P AK2, BD H1 1080 40331 1090p1 16 2967504 93 0199897 524,848 187del 7DOC2 A, ASPH D1, COR O1A, TB X6, KIF2 2, CDIPT , QPRT, YPEL3, PPP4C, MAPK3, SPN, M VP, FAM 57B, AL DOA, IN O80E, S EZ6L2, T AOK2, K CTD13, MAZ, PR RT2, GD PD3, C1 6orf92, C 16orf53, TMEM2 19, C16o rf54, HIR IP3 2810 27371 1154p1 77 2848337 7412076 41 ,272,427 209dup 1STX1A , WBSC R27, WB SCR22, LAT2, LI MK1, W BSCR28 , RFC2, FZD9, VPS37D , ABHD1 1, CLIP2 , CLDN3 , CLDN4 , BCL7B , ELN, M LXIPL, DNAJC3 0, GTF2 IRD1, BA Z1B, TB L2, EIF4 H, GTF2 I 888 38831 1265p1 15 2284576 12 3062319 216,558 57del 6NIPA2 , NIPA1, CYFIP1 , TUBGC P5 1251 42891 1353p1 17 3489295 03 6104875 1,211,92 5159d el1 LHX1, D USP14, MRM1, ACACA, DDX52, DHRS1 1, SYNR G, HNF1 B, AATF, PIGW, T ADA2A, GGNBP 2 1080 40691 1433p1 16 2980821 33 0199897 391,684 175del 7DOC2 A, ASPH D1, COR O1A, TB X6, KIF2 2, CDIPT , YPEL3 , PPP4C , MAPK3, MVP, FA M57B, A LDOA, IN O80E, S EZ6L2, T AOK2, K CTD13, MAZ, PRRT2, GDPD3, C16orf9 2, C16or f53, TME M219, H IRIP3 1180 43051 1532p1 17 6329934 7352107 1,022,17 3366d up1 CLEC10 A, ACAD VL, AIPL 1, C17or f74, C17 orf100, S PEM1, F AM64A, LOC100 506713, EIF5A, G PS2, DL G4, CTD NEP1, M IR497HG , TEKT1 , RNASEK -C17OR F49, GA BARAP, DVL2, S LC16A1 1, SLC1 6A13, N EURL4, FGF11, CLDN7, C17orf8 1, NLGN 2, C17or f61-PLS CR3, TN K1, MED 31, SLC13A 5, ACAP 1, ASGR 1, ASGR 2, TMEM 102, TM EM95, K IAA0753 , PHF23, XAF1, P LSCR3, TXNDC1 7, SLC2 A4, CHR NB1, KC TD11, PITPNM 3, BCL6 B, FBXO 39, YBX 2 1080 41141 2100p1 16 2967504 93 0199897 524,848 193del 7DOC2 A, ASPH D1, COR O1A, TB X6, KIF2 2, CDIPT , QPRT, YPEL3, PPP4C, MAPK3, SPN, M VP, FAM 57B, AL DOA, IN O80E, S EZ6L2, T AOK2, K CTD13, MAZ, PR RT2, GD PD3, C1 6orf92, C 16orf53, TMEM2 19, C16o rf54, HIR IP3 1057 41231 2162p1 16 2163625 12 1737979 101,728 26dup 5METT L9, OTO A 1967 49041 2224p1 22 4052182 14 0762526 240,705 38del 1ADSL, TNRC6 B 1080 41441 2308p1 16 2967504 93 0199897 524,848 193del 7DOC2 A, ASPH D1, COR O1A, TB X6, KIF2 2, CDIPT , QPRT, YPEL3, PPP4C, MAPK3, SPN, M VP, FAM 57B, AL DOA, IN O80E, S EZ6L2, T AOK2, K CTD13, MAZ, PR RT2, GD PD3, C1 6orf92, C 16orf53, TMEM2 19, C16o rf54, HIR IP3 749 37051 2343p1 13 4130163 74 1910892 609,255 53dup 1KBTB D7, KBT BD6, MT RF1, MR PS31, N AA16, M IR320D1 , ELF1, SLC25A 15, WBP 4 1120 67251 2691p1 16 6997102 07 2923861 2,952,84 1487d el1 WWP2, HP, SF3B 3, DHX3 8, ZNF1 9, CLEC 18A, CL EC18C, FUK, PM FBP1, HYDIN, TAT, ATX N1L, PD PR, MAR VELD3, DDX19A , DDX19 B, VAC1 4, ST3GAL 2, None, MTSS1 L, HPR, CALB2, KIAA017 4, COG4 , IL34, Z FHX3, ZNF23, PHLPP2 , CHST4 , AP1G1 , FTSJD 1, AARS , DHODH , TXNL4 B 1682 56731 2735p1 210 5438016 1092931 54 3,855,13 8163d el1 POU3F3 , SULT1C 3, ST6G AL2, NC K2, MRP S9, UXS 1, SULT1 C2, SULT1C 4, C2orf 40, FHL2 , LIMS1, C2orf49 , GCC2, SLC5A7 , GPR45 , TGFBRA P1 1079 41861 2736p1 16 2946495 33 0212614 747,661 222dup 3DOC2 A, ASPH D1, COR O1A, TB X6, PRR T2, CDIP T, QPRT , YPEL3 , PPP4C , SLX1B, MAPK3, SPN, B OLA2B, MVP, FA M57B, A LDOA, IN O80E, S EZ6L2, TAOK2, KCTD13 , SLX1A -SULT1A 3, MAZ, KIF22, G DPD3, C 16orf92, C16orf5 3, TMEM 219, C16 orf54, H IRIP3 568 64091 2975p1 111 2134882 612 2852379 1,503,55 386d el1 SORL1, CRTAM, UBASH 3B, BSX , MIR100 HG, C11 orf63 1475 70651 3018p1 19 1505230 01 6038149 985,849 234dup 1EPHX3 , CYP4F 8, PGLY RP2, CA SP14, R ASAL3, CYP4F3 , CYP4F 2, CCDC10 5, CYP4 F12, CY P4F11, A KAP8L, OR1I1, I LVBL, SY DE1, BR D4, OR7C2, CYP4F2 2, NOTC H3, OR1 0H2, OR 10H3, O R10H1, AKAP8, OR10H5 , SLC1A 6, WIZ 143 84133 46p1 19 3545080 9571242 62 ,167,346 229del 1TMED 5, GCLM , RWDD 3, CNN3 , DR1, A LG14, M TF2, TM EM56-R WDD3, DNTTIP2 , SLC44 A3, ABC A4, F3, M IR760, T MEM56, ARHGA P29, CCDC18 , ABCD3 , BCAR3 , FNBP1 L 150 85133 46p1 1 993586 27 100983 836 1,625,20 9162 d el1 CCDC76 , LRRC3 9, AGL, HIAT1, C DC14A, PALMD, LPPR5, LPPR4, SASS6, SLC35A 3, FRRS 1, MIR54 8D1, DB T, RTCD 1 161 86133 46p1 111 2795166 1134719 30 676,764 87del 1WNT2 B, CAPZ A1, MOV 10, FAM 19A3, SL C16A1, PPM1J, RHOC, S T7L, CTTNBP 2NL 891 39971 3355p1 15 2701754 82 7188573 171,025 12dup 1GABR A5, GAB RB3 504 73613 726p1 11 5694936 76 0233630 3,284,26 3348d el1 DTX4, O R5B21, LPXN, S SRP1, O R5B2, L RRC55, UBE2L6 , OR1S1 , OR1S2, SERPIN G1, PRG 3, PRG2 , MS4A7 , P2RX3 , MS4A5 , MS4A2 , MS4A3, MS4A1 , YPEL4 , OR5A1 , MIR130 A, OR10 Q1, OR4 D10, OR 4D11, MS4A14 , TIMM1 0, OR10 V1, OR5 B3, MED 19, PATL 1, MPEG 1, None, OR4D6, OR10W 1, MRPL 16, OR4 D9, OR5 B12, OR 5B17, M S4A6A, ZFP91- CNTF, T NKS1BP 1, TMX2 , ZDHHC 5, MS4A 4A, SLC 43A3, SM TNL1, SLC43A 1, TMX2 -CTNND 1, OR9Q 2, OSBP , OR9Q1 , APLNR , PLAC1 L, GIF, CLP1, O R5A2, S TX3, GLY ATL1, TC N1, GLY ATL2, G LYAT, OR 5AN1, MS4A6E , FAM11 1A, FAM 111B, R TN4RL2 1127 99113 815p1 16 7631156 07 6513435 201,875 12del 1CNTN AP4 2962 29931 3876p1 83 6641928 4305471 26 ,412,784 598dup 1IDO2, TM2D2, IDO1, S TAR, BA G4, ASH 2L, AP3M 2, LETM 2, IKBKB , ADAM18 , ADAM3 2, C8orf 4, HTRA 4, HGSN AT, RNF 170, AN K1, ZNF 703, CHRNB 3, DDHD 2, PPAP DC1B, W HSC1L1 , NKX6-3 , GPR12 4, KCNU 1, GOLGA7 , MYST3 , SGK19 6, POLB , FNTA, HOOK3, LSM1, P LAT, CH RNA6, BRF2, A DAM2, C 8orf40, C 8orf86, F GFR1, S LC20A2 , THAP1 , RAB11 FIP1, SFRP1, GOT1L1 , GINS4, ERLIN2 , TACC1 , DKK4, ADAM9, VDAC3, AGPAT6 , ADRB3 , EIF4EB P1, PLE KHA2, P ROSC, Z MAT4 2964 29941 3876p1 84 3197329 4998689 26 ,789,563 142dup 1None, CEBPD , PRKDC , KIAA01 46, MCM 4, SNAI2 , UBE2V 2, EFCA B1, POTEA, C8orf22 1780 19111 1241s1 224 1615961 2417091 23 93,162 40dup 3AQP1 2B, AQP 12A, KIF 1A 1195 68431 2480s1 17 1034792 01 0356645 8,725 16del 1MYH4 114 93134 47s1 14 8688408 4870520 9 16,801 12dup 1SLC5A 9 3059 62531 3601s1 93 5649844 3575412 6 104,282 134del 1CA9, C CDC107 , MSMP, CREB3 , TPM2, TLN1, C 9orf100, SIT1, G BA2, RGP1 1643 2081 3629s1 26 1505299 6152858 9 23,290 14dup 1USP34 callID sampleID chr start stop i1M_star ti1M _stop probes aCHG_m eani1 M_PASS aCGH_pa ss Notes 1749123 83.p1 chr1 3413218 3417328 260 .048704 False Pos itive 5495126 18.p1 chr1 11134287 11155938 9-0. 762897 YES 1561114 33.s1 chr1 40204572 40312969 260 .442855 YES 1581116 67.s1 chr1 42693553 42744343 42693597 42837441 250 .404631Y ES YES 4011872 .p1 chr1 65730593 65831879 65666329 65823558 690 .466047Y ES YES 5555130 97.p1 chr1 65849875 65855310 430 .530438 YES 1499111 18.p1 chr1 66837995 67000051 30-0 .057537 False Pos itive 6912810 .p1 chr1 87029343 87038403 87028669 87038695 72-0 .519149Y ES YES 8299132 96.s1 chr1 11324518 4113 264970 250 .355266 YES 5512161 .s1 chr1 18255035 9182 555941 18254901 9182 564062 43-0 .571282Y ES YES 3711715 .s1 chr1 18509780 0185 130057 450 .507937 YES 6621189 5.s1 chr10 225952 532470 60 0.54439 YES 3250123 17.s1 chr10 18289595 18331762 18276761 18415963 380 .401112Y ES YES 8347129 97.p1 chr10 54527896 54531395 54524658 54537447 26-0 .577977Y ES YES 8348129 97.s1 chr10 54527896 54531395 54524658 54536839 26 -0.2795Y ES YES 3175112 67.s1 chr10 72604229 72645686 72601813 72824456 20-0 .629198Y ES YES 6270125 10.p1 chr10 90703553 90707143 27 0.00287 PPG, ACT A2 6271125 10.s1 chr10 90703553 90707143 270 .084178 PPG, ACT A2 3253123 83.s1 chr10 10159413 6101 596047 0 FP/Not Te sted due t o lack of a CGH prob es 3201115 19.p1 chr10 12221681 7122 349014 12181455 3122 500375 1-0. 887886Y ES YES 6334131 62.s1 chr10 13516887 3135 179599 0 FP/Not Te sted due t o lack of a CGH prob es 6971156 9.p1 chr11 5862185 5878932 23-0 .261248 YES 3345116 67.p1 chr11 18727363 18729842 19-0 .433224 YES 3347116 67.s1 chr11 18727363 18729842 19-0 .224613 YES 7281281 0.p1 chr11 32697110 32781789 32699987 32815580 42-0 .566457Y ES YES 6402128 36.s1 chr11 59620447 59623531 23-0 .879108 YES 6401128 36.p1 chr11 59620675 59622308 11-0 .839071 YES 7101178 8.s1 chr11 63057637 63059115 120 .456157 YES 7466112 29.p1 chr11 10055840 9100 859532 108 0.047092 PPG, ARH GAP42L 7715111 15.s1 chr11 10055840 9100 831720 94-0 .060677 PPG, ARH GAP42L 3346116 67.s1 chr11 10825664 6108 264105 580 .369384 YES 7717116 67.p1 chr11 10825664 6108 264105 580 .518831 YES 6407128 51.s1 chr11 13417701 7134 257553 13416461 8134 346119 390 .494821Y ES YES 7722112 82.p1 chr12 4651049 4668159 36 0.1066 PPG, RAD 51AP1 7723115 19.p1 chr12 4651049 4668159 36-0 .003573 PPG, RAD 51AP1 3495112 82.s1 chr12 4655474 4668159 29 0.03143 PPG, RAD 51AP1 8313129 97.p1 chr12 25264705 25267804 23-0 .992749 YES 8314129 97.s1 chr12 25264705 25267804 23-0 .563687 YES 6506128 36.s1 chr12 49688983 49691056 170 .150829 Confirmed w/ manu al inspect ion 3463110 90.p1 chr12 49688983 49691056 170 .274504 YES 3540118 28.p1 chr12 51203238 51213562 14-0 .075046 PPG, ATF 1 7741189 5.s1 chr12 10929078 1109 293251 19-0 .361741 YES 3489112 41.p1 chr12 12087592 9120 884632 650 .508616 YES 6505128 36.p1 chr12 12087592 9120 884632 12087372 6120 888619 650 .390249Y ES YES 7477120 11.s1 chr13 20425494 20426320 20420175 20451410 60. 482955Y ES YES 3671114 12.p1 chr13 96508411 96515968 59-0 .863395 YES 3672114 12.s1 chr13 96508411 96515968 59-0 .582202 YES 6555128 29.s1 chr13 11413815 4114 175048 180 .532308 YES 3661111 96.s1 chr13 11500759 5115 091756 170 0.201715 False Pos itive 8651230 4.p1 chr14 65016715 65019579 220 .321413 YES 8771281 0.p1 chr14 65016715 65019579 220 .389589 YES 8420125 82.p1 chr14 69919957 69969596 69922552 69978718 230 .504025Y ES YES 3782111 18.p1 chr14 99182528 99183611 99181386 99193587 9-0. 694542Y ES YES 3826123 17.s1 chr14 10040236 5100 405664 10039831 8100 407548 250 .274271Y ES YES 3954123 17.s1 chr15 40993261 41001314 45-0 .759746 YES 6651130 18.p1 chr15 43627893 43644143 190 .140645 PPG, ADA L 9061122 9.s1 chr15 76426604 76641077 76394419 76640658 420 .400188Y ES YES 6751130 18.p1 chr16 4624972 4642368 4614859 4643452 21-0 .710982Y ES YES 4043111 18.p1 chr16 4871380 4871598 2-0. 019551 FP/Not Te sted due t o lack of a CGH prob es 6696124 41.p1 chr16 21655610 21763826 250 .513189 YES 9431122 9.p1 chr16 31477170 31488897 31478711 31489033 120 .403372Y ES YES 9451122 9.s1 chr16 31477170 31487883 35183650 35284399 11 0.25197Y ES YES 7863114 12.p1 chr16 84693365 84812688 0 FP/Not Te sted due t o lack of a CGH prob es 1036125 78.p1 chr17 6010 11981 8547 34203 460 .124371Y ES 4306115 32.p1 chr17 3518630 3561469 3503527 3561396 68-0 .731878Y ES YES 4276112 67.s1 chr17 7258523 7259951 100 .094905 False Pos itive 4295114 84.s1 chr17 29311634 29324349 29301608 29319956 170 .592237Y ES YES 4353125 61.s1 chr17 72874459 72877385 24-0 .200099 YES 6838124 24.p1 chr17 80151631 80153240 13-0 .592442 YES 4435112 52.p1 chr18 688573 697355 691173 695030 70-0 .693998Y ES YES 1079119 42.p1 chr18 32953419 32954256 6-0. 445434 YES 4485126 56.s1 chr18 70416262 70417855 70402753 70412150 110 .121603Y ES 1132121 61.s1 chr19 3805946 3834981 990 .136737 Confirmed w/ manu al inspect ion 1131121 61.p1 chr19 3807169 3833776 980 .206042 Confirmed w/ manu al inspect ion 7052128 37.s1 chr19 3816671 3825405 69-0 .449766 YES 7006122 52.s1 chr19 11436073 11448075 140 .008268 PPG, RAB 3D 7054128 38.s1 chr19 19030031 19033553 280 .176093 Confirmed w/ manu al inspect ion 1091110 29.p1 chr19 45494107 45495658 12-0 .270947 YES 1092110 29.s1 chr19 45494107 45495658 12-0 .421898 YES 4554113 04.s1 chr19 49481180 49497180 760 .262305 YES 7087132 96.s1 chr19 49671173 49675365 32-0 .303428 YES 7074131 62.s1 chr19 57286055 57293489 58 0.0126 False Pos itive 5709132 96.s1 chr2 1437209 1479843 1426346 1520676 23 0.44676Y ES YES 1898111 18.p1 chr2 44527109 44541090 44519142 44545576 180 .414627Y ES YES 1985122 28.p1 chr2 74129486 74166149 128 0.007104 False Pos itive 5682128 51.s1 chr2 96780544 97784254 97845100 98202258 840 .401272Y ES YES 5682128 51.s1 chr2 96780544 97784254 96731109 97577661 840 .401272Y ES YES 1931114 84.s1 chr2 98263529 98275940 180 .240511 YES 7964110 45.p1 chr2 10240718 1102 416105 700 .437141 YES 2016125 52.s1 chr2 11334644 2113 404739 300 .008806 False Pos itive 5683128 51.s1 chr2 19271116 8193 059250 19271116 2193 252329 70-0 .634688Y ES YES 5689129 97.p1 chr2 23063226 9230 724290 23063087 4230 712174 135 0.486573 YES YES 1601165 9.s1 chr2 23207095 1232 072965 16-0 .557958 YES 8293124 09.s1 chr20 3734704 3739325 350 .021807 PPG, C20 orf27 4742126 50.s1 chr20 3735043 3739325 330 .081261 PPG, C20 orf27 7992123 83.p1 chr20 3835271 3838456 240 .270166 YES 7993123 83.s1 chr20 3835271 3838456 3822148 3830474 240 .113595Y ES 1194110 13.s1 chr20 7864239 7990962 7549585 8317018 230 .636108Y ES YES 1207118 72.p1 chr20 30533609 30538190 36 0.2861 YES 4712111 96.s1 chr20 62853245 62904953 260 .368368 YES 7149124 81.p1 chr21 45724911 45826648 220 .512218 YES 7216129 97.p1 chr22 17264508 17288963 120 .529071 YES 7191124 81.p1 chr22 21369463 21473538 21374550 21465780 2-0. 040647Y ES 4841110 90.p1 chr22 23401639 24967945 24182500 24999104 401 0.385952 YES YES 4841110 90.p1 chr22 23401639 24967945 23648009 24163081 401 0.385952 YES YES 80251109 0.p1 chr22 24034217 24325797 24182500 24999104 90.5 26062YES YES 80251109 0.p1 chr22 24034217 24325797 23648009 24163081 90.5 26062YES YES 80241109 0.p1 chr22 24431961 24967945 24182500 24999104 90.4 62384YES YES 80241109 0.p1 chr22 24431961 24967945 23648009 24163081 90.4 62384YES YES 12631281 0.p1 chr22 32545739 32651316 32530256 32703072 750 .289551Y ES YES 48931182 8.p1 chr22 42781166 42899582 42788316 42896474 0 YES 12551201 1.s1 chr22 44883503 44893436 800 .420071 YES 83181240 9.s1 chr22 47287161 47308084 47285396 47443386 10-0 .673402Y ES YES 57661258 8.s1 chr3 7594808 7782093 7516145 7802705 1-0.8 41887YES YES 83771285 1.s1 chr3 9867483 9871079 0 FP/Not Te sted due t o lack of a CGH prob es 57611248 1.p1 chr3 12632296 12791331 12633293 12806142 0 YES 22321238 3.s1 chr3 35833873 35835450 35811601 35938795 120 .366609Y ES YES 57561225 2.s1 chr3 81627075 81640315 81594581 81644911 18-0 .836112Y ES YES 57971316 2.s1 chr3 113588353 113619993 113576614 113619872 15-0 .708712Y ES YES 23712304 .p1c hr3 132277814 132280061 16-0 .510444 YES 23012106 .s1c hr3 137781657 137803095 90.4 62725 YES 21521130 4.s1 chr3 141884463 142084021 141821227 142085310 108 0.430274 YES YES 57691263 1.s1 chr3 141889166 142084208 141821227 142065934 108 0.44434Y ES YES 27812161 .p1c hr4 6594899 6613005 6594947 6611813 25-0 .279563Y ES YES 27912161 .s1c hr4 6594899 6613005 6594947 6611813 25-0 .454313Y ES YES 58221240 9.s1 chr4 89978088 90035703 89967292 90149589 28- 0.69108Y ES YES 23731237 0.p1 chr4 159590764 159616795 130 .487415 YES 31411659 .s1c hr5 40931165 40937792 40927429 40943442 53-0 .685554Y ES YES 24421145 6.s1 chr5 81283389 81354421 350 .013013 PPG, ATG1 0 58901285 1.s1 chr5 81283389 81354421 350 .001057 PPG, ATG1 0 58761258 8.s1 chr5 110427986 110446977 260 .488264 YES 24901237 0.p1 chr5 110430617 110446977 110417428 110441533 220 .550374Y ES YES 24921237 0.s1 chr5 110430617 110446977 220 .335151 YES 33612161 .p1c hr5 121309890 121358102 250 .260764 YES 33712161 .s1c hr5 121309890 121362821 121297115 121366608 270 .424875Y ES YES 34612390 .s1c hr5 141312823 141314151 9-0.2 77458 YES 24541182 8.p1 chr5 156456743 156479665 11 0.50319 YES 75121101 3.s1 chr6 49426794 49440571 410 .415165 YES 37811229 .p1c hr6 51747890 51752043 33-0 .798963 YES 39412106 .s1c hr6 56915571 56919661 56917538 56954550 310 .496498Y ES YES 26121155 1.p1 chr6 88315634 88318947 26-0 .759539 YES 59621265 5.s1 chr6 146870599 146875741 146870930 146875322 40-0 .510074Y ES YES 26551265 0.s1 chr6 162683556 162864505 162632171 163059714 0 YES 26101151 9.p1 chr6 169617915 169646376 169243539 169795975 0 YES 46512304 .p1c hr7 98628206 98633339 0 FP/Not Te sted due t o lack of a CGH prob es 47012578 .p1c hr7 150706017 150725697 150705842 150723467 0 YES 60671283 7.s1 chr7 151833916 152027824 151826268 152055852 10.4 83768YES YES 29181119 6.s1 chr8 190895 382935 116 0.250819 YES 84901263 1.s1 chr8 27378399 27380025 13-0 .520584 YES 53011715 .s1c hr8 57078801 57080828 57055054 57098250 160 .498051Y ES YES 29221125 2.p1 chr8 144295142 144450815 147 0.374638 YES 59312106 .s1c hr9 214508 340321 174447 359712 620 .929548Y ES YES 61941265 5.p1 chr9 368017 677009 788730 864122 630 .485291Y ES YES 61941265 5.p1 chr9 368017 677009 347559 699065 630 .485291Y ES YES 59712161 .s1c hr9 5968018 6015607 5966898 6135411 230 .429288Y ES YES 61771225 2.s1 chr9 33166755 33261167 33141752 33260632 490 .519623Y ES YES 60712741 .s1c hr9 35228011 35237823 35171905 35238095 77-0 .957764Y ES YES 59912578 .p1c hr9 35662942 35664489 12-0 .196435 YES 61911263 7.p1 chr9 35662942 35664489 12-0 .457227 YES 62011282 9.s1 chr9 35662942 35664489 12- 0.39486 YES 62331329 6.s1 chr9 125562401 125589066 130 .484615 YES False P os. Validat edT otal Te sted FPR Chi Sq uare/F isher E xact P value Datase tI ossifov et al. 1 51 52 0.02 O?Roak et al. 0 44 44 0.00 Sander s et al. 6 53 59 0.10 Affecte dP roband s 3 68 71 0.04 Sibling s 4 80 84 0.05 Size 2 exon s 2 26 28 0.07 3+ exo ns 5 122 127 0.04 Del/Du pD eletions 1 54 55 0.02 Duplica tions 6 89 95 0.06 Total 7 148 155 0.05 Missed Found Total FNR Chi Sq uare/F isher E xact P value Datase tI ossifov et al. 76 111 187 0.41 O?Roak et al. 21 77 98 0.21 Sander s et al. 55 201 256 0.21 Offsprin gP roband s 93 220 313 0.29 Sibling s 59 169 228 0.26 Size 2 exon s 39 24 63 0.62 3+ exo ns 113 365 478 0.24 Total 152 389 541 0.28O R=1.21 0.355 OR=0.1 9 < 1*10- 8 OR=0.5 33 0.611 OR=0.2 75 0.432 False N egative Rate X^2 = 2 2.3 < 1*10- 4 False P ositive Rate False P ositive Rates ?^2 = 7. 26 0.026 OR=0.8 82 1 Samp le SRS S core FSIQ SRS Disco rdant Sex # of ra re CNVs # of d e nov o SNVs # of d e nov o CNVs 11000.p 1 70 65 FALSE male 1 0 0 11000.s 1 48 FALSE female 0 0 0 11008.p 1 65 129 FALSE male 1 1 0 11008.s 1 FALSE male 1 0 0 11010.p 1 76 65 TRUE male 0 0 0 11010.s 1 40 TRUE male 0 0 0 11013.p 1 90 132 TRUE male 1 0 0 11013.s 1 47 TRUE male 2 0 0 11014.p 1 75 148 FALSE male 1 0 0 11014.s 1 48 FALSE male 0 0 0 11029.p 1 90 29 TRUE female 1 0 0 11029.s 1 41 TRUE male 1 0 0 11045.p 1 60 79 FALSE female 1 0 0 11045.s 1 53 FALSE male 0 0 0 11057.p 1 64 89 FALSE male 0 0 0 11057.s 1 34 FALSE male 1 0 0 11060.p 1 66 100 FALSE male 1 0 0 11060.s 2 40 FALSE male 0 0 0 11066.p 1 82 88 TRUE male 1 0 0 11066.s 2 37 TRUE male 0 0 0 11067.p 1 85 130 TRUE male 1 0 0 11067.s 1 40 TRUE female 2 0 0 11074.p 1 63 68 FALSE male 0 1 0 11074.s 1 35 FALSE male 0 0 0 11075.p 1 67 39 FALSE male 2 0 0 11075.s 1 36 FALSE male 2 0 0 11077.p 1 79 33 TRUE male 0 1 0 11077.s 1 37 TRUE male 0 0 0 11079.p 1 83 53 TRUE male 0 0 1 11079.s 1 42 TRUE female 0 0 0 11085.p 1 82 61 TRUE male 0 0 0 11085.s 1 39 TRUE female 1 0 0 11089.p 1 84 40 TRUE male 0 0 0 11089.s 1 57 TRUE male 0 0 0 11090.p 1 83 56 TRUE male 4 0 1 11090.s 1 42 TRUE male 0 0 0 11092.p 1 76 109 TRUE male 1 0 0 11092.s 1 35 TRUE female 0 0 0 11094.p 1 76 87 TRUE male 2 0 0 11094.s 1 46 TRUE male 0 0 0 11107.p 1 90 30 TRUE male 3 0 0 11107.s 1 39 TRUE male 2 0 0 11108.p 1 75 104 FALSE male 2 0 0 11108.s 1 51 FALSE male 1 0 0 11114.p 1 90 40 TRUE female 1 1 0 11114.s 1 39 TRUE female 1 0 0 11115.p 1 90 37 TRUE female 1 0 0 11115.s 1 50 TRUE female 1 0 0 11117.p 1 89 121 TRUE female 0 0 0 11117.s 1 41 TRUE female 1 0 0 11118.p 1 80 93 TRUE female 4 0 0 11118.s 1 51 TRUE male 1 0 0 11132.p 1 90 47 FALSE male 0 1 0 11132.s 1 64 FALSE male 0 0 0 11146.p 1 90 85 TRUE male 2 0 0 11146.s 1 40 TRUE male 2 0 0 11154.p 1 90 90 TRUE male 0 0 1 11154.s 1 45 TRUE female 1 0 0 11172.p 1 71 63 FALSE male 0 0 0 11172.s 1 40 FALSE female 0 0 0 11180.p 1 27 FALSE female 2 0 0 11180.s 1 43 FALSE male 3 0 0 11190.p 1 80 69 TRUE male 1 0 0 11190.s 1 55 TRUE female 2 0 0 11196.p 1 80 112 TRUE male 5 0 0 11196.s 1 53 TRUE male 3 0 0 11203.p 1 84 86 TRUE female 0 0 0 11203.s 1 53 TRUE female 2 0 0 11219.p 1 51 99 FALSE male 2 0 0 11219.s 1 36 FALSE female 1 0 0 11220.p 1 75 80 FALSE female 1 0 0 11220.s 1 41 FALSE female 3 0 0 11229.p 1 73 63 FALSE male 3 0 0 11229.s 1 48 FALSE male 2 0 0 11241.p 1 90 76 TRUE male 1 0 0 11241.s 1 38 TRUE male 1 0 1 11242.p 1 79 94 TRUE male 1 0 0 11242.s 1 48 TRUE male 1 0 0 11247.p 1 87 128 TRUE male 1 0 0 11247.s 1 56 TRUE female 0 0 0 11252.p 1 86 78 FALSE male 2 0 0 11252.s 1 62 FALSE male 2 0 0 11265.p 1 90 106 TRUE male 0 0 1 11265.s 1 47 TRUE female 0 0 0 11267.p 1 90 85 TRUE female 2 0 0 11267.s 1 54 TRUE male 2 0 0 11282.p 1 89 75 TRUE male 1 0 0 11282.s 1 42 TRUE male 1 0 0 11285.p 1 13 FALSE male 0 0 0 11285.s 1 44 FALSE male 1 0 0 11290.p 1 65 119 FALSE male 0 0 0 11290.s 1 39 FALSE female 0 0 0 11291.p 1 68 86 FALSE male 0 1 0 11291.s 1 51 FALSE female 0 0 0 11298.p 1 90 141 TRUE male 3 0 0 11298.s 1 37 TRUE male 2 0 0 11301.p 1 82 101 TRUE male 1 0 0 11301.s 1 44 TRUE female 1 0 0 11304.p 1 90 53 TRUE male 1 0 0 11304.s 1 51 TRUE female 2 0 0 11316.p 1 90 47 TRUE female 1 0 0 11316.s 1 54 TRUE female 0 0 0 11336.p 1 81 123 TRUE male 3 0 0 11336.s 1 45 TRUE female 3 0 0 11353.p 1 90 79 TRUE female 1 0 1 11353.s 1 40 TRUE male 0 0 0 11356.p 1 90 72 TRUE female 3 1 0 11356.s 1 42 TRUE male 1 0 0 11364.p 1 72 106 FALSE male 1 0 0 11364.s 1 40 FALSE female 3 0 0 11382.p 1 90 76 FALSE male 0 0 0 11382.s 1 63 FALSE female 0 1 0 11390.p 1 90 66 TRUE female 0 0 0 11390.s 1 45 TRUE female 0 0 0 11411.p 1 90 61 TRUE male 0 0 0 11411.s 1 41 TRUE female 1 0 0 11412.p 1 70 107 FALSE male 2 0 0 11412.s 1 41 FALSE female 1 0 0 11429.p 1 72 99 FALSE male 1 0 0 11429.s 1 FALSE male 2 0 0 11433.p 1 89 78 TRUE male 1 0 1 11433.s 1 45 TRUE female 1 0 0 11437.p 1 75 82 FALSE male 1 0 0 11437.s 1 35 FALSE male 1 0 0 11452.p 1 90 80 TRUE male 0 1 0 11452.s 1 35 TRUE male 1 0 0 11456.p 1 79 75 TRUE male 0 0 0 11456.s 1 54 TRUE male 1 0 0 11459.p 1 90 80 TRUE male 2 0 0 11459.s 1 42 TRUE male 2 0 0 11462.p 1 74 114 FALSE male 0 0 0 11462.s 1 50 FALSE male 0 0 0 11469.p 1 79 109 TRUE male 1 0 0 11469.s 1 49 TRUE female 1 0 0 11472.p 1 90 30 TRUE female 2 0 0 11472.s 1 41 TRUE female 1 0 0 11474.p 1 89 116 FALSE male 1 0 0 11474.s 1 FALSE male 1 0 0 11479.p 1 79 133 TRUE male 3 0 0 11479.s 1 46 TRUE female 2 0 0 11484.p 1 76 106 TRUE male 1 0 0 11484.s 1 43 TRUE male 2 0 0 11490.p 1 83 84 TRUE male 0 0 0 11490.s 1 50 TRUE female 0 1 0 11491.p 1 74 53 FALSE male 0 0 0 11491.s 1 42 FALSE male 0 0 0 11501.p 1 64 78 FALSE male 2 0 0 11501.s 1 48 FALSE male 2 0 0 11509.p 1 90 80 FALSE male 0 0 0 11509.s 1 FALSE male 2 0 0 11519.p 1 78 50 TRUE male 3 0 0 11519.s 1 47 TRUE female 1 0 0 11524.p 1 74 113 FALSE male 0 1 0 11524.s 1 40 FALSE male 0 0 0 11532.p 1 77 59 TRUE male 1 0 1 11532.s 1 39 TRUE female 0 0 0 11551.p 1 90 98 TRUE male 1 0 0 11551.s 1 39 TRUE female 0 0 0 11561.p 1 71 109 FALSE male 1 0 0 11561.s 1 42 FALSE male 1 0 0 11569.p 1 90 59 TRUE female 1 0 0 11569.s 1 45 TRUE male 1 0 0 11571.p 1 90 100 TRUE male 1 0 0 11571.s 1 51 TRUE female 0 0 0 11581.p 1 78 64 TRUE male 3 0 0 11581.s 1 50 TRUE male 1 0 0 11610.p 1 82 127 FALSE male 1 1 0 11610.s 1 60 FALSE male 2 0 0 11611.p 1 90 32 TRUE female 1 0 0 11611.s 1 34 TRUE male 1 0 0 11622.p 1 90 97 TRUE male 3 0 0 11622.s 1 50 TRUE female 3 0 0 11629.p 1 90 50 TRUE male 3 0 0 11629.s 1 47 TRUE female 2 0 0 11638.p 1 90 55 TRUE male 0 0 0 11638.s 1 46 TRUE male 0 0 0 11641.p 1 80 93 TRUE male 0 0 0 11641.s 1 41 TRUE male 0 0 0 11654.p 1 89 40 TRUE female 1 0 0 11654.s 1 57 TRUE female 0 0 0 11659.p 1 74 88 FALSE female 1 0 0 11659.s 1 55 FALSE female 2 0 0 11667.p 1 78 53 TRUE male 2 0 0 11667.s 1 45 TRUE male 3 0 0 11676.p 1 90 78 TRUE female 2 0 0 11676.s 1 40 TRUE female 1 0 0 11691.p 1 90 59 TRUE male 0 0 0 11691.s 1 47 TRUE male 0 0 0 11696.p 1 71 95 FALSE male 2 0 0 11696.s 1 56 FALSE male 0 0 0 11700.p 1 65 88 FALSE male 1 0 0 11700.s 1 41 FALSE female 0 0 0 11711.p 1 88 94 TRUE male 2 0 0 11711.s 1 48 TRUE male 2 0 0 11715.p 1 90 96 TRUE male 1 1 0 11715.s 1 40 TRUE female 2 0 0 11716.p 1 90 49 TRUE male 4 0 0 11716.s 1 42 TRUE male 1 0 0 11720.p 1 59 68 FALSE male 0 0 0 11720.s 1 39 FALSE female 0 0 0 11722.p 1 81 97 TRUE male 2 0 0 11722.s 1 42 TRUE male 0 0 0 11724.p 1 84 59 TRUE male 1 0 0 11724.s 1 35 TRUE female 0 0 0 11740.p 1 75 98 FALSE male 1 0 0 11740.s 1 42 FALSE male 0 0 0 11766.p 1 57 104 FALSE male 1 0 0 11766.s 1 42 FALSE female 1 1 0 11773.p 1 90 43 TRUE male 1 0 0 11773.s 1 42 TRUE male 1 0 0 11788.p 1 84 84 TRUE male 1 0 0 11788.s 1 41 TRUE male 1 0 0 11797.p 1 76 118 TRUE male 1 0 0 11797.s 1 51 TRUE female 2 0 0 11808.p 1 71 84 FALSE female 1 0 0 11808.s 1 36 FALSE female 0 0 0 11809.p 1 89 92 TRUE male 0 0 0 11809.s 1 53 TRUE female 0 0 0 11810.p 1 90 92 FALSE male 4 0 0 11810.s 1 86 FALSE female 2 1 0 11824.p 1 79 81 FALSE male 1 0 0 11824.s 1 68 FALSE male 1 0 0 11828.p 1 88 74 TRUE male 3 0 0 11828.s 1 41 TRUE female 2 0 0 11872.p 1 88 62 TRUE female 2 1 0 11872.s 1 49 TRUE female 0 0 0 11892.p 1 70 55 FALSE male 0 1 0 11892.s 1 41 FALSE female 0 0 0 11894.p 1 70 114 FALSE male 1 0 0 11894.s 1 40 FALSE male 0 0 0 11895.p 1 70 86 FALSE male 1 0 0 11895.s 1 47 FALSE male 2 0 0 11905.p 1 90 53 TRUE male 1 0 0 11905.s 1 34 TRUE male 0 0 0 11942.p 1 65 50 FALSE male 1 0 0 11942.s 1 36 FALSE male 0 0 0 11959.p 1 70 64 FALSE male 0 0 0 11959.s 1 44 FALSE female 1 0 0 11962.p 1 75 81 FALSE male 0 0 0 11962.s 1 49 FALSE male 0 0 0 11964.p 1 90 40 TRUE female 2 0 0 11964.s 1 44 TRUE female 0 0 0 12011.p 1 82 82 TRUE male 2 0 0 12011.s 1 43 TRUE male 2 0 0 12051.p 1 90 62 TRUE male 0 0 0 12051.s 1 39 TRUE male 0 0 0 12100.p 1 90 71 TRUE male 2 0 1 12100.s 1 38 TRUE female 1 0 0 12106.p 1 88 112 TRUE male 3 0 0 12106.s 1 56 TRUE female 3 0 0 12152.p 1 87 114 TRUE male 0 0 0 12152.s 1 42 TRUE male 1 0 0 12153.p 1 67 60 FALSE male 0 0 0 12153.s 1 38 FALSE female 0 1 0 12161.p 1 55 106 FALSE female 3 2 0 12161.s 1 44 FALSE female 5 0 0 12162.p 1 77 67 TRUE male 1 0 1 12162.s 1 35 TRUE female 0 0 0 12175.p 1 83 71 TRUE male 1 0 0 12175.s 1 41 TRUE female 1 0 0 12187.p 1 67 109 FALSE male 0 0 0 12187.s 1 42 FALSE female 0 0 0 12224.p 1 89 80 TRUE male 1 1 1 12224.s 1 41 TRUE male 1 0 0 12228.p 1 83 108 TRUE male 1 0 0 12228.s 1 54 TRUE female 0 0 0 12233.p 1 51 106 FALSE male 0 0 0 12233.s 1 36 FALSE female 0 0 0 12235.p 1 76 79 TRUE male 0 0 0 12235.s 1 36 TRUE male 0 0 0 12241.p 1 90 72 TRUE female 0 0 0 12241.s 1 41 TRUE male 0 0 0 12243.p 1 83 97 TRUE male 0 0 0 12243.s 1 44 TRUE male 0 0 0 12252.p 1 74 87 FALSE male 2 0 0 12252.s 1 41 FALSE male 3 0 0 12285.p 1 75 75 FALSE female 2 0 0 12285.s 1 47 FALSE male 2 0 0 12295.p 1 90 104 FALSE male 1 0 0 12295.s 1 FALSE male 1 0 0 12297.p 1 89 97 TRUE male 1 0 0 12297.s 1 45 TRUE male 2 0 0 12301.p 1 119 FALSE male 0 0 0 12301.s 1 59 FALSE female 0 0 0 12303.p 1 90 79 TRUE male 2 0 0 12303.s 1 41 TRUE female 0 0 0 12304.p 1 65 83 FALSE male 3 0 0 12304.s 1 39 FALSE male 2 0 0 12308.p 1 90 105 TRUE female 3 0 1 12308.s 1 39 TRUE male 0 0 0 12313.p 1 90 115 TRUE female 1 0 0 12313.s 1 38 TRUE female 1 0 0 12317.p 1 53 91 FALSE male 2 0 0 12317.s 1 46 FALSE female 3 0 0 12321.p 1 80 96 TRUE male 0 0 0 12321.s 1 44 TRUE female 0 0 0 12327.p 1 82 97 TRUE male 0 0 0 12327.s 1 53 TRUE female 0 0 0 12334.p 1 61 84 FALSE male 6 0 0 12334.s 1 48 FALSE male 3 2 0 12340.p 1 90 26 TRUE female 1 1 0 12340.s 1 39 TRUE female 0 1 0 12343.p 1 30 FALSE female 0 0 1 12343.s 1 39 FALSE female 0 0 0 12345.p 1 82 91 TRUE male 0 0 0 12345.s 1 42 TRUE female 0 0 0 12358.p 1 90 36 TRUE female 1 0 0 12358.s 1 39 TRUE male 0 0 0 12360.p 1 90 104 TRUE male 0 0 0 12360.s 1 41 TRUE female 0 0 0 12368.p 1 81 47 TRUE male 1 0 0 12368.s 1 39 TRUE male 1 0 0 12370.p 1 80 130 TRUE male 2 0 0 12370.s 1 59 TRUE female 1 0 0 12373.p 1 90 TRUE male 2 0 0 12373.s 1 52 TRUE female 1 0 0 12375.p 1 89 94 TRUE male 1 0 0 12375.s 1 46 TRUE female 1 0 0 12383.p 1 80 101 FALSE male 2 0 0 12383.s 1 72 FALSE male 3 0 0 12390.p 1 90 86 TRUE male 0 0 0 12390.s 1 36 TRUE male 1 0 0 12394.p 1 77 88 TRUE male 2 0 0 12394.s 1 36 TRUE male 2 0 0 12396.p 1 90 99 TRUE male 2 0 0 12396.s 1 51 TRUE female 1 0 0 12403.p 1 90 104 TRUE male 1 0 0 12403.s 1 39 TRUE female 0 0 0 12409.p 1 77 107 TRUE male 2 1 0 12409.s 1 44 TRUE female 3 0 0 12412.p 1 90 98 TRUE male 0 0 0 12412.s 1 55 TRUE male 0 0 0 12420.p 1 48 131 FALSE male 3 0 0 12420.s 1 35 FALSE female 3 0 0 12424.p 1 63 69 FALSE male 1 0 0 12424.s 1 45 FALSE female 0 0 0 12438.p 1 87 75 TRUE male 0 1 0 12438.s 1 48 TRUE male 0 0 0 12441.p 1 90 28 TRUE male 1 0 0 12441.s 1 45 TRUE female 0 0 0 12445.p 1 81 104 TRUE male 1 0 0 12445.s 1 34 TRUE male 1 0 0 12460.p 1 83 68 TRUE male 1 0 0 12460.s 1 42 TRUE male 0 0 0 12462.p 1 76 112 TRUE male 0 0 0 12462.s 1 36 TRUE female 1 1 0 12463.p 1 83 84 TRUE male 3 1 0 12463.s 1 45 TRUE female 0 0 0 12467.p 1 70 102 FALSE male 1 0 0 12467.s 1 47 FALSE female 1 0 0 12473.p 1 88 49 TRUE male 2 0 0 12473.s 1 37 TRUE female 0 0 0 12480.p 1 90 86 TRUE male 1 0 0 12480.s 1 36 TRUE male 0 0 1 12481.p 1 63 42 FALSE male 3 0 0 12481.s 1 34 FALSE male 2 0 0 12498.p 1 90 65 TRUE male 2 0 0 12498.s 1 38 TRUE female 2 0 0 12507.p 1 89 80 FALSE male 2 0 0 12507.s 1 FALSE female 2 0 0 12510.p 1 90 52 TRUE male 1 0 0 12510.s 1 45 TRUE male 1 0 0 12512.p 1 75 85 FALSE male 0 0 0 12512.s 1 38 FALSE female 2 0 0 12515.p 1 80 104 TRUE female 0 0 0 12515.s 1 46 TRUE male 0 0 0 12518.p 1 62 88 FALSE male 1 0 0 12518.s 1 54 FALSE female 3 0 0 12522.p 1 81 68 TRUE male 0 0 0 12522.s 1 41 TRUE male 0 0 0 12523.p 1 49 91 FALSE male 2 0 0 12523.s 1 56 FALSE female 1 0 0 12524.p 1 80 146 TRUE female 1 0 0 12524.s 1 50 TRUE female 0 0 0 12526.p 1 82 103 TRUE male 0 0 0 12526.s 1 36 TRUE female 1 0 0 12534.p 1 90 81 TRUE female 3 0 0 12534.s 1 42 TRUE female 0 0 0 12536.p 1 87 85 FALSE male 0 0 0 12536.s 1 75 FALSE female 0 0 0 12552.p 1 90 104 TRUE male 0 0 0 12552.s 1 54 TRUE male 1 0 0 12561.p 1 73 102 FALSE male 2 1 0 12561.s 1 50 FALSE female 1 0 0 12578.p 1 79 81 TRUE male 3 0 0 12578.s 1 42 TRUE female 1 0 0 12579.p 1 90 33 TRUE male 2 0 0 12579.s 1 43 TRUE female 2 0 0 12581.p 1 90 34 TRUE female 1 0 0 12581.s 1 37 TRUE male 1 0 0 12582.p 1 90 57 TRUE male 1 0 0 12582.s 1 41 TRUE male 0 0 0 12588.p 1 85 106 TRUE male 1 0 0 12588.s 1 47 TRUE male 2 1 0 12616.p 1 63 111 FALSE male 1 0 0 12616.s 1 38 FALSE male 1 0 0 12618.p 1 90 106 TRUE male 1 0 0 12618.s 1 42 TRUE male 1 0 0 12620.p 1 72 83 FALSE male 0 0 0 12620.s 1 46 FALSE female 0 0 0 12626.p 1 87 92 TRUE male 2 0 0 12626.s 1 45 TRUE female 2 0 0 12628.p 1 90 118 TRUE male 1 0 0 12628.s 1 54 TRUE female 1 0 0 12630.p 1 90 129 TRUE male 0 0 0 12630.s 1 41 TRUE female 1 0 0 12631.p 1 78 80 FALSE male 2 0 0 12631.s 1 FALSE male 2 0 0 12633.p 1 64 80 FALSE male 0 0 0 12633.s 1 37 FALSE female 0 0 0 12637.p 1 72 105 FALSE male 1 0 0 12637.s 1 40 FALSE male 1 0 0 12638.p 1 59 73 FALSE male 0 0 0 12638.s 1 42 FALSE female 0 0 0 12642.p 1 88 99 TRUE male 0 0 0 12642.s 1 43 TRUE male 0 0 0 12644.p 1 90 106 TRUE male 1 0 0 12644.s 1 39 TRUE male 1 0 0 12645.p 1 90 86 TRUE male 1 1 0 12645.s 1 36 TRUE male 0 0 0 12647.p 1 88 72 TRUE male 2 0 0 12647.s 1 40 TRUE male 1 0 0 12650.p 1 79 104 FALSE male 1 0 0 12650.s 1 63 FALSE female 2 0 0 12651.p 1 85 24 TRUE male 0 0 0 12651.s 1 38 TRUE male 1 0 0 12652.p 1 90 73 TRUE male 0 1 0 12652.s 1 45 TRUE female 1 0 0 12653.p 1 85 77 TRUE male 0 2 0 12653.s 1 38 TRUE female 0 0 0 12655.p 1 76 48 TRUE male 1 0 0 12655.s 1 35 TRUE male 1 0 0 12656.p 1 83 40 TRUE male 2 0 0 12656.s 1 35 TRUE male 1 0 0 12657.p 1 66 80 FALSE female 0 0 0 12657.s 1 39 FALSE female 0 0 0 12664.p 1 89 93 TRUE male 0 0 0 12664.s 1 35 TRUE male 0 0 0 12680.p 1 77 47 TRUE male 1 0 0 12680.s 1 47 TRUE male 0 0 0 12683.p 1 66 89 FALSE male 1 1 0 12683.s 1 36 FALSE male 1 0 0 12685.p 1 87 100 TRUE male 0 1 0 12685.s 1 44 TRUE female 0 0 0 12688.p 1 71 116 FALSE male 0 0 0 12688.s 1 36 FALSE male 0 0 0 12690.p 1 90 110 TRUE male 1 0 0 12690.s 1 40 TRUE male 0 0 0 12691.p 1 73 34 FALSE female 2 0 1 12691.s 1 FALSE male 1 0 0 12697.p 1 80 85 TRUE male 3 0 0 12697.s 1 46 TRUE female 0 0 0 12703.p 1 72 59 FALSE male 0 0 0 12703.s 1 38 FALSE female 0 0 0 12705.p 1 75 99 FALSE male 1 1 0 12705.s 1 38 FALSE male 1 0 0 12708.p 1 90 79 TRUE male 0 0 0 12708.s 1 41 TRUE female 0 0 0 12716.p 1 88 112 TRUE male 1 0 0 12716.s 1 56 TRUE female 1 0 0 12719.p 1 90 46 TRUE male 1 0 0 12719.s 1 48 TRUE female 1 0 0 12720.p 1 59 69 FALSE male 0 0 0 12720.s 1 49 FALSE male 0 0 0 12723.p 1 82 48 TRUE male 1 0 0 12723.s 1 57 TRUE female 1 0 0 12724.p 1 87 83 TRUE female 0 0 0 12724.s 1 49 TRUE male 0 0 0 12727.p 1 78 90 TRUE male 1 0 0 12727.s 1 58 TRUE male 2 1 0 12729.p 1 90 74 TRUE female 2 0 0 12729.s 1 37 TRUE male 2 0 0 12733.p 1 66 90 FALSE male 1 0 0 12733.s 1 45 FALSE female 2 0 0 12735.p 1 90 55 TRUE male 2 0 1 12735.s 1 39 TRUE male 0 0 0 12736.p 1 85 101 TRUE male 1 0 1 12736.s 1 53 TRUE female 1 0 0 12739.p 1 82 98 TRUE male 0 0 0 12739.s 1 41 TRUE male 0 0 0 12741.p 1 86 74 TRUE male 2 0 0 12741.s 1 38 TRUE male 1 0 0 12743.p 1 78 116 FALSE male 1 0 0 12743.s 1 62 FALSE male 0 0 0 12748.p 1 90 92 TRUE male 0 0 0 12748.s 1 43 TRUE male 0 0 0 12758.p 1 80 74 TRUE male 1 0 0 12758.s 1 40 TRUE female 0 0 0 12759.p 1 62 105 FALSE male 0 0 0 12759.s 1 39 FALSE female 0 0 0 12763.p 1 90 37 TRUE male 1 0 0 12763.s 1 42 TRUE male 0 0 0 12764.p 1 90 94 TRUE female 0 1 0 12764.s 1 40 TRUE female 0 0 0 12770.p 1 80 122 TRUE male 1 0 0 12770.s 1 47 TRUE male 1 0 0 12780.p 1 79 115 TRUE male 1 0 0 12780.s 1 49 TRUE male 2 0 0 12790.p 1 84 31 TRUE male 1 0 0 12790.s 1 41 TRUE female 0 0 0 12802.p 1 90 TRUE male 1 0 0 12802.s 1 35 TRUE male 1 0 0 12810.p 1 90 63 TRUE male 4 0 0 12810.s 1 38 TRUE female 1 0 0 12826.p 1 90 66 TRUE female 3 1 0 12826.s 1 42 TRUE male 2 0 0 12829.p 1 72 133 FALSE male 1 0 0 12829.s 1 70 FALSE male 2 0 0 12833.p 1 76 65 TRUE male 1 0 0 12833.s 1 49 TRUE female 1 0 0 12836.p 1 65 127 FALSE male 2 0 0 12836.s 1 56 FALSE male 2 0 0 12837.p 1 86 89 TRUE male 4 0 0 12837.s 1 57 TRUE female 2 0 0 12838.p 1 80 78 TRUE male 0 0 0 12838.s 1 41 TRUE female 1 1 0 12840.p 1 90 54 TRUE male 2 2 0 12840.s 1 39 TRUE female 2 0 0 12843.p 1 72 93 FALSE male 1 0 0 12843.s 1 45 FALSE female 0 0 0 12851.p 1 90 37 TRUE male 2 0 0 12851.s 1 41 TRUE male 5 0 0 12852.p 1 78 77 TRUE male 0 0 0 12852.s 1 44 TRUE male 0 0 0 12869.p 1 90 31 TRUE female 3 0 0 12869.s 1 36 TRUE female 1 0 0 12905.p 1 90 66 TRUE male 0 0 0 12905.s 1 41 TRUE male 0 0 0 12906.p 1 90 61 TRUE male 1 0 0 12906.s 1 40 TRUE female 1 0 0 12937.p 1 90 35 TRUE male 0 0 0 12937.s 1 46 TRUE male 2 0 0 12958.p 1 57 88 FALSE male 0 0 0 12958.s 1 43 FALSE male 0 0 0 12962.p 1 79 79 TRUE male 0 0 0 12962.s 1 37 TRUE female 1 0 0 12975.p 1 68 89 FALSE male 0 0 1 12975.s 1 41 FALSE female 0 0 0 12984.p 1 82 100 TRUE male 0 0 0 12984.s 1 46 TRUE male 0 0 0 12997.p 1 81 97 TRUE male 4 0 0 12997.s 1 40 TRUE female 2 0 0 13000.p 1 79 38 TRUE male 0 0 0 13000.s 1 37 TRUE female 1 0 0 13016.p 1 52 98 FALSE male 0 0 0 13016.s 1 40 FALSE female 0 0 0 13018.p 1 76 78 TRUE male 2 1 1 13018.s 1 41 TRUE male 1 0 0 13048.p 1 90 35 TRUE female 1 0 0 13048.s 1 40 TRUE female 1 0 0 13063.p 1 74 49 FALSE male 1 0 0 13063.s 1 41 FALSE female 1 0 0 13073.p 1 90 43 TRUE male 0 0 0 13073.s 1 34 TRUE male 0 0 0 13094.p 1 67 71 FALSE male 2 1 0 13094.s 1 45 FALSE male 1 0 0 13096.p 1 85 107 TRUE male 0 1 0 13096.s 1 39 TRUE male 1 0 0 13097.p 1 73 34 FALSE male 1 0 0 13097.s 1 40 FALSE female 3 0 0 13099.p 1 86 92 TRUE male 1 0 0 13099.s 1 38 TRUE male 1 0 0 13101.p 1 90 94 TRUE female 0 0 0 13101.s 1 51 TRUE female 1 0 0 13104.p 1 87 97 TRUE male 0 0 0 13104.s 1 44 TRUE female 0 0 0 13116.p 1 90 35 TRUE female 3 0 0 13116.s 1 42 TRUE female 3 0 0 13120.p 1 90 113 TRUE male 0 0 0 13120.s 1 38 TRUE male 0 0 0 13125.p 1 86 97 TRUE male 2 1 0 13125.s 1 58 TRUE female 2 0 0 13129.p 1 78 31 TRUE male 1 0 0 13129.s 1 35 TRUE female 2 0 0 13131.p 1 80 75 TRUE male 0 0 0 13131.s 1 45 TRUE male 0 1 0 13139.p 1 73 127 FALSE male 2 0 0 13139.s 1 39 FALSE female 1 0 0 13144.p 1 77 112 TRUE male 1 0 0 13144.s 1 53 TRUE male 1 1 0 13146.p 1 86 106 TRUE male 0 0 0 13146.s 1 37 TRUE female 0 0 0 13148.p 1 78 52 TRUE male 2 0 0 13148.s 1 51 TRUE female 2 0 0 13152.p 1 90 77 TRUE male 0 0 0 13152.s 1 41 TRUE female 0 0 0 13153.p 1 76 75 TRUE male 2 0 0 13153.s 1 34 TRUE male 1 0 0 13154.p 1 80 40 TRUE male 0 0 0 13154.s 1 36 TRUE male 0 0 0 13159.p 1 75 79 FALSE male 0 0 0 13159.s 1 48 FALSE male 0 0 0 13162.p 1 90 74 TRUE male 3 1 0 13162.s 1 36 TRUE female 3 0 0 13165.p 1 90 51 TRUE male 0 0 0 13165.s 1 49 TRUE male 1 0 0 13166.p 1 90 46 TRUE male 0 0 0 13166.s 1 39 TRUE male 1 0 0 13168.p 1 70 104 FALSE female 0 1 0 13168.s 1 45 FALSE female 0 0 0 13169.p 1 49 45 FALSE male 0 0 0 13169.s 1 40 FALSE male 3 0 0 13171.p 1 90 51 TRUE female 1 0 0 13171.s 1 40 TRUE female 0 0 0 13174.p 1 86 106 TRUE male 0 0 0 13174.s 1 38 TRUE male 0 0 0 13176.p 1 90 137 TRUE female 1 1 0 13176.s 1 42 TRUE female 2 0 0 13183.p 1 90 60 TRUE male 1 1 0 13183.s 1 42 TRUE female 1 0 0 13187.p 1 75 84 FALSE male 0 0 0 13187.s 1 39 FALSE male 0 0 0 13188.p 1 82 78 FALSE male 0 0 0 13188.s 1 FALSE female 0 0 0 13193.p 1 90 81 TRUE male 0 0 0 13193.s 1 39 TRUE male 0 0 0 13195.p 1 89 76 TRUE male 1 0 0 13195.s 1 52 TRUE male 1 0 0 13196.p 1 90 92 TRUE male 1 0 0 13196.s 1 55 TRUE male 1 0 0 13197.p 1 79 65 TRUE male 0 1 0 13197.s 1 42 TRUE female 1 0 0 13215.p 1 90 74 TRUE male 2 0 0 13215.s 1 51 TRUE female 2 0 0 13216.p 1 90 105 TRUE male 1 0 0 13216.s 1 50 TRUE male 2 0 0 13218.p 1 66 98 FALSE male 0 0 0 13218.s 1 37 FALSE male 0 0 0 13227.p 1 86 104 TRUE male 0 0 0 13227.s 1 39 TRUE female 1 0 0 13239.p 1 64 83 FALSE female 0 0 0 13239.s 1 41 FALSE male 1 0 0 13258.p 1 90 28 TRUE male 0 0 0 13258.s 1 50 TRUE female 0 0 0 13263.p 1 73 81 FALSE male 0 0 0 13263.s 1 37 FALSE female 1 0 0 13266.p 1 90 87 FALSE male 0 0 0 13266.s 1 68 FALSE female 0 0 0 13269.p 1 69 97 FALSE male 0 0 0 13269.s 1 40 FALSE male 1 0 0 13271.p 1 90 60 TRUE male 1 0 0 13271.s 1 47 TRUE female 0 0 0 13293.p 1 90 83 TRUE male 2 0 0 13293.s 1 41 TRUE female 2 0 0 13296.p 1 87 30 TRUE male 4 0 0 13296.s 1 40 TRUE female 4 0 0 13307.p 1 90 30 TRUE male 0 0 0 13307.s 1 45 TRUE male 1 0 0 13309.p 1 89 79 TRUE male 0 0 0 13309.s 1 38 TRUE male 0 0 0 13312.p 1 75 119 FALSE male 0 0 0 13312.s 1 45 FALSE female 1 0 0 13315.p 1 90 118 TRUE male 0 0 0 13315.s 1 37 TRUE male 0 0 0 13322.p 1 77 59 TRUE male 3 0 0 13322.s 1 36 TRUE female 1 0 0 13327.p 1 90 103 TRUE male 3 0 0 13327.s 1 45 TRUE female 2 0 0 13328.p 1 61 100 FALSE male 1 0 0 13328.s 1 45 FALSE male 0 0 0 13330.p 1 60 101 FALSE male 1 0 0 13330.s 1 58 FALSE female 0 0 0 13335.p 1 18 FALSE female 2 0 0 13335.s 1 36 FALSE male 5 0 0 13338.p 1 90 105 TRUE male 2 0 0 13338.s 1 51 TRUE female 2 0 0 13346.p 1 84 59 FALSE female 1 1 3 13346.s 1 60 FALSE female 2 0 0 13349.p 1 76 87 TRUE male 0 1 0 13349.s 1 42 TRUE female 0 0 0 13355.p 1 90 30 TRUE male 2 0 1 13355.s 1 39 TRUE male 1 0 0 13366.p 1 74 124 FALSE male 1 0 0 13366.s 1 56 FALSE female 2 0 0 13374.p 1 90 19 TRUE male 0 0 0 13374.s 1 39 TRUE female 0 0 0 13385.p 1 90 16 TRUE male 1 0 0 13385.s 1 52 TRUE female 0 0 0 13387.p 1 90 95 TRUE male 0 0 0 13387.s 1 42 TRUE male 2 0 0 13393.p 1 59 77 FALSE female 2 0 0 13393.s 1 47 FALSE female 2 0 0 13396.p 1 83 103 TRUE male 3 0 0 13396.s 1 43 TRUE female 2 0 0 13398.p 1 90 75 TRUE male 1 1 0 13398.s 1 42 TRUE female 1 0 0 13412.p 1 90 33 TRUE male 2 0 0 13412.s 1 36 TRUE female 0 0 0 13418.p 1 61 135 FALSE male 4 0 0 13418.s 1 46 FALSE female 1 0 0 13424.p 1 74 114 FALSE male 0 0 0 13424.s 1 42 FALSE female 0 0 0 13439.p 1 87 82 TRUE male 0 1 0 13439.s 1 42 TRUE female 0 0 0 13443.p 1 76 102 TRUE male 1 0 0 13443.s 1 45 TRUE female 0 0 0 13444.p 1 85 76 TRUE male 0 0 0 13444.s 1 51 TRUE female 0 0 0 13447.p 1 90 44 FALSE female 0 1 0 13447.s 1 FALSE female 1 0 1 13462.p 1 87 70 TRUE male 0 0 0 13462.s 1 41 TRUE male 0 0 0 13465.p 1 90 121 TRUE male 1 0 0 13465.s 1 46 TRUE female 1 0 0 13486.p 1 83 93 TRUE male 0 0 0 13486.s 1 42 TRUE male 0 0 0 13487.p 1 56 96 FALSE male 2 0 0 13487.s 1 38 FALSE female 2 0 0 13493.p 1 77 104 TRUE male 2 0 0 13493.s 1 57 TRUE male 1 0 0 13496.p 1 77 107 TRUE male 0 0 0 13496.s 1 42 TRUE male 0 0 1 13502.p 1 90 55 TRUE male 1 0 0 13502.s 1 49 TRUE female 1 0 0 13504.p 1 77 64 TRUE male 4 0 0 13504.s 1 43 TRUE female 3 0 0 13505.p 1 60 112 FALSE male 0 0 0 13505.s 1 41 FALSE female 0 0 0 13507.p 1 46 101 FALSE male 1 0 0 13507.s 1 37 FALSE male 2 1 0 13508.p 1 63 96 FALSE male 1 0 0 13508.s 1 41 FALSE female 2 0 0 13509.p 1 89 70 TRUE female 2 0 0 13509.s 1 44 TRUE female 1 0 0 13512.p 1 83 110 TRUE male 4 0 0 13512.s 1 38 TRUE female 1 0 0 13513.p 1 67 88 FALSE male 1 1 0 13513.s 1 36 FALSE female 1 0 0 13533.p 1 71 47 FALSE male 2 0 0 13533.s 1 38 FALSE male 2 0 0 13543.p 1 82 42 TRUE male 1 0 0 13543.s 1 40 TRUE female 0 0 0 13589.p 1 78 56 TRUE male 2 0 0 13589.s 1 38 TRUE male 3 0 0 13590.p 1 90 86 TRUE male 3 2 0 13590.s 1 36 TRUE male 2 0 0 13593.p 1 84 39 FALSE male 1 0 0 13593.s 1 FALSE male 2 0 0 13599.p 1 79 65 TRUE male 2 0 0 13599.s 1 39 TRUE male 2 0 0 13601.p 1 85 78 TRUE female 3 0 0 13601.s 1 40 TRUE male 4 0 1 13606.p 1 90 54 TRUE male 0 1 0 13606.s 1 49 TRUE female 1 0 0 13608.p 1 90 42 FALSE female 2 1 0 13608.s 1 FALSE male 1 0 0 13618.p 1 90 44 FALSE female 0 0 0 13618.s 1 FALSE female 0 0 0 13621.p 1 90 56 TRUE female 0 0 0 13621.s 1 40 TRUE male 1 0 0 13625.p 1 85 94 TRUE male 2 0 0 13625.s 1 42 TRUE male 1 0 0 13629.p 1 82 52 TRUE male 1 0 0 13629.s 1 44 TRUE female 2 0 1 13660.p 1 80 53 TRUE male 0 0 0 13660.s 1 46 TRUE male 1 0 0 13684.p 1 76 106 TRUE male 1 0 0 13684.s 1 49 TRUE female 0 0 0 13689.p 1 79 97 TRUE male 0 0 0 13689.s 1 50 TRUE female 1 1 0 13695.p 1 88 54 TRUE male 0 0 0 13695.s 1 37 TRUE female 0 0 0 13698.p 1 89 98 FALSE male 1 0 0 13698.s 1 64 FALSE male 1 0 0 13726.p 1 80 61 TRUE male 1 0 1 13726.s 1 45 TRUE male 1 0 0 13730.p 1 68 94 FALSE female 2 0 0 13730.s 1 41 FALSE male 0 0 0 13739.p 1 90 25 TRUE female 1 0 0 13739.s 1 37 TRUE male 2 0 0 13752.p 1 76 98 TRUE female 0 0 0 13752.s 1 47 TRUE female 0 0 0 13774.p 1 90 33 TRUE female 1 0 0 13774.s 1 44 TRUE female 0 0 0 13793.p 1 87 52 TRUE male 1 0 0 13793.s 1 45 TRUE female 2 0 0 13795.p 1 90 34 TRUE female 2 0 0 13795.s 1 38 TRUE female 1 0 0 13798.p 1 68 103 FALSE female 2 0 0 13798.s 1 37 FALSE female 0 0 0 13808.p 1 79 40 TRUE male 2 0 0 13808.s 1 44 TRUE male 2 0 0 13809.p 1 77 61 TRUE male 2 0 0 13809.s 1 51 TRUE female 0 0 0 13815.p 1 82 51 TRUE male 3 0 1 13815.s 1 44 TRUE female 2 0 0 13821.p 1 67 92 FALSE female 0 0 0 13821.s 1 43 FALSE female 0 0 0 13825.p 1 90 58 FALSE female 3 0 0 13825.s 1 71 FALSE female 4 0 0 13832.p 1 90 25 TRUE male 0 0 0 13832.s 1 38 TRUE female 1 0 0 13835.p 1 90 119 TRUE female 1 0 0 13835.s 1 52 TRUE female 1 0 0 13840.p 1 0 13840.s 1 0 13843.p 1 90 66 TRUE female 3 0 0 13843.s 1 42 TRUE female 1 0 0 13876.p 1 82 40 FALSE male 1 0 2 13876.s 1 FALSE female 1 0 0 13887.p 1 67 81 FALSE female 0 0 0 13887.s 1 43 FALSE male 0 0 0 13890.p 1 82 37 TRUE female 2 1 0 13890.s 1 47 TRUE female 2 0 0 13912.p 1 83 95 TRUE female 0 0 0 13912.s 1 44 TRUE female 1 0 0 13922.p 1 90 100 TRUE female 1 0 0 13922.s 1 41 TRUE male 2 0 0 13926.p 1 79 62 TRUE female 1 0 0 13926.s 1 41 TRUE female 3 0 0 13992.p 1 80 106 TRUE female 0 0 0 13992.s 1 45 TRUE female 1 0 0 14009.p 1 85 98 TRUE female 0 0 0 14009.s 1 35 TRUE female 0 0 0 14011.p 1 90 53 TRUE female 1 0 0 14011.s 1 36 TRUE female 1 0 0 14110.p 1 69 77 FALSE male 3 0 0 14110.s 1 39 FALSE female 3 0 0 14167.p 1 76 132 TRUE male 0 0 0 14167.s 1 42 TRUE male 0 0 0 14201.p 1 90 43 TRUE male 2 0 0 14201.s 1 45 TRUE female 2 0 0 Compa rison N ame Group # of qu ads Count Enrich ment Binom ial Paired t-test 95% C I (boots trap) Overall rare CN Vs Proband s 458 Siblings 397 Overall genes in rare CN Vs Proband s 921 Siblings 726 Catego ry # of Qu ads Proban ds Cou ntS iblings Count enrichm ent Binom ial Paired t-test # of CN Vs High FS IQ: Prob and IQ s core ? 7 0 267 273 236 1.16 0.11 0.018 Low FS IQ: Prob and IQ s core < 7 0 141 157 126 1.25 0.074 0.015 SRS dis cordant proban d-sibling pairs 276 316 251 1.26 0.0071 0.00015 SRS co ncordan t proban d-sibling pairs 115 117 113 1.04 0.84 0.7 Discord ant SRS , Low IQ 109 138 104 1.33 0.034 0.0038 Concor dant SR S, Low IQ 22 19 22 0.86 0.76 0.53 Discord ant SRS , High I Q 167 175 145 1.21 0.1 0.016 Concor dant SR S, High IQ 93 98 91 1.08 0.66 0.46 Genes Affecte d High IQ 267 537 472 1.14 0.044 0.14 Low IQ 141 384 252 1.52 1.90E-0 7 0.084 Discord ant SRS 276 707 510 1.39 1.80E-0 8 0.02 Concor dant SR S 115 222 219 1.01 0.92 0.9 Discord ant SRS , Low IQ 109 348 207 1.68 2.30E-0 9 0.061 Concor dant SR S, Low IQ 22 36 45 0.8 0.37 0.52 Discord ant SRS , High I Q 167 351 298 1.18 0.041 0.18 Concor dant SR S, High IQ 93 186 174 1.07 0.56 0.55 By CNV freque ncy Private CNVs only 411 271 245 1.11 0.27 0.1 All Rare CNVs 411 453 394 1.15 0.046 0.0044 By exp ression profile Genes with bra in expre ssion (a verage) 411 19/317 6/224 2.24 NT NT ... and in discord ant SRS quads only 276 15/256 2/170 #VALU E! NT NT Genes with bra in expre ssion (a ny regio n) 411 73 43 1.7 NT NT By pre vious p athoge nic gen e asso ciation Genes w ith previ ous ass ociation 411 83 59 1.41 NT 0.049 ... and in discord ant SRS quads only 276 66 35 1.89 NT 0.006 ... and in concor dant SR S quads only 115 17 24 0.71 NT 0.069 By sex , family size a nd birt h orde r 411 1.15 0.04 0.004 1.09 - 1 .29 411 1.27 < 0.000 01 0.029 1.10 - 1 .52 CNVs # of Qu ads Proban ds Cou ntSi blings C ount Compa rison enrichm entB inomial Paired t -test Male Pr o 335 358 313 1.14 0.089 0.012 Female Pro 76 95 81 1.17 0.33 0.19 Male Si b 191 204 186 1.1 0.39 0.2 Female Sib 220 249 208 1.2 0.061 0.007 Sex Co ncord. 211 217 199 1.09 0.4 0.22 Sex Dis cord. 200 236 195 1.21 0.054 0.0049 Both M ale 163 163 152 1.07 0.57 0.37 Both Fe male 48 54 47 1.15 0.55 0.39 Female Pro, M ale Sib 28 41 34 1.21 0.49 0.32 Male Pr o, Fema le Sib 172 195 161 1.21 0.08 0.0083 Proban d Older 224 232 209 1.11 0.29 0.12 Sibling Older 171 203 166 1.22 0.061 0.0068 1 sibling 247 264 242 1.09 0.35 0.15 2 sibling s 117 131 110 1.19 0.2 0.092 3+ sibli ng 47 58 42 1.38 0.13 0.017 0.53 (ch i-sq)0.93 0.53 0.49 0.82 1 0.52 Tissue Name Ratio Brain? Proband FracitonProband CountProband genes (top 5% in each tissue) Sibling FractionSibling Count Sibling genes (top 5% in each tissue)temporal lobe 8.6328125 TRUE 0.05078125 13 DOC2A, PGBD5, MAPK3, XKR3, CRLF1, UCHL1, DNAJC6, GAN, NIPA1, C16orf45, NDRG2, SEZ6L2, CNDP1 0.00588235 1 TMEFF2pons 4.98046875 TRUE 0.05859375 15 FAM57B, KCNE2, PGBD5, KIF1A, GSTT1, CA2, DNAJC6, SLFN12, UCHL1, TRIM58, C16orf45, NDRG2, AGT, SEZ6L2, CNDP1 0.01176471 2 CORIN, SLC22A1cerebellum 4.6484375 TRUE 0.0546875 14 AQP4, PGBD5, MTSS1L, UCHL1, DNAJC6, OR2W3, NFAM1, CPLX1, IQSEC1, C16orf53, NDRG2, AGT, SEZ6L2, CNDP1 0.01176471 2 MAPKBP1, PDZK1parietal lobe 4.6484375 TRUE 0.0546875 14 RNASE1, AQP4, PGBD5, MTSS1L, CA2, DNAJC6, UCHL1, ZNF517, IQSEC1, C16orf45, NDRG2, AGT, SEZ6L2, CNDP1 0.01176471 2 TMEFF2, PPM1Jamygdala 4.31640625 TRUE 0.05078125 13 DOC2A, KCTD13, AQP4, PGBD5, CA2, DNAJC6, UCHL1, SGIP1, CPLX1, C16orf45, AGT, SEZ6L2, CNDP1 0.01176471 2 TMEFF2, MAPKBP1hypothalamus 3.76302083 TRUE 0.06640625 17 KCTD13, AQP4, PGBD5, GSTT2, GSTT1, RNASE6, CHODL, DNAJC6, ORC3, UCHL1, C3, CA2, C16orf45, NDRG2, AGT, SEZ6L2, CNDP1 0.01764706 3 TMEFF2, MAPKBP1, TANC1subthalamic nucleus3.76302083 TRUE 0.06640625 17 AQP4, PGBD5, KIF1A, CRLF1, MTSS1L, UCHL1, DNAJC6, GAN, FGFR3, ATP8B3, C16orf45, CPLX1, AKR1C4, PDE4B, ZNF17, SEZ6L2, CNDP1 0.01764706 3 OR4F15, TMEFF2, ZNF613BM-CD34+ 3.48632813 FALSE 0.08203125 21 XPO1, TACC3, RNASE3, RNASE2, SNRPD3, CHD1L, RPL8, GLIPR1, PPP4C, RNASE6, VPREB3, CLC, RAD51AP1, CORO1A, TRIM58, IGLL1, MYC, ZNF84, ATF1, PRC1, UCHL3 0.02352941 4 SLC20A1, KARS, MAVS, RUVBL2medulla oblongata3.3203125 TRUE 0.078125 20 KCTD13, ADORA2A, AQP4, PGBD5, GSTT2, CRLF1, CA2, DNAJC6, SLFN12, HCRTR1, CPLX1, RNASE1, UCHL1, C22orf43, C16orf45, ANXA10, NDRG2, SEZ6L2, CNDP1, KIF1A 0.02352941 4 TMEFF2, SLC22A1, KRT19, ZNF608caudate nucleus3.15429688 TRUE 0.07421875 19 KCTD13, ADORA2A, AQP4, PGBD5, CA2, DNAJC6, DDHD2, HMX3, UCHL1, CPLX1, PTCHD3, CEP72, RNASE1, ALDH5A1, NDRG2, AGT, SEZ6L2, CNDP1, BCR 0.02352941 4 TMEFF2, ZNF608, C20orf27, STARD7prefrontal cortex3.09895833 TRUE 0.0546875 14 KCTD13, AQP4, PGBD5, GSTT2, GSTT1, CRLF1, UCHL1, DNAJC6, IQSEC1, C16orf45, NDRG2, AGT, SEZ6L2, CNDP1 0.01764706 3 MYOM2, TMEFF2, MAPKBP1occipital lobe 2.87760417 TRUE 0.05078125 13 AQP4, PGBD5, CA2, DNAJC6, UCHL1, FGFR3, IQSEC1, SGIP1, C16orf45, NDRG2, AGT, SEZ6L2, CNDP1 0.01764706 3 TMEFF2, MAPKBP1, ZNF608PB-CD19+ Bcells2.87760417 FALSE 0.05078125 13 TACC3, SNRPD3, RPL8, GLIPR1, PPP4C, RNASE6, C22orf13, CORO1A, RSL24D1, C1orf131, MYC, ATF1, VPREB3 0.01764706 3 TXNIP, ANKRD39, ARID5Aadrenal gland 2.82226563 FALSE 0.06640625 17 RNASE1, METTL9, ZNF219, PGBD5, SIK1, GSTT2, GSTT1, QPRT, NQO2, MOCOS, ALDH2, TXN2, OR2W3, C3, ZNF70, VPREB3, ARHGEF40 0.02352941 4 RAB20, ERCC2, HGSNAT, STARD7whole brain 2.5234375 TRUE 0.07421875 19 DOC2A, KCTD13, AQP4, ASPHD1, SEZ6L2, GSTT2, CRLF1, CA2, DNAJC6, UCHL1, PGBD5, IQSEC1, CORO1A, SGIP1, C16orf45, NDRG2, AGT, COX6A1, CNDP1 0.02941176 5 MYOM2, TMEFF2, MAPKBP1, RUVBL2, DUSP2adipocyte 2.49023438 FALSE 0.05859375 15 HSD17B12, FAM89A, GSTT1, CRLF1, THBS2, PPARG, GLIPR1, NQO2, ETFDH, DDT, C3, GGT5, ALDOA, MYC, UCHL1 0.02352941 4 TNFAIP8L3, CLIC3, PTGIS, LIX1Llymph node 2.21354167 FALSE 0.0390625 10 ZNF140, TACC3, GLIPR1, RNASE6, CORO1A, IL32, GGT5, C3, MEGF6, VPREB3 0.01764706 3 SLC20A1, ARID5A, DUSP2fetal brain 2.21354167 TRUE 0.0390625 10 KCTD13, PGBD5, UCHL1, DNAJC6, ORC3, SGIP1, CORO1A, TP53BP1, C16orf45, SEZ6L2 0.01764706 3 PNPLA3, TMEFF2, MAPKBP1cerebellum peduncles1.9921875 TRUE 0.046875 12 AQP4, PGBD5, KCNG2, UCHL1, DNAJC6, FGFR3, C16orf45, C16orf53, NDRG2, AGT, SEZ6L2, CNDP1 0.02352941 4 MAPKBP1, TANC1, CNNM4, PDZK1pancreatic islets 1.9921875 FALSE 0.05859375 15 RAB27A, GSTT2, GSTT1, UCHL1, KCNG2, RPL8, CHST9, HCRTR1, IL32, ACADSB, C3, ANXA10, MAPKAPK5, SHPK, SEZ6L2 0.02941176 5 CBLC, KARS, KRT15, KRT19, CCNCcingulate cortex 1.9921875 TRUE 0.046875 12 KCTD13, PGBD5, KIF1A, CA2, DNAJC6, UCHL1, SGIP1, IQSEC1, C16orf45, AGT, SEZ6L2, CNDP1 0.02352941 4 MYOM2, ANKRD36, TMEFF2, SLC22A1leukemia lymphoblastic(molt4)1.89732143 FALSE 0.078125 20 QPRT, MYC, SNRPD3, CHD1L, BUB3, PPP4C, MAZ, CCT4, RAD51AP1, CORO1A, IL32, HNRNPA2B1, CEP72, RSL24D1, IGLL1, TACC3, MAPKAPK5, METTL17, PRC1, HIRIP3 0.04117647 7 KARS, CD1E, RUVBL2, SMYD3, NCAPH, ADSL, NINLtestis seminiferous tubule1.859375 FALSE 0.0546875 14 KCTD13, RNASE1, XKR3, UPB1, CHODL, TACC3, ORC3, ARL6, SLC26A8, CAPS2, IGLL1, PRC1, C9orf93, TRIP12 0.02941176 5 TEX101, RUVBL2, PPM1J, GEMIN4, NCAPHbone marrow 1.80245536 FALSE 0.07421875 19 ADORA2A, RNASE3, RNASE2, METTL9, CA2, PPP4C, MYH9, RNASE6, CLC, OR2W3, C22orf13, CORO1A, TACC3, TRIM58, RAB27A, IGLL1, MYC, PRC1, VPREB3 0.04117647 7 OR4F15, LAT2, HK3, FPR2, ANKRD35, NCAPH, TXNIPlymphoma Burkitts Raji1.77083333 FALSE 0.0625 16 ADORA2A, XKR3, RPL8, QPRT, TACC3, DDT, TNIP2, CORO1A, IL32, RSL24D1, AKR1C4, ALDH5A1, MYC, UCHL1, PRC1, VPREB3 0.03529412 6 LAT2, RUVBL2, ADRA2B, CCNC, PLAG1, DUSP2liver 1.77083333 FALSE 0.0625 16 QPRT, SHPK, UPB1, CA2, PPP4C, MOCOS, ALDH2, DDT, C3, IL32, PQLC1, AKR1C4, ANXA10, AGT, CNDP1, NQO2 0.03529412 6 RUVBL2, HFE2, PNPLA3, CES2, SLC22A1, ABCC6spinal cord 1.74316406 TRUE 0.08203125 21 HSD17B12, RNASE1, AQP4, ASPHD1, RNASE6, MTSS1L, CA2, DNAJC6, DDHD2, ORC3, UCHL1, PGBD5, C3, C1orf198, AKR1C4, ANXA10, NDRG2, AGT, SEZ6L2, CNDP1, PACS2 0.04705882 8 ARHGEF10, TNFAIP8L3, FNTA, GABARAPL2, TMEFF2, TANC1, SCCPDH, SEMA4Clymphoma Burkitts Daudi1.74316406 FALSE 0.08203125 21 XPO1, TACC3, PGBD5, SNRPD3, RPL8, BUB3, RAD51AP1, KCNG2, RBFA, CCT4, DDT, IGLL1, CORO1A, HNRNPA2B1, DGUOK, AKR1C4, ALDH5A1, MYC, RSL24D1, PRC1, VPREB3 0.04705882 8 KARS, SLC20A1, RUVBL2, CCNC, ANKRD36, NCAPH, ADSL, DUSP2ciliary ganglion 1.7265625 FALSE 0.05078125 13 LCN6, RNASE8, PRPH, ZNF300, CHST9, OR4D10, PTCHD3, CAPS2, ANXA10, C1orf131, ADAMTS18, ZNF70, UCHL1 0.02941176 5 ALG10, PLAG1, LIX1L, USP50, OR10A6globus pallidus 1.7265625 TRUE 0.05078125 13 AQP4, LCN6, MTSS1L, UCHL1, DNAJC6, LCN8, OR4E2, PGBD5, C16orf45, NDRG2, ZNF17, SEZ6L2, CNDP1 0.02941176 5 C3orf33, PNPLA3, SLC22A1, ST6GALNAC1, USP50leukemia promyelocytic(hl60)1.66015625 FALSE 0.078125 20 RPL8, ADORA2A, TACC3, SNRPD3, EIF3M, BUB3, RBFA, ORC3, CCT4, RAD51AP1, CORO1A, SHPK, HNRNPA2B1, RSL24D1, ALDH5A1, MYC, UCHL1, PRC1, UCHL3, VPREB3 0.04705882 8 KARS, RUVBL2, TANC1, EIF2A, SAMM50, NCAPH, ADSL, DUSP2testis germ cell 1.66015625 FALSE 0.05859375 15 KCTD13, RNASE1, LCN6, XKR3, ACTG2, CHODL, TACC3, ORC3, CHST9, UPB1, PGBD5, SLC26A8, PRC1, EDDM3B, TRIP12 0.03529412 6 RUVBL2, MAPKBP1, GEMIN4, TEX101, NCAPH, ABCC1721 B lymphoblasts1.61272321 FALSE 0.06640625 17 QPRT, SNRPD3, BUB3, RNASE6, UCHL1, PPP4C, TACC3, DDT, RAD51AP1, CORO1A, IL32, RSL24D1, MYC, MX1, ATF1, PRC1, HIRIP3 0.04117647 7 KARS, ARID5A, RUVBL2, C15orf41, NCAPH, ANKRD39, DUSP2BM-CD105+ endothelial1.59375 FALSE 0.09375 24 RPL8, CRLF1, TRIM58, PRC1, VPREB3, XPO1, QPRT, CHD1L, PPP4C, RAD51AP1, C1orf131, MYC, SHPK, ARV1, UCHL3, SNRPD3, GSTT1, CA2, DNAJC6, TACC3, ATP8B3, CEP72, IGLL1, ATF1 0.05882353 10 KARS, SLC20A1, C20orf27, RUVBL2, SAMM50, NCAPH, TTC27, PLAG1, C15orf41, DUSP2thalamus 1.4609375 TRUE 0.04296875 11 RNASE1, PGBD5, GSTT1, CA2, DNAJC6, UCHL1, C3, C16orf45, AGT, SEZ6L2, CNDP1 0.02941176 5 TMEFF2, TANC1, PPM1J, ST6GALNAC1, ZNF608dorsal root ganglion1.43880208 FALSE 0.05078125 13 FAM57B, MTNR1A, OR2G3, DNAJC6, OR4E2, PRPH, CLC, ATP8B3, DGKI, TBX6, ARL6, UCHL1, NIPA1 0.03529412 6 ZNF613, OR4F6, TMEFF2, ZNF649, SPINK5, STARD7kidney 1.328125 FALSE 0.0625 16 QPRT, GSTT1, RTDR1, CA2, SLC13A3, NQO2, ALDH2, DDT, NFAM1, CGNL1, CEP72, GGT5, UPB1, IL32, CDH3, AGT 0.04705882 8 PNPLA3, TEK, ATP6V0A4, CES2, ST6GALNAC1, ABCC6, KRT19, PDZK1cardiac myocytes 1.328125 FALSE 0.0703125 18 TACC3, RNASE3, MYH9, GSTT2, CRLF1, PPP4C, GLIPR1, THBS2, VPREB3, TUBGCP5, BMPER, AKR1C4, ANXA10, MYC, F11, UCHL1, PRC1, GALNT2 0.05294118 9 TMEM127, SLC20A1, CLIC3, PTGIS, KRT15, TEK, NCAPH, RHOC, C15orf41testis interstitial 1.328125 FALSE 0.0390625 10 KCTD13, XKR3, CHODL, TACC3, ORC3, SLC26A8, IGLL1, PRC1, C9orf93, TRIP12 0.02941176 5 TEX101, RUVBL2, PPM1J, GEMIN4, NCAPHolfactory bulb 1.328125 FALSE 0.0546875 14 QPRT, SLC13A3, SIK1, RNASE6, CA2, EHBP1, UCHL1, OR2W3, C1orf198, KANK1, C3, AGT, ATF1, CNDP1 0.04117647 7 ARHGEF10, ARID5A, TANC1, SAG, PIAS3, SEMA4C, STARD7whole blood 1.23325893 FALSE 0.05078125 13 RAB27A, RNASE2, YPEL3, GLIPR1, PPP4C, MYH9, RNASE6, CLC, TACC3, CORO1A, IL32, RAF1, ARHGEF40 0.04117647 7 TMEM127, ARID5A, LAT2, HK3, FPR2, PPP1R12A, TXNIPBM-CD71+ early erythroid1.1953125 FALSE 0.0703125 18 TRMT1L, TACC3, MAZ, SNRPD3, CA3, CA2, PPP4C, KIF22, DDT, RAD51AP1, C22orf13, RFPL2, TRIM58, OR2W3, PRC1, ADCK1, ATF1, PQLC1 0.05882353 10 RIOK3, SLC20A1, C20orf27, GABARAPL2, RUVBL2, PRR5, ANKRD35, TERF2IP, NCAPH, ANKRD39adrenal cortex 1.1953125 FALSE 0.03515625 9 QPRT, SIK1, RNASE6, NQO2, HCRTR1, ACADSB, GGT5, C3, ARHGEF40 0.02941176 5 PNPLA3, RAB20, KRT15, SLC22A1, USP50BM-CD33+ myeloid1.16210938 FALSE 0.0546875 14 METTL9, RNASE3, RNASE2, SIK1, NLRP3, GLIPR1, PPP4C, RNASE6, ATP8B3, CORO1A, TACC3, RAB27A, ATF1, MON1B 0.04705882 8 TMEM127, C20orf27, LAT2, HK3, FPR2, CARS2, HGSNAT, DUSP2trigeminal ganglion1.16210938 FALSE 0.0546875 14 FAM57B, RNASE3, OR2G3, NIPA1, PRPH, ARL6, ATP8B3, RAD51AP1, ZFP37, ACADSB, ANXA10, EDDM3B, UCHL1, VPREB3 0.04705882 8 OR4F15, CORIN, OR10A6, USP50, C3orf33, SLC22A1, NCAPH, ALG10appendix 1.13839286 FALSE 0.046875 12 TACC3, ACTG2, CRLF1, CA3, RNASE6, SLFN12, CHST9, SLC39A9, RFPL2, C3, ANXA10, SLC39A2 0.04117647 7 CORIN, SDPR, SNRNP200, FPR2, ATP6V0A4, CENPQ, MCTP2testis Leydig cell1.10677083 FALSE 0.05859375 15 KCTD13, LCN6, ACTG2, CHODL, TACC3, LCN8, ORC3, SLC26A8, UPB1, HCRTR1, PRC1, EDDM3B, C9orf93, TRIP12, HIRIP3 0.05294118 9 OR4F15, RUVBL2, PPM1J, GEMIN4, SCCPDH, CORIN, NCAPH, TEX101, KRT19leukemia chronic myelogenous(k562)1.10677083 FALSE 0.05859375 15 QPRT, EIF3M, MAPKAPK5, RAB27A, TACC3, NQO2, ORC3, CCT4, RAD51AP1, XKR3, CEP72, RSL24D1, MYC, ARV1, PRC1 0.05294118 9 KARS, SLC20A1, RUVBL2, TFB2M, SMYD3, ST6GALNAC1, NCAPH, ADSL, KRT19smooth muscle 1.10677083 FALSE 0.05859375 15 SNRPD3, RPL8, GSTT1, GLIPR1, MYH9, THBS2, C3, IL32, DGUOK, ALDOA, ANXA10, MYC, UCHL1, PRC1, GALNT2 0.05294118 9 SLC20A1, LTBP1, RUVBL2, ERCC2, PTGIS, GYS1, ANKRD23, RHOC, IL1APB-CD4+ Tcells 1.04352679 FALSE 0.04296875 11 RAB27A, TACC3, RPL8, BUB3, USP34, PPP4C, MYH9, CORO1A, IL32, MYC, ATF1 0.04117647 7 SLC20A1, CLIC3, ARID5A, C9orf142, TXNIP, PLAG1, DUSP2pituitary gland 0.99609375 FALSE 0.03515625 9 RAB27A, PGBD5, UCHL1, DNAJC6, ORC3, TP53BP1, C16orf53, ZNF10, SEZ6L2 0.03529412 6 CBLC, USP8, CLIC3, LMAN2L, ERCC2, PLAG1trachea 0.99609375 FALSE 0.046875 12 RNASE1, HMX3, SIK1, ACTG2, CA2, ZNF70, CHST9, C3, NFAM1, ATP2C2, CDH3, MYC 0.04705882 8 TANC1, TXNIP, ATP12A, KRT19, KRT15, HGSNAT, ALG10, ST6GALNAC1tonsil 0.99609375 FALSE 0.046875 12 LCN6, RNASE6, TACC3, MX1, ZNF517, CORO1A, IL32, VPREB3, C3, CDH3, MYC, PRC1 0.04705882 8 CBLC, CLIC3, LAT2, MAPKBP1, KRT15, NCAPH, SPINK5, KRT19placenta 0.99609375 FALSE 0.046875 12 RNASE1, QPRT, MYH9, MMP11, ACTG2, LGALS14, PPP4C, CRLF1, PPARG, TRIM58, FAM89A, BCR 0.04705882 8 SLC20A1, CLIC3, LTBP1, CYP19A1, TEK, SEMA4C, BCAR1, KRT19lung 0.94075521 FALSE 0.06640625 17 RNASE1, QPRT, CYFIP1, GSTT1, ACTG2, CRLF1, RPL8, PPP4C, MYH9, ALDH2, DDT, PPARG, CORO1A, IL32, RNASE6, C3, SIK1 0.07058824 12 TMEM127, CLIC3, ARID5A, RUVBL2, HK3, TXNIP, ST6GALNAC1, TRPM4, NXN, BCAR1, KRT19, DUSP2PB-BDCA4+ dentritic cells0.91308594 FALSE 0.04296875 11 TACC3, SIK1, CHD1L, NLRP3, GLIPR1, PPP4C, RNASE6, MX1, CORO1A, ATF1, UCHL3 0.04705882 8 TMEM127, SLC20A1, CLIC3, ARID5A, HK3, PPM1J, C9orf142, DUSP2ovary 0.91308594 FALSE 0.04296875 11 XKR3, ACTG2, PPARG, OR2G3, OR2G2, KIAA1958, C3, OR4D10, CPLX1, AKR1C4, ZNF70 0.04705882 8 OR10A3, OR4F6, ZNF615, PTGIS, SYT10, XRN1, ALG10, ASTL fetal liver 0.91308594 FALSE 0.04296875 11 QPRT, TACC3, CLC, RAD51AP1, TRIM58, ACADSB, C3, ANXA10, PRC1, AGT, VPREB3 0.04705882 8 MUT, CYP19A1, HFE2, PNPLA3, NCAPH, ABCC6, C15orf41, PDZK1atrioventricular node0.88541667 FALSE 0.03125 8 SPN, XKR3, OR4E2, SLFN12, OR4D10, DGKI, PTCHD3, CAPS2 0.03529412 6 CLIC3, BAIAP2L2, STARD7, PNPLA3, TRPM7, C15orf41heart 0.88541667 FALSE 0.046875 12 RNASE1, QPRT, GSTT1, ACTG2, RPL8, DDT, TXN2, CORO1A, IL32, CEP72, ALDOA, AGT 0.05294118 9 MYOM2, TMEM127, CNNM4, OR10A6, RUVBL2, FAM96B, GYS1, RHOC, PPP1R13Lsalivary gland 0.88541667 FALSE 0.046875 12 SIK1, RNASE6, CA2, ZNF84, ATP8B3, HCRTR1, OR4D10, GGT5, C3, CDH3, EDDM3B, VPREB3 0.05294118 9 CORIN, BAIAP2L2, PPM1J, ATP6V0A4, TRPM4, KRT15, XRN1, ALG10, KRT19testis 0.81163194 FALSE 0.04296875 11 KCTD13, RNASE1, CAPS2, RTDR1, CHODL, PPP4C, TACC3, CEP72, SLC26A8, PRC1, HIRIP3 0.05294118 9 CNNM4, RUVBL2, ERCC2, MAPKBP1, PPM1J, GEMIN4, TEX101, KRT15, NCAPHPB-CD8+ Tcells 0.81163194 FALSE 0.04296875 11 RAB27A, BUB3, USP34, PPP4C, TACC3, ZNF84, CORO1A, IL32, MAPKAPK5, ATF1, MYC 0.05294118 9 SLC20A1, CLIC3, ARID5A, CD160, ANKRD36, C9orf142, TXNIP, PLAG1, DUSP2bronchial epithelial cells0.77473958 FALSE 0.0546875 14 SNRPD3, CYFIP1, RPL8, PPP4C, MYH9, CCT4, TMEM40, PPARG, UCHL3, RSL24D1, SIK1, CDH3, MYC, PRC1 0.07058824 12 CBLC, KARS, SLC20A1, CLIC3, RUVBL2, TANC1, KRT15, RHOC, IL1A, NXN, PLAG1, KRT19skin 0.77473958 FALSE 0.0546875 14 SIK1, SPN, XKR3, OR2G3, THBS2, ARL6, GAN, OR4D10, SLC5A4, CAPS2, ADAMTS18, ZNF70, FGFR3, ZNF396 0.07058824 12 C3orf33, FAM57A, PNPLA3, USP50, ATP6V0A4, VPS53, KRT15, ZNF649, CORIN, SPINK5, PPP1R13L, KRT19superior cervical ganglion0.77473958 FALSE 0.02734375 7 SLFN12, RAD51AP1, PTCHD3, TRIM58, ANXA10, EDDM3B, UCHL1 0.03529412 6 ZNF613, OR4F6, BAIAP2L2, FPR2, PTGIS, SLC22A1thymus 0.76622596 FALSE 0.05859375 15 ADORA2A, TACC3, SNRPD3, RNASE6, PPP4C, MYH9, MX1, IGLL1, CORO1A, IL32, C3, CDH3, MYC, PRC1, HIRIP3 0.07647059 13 CD1E, CLIC3, ARID5A, RUVBL2, KRT15, ANKRD36, ANKRD23, NCAPH, TXNIP, LMAN2L, NINL, KRT19, DUSP2prostate 0.76622596 FALSE 0.05859375 15 RAB27A, CYFIP1, GSTT1, ACTG2, CA3, SEZ6L2, PPP4C, RPL8, CLC, ATP2C2, SIK1, CDH3, DDT, MEGF6, GSTT2 0.07647059 13 CBLC, CD1E, CLIC3, RUVBL2, TMEFF2, TANC1, CARS2, KRT15, ZNF432, TRPM4, NXN, KRT19, DUSP2pancreas 0.74707031 FALSE 0.03515625 9 RNASE1, HMX3, RPL8, PPARG, GAN, OR4D10, IL32, C3, MYC 0.04705882 8 CBLC, OR10A3, OR10A6, DZIP1L, ZNF613, KRT15, STARD7, KRT19fetal lung 0.71940104 FALSE 0.05078125 13 RNASE1, SIK1, ACTG2, CRLF1, RNASE6, MYH9, IL32, C3, SNRPD3, CDH3, AGT, MEGF6, PRC1 0.07058824 12 OR4F15, SDPR, CLIC3, RUVBL2, PLAG1, HFE2, TANC1, KRT15, TXNIP, LMAN2L, KRT19, PDZK1uterus corpus 0.6640625 FALSE 0.0390625 10 CHST9, KCNE1, ACTG2, CA3, TEC, THBS2, ATP8B3, GGT5, C16orf45, ZNF268 0.05882353 10 SNRNP200, TNFAIP8L3, SAG, PTGIS, ATP12A, ZNF557, SLC22A1, VPS53, PLAG1, KRT19PB-CD14+ monocytes0.6640625 FALSE 0.03125 8 RAB27A, RNASE2, NLRP3, GLIPR1, PPP4C, RNASE6, CORO1A, TACC3 0.04705882 8 TMEM127, LAT2, EMR1, FPR2, CARS2, HK3, TXNIP, DUSP2fetal thyroid 0.6640625 FALSE 0.0390625 10 GSTT1, CRLF1, CA3, PPARG, ATP8B3, C3, ZNF34, GGT5, AKR1C4, ARHGEF40 0.05882353 10 RAB20, CLIC3, LMAN2L, TANC1, PNPLA3, HGSNAT, TXNIP, IL1A, PLAG1, KRT19colorectal adenocarcinoma0.60369318 FALSE 0.0390625 10 METTL9, BUB3, GSTT1, TACC3, RAD51AP1, CEP72, CDH3, ATF1, MYC, PRC1 0.06470588 11 RAB20, SLC20A1, RUVBL2, ARHGDIA, TANC1, GEMIN4, KRT15, NCAPH, NXN, KRT19, DUSP2tongue 0.59765625 FALSE 0.03515625 9 CRLF1, CA3, ZNF10, GAN, TRIM58, CAPS2, SLC39A2, MEGF6, FGFR3 0.05882353 10 MYOM2, CLIC3, HFE2, GYS1, KRT15, CES2, IL1A, SPINK5, PPP1R13L, KRT19skeletal muscle 0.59027778 FALSE 0.03125 8 IGLL1, XKR3, CA3, SLC2A11, HCRTR1, IL32, ALDOA, ZNF268 0.05294118 9 MYOM2, HFE2, CD160, CYP19A1, GYS1, KRT15, ANKRD23, SLC22A1, NXNthyroid 0.58105469 FALSE 0.02734375 7 GSTT1, CA3, PPARG, TRIM58, C16orf53, MTNR1A, ATF1 0.04705882 8 MYOM2, CNNM4, RUVBL2, MAPKBP1, TANC1, CCNC, KRT19, CLIC3uterus 0.48697917 FALSE 0.04296875 11 RNASE3, CYFIP1, GSTT2, ACTG2, MEGF6, THBS2, OR2G2, KANK1, AKR1C4, MTNR1A, ATF1 0.08823529 15 SLC37A3, SEMA4C, TNFAIP8L3, FAM19A3, ARID5A, RUVBL2, TANC1, LIX1L, PTGIS, CORIN, PIAS3, PPP1R12A, ZNF649, NXN, KRT19PB-CD56+ NKCells0.44270833 FALSE 0.03125 8 RAB27A, GLIPR1, PPP4C, TACC3, CORO1A, IL32, PPP2R5E, ATF1 0.07058824 12 MYOM2, TMEM127, SLC20A1, CLIC3, ARID5A, CD160, POLR3GL, PPP1R12A, HK3, TXNIP, ANKRD39, DUSP2