The evolution and population diversity of human-specific segmental duplications

Loading...
Thumbnail Image

Authors

Dennis, Megan Y
Harshman, Lana
Nelson, Bradley J
Penn, Osnat
Cantsilieris, Stuart
Huddleston, John
Antonacci, Francesca
Penewit, Kelsi
Denman, Laura
Raja, Archana

Journal Title

Journal ISSN

Volume Title

Publisher

Nature Ecology & Evolution

Abstract

Segmental duplications contribute significantly to the evolution, adaptation and diseaseassociated instability of the human genome. The largest and most identical duplications suffer from the poorest characterization, often corresponding to genome gaps and misassembly. Here we focus on creating a framework to understand the evolution, copy number variation and coding potential of human-specific segmental duplications (HSDs). We identify 218 HSDs (>5 kbp in length) based on analysis of 322 deeply sequenced ape and human genomes. We target 268 large-insert human bacterial artificial chromosomes, 85 of which have been incorporated into the most recent human reference build (GRCh38) correcting 24 large euchromatic gaps, and 269 nonhuman primate clones for finished sequencing in order to resolve the structure and evolution of the largest, most complex regions with protein-coding potential (n=80 genes/33 gene families). Our analyses indicate that these HSDs (28 duplications ranging in length from 11–677 kbp) are non-randomly organized (P<1x10-6), cluster in association with core duplicons (P<1x10-7) and the majority represent intrachromosomal events arranged predominantly in an interspersed inverted orientation (18/26; P=0.014). Phylogenetic reconstruction suggests different waves of HSD with the latest burst occurring <1.3 million years ago. These 16 duplications and 28 genes would be specific to the genus Homo, including three gene families absent in ancient Neanderthal and Denisova genomes. Of particular interest are the TCAF1/TCAF2 family, which is the most stratified of the Homo sapiens-specific duplications and has been implicated in the somatosensation of cold. Overall, copy number variation analysis (n=2,379 genomes), RNA sequence mapping (GTEx) and targeted resequencing of the protein-coding regions (n=3,275 controls) identify ten gene families where copy number never returns to the ancestral state, there is evidence of mRNA splicing and expression, and no common gene-disruptive mutation events are observed in the general population. We propose that this subset of genes, including functional paralogs ARHGAP11B and SRGAP2C, represents excellent candidates for the evolution of human-specific adaptive traits.

Description

Citation

Dennis MY, Harshman L, Nelson BJ, Penn O, Cantsilieris S, Huddleston J, Antonacci F, Penewit K, Denman L, Raja A, Baker C, Mark K, Malig M, Janke N, Espinoza C, Stessman HAF, Nuttle X, Hoekzema K, Lindsay-Graves TA, Wilson RK, Eichler EE. (2017). The evolution and population diversity of human-specific segmental duplications. Nat Ecol Evol Feb 17;1:69. doi:10.1038/s41559-016-0069.

DOI