TY - JOUR
T1 - Copy number variation arising from gene conversion on the human Y chromosome
AU - Shi, Wentao
AU - Massaia, Andrea
AU - Louzada, Sandra
AU - Banerjee, Ruby
AU - Hallast, Pille
AU - Chen, Yuan
AU - Bergström, Anders
AU - Gu, Yong
AU - Leonard, Steven
AU - Quail, Michael A.
AU - Ayub, Qasim
AU - Yang, Fengtang
AU - Tyler-Smith, Chris
AU - Xue, Yali
PY - 2017/12/5
Y1 - 2017/12/5
N2 - We describe the variation in copy number of a ~ 10 kb region overlapping the long intergenic noncoding RNA (lincRNA) gene, TTTY22, within the IR3 inverted repeat on the short arm of the human Y chromosome, leading to individuals with 0–3 copies of this region in the general population. Variation of this CNV is common, with 266 individuals having 0 copies, 943 (including the reference sequence) having 1, 23 having 2 copies, and two having 3 copies, and was validated by breakpoint PCR, fibre-FISH, and 10× Genomics Chromium linked-read sequencing in subsets of 1234 individuals from the 1000 Genomes Project. Mapping the changes in copy number to the phylogeny of these Y chromosomes previously established by the Project identified at least 20 mutational events, and investigation of flanking paralogous sequence variants showed that the mutations involved flanking sequences in 18 of these, and could extend over > 30 kb of DNA. While either gene conversion or double crossover between misaligned sister chromatids could formally explain the 0–2 copy events, gene conversion is the more likely mechanism, and these events include the longest non-allelic gene conversion reported thus far. Chromosomes with three copies of this CNV have arisen just once in our data set via another mechanism: duplication of 420 kb that places the third copy 230 kb proximal to the existing proximal copy. Our results establish gene conversion as a previously under-appreciated mechanism of generating copy number changes in humans and reveal the exceptionally large size of the conversion events that can occur.
AB - We describe the variation in copy number of a ~ 10 kb region overlapping the long intergenic noncoding RNA (lincRNA) gene, TTTY22, within the IR3 inverted repeat on the short arm of the human Y chromosome, leading to individuals with 0–3 copies of this region in the general population. Variation of this CNV is common, with 266 individuals having 0 copies, 943 (including the reference sequence) having 1, 23 having 2 copies, and two having 3 copies, and was validated by breakpoint PCR, fibre-FISH, and 10× Genomics Chromium linked-read sequencing in subsets of 1234 individuals from the 1000 Genomes Project. Mapping the changes in copy number to the phylogeny of these Y chromosomes previously established by the Project identified at least 20 mutational events, and investigation of flanking paralogous sequence variants showed that the mutations involved flanking sequences in 18 of these, and could extend over > 30 kb of DNA. While either gene conversion or double crossover between misaligned sister chromatids could formally explain the 0–2 copy events, gene conversion is the more likely mechanism, and these events include the longest non-allelic gene conversion reported thus far. Chromosomes with three copies of this CNV have arisen just once in our data set via another mechanism: duplication of 420 kb that places the third copy 230 kb proximal to the existing proximal copy. Our results establish gene conversion as a previously under-appreciated mechanism of generating copy number changes in humans and reveal the exceptionally large size of the conversion events that can occur.
U2 - 10.1007/s00439-017-1857-9
DO - 10.1007/s00439-017-1857-9
M3 - Article
VL - 137
SP - 73
EP - 83
JO - Human Genetics
JF - Human Genetics
SN - 0340-6717
ER -