TY - JOUR
T1 - Linked-read sequencing enables haplotype-resolved resequencing at population scale
AU - Lutgen, Dave
AU - Ritter, Raphael
AU - Olsen, Remi-André
AU - Schielzeth, Holger
AU - Gruselius, Joel
AU - Ewels, Phil
AU - García, Jesús T
AU - Shirihai, Hadoram
AU - Schweizer, Manuel
AU - Suh, Alexander
AU - Burri, Reto
PY - 2020/9
Y1 - 2020/9
N2 - The feasibility to sequence entire genomes of virtually any organism provides unprecedented insights into the evolutionary history of populations and species. Nevertheless, many population genomic inferences - including the quantification and dating of admixture, introgression and demographic events, and inference of selective sweeps - are still limited by the lack of high-quality haplotype information. The newest generation of sequencing technology now promises significant progress. To establish the feasibility of haplotype-resolved genome resequencing at population scale, we investigated properties of linked-read sequencing data of songbirds of the genus Oenanthe across a range of sequencing depths. Our results based on the comparison of downsampled (25x, 20x, 15x, 10x, 7x, and 5x) with high-coverage data (46-68x) of seven bird genomes mapped to a reference suggest that phasing contiguities and accuracies adequate for most population genomic analyses can be reached already with moderate sequencing effort. At 15x coverage, phased haplotypes span about 90% of the genome assembly, with 50 and 90 percent of phased sequences located in phase blocks longer than 1.25-4.6 Mb (N50) and 0.27-0.72 Mb (N90). Phasing accuracy reaches beyond 99% starting from 15x coverage. Higher coverages yielded higher contiguities (up to about 7 Mb/1Mb (N50/N90) at 25x coverage), but only marginally improved phasing accuracy. Phase block contiguity improved with input DNA molecule length; thus, higher-quality DNA may help keeping sequencing costs at bay. In conclusion, even for organisms with gigabase-sized genomes like birds, linked-read sequencing at moderate depth opens an affordable avenue towards haplotype-resolved genome resequencing at population scale.
AB - The feasibility to sequence entire genomes of virtually any organism provides unprecedented insights into the evolutionary history of populations and species. Nevertheless, many population genomic inferences - including the quantification and dating of admixture, introgression and demographic events, and inference of selective sweeps - are still limited by the lack of high-quality haplotype information. The newest generation of sequencing technology now promises significant progress. To establish the feasibility of haplotype-resolved genome resequencing at population scale, we investigated properties of linked-read sequencing data of songbirds of the genus Oenanthe across a range of sequencing depths. Our results based on the comparison of downsampled (25x, 20x, 15x, 10x, 7x, and 5x) with high-coverage data (46-68x) of seven bird genomes mapped to a reference suggest that phasing contiguities and accuracies adequate for most population genomic analyses can be reached already with moderate sequencing effort. At 15x coverage, phased haplotypes span about 90% of the genome assembly, with 50 and 90 percent of phased sequences located in phase blocks longer than 1.25-4.6 Mb (N50) and 0.27-0.72 Mb (N90). Phasing accuracy reaches beyond 99% starting from 15x coverage. Higher coverages yielded higher contiguities (up to about 7 Mb/1Mb (N50/N90) at 25x coverage), but only marginally improved phasing accuracy. Phase block contiguity improved with input DNA molecule length; thus, higher-quality DNA may help keeping sequencing costs at bay. In conclusion, even for organisms with gigabase-sized genomes like birds, linked-read sequencing at moderate depth opens an affordable avenue towards haplotype-resolved genome resequencing at population scale.
KW - admixture
KW - demography
KW - introgression
KW - phasing
KW - population genomics
KW - selective sweeps
UR - http://www.scopus.com/inward/record.url?scp=85087172327&partnerID=8YFLogxK
U2 - 10.1111/1755-0998.13192
DO - 10.1111/1755-0998.13192
M3 - Article
C2 - 32419391
VL - 20
SP - 1311
EP - 1322
JO - Molecular Ecology Resources
JF - Molecular Ecology Resources
SN - 1755-098X
IS - 5
ER -