TY - JOUR
T1 - Critical review of NGS analyses for de novo genotyping multigene families
AU - Lighten, Jackie
AU - van Oosterhout, Cock
AU - Bentzen, Paul
N1 - Funding information: Natural Sciences and Engineering Research Council of Canada (NSERC); NSERC Discovery; Earth and Life Science Alliance (ELSA); Leverhulme Trust. Grant Number: RPG-2013-305
PY - 2014/8/1
Y1 - 2014/8/1
N2 - The genotyping of highly polymorphic multigene families across many individuals used to be a particularly challenging task because of methodological limitations associated with traditional approaches. Next-generation sequencing (NGS) can overcome most of these limitations, and it is increasingly being applied in population genetic studies of multigene families. Here, we critically review NGS bioinformatic approaches that have been used to genotype the major histocompatibility complex (MHC) immune genes, and we discuss how the significant advances made in this field are applicable to population genetic studies of gene families. Increasingly, approaches are introduced that apply thresholds of sequencing depth and sequence similarity to separate alleles from methodological artefacts. We explain why these approaches are particularly sensitive to methodological biases by violating fundamental genotyping assumptions. An alternative strategy that utilizes ultra-deep sequencing (hundreds to thousands of sequences per amplicon) to reconstruct genotypes and applies statistical methods on the sequencing depth to separate alleles from artefacts appears to be more robust. Importantly, the ‘degree of change’ (DOC) method avoids using arbitrary cut-off thresholds by looking for statistical boundaries between the sequencing depth for alleles and artefacts, and hence, it is entirely repeatable across studies. Although the advances made in generating NGS data are still far ahead of our ability to perform reliable processing, analysis and interpretation, the community is developing statistically rigorous protocols that will allow us to address novel questions in evolution, ecology and genetics of multigene families. Future developments in third-generation single molecule sequencing may potentially help overcome problems that still persist in de novo multigene amplicon genotyping when using current second-generation sequencing approaches.
AB - The genotyping of highly polymorphic multigene families across many individuals used to be a particularly challenging task because of methodological limitations associated with traditional approaches. Next-generation sequencing (NGS) can overcome most of these limitations, and it is increasingly being applied in population genetic studies of multigene families. Here, we critically review NGS bioinformatic approaches that have been used to genotype the major histocompatibility complex (MHC) immune genes, and we discuss how the significant advances made in this field are applicable to population genetic studies of gene families. Increasingly, approaches are introduced that apply thresholds of sequencing depth and sequence similarity to separate alleles from methodological artefacts. We explain why these approaches are particularly sensitive to methodological biases by violating fundamental genotyping assumptions. An alternative strategy that utilizes ultra-deep sequencing (hundreds to thousands of sequences per amplicon) to reconstruct genotypes and applies statistical methods on the sequencing depth to separate alleles from artefacts appears to be more robust. Importantly, the ‘degree of change’ (DOC) method avoids using arbitrary cut-off thresholds by looking for statistical boundaries between the sequencing depth for alleles and artefacts, and hence, it is entirely repeatable across studies. Although the advances made in generating NGS data are still far ahead of our ability to perform reliable processing, analysis and interpretation, the community is developing statistically rigorous protocols that will allow us to address novel questions in evolution, ecology and genetics of multigene families. Future developments in third-generation single molecule sequencing may potentially help overcome problems that still persist in de novo multigene amplicon genotyping when using current second-generation sequencing approaches.
U2 - 10.1111/mec.12843
DO - 10.1111/mec.12843
M3 - Article
VL - 23
SP - 3957
EP - 3972
JO - Molecular Ecology
JF - Molecular Ecology
SN - 0962-1083
IS - 16
ER -