TY - JOUR
T1 - Estimates of genetic differentiation measured by FST do not necessarily require large sample sizes when using many SNP markers
AU - Willing, Eva-Maria
AU - Dreyer, Christine
AU - Van Oosterhout, Cock
N1 - © Willing et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2012/8/14
Y1 - 2012/8/14
N2 - Population genetic studies provide insights into the evolutionary processes that influence the distribution of sequence
variants within and among wild populations. FST is among the most widely used measures for genetic differentiation and
plays a central role in ecological and evolutionary genetic studies. It is commonly thought that large sample sizes are
required in order to precisely infer FST and that small sample sizes lead to overestimation of genetic differentiation. Until
recently, studies in ecological model organisms incorporated a limited number of genetic markers, but since the emergence
of next generation sequencing, the panel size of genetic markers available even in non-reference organisms has rapidly
increased. In this study we examine whether a large number of genetic markers can substitute for small sample sizes when
estimating FST. We tested the behavior of three different estimators that infer FST and that are commonly used in population
genetic studies. By simulating populations, we assessed the effects of sample size and the number of markers on the various
estimates of genetic differentiation. Furthermore, we tested the effect of ascertainment bias on these estimates. We show
that the population sample size can be significantly reduced (as small as n = 4–6) when using an appropriate estimator and a
large number of bi-allelic genetic markers (k.1,000). Therefore, conservation genetic studies can now obtain almost the
same statistical power as studies performed on model organisms using markers developed with next-generation
sequencing.
AB - Population genetic studies provide insights into the evolutionary processes that influence the distribution of sequence
variants within and among wild populations. FST is among the most widely used measures for genetic differentiation and
plays a central role in ecological and evolutionary genetic studies. It is commonly thought that large sample sizes are
required in order to precisely infer FST and that small sample sizes lead to overestimation of genetic differentiation. Until
recently, studies in ecological model organisms incorporated a limited number of genetic markers, but since the emergence
of next generation sequencing, the panel size of genetic markers available even in non-reference organisms has rapidly
increased. In this study we examine whether a large number of genetic markers can substitute for small sample sizes when
estimating FST. We tested the behavior of three different estimators that infer FST and that are commonly used in population
genetic studies. By simulating populations, we assessed the effects of sample size and the number of markers on the various
estimates of genetic differentiation. Furthermore, we tested the effect of ascertainment bias on these estimates. We show
that the population sample size can be significantly reduced (as small as n = 4–6) when using an appropriate estimator and a
large number of bi-allelic genetic markers (k.1,000). Therefore, conservation genetic studies can now obtain almost the
same statistical power as studies performed on model organisms using markers developed with next-generation
sequencing.
U2 - 10.1371/journal.pone.0042649
DO - 10.1371/journal.pone.0042649
M3 - Article
VL - 7
JO - PLoS One
JF - PLoS One
SN - 1932-6203
IS - 8
M1 - e42649
ER -