TY - JOUR
T1 - Natural selection beyond genes
T2 - Identification and analyses of evolutionarily conserved elements in the genome of the collared flycatcher (Ficedula albicollis)
AU - Craig, Rory J.
AU - Suh, Alexander
AU - Wang, Mi
AU - Ellegren, Hans
N1 - Funding Information:
We thank Severin Uebbing, Dominik R. Laetsch and Judith E. Risse for advice on genome annotation, and Carina Farah Mugal, Ludovic Dutoit, Linnéa Smeds and Douglas Scofield for providing scripts and advice on various analyses. This work was supported by the Swedish Research Council (grant number 2013-8271) and the Knut and Alice Wallenberg Foundation (Project Grant). Computational analyses were performed on resources provided by the Swedish National Infrastructure for Computing (SNIC) through the Uppsala Multidisciplinary Center for Advanced Computational Science (UPPMAX).
Publisher Copyright:
© 2017 The Authors. Molecular Ecology Published by John Wiley & Sons Ltd
PY - 2018/1
Y1 - 2018/1
N2 - It is becoming increasingly clear that a significant proportion of the functional sequence within eukaryotic genomes is noncoding. However, since the identification of conserved elements (CEs) has been restricted to a limited number of model organisms, the dynamics and evolutionary character of the genomic landscape of conserved, and hence likely functional, sequence is poorly understood in most species. Moreover, identification and analysis of the full suite of functional sequence are particularly important for the understanding of the genetic basis of trait loci identified in genome scans or quantitative trait locus mapping efforts. We report that ~6.6% of the collared flycatcher genome (74.0 Mb) is spanned by ~1.28 million CEs, a higher proportion of the genome but a lower total amount of conserved sequence than has been reported in mammals. We identified >200,000 CEs specific to either the archosaur, avian, neoavian or passeridan lineages, constituting candidates for lineage-specific adaptations. Importantly, no less than ~71% of CE sites were nonexonic (52.6 Mb), and conserved nonexonic sequence density was negatively correlated with functional exonic density at local genomic scales. Additionally, nucleotide diversity was strongly reduced at nonexonic conserved sites (0.00153) relative to intergenic nonconserved sites (0.00427). By integrating deep transcriptome sequencing and additional genome annotation, we identified novel protein-coding genes, long noncoding RNA genes and transposon-derived (exapted) CEs. The approach taken here based on the use of a progressive cactus whole-genome alignment to identify CEs should be readily applicable to nonmodel organisms in general and help to reveal the rich repertoire of putatively functional noncoding sequence as targets for selection.
AB - It is becoming increasingly clear that a significant proportion of the functional sequence within eukaryotic genomes is noncoding. However, since the identification of conserved elements (CEs) has been restricted to a limited number of model organisms, the dynamics and evolutionary character of the genomic landscape of conserved, and hence likely functional, sequence is poorly understood in most species. Moreover, identification and analysis of the full suite of functional sequence are particularly important for the understanding of the genetic basis of trait loci identified in genome scans or quantitative trait locus mapping efforts. We report that ~6.6% of the collared flycatcher genome (74.0 Mb) is spanned by ~1.28 million CEs, a higher proportion of the genome but a lower total amount of conserved sequence than has been reported in mammals. We identified >200,000 CEs specific to either the archosaur, avian, neoavian or passeridan lineages, constituting candidates for lineage-specific adaptations. Importantly, no less than ~71% of CE sites were nonexonic (52.6 Mb), and conserved nonexonic sequence density was negatively correlated with functional exonic density at local genomic scales. Additionally, nucleotide diversity was strongly reduced at nonexonic conserved sites (0.00153) relative to intergenic nonconserved sites (0.00427). By integrating deep transcriptome sequencing and additional genome annotation, we identified novel protein-coding genes, long noncoding RNA genes and transposon-derived (exapted) CEs. The approach taken here based on the use of a progressive cactus whole-genome alignment to identify CEs should be readily applicable to nonmodel organisms in general and help to reveal the rich repertoire of putatively functional noncoding sequence as targets for selection.
KW - collared flycatcher
KW - comparative genomics
KW - conserved elements
KW - progressive cactus
KW - targets for selection
KW - transposon exaptation
UR - http://www.scopus.com/inward/record.url?scp=85040185543&partnerID=8YFLogxK
U2 - 10.1111/mec.14462
DO - 10.1111/mec.14462
M3 - Article
C2 - 29226517
AN - SCOPUS:85040185543
VL - 27
SP - 476
EP - 492
JO - Molecular Ecology
JF - Molecular Ecology
SN - 0962-1083
IS - 2
ER -