Swarmv2: Highly-scalable and high-resolution amplicon clustering

Frédéric Mahé, Torbjørn Rognes, Christopher Quince, Colomban de Vargas, Micah Dunthorn

Research output: Contribution to journalArticlepeer-review

400 Citations (Scopus)

Abstract

Previously we presented Swarm v1, a novel and open source amplicon clustering program that produced fine-scale molecular operational taxonomic units (OTUs), free of arbitrary global clustering thresholds and input-order dependency. Swarmv1 worked with an initial phase that used iterative single-linkage with a local clustering threshold (d), followed by a phase that used the internal abundance structures of clusters to break chained OTUs.Here we present Swarmv2, which has two important novel features: (1) a new algorithm for d = 1 that allows the computation time of the program to scale linearly with increasing amounts of data; and (2) the new fastidious option that reduces under-grouping by grafting low abundant OTUs (e.g., singletons and doubletons) onto larger ones. Swarm v2 also directly integrates the clustering and breaking phases, dereplicates sequencing reads with d = 0, outputs OTU representatives in fasta format, and plots individualOTUs as two-dimensional networks.

Original languageEnglish
Article numbere1420
JournalPeerJ
Volume2015
Issue number12
DOIs
Publication statusPublished - 2015

Keywords

  • Barcoding
  • Environmental diversity
  • Molecular operational taxonomic units

Cite this