Finding sRNA generative locales from high-throughput sequencing data with NiBLS

Daniel MacLean, Vincent Moulton, David J. Studholme

Research output: Contribution to journalArticlepeer-review

9 Citations (Scopus)

Abstract

Background:
Next-generation sequencing technologies allow researchers to obtain millions of sequence reads in a single experiment. One important use of the technology is the sequencing of small non-coding regulatory RNAs and the identification of the genomic locales from which they originate. Currently, there is a paucity of methods for finding small RNA generative locales.

Results:
We describe and implement an algorithm that can determine small RNA generative locales from high-throughput sequencing data. The algorithm creates a network, or graph, of the small RNAs by creating links between them depending on their proximity on the target genome. For each of the sub-networks in the resulting graph the clustering coefficient, a measure of the interconnectedness of the subnetwork, is used to identify the generative locales. We test the algorithm over a wide range of parameters using RFAM sequences as positive controls and demonstrate that the algorithm has good sensitivity and specificity in a range of Arabidopsis and mouse small RNA sequence sets and that the locales it generates are robust to differences in the choice of parameters.

Conclusions:
NiBLS is a fast, reliable and sensitive method for determining small RNA locales in high-throughput sequence data that is generally applicable to all classes of small RNA.
Original languageEnglish
Article number93
JournalBMC Bioinformatics
Volume11
DOIs
Publication statusPublished - 18 Feb 2010

Cite this