TY - JOUR
T1 - Variability of inverted repeats in all available genomes of bacteria
AU - Porubiaková, Otília
AU - Havlík, Jan
AU - Indu, null
AU - Šedý, Michal
AU - Přepechalová, Veronika
AU - Bartas, Martin
AU - Bidula, Stefan
AU - Šťastný, Jiří
AU - Fojta, Miroslav
AU - Brázda, Václav
N1 - Funding information: This research was funded by the Czech Science Foundation, grant number 22-21903S, and the SYMBIT project (registration number CZ.02.1.01/0.0/0.0/15_003/0000477) financed by the ERDF.
PY - 2023/8/17
Y1 - 2023/8/17
N2 - Noncanonical secondary structures in nucleic acids have been studied intensively in recent years. Important biological roles of cruciform structures formed by inverted repeats (IRs) have been demonstrated in diverse organisms, including humans. Using Palindrome analyser, we analyzed IRs in all accessible bacterial genome sequences to determine their frequencies, lengths, and localizations. IR sequences were identified in all species, but their frequencies differed significantly across various evolutionary groups. We detected 242,373,717 IRs in all 1,565 bacterial genomes. The highest mean IR frequency was detected in the Tenericutes (61.89 IRs/kbp) and the lowest mean frequency was found in the Alphaproteobacteria (27.08 IRs/kbp). IRs were abundant near genes and around regulatory, tRNA, transfer-messenger RNA (tmRNA), and rRNA regions, pointing to the importance of IRs in such basic cellular processes as genome maintenance, DNA replication, and transcription. Moreover, we found that organisms with high IR frequencies were more likely to be endosymbiotic, antibiotic producing, or pathogenic. On the other hand, those with low IR frequencies were far more likely to be thermophilic. This first comprehensive analysis of IRs in all available bacterial genomes demonstrates their genomic ubiquity, nonrandom distribution, and enrichment in genomic regulatory regions.
AB - Noncanonical secondary structures in nucleic acids have been studied intensively in recent years. Important biological roles of cruciform structures formed by inverted repeats (IRs) have been demonstrated in diverse organisms, including humans. Using Palindrome analyser, we analyzed IRs in all accessible bacterial genome sequences to determine their frequencies, lengths, and localizations. IR sequences were identified in all species, but their frequencies differed significantly across various evolutionary groups. We detected 242,373,717 IRs in all 1,565 bacterial genomes. The highest mean IR frequency was detected in the Tenericutes (61.89 IRs/kbp) and the lowest mean frequency was found in the Alphaproteobacteria (27.08 IRs/kbp). IRs were abundant near genes and around regulatory, tRNA, transfer-messenger RNA (tmRNA), and rRNA regions, pointing to the importance of IRs in such basic cellular processes as genome maintenance, DNA replication, and transcription. Moreover, we found that organisms with high IR frequencies were more likely to be endosymbiotic, antibiotic producing, or pathogenic. On the other hand, those with low IR frequencies were far more likely to be thermophilic. This first comprehensive analysis of IRs in all available bacterial genomes demonstrates their genomic ubiquity, nonrandom distribution, and enrichment in genomic regulatory regions.
KW - Bacteria domain
KW - Bacterial genome analysis
KW - Inverted repeats
KW - Palindrome analyser
UR - http://www.scopus.com/inward/record.url?scp=85171994866&partnerID=8YFLogxK
U2 - 10.1128/spectrum.01648-23
DO - 10.1128/spectrum.01648-23
M3 - Article
VL - 11
JO - Microbiology Spectrum
JF - Microbiology Spectrum
SN - 2165-0497
IS - 4
M1 - e01648-23
ER -