TY - JOUR
T1 - SQUIRREL: Reconstructing semi-directed phylogenetic level-1 networks from four-leaved networks or sequence alignments
AU - Holtgrefe, Niels
AU - Huber, Katharina
AU - van Iersel, Leo
AU - Jones, Mark
AU - Martin, Samuel
AU - Moulton, Vincent
N1 - Data Availability Statement: The generated networks, Python scripts, sequence alignments and numerical results of the experiments in this paper are available at https://github.com/nholtgrefe/squirrel.
Funding Information: This work received funding from grants OCENW.M.21.306 (N.H., L.v.I., and M.J.) and OCENW.KLEIN.125 (L.v.I. and M.J.) of the Dutch Research Council (NWO). Part of this work was done while some of the authors were in residence at the Institute for Computational and Experimental Research in Mathematics (ICERM) in Providence (RI, USA) during the Theory, Methods, and Applications of Quantitative Phylogenomics program [supported by grant DMS-1929284 of the National Science Foundation (NSF)].
PY - 2025/4
Y1 - 2025/4
N2 - With the increasing availability of genomic data, biologists aim to find more accurate descriptions of evolutionary histories influenced by secondary contact, where diverging lineages reconnect before diverging again. Such reticulate evolutionary events can be more accurately represented in phylogenetic networks than in phylogenetic trees. Since the root location of phylogenetic networks cannot be inferred from biological data under several evolutionary models, we consider semi-directed (phylogenetic) networks: partially directed graphs without a root in which the directed edges represent reticulate evolutionary events. By specifying a known outgroup, the rooted topology can be recovered from such networks. We introduce the algorithm Squirrel (Semi-directed Quarnet-based Inference to Reconstruct Level-1 Networks) which constructs a semi-directed level-1 network from a full set of quarnets (four-leaf semi-directed networks). Our method also includes a heuristic to construct such a quarnet set directly from sequence alignments. We demonstrate Squirrel's performance through simulations and on real sequence data sets, the largest of which contains 29 aligned sequences close to 1.7 Mb long. The resulting networks are obtained on a standard laptop within a few minutes. Lastly, we prove that Squirrel is combinatorially consistent: given a full set of quarnets coming from a triangle-free semi-directed level-1 network, it is guaranteed to reconstruct the original network. Squirrel is implemented in Python, has an easy-to-use graphical user interface that takes sequence alignments or quarnets as input, and is freely available at https://github.com/nholtgrefe/squirrel.
AB - With the increasing availability of genomic data, biologists aim to find more accurate descriptions of evolutionary histories influenced by secondary contact, where diverging lineages reconnect before diverging again. Such reticulate evolutionary events can be more accurately represented in phylogenetic networks than in phylogenetic trees. Since the root location of phylogenetic networks cannot be inferred from biological data under several evolutionary models, we consider semi-directed (phylogenetic) networks: partially directed graphs without a root in which the directed edges represent reticulate evolutionary events. By specifying a known outgroup, the rooted topology can be recovered from such networks. We introduce the algorithm Squirrel (Semi-directed Quarnet-based Inference to Reconstruct Level-1 Networks) which constructs a semi-directed level-1 network from a full set of quarnets (four-leaf semi-directed networks). Our method also includes a heuristic to construct such a quarnet set directly from sequence alignments. We demonstrate Squirrel's performance through simulations and on real sequence data sets, the largest of which contains 29 aligned sequences close to 1.7 Mb long. The resulting networks are obtained on a standard laptop within a few minutes. Lastly, we prove that Squirrel is combinatorially consistent: given a full set of quarnets coming from a triangle-free semi-directed level-1 network, it is guaranteed to reconstruct the original network. Squirrel is implemented in Python, has an easy-to-use graphical user interface that takes sequence alignments or quarnets as input, and is freely available at https://github.com/nholtgrefe/squirrel.
KW - network reconstruction
KW - quarnet
KW - rooted phylogenetic network
KW - semi-directed phylogenetic network
KW - sequence alignment
KW - traveling salesman problem
UR - http://www.scopus.com/inward/record.url?scp=105002604037&partnerID=8YFLogxK
U2 - 10.1093/molbev/msaf067
DO - 10.1093/molbev/msaf067
M3 - Article
SN - 0737-4038
VL - 42
JO - Molecular Biology and Evolution
JF - Molecular Biology and Evolution
IS - 4
M1 - msaf067
ER -