The MinMax Squeeze: Guaranteeing a minimal tree for population data

B. Holland, K.T. Huber, D. Penny, V. Moulton

Research output: Contribution to journalArticlepeer-review

15 Citations (Scopus)

Abstract

We report that for population data, where sequences are very similar to one another, it is often possible to use a two-pronged (MinMax Squeeze) approach to prove that a tree is the shortest possible under the parsimony criterion. Such population data can be in a range where parsimony is a maximum likelihood estimator. This is in sharp contrast to the case with species data, where sequences are much further apart and the problem of guaranteeing an optimal phylogenetic tree is known to be computationally prohibitive for realistic numbers of species, irrespective of whether likelihood or parsimony is the optimality criterion. The Squeeze uses both an upper bound (the length of the shortest tree known) and a lower bound derived from partitions of the columns (the length of the shortest tree possible). If the two bounds meet, the shortest known tree is thus proven to be a shortest possible tree. The implementation is first tested on simulated data sets and then applied to 53 complete human mitochondrial genomes. The shortest possible trees for those data have several significant improvements from the published tree. Namely, a pair of Australian lineages comes deeper in the tree (in agreement with archaeological data), and the non-African part of the tree shows greater agreement with the geographical distribution of lineages.
Original languageEnglish
Pages (from-to)235-242
Number of pages8
JournalMolecular Biology and Evolution
Volume22
Issue number2
DOIs
Publication statusPublished - 2005

Cite this