Abstract
UPGMA (Unweighted Pair Group Method with Arithmetic Mean) is a widely used clustering method. Here we show that UPGMA is a greedy heuristic for the normalized equidistant minimum evolution (NEME) problem, that is, finding a rooted tree that minimizes the minimum evolution score relative to the dissimilarity matrix among all rooted trees with the same leaf-set in which all leaves have the same distance to the root. We prove that the NEME problem is NP-hard. In addition, we present some heuristic and approximation algorithms for solving the NEME problem, including a polynomial time algorithm that yields a binary, rooted tree whose NEME score is within O(log2n) of the optimum.
| Original language | English |
|---|---|
| Pages (from-to) | 1-15 |
| Number of pages | 15 |
| Journal | Theoretical Computer Science |
| Volume | 721 |
| Early online date | 1 Feb 2018 |
| DOIs | |
| Publication status | Published - 18 Apr 2018 |
Keywords
- UPGMA
- minimum evolution
- balanced minimum evolution
- hierarchical clustering