Distributions of cherries and pitchforks for the Ford model

Gursharn Kaur, Kwok Pui Choi, Taoyang Wu

Research output: Contribution to journalArticlepeer-review

Abstract

Distributional properties of tree shape statistics under random phylogenetic tree models play an important role in investigating the evolutionary forces underlying the observed phylogenies. In this paper, we study two subtree counting statistics, the number of cherries and that of pitchforks for the Ford model, the alpha model introduced by Daniel Ford. It is a one-parameter family of random phylogenetic tree models which includes the proportional to distinguishable arrangement (PDA) and the Yule models, two tree models commonly used in phylogenetics. Based on a non-uniform version of the extended Pólya urn models in which negative entries are permitted for their replacement matrices, we obtain the strong law of large numbers and the central limit theorem for the joint distribution of these two statistics for the Ford model. Furthermore, we derive a recursive formula for computing the exact joint distribution of these two statistics. This leads to exact formulas for their means and higher order asymptotic expansions of their second moments, which allows us to identify a critical parameter value for the correlation between these two statistics. That is, when the number of tree leaves is sufficiently large, they are negatively correlated for 0≤α≤1/2 and positively correlated for 1/2<α<1.

Original languageEnglish
Pages (from-to)27-38
Number of pages12
JournalTheoretical Population Biology
Volume149
Early online date22 Dec 2022
DOIs
Publication statusE-pub ahead of print - 22 Dec 2022

Keywords

  • Ford's alpha model
  • Polya urn model
  • Random trees
  • Subtree statistics
  • Tree shape

Cite this