WIAS Preprint No. 1237, (2007)

Computing likelihoods for coalescents with multiple collisions in the infinitely-many-sites model


  • Birkner, Matthias
  • Blath, Jochen

2010 Mathematics Subject Classification

  • 92D15 60G09 60G52 60J75 60J85


  • $Lambda$-coalescent, likelihood-based inference, infinitely-many-sites, population genetics, Monte-Carlo method




One of the central problems in mathematical genetics is the inference of evolutionary parameters of a population (such as the mutation rate) based on the observed genetic types in a finite DNA sample. If the population model under consideration is in the domain of attraction of the classical Fleming-Viot process, such as the Wright-Fisher- or the Moran model, then the standard means to describe its genealogy is Kingman's coalescent. For this coalescent process, powerful inference methods are well-established. An important feature of the above class of models is, roughly speaking, that the number of offspring of each individual is small when compared to the total population size, and hence all ancestral collisions are binary only. Recently, more general population models have been studied, in particular in the domain of attraction of so-called generalised $Lambda$-Fleming-Viot processes, as well as their (dual) genealogies, given by the so-called $Lambda$-coalescents, which allow multiple collisions. Moreover, Eldon and Wakeley (2006) provide evidence that such more general coalescents might actually be more adequate to describe real populations with extreme reproductive behaviour, in particular many marine species. In this paper, we extend methods of Ethier and Griffiths (1987) and Griffiths and Tavaré (1994, 1995) to obtain a likelihood based inference method for general $Lambda$-coalescents. In particular, we obtain a method to compute (approximate) likelihood surfaces for the observed type probabilities of a given sample. We argue that within the (vast) family of $Lambda$-coalescents, the parametrisable sub-family of Beta$(2-alpha, alpha)$-coalescents, where $alpha in (1,2]$, are of particular relevance. We illustrate our method using simulated datasets, thus obtaining maximum-likelihood estimators of mutation and demographic parameters.

Appeared in

  • J. Math. Biol., 57 (2008) pp. 435--465.

Download Documents