Bayesian phylogenetic methods are generating noticeable enthusiasm in the field of molecular systematics. Many phylogenetic models are often at stake, and different approaches are used to compare them within a Bayesian framework. The Bayes factor, defined as the ratio of the marginal likelihoods of two competing models, plays a key role in Bayesian model selection. We focus on an alternative estimator of the marginal likelihood whose computation is still a challenging problem. Several computational solutions have been proposed, none of which can be considered outperforming the others simultaneously in terms of simplicity of implementation, computational burden and precision of the estimates. Practitioners and researchers, often led by available software, have privileged so far the simplicity of the harmonic mean (HM) estimator. However, it is known that the resulting estimates of the Bayesian evidence in favor of one model are biased and often inaccurate, up to having an infinite variance so that the reliability of the corresponding conclusions is doubtful. We consider possible improvements of the generalized harmonic mean (GHM) idea that recycle Markov Chain Monte Carlo (MCMC) simulations from the posterior, share the computational simplicity of the original HM estimator, but, unlike it, overcome the infinite variance issue. We show reliability and comparative performance of the improved harmonic mean estimators comparing them to approximation techniques relying on improved variants of the thermodynamic integration.

Improved Harmonic Mean Estimator for Phylogenetic Model Evidence

ARIMA, SERENA;
2012

Abstract

Bayesian phylogenetic methods are generating noticeable enthusiasm in the field of molecular systematics. Many phylogenetic models are often at stake, and different approaches are used to compare them within a Bayesian framework. The Bayes factor, defined as the ratio of the marginal likelihoods of two competing models, plays a key role in Bayesian model selection. We focus on an alternative estimator of the marginal likelihood whose computation is still a challenging problem. Several computational solutions have been proposed, none of which can be considered outperforming the others simultaneously in terms of simplicity of implementation, computational burden and precision of the estimates. Practitioners and researchers, often led by available software, have privileged so far the simplicity of the harmonic mean (HM) estimator. However, it is known that the resulting estimates of the Bayesian evidence in favor of one model are biased and often inaccurate, up to having an infinite variance so that the reliability of the corresponding conclusions is doubtful. We consider possible improvements of the generalized harmonic mean (GHM) idea that recycle Markov Chain Monte Carlo (MCMC) simulations from the posterior, share the computational simplicity of the original HM estimator, but, unlike it, overcome the infinite variance issue. We show reliability and comparative performance of the improved harmonic mean estimators comparing them to approximation techniques relying on improved variants of the thermodynamic integration.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11587/472117
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 7
  • ???jsp.display-item.citation.isi??? 6
social impact