State aggregation for fast likelihood computations in molecular evolution.

Details

Ressource 1Download: btw632.pdf (804.89 [Ko])
State: Public
Version: Final published version
Secondary document(s)
Download: btw632_Supp.pdf (1184.50 [Ko])
State: Public
Version: author
Serval ID
serval:BIB_54224013033C
Type
Article: article from journal or magazin.
Collection
Publications
Institution
Title
State aggregation for fast likelihood computations in molecular evolution.
Journal
Bioinformatics
Author(s)
Davydov I.I., Robinson-Rechavi M., Salamin N.
ISSN
1367-4811 (Electronic)
ISSN-L
1367-4803
Publication state
Published
Issued date
2016
Peer-reviewed
Oui
Volume
33
Pages
354-362
Language
english
Abstract
MOTIVATION: Codon models are widely used to identify the signature of selection at the molecular level and to test for changes in selective pressure during the evolution of genes encoding proteins. The large size of the state space of the Markov processes used to model codon evolution makes it difficult to use these models with large biological datasets. We propose here to use state aggregation to reduce the state space of codon models and, thus, improve the computational performance of likelihood estimation on these models.
RESULTS: We show that this heuristic speeds up the computations of the M0 and branch-site models up to 6.8 times. We also show through simulations that state aggregation does not introduce a detectable bias. We analysed a real dataset and show that aggregation provides highly correlated predictions compared to the full likelihood computations. Finally, state aggregation is a very general approach and can be applied to any continuous-time Markov process-based model with large state space, such as amino acid and coevolution models. We therefore discuss different ways to apply state aggregation to Markov models used in phylogenetics.
AVAILABILITY: The heuristic is implemented in the godon package (https://bitbucket.org/Davydov/godon) and in a version of FastCodeML (https://gitlab.isb-sib.ch/phylo/fastcodeml).
CONTACT: nicolas.salamin@unil.ch SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Pubmed
Web of science
Open Access
Yes
Create date
05/09/2016 16:10
Last modification date
20/08/2019 15:09
Usage data