Towards building the tree of life: a simulation study for all angiosperm genera.

Détails

ID Serval
serval:BIB_415B7A05529C
Type
Article: article d'un périodique ou d'un magazine.
Collection
Publications
Titre
Towards building the tree of life: a simulation study for all angiosperm genera.
Périodique
Systematic Biology
Auteur(s)
Salamin N., Hodkinson T.R., Savolainen Coates V.
ISSN
1063-5157 (Print)
ISSN-L
1063-5157
Statut éditorial
Publié
Date de publication
2005
Peer-reviewed
Oui
Volume
54
Numéro
2
Pages
183-196
Langue
anglais
Résumé
Comprehensive phylogenetic trees are essential tools to better understand evolutionary processes. For many groups of organisms or projects aiming to build the Tree of Life, comprehensive phylogenetic analysis implies sampling hundreds to thousands of taxa. For the tree of all life this task rises to a highly conservative 13 million. Here, we assessed the performances of methods to reconstruct large trees using Monte Carlo simulations with parameters inferred from four large angiosperm DNA matrices, containing between 141 and 567 taxa. For each data set, parameters of the HKY85+G model were estimated and used to simulate 20 new matrices for sequence lengths from 100 to 10,000 base pairs. Maximum parsimony and neighbor joining were used to analyze each simulated matrix. In our simulations, accuracy was measured by counting the number of nodes in the model tree that were correctly inferred. The accuracy of the two methods increased very quickly with the addition of characters before reaching a plateau around 1000 nucleotides for any sizes of trees simulated. An increase in the number of taxa from 141 to 567 did not significantly decrease the accuracy of the methods used, despite the increase in the complexity of tree space. Moreover, the distribution of branch lengths rather than the rate of evolution was found to be the most important factor for accurately inferring these large trees. Finally, a tree containing 13,000 taxa was created to represent a hypothetical tree of all angiosperm genera and the efficiency of phylogenetic reconstructions was tested with simulated matrices containing an increasing number of nucleotides up to a maximum of 30,000. Even with such a large tree, our simulations suggested that simple heuristic searches were able to infer up to 80% of the nodes correctly.
Mots-clé
Angiosperms/genetics, Base Sequence, Classification/methods, Cluster Analysis, Computer Simulation, Models, Genetic, Monte Carlo Method, Phylogeny, Reproducibility of Results, Sample Size
Pubmed
Web of science
Open Access
Oui
Création de la notice
24/01/2008 19:41
Dernière modification de la notice
08/05/2019 17:40
Données d'usage