Comparative performance of supertree algorithms in large data sets using the soapberry family (Sapindaceae) as a case study.
Détails
Télécharger: REF.pdf (390.50 [Ko])
Etat: Public
Version: Final published version
Licence: Non spécifiée
It was possible to publish this article open access thanks to a Swiss National Licence with the publisher.
Etat: Public
Version: Final published version
Licence: Non spécifiée
It was possible to publish this article open access thanks to a Swiss National Licence with the publisher.
ID Serval
serval:BIB_AD8F3F9AC659
Type
Article: article d'un périodique ou d'un magazine.
Collection
Publications
Institution
Titre
Comparative performance of supertree algorithms in large data sets using the soapberry family (Sapindaceae) as a case study.
Périodique
Systematic Biology
ISSN
1076-836X (Electronic)
ISSN-L
1063-5157
Statut éditorial
Publié
Date de publication
2011
Peer-reviewed
Oui
Volume
60
Numéro
1
Pages
32-44
Langue
anglais
Résumé
For the last 2 decades, supertree reconstruction has been an active field of research and has seen the development of a large number of major algorithms. Because of the growing popularity of the supertree methods, it has become necessary to evaluate the performance of these algorithms to determine which are the best options (especially with regard to the supermatrix approach that is widely used). In this study, seven of the most commonly used supertree methods are investigated by using a large empirical data set (in terms of number of taxa and molecular markers) from the worldwide flowering plant family Sapindaceae. Supertree methods were evaluated using several criteria: similarity of the supertrees with the input trees, similarity between the supertrees and the total evidence tree, level of resolution of the supertree and computational time required by the algorithm. Additional analyses were also conducted on a reduced data set to test if the performance levels were affected by the heuristic searches rather than the algorithms themselves. Based on our results, two main groups of supertree methods were identified: on one hand, the matrix representation with parsimony (MRP), MinFlip, and MinCut methods performed well according to our criteria, whereas the average consensus, split fit, and most similar supertree methods showed a poorer performance or at least did not behave the same way as the total evidence tree. Results for the super distance matrix, that is, the most recent approach tested here, were promising with at least one derived method performing as well as MRP, MinFlip, and MinCut. The output of each method was only slightly improved when applied to the reduced data set, suggesting a correct behavior of the heuristic searches and a relatively low sensitivity of the algorithms to data set sizes and missing data. Results also showed that the MRP analyses could reach a high level of quality even when using a simple heuristic search strategy, with the exception of MRP with Purvis coding scheme and reversible parsimony. The future of supertrees lies in the implementation of a standardized heuristic search for all methods and the increase in computing power to handle large data sets. The latter would prove to be particularly useful for promising approaches such as the maximum quartet fit method that yet requires substantial computing power.
Mots-clé
Algorithms, Angiosperms/classification, Angiosperms/genetics, Classification/methods, DNA, Plant/genetics, Phylogeny, Sapindaceae/classification, Sapindaceae/genetics
Pubmed
Web of science
Open Access
Oui
Création de la notice
09/06/2010 14:05
Dernière modification de la notice
14/02/2022 7:56