An alignment confidence score capturing robustness to guide tree uncertainty.

Détails

ID Serval
serval:BIB_F734415E5D7E
Type
Article: article d'un périodique ou d'un magazine.
Collection
Publications
Titre
An alignment confidence score capturing robustness to guide tree uncertainty.
Périodique
Molecular biology and evolution
Auteur⸱e⸱s
Penn O., Privman E., Landan G., Graur D., Pupko T.
ISSN
1537-1719 (Electronic)
ISSN-L
0737-4038
Statut éditorial
Publié
Date de publication
08/2010
Peer-reviewed
Oui
Volume
27
Numéro
8
Pages
1759-1767
Langue
anglais
Notes
Publication types: Journal Article ; Research Support, N.I.H., Extramural ; Research Support, Non-U.S. Gov't
Publication Status: ppublish
Résumé
Multiple sequence alignment (MSA) is the basis for a wide range of comparative sequence analyses from molecular phylogenetics to 3D structure prediction. Sophisticated algorithms have been developed for sequence alignment, but in practice, many errors can be expected and extensive portions of the MSA are unreliable. Hence, it is imperative to understand and characterize the various sources of errors in MSAs and to quantify site-specific alignment confidence. In this paper, we show that uncertainties in the guide tree used by progressive alignment methods are a major source of alignment uncertainty. We use this insight to develop a novel method for quantifying the robustness of each alignment column to guide tree uncertainty. We build on the widely used bootstrap method for perturbing the phylogenetic tree. Specifically, we generate a collection of trees and use each as a guide tree in the alignment algorithm, thus producing a set of MSAs. We next test the consistency of every column of the MSA obtained from the unperturbed guide tree with respect to the set of MSAs. We name this measure the "GUIDe tree based AligNment ConfidencE" (GUIDANCE) score. Using the Benchmark Alignment data BASE benchmark as well as simulation studies, we show that GUIDANCE scores accurately identify errors in MSAs. Additionally, we compare our results with the previously published Heads-or-Tails score and show that the GUIDANCE score is a better predictor of unreliably aligned regions.
Mots-clé
Algorithms, Amino Acid Sequence, Animals, Base Sequence, Computer Simulation, Databases, Factual, Drosophila melanogaster/genetics, Molecular Sequence Data, Phylogeny, ROC Curve, Sequence Alignment/methods, Sequence Analysis, DNA/methods, Software
Pubmed
Web of science
Open Access
Oui
Création de la notice
20/01/2011 16:40
Dernière modification de la notice
27/07/2023 14:29
Données d'usage