Making sense of score statistics for sequence alignments
Détails
ID Serval
serval:BIB_CD62AA985FEC
Type
Article: article d'un périodique ou d'un magazine.
Collection
Publications
Institution
Titre
Making sense of score statistics for sequence alignments
Périodique
Briefings in Bioinformatics
ISSN
1467-5463 (Print)
Statut éditorial
Publié
Date de publication
03/2001
Volume
2
Numéro
1
Pages
51-67
Notes
Comparative Study
Journal Article
Research Support, Non-U.S. Gov't --- Old month value: Mar
Journal Article
Research Support, Non-U.S. Gov't --- Old month value: Mar
Résumé
The search for similarity between two biological sequences lies at the core of many applications in bioinformatics. This paper aims to highlight a few of the principles that should be kept in mind when evaluating the statistical significance of alignments between sequences. The extreme value distribution is first introduced, which in most cases describes the distribution of alignment scores between a query and a database. The effects of the similarity matrix and gap penalty values on the score distribution are then examined, and it is shown that the alignment statistics can undergo an abrupt phase transition. A few types of random sequence databases used in the estimation of statistical significance are presented, and the statistics employed by the BLAST, FASTA and PRSS programs are compared. Finally the different strategies used to assess the statistical significance of the matches produced by profiles and hidden Markov models are presented.
Mots-clé
Amino Acid Sequence
Animals
Computational Biology
Databases, Factual
Humans
Markov Chains
Models, Statistical
Molecular Sequence Data
Proteins/genetics
Sequence Alignment/*statistics & numerical data
Sequence Homology, Amino Acid
Pubmed
Création de la notice
24/01/2008 15:39
Dernière modification de la notice
20/08/2019 15:48