Mining literature for protein-protein interactions.
Détails
ID Serval
serval:BIB_6FD464E77A85
Type
Article: article d'un périodique ou d'un magazine.
Collection
Publications
Institution
Titre
Mining literature for protein-protein interactions.
Périodique
Bioinformatics
ISSN
1367-4803 (Print)
ISSN-L
1367-4803
Statut éditorial
Publié
Date de publication
2001
Volume
17
Numéro
4
Pages
359-363
Langue
anglais
Résumé
MOTIVATION: A central problem in bioinformatics is how to capture information from the vast current scientific literature in a form suitable for analysis by computer. We address the special case of information on protein-protein interactions, and show that the frequencies of words in Medline abstracts can be used to determine whether or not a given paper discusses protein-protein interactions. For those papers determined to discuss this topic, the relevant information can be captured for the Database of Interacting PROTEINS: Furthermore, suitable gene annotations can also be captured. RESULTS: Our Bayesian approach scores Medline abstracts for probability of discussing the topic of interest according to the frequencies of discriminating words found in the abstract. More than 80 discriminating words (e.g. complex, interaction, two-hybrid) were determined from a training set of 260 Medline abstracts corresponding to previously validated entries in the Database of Interacting Proteins. Using these words and a log likelihood scoring function, approximately 2000 Medline abstracts were identified as describing interactions between yeast proteins. This approach now forms the basis for the rapid expansion of the Database of Interacting Proteins.
Mots-clé
Algorithms, Bayes Theorem, Information Storage and Retrieval, MEDLINE, Proteins/metabolism
Pubmed
Web of science
Open Access
Oui
Création de la notice
18/10/2012 9:15
Dernière modification de la notice
20/08/2019 14:28