CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation.

Détails

Ressource 1Télécharger: journal.pcbi.1010075.pdf (3194.15 [Ko])
Etat: Public
Version: Author's accepted manuscript
Licence: CC BY 4.0
ID Serval
serval:BIB_4EB542E53FA1
Type
Article: article d'un périodique ou d'un magazine.
Collection
Publications
Institution
Titre
CrowdGO: Machine learning and semantic similarity guided consensus Gene Ontology annotation.
Périodique
PLoS computational biology
Auteur⸱e⸱s
Reijnders MJMF, Waterhouse R.M.
ISSN
1553-7358 (Electronic)
ISSN-L
1553-734X
Statut éditorial
Publié
Date de publication
13/05/2022
Peer-reviewed
Oui
Editeur⸱rice scientifique
Fetrow Jacquelyn S.
Volume
18
Numéro
5
Pages
e1010075
Langue
anglais
Notes
Publication types: Journal Article
Publication Status: aheadofprint
Résumé
Characterising gene function for the ever-increasing number and diversity of species with annotated genomes relies almost entirely on computational prediction methods. These software are also numerous and diverse, each with different strengths and weaknesses as revealed through community benchmarking efforts. Meta-predictors that assess consensus and conflict from individual algorithms should deliver enhanced functional annotations. To exploit the benefits of meta-approaches, we developed CrowdGO, an open-source consensus-based Gene Ontology (GO) term meta-predictor that employs machine learning models with GO term semantic similarities and information contents. By re-evaluating each gene-term annotation, a consensus dataset is produced with high-scoring confident annotations and low-scoring rejected annotations. Applying CrowdGO to results from a deep learning-based, a sequence similarity-based, and two protein domain-based methods, delivers consensus annotations with improved precision and recall. Furthermore, using standard evaluation measures CrowdGO performance matches that of the community's best performing individual methods. CrowdGO therefore offers a model-informed approach to leverage strengths of individual predictors and produce comprehensive and accurate gene functional annotations.
Mots-clé
Computational Theory and Mathematics, Cellular and Molecular Neuroscience, Genetics, Molecular Biology, Ecology, Modeling and Simulation, Ecology, Evolution, Behavior and Systematics
Pubmed
Open Access
Oui
Financement(s)
Fonds national suisse / Carrières / PP00P3_170664
Fonds national suisse / Carrières / PP00P3_202669
Création de la notice
20/05/2022 15:38
Dernière modification de la notice
24/05/2022 6:38
Données d'usage