Quality assessment of gene repertoire annotations with OMArk.

Détails

Ressource 1Télécharger: 38383603_BIB_85062152E9B8.pdf (5731.34 [Ko])
Etat: Public
Version: Final published version
Licence: CC BY 4.0
ID Serval
serval:BIB_85062152E9B8
Type
Article: article d'un périodique ou d'un magazine.
Collection
Publications
Institution
Titre
Quality assessment of gene repertoire annotations with OMArk.
Périodique
Nature biotechnology
Auteur⸱e⸱s
Nevers Y., Warwick Vesztrocy A., Rossier V., Train C.M., Altenhoff A., Dessimoz C., Glover N.M.
ISSN
1546-1696 (Electronic)
ISSN-L
1087-0156
Statut éditorial
Publié
Date de publication
01/2025
Peer-reviewed
Oui
Volume
43
Numéro
1
Pages
124-133
Langue
anglais
Notes
Publication types: Journal Article
Publication Status: ppublish
Résumé
In the era of biodiversity genomics, it is crucial to ensure that annotations of protein-coding gene repertoires are accurate. State-of-the-art tools to assess genome annotations measure the completeness of a gene repertoire but are blind to other errors, such as gene overprediction or contamination. We introduce OMArk, a software package that relies on fast, alignment-free sequence comparisons between a query proteome and precomputed gene families across the tree of life. OMArk assesses not only the completeness but also the consistency of the gene repertoire as a whole relative to closely related species and reports likely contamination events. Analysis of 1,805 UniProt Eukaryotic Reference Proteomes with OMArk demonstrated strong evidence of contamination in 73 proteomes and identified error propagation in avian gene annotation resulting from the use of a fragmented zebra finch proteome as a reference. This study illustrates the importance of comparing and prioritizing proteomes based on their quality measures.
Mots-clé
Software, Molecular Sequence Annotation, Proteome/genetics, Animals, Databases, Protein, Genomics/methods
Pubmed
Web of science
Open Access
Oui
Création de la notice
26/02/2024 9:20
Dernière modification de la notice
25/02/2025 7:15
Données d'usage