Phylogenetic approaches to identifying fragments of the same gene, with application to the wheat genome.

Détails

Ressource 1Télécharger: bty772.pdf (532.23 [Ko])
Etat: Public
Version: Final published version
Licence: CC BY 4.0
ID Serval
serval:BIB_6469004280EE
Type
Article: article d'un périodique ou d'un magazine.
Collection
Publications
Institution
Titre
Phylogenetic approaches to identifying fragments of the same gene, with application to the wheat genome.
Périodique
Bioinformatics
Auteur⸱e⸱s
Piližota I., Train C.M., Altenhoff A., Redestig H., Dessimoz C.
ISSN
1367-4811 (Electronic)
ISSN-L
1367-4803
Statut éditorial
Publié
Date de publication
01/04/2019
Peer-reviewed
Oui
Volume
35
Numéro
7
Pages
1159-1166
Langue
anglais
Notes
Publication types: Journal Article ; Research Support, Non-U.S. Gov't
Publication Status: ppublish
Résumé
As the time and cost of sequencing decrease, the number of available genomes and transcriptomes rapidly increases. Yet the quality of the assemblies and the gene annotations varies considerably and often remains poor, affecting downstream analyses. This is particularly true when fragments of the same gene are annotated as distinct genes, which may cause them to be mistaken as paralogs.
In this study, we introduce two novel phylogenetic tests to infer non-overlapping or partially overlapping genes that are in fact parts of the same gene. One approach collapses branches with low bootstrap support and the other computes a likelihood ratio test. We extensively validated these methods by (i) introducing and recovering fragmentation on the bread wheat, Triticum aestivum cv. Chinese Spring, chromosome 3B; (ii) by applying the methods to the low-quality 3B assembly and validating predictions against the high-quality 3B assembly; and (iii) by comparing the performance of the proposed methods to the performance of existing methods, namely Ensembl Compara and ESPRIT. Application of this combination to a draft shotgun assembly of the entire bread wheat genome revealed 1221 pairs of genes that are highly likely to be fragments of the same gene. Our approach demonstrates the power of fine-grained evolutionary inferences across multiple species to improving genome assemblies and annotations.
An open source software tool is available at https://github.com/DessimozLab/esprit2.
Supplementary data are available at Bioinformatics online.
Mots-clé
Genome, Plant, Molecular Sequence Annotation, Phylogeny, Software, Triticum
Pubmed
Web of science
Open Access
Oui
Création de la notice
10/09/2018 13:35
Dernière modification de la notice
21/11/2022 9:16
Données d'usage