From pairwise to multiple spliced alignment.

Jammali, S.; Djossou, A.; Ouédraogo, WDD; Nevers, Y.; Chegrane, I.; Ouangraoua, A.

doi:10.1093/bioadv/vbab044

From pairwise to multiple spliced alignment.

Détails

Télécharger: vbab044.pdf (958.73 [Ko])
Etat: Public
Version: Final published version
Licence: CC BY 4.0

ID Serval

serval:BIB_F9ACA285ED46

Type

Article: article d'un périodique ou d'un magazine.

Collection

Publications

Institution

UNIL/CHUV

Titre

From pairwise to multiple spliced alignment.

Périodique

Bioinformatics advances

Auteur⸱e⸱s

Jammali S., Djossou A., Ouédraogo WDD, Nevers Y., Chegrane I., Ouangraoua A.

ISSN

2635-0041 (Electronic)

ISSN-L

2635-0041

Statut éditorial

Publié

Date de publication

2022

Peer-reviewed

Oui

Volume

Numéro

Pages

vbab044

Langue

anglais

Notes

Publication types: Journal Article
Publication Status: epublish

Résumé

Alternative splicing is a ubiquitous process in eukaryotes that allows distinct transcripts to be produced from the same gene. Yet, the study of transcript evolution within a gene family is still in its infancy. One prerequisite for this study is the availability of methods to compare sets of transcripts while accounting for their splicing structure. In this context, we generalize the concept of pairwise spliced alignments (PSpAs) to multiple spliced alignments (MSpAs). MSpAs have several important purposes in addition to empowering the study of the evolution of transcripts. For instance, it is a key to improving the prediction of gene models, which is important to solve the growing problem of genome annotation. Despite its essentialness, a formal definition of the concept and methods to compute MSpAs are still lacking.
We introduce the MSpA problem and the SplicedFamAlignMulti (SFAM) method, to compute the MSpA of a gene family. Like most multiple sequence alignment (MSA) methods that are generally greedy heuristic methods assembling pairwise alignments, SFAM combines all PSpAs of coding DNA sequences and gene sequences of a gene family into an MSpA. It produces a single structure that represents the superstructure and models of the gene family. Using real vertebrate and simulated gene family data, we illustrate the utility of SFAM for computing accurate gene family superstructures, MSAs, inferring splicing orthologous groups and improving gene-model annotations.
The supporting data and implementation of SFAM are freely available at https://github.com/UdeS-CoBIUS/SpliceFamAlignMulti.
Supplementary data are available at Bioinformatics Advances online.

URN

urn:nbn:ch:serval-BIB_F9ACA285ED461

OAI-PMH

oai:serval.unil.ch:BIB_F9ACA285ED46

DOI

10.1093/bioadv/vbab044

Pubmed

36699392

Open Access

Oui

Création de la notice

17/02/2023 10:05