A generalized Robinson-Foulds distance for labeled trees.
Détails
Télécharger: 33208096_BIB_2A5CE76C7989.pdf (2058.12 [Ko])
Etat: Public
Version: Final published version
Licence: CC BY 4.0
Etat: Public
Version: Final published version
Licence: CC BY 4.0
ID Serval
serval:BIB_2A5CE76C7989
Type
Article: article d'un périodique ou d'un magazine.
Collection
Publications
Institution
Titre
A generalized Robinson-Foulds distance for labeled trees.
Périodique
BMC genomics
ISSN
1471-2164 (Electronic)
ISSN-L
1471-2164
Statut éditorial
Publié
Date de publication
18/11/2020
Peer-reviewed
Oui
Volume
21
Numéro
Suppl 10
Pages
779
Langue
anglais
Notes
Publication types: Journal Article
Publication Status: epublish
Publication Status: epublish
Résumé
The Robinson-Foulds (RF) distance is a well-established measure between phylogenetic trees. Despite a lack of biological justification, it has the advantages of being a proper metric and being computable in linear time. For phylogenetic applications involving genes, however, a crucial aspect of the trees ignored by the RF metric is the type of the branching event (e.g. speciation, duplication, transfer, etc).
We extend RF to trees with labeled internal nodes by including a node flip operation, alongside edge contractions and extensions. We explore properties of this extended RF distance in the case of a binary labeling. In particular, we show that contrary to the unlabeled case, an optimal edit path may require contracting "good" edges, i.e. edges shared between the two trees.
We provide a 2-approximation algorithm which is shown to perform well empirically. Looking ahead, computing distances between labeled trees opens up a variety of new algorithmic directions.Implementation and simulations available at https://github.com/DessimozLab/pylabeledrf .
We extend RF to trees with labeled internal nodes by including a node flip operation, alongside edge contractions and extensions. We explore properties of this extended RF distance in the case of a binary labeling. In particular, we show that contrary to the unlabeled case, an optimal edit path may require contracting "good" edges, i.e. edges shared between the two trees.
We provide a 2-approximation algorithm which is shown to perform well empirically. Looking ahead, computing distances between labeled trees opens up a variety of new algorithmic directions.Implementation and simulations available at https://github.com/DessimozLab/pylabeledrf .
Mots-clé
Edit distance, Labeled trees, Robinson-Foulds, Tree metric
Pubmed
Web of science
Open Access
Oui
Création de la notice
28/11/2020 10:07
Dernière modification de la notice
30/04/2021 6:09