DOMINO: Using Machine Learning to Predict Genes Associated with Dominant Disorders.

Quinodoz, M.; Royer-Bertrand, B.; Cisarova, K.; Di Gioia, S.A.; Superti-Furga, A.; Rivolta, C.

doi:10.1016/j.ajhg.2017.09.001

DOMINO: Using Machine Learning to Predict Genes Associated with Dominant Disorders.

Détails

Télécharger: 5_28985496_Postprint.pdf (4990.46 [Ko])
Etat: Public
Version: Author's accepted manuscript

Document(s) secondaire(s)

Télécharger: Table_S1.xlsx (42.89 [Ko])
Etat: Public
Version: Supplementary document

Télécharger: Table_S2.xlsx (19.53 [Ko])
Etat: Public
Version: Supplementary document

Télécharger: Table_S3.xlsx (10.30 [Ko])
Etat: Public
Version: Supplementary document

Télécharger: Table_S4.xlsx (13.90 [Ko])
Etat: Public
Version: Supplementary document

Télécharger: Table_S5.xlsx (12.13 [Ko])
Etat: Public
Version: Supplementary document

Télécharger: Table_S6.xlsx (22.73 [Ko])
Etat: Public
Version: Supplementary document

Télécharger: Table_S7.xlsx (15.60 [Ko])
Etat: Public
Version: Supplementary document

Télécharger: Table_S8.xlsx (11.06 [Ko])
Etat: Public
Version: Supplementary document

ID Serval

serval:BIB_5037815084FE

Type

Article: article d'un périodique ou d'un magazine.

Collection

Publications

Institution

UNIL/CHUV

Titre

DOMINO: Using Machine Learning to Predict Genes Associated with Dominant Disorders.

Périodique

American journal of human genetics

Auteur⸱e⸱s

Quinodoz M., Royer-Bertrand B., Cisarova K., Di Gioia S.A., Superti-Furga A., Rivolta C.

ISSN

1537-6605 (Electronic)

ISSN-L

0002-9297

Statut éditorial

Publié

Date de publication

05/10/2017

Peer-reviewed

Oui

Volume

101

Numéro

Pages

623-629

Langue

anglais

Notes

Publication types: Journal Article
Publication Status: ppublish

Résumé

In contrast to recessive conditions with biallelic inheritance, identification of dominant (monoallelic) mutations for Mendelian disorders is more difficult, because of the abundance of benign heterozygous variants that act as massive background noise (typically, in a 400:1 excess ratio). To reduce this overflow of false positives in next-generation sequencing (NGS) screens, we developed DOMINO, a tool assessing the likelihood for a gene to harbor dominant changes. Unlike commonly-used predictors of pathogenicity, DOMINO takes into consideration features that are the properties of genes, rather than of variants. It uses a machine-learning approach to extract discriminant information from a broad array of features (N = 432), including: genomic data, intra-, and interspecies conservation, gene expression, protein-protein interactions, protein structure, etc. DOMINO's iterative architecture includes a training process on 985 genes with well-established inheritance patterns for Mendelian conditions, and repeated cross-validation that optimizes its discriminant power. When validated on 99 newly-discovered genes with pathogenic mutations, the algorithm displays an excellent final performance, with an area under the curve (AUC) of 0.92. Furthermore, unsupervised analysis by DOMINO of real sets of NGS data from individuals with intellectual disability or epilepsy correctly recognizes known genes and predicts 9 new candidates, with very high confidence. In summary, DOMINO is a robust and reliable tool that can infer dominance of candidate genes with high sensitivity and specificity, making it a useful complement to any NGS pipeline dealing with the analysis of the morbid human genome.

Mots-clé

Databases, Genetic, Genes, Dominant, Genetic Diseases, Inborn/genetics, Genome, Human, Genomics, High-Throughput Nucleotide Sequencing/methods, Humans, Machine Learning, Mutation, Software

URN

urn:nbn:ch:serval-BIB_5037815084FE5

OAI-PMH

oai:serval.unil.ch:BIB_5037815084FE

DOI

10.1016/j.ajhg.2017.09.001

Pubmed

28985496

Web of science

000412277300013

Création de la notice

10/10/2017 14:32