Improving population scale statistical phasing with whole-genome sequencing data.
Détails
ID Serval
serval:BIB_3ED15061CE9A
Type
Article: article d'un périodique ou d'un magazine.
Collection
Publications
Institution
Titre
Improving population scale statistical phasing with whole-genome sequencing data.
Périodique
PLoS genetics
ISSN
1553-7404 (Electronic)
ISSN-L
1553-7390
Statut éditorial
Publié
Date de publication
07/2024
Peer-reviewed
Oui
Volume
20
Numéro
7
Pages
e1011092
Langue
anglais
Notes
Publication types: Journal Article
Publication Status: epublish
Publication Status: epublish
Résumé
Haplotype estimation, or phasing, has gained significant traction in large-scale projects due to its valuable contributions to population genetics, variant analysis, and the creation of reference panels for imputation and phasing of new samples. To scale with the growing number of samples, haplotype estimation methods designed for population scale rely on highly optimized statistical models to phase genotype data, and usually ignore read-level information. Statistical methods excel in resolving common variants, however, they still struggle at rare variants due to the lack of statistical information. In this study we introduce SAPPHIRE, a new method that leverages whole-genome sequencing data to enhance the precision of haplotype calls produced by statistical phasing. SAPPHIRE achieves this by refining haplotype estimates through the realignment of sequencing reads, particularly targeting low-confidence phase calls. Our findings demonstrate that SAPPHIRE significantly enhances the accuracy of haplotypes obtained from state of the art methods and also provides the subset of phase calls that are validated by sequencing reads. Finally, we show that our method scales to large data sets by its successful application to the extensive 3.6 Petabytes of sequencing data of the last UK Biobank 200,031 sample release.
Mots-clé
Haplotypes, Whole Genome Sequencing/methods, Humans, Genetics, Population/methods, Genome, Human, Polymorphism, Single Nucleotide/genetics, Genome-Wide Association Study/methods, Algorithms
Pubmed
Web of science
Open Access
Oui
Création de la notice
11/07/2024 14:22
Dernière modification de la notice
30/07/2024 6:02