Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks.

Détails

Ressource 1Télécharger: 36697501_BIB_1803C776AAF4.pdf (2168.08 [Ko])
Etat: Public
Version: Final published version
Licence: CC BY 4.0
ID Serval
serval:BIB_1803C776AAF4
Type
Article: article d'un périodique ou d'un magazine.
Collection
Publications
Institution
Titre
Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks.
Périodique
Communications biology
Auteur⸱e⸱s
Appadurai V., Bybjerg-Grauholm J., Krebs M.D., Rosengren A., Buil A., Ingason A., Mors O., Børglum A.D., Hougaard D.M., Nordentoft M., Mortensen P.B., Delaneau O., Werge T., Schork A.J.
ISSN
2399-3642 (Electronic)
ISSN-L
2399-3642
Statut éditorial
Publié
Date de publication
26/01/2023
Peer-reviewed
Oui
Volume
6
Numéro
1
Pages
101
Langue
anglais
Notes
Publication types: Journal Article ; Research Support, N.I.H., Extramural ; Research Support, Non-U.S. Gov't
Publication Status: epublish
Résumé
Sample recruitment for research consortia, biobanks, and personal genomics companies span years, necessitating genotyping in batches, using different technologies. As marker content on genotyping arrays varies, integrating such datasets is non-trivial and its impact on haplotype estimation (phasing) and whole genome imputation, necessary steps for complex trait analysis, remains under-evaluated. Using the iPSYCH dataset, comprising 130,438 individuals, genotyped in two stages, on different arrays, we evaluated phasing and imputation performance across multiple phasing methods and data integration protocols. While phasing accuracy varied by choice of method and data integration protocol, imputation accuracy varied mostly between data integration protocols. We demonstrate an attenuation in imputation accuracy within samples of non-European origin, highlighting challenges to studying complex traits in diverse populations. Finally, imputation errors can bias association tests, reduce predictive utility of polygenic scores. Carefully optimized data integration strategies enhance accuracy and replicability of complex trait analyses in complex biobanks.
Mots-clé
Humans, Haplotypes, Multifactorial Inheritance, Biological Specimen Banks, Genome, Genotype
Pubmed
Web of science
Open Access
Oui
Création de la notice
07/03/2023 15:33
Dernière modification de la notice
23/01/2024 8:21
Données d'usage