Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks.

Details

Ressource 1Download: 36697501_BIB_1803C776AAF4.pdf (2168.08 [Ko])
State: Public
Version: Final published version
License: CC BY 4.0
Serval ID
serval:BIB_1803C776AAF4
Type
Article: article from journal or magazin.
Collection
Publications
Institution
Title
Accuracy of haplotype estimation and whole genome imputation affects complex trait analyses in complex biobanks.
Journal
Communications biology
Author(s)
Appadurai V., Bybjerg-Grauholm J., Krebs M.D., Rosengren A., Buil A., Ingason A., Mors O., Børglum A.D., Hougaard D.M., Nordentoft M., Mortensen P.B., Delaneau O., Werge T., Schork A.J.
ISSN
2399-3642 (Electronic)
ISSN-L
2399-3642
Publication state
Published
Issued date
26/01/2023
Peer-reviewed
Oui
Volume
6
Number
1
Pages
101
Language
english
Notes
Publication types: Journal Article ; Research Support, N.I.H., Extramural ; Research Support, Non-U.S. Gov't
Publication Status: epublish
Abstract
Sample recruitment for research consortia, biobanks, and personal genomics companies span years, necessitating genotyping in batches, using different technologies. As marker content on genotyping arrays varies, integrating such datasets is non-trivial and its impact on haplotype estimation (phasing) and whole genome imputation, necessary steps for complex trait analysis, remains under-evaluated. Using the iPSYCH dataset, comprising 130,438 individuals, genotyped in two stages, on different arrays, we evaluated phasing and imputation performance across multiple phasing methods and data integration protocols. While phasing accuracy varied by choice of method and data integration protocol, imputation accuracy varied mostly between data integration protocols. We demonstrate an attenuation in imputation accuracy within samples of non-European origin, highlighting challenges to studying complex traits in diverse populations. Finally, imputation errors can bias association tests, reduce predictive utility of polygenic scores. Carefully optimized data integration strategies enhance accuracy and replicability of complex trait analyses in complex biobanks.
Keywords
Humans, Haplotypes, Multifactorial Inheritance, Biological Specimen Banks, Genome, Genotype
Pubmed
Web of science
Open Access
Yes
Create date
07/03/2023 15:33
Last modification date
23/01/2024 8:21
Usage data