Improving GWAS discovery and genomic prediction accuracy in biobank data.
Détails
Télécharger: 35905320_BIB_8A23ACB54923.pdf (1028.60 [Ko])
Etat: Public
Version: Final published version
Licence: CC BY-NC-ND 4.0
Etat: Public
Version: Final published version
Licence: CC BY-NC-ND 4.0
ID Serval
serval:BIB_8A23ACB54923
Type
Article: article d'un périodique ou d'un magazine.
Collection
Publications
Institution
Titre
Improving GWAS discovery and genomic prediction accuracy in biobank data.
Périodique
Proceedings of the National Academy of Sciences of the United States of America
ISSN
1091-6490 (Electronic)
ISSN-L
0027-8424
Statut éditorial
Publié
Date de publication
02/08/2022
Peer-reviewed
Oui
Volume
119
Numéro
31
Pages
e2121279119
Langue
anglais
Notes
Publication types: Journal Article ; Research Support, Non-U.S. Gov't
Publication Status: ppublish
Publication Status: ppublish
Résumé
Genetically informed, deep-phenotyped biobanks are an important research resource and it is imperative that the most powerful, versatile, and efficient analysis approaches are used. Here, we apply our recently developed Bayesian grouped mixture of regressions model (GMRM) in the UK and Estonian Biobanks and obtain the highest genomic prediction accuracy reported to date across 21 heritable traits. When compared to other approaches, GMRM accuracy was greater than annotation prediction models run in the LDAK or LDPred-funct software by 15% (SE 7%) and 14% (SE 2%), respectively, and was 18% (SE 3%) greater than a baseline BayesR model without single-nucleotide polymorphism (SNP) markers grouped into minor allele frequency-linkage disequilibrium (MAF-LD) annotation categories. For height, the prediction accuracy R <sup>2</sup> was 47% in a UK Biobank holdout sample, which was 76% of the estimated [Formula: see text]. We then extend our GMRM prediction model to provide mixed-linear model association (MLMA) SNP marker estimates for genome-wide association (GWAS) discovery, which increased the independent loci detected to 16,162 in unrelated UK Biobank individuals, compared to 10,550 from BoltLMM and 10,095 from Regenie, a 62 and 65% increase, respectively. The average [Formula: see text] value of the leading markers increased by 15.24 (SE 0.41) for every 1% increase in prediction accuracy gained over a baseline BayesR model across the traits. Thus, we show that modeling genetic associations accounting for MAF and LD differences among SNP markers, and incorporating prior knowledge of genomic function, is important for both genomic prediction and discovery in large-scale individual-level studies.
Mots-clé
Bayes Theorem, Databases, Genetic, England, Estonia, Genome-Wide Association Study, Genomics, Genotype, Humans, Phenotype, Polymorphism, Single Nucleotide, Precision Medicine, Quantitative Trait, Heritable, Bayesian penalized regression, association study, genomic prediction
Pubmed
Web of science
Open Access
Oui
Création de la notice
15/08/2022 14:59
Dernière modification de la notice
25/01/2024 7:40