An omics-based machine learning approach to predict diabetes progression: a RHAPSODY study.

Slieker, R.C.; Münch, M.; Donnelly, L.A.; Bouland, G.A.; Dragan, I.; Kuznetsov, D.; Elders, PJM; Rutter, G.A.; Ibberson, M.; Pearson, E.R.; 't Hart, L.M.; van de Wiel, M.A.; Beulens, JWJ

doi:10.1007/s00125-024-06105-8

An omics-based machine learning approach to predict diabetes progression: a RHAPSODY study.

Details

Download: 38374450_BIB_14192B042DAE.pdf (1208.26 [Ko])
State: Public
Version: Final published version
License: CC BY 4.0

Serval ID

serval:BIB_14192B042DAE

Type

Article: article from journal or magazin.

Collection

Publications

Institution

UNIL/CHUV

Title

An omics-based machine learning approach to predict diabetes progression: a RHAPSODY study.

Journal

Diabetologia

Author(s)

Slieker R.C., Münch M., Donnelly L.A., Bouland G.A., Dragan I., Kuznetsov D., Elders PJM, Rutter G.A., Ibberson M., Pearson E.R., 't Hart L.M., van de Wiel M.A., Beulens JWJ

ISSN

1432-0428 (Electronic)

ISSN-L

0012-186X

Publication state

Published

Issued date

05/2024

Peer-reviewed

Oui

Volume

Number

Pages

885-894

Language

english

Notes

Publication types: Journal Article
Publication Status: ppublish

Abstract

People with type 2 diabetes are heterogeneous in their disease trajectory, with some progressing more quickly to insulin initiation than others. Although classical biomarkers such as age, HbA 1c and diabetes duration are associated with glycaemic progression, it is unclear how well such variables predict insulin initiation or requirement and whether newly identified markers have added predictive value.
In two prospective cohort studies as part of IMI-RHAPSODY, we investigated whether clinical variables and three types of molecular markers (metabolites, lipids, proteins) can predict time to insulin requirement using different machine learning approaches (lasso, ridge, GRridge, random forest). Clinical variables included age, sex, HbA 1c , HDL-cholesterol and C-peptide. Models were run with unpenalised clinical variables (i.e. always included in the model without weights) or penalised clinical variables, or without clinical variables. Model development was performed in one cohort and the model was applied in a second cohort. Model performance was evaluated using Harrel's C statistic.
Of the 585 individuals from the Hoorn Diabetes Care System (DCS) cohort, 69 required insulin during follow-up (1.0-11.4 years); of the 571 individuals in the Genetics of Diabetes Audit and Research in Tayside Scotland (GoDARTS) cohort, 175 required insulin during follow-up (0.3-11.8 years). Overall, the clinical variables and proteins were selected in the different models most often, followed by the metabolites. The most frequently selected clinical variables were HbA 1c (18 of the 36 models, 50%), age (15 models, 41.2%) and C-peptide (15 models, 41.2%). Base models (age, sex, BMI, HbA 1c ) including only clinical variables performed moderately in both the DCS discovery cohort (C statistic 0.71 [95% CI 0.64, 0.79]) and the GoDARTS replication cohort (C 0.71 [95% CI 0.69, 0.75]). A more extensive model including HDL-cholesterol and C-peptide performed better in both cohorts (DCS, C 0.74 [95% CI 0.67, 0.81]; GoDARTS, C 0.73 [95% CI 0.69, 0.77]). Two proteins, lactadherin and proto-oncogene tyrosine-protein kinase receptor, were most consistently selected and slightly improved model performance.
Using machine learning approaches, we show that insulin requirement risk can be modestly well predicted by predominantly clinical variables. Inclusion of molecular markers improves the prognostic performance beyond that of clinical variables by up to 5%. Such prognostic models could be useful for identifying people with diabetes at high risk of progressing quickly to treatment intensification.
Summary statistics of lipidomic, proteomic and metabolomic data are available from a Shiny dashboard at https://rhapdata-app.vital-it.ch .

Keywords

Humans, Diabetes Mellitus, Type 2/metabolism, Prospective Studies, C-Peptide, Proteomics, Insulin/therapeutic use, Biomarkers, Machine Learning, Cholesterol, Machine learning, Prediction model, Progression, Type 2 diabetes

URN

urn:nbn:ch:serval-BIB_14192B042DAE2

OAI-PMH

oai:serval.unil.ch:BIB_14192B042DAE

DOI

10.1007/s00125-024-06105-8

Pubmed

38374450

Web of science

001164493400001

Open Access

Yes