Abstract

Metabolomic risk scores predict type 2 diabetes (T2D) more accurately than many established clinical risk models and are moving toward clinical application. However, the biological liability captured by these scores and their fidelity to diagnosed T2D, remain unclear. Using 249 NMR-derived metabolites in UK Biobank, I trained ancestry-specific incident-T2D risk scores in European, South Asian, and African-ancestry participants, then performed genome-wide association studies of the resulting predicted-risk phenotypes. Although the scores showed strong discrimination, their genetic correlation with diagnosed T2D was consistently below one, ranging from 0.66 to 0.70. This indicates that metabolomic risk scores capture a component of T2D liability that is genetically related to, but distinct from, diagnosed disease. I developed a genetic construct validity framework to identify sources of divergence between diagnosed and metabolomically predicted T2D. The predicted-risk phenotype over-represented dyslipidaemia, lipodystrophy, and insulin-resistance pathways, while under-representing beta-cell function and direct glucose-handling mechanisms. Thus, metabolomic T2D risk scores appear to index a 鈥渕etabolically visible鈥 component of disease liability, potentially missing subtypes driven by less metabolomically visible mechanisms. More broadly, these results show that comparing the genetic architectures of observed and predicted phenotypes can reveal what clinical prediction models measure, what they miss, and why their performance may vary across populations or disease subtypes.

Biography

Daniel Malawsky is a Postdoctoral Fellow at the Wellcome Sanger Institute, where he develops statistical methods that use genetic variation to understand the causes of disease. His work focuses on the genetic and environmental contributions to human health, with applications to common and rare diseases. He studied biostatistics, mathematics, and chemistry at the University of North Carolina at Chapel Hill as a Morehead-Cain Scholar before completing an MPhil and PhD at the University of Cambridge and the Wellcome Sanger Institute. His doctoral research, supervised by Dr Hilary Martin and supported by Churchill and Gates Cambridge Scholarships, was recognised with the Genetics Society鈥檚 Sir Kenneth Mather Prize. Abstract
Electronic health records (EHR) have become a cornerstone of contemporary population research studies. However, utilising EHR effectively is challenging due to data heterogeneity and fragmentation, and non-standardised development of algorithms to identify disease cohorts, which limits research reproducibility and scientific discovery. In my talk, I will present our recent work where we developed a computational framework to create and validate disease phenotyping algorithms at scale, exemplified in the UK Biobank. Our framework provides a robust approach for identifying patient cohorts in EHR that can be used in observational studies.

Join Zoom Meeting

Link:

Meeting chat link:

Meeting ID: 813 4030 5129

Passcode: 793263