The Molecules to Health Records Driver Programme aims to unify information on genomics, other molecular traits, and electronic health records (EHRs) to gain new insights into the underlying causes and biology of diseases.  With the UK having lifelong EHRs, combined with the rapid advancement of genomic medicine, there is enormous opportunity to unite these and answer challenging questions about many different diseases. 

    • For researchers to uncover hidden connections between seemingly unrelated health conditions by analysing genetic and molecular data from EHRs in long-term population studies. This could help accelerate the translation of research findings into therapies and builds on the HDR UK Multi-omics Cohorts Consortium. 
    • To continue the existing partnership with Genomics England which looks at rare diseases. One core aim of the work is to enrich and enhance the longitudinal health-related data within the Genomics England Trusted Research Environment (TRE) by linking additional data sources and types. 
    • For researchers to develop and validate more equitable PGSs (polygenic scores) across a range of common diseases to address ethnic bias and quantify performance in healthcare. 
    • Look at world-wide evidence from cohort studies in Bangladesh, India, and Malaysia to consider the major causes of NCDs, leveraging the environmental and genetic diversity in these populations to strengthen understanding of risk factors, especially in low- and middle-income countries. 

Programme Co-leads

Professor John Danesh 

Professor Sarah Lewington 

Overview

The Molecules to Health Records Driver Programme aims to enhance our understanding of diseases by integrating electronic health records (EHRs) with genomic and molecular data.  

By combining these diverse sources of information, the programme seeks to uncover the underlying causes of diseases more effectively than ever before. 

This initiative brings together various fields such as medicine, genetics, and bioinformatics to analyse how different molecules contribute to disease development across populations. The integration of EHRs, which contain detailed patient records, with genomic data (genetic information) and molecular traits provides a comprehensive view of health issues. 

The programme’s goals include improving diagnosis, therapy, and prevention by leveraging insights from large-scale linkages to molecular data. 


“Over the next five years, the Molecules to Health Records team will seek to help unlock the potential of large-scale bioresources that contain multi-dimensional data from many diverse populations and patient groups. By bringing together information on genomics, other molecular traits, and electronic health records at scale, the team aims to provide major new actionable insights into the causes of important health conditions.” – Programme co-directors Professor John Danesh and Professor Sarah Lewington


Workstreams:

  1. Population systems genomics and EHRs 
  2. Genomic medicine and EHRs 
  3. Molecular informatics tools and resources 
  4. Diverse and global cohorts 

Activities:

  • To scale analyses to “mega-cohorts” (such as UK Biobank) with a particular emphasis on ethnic diversity 
  • Build on work using widely adopted multi-omics platforms, interrogating innovative bioassays, including proteomics technologies and post-translational modifications. 
  • Build on existing and engage with new pharmaceutical partnerships  
  • To work with the Genomics England and eCHILD teams to define the data and create a framework for exchanging data – this includes PPIE work 
  • Harmonise data deposition to EMBL-EBI’s Polygenic Score Catalogue with the Genome Wide Association Studies (GWAS) catalogue  
  • Continue to develop new software tools that will be made available to the research community. Including provision of metadata, cloud infrastructure and enrichment of data diversity. 
  • Developing and helping to implement frameworks for more sustainable computing, both nationally and internationally 
  • Examine the determinants and consequences of adiposity and diabetes in environmentally and genetically diverse global cohorts 

Outputs

  • Journal publications on large genomic and multi-omic studies for precision health research in disease phenotypes and conditions like high blood pressure, cancer, and neurodevelopmental disorders  
  • Updates to OmicsPred portal with risk scores for ~3,000 plasma proteins, analysis of the transferability of these scores to global populations continues   
  • Developed Green DiSC, a new sustainability certification scheme for computational research