In this month’s review of published papers and pre-prints, our Early Career Committee considered dozens of articles made open access. They were ranked against core pillars of the Health Data Research UK (HDR UK) ethos: research quality, team science, scale, open science, patient and public involvement, impact and equality, diversity and inclusion. This month’s winning publication was “Development and assessment of a machine learning tool for predicting emergency admission in Scotland”, led by Liley et al. on behalf of Public Health Scotland and a network of researchers from UK universities.

Liley et al developed a model that aims to accurately predict risk of emergency hospital admissions in patients in Scotland before they happen, so that general practitioners (GPs) can take early action for these patients, such as more follow-up appointments, to protect them against deterioration of health and requiring an emergency admission to hospital. The main goal of the project was to produce a risk prediction score called SPARRA v4, to be deployed by Public Health Scotland (PHS) and GPs across Scotland. In testing the performance of this model, the researchers found it to have high accuracy and to have outperformed the previous v3 of the model that is currently used in clinical practice.

We judged the potential impact of this research on patients and the public to be high, as this state-of-the-art model will be deployed across Scotland for the benefit of patients. Importantly, the researchers found that this new model v4 has notably higher accuracy and performance in previously difficult-to-predict high-risk populations with imminent hospital admissions. More generally, this study shows that sophisticated machine learning methods can have substantial real-world impact in healthcare.

The data used consisted of routinely collected electronic health records (EHR) collected by PHS. The overall dataset contained nearly 6 million patients, and over 400 million records of interaction with the health system – including information on patient demographics, socioeconomic status, medication prescriptions, long-term conditions, and events such as emergency admissions – all of which were important in building the predictive model.

In scoring this paper by our criteria, the committee recognised the strong team science involved, with co-authors hailing from multiple institutions across the UK. This team also demonstrates a framework for collaboration between PHS and health data researchers in academia. All code was made available on a Github repository, so the paper was scored highly on open science. While this is a pre-print paper not yet peer-reviewed, we judged the potential for high impact on patients and the public. While this model was designed for use in the Scottish population specifically, the researchers demonstrate a commitment to reproducible research practices and suggest that their analysis could be adopted to other settings. The support of HDR UK is acknowledged for this study both generally and specifically for a group of the co-authors.

It is clear within this paper that Liley et al recognise the need for sophisticated machine learning models in healthcare to be accurate, comprehensive and interpretable. They further demonstrate this in their commitment to collaborate with NHS and GP colleagues when deploying and further developing this risk score, with the ultimate aim of contributing to the effectiveness of the Scottish NHS.

Our Early Career Committee would like to congratulate and commend this team for their contribution to HDR UK’s vision of uniting the UK’s health data to enable discoveries that improve people’s lives.

Read the full paper