A team of early career researchers from this project submitted an application to give a Lightning Talk at the One Institute event. The Lightning Talk Committee was so impressed with the quality of work presented that, given it was submitted by a team, they recommended the application was transposed to the Team of the Year competition. The applicants on the original Lightning Talk application were the early career contingent of one of HDR UK’s flagship national projects, the National Text Analytics Project (https://www.hdruk.ac.uk/projects/national-text-analytics-project/). In this nomination, made on behalf of the Lightning Talk Committee, the co-leads of the National Text Analytics Project have been added – Angus Roberts and Richard Dobson, also of King’s College London.

The team

Angus and Richard, co-leads of the project, are senior scientists with decades of experience and hundreds of publications between them. Rebecca, Dan and Ewan are early career researchers (each within the first eight years of their career). The team also includes Rosita Zakeri, Andrew Pickles, Kevin O’Gallagher, Daniel Stahl, Amos Folarin, Lukasz Roguski, Thomas Searle, Anthony Shek, Zeljko Kraljevic, Ahay Shah and James Teo. This is a highly multidisciplinary team combining skills including statistics, health informatics and natural language processing (NLP), and brings academia together with the NHS in an inter-sectoral endeavour. The team applied their skills to address a range of health issues including stroke, lung cancer and serious mental illness – and now, infectious disease.

The project they are working on

When you visit your doctor or attend hospital, information collected about you, including your symptoms, tests, investigations, diagnosis, and treatments, is entered on computers as electronic health records (EHRs). This information could help us learn how to tailor treatments more accurately for individual patients and to offer better and safer healthcare. The challenge we face is that most of the information held within  these records is in written form – sometimes referred to as unstructured text – which is difficult to use in research: for example, ‘the patient feels very tired and breathless, is losing weight, and says her heart is beating very fast’. We need to develop special computerised tools to process these words to ensure we have a full picture of all patient symptoms, experiences and diagnoses to use in research for patient benefit.

This team is well on the way to establishing a NLP research community that will address the complexity of clinical text through development of shared tools and standards with inbuilt patient confidence and engagement, supporting joint working across industry, academia and the NHS. They have built a community that is open and inclusive, and involves HDR UK members from all four nations. It is developing capability for UK-wide NLP research at scale, whilst providing clear ‘quick-wins’ through exemplar projects, shared material and datasets for training and implementation, with the ultimate aim of integrating with other health data analytics.

Applying the project to COVID-19

Although infectious disease was not a focus for this group at the outset, this is an example of the many instances where HDR UK projects have diverted the course of their research to tackle the COVID-19 challenge. The team used the tools, such as MedCAT, that they had already developed for their original project (from the CogStack ecosystem of digital NLP tools), which extract data on health conditions from NHS Electronic Health Records, and also developed new COVID-19 specific tools. They deployed this approach to make significant improvements to the way hospitals normally predict which hospitalised patients are most likely to become so ill that they need treatment in an intensive care unit.

Hospitals use risk scores to calculate the chance of an individual patient becoming severely unwell. The ‘NEWS2’ National Early Warning Score in particular is very widely used in England. However, this score was not designed specifically for coronavirus and may not accurately predict a patient’s risk. They tested the performance of NEWS2 in 1464 patients with coronavirus admitted to King’s College Hospital in London. They found that it had only moderate value in predicting which patients would need intensive care or who died, but that some very simple additions to the score (in particular taking the age of the patient into account) could significantly improve performance. These modifications are simple enough that they could readily be used in other hospitals, but before that happens it is important to test how well our improved score works outside of our hospital in London. They are currently working with several UK NHS Trusts (Guys and St Thomas’ Hospitals, GSTT and University Hospitals Southampton, UHS) and two hospitals in Wuhan (China), who are testing their score to see if their improved score still works better than the standard NEWS2. This work has featured in HDR UK’s reports to SAGE, the UK government’s Scientific Advisory Group for Emergencies.

As the COVID-19 pandemic continues, simple and performant tools to predict risk are critical. Although NEWS2 is being widely used, there is little to no evidence to support its effectiveness for COVID. This team found that with some simple additions, the predictive performance of NEWS2 can be meaningfully improved, producing a risk score that could be implemented very rapidly across sites with the potential to improve the identification of patients at high risk of death or critical care requirement following COVID19 infection. This kind of prediction empowers clinicians to intervene at an earlier stage for those most vulnerable patients, and enables more targeted care to be provided within our, already over-stretched, health system. None of this has been easy under the current conditions imposed by this pandemic, but this team has worked exceptionally well together to reach this goal, and is persevering to ensure that the benefits of their research are felt beyond London and indeed England to be applicable to all four nations.