Most comprehensive analysis of COVID-19 data reveals previously unattributed deaths
9 June 2022
Study published in The Lancet Digital Health and supported by the BHF Data Science Centre used health data from 57 million people in England to build the most complete picture of the pandemic in a single country to date.
In this first-of-its-kind study, researchers from University College London (UCL) combined multiple NHS datasets on national laboratory testing data, primary care consultations, hospitalisations and deaths to reveal the exact trajectory of individuals through the healthcare system during the pandemic, and what impact this had on their health outcomes.
The analysis uncovered 15,486 deaths that occurred within 28 days of a COVID-19 diagnosis but didn’t list COVID-19 as being a cause of death. A further 10,884 COVID-19 diagnoses were identified from death records alone with no other related information recorded earlier in health records.
Researchers also found almost one third of patients received ventilatory support outside of ICU departments, and that this was associated with the highest rates of death in waves one and two of the pandemic. The authors say this demonstrates the need for planning on how to scale ICU services in the event of future pandemics and healthcare emergencies.
Assessing the impact
Dr Chris Tomlinson, UCL, co-lead researcher of the study, said, “Understanding the impact of COVID-19 requires consideration of how the infection varies in severity and time course – from asymptomatic, to cases that are unfortunately fatal.
“These different clinical presentations are captured in a patient’s digital records, but across multiple, and often unconnected organisations – including public health bodies, GP surgeries, hospitals, intensive care units and death registries. Analysing all this data on the scale of an entire population presents a real challenge.
“In this study, we bring together eight complementary and national-level datasets from across the NHS to create the most comprehensive analysis of COVID-19 events to date, with the aim of supporting policy decision-making for COVID-19 and future health crises.”
Data linkage
For their analysis, researchers used anonymised patient data from multiple national NHS sources to identify patterns in how patients progress through the healthcare system. Linking these to demographic factors like age, sex, and ethnicity allowed another layer of analysis. For example, those from non-white ethnicities had a shorter time between infection and death, suggesting these groups may have been accessing testing facilities and healthcare later in their disease.
The research was conducted securely in a Trusted Research Environment by members of the CVD-COVID-UK consortium, a National Institute for Health Research (NIHR) and British Heart Foundation (BHF) flagship project led by the BHF Data Science Centre, part of Health Data Research UK.
Rapid and reliable access
Professor Cathie Sudlow, Director of the BHF Data Science Centre, said: “Rapid and reliable access to health data has been essential throughout the pandemic. Until now, this data has been locked away in siloed organisations where it is almost impossible to analyse in harmony.
“The BHF Data Science Centre’s CVD-COVID-UK consortium is now working to provide trusted researchers rapid access to multiple, linked datasets from across the NHS. By collaborating with research teams like this one who are developing new approaches to analyse these data sets, we’re paving the way towards a new future of using health data to improve people’s lives.”
Professor Spiros Denaxas, UCL, an author on the study, said “By linking electronic health records on a national scale we were able to identify patterns and patient trajectories in the pandemic which would have otherwise remain hidden in smaller datasets. On-going, secure access to the excellent data the NHS holds is essential for performing high quality health data research and improve patients’ health and healthcare.”
The researchers note that although they present patterns throughout the pandemic, their focus was to analyse COVID-19 related characteristics, rather than causal relationships. The findings are important for identifying potential NHS pinch-points and informing future policies.
Enabling further research
Dr Johan Thygesen, UCL, who co-led the study, said, “This work has already enabled other research with highly relevant public health implications, like assessing the blood clotting risks of COVID-19 vaccines. By fully sharing our methods and code, we believe this research has the potential to unlock the power of linked health data for not just future COVID-19 outbreaks, but all kinds of complex health conditions.”
The full paper is available at The Lancet Digital Health.