Seven Health Data Research Hubs were launched in October 2019; each a ‘shop window’ for UK health data research, where expertise, data, tools, scientific knowledge and open innovation come together to maximise insights and improve people’s lives. So far, the Hubs have made 157 datasets available on the Innovation Gateway; ranging from public health data, to clinical and genomic data, to support research into a range of disease areas.

The Hubs look to consistently improve their datasets, so that researchers can discover and utilise the data to answer important research questions. These improvements have enabled a range of diverse impacts including; informing policy decisions around the COVID-19 crisis in how effective vaccines are, development of tools to improve clinical decision-making in the management of patients with vascular disease, and linking routinely-collected data to drive research in cancer, heart disease, and hospital care pathways.


The Hubs’ focus is on uniting, improving and using health data to enhance research potential and improve health. This is only possible by consistently reviewing and enhancing the quality of data, its discovery and access. This requires improvement of the metadata (information about a dataset, which allows users to understand what it contains before they have accessed it), data quality (how accurate, clean or complete the data is), and data utility (how well it meets the user’s needs).

For instance, quality metadata ensures that researchers can determine what data exists, how to acquire and process the data and, importantly, that they can understand the dataset. It also ensures the data custodians are sharing reliable information, optimising their infrastructure, and utilising data after its initial intended purpose. The pandemic has brought into focus the importance of the Hubs’ data improvements, to be able to provide access to a broad range of rich datasets at scale for analysis.


To review the improvements to data over the 18 months since the Hubs’ inception, HDR UK created a first-of-its-kind data utility framework to measure the Hubs’ progress across a number of categories including; documentation (information about the data), technical quality (condition of the data), access & provision (how users can access the data). Insert chart

This revealed improvements in data across all of the Hubs, with documentation seeing the most consistent increase, driven primarily by improvements in metadata richness. The technical quality and value and interest areas also saw large increases, as Hubs developed data management plans and increased the linkage potential  within and between  datasets.

Impact and Outcomes

All of the Hubs have used their data improvement efforts to support the NHS, academia, and industry, in utilising the power of data to understand and address the challenges of COVID-19.  They have supported urgent analysis that has informed government policy, linked crucial data such as testing datasets with pathology to aid forecasting, and supported commercial activities such as pharma vaccine uptake and effectiveness. They have had a direct impact on clinical activities; such as Discover-NOW’s digital innovation testbed approach for remote monitoring and PIONEER implementing an electronic COVID-19 screening and management system and real-time COVID-19 dashboard across four NHS hospitals.

We saw NHS DigiTrials significantly enhance recruitment to PRINCIPLE – a clinical trial of community-based treatments for over 50s that have tested positive for COVID-19. Mapping the data on the NHS Test and Trace App, they were able to contact eligible candidates earlier, and registrations for the PRINCIPLE trial increased from an average of 87 people per week to 325 people per week and by 31 March 2021, 4,671 people had been registered to the trial. This increase enabled the inclusion of inhaled budesonide as a treatment, which was found to shorten recovery times, with the trial findings having the potential to change clinical practice globally. Their involvement in a further trial, RECOVERY, enabled the trial team to deliver a major breakthrough in the COVID-19 response; the finding that dexamethasone saves lives was rapidly adopted as part of standard hospital treatment around the world. Being able to access this data from one place improved the efficiency of the trial, ultimately resulting in doctors being able to prescribe the treatment rapidly.

BREATHE supported the rapid set-up of EAVE II, a national COVID-19 surveillance platform covering ~99% (5.4 million) of the Scottish population. This involved linking data from 940 general practices to testing, vaccination, hospitalisation, intensive care unit, and mortality data to create the world’s only end-to-end national COVID-19 platform. Through this landmark study, BREATHE was able to report the first national estimates on the effectiveness of the Pfizer-BioNTech and Oxford-AstraZeneca vaccines in reducing COVID-19 hospitalisations.

The Hubs were also able to show the impact of the pandemic on specific disease areas, with INSIGHT supporting analysis into age-related macular degeneration (AMD), one of the leading causes of blindness. The project provided the first reliable estimates of the scale and severity of vision loss arising from delays in treatment for newly-diagnosed ‘wet AMD’ (a chronic eye disorder) during the COVID-19 period, informing NHS and industry providers on strategies to optimise care of patients.

Similarly, Gut Reaction supported receiving immunosuppressants for Inflammatory Bowel Disease (IBD), who are at higher risk of complications from COVID-19, due to the medication they take or disease activity. Some of those were categorised as ‘clinically extremely vulnerable’. There was a time-critical need for patients and clinicians to understand their level of risk with a view to shielding. In just eight days in March 2020, their IBD Registry developed a COVID-19 IBD Risk Tool to allow patients to self-assess their risk. This was automatically sent to IBD services and was a significant support in making sure people with IBD were categorised correctly.

We also saw support for the cancer field, with DATA-CAN’s work with UK cancer centres to collect and analyse real-time hospital cancer service data, looking at referrals from GPs for people with suspected cancers and chemotherapy appointments. Bringing these data together improved their quality, creating new aggregated datasets that indicated, for the first time in the UK, the impact of COVID-19 on cancer services. From a data improvement perspective, their expertise and research activities helped create new aggregated datasets that did not exist before, enhance existing datasets and deploy these improved datasets to precisely delineate for the first time the impact of COVID-19 on cancer services and cancer patients in the UK.

The Hubs’ commitment to data improvement has enabled them to quickly react to the pandemic to answer research and clinical questions and collect and analyse datasets that can continue to be used for new purposes.

Further information:

Read our full report 

Find out more about the Hubs