NHS Digital is a complex multi-faceted organisation within the UK national health service whose primary role is to provide the national information technology and data infrastructures that supports clinicians at work, enables patients to obtain the best care, and allows the use of data to improve health and care.

Yet, despite its importance within the UK health data landscape, few truly appreciate the many roles that NHS Digital plays and the people behind these critical activities. HDR UK was therefore delighted when NHS Digital were willing to open their doors to our PhD students to give an in-depth insight into the many functions of the organisation as part of an immersion training week.

Over three days, Rupert Chaplin (Head of Data Science), and data science and technology colleagues from across NHS Digital’s many constituent teams and groups, introduced our students to a full range of capabilities and services.

Day One saw talks about NHS Digital’s role in the provision of primary and secondary care data including the collation of the GP Data for Pandemic Planning & Research and Hospital Episode Statistics datasets and the complex data flows that underpins these services. Open data services were also introduced, and examples included NHS Digital’s Mental Health Statistics & Surveys and Summary Hospital Mortality Indicators. It was stressed to students that NHS Digital has a commitment to data accessibility and transparency and this was evident throughout the training week including methodologies that were used in official statistics that were discussed by Chief Statistician Chris Roebuck. A practical data visualisation challenge using open data from NHS Digital was given.

“All the lectures detailing specific challenges and solutions proposed by NHS Digital were quite useful in giving an idea of how difficult it is to actually engineer the data throughout the NHS into useful datasets.” – Jose Benitez-Aurioles, HDR UK PhD student, University of Manchester

The second day saw several topics introduced that would be familiar to those linked to HDR UK: the DigiTrials programme, the National Disease Registration Service and the development of the COVID-19 Shielded Patient List.

Kate Fleming and colleagues later introduced us to the National Disease Registration Service, a national resource that records all cancers, rare diseases and congenital anomalies diagnosed each year in England. Kate explained that the data allows research questions to be answered that could not be addressed through other means but spoke of the many challenges. This included the need to maximise data quality, incorporating new and emerging data types such as molecular data and patient reported information and phenotype definitions for increasing granular descriptions of rare conditions.

The development of the Shielded Patient List, described by Dr Kieran Baker, was a remarkable story that told of the incredibly pressured work undertaken by NHS Digital in the early days of the pandemic. Under pressure to bring the necessary data streams together to identify those vulnerable members of the populations at most risk from COVID-19 exposure, Kieran and others within NHS Digital worked tirelessly to develop increasingly sophisticated shielding lists that would prioritise those who required the most support during the pandemic including vaccinations and food parcels. We thank all those involved for the extraordinarily work that they did that no doubt saved many lives.

Finally, on Day Three, students were introduced to some of the data science work undertaken within NHS Digital. This included the importance of reproducible analysis as part of the organisation’s commitment to transparent reporting as well as NHS Digital’s interests in areas such as synthetic data. The Electronic Prescribing and Medicines Administration was also used to explain the text analytics work undertaken by NHS Digital to make use of prescription data. The immersion week was rounded off by Arjun Dhillon, Caldicott Guardian, who described the critical work required to ensure best practice and governance in the use of data for patient benefit and an introduction to the Data Access Request Service that we hope students might make use of in the future.

“It was a pleasure hosting students from the PhD programme at NHS Digital. We enjoyed engaging in thoughtful and insightful debate around NHS Digital’s vital role across the sector, and the use of health data for research and patient care. We wish the students well with their continuing study and research, and look forward to engaging with them again as research users of our data and systems.” – Rupert Chaplin, Head of Data Science, NHS Digital

When we set out to organise this event, my objective was simple. Students needed to better understand where health data comes from. Data is not just a file to be downloaded, it does not magically appear from nowhere. This unique training opportunity enabled HDR UK and non-HDR UK PhD students to understand the rich landscape of NHS data, where it comes from and the role of NHS Digital in its curation, management, and use. Importantly, it highlighted how much of what NHS Digital does relies heavily on highly complex and detailed work by many individuals in a multitude of different roles – a truly massive team effort.

Health Data Research UK would like to thank Rupert Chaplin and colleagues at NHS Digital for their time in putting together this unique training event and their hospitality in welcoming us.