Authors: Blythe Adamson | Arun Sujenthiran | Ben Gordon | Alison Elderfield

When it comes to the important topic of healthcare data, the UK is in a unique and privileged position of having an incredible set of diverse and rich data assets which can be used for research to improve health care systems and ultimately save lives.

However, with this opportunity comes a substantial challenge – these datasets are fragmented – across healthcare systems (primarily but not only the National Health Service), academia, industry and charities (such as Cancer Research UK and the British Heart Foundation), making them difficult for researchers to find, access, and use. These delays and barriers  are a major obstacle for  discovery and innovation.

Over the past two years, the UK’s world-leading research response to Covid-19 has shown what is possible when the sector works together to overcome these barriers. Research discoveries substantially impacted the population by advancing our understanding of the virus, and contributed to more rapid development of treatments and vaccines.

So, collaboration is key.

As the national institute for health data science, a core part of Health Data Research UK’s (HDR UK) mission is to unite the UK’s health data sets.  Flatiron Health is a healthtech company dedicated to advancing oncology research and improving cancer outcomes for patients. A key part of Flatiron’s vision is to structure fragmented health data better, in order to accelerate research and improve care.

Flatiron Health and HDR UK’s missions align and both have a patient-centred approach at the heart of their organisations. They partnered in 2021 specifically to support the development of the HDR UK Data Utility Framework.

Data Utility – an important part of the puzzle

As the research community starts to bring together the vast quantity of health datasets that exist across the UK and make those datasets more widely available for health research, there is a growing need to define, categorise and curate that data.

HDR UK developed the first ever framework to do this – the Data Utility Framework. This allows researchers to find, select and understand the potential usefulness of datasets for a specific research project.

The framework contains five categories and dimensions, each of which are evaluated to describe the characteristics of a dataset. Each dimension has a progressive series of criteria, allowing for a rating from ‘Bronze’ to ‘Platinum’. While quality is important, the purpose is not necessarily to achieve a ‘Platinum’ rating across all dimensions, but to enable a user to exclude and refine the datasets required for their specific needs and fit the research question.

In this way, the framework supports key groups across the health data research landscape:

  • Data custodians – to communicate the utility of their dataset, and improvements made in the dataset;
  • Researchers – to identify datasets that meet the minimum requirements for their specific purpose; and
  • System leaders and funders – to identify where to invest in data quality improvements, and to evaluate what improvements have happened as a result of their investments.

Making frameworks work

Organisations that work with and curate health data, such as Flatiron Health, can adopt this framework in their everyday practice to discover the best datasets for a specific project and to then improve the quality and completeness of their own data.

So, as part of a collaborative development process, HDR UK was delighted to have Flatiron Health’s support in reviewing the framework, particularly its application for cancer datasets, which is Flatiron’s area of focus. Their comments and feedback were invaluable to making the framework robust, effective and user-friendly.

Their input was also a vital ingredient of iterating the framework into an even more practical tool – The Data Utility Wizard. The Data Utility Wizard applies the framework to the advanced search for datasets listed on the Health Data Research Innovation Gateway (the ‘Gateway’), the UK’s only unified portal for the discovery and access of healthcare datasets.

Flatiron Health encourages its data analytics community to use the tool to find cancer datasets on the Gateway: