Fighting COVID-19 through data

Since the beginning of the COVID-19 pandemic, researchers have been trying to answer questions to help understand, predict and treat the virus. Deciphering how immunity to the virus is generated is critical to understanding the disease.

A key part of understanding immunity is the presence of antibodies in the blood (serology). While these have been measured in numerous studies across the UK, over the past year, researchers can still face barriers in terms of not knowing which datasets exist and contain this information. Once they find these datasets, they can often face lengthy access processes needed to access the data.

The HDR UK Innovation Gateway addresses these barriers by allowing researchers to discover the datasets that exist and providing a single point to request access to them. There are currently 613 datasets available to request through the platform; of which 73 are related to COVID-19 (many of which support the National Core Studies, the government’s research programme into the pandemic).

CO-CONNECT will enhance the utility of a number of these datasets even further by using the latest technology to aggregate and standardise information across datasets in near real-time.

The technology of combining cohorts

CO-CONNECT is a collaborative project which will allow researchers to request access to and analyse data from cohorts across a number of different datasets. It builds on the infrastructure provided by the HDR Innovation Gateway by bringing together 44 different datasets that contain serology data, from 19 different organisations.

CO-CONNECT uses the technology developed for ATLAS, an advanced data search tool for researchers. The team of data and software engineers are streamlining and standardising the data in each data set by mapping them to a common data model. Software that connects the datasets is then installed at each organisation that holds a dataset (the “data custodian”), which will allow the data to be searched in a safe way.

Facilitating discovery, safely and securely

The software allows data custodians to provide access approvals much more quickly than before; but crucially to retain control to who has access to the data.

Most significantly, researchers will be allowed to search and compare data between studies without moving data around.

This means CO-CONNECT will allow researchers to answer crucial questions about COVID-19 immunity with direct implications for patient outcomes, such as:

  • How does the risk of infection change with different strains?
  • How does the immune response differ in cohorts in interest?
  • Is there an impact on disease risk for other diseases?
  • Does the immune response differ between vaccinated and naturally exposed individuals?
  • What affect did public heath responses in the different regions have?

For now, CO-CONNECT will standardise antibody data collection across the UK, aligning with the objectives of the National Core Studies programme for COVID-19. It has already captured metadata from 17 studies, which are included on the HDR Innovation Gateway.

Paving the way for a new approach to analysing health data

While the project has a focus on COVID-19 serology data, ultimately CO-CONNECT’s work will pave the way for a standardised approach to collection, storage and discovery of health datasets more widely.

Its successful implementation will have far-reaching consequences as the principles can be applied to all data that is being collected and stored for research purposes. It will allow an infrastructure to be configured which enables trustworthy, fast, de-identified, secure analysis of data from across multiple organisations.

This approach of working with latest technology in a collaborative way will form the basis of Innovation Gateway’s soon to be launched Cohort Discovery tool. This tool will apply CO-CONNECT’s software, technology and data engineering expertise to allow researchers on the Gateway to search across the hundreds of datasets listed there to find cohorts of patients with specific, researcher defined characteristics.

There is a strong desire among data custodians for researchers to make use of the vast quantities of data being collected. The current paradigm has been designed around restricting access to data, but this has had the unfortunate side effect of simply creating a series of vaults containing hugely useful information that relatively few researchers are even aware of, let alone able to analyse.

By opening these vaults to the research community in this more efficient but safe way, we will accelerate and reduce the effort required to carry out impactful research. By working in partnership with the members of the HDR UK Alliance and the Innovation Gateway, we are beginning to shift this paradigm to a more open approach where we enable even greater discovery, safely and securely,  from the UK’s rich collection of health data.

Further information:

Health Data Research UK is the national institute for health data science. CO-CONNECT is led by Phil Quinlan, Emily Jefferson, Susan Hopkins and Aziz Sheikh, and being built by experts from Universities of Nottingham, Dundee and Edinburgh, as well as Public Health England.

Project description

CO-CONNECT launch press release

CO-CONNECT website

HDR Innovation Gateway

CO-CONNECT collection on HDR Innovation Gateway