Healthcare organisations have large, rich datasets held within local – often proprietary – clinical systems. If used for research and innovation, these datasets could transform healthcare delivery, but there remain significant access barriers:

  1. Minimal NHS-based informatics capacity to prepare “research-ready” data. This is especially true for hospital data: unlocking this untapped resource is a key mission for PIONEER.
  2. Ensuring data curation and analytical environments meet end user need. Building bespoke data assets/analytical environments to commercial timelines requires access to large data warehouses and technical expertise
  3. Communicating precisely what data(sets) are available.
  4. Providing joined-up costing, governance, and contracting services: fit for purpose, transparent, and easily navigated in a timely fashion.

To address this, PIONEER has worked closely with healthcare organisations, academic and industry partners and HDR UK to:

Provide training, standardised operating procedures and technical assistance, supporting data providers to prepare their own data. Training, developing code, sharing resources and enabling secondments with PIONEER.

PIONEER partners with national, and international data providers, and works:

  • as a data partner on specific projects, with PIONEER assisting with curation (for example, ADMISSION grant with Newcastle Hospitals NHS Foundation Trust)
  • as a partner on agreed programmes of work encompassing several projects (for example, Nottingham University Hospitals)
  • as a trusted advisor, to help develop data curation pathways (for example Birmingham Women’s and Children’s and Sandwell West Birmingham NHS Trust)
  • as a new partner joining PIONEER (agreed with University College London Hospitals in March 2022).

Expand the number, range and modalities of available datasets – bespoke structured, free text and image data. Bespoke synthetic structured data and synthetic images. Building demonstrator and bespoke datasets, including external linkage where needed. PIONEER has launched over 40 datasets on the Gateway since inception, each designed with clinical expertise to maximise utility.  (See Figure 1)

We have created a master data repository for de-identified multimodal structured and image data.  Based on our customers’ needs and market engagement we have designed and iterated the PIONEER Data Analytics Platform. PIONEER provides a working environment for researchers containing bespoke datasets accessible via a secure remote desktop. We support different technologies based on customer need; we “track and flag” resource usage to manage customer budgets and securely handle data from ingestion to consumption.  See Impact case 1.

PIONEER links datasets which are extremely valuable but rarely discoverable. Specific examples include combining immune cell function (a NIHR Biomedical Research Centre laboratory dataset) in patients with frailty for a Dunhill Fellowship, and, in our cross-council DARE Sprint exemplar, combining asthma admissions data with meteorological, air quality data, and translational biomarkers.

Catalogue and publish high-quality comprehensive metadata to communicate dataset content and availability via the Innovation Gateway.

Supplementing this with infographics, social media campaigns and webinars. Also demonstrating data utility in exemplar use cases in partnership with clinical and academic experts, with 18 academic papers and >20 short reports from data access requests to date.

Build, test and run a costing calculator.

PIONEER has developed a Cost calculator, from which all quotations are generated; co-developed with our patient/public advisory group (PPAG).

During public workshops, there was support for financial benefits derived from health data to be returned to the NHS with transparency, informing the public about data access under license as a charged service and any resulting patient benefit. Working with our Patient and Public Advisory Group (PPAG), we agreed the elements for a cost calculator, exploring models of working with SMEs, academic researchers, and large industry.

The cost calculator has been iterated by a market consultancy exercise and through industry engagement. The tool considers the quantity and complexity of the data extract request, data rarity, with an appropriate mark up to ensure sustainability and growth.  An example of the Cost Calculator output table is provided (Figure 2).   This is used for all products and services provided by PIONEER, including data, consultancy and TRE hire. See Impact case 1.

Figure 2:  PIONEER Cost calculator output for customers

Chargeable costs are proportionate to the size and sector of the requesting organisation and take into account Fair Value (defined as the actual value of the data as an asset agreed between the Data Controller and the Data Requestor).  Full cost recovery encompasses time & effort associated with:

  • Pre-access: request assessment, review and authorisation processes;
  • Pre-access: extraction, curation, transformation, linkage, Quality Assurance/Quality Control, de-identification;
  • Access: specification of Trusted Research Environment (e.g. size of data storage at rest, tooling for analysis requirements, administration of access rights and audit) or access via sFTP (Secure File Transfer) of data.

PIONEER has delivered projects with SMEs, with potential models for service including equity share, stock warrant, royalties, free or reduced charge for product or services or partnership funding applications. Examples include providing synthetic data of a rare cardiomyopathy for an SME to develop an algorithm for risk prediction.  The cost model was free use of product for PIONEER NHS partners (a further benefit of Data Controllers joining PIONEER) and equity share.

This model has been highly praised by Data Requestors for its transparency.  PIONEER has been asked by HDR UK, other Data Controllers and other hubs to advise on our costing model and commercialisation of synthetic data including DISCOVER-NOW and BREATHE, with our learning shared across the network.

Design, run and then expand a functional data governance model.

PIONEER offers:

  • an efficient data access request pipeline, providing individuals and organisations with a simple process to apply for access to research datasets.
  • A single point of contact, with a data concierge service to take them from initial contact, quote, contracting and data/consultancy access.
  • A patient/public review of all data requests, through our Data Trust Committee (see PPIE Impact Case), increasing the confidence of Caldicott Guardians to approve data access.
  • A robust cloud-based research platform where research data assets can be securely staged and accessed by authorised researchers (see Impact case study 1).

Template contracts for faster data access.

PIONEER has developed template Data Sharing Agreements which have been road-tested and approved by Data Controllers and requestors, to speed up contracting.  These include amendable terms to enable fair value return and have been successfully used across many sectors, with positive feedback for our streamlined contracting process.


“Pioneer is collaborating with RW Health to describe the real world outcomes for acute myeloid leukaemia therapies. Pharmacy data combined with pathology and laboratory data enables the visualisation of the AML patient journey through remission and relapse in the real-world setting, helping to drive further phases and developments for this research. 


The knowledgeable and responsive Pioneer Team curate high quality data, and the outcomes and information generated from this anonymised dataset is essential for research, helping to provide clinicians and researchers with decision-making insights and drive better quality patient care.”   


Dr Alison Isherwood 

Director of Epidemiology, RW Health

Collectively this facilitates a highly scalable sustainable architecture to support the rapid realisation of research-led patient benefits.

A quote from Dr Alison Isherwood (Director of Epidemiology) at RW Health is provided to illustrate a PIONEER customer view, and this company have multiple requests with PIONEER.

Our approach has enabled PIONEER to support data access requests which have improved health care for patients.  PIONEER is proud to provide the following examples of our work:

  • National core studies on vaccine efficacy and safety, especially focusing on underserved groups.
  • A large pharma study providing real word data following a new product launch, identifying market share.
  • A health consultancy study building an algorithm to identify risk of severe renal dysfunction during acute care service presentations
  • A large pharma company working in a partnership to study lesser-known side effects of immune checkpoint inhibitors and building educational materials for clinicians.

We continue to receive exciting projects and requests which align closely to the needs of the NHS.

Examples include building digital clinical support tools for the medical management of myocardial infarctions, identifying and mitigating diagnostic delays for rare cancers, assessing biomarker performance in complex patients, a study examining if Covid-19 virtual wards improve patient safety and the analysis of acuity scores when deployed throughout a patient admission.

Our agile approach to solutions and early success has produced a significant investment from our strategic NHS and academic partners, including from prestigious funders, as outlined in Impact case 1.

Impact and sustainability in a thriving data ecosystem.

PIONEERs continuing success builds from the impact we bring to patients, our quality service offer and its timely delivery and our innovation-led approach to new technologies, all supported by the vibrant life science community of the West Midlands, serving a connected population of over six million people.

Health data is a core pillar of the Life Sciences Industrial Strategy, featuring in the multi-stakeholder Life Sciences Recovery Roadmap, as well as the UK’s new R&D Roadmap. ‘Data-driven health and life sciences’ is central to the West Midlands Local Industrial Strategy, committing to “innovative hospital networks, platforms and citizen engagement to enable needs-driven, real-world evaluation, validation and adoption of novel technologies, and new approaches to data sharing and implementation of AI”.

The Birmingham Health Innovation Campus is one of six national Life Sciences Opportunity Zones, attracting significant new inward investment to the region, with a Precision Health Technologies Accelerator focusing on data-enabled healthcare. Birmingham Health Partners (three NHS Trusts, the University of Birmingham and regional AHSN) have formed a strategic alliance with the ABPI to use real world data to stratify therapies.  Birmingham hosts the Commonwealth Games in 2022 with a “data-enabled” theme.

PIONEER is embedded in all these opportunities, setting the agenda for major future investment. Our expanding data sources, high data quality and detail, and ability to deliver services to industry standards with public and patient oversight and support, places us in a strong position for continuing growth.