Health Data Research UK’s Early Career Researcher Committee is delighted to announce that the winner of the Open Access Publication of the Month competition for January is: ‘PhenoScanner V2: an expanded tool for searching human genotype–phenotype associations’ led by Mihir Kamat. This work aligns primarily with Health Data Research UK’s Understanding Causes of Disease Science Priority.

Amy Mizen, member of Health Data Research UK’s Early Career Researcher Committee, writes on the winning attributes of this piece of research:

This publication presents the second iteration of a database that contains genetic information brought together from publicly available sources. PhenoScanner V2 has been enhanced from its version 1 with lots of extra features – evolving to become an online database tool with greater volume and detail on human genotype-phenotype associations.

With the highest score overall, the committee felt that this paper particularly stood out because of the scale of the study (large volume of data, detail and multiple datasets linked together) and its commitment to open science. These are two of our criteria used for judging the research papers. Overall, through “HDR UK Open Access Publication of the Month” we want to celebrate research quality, team science, scale, open science, patient/public involvement and impact, diversity and inclusion.

This project makes exciting contributions to improving the usability of genomic data by linking multiple datasets together and will be a valuable resource for researchers and practitioners. The committee unanimously agreed that the creation of the online tool will not only facilitate reproducible research but the volume of data that the project brings together will help to accelerate the interrogation of genetic associations and to understand the mechanisms underlying these associations.

The PhenoScanner V2 has a Python-R web interface that connects to a series of MySQL databases. A new feature highlighted in the paper includes new opportunities to search the data; including gene, genomic region and phenotype-based queries. PhenoScanner V2 also contains information on human genotype-phenotype associations split into phenotype classes and linkage disequilibrium information for the five super-ancestries in 1000 Genomes.

This publication presents an exciting development not only for genomic research but also as an exemplar for multi-source data platforms that will help to reduce duplication of effort and maximise the usability of data for impactful research.

This online search tool is a good representative for various types of research outputs. The committee once again congratulates Mihir Kamat and all the other team members – James Blackshaw, Robin Young, Praveen Surendran, Stephen Burgess, John Danesh, Adam Butterworth and James Staley – for their effort in turning their research piece in an openly available online tool. We at Health Data Research UK’s Early Career Researcher Committee highly value all types of research outputs – we are currently developing ways to regularly celebrate researchers’ outputs in the form of open source reproducible algorithms and tools – watch this space!