Many artificial intelligence algorithms are developed using large, publicly accessible datasets. But we know very little about what data is out there, where it comes from, and the people, settings and health conditions it represents. The risk is that new AI health technologies will be based on unrepresentative datasets and will therefore only work for some people in some contexts, leaving countless others behind – an issue known as ‘health data poverty’.
Eye health is one of the leading areas of digital innovation. Through global searches and analysis, this project is mapping what publicly available eye imaging datasets are out there and reviewing the extent to which they represent the diversity and the needs of the world population. In understanding and assessing the information being used by AI algorithms, the team can identify gaps – such as a lack of basic, clinically important information about the people represented (like age, sex and ethnicity) or disparities in who or what conditions are represented. We can then work to understand why this is the case – and look for possible solutions. For example, if data isn’t publicly available, why? Are there barriers to collecting it, and how can we overcome these barriers? Or are there particular challenges in making more representative data visible, accessible and useable?
Only 1 of the 98 eye health datasets we identified and assessed came from sub-Saharan Africa – most came from populations in Asia, North America and Europe – and none came from Australia and New Zealand. This means the people and diseases in these datasets represent only a small part of the global population, and others are left ‘off the map’.
The impact and outcomes
The project has already uncovered major gaps in the publicly available data on eye health and highlights a concerning lack of data on certain conditions, from certain parts of the world and certain population groups.
The ambition is to extend these reviews into other health disciplines to understand the scale of the problem and to make sure that new AI health technologies are based on representative datasets so that everyone benefits from AI innovations and decision-making for better health and care.
and Pearse Keane.
Do home adaptation interventions help to reduce emergency fall admissions? A national longitudinal data-linkage study of 657,536 older adults living in Wales (UK) between 2010 and 2017
21 April 2022
Falls are a significant concern for older people and the NHS, however there is little evidence for the effectiveness of services to make adaptations at home. Researchers in Wales have used health...
GP consultation rates for sequelae after acute covid-19 in patients managed in the community or hospital in the UK: population based study
20 April 2022
The long-term effects in some people with COVID-19 are still poorly understood. This new research used large-scale data to understand how these effects differ based on the severity of the initial...
The Office for National Statistics and BREATHE Announce Partnership to improve respiratory health
10 March 2022
BREATHE, our Health Data Research Hub for Respiratory Health, have today announced their partnership with the Office for National Statistics (ONS) to improve respiratory health in the UK through...