Health Data Research UK fellow Dr Rosalind Eggo from the London School of Hygiene and Tropical Medicine (LSHTM) is helping to answer these questions by studying data from the Diamond Princess – a cruise ship that was struck by hundreds of cases of COVID-19 within a matter of weeks, back in February 2020.

Over the 14 days that the luxury vessel was quarantined off the Japanese coast, the 3,711 passengers and crew cooped up on board were extensively tested to measure COVID-19 infections. Around 700 people tested positive and nine died. 

The highly detailed data collected has presented scientists with what Rosalind calls a “rare and extremely valuable” opportunity to gain insights into how the virus spread within a closed population.

Whether it is seasonal flu or COVID-19, it’s always tricky to get an accurate picture of how many people infected with a virus will actually go on to die from it. If not all cases are reported or if only some of those who are infected are tested, the deadliness of the virus can be overestimated.

On the other hand, if a person dies of COVID-19 some time after testing positive and the death is not recorded as being due to the disease, it can make the virus appear less dangerous than it really is. Add to the mix the fact that being older or having underlying health conditions increases the risk, and it becomes a complicated picture.

Rosalind and her colleagues used three sets of publicly available data from research studies with the Diamond Princess passengers and crew, including World Health Organization reports. The data were anonymous with no links to the identities of the people involved. 

Using this data they were able to estimate both the share of confirmed cases that went on to die and the proportion of deaths among the total number of people infected, either knowingly or without symptoms. 

The team adjusted their model to account for any delays between a positive test and dying from the disease by including another chunk of publicly available data from 72,000 people from the initial outbreak of the disease in Wuhan, China. And because the ship’s passengers had an average age of 58, the researchers also scaled their model to get an estimate for a broader range of ages, as seen in the Wuhan outbreak.

When they crunched the numbers, they estimated that only 1.3% of people with confirmed coronavirus infection go on to die from the disease. That’s two thirds lower than the World Health Organization’s estimate, which did not take into account that only a fraction of infected people are actually tested. 

The study findings underline the importance of both extensive testing and a long follow up period after a positive test result. It also shows that adjustments for age are really important to get a more accurate picture.

Further research by Rosalind and her colleagues at LSHTM, which has been published as a pre-print prior to peer-review, has been investigating how many people can be infected without having any symptoms at all (asymptomatic infection).

They compared data from initial tests done only on people who’d had symptoms with data from later mass-testing of nearly all passengers on the Diamond Princess. Their analysis suggests that around three quarters of people infected with SARS-CoV-2 may have had no obvious symptoms. 

If these asymptomatic ‘silent carriers’ are passing on the disease unknowingly, this could have implications for controlling spread of the virus in other closed populations, such as care homes or prisons, says Dr Rein Houben, lead author of the study.

“In these closed populations where the intensity of transmission gets very high very quickly or you have a higher number of vulnerable individuals, such as in care homes, you can’t rely on symptom screening alone. You need to do some form of testing individuals regardless of their symptoms, otherwise you’re at great risk of underestimating how many infections have happened,” he explains. 

Further exploration of health data will be essential to get answers. 

“These aren’t just numbers, these are people – the point of what we’re trying to do is to stop people getting COVID-19,” Rosalind says. “Having this kind of health data available means we can learn quickly enough about what’s happening in order to help keep everyone safe.”

Health Data Research UK is working to make health data securely and safely accessible for research to improve people’s lives. Find out more at hdruk.ac.uk, and follow on Twitter @hdr_uk and LinkedIn.

Read more: