How early was the novel coronavirus circulating?

This illustration shows a coronavirus particle in blood plasma showing the Y-shaped immunoglobulin G antibodies (IgG, light blue) bound to the coronavirus' spike proteins (red). IgG antibodies are Y-shaped proteins produced by B-lymphocyte white blood cells as part of an immune response. Immunoglobulin M antibodies (IgM) are also shown in light blue. (Image credit: JUAN GAERTNER/SCIENCE PHOTO LIBRARY via Getty)

In late December 2019, the Wuhan Municipal Health Commission reported cases of an unidentified viral pneumonia, which, along with other reports, alerted the World Health Organization (WHO) to a potential new health threat that was identified as a coronavirus in January 2020 and was later named SARS-CoV-2. 

But it has become clear that the virus emerged before late December 2019, possibly even months before. A joint WHO study by Chinese and international researchers identified 174 SARS-CoV-2 infections throughout December, with the earliest going back to Dec. 8. Though most researchers think the virus originated sometime during the fall or winter of 2019, an exact time is hard to pinpoint without more data. Finding out when SARS-CoV-2 began spreading among people could help prevent or address future epidemics and pandemics by providing insight into the kind of disease surveillance that would have been necessary to prevent this one, experts say.

By the time the virus was identified, it had already spread significantly and was harder to contain, said Sergei Pond, a professor of biology at Temple University in Philadelphia. "You don't want to wait eight weeks until you have a cluster of cases with unusual pneumonia," Pond said. "You kind of want to have a surveillance system where you pick it up very early."

Related: 7 facts about the origin of the novel coronavirus

The first case of COVID-19 that has been confirmed by a laboratory test was in a man who started to experience symptoms on Dec. 8, 2019, The Washington Post reported. Though there were earlier reports suggesting the first case could be traced back to Dec. 1 or Nov. 17, as Live Science previously reported, those reports were not confirmed by the WHO-China joint study, said Joel Wertheim, an associate professor of medicine at the University of California, San Diego. Wertheim and his colleagues analyzed the virus's genetic information and conducted epidemiological computer simulations, which put the virus's origin date at between mid-October and mid-November 2019, they reported in April in the journal Science.

To draw this conclusion, the researchers analyzed genomes of SARS-CoV-2 from the first wave of the pandemic in China. Because viruses accumulate genetic changes over time, the researchers could identify a fixed rate of genetic mutation and then work backward until they found when the first person with a relatively unaltered form of the virus could have started to spread it among people. The researchers estimated that for SARS-CoV-2, that date was between Nov. 17 and Dec. 20, 2019.

But that's just when the virus likely started spreading among people. Because SARS-CoV-2 originated in an animal and was passed to humans, the animal coronavirus that originally infected the first person could have genetic differences from the current virus. That version might have taken a while to become genetically recognizable as SARS-CoV-2, meaning the virus may have started spreading even earlier, the researchers said. 

Bats, particularly  horseshoe bats, are most likely the original reservoir for the precursor virus to SARS-CoV-2. (Image credit: Shutterstock)

To see how long it might have taken the virus to accumulate those kinds of changes, the researchers used a computer simulation of the virus's spread. They concluded that the process likely would have taken anywhere from zero to 41 days, although the most common result was eight days. This process, they said, might have pushed back the virus's initial spread to mid-October.

Wertheim emphasized that the goal of the study was to establish how far back the virus could have started to spread, not necessarily how early it did spread. "That's as far as you can make it go, and even then, that's a lot of assumptions to get that far back," he said.

Many researchers, including Pond, would agree with that based on current data, the timeframe that Wertheim and his colleagues proposed in the study is likely, said Pond, who co-authored  a separate study examining the early evolutionary history of  SARS-CoV-2, published in May in the journal Molecular Biology and Evolution. In that study, Pond and his colleagues used a kind of genetic analysis originally developed to reconstruct the evolution of human cancer cells. They determined that the version of SARS-CoV-2 that spread in December 2019 would take six to eight weeks to evolve from the initial human strain of the virus. Although the method they used was different, that time frame would also push the origin back to around the same time as the other study — October 2019.

But Pond said there are also ways of gaining new insight into when the virus emerged. For instance, many thought the virus emerged from an animal at the Huanan Seafood Wholesale Market in Wuhan, but later analysis found cases that couldn't be linked to the market, Live Science previously reported. In contrast, Wertheim said the lack of confirmation from the China-WHO joint study for the Dec. 1 and Nov. 17 cases that his study used could affect the estimate. Frozen blood samples from early potential cases or genetic sequence records could also provide further insight, Pond said.

"You can easily imagine a scenario where you get five or 10 more sequences that are early, and they just change everything," he said. Regardless, Pond thinks it's unlikely that the virus emerged earlier than fall 2019 or, at the earliest, late summer 2019, because even events that could lead to the virus circulating undetected that early — such as it starting to spread, going extinct and then being reintroduced — are very unlikely and become increasingly so the further back you go.

Some research has suggested an origin date earlier than October, but the studies have not been peer-reviewed or published in scientific journals. In one such study, researchers at Harvard University analyzed internet searches in Wuhan, China, from 2019 and found an increase in searches for "diarrhea" in August 2019 that correlated with an increase in traffic in a Wuhan hospital parking lot the same month. Diarrhea is more common with COVID-19 than with the flu, so the researchers suggested the increase could point to the virus spreading in August. 

In a commentary in response to that study, however, other researchers pointed out that the authors used an awkward Chinese translation for "diarrhea" and that the search term increased in use all over China, not just in Wuhan. Another study, which was published to the preprint server medRxiv and was not peer-reviewed, found traces of SARS-CoV-2 in wastewater in Barcelona, Spain, in March 2019. However, the findings made little sense without any evidence of patients experiencing symptoms of COVID-19 in Barcelona at the time.

There are inherent problems with trying to find a more precise origin date. Wertheim's analyses showed that early on, case counts were likely to be so low that the virus went undetected. In fact, in the computer simulations from the study that modeled the spread of SARS-CoV-2 from a single human case, the virus went extinct the majority of the time, and when it didn't, sometimes it relied on a single person to spread it more widely again. Of course, in a large, densely-populated city like Wuhan, that scenario doesn’t present a problem — it would be easy for a single person to transmit the virus to many people. But it does make it likely that early on, few people had the virus. Amid a severe flu season, and since SARS-CoV-2 had a relatively low mortality rate compared with viruses like severe acute respiratory syndrome (SARS) and Middle East respiratory syndrome (MERS), Wertheim said it's no surprise the virus wasn't detected when it first started to spread.

Wertheim hopes systems that allow for earlier detection could help prevent or mitigate the effects of future pandemics.

"In an ideal world, we would have a sort of systematic and interconnected way to report all unexpected illness in a way that can be seen across borders," he said. "Something like that would have given us a leg up on this pandemic and potentially have been able to stop it in its tracks."

Originally published on Live Science.

Rebecca Sohn
Live Science Contributor

Rebecca Sohn is a freelance science writer. She writes about a variety of science, health and environmental topics, and is particularly interested in how science impacts people's lives. She has been an intern at CalMatters and STAT, as well as a science fellow at Mashable. Rebecca, a native of the Boston area, studied English literature and minored in music at Skidmore College in Upstate New York and later studied science journalism at New York University.