Will we ever find COVID-19's 'Patient Zero?'

A coronavirus particle binds to a human cell.
A coronavirus particle binds to a human cell. (Image credit: KATERYNA KON/SCIENCE PHOTO LIBRARY via Getty Images)

Chinese officials have rejected a World Health Organization proposal to investigate the origins of the novel coronavirus that causes COVID-19, raising new questions about whether the world will ever learn when, where and how the coronavirus (SARS-CoV-2) made the leap into humans. 

China objected to the WHO plan last week because this phase of the investigation left open the possibility that the virus escaped as the result of a laboratory accident, NPR reported. Without Chinese cooperation, scientists will face frustrating gaps in the data that may keep them from identifying the moment the pandemic began. However, the virus itself does hold clues to its own origin. In the coronavirus's genetic blueprint is a history of where it came from and how long it took to cause the outbreak that led to a global catastrophe.

Even if scientists never identify a Patient Zero — the first person who fell victim and sparked a chain of infections leading to the pandemic — they may be able to determine what animals facilitated the leap and what human activities made it possible, experts told Live Science.

Related: 7 facts about the origin of the novel coronavirus

Defining Patient Zero

In your typical pandemic fiction, a disease outbreak begins with a single, dramatic moment: A vial of infected blood breaks, a sickly monkey escapes a lab, an alien satellite falls from the sky.

And it is sometimes possible to find a singular source for an epidemic or pandemic in the real world. Recently, epidemiologists traced the source of a devastating 2014 Ebola outbreak in Guinea, Liberia and Sierra Leone to the infection and death of a 2-year-old named Emile Ouamouno.

But this work is extremely challenging and potentially stigmatizing. For example, for many years, a single Québécois flight attendant was blamed for spreading HIV to North America. In a 2016 study in the journal Nature, however, researchers showed that the flight attendant, who died of AIDS in 1984, was just one of thousands who had become infected with the then-unknown virus. Ironically, the man was blamed for so much spread partly because he was one of the most helpful early patients to epidemiologists, providing information on his sexual contacts that other patients couldn't always recall.

Delving further into HIV's history, any notion of a "Patient Zero" becomes foggy: The virus leapt from West African primates into humans at least three times, and the major strain responsible for most infections probably emerged sometime around 1910 or 1920.

Related: 11 (sometimes) deadly diseases that hopped across species

Even for diseases in the modern era, finding early cases doesn't always translate to understanding how the disease jumped from animal to human. No one knows exactly how Emile Ouamouno caught Ebola, and scientists still haven't discovered the animal reservoir for the disease, though bats are a prime suspect.

Likewise, discovering how a new virus jumped from animals to humans doesn't always require discovering a Patient Zero. SARS-CoV-1, the close relative of the current pandemic coronavirus, emerged in November 2002 with a single patient, a farmer from Guangdong who died in the hospital. But that farmer was just one of  several early cases that emerged in five separate cities. Further studies revealed that SARS-CoV-1 was closely related to a virus found in horseshoe bats, which then infected animals sold in wildlife markets, particularly civet cats. A 2003 Center for Disease Control and Prevention study found that 13% of people in the wildlife trade in the region had antibodies against SARS-1 compared with 1% to 3% of the general population, suggesting that the virus or a closely related one had been bouncing from animals and humans asymptomatically or with minimal symptoms before the major outbreak occurred. Among those who traded in civet cats — the likely bridge species between bats and humans — the likelihood of previous infection was 72%.

Ultimately, researchers found a virus in bats that was 97% identical to human SARS-1, and then a virus in civets and raccoon dogs that was 99.8% identical to the virus that infected humans, said Stephen Goldstein, a postdoctoral scholar in evolutionary virology at the University of Utah. Thus, researchers clinched the chain of animal-to-human transmission of SARS-1 without ever learning exactly when and where the virus made the leap.

A murky beginning 

SARS-CoV-2 may be particularly tricky to trace because of its inconsistency in producing disease. Somewhere between 30% and 40% of infected people are asymptomatic, and many others experience mild or moderate symptoms of COVID-19 that can be easily mistaken for a head cold or a case of the flu. Wuhan, where the first cases emerged, was in the midst of a bad flu season in fall 2019, so early cases could have been misdiagnosed.

To work within these limits, scientists are trying to rewind the history of the virus from its genetic blueprint. This can't reveal the exact moment of the first animal-to-human transmission, but it can get tantalizingly close.

"For trying to determine when HIV first arrived in the United States, our uncertainty is on the order of years or sometimes even a decade," said Joel Wertheim, an evolutionary biologist at the University of California, San Diego, who is doing this research. "For SARS-CoV-2, our uncertainty is on the order of weeks."

Wertheim and other researchers in his field depend on a powerful tool in viral evolution: a molecular clock. This "clock" is based on a constant pile-up of mutations that occurs each time the coronavirus reproduces. Most of these mutations have no effect on the function of the virus, Wertheim said, but because they occur at a predictable rate, scientists can use them to determine when certain events in the virus's history took place. Those events can include when the infection that kicked off the pandemic first occurred.

This isn't the same as the first human infection with SARS-CoV-2, Wertheim cautioned. Most people who caught the earliest variants of the virus didn't pass it on, so there could have been dozens of infection chains that fizzled out.

There are parallels in human evolution. Around 200,000 years ago in Africa lived a Homo sapien woman known as Mitochondrial Eve, because the maternal genetics of every human alive today can be traced to her. But Mitochondrial Eve wasn't the only woman around back then — she was just the one whose genetic lineage survived.

"You can think of the genetic ancestor of all of SARS-CoV-2 like that," Wertheim told Live Science. "It is the virus from which all circulating SARS-CoV-2 descends, but that doesn't mean that there may not have been other [SARS-CoV-2] viruses around at the time, potentially very closely related, that just went extinct."

Wertheim and his colleagues used the molecular clock of SARS-CoV-2 to try to figure out how much time could have passed between the first appearance of the virus in humans and the infection that sparked the pandemic.

"What we were really interested in in our study was trying to put an upper limit on how long the virus could have been in humans and still given rise to the genetic [common] ancestor," he said.

In a paper published in Science in April, Wertheim and his team reported that the earliest possible emergence of the coronavirus was October 2019, but the most likely timing was mid-November 2019. Based on the genetic changes in the virus, very few people would have been infected in mid-November, Wertheim said, suggesting that reports of early hospitalizations in Wuhan may indeed have been due to influenza, not COVID-19.

"It would have had to have been at very, very low levels in order to persist without giving rise to this genetic ancestor," Wertheim said.

Wuhan's local health authority reported the first cluster of mysterious pneumonia in the city on December 31, 2019. The WHO later determined that the first case that could be confidently identified as COVID-19 was a man who became ill on Dec. 1, 2019.

Wertheim and his colleagues are now delving deeper into the coronavirus genetics to try to understand whether the virus leapt from animals to humans just once to spark the pandemic, or whether it made multiple incursions leading to multiple infection chains. SARS-1 was genetically diverse early on, Wertheim said, suggesting a multiple-introduction scenario. SARS-CoV-2 was less diverse, which may mean the introduction happened just once, he said. But both scenarios are still possible with the data currently available.

The animal-human connection 

Unfortunately, much of the evidence of the early pandemic is now gone, or at least hidden. During the SARS-1 outbreak, the live-animal markets were not initially shut down, Goldstein told Live Science. When scientists went into the markets months later, infected animals were still present, and animal-to-animal transmission was ongoing. In contrast, soon after the SARS-CoV-2 virus began spreading among humans, wet markets were shut down, and Chinese officials initially denied any live animals were sold at the market at the center of the first superspreader event, the Huanan Seafood Market. Researchers later showed that seven vendors were selling live mammals, birds and reptiles at that market, they reported in June in the journal Scientific Reports.

If the Chinese government tested any of the animals present in the markets when they were shut down, they're not talking.

"They haven't announced that they tested any of those animals that were in the markets in November and December 2019," Goldstein said.

Similarly, the government has refused to release early viral samples from Wuhan that might reveal more about the genetics of the first human cases and has taken a database containing early viral sequences offline.

This makes uncovering the animal-human link for SARS-CoV-2 difficult. What's clear right now is that the virus probably originated in bats. The closest known relative so far is a bat virus called RaTG13, with which SARS-CoV-2 shares 96% of its genome. Researchers discovered the virus in Yunnan province, China, in 2013, and published about its close ties to SARS-CoV-2 in March 2020. Researchers are still looking for closer relatives, but it's slow going, Goldstein said, particularly given pandemic-related travel restrictions and China's reluctance to invite in international research teams.

"You've got to find the right bats and it's like a needle in a haystack," Goldstein said.

However, comparing the bat viruses to the human virus can be illuminating. Bats are a lot like humans, said William Haseltine, the president of ACCESS Health International and a former professor at Harvard Medical School, where he studied HIV and the human genome. Like humans, bats have long life spans, travel over long distances and then cluster together in close contact. This pattern of behavior may partly explain why coronaviruses that evolve in bats tend to find fertile ground in humans.

"A bat has a chance to be infected many times in its lifetime, so these viruses have got to survive in a long-lived mammal that has many defenses against them," Haseltine said.

The proteins in SARS-CoV-2 can reveal just how the virus's evolution allowed it to break free of bats and eventually infect humans. The genes alone can't explain this step, said Ingo Ebersberger, a bioinformatician at Goethe University Frankfurt, because most of the mutations in the genome don't change the virus's function. It's the proteins that are the workhorses, as genes give instructions for making proteins and proteins carry out biological functions. In a study not yet peer-reviewed but posted Feb. 5.on the preprint server bioRxiv, Ebersberger and his colleagues studied the proteins of SARS-CoV-2 and found that most of the genetic changes between RaTG13, SARS-1 and closely related viruses translated to exactly nothing on the protein side.

"SARS-CoV-2 is not special," Ebersberger told Live Science.

In the end, the only major functional change that made SARS-CoV-2 stand out was that the virus has something called a furin cleavage site. This is a tiny sequence of four amino acids that massively improves the coronavirus's ability to fuse to the ACE2 receptors on the surface of human cells. This tiny insertion helps the spike protein on the virus to unfurl, all the better to expose its binding sites to the ACE2 receptors, which then unlock the cell for the virus's invasion.

RaTG13 doesn't have a furin cleavage site, but other coronaviruses, including some that circulate in bats, mice, camels and cats, do.

"This is something we think evolutionarily can happen very quickly," Ebersberger said. The change requires only a tiny mutation, he said, and every sick animal produces millions or billions of viral particles, each of which has a chance at accidentally acquiring that crucial mutation. 

Continued change

The acquisition of the furin cleavage site has led some to argue that the origins of COVID-19 lie not in natural animal viruses, but in deliberate manipulation in a laboratory. The researchers contacted by Live Science for this story dismissed this as evidence for such an origin, however. The original version of SARS-CoV-2 actually had a wimpy version of the furin cleavage site and was not particularly transmissible compared with what was to come, Wertheim said.

"Anyone who says they've never seen a more perfectly adapted human virus, well, they clearly hadn't met the delta variant," Wertheim said.

Related: Coronavirus variants: Here's how the SARS-CoV-2 mutants stack up

In January 2020, well before the word "variant" exploded into everyone's consciousness, SARS-CoV-2 acquired a spike protein mutation called D614G that made it perhaps 20% more transmissible. Coronavirus strains with this mutation quickly took over the world. And in the spike protein, evolution has marched on. The alpha variant of coronavirus was 50% more transmissible than the variants with D614G alone, according to Yale Medicine, and the delta variant is around 50% more transmissible than alpha. 

The spot on the coronavirus' genome that encodes for the furin cleavage site is also evidence for a natural origin, Goldstein said. The mutation is a string of 12 nucleotides dropped right in the middle of a codon, or three-nucleotide sequence, that codes for the amino acid serine. By a stroke of evolutionary good luck for the virus, the sequence still works for coding for proteins: All amino acids are coded for by three-nucleotide codons, and because 12 is a multiple of three, the overall rhythm of the sequence remains undisturbed. But the position of the mutation smack dab in the middle of the codon for another amino acid looks far more like an accident of nature than something engineered deliberately.

"It's a totally bizarre thing that nobody would ever do," Goldstein said. 

Finally, Goldstein said, the amino acid sequence in the SARS-CoV-2 furin cleavage site is not one that anyone had experimented with before and is not one that anyone would have predicted would work particularly well. Some researchers have experimented with artificially inserting a different furin cleavage from feline coronaviruses into harmless virus fragments in the lab. If someone were trying to make an animal virus transmissible in humans on purpose, Goldstein said, you'd expect them to use that proven sequence rather than a new, poorly placed string of amino acids that doesn't work that well out of the gate.

None of these structural studies can prove that SARS-CoV-2 wasn't a natural virus that was present in laboratory samples, though. The question of whether the virus could have leaked from the Wuhan Institute of Virology, a lab where studies of bat coronaviruses took place, has become a political sticking point that might sink any chance of discovering the origin of SARS-CoV-2. The Chinese government has categorically denied that the virus came from the lab, while obfuscating raw data that could prove whether it did or didn't. In recent statements, government officials have tried to steer the conversation away from China entirely, despite no evidence that the virus initially emerged elsewhere. (Indeed, Wertheim's work on early transmission dynamics suggests that the virus needed a densely populated city like Wuhan to take off; simulations mimicking rural population density led to an emerging virus that couldn't find enough hosts and went extinct.)

"In the next stage of origin studies led by the WHO, we should take a global vision and conduct research in different countries and multiple places instead of focusing on one area only," foreign ministry spokesperson Zhao Lijian said on June 16

Scientists interested in COVID-19's origins have a different take. Both Wertheim and Goldstein said they think a lab leak is unlikely, but that the search for the virus's origins needs to focus on the animal supply chain in and around Wuhan. This search can be stigmatizing, too, Ebersberger said, as many of the news stories circulating about the markets led to the implication that Chinese people eat wild animals indiscriminately. Many wild animals are consumed as delicacies in Chinese cuisine, but much of the international chatter around these culinary traditions ignored regional differences and the rarity of these items in people's diets. Bats aren't commonly part of the menu in central China, where Wuhan is located, and bats were not present at the Huanan Seafood market. Many animals sold at these markets aren't sold as meat, either, but as pets or for fur. One possible species that could have carried the virus from bats to humans is the raccoon dog (Nyctereutes procyonoides), which is mostly farmed for fur. The meat from raccoon dogs killed for fur then ends up in the luxury food market, Goldstein said.

Still, disparate species are held close together during both shipping and in stalls at live animal markets, creating prime conditions for viruses to mix, mingle and evolve. It wouldn't be the first time that close quarters between people, wild animals and domestic animals caused trouble. For example, the H1N1 strain of flu, also known as swine flu, is a genetic mix of influenza viruses from pigs, people and birds. Were he advising the WHO, Goldstein said, he'd recommend that scientists test the blood of people working in the animal trade for SARS-CoV-2 antibodies to see if they are more exposed than the general population.

"You can start with the farmers, you can go with the people who transport these animals from farms to cities, you can look at the people who sell these animals in the market," Goldstein said. "If these people have higher antibody positivity rate than the general population, that would be indirect but very strong evidence that this virus was present in animals that were part of the human food chain." 

Originally published on Live Science

Stephanie Pappas
Live Science Contributor

Stephanie Pappas is a contributing writer for Live Science, covering topics ranging from geoscience to archaeology to the human brain and behavior. She was previously a senior writer for Live Science but is now a freelancer based in Denver, Colorado, and regularly contributes to Scientific American and The Monitor, the monthly magazine of the American Psychological Association. Stephanie received a bachelor's degree in psychology from the University of South Carolina and a graduate certificate in science communication from the University of California, Santa Cruz.