Ancestors of coronavirus have been hiding out in bats for decades, ready to infect humans

The SARS-CoV-2 virus and its relatives emerged from horseshoe bats (Rhinolophus affinis)
The SARS-CoV-2 virus and its relatives emerged from horseshoe bats (Rhinolophus affinis) (Image credit: Shutterstock)

The ancestors of the novel coronavirus may have been circulating in bats unnoticed for decades. And those coronaviruses likely also had the ability to infect humans, according to a new study. 

To understand where the novel coronavirus, known as SARS-CoV-2, came from and how it spread to humans, scientists need to trace its evolutionary history through the virus’s genes, which are encoded in ribonucleic acid, or RNA. But the evolutionary history of SARS-CoV-2 is complicated, because coronaviruses are known to frequently exchange genetic material with other coronaviruses.

That gene-swapping, called genetic recombination, also makes it difficult for scientists to pin down how the coronavirus first spread to humans; some researchers propose a direct bat-to-human transmission, while others hypothesize there was a middle species, such as pangolins, involved.

Related: Coronavirus news: Live updates

In the new study, researchers first identified the sections of RNA in the SARS-CoV-2 genome that had been evolving "as one entire piece," without genetic recombination, for as far back as they could study, said co-lead author Maciej Boni, an associate professor of biology at Penn State's Center for Infectious Disease Dynamics. 

They then compared these genetic regions with those of similar coronaviruses found in bats and pangolins. Adding evidence to support previous findings, they discovered that SARS-CoV-2 was most closely related to another bat coronavirus, known as RaTG13. 

In previous studies, scientists had looked specifically at genes responsible for the so-called receptor-binding domain (RBD) of the coronavirus' "spike" protein — the piece that allows the virus to dock to the ACE2 receptor in human cells and infect them. That research found the RBD portion of the spike protein was genetically more similar to a coronavirus found in pangolins (called Pangolin-2019) than that of RaTG13. There are two possible explanations for this finding: first, that the SARS-CoV-2 virus had evolved its ability to spread to humans in pangolins (unlikely, given that SARS-CoV-2 is more closely related to RaTG13 than any known pangolin viruses), or second, that the SARS-CoV-2 had acquired this RBD through recombination with a pangolin virus, Boni said. 

But in the new analysis, the researchers did not find any evidence of recombination in the genes of the SARS-CoV-2 spike protein. Instead, the new genetic sequencing data suggests a third explanation for what happened: The genes for the spike protein, and thus the coronavirus’s ability to infect human cells, were passed down from a common ancestor that eventually gave rise to all three of the coronaviruses: SARS-CoV-2, RaTG13 and Pangolin-2019. 

The authors note that it’s still possible that pangolins “or another hitherto undiscovered species” could have acted as an intermediate host that helped the virus spread to humans. But “it’s unlikely,” Boni said. Rather, the new findings suggest that the ability to replicate in the upper respiratory tract of both humans and pangolins actually evolved in bats. From bats, SARS-CoV-2 could have spread directly to humans. 

Circling for decades

But when did the lineage that gave rise to SARS-CoV-2 first diverge from the other two virus lineages? To figure this out, the researchers identified mutations or differences in specific nucleotides — the molecules that make up the RNA of the coronavirus — among the different viruses. They then counted the number of mutations present in the regions of the SARS-CoV-2 genome that had not undergone recombination. And knowing the estimated rate at which the coronavirus mutates every year, they calculated how long it had been since the three diverged.

Related: The coronavirus was not engineered in a lab. Here's how we know.

They found that over a century ago, there was a single lineage that eventually would give rise to SARS-CoV-2, RaTG13 and Pangolin-2019 viruses. Even then, "this lineage probably had all of the necessary amino acids in its receptor-binding site to infect human cells," Boni said. (Amino acids are the building blocks of proteins such as the spike protein).

At that time, the Pangolin-2019 virus diverged from the SARS-CoV-2 and the RaTG13 viruses. Then, in the 1960s or 1970s, this lineage split into two, creating the RaTG13 lineage and the SARS-CoV-2 lineage. Sometime between 1980 and 2013, the RaTG13 lineage lost its human receptor-binding ability, but the SARS-CoV-2 did not.

"The SARS-CoV-2 lineage circulated in bats for 50 or 60 years before jumping to humans," Boni said. Near the end of 2019, "someone just got very unlucky" and came into contact with SARS-CoV-2 and that set off a pandemic.

There are likely other virus lineages from the same century-old ancestor that also underwent decades of evolution, "that we have just not characterized," Boni said. "The question is, ‘Are there half a dozen of these lineages, 20, or a hundred?’ — and nobody knows." But it's likely there are others out there hiding out in bats that are able to spread to humans, he said.

"This paper provides more clues to understanding how this and other coronaviruses may emerge," said Dr. Amesh Adalja, an infectious disease expert at the Johns Hopkins Center for Health Security in Baltimore, who was not a part of the study. "We only really know the tip of the iceberg when it comes to the viruses that are harbored in bats." Seeing that relatives of the coronavirus have been around for so many years, suggests there's so much unsampled. "When it comes to pandemic preparedness, having a much more robust surveillance system is really the only way that we're going to protect against these threats in the future," Adalja said.

A lot of virus sampling is done in domestic and wild birds in east Asia, Southeast Asia and in other parts of the world in an effort to prevent potential bird flu pandemics, Boni said. "If someone gets infected with an avian influenza virus, the turnaround time to understand that would be something like 48 hours and we would immediately know that this person needs to be isolated right away and other measures would follow." But for bat coronaviruses, there are no such preventative measures in place, he added. 

It took more than a month after SARS-CoV-2 first spread to humans for scientists to have the novel coronavirus's genome in their hands — enough time for the virus to have spread to a thousand people, Boni said. "At that point it was too late."

The findings were published July 28 in the journal Nature Microbiology.

Originally published on Live Science.

Yasemin Saplakoglu
Staff Writer

Yasemin is a staff writer at Live Science, covering health, neuroscience and biology. Her work has appeared in Scientific American, Science and the San Jose Mercury News. She has a bachelor's degree in biomedical engineering from the University of Connecticut and a graduate certificate in science communication from the University of California, Santa Cruz.