close
close

AI-driven discovery sheds light on the origin and evolution of viruses

Artificial intelligence (AI) is helping to redraw the family tree of viruses. Predicted protein structures generated by AlphaFold and chatbot-inspired “protein language models” have revealed some surprising connections in a family of viruses, including pathogens that infect humans and emerging threats.

Much of the scientific understanding of viral evolution is based on genome comparisons. But the rapid evolution of viruses – particularly those with genomes written in RNA – and their tendency to acquire genetic material from other organisms means that genetic sequences can hide deep and distant relationships between viruses, which also vary depending on the gene being studied can.

In contrast, the shapes or structures of proteins encoded by viral genes tend to change slowly, making it possible to elucidate these hidden evolutionary connections. But until there were tools like AlphaFold that can predict protein structures on a large scale, it was impossible to compare protein structures across an entire family of viruses, says Joe Grove, a molecular virologist at the University of Glasgow, UK.

In a paper published in Nature, Grove and his team demonstrate the power of a structure-based approach on flaviviruses – a group that includes hepatitis C, dengue and Zika viruses, as well as some important animal pathogens and species that could emerge Dangers to human health.

The viral entry

Much of researchers' understanding of the evolution of flaviviruses is based on sequences of slowly evolving enzymes that copy their genetic material. However, researchers know remarkably little about the origins of the “viral entry proteins” that flaviviruses use to enter cells and which determine the range of hosts they can infect. That gap, Grove said, has slowed the development of an effective vaccine against hepatitis C, which kills hundreds of thousands of people each year.

“At the sequence level, things are so different that we can’t say whether they’re related or not,” he says. “The advent of protein structure prediction unravels the whole question, and we can see things quite clearly.”

The researchers used DeepMind's AlphaFold2 model and ESMFold, a structure prediction tool developed at tech giant Meta, to generate more than 33,000 predicted protein structures from 458 flavivirus species. ESMFold is based on a language model trained on tens of millions of protein sequences. Unlike AlphaFold, it only requires a single input sequence and does not rely on multiple sequences of similar proteins, so it could be beneficial for studying the most mysterious viruses.

The predicted structures allowed the authors to identify viral entry proteins whose sequences are very different from those of known flaviviruses. You found some unexpected links. For example, the subgroup of viruses that includes hepatitis C infects cells using a system similar to the one they discovered with the pestiviruses – a group that includes the classic swine fever virus, which causes hemorrhagic fever in pigs, and other animal pathogens belong.

The AI-powered comparisons showed that this entry system is different from those of many other flaviviruses. “We don’t know where the entry system for hepatitis C and its relatives comes from. It could be that it was “invented” by these viruses a long time ago,” says Grove.

Stolen by bacteria

The predicted structures also showed that the well-studied entry proteins of the Zika and dengue viruses are precisely the origin of what Grove describes as “weird and wonderful” flaviviruses with huge genomes, including the Haseki tick virus, which causes fever in humans can cause. Another big surprise was the discovery that some flaviviruses have an enzyme that appears to have been stolen from bacteria.

“This would be unprecedented,” says virologist Mary Petrone of the University of Sydney, Australia, if her team hadn’t discovered a similar theft of a strange and wonderful species of flavivirus this year. “Genetic piracy may have played a larger role in the evolution of flavivirids than previously thought,” she adds.

David Moi, a computational biologist at the University of Lausanne in Switzerland, says the flavivirus study is the tip of the iceberg and that the evolutionary history of other viruses and even some cellular organisms is likely to be rewritten with AI. “We will retell their stories with a new generation of tools,” he says. “Now that we can look a little further, all of these things need a little update.”