An Atlas of Viral Adaptation

A comparison of adaptive evolution across a panel of
human pathogenic viruses.

Adaptive evolution is evolution in response to a selective pressure that increases fitness. In the context of viruses, the rate at which a protein evolves adaptively speaks to its evolutionary potential to undergo antigenic drift (evade antibody recognition), cross-species barriers, or escape drugs. In particular, a high rate of adaptation in a virus that has been endemic in humans for decades indicates that this virus is undergoing continuous adaptation, likely stemming from a changing selective landscape caused by an evolutionary arms race between the virus and its host. Because viral surface proteins are the primary targets of neutralizing antibodies, high rates of adaptation in the viral surface protein typically indicates antigenic drift.

Comparing rates of adaptation in the surface proteins of different viruses allows us to hypothesize which viruses evolve antigenically and which do not. From this comparison, we can also glean insight into the prevalence or rarity of antigenic evolution amongst viruses that infect humans. Additionally, we can compare lesser-studied viruses to ones that are better-understood to extrapolate real-world implications of varying rates of adaptation.

Most viruses contains a polymerase, which we expect to be relatively conserved, as well as surface protein (or protein subunit) that binds to a host-cell receptor. This plot compares the rates of adaptation within these genes.

Hovering your mouse over a point on the plot will indicate which viral protein the rate is calculated for (e.g. for H3N2, polymerase is PB1 and receptor-binding is HA1). The calculated rate of adaptation and metadata about the virus (such as viral family, genome type, etc) will also be shown. The controls below the plot allow the y-axis units to be changed from "adaptive mutations per codon to year" to "adaptive mutations per year", choose whether the x-axis is ordered by viral family or ascending rate of evolution, and select which viruses are displayed on the plot. Clicking on the point will bring you to a virus-specific page which shows the inferred rates of adaptation in other genes of this virus as well as predicted sites of immune evasion in the receptor-binding protein.

This panel focuses on viruses that are endemic in humans, and for which there are a minimum of 50 high-quality sequences spanning at least 10 years of time. We have aimed to cover a diverse range of RNA and DNA viruses with genomes under ~50 kiloBases in length.

Rates of adaptation here are calculated according to an adaptation of the McDonald-Krietman test laid out in Bhatt et al, 2011. This method divides an alignment of sequences into time windows and then calculates the number of adaptive mutations that have accumulated between the ancestral sequence and each subsequent time point. The time points are then fit by linear regression and the rate of adaptation is the slope.

The number of adaptive mutations is the number of fixations and high-frequency polymorphisms (present at 75% or higher in the population) that exceed the neutral expectation. The neutral expectation is defined by synonynonymous and mid-frequency nonsynonymous mutations. Specifically, the number of adaptive mutations \( a \) is given by \(a = r_f +r_h - (s_f + s_h){r_m \over s_m} \), where \( r_f \), \( r_h \) and \( r_m \) are counts of replacement (nonsynonymous) fixations, high-frequency polymorphisms, and mid-frequency polymorphisms, respectively, and \( s_f \), \( s_h \) and \( s_m \) are counts of silent (synonymous) mutations in these frequency classes. These counts are obtained by walking through each nucleotide position in the sequence and comparing the outgroup to the alignment.

Upon a zoonosis event from an animal to humans, viruses almost always undergo adaptive evolution to adapt to the new host environment. This usually includes adaptation in the receptor-binding domain to optimize binding of human receptors- a process which is usually accomplished within a few years. However, the receptor-binding protein is also a major target of the human immune system, putting another evolutionary pressure on the virus to fix mutations in this region that escape immune recognition. Because the humoral immune system is also adaptive, this supplies a continuous selective pressure on the virus. Thus, viruses that have been endemic in humans for decades and continue to display adaptive evolution in their receptor-binding proteins, are likely evolving antigenically to escape antibody responses elicted by past infections.

This means that we will be susceptible to reinfection by viruses that evolve antigenically, and also that vaccines against these viruses will provide only temporary protection. The goal of this project is to predict which viruses undergo antigenic evolution and which do not by directly comparing rates of adaptation in the receptor-binding proteins across a panel of viruses. Measles and Influenza A/H3N2 can be thought of a benchmarks for comparison: measles is antigenically stable while H3N2 is known to undergo rapid antigenic drift. The implications of this for human disease are that the measles vaccine (which is based on a virus that was circulating in the 1960's) offers life-long protection against infection, while the H3N2 vaccine offers only transient protection, necessitating continual updates to the vaccine formulation. Between 2006 and 2021, the H3N2 component of the influenza vaccine was updated 10 times (11 different H3N2 strains).

The rate of adaptation should be related to the time it takes the virus to evolve an escape mutant and, thus, the duration of vaccine efficacy. Fittingly, influenza B viruses (which we predict to evolve antigenically, but at a slower rate than H3N2), were updated 4 times between 2006 and 2021. Though we hypothesize that there is a relationship between rate of adaptation in the receptor-binding protein and duration of immunity from natural infection or vaccination, we do not fully understand the nature of this relationship. The conversion of "rate of adaptation" to "duration of immunity" is complicated by factors such as waning immunity and a difference in the exact number of adaptive mutations needed to effectively escape immunity (some viruses can escape with just one amino acid change, while others require more).

The analysis and data behind this website can be found in this repository.

The manuscript is currently posted as a bioXriv pre-print.

Built by Katie Kistler