close
close

Viral plasticity facilitates host diversity in challenging environments

To model the antagonistic coevolution of bacteria and plastic viruses, we modified an eco-evolutionary modeling framework that we successfully used in the past to study host-phage coevolution55.

Equations for the ecological dynamics

Our model describes the dynamics of uninfected host cells (C, in cells L−1), infected hosts (I, in cells L−1), extracellular viruses (V, in ind L−1), and the concentration of the most limiting nutrient for the host (N, in mol L−1), under controlled environmental conditions (chemostat). The model also explicitly includes the period of infection (latent period, L) by representing the interactions between host cells and viruses through delayed differential equations (see Supplementary Table 1 in SI for symbols and units). For a host phenotype i and a virus phenotype j:

$$\frac{d{C}_{i}
(1)

$$\frac{d{I}_{i}
(2)

$$\frac{d{V}_{j}
(3)

$$\frac{dN
(4)

where nH and nV are the total number of host and virus phenotypes at a given time, respectively, and \({{{{\mathcal{A}}}}}_{i,j}=1\) if viral phenotype j infects viral host i, or zero otherwise (i.e. \({{{\mathcal{A}}}}\) is the adjacency matrix for the system’s interaction network). The first equation (Eq. (1)) represents, for a given host phenotype, the dynamics of the population of uninfected cells, which increases at a growth rate μ that depends on the availability of the most limiting nutrient (first term), and declines due to adsorption and infection by any extracellular viruses able to infect this host (second term) or dilution from the chemostat at a rate w (fourth term). The third term accounts for potential competition among all host phenotype populations for space or resources not explicitly modeled that can affect negatively the growth of the focal phenotype, especially relevant as the eco-evolutionary framework allows for many contemporary host phenotype populations; Call = ∑iCi thus represents uninfected cells across all phenotypes (see SI.9.2). As usually assumed in host-virus models, all adsorptions lead to infection (first term in Eq. (2)); in addition, the population of infected cells declines due to dilution (third term) or due to lysis (second term), with the number of lysed cells calculated as the cells that were infected a latent period in the past (i.e. at time t − L) and avoided dilution during that latent period (which occurs at a probability given by ewL). Each lysed cell produces B new extracellular viruses (first term in Eq. (3)), whose population decreases as they enter/infect uninfected cells (second term), decay and lose infectivity (fourth term), or are diluted (fifth term). The third term represents superinfection avoidance, generically accounting for the battery of mechanisms that an infecting virus can deploy to prevent any other virus from using the same host for replication8; to this end, we used a density-dependent term where Vall = ∑jVj represents the sum of all (extracellular) virus densities across phenotypes (see SI.9.2). Finally, Eq. (4) represents the dynamics of the most limiting nutrient for the host population, whose availability changes due to inflow and dilution from the chemostat (first term), and decreases as it is taken up by uninfected hosts within the chemostat (second term). Also a typical assumption, infected cells do not grow nor replicate, and thus they do not contribute to the uptake term nor need to be considered for the third term in Eq. (1). Nutrient uptake results from the host’s requirement for growth, provided by the classic Monod formulation56:

$$\mu={\mu }_{max}\frac{N}{N+{K}_{N}},$$

(5)

where μmax is the maximum growth rate (in d−1), and KN is the half-saturation constant for growth on the focal nutrient (in mol L−1).

As commonly done in population models, we set a threshold below which either the host or the viral populations were considered to be extinct, which avoids unrealistically low values of either population as well as potential population “regeneration” from such unrealistic values. Here, we set a threshold of 1 ind L−1 for either population; in the case of the virus, the threshold applied to both extracellular and intracellular viruses together, a conservative choice aimed to prevent eliminating newly introduced viral mutant populations while still within infected cells before completing their first lytic cycle.

Although the equations above can represent the ecological dynamics of a wide variety of host-virus systems, here we focused on Escherichia coli as a host and T7 phage as the virus. The wealth of information available about this system enabled the parametrization of the equations above, including expressions for the dependence of the viral traits on host physiology (viral plasticity, see below). The focal nutrient was glucose, which E. coli uses to sustain growth, but other nutrient choices more relevant for other organisms or environments (e.g. nitrogen) can be used by modifying the nutrient-related parameters (see Supplementary Table 1 in SI). Additionally, although the equations above can be easily adapted to other settings, chemostats are highly tunable by modifying the dilution rate, w, or the input concentration, N0, which we used here to achieve a gradient of growth conditions. Moreover, chemostats can represent a variety of environments, from volumes of water in the ocean (with inflow/outflow representing advection that moves nutrients, hosts, and viruses in/out of the focal volume, and turbulence mixing the medium), to different areas of the human intestinal tract (with inflow/outflow representing directional tract flows).

Host traits and tradeoffs

Here, we focused on one single type of host receptor as viral gateway for infection; we considered that evolution can alter the physical configuration of such receptor, which can prevent infection by viruses unable to recognize and attach to this evolved form (Fig. 1a). From the wide variety of defense mechanisms that hosts can develop, such an evolutionary response is one of the quickest and most well-documented defense strategies for microbes7,14,34.

Each host phenotype was therefore characterized by a fixed number of specific, identical receptors (i.e. each phenotype showed only one receptor type), which in turn were defined by how much they departed from the pristine form of the receptor. We used a real value a [aminamax] to represent quantitatively the form of the receptor, with a = a0 representing the pristine form. Thus, a was the only host evolving trait, and host phenotypes differed in the value of a (and related traits, see Eq. (7) below) and were identical otherwise. For the results presented here, a [0, 2] and a0 = 1, but other values did not change qualitatively our results (as long as the values of amin and amax do not cap/constrain evolution, a standard caution in any model that accounts for evolutionary dynamics).

The evolution of the receptor typically leads to a decrease in its nutrient-uptake efficiency, which ultimately translates into a decrease in the overall host growth rate6,7,14,36. Here, we represented such a tradeoff with a reduction of the maximum growth rate (which links growth and nutrient uptake, Eq. (5)) proportional to how much the receptor departed from its pristine form (i.e. degree of innovation). In short, μmax decreased linearly as the receptor characterizing the phenotype departed from the pristine form; phenotypes with the pristine receptor showed full growth potential (i.e. \({\mu }_{max}={\mu }_{ma{x}_{0}}\)), whereas that potential was reduced to \({\mu }_{max}={\epsilon }_{\mu }{\mu }_{ma{x}_{0}}\) for those phenotypes whose receptors showed the highest possible degree of innovation (Fig. S1b). Here, we set ϵμ to a low but non-zero value, ϵμ = 0.05, representing the fact that even the most evolved receptor can still be somewhat functional. Mathematically, we defined the degree of innovation for a given host phenotype i as:

$${\Delta }_{i}=\left | \frac{{a}_{i}-{a}_{0}}{{a}_{0}}\right |,$$

(6)

confined between a minimum and a maximum distance to the pristine receptor, [Δmin = 0, Δmax = max(Δ(amax), Δ(amin))] (see Fig. S1a). Thus, the phenotype’s maximum growth rate was given by:

$${\mu }_{ma{x}_{i}}={\mu }_{ma{x}_{0}}\left(1-{c}_{a}{\Delta }_{i}\right)$$

(7)

where ca is a constant to map the parenthesis into [ϵμ, 1]:

$${c}_{a}=\frac{1-{\epsilon }_{\mu }}{{\Delta }_{max}}$$

(8)

Note that the second term in the parenthesis, caΔi, effectively quantifies the cost of innovation (i.e. distance between the realized μmax and pristine performance \({\mu }_{ma{x}_{0}}\)).

In our simulations, no surviving phenotype reached the proximity of the interval limits for a, and thus never showed maximal innovation (Δ = 1 with the chosen parametrization). On the other hand, receptor evolution did not affect the other host traits in the model (the Monod-related half saturation constant, KN, and yield, Y).

Viral traits and tradeoffs

As a viral evolutionary response, we focused on increases in the versatility of the viral tail fiber to recognize and attach to diverse forms of the focal receptor (Fig. 1a), a viral counter-defense response that has been shown to emerge shortly after hosts evolve modifications to their receptors6,14,33.

Viral phenotypes were here characterized by the set of receptor forms that they could recognize. Thus, we assigned to each phenotype an array r of R = 10 possible integers between 1 and R, with each integer representing forms of the receptor (i.e. values of a) that the virus could recognize. Specifically, a viral phenotype j was able to target a host phenotype i if di rj, where di is the integer resulting from mapping ai into 1, . . . , R:

$${d}_{i}={c}_{1,r}{a}_{i}+{c}_{2,r}$$

(9)

and:

$${c}_{1,r}=\frac{R-1}{{a}_{max}-{a}_{min}},\quad {c}_{2,r}=\frac{{a}_{max}-{a}_{min}R}{{a}_{max}-{a}_{min}}$$

(10)

Therefore, with the parametrization chosen here (see Supplementary Table 1 in SI), a viral phenotype with a value r0 = 5 within its r array would be able to target hosts whose a is within [0.889, 1.111], which encompasses not only the pristine form of the receptor a0 = 1 but also slightly modified versions of it (Fig. S2). The latter, necessary from a technical standpoint because a is represented using real numbers, reflects the fact that only after evolution has modified a receptor sufficiently does the tail fiber lose the ability to recognize and attach to it. Virus phenotypes thus differed only if the list of distinct integers within their r array differed (see Fig. S2). The versatility of phenotype j was defined as the number of distinct integers within \({{{{\bf{r}}}}}_{j},\,\,{n}_{dif{f}_{j}}\).

Following empirical evidence, we also considered the observed decrease in infection efficiency shown by viruses that evolve a higher versatility6,14,33,35, here represented as a decrease in the adsorption rate proportional to ndiff. In short, viruses with only one distinct integer in the array (i.e. targeting a single type of receptor) showed the maximum possible adsorption rate, k0, whereas the most versatile viruses (i.e. R distinct integers in the r array) showed a reduced adsorption rate ϵkk0 (Fig. S1c). Here, we set ϵk = 0.1, which aimed to represent a non-zero minimum adsorption efficiency even for the most versatile phenotypes. Mathematically:

$${k}_{j}={k}_{0}\left[1-{c}_{1,k}\left({n}_{dif{f}_{j}}-1\right)\right],$$

(11)

where c1,k helps map the bracket above into the interval [ϵk, 1]:

$${c}_{1,k}=\frac{1-{\epsilon }_{k}}{R-1}$$

(12)

Note that the second term in the bracket, \({c}_{1,k}({n}_{dif{f}_{j}}-1)\), effectively quantifies the cost of versatility (i.e. distance between the realized k and k0).

In our simulations, no surviving viral phenotype reached full versatility (ndiff = R, see below). We assumed that tail fiber evolution did not affect the other viral traits considered in the model. Such traits are defined by the lytic cycle57: after attaching to the host receptor and injecting their genetic material into the host, viruses hijack the host machinery and use host resources to i) synthesize the components of the new virions during the eclipse period E, and ii) assemble those components at a certain maturation rate M to form the new mature individuals released at lysis; the total infection time (from adsorption to lysis) is the latent period L, and the number of new virions released per infection is the burst size B.

Modeling viral plasticity

As described above for the lytic cycle, due to its parasitic nature the virus utilizes the host synthesis machinery and resources to replicate. This leads to an obvious link between host physiological state and viral reproduction that has been identified experimentally in a multitude of systems19,23,26,27,28,29,30, but so far only systematically studied for the E. coli – T virus system20,22,24,25.

Such a link has been characterized as a relationship between viral traits and host growth rate, termed viral plasticity because the value of viral traits changes as the host (reproductive environment of the virus) changes. In the past, we compiled existing data for E. coli and T phage and found that the eclipse period, E, and maturation rate, M, could be expressed as the following functions40:

$$E \, \left(\mu \right)={E}_{\infty }+{E}_{0}{e}^{-{\alpha }_{E}\mu /{\mu }_{ma{x}_{0}}}$$

(13)

$$M \, \left(\mu \right)=\frac{{M}_{\infty }}{1+{e}^{-{\alpha }_{M}\left(\mu /{\mu }_{ma{x}_{0}}-{M}_{0}\right)}}$$

(14)

where μ represents the host growth rate at the moment of infection, with the implicit assumption that it remains constant during the latent period20,22. In short, these functions illustrate that, as the host growth rate gets closer to its maximum potential value, \({\mu }_{ma{x}_{0}}\), the time needed to synthesize virion components decreases and the rate to assemble them into mature viruses increases. Both traits reach a lower and upper plateau, respectively, due to physiological constraints and bottlenecks. See40 and references therein for more details. We also showed that the optimal latent period under chemostat conditions can be written as40:

$$L \, \left(\mu \right)={L}_{0}+E(\mu ),$$

(15)

where L0 = 1/w, and thus is set by the chemostat dilution rate; in other words, this optimal latent period has an environmental component L0 and a component provided by the (plastic) eclipse period. Here, we assumed this optimal latent period for all viruses as a way to consider the most dominant L value expected for a given chemostat, and thus to focus solely on the evolution of versatility (Fig. S1d). See SI.9.4 for a discussion on other choices for L0. Regarding the burst size, we used a modified version of an empirically derived expression58:

$$B \, \left(\mu \right)=M(\mu ) \, \left[L(\mu )-E(\mu )\right],$$

(16)

where we only added the potential for the traits involved to depend on host physiology, ultimately leading to a dependence of burst size on host growth rate (see Fig. S1e).

Eqs. (13–16) thus link the trait values for a given viral individual to the growth rate (at time of infection) of the individual host cell it infects.

Nonplastic version of the model

To understand the role of viral plasticity in antagonistic coevolution, we contrasted the results obtained using the expressions above (Eqs. (13–16)) with those obtained with a version of the model that uses fixed values for those viral traits. The latter can represent both an example of a virus that depends minimally on the host (taken to the extreme of no dependence at all) as well as the expectations built with standard models, which neglect viral plasticity.

To obtain fixed values for the latent period and burst size that still represented the most dominant viral trait values expected at a given environment, we used Eqs. (15) and (16) with a fixed value for the eclipse period and the maturation rate, Enon and Mnon respectively. For these two traits, the typical information available is from laboratory experiments in which, as explained in ref. 20, the host is kept under maximal growth conditions. Thus, for coherence we calculated those values here by using Eqs. (13) and (14) with \(\mu={\mu }_{ma{x}_{0}}\) (i.e. \({E}_{non}=E({\mu }_{ma{x}_{0}}),{M}_{non}=M({\mu }_{ma{x}_{0}})\)), which were within the ranges reported in the literature for T viruses (e.g.20,22,59). Therefore, the resulting \({L}_{non}=L({\mu }_{ma{x}_{0}})\) and \({B}_{non}=B({\mu }_{ma{x}_{0}})\) provided values for the most dominant L and B that are expected for a given environment, but that disregard plasticity.

Modeling evolution

To model the evolution of the focal traits (host receptor and viral tail fiber), we adapted an eco-evolutionary approach that has been repeatedly used in the literature for a variety of microbial systems and questions (e.g.40,42,47,55,60).

Generically, for one evolving species the approach considers mutation events occurring at exponentially distributed times that depend on the species’ mutation rate. At each mutation time, ecological dynamics are stopped to use a roulette-wheel genetic algorithm that determines which phenotype evolves, chosen with a probability that depends on each phenotype’s relative abundance. A new mutant population is introduced then in the system, in small numbers, that is identical to the parental phenotype except for a small, random difference in the evolving trait. Thus, ecological dynamics resume and the new and existing phenotypes interact (e.g. compete), with some of such populations consequently growing in numbers and others going extinct, until a new mutation event occurs. This “innovation+selection” iterative process allows the system to explore the trait space, with the possibility of reaching an evolutionarily stable strategy (ESS, trait fixation or selective sweeping), or the continued coexistence of several dominating phenotypes with or without evolutionary oscillations (fluctuating selection or Red Queen dynamics). Because mutation times are randomly selected based on the demography of the evolving population, there is no imposed separation of ecological and evolutionary timescales, and thus the emerging evolutionary dynamics are affected by the ecological interactions and vice-versa (eco-evolutionary dynamics).

In our case, both host and virus evolved. At any given time, existing host and virus phenotype populations interacted following Eqs. (1–16), with hosts competing for the available nutrients and viruses competing for common hosts. These ecological interactions were simulated using a tailored Euler scheme to integrate the system of delayed differential equations, but any other solver accounting for the delay (i.e. latent-period-related) terms can be used. Mutation times for the host were determined by the host mutation rate (see Supplementary Table 1 in SI) and the population densities of the existing host phenotypes, the latter also influencing each phenotype’s mutation probability (i.e. probability to be chosen for mutation). New host mutants introduced in the system were identical to the parental phenotype except for the value of a, which differed by an amount randomly distributed in \({{{\mathcal{N}}}}(0,{\sigma }_{a})\) (i.e. Gaussian distribution with mean 0 and standard deviation σa). We considered two phenotypes to be identical if their a differed less than da = 10−2. On the other hand, mutation times for the virus were determined by the viral mutation rate (see Supplementary Table 1 in SI) and the population densities of the existing viral phenotypes, which also determined their respective mutation probability. Through mutations in the tail fiber, mutant viral phenotypes showed the same r array as the parental phenotype except for a randomly chosen location, where a new integer was considered that resulted from adding a random amount (from a standard normal distribution, \({{{\mathcal{N}}}}(0,1)\)) to the existing integer, then converting the result back to an integer. This rule ensured that any new form of the receptor that can be recognized as a result of the mutation was not radically different from the ones the parental phenotype could attach to, in agreement with empirical observations that evolution of a wider range results from cumulative evolutionary steps61. We considered two phenotypes to be identical if their r showed the same set of distinct (i.e. non-repeated) integers, which would mean that they targeted the same receptors (i.e. showed the same strategy). After a new mutant phenotype was introduced, the ecological dynamics resumed. Mutation times were stochastic and did not necessarily occur after ecology reached an equilibrium, thus enabling eco-evolutionary interactions.

We started our simulations with one single phenotype population with the pristine form of the receptor, i.e. characterized by a = a0, and one single viral phenotype with all locations in r set to r0 = 5, i.e. able to infect only the pristine form of the receptor.

Due to the random nature of the evolutionary algorithm, we used 100 replicates for each of the cases explored here. We stopped each replicate after t = 105d to ensure that the system had reached its asymptotic eco-evolutionary behavior regardless of the environmental conditions (the latter set by the dilution rate, w, and nutrient input concentration, N0). See Supplementary Table 1 in SI for more information regarding the chosen parametrization.

Measures of strategy effectiveness and diversity

To measure the effectiveness of a given host strategy, we used the standard measure of resistance or immunity, i.e. proportion of virus phenotypes that a given host phenotype i can elude16:

$${\nu }_{i}=1-\frac{{\sum }_{j}{{{{\mathcal{A}}}}}_{i,j}}{{n}_{V}},$$

(17)

where nV is the total number of virus phenotypes at a given time and, as above, \({{{{\mathcal{A}}}}}_{i,j}=1\) if viral phenotype j infects viral host i, or zero otherwise. Because we can monitor the abundance of all host and viral phenotype populations, we also measured a weighted version of such immunity:

$${\nu }_{{w}_{i}}=1-\frac{{\sum }_{j}{{{{\mathcal{A}}}}}_{i,j}{V}_{j}}{{V}_{all}}.$$

(18)

This expression accounts for the proportion of the total virus abundance that cannot infect the focal virus phenotype, and therefore reflects more closely the success of a host strategy: large values indicate the ability of the host phenotype to elude the majority of available extracellular viruses, regardless of whether they belong to one or many different phenotypes. We will refer to νw as realized immunity hereon.

To measure the effectiveness of a given viral strategy, we used the standard measure of range, i.e. proportion of host phenotypes that a given virus phenotype j can infect16:

$${\rho }_{j}=\frac{{\sum }_{i}{{{{\mathcal{A}}}}}_{i,j}}{{n}_{H}},$$

(19)

where nH is the total number of host phenotypes at a given time. As in the case of immunity, we also measured a weighted version:

$${\rho }_{{w}_{j}}=\frac{{\sum }_{i}{{{{\mathcal{A}}}}}_{i,j}{C}_{i}}{{C}_{all}},$$

(20)

which accounts for the proportion of the total host biomass that is infected by the focal virus phenotype. This expression thus reflects more closely the success of a viral strategy: large values indicate the ability of the virus phenotype to infect the majority of available host cells, regardless of whether they belong to one or many different phenotypes. We will refer to ρw as realized infectivity hereon.

Finally, we used the standard definition of richness as the number of phenotypes (nH for host, nV for virus), and the Shannon index44 as a measure of biodiversity, defined as:

$${S}_{H}=-{\sum}_{i} \, \left(\frac{{C}_{i}}{{C}_{all}}\right)\log \left(\frac{{C}_{i}}{{C}_{all}}\right)$$

(21)

for the host and:

$${S}_{V}=-{\sum}_{j}\left(\frac{{V}_{j}}{{V}_{all}}\right)\log \left(\frac{{V}_{j}}{{V}_{all}}\right)$$

(22)

for the virus.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.