Pnas Gerrish Si

Supporting Information for
Complete genetic linkage subverts natural selection

Philip Gerrish,* Alexandre Colato, Alan Perelson, and Paul Sniegowski*
*To whom correspondence should be addressed. E-mail: pgerrish@lanl.gov (P.G.); paulsnie@sas.upenn.edu (P.S.)
This Supporting Information includes:
Supporting Text Figures 5 to 20 Movies 1 and 2
- S1 -
Summary and schematic of mutator hitchhiking In this paper, we model adaptive evolution. Our results show that complete linkage between replication fidelity loci and fitness loci results in the inevitable extinction of a population. Adaptation indirectly favors variants with elevated mutation rates, thereby driving up mutation rate in a ratchet-like fashion. This phenomenon of mutator hitchhiking, illustrated in Fig. 5, is well documented (1, 2) and we expected to observe this phenomenon operating in our simulations. We even speculated that this process might drive mutation rate to high levels through sequential mutator hitchhiking events. What we did not expect, however, was that this process of mutation-rate elevation would eventually drive the mutation rate to intolerable levels (3, 4), resulting in the extinction of the population.
Figure 5: Schematic of mutator hitchhiking. Lines represent genomes, green lines represent genomes that carry a defect in their replication/repair genes (a mutator allele), red dots represent deleterious mutations, and blue dots represent a beneficial mutation. Panels a through f are temporaly sequential snapshots of the population. The beneficial mutation happens to occur on a mutator background and subsequently carries the mutator allele to fixation, thereby increasing the population mutation rate. It is said that the mutator hitchhikes to fixation.
- S2 -
Analysis of pde models. We compactly model the process of adaptation, where fitness and mutation-rate genes are completely linked and both are subject to mutation. Mutation is represented as diffusion in log fitness and log mutation-rate space. Analysis reveals that adaptation by natural selection ultimately drives a population to extinction quite rapidly. The models to be analyzed.
We present two complementary analyses of two general models of evolution that differ only in their assumptions about when mutation occurs during the life cycle:
Model 1:
u = ( x x )u + e y Mu t u = ( x x )(u + e y Mu ) t
(1.1)
Model 2:
(1.2)
where:
x = log absolute fitness y = log mutation rate t = time u = u(x,y,t) = density function M = mutation operator x = xu( x, y , t )dxdy
x ,y Re
We employ two models mainly because they lend themselves naturally to two qualitatively different analyses. Numerical solutions of these two models give qualitatively equivalent results. Model 1 most closely mimics organisms whose offspring may suffer mutations independently of one another (e.g., binary fission of bacteria, eukaryotes in general). Model 2 most closely mimics organisms whose offspring are produced en masse from a template genotype and for which mutation is most likely to occur in the replication event that creates the template (e.g., retroviruses). The mutation operator M models mutation to and from other genotypes in the population. Mutation can affect either fitness, represented by a change in x, or mutation rate represented by a change in y. Genomic mutation rate of an individual of genotype (x,y) is e y . To clarify the mechanics of M, we give an example. A deleterious mutation
- S3 -
may decrease fitness by an amount x such that the parents fitness is x + x and the offsprings fitness is x . The probability density for x is gD (x ) and the mutational flux from fitness class x + x to fitness class x is ey fDu( x + x, y , t )gD (x) , where ey is genomic mutation rate and fD is the fraction of all mutations that are deleterious. The total mutation influx of deleterious mutations into fitness class x is given by
ID = ey fD u(x + x, y , t )gD (x )d x . The total outflux of deleterious mutations from
0
fitness class x is OD = ey fDu(x, y, t ) , and the total net change in u(x, y, t ) due to deleterious mutations is therefore
ID OD = ey fD u( x + x, y , t )gD (x)d x e y fD u( x, y, t ) . Similar logic gives terms for
0
beneficial, mutator, and anti-mutator mutations, and all of these terms are then summed to construct mutation operator M, which is then defined as
y e Mu = fB e u( x x, y , t )g B ( x )d x u + fD e u( x + x, y , t )g D ( x )d x u + 0 0 fM e y y u( x, y y , t )g M ( y )d y e y u + fA e y +y u( x, y + y , t )g A ( y )d y e y u 0 0
y y
where, fB = fraction of mutations that increase fitness (beneficial) fD = fraction of mutations that decrease fitness (deleterious) fM = fraction of mutations that increase mutation rate (mutator) fA = fraction of mutations that decrease mutation rate (anti-mutator) gB = distribution of effects of beneficial mutations gD = distribution of effects of deleterious mutations gM = distribution of effects of mutator mutations gA = distribution of effects of anti-mutator mutations
B B
If the distributions of mutational effects, g, have means m and variances 2 then second-order Taylor expansion of M gives
- S4 -
Mu Dx
2u u 2u u + d + D + (2Dy + d y ) + (Dy + d y ) u x y 2 2 x x y y
where
2 2 2 2 1 1 Dx = 2 fD (mD + D )+ 2 fB (mB + B ),
d x = fD mD fB mB
d y = fA mA fM mM
2 2 2 2 1 1 Dy = 2 fA ( mA +A )+ 2 fM (mM + M ),
- S5 -
Analysis 1: Dynamic Limiting Solution of Model 1.
Our analysis begins conservatively by supposing our finding (that natural selection drives a population extinct) is wrong and that previous ideas about natural selection and mutation-rate evolution are right. To this end, we sought an asymptotic
t (x ct , y ) which, if it exists stably traveling wave solution of the form u(x, y , t ) u
for positive wave velocity ( c > 0 ), would be consistent with the notion that mutation rate converges to some optimal or stable distribution that balances adaptability and adaptedness while fitness continues to increase without bound. When mutation rate does not evolve, a stable, asymptotic traveling wave solution exists, has the form
t ( x ct ) , and was previously derived by Tsimring et al. (5) and Rouzine et u( x, t ) u
al. (6). We write this solitary wave solution by first defining a new variable, z = x ct . We suppose that mean log fitness increases linearly in time ( x = ct ) such that z = x x .
u + e y Mu , where the mutation operator M remains = zu z
( z, y ) is c The equation for u
(spatial derivatives are unchanged, unchanged because it acts equivalently on u and u

/ z ). e.g., u / x = u
A numerical example. To begin, we show a numerical solution of the solitary wave equation for c > 0 .
The purpose of this first part is to simply demonstrate what happens to the solitary wave as the upper bound on mutation rate increases, given one biologically reasonable set of parameters (Fig. 6). Parameters chosen for this solution are: fB = 1010 , fD = 101 ,
fM = 10 4 , fA = 10 5 , and m = = 0.03 for all distributions of mutational effects. To
interpret these solitary wave plots intuitively, one can imagine that he or she is a moving observer of the fitness peak; this fitness peak propagates to the right along the fitness axis at the same velocity as the observer, such that the observer sees a stationary peak. The peak that one observes is what the population ultimately converges to, possibly after a long time, i.e., it is the asymptotic distribution. The parameter c is the asymptotic wave velocity and may be interpreted as specifying how fast (and in which direction) the observer must travel in order to see the distribution as stationary. Initially, the peak
- S6 -
appears close to the maximum allowable mutation rate, as shown in panel A. Here, the maximum allowable mutation rate is not yet intolerable. Because of mutator hitchhiking, the population mutation rate will evolve to some value close to the maximum while fitness continues to increase without bound. When the upper bound on mutation rate is sufficiently large, the peak disappears altogether, as shown in panel B. Keeping the upper bound on mutation rate at this largest value, the soliton was found by allowing negative wave velocities, shown in panel C. In this case, the asymptotic wave velocity was found to be c 3 , which means that population log fitness ultimately decreases very fast without bound when the upper bound on mutation rate is large.
c >0
c >0
max
c0
Figure 6. Numerical solution of the solitary wave equation. As the maximum allowable mutation rate increases, the peak disappears under the assumption of c > 0 (panels A and B); it reappears when c is allowed to be negative and is found at the strongly negative value of c 3 (panel C). (Numerical solution and plots generated by Maple.)
- S7 -
Analysis. The solitary wave equation to be analyzed is: c where: u = zu + e y Mu z
2u u 2u u + + + (2Dy + d y ) + (Dy + d y ) u d D x y 2 2 z y z y with u.) If we impose the boundary conditions, (To simplify notation, we have replaced u Mu Dx
u( , y ) = u(, y ) = u( z, ) = u( z, ) = u( z, ) = u( z, ) = 0 z z y y then, in Mathematical details, we show that: u( , y ) = u(, y ) =

1) The mean of z is zero: z =

as expected, and
2 2) The variance of z is: z =
zu(z, y )dzdy = 0 ,
(1.3)
z u( z, y )dzdy = d + c ,
2 z
(1.4)
where = e y . To determine the sign of c, an intuitive approach would be to solve for the c that keeps the peak of the distribution in the same place. If the peak of the distribution
, then c is determined from the equation occurs at z = z

c , y ) = zu (z , y ) + e y Mu( z , y ) . The problem with this approach is that , y ) = 0 u( z u( z z z
unconditionally, leaving c undetermined. A variant of this approach, however, does yield meaningful results. Instead of looking exactly at the peak of the distribution, we look at a point that is just to the right of the peak in the fitness-deviation variable, z, and we solve for the c that will keep that point in the same place. That is, instead of solving for c at , we solve for c at the point z = z + , for some arbitrarily small . (Note: if one z=z
, the exact same result is looks at a point that is just to the left of the peak z = z
obtained.) Thus, c is determined from the equation

c + , y ) = (z + )u ( z + , y ) + e y Mu( z + , y ) . u( z z
(1.5)
- S8 -
For small , we have
2 2 + , y ) , y ) + 2 u( z , y ) = 2 u( z , y ) and u( z u( z z z z z
+ )u( z + , y ) zu (z , y ) + u( z , y ) . From these two expansions, the equation becomes (z c 2 , y ) = zu (z , y ) + u( z , y ) + e y Mu( z + , y ). u( z 2 z (1.6)
If we go back for a moment to the equation obtained exactly at the peak, the fact that , y ) = 0 gives 0 = zu (z , y ) + e y Mu( z , y ) , or zu (z , y ) = e y Mu( z , y ) . This we now u( z z insert into (1.6), giving c 2 , y ) = u( z , y ) + e y Mu( z + , y ) e y Mu( z , y ) . Expansion u( z 2 z 3 2 , y ) , u z y d u( z ( , ) + x z 3 z 2
+ , y ) e y Mu( z , y ) = Dx + , y ) gives e y Mu( z of e y Mu( z yielding c
2 3 2 , y ) = u( z , y ) + Dx 3 u( z , y ) + d x 2 u( z , y ) u( z 2 z z z
(1.7)
From this equation, the asymptotic wave velocity is: u( z, y ) c= 2 e y Dx , y ) u( z z 2 3 , y ) u( z 3 z + dx 2 , y ) u( z z 2 2 , y ) < 0 . The first term, u( z z 2
(1.8)
At its peak, u is concave down, guaranteeing that
, y ) u( z , is therefore positive but finite. Before getting technical about it, we appeal to 2 , y ) u( z z 2 biological intuition to guess at the sign of the third derivative: the action of natural selection will cause a population to quickly assimilate rare new beneficial mutations (occurring to the right of the distribution on the z axis), while the more frequent deleterious mutations will be weeded out slowly. Curvature should therefore become increasingly negative as you move to the right of the peak, implying that 3 , y ) 0 u( z z 3
> z = 0 ; see below). We know that Dx > 0 (confirmed numerically and by the fact that z
- S9 -
is large and again biological considerations insure that d x > 0 ( fD mD > fB mB ). If y enough, therefore, c will be negative. Finally, we derive some mathematical support for our claim that The mode of the distribution in z occurs at 3 , y ) 0 . u( z z 3
= z
, y ) e y Mu( z 2 2 , y )1 2 u( z , y ) + Dy u( z , y )1 2 u( z , y ) + Dy + d y = e y A . = e y DX u( z , y ) z y u( z
2 2 , y ) < 0 , we < , and u ( z , y ) 0 u( z z 2 y 2
, y ) > 0 , From the facts that Dx > 0 , Dy > 0 , u( z
conclude that A is negative provided that Dy + d y < 0 , which roughly translates to the condition that fM mM > fA mA (mutators must be more common than antimutators) a condition that biological considerations essentially guarantee. It is evident, therefore, that will be positive. Furthermore, from the inequalities listed above, we gather that z Dy + d y , A Dy + d y < 0 , from which we derive a relation for the mode of u: e y z t e y fM mM . Pearsons skewness which translates to the approximate expression, z z z 3e y fM mM coefficient in z is defined as = 3 < 0 , i.e., skewness is guaranteed d z z to be negative. To validate our inferences drawn from (1.8), we must make an assumption that u is a non-pathological unimodal function (its higher derivatives are diminishing); under this reasonable assumption (employed commonly, for example, by economists), our finding that skewness must be negative implies 3 , y ) 0 . u( z z 3
Analysis 2: Static Limiting Solution of Model 2.

(x, y ) such that We now seek a static limiting distribution of the form u
t ( x, y ) , derived by setting u / t = 0 . This gives the equilibrium u( x, y , t ) u
( x, y ) = g ( x )h( y ) , and employing the shorthand notation, + Mu = 0 . Letting u equation u

g ' = g '( x ) and h ' = h '( y ) , the full equation is written, 0 = gh + e y (Dx g '' h + d x g ' h + Dy gh ''+ (2Dy + d y )gh '+ (Dy + d y )gh )
- S10 -
Rearranging, we have:
Dx g ''/ g + d x g '/ g = e y Dy h ''/ h (2Dy + d y )h '/ h (Dy + d y ) =

This rearrangement demonstrates that the equation is separable ( is the separation constant). We begin by solving the equation in y:
h( y ) = e By C1J A (2 e y / Dy )(1 + A) + C2 J A (2 e y / Dy )(1 A)
where J denotes Bessel function of the first kind, denotes the gamma function,
1 A = Dy (2Dy + d y )2 4Dy (Dy + d y ) , B = (2Dy + d y ) / 2Dy , and the Cs are constants
of integration. The Bessel function oscillates at low values of y, and we therefore discard the notion that mutation rate evolves to very low values. Hence, we are interested in what happens to probabilities as mutation rate increases. If probabilities decrease above some value of y, then mutation rate is contained, i.e., it evolves to some stable, intermediate value. If probabilities increase monotonically as y increases, then mutation increases without bound. To determine what happens to probabilities as y increases, we employ the asymptotic behavior of the Bessel function: limz 0 J A ( z ) = ( z / 2)A / (1 + A) . As y increases (i.e., as e y decreases), the asymptotic equation reduces to:
y e yA / 2 + C e yA / 2 h( y ) e By C 1 2
= C D A / 2 and C = C D A / 2 . This function is monotonically increasing in y if where C 1 1 y 2 2 y either B A / 2 > 0 or B + A / 2 > 0 , which translates to (2Dy + d y )2 4Dy (Dy + d y ) > 2Dy + d y (either one or the other, plus or minus, must hold). This condition reduces to the following pair of conditions: Either (1) 2Dy + d y < 0 (independent of ), or (2) 2Dy + d y > 0 and > Dy + d y . Given the infinite domain of , we were unable to the governing equation and the rather general conditions place on u put restrictions on . For this reason, the constraints we derive from this analysis are limited to the following statement: A sufficient (but not necessary) condition for the unbounded increase in mutation rate (due to the mutation-rate ratchet) is that
- S11 -
2Dy + d y < 0 , which translates to
2 2 fM m + mA +A > A . If mM = mA = m , if m 2 = 2 (as 2 2 fA mM mM M
would be the case if mutational effects were exponentially distributed), and if 0 < m 1 , then this condition reduces to fM t 1 + 4m . Roughly speaking, as long as mutators are fA
more common than anti-mutators, we can be assured that natural selection will drive unbounded increase in mutation rate, and hence extinction. Some of our numerical solutions of the pde, and even some of our simulations (finite populations), have shown that a population can be driven extinct even when fA > fM . These observations remind us that the condition stated above is only sufficient and not necessary. From a biological standpoint, however, the sufficient condition derived above is indeed sufficient: Mutator mutations are loss-of-function mutations and are therefore expected to be much more common than anti-mutator mutations in any conceivable biological system. Next, we solve the equation in x: g ( x ) = C1e A1x + C2e A2 x where, A1,2 =
2 d x d x 4Dx
2Dx
. Here, biological considerations insure that
2 2 + fB mB > 0 . If only we could be certain that the d x = fD mD fB mB > 0 . And Dx = fD mD
separation constant was positive ( > 0 ), then this solution would guarantee that fitness ultimately decreases without bound (on a log scale). Unfortunately, as stated above, we were unable to place restrictions on , and hence a unique behavior for this solution could not be derived. We therefore sought a solution to the equation in x by a different route. Our alternative strategy for solving the equation in x was to first integrate the equilibrium equation over y as follows: 0 = g h + Dx g '' e y h + d x g ' e y h + Dy g e y h ''+ (2Dy + d y )g e y h ' + (Dy + d y )g e y h .
- S12 -
Again, we employ the shorthand notation,

y e h =
h =
h( y )dy = 1 and
h( y )dy = k y , and integration by parts gives us the relation
h = e y h ' = e y h '' = k y . From this relation, the integrated equation reduces to: 0 = g + k y ( Dx g ''+ d x g '+ Dy g (2Dy + d y )g + (Dy + d y )g )
whose solution is: g ( x ) = C1e A1x + C2e A2 x ,
where A1,2 =
2 2 d x k y d x k y 4k y Dx
2k y Dx
. We are guaranteed that k y =
h( y )dy 0
simply by the fact that e y can never be negative, and a unique behavior quite clearly derives from this alternative solution: As long as d x = fD mD fB mB > 0 , log fitness ultimately converges to negative infinity. Roughly speaking, as long as deleterious mutations are more common than beneficial mutations, fitness eventually plunges to zero. Combining and interpreting the results of Analyses 1 and 2.
Analysis 2 concludes that any biologically realistic population should ultimately evolve to a state of high mutation rate and low fitness. It does not say anything about the dynamics that give rise to this final state. In the mathematicians abstract world of infinite
t t domains, this final state is expressed as x and y + : the mean log
fitness converges to negative infinity (fitness converges to zero), and the mean log mutation rate converges to positive infinity (mutation rate increases without bound). The
t t same is true for the mode of the distribution: x and y + (because the
marginals for x and y are monotonic). If we take this result about the mode and apply it to Analysis 1, a picture of the dynamics that lead to the final state begins to emerge. If
t + into equation (1.8), we obtain c : the solitary wave ultimately we insert y
- S13 -
travels infinitely fast on the infinite domain in the direction of decreasing log fitness. Translating this picture of the dynamics back into biology and arithmetic scales, it still says nothing about what happens initially, but we now know that population fitness eventually plummets to zero pretty quickly. Analyses 1 and 2 both arrive at the same two conditions that guarantee extinction by natural selection in an effectively infinite population of asexual organisms: (1) Mutators should be produced at a higher rate than anti-mutators ( fM t fAM ); this condition is sufficient but not necessary in both analyses. (2) Deleterious mutations should be produced at a higher rate than beneficial mutations ( fD t fB ); this condition is sufficient but not necessary in Analysis 1 and sufficient and necessary in Analysis 2. Both of these conditions are certainly met by any biological system. To state that extinction occurs whenever both of these conditions are met is thus biologically equivalent to stating that extinction is inevitable. Integral forms.
To gain insight into the processes underlying observed fitness and mutation rate dynamics, we obtained two integral forms of Model 1. These expressions, among other things, allow comparison with Fishers fundamental theorem of natural selection (7) and the more general Price equation (8, 9). We impose the boundary conditions,
u( , y ) = u(, y ) = u( , y ) = u(, y ) = u( x, ) = u( x, ) = u( x, ) = u ( x, ) = 0 x x y y
x ,y Re
and we define mean log fitness, x =
xu( x, y , t )dxdy , and mean log mutation rate,
x ,y Re
e y u( x, y , t )dxdy . Then, multiplying Model 1 by x and integrating gives an
expression for dynamics of mean log fitness:

x 2 = x dx . t
(1.9)
- S14 -
(See Mathematical Details for derivation of these integral expressions.) Multiplying Model 1 by e y and integrating gives an expression for dynamics of mean log mutation rate:
= cov( , x ) + (Dy d y ) 2 . t
(1.10)
For fD mD fB mB , we have d x = fD mD fB mB fD mD . If mutator and antimutator mutations have small means and variances, and if fM mM fA mA , then
2 2 2 2 1 1 Dy d y = 2 fA ( mA + A )+ 2 fM (mM + M ) + fM mM fA mA fM mM . Under these reasonable
conditions, our equations are reduced to:

x 2 = x fD mD t
(1.11)
and
= cov( , x ) + fM mM 2 t
(1.12)
We note that the effects of mutator and antimutator mutations may not have small variance as assumed above. Knockouts in proofreading or repair genes, for example, are fairly common among bacteria; these are single mutations that can elevate mutation rate up to three orders of magnitude. A class of large-effect mutations like these can dramatically increase variance in the effects of mutator and antimutator mutations. But equation (1.10) shows that any increase in this variance will only decrease the mutation rate at which the population shift is triggered (because
2 2 2 2 1 1 Dy d y = 2 f A ( mA +A )+ 2 fM (mM + M ) + fM mM fA mA will increase with increasing
variance). If the resulting variance is fairly large, our conservative approximation Dy d y fM mM becomes invalid.
Time until extinction. Numerical solutions of the pde models and individual-based simulations both show that the time-averaged covariance between mutation rate and fitness, in the time preceding the fitness decline, is positive but can be quite small. As a first approximation
- S15 -
to the time until extinction, we suppose that the evolutionary dynamics are closely approximated by assuming the covariance to be some small, constant value during the period of time preceding the fitness decline. We denote this small constant value by h = cov( , x ) . Furthermore, we have determined from simulations and numerical
2 + 2 2 , i.e., the variance in mutation rate is negligible. From solutions that 2 =
these observations and assumptions, the equation for mutation rate dynamics becomes
= h + fM mM 2 . t
(1.13)
The solution to this equation is
(t ) =
f m h tan t fM mM h + tan1 0 M M fM mM h f m tan1 0 M M h fM mM h 2 .
(1.14)
which has a singularity (mean mutation rate skyrockets quite suddenly) at time tE = 1 (1.15)
As the small constant covariance, h = cov( , x ) , tends toward zero, this time tends toward ( 0 fM mM ) , giving the conservative upper bound,
1
tE < ( 0 fM mM ) .
1
(1.16)
- S16 -
Numerical solutions.
To obtain numerical solution, we employed a difference equation formulation of the pde models. In this approach, the size of an individuals mutational neighborhood could be varied. Numerical solution is obtained from the difference equations, ut +1(i , j ) = wi ut (i , j ) + Mi , j (ut ) wt and ut +1(i , j ) = wi (ut (i, j ) + Mi , j (ut )) wt (Model 2). (Model 1)
The solution lives on an nw n grid of fitness and mutation-rate values. w i is the fitness of an individual at position (i , j ) on the grid, j is the log of its mutation rate, and ut (i , j ) is the discrete probability density of individuals with fitness w i and log mutation rate j in generation t. w t = w i ut (i , j )dwd , where dw = w i w i 1 and d = j j 1 .
i =1 j =1 nw n
The operator Mi , j models mutation within a square mutational neighborhood whose size is defined by a parameter m. The term wi models selection. wt
- S17 -
The mutational operator Mi , j is defined as: Mi , j (ut ) =

i +m
k =i m l = j m
u (k, l )[
t
k ,l (0,m )
j +m
( l )]( k i ) [ M ( l )]( j l ) [ B (l )]( i k ) [ A (l )]( l j ) (1 + ( j ) ) ut (i , j )

+ + + +
where ( j ) =
[ D ( j ) + B ( j )]k [ M ( j ) + A ( j )]l 1 , and B , D , M , A denote
beneficial, deleterious, mutator, and anti-mutator mutation rates, respectively, and are functions of their j-coordinate (specifying mutation rate) on the fitness mutation rate grid, e.g., D ( j ) = e j . The superscript + indicates that the argument is set to zero if it is negative, i.e., max(,0) . For example, if m=1, then the mutation operator on the grid at point (i,j) can be visualized as follows:
(i+1,j-1)
D M B A A M
(i+1,j)
D B D A A M
(i+1,j+1)
B M
(i,j-1)
D A
(i,j)
D B
(i,j+1)
B A
B M
D M
(i-1,j-1)
(i-1,j)
(i-1,j+1)
Figure 7. Schematic of how mutation is modeled for a mutational neighborhood of size one. All mutation rates are functions of the j-coordinate of the class from which the mutational flux (i.e., the arrow) originates.
yielding,
Mi , j (ut ) = ut (i + 1, j 1)D ( j 1)M ( j 1) + ut (i + 1, j )D ( j ) + ut (i + 1, j + 1)D ( j + 1) A ( j + 1) + ut (i , j 1)M ( j 1) + ut (i , j )( ) + ut (i , j + 1) A ( j + 1) + ut (i 1, j 1)B ( j 1)M ( j 1) + ut (i 1, j )B ( j ) + ut (i 1, j + 1)B ( j + 1) A ( j + 1) where,
= ( B ( j ) + D ( j ))( M ( j ) + A ( j )) + ( B ( j ) + D ( j )) + ( M ( j ) + A ( j ))
- S18 -
An example solution.
The computer code and/or compiled executable that implements the difference equation derived above may be obtained by writing to PG. An example solution is given here in three formats: a simple plot of average fitness and average mutation rate, an animated contour plot on a log scale, and an animated 3D plot on an arithmetic scale. The following solution lives on a grid where fitness ranges from 0 to 80 and log relative mutation rate ranges from -2 to 8. Beneficial mutation fraction is 10-10, deleterious mutation fraction is 10-2, mutator fraction is 10-4, and anti-mutator fraction is 10-5. Mutational effects are all exponential with mean 0.03. Mutational neighborhood is of size m = 1. The output of this numerical solution is plotted in Fig 1 of the main text and is repeated here as Fig 8. Movie S1 (viewed by opening the file S1.avi) shows the animated log-scale contour plot. In this animated plot, the center of probability mass is red and colors generally get cooler with decreasing probability. The lowest non-zero probabilities are colored yellow. The probability arm created by the upward extension of the distribution over time is seen to lean slightly to the right, reflecting the positive association between fitness and mutation rate caused by hitchhiking. Movie S2 (viewed by opening the file S2.avi) shows the corresponding animated arithmetic scale 3D plot. In this plot, the remarkable speed of the transition is apparent.
Figure 8: Numerical solution of difference equation.
- S19 -
Simulations. Simulations kept track of every individual and every replication event in the population. Offspring could differ from parents in fitness or mutation rate or both. At the outset, every individual in the population had a fitness of one and the wildtype genomic mutation rate set by the user. At each time step, every individual produced a number of offspring, X, drawn at random from a Poisson distribution whose mean was w i / w , where wi is the fitness of the ith individual in the population and w =
1 N
w
j =1
is the mean fitness of the population, i.e., X
is a Poisson random variable with mean w i / w . Each offspring acquired: (1) a number XD of new deleterious mutations that each decreased fitness by a factor (1-MD), (2) a number XB of new beneficial mutations that each increased fitness by a factor (1+MB),
B B
(3) a number XM of new mutator mutations that each increased log mutation rate by an amount MM, (4) a number XA of new anti-mutator mutations that each decreased log mutation rate by an amount MA. XD, XB, XM and XA are all Poisson random variables with
B
means UifD, UifB, UifM and UifA, respectively, where U is the base genomic mutation
B
rate, i is the relative mutation rate of individual i (this is the rate that mutates and evolves), and the fs are fractions of deleterious, beneficial, mutator, and anti-mutator mutations parameters set by the user. To simulate mutational effects, MD 0 , MB 0 , MM 0 , and M A 0 were continuous random variables with means mD, mB, mM and mA,
B
respectively. A variety of different governing distributions for these mutational effects was implemented and ranged from distributions having exponential, or normal, tail probabilities to those having power-law, or very heavy, tail probabilities. The mutation rate ratchet operated, and extinction occurred, under all distributions tested.
- S20 -
Program code for simulations. What follows is the program code, in MS Visual Basic, for the simplest simulation program. The compiled, executable file is available upon request. This program was written by PG. Most of our reported results were obtained from variants of this code. A different simulation program was written, completely independently, by AC. This alternative program code is in C++ and is also available upon request.
Public gmr!, mb!, md!, mm!, ma!, fb!, fd!, lf!, fm!, fa!, IDUM&, af!, amr! Public Sub Main() 'gmr! = base genomic mutation rate 'mb! = mean effect of beneficial mutations 'md! = mean effect of deleterious mutations 'mm! = mean effect of mutator mutations 'ma! = mean effect of anti-mutator mutations 'fb! = fraction of all mutations that are beneficial 'fd! = fraction of all mutations that are deleterious 'lf! = fraction of all deleterious mutations that are lethal 'fm! = fraction of all mutations that are mutator mutations 'fa! = fraction of all mutations that are anti-mutator mutations 'cc& = starting population size (carrying capacity is twice this value) 'maxgens& = maximum number of generations to run the simulation 'sp1% = sampling period (in generations) for point data 'a! = "nudging" parameter: nudges pop size back toward the starting size; 'recommended: a = 0.3 Open "ratchet.in" For Input As #1 Input Input Input Input Input Input Input #1, #1, #1, #1, #1, #1, #1, gmr! mb!, md!, mm!, ma! fb!, fd!, fm!, fa! cc& maxgens& sp1% a!
Close (1) 'seed the random number generator with the clock: IDUM& = -Timer n& = cc& 'initial population size 'fitness and relative mutation-rate arrays 'new arrays for temporary storage
ReDim w!(2 * cc&), mr!(2 * cc&) ReDim nw!(2 * cc&), nmr!(2 * cc&) tt& = 0 af! = 1
'initialize time iterator 'population average fitness
For ii& = 1 To n& w!(ii&) = 1 mr!(ii&) = 1 Next ii&
'start with homogeneous population: fitness = 1 'start with homogeneous population: relative mutation rate = 1
Open "ratchet.out" For Output As #1 frmSimple.Show 'show output window
- S21 -
Do ------jj& = 0 For ii& = 1 To n& c! = (w!(ii&) / af!) * (1 + a! * (cc& - n&) / cc&) 'c = mean number of progeny is fitness divided by population fitness 'times a correction to keep population size relatively constant nr& = rndpssn&(c!) 'number of progeny is Poisson distributed For i& = 1 To nr& If jj& < 2 * cc& Then jj& = jj& + 1 'individual ii& replicates and produces offspring jj& in new array: Call replicate(w!(), mr!(), nw!(), nmr!(), ii&, jj&) End If Next i& Next ii& n& = jj& If n& = 0 Then Exit Do ------'pass "new values" from "new" arrays '(denoted by adding an 'n' to the front of the variable) 'to the "regular" arrays sum1! = 0: sum2! = 0 For ii& = 1 To n& w!(ii&) = nw!(ii&) mr!(ii&) = nmr!(ii&) sum1! = sum1! + w!(ii&) sum2! = sum2! + mr!(ii&) Next ii& af! = sum1! / n& 'average fitness amr! = sum2! / n& 'average mutation rate tt& = tt& + 1 'increment time
'display current values to an output window called frmSimple: frmSimple.lblGeneration = Str$(tt&) frmSimple.lblPopSize = Str$(n&) frmSimple.lblFitness = Str$(af!) frmSimple.lblMutationRate = Str$(amr!) frmSimple.Refresh If tt& / 100 = Int(tt& / 100) Then frmSimple.Hide 'redisplays output window every 100 generations frmSimple.Show 'to keep it from freezing up End If If tt& / sp1% = Int(tt& / sp1%) Then Write #1, tt&, n&, af!, amr! Loop While tt& < maxgens& And af! > 0.001 Close (1) End Sub 'write to file
- S22 -
Sub replicate(w!(), mr!(), nw!(), nmr!(), ii&, jj&) ndm& = rndpssn&(gmr! * fd! * mr!(ii&)) 'number of deleterious mutations prod! = 1 For i& = 1 To ndm& If RAN2(IDUM&) < lf! Then 'if its a lethal mutation then fitness = 0 prod! = 0 Else 'the effect of each new deleterious mutation is drawn 'at random from an exponential distribution with mean md!: Do: R! = RAN2(IDUM&): Loop Until R! > 0 prod! = prod! * (1 + Log(R!) * md!) If prod! < 0 Then prod! = 0 End If Next i& dwd! = prod! nbm& = rndpssn&(gmr! * fb! * mr!(ii&)) 'number of beneficial mutations prod! = 1 For i& = 1 To nbm& 'the effect of each new beneficial mutation is drawn 'at random from an exponential distribution with mean mb!: Do: R! = RAN2(IDUM&): Loop Until R! > 0 prod! = prod! * (1 - Log(R!) * mb!) Next i& dwb! = prod! 'new fitness equals old fitness times mutational effects: nw!(jj&) = w!(ii&) * dwd! * dwb! nmm& = rndpssn&(gmr! * fm! * mr!(ii&)) 'number of mutator mutations prod! = 1 For i& = 1 To nmm& 'the effect of each new mutator mutation is drawn 'at random from an exponential distribution, on a log scale, 'with mean mm!: Do: R! = RAN2(IDUM&): Loop Until R! > 0 prod! = prod! * R! ^ (-mm!) Next i& dmmr! = prod! namm& = rndpssn&(gmr! * fa! * mr!(ii&)) prod! = 1 For i& = 1 To namm& 'the effect of each new anti-mutator mutation is drawn 'at random from an exponential distribution, on a log scale, 'with mean ma!: Do: R! = RAN2(IDUM&): Loop Until R! > 0 prod! = prod! * R! ^ ma! Next i& dammr! = prod! 'new mutation rate equals old mutation rate times mutational effects: nmr!(jj&) = mr!(ii&) * dmmr! * dammr! End Sub
Notes: rndpssn&(m&) returns a value drawn at random from a Poisson distribution with mean m&. RAN2(IDUM&) returns a random number between 0 and 1; this random number generator is a Numerical Recipes subroutine (10).
- S23 -
Instructions for using the compiled simulation program.
The compiled, executable file may be obtained by writing to PG. To run the simulation program, it is advisable to create a separate folder to house the executable, input and output files. The executable should be pasted there first. Next, an input file must be created with a plain text editor such as notepad. This file must be named ratchet.in and its format is as follows.
base genomic mutation rate mean beneficial effect, mean deleterious effect, mean mutator effect, mean anti-mutator effect beneficial fraction, deleterious fraction, mutator fraction, anti-mutator fraction population size maximum number of generations to run simulations sampling period nudging factor to keep population size relatively constant (recommended 0.3)
(Note: If the nudging factor is too large, instability can result.) An example input file is printed out here:
.1 .03,.03,.03,.03 .00001,.1,.001,.0001 30000 100000 100 .3
Once the files ratchet.in and ratchet.exe are in the same folder, double click on ratchet.exe and an output window will pop up reporting current simulation values for generation, population size, average fitness, and average mutation rate. To stop the program, it is often necessary to press ctrl-alt-del to bring up the task manager, click on the processes tab, find and select the ratchet.exe process and then click on end task.
- S24 -
Wright-Fisher simulations. We designed separate simulations using a different algorithm that modeled a true Wright-Fisher population, in which population size was held exactly constant. Here, it is observed that the essential dynamics are the same, and the fitness drop is equally, if not more, precipitous. While trends seem even clearer and are a bit more consistent, these simulations were more computationally intensive and it was not possible to obtain results for very large populations. To model a Wright-Fisher population, offspring are multinomially distributed (11): each individual j in the population has ij offspring and N!
N i
i
j =1
= N ; the probability of each such
j wj partitioning of offspring is N . To simulate the multinomial partitioning of j =1 Nw ij !
j =1
offspring, the interval between 0 and 1 is divided into N segments, representing the N individuals in the population; the length of each segment is proportional to an individuals fitness. A random number between 0 and 1 is generated and the individual on whose segment the random number falls is awarded an offspring for the next generation. This procedure is repeated N times to generate the N multinomial-distributed progeny for the next generation. With the population size thus fixed, it could be ruled out as a determining factor in the observed fitness drop. The Wright-Fisher algorithm was implemented by replacing the piece of code between the red lines to:
------For ii& = 1 To n& rr! = RAN2(IDUM&) sum1! = 0: sum2! = 0 For kk& = 1 To n& sum2! = sum2! + w!(kk&) / (n& * If rr! >= sum1! And rr! < sum2! Call replicate(w!(), mr!(), Exit For End If sum1! = sum1! + w!(kk&) / (n& * Next kk& Next ii& -------
ff!) Then nw!(), nmr!(), kk&, ii&)
ff!)
A typical plot of fitness and mutation-rate dynamics under the Wright-Fisher model is shown in Fig. 9.
- S25 -
Figure 9: Fitness and mutation rate dynamics resulting from the Wright-Fisher model of evolution. We note that the large numbers for mean fitness have limited meaning as the model uses only relative fitness to compute offspring numbers. Population size is 1000 and starting beneficial mutation rate is 10-4; all other parameters are the defaults.
- S26 -
Comparing analytical theory with individual-based simulations. Our analytical results derived from the pde models make the implicit assumption of infinite population size. This assumption can introduce considerable error when very rare events are significant. In the process of adaptive evolution, the occurrences of beneficial mutations are rare but significant events, suggesting the possibility of a large discrepancy between the results of our analytical theory and those of individual-based simulations of finite populations. We find that theory and simulations give qualitatively similar final outcomes. In addition, from statistical properties of a population, the integral forms of Model 1 predict rates of change in mean mutation rate and mean log fitness. These predictions fit observed values quite well, supporting unexpectedly good agreement between theory and simulations. As a benchmark comparison with established theory, we first restate results obtained in the absence of beneficial mutations. The expected dynamics were those of Mullers ratchet, quantified in Haigh (12): fitness of a finite population should decline very slowly due to the occasional loss of the currently fittest class. Indeed, our simulations behaved in exactly this way when beneficial mutations were completely absent. Qualitatively, the discrepancy between our analytical models and individualbased simulations can be visualized by comparing panels A and B of Fig. 1 in the main text. The numerical solution of the pde model is characterized by a very sharp transition to low fitness high mutation rate, whereas dynamics of the individual-based simulations show a slower but equally devastating transition. This single example suggests that the final outcome is unchanged by the assumption of infinite population size, and many more examples support the same conclusion (see Fig. 2 in the main text). To make a quantitative comparison between the pde models and individualbased simulations, we employed the integral forms of Model 1, given by equations (1.11) and (1.12). From the individual-based simulations, we directly measured cov( , x ) , 2 ,
2 , x , and x , from which we were able to compute the right-hand-sides of equations
(1.11) and (1.12). By computing the slopes of and x , we determined
x and , t t
- S27 -
giving us the left-hand-sides of equations (1.11) and (1.12). Figs. 10 and 11 plot results from two representative simulation runs. A
0.008
rate of mutation-rate increase
0.006
0.004
0.002
predicted observed
-0.002
-0.004 0 20000 40000 60000
time (generations)
B
0.004
0.002
rate of fitness increase
predicted observed
-0.002
-0.004 0 20000 40000 60000
time (generations)
Figure 10. Comparing theory with individual-based simulations. Data were taken from a were simulation run with a population size of 10,000. In Panel A, predicted values for t obtained by computing cov( , x ) + fM mM 2 , and observed values for were obtained by t computing the slope of mean mutation rate. The parameters fM and mM were the same as in the simulations. In Panel B, predicted values for
2 x fD mD , and observed values for
x were obtained by computing t
x were obtained by computing the slope of mean log t fitness. The parameters fD and mD were the same as in the simulations.
- S28 -
A
0.005
rate of mutation rate increase
0.0025
predicted observed
-0.0025
-0.005 0 20000 40000 60000 80000
time (generations)
B
0.001
rate of fitness increase
0.0005
0
predicted observed
-0.0005
-0.001
-0.0015 0 20000 40000 60000 80000
time (generations)
Figure 11. Comparing theory with individual-based simulations. Same as Fig. 10, but data for these plots were taken from a different simulation run with a population size of 30,000.
- S29 -
Parameters. In the main text, we show that the deleterious and beneficial mutation rates employed in the simulations are supported by experimental data. Here, we discuss data bearing on the realism of the values employed for the remaining parameters in the simulations. Mutator fraction, fM . In populations of E. coli, the mutation rate from non-mutators to mutators (the mutator mutation rate) was calculated to be 5 106 per bacterium per generation (13). This value corresponds roughly to a fraction 0.001 of the genomic mutation rate (14). Influenza A virus was shown to have considerable mutation rate heterogeneity: in one population (15) 7 out of 60 clones were shown to be mutators, and 2 of these conferred a two- to three-fold increase in mutation rate (quite significant for a virus whose base mutation rate is already very high). To convert these numbers to a rate of mutator mutation, we employed a base mutation rate estimate for Influenza A of one mutation per genome per replication (14). Because viral genomes in general carry such highly condensed genetic information, we speculate that at least one tenth of all mutations for Influenza A is deleterious. The coefficient of selection against a mutator is calculated as sM = 1 e m D , where D is the deleterious mutation rate, and m is the e D
factor by which the mutator allele increases mutation rate. Influenza A has a segmented genome that undergoes reassortment, and the concept of a mutation rate equilibrium may therefore apply. The equilibrium frequency of a mutator in a population is pM (the inequality is explained in (16)), where M is the rate of mutator mutation.
e m D Rearranging, we have M pM sM = pM 1 D e pM = . Inserting the observed values,
M
sM
2 * , m = 2.5 , and 0.1 < D < 1 , we calculate that M M , where 60
* 0.0046 < M < 0.026 . This upper bound has a geometric mean of 0.011, corresponding
to a fraction 0.011 of the genomic mutation rate.
- S30 -
Antimutator fraction, fA . To our knowledge, there are no empirical data available bearing on the fraction of mutations that lower, as opposed to raise, the genomic mutation rate in any organism. There are theoretical reasons to suppose that such mutations will be exceedingly rare in organisms with low, wild-type genomic mutation rates (17). In cases in which a gene locus involved in repair or replication has been mutated, yielding an elevated mutation rate, basic genetic principles indicate that revertants or suppressors of that mutation are likely to arise at a much lower rate, consistent with the qualitative requirement of our analytical model. Mean deleterious effect, mD , and mean beneficial effect, mB . Lynch et al. (18) have reviewed studies on the distribution of selection coefficients of effects of deleterious mutations in a wide range of organisms, and conclude that there is strong evidence for the existence of a large class of mutations of slightly delterious effect, consistent with the idea that the average selective effect of a new deleterious mutation is typically less than 5%. Imhof and Schltterer (19) estimated the average selection coefficient of new beneficial mutations in evolution experiments with E. coli as 0.02. Using different approaches but the same organism, Gerrish and Lenski (20) estimate a value of 0.03, and Rozen et al. (21) estimate a value of 0.02. Sanjun et al. (22) conducted a large study of the effects single-nucleotide substitutions in vesicular stomatitis virus and estimated the mean selective effect of beneficial mutations as approximately 0.04. Mean mutator effect, mM , and mean antimutator effect, mA . There seem to be no available data on average effects of newly arising mutations on the mutation rate, presumably because of the difficulty involved in obtaining precise measurements of mutation rates. It is likely that there is a detection bias in favor of those mutations which alter the mutation rate by a substantial amount, but there is little reason to suppose, as noted in the main text, that mutation rates cannot be subtly altered by minor mutational changes to the enzymatic machinery of DNA replication and repair.
- S31 -
Dynamics of deleterious load on a hitchhiking mutator A mutator may hitchhike to fixation with a new beneficial mutation it produces. As the beneficial mutation increases in frequency, it will acquire excess deleterious load due to its mutator background. Here we show that there is a robust time delay before increased deleterious load is established on a hitchhiking mutator. Short-sighted natural selection does not anticipate the delayed establishment of increased load, and an adapting population can therefore unwittingly evolve intolerable mutation rates. The observation that natural selection drives an adapting population to evolve intolerable mutation rates is counter-intuitive and, as such, requires careful explanation. In the main text, we assert that this counter-intuitive result is largely due to the short-sightedness of natural selection. Yet, short-sightedness is only detrimental when there is some delayed detrimental effect that the short-sighted process cannot anticipate. Here, the delayed detrimental effect is the establishment of excess deleterious load due to an increased mutation rate. On the surface it seems this delay would be quite short and therefore not of much consequence: by the time a beneficial mutation appears on a mutator background, the mutator will have already acquired some excess deleterious mutations due to its increased mutation rate. So the excess deleterious load will already be partially established at the outset, and intuition suggests that the total excess load will be established shortly thereafter. A very simple model of deleterious load dynamics, however, yields results that depart from intuition. Let x denote the frequency of a successful beneficial mutation on a mutator background, and let y denote the excess deleterious mutation frequency due to the mutator background. Then the dynamics of deleterious load on a hitchhiking mutator can be compactly represented by a system of two coupled equations: dx / dt = sB x(1 x ) dy / dt = D x sD y where sB denotes the net selective advantage of the beneficial mutation that drives the
B
mutator to fixation, sD denotes the mean selective disadvantage of deleterious mutations, and D denotes the excess deleterious mutation rate; i.e., if u and u ' denote, respectively, the wildtype and mutator deleterious mutation rates, then
D = u ' u . To help understand exactly how sB is defined, we provide an example: if a
- S32 -
, then net selective beneficial mutation confers an intrinsic selective advantage of sB

D < sB < sB advantage could be bounded by sB . We next write the equations in terms
of a more meaningful variable, z = y / x . Because y will usually be much smaller than x, this new variable is approximately the frequency of excess deleterious mutations on the mutator background, i.e., z = y / x y /( x + y ) . The new system becomes: dx / dt = sB x(1 x ) dz / dt = D (sD + sB )z + sB zx Fig. 12 reveals that the trajectory of z appears as two steps, the first occurring soon after the beneficial mutation begins to grow exponentially and the second occurring after the beneficial mutation is fixed. This dynamic reveals that deleterious load is suppressed during the hitchhiking of the mutator allele below the equilibrium load that is eventually established once the mutator is fixed. Simple analysis of this system shows that when z = is invariant in time, it is equal to z
D
sD + sB (1 x )
. But this function is only invariant in
time when x is invariant in time; x is invariant in time once the beneficial mutation is 2 = fixed, i.e., when x = 1, yielding the classical mutation-selection balance, z is the height of the second step in the trajectory. In addition, however, x is approximately invariant in time on an arithmetic scale during the exponential growth of the beneficial mutation, i.e., when x 0 , yielding the height of the first step in the 1 = trajectory, z
D
sD
, which
D
sD + sB
. These results imply that, during hitchhiking of a mutator, sD of the total equilibrium load that is established sD + sB
deleterious load is only a fraction
once the mutator is fixed. When linkage is maintained among fitness loci, the fitness advantage of successful beneficial mutations sB can be large, due to clonal interference,
B
meaning that the height of the first step is potentially much smaller than the height of the second step. Put differently, the suppression of deleterious load during mutator hitchhiking can be significant. In addition to the suppression of load during hitchhiking, another feature of load dynamics is extracted from this simple model that helps to explain how an adapting
- S33 -
population can evolve intolerable mutation rates. This feature is the observation that long-term dynamics, and hence the time lag until establishment of excess deleterious load, are very robust to initial conditions. Typically, a mutator background will have accumulated some deleterious load by the time the beneficial mutation appears, and the initial deleterious load is therefore positive and variable. Fig. 13 shows that the waiting time until establishment of excess deleterious load is, for all practical purposes, completely independent of this initial deleterious load. Furthermore, the suppression of deleterious load during mutator hitchhiking is independent of initial conditions; if the mutator background has an initial deleterious load greater than
D
sD + sB
, it will quickly be
reduced to this value until the mutator achieves high frequency, as shown in Fig. 13.
Figure 12: Dynamics of deleterious load on a hitchhiking mutator. The logistic dynamics of fixation of the driving beneficial mutation (x) are shown by the green line. Deleterious load on the mutator background (z) is shown by the red line.
- S34 -
Figure 13: Robustness of deleterious load dynamics to variable initial conditions. The vertical axis is deleterious load on the mutator background, z. Two salient features of these dynamics are the robustness to initial conditions of 1) long-term dynamics and hence the time lag until establishment of increased deleterious load, and 2) suppression of deleterious load during mutator hitchhiking.
- S35 -
Finite genome effects The theory developed in this paper implicitly makes the assumption that genomes are effectively infinite in size. With respect to fitness mutations, this assumption seems very weak. With respect to mutation-rate mutations, however, the assumption may not be as weak as it initially appears, but the conditions under which it becomes less weak appear to be relatively restricted and perhaps of questionable biological significance. We believe that most readers will readily concur on two basic premises regarding mutations: 1) deleterious mutations are far more common than beneficial mutations, and 2) mutator mutations are far more common than antimutator mutations. For finite genomes, however, these proportions (beneficial/deleterious, and antimutator/mutator) should change over time in an adapting population. If an organisms environment is completely static, then as the population adapts, the supply of beneficial mutations available to the finite genome is slowly depleted, and the beneficial/deleterious ratio tends toward zero. If the environment is changing (however slowly or erratically), then as the population adapts, the supply of potential beneficial mutations is slowly depleted, but changes in the environment provide new opportunities to adapt and this supply is therefore slowly replenished. The balance between these rates of depletion and replenishment of the supply of potential beneficial mutations will determine the beneficial/deleterious ratio, which should remain very small but will not become zero. Furthermore, it is very hard to fathom how any genome, however small, in any situation, however dire, could result in the number of potential beneficial mutations exceeding the number of potential deleterious mutations. With respect to fitness mutations, therefore, our infinite-genome assumption is very weak. Our work shows that, as an asexual population adapts, indirect selection on mutation-rate variants tends to cause the fixation of mutators at a higher rate than antimutators. In a finite genome, the number of potential mutator mutations slowly decreases as a population adapts because a) they are used up as more mutators are fixed, and b) as more mutators are fixed, it seems reasonable to speculate that further mutation in replication/repair apparatuses will increasingly result in non-viable genomes. Furthermore, the number of potential antimutator mutations increases disproportionately because this number will include reversions of mutator mutations as well as compensatory mutations that lower the mutation rate. As a result, it seems plausible at
- S36 -
least that a finite genome could reach a state in which there are more potential viable antimutator mutations than viable mutator mutations. According to our analyses and simulations, this situation is uncharted territory: our results essentially guarantee the extinction result as long as fM > fA , but they do not specify what will happen when this is not true. In simulations with fM < fA , extinction occurred almost never in the time span of the simulations; however, a few runs did result in extinction. This evidence is only anecdotal but suggests that there are conditions under which extinction will occur even when fM < fA . In the section of the main text entitled Model assumptions, we refer to biological evidence in support of the notion that mutations in replication, proofreading, and repair apparatuses can result in intolerable mutation rates without directly killing the organism. This evidence supports the conclusion that, effectively, fM > 0 at least long enough for extinction to occur. The question remains, however, what happens when the ratio fM / fA decreases as mutator mutations accumulate. We investigate two mechanisms by which the ratio fM / fA may decrease with increasing mutation rate: Mechanism 1: as mutator mutations accumulate, further mutations in replication/repair genes become more likely to be directly lethal. Mechanism 2: as this reviewer points out, as mutation rate increases, further opportunities to increase mutation rate are depleted while new ways to decrease mutation rate through reversion and compensatory mutation open up. Mechanism 1 In what follows, we redefine mutator fraction, fM , as net mutator fraction, and antimutator fraction, fA , as net antimutator fraction. By net mutator (antimutator) fraction, we mean the fraction of mutations that are mutator (antimutator) mutations and are not directly lethal. With our new definitions, the conditions for error catastrophe to occur are essentially the same: In sum, the error catastrophe is guaranteed in infinite populations when fM > fA although it is not clear what happens when fM < fA .
- S37 -
To address the question of whether a decreasing ratio fM / fA could prevent the error catastrophe, we have run simulations in which fM is decreasing because of increasing mutator lethality as mutators are fixed. In these simulations, a certain fraction of mutator mutations are directly lethal, and as mutator mutations accumulate, this lethal fraction increases. In these simulations, the function describing how the lethal fraction increases with the number of mutator mutations already on the genome is plotted here:
Figure 14: Mutator lethality function.
To help interpret this function biologically, we ask the following question: given that an organism has acquired n mutator mutations at random, whats the probability that the organism is dead? This probability is calculated as the probability that the first mutation is lethal plus the probability that the first mutation isnt lethal but the second one is, etc. The complement of this probability, i.e., the probability that the organism is still alive after n mutator mutations, is plotted here:
- S38 -
Figure 15: Corresponding survival curve.
This function compares at least qualitatively with observations of functional fraction after a given number of mutations in the protein subtilisin (taken from (23)):
and in TEM1 -lactamase (again taken from (23)):
- S39 -
If anything, the function we employ (Fig. 15) decreases faster than these observed functions, suggesting that, if anything, the function we employ is conservative. Implementing the above function for mutator lethality, the output of one simulation run is plotted here:
Figure 16: One simulation run implementing the mutator lethality function plotted in Fig.14.
In this simulation run, the fitness crash clearly occurs, in spite of the fact that an increasing fraction of mutator mutations are directly lethal (the red curve represents a population average). Other parameters are the same as those used in Fig. 1 of the main
- S40 -
text. Mutation rate dynamics are similar to previous simulations that did not account for the effect of mutator lethality:
relative mutation rate
7 6 5 4 3 2 1 0 0 10000 20000 30000 40000
time (generations)
Figure 17: Mutation rate dynamics.
This would seem to suggest that the principle determinant of mutation rate dynamics is not the supply rate of viable mutator mutations (which effectively decreases considerably) but indirect selection on those viable mutator mutations that are produced. These simulations do not account for the effect of antimutators on the fraction of mutator mutations that are directly lethal. IAn antimutator mutation should decrease the fraction of mutator mutations that are directly lethal, and not including this effect would therefore reflect a conservative assumption. These results are, of course, dependent on the function plotted in Figure 1. If you employ a function that approaches one fast enough (unrealistically fast, according to (23)), this does indeed appear to prevent error catastrophe. This stands to reason because it would either lead to a situation in which the population is physically unable to reach the error threshold, or it would eventually give rise to a situation in which fM < fA which may (or may not) prevent the error catastrophe. Biological considerations (outlined above) together with results from protein mutagenesis (23) would seem to argue, however, that if anything the function we used is in fact quite conservative.
Mechanism 2
In addition, we have specifically addressed the idea that, as mutation rate increases, the mutator and antimutator fractions should converge. If one considers a very simple model
- S41 -
in which mutator mutations may revert to antimutator mutations at equal per-nucleotide rates, and vice versa, then at least in this simple model, in the limit of increasing mutation rate, further mutation in a polymerase (that doesnt destroy it) becomes just as likely to increase as to decrease the mutation rate. In this simple model, mutator and antimutator fractions converge. We model the relationship between mutator fraction (x) or antimutator fraction (y) and relative mutation rate (r) as follows:
dx / dr = y x dy / dr = x y
whose solution looks like this:
Figure 18: Mutator and antimutator fractions as a function of mutation rate relative to the wildtype.
We implemented this relationship between mutator (antimutator) fraction and relative mutation rate in simulations; here is the output from one simulation run:
- S42 -
Figure 19: One simulation run implementing the mutator and antimutator fractions plotted in Fig.18.
Again, dynamics are similar to previous simulations. Finally, we note that an encompassing exploration of the phenomena presented in this subsection is in our plans for future work, but our initial simulation work would suggest that these phenomena may be qualitatively relevant only in very restricted, and perhaps biologically questionable, regions of parameter space.
- S43 -
Inferring extinction. The mutation rate catastrophe ultimately causes a very rapid decline in fitness. We propose that any population that experiences such a rapid decline in mean fitness will inevitably become extinct. Yet the simulated populations themselves, because they employ extreme soft selection, do not become extinct. Here, we support our claim that any real population would be driven extinct by such a rapid drop in fitness with both verbal arguments and further simulation results. In the present manuscript, we infer extinction from the observed fitness and mutation rate dynamics. In essence, we base our inference on the observation that the fitness decline is ultimately so rapid, and the mutation rate ultimately so high, that no population could survive it. When fitness decline is slow, it is conceivable that the resulting reduction in resource competition or environmental restoration (the opposite of Fishers environmental deterioration (7)) will compensate for the slower production of offspring, resulting in a stable population size (soft selection effects). When fitness decline is rapid, however, the stabilizing effects of increased available resource (soft selection effects) are not fast enough to save the population from extinction. In zero-sum soft selection simulations with floating population size (in which the nudging factor equals zero; results not shown), the rapid decline in fitness caused actual extinction of the population despite the zero-sum formulation (i.e., despite very soft selection). In these simulations, extinction was not inferred; instead, it was observed. We chose, however, not to report the results of these simulations, as the observed extinction could raise concerns that are not essential our main thesis, and the resulting fluctuations in population size could introduce complicating artifacts that only obfuscate an already complicated process. Whether or not extinction results from the observed fitness dynamics will depend on where your population is positioned in the continuum that joins the soft-selection and hard-selection extremes. What our zero-sum simulations have shown is that a population that is very close to the soft-selection extreme becomes extinct as a result of the very rapid fitness decline due to the mutation rate catastrophe. We built two versions of the mutational meltdown model (24), in which the simulations operate in soft-selection mode until absolute mean fitness dips below one. At this point, the simulations switch from the soft-selection extreme to the hard-selection
- S44 -
extreme, awarding each individual a number of offspring drawn at random from a Poisson distribution whose mean is equal to the individuals absolute fitness. In the first version of this model, population size is held constant as long as the population remains in the soft-selection regime (Fig 20A). In the second version, population size experiences a slow, forced decline as long as the population remains in the softselection regime (Fig 20B), mimicking a population whose rate of fitness loss slightly exceeds the rate of fitness recuperation due to increased resource availability (or environmental restoration). Both of these simulations show that neither hard selection nor simulated hard selection effects (forced, slow population decline) saves the population from extinction. If anything, extinction occurs faster in these regimes than would be inferred from observed fitness dynamics in a purely soft-selection simulation.
- S45 -
A
12000 8
Population size
6 8000 4 4000 2
Fitness
0 0 20000 40000
0 60000
B
12000 9
Population size
Fitness
8000
4000
0 0 20000 40000
0 60000
Time (generations)
Figure 20. Soft selection / hard selection hybrid simulations, similar to previous mutational meltdown models. (A) Population size is held constant until fitness dips below one, at which time the simulation switches to the hard selection extreme, in which an individuals mean number of offspring is equal to his absolute fitness. The fitness crash is so fast that the population size plummets to zero upon entering the hard selection regime. (B) Population size experiences a slow, forced decline over time. Again, population size plummets to zero upon entering the hard selection regime.
- S46 -
Mathematical Details
- S47 -
Conversion of Mu from integral to differential form (in Models to be analyzed). Here, we show how this: e y Mu = fB e y u( x x, y , t )g B ( x )d x u + fD e y u( x + x, y , t )g D ( x )d x u + 0 0 fM e y y u( x, y y , t )g M ( y )d y e y u + fA e y +y u( x, y + y , t )g A ( y )d y e y u 0 0 becomes this: 2u u 2u u + Dy 2 + (2Dy + d y ) + (Dy + d y ) u e y Mu e y Dx 2 + d x x y y x
2 2 2 2 1 1 Dx = 2 fD (mD + D )+ 2 fB (mB + B ), 2 2 2 2 1 1 Dy = 2 fA ( mA +A )+ 2 fM (mM + M ),
where
d x = fD mD fB mB d y = fA mA fM mM
We start with the first integral, u( x x, y , t )g B ( x )d x , describing mutational flux into

0
fitness class x due to beneficial mutation. Taylor series expansion of u( x x, y , t ) about x gives: u( x x, y , t ) = u( x, y , t ) x 2 1 x 2 2 u( x, y , t ) ... u ( x, y , t ) + 2 x x If we truncate
this series after the second derivative, multiply by g B ( x ) , and integrate over x , we have:
u( x x, y , t )g
0
( x )d x 2 u( x, y , t ) xg B ( x )d x + 2 u( x, y , t ) x 2 g B ( x )d x x x 0 0

u( x, y , t ) g B ( x )d x
0
The function g B ( x ) is a probability density function for positive values of x . (For example, g B ( x ) could be the exponential density: g B ( x ) = e x .) From this fact, we derive the relations:
g
0 0
( x )d x = 1 , ( x )d x = mB , the mean fitness effect of beneficial mutations, and
xg
- S48 -
x g
2 0
2 2 2 1 ( x )d x = 2 (mB + B ) , where B is the variance in fitness effects of beneficial
mutations. The first integral in Mu is therefore approximated as follows:
u( x x, y , t )gB (x )d x u( x, y , t ) mB
0
2 2 2 1 u ( x, y , t ) + 2 (mB ) 2 u( x, y , t ) + B x x
and the entire first term of Mu is

2 u 1 2 2 u + 2 ( mB + B ) 2 u e y fB u( x x, y , t )g B ( x )d x e y fB u mB x x 0 2 1 u 2 2 u = ey 2 + B ) 2 fB mB fB (mB x x
In a similar fashion, we find the second term to be:

2 1 u 2 2 u e y fD u( x + x, y , t )g D ( x )d x e y 2 fD ( mD + D ) 2 + fD mD . x x 0
Approximation of the third and fourth terms of Mu is slightly different. To simplify these approximations somewhat, we define a new variable, ( y ) = e y u ( x, y , t ) . The third integral in Mu can then be written ( y y )g M ( y )d y . Truncated expansion of
0 1 ( y y ) about y gives: ( y y ) ( y ) y '( y ) + 2 y 2 ''( y ) , and we have
'( y ) = e y (u +
u u 2u + ) . Multiplying by g M ( y ) , putting things ) and ''( y ) = e y (u + 2 y y 2 y
back in terms of u, and integrating gives
e
0
y y
u( x, y y , t )g M ( y )d y
u u 2u 1 y e y u g M ( y )d y e y u + yg ( y ) d y + e u + 2 + y 2 g M ( y )d y M 2 2 y y y 0 0 0
Again, the function g M ( y ) is a probability density function for positive values of y . From this fact, we derive the relations:
g
0
( y )d y = 1,
- S49 -
yg
0
( y )d y = mM , the mean effect of mutator mutations, and
y
0
2 2 2 1 g M ( y )d y = 2 (mM + M ) , where M is the variance in effects of mutator
mutations. The entire third term thus becomes

u u 2u 2 2 1 + M + 2 )u fM e y y u( x, y y , t )g M ( y )d y e y fM u mM (u + ) + 2 (mM )(u + 2 y y y 0 2 1 u 2 2 u 2 2 2 2 1 = ey 2 + M + M + + M fM ( mM fM (mM f (mM u ) 2 + ) fM mM ) fM mM 2 M y y In a similar fashion, we find the fourth term to be:
fA e y +y u( x, y + y , t )g A ( y )d y
0 2 1 u 2 2 u 2 2 2 2 1 f A ( mA 2 +A +A + + A ey 2 f A ( mA ) 2 + ) + fA mA fA ( mA ) + f A mA u y y
We can now write out the entire expression:

Mu Dx
2u u 2u u + + + (2Dy + d y ) + (Dy + d y ) u d D x y 2 2 x y x y
where
2 2 2 2 1 1 Dx = 2 fD (mD + D )+ 2 fB (mB + B ), 2 2 2 2 1 1 Dy = 2 fA ( mA +A )+ 2 fM (mM + M ),
d x = fD mD fB mB d y = fA mA fM mM
Our choice of notation, D and d, was intended to allude to diffusion and drift coefficients, respectively. The mean of z is zero (from Analysis 1). Integrating the soliton equation over z and y directly gives the mean of z, because the first term on the right hand side is zu which, when integrated, gives the mean of z. The mean of z, denoted z , is therefore: z = c

u e y Mu z
The first term is
u z = zu(z, y )dzdy = [u(, y ) u(, y )]dy = 0 0 = 0 (from
boundary conditions). The second term is
- S50 -
y e Mu =
e y Mu dzdy =
2u u 2u u y e D d D + + + (2Dy + d y ) + (Dy + d y ) u dzdy z 2 z y 2 z y y z
which we now integrate term by term. The first term in

e Mu
y
is
2u 2 y Dz e D e u( z, y )dzdy = Dz e y u(, y ) u( , y )dy = 0 0 = 0 , = z 2 2 z z z z

y
because of the boundary conditions imposed. The second term in dz ey

e Mu
y
is
u = d z e y u( z, y )dzdy = d z e y [u(, y ) u( , y )]dy = 0 0 = 0 , again z z
because of boundary conditions. The third term in Dy e y

e Mu
y
is
2 2u y = D e u( z, y ) dydz , of which the inside integral may be solved using y y 2 y 2
integration by parts: 2 u( z, y ) dy = e u( z, ) e u( z, ) e y u( z, y ) dy = e y u( z, y ) dy 2 y y y y y and furthermore, ey

y e
u( z, y ) dy = e u( z, ) e u( z, ) e y u( z, y ) dy = e y u( z, y ) dy y

From these two relations, we deduce that

y e
2 u( z, y ) dy = e y u( z, y ) dy = e y u( z, y ) dy . 2 y y

When this inside integral is put back into the outside integral,

ey
2 u( z, y ) dydz = e y u( z, y ) dydz = , the mean mutation rate. Following this y 2
logic, we solve for the remaining terms in

e Mu :
y
Dy
ey
2 u( z, y ) dydz = Dy , y 2

(2Dy + d y ) (Dy + d y )
e
y
u( z, y ) dydz = (2Dy + d y ) , and y

y
e u(z, y ) dydz = (D
+ d y ) .
- S51 -
And now, replacing each term in have:
e Mu
y y
with its corresponding integrated form, we
e Mu = 0 + 0 + D (2D
y
+ d y ) + (Dy + d y ) = 0 .
We note that this is what one would expect for a conservative system. We have established that
z = 0 and that e Mu = 0 , from which we infer that zu = 0 ,

y
confirming that the mean of z is zero ( z = 0 ) as expected. The variance of z (from Analysis 1). To obtain the variance of z, we multiply the soliton equation by z before integrating,
2 giving: z = z 2 u = c z
u ze y Mu . The first integral on the right hand side is: z
u = z u( z, y )dzdy = u(, y ) + u( , y ) u( z, y )dz dy = u( z, y ) dzdy = 1 z z
The second integral on the right hand side is:

y ze Mu =
ze y Mu dzdy =
2u u 2u u y ze D d D + + + (2Dy + d y ) + (Dy + d y ) u dzdy z 2 z y 2 z y y z
which we now integrate term by term. The first term in
ze Mu
y
is:
Dz
y e
2u dzdy = Dz e y z 2
u ( , ) ( , ) u y u y dz dy + z z z
= Dz
z u(, y ) + z u( , y ) ( u(, y ) u( , y ) ) dy = 0
The second term in
ze Mu
y
is:
dz
y e
u dzdy = d z e y z
y z
( , ) ( , ) u y u y udz dy +
= d z
e udzdy = d
The remaining three terms in
ze Mu , after integration by parts, are:

y
- S52 -
Dy (2Dy + d y ) + (Dy + d y )
z e y u dydz = 0 .
The above integrations give our expression for the variance of z:

2 z = z 2 u = c z
u ze y Mu = c + d z . z
Derivation of integral forms. Here, we perform two integrations of Model 1 that result in expressions for the dynamics of mean fitness and mean mutation rate, x 2 = x dx x t and = cov( , x ) (Dy + d y ) 2 . t
These equations are given above, under the heading Integral forms, and they are given in approximate form in the main text. The equation for dynamics of mean mutation rate is derived by multiplying Model 1 by e y and integrating: = e y u( x, y , t ) dxdy = xe y u( x, y , t ) dxdy x e y u ( x, y , t ) dxdy + e 2 y Mu( x, y , t ) dxdy t t which is written more compactly as
= x x + e 2 y Mu( x, y , t ) dxdy , or t = cov( , x ) + e 2 y Mu( x, y , t ) dxdy t

(1.17)
Next, we evaluate The first term is
2y
Mu( x, y , t ) dxdy term by term. 2 2 u( x, y , t ) dxdy = Dx e 2 y 2 u( x, y , t ) dxdy , of which the inner 2 x x u( , y , t ) u( , y , t ) = 0 . So the first term is zero. x x u( x, y , t ) dxdy = d x e 2 y u( x, y , t ) dxdy , of which the inner x x

2y e Dx
integral is
2
2
u( x, y , t ) dx =

The second term is
2y
dx
integral is
x u( x, y , t ) dx = u(, y , t ) u(, y , t ) = 0 . So the second term is zero.
- S53 -
The third term is
2y e
2y
Dy
2 2 u( x, y , t ) dxdy = Dy e 2 y 2 u( x, y , t ) dydx , of which the inner 2 y y
integral is
2 u( x, y , t ) dx = 4 e 2 y u( x, y , t ) dx by integration by parts (see above). y 2
Inserting this inner integral back in the outer integral gives the third term of M:

4Dy
2y
u( x, y , t ) dydx = 4Dy 2 .

The fourth term is
2y e
e 2 y (2Dy + d y )
u( x, y , t ) dxdy = (2Dy + d y ) e 2 y u( x, y , t ) dydx , of y y
which the inner integral is
u( x, y , t ) dx = 2 e 2 y u( x, y , t ) dx by integration by parts. y
Inserting this inner integral back in the outer integral gives the fourth term of M:
2(2Dy + d y )

2y
u( x, y , t ) dydx = 2(2Dy + d y ) 2 .
Finally, the fifth term of M is

e 2 y (Dy + d y )u( x, y , t ) dxdy = (Dy + d y )
2y
u( x, y , t ) dydx = (Dy + d y ) 2 .
Now we evaluate:

2y
Mu( x, y , t ) dxdy = 0 + 0 + 4Dy 2 2(2Dy + d y ) 2 + (Dy + d y ) 2 = (Dy d y ) 2 ,
and insert this into (1.17) to obtain:

= cov( , x ) + (Dy d y ) 2 t
(1.18)
The equation for dynamics of mean fitness is derived by multiplying Model 1 by x and integrating: x = x u( x, y , t ) dxdy = x 2u( x, y , t ) dxdy x xu ( x, y , t ) dxdy + xe y Mu( x, y , t ) dxdy t t which is written more compactly as
x = x 2 ( x )2 + xe y Mu( x, y , t ) dxdy , or t

- S54 -
x 2 = x + xe y Mu( x, y , t ) dxdy t

(1.19)
Next, we evaluate The first term is integral is
xe Mu( x, y , t ) dxdy
y
term by term.

xe y Dx
2 2 y u ( x , y , t ) dxdy D e x = x x 2 u( x, y , t ) dxdy , of which the inner x 2
2 u( x, y , t ) dx = u(, y , t ) ( ) u( , y , t ) u( x, y , t ) dx = u( x, y , t ) dx , 2 x x x x x x
where u( x, y , t ) dx = u(, y , t ) + ( )u( , y , t ) = 0 . So the first term is zero.
The second term is
xe y d x
u( x, y , t ) dxdy = d x e y x u( x, y , t ) dxdy , of which the x x

inner integral is
x x u( x, y, t ) dx = u(, y , t ) ()u(, y , t ) u( x, y , t ) dx = u( x, y , t ) dx .

Putting this inner integral back into the outer integral gives the second term of M:

d x
e u( x, y , t ) dxdy = d
y
Integration of the next three terms gives
xe
2 Dy 2 u( x, y , t ) + (2Dy d y ) u( x, y , t ) + (Dy d y )u( x, y , t ) dydx = 0 , x x
by integration by parts.

From the above term-by-term integrations, we have equation (1.19) becomes:

x 2 = x dx t
xe Mu( x, y , t ) dxdy = d
y
, and
(1.20)
- S55 -
References for Supporting Online Material 1. Shaver, A. C., Dombrowski, P. G., Sweeney, J. Y., Treis, T., Zappala, R. M. & Sniegowski, P. D. (2002) Genetics 162, 557-566. 2. 3. 4. 5. Sniegowski, P. D., Gerrish, P. J. & Lenski, R. E. (1997) Nature 387, 703-705. Eigen, M. (2002) PNAS 99, 13374-13376. Nowak, M. & Schuster, P. (1989) J. Theor. Biol. 137, 375-395. Tsimring, L. S., Levine, H. & Kessler, D. A. (1996) Physical Review Letters 76, 44404443. 6. 7. 8. 9. 10. Rouzine, I. M., Wakeley, J. & Coffin, J. M. (2003) PNAS 100, 587-592. Fisher, R. A. (1930) The Genetical Theory of Natural Selection (Clarendon, Oxford). Frank, S. A. (1995) Journal of Theoretical Biology 175, 373-388. Price, G. R. (1970) Nature 227, 520-521. Press, W. H., Flannery, B. P., Teukolsky, S. A. & Vetterling, W. T. (1992) Numerical Recipes in Fortran (Cambridge University Press. 11. 12. 13. Ewens, W. J. (1979) Mathematical Population Genetics (Springer-Verlag, New York, NY). Haigh, J. (1978) Theoretical Population Biology 14, 251-267. Boe, L., Danielson, M., Knudsen, S., Petersen, J. B., Maymann, J. & Jensen, P. R. (2000) Mutation Research 448, 47-55. 14. Drake, J. W., Charlesworth, B., Charlesworth, D. & Crow, J. F. (1998) Genetics 148, 1667-1686. 15. 16. 17. Suarez, P., Valcarcel, J. & Ortin, J. (1992) Journal of Virology 66, 2491-2494. Johnson, T. (1999) Proc. R. Soc. Lond. Ser. B Biol. Sci. 266, 2389-2397. Sniegowski, P. D., Gerrish, P. J., Johnson, T. & Shaver, A. (2000) BioEssays 22, 10571066. 18. Lynch, M., Blanchard, J., Houle, D., Kibota, T., Schultz, S., Vassilieva, L. & Willis, J. (1999) Evolution 53, 645-663. 19. 20. 21. 22. 23. Imhof, M. & Schltterer, C. (2001) Proc. Natl. Acad. Sci. USA 98, 1113-1117. Gerrish, P. J. & Lenski, R. E. (1998) Genetica 102/103, 127-144. Rozen, D. E., de Visser, J. A. G. M. & Gerrish, P. J. (2002) Curr. Biol. 12, 1040-1045. Sanjun, R., Moya, A. & Elena, S. F. (2004) PNAS 101, 8396-8401. Bloom, J. D., Silberg, J. J., Wilke, C. O., Drummond, D. A., Adami, C. & Arnold, F. H. (2005) PNAS 102, 606-611. 24. Lynch, M. & Gabriel, W. (1990) Evolution 44, 1725-1737.
- S56 -
- S57 -

Pnas Gerrish Si

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Pnas Gerrish Si

Uploaded by

Copyright:

Available Formats

Supporting Information for

Complete genetic linkage subverts natural selection

This Supporting Information includes:

Supporting Text Figures 5 to 20 Movies 1 and 2

Analysis 1: Dynamic Limiting Solution of Model 1.

( z, y ) is c The equation for u

(spatial derivatives are unchanged, unchanged because it acts equivalently on u and u

Analysis. The solitary wave equation to be analyzed is: c where: u = zu + e y Mu z

1) The mean of z is zero: z =

, then c is determined from the equation occurs at z = z

obtained.) Thus, c is determined from the equation

For small , we have

+ , y ) e y Mu( z , y ) = Dx + , y ) gives e y Mu( z of e y Mu( z yielding c

At its peak, u is concave down, guaranteeing that

, y ) > 0 , From the facts that Dx > 0 , Dy > 0 , u( z

Analysis 2: Static Limiting Solution of Model 2.

( x, y ) = g ( x )h( y ) , and employing the shorthand notation, + Mu = 0 . Letting u equation u

Dx g ''/ g + d x g '/ g = e y Dy h ''/ h (2Dy + d y )h '/ h (Dy + d y ) =

2Dy + d y < 0 , which translates to

. Here, biological considerations insure that

2 2 + fB mB > 0 . If only we could be certain that the d x = fD mD fB mB > 0 . And Dx = fD mD

Again, we employ the shorthand notation,

h( y )dy = k y , and integration by parts gives us the relation

whose solution is: g ( x ) = C1e A1x + C2e A2 x ,

. We are guaranteed that k y =

and we define mean log fitness, x =

xu( x, y , t )dxdy , and mean log mutation rate,

e y u( x, y , t )dxdy . Then, multiplying Model 1 by x and integrating gives an

expression for dynamics of mean log fitness:

conditions, our equations are reduced to:

The solution to this equation is

f m h tan t fM mM h + tan1 0 M M fM mM h f m tan1 0 M M h fM mM h 2 .

The mutational operator Mi , j is defined as: Mi , j (ut ) =

( l )]( k i ) [ M ( l )]( j l ) [ B (l )]( i k ) [ A (l )]( l j ) (1 + ( j ) ) ut (i , j )

[ D ( j ) + B ( j )]k [ M ( j ) + A ( j )]l 1 , and B , D , M , A denote

Mi , j (ut ) = ut (i + 1, j 1)D ( j 1)M ( j 1) + ut (i + 1, j )D ( j ) + ut (i + 1, j + 1)D ( j + 1) A ( j + 1) + ut (i , j 1)M ( j 1) + ut (i , j )( ) + ut (i , j + 1) A ( j + 1) + ut (i 1, j 1)B ( j 1)M ( j 1) + ut (i 1, j )B ( j ) + ut (i 1, j + 1)B ( j + 1) A ( j + 1) where,

is the mean fitness of the population, i.e., X

'initialize time iterator 'population average fitness

For ii& = 1 To n& w!(ii&) = 1 mr!(ii&) = 1 Next ii&

Open "ratchet.out" For Output As #1 frmSimple.Show 'show output window

Instructions for using the compiled simulation program.

= N ; the probability of each such

j wj partitioning of offspring is N . To simulate the multinomial partitioning of j =1 Nw ij !

ff!) Then nw!(), nmr!(), kk&, ii&)

(1.11) and (1.12). By computing the slopes of and x , we determined

rate of mutation-rate increase

-0.004 0 20000 40000 60000

rate of fitness increase

-0.004 0 20000 40000 60000

x were obtained by computing t

rate of mutation rate increase

-0.005 0 20000 40000 60000 80000

rate of fitness increase

-0.0015 0 20000 40000 60000 80000

2 * , m = 2.5 , and 0.1 < D < 1 , we calculate that M M , where 60

to a fraction 0.011 of the genomic mutation rate.

D = u ' u . To help understand exactly how sB is defined, we provide an example: if a

, then net selective beneficial mutation confers an intrinsic selective advantage of sB

. But this function is only invariant in

deleterious load is only a fraction

Figure 14: Mutator lethality function.

Figure 15: Corresponding survival curve.