Professional Documents
Culture Documents
A Statistical Test For Host-Parasite Coevolution
A Statistical Test For Host-Parasite Coevolution
A Statistical Test For Host-Parasite Coevolution
51(2):217–234, 2002
Abstract.—A new method, ParaFit, has been developed to test the signicance of a global hypothesis
of coevolution between parasites and their hosts. Individual host–parasite association links can also
Parasites generally form tight ecological 1998]), and a maximum likelihood-based test
associations with their hosts. Biologists have (Huelsenbeck et al., 1997). This more rig-
long assumed that the evolution of para- orous framework led to the publication of
sites is highly dependent on that of their several studies about host–parasite coevolu-
hosts (Barrett, 1986; Klassen, 1992). This has tion (e.g., Brooks and Glen, 1982; Hafner and
led to the establishment of several “rules,” Nadler, 1988, 1990; Klassen and Beverley-
such as Farenholz’s Rule (1913)—“parasite Burton, 1988; Demastes and Hafner, 1993;
phylogeny mirrors host phylogeny”—and Page, 1993b; Paterson et al., 1993; Hafner
Szidat’s Rule (1940)—“primitive hosts har- et al., 1994; Page and Hafner, 1996; Boeger
bour primitive parasites.” Klassen (1992) and Kritsky, 1997; Roy, 2001). This topic has
summarized the history of host–parasite gained considerable importance in evolu-
(H-P) coevolution studies through 1991. tionary biology (Futuyma and Slatkin, 1983;
The term “coevolution” was introduced by Brooks and McLennan, 1991; Thompson,
Ehrlich and Raven (1964) in a study on but- 1994; Page and Holmes, 1998).
teries and their plant hosts. In the present The above-mentioned methods treat dif-
paper, coevolution is dened as the extent ferently the different kinds of evolutionary
to which the host and parasite phylogenetic events occurring in a host–parasite associa-
trees are congruent, where congruence refers tion (Ronquist, 1997; Page and Charleston,
to the degree to which parasites and their 1998). Consequently, they can produce dif-
hosts occupy corresponding positions in the ferent results. The simultaneous evolution
phylogenetic trees. Perfect congruence is a of hosts and parasites can exhibit four
good indicator of host and parasite cospeci- main different kinds of events (Ronquist,
ation; a total absence of congruence, on the 1997; Charleston, 1998; Page and Charleston,
other hand, indicates random associations in 1998): cospeciation (simultaneous speciation
their evolutionary history. This denition of a host and its parasite), duplication (in-
corresponds to that of Brooks (1979, 1985) dependent parasite speciation), lineage sort-
and Klassen (1992), which refers to the ing (disappearance of a parasite lineage on a
macroevolutionary context. host lineage), and host switching (coloniza-
Until the work of Brooks (1977, 1981) at tion of a new host by a parasite). A draw-
the end of the 1970s, no rigorous analytical back common to all the above-mentioned
method had been developed to study H-P methods is that they are ideally designed
coevolution. Since then, several methods for for the one host-one parasite case; as the
doing so have been developed: Brooks parsi- numbers of hosts and parasites increases,
mony analysis (BPA; Brooks and McLennan, the problem becomes highly computer-
1991), component analysis (Component; intensive, making optimal solutions hard to
Page, 1993a), a method based on recon- nd. These methods aim at reconstructing a
ciled phylogenetic trees (TreeMap; Page, putative history of the host–parasite associa-
1994), event-based methods (e.g., TreeFitter tion by adequately mixing the types of events
[Ronquist, 1995, 1997]; Jungles [Charleston, and trying to minimize the overall cost of
217
218 S YSTEMATIC BIOLOGY VOL. 51
the estimated evolutionary scenario. They at- D ESCRIPTION OF THE PARAFIT T EST
tempt to answer the question, What is the S TATISTICS
most probable coevolutionary history of the ParaFit Statistic for the Global Test
host–parasite association, given the costs of of Host–Parasite Coevolution
the different events?
Statistically assessing a hypothesis of H-P
coevolution requires combining three types
PRINCIPLE OF THE T EST of information that are jointly necessary to
In the present paper, we want to know if describe the situation: the phylogeny of the
the data agree with a model of coevolution parasites, the phylogeny of the hosts, and the
observed H-P associations (Fig. 1). A phy-
parasites in rows and the hosts in columns, association between the two phyloge-
a 1 is written where a parasite has been em- nies. D can be obtained by the matrix
pirically found to be associated to a host and operation
0’s are used elsewhere.
If the reconstructed phylogeny for either D D C A’ B (1)
the hosts or the parasites, or both, is un-
certain (e.g., poorly resolved phylogeny or described by Legendre et al. (1997; see also
uncertainty among several almost equiva- Legendre and Legendre, 1998: Section 10.6),
lent trees), one can use, instead of patris- who coined the expression “fourth-corner
tic distances, a matrix of phylogenetic dis- statistics” to describe the individual values
For the test, consider the H-P link k from The second test statistic that we are propos-
matrix A. If we replace the value 1 that rep- ing for an individual link k is thus:
resents this link in matrix A with a 0 (no
link), we obtain a new matrix A(k). We now ParaFitLink2(k) D (trace ¡ trace(k))=
compute
(TraceMax ¡ trace) (6)
X¡ ¢
trace(k) D di j 2 (3)
This statistic cannot be used when the host
and parasite phylogenetic trees are identical
where the di j values now are those result-
because, in that situation, the denominator of
ing from the product D D C A(k)0 B. Using the
Eq. 6 is 0. This situation may happen in sim-
H-P links are not random but associate cor- ParaFitLink2(k)ref (Eq. 6), by using the
responding branches of the two evolution- value traceref saved from the global test of
ary trees; in that sense, they are critical coevolution.
to the overall H-P association. The testing 4. Permute at random the values within each
procedure is as follows: row of matrix A, independently from row
to row, using the same sequence of ran-
1. Compute matrix D by using Eq. 1. Com- dom permutations as were used during
pute statistic ParaFitGlobal by using Eq. the global test of H-P coevolution (above).
2, which provides the reference value Recompute matrix D by using Eq. 1 and
(ParaFitGlobal ref ) for the test. Save this trace(k)¤ by Eq. 3, and then recompute
random trees were transformed into matri- corresponding position on the tree. Next,
ces B (for the parasites) and C (for the hosts) a certain percentage of randomly located
by using principal coordinate analysis. The links were added to matrix A (without
statistics for the global test and for the indi- replication of existing links) by the same
vidual H-P links were calculated and tested procedure as in the type I error study. The
by using random permutations. simulation parameters were as follows:
A simulation study can never explore all 10 hosts, 10 parasites, 10 basic H-P links.
parameter combinations. The simulation ef- The number of random H-P links added
fort reported here for type I error involves the to the basic links was f0%, 10%, 20%, . . . ,
following parameter combinations, which 100%g, or f0, 1, 2, . . . , 10g supplementary
1. A random additive tree was generated, In all power simulations, there were 10,000
as described above, for each simulation, independent simulations per combination of
but here the same tree was used for the parameters, and 99 permutations per test.
parasites (matrix B) and the hosts (matrix Power was computed for various sig-
C); this is a condition for coevolution. A nicance levels ® D f0.01, 0.02, 0.03, 0.04,
H-P link was created between each host 0.05g. The 95% condence intervals were
and the parasite that was found in the not plotted because they would have made
2002 LEGENDRE ET AL.—HOST–PARASITE COEVOLUTION 223
the graphs difcult to read. We want to the same but the H-P links were positioned
identify the simulated situations in which at random.
power was low, medium, or high, to provide The power study is based on simulations
guidance for interpretation of real-case in which the null hypothesis is false by
studies. We also want to see if one of the construct; we are thus certain that the
statistics (ParaFitLink1 and ParaFitLink2) data used in these simulations represent
created for the tests of individual H-P links simulated cases of coevolution. When the
had higher power than the other, at least in data exactly conformed the hypothesis of
some situations. coevolution, power was maximum, the null
All the simulations described above hypothesis being rejected in nearly all cases
parasites and more links. The power of the without supplementary random links. With
global test increased with host and parasite ParaFitLink1, the probability of correctly
sample sizes, for given proportions of coevo- detecting a coevolutionary link (Fig. 5a)
lutionary links. was 1.5 to 2.5 times greater than that of
wrongly declaring a random link signicant
(Fig. 5c). The difference is not as great for
Tests of Individual Host–Parasite statistic ParaFitLink2 (compare Figs. 5b and
Association Links 5d). Accordingly, the ParaFitLink1 statistic is
In tests of individual H-P association links, preferable in this situation. (In Fig. 5d, a sin-
type I error was always correct for both statis- gle supplementary link added to a perfect
227
Downloaded from https://academic.oup.com/sysbio/article/51/2/217/1661471 by guest on 08 December 2023
228
FIGURE 6. Power of the test of individual H-P association in simulations in which a certain percentage of randomly-located links were removed from matrix A and replaced
with randomly located links. The simulations were for 10 hosts and 10 parasites. There were also 10 coevolutionary links; 0 to 10 of them were replaced with random links.
(a) Simulation results for one of the coevolutionary H-P links, ParaFitLink1 statistic. (b) ParaFitLink2 statistic. (c) Simulation results for a random H-P link replacing a
coevolutionary link, ParaFitLink1 statistic; (d) ParaFitLink2 statistic.
Downloaded from https://academic.oup.com/sysbio/article/51/2/217/1661471 by guest on 08 December 2023
FIGURE 7. Power of the test of individual H-P association in simulations in which the phylogenetic tree of the hosts and that of the parasites were partly similar and partly
random. There were 10 hosts, 10 parasites, and 10 links in this particular set of simulations. Abscissa: number of species of hosts and parasites (among 10) that shared the same
tree and were linked. (a) Simulation results for one of the coevolutionary H-P link, ParaFitLink1 statistic. (b) ParaFitLink2 statistic. (c) Simulation results for one of the random
H-P links, ParaFitLink1 statistic; (d) ParaFitLink2 statistic.
229
Downloaded from https://academic.oup.com/sysbio/article/51/2/217/1661471 by guest on 08 December 2023
230 S YSTEMATIC BIOLOGY VOL. 51
about one-half the total number of species; In the tests of individual H-P association
the effect on statistic ParaFitLink1 was less links, the null hypothesis is that the link be-
important than on statistic ParaFitLink2 ing tested is random. Results of tests of indi-
(Figs. 7c, 7d), making ParaFitLink1 seem vidual links are interpreted as follows:
preferable. The probability of correctly de-
tecting a coevolutionary link was greater ² When results of the global test and the
than that of wrongly declaring a random test of an individual link are both signif-
link signicant by about the same factor with icant, the test has detected a signicant
both statistics (compare Figs. 7a with 7c and H-P link, with a probability of type I er-
Figs. 7b with 7d); that is, the two statis- ror equal to ®.
1995; Page, 1996; Charleston, 1998). We This result is in agreement with previous
used Hasegawa–Kishino–Yano (HKY85) studies on this H-P model, which suggest an
corrected distances because of an observed important amount of cospeciation between
heterogeneity of nucleotide frequencies and pocket gophers and their chewing lice. When
the occurrence of a transition bias. Trees were we assessed the signicance of each H-P
reconstructed using the neighbor-joining association, using the ParaFitLink1 statistic
method (Saitou and Nei, 1987) available (tested at ® D 0.05), we found that 7 of the
in the program PAUP¤ (Swofford, 2001). 17 H-P links were not signicant (Fig. 8).
We used this method to quickly obtain an The signicant links display extensive co-
estimate of the phylogeny of the hosts and evolution between the hosts and parasites.
parasites; however, this should not be taken The discrepancies between the two phylo-
as an endorsement of neighbor-joining as genies are associated with the nonsignicant
the best method for exploring tree space. H-P links. If we remove those from Figure 8,
The trees that we obtained differ slightly and remove also the species that have no
from those published by Hafner et al. (1994) signicant H-P link, the result is two identi-
because we used distances corrected under a cal phylogenetic trees displaying perfect co-
different evolutionary model. This produced evolution for a subset of the gophers and lice
a topology slightly different from that pub- (Fig. 9).
lished in Hafner et al. (1994). The exact tree The ParaFit method does not intend
topology, however, is of little importance for to infer an evolutionary scenario explain-
illustrating our method in the present study. ing the observed historical association be-
Coevolution was rst tested by using the tween gophers and their lice—in contrast
global ParaFit statistic (H 0 : evolution of the to the TreeMap method, for example. It
hosts and parasites has occurred indepen- did allow us, however, to identify (1) the
dently). The result, that the null hypothesis hosts and parasites that have probably un-
must be rejected (permutational P D 0:001 dergone cospeciation and (2) the species
after 999 permutations), supports the alterna- most likely to have been subjected to host-
tive hypothesis (H1 ) of coevolution, revealed switching or sorting events (parasite extinc-
by the similarity of the two phylogenetic tion, or primary absence on daughter host
trees and the matrix of H-P association links. lineage)
232 S YSTEMATIC BIOLOGY VOL. 51
links are random. ParaFitLink2 cannot be CHAR LESTON, M. A. 1998. Jungles: A new solution to
used in perfect coevolutionary situations the host/parasite phylogeny reconciliation problem.
Math. Biosci. 149:191–223.
because its denominator is then zero. DEMASTES , J. W., AND M. S. HAFNER . 1993. Cospecia-
A FORTRAN program (PARAFIT : source tion of pocket gophers (Geomys) and their chewing lice
code, compiled versions for Macintosh and (Geomydoecus). J. Mammal. 74:521–530.
DOS, and program documentation) to carry EDG INGTON, E. S. 1995. Randomization tests, 3rd edi-
tion. Marcel Dekker, New York.
out the H-P coevolution test described in this EHR LICH, P. R., AND P. H. RAVEN. 1964. Butteries and
paper is available on the website <http:// plants: A study in coevolution. Evolution 18:586–608.
www.fas.umontreal.ca/biol/legendre/> as FARENHOLZ, H. 1913. Ectoparasiten und Abstam-
well as the site of the Society for Systematic mungslehre. Zool. Anz. 41:371–374.
FUTUYMA, D. J., AND M. SLATKIN. 1983. Coevolution.
PAGE, R. D. M. 1996. Temporal congruence revisited: RONQUIST , F. 1997. Phylogenetic approaches in co-
Comparison of mitochondrial DNA sequence diver- evolution and biogeography. Zool. Scr. 26:313–
gence in cospeciating pocket gophers and their chew- 322.
ing lice. Syst. Biol. 45:151–167. ROY, B. A. 2001. Patterns of association between cru-
PAGE, R. D. M., AND M. A. CHARLESTON. 1998. Trees cifers and their ower-mimic pathogens: Host jumps
within trees—Phylogeny and historical associations. are more common than coevolution or cospeciation.
Trends Ecol. Evol. 13:356–359. Evolution 55:41–53.
PAGE, R. D. M., AND M. S. HAFNER . 1996. Molecular phy- SAITOU, N., AND M. NEI . 1987. The neighbor-joining
logenies and host–parasite cospeciation : Gophers and method: A new method for reconstructing phyloge-
lice as a model system. Pages 255–270 in New uses for netic trees. Mol. Biol. Evol. 4:406–425.
new phylogenies (P. H. Harvey, A. J. Leigh Brown, SIDALL, M. E. 1996. Phylogenetic covariance probabil-
J. Maynard Smith, and S. Nee, eds.). Oxford Univ. ity: Condence and historical associations. Syst. Biol.
Press, New York. 45:48–66.