Professional Documents
Culture Documents
Oksanen Minchin 2002 Along Ecological Gradients
Oksanen Minchin 2002 Along Ecological Gradients
www.elsevier.com/locate/ecolmodel
Abstract
The shape of species’ responses along ecological gradients has important implications for both continuum theory and
community analysis. Most current theories and analytical models in community ecology assume that responses are
unimodal and symmetric. However, interactions between species and extreme environmental stress may cause skewed
or non-unimodal responses. To date, statistical tools for evaluating response shapes have been either inappropriate,
inefficient or biased. Using a data set on vascular plant distributions along an elevation gradient, we show that
Huisman /Olff /Fresco (HOF) models are an effective method for this purpose, allowing models of various forms
(skewed, symmetric, plateau, monotonic) to be tested for adequacy. HOF modeling was compared with alternative
methods for response fitting, including Gaussian responses as Generalized linear model (GLMs), Generalized Additive
Models (GAM) and Beta Functions with fixed or estimated endpoints. In our data set, skewed and plateau responses
are less common than symmetric ones. Less than half of the species have skewed or plateau responses that can not be
adequately modeled by Gaussian models. We show that Beta function models with fixed endpoints are biased,
confounding skewness and the location of the mode and should not be used to test response shapes. Beta models with
estimated endpoints are fairly consistent with other models. GAM’s cannot provide clear tests of skewness or kurtosis
of response curves, though we show that GAM’s, in general, confirmed the shapes chosen by HOF modeling. We
provide free software for fitting HOF models and encourage further applications to community data collected along
different types of ecological gradients.
# 2002 Elsevier Science B.V. All rights reserved.
very scanty evidence and its canonical status is response is significantly skewed, or whether sym-
unfounded. Indeed, theory predicts that response metric or monotonic responses could be used.
shapes should differ among gradient types (Austin In this paper, we study the suitability of
and Smith, 1989) or gradient locations (Austin and Huisman /Olff /Fresco models (HOF) for detect-
Gaywood, 1994). In addition, species interactions ing the shape of species responses along gradients.
may change the response shape even if the funda- We compare HOF models against symmetric
mental response were symmetric (Austin et al., Gaussian responses, variable Beta responses and
1990). Therefore, the analysis of response shape is non-parametric GAM.
of great theoretical interest (Austin, 1999). More-
over, it has a practical interest for ecologists
because of its implications for ordination metho- 2. Material and methods
dology. Correspondence analysis (CA) and its
derivatives, have been justified using the Gaussian 2.1. Data set
model (ter Braak, 1985), though the validity of
justification is open to debate (Austin, 1999, 2002). We analyzed the response patterns of vascular
The main alternative to CA-based methods, non- plants along an altitude gradient on the Mt. Field
metric multidimensional scaling, appears to be Plateau, Tasmania, which was earlier used to study
robust to variation in response shapes (Minchin, the shapes of response surfaces and some other
1987a). properties of community patterns (Minchin, 1989).
The symmetric Gaussian response function has Vegetation was sampled with equally spaced
been the only ecological response model for which sample plots along transects with random starting
the parameters can be estimated effectively (ter locations. We stratified the data by soil drainage
Braak and Barendregt, 1986; Jongman et al., so that interactions with this second gradient
1987). Although the evidence for the Gaussian would not distort response shapes. We studied
model can be criticized strongly (Austin, 1976, only the well-drained subset in this paper and
1980, 1999, 2002) all the methods proposed so far fitted response models for all species, which
to test the significance of skewness of the response occurred on 20 or more sites in this subset.
have been either inadequate, inefficient or biased: Coverage of sample plots was dense and even in
visual judgment of skewness (Økland, 1986) is this subset, as is obvious in the graphics in this
subjective; third-degree polynomials (Austin et al., article. This gave us a data set of 167 sites and 52
1990) have unrealistic shapes; Beta functions species with an altitude range 930/1380 m. We
(Austin et al., 1994) confound the location of the analyzed the data as presence/absence, or binomial
maximum and the skewness if the response end- with denominator m /1.
points are fixed (Oksanen, 1997); and the shape of
smoothed generalized additive models (GAM) 2.2. Bell-shaped or Gaussian response model
(Hastie and Tibshirani, 1990; Yee and Mitchell,
1991; Bio et al., 1998) must be assessed subjec- The Gaussian model defines a symmetric, bell-
tively. However, Huisman et al. (1993) have shaped response m along gradient x with three
proposed a set of hierarchical models which interpretable parameters: location of the optimum
include a skewed response that can be statistically u, height of the response h and tolerance or width
tested against a symmetric response. They assume of the response t (Fig. 1):
that these models cannot be estimated using
(x u)2
maximum likelihood (Huisman et al., 1993), but m h exp (1)
2t2
we found that this can be done and so it is possible
to use more realistic error distribution for the Instead of direct estimation of Gaussian para-
observations than the originally implied normal meters, it is customary to fit an equivalent poly-
distribution and to test statistically whether a nomial model (ter Braak and Looman, 1986):
J. Oksanen, P.R. Minchin / Ecological Modelling 157 (2002) 119 /129 121
that a and g are mainly parameters of response cases where either a or g is negative, or where the
skewness (Austin and Nicholls, 1997). If a /g, the optimum lies outside the studied range.
response is symmetric (and the optimum in the
middle of the range), implying the model:
2.4. Huisman /Olff /Fresco or HOF model
log(m) log(k)a[log(xp1 )log(p2 x)] (5)
Huisman et al. (1993) proposed a set of five
with a single explanatory variable log(x1/p1)/ hierarchic models for species response. The most
log(p2/x). Eqs. (4) and (5) define nested GLMs complex model defines a skewed response function
and can be used to test the null hypothesis of and can be written in slightly modified form:
symmetry, a /g .
The selection of endpoints has a great effect on 1 1
m M (6)
fitted response shapes (Fig. 2) and Oksanen (1997) 1 exp(a bx) 1 exp(c dx)
suggested that they should be estimated together
Here, m is the expected value, which is depen-
with other model parameters, but suspected this to
dent on the known values of the gradient (x ),
be difficult. However, we found that the simulta-
maximum possible value (M ) and four estimated
neous fitting of all five parameters is relatively
parameters of the function (a, b , c , d ). This model
simple, but has to be done with non-linear max-
imum likelihood regression instead of more stan- can be simplified into four other models by
aliasing some parameters or fixing them to con-
dard GLM.
stant values (Table 1). These simplified models
We fitted Beta responses in two alternative
define symmetric, plateau type or monotone
ways. Firstly, we followed Austin and Nicholls
responses (Fig. 3).
(1997) and fixed the endpoints before fitting the
Huisman et al. (1993) presented only a least-
response as a GLM (Eq. (4)). The Beta function is
squares method for fitting these models and
undefined beyond the endpoints p1 and p2 and
suggested that maximum likelihood estimation
therefore, we pushed these parameters somewhat
outwards from the most extreme occurrences. The cannot be used. However, the use of maximum
likelihood is fairly simple (e.g. Venables and
degree of pushing has a great influence in the
Ripley, 1999). The maximum likelihood estimates
response shapes (Fig. 2). We used 50 m, since it
can be found by maximizing a log-likelihood
produced better fitting curves that were more
function instead of minimizing the squared resi-
similar to other response models than did a smaller
duals. The maximum likelihood function used in
extension. Austin and Meyers (1996), Austin and
the HOF model is non-linear in parameters a, b , c ,
Nicholls (1997) similarly pushed the endpoints
d and so these must be estimated using iterative
outwards by a fixed number of zero-observations.
The second alternative was estimation of p1 and non-linear methods (e.g. Venables and Ripley,
1999).
p2 together with other parameters. In this case, the
parameters p1 and p2 are selected to give a good fit
Table 1
for the observations and they have nothing to do
Simpler HOF models can be derived from the most complex
with the expected range of species (see Fig. 2). In model V (Eq. (6)) by fixing some parameters to constant values
both cases, we used Eq. (4) with logistic link
function instead of the strict form (Eq. (3)), since Model Parameters
we had binary responses. V Skewed a b c d
Beta response function is unimodal only if a /0 ¯ ¯
IV Symmetric a¯ b c¯ b
¯ ¯
and g /0 and so it can define monotone or III Plateau a b c¯ 0
¯
inverted responses (no optimum, but a minimum). II Monotone a¯ b 0¯ 0
¯ ¯
I Flat a 0 0 0
In this article, we interpreted the response shape as ¯
monotone if it was strictly monotone within The estimated parameters are underlined and the fixed
observations. Thus monotone responses include parameters in ordinary print.
J. Oksanen, P.R. Minchin / Ecological Modelling 157 (2002) 119 /129 123
Table 2
The most parsimonous HOF models cross-tabulated against the most parsimonous Beta response models
V 11 9 1 1 7 3 1
IV 22 4 17 1 12 10
III 4 3 1 4
II 12 1 2 9 4 8
Total 49 14 23 12 19 21 9
The endpoints were fixed 50 m beyond the most extreme presences. One species had an inverted response (no optimum but a
minimum) in all models and two species with flat response (model I) were flat or inverted in all other models and these three are not
tabulated. The total number of species before these omissions was 52.
J. Oksanen, P.R. Minchin / Ecological Modelling 157 (2002) 119 /129 125
Fig. 5. Examples of species which were found to have symmetric bell-shaped responses by HOF models: HOF model IV, Gaussian
bell, ‘skewed’ Beta and GAM (df5/3) produced almost similar response, but ‘symmetric’ Beta differs sharply from all others.
species classified as HOF model II (monotone Plateau responses (HOF model III) are uncom-
response) were very similar in all models. mon and were found only for four species. All
Species classified as skewed responses (HOF these are within the confidence intervals of GAM
model V) or plateau responses (HOF model III) responses (Fig. 7). GAM responses were not as
are more interesting. These responses cannot be regular as the simple parametric plateaux of HOF
adequately fitted by Gaussian responses or Beta model III. In some cases the GAM response
responses with fixed endpoints and therefore, we resembles the skewed HOF model V, which was
compare them against GAM responses and Beta not retained in backwards selection.
with estimated endpoints. Our comparison was
biased in favor of model IV, because we studied
model III only if model IV failed. In fact, model 4. Discussion
III was an alternative parsimonious choice for nine
species out of 22 model IV species (Table 2) and Symmetric, bell-shaped responses were the most
for five of these it was indeed a better choice (lower common type at Mt. Field and only about one
deviance) than model IV. We used this rigorous fifth of the responses were clearly skewed. About
but discriminatory strategy because we think that 42% of species could have been fitted using
a plateau cannot be a global response model, but Gaussian response models. At first sight, these
valid only for a limited range on a gradient. conclusions appear to be in contrast with Minchin
It seems that GAM models with df5/3 are (1989) who wrote that ‘‘Although the majority of
usually more symmetric than skewed HOF re- response surfaces appear to be unimodal, only
sponses (Fig. 6). However, GAM models with 45% are symmetric’’. Actually, despite important
df 5/6 usually confirm the main peak of the differences between two studies (the original paper
skewed HOF response, but deviate from it in examined mean cover responses, rather than
other parts, being usually less regular (Fig. 6). probabilities of occurrence and fitted two-dimen-
Either the true response is less regular than sional response surfaces to both altitude and soil
assumed in parametric modeling, or HOF has a drainage), the observed proportion of symmetric
sharper peak than GAM is allowed to have. In bell responses was similar in both analyses. How-
general, HOF V models are within confidence ever, in this paper, we found a larger number of
limits of GAM’s, so that the response looks monotone responses (HOF models III, II and I).
credible (Fig. 6). Similarly, Beta responses with Minchin (1989) used non-parametric smoothing
estimated endpoints were fairly consistent with and had to interpret the shapes subjectively with-
HOF models (Table 2). out help of confidence intervals. Obviously his
126 J. Oksanen, P.R. Minchin / Ecological Modelling 157 (2002) 119 /129
Fig. 6. GAM (df5/6) (solid line) with approximate 95% confidence intervals against GAM(df5/3) (dotted line) and HOF V responses
(broken line).
smoothing was not sufficient: in present parlance, that it is hardly possible to say that the Gaussian is
he used too many degrees of freedom in fitting. the universal and general response type. In parti-
Consequently, he emphasized the irregularity of cular, one third (11/33) of unimodal responses
many responses (regarding 22% of species as (models V and IV) are skewed rather than sym-
having multimodal surfaces), whereas our present metric.
analyses did stronger smoothing and found most Austin et al. (1994) have proposed strict rules
responses to be fairly regular. Nevertheless, the for selecting the species that can be studied for
proportion of non-Gaussian responses is so large their response shapes. For instance, there should
Fig. 7. HOF models III responses (broken line) and GAM (df5/3) (solid line) with approximate 95% confidence intervals.
J. Oksanen, P.R. Minchin / Ecological Modelling 157 (2002) 119 /129 127
be at least 100 sites with zero observations beyond independently and in our case this really improved
the extreme presence. Such rules may be applicable the applicability of Beta functions.
to huge data sets at the subcontinental scale, but Fixing the endpoints at the location of the most
they are hardly practical for ordinary vegetation extreme presences ignores the fact that the data are
surveys. Moreover, even in subcontinental data only a sample. A unimodal response with tapering
sets, the rule results in a biased selection of species, tails implies that the probability of occurrence near
leaving out those with modes closer to the gradient the extremes will be very low. Even with a very
extremes. Gradient limits may be natural, like the large data set, it is, therefore, unlikely that a
Mt. Field peak at 1380 m in our case, or they may presence will be observed at the real limit of
be arbitrary, like the lower limit of 930 m in our distribution. Beta response models may be very
case. Austin’s rule does not distinguish between effective for simulating realistic response patterns
these two cases. The main difference between our along ecological gradients (Minchin, 1987b), but
approaches is that we are mainly interested in the our results show that they are unsuitable for
shape of the species response at the central area of analyzing response shapes, as also concluded by
species occurrence, whereas Austin seems more Austin and Meyers (1996).
interested in species range and the shapes of the Any of the studied alternative models could be
tails. An essential feature of response shape is the used for symmetric, bell-shaped responses, even
slope of decrease near the species optimum and the Beta response model with estimated ‘end-
this can be studied even with truncated responses. points’. However, for other response types, Beta
functions and Gaussian response models provide
Contrary to Austin and Nicholls (1997), we think
biased or inappropriate models. GAM’s are
that there is no general need to remove extreme
usually very similar to the selected HOF models,
zeros, as most response models are able to predict
but smoothness and regularity of shape is depen-
near-zero occurrence for these. The trimming of
dent on the number of degrees of freedom (Hastie
zero tails is necessary only for Beta responses with
and Tibshirani, 1990). We used generalized cross-
fixed endpoints, since the response is undefined
validation for selecting the effective smoothness
beyond these points (Austin et al., 1994; Austin
(Gu and Wahba, 1991). Hastie and Tibshirani
and Nicholls, 1997; Oksanen, 1997). Trimming of
(1990) found that this fails in several cases and
zeros may be necessary in GAM with several produces responses that are too irregular. There-
gradients, since GAM fit the response for each fore, we defined a fairly low maximum limit for
dimension separately without regarding interac- degrees of freedom. However, for detecting skew-
tions (Hastie and Tibshirani, 1990). For Gaussian ness, a somewhat higher limit must be used,
responses, HOF models or Beta models with freely although this comes at cost of an irregular,
estimated endpoints, there is no need to remove perhaps biased pattern in other parts of the
any extreme zeros, since these functions can fit response.
these adequately. We analyzed a real data set and therefore, we
Beta response functions with fixed endpoints cannot know the true underlying response shapes
failed badly in our analyses. After fixing the and so we do not know if HOF models actually
response endpoints, or the species range (Austin were correct. However, we compared HOF models
and Nicholls, 1997), the responses either fitted the against GAM’s and symmetric and monotone
data very badly or else failed to detect the models against other alternatives and found con-
appropriate shape because they preferred to find sistent patterns. In most cases, the final selected
the location of the optimum instead of skewness HOF model was entirely within the approximate
(Oksanen, 1997). Indeed, Oksanen (1997) sug- confidence limits of a GAM. The most notable
gested that the response endpoints should not be differences were near the extremes of the gradient,
fixed but estimated together with other para- where the GAM often flipped up, whereas the
meters, so that response curve could try to find corresponding HOF model decreased asymptoti-
both the skewness and the location of the optimum cally. However, the confidence bands in fitted
128 J. Oksanen, P.R. Minchin / Ecological Modelling 157 (2002) 119 /129
Heegaard, E., Hangelbroek, H.H., 1999. The distribution of Oksanen, J., Läärä, E., Huttunen, P., Meriläinen, J., 1988.
Ulota crispa in relation to both dispersal- and habitat- Estimation of pH optima and tolerances of diatoms in lake
related factors. Lindbergia 24, 65 /74. sediments by the methods of weighted averaging, least-
Huisman, J., Olff, H., Fresco, L.F.M., 1993. A hierarchical set squares and maximum likelihood, and their use for the
of models for species response analysis. J. Veg. Sci. 4, 37 / prediction of lake acidity. J. Paleolimnol. 1, 39 /49.
46. Oksanen, J., Läärä, E., Meriläinen, J., Warner, B.G., 2001.
Ihaka, R., Gentleman, R., 1996. R: a language for data analysis Confidence intervals for the optimum in the Gaussian
and graphics. J. Comput. Graph. Stat. 5, 299 /314. response function. Ecology 82, 1191 /1197.
Jongman, R.H., ter Braak, C.J.F., van Tongeren, O.F.R., 1987. ter Braak, C.J.F., 1985. Correspondence analysis of incidence
Data analysis in community and landscape ecology, Pudoc, and abundance data: properties in terms of a unimodal
Wageningen. response model. Biometrics 41, 859 /873.
Kay, R., Little, S., 1987. Transformations of the explanatory
ter Braak, C.J.F., Barendregt, L.G., 1986. Weighted averaging
variables in the logistic regression model for binary data.
of species indicator values: its efficiency in environmental
Biometrika 74, 495 /501.
calibration. Math. Biosci. 78, 57 /72.
Leathwick, J.R., 1998. Are New Zealand’s Nothofagus species
ter Braak, C.J.F., Looman, C.W.N., 1986. Weighted averaging,
in equilibrium with their environment. J. Veg. Sci. 9, 719 /
logistic regression and the Gaussian response model.
732.
Lehmann, A., Overton, J.M., Leathwick, J.R., 2002. General- Vegetatio 65, 3 /11.
Venables, W.M., Ripley, B.D., 1999. Modern Applied Statistics
ized regression analysis and apatial predictions. Ecol.
Model. 157 (2 /3), 187 /205. with S-PLUS, third ed.. Springer, Heidelberg.
McCullagh, P., Nelder, J.A., 1989. Generalized Linear Models. Whittaker, R.H., 1978. Direct gradient analysis. In: Whittaker,
Chapman and Hall, London, p. 125. R.H. (Ed.), Ordination of Plant Communities. Junk, The
Minchin, P.R., 1987a. An evaluation of relative robustness of Hague, pp. 7 /50.
techniques for ecological ordinations. Vegetatio 69, 89 /107. Wood, S.N., 2000. Modelling and smoothing parameter
Minchin, P.R., 1987b. Simulation of multidimensional commu- estimation with multiple quadratic penalties. J. Roy. Stat.
nity patterns: towards a comprehensive model. Vegetatio 71, Soc. B62, 413 /428.
145 /156. Wood, S.N., Augustin, N.H., 2002. Improving GAMs for
Minchin, P.R., 1989. Montane vegetation of the Mt. Field environmental modelling using GCV and penalized regres-
massif, Tasmania: a test of some hypotheses about proper- sion splines. Ecol. Mod., This Volume, in this issue.
ties of community patterns. Vegetatio 83, 97 /110. Yee, T.W., Mitchell, N.D., 1991. Generalized additive models
Økland, R.H., 1986. Rescaling of ecological gradients. The in plant ecology. J. Veg. Sci. 2, 587 /602.
effect of scale on symmetry of species response curves. Zaniewski, A.E., Lehmann, A., Overton, J.M., 2002. Predicting
Nordic J. Bot. 6, 671 /677. species distribution using presence-only data: A case study
Oksanen, J., 1997. Why the beta-function cannot be used to of native New Zealand ferns. Ecol. Model. 157 (2 /3), 259 /
estimate skewness of species responses. J. Veg. Sci. 8, 147 / 278.
152.