Professional Documents
Culture Documents
Performance of Species Richness Estimators Across Assemblage Types and Survey Parameters Reese 2014
Performance of Species Richness Estimators Across Assemblage Types and Survey Parameters Reese 2014
Performance of Species Richness Estimators Across Assemblage Types and Survey Parameters Reese 2014
1
Department of Fish, Wildlife, and ABSTRACT
Conservation Biology, Colorado State
University, Fort Collins, CO 80523, USA,
Aim A raw count of the species encountered across surveys usually underestimates
2
US Forest Service, Rocky Mountain Research species richness. Statistical estimators are often less biased. Nonparametric estima-
Station, Fort Collins, CO 80526, USA tors of species richness are widely considered the least biased, but no particular
estimator has consistently performed best. This is partly a function of estimators
responding differently to assemblage-level factors and survey design parameters.
Our objective was to evaluate the performance of raw counts and nonparametric
estimators of species richness across various assemblages and with different survey
designs.
Location We used both simulated and published field data.
Methods We evaluated the bias, precision and accuracy of raw counts and 13
nonparametric estimators using simulations that systematically varied assemblage
characteristics (number of species, species abundance distribution, total number of
individuals, spatial configuration of individuals and species detection probability),
sampling effort and survey design. Results informed the development of an esti-
mator selection framework that we evaluated with field data.
Results When averaged across assemblages, most nonparametric estimators were
less negatively biased than a raw count. Estimators based on the similarity of
repeated subsets of surveys were most accurate and their accumulation curves
appeared to reach asymptotes fastest. Number of species, species abundance distri-
bution and effort had the largest effects on performance, ultimately by affecting the
proportion of the species pool contained in a sample. Our estimator selection
framework showed promising results when applied to field data.
Main conclusions A raw count of the number of species in an area is far from the
best estimate of true species richness. Nonparametric estimators are less biased.
Newer largely unused, estimators perform better than more well known and longer
established counterparts under certain conditions. Given that there is generally a
trade-off between bias and precision, we believe that estimator variance, which is
often not reported when presenting species richness estimates, should always be
*Correspondence: Gordon C. Reese, included.
Department of Fish, Wildlife, and Conservation
Keywords
Biology, Colorado State University, Fort Collins,
CO 80523, USA. Biodiversity, community ecology, nonparametric estimator, sample coverage,
E-mail: greese126@gmail.com selection framework, simulation.
frequently used (Moreno et al., 2006; Drake, 2007; Lopez et al., Table 1 Levels of factors simulated to evaluate species richness
2012). estimators.
The number of species observed (Sobs) is the most straight-
forward estimate of the true number of species in an assemblage Description Symbol Levels
(Strue), but it has a known negative bias and is based on the naïve
Total (true) number of species Strue 25
assumption that all species are detected with a probability equal
100
to one (Palmer, 1990; Nichols et al., 1998). Fortunately, species 500
distribution and abundance patterns can be used to inform Total abundance across all species N 6250
statistically derived estimators of Strue. All estimators have under- 12500
lying assumptions, but the statistical estimators of SR are typi- Species abundance distribution Abund Log-series
cally less negatively biased than Sobs (Baltanás, 1992; Chiarucci Log-normal
et al., 2003; Walther & Moore, 2005). Spatial relationship between individuals Config Aggregated
Besides Sobs, there are three categories of SR estimators. One of a species Hyper-dispersed
category includes approaches for extrapolating a species accu- Random*
mulation curve to an asymptote, often using a negative expo- Mean detection probability of three p (0.5, 0.5, 0.5)
species abundance groups (0.9, 0.9, 0.9)
nential model (Holdridge et al., 1971), the Michaelis–Menten
(0.5, 0.7, 0.9)
equation (Michaelis & Menten, 1913) or a power model
(0.9, 0.7, 0.5)
(Arrhenius, 1921; Tjørve, 2009). A second category includes Spatial arrangement of surveyed grid Design Random
parametric methods that involve: (1) interpolating under a dis- cells Transect (linear)†
tribution fit to abundance data or (2) applying an estimator that Amount of landscape surveyed Effort 1% (100 cells)
assumes that all species are equally detectable. A third category 5% (500 cells)
includes nonparametric estimators, which make no assumption
about the underlying distribution of the data. *Individuals were spaced less regularly (aggregated) or more evenly
The performance of SR estimators has largely depended on (hyperdispersed) than expected by chance (random).
†Surveys were configured randomly or as random linear transects of 50
whether underlying assumptions are met, making factors such
grid cells.
as the species abundance distribution, species detection prob-
ability (p), survey effort and Strue (Baltanás, 1992; Keating &
Quinn, 1998; Brose et al., 2003) important to understanding
their performance. Ultimately, no estimator has consistently species but are constant across space. This assumption can be
performed best, but the nonparametric estimators have gener- violated when distributions are spatially heterogeneous, which
ally performed better than the other categories (see Table 1 in occurs regularly in natural systems (Legendre, 1993; Deblauwe
Cao et al. 2004; Table 3 in Walther & Moore 2005; Table 1 in et al., 2008).
Reese 2012). We thus focus on evaluating nonparametric esti- Our primary objective was to evaluate nonparametric SR esti-
mators, including several that are relatively untested (see mators across assemblages that were systematically varied both
Burnham & Overton, 1978; Pledger, 2000; Cao et al., 2001; Cao in their attributes (number of species, total number of individ-
et al., 2004), by comprehensively and consistently varying uals, species abundance distribution, spatial aggregation and
factors that can compromise estimator assumptions. species detection probability) and how they were sampled
Most of the nonparametric SR estimators can be categorized (effort and survey design). Given the large number of factors
as those that model either heterogeneity in p (commonly and the benefits of knowing the true factor values, we relied
denoted as model Mh; see Otis et al., 1978) or the similarity heavily on simulated data, but also evaluated estimators using
between replicate subsets of the survey data. Many in the first field data (see the Supporting Information). An important sec-
group were developed for population estimation using mark– ondary objective was to expand the estimator selection
recapture data under an assumption of geographic and demo- approach proposed by Brose et al. (2003). Their framework tar-
graphic closure, an assumption that also applies to all SR geted estimator accuracy and based selection on the ratio of
estimators. Brose et al. (2003) detailed additional challenges Sobs to the mean SR estimate, which is an estimate of sample
that arise when an Mh population estimator is used to estimate coverage (sc) or the fraction of a species pool represented in a
Strue. First, differences in detectability between species can be sample (i.e. Sobs/Strue). Our contribution included additional
larger and more difficult to model than those between the selection criteria based on bias, precision and an estimate of sc
individuals composing a population of one species. Second, based on Jaccard’s similarity coefficient.
when an Mh estimator is used to estimate population size from
encounter histories of individuals in surveys repeated at the
METHODS
same location, an assumption is that p will vary across indi-
viduals but remain constant over time. When those estimators
Simulation procedure
are instead used to estimate Strue from encounter histories of
species in surveys replicated at different locations, the compa- We evaluated the performances of Sobs and 13 nonparametric SR
rable assumption is that detection probabilities vary among estimators across simulated assemblages using the program
586 Global Ecology and Biogeography, 23, 585–594, © 2014 John Wiley & Sons Ltd
Performance of species richness estimators
SimAssem (Reese et al., 2013). We systematically varied the total 0.95d had to equal or exceed a RUV, where d is a randomly
number of species, total assemblage abundance, type of species selected distance [0, 1]. Second, the individual’s location had to
abundance distribution, spatial structure of individual occur- fall on or within landscape boundaries when placed distance d in
rences within a species and detection probability of species a random direction (0–359°) from the selected centre.
abundance groups in a factorial design (Table 1). Similarly, we Species-specific detection probabilities (factor p) were ran-
simulated a sampling procedure in which we varied the number domly drawn from beta distributions with expected means of
of surveyed grid cells and how they were selected. Based on our 0.5, 0.7 and 0.9. To isolate the effect of p, we constrained vari-
review of the literature and past experience, we selected these ances to the range 0.010–0.015. For two factor levels, we drew
factors and factor levels to cover a wide range of realistic sce- each p from the same distribution, with an expected mean of 0.5
narios. We generated 42 replicates of each factor combination or 0.9 (α and β parameters equalled 10 and 10, or 4.5 and 0.5,
(24,192 total simulations) as a compromise between computer respectively). We evaluated two additional factor levels where p
processing time and having a sufficient number of replicates to varied as a function of abundance (see Selmi & Boulinier, 2004;
ensure an adequate sample size after we removed replicates Pagano & Arnold, 2009). We grouped species into thirds based
involving estimation issues (e.g. lack of convergence for the on abundance and randomly selected abundance groups to
Mixture estimator). accommodate the additional species since the tested levels of
We simulated assemblages with 25, 100 and 500 species Strue are not divisible by three. For one factor level we assigned
(factor Strue), populating them with a total of 6250 and 12,500 each species in the least, moderate and most abundant groups a
individuals (factor N). These values restricted the average p that was randomly drawn from a distribution with an expected
number of individuals encountered per species to the range of mean of 0.5, 0.7 (α = 14, β = 6) or 0.9, respectively. We created
several datasets (see Williams, 1939; Lewis & Taylor, 1967; another factor level by reversing the expected means of the
Dallmeier et al., 1991). To simulate species abundance patterns abundance groups.
(factor Abund), we used log-normal (Preston, 1948) and log- In SimAssem, every cell of an overlaid 100 × 100 grid is a
series distributions (Fisher et al., 1943). SimAssem generated: potential survey site. We evaluated two levels of effort (factor
(1) log-normal distributions by drawing a random log-normal Effort) by sampling either 100 (1%) or 500 (5%) grid cells. In
variate (μ = 0, σ = 1) for each species, dividing each variate by addition, we used two survey designs (factor Design), random
the sum of all the variates and rescaling by multiplying each and a linear transect design. Each transect contained 50 adjacent
variate by N and (2) log-series distributions by calculating the cells in a randomly selected horizontal or vertical orientation
number of species to populate with z individuals, where z = 1, 2, and transects were added until the specified number of grid cells
. . . , N (see Magurran, 2004). For comparison, we also simulated was surveyed. Previously surveyed grid cells intersected by a new
evenly distributed assemblages with particulate-niche distribu- transect were applied to the transect length, but were not
tions (MacArthur, 1957) where each individual was randomly double-counted. An individual was encountered when two con-
assigned to a species. Given that: (1) MacArthur considered this ditions were met. First, a surveyed cell had to contain at least one
distribution unsatisfactory (Magurran, 2004) and (2) it is infre- individual. Second, a RUV, where one was generated for each
quently referenced in the literature, we provide the results in individual in the cell, had to be ≤ p.
Table S1 in Supporting Information (see also Reese, 2012).
We assigned x- and y-coordinates [0, 1] to each individual via
Species richness estimators
three spatial configuration options available in SimAssem
(factor Config). The random configuration located each individ- In addition to Sobs, we evaluated the performance of 13 estima-
ual with a pair of random uniform variates [0, 1]. For the hyper- tors, where some are variants of others (see Table 2 for details
dispersed configuration, SimAssem assigned each individual a and abbreviations). Those estimators belonging to the Mh class
square territory with a linear dimension of 1/√ni, where ni is the included two based on abundance patterns (i.e. the number of
abundance of species i. Territories were adjacent in the horizon- species with an exact number of individuals encountered) and
tal and vertical directions and collectively formed a grid across nine based on incidence patterns (i.e. the number of species
the entire landscape. Individuals were located a random distance encountered in an exact number of surveys). We used the Rmark
and random direction from the bottom left corner of their ter- package (Version 1.9.5; Laake & Rexstad, 2008) in the program
ritory. When a randomized location in an overlapping territory R (R Development Core Team, 2009) to generate Mixture esti-
fell outside the top or right landscape boundary, as happened mates based on two groups using program MARK (Version 6.0;
occasionally, the individual was randomly placed on the land- White & Burnham, 1999).
scape. To determine the locations around which species could Two additional estimators, CY-1 and CY-2, are based on the
aggregate with the aggregated (centres equal abun) option, similarity of two replicate subsets of surveys. A CY-1 estimate
SimAssem generated ni random uniform variates (RUV) for equals average SR across the replicate sets of surveys (SR )
each species i, and every RUV ≥ 0.98 increased by one the divided by the Jaccard coefficient (JC), JC = c/(a + b + c), where
number of randomly placed aggregate centres, resulting in a and b are the numbers of species unique to each subset and c
approximately one centre for every 50 individuals. Each individ- is the number of species common to both subsets. Thus, CY-1 is
ual of a species was then randomly allocated to a centre and only calculable when individuals are encountered in two or
located on the landscape when two conditions were met. First, more surveys. A CY-2 estimate equals the slope plus the inter-
Global Ecology and Biogeography, 23, 585–594, © 2014 John Wiley & Sons Ltd 587
G. C. Reese et al.
588 Global Ecology and Biogeography, 23, 585–594, © 2014 John Wiley & Sons Ltd
Performance of species richness estimators
Table 3 Average performance of species richness estimators Except for the bias of Mixture and the accuracy of higher-
across all factors (see Table 1) based on bias, measured as scaled order jackknife estimators (Jack3–Jack5), bias and accuracy
mean error (SME); precision, measured as standard deviation of worsened with increasing Strue (Table S1.1 in Supporting Infor-
scaled estimates (SD); and accuracy, measured as scaled mean mation). The relationship between Strue and estimator precision
square error (SMSE). Subscripts are estimator rank for a given
was much less consistent, with numerous instances of both posi-
performance measure.
tive and negative relationships. Only three estimators, Sobs, Boot
and Jack1, ranked consistently across all factor levels and only
Estimator Bias (SME) Precision (SD) Accuracy (SMSE)
for precision. The greatest change in the relative performance of
ACE −0.327 0.264 0.174 an estimator was the bias of Mixture which was the least biased
Boot −0.4912 0.276 0.3111 estimator with Strue = 100 and one of the two most biased esti-
Chao1 −0.3710 0.262 0.208 mators with the other factor levels.
Chao2 −0.379 0.251 0.207 Variation in the other assemblage factors also affected estima-
CY-1 −0.193 0.309 0.131 tor performance. Except for Jack5 and Mixture, estimator bias
CY-2 −0.151 0.3411 0.142 and accuracy improved with increasing evenness (i.e. in moving
ICE −0.326 0.265 0.173 from log-series to log-normal distributions) and all estimators
Jack1 −0.4111 0.287 0.2510
were more precise with log-series distributions (Table S1.2). The
Jack2 −0.338 0.298 0.196
relative bias of Mixture was inconsistent, ranking best with log-
Jack3 −0.275 0.3310 0.185
Jack4 −0.224 0.4112 0.229
series distributions and worst with log-normal distributions. By
Jack5 −0.182 0.5913 0.3813 all measures, the nonparametric estimators performed better at
Mixture 0.6414 13.4114 180.2214 the larger N level (Table S1.3). The precision of Sobs, however,
Sobs −0.5613 0.263 0.3812 decreased slightly with the increase in N. Mixture was the most
biased estimator at N = 6250 and the least biased at N = 12,500.
Performances varied little across configuration patterns, but esti-
We estimated sc (sc) by averaging the Jaccard similarity coeffi- mators were generally least biased and least precise when indi-
cients across the 100 iterations used to calculate CY-1. viduals were hyperdispersed (Table S1.4). The nonparametric
estimators performed better when average species detection
probability ( p ) equalled 0.9 than when p = 0.5 (Table S1.5).
R E S U LT S
When p increased with ranked abundance, estimators were more
precise and generally more biased and more accurate than when
Simulations
p decreased with abundance. Mixture was again either the most
The Mixture estimator failed to converge (i.e. it failed to produce or least biased estimator, depending on the factor level.
an estimate) in 0.19% of the simulations and in as many as five The effects of the tested survey factors, Effort and Design, were
replicates (i.e. simulations with identical factor levels). A small relatively consistent. The nonparametric estimators performed
number of simulations (0.39%) resulted in less than the number better at the larger Effort level (Table S1.6). Only the precision of
of encounters needed to compute CY-1 and CY-2, so those esti- Sobs decreased with the increase in Effort. Estimators were gen-
mates equalled zero. There were also simulations where one or erally, often to a small degree, less biased, more precise and more
more of the estimators returned an estimate < Sobs including accurate when individual survey locations were selected ran-
CY-2 (0.45%), Jack2 (3.60%), Jack3 (7.73%), Jack4 (13.45%) domly than when they were placed along randomly located
and Jack5 (18.10%). We removed simulations in which Mixture linear transects (Table S1.7).
did not converge and rebalanced the factorial by removing five Averaged across all estimators, Strue, Effort and Abund had the
or more replicates from every factor combination (i.e. resulting largest effects on estimator performance (Table S2). The results
in 37 replicates of every factor combination); when possible, we were generally similar with individual estimators; however, N or
removed replicates that also had one or more estimates < Sobs. p occasionally ranked higher. Main effects accounted for
Averaged over all simulations, Mixture was the only positively approximately 87%, 71% and 81% of the variation in bias, pre-
biased estimator and the only nonparametric estimator more cision and accuracy, respectively, but explained < 50% of the
biased than Sobs (Table 3). The least biased estimator, CY-2, variation in the performance of Mixture.
underestimated Strue by 15% on average. CY-1 was the most We also evaluated estimator performance against sample cov-
accurate estimator. Estimator rank depended on the evaluation erage (i.e. sc). Estimators were generally less biased and more
metric and we noted a general trade-off between bias and pre- accurate with a larger sc. There was no apparent trend in preci-
cision (i.e. the least biased estimators were the least precise, and sion. Our results suggest that the best estimator for bias reduc-
vice versa). However, Mixture was both the most biased and the tion depends largely on the sc range; only Jack5 is recommended
least precise estimator (Table 3). Relative ranks were rarely con- first in more than one sc range (Table 4). By contrast, Sobs and
sistent across factor levels, but estimators that performed best Boot were among the three most precise estimators for all cov-
overall were generally among the best at individual factor levels. erage ranges. CY-1 and Boot were the most accurate estimators
Below, we mainly focus on factor-level comparisons; additional in the two smallest and two largest coverage ranges, respectively,
estimator-specific results are presented in Reese (2012). and one of the jackknife estimators was most accurate in the
Global Ecology and Biogeography, 23, 585–594, © 2014 John Wiley & Sons Ltd 589
G. C. Reese et al.
SME, scaled mean error; SD, standard deviation of scaled estimates; SMSE, scaled mean square error.
other ranges. The CY-2 estimator was one of the three most were less than the ≥ 20% bias reported for other estimators (see
accurate estimators in the largest number of sc ranges (six) and Brose et al., 2003; Canning-Clode et al., 2008; Jobe, 2008). Bias
CY-1 was next best (four). reduction involves increased extrapolation, which could partly
explain the loss of precision that accompanied the least biased
estimators. As was found in our study, estimation involves a
DISCUSSION
tradeoff between bias and precision (see Burnham & Overton,
1979; Brose et al., 2003; Willie et al., 2012).
The case for statistical estimators
Among the most accurate and least biased estimators at many
Using Sobs as an estimate of SR frequently leads to large under- factor levels, CY-1 and CY-2 could be promising newer estima-
estimates, due to a strong dependence on sampling effort and tors (Table 3, Appendix S1). It is noteworthy that both per-
the assumption that all species are detected (Nichols et al., 1998; formed at their best with relatively uneven log-normal and log-
Kéry & Plattner, 2007). Other than Mixture, the nonparametric series distributions (Table S1) which might best approximate
estimators were generally less biased and more accurate than Sobs true species abundance patterns (Sugihara, 1980; Ulrich et al.,
(Table 3), supporting other studies (see Wagner & Wildi, 2002; 2010). However, CY-1 and CY-2 were outperformed at some
Brose et al., 2003; Walther & Moore, 2005). Thus, Sobs is far from factor levels (see Appendix S1). Cao et al. (2004) stated that
the best approach for estimating SR. As effort increases, Sobs will sample size would affect CY-2 less than other estimators and it
approach Strue, but estimators generally provide a more reliable was the least biased estimator at the smaller effort level (Table
approach (see Table S1.6). A measure of precision is another S1.6). However, CY-1 was more accurate and less affected (bias)
advantage provided by some estimators (Nichols et al., 1998; by variation in Effort than any other estimator (Table S2; see also
Reese, 2012). Drake, 2007). It is possible that the number of regression points
used in the program SimAssem, five or ten, does not fully expose
the performance potential of CY-2. Both estimators were rela-
Statistical estimator comparisons
tively imprecise and prone to overestimation with particulate-
Averaged across all factor levels, our results suggest that ACE, niche distributions (Table S1) and larger levels of N, Effort and p
CY-1, CY-2, ICE and Jack3 can provide less biased and more (Appendix S1), factor levels generally associated with larger sc
accurate estimates than the more popular Chao1, Chao2, Jack1 values.
and Jack2 estimators (Table 3). The higher-order jackknife and Another largely untested estimator, Mixture, often performed
bias-corrected Chao estimators were among the least and inter- relatively poorly. However, based on bias, Mixture ranked more
mediately biased estimators, respectively, supporting Brose et al. favourably in assemblages with an intermediate number of
(2003). In particular, the overall biases of Jack5, CY-1 and CY-2 species (Table S1.1), log-series distributions (Table S1.2), larger
590 Global Ecology and Biogeography, 23, 585–594, © 2014 John Wiley & Sons Ltd
Performance of species richness estimators
N (Table S1.3) and either a larger p or a p that increased with individuals (Table S1.3). Positive relationships have been simi-
abundance groups (Table S1.5). Precision improved consider- larly reported with N in the form of density (Baltanás, 1992;
ably when effort increased from 100 to 500 cells (Table S1.6). Walther & Morand, 1998).
These results suggest that some factor combinations did not When the two levels where p was a function of ranked abun-
provide enough data to estimate probabilities for the two groups dance were removed, the correlation between sc and p increased
of the mixture model. An inspection of the raw results (G.C.R., from r2 = 0.04 to r2 = 0.18. When p generally decreased with
unpublished data) showed that the poorest performances abundance, estimators were less biased (there were exceptions
resulted from an increased number of large estimates. In with Boot, Mixture and Sobs), though often with less precision
other words, Mixture occasionally produced large estimates and accuracy, than when p increased with abundance (see Table
(e.g. 211,111) in assemblages with, for example, Strue = 500, S1.5). Less abundant species, particularly when p is small, are
Abund = log-normal, N = 6250, and p = 0.5, possibly a result of often not detected, but a larger p could result in such species
poor convergence. The conditions under which the performance being encountered, thereby reducing bias. Given the formulae of
of Mixture improved would indicate that it should be further many of the nonparametric estimators, bias will be further
evaluated in comparatively data-rich environments. reduced when such species are encountered in a small number
of surveys. Using either a fixed p for each abundance group or
larger differences in p might show more consistent trends in
bias and accuracy across estimators.
Factor effects
Estimator performance was little affected by Config, support-
The factors Strue, Effort, and Abund had the largest effects on ing both Wagner & Wildi (2002) and Brose et al. (2003). The
estimator performance, supporting Brose et al. (2003), and the three configuration patterns differed considerably for a single
strongest correlations with sc (i.e. the correlation between sc and species, but at the assemblage level the differences were minimal.
Strue = −0.66, Effort = 0.48, Abund = 0.37, N = 0.18, P = 0.04, This emergent property could partly explain the relatively weak
Config = 0.01 and Design = 0.01); note that the sign of correla- effects that Config had on estimator performance and the effects
tion is irrelevant for the nominal variables Abund, Config, Design of an assemblage-wide configuration pattern could be greater.
and p. This suggests that sc is the means by which factors affect Most estimators were more negatively biased with increased
performance and supports Brose et al. (2003). The negative cor- aggregation, which supports Baltanás (1992, their Fig. 3), but
relation that we found between Strue and sc, however, contradicts not Wagner & Wildi (2002). Furthermore, we did not find the
the positive and relatively weak correlation (i.e. r = 0.14) found positive relationship between accuracy and aggregation
by Brose et al. (2003). reported by Walther & Morand (1998).
The negative bias of most estimators was largest with the Survey design also had small effects on performance. Its
relatively uneven log-series distribution (Table S1), which sup- importance could be greater if, for example, SR varied along one
ports Wagner & Wildi (2002) and Brose et al. (2003). Wagner & or more gradients and the orientation of linear transects failed
Wildi (2002) found less bias with log-normal distributions than to fully represent an area. We are unaware of other investigations
with the distribution selected for its relative evenness, the of the relationships between survey design and SR estimates, but
broken-stick distribution. However, we did not find a significant differences between survey designs are important to other esti-
difference between simulated broken-stick and log-normal dis- mation issues (see Reese et al., 2005), including practicality due
tributions using Kolmogorov–Smirnov goodness of fit tests to logistics when implementing sampling regimes.
(G.C.R., unpublished analysis; for the test see Magurran, 2004),
which could explain our contradictory result when using
Estimator selection framework
particulate-niche distributions instead (Table S1). Previous
comparisons, therefore, might not have included enough vari- Near the boundaries of sc, we found that accuracy can be max-
ation to reveal actual relationships between estimator bias and imized with different estimators than those proposed in Brose
assemblage evenness. et al. (2003), specifically CY-1 when sc ≤ 20% and Boot when
At larger values of Strue, estimators were generally more biased, sc > 80% (Table 4). We also found the best performing estimator
supporting Baltanás (1992). Brose et al. (2003) reported a to vary between bias, precision and accuracy; therefore selection
contradictory result where estimators were more biased at should be application specific. Our selection framework and the
Strue = 25 than at Strue = 500; however, it is unclear whether that one by Brose et al. (2003) are based on sc (i.e. Sobs/Strue) a quan-
relationship held across their other levels of Strue (see their tity that, if known, would make estimation unnecessary. Across
Fig. 1). We also found that estimators were less accurate at larger our simulations, the correlation between sc and sc exceeded
values of Strue which contradicts Walther & Morand (1998; see 0.80. Thus, we recommend using the program SimAssem to
Table S1.1). Better performances at the larger Effort level, which process data and the reported sc to locate the coverage range of
supports Brose et al. (2003) and Wagner & Wildi (2002), could suggested estimators. Given the inconsistent performance of
be caused by more effort resulting in more data, particularly Mixture, we recommend caution when considering it for bias
encounters (see Table S1.6). The number of encounters can also reduction (see Table 4 and Appendix S1). We also tested our
be increased with larger values of N and p. Estimator perfor- selection framework against field data and present the results in
mance improved when N was increased from 6250 to 12,500 Appendix S3.
Global Ecology and Biogeography, 23, 585–594, © 2014 John Wiley & Sons Ltd 591
G. C. Reese et al.
592 Global Ecology and Biogeography, 23, 585–594, © 2014 John Wiley & Sons Ltd
Performance of species richness estimators
Colwell, R.K. & Coddington, J.A. (1994) Estimating terrestrial Moreno, C., Zuria, I., García-Zenteno, M., Sánchez-Rojas, G.,
biodiversity through extrapolation. Philosophical Transactions Castellanos, I., Martínez-Morales, M. & Rojas-Martínez, A.
of the Royal Society B: Biological Sciences, 345, 101–118. (2006) Trends in the measurement of alpha diversity in the
Dallmeier, F., Foster, R.B., Romano, C., Rice, R. & Kabel, M. last two decades. Interciencia, 31, 67–71.
(1991) A user’s guide to the Beni Biosphere Reserve biodiversity Nichols, J.D., Boulinear, T., Hines, J.E., Pollock, K.H. & Sauer,
plots. Smithsonian Institution, Washington, DC. J.R. (1998) Inference methods for spatial variation in species
Deblauwe, V., Barbier, N., Couteron, P., Lejeune, O. & Bogaert, J. richness and community composition when not all species are
(2008) The global biogeography of semi-arid periodic vegeta- detected. Conservation Biology, 12, 1390–1398.
tion patterns. Global Ecology and Biogeography, 17, 715–723. Otis, D.L., Burnham, K.P., White, G.C. & Anderson, D.R. (1978)
Drake, M.T. (2007) Estimating sampling effort for biomo- Statistical inference from capture data on closed animal popu-
nitoring of nearshore fish communities in small central Min- lations. Wildlife Monographs, 62, 1–135.
nesota lakes. North American Journal of Fisheries Management, Pagano, A.M. & Arnold, T.W. (2009) Estimating detection prob-
27, 1094–1111. abilities of waterfowl broods from ground-based surveys.
Fisher, R.A., Corbet, A.S. & Williams, C.B. (1943) The relation Journal of Wildlife Management, 73, 686–694.
between the number of species and the number of individuals Palmer, M.W. (1990) The estimation of species richness by
in a random sample of an animal population. Journal of extrapolation. Ecology, 71, 1195–1198.
Animal Ecology, 12, 42–58. Pledger, S. (2000) Unified maximum likelihood estimates for
Holdridge, L.R., Grenke, W.C., Hatheway, W.H., Liang, T. & Tosi, closed capture–recapture models using mixtures. Biometrics,
J.A. (1971) Forest environments in tropical life zones. Pergamon 56, 434–442.
Press, Oxford. Preston, F.W. (1948) The commonness, and rarity, of species.
Jobe, R.T. (2008) Estimating landscape-scale species richness: Ecology, 29, 254–283.
reconciling frequency- and turnover-based approaches. R Development Core Team (2009) R: a language and environ-
Ecology, 89, 174–182. ment for statistical computing. R Foundation for Statistical
Keating, K.A. & Quinn, J.F. (1998) Estimating species richness: Computing, Vienna, Austria. Available at: http://www
the Michaelis–Menten model revisited. Oikos, 81, 411–416. .r-project.org (accessed 24 October 2013).
Kéry, M. & Plattner, M. (2007) Species richness estimation and Reese, G.C. (2012) Simulating species assemblages and evaluating
determinants of species detectability in butterfly monitoring species richness estimators. PhD thesis, Colorado State Univer-
programmes. Ecological Entomology, 32, 53–61. sity, Fort Collins. Available at: http://hdl.handle.net/10217/
Laake, J. & Rexstad, E. (2008) RMark – an alternative approach 68193 (accessed 24 October 2013).
to building linear models in MARK. Program mark: ‘a gentle Reese, G.C., Wilson, K.R., Hoeting, J.A. & Flather, C.H. (2005)
introduction’, 7th edn (ed. by E. Cooch and G.C. White), Factors affecting species distribution predictions: a simulation
pp. C1–C115. Available at: http://www.phidot.org/software/ modeling experiment. Ecological Applications, 15, 554–564.
mark/docs/book/ (accessed 24 October 2013). Reese, G.C., Wilson, K.R. & Flather, C.H. (2013) Program
Lam, T.Y. & Kleinn, C. (2008) Estimation of tree species richness SimAssem: software for simulating species assemblages and
from large area forest inventory data: evaluation and compari- estimating species richness. Methods in Ecology and Evolution,
son of jackknife estimators. Forest Ecology and Management, 4, 891–896.
255, 1002–1010. SAS Institute (1999) SAS/STAT user’s guide, version 8. SAS Insti-
Lee, S.M. & Chao, A. (1994) Estimating population size via tute, Cary, NC.
sample coverage for closed capture–recapture models. Selmi, S. & Boulinier, T. (2004) Distribution–abundance rela-
Biometrics, 50, 88–97. tionship for passerines breeding in Tunisian oases: test of the
Legendre, P. (1993) Spatial autocorrelation: trouble or new para- sampling hypothesis. Oecologia, 139, 440–445.
digm? Ecology, 74, 1659–1673. Smith, E.P. & van Belle, G. (1984) Nonparametric estimation of
Lewis, T. & Taylor, L.R. (1967) Introduction to experimental species richness. Biometrics, 40, 119–129.
ecology. Academic Press, London. Sugihara, G. (1980) Minimal community structure: an explana-
Lopez, L.C.S., Fracasso, M.P.D.A., Mesquita, D.O., Palma, A.R.T. tion of species abundance patterns. The American Naturalist,
& Riul, P. (2012) The relationship between percentage of sin- 116, 770–787.
gletons and sampling effort: a new approach to reduce the bias Tjørve, E. (2009) Shapes and functions of species–area curves
of richness estimates. Ecological Indicators, 14, 164–169. (II): a review of new models and parameterizations. Journal of
MacArthur, R.H. (1957) On the relative abundance of bird Biogeography, 36, 1435–1445.
species. Proceedings of the National Academy of Sciences USA, Ulrich, W., Ollik, M. & Ugland, K.I. (2010) A meta-analysis of
43, 293–295. species-abundance distributions. Oikos, 119, 1149–1155.
Magurran, A.E. (2004) Measuring biological diversity. Blackwell Wagner, H.H. & Wildi, O. (2002) Realistic simulation of the
Publishing, Malden, MA. effects of abundance distribution and spatial heterogeneity on
Michaelis, M. & Menten, M.L. (1913) Der Kinetic der non-parametric estimators of species richness. Ecoscience, 9,
Invertinwirkung. Biochemische Zeitschrift, 49, 333–369. 241–250.
Global Ecology and Biogeography, 23, 585–594, © 2014 John Wiley & Sons Ltd 593
G. C. Reese et al.
Walther, B.A. & Moore, J.L. (2005) The concepts of bias, preci- Appendix S3 Case study of the performance of our proposed
sion and accuracy, and their use in testing the performance of selection framework and the asymptotic properties of
species richness estimators, with a literature review of estima- estimators.
tor performance. Ecography, 28, 815–829.
Walther, B.A. & Morand, S. (1998) Comparative performance of
species richness estimation methods. Parasitology, 116, 395– BIOSKETCHES
405.
Gordon C. Reese recently completed his PhD in the
White, G.C. & Burnham, K.P. (1999) Program MARK: survival
Graduate Degree Program in Ecology at Colorado State
estimation from populations of marked animals. Bird Study,
University. He is researching the factors affecting the
46, Supplement, 120–138.
performance of species richness estimators and also
Williams, C.B. (1939) An analysis of four years’ captures of
the relative performance of several approaches for
insects in a light trap. Part I. Transactions of the Royal Ento-
estimating the variance of species richness estimators.
mological Society of London, 89, 79–132.
Willie, J., Petre, C.A., Tagg, N. & Lens, L. (2012) Evaluation of Kenneth R. Wilson is a professor in the Department
species richness estimators based on quantitative perfor- of Fish, Wildlife, and Conservation Biology and
mance measures and sensitivity to patchiness and sample Graduate Degree Program in Ecology at Colorado State
grain size. Acta Oecologica, 45, 31–41. University. He has an interest in wildlife management,
conservation biology and ecology specifically related to:
S U P P O RT I N G I N F O R M AT I O N (1) impacts of human activities on wildlife, (2)
population ecology, and (3) understanding patterns of
Additional supporting information may be found in the online species richness.
version of this article at the publisher’s web-site.
Curtis H. Flather is a research ecologist with the US
Table S1 Performance of species richness estimators as a func- Forest Service, Rocky Mountain Research Station in
tion of species abundance distribution, log-normal, log-series Fort Collins, CO. His research programme focuses on
and particulate-niche, averaged across all factor levels. estimating the response of species assemblages to
Table S2 A variance component analysis of relative factor effects. natural and human-caused disturbance to support
Appendix S1 Estimator bias, precision and accuracy across resource planning within the agency.
factor levels (Tables S1.1–S1.7).
Appendix S2 Estimator bias, precision and accuracy across Editor: Gary Mittelbach
factor levels, after accounting for estimation issues (Tables
S2.1–S2.7).
594 Global Ecology and Biogeography, 23, 585–594, © 2014 John Wiley & Sons Ltd
Table S1 Performance of species richness estimators as a function of species abundance distribution, log-
normal (LN), log-series (LS), and particulate-niche (PN), averaged across all factor levels. Subscripts are
estimator rank for a given performance measure and abundance distribution. See Table 2 for estimator
abbreviations and Table 3 for specific performance metrics.
Notes: A large number of simulations with unreasonable estimates, i.e. estimates smaller than or
substantially greater than Sobs, were found in factor combinations with the particulate-niche distribution (in as
many as 33 of 37 replicates, i.e. once simulations in which Mixture failed to converge were removed and the
factorial was rebalanced by removing ≤5 replicates from each factor combination) as were the worst
performances (all measures) by CY-1, CY-2, and Mixture. Additionally, there is little empirical support for
particulate-niche distributions, compared to log-normal and log-series distributions; therefore, the results are
presented separately here.
The similarity of log-normal and broken-stick distributions (see text) could have contributed to Wagner and
Wildi (2002) finding that estimators are more negatively biased with an even distribution (i.e. a broken-stick)
than with an uneven distribution (i.e. a log-normal) which was opposite of our finding. We found that
estimators were most negatively biased in assemblages with relatively uneven log-series distributions,
supporting both Wagner and Wildi (2002) and Brose et al. (2003).
REFERENCES
Brose, U., Martinez, N.D. & Williams, R.J. (2003) Estimating species richness: sensitivity to sample coverage
and insensitivity to spatial patterns. Ecology, 84, 2364-2377.
Wagner, H.H. & Wildi, O. (2002) Realistic simulation of the effects of abundance distribution and spatial
heterogeneity on non-parametric estimators of species richness. Ecoscience, 9, 241-250.
Bias (SME)
Estimator Strue N Abund Config Design Effort p Residual
ACE 17.732 1.026 57.131 0.347 0.048 17.603 2.065 4.074
Boot 42.021 3.645 19.623 0.017 0.008 28.242 1.216 5.254
Chao1 22.203 2.655 42.561 0.207 0.018 26.902 1.636 3.854
Chao2 22.023 2.585 42.831 0.147 0.038 26.992 1.646 3.784
CY-1 0.725 0.008 85.381 0.336 0.167 2.354 3.203 7.862
CY-2 3.735 0.008 75.411 0.416 0.167 7.183 3.784 9.322
ICE 17.262 0.946 58.581 0.227 0.088 16.703 2.115 4.114
Jack1 38.461 3.295 23.103 0.027 0.008 27.892 1.216 6.034
Jack2 33.741 2.875 26.603 0.027 0.008 27.852 1.256 7.684
Jack3 30.691 2.555 27.962 0.027 0.008 27.833 1.286 9.674
Jack4 29.391 2.325 27.332 0.017 0.008 27.213 1.286 12.474
Jack5 29.431 2.105 24.533 0.007 0.008 25.232 1.246 17.474
Mixture 6.462 3.454 3.225 0.008 0.117 6.283 1.846 78.631
Sobs 44.371 3.905 16.773 0.017 0.008 28.632 1.236 5.094
Mean 24.162 2.245 37.931 0.127 0.048 21.213 1.786 12.524
Precision (SD)
Estimator Strue N Abund Config Design Effort p Residual
ACE 0.206 9.263 5.544 0.047 0.008 43.831 4.675 36.472
Boot 73.661 0.715 1.504 0.007 0.008 3.943 0.106 20.082
Chao1 9.673 4.984 3.515 0.097 0.008 31.462 2.466 47.831
Chao2 9.993 6.434 3.345 0.177 0.008 31.152 2.426 46.511
CY-1 3.826 7.684 12.503 0.008 0.017 45.021 4.185 26.792
CY-2 2.716 8.253 4.254 0.008 0.007 32.582 3.995 48.231
ICE 0.176 9.693 5.844 0.008 0.007 44.471 4.695 35.142
Jack1 76.731 0.775 0.874 0.008 0.007 3.463 0.086 18.092
Jack2 83.411 0.455 1.304 0.008 0.007 1.443 0.046 13.362
Jack3 87.441 0.105 2.523 0.008 0.007 0.134 0.016 9.792
Jack4 87.771 0.006 3.833 0.008 0.007 0.094 0.005 8.312
Jack5 86.551 0.006 4.613 0.008 0.007 0.944 0.005 7.902
Mixture 8.743 4.614 1.946 0.008 0.017 10.012 2.455 72.231
Sobs 74.881 0.435 3.073 0.008 0.007 2.784 0.126 18.722
Mean 43.271 3.815 3.904 0.027 0.008 17.953 1.806 29.252
Accuracy (SMSE)
Estimator Strue N Abund Config Design Effort p Residual
ACE 17.053 2.205 36.591 0.147 0.098 36.212 1.436 6.284
Boot 45.261 4.075 10.953 0.017 0.008 32.622 1.456 5.644
Chao1 18.853 3.335 32.362 0.207 0.068 38.261 1.946 5.004
Chao2 18.603 3.395 33.092 0.127 0.078 37.781 1.936 5.034
CY-1 3.504 1.056 33.182 0.007 0.008 37.981 1.715 22.583
CY-2 5.414 2.805 12.583 0.007 0.008 43.831 1.636 33.742
ICE 16.593 2.265 37.611 0.088 0.107 35.752 1.456 6.154
Jack1 40.241 3.995 11.373 0.008 0.007 34.562 1.476 8.364
Jack2 32.122 4.095 11.514 0.008 0.007 38.721 1.606 11.963
Jack3 21.892 4.635 9.724 0.008 0.007 45.991 1.956 15.813
Jack4 18.013 4.924 2.196 0.008 0.007 47.181 2.465 25.242
Jack5 61.731 1.235 2.874 0.008 0.007 7.423 0.956 25.802
Mixture 3.022 2.294 1.585 0.008 0.437 2.683 1.026 88.991
Sobs 48.361 4.194 10.223 0.017 0.008 31.722 1.456 4.055
Mean 25.052 3.185 17.564 0.048 0.057 33.621 1.606 18.903
Table S1.1 Performance of species richness estimators as a function of the true number of species, 25, 100,
and 500 species, averaged across all factor levels. Subscripts are estimator rank for a given performance
measure and true number of species. See Table 2 for estimator abbreviations and Table 3 for specific
performance metrics.
Table S1.2 Performance of species richness estimators as a function of species abundance distribution, log-
normal (LN) and log-series (LS), averaged across all factor levels. Subscripts are estimator rank for a given
performance measure and abundance distribution. See Table 2 for estimator abbreviations and Table 3 for
specific performance metrics.
Table S1.4 Performance of species richness estimators as a function of the spatial configuration of
individuals of a species, aggregated (Agg), hyper-dispersed (Hyp), and random (Rnd), averaged across all
factor levels. Subscripts are estimator rank for a given performance measure and spatial configuration. See
Table 2 for estimator abbreviations and Table 3 for specific performance metrics.
Table S1.7 Performance of species richness estimators as a function of survey design, random (Random)
and linear transect (Transect), averaged across all factor levels. Subscripts are estimator rank for a given
performance measure and survey design. See Table 2 for estimator abbreviations and Table 3 for specific
performance metrics.
Table S2.1 Performance of species richness estimators as a function of the true number of species, 25, 100,
and 500 species, averaged across all factor levels. Performances that include the substitutions described above
are identified with a (s). See Table 2 for estimator abbreviations and Table 3 for specific performance metrics.
Table S2.4 Performance of species richness estimators as a function of the spatial configuration of individuals of
a species, aggregated (Agg), hyper-dispersed (Hyp), and random (Rnd), averaged across all factor levels.
Performances that include the substitutions described above are identified with a (s). See Table 2 for estimator
abbreviations and Table 3 for specific performance metrics.
Table S2.7 Performance of species richness estimators as a function of survey design, random (Random) and
linear transect (Transect), averaged across all factor levels. Performances that include the substitutions
described above are identified with a (s). See Table 2 for estimator abbreviations and Table 3 for specific
performance metrics.
SURVEY DATA
To evaluate the performance of our estimator selection framework and the asymptotic performances of
estimators, we used field data from two studies. Fannes et al. (2008) surveyed two sites located in lowland
rainforests for arboreal spiders. They used insecticide knockdown fogging and suspended sheets to collect
specimens across 12 samples that were split between primary and secondary forests in Kakum National
Park, Ghana (November 2005) and across five samples within the primary forests of the Luki Biosphere
Reserve, Democratic Republic of the Congo (November 2006). King & Porter (2005) compared several
methods for sampling ants across five different ecosystems in north and central Florida. From June through
September 2001, they conducted surveys at three different sites in each ecosystem via a combination of
pitfall traps, baits, litter extraction, and hand collection.
There is always uncertainty about the true number of species (Strue) in a real assemblage. However, given
the asymptotic appearance of a randomized accumulation curve from the 12 samples of Kakum National
Park, Fannes et al. (2008) considered the 11 spider species a near census. The same was not true of the
accumulation curves that resulted from randomizations of the five samples of the Luki Biosphere Reserve
(see Fannes et al. 2008, their Fig. 4) and the 1,732 samples of ants in Floridian ecosystems (King & Porter,
2005, their Fig. 1). An accumulation curve that fails to reach an asymptote indicates that data collection is
incomplete, i.e. some species have not been detected (Gotelli & Colwell, 2001; Walther & Moore, 2005).
METHODS
We used randomization techniques when evaluating performances. At every sample size, we randomized
samples without replacement, 200x for the Fannes et al. (2008) datasets and 50x for the King & Porter
(2005) dataset, same as in the original studies. We applied species richness estimators to each randomized
subset and computed bias, precision, and accuracy, where possible, and averaged over all randomizations
at each sample size. We followed Fannes et al. (2008) in assuming that Strue = 11 for evaluating performance
against their data from Kakum National Park, Ghana. Despite possibly violating the assumption that the
observed number of species (Sobs) equaled Strue, we used Strue = 8 for evaluations against the data from Luki
Biosphere Reserve. For the dataset from King & Porter (2005), we assumed that Strue = 142, which is the
number of species found in historical datasets from the region (see Deyrup 2003). Some estimators, i.e. CY-
1, CY-2, and some orders of the jackknife, could not be computed at the smallest survey sizes (see Table 2
for estimator details, including abbreviations).
To evaluate estimator performance with the field data, we compared the rate at which randomized
accumulation curves reached their apparent asymptotes, i.e., near the asymptotic value of Sobs and
estimated richness when all data were used. Reaching an asymptote faster than Sobs is an important quality
for a SR estimator (Gotelli & Colwell, 2001).
The randomization of two surveys from the Luki Biosphere Reserve dataset (Fannes et al., 2008) included
>80% of the species, if it is assumed that Strue = 8. Based on that assumption, our selection framework
correctly predicted the least biased, ICE (SME = -0.01), most precise, Sobs (SD = 0.12), and most accurate,
Boot (SMSE = 0.04), estimators (Fig. S3.2). Our selection framework did not perform as well when the
average sc equaled 0.95 (four surveys), predicting that Chao2 (SME = 0.07) would be least biased when it
was actually Boot (SME = 0.03). Chao1 (SME = -0.03) was, however, correctly predicted to perform nearly
as well. The most precise estimator was correctly predicted as Sobs (SD = 0.10). The first and third most
accurate estimators, Sobs (SMSE = 0.01) and Boot (SMSE = 0.02), respectively, were predicted in the
reverse order by our selection framework, though their performances differed minimally.
When we randomized 20, 100, and 400 samples from King & Porter (2005), Jack5 produced the largest
estimates, 54.8, 87.3, and 109.1, respectively. Assuming that Strue = 142, the average sc with 20, 100, and
400 samples equaled 0.15, 0.31, and 0.46, respectively. Our framework correctly selected Jack5 to reduce
Thus far, we have evaluated the performance of our selection framework against values assumed for Strue,
information that is generally unavailable with field data. More important would be an estimate of sc that
correctly directs estimator selection without prior knowledge of Strue. We therefore used the data from Kakum
National Park (Fannes et al., 2008) to test the performance of the sc estimated by SimAssem (Reese et al.,
2013). In building the accumulation curve, the estimated sc ranged from 0.53 to nearly 0.70 with four and 10
randomized surveys, respectively. If we again assume that Strue = 11, then the estimated sc underestimated
the empirical sc which was 0.78 and 0.98 with four and 10 surveys, respectively. With that information, our
selection framework would suggest using Jack5 (with four surveys) and either Jack3 or CY-1 (with 10
surveys) to reduce bias. Since it requires at least five surveys, Jack5 could not be computed; the next
estimator recommended for bias reduction when the estimated sc = 0.53, Jack4, produced the second
largest average estimate (12.6) when four surveys were repeatedly randomized (the average CY-1 estimate
equaled 12.9), though several other estimates were closer to 11 (see Fig. S3.1, Jack4 data are not shown).
When 10 surveys were randomly selected, the average Jack3 estimate (11.4) was closer to 11 than all but
the average Sobs (10.8) and average Chao1 (11.1) estimates; furthermore, the average CY-1 estimate (13.3)
was the second largest (the average CY-2 estimate equaled 15.0).
Asymptotic performances
Fannes et al. (2008) considered the performances of the species richness estimators unsatisfactory, largely
because none of the associated accumulation curves reached an asymptote any faster than the curve for
Sobs. In addition to recreating accumulation curves for the originally compared estimators (ACE, Boot, Chao1,
Chao2, Jack1, and Jack2), we created curves for CY-1, CY-2, ICE, and Mixture (Fig. S3.1). The estimates
from CY-2, which required at least 10 surveys, appear as a line fragment in the top right corner. While it
remained true that none of the estimators used in their study reached an asymptote any faster than Sobs, the
average CY-1 estimate was 12.1 when two surveys were randomized (Sobs = 7 at that point) which is 110%
of an assumed Strue = 11, 94% of the CY-1 estimate (12.8) based on all the data, and possible evidence of
an asymptote. The ICE estimate was even larger with two surveys (15.4), but the continual decline in
estimates at larger samples sizes implies that it was an artifact of randomization, e.g. one or more surveys
had a relatively large number of uniques that would cause a large estimate when coupled with only one other
survey. The next largest estimate at two surveys was 92% of Strue (Chao2 = 10.2). Based on all of the data,
the Jack2 estimate was <Sobs and the Jack3, Jack4, and Jack5 estimates were 7.9, 4.6, and 1.0, respectively
(curves not shown).
When the Luki survey data were randomized, Sobs never reached an asymptote, but Fannes et al. (2008)
noted a possible leveling of the Chao1 estimator. In our randomizations, none of the estimators reached an
apparent asymptote as quickly as with the Kakum data and Chao2 and CY-1 changed the least between
samples sizes of four and five (Fig. S3.2). Had there been more data with which to evaluate performance,
these two estimators possibly would have shown that they had reached their respective asymptotes. As in
the simulations with Strue = 25 (see Table S2.1), these comparisons indicate that CY-1 could require less
data to reach an asymptote than for the other estimators.
SR estimators appeared not to approach an asymptote faster than Sobs with the ant dataset of King and
Porter (2005; their Fig. 1). Though not reproduced here, none of the additional estimators performed any
better. This could indicate that a considerable number of species were yet to be encountered.
CONCLUSION
Evaluating estimator selection frameworks and asymptotic properties with field data is difficult because there
is rarely, if ever, an assurance that the evaluation is based on correct information. However, we believe that
the tests presented here demonstrate the promise of selection frameworks such as those proposed in Brose
et al. (2003) and in this study, particularly when the sample coverage is correctly estimated. Tests of our
proposed sample coverage estimator indicate that it underestimates the true sample coverage, but again,
assumptions that Sobs = Strue in the test datasets could have been false. For example, several estimators
indicated that Strue > 11 in Kakum National Park, which could be true given the difficulty in detecting all
species and evidence of the difficulty in evaluating the asymptotic properties of estimators with field data.
Thus, our study shows that additional development and testing of species richness estimator is warranted.
Figure S3.2 Species accumulation curves for eight species richness estimators based on arboreal spider
data from Luki Biosphere Reserve, DR Congo. At each possible survey size (five total), 200 surveys were
randomly drawn without replacement, species richness was estimated, and estimates were averaged across
the 200 randomizations. The x-axis is the average number of individuals contained in randomized surveys
(each successive symbol on a line represents an increment of one in the number of randomized surveys). To
minimize clutter, Jack3 (the estimate based on all data equaled 10.7), Jack4 (10.8), Jack5 (10.8), and ICE
(9.2) are not displayed and, with only five surveys, there were not enough data to compute CY-2.
Brose, U., Martinez, N.D. & Williams, R.J. (2003) Estimating species richness: sensitivity to sample coverage
and insensitivity to spatial patterns. Ecology, 84, 2364-2377.
Deyrup, M. (2003) An updated list of Florida ants (Hymenoptera: Formicidae). Florida Entomologist, 86, 43-
48.
Fannes, W., Bakker, D.D., Loosveldt, K. & Jocqué, R. (2008) Estimating the diversity of arboreal oonopid
spider assemblages (Araneae, Oonopidae) at Afrotropical sites. The Journal of Arachnology, 36,
322-330.
Gotelli, N.J. & Colwell, R.K. (2001) Quantifying biodiversity: procedures and pitfalls in the measurement and
comparison of species richness. Ecology Letters, 4, 379-391.
King, J.R. & Porter, S.D. (2005) Evaluation of sampling methods and species richness estimators for ants in
upland ecosystems in Florida. Environmental Entomology, 34, 1566-1578.
Reese, G.C. (2012) Simulating species assemblages and evaluating species richness estimators. PhD
thesis, Colorado State University, Fort Collins. Available at: http://hdl.handle.net/10217/68193
(accessed 23 October 2013).
Reese, G.C., Wilson, K.R. & Flather, C.H. (2013) Program SimAssem: software for simulating species
assemblages and estimating species richness. Methods in Ecology and Evolution, 4, 891-896.
Walther, B.A. & Moore, J.L. (2005) The concepts of bias, precision and accuracy, and their use in testing the
performance of species richness estimators, with a literature review of estimator performance.
Ecography, 28, 815-829.