Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

Available online at www.sciencedirect.

com

Remote Sensing of Environment 112 (2008) 2000 2017 www.elsevier.com/locate/rse

Modeling distribution of Amazonian tree species and diversity using remote sensing measurements
Sassan Saatchi a,, Wolfgang Buermann b , Hans ter Steege c , Scott Mori d , Thomas B. Smith e
Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA 91109 USA Center for Tropical Research, Institute of the Environment, University of California, Los Angeles, Los Angeles, CA 90095 USA c Institute of Environmental Biology, Section Plant Ecology and Biodiversity, Utrecht University, Sorbonnelaan 14, 3584 CA Utrecht, The Netherlands d New York Botanical Garden, 200th Street and Kazimiroff Blvd., Bronx, NY 10458 USA Center for Tropical Research, Institute of the Environment and Department of Ecology and Evolutionary Biology, University of California, Los Angeles, Los Angeles, CA 90095 USA
b a

Received 26 December 2006; received in revised form 8 January 2008; accepted 12 January 2008

Abstract The availability of a wide range of satellite measurements of environmental variables at different spatial and temporal resolutions, together with an increasing number of digitized and georeferenced species occurrences, has created the opportunity to model and monitor species geographic distribution and richness at regional to continental scales. In this paper, we examine the application of recently developed global data products from satellite observations in modeling the potential distribution of tree species and diversity in the Amazon basin. We use data from satellite sensors, including MODIS, QSCAT, SRTM, and TRMM, to develop different environmental variables related to vegetation, landscape, and climate. These variables are used in a maximum entropy method (Maxent) to model the geographical distribution of five commercial trees and to classify the patterns of tree alpha-diversity in the Amazon basin. Maxent simulations are analyzed using binomial tests of omission rates and the area under the receiver operating characteristics (ROC) curves to examine the model performance, the accuracy of geographic distributions, and the significance of environmental variables for discriminating suitable habitats. To evaluate the importance of satellite data, we used the Maxent jackknife test to quantify the training gains from data layers and to compare the results with model simulations using climate-only data. For all species and tree alpha-diversity, modeled distributions are in agreement with historical data and field observations. The results compare with climate-derived patterns, but provide better spatial resolution and detailed information on the habitat characteristics. Among satellite data products, QSCAT backscatter, representing canopy moisture and roughness, and MODIS leaf area index (LAI) are the most important variables in almost all cases. Model simulations suggest that climate and remote sensing results are complementary and that the best distribution patterns can be achieved when the two data sets are combined. 2008 Elsevier Inc. All rights reserved.
Keywords: Species distribution; Remote sensing data; Maxent; Amazon basin; Tree diversity

1. Introduction Recent efforts to conserve biodiversity are moving beyond preserving only its pattern, such as particular species or populations, to include the many complex processes that produce and maintain biodiversity (Cowling and Pressey, 2001; Crandall et al., 2000). The conservation of regional biodiversity

Corresponding author. Tel.: +1 818 354 1051; fax: +1 818 393 5184. E-mail address: saatchi@congo.jpl.nasa.gov (S. Saatchi). 0034-4257/$ - see front matter 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.rse.2008.01.008

is inextricably linked with the species that occur in a region, the genes they contain, and the other biotic and abiotic features that comprise the ecosystem (Myers et al., 2000). Under pressure to make informed management decisions rapidly, conservation practitioners must increasingly rely on predictive models to provide them with information on species distributions (Ferrier, 2002; Loiselle et al., 2003). In addition, using models to predict species distributions have become key elements in documenting biodiversity on the planet and are critical to understanding the effect of multiple stresses caused by climate and human-induced changes (Fjeldsa & Lovett, 1997; Pimm, 1991).

S. Saatchi et al. / Remote Sensing of Environment 112 (2008) 20002017

2001

The majority of studies in biogeography use species occurrence or museum collections to map and analyze large-scale patterns of species distribution and richness (Lovett et al., 2000; Rahbek & Graves, 2001). These studies clearly indicate that species differ in the size of their geographic range. Most species, within the same assemblage, tend to have relatively small ranges that reflect how they share space (Brown et al., 1996; Gaston, 1998). Range size may depend on a variety of ecological and evolutionary processes and extrinsic factors of the physical environment such as soils, nutrients, water, and climate (Gentry, 1988; Hunter, 2003; Kreft et al., 2006; Smith et al., 2001). Capturing the interplay of these factors is fundamental to understanding the uneven distribution of diversity on regional and global scales. In most biogeographic theories, geographic distribution of species and their diversity or richness are conceived in terms of a multidimensional coordinate system, whose axes are various resource gradients (e.g. ecological and environmental variables). This coordinate system defines a hyperspace, and the range of the space that a given species occupies is its niche. The niche is an abstract characterization of the intra-community position of the species that depends on time, space, and differences in resource gradients that cause the species evolution (Whittaker, 1972). Geographic distribution of species and their diversity or richness depends on how well their ecological niche is understood. It is widely accepted that measurement of environmental requirements to quantify the range size and patterns of species distribution and richness is an important step towards this understanding (Woodward, 1987). This generalization is true at a variety of spatial scales, suggesting the importance of measurements of environmental variables at different scales. For example, climate variables are of increasing importance as the scale increases from regional to continental to global scales. Currently, there is an increasing urgency among conservation biologists to quantify the environmental requirements of particular species at finer spatial scales in order to better prioritize conservation efforts. This has created the need to collect spatial information over large regions using remotely sensed measurements from airborne or satellite sensors (Turner et al., 2003). In addition, the use of remote sensing data by conservation biologists has helped frame new and important research questions. Can remote sensing data identify areas of significance to biodiversity, predict species distributions, and model community responses to environmental and anthropogenic changes? Answering these questions depends on several assumptions: 1) environmental variables, and biophysical properties that characterize species habitat, and drive its distribution are detectable by existing remote sensing sensors, 2) there are sufficient and spatially representative field observations of species presence or absence and habitat characteristics, and 3) there are distribution models capable of extending the field observations to regional and global scales with the aid of environmental variables produced by remote sensing measurements. There has been an increasing interest in studying these assumptions in recent years (Turner et al., 2003; Nagendra, 2001; Guisan & Zimmermann, 2000; Peng, 2000).

This paper examines the potential use of recently developed global datasets from satellite observations for mapping distribution patterns of tree species and diversity in the Amazon basin. Unlike regions with limited species richness and strong gradients of climate variables, the Amazon basin has one of the highest species diversity and richness in the world, but comparatively little variations in climatic variables (temperature and rainfall) (Nelson et al., 1990; De Oliveira and Mori, 1999). These regional characteristics limit the use of climate variables to develop ecological and distribution models. Remote sensing data, on the other hand, provide spatially refined information on landscape and vegetation heterogeneity over the Amazon basin that can be readily incorporated in models to predict species distribution and diversity. These models are either strictly mathematical or based on certain ecological theories. The detailed discussion or review of these models and the ecological theories are beyond the scope of this paper (Elith et al., 2006; Graham & Hijmans, 2006). Here, we are interested to model the distribution of five widespread commercial trees, and tree alpha-diversity (expressed as Fisher's alpha) over the Amazon basin. We use the maximum entropy method (Maxent) (Phillips et al., 2005) that integrates remote sensing and geographical point locality data of species in order to model distributions and provides a predictive probability to assess the contribution of remote sensing data layers. The paper is organized into three sections: 1) description of species, remote sensing, and climate data, 2) description of the Maxent model and simulations used for testing the application of remote sensing data, 3) assessment of potential range distributions, and 4) discussion on the contribution and significance of remote sensing data for characterizing suitable areas of species habitat. 2. Species data 2.1. Amazonian tree species Five widespread and well-documented commercial timber trees were selected for distribution modeling. The geographical locations of trees were extracted from the herbarium collection of the New York Botanical Gardens and included data from a variety of forest types and landscape features in northern South America (Fig. 1). The species studied were: Calophyllum brasiliense (Clusiaceae), Carapa guianensis (Meliaceae), Hura crepitans (Euphorbiaceae), Manilkara bidentata (Sapotaceae), and Virola surinamensis (Myristicaceae). The data set did not include any subspecies with strong distributional or functional characteristics or preferences that might influence the overall distribution. C. brasiliense (Clusiaceae), known in Brazil by the common name of jacareuba, grows as a canopy tree in a variety of soil, slopes, and elevations (up to 1500 m). The tree can reach 45 m in height with a straight bole without any buttresses or branches for about 2/3 of the height. C. (Clusiaceae) is a tropical genus composed of approximately one hundred species. Its natural geographical range extends from southern Mexico throughout Central America to northern parts of South America (Record & Hess, 1943). It is also found in several Caribbean islands

2002

S. Saatchi et al. / Remote Sensing of Environment 112 (2008) 20002017

Fig. 1. Geographic locations of tree species inventory data for five commercial tree species, C. brasiliense, C. guianensis, H. crepitans, M. bidentata, and V. surinamensis.

(Marques & Joly, 2000). We acquired 96 point localities spread over the Amazon basin for this tree. C. guianensis (Meliaceae) is a deciduous or semi-evergreen medium-size species, up to 35 m tall with a straight cylindrical bole up to 100200 cm in diameter. The species, with the trade name of andiroba, is found in the West Indies, in the Caribbean islands, throughout Central America and south to the central Amazon. We acquired 88 point localities for this tree species spread mainly along rivers. The tree establishes itself on rich soils along streams, in periodically inundated swamp forests, in upland forests along the rivers of the Amazon basin, and in elevations ranging from shorelines to 1200 m (Guariguata et al., 2002). H. crepitans (Euphorbiaceae), known as Acacu in Brazil, is another tall tree, ranging from 25 to 50 m in height with clear boles of 15 to 30 m and with diameters ranging from 100150 cm (at times to 200 cm). The tree has a native range in tropical America, but has been naturalized in other parts of the world. Its range extends from Central America and the Caribbean islands to northern South America, with larger concentrations in Colombia, Ecuador, and northern Peru and within the white water varzea floodplains along the Amazon River. H. crepitans is also found extensively in coastal Venezuela and the Guyanas on pure sand or moist sandy loam and is frequently cultivated as a shade tree elsewhere (Freiberg, 1996). However, for this study, we could only acquire 45 point localities for this species. M. bidentata (Sapotaceae) is a large evergreen forest tree found throughout the West Indies, ranging from Mexico throughout Panama to northern South America, and from Venezuela to Peru, including northern Brazil and the Guyanas. The tree is extremely

shade tolerant and grows from coastal sea levels up to few hundred meters in elevation. There were 140 point localities for this species that included the two subspecies of bidentata and surinamensis. V. surinamensis (Myristicaceae), known as Ucuba in Brazil, has a variety of commercial and medicinal values and is found in swampy, fertile and periodically inundated riverbanks, in Amazonian varzea forests, and in degraded and secondary forests. Its geographical range in the Neotropics extends from Central America, Costa Rica, and Panama down to the northern Amazon basin and the eastern coastal region in the Guyanas. The tree grows modestly in the open forest gaps and can attain a size of 30 m in height and 100 cm in diameter (Fisher et al., 1991). The tree canopy has seasonal characteristics. In French Guiana, Ucuba flowers twice a year, in March and September, but near Manaus flowering extends from August to November and fruiting from January to July (Howe, 1990; Rodriguez, 1972). We found 133 point localities for this species. 2.2. Amazonian tree diversity The tree diversity data were from a total of 633 plots located on a variety of forest types, including terra firme, floodplains, and swamps in the Amazon basin and the Guiana Shields (Fig. 2). The data were primarily from published 1 ha plots of the ATDN database (ter Steege et al., 2003), however, a number of smaller plots with sufficient trees (more than 150 individuals) of a diameter at breast-height, dbh N 10 cm or larger ones were also included in the data set (ter Steege et al., 2003). Tree alphadiversity, expressed as Fisher's alpha (), a measure which

S. Saatchi et al. / Remote Sensing of Environment 112 (2008) 20002017

2003

Fig. 2. Geographic locations of tree alpha-diversity (Fisher's alpha) in terra firme and inundated forests. Dots indicate the maximum Fisher's alpha found at one location (n = 633).

corrects thesample size, was calculated for each plot using for S a ln 1 N , where S is the number of species, N is the a number of individuals, and is the diversity coefficient (Fisher et al., 1943). Plots' geographical locations, tree alpha-diversity, and forest types were among data sets acquired from the Amazon Tree Diversity Network. Detailed information about the data sets and a partial list of references for the published plot data in the Amazon can be found in ter Steege et al. (2003). 3. Environmental data 3.1. Remote sensing data We compiled a set of remote sensing data and products from different earth observing sensors to derive metrics sensitive to vegetation and landscape variables. The data set included both optical and microwave satellite sensors. To quantify spatial and temporal patterns in canopy structure, we used the monthly 1 km LAI (Leaf Area Index) data derived from MODIS reflectance over the five-year period, 20002004 (Myneni et al., 2002). It is noteworthy to mention that in this study we preferred LAI over NDVI or any other vegetation index because of how it relates to canopy structure and seasonality and it had undergone various quality checks before and during LAI algorithm implementation (Myneni et al., 2002). The MODIS 8-day LAI products provided the basis for these monthly composites, which improved the data quality by further reducing the impact of clouds and any possible LAI estimation errors. We produced monthly climatological means by averaging values over

5 years (20002004). The climatological composites were then used to generate five metrics: annual maximum, (Fig. 3a) minimum, mean, standard deviation, and range (difference of maximum and minimum). These LAI metrics provide information on net primary productivity and vegetation seasonality, both important for characterizing species geographical range. We also included the MODIS-derived vegetation continuous field (VCF) product as a measure of the percentage of tree canopy cover within each 1 km pixel resolution (Hansen et al., 2002). The VCF product is generated from the time series composites of MODIS data from year 2001 and is available from the Global Land Cover Facility at the University of Maryland. The VCF product separates open (e.g., shrub lands, savannas), fragmented, and deforested areas from those of intact old growth forests (Fig. 3b). As part of the microwave remote sensing measurements, we included global QSCAT (Quick Scatterometer) data available in three-day composites at 2.25 km resolution (Long et al., 2001). The three-day data over 5 years (20002004) were used to create average monthly composites at 1 km resolution and then further processed to produce four metrics that included annual mean and standard deviation of radar backscatter at both HH and VV polarizations (H: horizontal, V: vertical). QSCAT radar measurements are at KU band (12 GHz) and are sensitive to surface or canopy roughness, moisture, and other seasonal attributes, such as phenological changes. For areas with low vegetation biomass, such as woodlands and savanna, measurements at different polarizations correlate positively with the aboveground biomass (Long et al., 2001; Saatchi et al., 2007). For areas with dense forest, backscatter measurements are

2004

S. Saatchi et al. / Remote Sensing of Environment 112 (2008) 20002017

Fig. 3. A selection of the remote sensing data layers used in this study. The panels show (a) MODIS LAI annual maximum, (b) MODIS percentage tree cover, (c) QSCAT annual mean, and (d) mean elevation from SRTM.

sensitive to canopy roughness and moisture and contribute to measuring differences in forest types and canopy structure. The long term ( 5 years) average of MODIS and QSCAT data and the metrics used in this study are assumed to approximately represent the climatological mean of the environmental variables they represent (Buermann et al., 2002). In this study, we used the annual mean (Fig. 3c) and standard deviation of QSCAT HH backscatter data over 1 year and excluded the VV backscatter data because of its high correlation with the HH backscatter over tropical forests. Finally, we included the SRTM (Shuttle Radar Topography Mission) digital elevation data, aggregated from a 100-meter resolution to 1 km. In addition to the mean elevation (Fig. 3d), the standard deviation was also included to represent surface

ruggedness or roughness. Overall, seven remote sensing data layers (2 LAI, 2 QSCAT, 1 VCF, 2 SRTM) were included in this study (Table 1). These layers were chosen after performing a correlation test and removing highly correlated layers (Buermann et al., in press). 3.2. Climate data A series of climate metrics were obtained from WorldClim (WorldClim version 1.4; Hijmans et al., 2005). These climate metrics are derived from monthly temperature and rainfall values and represent biologically meaningful variables for characterizing species distribution (Nix, 1986). The WorldClim data layers included 11 temperature and eight precipitation metrics, expressing

S. Saatchi et al. / Remote Sensing of Environment 112 (2008) 20002017 Table 1 Overview of remote sensing data sets used in the Maxent predictions along with their native resolution and ecological interpretation Data record Leaf area index (LAI) Maximum LAI LAI range Percent tree cover Scatterometerbackscatter Annual mean HH Annual STD HH DEM Rainfall Instrument MODIS Ecological variable Vegetation phenology, structure, and net primary productivity Forest cover and heterogeneity Surface (canopy) moisture and roughness (forest structure) Topography and ruggedness Monthly rainfall Native resolution 1 km and 8 days

2005

4. Methodology 4.1. Maxent model We used the Maxent algorithm, which has been very recently introduced for modeling of species distributions (Phillips et al., 2005). Maxent is a general-purpose algorithm that generates predictions or inferences from an incomplete set of information. The Maxent approach is based on a probabilistic framework. It relies on the assumption that the incomplete empirical probability distribution (which is based on the species occurrences) can be approximated with a probability distribution of maximum entropy (the Maxent distribution) subject to certain environmental constraints, and that this distribution approximates a species potential geographic distribution (Phillips et al., 2005). The input data includes a set of environmental layers for a geographical region and a set of species presence data inside that region. Like most maximum likelihood estimation approaches, Maxent, a priori assumes a uniform distribution and performs a number of iterations in which the weights are adjusted to maximize the average probability of the point localities (also known as the average sample likelihood), expressed as the training gain (Phillips, 2005). These weights are then used to compute the Maxent distribution over the entire geographic space. As in the case of the present study, Maxent can be applied to species presence-only geographic locations and remote sensing data to produce distributions expressing suitability of each grid cell as a function of the environmental variables at that grid cell. A high value of the function at a particular grid cell indicates that the grid cell is predicted to have suitable conditions for that species (Phillips, 2005). Compared to other existing models, Maxent has a number of features that makes it very useful for modeling species distribution (Elith et al., 2006; Phillips et al., 2005). These include a deterministic framework and, hence, stability as well as the ability to run with presence-only point occurrences, high performance with few point localities, better computing efficiency enabling the use of large-scale high-resolution data layers, continuous output from least to most suitable conditions, and ability to model complex responses to environmental variables. Last but not least, the newest Maxent version (2.3) is equipped with several features aimed at supporting the interpretation of the model results. For example, Maxent has a built-in jackknife

MODIS QSCAT

1 km 2.25 km and 3 days

SRTM TRMM

90 m1 km 0.25 0.25 deg.

spatial variations in annual means, seasonality, and extreme or limiting climatic factors. The climate metrics were developed using long time series of a global network of more than 4000 weather stations from various sources such as the Global Historical Climatology Network (GHCN), the FAO (the United Nations Food and Agricultural Organization), the WMO (World Meteorological Organization), the International Center for Tropical Agriculture (CIAT), R-HYdronet, and additional country-based stations. The station data were interpolated to monthly climate surfaces at 5 km spatial resolution by using a thin-plate smoothing spline algorithm with latitude, longitude, and elevation (SRTM) as independent variables (Hijmans et al., 2005). In addition to the bioclimatic variables interpolated from the station data, we used remotely sensed precipitation data from the sensors onboard the Tropical Rainfall Mapping Mission (TRMM) (Kummerow et al., 1998). The TRMM products were obtained from the global rainfall algorithm (3B43), combining the estimates from the sensors with the global gridded rain gauge data from Climate Assessment and Monitoring System (CAMS), produced by NOAA's Climate Prediction Center and/ or global rain gauge product, produced by the Global Precipitation Climatology Center (GPCC). The output is rainfall for 0.25 0.25 degree grid boxes for each month. Monthly rainfall data from TRMM covering the tropical region (20N20S) and extended to (50N50S) over a period of 9 years (19982006) were used to develop climatologically averaged precipitation metrics such as the total annual, driest quarter, wettest quarter, and seasonality (coefficient of variation). While developing the climatological metrics, we resampled the TRMM data to 5 km resolution using a cubic-spline routine in order to be compatible with the WorldClim data layers. The TRMM measurements are superior to precipitation layers in the WorldClim dataset because of direct rainfall measurements from space, calibration accuracy, and coverage over areas in the tropics where no ground stations are available. After removing the correlated climate layers (Buermann et al., in press), we used only nine independent climate variables for the model runs (Table 2).

Table 2 Bioclimatic variables used in Maxent predictions Bioclimate layer Layer description and unit 5 km resolution BIO1 BIO2 BIO3 BIO4 BIO5 BIO6 BIO7 BIO8 BIO9 Annual mean temperature Mean diurnal range (mean of monthly (max tempmin temp)) Temperature seasonality (standard deviation 100) Max temperature of warmest month Min temperature of coldest month TRMM annual precipitation TRMM precipitation seasonality (coefficient of variation) TRMM precipitation of wettest quarter TRMM precipitation of driest quarter

2006

S. Saatchi et al. / Remote Sensing of Environment 112 (2008) 20002017

option, which allows the estimation of the significance of individual environmental data layers in computing the species distributions. It provides statistical measures for model performance such as omission rates and the areas under the Receiver Operating Characteristic (ROC) curve (AUC), and response curves for each environmental layer showing how the Maxent prediction depends on a particular environmental variable (Phillips, 2005). In this study, we used two approaches to evaluate the model performance: the omission rates and the AUC. In general, a low omission rate of species occurrence is necessary for potentially predicting the species distribution ranges (Anderson et al., 2003). 4.2. Scenarios and quantitative analysis We developed several experiments to test the contribution of high-resolution satellite data in modeling species distribution and diversity. First, we used Maxent to model the distribution of five tree species with remote sensing data only at 1 km resolution and evaluated the overall performance of the model and the contribution of each data layer. We compared these results with Maxent models derived from climate data only and examined the spatial details obtained from remote sensing data. For species diversity, we divided the tree alpha-diversity values into different range classes based on the histogram of tree alpha-diversity values for available sites, and used Maxent to provide distributions for each class. The selection of the diversity range for each category was based on how the values were distributed over the entire 633 sites (Fig. 2). We divided the sites into incremental groups to sample the distribution of point locations and provide ample point locations for modeling. The initial Maxent run was performed for training data with alpha b 20, corresponding to all the low diversity sites in the Amazon. Afterward, the model runs were performed for all the sites with values greater than the Fisher's alpha threshold for each category (N20, N 40, N 60, N80, N 100, N 120, N180). We used a threshold of 25% for the predictive probabilities obtained for each class range and combined the derived distributions in a decision rule approach to develop a classification map of tree alpha-diversity for the entire range. The threshold value of 25% allowed the largest predictive area and suitability for each class range. After evaluating the contribution of remote sensing data layers to model outputs, we compared the results with a similar experiment obtained from climate-only data. Utilizing all point localities available for each species produced the final distribution maps of tree species and diversity. Spatial accuracy of the Maxent predictions was tested closely following the procedures in Phillips et al. (2005). In detail, we created 10 random data partitions with 60% of the point localities assigned for training and 40% for testing, and ran each scenario with each of these 10 data partitions. Model performance was then tested at fixed thresholds (threshold-dependent) and across all thresholds (threshold-independent). In the threshold-dependent case, we evaluated extrinsic (test) omission rates, defined as the fraction of test localities that fall into pixels outside the predicted area, at the 10% cumulative probability threshold. The proportional predicted area is also

provided as the fraction of all the pixels predicted as being suitable for the species. A one-tailed test, as a measure to assess whether the omission rate is lower or higher than random, was used to determine whether the model could significantly predict the test localities. In the threshold-independent test, we analyzed the area under the ROC curve (AUC) for both training and the test datasets and estimated how significantly each model prediction differed from random using a ties-corrected MannWhitney-U test (Phillips et al., 2005). The ROC curve provides a quantitative representation of the tradeoffs between omission (sensitivity) and commission error (1-specificity). The sensitivity represents the absence of the omission error, and the quantity 1-specificity represents the commission error (Cantor et al., 1999). The ROC curve is obtained by plotting the sensitivity on the y axis and 1-specificity on the x axis for all possible thresholds (Swets, 1988). The area under the ROC curve is an important metric to measure the model performance. The larger the AUC, the highest is the sensitivity rate and the lower is the 1-specificity rate. An AUC equal to 1.0 represents an ideal diagnostic test because it achieves both 100% sensitivity and 100% specificity. If AUC is 0.5, it indicates that the test has 50% sensitivity and 50% specificity rates, suggesting high omission and commission errors (Cantor et al., 1999). Finally, we compared the distributions obtained from remote sensing data with Maxent distributions derived from climate variables. We repeated the same experiments with nine independent bioclimatic variables and compared the final distributions for tree species and tree alpha-diversity to illustrate the significance of spatial information in satellite observations and to explain the complementary habitat characteristics obtained from remote sensing data and products. 5. Results 5.1. Distribution of tree species Maxent models for distribution of five tree species were generated using the georeferenced locations and remote sensing data layers at 1 km spatial resolution (Table 1) excluding the TRMM precipitation metrics. We used two indicators to examine the performance of the model: extrinsic omission evaluated at a fixed threshold and the threshold-independent area under the ROC curve (AUC) (Table 3). The indicators were obtained using 40% of the point locality data as test localities with the remainder used for training. For all species and all data partitions, the AUC values were significantly better than random (0.5). The AUCs in the training and test cases generally showed only small differences, suggesting little overfitting in the Maxent predictions. At the 10% fixed cumulative probability threshold, the extrinsic omission rates were small, associated with reasonable fractions of predicted areas, again suggesting meaningful model predictions. The overall performance of the model for all five species were high, indicating that the Maxent-derived distributions were a close approximation of the probability distribution that represents the reality. Maxent models for distribution of five tree species were generated using the georeferenced locations and remote sensing

S. Saatchi et al. / Remote Sensing of Environment 112 (2008) 20002017 Table 3 Results of threshold-dependent omission tests and threshold-independent ROC tests for five tree species, including fractional predicted area, test omission rates, and the area under the ROC curve (AUC) Species name and number of occurrences Threshold-dependent test Threshold-independent test

2007

Fractional Test omission AUC test (training) predicted rate area (a) C. brasiliense (96) (b) C. guianensis (88) (c) H. crepitans (45) (d) M. bidentata (140) (e) V. surinamensis (133) 0.533 0.356 0.289 0.321 0.377 0.106 0.153 0.212 0.095 0.142 0.756 (0.821) 0.843 (0.914) 0.832 (0.921) 0.853 (0.907) 0.833 (0.896)

Test omission rates were calculated at the 10% threshold level. Values represent averages from 10 separate random training/test data partitions. For each partition, statistical evaluations of test omission rates (one-tailed binomial) and test AUC (MannWhitney-U) indicated that the predictions were significantly better than random (p b 0.001; individual p-values not shown).

data layers at 1 km spatial resolution (Table 1) excluding the TRMM precipitation metrics. We used two indicators to examine the performance of the model: the fraction of predicted area and extrinsic omission rate as threshold-dependent tests and the area under the ROC curve (AUC) as the thresholdindependent test (Table 3). The indicators were obtained using approximately 40% of the training data as test localities for evaluating the performance statistics. For all species, the AUC values were significantly better than random (0.5), and with one-tailed p b 0.001. This result was obtained for both the training and the test data, with the small difference in AUC values suggesting a robust performance of the Maxent algorithm to capture the variations in environmental variables over point localities. All omission tests were calculated at 10% threshold value. At this threshold, the fractional predicted area shows the fraction of all the pixels that are predicted suitable for the species. For all species, the extrinsic omission rates were small, suggesting that only a small fraction of the test locations fell into pixels not predicted as suitable for species. The overall performance of the model for all five species was high, implying that the Maxent-derived distributions were a close approximation of the probability distribution that represents the reality. The spatial distributions of tree species in terms of the predictive probability were segmented in five categories to represent the ranges of habitat suitability (Fig. 4). As described earlier, the value assigned to a pixel is the sum of the probabilities of that pixel and all other pixels with equal or lower probability multiplied by 100 to give a percentage. Theoretically, any pixel with probability greater than 1% is considered suitable for the species habitat. However, here we are mainly interested in areas with higher probability (N20%). For C. brasiliense (Fig. 4a), the areas in central Amazon have the highest cumulative probabilities (N20%). Floodplains of the central Amazon, dominated by close canopy varzea forests, the tidal varzea of the Amazon estuary in the state of Para, and southern basins of Tapajos, Itiri, and Xingu rivers, all fall in the N50% probability. C. brasiliense is considered to be

one of the most exploited timber species in the varzea forests and is on the verge of extinction in these regions due to unsustainable logging practices (Higuchi et al., 1994). The model also predicts areas in fragmented forests along the Atlantic coast of Brazil in southern Bahia, the northwestern Amazon in Colombia, Ecuador, and Peru, and some areas of the Guiana Shields as suitable habitats for the species (Fisher & Dos Santos, 2001). The results of the jackknife test of variable importance showed the highest gain (N0.3) for the QSCAT mean backscatter data, suggesting areas with high moisture in a floodplain and terra firme forests as the suitable habitat for C. brasiliense (Fig. 5a). Other variables, such as the maximum LAI and the percent tree cover with moderate gains (0.10.2), were the next contenders in defining the habitat. Maxent prediction for C. guianensis pointed to the central Amazon, the states of Amazonas and Para, the Guiana Shields, the northern coast of Venezuela, and the Atlantic Coastal forests and varzea floodplains as regions of highest probability for the geographical range of the species (Fig. 4b). Remote sensing variables with the highest gains were QSCAT mean (0.49) backscatter and SRTM elevation (0.6). A close examination of Fig. 5b shows areas delineated by elevation less than 300 m and high moisture along the floodplains and coastal regions were suitable habitats. In addition, the Colombian Pacific Coast region (Choc) (Lellinger and Sota, 1978), a stretch of land mainly between the Pacific Ocean and Cordillera Occidental of the Andes, from west of the mouth of the Atrato River near Panama to Mataje River in the south, bordering northwestern Ecuador, was also predicted as the suitable range for C. guianensis (known as tangare in the region) (Gentry, 1982). This result clearly shows the strength of remote sensing data and the Maxent model for predicting species range, in particular in areas where no training data were available. H. crepitans had the lowest numbers of point localities among the five species and they were scattered mainly in the western Amazon. Maxent predicted areas outside the central Amazon as the suitable habitat. Areas with the highest probabilities (N50%) were in the western lowlands of Peru and Ecuador, in southern Bolivia, along the Beni river basin, and in the eastern regions of the Brazilian Amazon and the coastal regions of Surinam, Guyana, and French Guiana (Fig. 4c). Along the Atlantic coast of Brazil, the model predicts small regions in southern Bahia as potentially suitable habitat for H. crepitans. Similarly, narrow regions in varzea forests along the Rio Solimoes and its tributaries are delineated as potential habitat. In general, H. crepitans is considered a semi-evergreen species found in seasonal forests which, along with other emergent trees, undergoes foliage reduction during the dry season (Condit et al., 2000; Schongart et al., 2002). Among remote sensing data, QSCAT mean and standard deviation, maximum and range of LAI were selected as variables with high gains for defining the species range. Mean QSCAT and maximum LAI both reached N0.6 gains through the jackknife analysis, suggesting forests with canopy moisture, roughness, and leaf area as potential habitat (Fig. 5c). The standard deviation of QSCAT and the range of LAI had gains N0.3, and both pointed to forests with seasonal canopy characteristics. Topography, on the other hand, was not

2008

S. Saatchi et al. / Remote Sensing of Environment 112 (2008) 20002017

Fig. 4. Maxent Prediction of potential geographic distribution of five tree species made using all occurrence records and the remote sensing data at 1 km resolution. The predictive probability values ranging from 0 to 100 are depicted by colors. (a) C. brasiliense, (b) C. guianensis, (c) H. crepitans, (d) M. bidentata, and (e) V. surinamensis.

S. Saatchi et al. / Remote Sensing of Environment 112 (2008) 20002017

2009

Fig. 5. Results from the Maxent jackknife test of the importance for remote sensing variables used for five tree species. The graphs depict the training gains when a variable is used in isolation, when the variable is excluded, and when all variables are utilized. The gain is a measure of how much better the Maxent probability distribution fits the distribution of occurrence data. A variable has useful information when the gain is high and it is used in isolation and has unique information if it reduces the gain most when it is excluded.

2010

S. Saatchi et al. / Remote Sensing of Environment 112 (2008) 20002017

region of western Colombia to Panama (Fisher et al., 1991; Gentry, 1975; Howe, 1990; Rodriguez, 1972). Areas in the southern Amazon region of Brazil along the transitional forests are also predicted as suitable habitat, but with lower probability (b 20%). Analysis of Maxent results showed seasonality as less important in characterizing the distribution. This is mainly due to the fact that the point localities of the species covered a wide range of landscapes with a wide range of seasonality. However, the canopy moisture from QSCAT backscatter (gain N 0.5), maximum LAI from MODIS (gain N0.4), and the elevation from SRTM (gain 0.4) contributed significantly in defining the species range (Fig. 5e). One of the most apparent features in the distribution is the high probability of prediction (N50%) in the Amazon floodplains. 5.2. Distribution of tree alpha-diversity
Fig. 6. Results from Maxent predictions of tree alpha-diversity made from remote sensing data and inventory plots (n = 633). Classification map of tree alpha-diversity produced from Maxent predictions using 25% probability threshold for each class range.

an important indicator for the habitat. Visual inspection of the predicted areas at 1 km resolution confirmed that the distribution compared well with the seasonal characteristics of H. crepitans observed in field experiments (Freiberg, 1996; Schongart, et al., 2002). M. bidentata distribution is widespread in the Amazon basin with the highest predicted probability in the Central region of Brazil, especially in the state of Amazonas, Para (heavily logged), Amapa, and Roriama, and along the northeastern Atlantic coast, the Amazon estuary, French Guiana, Surinam, and Guyana (Fig. 4d). Maxent predicts areas in upland terra firme and floodplain forests as preferred habitats for M. bidentata. The distribution extends to western South America, the state of Acre in Brazil, northern Peru, Ecuador, and Colombia. The range also covered areas in the western Choc region of Colombia to southern Panama (FaberLangendoen & Gentry, 1991). Several remote sensing variables help define the geographical range of M. bidentata. The training gains obtained from the jackknife statistics showed mean QSCAT (gain N 0.5), standard deviation (gain N 0.3), MODIS percent tree cover (gain N 0.4), SRTM elevation (gain N0.4), and maximum LAI (gain N 0.4) are among the important remote sensing variables contributing to the predicted distribution (Fig. 5d). In general, closed canopy forests (high percent tree cover), moist (high mean QSCAT), with low seasonality (low QSCAT standard deviation) and medium LAI with almost no seasonal variations were the best indicators for the habitat. The range was primarily limited to areas of low elevation and small variations in topography. V. surinamensis is also predicted to be widespread in the central Amazon, extending from the eastern Atlantic coast and the Guiana Shields to the western regions of Peru, Ecuador, Colombia, and areas along the lowland Andes to southern ridges in Bolivia (Fig. 4e). Areas with high predictive probability are predominantly along the Amazon River floodplains, coastal forests extending to Venezuela, and in the Choc

Using the remote sensing data layers, we ran the Maxent model for nine categories of Fisher's alpha (Fig. 6). The numbers of point localities for each category were sufficient for the model runs without encountering problems associated with over- or under-predictions. In fact, in all cases, the model performance determined by the area under the ROC curve was significantly (p b 0.001) better than random (Table 4). Similarly, the training omission rates were small and the fractional predicted area over the entire environmental space covering northern South America was reasonably large. The results in Table 4 provided confidence in Maxent prediction of spatial distribution of tree alpha-diversity for each category. The results from dividing the point localities randomly in training (60% of points) and testing (40% of points) reduced the AUC values about 5% on the average, suggesting a reliable model performance under more constrained conditions. The thresholddependent omission tests also provided low omission rates and relatively large fractional predicted area (Table 4). After combining the distribution maps derived for all nine scenarios, a classification map of Fisher's alpha was produced
Table 4 Results from the threshold-dependent omission tests and threshold-independent ROC tests for eight range classes of tree alpha-diversity (expressed as Fisher's alpha), including fractional predicted area Fisher's alphadiversity class b20 N20 N40 N60 N80 N100 N120 N180 Number of point localities Threshold-dependent test Fractional predicted area 0.345 0.278 0.266 0.226 0.221 0.207 0.184 0.144 Test omission rate 0.118 0.128 0.106 0.117 0.141 0.106 0.080 0.038 Threshold-independent test AUC test (training) 0.861(0.897) 0.877(0.909) 0.886(0.915) 0.901(0.932) 0.907(0.940) 0.911(0.933) 0.935(0.951) 0.952(0.970)

118 515 370 249 176 112 79 28

Test omission rates were calculated at the 10% threshold level. Values represent averages from 10 separate random training/test data partitions. For each partition, statistical evaluations of test omission rates (one-tailed binomial) and test AUC (MannWhitney-U) indicated that the predictions were significantly better than random (p b 0.001; individual p-values not shown).

S. Saatchi et al. / Remote Sensing of Environment 112 (2008) 20002017

2011

at 1 km resolution. The resulting map is a potential distribution of ranges of tree alpha-diversity over the basin (Fig. 6). Given the Maxent performance and the distribution of sampling sites over the basin, any extrapolation of diversity outside the main range of the sampled data and over other ecosystems or biomass is not allowed. Therefore, the analysis of the range map is restricted to the Amazon basin. The largest swath of high tree diversity is in terra firme forest of the western Amazon, stretching from the foothills of the Andes in southern Colombia, Ecuador, and Peru into the central Amazon in Brazil. The Fisher's alpha associated with this region exceeded 180. The map also revealed many smaller areas of high tree diversity (alpha N 180) in areas east of the Rio Negro and north of Manaus, in northeastern Brazil in the state of Amapa, the southern basins of Tapajos and Xingu rivers in the state of Para, areas along southern and eastern Guiana Shields, and outside the basin in the Atlantic coastal forests in Bahia. In contrast, areas in the central Amazon between the Rio Negro and Solimoes and their tributaries are dominated by forests with lower tree alpha-diversity (alpha b120). This region is dominated by extensive river systems, varzea and igapo floodplains, and depending on their proximity to rivers and their sediment load, and topography, the forests contain a large range of diversity. Further south, in transitional deciduous and semideciduous forests of Brazil and in Chiquitano dry forests of Bolivia, the diversity drops to its minimum (alpha b 40). Note that the separation of the high diversity Amazon basin from the surrounding woodland and grassland savanna (cerrado) and higher elevation Andean vegetation types is an artifact resulting from Maxent's extrapolation over the environmental space. As there were no point localities sampling these biomes in our database, model predictions for these regions are not warranted. The jackknife training gains for the remote sensing variables were performed for all individual runs and showed almost the same results with slight variability in gain values. We show results for alpha N 20 to demonstrate the significance of each variable (Fig. 7). The most important variables were vegetation canopy roughness and moisture from QSCAT, percent tree cover from MODIS, and mean and seasonality of LAI from MODIS. The highest gain was achieved for the QSCAT mean backscatter (gain N 0.7), maximum LAI (gain N0.6), and percent tree cover (gain N 0.5). SRTM elevation, although important for the overall performance of the model, had a low gain compared

to other variables (gain b 0.3). LAI range and the standard deviation of QSCAT both had relatively high gains, suggesting seasonality as important variables for the distribution of areas with high diversity. An examination of the model response curves to input variables suggested that, in general, high moisture and low seasonality were associated with areas of high diversity. To further demonstrate the role of remote sensing data in separating the tree diversity over the basin, we plotted the Fisher's alpha for all 633 plots with respect to the mean and standard deviation of backscatter from QSCAT and the SRTM standard deviation (Fig. 8). In each case, we developed an envelope based on exponential functions to show the relationship between the maximum (equivalent to 90th quantile) tree alpha-diversity and the remote sensing variable. The envelopes showed three important trends: 1) QSCAT backscatter were positively correlated (R2 = 0.73, p b 0.001) with the maximum alpha-diversity suggesting areas with higher canopy moisture and roughness associated with higher diversity (Fig. 8a), 2) QSCAT standard deviation of backscatter was negatively correlated (R2 = 0.81, p b 0.001) with maximum tree alphadiversity indicating higher tree diversity was associated with areas of less seasonality in high stability in moisture (Fig. 8b), and 3) maximum tree alpha-diversity was negatively correlated (R 2 = 0.62, p b 0.001) with standard deviation of SRTM suggesting areas with less variations in elevation, primarily lowlands, were associated with higher tree diversity (Fig. 8c). 5.3. Comparison with climate derived models We performed the comparison of remote sensing results with distributions derived from nine bioclimatic layers for all five tree species and the Fisher's alpha scenarios. Here, we show distributions from V. surinamensis and the tree alpha-diversity to summarize the results from bioclimatic variables (Fig. 9). For all tree species, the distributions derived from remote sensing data were superior to climate data mainly because of the very coarse spatial resolution of the climate data and the limited variations of temperature and precipitation over the lowland Amazonian forests where species data were collected. For example, the V. surinamensis distributions from climate (Fig. 9a) and remote sensing data (Fig. 4e) have similar patterns for comparable predictive probabilities. However, there are two distinct

Fig. 7. Example of the Maxent jackknife test gains for the importance of remote sensing variables for tree alpha-diversity N20 (n = 515).

2012

S. Saatchi et al. / Remote Sensing of Environment 112 (2008) 20002017

Fig. 8. Distribution of tree alpha-diversity of inventory plots as a function of selected remote sensing variables. Exponential functions are used as envelopes to show the general trends of maximum diversity (equivalent to 90th quantile) with respect to (a) QSCAT annual mean backscatter, (b) QSCAT annual standard deviation of backscatter, and (c) standard deviation of SRTM data.

differences: 1) finer resolution remote sensing data allows prediction of range patterns along geomorphological features in the central Amazon and topographical and vegetation features along the Guiana Shields. These features disappear in climate results. 2) Models from climate data underpredict the species range in southern margins of the Amazon along the transitional forests and overpredict in savanna regions such as Roraima, the Gran Savanna and areas in cerrado Brazil. The jackknife test (Fig. 9b) shows several temperature and rainfall variables such as the temperature mean diurnal range (BIO2), seasonality (BIO3), temperature of the coldest month (BIO5), total annual precipitation (BIO6), and mean precipitation of the driest month (BIO9) as important habitat characteristics. If predicted correctly, these variables are complementary to remote sensing data (Fig. 5e) where canopy moisture and seasonality (QSCAT), maximum leaf

area (MODIS), and low elevation (SRTM) are the dominant habitat characteristics. The distribution for Fisher's alpha classes derived from climate data (Fig. 9c) has general patterns similar to those obtained by remote sensing data (Fig. 7). Visual comparison of the two diversity maps reveals three regions of high diversity within the Amazon basin: 1) the western Amazon basin, including western and northwestern Brazil, northern Peru, Ecuador, and the eastern Colombian Amazon, 2) the central Amazon basin, including areas east of the Rio Negro and north of Manaus to western Para, and 3) areas in northeastern Brazil and part of the Guiana Shields. Except for the western Amazon basin, the patterns do not necessarily cover the exact geographical regions. In general, the distribution from remote sensing data is spatially refined and shows patterns delineated by geomorphological and geological features of the Amazon basin, whereas, the climatederived distribution is coarse in resolution and is distinguished primarily by patterns of precipitation. Similarly, areas of high diversity (alpha N 180) in the Guiana Shields appear continuous in the remote sensing results, with the patterns following the geological and vegetation gradients, but patchy and discontinuous in the climate results. The southwestern region of the Amazon, including southern Peru and Bolivia, also appear different in the two distributions. The climate results show low diversity within the dense old growth forests of lowland Peru caused by temperature and precipitation seasonality. In contrast, the remote sensing results show higher tree alpha-diversity (alpha N 120) in the lowland old growth forests of Peru and northern Bolivia. In general, the distribution of tree alpha-diversity from the remote sensing data has much smoother variations within the Amazon basin than the climate results. The Maxent model results from the climate data underpredict the tree alpha-diversity in most areas of the central Amazon and create patterns different from observed tree diversity in the region (ter Steege et al., 2003). The jackknife test highlights these points by choosing rainfall of the driest quarter (BIO9), total rainfall (BIO6), and temperature range (BIO2), seasonality (BIO3), and minimum temperature of the coldest month (BIO5) as important variables in predicting areas of high diversity (ter Steege et al., 2006). Given the complementary information in remote sensing and climate layers, it is expected that the best distribution for diversity may be produced from combined climate and remote sensing data (Prates-Clark et al., 2008). 6. Discussion and conclusion 6.1. Contribution of remote sensing data For both tree species and tree alpha-diversity, remote sensing datasets provided meaningful and significant contributions in defining the distribution range and spatial patterns. We summarize these contributions in two areas: 1) improving the spatial resolution and, therefore, providing landscape-level details on potential habitat characteristics, and 2) adding to the pool of environmental variables beyond climate surfaces and hence improving the definition of habitat properties and ecological niche.

S. Saatchi et al. / Remote Sensing of Environment 112 (2008) 20002017

2013

Fig. 9. Maxent predictions derived from bioclimatic variables at 5 km resolution and corresponding Maxent jackknife test results for the variable importance. (a) V. surinamensis distribution, (b) jackknife test for V. surinamensis. (c) classification of tree alpha-diversity, and (d) jackknife test for alpha N20 class (n = 515).

A subset of the tree alpha-diversity maps produced from 1 km remote sensing data (Fig. 6) and 5 km climate data (Fig. 9c) over the central Amazon in Amazonia and the Para states of Brazil demonstrated this point (Fig. 10). With the 1 km remote sensing data (Fig. 10a), landscape features due to forest fragmentation, differences in forest cover (terra firme and inundated), and landscape geomorphology were readily delineated and reasonable differences in their tree alpha-diversity were observed. In contrast, the climate-derived map (Fig. 10b), although showing a general pattern, does not provide reasonable prediction and useful patterns of tree alpha-diversity in this region. In general, climate surfaces interpolated from station data or derived from coarse resolution satellite measurements cannot capture landscape-scale variations in diversity. It is also important to note that unlike potential species distribution influenced by environmental factors, diversity depends on the size of the area sampled, climate, past history, and local influences, such as soil, geology,

and nutrients. Therefore, diversity is very much a local or regional property of a forest and cannot be readily extrapolated to other regions. Nevertheless, as demonstrated in this study, quantification of landscape heterogeneity from high-resolution satellite observations can readily improve our understanding of the biogeography and biodiversity of the lowland Amazonian rainforests from typical postulated or observed distribution barriers such as unfavorable past climates, mountains, rivers, and river floodplains (Tuomisto et al., 1995). Furthermore, remote sensing data provide measurements directly related to forest structure, species composition, gap fraction, and the overall health of the ecosystem that can collectively improve our understanding of suitable habitats for species. Maxent offers response curves for the input environmental variables that allow examining how the predictions of suitable habitat depend on each variable. To demonstrate this, we examined the QSCAT response curves, as one of the most

2014

S. Saatchi et al. / Remote Sensing of Environment 112 (2008) 20002017

Fig. 10. Comparison of Maxent predictions of tree alpha-diversity classification over the Atlantic Coastal Forests of Bahia from (a) 1 km remote sensing data and (b) 5 km bioclimatic variables.

important variables in predicting the distribution of M. bidentata and V. surinamensis (Fig. 11). The response curves were derived from Maxent runs with QSCAT used in isolation in order to avoid interferences with other variables. QSCAT values over most of the point localities (shown as diamonds) and plotted as log of sample frequency, are scattered within a small range of the variable between 10.0 dB and 6.0 dB. These are typical QSCAT values measured over tropical forests. Within this range,

there is a major difference in the response curves between the two species. These differences force Maxent to choose thresholds to separate suitable areas and predict different distribution patterns, Areas outside of this range, although included in the overall environmental space, will not contribute to defining the species range. We expect the combination of climate and remote sensing data and the multiscale analysis may provide the best distributions for species range and diversity.

Fig. 11. Maxent response and sample frequency for M. bidentata and V. surinamensis as a function of QSCAT annual mean backscatter. The response curves illustrate how the contribution to the raw Maxent prediction depends on a particular environmental variable (Phillips et al., 2005). The Maxent response curves were derived from Maxent runs with QSCAT used in isolation to avoid interferences with other variables. The sample frequency (plotted as diamonds) shows the number of point localities that fall in a certain QSCAT interval.

S. Saatchi et al. / Remote Sensing of Environment 112 (2008) 20002017

2015

6.2. Maxent predictions For those species whose distribution patterns are determined primarily by evolutionary processes and gene flow and less by environmental variables, especially those detected by climate, landscape, or vegetation, predictive models are not useful. In general, models are based on algorithms that extrapolate the sensitivity developed by the training data from point localities and the environmental variables to a larger space. A true statistical test for any model prediction is how significantly and consistently it performs better than a random prediction. Maxent has internal routines that provide threshold-dependent and threshold-independent tests of its performance by examining the omission rate and the area under the ROC curve (AUC) for a set of randomly selected test localities. In case of low sensitivity to environmental variables for a set of point localities, both tests will show poor performance of the algorithm. In all cases studied here, consistent and significantly better than random performance were achieved. However, without absence data (not available in most biological data), there seems to be no source of negative instances to accurately measure the predictions over the rest of the environmental space. Phillips et al. (2005) developed non-parametric tests using the MannWhitney-U-statistic and a sample of 10,000 pixels drawn randomly from the study region to examine to what extent the prediction differs from random (AUC of 0.5). Although, this approach is not as rigorous as tests with absence data, it provides more confidence in model predictions. Furthermore, it provides enough confidence in distribution patterns derived from remote sensing data. Following a similar approach, we concluded that in examples used in our study, Maxent predictions were significantly different than random and its performance was close to optimal. These tests suggest that model predictions derived from remote sensing data provide meaningful and reasonable distribution patterns. With the predictions related to tree alpha-diversity, the results also confirmed that the distribution is only valid within the biomes sampled by the presence data and any extrapolation of diversity to other biomes is not valid. 6.3. Future work Spatial, spectral, and temporal diversity of recent satellite observations and improvement of algorithms to derive ecologically important variables has enhanced our capabilities in conservation biology in different areas: 1) to directly map individual species over relatively large and spatially contiguous units, 2) to map homogeneous associations dominated by few species or diversity indicators, and 3) to develop environmental requirements for species range and diversity. Results obtained from this study and similar studies using satellite data suggest that nature conservation in Amazonia and other tropical regions can benefit from recognizing ecological heterogeneity at landscape scales and the use of methods and datasets that can carefully distinguish these heterogeneities (Nagendra, 2001; Tuomisto et al., 1995). Interpolation and extrapolation of field inventory data on species presence, richness, and diversity on a spatial scale depend on how well the environmental variables

and ecological heterogeneities are characterized on that scale. Analysis of satellite images over Amazonian forests has shown that in addition to spatial information, the spectral data can be used to separate structurally and floristically distinct biotopes within the vegetation types already known (Chambers et al., 2006; Lucas et al., 2004). These results also call for new areas of research in using satellite observations in biogeography and the conservation of biodiversity. First, the use of remote sensing data in predictive models requires new rules and protocols to be developed in order to improve the assessment of distribution patterns from regional and continental to landscape scales. Finer-resolution environmental data may not improve model predictions, as they may be incompatible with the spatial scale of the inventory data from natural history museums and herbaria and they may introduce unwanted and additional statistics in the input data and thus impact the performance of predictive models. Further research is also required to determine to what degree the spatial resolution of satellite observation can help or limit the identification of species' environmental requirements and their ecological niche. Research is also needed regarding the utility of spectral data or remote sensing products to quantify and map ecologically important features on landscapes. Currently, remote sensingbased habitat characterization is mainly based on the relation between spectral data and the structure, chemistry, and heterogeneity of vegetation within a pixel resolution. It is not clear whether habitat suitability for most species can be defined in terms of these variables. Currently, specieshabitat relationships are defined by eco-region classifications based on climate, geology, and natural barriers. In particular, for mobile taxa like birds or butterflies, the specieshabitat relationships are not well defined or quantifiable. New approaches are required to extend these relationships to physical environmental variables detectable by satellite observations. In addition, research is also required to explore whether modeling distributions for a number of species or communities or developing relationships between spectral data and species richness and diversity may enhance and improve the utility of satellite data. Detection and assessment of changes in landscapes, such as deforestation and land use change, have been explored extensively in environmental sciences using the temporal diversity of remote sensing data. In addition, availability of time series satellite observation in the past three decades has also provided information on both the stability and the dynamics of ecosystems and natural habitats. In this study, we used time series data from MODIS and QSCAT to develop climatological metrics. However, further research is required to improve the utility of these datasets in predictive models and to understand the impact of land use patterns and interannual variability present in time series remote sensing data on species distribution. Acknowledgement We thank Donat Agosti for his help with acquiring the tree species data and Ana Paula Giorgi for her help with the GIS layers. The tree species data was provided by the herbarium of

2016

S. Saatchi et al. / Remote Sensing of Environment 112 (2008) 20002017 Gaston, K. J. (1998). Species-range size distributions: Products of speciation, extinction, and transformation. Philosophical Transactions of the Royal Society B, 353, 219/230. Gentry, A. H. (1975). Additional Panamanian Myristicaceae. Annals of the Missouri Botanical Garden, 62(2), 474479. Gentry, A. H. (1982). Phytogeographic patterns as evidence for a Choc refuge. In G. T. Prance (Ed.), Biological diversification in the tropics (pp. 112136). New York: Columbia University Press. Gentry, A. H. (1988). Tree species richness of upper Amazonian Forests. Proceeding of National Academy of Sciences, 85, 156169. Graham, C. H., & Hijmans, R. J. (2006). A comparison of methods of species ranges and species richness. Global Ecology and Biogeography, 15, 578587. Guariguata, M. R., Claire, H. A., & Jones, G. (2002). Tree seed fate in a logged and fragmented forest landscape, Northeastern Costa Rica. Biotropica, 34, 405415. Guisan, A., & Zimmermann, N. E. (2000). Predictive habitat distribution models in ecology. Ecological Modelling, 135, 147186. Hansen, M. C., DeFries, R., Townshend, J., Sohlberg, R., Dimiceli, C., & Carroll, M. (2002). Towards an operational MODIS continuous field of percent tree cover algorithm: Examples using AVHRR and MODIS data. Remote Sensing of Environment, 83, 303319. Higuchi, N., Hummel, A. C., Freitas, J. V., Malinowski, J. R. E., & Stokes, R. (1994). Explorao Florestal nas Vrzeas do Estado do Amazonas: Seleo de rvore, Derrubada e Transporte. Proceedings of the VII Harvesting and Transportation of Timber Products (pp. 168193). Curitiba, Brazil: IUFRO/ UFPR. Hijmans, R. J., Cameron, S. E., Parra, J. L., Jones, P. G., & Jarvis, A. (2005). Very high resolution interpolated climate surfaces for global land areas. International Journal of Climatology, 25, 19651978. Howe, F. (1990). Survival and growth of juvenile Virola surinamensis in Panama: Effects of herbivory and canopy closure. Journal of Tropical Ecology, 6, 259280. Hunter, J. T. (2003). Factors affecting range size differences for plant species on rock outcrops in eastern Australia. Diversity and Distributions, 9, 211 /220. Kreft, H., Sommer, J. H., & Barthlott, W. (2006). The significance of geographic range size for spatial diversity patterns in Neotropical palms. Ecography, 29, 2130. Lellinger, D. B., & Sota, E. R. de la (1978). The phytogeography of the pteridophytes of the Departamento del Choc, Colombia. National Geographic Society Research Reports 1969 projects (pp. 381387). Loiselle, B. A., Howell, C. A., Graham, C. H., Goerck, J. M., Brooks, T., Smith, K. G., & Williams, P. H. (2003). Avoiding pitfalls of using species distribution models in conservation planning. Conservation Biology, 17, 15911600. Long, D. G., Drinkwater, M., Holt, B., Saatchi, S., & Bertoia, C. (2001). Global ice and land climate studies using scatterometer image data. EOS, Transaction of American Geophysical Union, 82(43). Lovett, J. C., Rudd, S., Taplin, J., & Frimodt-Mller, C. (2000). Patterns of plant diversity in Africa south of the Sahara and their implications for conservation management. Biodiversity and Conservation, 9, 3342. Lucas, R., Held, A. A., Phinn, S. R., & Saatchi, S. S. (2004). Tropical forests. In S. Ustin (Ed.), Remote Sensing of Earth Sciences: Manual of Remote Sensing, Vol. 4. John Wiley & Sons. Kummerow, C., Barnes, W., Kozu, T., Shiue, J., & Simpson, J. (1998). The Tropical Rainfall Measuring Mission (TRMM) sensor package. Journal of Atmospheric and Oceanic Technology, 15, 809817. Marques, M. C. M., & Joly, C. A. (2000). Seed germination and growth of Calophyllum brasiliensie (Clusiaceae), a typical species of flooded forests. Acta Botanica Brasilica, 14(1), 113120. Myers, N., Mittermeier, R. A., Mittermeier, C. G., da Fonseca, G. A. B., & Kent, J. (2000). Biodiversity hotspots for conservation priorities. Nature (London), 403, 853858. Myneni, R. B., Hoffman, S., Knyazikhin, Y., Privette, J. L., Glassy, J., Tian, Y., et al. (2002). Global products of vegetation leaf area and fraction absorbed PAR from year one of MODIS data. Remote Sensing of Environment, 83, 214231. Nagendra, H. (2001). Using remote sensing to assess biodiversity. International Journal of Remote Sensing, 22, 23772400.

the New York Botanical Garden and the tree diversity data is from the Amazon Tree Diversity Network. However, this work would not have been possible without decades of dedicated and methodic fieldwork of numerous researchers across South America. We would also like to thank the reviewers who provided us with important suggestions and criticisms of the original manuscript. This work was performed at the Jet Propulsion Laboratory, California Institute of Technology, and the UCLA Center for Tropical Research, Institute of the Environment, under a contract from the National Aeronautics and Space Administration. References
Anderson, R. P., Lew, D., & Peterson, A. T. (2003). Evaluating predictive models of species' distributions: Criteria for selecting optimal models. Ecological Modelling, 162, 211232. Brown, J. H., Stevens, G. C., & Kaufman, D. M. (1996). The geographic range: Size, shape, boundaries, and internal structure. Annual Review of Ecology and Systematics, 27, 597623. Buermann, B., Saatchi, S., Zutta, B. R., Chaves, J., Mil, B., Graham, C. H., Smith, T. B. (in press). Application of remote sensing data in predictive models of species' distribution. Journal of Biogeography. Buermann, W., Wang, Y., Dong, J., Zhou, L., Zeng, X., Dickinson, R. E., Potter, C. S., & Myneni, R. B. (2002). Analysis of a multiyear global vegetation leaf area index data set. Journal of Geophysical Research, 107, 4646. doi:10.1029/2001JD000975 Cantor, S. B., Sun, C. C., Tortolero-Luna, G., Richards-Kortum, R., & Follen, M. (1999). A comparison of C/B ratios from studies using receiver operating characteristic curve analysis. Journal of Clinical Epidemiology, 52, 885892. Chambers, J. Q., Asner, G. P., Morton, D. C., Anderson, L. O., Saatchi, S. S., Esprito-Santo, F. D. B., Palace, M., & Souza, C., Jr. (2006). Regional ecosystem structure and function: Ecological insights from remote sensing of tropical forests. Trends in Ecology & Evolution, 22(8), 414423. Condit, R., Watts, K., Bohlman, S. A., Prez, R., Hubbell, S. P., & Foster, R. B. (2000). Quantifying the deciduousness of tropical forest canopies under varying climates. Journal of Vegetation Science, 11, 649658. Cowling, R. M., & Pressey, R. L. (2001). Rapid plant diversification: Planning for an evolutionary future. Proceedings of the National Academy of Sciences, 98(10), 54525457. Crandall, K. C., Bininda-Emonds, O. R. P., Mace, G. M., & Wayne, R. K. (2000). Cosidering evolutionary processes in conservation biology. Trends in Ecology and Evolution, 15(7), 290295. De Oliveira, A. A., & Mori, S. (1999). A central Amazonian terra firme forest. I. High tree species richness on poor soils. Conservation Biology, 8, 12191244. Elith, J., Graham, C. H., Anderson, R. P., Dudk, M., Ferrier, S., Guisan, A., et al. (2006). Novel methods improve prediction of species' distributions from occurrence data. Ecography, 29, 129151. Faber-Langendoen, D., & Gentry, A. H. (1991). The structure and diversity of rain forests at Bajo Calima, Choco Region, Western Colombia. Biotropica, 23, 211. Ferrier, S. (2002). Mapping spatial pattern in biodiversity for regional conservation planning: Where to from here? Systematic Biology, 51(2), 331363. Fisher, R. A., Corbet, A. S., & Williams, C. B. (1943). The relation between the number of species and the number of individuals in a random sample of an animal population. Journal of Animal Ecology, 12, 4258. Fisher, E., & Dos Santos, F. A. M. (2001). Demography, phenology and sex of Calophyllum brasiliense (Clusiaceae) trees in the Atlantic forest. Journal of Tropical Ecology, 17, 903909. Fisher, B. L., Howe, H. F., & Wright, S. J. (1991). Survival and growth of Virola surinamensis yearlings: Water augmentation in gap and understory. Oecologia, 86, 292297. Fjeldsa, J., & Lovett, J. C. (1997). Biodiversity and environmental stability. Biodiversity and Conservation, 6, 315323. Freiberg, M. (1996). Spatial distribution of vascular epiphytes on three emergent canopy trees in French Guiana. Biotropica, 28(3), 345355.

S. Saatchi et al. / Remote Sensing of Environment 112 (2008) 20002017 Nelson, B. W., Ferreira, C. A. C., da Silva, M. F., & Kawasaki, M. L. (1990). Endemism centres, refugia and botanical collection density in Brazilian Amazonia. Nature, 345, 714716. Nix, H. A. (1986). A biogeographic analysis of Australian elapid snakes. In: Atlas of Australian Elapid Snakes (pp. 415). Canberra: Bureau Flora Fauna. Peng, C. (2000). From static biogeographical model to dynamic global vegetation model: A global perspective on modeling vegetation dynamics. Ecological Modelling, 135, 3354. Phillips, S. (2005). A brief tutorial on Maxent. AT& T Research (from http:// www.cs.princeton.edu/~schapire/maxent/tutorial/tutorial.doc). Phillips, S., Anderson, R. P., & Schapire, R. E. (2005). Maximum entropy modelling of species geographic distributions. Ecological Modelling, 190, 231259. Pimm, S. L. (1991). The balance of nature? Ecological issues in conservation of species and communities. Chicago: The University of Chicago Press. Prates-Clark, C. D. C., Saatchi, S. S., & Agosti, D. (2008). Predicting geographical distribution models of high-value timber trees in the Amazon Basin using remotely sensed data. Ecological Modelling, 211(3-4), 309323. Rahbek, C., & Graves, G. R. (2001). Multiscale assessment of patterns of avian species richness. Proceedings of the National Academy of Sciences of the United States of America, 98, 4534 /4539. Record, S. J., & Hess, R. W. (1943). Timbers of the new world. New Haven (USA): Yale Univ. Press. Rodriguez, W. A. (1972). A ucuuba de vrzea e suas aplicaes. Acta Amazonica, 2(2), 2947.

2017

Saatchi, S., Houghton, R., Avala, R., Yu, Y., & Soares, J. -V. (2007). Spatial distribution of live aboveground biomass in Amazon Basin. Global Change Biology, 13, 816837. Schongart, J., Piedade, M. T. F., Ludwigshausen, S., Horna, V., & Worbes, M. (2002). Phenology and stem growth periodicity of tree species in Amazonian flood plain forests. Journal of Tropical Ecology, 18, 581597. Smith, T. B., Schneider, C. J., & Holder, K. (2001). Refugial isolation versus ecological gradients: Testing alternative mechanisms of evolutionary divergence in four rainforest vertebrates. Genetica (Dordrecht), 112113, 383398. Swets, J. A. (1988). Measuring the accuracy of diagnostic systems. Science, 240, 12851293. ter Steege, H., Pitman, N., Philips, O., Chave, J., Sabatier, D., Duque, A., et al. (2006). Continental-scale patterns of canopy tree composition and function across Amazonia. Nature, 443(28), 444447. ter Steege, H., Pitman, N., Sabatier, D., Castellanos, H., Van Der Hout, P., Daly, D. C., et al. (2003). A spatial model of tree alpha-diversity and tree density for the Amazon. Biodiversity and Conservation, 12, 22552277. Tuomisto, H., Ruokolainen, K., Kalliola, R., Linna, A., Danjoy, W., & Rodriguez, Z. (1995). Dissection Amazonian biodiversity. Science, 269, 6366. Turner, W., Spector, S., Gardiner, N., Fladeland, M., Sterling, E., & Steininger, M. (2003). Remote sensing for biodiversity science and conservation. Trends in Ecology and Evolution, 18, 306314. Whittaker, R. H. (1972). Evolution and measurement of species diversity. Taxon, 21, 213251. Woodward, F. I. (1987). Climate and plant distribution. Cambridge University Press 190 pp.

You might also like