ISSN 1064-2293, Eurasian Soil Science, 2021, Vol. 54, No. 2, pp. 176–188. © Pleiades Publishing, Ltd., 2021.

Russian Text © The Author(s), 2021, published in Pochvovedenie, 2021, No. 2, pp. 168–182.



Geospatial Modeling of Nitrogen and Carbon Content and Stock

in the Forest Litter Horizons Based
on Sentinel-2 Multi-Seasonal Satellite Imagery
E. A. Gavrilyuka, *, A. I. Kuznetsovaa, and A. V. Gornova
aCenter for Forest Ecology and Productivity, Russian Academy of Sciences, Moscow, 117997 Russia
*e-mail: egor@ifi.rssi.ru
Received April 2, 2020; revised June 20, 2020; accepted June 22, 2020

Abstract—The capabilities of Sentinel-2 optical multispectral satellite data for modeling nitrogen (N) and
carbon (C) contents, their ratio (C : N), and stocks in the litter horizons of forest soils were assessed. The
study was conducted in the Bryansk Forest Nature Reserve and its buffer zone. The organic horizon samples
were taken on 33 plots selected with due account for the tree species diversity of the reserve’s forests. Two lay-
ers of the organic horizon—L and FH—were sampled separately. The main variables for geospatial modeling
were derived from a time series of eight Sentinel-2 multi-seasonal satellite images. Basic terrain characteris-
tics and pixel coordinates were also added to variables’ stack. We used random forest to build regression mod-
els and the corresponding standard methods to assess their performance. The best results were obtained for
the C : N ratio: the coefficient of determination R2 = 0.71 with a scaled root-mean-square error RMSE =
12.5% in the L layer, and R2 = 0.83 with RMSE = 10.6% in the FH layer. For other models, the values of R2
ranged from 0.23 to 0.61, and the RMSE ranged from 15.8 to 48.6% with the least reliable results for the N
and C stocks. Satellite-based variables were most informative for the contents of N and C, and, notably, for
the C : N ratio. The most significant periods in the time series were early spring, summer, and snowy winter.
To conclude, Sentinel-2 satellite imagery can be successfully used for estimation and mapping the contents
and stocks of N and C in the forest soil organic horizon as a free and relevant alternative to thematic data on
the species composition and related properties of stands.

Keywords: forest soil, coniferous–broadleaved forests, C : N ratio, remote sensing data, random forest,
machine learning
DOI: 10.1134/S1064229321020046

INTRODUCTION obtained on the basis of remote sensing data are used

Forest litter, being a product of the functioning of in combination with climatic, orographic, geological,
forest biogeocenoses, regulates a wide range of ecosys- and other thematic maps; often, their informativeness
tem processes. Data on the nitrogen (N) and carbon remains relatively low [33, 44, 48].
(C) contents and their ratio (C : N) characterize the The potential for using satellite imagery, both opti-
quality of forest litter and are indicative of the rate of cal and radar, to assess the properties of soils, espe-
mineralization processes, which directly affect the cially forest soils, is often limited because of the dense
productivity of forest soils and their carbon sequestra- vegetation cover. The canopy density of more than
tion capacity [10, 18, 52]. In the context of global cli- 30% excludes the possibility of direct analysis of the
mate change, much attention is paid to updating the spectral characteristics of the underlying surface [29].
estimates of the soil organic matter stocks. It has been However, individual soil properties can be closely
shown that the share of litter in the total carbon stocks related to the qualitative and/or quantitative charac-
can reach 30% [9, 19]. Remote sensing data (RSD) are teristics of vegetation, which leaves room for an indi-
currently widely used as a basis for digital mapping of rect assessment of such properties. In particular,
soils and geospatial modeling of their quantitative woody plants of dominant species composing different
characteristics [15]. The global information system types of forests significantly affect the composition of
SoilGrids [34] is a conceptual example of the joint use organic matter, acidity, total N content, and C : N
of satellite and ground data to obtain thematic maps of ratio in forest soils, both in their organic and upper
major soil properties using machine learning methods. (0–10 cm) mineral horizons [27, 43]. Therefore, the-
It should be noted that in most studies devoted to geo- matic characteristics of forest cover (prevailing tree
spatial modeling of soil parameters, the variables species, proportion of coniferous and deciduous trees,


30°0′ 40°0′ E

55°0′ N
52°33′




1 3 5 7 9 11
2 4 6 8 10 12
0 1 2 4 6 km 8

33°50′ 33°55′ 34°00′ 34°05′ E

Fig. 1. Location of study area and composition of tree stands according to [1]: (1) pine, (2) broadleaved, (3) birch, (4) aspen,
(5) black alder, (6) mixed coniferous, (7) mixed coniferous–hardwood, and (8) mixed hardwood. Other designations: (9) forest-
less area, (10) boundary of the Bryanskii Les Reserve, (11) water courses, and (12) location of GTPs.

forest type, canopy density, ground phytomass, etc.) modeling of the C and N contents in forest litter using
are often used as variables in geospatial modeling of machine learning methods for the territory of the Bry-
various properties of forest soils, including the C : N ansk Forest Reserve and its buffer zone. Litter subho-
ratio and C and N stocks in litter [21, 24, 25]. At the rizons L (fresh or weakly decomposed litter) and FH
same time, the recognition and mapping of the species (a layer of fermentation and decomposition of plant
structure of forests is one of the classical problems residues) were considered independently from one
solved with the use of remote sensing data with varying another. For each of them, five quantitative indicators
degrees of success [31]; the same concerns the assess- were analyzed, i.e., the contents (%) and stocks (g/m2)
ment of biometric and structural characteristics of for- of organic C and N and their ratio (C : N).
est stands [41]. Accordingly, satellite imagery bands
and/or their derivatives are potentially capable of
replacing thematic variables describing variability of OBJECTS AND METHODS
the forest cover characteristics. As the properties of lit-
ter are directly related to the species composition of Study area. The study was carried out in the south-
forest stands, the organic (litter) horizon of forest soils ern part of the Bryansk outwash plain (polessie land-
should potentially demonstrate indirect relationships scape) within the “Bryanskiy Les” State Natural Bio-
with spectral features of the tree canopy. At the same spheric Reserve and its buffer zone (Fig. 1); 97% of
time, although there are studies aimed at determining this territory is under forests, the total area of which is
the coverage area [39], moisture [51], and the degree of more than 200 km2.
decomposition [45] of forest litter from remote sensing According to geobotanic zoning, this territory
data, we failed to find analogous works devoted to belongs to the Polessie subprovince of the East Euro-
assessing the contents of C and N in forest litter. pean province of the European region of broadleaved
We studied possibilities of using a time series of forests [14]. The forests of the reserve were greatly
optical multispectral satellite images of high spatial affected by various economic activities in the past. The
resolution together with terrain data for geospatial modern forest cover is mainly represented by early


succession communities with monodominant (pine, GTPs, forest litter was sampled using a 25 × 25 cm
birch, black alder, and aspen forests) or oligodominant frame in three or four replicates (108 samples in total).
(mixed coniferous–deciduous and small-leaved for- Under laboratory conditions, the selected samples
ests) composition of stands [1, 3, 4]. At the same time, were dried to an absolutely dry state at 105°C,
unique areas of polydominant coniferous–deciduous weighed, and the contents of N and C were deter-
and broadleaved forests have been preserved in the mined on an EA1110 (CHNS-O) elemental analyzer.
reserve [2, 30]. When calculating carbon stocks, we used the guide-
The climate of the territory is moderately continen- lines for quantifying the volume of absorption of
tal, with four pronounced phenological seasons during greenhouse gases [13]. As a result, five indicators were
the year. The mean air temperature in the winter estimated for the L- and FH-subhorizons of the forest
months is –5.2°С; in the summer months, +18.4°С; litter: N content (N%), C content (C%), C : N ratio,
the mean annual precipitation is 556 mm, with its N stock (Nstk), and C stock (Cstk). The statistical anal-
larger part (33%) in the summer period [12]. In terms ysis of the obtained results (Table 1) indicates suffi-
of soil-geographical zoning, the reserve is included in cient homogeneity (coefficient of variation <30%) and
the Central Russian province of podzolic soils [6]. In representativeness (power over 80%) of the samples for
the landscape structure of the territory, there are areas N (%), C (%), and C : N. The samples are heteroge-
of floodplains, terraces, polessie (outwash plain), and neous with respect to N and C stocks, which may be a
cis-polessie landscapes [5]. Gray-humus organo- consequence of the insufficient number of replicates
accumulative soils [8] (Umbrisols according to WRB on the test plots. Nevertheless, they are representative
[49]) are developed from the alluvial deposits on for all parameters, except for Nstk in the L subhorizon
floodplains [17]. Soddy podzols [8] (Albic Podzols (sample power about 60%).
(Arenic)) with a pronounced humus horizon are Satellite data and their preliminary processing. Sen-
developed from glaciofluvial sands (or loamy sands) tinel-2 multi-season multispectral images were used as
and from clayey eluvium of siliceous claystone [16] the main data source for geospatial modeling [28].
within the terrace, polessie, and cis-polessie land- Based on the analysis of the annual global MODIS
scapes. Gley features are encountered everywhere. On data on the dynamics of land cover (MCD12Q2, ver-
less disturbed territories of the reserve, brown forest sion 6 [32]), eight successive phenological periods
soils (Cambisols) are formed [7]. were identified for the study area (Table 2). For each of
Initial data. Three types of data are applied in our them, composite images were formed from the Senti-
study: the results of ground surveys, satellite images, nel-2 scenes obtained in 2016–2018. MCD12Q2 is a
and a digital elevation model (DEM) of the area with set of annual global thematic images with a spatial res-
its derivatives. In addition, the spatial coordinates of olution of 500 m for the period from 2000 to 2017. The
the position of the centers of the image pixels (in the images contain information at the pixel level about
form of horizontal and vertical serial numbers) were seven key dates in the dynamics of the curve of the
added as two auxiliary independent variables for geo- spectral index EVI2 [35] and the degree of reliability of
spatial modeling, which is a common practice. It should their determination. Key dates correspond to the
be noted that a relatively small size of the study area beginning and middle of growth; reaching a plateau;
makes it impractical to use climatic data as additional peak; and beginning, middle, and end of a decrease in
variables, and there are no actual maps of soils of suffi- the index values. For local areas that are relatively
cient detail (on a scale of about 1 : 100000) that could be homogeneous in climatic conditions, as in our case,
used in the work on the territory of the reserve. the median values of these dates estimated over all
valid image pixels can be reliably correlated with the
Field data and their preliminary processing. During successive change in the phenological phases of green
field studies in 2016–2017, 33 ground test plots (GTPs) vegetation. To determine the boundaries of the four
of 400 m2 were laid in forests of the reserve and its buf- main seasons (winter, spring, summer, autumn), we
fer zone. These test plots (Fig. 1) were located in forest used the data of 2017, after which three additional
areas homogeneous in the species structure of the tree periods (beginning, middle, and end) were analytically
layer so that the sample contained litter from all types identified for spring and autumn seasons. The bound-
of forest stands typical for the study area. At the same aries of the additional periods were selected in such a
time, to assess the species composition of forests, a way that they were located symmetrically with respect
map (hereinafter, a map of tree species) obtained as a to the initial dates of the middle of the increasing and
result of thematic processing of satellite data from decreasing parts of the curve of the EVI2 index; the
Landsat [1] and materials of forest inventory of the length of each period was at least 15 days.
reserve in 2006 were used. At each GTP, a geobotani-
cal description was performed with the identification For all periods, except for winter, three Sentinel-2
of the full floristic composition with due account for scenes of the corresponding time range with the mini-
the layered structure of forest cenoses. In each layer, mum cloud cover were selected from the archive.
the species participation was determined using the Composite images of the periods were formed from
Brown-Blanquet cover-abundance scale [11]. At all the median values of the pixels of the chosen scenes



Table 1. Descriptive statistics for ground data: above the line, for the L subhorizon; under the line, for the FH subhorizon
Indicator Mean ν CVВ Minimum 1st quartile Median 3rd quartile Maximum Sample power*
N% 2.2 0.4 19.5 1.1 1.9 2.2 2.4 3.1 88.4
2. 0 0. 5 25.3 1.0 1.6 1. 9 2.3 3.5 85.4
C% 44.4 3.1 6.9 32.8 42.9 44.9 46.3 49.5 92.3
37.5 7. 1 18.9 18.1 32.0 39.0 43.1 49.8 80.8
C:N 21.3 5.4 25.4 14.5 17.5 20.3 23.5 39.4 82.6
19.5 5.2 26.8 12.1 15.9 17.2 22.2 36.8 84.3
Nstk, g/m2 8.7 6. 0 68.4 1.2 5. 0 6.9 11.2 30.3 59.3
20.8 11.3 54.2 3.1 11.3 19.9 28.4 58.8 87.2
Cstk, g/m2 185.4 127.2 68.6 21.1 88.9 145.5 268.3 712.8 92.1
415.3 262.7 63.3 46.3 196.6 344.8 583.0 1425.1 90.4
ν is the standard deviation, and CV is the coefficient of variation, %.
* Sample power (%) was estimated according to a standard t-test at the significance level of α = 0.1 and the effect value of 10% of the

calculated independently for each spectral band. For 10 m), the results of measurements at GTPs were com-
the period of snowy winter, a single, completely cloud- pared with the median values of the variables in a 3 ×
less scene was used. All selected Sentinel-2 images 3 pixel (30 × 30 m) window around the locations of
were preliminarily converted into L2A level products GTPs sites.
(reflectance values at the ground level) using the Optimization of the set of variables. The original set
Sen2Cor software module [40]. of variables was optimized first using correlation anal-
DEM and its derivatives. The basic orographic ysis and then using the recursive feature elimination
characteristics of the terrain were obtained from the method. In the correlation analysis, the pairwise Pear-
DEM with a spatial resolution of 10 m formed as a result son correlation (r) was assessed for all variables, after
of interpolation of the elevation values along the con- which, from the pairs with r> 0.95, the variables with
tour lines of a topographic map on a scale of 1 : 50000 the higher mean r value calculated for each variable for
(contour interval 5 m). The indices of elevation, slope, all its pairs were discarded. The procedure for recur-
aspect (as the sine and cosine of its angular values), sive feature elimination implies the sequential con-
and general slope curvature [46], as well as the topo- struction of regression models (with fixed algorithm
graphic wetness index [22], were calculated using parameters) for the analyzed indicator with a step-by-
standard GIS SAGA tools [26] and served as geospa- step filtering out of the least informative variables. At
tial variables. the same time, at each stage of screening, the perfor-
Formation of the initial set of variables. Taking into mance of the model is assessed, which makes it possible
account the high correlation between the Sentinel-2 to form an optimal set of the most significant variables
bands in the visible, near infrared and mid infrared providing the best accuracy from the point of view of the
regions of the spectrum, we used classical principal selected formal criterion. As such a criterion, we used
component analysis [36] to reduce the space of vari- the mean square error (MSE) of the model assessed by
ables. In our case, the first two principal components, the repeated cross-validation method (25 repetitions
depending on the survey period, described from 88%
(late autumn) to 98% (snowy winter) of the total vari-
ance in ten major bands of Sentinel 2 (bands 2–8, 8A, Table 2. Phenological periods, for which composite Senti-
11, and 12), which made it possible to compress data nel-2 images were collected in different seasons
by five times without significant loss of information. Length,
Six bands with a spatial resolution of 20 m (bands 5– Period Boundaries
7, 8A, 11, and 12) were preliminarily brought to a res-
olution of 10 m by the nearest neighbor method. As a Winter October 26–March 30 166
result, after transforming the images by the principal Beginning of spring March 31–April 30 31
component method, sixteen features obtained on the Middle of spring May 1–15 15
basis of satellite data (two principal components for
each of the eight phenological periods), in combina- End of spring May 16–30 15
tion with six orographic characteristics of the terrain Summer May 31–July 24 45
and two coordinates of the position of pixels, made up Beginning of autumn July 25–September 3 41
the initial set of geospatial variables for modeling. Tak- Middle of autumn September 4–October 4 31
ing into account the accuracy of georeferencing of
both satellite and field data (the average error is about End of autumn October 5–October 25 21


with the division of the original sample into 4 parts). The statistical significance of the models as a whole
According to the results of this procedure, only those (p-value) was assessed by the method described in
variables that were not optimal for any of the consid- [20], which is a multiple permutation test (in our case,
ered indicators were excluded from further analysis; 200 iterations) for the dependent variable (the value,
thus, a single set of variables was used for all models. for which the simulation is performed). The permuta-
tion procedure implies a random permutation of the
Development of regression models. We used the values of the variable, after which the model is fitted
random forest machine learning algorithm [23] to and the efficiency is assessed (in our case, by RMSE).
build regression models, as well as the entire set of The proportion of cases, when permutation models
methods for the automatic selection of algorithm are more efficient than the original one, characterizes
parameters, assessing the informativeness of variables, the sought p-value.
the quality of training, and the performance of predic-
tions implemented in the R environment in caret [38] Choice of the best prediction model. On the basis of
and ranger [50] packages. the absolute value of RMSE, three types of model pre-
diction results—average, adjusted average, and
Random forest is a statistical method for classifica- median—were evaluated in order to determine the most
tion and regression problems. It is based on the use of effective option for each modeled indicator. The first
a large number (ensemble) of decision trees, each of type, which is a standard for random forest regressions,
which is constructed from an incomplete sample is averaging the results across all trees in the ensemble.
obtained from the initial sample using a bootstrap The second type is obtained from the values of the first
(random sampling with return); a fixed number of by applying a simple linear model (the predicted values
variables is randomly selected from the complete set. are fitted to the original measurements) to correct the
In the basic version of the algorithm, classification is common effect of overestimating low values and under-
carried out using a simple vote of classifiers defined by estimating high values. To construct this additional
individual trees, and regression modeling is performed model, we used the repeated median method [47]
by averaging the results over all trees. Nowadays, ran- implemented in the R-package mblm [37]. The third
dom forest is one of the most popular methods of type is the median value (in the 50th percentile)
machine learning, since it combines relative versatil- obtained by constructing a quantile regression in its
ity, ease of configuration, and speed of work with high implementation for random forests [42].
performance indicators of the resulting models.
Assessment of the informativeness of variables. As a
We used an ensemble of 1000 trees. Other parame- measure of the informativeness of variables in the
ters of the algorithm—in particular, the number of learning process, we used the standard for random for-
random features for each splitting of the tree, the split- ests indicator of the mean decrease in the overall
ting method, and the minimum node size—were cho- model accuracy (MDA, Mean Decrease in Accuracy)
sen individually for each modeled indicator by a sim- after a random permutation of the values of the esti-
ple enumeration of options, similar to the procedure mated independent variable. In contrast to the similar
for recursive feature elimination. permutation test for statistical significance described
Evaluation of the performance and statistical signif- above, the value permutation, model overfitting, and
icance of models. The performance of the models was accuracy assessment take place only once for each
assessed by standard statistical metrics: the coeffi- variable. The RMSE, value was chosen as the criterion
cient of determination (R2), the root mean square for the accuracy of the model.
error (RMSE), and its relative values: the percentage Geospatial modeling. To obtain thematic products
from the average (RMSEAVG) and the range characterizing the distribution of the analyzed indica-
(RMSERNG) of the modeled indicators. In this case, tors over the study area, the corresponding models
we used both estimates obtained on the basis of the full with the most efficient prediction trained on the full
initial sample (33 measurements) integrated into ran- sample were applied to the optimized set of variables at
dom forest algorithm using the out-of-bag (OOB) the pixel-by-pixel level. At the same time, using quan-
method and obtained for the training and control sub- tile regression, the boundaries of the 80% confidence
sets (in a ratio of 25/8) of the full initial sample. The interval were estimated—the values in the 10th and
OOB method implies the formation of separate boot- 90th percentiles of the distribution of predictions of all
strap samples for training each of the trees of the trees of the ensemble. The difference between these
ensemble, which allows using measurements that are values relative to the prediction result characterizes the
not included in them to assess the performance of limiting errors of modeling errors in the directions of
individual trees and then the entire model by averaging underestimation and overestimation, respectively,
the results. The measurements for the control sample within a given confidence interval. As a spatial mea-
were chosen analytically in such a way as to avoid their sure of modeling uncertainty for the obtained thematic
falling on the distribution edges for all analyzed products, the ratio of the width of the confidence
parameters and to ensure the representativeness of all interval (the difference in values at its boundaries) to the
types of forest stands. forecast result, expressed as a percentage, was used.



Table 3. Summary statistics of the performance of regression models for the L (above the line) and FH (under the line)
N% C% C:N Nstk, g/m2 Cstk, g/m2
Index method of performance evaluation *
oob test oob test oob test oob test oob test

Prediction type** Av. Av. Med. LAv. Med. Med. Med. Av. Med. LAv.
Med. Av. Med. Med. Av. LAv. Med. Med. Av. Av.
0.48 0.57 0.46 0.46 0.71 0.77 0.58 0.49 0.61 0.54
0.44 0.12 0.56 0.44 0.83 0.95 0.23 0.30 0.37 0.24
RMSE 0.3 0.3 1.8 1.8 2.7 1.9 3.4 3.5 72.3 77.8
0.4 0.4 4.2 3.8 2.1 0.8 9.2 9.1 190.9 229.8
RMSERNG, % 15.8 24.4 16.8 35.2 12.5 15.1 15.5 25.5 14.3 28.1
17.7 38.6 17.5 25.6 10.6 8.6 18.4 29.6 20.7 31.3
RMSEAVG, % 12.5 11.5 4.0 4.0 12.8 9.2 39.8 43.0 40.3 45.4
18.1 19.1 11.1 9.8 10.9 4.3 46.0 42.9 48.6 54.9
p< 0.005 0.005 0.005 0.005 0.005 0.005 0.005 0.010 0.005 0.005
0.005 0.005 0.005 0.005 0.005 0.005 0.010 0.015 0.005 0.010
* Method of model performance evaluation: oob—out-of-bag method, learning on the total sample; test—learning with a preliminary
division of the sample into the learning and control subsamples.
** The type of model prediction according to the RMSE value: Av.—average, LAv.—average corrected by linear model, and Med.—

RESULTS zon L) and 2.6 (subhorizon FH) times lower than

Optimal variables for modeling. According to the analogous OOB indicators, which may be a random
results of the correlation analysis of the initial variables, feature of measurements selected for manual control.
the principal components of late spring and early The best results were demonstrated by models for the
autumn were discarded, because they were strongly cor- C : N ratio: R2 = 0.71 at RMSERNG = 12.5% for the L
related with similar variables for the adjacent phenolog- subhorizon, and R2 = 0.83 at RMSERNG = 10.6% for
ical periods. After the procedure of recursive feature the FH subhorizon, which was quite expectable given
elimination, six more variables were eliminated: the the known close relationship of these indicators with
principal components of the end of autumn, indicators the proportion between conifers and hardwood spe-
of orientation and curvature of slopes, and the topo- cies in stands. For absolute values of the contents of N
graphic wetness index. Thus, the optimal dataset for and C (in percent) in both subhorizons, R2 was in the
modeling N and C contents and stocks in the forest lit- range from 0.44 to 0.56 with RMSERNG from 15.8 to
ter consisted of fourteen geospatial variables: 17.7%, which could be estimated as a moderate degree
— ten satellite imagery-based variables (two princi- of model efficiency. The worst results were obtained
pal components for the periods of snowy winter, early for the indicators of the stocks of N and C (Nstk and Cstk)
and mid-spring, mid-summer and mid-autumn); in both subhorizons: R2 from 0.23 to 0.61 with RMSEAVG
— two characteristics based on the DEM (elevation from 39.8 to 48.6%, which, first of all, could be a con-
and slope); and sequence of the heterogeneity of the initial sample.
— two coordinates of the spatial position of pixels. Nevertheless, the random forest method, as a rule,
Model performance. Table 3 shows the perfor- gives reliable results even with a small volume and het-
mance indicators of the obtained regression models erogeneity of training data, including data with a large
for the content and stocks of N and C in the forest lit- number of independent variables. Thus, all the
ter. The estimates for models with training on the total obtained models were statistically significant; most of
sample (based on the OOB method) are more consis- them, at p < 0.005; in the worst case, at p < 0.015 (for
tent in terms of the metrics used (high R2 values are Nstk in the FH subhorizon). At the same time, it was
accompanied by low relative RMSE values and vice impossible to unambiguously give preference to any of
versa) than the estimates for models with a prelimi- the three types of forecasting considered: for ten mod-
nary division of the sample into training and control els out of twenty, the median value was the best; for
subsamples, though the absolute values of RMSE are seven, a simple average; and for the remaining three,
generally quite close for both approaches. Models for the adjusted average. Figure 2 shows scatter diagrams
the C : N values are an exception. For them, the of measured and predicted values according to the best
RMSE values for the control sample are 1.4 (subhori- forecast type for all analyzed indicators. As can be seen


Predictions N%, % C%, % C:N Nstk, g/m2 Сstk, g/m2

2.5 45 30 400
(a) 2.0
10 200
40 20
1.5 5

1.5 2.0 2.5 3.0 40 45 20 30 0 5 10 15 20 25 0 200 400

47 15 300
2.5 25
(b) 45 200
2.0 20
43 100
2.0 2.5 43 45 47 15 20 25 5 10 15 100 200 300
3.5 50
3.0 40 750
40 25
(c) 2.5 500
2.0 20 20
30 250
1.5 15

1.5 2.0 2.5 3.0 3.5 30 40 50 15 20 25 30 0 20 40 0 250 500 7501000

45 27
24 15
2.5 40 600
(d) 21 10
2.0 35 400
18 200
1.5 5
30 15
1.5 2.0 2.5 30 35 40 45 15 18 21 24 27 5 10 15 200 400 600 800
Fig. 2. Scatter diagrams of measured and predicted values of the contents and stocks of N and C in the forest litter for the (a, b) L
and (c, d) FH subhorizons estimated by the OOB method for the (a, c) total sample and (b, d) separate control sample.

from the diagrams, the characteristic tendencies indicators is shown in Fig. 3. For the contents of N and
towards overestimating low and underestimating high C and, particularly, for the C : N ratio, we can state a
values are expressed to varying degrees in all models. significant superiority of the predictors obtained on
According to the quantitative assessments of model the basis of satellite images over the predictors
performance, additional adjustment of the results by obtained on the basis of DEM, which indirectly
an additional linear model did not make any signifi- reflects the high correlation of spectral characteristics
cant improvement. The most likely reason for this sit- with the species structure of forest stands. At the same
uation may be the low sensitivity of the variables used time, for the indicators of N and C stocks (except for
to the variability of the analyzed litter properties after the C stock in the FH subhorizon), the elevation of the
they reach certain threshold values at both ends of the territory and the coordinates of the position of the pix-
distribution (i.e., when the saturation effect is els exceed all satellite variables in terms of their infor-
observed). In general, the greatest uncertainty in the mation content. This can be interpreted as a conse-
prediction (with a tendency to underestimate high val- quence of a lower correlation of these indicators with
ues of the indicator) for most of the models was the species composition of forest stands on one hand,
observed. However, for the C content indicator, the and of their stronger dependence on their position in
opposite situation—overestimation in the range of low the landscape, on the other hand. In addition, the
values—was observed. stocks of N and C in the litter are associated not only
with the quality of litter but also with its weight, which
Informativeness of variables. The final informative- depends primarily on the phytomass and productivity
ness of the selected variables for each of the modeled of forest stands, determined by habitat conditions,



which are not always clearly reflected in the spectral Informativeness, %

properties of the forest canopy.
In general, the most significant periods of the year 80
in modeling were early spring, summer, and snowy
winter. However, the degree of informativeness of
individual variables varies greatly from indicator to (a) 40
indicator. Interestingly, the images of the autumn 20
period, despite the high potential for recognizing the 0
species structure due to the change in foliage color, –20 1 2 3 4 5 6 7 8 9 10 11 12 13 14
turned out to be of little information for modeling the
characteristics of the litter (except for the C stock in 100
the FH subhorizon). It should also be noted that, in 80
most cases, the second principal components reflect-
ing variability of spectral brightness between different 60
bands for separate phenological periods are naturally (b) 40
more informative than the first principal components 20
characterizing differences in the total intensity of the 0
reflected radiation. Nevertheless, it is impossible to
–20 1 2 3 4 5 6 7 8 9 10 11 12 13 14
completely exclude the first components from the
models without a significant loss of their efficiency. 100
Results of geospatial modeling. Based on the analy- 80
sis of model performance and informativeness of the 60
variables, the most reliable results of geospatial mod-
eling based on satellite data can be obtained for the (c) 40
C : N ratio (Fig. 4). Visual comparison with the map 20
of tree species (Fig. 1) shows a clear spatial consistency 0
of the predicted C : N values with the species compo- –20 1 2 3 4 5 6 7 8 9 10 11 12 13 14
sition of forest stands, which is due to the high infor-
mation content of satellite variables in the learning 100
process. The lowest C : N values correspond to broad- 80
leaved forests; slightly higher values, to small-leaved 60
forests; and the highest values, to coniferous and (d) 40
mixed stands. For the modeled remaining indicators
(not shown), spatial differentiation according to the 20
species composition of forest stands is also character- 0
istic. However, the variability of their values within –20 1 2 3 4 5 6 7 8 9 10 11 12 13 14
homogeneous forest areas is less pronounced than that
for the C : N ratio because of the weaker relationships 100
with the variables used (which is reflected in the mod- 80
erate values of the coefficient of determination of the
models). In addition, for C and N stock estimates, local (e) 40
effects of “blockiness” (pronounced vertical and hori-
zontal boundaries between image areas) are observed 20
due to the high information content of the variables 0
responsible for the spatial coordinates of pixels. –20 1 2 3 4 5 6 7 8 9 10 11 12 13 14
The spatial consistency of results of modeling Variables
(Fig. 5) for the two litter subhorizons is clearly pro-
nounced for the N content (Pearson’s correlation
coefficient r = 0.87) and the C : N ratio (r = 0.97); the Fig. 3. Relative informativeness of variables in regression
rest of the parameters are characterized by a moderate modeling of (a) N(%), (b) C(%), (c) C : N, (d) Nstk, and
positive correlation (r from 0.41 to 0.65). The values of (e) Cstk. Initial values were normalized to the absolute
modeling uncertainty increase in proportion to the maximum. Dada for the L subhorizon are given in dark
gray, and data for the FH subhorizon are given in light gray.
predicted values for the C : N ratio in both subhorizons Variables: (1) winter I (1st principal component of the
(r = 0.86 for L, and r = 0.79 for FH), as well as for the winter image), (2) winter II, (3) beginning of spring I,
N content in the FH subhorizon (r = 0.78). For other (4) beginning of spring II, (5) middle of spring I, (6) mid-
indicators, no pronounced dependence is observed dle of spring II, (7) summer I, (8) summer II, (9) middle of
autumn I, (10) middle of autumn II, (11) elevation,
(r varies from –0.43 to 0.66). In general, the highest (12) slope, (13) pixel coordinate along the X axis, and
uncertainty values are typical of the areas with rela- (14) pixel coordinate along the Y axis.


33°50′ 33°55′ 34°00′ 34°05′ E

52°35′ >36 (a) 100 (c)
25 50

<14 <1


>36 (b) 100 (d)
25 50

<14 <1

Boundary of the reserve Water course Forestless area 0 2 4 8 12 16 km

Fig. 4. The results of geospatial modeling the C : N ratio in the (a) L and (b) FH subhorizons of forest litter for the studied area
with (c, d) pixel-by-pixel estimates of uncertainty expressed in percent.

tively low canopy density and/or areas near forest simulation is carried out. In turn, the small volume of
boundaries. It is obvious that image pixels, for which ground data limits the possibilities for reliable verifica-
“mixed” spectral brightness of signals from underlying tion of the model prediction results, so that the model
surfaces of different types is observed, are difficult for performance estimates obtained in our study are more
unambiguous interpretation (unless such areas are of a comparative nature rather than accurate quantita-
purposefully provided with GTPs), which accordingly tive results. In other words, we can definitely conclude
affects the operation of the models. that the N and C contents, as well as the C : N ratio, are
modeled more accurately than the N and C stocks.
However, in order to judge how the quantitative esti-
DISCUSSION mates of the model performance are close to the real
accuracy of thematic products, it is necessary to gener-
Two main factors that do not allow us to make ate an independent set of ground control data compara-
unambiguous conclusions about the stability and uni- ble in volume to that used for training.
versality of the results obtained in this work are the local
character of the study area and the relatively small In addition to the sample size, the spatial location
amount of ground data used to train regression models. of the GTPs plays an equally important role. In this
In particular, the locality imposes restrictions on con- study, when planning ground-based surveys, mainly
clusions about the relative informativity of variables of the species composition of forest stands was taken
various types, since it is obvious that the role of oro- into account, and this turned out to be sufficient for
graphic and/or climatic data (which were not consid- successful modeling of the C : N ratio and satisfac-
ered in the work for reasons of rationality) can signifi- tory results for N and C contents. However, N and C
cantly increase as the spatial coverage and diversity of stocks proved to be more sensitive to the position of
forest growth conditions in the territory for which the forest stands in the landscape, which is confirmed in



(a) (b) (c)

FH subhorizon Uncertainty, % Uncertainty, %
3.5 80

3.0 90
60 1600
N%, %

2.5 2000 1000 1200

40 500
60 800
2.0 1000 400

1.5 20 30

1.5 2.0 2.5 3.0 3.5 1.5 2.0 2.5 1.5 2.0 2.5 3.0 3.5
20 60
40 4000 2000
C%, %

3000 15 1200 1500

2000 800 40 1000
1000 10 400 500
30 40 50 40 42 44 46 48 30 40
30 75
2000 50 750

2000 50 500
1000 1000 250
20 25
0 20 30 0 15 20 25 30 35 15 20 25 30
30 250
Nstk, g/m2

20 2000 300 2000 1600

1500 1500 150 1200
1000 200 1000 800
10 500 500 400
100 100

0 10 20 30 0 5 10 15 20 25 10 15 20 25
600 300
Сstk, g/m2

600 400 1600

2000 1200 200 1000
400 800
1000 500
200 400
200 100

0 200 400 600 800 0 100 200300400500 200 400 600 800
L subhorizon Predictions Predictions

Fig. 5. 2D histograms of the distribution of the study area (hectares) in terms of the (a) contents and stocks of N and C in the
forest litter and uncertainty of predicted values for the (b) L and (c) FH subhorizons.


a number of studies. Thus, the stocks of litter of auto- CONCLUSIONS

morphic and semihydromorphic soils can differ almost This work demonstrates the possibilities of using
twofold [19]. In addition, low canopy density has been optical multi-season satellite images for geospatial
found to be a significant source of uncertainty in pre- modeling of the contents and stocks of N and C in the
dictions from geospatial modeling. Thus, for further forest litter, without the need for preliminary recogni-
research in this direction, when choosing the sites for tion of the species structure of stands, which is their
GPTs, it is necessary to take into account both the spe- main predictor. This approach makes it possible to
cies and morphostructural characteristics of forest automatically take into account the variability of the
stand and the landscape features of the territory. share of trees of different species and/or groups of spe-
Despite the indicated limitations, the results cies in the composition of forest stands, as well as a
obtained sufficiently characterize the potential of the number of accompanying characteristics, such as can-
proposed approach, which implies the use of a time opy density, sanitary condition, etc. in the constructed
series of satellite images as the main variables for geo- regression models. This is due to differences in the
spatial modeling of the contents and stocks of N and C seasonal dynamics of spectral properties of the tree
in the forest floor. As already noted, we could not find canopy. At the same time, it is obvious that thematic
works analogous to our study in terms of the analyzed products of qualitative and/or quantitative character-
indicators and the initial data used. However, several istics of forest cover obtained on the basis of remote
studies close in thematic and methodological aspects sensing data or from forest inventory materials can be
can be considered as analogues for comparison. All of successfully used as variables for modeling the proper-
them use the national forest inventory materials to ties of litter. However, given the wide availability of
model the properties of forest litter at the subcontinen- optical satellite images, which, as a rule, are immedi-
tal level. Beguin et al. used materials with 500 GTPs and ately ready for thematic processing, it is more rational
geospatial data on the proportions of coniferous and to directly assess the properties of the organic soil
hardwood species in the composition of forest stands horizon without intermediate products, which may
in combination with the absolute elevation of the relief have different degrees of generalization, relevance,
to model the C : N ratio in the organic horizon of soils and reliability.
of boreal forests of Canada [21]. Various machine For the further development of this study, it is
learning methods were compared for constructing planned to expand ground-based surveys on the terri-
regression models, including random forests. The per- tory of the Bryansk Forest Reserve, taking into
formance indicators of the best of the considered account the obtained thematic products for more reli-
models were at the level of R2 = 0.4 and RMSE = 30% able validation and calibration of models, as well as to
(the authors provide only graphical data without exact verify the reliability of our conclusions and the effi-
numbers). For a similar task in the forests of Europe, ciency of the approaches used for forests of the taiga
Carre with coauthors used 739 GTPs and about biome in the European part of Russia.
40 variables, including maps of the prevailing tree spe-
cies and landforms, as well as climatic data [25]. The
simulation was done using simple kriging and neural ACKNOWLEDGMENTS
networks. The results of kriging were the most accu- The authors are grateful to the director of the Center for
rate: R2 = 0.6, RMSE = 4.91 (relative error values were Forest Ecology and Productivity, corresponding member of
not estimated). Cao with coauthors used 3303 GTPs the Russian Academy of Sciences, Dr. Sci. N.V. Lukina for
and more than 30 different variables, including the- the proposed idea of this study, to the administration of the
matic characteristics of soils, climate, forests, topog- Bryanskii Les Reserve for assistance in field work, and to
raphy, and parent rock, as well as seasonal values of the the team of the Chromatography ecoanalytical laboratory
spectral index NDVI, to model C stocks in the litter (collective use center) of the Institute of Biology, Komi Sci-
and upper mineral soil layer of US forests [24]. Three ence Center, Ural Branch of the Russian Academy of Sci-
methods of training models were considered, and the ences (accreditation certificate no. ROSS RU.0001.511257
best results were obtained for the random forest algo- for quantitative determination of carbon and nitrogen in the
rithm: R2 = 0.2, RMSE = 923 g/m2 (only absolute val- collected samples.
ues are given). As can be seen from the above exam-
ples, the results of modeling the properties of forest lit-
ter for large areas, even using large training samples FUNDING
and a wide range of types of geospatial variables, are This study was performed within the framework of state
characterized by relatively low accuracies (comparable assignment of the Center for Forest Ecology and Productiv-
to or lower than in our local study). Therefore, the ity “Methodological approaches to the assessment of struc-
issues of assessing the possibilities of scaling the pro- tural arrangement and functioning of forest ecosystems”
posed approach and determining the conditions under (registration no. АААА-А18-118052590019-7) (thematic
which sufficiently reliable modeling results can be analysis of data). It was partly supported by the Program of
obtained are of the highest priority for further research the Presidium of the Russian Academy of Sciences “Bio-
in this area. diversity of Natural Systems. Rational Use of the Biologi-



cal Resources of Russia” (registration no. AAAA-A18- 11. B. M. Mirkin, L. G. Rozenberg, and L. G. Naumova,
11802199063-2 (statistical analysis of the results). The Dictionary of Definitions and Terms of Modern Phyto-
development of data bases with the results of ground surveys cenology (Moscow, 1989) [in Russian].
was supported by the Russian Science Foundation (project 12. Climate, Bryanskiy Les Nature Reserve official website.
no. 16-17-10284). http://www.bryansky-les.ru/naturalconditions/klimat/.
13. Order of the Ministry of Natural Resources and Envi-
ronment of Russian Federation no. 20-r of June 30,
CONFLICT OF INTEREST 2017 on Methodological recommendations for quanti-
tative determination of greenhouse gas consumption
The authors declare that they have no conflict of interest. volume. https://www.garant.ru/products/ipo/prime/
14. Vegetation of the European Part of USSR, Ed. by
REFERENCES S. A. Gribov, T. I. Isachenko, and E. M. Lavrenko
(Nauka, Leningrad, 1980) [in Russian].
1. E. A. Gavrilyuk, A. V. Gornov, and D. V. Ershov, “As- 15. I. Yu. Savin, A. V. Zhogolev, and E. Yu. Prudnikova,
sessment of spatial distribution of tree species in the Brya- “Modern trends and problems of soil mapping,” Eur-
nskiy Les Nature Reserve and its protective zone based on asian Soil Sci. 52, 471–480 (2019).
different season Landsat satellite data,” Byull. Bryansk. https://doi.org/10.1134/S1064229319050107
Otd., Ross. Bot. O-va, No. 3 (15), 13–23 (2018). 16. L. A. Sokolov, “Classification of pedogenic and litter-
https://doi.org/10.22281/2307-4353-2018-3-13-23 ing mountain minerals in Bryansk forest massif,” in
2. A. V. Gornov, M. V. Gornova, E. V. Tikhonova, Proceedings of Scientific Conference “Input of Scientists
N. E. Shevchenko, A. I. Kuznetsova, E. V. Ruchinskaya, and Professionals in National Economics” (Bryansk,
and D. N. Teben’kova, “Assessment of succession state 1998), Vol. 2.
of coniferous–broadleaved forests in European part of 17. M. V. Stefurishin, “Assessment of soil-ecological con-
Russia based on population approach,” Lesovedenie, ditions of water-glacial landscapes of the Bryansk forest
No. 4, 243–257 (2018). massif,” in Problems of Forest Science and Forestry
https://doi.org/10.1134/S0024114818040083 (Bryansk State Engineering and Technological Acade-
3. O. I. Evstigneev, Doctoral Dissertation in Biology my, Bryansk, 2000), No. 10, pp. 48–50.
(Nizhny Novgorod, 2010). 18. N. G. Fedorets and O. N. Bakhmet, Ecological Features
4. O. I. Evstigneev, History and Nature Management in the of Transformation of Carbon and Nitrogen Compounds in
Nerusso-Desnyanskoe Polesie (Desyatochka, Bryansk, Forest Soils (Karelian Scientific Center, Russian Acad-
2009) [in Russian]. emy of Sciences, Petrozavodsk, 2003) [in Russian].
5. O. I. Evstigneev and Yu. P. Fedotov, “Assessment of 19. O. V. Chernova, I. M. Ryzhova, and M. A. Podvezen-
vegetation cover diversity in Russian-Ukrainian cross- naya, “Assessment of organic carbon stocks in forest
boundary ecological network by the example of Nerus- soils on a regional scale,” Eurasian Soil Sci. 53, 339–
so-Desnyanskoe Polesie,” in Proceedings of Russian- 348 (2020).
Ukrainian Meeting “Prospective Development of Ecologi- https://doi.org/10.1134/S1064229320030023
cal Network and Creation of Cross-Boundary Protected 20. A. Altmann, L. Tolosi, O. Sander, and T. Lengauer,
Territories in the Desna River Basin” (Moscow, 1999), “Permutation importance: a corrected feature impor-
pp. 27–43. tance measure,” Bioinformatics 26, 1340–1347 (2010).
6. “A map of soil-geographic zonation, scale 1 : 1 5000000,” 21. J. Beguin, G.-A. Fuglstad, N. Mansuy, and D. Pare,
in National Soil Atlas of Russian Federation, Ed. by “Predicting soil properties in the Canadian boreal for-
S. A. Shoba (Astrel’-AST, Moscow, 2011), pp. 198–201. est with limited data: comparison of spatial and non-
spatial statistical approaches,” Geoderma 306, 195–
7. Yu. A. Kiseleva, “Specific development of woodland 205 (2017).
soils by example of the Bryanskiy Les Nature Reserve https://doi.org/10.1016/j.geoderma.2017.06.016
(formation of brown and podzolic soils),” in Role of 22. K. J. Beven and M. J. Kirkby, “A physically-based vari-
Soils in Biosphere, Tr. Inst. Pochvoved., Mosk. Gos. able contributing area model of basin hydrology,” Hy-
Univ., Ross. Akad. Nauk no. 1 (Moscow, 2002), drol. Sci. Bull. 24 (1), 43–69 (1979).
pp. 56–78.
23. L. Breiman, “Random forests,” Mach. Learn. 45 (1),
8. L. L. Shishov, V. D. Tonkonogov, I. I. Lebedeva, and 5–32 (2001).
M. I. Gerasimova, Classification and Diagnostic System 24. B. Cao, G. M. Domke, M. B. Russell, and B. F. Wal-
of Russian Soils (Oikumena, Smolensk, 2004) [in Rus- ters, “Spatial modeling of litter and soil carbon stocks
sian]. on forest land in the conterminous United States,” Sci.
9. A. I. Kuznetsova, N. V. Lukina, E. V. Tikhonova, Total Environ. 654, 94–106 (2019).
A. V. Gornov, M. V. Gornova, V. E. Smirnov, A. P. Ger- https://doi.org/10.1016/j.scitotenv.2018.10.359
askina, N. E. Shevchenko, D. N. Tebenkova, and 25. F. Carre, N. Jeannee, S. Casalegno, O. Lemarchand,
S. I. Chumachenko, “Carbon stock in sandy and loamy H. I. Reuter, and L. Montanarella, “Mapping the CN
soils of coniferous–broadleaved forests at different suc- ratio of the forest litters in Europe—lessons for global
cession stages,” Eurasian Soil Sci. 52, 756–768 (2019). digital soil mapping,” in Digital Soil Mapping, Ed. by
https://doi.org/10.1134/S1064229319070081 J. L. Boettinger, (Springer-Verlag, Dordrecht, 2010),
10. O. V. Menyailo, A. I. Matvienko, M. I. Makarov, and pp. 217–225.
Sh.-K. Cheng, “Role of nitrogen in carbon balance in 26. O. Conrad, B. Bechtel, M. Bock, H. Dietrich,
forest ecosystems,” Lesovedenie, No. 2, 143–159 (2018). E. Fischer, L. Gerlitz, J. Wehberg, V. Wichmann, and
https://doi.org/10.7868/S0024114818020067 J. Böhner, “System for Automated Geoscientific Anal-


188 GAVRILYUK et al.

yses (SAGA) v. 2.1.4,” Geosci. Model. Dev. 8, 1991– Preprints, (2019).

2007 (2015). https://doi.org/10.7287/peerj.preprints.27891v1
https://doi.org/10.5194/gmd-8-1991-2015 40. J. Louis, V. Debaecker, B. Pflug, M. Main-Knorn,
27. N. Cools, L. Vesterdal, B. De Vos, E. Vanguelova, and J. Bieniarz, U. Müller-Wilm, E. Cadau, and F. Gascon,
K. Hansen, “Tree species is the major factor explaining “Sentinel-2 L2A Sen2Cor Processor: L2A processor for
C : N ratios in European forest soils,” For. Ecol. Man- users,” in Proceedings of Living Planet Symposium 2016,
age. 311, 3–16 (2014). Prague, Czech Republic, May 9–13, 2016 (European
https://doi.org/10.1016/j.foreco.2013.06.047 Space Agency, Paris, 2016), pp. 1–8.
28. ESA Sentinel-2. http://www.esa.int/Our_Activities/ 41. G. Matasci, T. Hermosilla, M. A. Wulder, J. C. White,
Observing_the_Earth/Copernicus/Sentinel-2. Accessed N. C. Coops, G. W. Hobart, and H. S. Zald, “Large-
March 20, 2019. area mapping of Canadian boreal forest cover, height,
29. P. Escribano, T. Schmid, S. Chabrillat, E. Rodríguez- biomass and other structural attributes using Landsat
Caballero, and M. García, “Optical remote sensing for composites and lidar plots,” Remote Sens. Environ.
soil mapping and monitoring,” in Soil Mapping and 209, 90–106 (2018).
Process Modeling for Sustainable Land Use Management, https://doi.org/10.1016/j.rse.2017.12.020
Ed. by P. Pereira, et al. (Elsevier, Amsterdam, 2017), 42. N. Meinshausen, “Quantile regression forests,” J. Mach.
pp. 87–125. Learn. Res. 7, 983–999 (2006).
https://doi.org/10.1016/B978-0-12-805200-6.00004-9 43. Q. Quan, C. Wang, N. He, Zh. Zhang, X. Wen, H. Su,
30. O. I. Evstigneev and V. N. Korotkov, “Pine forest suc- Q. Wang, and J. Xue, “Forest type affects the coupled
cession on sandy ridges within outwash plain (Sandur) relationshipsof soil C and N mineralization in the tem-
in Nerussa-Desna Polesie,” Russ. J. Ecosyst. Ecol. 1 perate forests of northern China,” Sci. Rep. 4, 6584
(3), (2016). (2014).
https://doi.org/10.21685/2500-0578-2016-3-2 https://doi.org/10.1038/srep06584
31. F. Fassnacht, H. Latifi, K. Stereńczak, A. Modzelews- 44. A. Ramcharan, T. Hengl, T. Nauman, C. Brungard,
S. Waltman, S. Wills, and J. Thompson, “Soil property
ka, M. Lefsky, L. Waser, C. Straub, and A. Ghosh, and class maps of the conterminous US at 100 meter
“Review of studies on tree species classification from
remote licensed data,” Remote Sens. Environ. 186, spatial resolution based on a compilation of national
soil point observations and machine learning,” Soil Sci.
64–87 (2016). Soc. Am. J. 82, 186–201 (2018).
32. M. Friedl, J. Gray, D. Sulla-Menashe, and M. Friedl, 45. L. Sabetta, N. Zaccarelli, G. Mancinelli, S. Mandrone,
MCD12Q2: MODIS/Terra+Aqua Land Cover Dynamics R. Salvatori, M. L. Costantini, G. Zurlini, and L. Ros-
Yearly L3 Global 500m SIN Grid V006 (Land Processes si, “Mapping litter decomposition by remote-detected
Distributed Active Archive Center, Sioux Falls, SD, 2019). indicators,” Ann. Geophys. 49 (1), 219–226 (2006).
46. P. A. Shary, “Land surface in gravity points classifica-
33. B. Gallo, J. Demattê, R. Rizzo, J. Safanelli, tion by complete system of curvatures,” Math. Geol. 27
W. Mendes, I. Lepsch, M. Sato, D. Romero, and (3), 373–390 (1995).
M. Lacerda, “Multi-temporal satellite images on top- 47. A. F. Siegel, “Robust regression using repeated medi-
soil attribute quantification and the relationship with ans,” Biometrika 69 (1), 242–244 (1982).
soil classes and geology,” Remote Sens. 10, 1571 (2018).
https://doi.org/10.3390/rs10101571 48. S. Wang, K. Adhikari, Q. Wang, X. Jin, and H. Li,
“Role of environmental variables in the spatial distribu-
34. T. Hengl, J. Mendes de Jesus, G. B. M. Heuvelink, tion of soil carbon (C), nitrogen (N), and C : N ratio
M. Ruiperez Gonzalez, M. Kilibarda, A. Blagotić, from the north-eastern coastal agroecosystems in Chi-
W. Shangguan, et al., “SoilGrids250m: global girded na,” Ecol. Indic. 84, 263–272 (2018).
soil information based on machine learning,” PLoS https://doi.org/10.1016/j.ecolind.2017.08.046
One 12 (2), e0169748 (2017). 49. IUSS Working Group WRB, World Reference Base for
https://doi.org/10.1371/journal.pone.0169748 Soil Resources 2014, International Soil Classification
35. Z. Jiang, A. R. Huete, K. Didan, and T. Miura, “De- System for Naming Soils and Creating Legends for Soil
velopment of a two-band enhanced vegetation index Maps, World Soil Resources Reports No. 106 (UN Food
without a blue band,” Remote Sens. Environ. 112, and Agriculture Organization, Rome, 2014).
3833–3845 (2008). 50. M. N. Wright and A. Ziegler, “A fast implementation of
https://doi.org/10.1016/j.rse.2008.06.006 random forests for high dimensional data in C++ and
36. I. T. Jolliffe, Principal Component Analysis, 2nd ed. R,” J. Stat. Software 77 (1), 1–17 (2017).
(Springer-Verlag, New York, 2002). https://doi.org/10.18637/jss.v077.i01
https://doi.org/https://doi.org/10.1007/b98835 51. X. Yang, Y. Yu, H. Hu, and L. Sun, “Moisture content
37. L. Komsta, mblm: Median-based linear models, estimation of forest litter based on remote sensing da-
R package version 0.12.1. https://CRAN.R-project.org/ ta,” Environ. Monit. Assess. 190 (7), 421 (2018).
package=mblm. https://doi.org/10.1007/s10661-018-6792-2
38. M. Kuhn, Classification and regression training, 52. Y. Yang, Y. Luo, and A. C. Finzi, “Carbon and nitrogen
R package version 6.0-84. https://CRAN.R-project.org/ dynamics during forest stand development: a global
package=caret. synthesis,” New Phytol. 190, 977–989 (2011).
39. Q. Li, L. Ma, S. Liu, A. Wufu, Y. Li, S. Yang, and https://doi.org/10.1111/j.1469-8137.2011.03645.x
X. Yang, “Plant litter estimation and its correlation with
sediment concentration in the Loess Plateau,” PeerJ Translated by D. Konyushkov


