Breen Agu2018

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/321836286

Developing a surrogate model for the Soil & Water Assessment Tool (SWAT) using a deep learning algorithm

Research · December 2017

CITATIONS READS
0 529

7 authors, including:

Kathy Breen Scott C. James


NASA Baylor University
9 PUBLICATIONS   12 CITATIONS    225 PUBLICATIONS   2,758 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Machine-learning surrogate models View project

Geomorphic and Hydraulic Assessment of Denton Creek, Texas View project

All content following this page was uploaded by Scott C. James on 05 March 2019.

The user has requested enhancement of the downloaded file.


Deep Learning Model Integration of Remotely Sensed and SWAT-Simulated Regional Soil Moisture
Kathy Breen1, Scott C. James2, Scott Fitzgerald3, Joseph D. White4
1Department of Geosciences, Baylor University, 2Departments of Geosciences and Mechanical Engineering, Baylor University, 3Department of Computer Sciences, Baylor University, 4Department of Biology, Baylor University

Abstract and Objectives Area of Interest Soil Moisture Datasets


Prior to neural-network development, soil-moisture data (in situ and satellite; Figure 2) were
Purpose: Figure 1: Target area evaluated for use with SWAT. Linear regressions were performed using soil-moisture
selection in the Middle
Soil moisture can be used as a metric to assess: Tennessee-Elk River
estimates from 11 SCAN stations in the study area vs. remotely sensed data (Table 1,
Soil health – does a given soil contain sufficient moisture to sustain plant and animal life? watershed. Uncertainty Figure 3). SMAP was chosen to use in further analyses because overall data variances were
Risk potential – in the event of extreme weather, which areas are most sensitive to extreme water associated with remotely low and linear model fits were superior to those using SMOS data.
flux and flooding? sensed soil-moisture data
decreases when the
Need: physical characteristics of
Accurate, near-real-time, high-resolution estimates for key metrics (i.e., soil moisture) to assess the land surface are
hydrologic activity. relatively homogeneous
Computationally efficient, lightweight tools to analyze data and make predictions in real time. (Zhang et al., 2017;
Burgin et al., 2017). For
this reason, the selected
Abstract: target area for neural-
Data-driven deep learning models (DLMs) may be used as surrogates for physics-based models to network development was
reduce computational time and user biases. Here, a DLM was developed to make near-real-time soil- in a region where
agriculture is a prominent
moisture estimates using the USDA’s Soil & Water Assessment Tool (SWAT) predictions (uncalibrated, land-use type and the
higher resolution) in conjunction with soil-moisture retrievals from NASA’s Soil Moisture Active Passive topographic profile is
(SMAP) satellite (calibrated, lower resolution) for use in risk assessments (e.g.,flooding and crop relatively flat. The white
viability). The Middle Tennessee River watershed was selected as the area of interest for analysis with square delineates the
predominantly agricultural land use and fairly uniform topography. approximate size of an
SMAP data pixel.
As a first step in preparing an operational framework, in situ and satellite soil-moisture data were
evaluated with regard to their applicability in developing a DLM surrogate for SWAT simulations. The Figure 2: Timeseries of soil-moisture estimates from SCAN Figure 3: Histograms of residuals of linear regression
SMAP data were selected for input into the DLM because overall data variances were low and site 2113 (black curve), SMAP (red circles), and SMOS (blue models for in situ and satellite data comparisons
correlations were superior than other remotely sensed retrievals when compared with measured soil circles). Satellite data are from a single pixel geographically (SMAP vs. SCAN: red, SMOS vs. SCAN: blue).
moistures (RMSE < 0.04, R2 > 0.7 for all but one site). Uncertainty associated with remotely sensed correlated to the SCAN station.
soil-moisture data decreased with increasing homogeneity of the physical characteristics of the land Table 1: Linear regression statistics from timeseries comparisons between satellite data products (SMAP, SMOS) and
surface.
Data used to train the DLM were generated by running SWAT from 2015 to 2017 based on 3,200
hydrologic response units within the target area. The Scikit-Learn and Keras machine learning
Soil Moisture Remote Sensing SCAN station observations for 2016. Satellite timeseries were collected from data pixels corresponding to the geographic
location of each respective SCAN station. All data were collected at approximately 6 a.m.
RMSE R2 𝛽𝛽0 𝛽𝛽1
packages were used to pre-process the data and build the DLM architecture, respectively. Preliminary SCAN
SMAP SMOS SMAP SMOS SMAP SMOS SMAP SMOS
investigations revealed that a multilayer perceptron (MLP) model could not achieve satisfactory To estimate soil moisture, a passive station
results, so a hybrid architecture was developed where an LSTM network processed transient SWAT radiometer measures the natural 2053 0.03 0.06 0.86 0.63 0.02 -0.08 0.86 0.90
thermal emission from the land
model inputs (precipitation, temperature, etc.) and an MLP processed static data (curve number, soil surface as brightness temperature 2055 0.03 0.09 0.90 0.55 0.08 -0.10 0.69 0.88
depth, etc.) to make predictions. Loss was calculated as the RMSE between SWAT-predicted and (𝑇𝑇𝐵𝐵 ). This is a bulk measure of 2056 0.03 0.08 0.87 0.11 0.08 0.03 0.58 0.20
SMAP-sensed soil moistures. The Adagrad optimizer was implemented with a dropout rate of 0.2 for upwelling radiation from the land
surface. Partitioning of 𝑇𝑇𝐵𝐵 into 2057 0.04 0.07 0.83 0.60 0.07 -0.05 0.84 0.84
all layers. The hybrid model architecture yielded predictions within ±0.03 cm3/cm3, slightly less than
vegetation, air, and soil components 2075 0.03 0.06 0.86 0.69 0.10 -0.01 0.68 0.85
the target error for SMAP mission objectives (±0.04 cm3/cm3). requires knowledge of the landscape,
such as vegetation and soil type. The 2076 0.04 0.05 0.76 0.51 0.06 -0.03 0.76 0.65
Objectives: 𝑇𝑇𝐵𝐵,𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 component is used to calculate
emissivity, which is then used to
2077 0.05 0.07 0.70 0.18 -0.06 -0.02 0.90 0.37
• Estimate current soil-moisture conditions using a neural-network surrogate model for SWAT daily estimate soil moisture via a dielectric 2078 0.04 0.06 0.80 0.65 0.02 -0.09 1.04 1.17
model inputs and outputs augmented with satellite datasets for leaf area index (LAI) and soil model. Uncertainty is introduced
2113 0.03 0.09 0.81 0.33 0.15 -0.04 0.49 0.55
moisture (SM), respectively, and in situ soil moisture and weather data. primarily via the choice of landcover
and soils databases and selection 2173 0.04 0.09 0.73 0.00 0.14 0.06 0.58 0.06
• Create a data product that is scalable and produces real-time estimates that can be applied in and implementation of the dielectric
flood risk and growing season crop viability assessments. 2179 0.04 0.10 0.58 0.31 0.21 -0.05 0.47 0.78
model.

SWAT Model and Deep Learning


Architecture
DLM Model Architectures by Data Type Observations and Conclusions
• SWAT is a physics-based model that predicts hydrologic responses to changes in climate, land use, and land-
management practices for any number of HRUs (Neitsch et al., 2011). Static Data: Multilayer Inputs Hidden Layer(s) Outputs • Overall agreement between SMAP and in situ
• SWAT ran daily on a 10×10-km2 grid within the bounding rectangle of the Middle Tennessee-Elk River watershed
(Figure 1). Perceptron (MLP) soil-moisture estimates is high (0.55 < R < 0.9).
2
• SWAT requires static inputs of land-use type, soil properties, and topography and dynamic inputs of weather • Simplest deep learning Compute Loss
parameters (precipitation, temperature, wind speed, etc.) to compute daily water balance.
architecture. X Y • For physics-based models like SWAT whose
• Often used for classification
Water balance
𝑺𝑺𝑺𝑺′𝒊𝒊 = 𝑺𝑺𝑺𝑺𝒊𝒊 + 𝑾𝑾𝐩𝐩𝐩𝐩𝐩𝐩𝐩𝐩,𝒊𝒊 − 𝑬𝑬𝑬𝑬𝒊𝒊 problems.
Minimize loss function
fluxes are governed by dynamic inputs, a hybrid
• Good for data that do not have
where 𝑆𝑆𝑆𝑆𝑖𝑖′ and 𝑆𝑆𝑆𝑆𝑖𝑖 are the soil water content and the end and beginning of day i, respectively, 𝑊𝑊perc is the net soil temporal variance. Figure 4: Schematic for a simple MLP network mapping inputs architecture increased predictive accuracy by a
(X) to outputs (Y) through two hidden layers.

percolation on day i, and 𝐸𝐸𝐸𝐸𝑖𝑖 evapotranspiration on day i.
SWAT inputs were developed from open-source, national databases: USDA State Soil Geographic (STATSGO) and factor of 20 over an MLP.
the USGS National Landcover Database (NLCD). Weather data were obtained from 33 active NOAA Integrated Dynamic Data: Recurrent
Surface Database (ISD; precipitation, temperature, wind speed) stations in the target area. Weather parameters not
available in ISD datasets (solar radiation, relative humidity) were calculated using long-term averages and inverse- Network • Losses calculated on the validation set for the

distance weighting using the SWAT weather generator.
A hybrid DLM architecture links a long short-term memory (LSTM) network with two multilayer perceptron (MLP)
• Used in sequence prediction,
such as natural language
hybrid model approach SMAP mission objectives
within the AOI.
networks to process dynamic and static datasets, respectively.
• Dynamic SWAT inputs (weather) with a daily timestep fed data to a single LSTM layer. The output from the LSTM processing or timeseries analysis.
layer was then concatenated with SWAT static inputs (soil depth, curve number, etc.) and mapped through an MLP
• RNN variants such as LSTM

network to SWAT soil-moisture predictions (MLP1).
Finally, SWAT soil-moisture predictions were mapped through a second shallow MLP network to corresponding SMAP networks “remember” historical • Future LSTM-MLP hybrid network development
soil moistures over the target area (MLP2). system states on short and long Figure 5: Left: Recurrent Neural Network (RNN) network
timescales. schematic. Right: Long Short-term Memory (LSTM) cell, which
will focus on feature engineering to increase
Dynamic
features Deep Learning Model Hybrid Architecture adds an internal hidden state, ct, within each hidden state, ht.
spatial generalization of DLM predictions.
XD Step 1: Extract temporal
information from dynamic
Specifically, experiments with
𝑑𝑑1𝑇𝑇 𝑇𝑇
𝑑𝑑𝑁𝑁
data (output from LSTM
network).
Step 3: Map SWAT input
Results overparameterization of input samples will be
LSTM parameters to SWAT soil-
performed.
𝑑𝑑11 1
𝑑𝑑𝑁𝑁 𝜎𝜎 𝐖𝐖D 𝐗𝐗 D + bD
features

moisture predictions. • Attempts to process static and


dynamic data with a single MLP
samples
𝐗𝐗 = 𝜎𝜎 𝐖𝐖D 𝐗𝐗 D + bD XS MLP1 network did not yield prediction

Static Step 2: 𝐘𝐘𝐒𝐒𝐒𝐒𝐒𝐒𝐒𝐒


errors within SMAP mission
objectives (MO).
References
features Concatenate
XS temporal
Burgin, M. S., Colliander, A., Njoku, E. G., Chan, S. K., Cabot, F., Kerr, Y. H., Bindlish, R., Jackson, T. J., Entekhabi, D., and
H. (2017). A comparative study of the SMAP passive soil moisture product with existing satellite-based soil moisture products. IEEE
Yueh, S.

information with Step 4: Map • SMAP soil-moisture MO are accurate Transactions on Geoscience and Remote Sensing, 55(5):2959–2971.
MLP2 SWAT soil Duchi, J., Hazan, E., and Singer, Y. (2011). Adaptive subgradient methods for online learning and stochastic optimization. Journal of
static data. within ±𝟎𝟎. 𝟎𝟎𝟎𝟎 𝐜𝐜𝐜𝐜𝟑𝟑 𝐜𝐜𝐦𝐦−𝟑𝟑 (1-σ). The Machine Learning Research, 12:2121-2159.
𝑠𝑠1 ⋯ 𝑠𝑠𝑁𝑁 moisture to Leontjeva, A. and Kuzovkin, I. (2017). Combining static and dynamic features for multivariate sequence classification. Data Science
hybrid network estimated SMAP soil
features

𝐘𝐘𝐒𝐒𝐒𝐒𝐒𝐒𝐒𝐒 SMAP retrievals. and Advanced Analytics (DSAA), 2016 IEEE International Conference:21-30.
Calculate loss. moisture with a training accuracy of Figure 6: Average training loss (RMSE) between DLM
Neitsch, S., Arnold, J., Kiniry, J., and Williams, J. (2011). Soil water assessment tool theoretical documentation. Version 2009. Texas
Water Resource Institute, College Station, Texas. TWRI Report. Technical report, TR-406.
samples
about 𝟎𝟎. 𝟎𝟎𝟑𝟑 𝐜𝐜𝐜𝐜𝟑𝟑 𝐜𝐜𝐦𝐦−𝟑𝟑 and a test predictions and SMAP soil-moisture estimates. Note
the range of training variance indicates over fitting of
Zhang, X., Zhang, T., Zhou, P., Shao, Y., and Gao, S. (2017). Validation analysis of SMAP and AMSR2 soil moisture products over the
United States using ground-based measurements. Remote Sensing, 9(2):104.
accuracy of 𝟎𝟎. 𝟎𝟎𝟑𝟑𝟓𝟓 𝐜𝐜𝐜𝐜𝟑𝟑 𝐜𝐜𝐦𝐦−𝟑𝟑 . the DLM to training data.

View publication stats

You might also like