Non-Linear Time-Varying Stocha

Environ Ecol Stat (2015) 22:227–246
DOI 10.1007/s10651-014-0295-2
Non-linear time-varying stochastic models

for agroclimate risk assessment
Reza Hosseini · Akimichi Takemura ·

Alireza Hosseini
Received: 24 November 2013 / Revised: 1 June 2014 / Published online: 11 July 2014
© Springer Science+Business Media New York 2014
Abstract This work develops a model for minimum temperature in order to assess
the weather related risk in agriculture industry. Non-linear autoregressive models with
time-varying coefficients and volatility with various seasonal components and lags are
compared in an appropriate model-selection algorithm using AIC. The optimal model
is a time-varying autoregressive model which includes non-linear and seasonally-
varying autoregressive terms as well as time-varying volatility. These models are then
used to simulate future weather from which the probabilities of appropriate complex
hazard events are estimated.
Keywords Frost · Minimum temperature · Non-linear · Risk assessment ·

Time-varying · Weather derivative
Handling Editor: Pierre Dutilleul.
The first author was supported by research grants from the Japanese Society for Promotion of Science.
R. Hosseini (B)
IBM Research Collaboratory, 9 Changi Business Park 1, Singapore 486048, Singapore
e-mail: reza1317@gmail.com; rezah@sg.ibm.com
A. Takemura
Department of Mathematical Informatics, Graduate School of Information Science and Technology,
University of Tokyo, Bunkyo, Tokyo 113-8656, Japan
e-mail: takemura@stat.t.u-tokyo.ac.jp
A. Hosseini
Department of Industrial Engineering, University of Yazd, University Blvd, Safaeye,
741-89195 Yazd, Iran
e-mail: alirezahosseini65@gmail.com
123
228 Environ Ecol Stat (2015) 22:227–246
1 Introduction
With the growing world population guaranteeing sufficient and reliable food produc-
tion is a major challenge in the 21st century. Weather damage in various stages of
agricultural production is a non-negligible risk factor and therefore needs to be con-
sidered by the investors in the industry. Measuring this risk is also of interest to the
governments which desire to provide appropriate insurance to the investors to guaran-
tee an efficient and stable food supply. The weather risk can take many forms depending
on the plant and region. For example low temperatures during spring (beginning of
the growing season) can damage pistachio tree leaves and blossoms thus significantly
decreasing the yield. Heavy rains or strong winds during this period are other risks
factors which can interfere with the pollination process. During the past twenty years,
the most important risk factor for pistachio production in the greater Rafsanjan area in
north of Kerman Province in Iran (with the largest pistachio production in the world)
has been temperatures below zero in the spring.
Agriculture is not the only sector largely influenced by weather – energy consump-
tion, tourism and transportation are also susceptible to weather events. Because of this
major influence of weather on human welfare, it is necessary to estimate the probabil-
ity (or distribution) of various events in the future. For example Hosseini et al. (2012a)
consider the hazard event: “the temperature in March-April is above 5 (deg. C) for
at least 3 consecutive days and is below zero after”, which is defined to character-
ize conditions under which the pistachio trees are damaged severely in the flowering
period by frost. In this application and many others, the main goal is to estimate the
probability and distribution of such complex events, well into the future for example
in the next year or in the next 20 years. This is in contrast to daily weather forecast
for the next few days–suitable for short-term planning. Estimating the probability of
such events in the coming year can be used to develop financial instruments called
“weather derivatives”, Richards et al. (2004), Benth et al. (2007), which can be sold to
the farmers and small investors(or other weather related industry) to hedge their risk
and encourage more investment in agriculture.
This paper focuses on developing statistical models for the minimum temperature
process which take into account various features of this process such as seasonality
and the statistical “dependence” over time. This dependence refers to the remaining
temporal correlation after accounting for the seasonality in the mean of the weather
process. The volatility of the temperature process also shows some seasonal patterns
over time and needs to be accounted for in the models. This is particularly important
because more volatility may increase the probability of the events of interest. Hosseini
et al. (2012a) compared autoregressive (AR) models of various order for the minimum
temperature using statistical model selection criteria such as AIC (Akaike 1974) and
BIC (Schwartz 1978). This work extends the models considered there by considering:
AR processes of very long lags (which are still tractable); allowing non-linear terms
in the AR process; allowing the volatility to vary with season; letting the volatility to
depend on the short-term past. Moreover here we present a model selection algorithm
which search through a large classes of models in various steps which are decided
based on the properties of the process as well as the objective of the models. We
123
Environ Ecol Stat (2015) 22:227–246 229
also present appropriate graphical methods for the exploratory analysis and model
assessment for such models.
Non-linear time series models are discussed in Tong (1990) from a dynamical
systems point of view where several non-linear models are discussed: threshold mod-
els which include piecewise linear models; fractional autoregressive models (FAR);
Bilinear models (BL); non-linear moving average models (NMA) and so on. Two
special cases of threshold models include: (1) ARMA models with periodic coef-
ficients (PARMA) which were studied by Gladyshev (1961), Jones et al. (1967)
and Troutman (1979). Also Tesfaye et al. (2006) utilized these models for model-
ing of the river flow which show strong seasonality similar to weather processes
such as temperature. (2) Piece-wise polynomial models which include polynomials
of the previous lags in the model, for example for a time series xt , we can consider,
xt = f (xt − 1), f (xt − 1) = axt−1 (1 − xt−1 ) for some constant a. In order to achieve
stability of the time-series Tong (1990) suggests hard and soft censoring. In hard cen-
soring the idea is to modify f to be zero outside an interval. We use both (1) and (2)
in our models suggested below. We extend (1) in the sense that we allow the variance
of the moving average component of our model to not only change with season but
also depend on the previous day in a specific form appropriate for the temperature
series. Also we extend (2) in the sense that we also allow interactions of previous lags
and include such terms as xt−1 xt−2 . In order to check for the stability of the series we
generate long-term future series from our model and check the simulated chain.
Considering AR processes with long-term lags (for example up to a year) cannot
be done by simply adding many lags because of the “over-fitting” issue. Therefore
we consider long-term averages of the past to prevent over-fitting. This method was
proposed for binary processes for precipitation occurrence process in Hosseini et al.
(2011). We also show that considering non-linear terms and allowing the AR coeffi-
cients to vary with season, can improve the fits considerably using the AIC criteria.
Moreover we show both in the explanatory analysis and the model selection section
that the volatility should be allowed to change with season.
2 Data and statistical models
The data in this study are daily minimum temperature values collected at Rafsanjan
weather station from 1992 to 2010. In order to model frost occurrences, we introduce
statistical models for minimum daily temperature in Rafsanjan. Several features of
the temperature process should be considered in modeling: (1) seasonal trends for the
mean over time; (2) short-term autocorrelation over time; (3) long-term autocorrelation
over time; (4) seasonal variability in the volatility.
Let {Y (t)}, t = 1, 2, . . . , T denote the daily minimum temperature process in
centigrade, where t denotes the day starting from March 1st, 1992 to December 28th,
2010. Below we also use the notation Yt to denote Y (t) in order to save space in
longer expressions. Here we consider, non-linear time-varying autoregressive moving
average models with exogenous variables, which we formally define below.
Definition 1 (NTARMAX) Suppose {Y (t)}, t = 0, 1, 2, . . . is a discrete-time time
series. Also suppose exogenous covariates {X 1 (t), . . . , X k (t)}, t = 1, 2, . . . are given.
123
Then Y (t) is called a “non-linear time-varying autoregressive moving average model

with exogenous variables” (NTARMAX) of order p, q if

q
Y (t) = gt (Y (t − 1), . . . , Y (t − p), (t) + bi (t − i), X 1 (t), . . . , X k (t)),
i=1
where gt : R p+k+1 → R is a measurable function and √ we assume (t) normally

distributed with mean zero and standard deviation equal to V ar ((t)) =√σ (t, Y (t −
1), . . . , Y (t − p)). Moreover we assume the standardized version: (t)/ V ar ((t))
are independent of each other and all other processes.
Note that we have allowed the gt to vary in time; this is a flexibility shown to be useful
in our data using exploratory analysis and model selection criteria (such as AIC).
Moreover we allow the error (t) to vary with time and depend on the previous lags,
in a particular form, which we show to be a property of the minimum temperature
process in Rafsanjan.
In order to be able to fit such models to the data, one need to specify all the
components of the model including gt and the distribution of the error terms (t), t =
1, 2, . . .. In this paper we consider a special class of the (NTARMAX) models; the
ones for which gt has a polynomial form of at most degree d in (Yt−1 , . . . , Yt−r ) and
it is linear in (t)s:
i1 ir

q
Yt = ai1 ,i2 ,...,ir (t)Yt−1 · · · Yt−r + (t) + bi (t − i).
0≤i 1 +i 2 +···+ir ≤d i=1
Moreover we assume that bi = 0 and confirm this assumption is reasonable after

fitting the models and inspecting the autocorrelation function of the residuals. Also
we do not have access to any useful exogenous variables in these data.
As an example of the models utilized in this work, consider

r
Yt = a0 (t) + ai (t)Yt−i + (t),
i=1
which is a linear time-varying autoregressive model. A more complex 2nd-order model

is given by

d
j j

d
j j
Yt = a0 (t) + a1 (t)Yt−1 + a1 Yt−1 + a2 Yt−2
j=2 j=1
+ a11 Yt−1 Yt−2 + a21 Yt−1

2
Yt−2 + a12 Yt−1 Yt−2
2
+ (t),
where d is the bound on the degree of Yt−1 and Yt−2 in the model (e.g. d = 3). In the
above example, we have only allowed the fixed-term a0 (t) and the coefficient of Yt−1
to vary with time and assumed the other coefficients to be fixed. Moreover we have
123
assumed non-linearity by considering various powers of Yt−1 and Yt−2 . There are still
components of the model to be specified. Seasonal trends (with a period of one year)
are the dominant varying factor and therefore we let
⎧ ⎫
⎨
k ⎬
a0 (t) = α0 + α j cos( jωt) + β j sin( jωt) ,
⎩ ⎭
j=1
where the first part is a Fourier series with the period ω = 365 2π
(or ω = 366
2π
if t is in
a leap year). Similarly we can model a1 (t) using a Fourier series. In fact the models
considered in this paper also allow higher order lag coefficients (a2 (t), a3 (t), . . .) and
interactions (e.g. a12 (t)) depend on time, a flexibility that is often not considered –
our exploratory analysis, model selection and model assessment indicated that it is a
useful extension.
Considering autoregressive models with very large number of lags can result in
over-fitting, while for example the weather patterns over 20 days or two month prior
to day t might be useful in predicting
day t. To remedy this problem we define long-term
average processes L i (t) = ij=1 Y (t − j)/i. For example L 31 takes the average of
the minimum temperature over a month prior to day t. This will provide a summary of
the weather patterns over a past month and L 31 can be used as a covariate, thus creating
a chain with long-term lags without over-fitting problem in contrast to including all
the lags Y (t − 1), . . . , Y (t − 31).
It still remains to specify the distribution of the independent errors (t) which we
assume to be N (0, σ (t)2 ). We allow the error distribution change over time by allowing
the volatility σ (t) vary with season (as suggested by exploratory analysis) and also
depend on the previous lags in an intuitively motivated form as described later.
2.1 Exploratory analysis
This section provides an exploratory analysis for the minimum temperature process
which can guide us in selecting a realistic model for the process and therefore better
assess the risk of events of interest. Exploratory analysis methods for investigating
characteristics of time series are discussed widely in the literature. For example the
classic text Brockwell et al. (1991) includes many such techniques and Wang et al.
(2006) is a more recent work which uses global characteristics of time series for time
series clustering. Here we look at appropriate characteristics of the minimum temper-
ature by keeping in mind the well-known properties of the series such as seasonality.
For example we build autocorrelation of the series for various seasons separately and
also plot the volatility of the series as a function of the day of the year (season).
We assess the assumption of the normality of the errors introduced in the models. In
order to assess this assumption in an exploratory stage sometimes the histogram of all
the available data is utilized. If we opt to this method a large deviation from normality
is observed. However this method is flawed since we are combining the distribution
for all Y (t) for t = 1, 2, . . . to represent a shifted version (except for the mean part)
of the distribution for (t). The issue is because Y (t) includes a large seasonal mean
123
Sample Quantiles
Sample Quantiles
5
5
−25 −15 −5
−25 −15 −5
−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3
Theoretical Quantiles Theoretical Quantiles
Sample Quantiles
Sample Quantiles
5 10 15
15
10
−5 0
5
−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3
months 11 and 12
Sample Quantiles
Sample Quantiles
0 5
−5 0 5 10
−10
−20
−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3
Fig. 1 Normal qq-plots of the daily temperature for aggregated data for two consecutive months from
January denoted by 1 to December denoted by 12
and (t) itself also may include seasonal behavior. To remedy this issue we can group
the data for each day of the year from January 1st to December 30th thus creating
365 histograms. In this data set we only have 19 years of data and thus each such
histogram will include only about 19 data points and therefore we use the following
approach: group the data into bimonthly partitions: Jan–Feb, Mar–Apr, . . ., Nov–Dec;
create the histogram for the data aggregated across the years and for each two-month
period. The qq-plots for each group are depicted in Fig. 1 which indicate that while
the normality assumption is reasonable, some small deviation from normality are seen
in the tails. This deviation is small in contrast to other features of the model such
as the change in the volatility across the season as we show later and therefore in
this research we focus on those more significant features. Moreover in the models
the residuals are calculated after removing the AR predictor effects and therefore the
qq-plots are slightly different. We make qq-plots for our chosen model in the model
assessment section to check the normality of the errors in the specific model.
Figure 2 depicts the daily mean and standard deviation of the temperature process.
For each day of the year (for example Feb 1st) the mean and standard deviation (sd)
of the day (Feb 1st) across all available years is presented. Smooth versions of the
same curves are given in the bottom panel to better detect the seasonal patterns of both
123
mean daily sd daily

25
7
20
6
min temp (deg)
min temp (deg)
15
5
10
4
3
5
2
0
0 100 200 300 0 100 200 300

day day
mean daily sd daily
6
20
min temp (deg)

5
min temp (deg)
15
4
10
3
5
0
0 100 200 300 0 100 200 300

day day
Fig. 2 Top row Seasonal mean and standard deviation plots for minimum temperature. Bottom row
smoothed version of the top plots obtained by a moving average of length 7 days (weekly)
the mean and sd. These smooth versions are created by applying a moving average
with a window of length 7 days. The figures suggest a strong seasonal pattern in both
the mean and volatility (standard deviation at each time) of the process. Figure 3 (left
and center panels) depicts the autocorrelation and partial autocorrelation plots for
the minimum temperature process for different seasons. We observe that the partial
autocorrelation seems to vary by season: summer has the strongest autocorrelation;
winter has the weakest autocorrelation; spring and fall have similar values for the
small lags. Figure 3 (right panel) depicts the autocorrelation in the data with respect to
small lags (1–4). We have separated the year to 6 periods, each consisting of 2 months:
Jan–Feb period is denoted by 1; March–April by 2; May–June by 3; July–August
by 4; September–October by 5; November–December by 6. The figure suggests that
periods 3,4,5 (summer and fall) have stronger autocorrelation compared 1,2,6 (Winter
and early spring). The autocorrelation is especially strong from July to October.
Figure 4 (left panel) depicts the annual average of minimum temperature from 1992
to 2006. We observe in the figure that the minimum temperature process show some
dependence over time even after averaging the seasonal patterns and the short-time
123
0.8
5
4
0.7
6
1 5
Autocorrelation
2 4
0.6
3
5
0.5
3
6
2
1
0.4
1
6
2
0.3
1.0 1.5 2.0 2.5 3.0
lag
Fig. 3 Left and center panels depict the autocorrelation plot for winter (bold line), spring (dashed line),
fall (dotted line), summer (dashed-dotted) line: summer has the strongest autocorrelation; winter has the
weakest autocorrelation; spring and fall have similar values for the small lags. Right panel depicts the
autocorrelation for bimonthly data: Jan–Feb period is denoted by 1; March–April by 2; May–June by 3;
July–August by 4; September–October by 5; November–December by 6
Annual min temp avg

1.0
12.0
0.8
11.8
0.4
partial autocorrelation
autocorrelation
11.6
0.6
min temp (C)
0.2
11.4
0.4
11.2
0.0
0.2
11.0
0.0
−0.2
10.8
1995 2000 2005 2010 0.0 1.0 2.0 3.0 1.0 2.0 3.0 4.0
Year lag lag
Fig. 4 Left panel Annual average of minimum temperature from 1992 to 2010. Center and right panels
Autocorrelation and partial autocorrelation for the annual average of minimum temperature from 1992 to
2010
dependence. To investigate this further, Fig. 4 (center and right panels) depicts the
autocorrelation of the annual averages and indicate that process is of lag one. Our
strategy to model this long-term dependence is considering long-term averages of the
past process on top of modeling the short-term dependence using the AR structure.
2.2 Estimation
This subsection discusses the fitting of the models. Estimation of time series regression
models are discussed extensively in the literature for a range of models. For example
Kedem et al. (2002) discuss the estimation for a class of generalized linear time series
123
models and Tesfaye et al. (2006) utilizes the “innovation” algorithm developed in
Anderson et al. (1999) to apply PARMA models to river flow modeling in Fraser
River in British Columbia. Since none of these methods are directly applicable to our
method in which both the autoregressive coefficients and the volatility vary with time
(in a non-linear fashion), below we discuss a simple iterative estimation method which
works very efficiently both in simulations and our data analysis.
Again let Yt , t = 1, 2, . . . , n be the daily temperature data and consider the column
vector of all the series and denote it by Y. Also let Xt = (X t1 , . . . , X t p ) denote the
available covariates at time t and consider the n × p dimensional “design matrix”
with rows equal to Xt , t = 1, . . . , n. Moreover denote the error at each time by
(t), t = 1, 2, . . . , n and the column vector of all the errors by . Then we can write
down the model in the matrix form by
Y = Xβ + ,
where β, a matrix of dimension n × 1 which includes the regression parameters. We

assume is multivariate gaussian and denote its variance-covariance matrix by .
In this paper we allow the volatility to vary with season as suggested in the
exploratory analysis, Fig. 2. We assume the errors (t), t = 1, 2, . . . , n are Gaussian
with mean equal to zero and variance equal to σ (t), where
σ (t) = exp{τ0 + τ1 sin(2ωt) + τ2 cos(2ωt)},
where τ = (τ0 , τ1 , τ2 ) are regression parameters for the volatility. Later we will extend
this to let σ (t) to also depend on the past. We denote the variance-covariance matrix
corresponding to the parameter τ by (τ ). Therefore our model in total includes this
set of parameters: β and τ .
The approach we take here for estimating the parameters is Maximum Likelihood
(ML). Because τ is unknown the ML cannot be done in closed form. However if we
assume Ω is given, then β̂ = (XT Ω −1 X)−1 XT Ω −1 Y maximizes the likelihood with
respect to β. Therefore we break down the maximization to two parts in the following
iterative algorithm which is a standard techniques in regression analysis.
Estimation algorithm
1 β̂ = (XT X)−1 XT Y
2 Repeat until convergence:
3 Plug in β̂ in the log-likelihood l(Y; Ω(τ ), β) and let
τ̂ = argmaxτ l(Y; Ω(τ ), β̂)
4 Update β̂:
β̂ = (XT Ω(τ̂ )−1 X)−1 XT Ω(τ̂ )−1 Y.
In our simulation analysis, this maximization was done very quickly (few seconds) and
achieved very close estimates for data sets of size 10,000; 20 predictors for the mean;
123
5 predictors for the volatility. In order to get standard errors for the mean parameters
(β), the estimated Ω can be plugged into
cov(β̂) = (XT Ω −1 X)−1 .
A Bayesian approach can also be implemented and the above algorithm to become a
Gibbs algorithm by replacing Step 3 by a Metropolis Hastings algorithm as shown for
spatial processes in Finely et al. (2008). However such implementation is considerably
slower in terms of computation and needs prior specification, therefore we use the ML
approach which works well for the models developed in this paper.
3 Statistical model selection
In the above we introduced several autoregressive models of: (1) various lags; (2)
various seasonal complexity (number of Fourier terms); (3) various long-term trends.
Therefore we need to use some criteria to select an optimal model. The problem
of model selection is an important one in statistical theory and application. Various
criteria are suggested in the literature for example: AI C in Akaike (1974); BIC in
Schwartz (1978) and AICc in Brockwell et al. (1991). Denote the likelihood of the
data by L, the number of covariates by p and the sample size by n. Then we have
2 p( p + 1)
AI C = 2 p − 2 log(L), AI Cc = AI C + , B I C = p log(n) − 2 log(L).
n− p−1
Since n in our data is large compared to k, AIC and AICc are very close and it is
sufficient to consider only AIC.
Various related time series studies such as Hosseini et al. (2012a, b) have shown that
BIC tends to underestimate the complexity of the models. Therefore here we use AIC
primarily for model selection but report the BIC for the top models. In the previous
sections, we allowed the volatility to vary with season. However fitting several such
models and calculating AIC is not feasible in the model selection phase. In contrast
fitting models with fixed standard deviation can be done fast and in closed form.
Therefore for the model selection stage, we choose the covariates for the models with
constant volatility. Even with that assumption, fitting all the combinations of models
is not feasible due to the high number of predictors. For example, if there are 50
predictors available to us, then 250 ≈ 1015 models need to be fitted! Here we partition
our search to a few steps by searching a structure for important components (seasonal,
first order dependence, seasonal dependence, etc) of the model in sequence. The model
selection algorithm is explained more precisely below.
In the following, we use these abbreviation for the variables: si : = sin(iωt);
ci : = cos(iωt); and wi : =Y (t − i). The model selection procedure diagram is
presented in Fig. 5 and more details follows.
Model search algorithm [Based on AIC]
Step 0: Instead of fitting the computationally intensive original model, assume σ 2 (t)
is fixed in Steps 1–6 and pick the predictors for the mean.
123
0: Model Selection Diagram

Assume
constant
volatility
2: Add AR
6: Add
1: Seasonal 3: Add Long- 5: Merge seasonal AR
AR(0) memory (1)
4: Add non-
linear
7: Add
varying
volatility
Fig. 5 Model selection algorithm diagram
Step 1: Seasonal component a0 (t): fit models for all possible combinations of
cos( jωt), sin( jωt), j = 1, . . . , 8;
pick the one with the smallest AIC; call it opt [1]. The optimal model with
this criterion and its AIC, BIC are given in Table 1.
Step 2: Linear autoregressive: Consider all the models that include opt [1] variables
and all the possible combinations from w1 , w2 , . . . , w10 ; pick the top model
by AIC; call that opt [2].
Step 3: Long-term lag autoregressive: Consider all the models that include opt [1]
variables and all the possible combinations from
L 1 , L 2 , L 3 , L 4 , L 5 , L 10 , L 15 , L 20 , L 30 , L 60 , L 120 , L 180 , L 360 , L 720 ;
pick the top model by AIC; call that opt [3].

Step 4: Non-linear: Add all the combinations from the predictors
w1 , w12 , w13 , w2 , w22 , w23 , w12 , w12 w2 , w1 w22
to opt [1]; pick the top model by AIC; call it opt [4].
Step 5: Consider all the models with pt [1] predictors and all the combinations from
predictors present in opt [2], opt [3], opt [4]; choose the optimal model using
AIC; call it opt [5].
123
Table 1 The AIC/BIC value for the optimal model (based on AIC) in each step
Model family Optimal model predictors AIC BIC
Step[1]: seas. AR(0) s1 , c1 , s2 , c2 , s3 , c3 , s4 , c4 , s5 , c6 , s7 , c8 36,405 36,500

Step[2]: seas. AR(0) seas. AR(0) + w1 , w4 , w9 32,936 33,052
+ AR
Step[3]: seas. AR(0) seas. AR(0) + L 1 , L 2 , L 4 , L 20 , L 360 32,927 33,056
+ long-term lag
Step[4]: seas. AR(0) seas. AR(0) + w1 , w2 , w12 , w13 , w1 w2 , w12 w2 , w1 w22 32,797 32,940
+ non-linear
Step[5]: seas. AR(0) + 32,774 32,938
seas. AR(0) + w1 , w2 , w12 , w12 , w13 w12 , w12 w2 , w1 w22 +
long-term lag + L 4 , L 20 , L 360
non-linear
Step[6]: seas. AR(0) + w1 , w2 , w12 , w13 , w12 , 32,747 32,931
seas. AR(0) + w12 w2 , w1 w22 , w1 s1 , w1 c1 , w1 s2 ,
nonlinear + w2 c1 w2 s2 , w2 c2 , L 4 , L 20 , L 360
seas. lags
The optimal model’s AIC and BIC values are denoted in bold
Step 6: Time-varying, non-linear, long-term lag AR: To opt [1] add all the combina-
tions from the predictors
w1 s1 , w1 c1 , w1 s2 , w1 c2 , w2 s1 , w2 c1 , w2 s2 , w2 c2
and opt [5]; pick the top model by AIC; call it opt [6]. Result is given in Table
1. To opt [6] we also further added seasonally varying terms corresponding to
w12 , w22 , w1 w2 , w12 w2 , w1 w22 and used AIC to check for any improvement to
opt [6]. However no further reduction to AIC value was observed and therefore
other seasonally-varying lags or interaction terms were not considered.
Step 7: Volatility model: Use the best covariates found in Step 6; use that in models
with varying volatility; pick the model for volatility using the AIC criterion.
We compared the models with fixed volatility using AIC in Table 1. In the table
we observe that adding each of: AR; non-linear AR; and long-term lag AR; to AR(0)
improve the AIC significantly. Then in Step 5 where we merge the predictors found in
Steps 1–4, we also see an improvement of AIC. In Step 6 we allow for time-varying
AR and also observe an improvement of AIC. This model includes predictors from
each of AR, non-linear AR, long-term lag AR and time-varying AR.
In Step 7, we fix the optimal predictors for the mean of the time series and search for
an appropriate structure for the volatility. We model the volatility σ (t) using various
predictors. As we saw in the explanatory analysis the volatility seems to vary with
season and therefore we consider seasonal patterns of various complexity as shown in
Table 2 which shows the model with s1 , c1 , s2 , c2 performs optimally among the ones
considered here. We also consider a temporal dependence on the past for the volatility.
This is done by considering AR structure by adding w1 , w2 , w12 also shown in Table 2.
We observe that the seasonal-only models perform better than autoregressive models
for the volatility.
123
Table 2 Comparison of the

Volatility predictors AIC BIC
models for the volatility by AIC
and BIC
Seasonal:
s1 , c1 31,598 31,796
s1 , c1 , s2 , c2 31,583 31,795
s1 , c1 , s2 , c2 , s3 , c3 31,810 32,035
Markov:
w1 32,075 32,267
w1 , w12 32,060 32,259
w1 , w2 31,986 32,184
w1 , w2 , w12 31,967 32,172
Deviation of day before from seasonal norm:
d 32,730 32,921
|d| 32,564 32,755
d, |d| 32,555 32,753
Combination:
seas.(s1 , c1 , s2 , c2 ) + w1 31,653 31,872
seas.(s1 , c1 , s2 , c2 ) + w1 , w12 31,578 31,803
seas.(s1 , c1 , s2 , c2 ) + w1 , w2 31,614 31,839
seas.(s1 , c1 , s2 , c2 ) + w1 , w2 , w12 32,304 32,536
The optimal model’s AIC and seas.(s1 , c1 , s2 , c2 ) + d 31,574 31,792
BIC values for each class of seas.(s1 , c1 , s2 , c2 ) + |d| 31,569 31,788
models (Seasonal, Markov, etc)
are denoted in bold seas.(s1 , c1 , s2 , c2 ) + d, |d| 31,593 31,819
Then we consider models which are motivated by the following intuition: If the
current weather deviates from the “normal seasonal patterns”, a high volatility in the
weather is expected. In order to get a value for the normal seasonal patterns, we use
the model fitted in Step 1 and denote the output of that model at time t by n(t). Define
the “norm-deviation” variable in day t by
d(t) = Y (t) − n(t).
Then we use the predictors d(t − 1) and |d(t − 1)| in the modeling of the volatility
and the results are given in Table 2. Finally we consider the models with combination
of seasonal and each of AR or norm-deviation models. The final optimal model is the
one with seasonal patterns and only |d| as the predictor of the volatility.
3.1 Model assessment
This subsection investigates the properties of the optimal models found here to confirm
the validity of the assumptions and the ability of the models to capture the properties
of the process that were found in the exploratory analysis.
Figure 6, left panel, depicts the estimated coefficient of the first order autore-
gressive term Y (t − 1) by the optimal model. The figure suggests that the strongest
123
1.3
2.0
1.2
1.5
1.1
AR coef.
log(sd)
1.0
1.0
0.9
0.5
0.8
0.0
0.7
0 100 200 300 0 100 200 300

time Day of year
Fig. 6 Left Panel The seasonal variation of the autoregressive coefficient of Y (t − 1). Right Panel The
seasonal variation of the volatility in the exploratory analysis (circles), the seasonal volatility calculated using
the residuals of the models (grey) and the volatility during 1993 from the model with (s1 , c1 , s2 , c2 , |d|) as
predictors
coefficient is for the July to October Period. This is consistent with our findings in
Fig. 3.
Figure 6, right panel, depicts the volatility of the optimal model for days in the
year 1993 (black curve) with the (s1 , c1 , s2 , c2 , |d|) for the volatility. We have also
included: the seasonal volatility calculated in the exploratory analysis (circles); a
seasonal volatility calculated using the errors of the model for the whole series 1993–
2010. To calculate the latter, a standard deviation is calculated by aggregating the
model residuals for each day of the year (1–366) across the years. The wiggly pattern
in the model volatility for 1993 (black curve) is due to the |d| term and we observe
that its effect is rather small compared to the seasonal patterns. Moreover we observe
that the model volatility (black curve) is in general smaller than the seasonal volatility
found in the exploratory analysis (circles) during the year. This is because the model
volatility is the volatility of the error after removing some temporal autoregressive and
long-term lag predictors–in contrast to the exploratory volatility which only considers
the seasonal effects of the mean of the series by calculating the standard deviation for
each day of the year separately.
Figure 7 shows the qq-plots of the errors aggregated for each two months, (Jan–Feb,
Mar–Apr, . . ., Nov–Dec), and shows the assumption of normality is quite reasonable.
Also note that the errors are closer to normal as compared to the original values of
the series (Fig. 1) because in the models we have removed the autoregressive effects.
Finally Fig. 8 shows the autocorrelation in the residuals of the model; it indicates negli-
gible autocorrelation left in the series; confirms the assumption of error independence
in the residuals.
4 Application to agrorisk management
In this section we discuss the application of the models developed in this paper for
agroclimate risk management. We consider the following event which is associated
with a high risk of pistachio damage due to frost in Ranfsanjan:
123
Sample Quantiles
15
Sample Quantiles
5 10
0 5
0
−10
−10
−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3
Sample Quantiles
Sample Quantiles
2 4 6
0 2 4
−2
−4
−6
−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3
months 11 and 12
Sample Quantiles
Sample Quantiles
−5 0 5 10
0 2 4 6
−4
−15
−3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3
Fig. 7 The qq-plots for the residuals aggregated bimonthly
Fig. 8 The autocorrelation of

1.0
the residuals of the model

0.8
0.6
AC
0.4
0.2
0.0
0 10 20 30
lags
123
Event A: During Mar–Apr 2015 the minimum temperature is above 5 (deg C) for at
least 3 consecutive days and is below zero after.
This event was defined as a result of our interview with farmers, agriculture engineers
and experts in the region. Our objective in this work is to estimate the probability of this
event in the coming year, for example 2015 given the data we have at hand. Moreover
we are interested in finding a confidence interval for this estimated probability. Such
probabilities then can be used by private investors in their investment strategy, the
government in developing insurance programs, and the private or public sector to
develop and price weather derivative appropriate for hedging the risk of investing in
this industry.
The approach we take here to calculate the probability of such complex hazard
events is simulating the future temperature series from our data and calculating the pro-
portion of the times in the simulations that the event of interest occurs. We start by sim-
ulating from the optimal model we found in the previous section. In that model we have
Y (t) = f (t) + (t),
where f (t) is a linear combination of: seasonal terms (such as sin(ωt), cos(ωt)); previ-
ous lags and their powers or long term averages (such as Y (t − 1), Y (t − 1)2 , L 365 (t));
and time-varying autoregressive terms (such as sin(ωt)Y (t − 1)). Moreover (t) are
independent and normally distributed with mean zero and their log of standard devia-
tion is a linear combination of: seasonal terms, sin(ωt), cos(ωt), sin(2ωt), cos(2ωt);
and the deviation from normal variable, |d(t)|, we defined before.
It is discussed in Tong (1990) that due to the existence of polynomials in the mean
part the model may be unstable in the sense that the value of Y (t) may go to infinity.
The solution there offered by Tong (1990) is to use hard or soft censoring of f (t) by
modifying it to f (t)1(−a,a) ( f (t)), for some positive number a, where 1(−a,a) ( f (t))
is an indicator function which is equal to 1 if f (t) ∈ (−a, a) and zero otherwise.
The issue is magnified in our model because we assume σ (t) also vary with time,
in particular with |d(t)| = |Y (t − 1) − n(t)| which depends on Y (t − 1) and will
result in more instability for large magnitudes of Y (t − 1). We can easily check if
instability can occur in our model by simulating several times from our model for a
long period in the future. While it was unlikely to see any instability in small number
of simulations and only few years ahead, the instability did occur for simulations for
long periods (for example 50 years) and for high number of simulations (for example
100 times). The issue with the using the suggestion by offered by Tong (1990) is
finding an appropriate value for a. Here in order to remedy this problem here we offer
two more practical solutions: (1) Find the minimum and maximum value observed in
the data set from 1982 to 2006 and denote them by m = −17(C) and M = 29(C)
respectively; re-assign any simulated value of Y (t − 1) less than m to m and simulated
value larger than M to M. More formally if the original simulated value at time t is
equal to Y ∗ (t) then let
⎧
⎨ Y (t) = M Y ∗ (t) > M
Y (t) = m Y ∗ (t) < m
⎩ ∗
Y (t) = Y (t) otherwise.
123
30
30
20
20
min temp (C)
min temp (C)

10
10
0
0
−10
−10
2008 2009 2010 2011 2012 2013 2014 2015 2010 2015 2020 2025 2030 2035 2040
time time
Fig. 9 Simulated series given in grey along with the real data in black. Left Panel from 2008 to 2011; Right
panel from 2008 to 2041
(2) Use the coefficients found in the model to calculate the volatility, σ (t), in the
observed data and denote the maximum by σ = 4.5; re-assign any volatility obtained
in the simulation which is larger than σ to σ . In fact it turned out that (1) on its own
resolves the issue and after applying 1, automatically the volatility never became larger
than σ .
Figure 9, left panel, depicts the simulated series from 2010 where the data finishes
to 2015 and shows good agreement between the data series and simulated series. The
right panel shows a simulated series up to year 2041. Now to calculate the probability
of any complex hazard event, including event A, we can simulate several series into the
future and calculate the proportion of the series for which the event occurs. Passing
this value to the investors may seem enough for them, however statisticians would
realize that this solution is a bit dishonest as we are not reporting any confidence for
our estimate for this probability–implicitly claiming total confidence about the value.
Below we describe methods to create such confidence intervals.
Here we describe a non-parametric bootstrap (BS) method to get confidence inter-
vals for the probability of Event A. If we show the covariates for the mean part by
Xm (t) and the covariates for the volatility by Xv (t) the model can be expressed in the
form:
Y (t) = Xm (t)βm + (t),
where (t) ∼ N (0, σ 2 (t)) with log(σ (t)) = Xv (t)βv . Then we can do a non-
parametric bootstrap by sampling with replacement from (Y (t), Xm (t), Xv (t)) and
re-estimating the parameters. Note that even though we are working with time-series
data such a bootstrap is valid since all the dependence over time is modeled through
(Xm (t), Xv (t)) – this would not have been the case for example if in the model we
also had (t − 1) :
Y (t) = Xm (t)βm + (t) + b1 (t − 1).
123
Table 3 (Left Panel): The confidence interval for the probability of Event A. (Right Panel): The confidence
interval for the model residual standard deviation
Method Hazard Pr. (var. vol.) Hazard Pr. (cons. vol.) RSD (var. vol.) RSD (fixed vol.)
Point est. 0.35 0.18 2.65 2.65

GLM (0.29, 0.43) (0.12, 0.23) 2.65 2.65
Nonpar. BS (0.27, 0.49) (0.12, 0.23) (2.61, 2.75) (2.59, 2.72)
Par. BS (0.28, 0.41) (0.12, 0.27) (2.62, 2.71) (2.62, 2.69)
We create 1,000 bootstrap data sets; using each data set we re-estimate the parameters
(βm (t), βv (t)); using each parameter set we simulate 500 time series; we calculate the
proportion of times Event A occurs in the corresponding 500 simulations; and finally
using the 1,000 proportions calculated we obtain 2.5 and 97.5 % quantile to form a
confidence interval for the probability of Event A. We do this procedure for both cases
of time-varying volatility and fixed volatility and results are given in Table 3, first row.
In the third row of Table 3 is simply treating the obtained parameters for the volatility
as fixed and only sampling the mean parameters, βm , using the formula:
cov(β̂) = (XT Ω −1 X)−1 ,
and then proceed as before.

In the fourth row of Table 3 we have performed a parametric bootstrap as follows:
we estimate the parameters of the data using the original data; using the estimated
parameters we simulate 1,000 new series of the same length of the original data set;
from each such series we re-estimate the parameters; for each such parameter set we
simulate 500 time series and proceed as before.
Right panel of Table 3 shows the residual standard deviation (RSD) for the non-
parametric and parametric bootstrap methods. The RSD confidence intervals in both
cases are consistent with the model RSD and therefore shows good stability of the
model. Other quantities can also be calculated from the bootstrap samples as a model
assessment technique.
5 Conclusions and discussion
This paper developed and compared several statistical models for minimum temper-
ature which can be used to estimate the probability of hazard events in agriculture
industry. Assessing the probabilities of these events is useful in estimating the risk
of investing in this industry from production to distribution and exporting. Here we
focused on minimum temperature and did not investigate other risk factors such as:
extremely high temperature during summer; heavy rain during flowering period; slow
but long rain during the flowering time. In fact in the spring of 2014 about 80 percent of
the pistachio yield in Rafsanjan was destroyed by a few-day long precipitation during
the pollination period. For future studies, we plan to acquire the data for precipitation,
maximum temperature and developing models that assess these other risk factors.
123
A multivariate time series model which takes into account the parameter and volatility
variation over time can be developed and used to assess the risk of a hazard event such
as: “there are 3 days with temperature more than 5 degrees during April-May months
and then either temperature goes below zero or it rains for more than 3 consecutive
days.”
An important aspect of assessing the risk is relating the risk factors to the losses
in yield or monetary values involved. For this study, we relied on expert knowledge
(by interviewing farmers and agriculture engineers in the region) to define the hazard
events. However if the data for yield per km2 become available for enough number of
years and/or locations, one can develop a statistical model to relate the weather events
to the losses in the yield in the same model.
The time series models we developed here allowed for a complex structure includ-
ing, long-term trends as well as non-linearity and time variation in both autoregressive
and volatility structures. Other related models that may be worth considering are:
the “exponential smoothing” methods e.g. De Livera et al. (2011) which have been
applied widely to financial data and are powerful in particular for modeling long-
term trends; machine learning approaches such as neural networks, support vector
machines, random forests and so on Hastie et al. (2009) which are powerful in mod-
eling non-linearity and giving good (one-day) predictions. However when using the
machine learning methods modeling a complex volatility structure which is an impor-
tant component of our analysis is not straight-forward and needs to be explored further
in the literature if such models are to be successful in assessing the risk of complex
weather events.
Acknowledgments We are indebted to Mr. Islami from Rafsanjan Weather Office for providing the data
for this study. We are also thankful to Prof. Jim Zidek and Prof. Nhu Le for some fruitful discussions on the
modeling and applications.
References
Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control AC–19:716–
723
Anderson PL, Meerschaert MM, Veccia AV (1999) Innovations algorithm for periodically stationary time
series. Stoch Process Appl 83(1):149–169
Benth FE, Benth JS (2007) Volatility of temperature and pricing of weather derivatives. Quant Financ
7(5):553–561
Brockwell PJ, Davis RA (1991) Time series: theory and methods, 2nd edn. Springer, Berlin
De Livera AM, Hyndman RJ, Snyder Ralph D (2011) Forecasting time series with complex seasonal patterns
using exponential smoothing. J Am Stat Assoc 106(496):1513–1527
Finley AO, Banerjee S, Ek AR, McRoberts RE (2008) Bayesian multivariate process modeling for prediction
of forest attributes. J Agric Biol Environ Stat 13:60–83
Gladyshev EG (1961) Periodically correlated random sequences. Sov Math 2:351–358
Hastie T, Tibshirani R, Friedman JH (2009) The elements of statistical learning, 2nd edn. Springer, Berlin
Hosseini A, Fallahnezhad MS, Zare-Mehrjardi Y, Hosseini R (2012) Seasonal autoregressive models for
estimating the probability of frost in Rafsanjan. J Nuts Relat Sci 3(2):45–52
Hosseini R, Le N, Zidek J (2012) Time-varying Markov models for binary temperature series in agrorisk
management. J Agric Biol Ecol Stat 17(2):283–305
Hosseini R, Le N, Zidek J (2011) Selecting a binary Markov model for a precipitation process. Environ
Ecol Stat 18(4):795–820
Jones RH, Brelsford WM (1967) Time series with periodic structure. Biometrika 54(3):403–8
123
Kedem B, Fokianos K (2002) Regression models for time series analysis. Wiley Series in Probability and
Statistics
Richards TJ, Manfredo MR, Sanders DR (2004) Pricing weather derivatives. Am J Agric Econ 86(4):1005–
1017
Schwartz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464
Tesfaye YG, Meerschaert MM, Anderson PL (2006) Identification of periodic autoregressive moving aver-
age models and their application to the modeling of river flows. Water Resour Res 42(1):W01419
Tong H (1990) Non-linear time series, a dynamical systems approach. Oxford University Press, Oxford
Troutman BM (1979) Some results in periodic autoregression. Biometrika 66(2):219–228
Wang X, Smith KA, Hyndman RJ (2006) Characteristic-based clustering for time series data. Data Min
Knowl Discov 13(3):335–364
Reza Hosseini received his bachelors in mathematics from Amirkabir University of Technology (2003),
a masters in mathematics from McGill University (2005) and a PhD in statistics from University of British
Columbia (2009) where he wrote a PhD thesis on “statistical models for assessing agro-climate risk”.
After his PhD, he worked for Agriculture Canada where he developed in-season crop yield forecasting
methods and statistical models for remote sensing of soil moisture (2010–2011). He also worked on pre-
dicting air pollution mixtures for children’s health analysis at University of Southern California (2011–
2012) and spatial-temporal risk assessment at University of Tokyo (2012–2013). Currently (2013-present)
he is a research scientist at IBM Research Collaboratory in Singapore developing new spatial-temporal
methods for environmental processes and urban city management.
Akimichi Takemura received his bachelors in economics (1976) from University of Tokyo, masters
in statistics (1978) from University of Tokyo and PhD in statistics (1982) from Stanford University. He
was an acting assistant professor at the department of statistics, Stanford University (1982–1983); visiting
assistant professor, department of statistics, Purdue University (1983–1984); associate professor, faculty
of economics, University of Tokyo (1984–1997); professor, faculty of economics, University of Tokyo
(1997–2001); professor, graduate school of information science and technology, University of Tokyo
(2001-present). He is an associate editor of Journal of Multivariate Analysis (2002-present) and he was
the president of Japan Statistical Society (2011–2013). He published more than one hundred papers and
books in a wide range of topics in statistics including algebraic statistics, multivariate statistics and sto-
chastic processes.
Alireza Hosseini received his bachelors in mathematics from Vali-Asr University of Rafsanjan (2010)
and a masters in industrial engineering from Yazd University (2012) where he wrote a masters thesis on
frost risk analysis for pistachio industry. He is currently a PhD student in industrial engineering at Yazd
University and his research interests include forecasting and time series analysis with industrial and envi-
ronmental applications.
123
Reproduced with permission of the copyright owner. Further reproduction prohibited without
permission.

Non-Linear Time-Varying Stocha

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Non-Linear Time-Varying Stocha

Uploaded by

Copyright:

Available Formats

Environ Ecol Stat (2015) 22:227–246

Non-linear time-varying stochastic models

Reza Hosseini · Akimichi Takemura ·

Keywords Frost · Minimum temperature · Non-linear · Risk assessment ·

Handling Editor: Pierre Dutilleul.

2 Data and statistical models

Then Y (t) is called a “non-linear time-varying autoregressive moving average model

where gt : R p+k+1 → R is a measurable function and √ we assume (t) normally

Moreover we assume that bi = 0 and confirm this assumption is reasonable after

which is a linear time-varying autoregressive model. A more complex 2nd-order model

+ a11 Yt−1 Yt−2 + a21 Yt−1

2.1 Exploratory analysis

mean daily sd daily

0 100 200 300 0 100 200 300

mean daily sd daily

min temp (deg)

0 100 200 300 0 100 200 300

Annual min temp avg

where β, a matrix of dimension n × 1 which includes the regression parameters. We

σ (t) = exp{τ0 + τ1 sin(2ωt) + τ2 cos(2ωt)},

τ̂ = argmaxτ l(Y; Ω(τ ), β̂)

β̂ = (XT Ω(τ̂ )−1 X)−1 XT Ω(τ̂ )−1 Y.

cov(β̂) = (XT Ω −1 X)−1 .

3 Statistical model selection

0: Model Selection Diagram

Fig. 5 Model selection algorithm diagram

cos( jωt), sin( jωt), j = 1, . . . , 8;

L 1 , L 2 , L 3 , L 4 , L 5 , L 10 , L 15 , L 20 , L 30 , L 60 , L 120 , L 180 , L 360 , L 720 ;

pick the top model by AIC; call that opt [3].

w1 , w12 , w13 , w2 , w22 , w23 , w12 , w12 w2 , w1 w22

Model family Optimal model predictors AIC BIC

Step[1]: seas. AR(0) s1 , c1 , s2 , c2 , s3 , c3 , s4 , c4 , s5 , c6 , s7 , c8 36,405 36,500

Table 2 Comparison of the

d(t) = Y (t) − n(t).

3.1 Model assessment

0 100 200 300 0 100 200 300

4 Application to agrorisk management

Fig. 8 The autocorrelation of

the residuals of the model

Y (t) = f (t) + (t),

min temp (C)

Y (t) = Xm (t)βm + (t),

Y (t) = Xm (t)βm + (t) + b1 (t − 1).

Point est. 0.35 0.18 2.65 2.65

cov(β̂) = (XT Ω −1 X)−1 ,

and then proceed as before.

5 Conclusions and discussion

You might also like