Professional Documents
Culture Documents
Atmospheric Environment
Atmospheric Environment
Atmospheric Environment
journal homepage: www.elsevier.com/locate/atmosenv
a r t i c l e i n f o a b s t r a c t
Article history: Air quality time series consists of complex linear and non-linear patterns and are difficult
Received 8 January 2008 to forecast. Box–Jenkins Time Series (ARIMA) and multilinear regression (MLR) models
Received in revised form 16 July 2008 have been applied to air quality forecasting in urban areas, but they have limited accuracy
Accepted 18 July 2008
owing to their inability to predict extreme events. Artificial neural networks (ANN) can
recognize non-linear patterns that include extremes. A novel hybrid model combining
Keywords:
ARIMA and ANN to improve forecast accuracy for an area with limited air quality and
Particulate matter forecasting
meteorological data was applied to Temuco, Chile, where residential wood burning is
Hybrid
ARIMA a major pollution source during cold winters, using surface meteorological and PM10
Neural networks measurements. Experimental results indicated that the hybrid model can be an effective
Temuco tool to improve the PM10 forecasting accuracy obtained by either of the models used
separately, and compared with a deterministic MLR. The hybrid model was able to capture
100% and 80% of alert and pre-emergency episodes, respectively. This approach demon-
strates the potential to be applied to air quality forecasting in other cities and countries.
Ó 2008 Elsevier Ltd. All rights reserved.
1352-2310/$ – see front matter Ó 2008 Elsevier Ltd. All rights reserved.
doi:10.1016/j.atmosenv.2008.07.020
8332 L.A. Dı́az-Robles et al. / Atmospheric Environment 42 (2008) 8331–8340
Table 1 than linear ones (Prybutok et al., 2000). Goyal et al. (2006)
Episode levels for air quality index (ICAP), Chile pointed out those linear models such as MLR and ARIMA
Air quality levels ICAP Ventilation condition Episode type fail to predict extreme concentrations (episodes). These
for PM10 [mg m3] authors found that the combined ARIMA and MLR models
0–194 0–200 Good to regular None predicted PM10 levels more accurately than either model
195–239 201–300 Bad Alert used independently for Delhi and Hong Kong. The ARIMA
240–329 301–500 Critical Pre-emergency
and ANN models have also been used for sales forecasting
>330 >501 Dangerous Emergency
in Brazil, with the non-linear ANN model showing higher
accuracy (Ansuj et al., 1996). Recent studies provide good
disparity among their RRs for mortality (Sanhueza et al., descriptions of the hybrid ARIMA–ANN models (Aburto and
2005). Some studies have found higher concentrations of Weber, 2007; Aslanargun et al., 2007; Gutiérrez-Estrada
Polycyclic Aromatic Hydrocarbons (PAHs) in Temuco than et al., 2007; Pulido-Calvo and Portela, 2007; Sallehuddin
Santiago, with concentrations that were 197 and 6.6 times et al., 2007; Gutiérrez-Estrada et al., 2008). For economic
higher, respectively, than the maximum limits allowed in time series forecasting, a study combined a seasonal ARIMA
the European Union (1 ng m3) (Tsapakis et al., 2002). model with a back propagation ANN model (Tseng et al.,
Other studies have found that the PM2.5 constitutes the 80– 2002), showing that the hybrid performed better than
90% of the total PM10 in Temuco, compared with 30–60% in ARIMA or ANN alone. Zhang (2003) tested a hybrid ARIMA
Santiago (Sanhueza et al., 2005). and ANN model over three kinds of time series, and
These studies document the need for better air quality concluded that the linear and non-linear time series
management to reduce air pollution levels. An accurate air patterns in the combined model improved forecasting
quality-forecasting model is needed to alert the population more than either of the models used independently. This
at large and to initiate preventative pollution control hybrid ARIMA–ANN approach has recently been used for
actions. This paper introduces a hybrid air quality- tourist arrival forecasting (Aslanargun et al., 2007),
forecasting model for Chile that could be applied to other hydrology (Jain and Kumar, 2007), supply chain manage-
cases with similar terrain, emission sources, and databases. ment (Aburto and Weber, 2007), freshwater phytoplankton
dynamics (Jeong et al., 2008), watersheds (Pulido-Calvo
1.1. Air quality forecasting models and Portela, 2007), fish catch (Gutiérrez-Estrada et al.,
2007) production values of the machinery industry (Chen
Autoregressive Integrated Moving Average (ARIMA) and Wang, 2007), fish community diversity (Gutiérrez-
(Box–Jenkins Time Series) (Box and Jenkins, 1970) and the Estrada et al., 2008), and in other areas (Tseng et al., 2002;
multilinear regression (MLR) models have been widely Zhang, 2003; Zhang and Qi, 2005), but all of them are not
used for air quality forecasting in urban areas, but they are related to air quality.
of variable accuracy owing to their linear representation of So far, only two hybrid models have been applied to air
non-linear systems (Goyal et al., 2006). Artificial neural quality forecasting (Chelani and Devotta, 2006; Wang and
networks (ANN) have been developed as a non-linear tool Lu, 2006). Wang and Lu (2006) used MLP trained with
for pollution forecasting, principally using multilayer per- a particle swarm optimization algorithm (MLP–PSO) and
ceptron (MLP) architecture (Pérez and Reyes, 2002; Pérez a hybrid Monte Carlo (HMC) method. This was applied for
et al., 2004; Pérez and Reyes, 2006; Schlink et al., 2006; ground level O3 forecasting in Hong Kong during 2000–
Slini et al., 2006; Sofuoglu et al., 2006; Sousa et al., 2006; 2002 over 2 typical monitoring sites, of a total of 14, with
Thomas and Jacko, 2007). The comparison among these different O3 formation patterns in order to fully examine
models is presented in Table 2. the feasibility and generality of the proposed predictive
These models have been compared to evaluate their models. One was the Tsuen Wan (TW) site, which is located
robustness in air quality forecasting performance. Fore- in the urban area and is surrounded by mountains, in which
casting daily maximum ozone (O3) concentration at O3 dynamic is mainly influenced by the high level of
Houston, a study showed that an ANN model was more primary pollutants emitted from local traffic. The other was
accurate than either the ARIMA or MLR models, mainly the Tung Chung (TC) site (Tung Chung Health Centre,
because the data presented clearer non-linear patterns located at the north of Lantau Island about 3 km southeast),
which is a suburban residential area, where the annual
Table 2 average O3 level is usually higher than that in TW site. The
Comparison among the ARIMA, ANN, and MLR forecasting models (Zhang, authors suggested that the O3 pollution at the TC site is
2003) partially subjected to a regional influence of the Pearl River
ARIMA ANN MLR Delta pollution shifting. Their results indicated that the
Better to capture Better to capture the Better to capture hybrid model produced good predictions of the maximum
the linear pattern non-linear patterns the linear pattern O3 level at both sites. However, the model did not perform
of a time series of a time series of a time series satisfactorily during an episode at the TC site, which is
Great versatility Great versatility Few versatility
influenced by both local and regional emissions; in other
Better for seasonal Better to capture Needs correction
patterns noise and extreme factors to capture words, long-range transport of O3 precursors and two
values (episodes) extreme values power plants emission of Hong Kong have more significant
Needs historical Does not need historical Does not need influence on the TC than the TW site. Chelani and Devotta
data continuity data continuity historical data (2006) included ARIMA with a non-linear dynamic tech-
continuity
nique to forecast nitrogen dioxide (NO2) at a site in Delhi,
L.A. Dı́az-Robles et al. / Atmospheric Environment 42 (2008) 8331–8340 8333
India, during 1999–2003. They found that the hybrid model concentrations as well, where the maximum concentra-
had a better performance than ARIMA and the non-linear tions were as high as 1164.0, 321.3, and 353.4 mg m3,
prediction used separately. respectively.
The dependent, MaxPM10ma variable, is log-normally
2. Methodology distributed and has one marked seasonal comportment,
and according to the United States Environmental
2.1. Study area and available data Protection Agency (USEPA) and other studies (USEPA,
2003; Stadlober et al., 2008), a log transformation can be
Hourly and daily time series of PM10 and meteorological used to normalize the data to improve the modeling
data during 2000–2006 at the Las Encinas monitoring performance. After the outlier study, a transformation
station in Temuco was used. Temuco, the capital of the analysis was performed to obtain normal distributions for
Araucanı́a region of Chile (Latitude 38 450 S; Longitude each of the variables, where the main transformation
72 400 W; 100 m.a.s.l) with a population of approximately functions were natural logarithms (USEPA, 2003; Sta-
300 000 inhabitants, is one of the most polluted cities in dlober et al., 2008). Other variables related to daily PM10
South America. The city is located in the center-south of were also created; the maximum hourly PM10 of the
Chile, equidistant between the Pacific Ocean and the Andes. previous day (L1PM10, mg m3), maximum 6-h PM10
The city placement corresponds to Cautı́n River-originated moving average concentration of the previous day
fluvial landmasses that developed in a crushed form (L6PM10, mg m3), maximum 12-h PM10 moving average
between two hills, Ñielol (350 m) and Conunhueno concentration of the previous day (L12PM10, mg m3), and
(360 m). The region has a temperate rainy climate with maximum 24-h PM10 moving average concentration of
Mediterranean influence. Precipitation occurs during all the previous day (L24PM10, mg m3). Next, the data was
months of the year, with highest precipitation during standardized to obtain the identical scale along each axis
winter and a dryer period during summer. Temuco has had in g-dimensional input space and to get constant variance
a fast urban expansion; great economic growth; increased for each variable. This analysis was performed for all
woodstove and industrial source emissions; and increasing models, as suggested for some studies (de Menezes and
vehicle exhaust. Nikolaev, 2006; Piotrowski et al., 2006; Gutiérrez-Estrada
Temuco was declared as non-attainment for PM10 in et al., 2008).
2005. Due to the abundant wood supplies from local
2.2. Variable selection and models construction
forests, almost 70% of the population uses wood for cooking
or heating in winter. It is estimated that 87% of PM10 winter
As shown in Table 4, the data set was partitioned into
emissions originate from residential wood combustion
(RWC) (Sanhueza et al., 2005). During 2006, 15 days were two sets; 92% for training and 8% for validation data, base
on the ANN model requirements for training.
reported with PM10 concentrations exceeding the Chilean
daily PM10 standard of 150 mg m3, among more than 60
2.2.1. Linear approach: multiple linear regression (MLR) model
days of exceedances in 2001–2006.
Temuco has a unique air quality monitoring station. Las The multiple regression procedure was utilized over the
training data set to estimate the significant regression
Encinas station is located in the center of a vacant lot in
a new residential area, 2.7 km west of the city center, Fig. 1. coefficients b0, b1,., bq of the linear equation:
From this site, hourly temperature, relative humidity, wind
y ¼ b0 þ b1 x1 þ / þ bq xq (1)
direction, and wind speed are acquired. A beta gauge
monitor (BAM, Environment S.A., Poissy, France) measures where the regression coefficients b0, b1,., bq represent the
hourly PM10. Daily precipitation, atmospheric pressure, and independent contributions of each independent variable x1,
radiation measurements were obtained from the meteo- ., xq to the prediction of the dependent variable y. To
rological station at the Catholic University of Temuco, examine the independency of the variables, a multi-
located w2 km north east of the site, and 1 km north west collinearity analysis was performed on the data set using
of the city center (Fig. 1). To build the models, the available the variation inflation factor (VIF) (DeLurgio, 1998). The
meteorological variables from those two stations were: global statistical significance of the relationship between y
minimum temperature (Tmin, C); maximum temperature and the independent variables was analyzed by means of
(Tmax, C); relative humidity (RH, %); wind speed (WS, analysis of variance (ANOVA, a level ¼ 0.05) to ensure the
m s1); solar radiation (R, Wh m2day); atmospheric pres- validity of the model. A stepwise procedure of the JMP 6.0.2
sure (hPa, P); and precipitation (mm, PP). The maximum (SAS Institute Inc., U.S.) tool was used for the MLR
PM10 moving averages per day (MaxPM10ma) values were calibration. The final MLR model included the following
calculated from the hourly PM10 measurements, and then normalized, independent, and standardized predictor
a statistical outlier analysis was performed on these data variables that were statistically significant: previous day
using the Cook’s distance, and DFFITS and DFBETAS coef- maximum hourly PM10 (L1PM10), WS, Tmin, and Tmax.
ficients (Belsley et al., 1980; Rawlings, 1988). Fig. 2 shows These variables were also used as predictor variables in to
the maximum 24-h PM10 moving average concentrations in the ARIMA(p,d,q)X, ANN, and hybrid models.
Temuco from July 2000 to September 2006. The statistics of
the air quality and meteorological data are presented in 2.2.2. Linear approach: Box–Jenkins ARIMA model
Table 3. This table includes 1-h maximum PM10, 24-h ARIMA linear models have dominated many areas of
average PM10, and maximum 24-h PM10 moving average time series forecasting. As the application of these models
8334 L.A. Dı́az-Robles et al. / Atmospheric Environment 42 (2008) 8331–8340
is very common, it is described here briefly. The linear Yt ¼ q0 þ f1 Yt1 þ f2 Yt2 þ / þ fp Ytp þ et (2)
function is based upon three parametric linear compo-
nents: autoregression (AR), integration (I), and moving where p is the number of the autoregressive terms, Yt is the
average (MA) method (Box and Jenkins, 1970; DeLurgio, forecasted output, Ytp is the observation at time tp, and
1998). The autoregressive or ARIMA(p,0,0) method is rep- f1, f2,., fp is a finite set of parameters. The f terms are
resented as follows: determined by linear regression. The q0 term is the
400
MaxPM10 24Hr. [µg/m3]
200
100
0
0 400 800 1200 1600 2000 2400
Time (days)
Fig. 2. Maximum 24-h PM10 moving average by beta attenuation monitor (BAM) time series at the Las Encinas site in Temuco, Chile (from 07/21/2000 to 09/30/
2006).
L.A. Dı́az-Robles et al. / Atmospheric Environment 42 (2008) 8331–8340 8335
Table 3 ANN models can recognize trends, patterns, and learn from
Statistical description of the meteorological and air quality data acquired their interactions with the environment. The most exten-
from Temuco/Las Encinas station from 07/21/2000 to 09/30/2006
sively studied and used ANN models are the multilayer feed
Variables Mean Standard Maximum Minimum forward networks (Rumelhart et al., 1986), which allow
deviation information transfer only from an earlier layer to the next
Max 1-h PM10, mg m3 166.5 168.5 1,164.0 6.0 consecutive layers. Each neuron j receives incoming signals
24-h PM10, mg m3 48.5 39.6 321.3 5.3
from external variables or every neuron i in the previous
Max 24-h PM10 moving 63.0 50.2 353.4 4.1
average, mg m3 layer and there is a synaptic weight (Wji) associated with
Minimum temperature, C 7.4 3.7 16.6 4.9 each incoming signal (xi). According to Eq. (5), the effective
Maximum temperature, C 17.4 5.3 37.4 2.8 incoming signal (Nj) to the node j is the weighted sum of all
Wind speed, m s1 1.9 0.9 6.9 0.3 the incoming signals.
Precipitation, mm 3.3 7.6 78.6 0.0
Relative humidity, % 80.5 8.7 100.0 52.0
X
m
Solar radiation, W m2 3,755 2,438 9,185 135 Nj ¼ xi Wji (5)
Pressure, mbar 1,003 5 1,018 986
i¼1
n
P 2 Portela, 2007). The optimal model is selected when the BIC
yi b
yi
is the lowest.
i
E2 ¼ 1:0 (6)
P
n
2
jyi yj 3. Results and discussions
i
a b
350 350
Observed MaxPM10 ma (µg/m3)
250 250
200 200
150 150
100 100
50 50
0 0
0 50 100 150 200 250 300 350 0 50 100 150 200 250 300 350
Modeled MaxPM10 ma (µg/m3), MLR model Modeled MaxPM ma (µg/m3), ARIMA-X model
c d
350 350
Observed MaxPM10 ma (µg/m3)
250 250
200 200
150 150
100 100
50 50
0 0
0 50 100 150 200 250 300 350 0 50 100 150 200 250 300 350
Modeled MaxPM10 ma (µg/m3), ANN model Modeled MaxPM10 ma (µg/m3), Hybrid model
Fig. 3. Models performance using the validation data set for maximum 24-h PM10 moving average and meteorological measurements taken at Temuca/Las
Encinas, Chile, from 04/01/2006 to 09/30/2006. The observed 24-h PM10 moving average concentration was compared with model performance from: a) MLR, b)
ARIMA-X, c) ANN, and d) hybrid models.
average of order 1, represented by Eq. 10 and using the ARIMAX MaxPM10ma, the errors associated with the ARI-
parameter estimated from Table 6. MAX model (earimax), L1PM10, WS, Tmin, and Tmax. The
hybrid model was trained with the Levenberg–Marquardt
Yt ¼ 0:98122Yt1 þ m 0:87057et1 þ et (15) algorithm and MLP architecture with one hidden layer,
Table 6 suggests that the autoregressive component of the three neurons, and a square activation function. The model
ARIMA model is more significant than the moving average performances are reported in Fig. 3 and Table 7.
component, the maximum hourly PM10 concentration of To evaluate the performance of these different models,
the previous day, and the meteorological variables to specifically, three groups to measure the accuracy were
predict the maximum 24-h PM10 moving average concen-
trations at Temuco. Table 7
The ANN model was built with the selected variables Model performance statistics for the validation data set for PM10 samples
acquired from Temuco/Las Encinas station from 04/01/2006 to 09/30/
found through the MLR model and trained with the Lev- 2006
enberg–Marquardt algorithm and MLP architecture with
one hidden layer, three neurons, and with a square activa- Estimator MLR ARIMAX ANN Hybrid
tion function. This model is a black box for the model R2 0.7786 0.7691 0.7770 0.9828
E2 0.7599 0.7588 0.7569 0.9770
equations and coefficients, a disadvantage compared to the
ARV 0.2401 0.2412 0.2431 0.0230
MLR or ARIMAX models. RMSE, mg m3 28.39 28.46 28.57 8.80
To build the hybrid ARIMA–ANN model, two output MAE, mg m3 20.83 19.87 20.65 6.74
variables of the ARIMAX model (Eq. 14) and the selected SEP 29.70 29.76 29.88 9.20
external meteorological variables from the MLR model PI 0.6626 0.6611 0.6585 0.9676
BIC 2198.9 2210.1 2216.7 1801.2
were used as inputs to the new ANN model; the forecasted
8338 L.A. Dı́az-Robles et al. / Atmospheric Environment 42 (2008) 8331–8340
Table 8 Table 10
Contingency table for the MLR model of the validation data set, Temuco/ Contingency table for the ANN model of the validation data set, Temuco/
Las Encinas station from 04/01/2006 to 09/30/2006 Las Encinas station from 04/01/2006 to 09/30/2006
None Alert Pre-emergency Emergency Tot. %Oa None Alert Pre-emergency Emergency Tot. %O
None 169 1 1 0 171 99 None 170 0 1 0 171 99
Alert 4 2 1 0 7 29 Alert 4 0 3 0 7 0
Pre-emergency 1 3 1 0 5 20 Pre-emergency 1 1 3 0 5 60
Emergency – – – – – Emergency – – – – –
Tot. 174 6 3 0 183 94 Tot. 175 1 7 0 183 95
%P 97 33 33 %P 97 0 43 0
a
Percentage of observed days by type of episodes that were forecast to
be in the type.
The performance in ARIMAX and ANN models is not
2 2
robust when the data meet certain behaviors, such as
considered: predictive capability (R , E , and ARV), preci- linear and non-linear patterns, that usually are found in air
sion (RMSE, MAE, and SEP), and goodness-of-fit (PI and quality time series. In this case, the 24-h PM10 moving
BIC). To estimate these coefficients, all the modeled and average concentrations time series presents those two
observed variables and their residuals were re calculated at patterns. This problem is solved with the hybrid ARIMA–
their original units, in other words, non-normalized and ANN model, which is able to capture almost all peaks in
non-standardized. For the predictive capability, the hybrid the validation data set, compared with the other models,
model beat the other three models at least in 10.4%. The R2 Fig. 3. Similar conclusions were obtained for other time
between observed and estimated maximum 24-h PM10 series (Qi and Zhang, 2001; Zhang, 2003; Zhang and Qi,
moving average concentrations in this validation phase 2005; Aburto and Weber, 2007; Gutiérrez-Estrada et al.,
indicated that a 98.28% of the explained variance was 2007; Pulido-Calvo and Portela, 2007). This detail is
captured by the hybrid model, value far better than the important to forecast PM episodes several hours in
other models, Table 7. Similar conclusions were obtained in advance. With an E2 of 0.9770 (Fig. 3d and Table 7), the
forecasting different kinds of time series (Zhang, 2003; hybrid model forecasted the maximum 24-h PM10 moving
Gutiérrez-Estrada et al., 2007; Pulido-Calvo and Portela, average concentrations at the Las Encinas monitoring
2007). The biggest difference was detected for the E2 station of Temuco.
coefficient (close to 22%), as that obtained for other study The results are also presented in the form of tables of
(Gutiérrez-Estrada et al., 2007). Regarding the accuracy, the contingency (Pérez and Reyes, 2006), Tables 8–11, which
hybrid model is consistently far the most accurate among show a summary of correct forecasts for episode types and
the four models. The SEP coefficient appears as the more models. The columns show the number of the days fore-
‘‘relaxed’’, while MAE as the most demanding among the cast to be on a given type against the type of the observed
three. In this case, this level of explained variance implied day. A good model should capture high percentages of
a SEP of 9.20%, a RMSE of 8.8 mg m3, and a MAE of alert and pre-emergency episodes. Row %P represents the
6.74 mg m3. Finally, the hybrid model appears as the model percentage of forecast days by type that were verified to
that better fit the forecasting also. In this regard, the occur. The percentage of false positives is represented by
observed differences in the BIC coefficient are distin- 100-%P. A good forecasting model should perform few
guished (Table 7), because the parameter is one of the most false positives of episode types. Bold diagonal numbers are
efficient in the study of model comparison. As the deter- the successful forecasts by episode. The MLR and ARIMAX
mination coefficient has received much criticism on models had a poor performance to capture alerts and pre-
forecasting because it is not related to the difference emergency episodes, mainly pre-emergencies. This
between predicted and observed values, the PI was used behavior can be expected, since these models are not good
complementarily (Goyal et al., 2006; Gutiérrez-Estrada for non-linear patterns (Goyal et al., 2006). On the other
et al., 2007). The PI value of the hybrid model was as high as hand, the non-linear ANN model had a better performance
0.9676. This suggests that the hybrid model had better on the most dangerous episodes; getting 60% of successes
forecasting performance than those other models. in the pre-emergencies, but for alerts none days were
Table 9 Table 11
Contingency table for the ARIMA(1,0,1)-X model of the validation data set, Contingency table for the hybrid model of the validation data set, Temuco/
Temuco/Las Encinas station from 04/01/2006 to 09/30/2006 Las Encinas station from 04/01/2006 to 09/30/2006
None Alert Pre-emergency Emergency Tot. %O None Alert Pre-emergency Emergency Tot. %O
None 170 0 1 0 171 99 None 170 1 0 0 171 99
Alert 4 2 1 0 7 29 Alert 0 7 0 0 7 100
Pre-emergency 2 3 0 0 5 0 Pre-emergency 0 0 4 1 5 80
Emergency – – – – – Emergency – – – – –
Tot. 176 5 2 0 183 94 Tot. 170 8 4 1 183 99
%P 97 40 0 0 %P 100 88 100 0
L.A. Dı́az-Robles et al. / Atmospheric Environment 42 (2008) 8331–8340 8339
180 4. Conclusions
160
A novel hybrid ARIMA–ANN model is proposed that is
140
capable of exploiting the strengths of traditional time series
PM10 [µg/m3]
Chow, J.C., Bachmann, J.D., Wierman, S.S.G., Mathai, C.V., Malm, W.C., Qi, M., Zhang, G.P., 2001. An investigation of model selection criteria for
White, W.H., Mueller, P.K., Kumar, N., Watson, J.G., 2002. Visibility: neural network time series forecasting. European Journal of Opera-
science and regulation – discussion. Journal of the Air & Waste tional Research 132, 666–680.
Management Association 52, 973–999. Rawlings, J., 1988. Applied Regression Analysis. A Research Tool. Wads-
Chow, J.C., Watson, J.G., Mauderly, J.L., Costa, D.L., Wyzga, R.E., Vedal, S., worth & Brooks, Pacific Grove, California.
Hidy, G.M., Altshuler, S.L., Marrack, D., Heuss, J.M., Wolff, G.T., Pope, C. Rumelhart, D.E., Hinton, G.E., Williams, R.J., 1986. ‘‘Learning’’ represen-
A., Dockery, D.W., 2006. Health effects of fine particulate air pollu- tations by back propagation errors. Nature 323, 533–536.
tion: lines that connect. Journal of the Air & Waste Management Sallehuddin, R., Shamsuddin, S.M.H., Hashim, S.Z.M., Abraham, A., 2007.
Association 56, 1368–1380. Forecasting time series data using hybrid grey relational artificial
Coulibaly, P., Anctil, F., Bobee, B., 1999. Hydrological forecasting with neural network and auto regressive integrated moving average
artificial neural networks: the state of the art. Canadian Journal of model. Neural Network World 17, 573–605.
Civil Engineering 26, 293–304. Sanhueza, P., Vargas, C., Jiménez, J., 1999. Daily mortality in Santiago and
DeLurgio, S.A., 1998. Forecasting Principles and Applications. Tom Casson, its relationship with air pollution. Revista Medica De Chile 127,
New York. 235–242.
Goyal, P., Chan, A.T., Jaiswal, N., 2006. Statistical models for the prediction Sanhueza, P., Vargas, C., Mellado, P., 2005. Impact of air pollution by fine
of respirable suspended particulate matter in urban cities. Atmo- particulate matter (PM10) on daily mortality in Temuco, Chile. Revista
spheric Environment 40, 2068–2077. Medica De Chile 134, 754–761.
Griñó, R., 1992. Neural networks for univariate time series forecasting and Schlink, U., Herbarth, O., Richter, M., Dorling, S., Nunnari, G., Cawley, G.,
their application to water demand prediction. Neural Network World Pelikan, E., 2006. Statistical models to assess the health effects and to
2, 437–450. forecast ground-level ozone. Environmental Modelling & Software 21,
Gutiérrez-Estrada, J.C., Silva, C., Yáñez, E., Rodrı́guez, N., Pulido-Calvo, I., 547–558.
2007. Monthly catch forecasting of anchovy Engraulis ringens in the Schreuder, A.B., Larson, T.V., Sheppard, L., Claiborn, C.S., 2006. Ambient
north area of Chile: non-linear univariate approach. Fisheries woodsmoke and associated respiratory emergency department visits
Research 86, 188–200. in Spokane, Washington. International Journal of Occupational and
Gutiérrez-Estrada, J.C., Vasconcelos, R., Costa, M.J., 2008. Estimating Environmental Health 12, 147–153.
fish community diversity from environmental features in the Schwarz, G., 1978. Estimating the dimension of a model. Annals of
Tagus estuary (Portugal): multiple linear regression and artificial Statistics 6, 461–464.
neural network approaches. Journal of Applied Ichthyology 24, Shepherd, A.J., 1997. Second-Order Methods for Neural Networks New
150–162. York.
Jain, A., Kumar, A., 2007. Hybrid neural network models for hydrologic Slini, T., Kaprara, A., Karatzas, K., Moussiopoulos, N., 2006. PM10 fore-
time series forecasting. Applied Soft Computing 7, 585–592. casting for Thessaloniki, Greece. Environmental Modelling & Software
Jeong, K.S., Kim, D.K., Jung, J.M., Kim, M.C., Joo, G.J., 2008. Non-linear 21, 559–565.
autoregressive modelling by temporal recurrent neural networks for Sofuoglu, S.C., Sofuoglu, A., Birgili, S., Tayfur, G., 2006. Forecasting
the prediction of freshwater phytoplankton dynamics. Ecological ambient air SO2 concentrations using artificial neural networks.
Modelling 211, 292–300. Energy Sources Part B-Economics Planning and Policy 1,
Kitanidis, P.K., Bras, R.L., 1980. Real time forecasting with a conceptual 127–136.
hydrological model. 2: applications and results. Water Resources Sousa, S.I.V., Martins, F.G., Pereira, M.C., Alvim-Ferraz, M.C.M., 2006.
Research 16, 1034–1044. Prediction of ozone concentrations in Oporto city with statistical
Larson, T., Su, J., Baribeau, A.M., Buzzelli, M., Setton, E., Brauer, M., 2007. A approaches. Chemosphere 64, 1141–1149.
spatial model of urban winter woodsmoke concentrations. Environ- Stadlober, E., Hörmann, S., Pfeiler, B., 2008. Quality and performance of
mental Science & Technology 41, 2429–2436. a PM10 daily forecasting model. Atmospheric Environment 42,
de Menezes, L.M., Nikolaev, N.Y., 2006. Forecasting with genetically pro- 1098–1109.
grammed polynomial neural networks. International Journal of Thomas, S., Jacko, R.B., 2007. Model for forecasting expressway fine
Forecasting 22, 249–265. particulate matter and carbon monoxide concentration: application
Naeher, L.P., Brauer, M., Lipsett, M., Zelikoff, J.T., Simpson, C.D., Koenig, J.Q., of regression and neural network models. Journal of the Air & Waste
Smith, K.R., 2007. Woodsmoke health effects: a review. Inhalation Management Association 57, 480–488.
Toxicology 19, 67–106. Titov, M., Sturman, A.P., Zawar-Reza, P., 2007. Application of MM5 and
Nash, J.E., Sutcliffe, J.V., 1970. River flow forecasting through conceptual CAMx4 to local scale dispersion of particulate matter for the city of
models. I: a discussion of principles. Journal of Hydrology 10, Christchurch, New Zealand. Atmospheric Environment 41, 327–338.
282–290. Tsapakis, M., Lagoudaki, E., Stephanou, E.G., Kavouras, I.G., Koutrakis, P.,
Pérez, P., Gramsch, E., 2007. PM10 Forecasting System in Santiago, Chile. Oyola, P., von Baer, D., 2002. The composition and sources of PM2.5
In: A&WMA. 100th Annual Conference and Exhibition of the Air and organic aerosol in two urban areas of Chile. Atmospheric Environ-
Waste Management Association Pittsburgh, PA, US. ment 36, 3851–3863.
Pérez, P., Palacios, R., Castillo, A., 2004. Carbon monoxide concentration Tseng, F.M., Yub, H.C., Tzeng, G.H., 2002. Combining neural network
forecasting in Santiago, Chile. Journal of the Air & Waste Management model with seasonal time series ARIMA model. Technological Fore-
Association 54, 908–913. casting & Social Change 69, 71–87.
Pérez, P., Reyes, J., 2002. Prediction of maximum of 24-h average of PM10 USEPAU.S., Environmental Protection Agency, 2003. Guidelines for
concentrations 30-h in advance in Santiago, Chile. Atmospheric Developing an Air Quality (Ozone and PM2.5) Forecasting Program,
Environment 36, 4555–4561. EPA-456/R-03–002. Office of Air Quality Planning and Standards.
Pérez, P., Reyes, J., 2006. An integrated neural network model for PM10 Ventura, S., Silva, M., Pérez-Bendito, D., Hervás, C., 1995. Artificial neural
forecasting. Atmospheric Environment 40, 2845–2851. networks for estimation of kinetic analytical parameters. Analytical
Piotrowski, A., Napiorkowski, J.J., Rowinski, P.M., 2006. Flash-flood Chemistry 67, 1521–1525.
forecasting by means of neural networks and nearest neighbour Wang, D., Lu, W.Z., 2006. Ground-level ozone prediction using multilayer
approach – a comparative study. Nonlinear Processes in Geophysics perceptron trained with an innovative hybrid approach. Ecological
13, 443–448. Modelling 198, 332–340.
Pope, C.A., Dockery, D.W., 2006. Health effects of fine particulate air Wang, H.B., Wamura, K., Shooter, D., 2006. Wintertime organic aerosols in
pollution: lines that connect. Journal of the Air & Waste Management Christchurch and Auckland, New Zealand: contributions of residential
Association 56, 709–742. wood and coal burning and petroleum utilization. Environmental
Prybutok, V.R., Yi, J.S., Mitchell, D., 2000. Comparison of neural network Science & Technology 40, 5257–5262.
models with ARIMA and regression models for prediction of Hous- Watson, J.G., 2002. Visibility: science and regulation. Journal of the Air &
ton’s daily maximum ozone concentrations. European Journal of Waste Management Association 52, 628–713.
Operational Research 122, 31–40. Zhang, G.P., 2003. Time series forecasting using a hybrid ARIMA and
Pulido-Calvo, I., Portela, M.M., 2007. Application of neural approaches to neural network model. Neurocomputing 50, 159–175.
one-step daily flow forecasting in Portuguese watersheds. Journal of Zhang, G.P., Qi, M., 2005. Neural network forecasting for seasonal and trend
Hydrology 332, 1–15. time series. European Journal of Operational Research 160, 501–514.