Professional Documents
Culture Documents
Abbasi 2018
Abbasi 2018
Accurate modeling of municipal solid waste (MSW) genera- is a challenging issue. A lot of studies have been conducted on
tion is vital as a reliable support for decision-making processes waste generation modeling from traditional to advanced
ensuring the success of the future development and manage- modeling techniques. So far, waste generation models are clas-
ment of wastes. The present study aims to forecast monthly and sified as correlation analysis, time-series analysis, group com-
seasonal MSW generation using radial basis function (RBF) parison, multiple regression analysis, system dynamics
neural network and assess the effect of the gender of educated modeling, and input–output analysis [5–7]. Correlation analysis
people with a combination of meteorological, socioeconomic, was used to find the relationship between MSW generation
and demographic variables on waste generation. The study and income level [8]. It was found that waste generation was
was implemented on data obtained from a megacity for the highly correlated with income level. Whereas, higher average
period of 1991–2013. Cross validation technique was employed generation rates around 2.1 kg per capita per day occurring in
to evaluate modeling performance. Performance of the RBF high-income nations in comparison to lower income, upper
model were also compared with adaptive neuro-fuzzy inference middle and lower middle with MSW generation rate of 0.6, 1.2,
system (ANFIS) and artificial neural network (ANN) models. and 0.8 kg per capita per day, respectively. However, accu-
The results proved that the number of educated women was rate waste generation prediction cannot be achieved using
highly associated with MSW generation while the number of correlation analysis or percapita waste generation [9]. Disre-
educated men was not a significant factor. Modeling outputs garding demographic and socioeconomic factors, Rimaityte
demonstrated that the RBF neural network model could suc- et al. [10] used conventional time-series approaches including
cessfully predict both monthly and seasonal variations of MSW autoregressive and integrated moving average (ARIMA) and
generation. Compared to ANFIS and ANN, RBF was the best- seasonal autoregressive and integrated moving average (sAR-
performing model for monthly and seasonal forecasting of IMA) to predict generation. The time series model perfumed
MSW generation. The results suggested that soft computing poorly for long-term prediction of MSW generation. The
methods like RBF improve the estimate of MSW generation in model, by contrast, was good at short-term forecasting of
metropolises. Hence, RBF network can be applied for forecast- MSW generation. Moreover, Xu et al. [11] employed time
ing and modeling MSW generation on a national scale. © 2018 series analysis with the mixed of gray system theory and sAR-
American Institute of Chemical Engineers Environ Prog, 2018 IMA model, a methodology to reveal the dynamic relation-
Keywords: radial basis function network, municipal solid ships in a system using differential equations that are derived
waste generation, machine learning, gender of educated from control theory in which the term gray describes the
people understanding of information in the system. The hybrid
model had a superior performance with lower errors than the
INTRODUCTION conventional sARIMA model. A comparative study was con-
Nowadays, the huge amount of waste is created, and con- ducted to forecast MSW quantity and quality by applying dif-
sequently, a successful waste management is needed to ensure ferent approaches including time series analysis, and multiple
resource efficiency and the sustainable growth of the econ- regression in Waste Prognostic Tool for the case study of the
omy. The estimation of the amount of waste is essential for Iasi Romania [12]. The urban life expectancy, population age,
designing and implementation of a waste management plan as total MSW, and a number of residents were used as predictors
it is the foundation of an effective waste planning [1]. in prognostic models so that solid waste amount was pre-
Municipal solid waste (MSW) generation changes by geo- dicted. Using time series analysis resulted in the most accu-
graphical location and depends on various parameters such as rate model to forecast MSW generation. In another study by
cultural practices, income level, the time of the year, the cli- Oribe-Garcia et al. [13], tourism activity, urban morphology,
mate, the degree of development, the standard of living, and income, and education level were found as the most effective
the eating habits [2–4]. Therefore, forecasting waste generation drivers of MSW generation. In this study, the effect of gender
and age structure on municipal waste generation was investi-
gated. The data from a 10-yr period, from 2001 to 2010 year,
© 2018 American Institute of Chemical Engineers were taken into consideration. Other MSW drivers such as
Max temperature
Educated women
Input layer
Household size
Educated men
Population
Income
GDP
Rain
Number of neurons
Hidden layer
φ1 φ2 O2
… … φn
(n)
O1 On
w2
w1
Output layer
wn
∑
Output:
MSW generation
Figure 2. Structure of RBF network used in this study.
All statistical analyses were carried out using MATLAB to generate the output values of the neural network. The
R2013a software. advantage of using ANN is that this network is able to learn
from the examples, as the learnt information is stored across
the network weights.
Artificial Neural Network
ANN is a computational model which recognize different ADAPTIVE NEURO-FUZZY INFERENCE SYSTEM
pattern which are buried in the data. ANN consists of simple The architecture of ANFIS uses both neural network and
processing unit, called neurons. Each neuron in the network is fuzzy logic which makes it powerful discriminator. This algo-
connected to the other neurons by unidirectional connections rithm works with data and extract fuzzy rules like if-then rules
of different weight. The neurons are arranged in a series of from the data. To train ANFIS algorithm, learning algorithms of
layers, namely, input layer, variable number of hidden layers, neural networks are used.
and output layers. It should be noted that the number of hid- ANFIS model consists of five layers for generating inference
den layers depends on the complexity of the problem to be system. There are several nodes in each layer and there is a
solved. First in the input layer, the input values are assigned to connection between each layer (see Figure 3).
input neurons and send the activation value to each neurons It should be noted that the square nodes, called adaptive
in the next layer. Then in the next layer, hidden layer, each nodes, are accepted to represent the parameter sets in these
neuron sums the received activation values from its connected nodes which are adjustable. Circle nodes are fixed nodes that
neurons and then using a transfer function determines its out- are accepted to represent the fixed parameter sets in the sys-
put value. This process is done in the next layer, output layer, tem. First-order Sugeno-type model which is proposed by Jang
[cite] is as follows:
where x, y are the inputs, Ai and Bi are the fuzzy sets, fi the
outputs within the fuzzy region identified by the rules, and pi,
qi, and ri the parameters which are determined during the
training.
Figure 6. Correlation analysis of different features with seasonal waste estimation during 21 time intervals: (a) non-linear and
(b) linear.
x − xmin performance were summed and then averaged. Using CV
xnormalized ¼ × 2 −1 ð11Þ avoids potential model overfitting which is the model that pro-
xmax −xmin
duces better predictive performance in the data set used in
where Xnormalized indicates the normalized value and xmin and model fitting than the data from the out of the study area. The
xmax are the maximum and minimum values of the variable, modeling performance metrics used in this study were cross
respectively. validated R2 (CV-R2) and cross validated root mean square
error (CV-RMSE).
In addition, the Bland–Altman analysis was employed to
Model Evaluation assess the agreement between the observation and predictions
We used eightfold cross validation (CV) technique to evalu- of seasonal and monthly waste generation. More details on the
ate models performance and estimate their out-of-sample Bland–Altman analysis is fully described in a study by Giavar-
accuracy. Generally, supervised learning algorithms are sus- ina [37].
ceptible to overfitting problem. This means that, these algo-
rithms potentially may produce models which are too trained
on noises and details of training data and have a low perfor- RESULT AND DISCUSSION
mance on new data. Therefore, using CV helps to avoid poten-
tial model overfitting. Investigation of the Correlation Coefficients between
In this study, data were randomized first and then divided Explanatory Variables and Dependent Variable
into eight equal size subsets. For each fold, the model was In this study, nine independent variables, which reflect
tested on one of the eight subsets and were trained by the demographic, socioeconomic, and meteorological conditions,
remaining subsets (see Figure 4). This process was repeated were used to predict monthly and seasonal solid waste genera-
eight times, and the resulting statistics for assessing modeling tion. The nonlinear and linear correlations between these
S1 S2 S3 S4 S5
All data minus All data minus All data minus educate
All All data minus educate educate man man temp rain
data educated man man temp temp rain unemployment rate
Monthly CV-R2R2 0.736 0.716 0.704 0.709 0.678
CV-RMSE 0.106 0.110 0.112 0.111 0.116
Seasonally CV-R2R2 0.872 0.872 0.875 0.884 0.849
CV-RMSE 0.066 0.066 0.065 0.063 0.071
independent variables and the dependent variable during dif- Prediction of MSW Generation using RBF Neural
ferent time intervals were analyzed, using distance correlation Network
and Pearson’s correlation, to evaluate the influence of each the Nine variables from socioeconomic, demographic, and,
explanatory variables on waste generation. In this regard, for meteorological data groups were selected and then fed to an
each of the datasets (monthly and seasonal), nonlinear and lin- RBF network to estimate monthly and seasonal MSW genera-
ear correlation between the each of the independent variables tion during 21 yr. To find the best combination of input vari-
and MSW generation were calculated for the first year (1992). ables, five different subsets of input data were fed to an RBF
Then the data related to the next year (1993) were added to network to build an estimation model. Model S1 included all
these data (1992) and the correlations between each of the the nine variables while the number of educated men was
independent variables and MSW generation was calculated excluded in model S2 due to the weak correlation with the
again. This process was repeated to cover all the data derived dependent variable. Model S3 involved all the variables of
from 21 yr of the study. model S2 except maximum temperature and rainfall was
For the monthly and seasonal datasets, the changes of lin- excluded in model S4. The model S5 contains GDP, popula-
ear and nonlinear correlations between the predictors and tion, income, number of educated females, and household
MSW generation during 21 time intervals are depicted in Fig- size. These models are built for each of the datasets separately
ures 5 and 6. At a glance, the correlation between independent and validated using CV technique. In addition, CV-R2 and CV-
and dependent variables followed the nonlinear patterns RMSE as two performance metrics were calculated.
rather than linear ones. As it is expected, the higher correlation Table 1 summarizes the obtained CV-R2 and CV-RMSE
was found in seasonal scale than monthly scale. The main rea- values for different models. Model S1 was capable of explain-
son is that seasonal patterns of waste generation have a key ing 74% and 87% of monthly and seasonal waste generation
role for estimating the amount of waste generation in a city. variations. By excluding number of educated male from all of
Moreover, the variation of the correlations with time revealed the variables (the model S2), CV-R2 is decreased by 2.7% and
that there were stronger relationships between socioeconomic, CV-RMSE rose by 3.8% in the monthly model, but no differ-
demographic, and meteorological variables and waste genera- ence is observed in the performance of the seasonal model. In
tion in a long period than a short period. In other words, the monthly and seasonal models, by removing educated male
socioeconomic, demographic, and meteorological variables and maximum temperature (model S3), CV-R2 is decreased by
influenced the long-term generation of MSW generation. 1.7% and 0.3%, compared to model S2, respectively. In model
Although, some factors such as unemployment rate did not fol- S4, especially the seasonal model, CV-R2 values significantly
low a specific trend. are increased. The observed increase in CV-R2 is certainly due
Among independent variables, GDP, income, household to removing rainfall variable from the datasets. Among the var-
size and educated women were highly correlated with both iables excluded, unemployment rate had the significant effect
monthly and seasonal waste generation, respectively. This on modeling performance. CV-R2 was dropped by around 4%
confirms previous findings in the literature but introduced the for both model. Hence, the unemployment rate could slightly
number of educated female as a waste generation driver. In improve the prediction of MSW generation. Finally, the highly
contrast, the number of educated men was a less effective fac- correlated predictors (GDP, population, household size,
tor on MSW generation. This may be related to the role of income, and the number of unemployment rate) were
women and their education in waste generation. included in the model.
Table 2. R2 and RMSE for model fitting in each fold of cross validation.
Seasonal
Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 Fold 6 Fold 7 Fold 8
2
Train and test R 0.870 0.896 0.895 0.876 0.885 0.888 0.894 0.894
RMSE 0.066 0.061 0.059 0.062 0.0603 0.0634 0.0617 0.0584
Test (R) R2 0.857 0.591 0.868 0.929 0.936 0.847 0.612 0.858
RMSE 0.062 0.061 0.069 0.064 0.091 0.057 0.068 0.086
Monthly
Fold 1 Fold 2 Fold 3 Fold 4 Fold 5 Fold 6 Fold 7 Fold 8
Train and test R2 0.735 0.742 0.726 0.754 0.737 0.752 0.736 0.728
RMSE 0.106 0.105 0.105 0.102 0.105 0.104 0.106 0.108
Test (R) R2 0.749 0.721 0.789 0.533 0.714 0.568 0.733 0.815
RMSE 0.107 0.112 0.117 0.135 0.113 0.118 0.102 0.084
Figure 8. Prediction–observation plot of normalized values of MSW generation: (a) monthly and (b) seasonal.
The CV results for the model trained by the best combina- seasonal predictions of municipal waste generation and its
tion of data are presented in Table 2. The R2 and MSE were actual observations.
calculated for each fold of eightfold CV across the model runs.
The small differences in R2 and MSE in both the monthly and COMPARISON OF THE PERFORMANCE OF RBF MODEL WITH ANFIS AND ANN
seasonal models showed the RBF model did not suffer from Same dataset was used to train and test the RBF, ANFIS,
overfitting problem. and ANN models. Table 3 compares the performance of RBF,
For a closer inspection of modeling results, we used the ANFIS, and ANN models based on CV-R2 and CV-RMSE param-
Bland–Altman analysis and prediction–observation plot to eters. The CV-R2 values were respectively 0.678, 0.53, and 0.43
evaluate the agreement between the observation and predic- for the RBF, SVM, and ANFIS models for monthly prediction
tions of monthly and seasonal models. In Figure 7, the Bland– while they were 0.85, 0.62, and 0.56 for seasonal prediction.
Altman plots demonstrated low bias in both models; however, All the built models from seasonal dataset have higher CV-R2
the seasonal model had a tighter agreement than a model with and lower CV-RMSE compared with the built models from
fewer large residuals. We also compared the observed MSW monthly dataset. The obtained results from both datasets show
generation to the predicted values of RBF models. The RBF models are more accurate than ANN and ANFIS in pre-
predicted–observed plot of seasonal and monthly modeling of dicting waste generation. Based on Table 3, RBF achieved the
MSW generation indicated that the values were more equally highest CV-R2 and lowest CV-RMSE values among all the
scattered across the line of agreement at the low and high models. In addition, there are significant differences between
waste generation (Figure 8). In addition, the predicted– the values obtained by the ANN and ANFIS models with the
observed plot showed stronger correlation between the RBF models that highlight more efficiency of RBF model in the
domain of waste generation prediction.