ISA Transactions: Xiaoyan Liu, Jiao Jin, Weining Wu, Fabian Herz

ISA Transactions xxx (xxxx) xxx
Contents lists available at ScienceDirect
ISA Transactions
journal homepage: www.elsevier.com/locate/isatrans
Research article
A novel support vector machine ensemble model for estimation of free

lime content in cement clinkers
∗
Xiaoyan Liu a,b , , Jiao Jin a , Weining Wu a , Fabian Herz c
a
College of Electrical and Information Engineering, Hunan University, 410082 Changsha, China
b
Hunan Key Laboratory of Intelligent Robot Technology in Electronic Manufacturing, 410082 Changsha, China
c
Department of Applied Biosciences and Process Engineering, Anhalt University of Applied Sciences, 06366 Köthen, Germany
article info a b s t r a c t
Article history: Free lime (f-CaO) content is a crucial quality parameter for cement clinkers in rotary cement kiln. Due
Received 25 July 2018 to lack of hardware sensors, f-CaO content in cement clinker is mostly obtained by offline laboratory
Received in revised form 1 September 2019 measurement, making timely control rather difficult and even impossible. In this work, a soft sensor
Accepted 1 September 2019
approach named as support vector machine ensemble (ESVM) model is proposed to estimate f-CaO
Available online xxxx
content. The process data employed to train and test the model were collected from a cement plant in
Keywords: China, covering a time span of about 30 days. The raw data were preprocessed by filters and time-series
Free lime content matching. The processed data were then clustered by fuzzy c-means clustering algorithm to capture
Soft sensor process features at different operating conditions. For each individual cluster, a base SVM regressor was
Support vector machine trained to estimate f-CaO content. Finally, an ensemble model consisting of four base SVM regressors
Fuzzy c-means clustering
was established to estimate f-CaO content at multifarious process conditions. The effectiveness of
Ensemble model
the proposed ESVM model was investigated by comparing it with manual measurements and other
models available in literature. The results demonstrate that the proposed ESVM model achieves
improvements in model accuracy as well as generalization capability. The proposed ESVM model has
a broad application space in cement production process for automatic monitoring of f-CaO content.
© 2019 ISA. Published by Elsevier Ltd. All rights reserved.
1. Introduction and material, temperatures and pressures) in the plant database.

The soft sensor models available in literature differ mainly in the
In cement industries, raw materials are calcined to be cement choice of model inputs and modeling algorithms (see Table 1).
clinker at a high temperature environment in rotary kilns. The The first research on soft sensing of free lime was conducted by
free lime (f-CaO) content in cement clinker greatly decides the Lin B et al. [4]. In their work, Principal Component Regression
cement quality. The f-CaO content is affected by many factors, for (PCR) and Partial Least Squares Regression (PLSR) algorithms
example, the combustion process, the material transport process were used as f-CaO content estimation model. Thirteen process
as well as the cooling process of the clinker [1–3]. In practice, variables were chosen as model inputs. 2000 testing data were
the f-CaO content is usually set at 0.5%–1.5% as required quality. used to test the models. Their results indicated that PLSR model
However, due to lack of hardware sensors and harsh environment is more suitable for estimation of f-CaO content than PCR model
(high temperature, strong erosion, heavy dust etc.) within the due to its ability to capture more relevant information. However,
rotary kiln, f-CaO content is mostly obtained offline by manual all the values of f-CaO content to be predicted were in the
analysis in laboratory with a long sampling interval (typically range 0.5%–1.5% (corresponding to product quality in normal pro-
one hour), resulting in difficulties in timely control of cement duction conditions). As the authors themselves pointed out, the
clinker quality. Therefore, it is necessary to develop methods for question of integration of information from irregular sampling
automatic monitoring of f-CaO content. offline quality measurements into weighted vectors still requires
In recent years, data-driven soft sensors have been introduced study.
for automatic monitoring f-CaO content. These soft sensors are Pani AK et al. [5] developed and compared three different
characterized by deriving an input–output model based on large neural networks (Back Propagation neural network (BP), Radial
amount of historical process data (for example feed rates of fuel Basis Function neural network (RBF) and Generalized Regression
neural network (GR)) for f-CaO content estimation. Nine pro-
∗ Corresponding author at: College of Electrical and Information Engineering, cess variables including compositions of the raw material were
Hunan University, 410082 Changsha, China. used as model inputs. It was found that RBF model has better
E-mail address: xiaoyan.liu@hnu.edu.cn (X. Liu). performance than BP and GR. Later, Pani AK et al. [6] applied
https://doi.org/10.1016/j.isatra.2019.09.003
0019-0578/© 2019 ISA. Published by Elsevier Ltd. All rights reserved.
Please cite this article as: X. Liu, J. Jin, W. Wu et al., A novel support vector machine ensemble model for estimation of free lime content in cement clinkers. ISA Transactions
(2019), https://doi.org/10.1016/j.isatra.2019.09.003.
2 X. Liu, J. Jin, W. Wu et al. / ISA Transactions xxx (xxxx) xxx
Table 1
Soft sensor models for estimation of f-CaO content.
Authors Model Model inputs Number of Value (%) distribution
testing samples in testing samples
<0.5 0.5–1.5 >1.5
Lin B et al. [4] PCR, PLSR 13 inputs, including: kiln current, fuel flow rates to 2100 0% 100% 0%
calciner and kiln, kiln feed and several temperature
measurements within the kiln system
Pani AK et al. [5] BP, RBF, GR 9 inputs, including: 110 1.8% 91.8% 6.4%
Pani AK et al. [6] T–S inference technique Raw meal quality (SiO2 , Al2 O3 , Fe2 O3 , CaO),
kiln operating variables (kiln feed rate, kiln RPM, raw
meal inlet temperature, coal feed rate, kiln current)
Li W et al. [7] RVFL ensemble model 14 inputs, including: 80 10% 65% 25%
flame image features (color of ROI, global configuration
of ROI, local configuration of ROI),
kiln operating variables (coal feeding, opening degree of
induced draft fan, kiln current, raw material pump
current, tail temperature, head temperature, head
pressure),
raw material quality (lime saturation factor, silicic acid
rate, alumina modulua, granularity)
Takagi–Sugeno (T–S) fuzzy inference technique to improve esti- to estimate f-CaO content in cement clinker, aiming at improving
mation performance. The model accuracy was verified using 110 its estimation accuracy and generalization capability. The process
testing samples and achieved good performance. However, 91.8% data used to train and test the model were collected from a local
of the values of f-CaO content to be predicted are ranged between cement production line, covering a relatively long time span (30
0.5%–1.5%, corresponding to product quality in normal production days) and wide range of process conditions ( 0.3<f-CaO<3.4).
conditions. The adaptability and prediction accuracy of this model The raw data were first preprocessed based on time-lag feature
for other cases (f-CaO>1.5%, abnormal product quality) remains of the cement production process. The processed data were then
to be further investigated. clustered by fuzzy c-means clustering (FCM) for capturing process
Li W et al. [7] employed a random vector functional-Link features at different conditions. For each individual cluster, a
(RVFL) nets ensemble model for automatic estimation of f-CaO base SVM regressor was trained using sub-dataset in the cluster.
content. The model was developed by generating 20 RVFL net- Finally, an ensemble model consisting of several base SVM regres-
works trained by the same training samples, and the final pre- sors was established to monitor the f-CaO content under various
dicted result was computed by the mean f-CaO content values process conditions.
predicted by the 20 RVFL networks. Fourteen variables including In Section 2, the cement clinker calcination process is briefly
features of flame images were used as inputs of the ensemble described along with data collection and preprocessing proce-
model. The model was verified by 80 testing data with a relatively dure; in Section 3, the ESVM model is proposed and described in
wide range of f-CaO values (25% of the values are bigger than detail; in Section 4, the performance of the proposed model for
1.5%). However, as pointed out by the authors themselves [7], the f-CaO content estimation is evaluated and compared with other
random weights assigned by this RVFL ensemble model are not models, followed by conclusions in Section 5.
optimal. Moreover, this method requires that the flame images
have good quality. In the cases of heavy smoke and dusts in
2. Cement clinker calcination process and data preprocessing
rotary kiln [8], however, the flame images captured by CCD cam-
era become very blurry or even invisible. This brings difficulties
2.1. Cement clinker calcination process
in image feature extraction and thus affects greatly the model
performance.
The above literature analysis shows that data-driven soft sen- The typical cement clinker calcination process includes four
sor models are effective for automatically estimating f-CaO con- stages (preheating, calcination, sintering and cooling) and in-
tent, and it is still necessary to develop novel algorithms to volves four main apparatuses (cyclone preheater, precalciner,
improve model accuracy and model generalization capability by rotary kiln and grate cooler, in Fig. 1). Specification of the cement
using representative datasets that cover a wider range of cement clinker production line in a local cement company are listed in
clinker quality. Table 2. The raw material is fed into the multistage cyclone pre-
Support Vector Machine (SVM) has been recognized as a tra- heater (mostly a five-stage cyclone preheater). During its vertical
ditional and reliable approach to evaluate nonlinear relationship downward movement in the cyclones, the material is pre-heated
of the inputs–outputs. Compared with other algorithms, SVM via countercurrent hot flue gas. After that, it is discharged into a
has advantages of simple structure and strong generalization precalciner (the temperature is about 800 ◦ C). Due to the complex
ability, and thus SVM is particularly suitable for modeling data dynamics of gas–solid two phase flow in high temperature, it is
with insufficient quantity and multi-collinearity [9–11]. There- difficult to establish mathematical models to calculate the time
fore, SVM could be a good choice for estimating the f-CaO content that the material needed to pass the cyclonepreheater and the
in clinker. However, clinker production is a complicated and 24 precalciner. Li and Jian [17] measured the residence time of the
hour-continuous process involving many coupled sub-processes, material in a five-stage cyclone preheater in cold tests (27 ◦ C),
such as combustion, chemical reactions, mass and heat trans- by use of an on-line RTD (retention time distribution) measuring
fer [12–14]. The processes have the characters of non-linearity, instrument using electromagnetic method. It was found that the
strong coupling, and especially big time lag [15,16]. It would residence time of the material in five cyclones is approximately
be difficult to measure the f-CaO content under various process 8 s. Chen et al. [18] performed numerical simulations to predict
conditions by use of a single SVM model. the gas–solid flow in the precalciner in cold tests. The calculated
Motivated by the above-mentioned problems, this work pro- result showed that the material stayed in the precalciner for
posed a novel support vector machine ensemble (ESVM) model about 10 s. In reality, the residence time is much bigger owing
X. Liu, J. Jin, W. Wu et al. / ISA Transactions xxx (xxxx) xxx 3
Table 2 Table 3
Parameter list of the cement clinker production process in a local cement Inputs and output of the proposed soft sensor model.
company. Model inputs Model output
Parameters Values Feed rate of the raw material (M)
Length of the kiln (m) 60 Temperature in precalciner (T )
Inner diameter of the kiln (m) 3.6 Secondary air pressure at kiln head (Ph ) f-CaO content
Inclination of the kiln to horizontal (◦ ) 2.29 Air pressure in the second chamber of grate cooler (Pc )
Rotation speed of the kiln (rpm) 3.3–3.9 Electrical current of the drive motor of the rotary kiln (I)
Effective area of the grate plate (m2 ) 63.6
Thickness of bed material in the cooler (m) 0.6
Repose angle of the material (◦ ) 35
Feeding rate of the raw material (t/h) about 180 If too many variables are chosen as inputs of the model for f-CaO
Density of clinker in the cooler (kg m−3 ) about 1500
content estimation, the time for model training and computing
Flow rate of the clinker into the cooler (kg min−1 ) about 1973.7
will be dramatically increased and cannot meet the real-time
requirement in practical applications.
For the above reasons and considering the fact that f-CaO is
to the turbulent flow of the flue gas in high temperature field. mostly generated in rotary kiln, we consider mostly the signals
Based on results of experiments and simulations in [17,18] as well sampled from rotary kiln as well as its neighborhood equipments
as the technical reports from the local cement company, the total (precalciner, two chambers of the grate cooler). According to the
time that the material needs to pass the cyclone preheater and process mechanism and experience of the kiln operators, five
the precalciner is estimated to be about 30 s. process variables are selected in the present work as inputs of
Coming out from the precalciner, the material then enters the f-CaO content estimation model (see Table 3 and Fig. 1). Among
rotary kiln. The temperature of the burning zone of the rotary kiln them, the feed rate of the raw material (M) is a variable from
reaches about 1300 ◦ C. Under such high temperature, the liquid material flow process; the temperature in precalciner (T ) is a
phase of tetracalcium aluminoferriate (C4 AF), tricalcium alumi- variable from heat transfer process; the secondary air pressure
nate (C3 A) and dicalcium silicate (C2 S) in the material absorbs at kiln head (Ph ) and the air pressure in the second chamber
f-CaO to form the major component of cement clinker which of grate cooler (Pc ) are variables from the air flow process; the
namely tricalcium silicate (C3 S). The mean residence time tk of electrical current of the drive motor of the rotary kiln (I) is a vital
the material in rotary kiln can be estimated using the classical parameter reflecting the material load and thermal conditions in
empirical equation developed by Sullivan et al. [19] as below: the kiln [20,21]. Although some other variables may be also used
L θ as model inputs, we will show later with experiments that these
tk = 1.77 × × (1) five input variables are enough for soft sensor modeling of f-CaO
D nc × β
content.
where L represents kiln length; D denotes the inner diameter of In order to train the model, we collected data from a cement
the kiln; θ is the repose angle of the material; nc is the rotation production line (Fig. 1 and Table 2) of a cement plant in Jiangxi
speed of the kiln; β is the inclination of the kiln to horizontal. Province, China. The process variables in Table 3 were measured
Based on the parameters listed in Table 2, tk is calculated to be by sensors installed on site. The values were sampled and saved in
21 min. The hot clinker is then discharged from rotary kiln into computer by the automatic monitoring system that was already
a grate cooler and is cooled by the air coming from cooling fans. installed for the production line. The sampling interval was one
The residence time tc of the clinker in the cooler is calculated as second. The f-CaO content was measured manually by offline
laboratory analysis with an interval of one hour. The acquired
m AH · ρ data cover a long time span of 30 days (from April 25th to May
tc = = (2)
W W 24th), reflecting various process conditions.
where m is the total mass of the material in the grate cooler; W
is the flow rate of the clinker into the cooler; A is the effective 2.2.2. Data preprocessing
area of the grate plate; H is the thickness of bed material in the The data preprocessing procedures are illustrated in Fig. 2. The
cooler; ρ is the density of the clinker in the cooler. Accordingly, process variables were sampled with an interval of 1 s for 30
tc is calculated to be 29 min. days, leading to a huge amount of data points (86 400 × 30) and
The cement clinker quality is confirmed offline in laboratory redundancy in data. In the present work, a combination of mean
by detecting the f-CaO content in the cement clinker samples filter and 3σ criterion (‘‘mean filter-3σ method’’) is employed to
collected periodically from the discharge end of the grate cooler. process the data, that is, the raw data are processed firstly by
The sampling interval is one hour. mean filter with a period of 5 min to reduce redundant informa-
tion and then processed by 3σ criterion to delete contaminated
2.2. Data collection and preprocessing data caused by sensor failure. In principle, one can also adopt
the processing sequence ‘‘3σ -mean filter’’, which could however
2.2.1. Data collection lead to more computing load since outliers have to be eliminated
As described above, cement clinker production process is from 86 400 × 30 data points when using 3σ criterion. In or-
rather complex. There are material flow, gas flow and fuel flow, der to investigate the influence of data preprocessing sequence
involving physical and chemical reactions, mass transfer and heat ‘‘3σ -mean filter’’ and ‘‘mean filter-3σ ’’, we compare the final
transfer. Variations of the process conditions greatly affect the processed datasets (288 data points) for precalciner temperature
f-CaO content. In practice, about 70 sensors are installed for T on May 5th. As shown in Fig. 3, there is no big difference
automatic monitoring of process variables such as temperatures, between datasets obtained by use of ‘‘mean filter-3σ ’’ method
pressures, flow rates etc. Due to the strong coupling of the pro- and the ‘‘3σ -mean filter’’ method. Among the 288 data points,
cesses, some sampled variables are correlated with one another. only 13 data points scatter near the centerline, and the relatively
For example, the temperature in the precalciner is affected by differences are less than 0.6%.
the temperature in the five-stage cyclone preheater, and the As already described in Section 2.1, the cement clinker pro-
electrical current increased greatly with increased rotation speed. ducing process involves preheating, decomposing, calcination and
Fig. 1. Schematic diagram of the cement clinker calcination process.
Fig. 2. Flow chart of data preprocessing.
Table 4
Example of some input–output pairs.
Sampling Sample M (t/h ) T (◦ C) I (A) Ph (Pa) Pc (kPa) f-CaO
date number content (%)
4–25 1 182.5 843.0 381.0 −22.7 6.8 1.6
2 182.8 844.1 340.6 −6.6 6.9 1.1
3 182.1 843.4 356.8 −0.1 6.5 1.2
. . . . . . .
. . . . . . .
. . . . . . .
23 182.7 843.2 279.3 14.7 6.4 0.7
24 182.8 841.0 328.1 −17.1 6.0 1.2
4–26 25 183.0 840.3 343.9 −29.9 6.1 1.1
26 182.4 838.3 294.3 −56.1 6.1 1.6
. . . . . . .
. . . . . . .
. . . . . . .
232 180.0 843.3 368.5 −13.7 6.5 0.3
. . . . . . .
. . . . . . .
. . . . . . .
415 182.0 851.8 343.3 −21.1 6.3 3.4
. . . . . . . .
. . . . . . . .
. . . . . . . .
5–23 660 179.8 851.9 229.3 −9.2 6.4 1.2
Minimum 167.5 825.5 134.5 −73.0 4.4 0.3
Maximum 185.0 865.5 461.3 36.3 7.7 3.4
cooling. There exists much time lag between the corresponding In the local cement company, the f-CaO content was manually
process variables. The changes of f-CaO content in cement clinker measured by collecting clinker samples at about the 50th minute
are caused by the cumulative impacts of operating variables of every hour (labeled by XX: 50 in the figure). Using XX: 50
over the past time period [22]. It is thus important to consider as reference time, the matching time of the 5 process variables
the time-series matching between each input and output (f-CaO (Fig. 4) can be obtained by determining the sampling locations of
content) of the model. A useful method was proposed in our process variables and calculating the residence time of the ma-
previous work [16] to solve such a problem by analyzing the terial in corresponding apparatuses. As described in Section 2.1,
residence time of the material in the four main apparatus (Fig. 4). the material stays in the preheater and precalciner for about 30 s,
Fig. 3. Comparison of the final processed datasets (288 data points) for precal- Fig. 4. Residence time of the material in apparatuses and time-series matching
ciner temperature T on May 5th (the data were processed by ‘‘3σ -mean filter’’ scheme for inputs–output of the model.
method and ‘‘mean filter-3σ ’’ method, respectively).
Table 5
The sum of Euclidean distances between of the testing samples and its
and the total residence time of the material in rotary kiln and corresponding cluster centers.
cooler is about 50 min, therefore the matching time of variable Number of clusters n=3 n=4 n=5 n=6
M in the preheater and the variable T in the precalciner could Sum of Euclidean distances d 3.96 3.17 3.43 3.55
be roughly estimated as XX: 00 with respect to the reference
sampling time (XX: 50) of f-CaO content. In consideration of the
sampling position of Pc in the grate cooler and the residence time
(29 min) of the material in the cooler, the matching time of Pc is is grouped into n clusters C1 , C2 , . . . , Cn using fuzzy c-means al-
approximately XX: 25. The variables Ph and I reflect the working gorithm (FCM). Each cluster indicates similar process conditions.
conditions in the rotary kiln. According to the infrared image of The corresponding data in each cluster (denoted as sub-dataset
the kiln shell, the high-temperature calcination of the material S1 , S2 , . . . , Sn , respectively) are then used to train a base SVM
has nearly accomplished at the distance of 51 m to the kiln tail. regressor (denoted as SVM1 , SVM2 , . . ., SVMn , respectively) for
The time for the material to pass the rest distance (9 m) to the f-CaO content estimation under the corresponding process condi-
discharge end of the kiln is calculated to be 3 min based on Eq. (1), tions. The established ESVM model {SVM1 , SVM2 , . . ., SVMn } can
and the time to pass through the grate cooler is about 29 min, that be then employed to estimate f-CaO content under new process
is, 32 min in total. Considering the sampling positions of the two conditions, by performing following procedures: (i) determine the
variables Ph and I, the matching time of them can be estimated cluster that the new input data {M , T , I , Ph , Pc } belong to; (ii) use
as XX: 18 with respect to the reference sampling time (XX: 50) the corresponding trained base SVM regressor to estimate f-CaO
of f-CaO content. content. The procedure will be represented in following sections.
After data preprocessing, a dataset containing 660 sets of
inputs–output pairs are obtained, arranged in time sequence or- 3.1. Data clustering based on FCM
der. The first 560 sets of data will be used for model building
(training data) and the remaining 100 sets for model validation
Fuzzy c-means algorithm (FCM) is an objective function-based
(testing data). This means that the proposed model needs to
clustering tool. FCM has been successfully used due to its ef-
predict the clinker quality over 4 successive days (from May 19th,
ficient and outstanding performance [23,24] and is thus ap-
0:00 to 23nd, 9:00) of continuous operation of the rotary kiln.
plied in the present work. Based on FCM algorithm, the train-
Some of the preprocessed input–output pairs are shown in
ing data {M , T , I , Ph , Pc } can be grouped into several clusters
Table 4 for example. The collected data cover a long time span
and have relatively wide ranges, being representative for various (C1 , C2 , . . . , Cn ). The inputs–output pairs {M , T , I , Ph , Pc , f -CaO}
process conditions. For example, the input variable I ranges from of each cluster are taken as a sub-dataset (S1 , S2 , . . . , Sn , re-
134.5 A to 461.5 A; the values of f-CaO content vary between 0.3% spectively). Each sub-dataset will be used to train a base SVM
(over-burning state) and 3.4% (under-burning state), covering a regressor.
relatively wider range than those used in literature. To ensure In the usage of FCM, the number of clusters n should be ap-
equal relevance for all variables, the inputs–output pairs are propriately selected. Too many clusters will lead to complicated
normalized before they are used for soft sensor model training. result that is difficult to interpret and analyze, while too few
clusters will make information to be lost and may misguide the
3. Framework of the ESVM model final decision [25]. For compromise, n is preset in the range 3–6.
For determining the most suitable number of clusters, a sum of
In this section, a support vector machine ensemble model calculated Euclidean distances (denoted as d) between the testing
(ESVM) is proposed for f-CaO content prediction. The basic idea samples and their corresponding cluster centers are employed for
is to increase the adaptability of the SVM regressor to the com- evaluation. As shown in Table 5, d is minimum for the case n = 4,
plexity and variety of cement clinker calcination process, thus im- indicating that there is a higher similarity between the testing
proving model generalization. Fig. 5 illustrates the general archi- samples and their corresponding cluster centers. Therefore, it is
tecture of the proposed ESVM model. In order to capture the pro- reasonable for us to group the training data into four clusters with
cess features at different operating conditions, the training data four cluster centers {v1 , v2 , v3 , v4 }.
Fig. 5. Architecture of the proposed ESVM model for estimation of f-CaO content.
3.2. Training of base SVM regressors capability in modeling, and its expression is [9]:
xi − xj 2
(   )
(  )
K xi , xj = exp − xi − xj 2
( ) 
Support vector machine (SVM) is a competent algorithm for = exp −γ (6)
dealing with regression issue and has the advantages of sim-
σ2
ple structure and strong generalization ability [26–29]. The brief where γ is the kernel parameter.
notions of SVM for the case of regression are introduced. Con- In the training of the base SVM regressors for f-CaO content
sidering a training dataset {(xi , yi )}Ni=1 where input xi ∈ X ⊆ Rd prediction, the inputs and output are X = {M , T , I , Ph , Pc } and
and output yi ∈ Y ⊆ R, SVM performs regression estimation by Y = {f − CaO}, respectively. The performance of SVM is directly
establishing a linear model, given by: bound up with the regularization parameter c, the kernel param-
eter γ and the parameter ε of the loss function. In determining
f (x) = w T • Φ (x) + b (3) the learning parameters of SVM, one can adopt trial-and-error
method [31–33] or adopted optimization methods such as grid
where w defines the weight vector; b represents the bias term search [34,35] and genetic algorithm (GA) [36,37]. In the present
which is a constant; Φ (x) is the representation of a nonlinear work, we combine trial-and-error method with GA to find out
transformation function to map the input space into a higher the learning parameters of SVM. Firstly, trail-and-error procedure
dimensional feature space. This is equivalent to solving the con- is performed to determine the rough ranges of the parameters,
straint optimization problem as follows [26,30]: i.e. c ∈ [0, 10], γ ∈ [0.25, 5] and ε ∈ [0, 0.25]. Then, these
learning parameters are optimized by GA in the preset ranges. We
n
1 ∑ use GA instead of grid search algorithm, because GA consumes
wT w + c ξi + ξi∗
( )
min less computing time and are suitable to solve complex parameter
w,b,ξi ,ξi∗ 2
i=1 optimization problems using merely three genetic operations of
⎧ selection, crossover and mutation [37].
⎨y[ i − w Φ (xi )] + b ≤ ε + ξi
[ T ]
⎪ The optimization algorithm can be realized by libsvm-mat
toolbox (https://www.csie.ntu.edu.tw/~cjlin/), which was devel-
subject to wT Φ (xi ) + b − yi ≤ ε + ξi (4)
oped by Chih-Jen Lin. The maximum number of generations and
ξi , ξi∗ ≥ 0, i = 1, 2, . . . , n
⎪
⎩
the population size of GA are set to 250 and 20, respectively. The
probability of crossover operation is set as 0.9 and the probability
where ξi and ξi∗ denote slack variables which are represented
of mutation operation is set as 0.01. The individual fitness values
as the upper and lower boundary hyperplane, respectively; c is are calculated by the fitness function defined as [37]:
penalty parameter which controls the trade-off between mini-
l ⏐ ⏐
mization of the training error and the maximization of the mar- 1 ∑ ⏐ yi − ŷi ⏐
f (i) = (7)
gin; ε is an insensitive loss function [30] and determines the
⏐ ⏐
ng ⏐ yi ⏐
i=1
maximum error for the approximation. The solution of this con-
strained optimization function is obtained by converting it to be where f (i) represents the fitness function value of sample i, ng
a Lagrange dual problem, and then the regression function can be denotes the sample size of samples of the validation dataset (in
the training subsets, one of them is taken as validation dataset
simplified by solving the dual problem as [10]:
in turn, others are taken as training dataset). yi and ŷi represent
n the actual values and prediction values of the validation dataset,
∑
f (x) = (αi − αi∗ )K xi , xj + b
( )
(5) respectively. The selection operation is performed by calculating
i=1 the individual fitness values. Chromosomes with higher fitness
values get more chances to survive and are more likely to recur
where αi and( )αi denote Lagrange multipliers; and K xi , xj =
∗
( )
in the new population. The selection process is repeated until 20
Φ (xi ) • Φ xj is the kernel function which is satisfied the Mer-
T
(the population size) members have been selected. The optimized
cer’s condition. In this article, the Gaussian radial basis function learning parameters of the four base SVM regressors are listed in
is employed as kernel function by the reason that it has excellent Table 6.
Table 6 Table 7
Learning parameters of the base SVM regressors obtained by genetic algorithm. Values of cluster centers.
c γ ε Cluster centers M T I Ph Pc
SVM1 5.0 1.70 0.118 v1 0.16 0.35 0.65 0.51 0.61
SVM2 3.5 4.02 0.062 v2 0.79 0.38 0.62 0.47 0.57
SVM3 2.0 1.05 0.013 v3 0.74 0.57 0.34 0.61 0.57
SVM4 1.8 0.40 0.142 v4 0.70 0.66 0.47 0.44 0.39
Table 8
Comparison of testing datasets used in literature and our work.
Researchers Number of testing Percentage of data in a certain
samples range of f-CaO content
<0.5 0.5–1.5 >1.5
Lin B et al. [4] 2200 0% 100% 0%
Pani AK et al. [5,6] 110 1.8% 91.8% 6.4%
Li W et al. [7] 80 10.0% 65.0% 25%
Our work 100 3.0% 63.0% 34%
used for model testing. The data were collected over 4 successive
days (from May 19th, 0:00 to 23nd, 9:00) during continuous
Fig. 6. (a) Distribution of the training data; (b) Distribution of the training data
operation of the rotary kiln and were not used for model training.
after FCM clustering.
Compared to testing data used in literature (see Table 8), our test-
ing samples reflect a variety of clinker quality (0.3%<f-CaO<3.3%)
with 34% data in the range of f-CaO>1.5 (unsatisfactory clinker
3.3. Testing the ESVM model quality). Such variety in testing samples is very suitable for eval-
uation of model performance, that is to say, how well the model
For new testing samples, the prediction of f-CaO content is performs when it is supplied with data collected in quite different
implemented with following steps: process conditions.
(1) Determine the cluster that the new sample belongs to
by calculating the Euclidean distances between the sample and 4.2.2. Predicted f-CaO content using ESVM model and comparison
the four obtained cluster centers {v1 , v2 , v3 , v4 }. The cluster with with other models
minimal Euclidean distance value is then selected for the sample. To evaluate the performance of the proposed ESVM model, the
Samples within the same cluster are considered to have similar predicted results by ESVM model are compared with manually
process conditions. measured values as well as with the predicted values by the
(2) If the new sample belongs to a certain cluster Ci , the trained available models that were used for f-CaO prediction in literature,
base SVM regressor SVMi is then selected to predict the f-CaO including:
content. (1) Single SVM model. Compared to the proposed ESVM archi-
tecture, it has only one SVM regressor. For comparison, the single
4. Results and discussions SVM model was trained using the same training dataset as ESVM
model.
4.1. Results of FCM clustering of the training dataset (2) PLSR model. PLSR was employed for estimation of f-CaO
content in literature [4]. For comparison, the PLSR model was
Distributions of the original training data and FCM clustering trained using the same training dataset as ESVM model.
results are visualized (Fig. 6(a) and (b)) using Radviz visualization (3) RBF neural network model. RBF was used for f-CaO content
technique [38]. Fig. 6 shows that the original distribution of prediction in literature [5]. For comparison, RBF was trained using
training data is imbalanced. After FCM clustering, four clusters the same training dataset as ESVM model.
(C1 , C2 , C3 , C4 ) located in different areas can be obtained. The (4) T–S fuzzy neural network model (T–S-FNN) [39]. It com-
distribution of each cluster is separated from other clusters. Each bines the advantages of neural network and T–S fuzzy inference
cluster has its own distribution characteristics, which is con- technique. Similar technique was used in literature [6] to predict
ducive to the modeling of base SVM learners and enhancing the f-CaO content. In the present work, T–S-FNN model was trained
diversity of learners. using the same training dataset as ESVM model.
The cluster centers after normalization are given in Table 7. Three commonly used performance indices are adopted in the
The values of each cluster center are different from those of present work to evaluate model performance, i.e. mean squared
other cluster centers. In particular, the cluster center for M varies error (MSE), Theil’s inequality coefficient (TIC) [6] and correlation
greatly, from 0.16 in the first cluster to 0.79 in the second cluster. coefficient (R):
Such variations can be also observed in the cluster centers of ∑N ( )2
other 5 variables (0.35–0.66 for T , 0.34–0.65 for I, 0.44–0.61 for Ph i=1 ỹi − yi
MSE = (8)
and 0.39–0.61 for Pc ). This indicates that each cluster has its own
√ N (
characteristic, which greatly benefits the modeling of the ESVM ∑N )2
yi − ỹi
model. TIC = √
i=1
√∑ (9)
∑N 2 N
i=1 yi + i=1 ỹi 2
4.2. Results of f-CaO content prediction and model performance ( )
∑N
i=1 (yi − y) ỹi − ỹ
4.2.1. Testing samples R= √ (10)
To validate the generalization capability of ESVM model pro- ∑N ∑N ( )2
i=1 (yi − y)
2
posed in this work, 100 new samples (or measured values) are i=1 ỹi − ỹ
Table 9
Performance of different models using the same testing samples.
Modeling methods MSE TIC R
PLSR model 0.28 0.17 0.55
RBF model 0.28 0.18 0.56
T–S-FNN model 0.29 0.17 0.60
Single SVM model 0.25 0.17 0.63
ESVM model in the present work 0.21 0.15 0.69
architecture of several base SVM regressors that were trained by

clustered process data.
The above analysis demonstrates that the proposed ESVM
model has advantages both in generalization capability and in
Fig. 7. Comparison of measured f-CaO content with predicted results by ESVM
model, PLSR model, RBF model and T–S-FNN model. model accuracy. However, it can be also observed that relatively
big deviations between prediction and measurement may occur
for testing samples with extremely off grade f-CaO content (>3%
or <0.5%, Fig. 8). This situation mainly caused by the fact that
the local cement company seldom produces extremely off grade
cement clinkers. Consequently there are less training data avail-
able in such extreme ranges. The model prediction accuracy for
such extreme cases can be further improved by collecting more
training data in the future.
5. Conclusions
The f-CaO content in cement clinkers largely reflects the qual-

ity of cement produced in the cement plants. Due to lack of hard-
ware sensors and harsh environment (high temperature, strong
erosion, heavy dust etc.), data-driven soft sensor modeling has
Fig. 8. Comparison of measured f-CaO content with predicted results by ESVM become an attractive approach for estimation of f-CaO content.
model and single SVM model.
In the present article, a novel support vector machine ensemble
(ESVM) model was proposed to estimate f-CaO content, with
following features and advantages:
where yi and ỹi are measured and model predicted f-CaO content, (1) ESVM model consists of several base SVM regressors. Each
respectively; y and ỹ are the average of yi and ỹi , respectively; N base SVM regressor was trained using clustered dataset based on
denotes the total number of testing samples. MSE is a frequently fuzzy c-means clustering algorithm. Therefore, process features at
used index that represents the differences between the measured different operating conditions can be captured by the model, and
and predicted f-CaO contents. The less is the value of MSE, the the prediction accuracy of this model is then improved.
smaller the mean fitting error. The value of TIC varies from 0 to (2) As compared to soft sensor models available in literature,
1. The closer TIC value is to 0, the more similar the prediction ESVM model is relatively simple and it needs only five variables
series is to the measurement series. R describes the degree of as model inputs, including feed rate of the raw material, tem-
correlation between prediction and measurement, ranging from perature in precalciner, secondary air pressure at kiln head, air
−1 to 1. The closer is the value of R to 1, the stronger relationship pressure in the second chamber of grate cooler, and electrical
of the predictions and the measured values. current of the drive motor of the rotary kiln. These five variables
The prediction capability of ESVM model proposed in present are common process variables that can be directly measured
work is compared with single SVM model, PLSR model, RBF model online. Results demonstrated that they are enough to build f-CaO
and T–S-FNN model in Figs. 7 and 8. MSE, TIC and R produced by content estimation model.
different models are given in Table 9. It is found that PLSR model, (3) The effect of time lag between process variables and f-CaO
RBF model and T–S-FNN model produce similar results (Fig. 7, content was considered for building the model by processing the
2nd, 3th and 4th column of Table 9), with MSE = 0.28, 0.28 vs. data with time-series matching technique.
0.29 and TIC = 0.17, 0.18 vs. 0.17. In comparison, the proposed (4) As compared to soft sensor models available in literature,
ESVM model has the smallest MSE (MSE = 0.21), followed by the proposed ESVM model was tested by new samples that re-
the single SVM model with MSE = 0.25. Since smallest MSE flect a wider range of clinker quality (f-CaO content was ranged
indicates that the mean squared error between model prediction between 0.3% and 3.3%). The testing samples were collected over
and measurement is smallest, it can be concluded that ESVM 4 successive days during continuous operation of the rotary kiln
model proposed in our work outperforms the other 4 models. and were not used for model training.
Considering the facts that the ESVM model is tested with data (5) Compared to the single SVM model, PLSR model, RBF
sampled in 4 successive days in clinker production and these model and T–S-FNN model, the proposed ESVM model has the
data were not used for model training, the prediction accuracy lowest mean square error and the highest correlation coefficient
of ESVM model is relatively great. when tested on the validation dataset, indicating that it has
Except for model accuracy, the generalization ability is also improvements both in prediction accuracy and in generalization
very important for the estimation model, which can be evaluated capability.
by use of correlation coefficient R [40]. It is obvious from Table 9 The proposed ESVM model has an excellent application
that the ESVM model proposed in this work has a better gen- prospect in cement clinker production line for automatic mon-
eralization capabilities than the other four models. The highest itoring of f-CaO content, enabling plant operators to maintain the
R value (R = 0.69) of the ESVM model is due to its ensemble quality of cement clinkers.
Declaration of competing interest [17] Li C, Jian M. The applications of the instrument for measuring powder
retention time online in studies of preheaters. J Wuhan Univ. Technol
1991;13(3):54–9, [in Chinese].
The authors declare that they have no known competing finan-
[18] Chen Z, Lu H, Peng J, Dou H, Shi L, Huang J. Numerical simulation of the
cial interests or personal relationships that could have appeared gas solid two-phase flow in a reinforced suspension precalciner. J Chin
to influence the work reported in this paper. Ceram Soc 2006;34(1):54–8, [in Chinese].
[19] Sullivan JD, Maier CG, Ralson OC. Passage of solid particles through rotary
Acknowledgments cylindrical kilns. US BurMines Tech Pap 1927.
[20] Bogiatzidis IX, Safacas AN, Mitronikas ED. Detection of backlash phe-
nomena appearing in a single cement kiln drive using the current
This work was supported by National Natural Science Founda- and the electromagnetic torque signature. IEEE Trans Ind Electron
tion of China [No. 61973108], Hunan Key Laboratory of Intelligent 2013;60(8):3441–53.
Robot Technology in Electronic Manufacturing [IRT.2018001]. [21] Xu X. Cement clinker f-CaO soft sensor model based on time series analysis
and support vector machine (Master thesis), Hunan University; 2017, [in
Chinese].
References [22] Li W, Wang D, Zhou X, Chai T. An improved multi-source based soft sensor
for measuring cement free lime content. Inform Sci 2015;323:94–105.
[1] Mikulčić H, Vujanović M, Fidaros DK, Priesching P, Minić I, Tatschl R, [23] Izakian H, Pedrycz W. Agreement-based fuzzy C-means for clustering data
et al. The application of CFD modelling to support the reduction of CO2 with blocks of features. Neurocomputing 2014;127:266–80.
emissions in cement industry. Energy 2012;45:464–73. [24] Jie L, Liu W, Sun Z, Teng S. Hybrid fuzzy clustering methods based on
[2] Chen H, Zhang X, Hong P, Hu H, Yin X. Recognition of the temperature improved self-adaptive cellular genetic algorithm and optimal-selection-
condition of a rotary kiln using dynamic features of a series of blurry flame based fuzzy c-means. Neurocomputing 2017;249:140–56.
images. IEEE Trans Ind Inform 2016;12(1):148–57. [25] Xu R, Ii DCW. Ieee, survey of clustering algorithms. IEEE Trans Neural Netw
[3] Vogelbacher M, Waibel P, Matthes J, Keller HB. Image-based characteriza- 2005;16(3):645–78.
tion of alternative fuel combustion with multifuel burners. IEEE Trans Ind [26] Choudhury S, Ghosh S, Bhattacharya A, Fernandes KJ, Tiwari MK. A real
Inform 2018;14(2):588–97. time clustering and SVM based price-volatility prediction for optimal
[4] Lin B, Recke B, Knudsen JK, Jørgensen SB. A systematic approach for soft trading strategy. Neurocomputing 2014;131:419–26.
sensor development. Comput Chem Eng 2007;31(5):419–25. [27] Bai X, Wang W. Saliency-SVM: An automatic approach for image
[5] Pani AK, Vadlamudi VK, Mohanta HK. Development and comparison of segmentation. Neurocomputing 2014;136:243–55.
neural network based soft sensors for online estimation of cement clinker [28] Pan X, Pang X, Wang H, Xu Y. A safe screening based framework for
quality. ISA Trans 2013;52(1):19–29. support vector regression. Neurocomputing 2018;287:163–72.
[6] Pani AK, Mohanta HK. Online monitoring of cement clinker quality us- [29] Wang T, Qi J, Xu H, Wang Y, Liu L, Gao D. Fault diagnosis method
ing multivariate statistics and Takagi-Sugeno fuzzy-inference technique. based on FFT-RPCA-SVM for cascaded-multilevel inverter. ISA Trans
Control Eng Pract 2016;57:1–17. 2016;60:156–63.
[7] Li W, Wang D, Chai T. Multisource data ensemble modeling for clinker [30] Vapnik V. the nature of statistical learning theory. New York:
free lime content estimate in rotary kiln sintering processes. IEEE Trans Springer-Verlag; 1995.
Syst Man Cybernet: Syst 2015;45(2):303–14. [31] Borhani TNG, Bagheri M, Manan ZA. Molecular modeling of the
[8] Li W, Wang D, Chai T. Burning state recognition of rotary kiln using ELMs ideal gas enthalpy of formation of hydrocarbons. Fluid Phase Equilib
with heterogeneous features. Neurocomputing 2013;102:144–53. 2013;360(9):423–34.
[9] Lv Y, Liu J, Yang T, Zeng D. A novel least squares support vector machine [32] Gholami AR, M. Shahbazian. Soft sensor design based on fuzzy c-means
ensemble model for NOx emission prediction of a coal-fired boiler. Energy and rfn_svr for a stripper column. J Nat Gas Sci Eng 2015;25:23–9.
2013;55:319–29. [33] Yu L, Wang S, Lai KK. Credit risk assessment with a multi-
[10] Fan G-F, Peng L-L, Hong W-C, Sun F. Electric load forecasting by the stage neural network ensemble learning approach. Expert Syst Appl
SVR model with differential empirical mode decomposition and auto 2008;34(2):1434–44.
regression. Neurocomputing 2016;173:958–70. [34] Zhang Z, Wen G, Chen S. Audible sound-based intelligent evaluation for
[11] Khormali A, Addeh J. A novel approach for recognition of control chart aluminum alloy in robotic pulsed gtaw: mechanism, feature selection and
patterns: Type-2 fuzzy clustering optimized support vector machine. ISA defect detection. IEEE Trans Ind Inform 2018;14(7):2973–83.
Trans 2016;63:256–64. [35] Gao X, Hou J. An improved svm integrated gs-pca fault diagnosis approach
[12] Bakdi A, Kouadri A, Bensmail A. Fault detection and diagnosis in a cement of tennessee eastman process. Neurocomputing 2016;174:906–11.
rotary kiln using PCA with EWMA-based adaptive threshold monitoring [36] Kumar A, Kumar R. Time-frequency analysis and support vector machine
scheme. Control Eng Pract 2017;66:64–75. in automatic detection of defect from vibration signal of centrifugal pump.
[13] Meyer V, Pisch A, Penttilä K, Koukkari P. Computation of steady state Measurement 2017;108:119–33.
thermochemistry in rotary kilns: Application to the cement clinker [37] Gu J, Zhu M, Jiang L. Housing price forecasting based on genetic algorithm
manufacturing process. Chem Eng Res Des 2016;115:335–47. and support vector machine. Expert Syst Appl 2011;38(4):3383–6.
[14] Lima RN, de Almeida GM, Braga AP, Cardoso M. Trend modelling with arti- [38] Rubio-Sánchez M, Raya L, Díaz F, Sanchez A. A comparative study
ficial neural networks. Case study: Operating zones identification for higher between RadViz and Star Coordinates. IEEE Trans Vis Comput Graph
SO3 incorporation in cement clinker. Eng Appl Artif Intell 2016;54:17–25. 2016;22(1):619–28.
[15] Sharifi A, Aliyari Shoorehdeli M, Teshnehlab M. Identification of cement [39] Han M, Sun Y, Fan Y. An improved fuzzy neural network based on T–S
rotary kiln using hierarchical wavelet fuzzy inference system. J Franklin model. Expert Syst Appl 2008;34(4):2905–20.
Inst 2012;349(1):162–83. [40] Ticknor JL, Hsu-Kim H, Deshusses MA. A robust framework to pre-
[16] Wu W, Liu X, Xu X, Jin J, Zhang M. Time series analysis method dict mercury speciation in combustion flue gases. J Hazard Mater
for the soft measurement of cement clinker quality. Control Theory 2014;264(2):380–5.
Appl 2018;35(7):1029–36. http://dx.doi.org/10.7641/cta.2017.70501, [in
Chinese].

ISA Transactions: Xiaoyan Liu, Jiao Jin, Weining Wu, Fabian Herz

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ISA Transactions: Xiaoyan Liu, Jiao Jin, Weining Wu, Fabian Herz

Uploaded by

Copyright:

Available Formats

ISA Transactions xxx (xxxx) xxx

Contents lists available at ScienceDirect

A novel support vector machine ensemble model for estimation of free

1. Introduction and material, temperatures and pressures) in the plant database.

Fig. 1. Schematic diagram of the cement clinker calcination process.

Fig. 2. Flow chart of data preprocessing.

architecture of several base SVM regressors that were trained by

The f-CaO content in cement clinkers largely reflects the qual-

You might also like