Professional Documents
Culture Documents
International Journalof Oil Gasand Coal Technology
International Journalof Oil Gasand Coal Technology
net/publication/336450928
CITATIONS READS
0 191
4 authors, including:
SEE PROFILE
All content following this page was uploaded by Mohamed Bin Shams on 29 July 2020.
1 Introduction
Operational excellence (OE) becomes a vital tool for ongoing improvement in modern
industrial plants (Davis et al., 2012). At the heart of this relatively new operation
philosophy is incident prevention. The use of statistical process control tools to monitor
and troubleshoot deviations from normal operating conditions aligns with OE objectives.
With the advancements of computer technologies, process monitoring has become much
easier and more efficient than any time before. In chemical process industry, huge
amount of data are collected and stored for different uses such as inspections, monitoring
and process improvement. Process monitoring, fault detection and diagnosis using
traditional statistical tools become more challenging when the number of process
variables is large and they are cross-correlated. This increase in the number of the
variables makes the use of normal/classical inspection approaches in fact inefficient.
Recently, principle component analysis (PCA) and partial least square (PLS) and their
variations have been used successfully for several industrial applications (Chiang et al.,
2001; Miletic et al., 2004). What make these multivariate techniques an attractive option
nowadays is the availability of commercial software that hides most of the sophisticated
mathematics associated with these techniques while providing the user with easy to build
and interpretation tools (MacGregor et al., 2015). The wide spectrum of process
characteristics has motivated the development different multivariate statistical monitoring
schemes such as static, dynamic, and adaptive monitoring and troubleshooting tools
(Bin Shams et al., 2011; Albazzaz and Wang, 2007; AlGhazzawi and Lennox, 2008; Yin
et al., 2002; Yuan and Wang, 2001). Bersimis et al. (2007) provided an overview of
common multivariate process monitoring techniques, e.g., MEWMA and MCUSUM.
Although Bersimis et al. (2007) presented a thorough overview of these techniques; they
seemed to have overlooked the applicability of applying such techniques to large
industrial processes characterised by large numbers of noisy process variables with many
missing observations. As a projection multivariate method, PCA finds hidden (or latent)
independent factors that are truly driving the observed process behaviour. Typically the
number of latent variables is small compared to the number of original variables
measured in the process, therefore, a more compact representation is provided by PCA
that facilitates the analysis of the underlying process. Latent variables are calculated by
linearly combining the original process variables that significantly contribute to the
overall process variations. For example it is common to have hundreds or more measured
348 M. Bin Shams et al.
variables (both on-line and from analytical and quality control laboratories). However,
only a small number of hidden variables drive the observed behaviour. This paper aims to
provide the necessary fundamentals and demonstrate, through commercially available
software, the proficiency of PCA based multivariate monitoring tools for fault detection
and troubleshooting with application to local refinery’s hydrogen plant. The latter is
achieved by:
1 introduce the underlying concepts and related mathematics through simple example
2 realise the benefits of PCA for monitoring and troubleshooting through a real case
study, namely, a hydrogen plant form a local refinery in Bahrain
3 emphasise the maturity of such techniques with regard to the availability of user
friendly commercial software such as Aspen ProMVTM (AspenTech, 2018).
The paper is organised as follows. In Section 2, definitions and theoretical background
related to PCA is presented. The details of implementing and interpreting PCA results are
demonstrated through simple example. To illustrate the applicability of PCA for
monitoring and troubleshooting in industrial setting, a real case study from a hydrogen
plant at a local refinery is presented in Section 3 followed by conclusions.
Most the modern chemical and petrochemical are operated under closed loop control with
an intensive use of recycle and heat integration schemes. Process variables generated
from theses economically and optimally designed plants are auto-correlated and
cross-correlated. Most of the statistical methods, e.g., multiple linear regression (MLR) is
based on the assumption that variables are neither auto nor cross correlated which are
rarely satisfied in practice. Since the assumptions of perfect independence between
variables are rarely completely satisfied in practice, such traditional statistical models are
rarely used by process operation engineers. Cross-correlation problem can be solved by
using methods like principle components regression (PCR), PCA and PLS. A small set of
new variables also known as latent variables, that are truly driving the observed process
behaviour substitute the original variables. The idea behind PCA is to exploit the
cross-correlation between the process variables in order to have a simpler representation
of the process variation. The latter facilities the interpretation of the original variables so
that informative insights can be drawn. The first principal component captures most of
the variability in the dataset. The second principal component captures the variability that
are not accounted for by the first principal component and so on. The operational data,
usually collected from data historian, e.g., OSIsoft PI System, is stacked into a matrix of
size m by n where m is the number of observations and n is the number of variables. A set
of a smaller variables symbolised by ‘a’ is chosen where ‘a’ is less than ‘n’. The new ‘a’
variables contain most of the information hidden in the data matrix with regard of how
these variables are correlated. This can be done by determining the directions of the most
important variation in the data. A data matrix X is constructed by concatenating the
variables’ vectors, e.g., for three variables X = [x1 x 2 x3 ]. As shown in Figure 1,
Process monitoring and troubleshooting of a local refinery’s hydrogen plant 349
although the data are three-dimensional, it can be seen that most of the variations in the
data can adequately be represented using only two-dimensions. The fact that three
variables variations can be adequately represented by two variables (or two principal
components) is what enables PCA to diminish the dimensionality of the original data. In
another word, PCA searches for new coordinate axis that explains the maximum
variations in the data. As shown in Figure 1, these two components are orthogonal and
commonly denote as T = [t1 t2].
Figure 1 Three variables are represented adequately using two principal components (see online
version for colours)
for building the PCA model. As indicated by the authors, the best performance was
obtained when CDC is combined with multivariate trimming MVT. In the same study,
the author suggested the use of robust and modified scaling rather than the standard
auto-scaling (i.e., deducting the mean and scaling by the inverse of standard deviation).
Robust scaling becomes especially important when multiple outliers are presented and
the mean is no longer accurate representative of the bulk of the data. Since CDC has been
used in the current study to extract normal operating data, it will be explained briefly. In
CDC, it is hypothesised that normal operating data is the data at which the process
remains steady as longer as possible. Therefore, CDC finds normal operating data by
calculating the distance of each observation, i.e., a single row of the X matrix, from the
centre (mean-vector of the data). The longer the period of operation at certain steady state
level, the more observations from this steady state level exist in the data matrix X and the
more influential they will have on the mean-vector of the data. Therefore, observations
belong to this period of operation will have smaller distances to the mean-vector. When
performing CDC, first the data matrix X is auto-scaled prior to calculating the distance of
each measurement vector from the data mean-vector. Either the Euclidean distance
(CDC2) or the maximum norm distance (CDCm) can be used to quantify the distance.
After ranking the data, the first half of the observations are the observation closest to the
centre of the data. MATLAB® has been used to perform CDC on historical data before
importing the data into Aspen ProMVTM (AspenTech, 2018). As mentioned earlier, when
extracting normal data and in the presence of multiple operating conditions and outliers,
robust scaling is highly recommend. When performing robust scaling, the median is used
in place of the mean and the median absolute deviation (MAD) from the median that is
defined as: sMAD = 1.4826 median{[ xi − xmedian ]}, where xmedian is the median of variable x
i
replaces the standard deviation. Once the normal condition data is identified, it is placed
in the X matrix.
1/ h0
2Θ 2 Θ 2 h0 ( h0 − 1)
SPEα = Θ1 1 + cα h0 +
Θ1 Θ12
n
Θi = λ ; for i = 1, 2, 3
j = a +1
i
j (1)
2Θ1Θ3
h0 = 1
3Θ 22
PCA model is built using the q – 1 parts. The model is then used to predict the omitted
data and the PRESS is re-calculated. The previous steps are repeated for the q parts using
‘a’ components. The PRESS (a) is then the sum of the PRESS of all qth part of a given ‘a’
components. If the PRESS (a) decreases significantly from PRESS (a – 1), then keep that
components otherwise stop adding components.
For T2 statistic the contribution of variable j for (a < n) principal components at each
sampling instance i is given by (Bin Shams et al., 2011):
a p2 n 1
n
2
jk xij2 + 2 p jk
Cont ij = λ
k =1 k
λk
r =1
prk xir xij +
λk r =1
prk xir
(2)
r≠ j r≠ j
where pjk and λk are the jk element of the loading matrix P and the kth eigenvalue,
respectively. The total contribution of variable j is summed over an operating-time
window. The first term in equation (2) includes the variable j while the second part
consists of cross product between variable j and the rest of the variables. The last part
does not contain xij. On the other hand, the total contribution of variable j to the SPE
statistic at each sampling instant i is given as:
Cont ij = eij2
The steps for process monitoring and troubleshooting using PCA is summarised in
Figure 3.
where w is white noise with unit variance. The 2,000 points that have been simulated to
calibrate the PCA model and represent the normal operating condition. Therefore, there is
no need to use the CDC here, since a single steady state is exist. Two components were
found adequate to capture the variations in the data using cross validation. Three types of
faults were applied (from 200 to 800) case (a): exceeding the limits, case (b): broken
correlation, case (c): exceeding limits and broken correlation.
Process monitoring and troubleshooting of a local refinery’s hydrogen plant 353
Figure 2 (a) Process with two measured variables (b) x1 versus x2 (see online version for colours)
(a)
T2
T2 & SPE
SPE
(b)
Figure 3 Steps required to implement PCA for process monitoring and troubleshooting
(see online version for colours)
Process monitoring and troubleshooting of a local refinery’s hydrogen plant 355
Figure 4 (a) T2 chart for case (a) (b) SPE chart for case (a) (see online version for colours)
(a)
(b)
356 M. Bin Shams et al.
Figure 5 (a) T2 contribution plot for case (a) (b) SPE contribution plot for case (a) (see online
version for colours)
(a)
(b)
Process monitoring and troubleshooting of a local refinery’s hydrogen plant 357
Figure 6 (a) SPE chart for case (c) (b) SPE chart for case (c) (see online version for colours)
(a)
(b)
358 M. Bin Shams et al.
Figure 7 (a) T2 contribution for case (c) (b) SPE contribution for case (c) (see online version
for colours)
(a)
(b)
Due to its significance for hydrotreating processes in refineries, hydrogen plant is one of
the most important units in the refinery. Nevertheless, the smooth and safe operation of
Process monitoring and troubleshooting of a local refinery’s hydrogen plant 359
the hydrogen plant in the local refinery is critical since it is closely connected to the
operation of other plants, e.g., hydrocracking unit, sulphur recovery unit, lube base oil
plant and low sulphur diesel plant. Hydrogen plant produces hydrogen from natural gas.
The feed (Khuff gas) contains mainly methane gas (CH4) 80%, carbon dioxide (CO2) 6%,
nitrogen (N2) 11% and the rest is heavier hydrocarbons (C2-C6). The H2 plant is divided
into four sections: sulfinol, reformer, two converters and methanator. The main objective
of the sulfinol section is to remove organic sulphur compounds, CO2 and other impurities
such as H2S. First, the feed gas (Khuff) enters the plant at almost 1,150 psig and then
drops to 525 psig with the aid of giant valve. The feed gas enters the bottom of the
absorber column (sulfinol absorber column) while the regenerated sulfinol solvent is
introduced at the top. Sulfinol absorber removes almost all the sulphur compounds and
CO2 from the feed gas. Next, the gas effluent from the absorber enters the water-wash
column to wash the gas from the solvent and subsequently into two drums containing two
types of catalyst cobalt-molybdenum and zinc-oxide distributed in two beds. The catalyst
converts residual sulphur to H2S, which is adsorbed by the zinc-oxide bed. In the
reformer, the sweetened gas and the steam are passing through the tubes and the
hydrogen is produced by heating the mixture in presence of the catalyst to approximately
1,450ºF where the following overall reactions occur:
Heat and catalyst
CH 4(g) + H 2 O(g) ⎯⎯⎯⎯⎯⎯ → CO (g) + 3H 2(g) [Endothermic]
Heat and catalyst
CO(g) + H 2 O(g) ⎯⎯⎯⎯⎯⎯
→ CO 2(g) + H 2(g) [Exothermic]
The composition of the product mixture resulting from these two reactions is 70% H2,
10% CO2, 10% carbon monoxide (CO), 4% methane (CH4) and the rest are traces of
other gases. In the presence of steam and specialised catalysts, carbon mono-oxide CO is
further converted to CO2 using two reactors, namely, high temperature shift (HTS) and
low temperature shift (LTS) converters where the following reaction takes place:
Catalyst
CO(g) + H 2 O(g) ⎯⎯⎯⎯ → CO 2(g) + H 2(g) + Heat
Since it is relatively easier to get rid of carbon dioxide rather than carbon monoxide, CO
must be converted to CO2 before its removal in the carbonate solution section. The aim of
carbonate system section is to remove carbon dioxide (CO2) from the product stream.
Since the hydrogen is eventually routed to the hydrogen desulfurisation unit (HDU), it
must contain the least possible amount of CO and CO2. This is important to prevent
temperature runway in the reactor. In addition, the presence of CO and CO2 cause
unwanted side reactions. The latter is achieved in the methanator section where the small
amounts of carbon-oxides that have been not been absorbed in the carbonate section are
converted into methane. The methanator reactor has a nickel catalyst where the following
highly exothermic reactions occur:
Catalyst
CO (g) + 3H 2(g) ⎯⎯⎯⎯ → CH 4(g) + H 2 O (g) + Heat
Catalyst
CO 2(g) + H 2(g) ⎯⎯⎯⎯ → CO (g) + H 2 O (g) + Heat
After the methanator, the main content of the products stream is the hydrogen gas. The
hydrogen rich stream is routed through a fin-fan cooler; a heat exchanger and a knockout
360 M. Bin Shams et al.
drum to separate the condensate steam from the product stream. Then hydrogen stream is
then routed to the hydrogen desulphurisation unit.
Figure 8 The methantor section in the hydrogen plant (see online version for colours)
64HC109
Valve failure
(instrument air filter)
Process monitoring and troubleshooting of a local refinery’s hydrogen plant 361
Figure 9 (a) T2 for the hydrogen plant (b) SPE for the hydrogen plant (see online version
for colours)
(a)
(b)
362 M. Bin Shams et al.
Figure 10 (a) Contribution plot for T2 monitoring chart of the hydrogen plant (b) Contribution
plot for SPE monitoring chart of the hydrogen plant (see online version for colours)
(a)
Process monitoring and troubleshooting of a local refinery’s hydrogen plant 363
Figure 10 (a) Contribution plot for T2 monitoring chart of the hydrogen plant (b) Contribution
plot for SPE monitoring chart of the hydrogen plant (continued) (see online version
for colours)
(b)
364 M. Bin Shams et al.
Although 61 variables from the four sections of the hydrogen plant were used for
monitoring, the T2 and SPE based contribution plots make the troubleshooting activities,
that is, identifying variables most correlated to the occurred fault, an easier task. That is
to say, instead of looking at 61 variables to diagnose the detected abnormality, plant’s
operator can now focus on only limited number of variables. Therefore, and with the use
of operators’ operation knowledge, a prompt and informative troubleshooting can be
achieved. Aspen ProMVTM (AspenTech, 2018) software allows the operator to navigate
every variable with significant contribution by double clicking on the corresponding
variable. For example, we observed that the temperature before the methanator increased
rapidly to 600ºF, stay for some time, decreased to 400ºF and then returned to the normal
condition (524ºF). Similar deviation from normal operation levels were observed for
64PDI43, 64PC11, 64PC12, 64FI25 and 64FC21, not shown here for brevity. While most
of the variables are from the methanator section, 64FC21 belongs to the sulfinol section,
(see Table 1). Once the variables in Table 1 were identified, the results were discussed
with the plant personnel to validate the PCA monitoring and troubleshooting findings.
The problem was identified as failure in the methanator feed’s control valve (see
Figure 8). Specifically, the instrument air filter’s glass was broken and the instrument air
was vented. Consequently, no air was available to actuate the valve (64HC109). Since the
valve is air-to-open (fail-close), the control valve went to the safe closed position (~90%
close).
Following the valve malfunction, the following symptoms were observed. The
methanator feed temperature (64TI207) deviated from 521ºF to 600ºF. This can be
understood by considering Figure 8. That is, the heat exchanger used to heat up the feed
to the methanator was affected by the valve failure. That is, following the valve failure,
the cold feed stream was trapped in the methanator feed line, and continuously heated up
by the hot effluent form the HTS convertor. The latter cause 64TI207 to increase. In
addition, the pressure downstream the methanator (64PC12) decreased sharply from
320 to 110 psig (see Figure 8) due to the decrease in the effluent from the methanator.
The flow indicator for the outlet from the methanator (64FI25) also decreased for the
same reason. Before the heat exchanger, the pressure was very high (64PC11) and the
relieve valve opened to atmosphere to prevent the unit from damage. This was clearly
shown by looking at the temperature time series where the temperature started to reduce
after a sharp increase. The supervisor also mentioned that the methanator’s valve failure
not only affected the methanator section, but also the sulfinol section. The solvent inlet
flow to the sulfinol absorber (64FC21) decreased and was not able to enter to the column
due to the increase in pressure inside the column.
Table 1 Discerption of the variables contributed to the valve failure problem in Figure 10
As can be seen, the process behaviour after the occurred valve failure was very close to
what PCA based T2, SPE and contribution plots highlighted. Interestingly, while most of
the variable correlated with the valve failure are physically located at the methanator
section, the multivariate correlative feature of PCA made it possible to identify other
variables that are physically apart from the problem, yet correlated with the rest of the
variables. Although the simultaneous occurrence of faults has not been considered in the
current work, it is worth to mention here that the detection statistics used in the current
study, e.g., T2 and SPE are equally applicable to the simultaneous case and they will
trigger an alarm whenever an anomaly is observed. However, contribution plot
effectiveness is limited to simple faults, e.g., sensor and actuator, such as the type of fault
detected in the current work. That is, it is not sufficient to accurately isolate faults with
contribution plots when the measured variables behave similarly during the occurrence of
different faults or when simultaneous faults occurred. In the latter, a faults library form
the data historian is required to precisely isolate the occurred problem (Bin Shams et al.,
2011).
Figure 11 (a) Production before (1) and after (2) the fault (b) Savings: early (1) versus
delayed (2) detections (see online version for colours)
(a) (b)
366 M. Bin Shams et al.
4 Conclusions
The premise and preliminaries of using PCA-based tools for fault detection and diagnosis
has been introduced using simple example. The proficiency of PCA for monitoring and
troubleshooting faults is further demonstrated through a local refinery’s hydrogen plant.
PCA is especially powerful when large numbers of correlated variables are encountered,
which is a characteristic of most chemical process industries. While automated
monitoring and troubleshooting tools are vital for modern industries, it is by no mean a
replacement of a well-trained plant engineer. Maximum benefits can be realised when
multivariate statistics based tools are used along with process knowledge. However, it is
still useful to use multivariate statistics alone, especially when an operator has little
troubleshooting experience. The implementation of such useful tools becomes easy and
quick especially with the availability of commercial software such as Aspen ProMVTM
(AspenTech, 2018). Economic and environment benefits associated with applying
multivariate statistics such as PCA for monitoring and troubleshooting justify its use.
References
Albazzaz, H. and Wang, X. (2007) ‘Introduction of dynamics to an approach for batch process
monitoring using independent component analysis’, Chemical Engineering Communications,
Vol. 194, No. 2, pp.218–233.
AlGhazzawi, A. and Lennox, B. (2008) ‘Monitoring a complex refining process using multivariate
statistics’, Control Engineering Practice, Vol. 16, No. 3, pp.294–307.
Aspen ProMVTM-AspenTech (2018) [online] http://www.aspentech.com (accessed 10 January
2018).
Bersimis, S., Psarakis, S. and Panaretos, J. (2007) ‘Multivariate statistical process control charts: an
overview’, Qual. Reliab. Engng. Int., Vol. 23, No. 5, pp.517–543.
Bin Shams, M., Budman, H. and Duever, T. (2011) ‘Fault detection, identification and diagnosis
using CUSUM based PCA’, Chemical Engineering Science, Vol. 66, No. 20, pp.4488-4498.
Chiang, L.H., Pell, R.J. and Seasholtz, M.B. (2003) ‘Exploring process data with the use of robust
outlier detection algorithms’, Journal of Process Control, Vol. 13, No. 5, pp.437–449.
Chiang, L.H., Russel, E.L. and Braatz, R.D. (2001) Fault Detection and Diagnosis in Industrial
Systems, Springer-Verlag, London.
Davis, J., Edgar, T., Porter, J., Bernaden, J. and Sarli, M. (2012) ‘Smart manufacturing,
manufacturing intelligence and demand-dynamic performance’, Computers & Chemical
Engineering, Vol. 47, No. 1, pp.145–156.
Process monitoring and troubleshooting of a local refinery’s hydrogen plant 367
Liu, J. (2014) ‘Fault isolation using modified contribution plots’, Computers and Chemical
Engineering, Vol. 61, No. 1, pp.9–19.
MacGregor, J.F., Bruwer, M.J., Miletic, I., Cardin, M. and Liu, Z. (2015) ‘Latent variable models
and big data in the process industries’, 9th International Symposium on Advanced Control of
Chemical Processes, Whistler, British Columbia, Canada, pp.521–525.
Miletic, I., Quinn, S., Dudzic, M., Vaculik, V. and Champagne, M. (2004) ‘An industrial
perspective on implementing on-line applications of multivariate statistics’, Journal of
Process Control, Vol. 14, No. 8, pp.821–836.
Miller, P., Swanson, R.E. and Heckler, C.F. (1998) ‘Contribution plots: the missing link in
multivariate quality control’, Applied Mathematics and Computer Science, Vol. 8, No. 4,
pp.775–792.
Qin, S.J. (2003) ‘Statistical process monitoring: basics and beyond’, Journal of Chemometrics,
Vol. 54, Nos. 8–9, pp.480–502.
Yin, K., Yang, H. and Cramer, F. (2002) ‘On-line monitoring of papermaking processes’, Chemical
Engineering Communications, Vol. 189, No. 9, pp.1242–1261.
Yoon, S. and MacGregor, J. (2001) ‘Fault diagnosis with multivariate statistical models part I:
using steady state fault signatures’, Journal of Process Control, Vol. 11, No. 4, pp.387–400.
Yuan, B. and Wang, X. (2001) ‘Multilevel PCA and inductive learning for knowledge extraction
from operational data of batch processes’, Chemical Engineering Communications, Vol. 185,
No. 1, pp.201–221.
Zhang, H., Tangirala, A.K. and Shah, S.L. (1999) ‘Dynamic Process Monitoring using multiscale
PCA’, Proceedings of the 1999 IEEE Canadian Conference on Electrical and Computer
Engineering, pp.1579–1584.