Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Multivariate Statistical Monitoring of Process Operating

Performance
JAMES V. KRESTA, JOHN F. MAcGREGOR and THOMAS E. MARLIN

Chemical Engineering Department, McMaster University, Hamilton, Ontario L8S 4L7


Process computers routinely collect hundreds to thousands of pieces of data from a multitude of plant sensors every
few seconds. This has caused a “data overload” and due to the lack of appropriate analyses very little is currently
being done to utilize this wealth of information. Operating personnel typically use only a few variables to monitor the
plant’s performance. However, multivariate statistical methods such as PLS (Partial Least Squares or Projection to
Latent Structures) and PCA (Principal Component Analysis) are capable of compressing the information down into
low dimensional spaces which retain most of the information. Using this method of statistical data compression a mul-
tivariate monitoring procedure analogous to the univariate Shewart Chart has been developed to efficiently monitor
the performance of large processes, and to rapidly detect and identify important process changes. This procedure is
demonstrated using simulations of two processes, a fluidized bed reactor and an extractive distillation column.

Les ordinateurs de procedes recueillent de manikre rdpktitive jusqu’k des milliers de donnees a partir d’une multitude
de capteurs dans I’usine en I’espace de quelques secondes. Cela a entrain6 une <<surcharge de donndew et en raison
de I’ahsence d’analyses appropriees, peu d’efforts sont faits pour exploiter cette masse d’informations. Le personnel
d’exploitationn’utilise gkndraleinent que quelques variables pour superviser la performance de I’installations. Cepcn-
dant, les methodes de statistiques multivarides telles que les methodes PLS (moindres carrEs partiels ou projection de
structure latentes) et PCA (analyse du composant principal) permettent de comprimer I’information dans un espace
minimum qui retient la majorite de I’information. En utilisant cette rnethode de compression de donndes statistiques.
o n a mis au point une procedure de contrhle multivarii. analogue au graphique de Shewart univarid afin de superviser
de facon efficace la performance de prockdds majeurs. et d’en detecter rapidement les fluctuations. On illustre cette
prockdure I’aide de simulations de deux prockdds. soit un reacteur i lit fluidisd et une colonne de distillation extractive.

Keywords: statistical process control. partial least squares. projection to latent structures. principal component
analysis, performance monitoring. fault detection.

M onitoring performance and detecting faults is an


integral part of the successful operation of any
procesc. Performance can be monitored by comparing the
methods of Principal Component Analysis (PCA) and Par-
tial Least Squares or Projection to Latent Structures (PLS).
These methods are particularly suited to analyzing large sets
actual results to the predictions from a mechanistic model, of correlated data. While, the analysis makes much more effi-
or by using statistical process control (SPC) charts (e.g. cient use of the data and the dimensionality of the problem
Shewan charts (Shewart, 1931), CUSUM charts (Woodward is greatly reduced, the results can still be interpreted and used
and Goldsmith, 1964) or EWMA charts (Huntcr, 1986)) to in the same way as originally outlined by Shewart (1931).
compare the current state of the process against “normal In the general problem process measurements are collected
operating conditions”. The challenge currently facing from the plant and arranged in a ( IZ x k ) matrix X consisting
process monitoring is the enormous amount of correlated data of n observations on k variables. while the product quality
being collected from a multitude of sensors every few variables are collected in a ( n x rn) matrix Y . The objec-
seconds. minutes or hours. This “data overload” and the tives of a multivariate SPC scheme would be to monitor the
lack of appropriate analytical tools has meant that very little process and product quality using these observations and
is being done to utilize this wealth of information. detect process upsets, equipment malfunctions or other spe-
The biggest drawback to using a mechanistic approach is cial events. The final step in the procedure would be to find
the need for a detailed model. Even if a detailed mechanistic and remove the assignable causes for these events, thus
model is available its parameters are uncertain, and often improving process performance. In some situations one may
they need to be updated in real-time. Using the established have only a single data block X which consists of variables
statistical control charting methods has the advantage that of the same type, either all process variables or all product
they need no model, but rather use the operational data variables. In other situations one may have information on
directly: they are also easily applied and interpreted. The both process, X , and product, Y , variables at the same time;
major drawback to these control charts is that they were in this case another objective of any study is to develop a
developed for monitoring univariate problems or sets of predictive relationship for Y using X .
independent variables, and the expansion to handle the case Given the current process control computer systems, on-
of many highly correlated variables is difficult (Woodhall stream analyzers, and automated quality control labs, it is
and Ncube, 1985). These methods are still being used on not uncommon to measure hundreds of proce
a one variable at a time basis on multivariate processes, either line every few seconds or minutes, and tens o
formally using Shewart charts or informally by operators ables every few minutes or hours. Although a large number
monitoring key variables. Generally, this approach has been of variables may be measured, they are almost never indepen-
adequate, although extremely inefficient; however, if the dent; rather, they are usually very highly correlated with one
variables are correlated this approach can lead to erroneous another. The true dimension of the space in which the process
results (Jackson, 1980). moves is almost always much lower than the number ofmea-
This paper presents a different and more efficient approach surements. In some situations this is due to underlying
to process monitoring based on multivariable statistical fundamental relationships among the variables. For example,

T H E CANADIAN JOURNAL OF CHEMICAL ENGINEERING. VOLUME 69. FEBRUARY, 1991 35


in the hypothetical reaction of A B+ - C where A and B
arc fcd to the reactor in a specified ratio, although the con-
The next section of this paper describes the statistical
methods used in this work. The conceptual ideas of dimen-
centrations of A , Band Care being measured (3 dimensional sion reduction are presented using a geometric argument,
measurement space), the actual problem is univariate (the and the mathematical details of this method, including the
stoichionietric relationship and the fixed feed ratio each algorithm, are presented in the appendix. In the following
eliminate a degrce of freedom). In other situations the place- section a multivariate SPC procedure based on PLS is devel-
ment of the measurements and the nature of the process make oped for monitoring the performance and detecting faults of
the measurements highly correlated. Consider, a distillation unusual events in high dimensional processes. This proce-
column where only three variables change independently, dure is demonstrated using two simulations; a fluidized bcd
reflux, reboil and feed composition. Originally, measure- reactor and an extractive distillation column. Finally there
ments are being made of the temperature profile using every is a discussion concerning the implementation of thc pro-
fourth tray temperature. If the number of measurements is posed method and some concluding remarks are made.
increased to every tray temperature to obtain a more detailed
temperature profile, the dimension of the measured variable Statistical methods
space has been greatly increased, but the actual dimension
of the problem has not changed. As a final illustration con- The challenges for any multivariate statistical monitoring
sider another situation where measurements on many scheme are as follows:
different variables are made, but the nature of the process I . The method must be able to deal with collinear data
and the disturbances are such that they only allow the vari- of high dimension, in both the independent (X)and the
ables t o move is a much lower dimensional space. For dependent (Y) variables.
example, in the manufacture of synthetic fibers it is not 2. The method must reduce the dimension of the problem
uncomnion to measure more than ten quality variables such substantially and allow simple graphical interpretations
as denier. elongation under differcnt loads, breaking strength, of the results.
dye depths, etc.. The physical meaning of these measure- 3. If both process ( X ) and product quality ( Y ) variables
ments guarantees that the process is only capable of making are present, it must be able to provide good predictions
fibers with certain combinations of properties, and distur- of Y .
bances to the process will affect many of these variables in
a highly correlated manner. For example fibers with very Methods for a single data block ( X )
small deniers (weight I unit length) cannot be made with very
high breaking strengths, and disturbances which lead to a Consider first the case where the data available is of only
reduction in denier lead to a reduction in breaking strength. one type, either all measurements are of process variables
Principal Components Analysis (PCA), and Partial Least or only the block of quality variables is of interest. Then this
Squares or Projection to Latent Structures (PLS) are mul- data can be arranged in a single mean centered matrix, which
tivariate statistical methods which consider all the noisy and we will call X.
highly correlated measurements made on the process, but Principal Components Analysis (PCA) is a procedure
project this information down onto low dimensional sub- (Anderson, 1984; Mardia et al., 1982) used to explain the
spaces which contain all the relevant information about the variance in a single data matrix (X).PCA calculates a vector,
process. Principal Component Analysis was originally devel- called the first principal component, which describes the
oped by Pcarson (1901) and is a standard multivariate direction of greatest variability (Figure 1). It is calculated
statistical method described in many textbooks (e.g. as the orthogonal regression o f a line through the data in the
Anderson, 1984; Mardia et al., 1982; S. Wold et al., 1987). space spanned by X , or as that linear combination of the
The development of PLS is more recent (Horst, 1961); columns of X ( X p l ) , given by the eigenvector of X T X ( p l )
some of the current interest in the method was sparked by associated with the largest eigenvalue. The second principal
H. Wold (1982) when he applied this approach for model- component is orthogonal to the first principal component and
ling socioeconomic trends, particularly for situations which explains the greatest amount of the remaining variability. It
were data rich and theory poor. PLS has found many appli- is obtained by fitting a line through the residuals left after
cations in the physical sciences, especially in the chemo- fitting the first principal component, or as the linear
metrics field. A review of the development of PLS is given combination of the columns of X ( X p z ) , given by the
by Geladi (1988) and will not be retraced here; however, eigenvector of X’X@,) associated with the next largest
a brief overview of the current work is provided. The basic eigenvalue.
ideas behind this method can be found in several recent One can proceed in this manner until k principal compo-
papers (S. Wold et al.. 1984a; Geladi and Kowalski, 1986a.b; nents are obtained. In this case one has not reduced the
Bccbc and Kowalski, 1987); further, several papers have dimensionality of the problem, but only rotated the axes of
been published concerning the mathematical foundations of the X to a new orthogonal basis. Fortunately, with large data
PLS (S. Wold et al.. 1987; Lorber et al., 1987; Manne, 1987; sets one often finds that, after defining the first A principal
Hoskuldsson, 1988) and their relationship to other statistical components where A < < k, most of the variation in the data
methods (Nacs and Martens, 1985; Helland, 1988). PLS and matrix X has been explained. By stopping at this point and
SIMCA, a cluster analysis technique based on PLS (S. Wold summarizing the information in X using new variables (ti’s)
et al., 1984a), have been applied in the analysis of multivar- defined as those linear combinations of the x ’ s ( t , = Xp,
iate data sets in the chemometrics field (Otto and ( i = 1,2,. . . A ) ) given by the first A principal components,
Wegscheider, 1985; Moseholm, 1988; Kvalheim, 1988). one has reduced the dimensionality of the space from k to
Interest in this method is also spreading to chemical A . In geometric terms this is equivalent to approximating the
engineering; however, as yet very little has been published k-dimensional observation space by the project ions of the
(Wise and Ricker, 1989: Wise et al., 1989; Kresta et al., observations onto a much smaller A-dimensional hyperplane.
1 990). PCA is illustrated for a 3-dimensional system in Figure I :

36 THE CANADIAN JOURNAL OF CHEMICAL ENGINEERING. VOLUME 69. FEBRUARY. 1991


x2 I dimension A . One can proceed until the percent of the vari-
ation explained by adding additional principal components
is small, although this often leads to overfitting. A better
procedure is to use cross-validation (S. Wold, 1978) whereby
one holds back a certain fraction of the observations (say
1/5), performs a PCA analysis o n the remaining data and
then computes the prediction error sum of squares (PRESS)
for those observations left out. This is repeated until every
observation has been left out once. The optimal order of the
principal components model, A , is taken as that order
minimizing the overall PRESS.
In PCA it is very important to scale the data properly sincc,
the variance contribution of a particular x to the total varia-
tion in X is dependent upon its units of measurement. In
general each variable should be scaled relative to the others
in terms of its relative importance. With product quality vari-
ables the specification ranges give a sensible unit of scaling.
while with process data the span of the sensors might pro-
vide a natural means of scaling. Alternatively. it is often
common to scale all the variables to have unit variance,
although care must be taken not to scale up the variance of
Figure I -- Geometric representation of the steps in PCA: a) obser- variables that are almost constant nor to disturb the natural
vations plotted in the space of the measurements; b) the first prin- relationships among the variables of the same type.
cipal component; c) the plane defined by the first two principal A very interesting example of PCA applied to monitoring
components. a desulpherization process is presented by Denney et al.
(1985). That example provided some of the motivation for
PC2
this present work.
x1
I T2

Methods for two data blocks ( X , Y )


_ _ ..
x3
In PCA we are concerned only with finding that latent
vector space which explains the greatest amount of varia-
(8) (b) bility in a single matrix of data ( X ) . Often, we can identify
a second group of variables ( Y ) which are of greater impor-
Figure 2 - The loading (a) and the score (b) plots for the system
in Figure I . tance - eg. product quality or productivity variables, which
we would like to include in the monitoring procedure. Unfor-
tunately, these variables are often measured on a much less
in this system the bulk of the variation is in the plane defined frequent basis than the process variables; therefore. we would
by the first two principal components ( A = 2 ) . like to use the information contained in the process variables
The principal components define the plane of greatest ( X ) to predict, monitor and detect changes in the output
variability, and the loading vectors associated with these prin- (process performance or product quality) variables ( Y ) .
cipal components define the location of the plane in terms Multiple linear regression (MLR) is the most common
of the original variables, and each observation is located on method for developing multivariate statistical models. Unfor-
this plane via its scores. The score is the distance from the tunately, it is well established that MLR may have severe
origin of the plane (X) along each principal component and problems dealing with large sets of collinear data (Draper
is calculated as the product of the loading vector and obser- and Smith, 1981), leading to very imprecise parameter esti-
vation. The perpendicular distance from each observation mates and poor predictions. In many applications, subsets
to the plane is the residual for that observation. The infor- of the independent variables are chosen to eliminate problems
mation contained in the original space can now be described due to dimensionality and collinearity ; however, the aim of
using two 2-D plots: (1) the loading plot, showing the rela- the present work is to extract information from all measure-
tionship between the original variables and the principal com- ments available, and we will focus on methods that retain
ponents (Figure 2a) and ( 2 ) the score plot, showing the the entire data set. One method which can be used to stabi-
relationship between the observations and the principal com- lize the regression coefficients is Ridge Regression (Hoerl
ponents (Figure 2b). and Kennard, 1970; Smith and Campbell, 1980; Draper and
Algebraically the X matrix has been approximated by Smith, 1981);unfortunately this method does not reduce the
dimension of the problem and becomes cumbersome for high
X = tip; + t2p.T + . .. + tApi+ E ...... .. dimensional problems. Principal Component Regression
(PCR) regresses each of they 's on the principal components
where E is a residual matrix. Ideally the dimension A is of the X matrix. In this way it overcomes both the dimen-
chosen such that there is not significant process information sionality and the collinearity problems. PCR treats each y
left in E. Rather E should represent random error, and adding variable individually, where in fact the Y space often con-
an ( A + 1) principal component would only be fitting some sists of many highly correlated variables. This can lead to
of this random error, and thereby increasing the prediction inconclusive results if they variables have little or no infor-
error of the principal components model in Equation ( I ) . mation individually but are highly informative as a group.
There are several ways for selecting this maximum significant Furthermore, the space defined by the principal components

THE CANADIAN JOURNAL OF CHEMICAL ENGINEERING. VOLUME 69, FEBRUARY, 1991 31


00 0
a..........0..UCL
*
..............................................................
*

I.. ............................................................................ LCL


T1 SPE
Observations in time (b)
Figure 3 - A typical Shewart Chart: * - indicate in control;
o - indicate out o f control.

of X is only that space exhibiting the greatest variation in


the x ‘ s and in not necessarily the space that is most predic-
SPE I D
0
O
0

tive of Y (S. Wold et al., 1984a; Geladi and Kowalski. 1986a;


Lorber et al.. 1987).
The Projcction to Latent Structures (PLS) method appears
I * * * * *
TIME
to best address all of the above problems. A description of Figure 4 - The multivariate monitoring chart. The axes are defined
the mathematical algorithm for PLS is given in the appendix by TI, T2 and SPE: a) 3-dimensional representation, b) 2-dinien-
and can be found in several of the earlier cited references. sional representation, c) SPE plotted vs. time.
Conceptually, PLS is similar to PCA except that it simul-
taneously reduces the dimensions of the X and Y spaces to
find the latent vectors for the X and Y spaces which are most reference set is critical to the successful application of the
highly correlated. In effect the plane in the X space is tilted procedure and is discussed in more detail when describing
so that it is more predictive of Y. The latent vector spaces the case studie
in PLS are usually found by an iterative method (see rence in multivariate SPC is that we are
appendix), but it can also be shown that the loading vectors usually facing the challenges set out previously in the sec-
for PLS are the eigenvectors of ( X ‘ Y ) (Y ‘ X ) where ( Y TX) tion entitled “Statistical Methods”. The underlying premise
is the covariance matrix between X and Y (Hoskuldsson, behind the proposed procedure is that many of the measure-
1988). ments are highly correlated and simply represent different
manifestations of the same underlying disturbances or other
Monitoring via multivariate SPC plots events occurring in the process. It is also assumed that the
underlying dimensionality of the process, when it is oper-
Statistical process control charts such as the Shewart chart, ating normally, is quite low, and in these circumstances we
CUSUM plot and EWMA chart are well established statistical can represent the most important elements of its behaviour
procedures for monitoring stable univariate processes. The on low dimensional plots defined by the dominant latent
assumption behind them is that a process subject only to its vectors obtained via PCA or PLS. In effect the low dimen-
natural (“common cause”) variability will remain in a state sional planes defined by these latent vectors provide a low
of statistical control unless a special event occurs. The con- dimensional window on the behaviour of the very high
trol charts represent several statistical hypothesis testing dimensional process. For a plant with an underlying dimen-
procedures aimed at detecting the occurrence of a special sion of two, the axes of the score plot ( T , and T2)form the
event as quickly as possible. Upon detecting such an event first two axes of the monitoring chart and each observation
the action called for would usually be to diagnose the is located on this plot via its score. Using the regression equa-
problem, find an assignable cause for the deviation, and then tion developed for the two latent vectors (using PLS), the
correct the process by removing this assignable cause. In predicted Y’s are calculated for each new observation. These
practice, such an approach of continually monitoring the values are then used to?calculate the Squared Prediction Error
process, detecting events, and removing their causes leads (SPE) (CTLl (yi - j+)-) for each observation and this vari-
to long term improvements in the process. able could form the third dimension of the monitoring chart
A Shewart chart consists of plotting the observations (Figure 4a), or could be plotted against time as in standard
sequentially on a graph which also contains the target value Range charts (Figure 4c). If, as is often the case, the Y vari-
and upper and lower control limits as shown in Figure 3. ables are measured only infrequently, and hence are not
If an observation exceeds the control limits a statistically sig- usually available at each monitoring interval, then one should
nificant deviation from normal operation is deemed to have plot the Squared Prediction Error for X , i.e. C:”,(xi - it)?.
occurred, which triggers the search for an assignable cause. This latter value represents the squared distance of the new
The control limits are usually determined by analyzing the x vector from the prediction plane (TI, T?). Note that in the
variability in a reference set of process data collected when latter case one might want to weight each x, element by its
only normal or “common cause” variability is present and modelling power for Y , i.e. C ~ = l w i ( ( x-i 2;)’.
acceptable operation is achieved. The limits are then usually It is worthwhile noting that the score plot (TI, T2)and its
set at plus and minus three standard deviations about the control limit contour is very similar to the Shewart chart for
target. the mean of the variable, (Z),while the SPE plot is similar
The philosophy applied here in developing multivariate to the range or standard deviation plot for the same variable.
SPC procedures is the same as that used for the univariate The observations included in the reference set are plotted
systems. As with the Shewart Charts, an appropriate refer- on this chart; the limits of acceptable performance are set
ence set is chosen which defines the “normal operating con- such that they encompass a certain percentage of the varia-
ditions” (NOC) for a particular process. The choice of this tion in the reference set (analogous to the control limits of
3x THE CANADIAN JOURNAL OF CHEMICAL ENGINEERING, VOLUME 69. FEBRUARY, 1991
the Shewart chart). It has been shown for PCA (Jackson, discussion; or (2) if no inferential variables were available,
1980; Jackson and Mudholdar, 1979), that under normality then the procedure would be based on PCA instead of PLS
assumptions and specific scaling, the control limits on the and the principal component model, Equation ( I ) , for X
TI - 7: plane form an ellipse and the proper size of the would be used to form an SPE (CyLlp, (xi - a;)’) in order
control ellipse can be calculated: further, the X-statistic can to detect changes which disturb the relationships among the
be used to establish the control limits for the residual or SPE. X variables. Both of these methods have limitations. The
Alternately, rather than basing the control limits on any method using PCA is less sensitive because it may include
hypothesized underlying distribution; they can be based on variation which is not directly related to quality variables.
the reference distribution of the historical data contained in This method is also unable to detect disturbances which affect
the reference set (Box et al., 1978). This is the method used the relationship between the process and quality variables
in this work. The “normal operating conditions” should but are not included in X . The method using inferential vari-
reflect well behaved plant conditions and will need to be reset ables is more sensitive to the specific disturbances which
if plant conditions change. affect the particular inferential variables used. It can detect
Once the chart has been set up it can be used to monitor changes in the relationship between X and the inferential vari-
the process. Using latent variable relationships developed for ables but may miss disturbances which do not affect these
the reference set, the scores for each new observation vector but affect the other unmeasured quality variables.
can be located on the plane (ti = Xws,f 2 = ( X - t , p :’)w4) In order to retain the ease of interpretation the system must
and its SPE calculated (CK I (yi - 9,)-or C:“; wi (xi - a,) . be chosen such that the bulk of the variation can be easily
The structure of these plots reflects the two ways in which presented to the user. This means that normally the
abnormalities can enter the system and provides powerful monitoring procedure would be restricted to three latent
diagnostic capability to determine the cause of the abnor- vectors which can be presented on a 3-dimensional plot and
mality. I f the abnormality is caused by a larger than normal a separate SPE plot. For very complex systems, where three
change in one or more of the process variables but the basic latent vectors may not explain the bulk of the variation then,
relationship between the process and quality variables does the system should be divided into logical modular sections
not change. then the abnormality will manifest itself as a shift which can be monitored separately. If this is impractical then
in the 7 ,- T2 plane, and the SPE will remain at an accept- two further possibilities exist: (1) use only three latent vectors
able level below its action limits (the * symbols in Figure 4a and an SPE vs time plot. Although the three latent vectors
and 4b). It’ on the other hand the abnormality enters through will not explain all the major variation, the variation not cap-
a new event not captured in the reference set, it will change tured by them will still show up in the SPE plot; ( 2 ) if it
the nature and possibly the dimension of the relationship is felt that more than three latent vectors would be useful,
between the process and quality variables; then, the SPE will then the number of plots can be increased, although this
increase ( o in Figure 4a and 4b). Acceptable performance, would quickly become cumbersome.
i.e. perfbrmance which matches the reference set, would fall
within the envelope of normal operation ( A in Figure 4a Monitoring a fluidized bed reactor
and 4b).
One difference between the standard univariate SPC charts The multivariate monitoring procedure outlined in the last
and the multivariate SPC chart is that there is no convenient section was applied to a fluidized bed reactor (FBR).
time axis to measure the age of the observations. The pas- Although only simulations were used in this study, the non-
sage of time can be indicated in the T , - T? plane by the linear model used was one recently identified from
markers used. This can be done in several ways including: experimental runs a on pilot plant (Kclly, 1989). The reac-
( I ) by marking the passage of time using shading intensity tion in the FBR is the hydrogenolysis of n-butane on 10%
- the most recent being the brightest and past values fading nickel on silica catalyst (Shaw et al., 1972; Kelly, 1989).
so that only the pastp observations are visible; or (2) by using Due to the much faster concentration dynamics in the reactor
time indexed markers (such as sl ,s2,s3,. . .), but still the concentrations can be considered at steady state with
showing only the past p observations on the current display. respect to temperature. Although this process is non-linear.
Alternately, separate univariate charts for Ti and T2 could especially when the extremes of operating conditions are
be plotted against time in a manner similar to a Shewart chart. included, this did not create significant problems for the
This approach is better than plotting the original variables monitoring method, since under normal operating conditions
individually, since there are many fewer T’s and they are the process is roughly linear. This process proved to be ideal
orthogonal; however, there is still a danger that an observa- for a preliminary study of the technique because, while
tion will lie within the individual limits but be clearly numerous measurements were available, the underlying
abnormal when viewed on a bivariate chart (see Figure 7, process proved to be essentially two dimensional. A simpli-
Case I ) . The SPE would usually be plotted against time fied schematic to this process is presented in Figure 5 along
similar t o the standard Range or S-chart (Figure 4c). with the measured variables.
The procedure described above and demonstrated in the The first step in the procedure is to choose which variables
simulations is for the situation where both X and Y mea- will be considered as process variables and which will be con-
surements are available at regular intervals; or where the Y sidered as indicators of product quality. An on-line gas chro-
measurements are available at least in the reference set matograph (GC) was assumed to be available for the analysis
defining the NOC used in the PLS analysis. If, however, the of the product stream, and the concentrations of the reaction
quality measurements ( Y ) were too infrequent or totally products (transformed to product selectivities and reactant
unavailable, then the procedure could be reformulated. Two conversions) were chosen as the product space ( Y ) . The
possibilities exists: ( I ) if certain variables in X could be remaining measurements were included in X . If the GC mea-
considered good inferential variables for Y, then X could be surements were not available, one alternative would be to
partitioned into XI and X,, where X , would contain the include the outlet reactor temperature in the product space,
inferential variables and act as the Y in the previous as an inferential variable for the concentrations. The division

T H E CANADIAN JOURNAL OF CHEMICAL ENGINEERING. VOLUME 69, FEBRUARY, 1991 39


Methane in the experimental work used to develop the model. The
Ethane four case studies were used to test the procedure: (1) a change
Propane in the hydrogen to butane ration in the feed; ( 2 ) a ramped
Hydrogen increase in the cooling oil temperature; (3) a baseline drift
Butane
in the chromatograph reading, and (4) a change in the catalyst
Reactor Temperature activity.
Cooling Oil Temperature
Once a reference set has been established, the scaling of
the variables must be chosen. Scaling affects the relation-
ships between variables. Any variable with large amounts
of variation will dominate the first latent vector. Unless eco-
nomic or physical considerations indicate that certain vari-
ables are more important than others, the variance for each
measurement should be approximately the same. This has
led to the application of “autoscaling” (mean centering fol-
Hydrosen
lowed by scaling all variables to unit variance) and although
Butane this approach was successfully applied to the FBR, it is not
Total Volumeuic Flowrate always the preferred choice due to physical considerations,
Inlet Temperature
Conc. Hydrogen
as described in the discussion of the distillation column
Conc. Butane simulation.
Once the reference set and the scaling were defined, the
Figure 5 - A schematic of the fluidized bed reactor (FBR) latent vectors of the PLS model were calculated. Table 1
including measured variables. presents the loading vectors (which define the latent vectors
in terms on the original variables) and the cumulative per-
of measurements into process variable (X)and product vari- cent of the variance explained for each variable. Only the first
able ( Y ) spaces is highly dependent on the measurements two latent vectors were used in the monitoring procedure since
available, and the objectives of the monitoring scheme. they explained 86.5% of the variation in Y , and because the
The choice of the reference set and the proper scaling are reduction in PRESS due to the third latent vector was small
critical to the successful application of the procedure. The and did not justify the added complexity (also it explained only
reference set used to develop the monitoring chart will deter- an additional 3.8% of the variance of Y). The scores plotted
mine the variations considered to be part of normal opera- in the monitoring chart are calculated as the product of the
measurement and the loading vector (tl = X(,,,, w I ;t z =
tion and ideally includes all variation leading to acceptable
performance. If the reference set variation is too small the r)
- t I p w 2 ) ; the prediction of y is calculated as
procedure will alarm too often, and if too great the sensi-
tivity of the procedure for indicating abnormal operation will 9 = T B Q ~. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (2)
be impaired. In an industrial application the reference set
would be chosen from past data. In the simulated example, or
the reference set was generated from a normal distribution
about a nominal point. The variations and the nominal point
chosen for this simulation correspond to the conditions used 9 = C tjb,qT .............................
, = I (3)

TABLE1
Results of the PLS Calculation for the Fluidized Bed Reactor Showing the Loadings for Each Latent Vector (LV)
and the Cumulative Percent of Variation Explained

First LV Second LV
Variables Loading Cum. % Expl. Loading Cum. % Expl.
Butane concentration 0.3002 29.5 -0.461 80.5
Hydrogen concentration -0.2957 39.2 0.4696 81.36
Ratio -0.3054 32.4 0.4951 88.34
Volumetric Flowrate -0.1008 0 0.085 4.5
T - Inlet 0.1099 9.0 -0.1893 12.0
T - Cooling Oil 0.586 56.0 0.4385 99.05
T - Reactor 0.6416 76.5 0.3131 98.0
Total % Explained of X 23.7 52.7

Methane selectivity 0.4607 83.7 0.4353 91.0


Ethane selectivity - 0.4046 65.3 -0.6862 82.6
Propane selectivity - 0.4594 83.2 -0.4127 89.8
Butane conversion 0.4471 78.14 0.255 81.0
Hydrogen conversion 0.4617 84.0 -0.3229 88.0
Total % Explained of Y 78.8 86.5

Regression Coefficient ( b ) 1.265 0.4188

40 THE CANADIAN JOURNAL OF CHEMICAL ENGINEERING, VOLUME 69. FEBRUARY. 1991


”,
L

g40
w

w‘ 5
n
cno
-4 -2 0 2 4

z 5
8 4
cn
2i3
3 2
>
- 1
c
sa o
J -1
a
5 -2
8 -3
’ -4
cu
I- -5

T1 - First Latent Vector Scores SPE - Square of the Prediction Error


Figure 6 -- The monitoring chart for the fluidized bed reactor showing the normal operating region (the cross-hatched area) and the points
in the reference set ( 0).

In Figure 6 the points from the reference set are plotted on of the signal from each variable was added to each variable),
the monitoring chart and the regions of acceptable perfor- the same changes were detectable but less prominently.
mance are defined (the shaded areas). The four case studies
described earlier were used to evaluate the ability of the Monitoring an extractive distillation column
procedure to detect upsets or faults; the results are presented
on the monitoring chart in Figure 7. The second simulation used to demonstrate the proposed
The results clearly show the different types of abnormali- monitoring procedure is an Extractive Distillation Column
ties which can arise. In the first case, the shift in the hydrogen (EDC) separating an azeotropic mixture of acetone, methanol
to butane ratio, the disturbance is measured by the process and water (Figure 7). The results presented are from simu-
variables and is indicated by a shift of the points on the lation runs using a fundamental tray-by-tray model developed
T, - T, plane. Since small variations in this ratio are at McMaster University (Chin, 1989). Only steady-state
included in the reference set, the PLS model includes this monitoring was performed; however, the possibilities of
type of variation and the acceptable SPE shows that the dynamic monitoring are discussed below.
prediction obtained is still accurate. The second case, the As with the FBR the measurements around the distillation
increase in the cooling oil temperature, indicates the same column must be divided into the process and product vari-
type of abnormality. Further, if the process were truly linear, ables. The configuration chosen is described in Table 2 . One
the SPE would remain within the limits set by the reference question that immediately comes to mind is whether the con-
set; however, due to the non-linearities in the system the SPE centrations of all components in the product streams should
increases with increasing temperature until finally it exceeds be included in the product variable set or only those which
the acceptable limit. The last two cases, the base line drift will ultimately be used for control. The answer to this ques-
in the GC and the change in the catalyst activity, have no tion is case specific. However, the following rule should be
effect on the measured process variables, but they have an followed: use only those variables in Y which are of greatest
extreme effect on the prediction equation. These effects cause interest from a monitoring point of view. Extraneous vari-
drastic increases in the SPE indicating that the relationship ables in the product space do not normally affect the PLS
between X and Y has been altered. The data presented here analysis; they do, however, impair the resolution of the
shows the method working on a system with variation but monitoring procedure by increasing the limits on the SPE.
no sensor noise. When sensor noise was added to the system Significant changes in the residuals of the important variables
(white noise equivalent to one-half of the standard deviation can be hidden by the variance of the extraneous variables

THE CANADIAN JOURNAL OF CHEMICAL ENGINEERING, VOLUME 69, FEBRUARY, 1991 41


xX
A

m 5
??
8 4
* 3
ti
v 2
2 ’
s o
c
0
-I -1
0

fi
-*
-3
8 1 - 4
(u
I- -5

T1 - First Latent Vector Scores SPE - Square of the Prediction Error


Figure 7 - The monitoring chart for the fluidized bed reactor showing the results of the simulated case studies used to evaluate the method.
The cross-hatched area indicates the normal operating region; Case 1 ( o ) indicates a change in the hydrogen to butane ratio; Case 2 ( + )
indicates an increase in the oil temperature; Case 3 (X) indicates a baseline drift on the GC; and Case 4 ( A ) indicates a change in the
catalyst activity

TABLE2 scaling approaches were investigated, ranging from unit


The Process and Product Variables Measured for the Extractive scaling all variables, except those in the temperature pro-
Distillation Column file, to no scaling at all, to specifically increasing the vari-
Process Measurements Product Measurements ance of certain variables. The following observations were
made: if the inter-relationships between a group of variables
Solvent Flow Flowrate of Distillate must be maintained (and they are measured in the same units)
Solvent Temperature* Acetone Concentration in Distillate then the same scaling factor must be used for the entire group.
Feed Flowrate Methanol Concentration in Distillate When the variances are almost the same size, then small
Feed Temperature* Water Concentration in Distillate adjustments in the scaling have an insignificant effect on the
Pressure* Flowrate of Bottoms
15 Tray Temperatures Acetone Concentration in Bottoms results.
Condenser Duty* Methanol Concentration in Bottoms By scaling selected variables to high variances compared
Reboiler Duty Water Concentration in Bottoms to the other variables, the latent vectors used by PLS model
were altered. When the scaled variable belongs to X,then
Note: * - Indicates variables held constant during simulation. this variable usually dominates the first latent vector. When
the scaled variable belongs to Y , then the variables which
which are included in the model. For this work all the infor- were highly predictive of this measurement were prominent
mation about the product streams was included in Y. When in the first latent vectors. This demonstrates the compromise
this was done nearly all the variation was still explained in being calculated by PLS between describing the X space and
three latent vectors and therefore in no way inhibited the per- predicting the Y space.
formance of the monitoring procedure. As with the FBR, the reference set for the EDC was gener-
Autoscaling was not used in this problem, as was done ated from a normal distribution around a nominal point. Five
for the FBR, because the relationships between the tray tem- case studies were used with this example, including: ( I ) an
peratures in the temperature profile should be kept intact. increase in the acetone to methanol ratio with a decrease in
This is done not because the final regression equation changes the water content in the feed; (2) no change in the acetone
but because more physical interpretation can be given to the to methanol ratio, but an increase in the water content in
first latent vectors if the data is unscaled. Several different the feed; (3) a decrease in the acetone to methanol ratio
42 THE CANADIAN JOURNAL OF CHEMICAL ENGINEERING, VOLUME 69, F E B R U A R Y , 1991
Cooling water temperature to those done for the FBR were performed and the results
n
plotted on the monitoring charts. The monitoring chart had
to be modified from the one used for the FBR because three
latent vectors were necessary to model the relevant infor-
mation. The scores from the three latent vectors form the
Distillate Flowrate axes of the monitoring chart (Figures 9 and 10) and the SPE
Acetone Conc. is shown on a separate chart (Figure 11). The SPE would
Solvent 10 Methanol Conc. generally be plotted sequentially in time, but can be plotted
__c
( Water ) Temp Water Conc. against any of the t 's, if this is more convenient for the appli-
cation. Since the data generated for this example was not
time dependent, SPE was plotted against T2, which gave the
Flow Tray Temperatures (15)
best resolution of the events.
Feed Te- The loading vectors for the three latent vectors used in the
5
(Acetone, 0.66; EDC monitoring chart are presented in Table 3 . Also
Methanol, 0.30; 2 presented in this table are the cumulative percent of vari-
ance explained for each variable. The loading vectors are
useful in interpreting the monitoring chart, and help assign
causes to abnormalities. The first loading vector indicates
that the first latent vector is dominated by changes in the
reboiler duty and the accompanying shifts in the tempera-
ture profile. Further evidence of this and the value of these
loadings in the interpretation of the monitoring chart is seen
in Figure 10, where the change in the steam temperature to
the reboiler are clearly seen as a shift in the Ti direction.
The second and third vectors are concerned with changes
in solvent and feedstock flowrates; for the second vector these
changes are in opposite directions, for the third they are in
the same direction. Again this interpretation is confirmed by

5-
8
v)

4 -
a
'0 3 -
c

8
> 2-
c
5
iii
1-

-I 0-
e
.-
iE -1-
I

p -2-

u) 4- 4
$j 3 - 3-
2 2- 2-
b 1 - 1 -
H
> 0- 0 -
4-
C
$ l -1 - -1 -
lu
-I -2 - -2 -
z -3- -3 -
1 - 4 - -4- rn
' -5 - -5 -
P -6 ' I ' ' ' ' I ' ' ' ' '

Figure 9 - The monitoring chart for the extractive distillation column showing the normal operating region (the cross-hatched area) and
the points used in the reference set ( m ) .

THE CANADIAN JOURNAL OF CHEMICAL ENGINEERING. VOLUME 69, FEBRUARY, 1991 43


-3" ' ' ' ' ' ' ' ' ' ' '
-8 -4 0 4 8 12
A
41
I
3 Y

'
P ILL--
-5
-6
-8
-4 0 4 8
T1 - First Latent Vector Scores
12
-5
-4 1
-6I '
-3 -1 1
T3 - Third Latent Vector Scores
A

3
A

5
I
I

Figure 10 - The monitoring chart for the extractive distillation column showing the results of the simulated case studies used to evaluate
the method. The cross-hatched area indicates the normal operating region; Case 1 ( v ) indicates an acetone to methanol ratio of 2.2-2.92;
Case 2 ( * ) a ratio of 1.96-2.07; Case 3 (X) a ratio of 1.71-1.64; in Case 4 ( A ) the solvent flow was increased by 33% and in Case
5 ( v ) the steam temperature was increased by 10 K.

events plotted in Figure 10; a positive change in the solvent when using unit scaling not to disturb inter-relationships
tlowrate is seen as a shift in the plane of T2 and T3, in among the variables. Reference sets must be designed to
the positive direction for T3 and in the negative for T2. reflect the purpose of the monitoring procedure. This means
One interesting thing should be noted in this simulation; that the procedure is case specific, and each application should
although feed composition variations were included in the be tested to see if it performs according to the specifications.
simulations, they were unmeasured disturbances. Since their The loading vectors contain information relating the
effect on the temperature profile is similar to that caused by original variables to the latent vectors. This information used
changes in the reboiler duty, it is difficult to detect feed com- in conjunction with past experience and pattern recognition
position changes solely from the information in the process can be instrumental in assigning possible causes to the
measurements. However, they are easily detected once the abnormalities.
plot of SPE is included (Figure 11). The work presented here has been restricted to steady-state
monitoring. By incorporating time lags (in the time series
Discussion sense) it should be possible to extend this procedure to
dynamic systems.
To obtain the best performance when using the monitoring Finally, the work presented here was for the situation
charts the following guidelines should be observed. The where regular quality measurements are available; if these
product space should be restricted to those variables of are not available, then one of the previously mentioned
interest in the monitoring procedure. Extraneous variables modifications needs to be implemented. Although these
in the process measurement space do not inhibit the proce- methods have some additional limitations they can still be
dure, and all possibly relevant measurements can be included. effective monitoring tools; however, the details of their
Scaling should be performed in such a way that the variances implementation must still be investigated.
of the measurements reflect their relative importance (the A major area for future research pertains to the appropriate
use of instrument ranges and engineering knowledge of the procedure for maintaining and updating the monitoring proce-
process should be very effective). Care should be exercised dure under changing operating conditions.
44 THE CANADIAN JOURNAL OF CHEMICAL ENGINEERING, VOLUME 69. FEBRUARY. 1991
TABLE3
Results of the PLS Calculation Using the Reference Set for the Extractive Distillation Column Showing the Loadings
for the Latent Vectors (LV) and the Cumulative Percent Variation Explained After Each Vector
First LV Second LV Third LV
Variables Loading Cum. % Expl. Loading Cum. % Expl. Loading Cum. % Expl.
Solvent Flow 0.06 0.0 -0.68 44.0 0.68 99.8
Feed Flow 0.06 0.0 0.52 50.0 0.65 99.1
Tray Temp 1 0.37 86.4 -0.31 95.0 -0.09 99.6
Tray Temp 2 0.15 86.4 -0.11 94.0 0.03 96.1
Tray Temp 3 0.004 6.0 -0.397 23.0 0.002 43.7
Tray Temp 4 0.06 68.0 -0. I06 92 0.02 92.4
Tray Temp 5 0.03 43.0 -0.1 I 86 0.03 90.3
Tray Temp 6 0.15 73.0 0. I6 89 0.03 90.7
Tray Temp 7 0. I5 64.7 0.2 I 90 0.03 92.0
Tray Temp 8 0.11 60.5 0.16 90 0.02 91.8
Tray Temp 9 0.06 59.5 0.09 89 0.02 91.5
Tray Temp 10 0.03 60.9 0.039 84 0.03 91.3
Tray Temp 11 0.03 62.8 0.028 84 0.03 97.8
Tray Temp 12 0.02 60.4 0.18 75 0.03 96.6
Tray Temp 13 0.01 54.6 0.008 63 0.03 94.6
Tray Temp 14 0.006 30.5 -0.004 29.6 0.03 86.3
Tray Temp 15 0.144 57.9 0.19 86 0.136 96.3
Steam Temp 0.874 99.8 -0.04 99.8 0.08 99.9
Total % Explained of X 74.9 87.1 98.9

Distillate flow 0.42 57.7 0.31 86. I 0.26 96.2


X , Acetone -0.4 54.5 -0.38 97.3 -0.06 97.8
X , Methanol 0.3 30.1 0.43 86.2 -0.24 95.1
X , Water 0.41 51.3 0.15 56.6 0.51 92.3
Bottoms Flow -0.17 8.0 -0.26 26.0 0.72 99. I
X , Acetone -0.51 84.5 0.2 96.0 0.16 99.5
X , Methanol 0.06 0.2 0.5 1 77.8 -0.24 86.2
X , Water 0.33 37.6 -0.425 94.5 0.02 94.8
Total % Explained of Y 40.4 77.0 95.1
Regression coefficient ( b ) 0.6325 1.485 1.128

6r I

Conclusions
i ,t
e A multivariate SPC procedure has been proposed for
-
G o
L
.-0
-5‘
8 1
-3 -1
T2 - Second latent vector scores.
1 3 handling large numbers of process and quality
variables. Multivariate statistical procedures are used to
t i . reduce the dimensionality of these large and highly

0
X

X
xx X 1I correlated data sets down to a few latent variables which
contain most of the information about the process
behaviour under normal operating conditions. By plot-
m
3
ting the projections (rows) of new process observations
fx over time on this low dimensional plane one is able to
detect larger than normal process variations, and by also
plotting the squared prediction errors, (ie. the perpen-
dicular distances from the plane) one is also able to
detect major changes in the behaviour of the process
caused by new events.
By compressing all the information on the process
down to low dimensional spaces, and using simple plots
-1 1 3 of the data in these spaces, together with meaningful
72 - Second latent vector scores. control limits, the essential idea and philosophy of
Shewart’s ( I 93 1) SPC methods have been preserved and
Figure \ 1 - The SPE chart for the extractive distillation column extended to handle the large number of variables
showing both the plot of the reference set (a) and the results of the
simulated case studies used to evaluate the method (b). The cross-
collected in most process industries today. The tools
hatched area indicates the normal operating region; Case 1 ( 0 ) necessary to establish the multivariate charts (PCA and
indicates an acetone to methanol ratio of 2.2-2.92; Case 2 ( + ) a PLS) may be more complex than usually used in univar-
ratio of 1.96-2.07; Case 3 (X) a ratio of 1.71-1.64; in Case 4 ( A ) iate SPC, but from the user’s point of view, the presen-
the solvent flow was increased by 33 % and in Case 5 ( v ) the steam tation of the data, and the interpretation of the results
temperature was increased by 10 K . is almost as simple.

THE CANADIAN JOURNAL OF CHEMICAL ENGINEERING. VOLUME 69. FEBRUARY, 1991 45


APPENDIX NIPALS is not iterative, some interesting properties of this case
are discussed by Lorbzr et al. (1987).
Some mathematical details of P I S It is of interest to consider what objective function is being
minimized by PLS. Hoskuldsson (1988) addressed this from a geo-
The mathematical and statistical properties of PLS have been dis- metric point of view. He showed that the objective of PLS is find
cussed in detail in by a number of workers (S. Wold et al., 1987; orthogonal rotations of the original matrices X & Y such that thc
Lorber et al., 1987; Naes and Martens, 1985; Hoskuldsson, 1988; angle between the rotated matrices is the smallest. This objective
Hclland, 1988). In these papers many variations of the basic can be conversely stated that the covariance between u and t should
algorithm used to do the calculation appear. The following is not be maximized.
review of their work but rather as a summary, high- Hoskuldsson (1988) showed some interesting relationships
se aspects which are particularly useful in obtaining a between PCA and PLS. These are of interest because PCA is con-
general understanding of the procedure. ceptually easier to grasp than PLS. PCA works on the X matrix.
PLS is a regression technique, regressing Y onto X,where X is and the loading vectors calculated using NIPALS are the eigen-
and II x k matrix and Y is an n X rn matrix. This method is espe- vectors of the covariance matrix (X ' X ) . PLS on the other hand,
cially useful when the variables within X and within Yare correlated. works on both X and Y , the loading vectors calculated in this case
As with the geometric argument already presented, it is easier to are the eigenvectors of the matrix X 'YY 'X. This last matrix can
start with PCA and then introduce PLS. be thought of in two ways: ( I ) the covariance of X has been scaled
PCA works on a sin& data matrix and attempts to explain the using the "size" of the Y matrix (W'),or (2) PLS is simply PCA
structure of the variation. Generally the X matrix is mean centered preformed on the covariance matrix of X and Y (Y 7X).
and scaled before PCA is applied. The methodology of PCA is to
decompose the data matrix into the following bilinear form: Nomenclature
A = the optimal number of latent vectors
X = C't,,p: t E,, . . . . . . . . . . . . . . . . . . . . . . . . . . . . . (A]) ba = the regression coefficient for the ath latent vectors of
X and Y measurements
'
Mathematically TP is indeterminant; to obtain uniqueness one B = the diagonal matrix of regression coefficients
can impose various conditions on t , and p a . One way to do this E = the residual matrix of X
is to force orthogonality a g the t o ' s and among the pa 's. If this F = the residual matrix of Y
is done and it is further incd that they are orthogonal to the I = the identity matrix
rows/columns of E then it can be shown that the 1,'s are the eigen- k = the number of processhperationallinput measurements
'
vectors of X X and pa 's are the eigenvectors of X 'X (Helland, m = the number of qualityioutput measurements
1988). This points to the equivalency between PCA and SVD I1 = the number of observations
(Hoskuldsson, 1988). In SVD the eigenvalues are calculated in Pa = the loading vector for the uth latent vector of X cal-
descending order starting with the largest; in both methods the cor- culated bv PCA or PLS
responding eigenvectors are also in descending order. For a detailed P the load& matrix for X calculated by PCA or PLS
discussion of PCA see Chartfield and Collins (1980) or Jackson P, the loading for the ith X variable
(1980). For correlated data sets A < < k using SVD to calculate 4, loading vector for ath latent vector of Y
all the principal components is inefficient. The NIPALS algorithm Q loading matrix for Y
(S. Wold et al., 1987) calculates the PC's one at a time, the cor- fa scores associated with the ath latent vector of X
rect number of PCs can be determined from a number of stopping T the matrix of scores associated with X
criterion, one of the more popular techniques being cross valida- T the axis of the monitoring chart
tion (S. Wold, 1978). ua scores associated with the ath latent vector of Y
The Y matrix can be similarly decomposed into U the matrix of scores associated with Y
"'a the loading vector used for prediction. calculated
from PLS, for the uth latent vector
(A21
W the matrix of prediction loadings for X
w, the prediction loading for the ith X variable
Performing the regression of u onto t leads to Principal Compo- X (n x k ) matrix of the processlopcrationaliinput
nent Regression. PLS follows a similar procedure except that it per- measurements
forms both of the decompositions simultaneously in order to get the mean of a variable of X and the vector of means
a better prediction of Y . The method can be described by the fol- the value of the ith X variable for one measurenient
lowing algorithm (S. Wold et al., 1987). the predicted value of the ith X variable for one
1. Start: set u equal to a column of Y measurement
2. w ' = u ' X / U 'U (regress columns of on u ) x the composition of the bottoms product from the
EDC (Table 3 )
3. Normalize w to unit length
4. t = Xw / w ' y (calculate the scores) the composition of the distillate product from EDC
'
S. q = t ' y i t ' t (regress columns of Y on t ) (Table 3)
4
6. Normalize to unit length ( n X m) matrix of the process performanceiproduct
qualityioutput measurements
7. u = Y q i q q (calculate new u vector)
8. Check convergence: if YES to 9, if NO to 2 the value of the ith Y variable for one measurenicnt
9. X loadings: p = X ' t / t ' 1 the predicted value of the ith Y variable for one
10. Regression: 6 = u ' t i t ' t measurement
1 1. Calculate residual matrices: E = X - tp' and F = Y - 6tq7
12. To calculate the next set of latent vectors replace X & Y by Superscript
E and F and repeat.
T = the transpose of the matrix
Several interesting things come from this algorithm, first, the
algorithm presented is for linear regression (step 10). Non-linearities Abbreviations
can be incorporated into the model by either transforming the
original variables in X and Y or by incorporating the known non- CUSUM = Cummulative Sum
linearity in step 10 (S. Wold et al., 1989). Secondly if the q vector EDC = Extractive Distillation Column
is not normalized then B = 1. Finally, if Y has only one variable then EWMA = Exponentially Weighted Moving Average

46 THE CANADIAN JOURNAL OF CHEMICAL ENGINEERING, VOLUME 69, FEBRUARY. 19'91


FBR = Fluidized Bed Reactor Lorber, A., L. A. Wagner and B. R. Kowalski, “ A Theoretical
GC = Gas Chromatograph Foundation for the PLS Algorithm”, J . Chemometrics 1 , 19-31
LCL = Lower Control Limit (1987).
MLR = Multiple Linear Regression Manne, R., “Analysis of Two Partial Least Squares Algorithms
NOC = Normal Operating Conditions for Multivariate Calibration”, Chem. Intell. Lab. Sys. 2, 187-194
PCA = Principle Component Analysis (1987).
PCR = Principle Component Regression Mardia, K. V., J. T. Kent and J. M . Bibby, Multivariate Anal-
PLS = Partial Least Squares or Projection to Latent ysis, Academic Press, London (1982).
Structures Moseholm, L., “Analysis of Air Pollution Plant Exposure Data:
PRESS = Prediction Error Sum of Squares The Soft Independent Modelling of Class Analogy (SIMCA) and
SPC = Statistical Process Control Partial Least Squares Modelling with Latent Variable (PLS)
SPE = Squared Prediction Error Approaches”, Environ. Pollut. 53, 313-331 (1988).
UCL = Upper Control Limit Naes, T. and H. Martens, “Comparison of Prediction Methods for
Multicollinear Data”, Commun. Statist. - Simul. Comput. 14,
References 545-576 (1 985).
Otto, M. and W. Wegscheider, “Spectrophotometric Multicom-
Anderson. T. W., Introduction to Multivariate Statistical Analysis ponent Analysis Applied to Trace Metal Determinations”, Anal.
2nd etl., Wiley, New York (1984). Chem. 57, 63-67 (1985).
Beebe, K . R. and B. R. Kowalski, “An Introduction to Multivar- Pearson, K . , “On Lines and Planes of Closest Fit to Systems of
iate (’alibration and Analysis”, Anal. Chem. 59, 1007a-1017a Points in Space”, Philos. Mag. 2, 559-572 (1901).
( 1987). Shaw, 1. D., T. W. Hoffman, A. Orlickas and P. M. Reilly. “The
Box G € . P.. W. Hunter and J. S. Hunter, Statistics of Hydrogenolysis of ()-Butane on Nickel on Silica Catalyst: I1 Flui-
Experimenters, Wiley, New York (1978). dized Bed Studies”, Can. J. Chem. Eng. 50, 637-643 (1972).
Chatfield. C. and A. J. Collins, Introduction to Multivariate Anal- Shewart, W. A., “Economic Control of Quality of Manufactured
ysis, Chapman Hall, London (1986). Product, Van Nostrand, Princeton, NJ (1931).
Chin, A , . Private Communication, Department of Chemical Smith, G. and F. Campbell, “ A Critique of Some Ridge Regres-
Engineering, McMaster University (1989). sion Methods”, J . Am. Stat. Assoc. 75, 74-81 (1980).
Denney, D. W., J . McKay, T. MacHattie, C. Flora and E. Mas- Wise, B. M. and N. L. Ricker, “Feedback Strategies in Multiple
tracci, “Application of Pattern Recognition Techniques to Process Sensor Systems”, AIChE Symp. Ser. 85, 19-23 (1989).
Unit Data, Presented at CSChE Conference, Sarnia, Ont., Oct. Wise, B. M., N. L. Ricker and D. J. Veltkamp, “Upset and Sensor
(1985). Detection in Multivariate Process”, paper 164b, AIChE Annual
Draper, N . R. and H. Smith, Applied Regression Analysis, Wiley, Meeting, San Francisco, CA., Nov. (1989).
New York (1981). Wold, H., “Soft Modeling. The Basic Design and Some Exten-
Geladi. P. and B. Kowalski, “Partial Least-Squares Regression: sions”. in Systems Under Indirect Observation, K. Joreskog and
A Tutorial”, Anal. Chim. Acta 185, 1-17 (1986a). H. Wold, Ed., North Holland, Amsterdam (1982).
Geladi. P. and B. Kowalski, “An Example of 2-Block Predictive Wold, S . , “Cross-Validatory Estimation of the Number of Com-
Partial Least Squares Regression with Simulated Data”, Anal. ponents in Factor and Principal Component Models”, Techno-
Chim. Acta 185, 19-32 (1886b). metrics 20, 397-405 (1978).
Geladi, P., “Notes on the History and Nature of Partial Least Wold, S., C. Albano, J. Dunn 111, U . Edlund, K. Esbensen, P.
Squares (PLS) Modelling”, J. Chemometrics 2, 23 1-246 (1988). Geladi, S. Hellberg, E. Johansson, W. Lindberg and M. Sjos-
Helland. I.. “On the Structure of Partial Least Squares Regres- trom, “Multivariate Data Analysis in Chemistry”, in Chemo-
sion”, Cornrnun. Statist. - Simul. Comput. 17, 581-607 (1988). metrics: Mathematics and Statistics in Chemistry, B. Kowalski,
Hoerl, A E. and R. W. Kennard, “Ridge Regression: Biased Esti- Ed., Reidel Publishing Co., Dordrecht. NL (1984a).
mation for Non-orthogonal Problems”, Technometrics 12, 55-67 Wold, S . , A. Ruhe, H. Wold and W. Dunn 111, “The Collinearity
(1970). Problem in Linear Regression. The Partial Least Squares (PLS)
Horst, P., ”Relations Among m Sets of Measures”, Psychometrica Approach to Generalized Inverses”, SIAM J. Sci. Stat. Comput.
26, 129-149 (1961). 5, 735-743 (1984b).
Hoskuldsson, A . , “PLS Regression Methods”, J . Chemometrics Wold, S., S. Hellberg, T. Lundstedt, M. Sjostrom and H. Wold,
2, 21 1-228 (1988). “PLS Modelling with Latent Variables in two or more Dimen-
Hunter. J . S., “Exponentially Weighted Moving Average”, J. Qual. sions”, Frankfurt PLS Meeting, Frankfurt, FRG, Sept. (1987).
Technol. 18. 203-210 (1986). Wold, S . . K. Esbensen and P. Geladi, “Principal Component Anal-
Jackson, J . E., “Principal Components and Factor Analysis: Part I ysis”, Chem. Intell. Lab. Sys. 2, 37-52 (1987).
- Principal Analysis”, J. Qual. Technol. 12, 201-213 (1980). Wold, S., N. Kettaneh-Wold and B. Skagerberg, “Non-Linear PLS
Jackson. J. E., “Control Procedures for Residuals Associated with Modelling”, Chem. Intell. Lab. Sys. 7, 53-65 (1989).
Principal Component Analysis”, Technometrics 21, 341-349 Woodhall, W. H. and M. M. Ncube, “Multivariate CUSUM
(1970). Quality-Control Procedures”, Technornetrics 27, 285-292
Kelly, J . , Private Communication, Department of Chemical (1985).
Engineering. McMaster University (1989). Woodward, R. H. and P. L. Goldsmith, Cumulative Sum Tech-
Kresta, J . , T. E. Marlin and J. F. MacGregor, “Choosing Inferen- niques. Oliver and Boyd, London (1964).
tial Variables Using Projection to Latent Structures (PLS) with
Application to Multicomponent Distillation”, Presented at Annual
AIChE Meeting, Chicago, IL., Nov. (1990).
Kvalheim, 0. M., “A Partial-Least-SquaresApproach to Interpreta- Manuscript received February 5 , 1990; revised manuscript
tive Analysis of Multivariate Data”, Chem. Intell. Lab. Sys. 3. received September 19, 1990; accepted for publication September
189-197 (1988). 26, 1990.

THE CANADIAN JOURNAL O F CHEMICAL ENGINEERING, VOLUME 69, FEBRUARY, 1991 47

You might also like