Multivariate Ordinal Categorical Process Control Based On Log Linear Modeling

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

Journal of Quality Technology

A Quarterly Journal of Methods, Applications and Related Topics

ISSN: 0022-4065 (Print) 2575-6230 (Online) Journal homepage: https://www.tandfonline.com/loi/ujqt20

Multivariate Ordinal Categorical Process Control


Based on Log-Linear Modeling

Junjie Wang, Jian Li & Qin Su

To cite this article: Junjie Wang, Jian Li & Qin Su (2017) Multivariate Ordinal Categorical Process
Control Based on Log-Linear Modeling, Journal of Quality Technology, 49:2, 108-122, DOI:
10.1080/00224065.2017.11917983

To link to this article: https://doi.org/10.1080/00224065.2017.11917983

Published online: 21 Nov 2017.

Submit your article to this journal

Article views: 37

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=ujqt20
Multivariate Ordinal Categorical Process
Control Based on Log-Linear Modeling
JUNJIE WANG
Xi’an Jiaotong University, Xi’an, Shaanxi, China, and
City University of Hong Kong, Kowloon, Hong Kong

JIAN LI and QIN SU


Xi’an Jiaotong University, Xi’an, Shaanxi, China

In many applications, the quality of products or services tends to be measured by multiple categorical
characteristics, each of which is classified into attribute levels such as good, marginal, and bad. Here there
is usually natural order among these attribute levels. However, traditional monitoring techniques ignore
such order among them. By assuming that each ordinal categorical quality characteristic is determined by
a latent continuous variable, this paper incorporates the ordinal information into an extended log-linear
model and proposes a multivariate ordinal categorical control chart based on a generalized likelihood-ratio
test. The proposed chart is efficient in detecting location shifts and dependence shifts in the corresponding
latent continuous variables of ordinal categorical characteristics based on merely the attribute-level counts
of the ordinal characteristics.

Key Words: Contingency Table; Dependence Shift; Latent Variable; Location Shift; Statistical Process
Control.

1. Introduction ness. They are more desirable to be measured by or-

M ULTIVARIATE statistical process control (SPC) dinal categorical levels such as perfect, good, and bad
has been widely exploited to monitor multi- (Taleb (2009)). Note that the attribute levels of each
ple quality characteristics simultaneously. In prac- characteristic follow a sequence from the best to the
tice, numerical data are not always available due to worst or the other way around, which makes it rea-
various constraints such as time and cost (Wang and sonable to refer to them as multivariate ordinal cate-
Tsung (2007, 2010)). An example is that, on a porce- gorical. On the contrary, attribute levels without nat-
lain production line, the quality characteristics of the ural order are nominal. Hereafter for simplicity, we
porcelain are appearance, translucence, and white- call a categorical quality characteristic a “factor”.

In the literature, there is a large amount of control


Mr. Wang is a joint Doctoral Student in the School of charts for monitoring multivariate continuous pro-
Management, Xi’an Jiaotong University, and the Shenzhen cesses. Please see Lowry and Montgomery (1995) and
Research Institute, Department of Systems Engineering and Bersimis et al. (2007) for a nice overview and Zou and
Engineering Management, City University of Hong Kong. His Tsung (2011) and Zou et al. (2012) for nonparamet-
email is wangjunjie@stu.xjtu.edu.cn.
ric monitoring methods. But the same cannot be said
Dr. Li is Associate Professor in the School of Management for multivariate categorical processes. An early con-
and State Key Laboratory for Manufacturing Systems Engi- tribution was made by Patel (1973), who proposed
neering. His email is jianli@xjtu.edu.cn a χ2 scheme applyied to multivariate binomial and
Dr. Su is Professor in the School of Management and State Poisson processes by normal approximation. Others
Key Laboratory for Manufacturing Systems Engineering. She also include the mnp-chart developed by Lu et al.
is the corresponding author. Her email is qinsu@mail.xjtu.edu (1998) and the mp-chart proposed by Chiu and Kuo
.cn. (2008). More work was reviewed by Topalidou and

Journal of Quality Technology 108 Vol. 49, No. 2, April 2017


MULTIVARIATE ORDINAL CATEGORICAL PROCESS CONTROL BASED ON LOG-LINEAR MODELING 109

Psarakis (2009). Most recent developments include suitable for online monitoring and therefore limits
a series of charting techniques developed by Li et its use in practice. To fill in this research gap, based
al. (2012, 2014a). They are all based on general- on the above latent continuous-variable assumption,
ized likelihood-ratio test (GLRT) statistics derived this paper tries to develop a multivariate version of
from log-linear models, which incorporate both the the SOC chart developed by Li et al. (2014b), namely
marginal distribution of factors and the dependence the multivariate ordinal categorical (MOC) control
among factors. chart. The proposed MOC chart should be efficient
in detecting both the location shifts of latent con-
The above-mentioned monitoring approaches for tinuous variables of multiple ordinal factors and de-
multivariate categorical processes all treat categori- pendence deviations of these underlying continuous
cal data as nominal, and therefore totally ignore the variables. To this end, we borrow the power of log-
order among their attribute levels. In fact, there is a linear modeling in incorporating both the marginal
big difference between ordinal categorical data and distributions of factors and the dependence among
nominal categorical data, which is nonnegligible in factors and extend it for exploiting the ordinal in-
both practice and research. First, there is loss of in- formation among attribute levels of factors. Then
formation when using traditional charts for monitor- combined with the exponentially weighted moving
ing factors regardless of the order among their lev- average (EWMA) scheme, a GLRT statistic can be
els. Second, Agresti (2010) pointed out that there derived as the charting statistic of the MOC chart.
has been an increasing emphasis on distinguishing
ordinal data from nominal data in categorical data The rest of the article is organized as follows.
analysis since 1980, and he illustrated via a simple First, the extended log-linear model is proposed for
example that ordinal analysis can give quite different incorporating the ordinal information. Then the mul-
and more powerful results than that ignoring such tivariate ordinal categorical chart is developed based
ordinal information. on the extended log-linear model, which is followed
by the performance evaluation of the MOC chart and
As far as we are concerned, there are few moni- its comparison with some other methods. Before the
toring methods that exploit the ordinal information concluding remarks, one real application of the pro-
among the attribute levels of a factor. To tackle this posed MOC chart is provided to illustrate its im-
problem, it is reasonable to assume that there is a plementation. Some numerical proofs are left in the
latent continuous variable that determines the at- appendix.
tribute levels of an ordinal factor by classifying its
numerical value into some attribute level according 2. Incorporating Ordinal Information
to some predefined thresholds or cutting points. With into Log-Linear Modeling
this assumption, Tucker et al. (2002) employed max-
2.1. Multivariate Categorical Process
imum likelihood estimation (MLE) to estimate the
Assume a multivariate categorical process with d
location shift in a guessed underlying continuous dis-
factors of interest denoted by X = {X1 , . . . , Xd },
tribution and then proposed a control chart for moni-
each of which, say factor Xi , takes hi (i = 1, . . . , d)
toring a single ordinal factor. However, this approach
attribute levels ai = 1, . . . , hi . All the level combina-
is computationally intensive and difficult to apply if
tions will form a d-way contingency d table of dimen-
the number of attribute levels is large. Recently, Li
sion h1 × . . . × hd and with h = i=1 hi cells. Each
et al. (2014b) developed a simple ordinal categorical
observation is a level combination and will fall into
(SOC) chart for detecting location shifts in the latent
one of the h cells as an observed count. Let pa1 ...ad
continuous variable of a factor. The SOC chart has
be the probability of an observation with level com-
a very simple charting statistic, and it is efficient in
bination a1 , . . . , ad (ai = 1, . . . , hi and i = 1, . . . , d)
detecting latent location shifts and robust to various
or falling into cell (a1 , . . . , ad ), and the counts in cell
types of latent continuous distributions.
(a1 , . . . , ad ) is denoted by na1 ...ad in a sample of size
When it comes to multiple ordinal categorical fac- N . Then the marginal sum counts of factor Xi in
tors, to the best of our knowledge, there is only level v, with v = 1, . . . , hi , can be easily calculated as
the nonparametric permutation-based control (NPC)  
chart proposed by Corain and Salmaso (2014). The n(i)v = ···
a1 a2 ai−1 ai+1
NPC chart needs no estimation of parameters, but it 
is computationally intensive due to calculating con- ··· na1 a2 ...ai−1 vai+1 ...ad−1 ad .
trol limits at each sampling point. This may not be ad−1 ad

Vol. 49, No. 2, April 2017 www.asq.org


110 JUNJIE WANG, JIAN LI, AND QIN SU

Obviously, given the sample size N , the hi marginal or category. For example, factor Xi has three ordi-
sum counts of Xi from n(i)1 to n(i)hi jointly follow the nal levels that are classified by its latent continuous
multinomial distribution MN(N ; p(i)1 , . . . , p(i)hi ), variable Xi∗ according to the following thresholds:
where p(i)v (v = 1, . . . , hi ) is the marginal proba-
excellent Xi∗ ∈ (−∞, 2.5],
bility of factor Xi in level v, with v = 1, . . . , hi , and
can be similarly calculated as acceptable Xi∗ ∈ (2.5, 3],
  unacceptable Xi∗ ∈ (3, +∞).
p(i)v = ···
a1 a2 ai−1 ai+1
 In general, the hi levels of factor Xi are obtained
··· pa1 a2 ...ai−1 vai+1 ...ad−1 ad . by classifying its latent continuous variable Xi∗ into
ad−1 ad hi intervals according to the following thresholds:
When all the factors are considered, the joint dis- −∞ = bi,0 < bi,1 < · · · < bi,hi −1 < bi,hi = +∞.
tribution of the d group of marginal sum counts
n(i)1 , . . . , n(i)hi (i = 1, . . . , d) is a multivariate multi- Given the cumulative density function (CDF) of the
nomial distribution (see Johnson et al. (1997)). Par- latent variable Fi (x∗i ), the probability that a contin-
ticularly, it will simplify into a multivariate binomial uous observation falls into interval (bi,v−1 , bi,v ] in the
distribution if each factor takes only two levels. latent scale is p(i)v = Fi (bi,v ) − Fi (bi,v−1 ), which is
equivalent to the (marginal) probability that a cate-
In the literature, log-linear models have proven gorical observation pertains to level v (v = 1, . . . , hi )
to be convenient and useful in characterizing mul- of factor Xi . Given the sample size N , the (marginal)
tivariate categorical processes. To clarify it, imag- counts n(i)v (v = 1, . . . , hi ) in level v of factor
ine a three-way contingency table with factors X1 , Xi would jointly follow the multinomial distribution
X2 , and X3 taking h1 , h2 , and h3 attribute lev- MN(N ; p(i)1 , . . . , p(i)hi ).
els, respectively. Let ma1 a2 a3 (a1 = 1, . . . , h1 ; a2 =
1, . . . , h2 ; a3 = 1, . . . , h3 ) represent the expected With a single factor, say Xi with hi attribute lev-
count with the level combination of X1 in level a1 , els, the test statistic proposed by Li et al. (2014b)
X2 in level a2 , and X3 in level a3 . Then the log-linear for detecting the location shift of its latent continu-
model is ous variable is
h 
ln ma1 a2 a3 = u(0) + u(1) (2) (3) (1,2) i 
a1 + ua2 + ua3 + ua1 ,a2  
Wi =  (c(i)k + c(i)(k−1) − 1)n(i)k  , (1)
+ u(1,3) (2,3) (1,2,3)  
a1 ,a3 + ua2 ,a3 + ua1 ,a2 ,a3 , k=1

(0) (1) (2) (3) k


where u is the overall mean; u , u , and u are where c(i)k = j=1 p(i)j (k = 1, . . . , hi ) with c(i)0 =
the main effects; u(1,2) , u(1,3) , and u(2,3) are the two- 0 is the cumulative probability up to level k of fac-
factor interaction effects; u(1,2,3) is the three-factor tor Xi , and n(i)k is the observed count in level k.
interaction effect. Denote pa1 a2 a3 as the probability This statistic can be regarded as assigning differ-
of an observation falling into cell (a1 , a2 , a3 ) and it is ent scores to each level to induce a weighted sum
obvious that ma1 a2 a3 = N pa1 a2 a3 given the sample of the observed ordinal-level counts. Although it is
size N . With a fixed N , the cell counts will jointly obtained under the assumption that the latent con-
follow a multinomial distribution. It is more conve- tinuous variable follows a logistic distribution, the
nient to concentrate on pa1 a2 a3 instead of ma1 a2 a3 , proposed chart is easy to build and shows robustness
and the log-linear model can be rewritten as to various types of underlying distributions, such as
normal and Gamma.
ln pa1 a2 a3 = u(0) + u(1) (2) (3) (1,2)
a1 + ua2 + ua3 + ua1 ,a2

+ u(1,3) (2,3) (1,2,3) Back to multivariate ordinal categorical processes,


a1 ,a3 + ua2 ,a3 + ua1 ,a2 ,a3 .
it is intuitive to separately construct a chart for each
factor Xi with a test statistic Wi (i = 1, . . . , d) in-
2.2. Extended Log-linear Models
troduced above, and totally the d statistics combine
In terms of describing ordinal information, it is a multi chart. However, this method fails to incor-
natural and efficient to assume an unobservable con- porate the dependence between factors, and fixing
tinuous variable underlying the factor of interest its in-control (IC) average run length (ARL) is non-
(Tucker et al. (2002)). This latent continuous variable trivial (see Woodall and Ncube (1985)). In order to
determines the ordinal attribute level of the factor borrow the power of log-linear models in incorporat-
by classifying its numerical value into some interval ing multiple categorical variables, fortunately we find

Journal of Quality Technology Vol. 49, No. 2, April 2017


MULTIVARIATE ORDINAL CATEGORICAL PROCESS CONTROL BASED ON LOG-LINEAR MODELING 111

that the test statistic in Equation (1) is equivalent considering multiple ordinal factors simultaneously.
to the likelihood-ratio test statistic derived from a To be specific, for each factor Xi (i = 1, . . . , d) in a
modified log-linear model with only one factor. multivariate categorical setting, we may construct a
novel log-linear model in a similar way, which is
To be specific, consider a log-linear model involv-
ing only factor Xi . Originally, the log-linear model ln pa1 ...ad = β0 + β1 s(1)a1 + . . . + βd s(d)ad
should be

d
(0) (i) = β0 + βi s(i)ai . (3)
ln p(i)k = u + uk , k = 1, . . . , hi ,
hi (i)
i=1
with k=1 p(i)k = 1. If we let uk = βi s(i)k with This model considers the main effects of each factor
s(i)k = c(i)k + c(i)(k−1) − 1, Xi on the probability of an observation falling into
cell (a1 , . . . , ad ) or with the attribute level combina-
where βi is a coefficient that relates the main effect
tion a1 , . . . , ad .
of factor Xi to its marginal probability, the above
log-linear model is modified into However, model (3) considers only the main ef-
fects, which implies that the d factors are indepen-
ln p(i)k = β0 + βi s(i)k , (2)
dent of each other. This is not always the case. To
where β0 is the same as the intercept u(0) . Compared account for the association among factors, apart from
with the original one, this modified log-linear model main effects, we include only two-factor interaction
is more parsimonious and contains less parameters or effects and neglect higher-order interaction. First,
coefficients, which are only one independent parame- two-factor interaction effects represent the associa-
(i)
ter βi . Replacing uk with βi s(i)k exploits the ordinal tion between each pair of factors, such as u(i,j) ac-
information among the attribute levels of factor Xi , counting for the dependence between Xi and Xj .
where it is the latent continuous variable Xi∗ that Second, lower-order effects are more likely to occur
makes factor Xi ordinal. In fact, the −2LRT statis- and play a more important role than higher-order
tic for testing if there is any deviation in βi of model effects, which is common consideration in many ap-
(2) has the form Wi2 /U with U as a constant, which plications, such as design of experiments (Wu and
is equivalent to that in Equation (1) for testing if Hamada (2000)) and SPC (Li et al. (2012)). Even if
there is any location shift in the latent variable Xi∗ . some higher-order effects should indeed be included
The proof can be found in Appendix A. in the model, it is sufficient for monitoring purposes
with only main effects and two-factor interaction ef-
Here s(i)k (k = 1, . . . , hi ) have intuitive meanings, fects. Third, as an overview of modeling ordinal cate-
which represent some type of standardized ranks of gorical data, Agresti (2010) put emphasis on describ-
the observations in level k of factor Xi . To be specific, ing the association between only two ordinal factors.
hi
we arrange the k=1 n(i)k observations of Xi from
level 1 to level hi , and the n(i)k observations in level To incorporate the ordinal information among the
k−1 k attribute levels of factors into the two-factor interac-
k take the positions j=1 n(i)j + 1, . . . , j=1 n(i)j .
Because all the observations in the same level should tion effects, Haberman (1974) provided a linear-by-
have the same rank, we assign the average linear (L×L) association model to characterize the
 k−1 of the association between two ordinal factors based on log-
above as the rank of level k, which is ( j=1 n(i)j +
 k linear modeling. Here the two-factor interaction ef-
1 + j=1 n(i)j )/2. By dividing the above rank by the (i,j)
fect uai ,aj of Xi and Xj is expressed by the product
total size and letting the total size tend to infinity,
of one coefficient and two scores each for one factor.
the standardized rank of observations in level k in
Here we take s(i)ai as the score of factor Xi in level
the population can be defined as (c(i)(k−1) + c(i)k )/2
ai , which satisfies some required constraints of the
lying between [0, 1], which can further be converted
scores suggested by Agresti (2010), and the interac-
to s(i)k = c(i)(k−1) + c(i)k − 1 lying between [−1, 1]
hi tion effect should be
and satisfying k=1 s(i)k p(i)k = 0. One can refer to
Ding et al. (2016) for details. u(i,j)
ai ,aj = βi,j s(i)ai s(j)aj .

The equivalence between testing any deviation in Ultimately, the extended log-linear model that incor-
βi of model (2) and testing any latent location shift porates the ordinal information among the attribute
in Xi∗ by Equation (1) provides us convenience of levels of factors should be

Vol. 49, No. 2, April 2017 www.asq.org


112 JUNJIE WANG, JIAN LI, AND QIN SU

ln pa1 ...ad representation of the extended log-linear model (4).


By denoting p = [p1...1 , p1...2 , . . . , ph1 ...hd−1 , ph1 ...hd ]T

d 
d−1 
d
of dimension h × 1, its matrix form is
= β0 + βi s(i)ai + βi,j s(i)ai s(j)aj , (4)
i=1 i=1 j=i+1 
d 
d−1 
d

ln p = 1β0 + yi βi + yi,j βi,j , (5)


with a1 ,...,ad pa1 ...ad = 1. i=1 i=1 j=i+1
Compared with the original log-linear model, where 1 is a column vector with 1 as all its entries
model (3) actually expresses the main effect of each and appropriate dimension, yi and yi,j both of di-
factor Xi by βi s(i)k (k = 1, . . . , hi ) and, therefore, mension h × 1 are designed vectors consisting of ele-
reduces
d the number of independent parameters from ments such as s(i)ai and s(i)ai s(j)aj , respectively. To
i=1 i −1) to d, which is more parsimonious. Thus,
(h construct them, first we denote the column vector
model (3) may not be a good choice for characteriz- with elements all equal to 1 and of dimension hi × 1
ing well the relationship between cell probabilities by 1hi and denote the score vector of factor Xi by
in the multi-way contingency table and the associ-
ated factor levels. Instead, it is actually constructed s(i) = [s(i)1 , . . . , s(i)hi ]T .
to derive the −2LRT statistic for testing if its co- Then we have, e.g,
efficients βi (i = 1, . . . , d) shift, which is equivalent
to testing any location shift in the joint distribution y1 = s(1) ⊗ 1h2 ⊗ 1h3 . . . ⊗ 1hd−2 ⊗ 1hd−1 ⊗ 1hd ,
of the latent continuous variables X1∗ , . . . , Xd∗ . Simi- y2 = 1h1 ⊗ s(2) ⊗ 1h3 . . . ⊗ 1hd−2 ⊗ 1hd−1 ⊗ 1hd ,
larly, model (4) also serves to derive a −2LRT statis- yd = 1h1 ⊗ 1h2 ⊗ 1h3 . . . ⊗ 1hd−2 ⊗ 1hd−1 ⊗ s(d) ,
tic, which is Equation (7) later.
y1,2 = s(1) ⊗ s(2) ⊗ 1h3 . . . ⊗ 1hd−2 ⊗ 1hd−1 ⊗ 1hd ,
3. Multivariate Ordinal y3,d−1 = 1h1 ⊗ 1h2 ⊗ s(3) . . . ⊗ 1hd−2 ⊗ s(d−1) ⊗ 1hd ,
Categorical Chart where ⊗ is the Kronecker product operator. There-
fore, we may summarize the rule of constructing
3.1. Matrix Form of the Extended Log-Linear yi and yi,j . Given the basic column vector 1h1 ⊗
Model 1h2 . . . ⊗ 1hd−1 ⊗ 1hd , replace 1hi with s(i) for yi , and
The extended log-linear model in Equation (4) in- replace 1hi and 1hj with s(i) and s(j) , respectively,
corporates both the marginal distributions and the for yi,j . Furthermore, letting
dependence among factors and the ordinal informa- Y = [y1 , . . . , yd , y1,2 , y1,3 , . . . , yd−2,d , yd−1,d ] ,
tion among the attribute levels of each factor is also T
exploited. We have known that monitoring a main- β = [β1 , . . . , βd , β1,2 , β1,3 , . . . , βd−2,d , βd−1,d ] ,
effect coefficient, say βi , is equivalent to detecting if the log-linear model in Equation (5) can be rewritten
there is any location shift in the latent continuous as
variable Xi∗ of factor Xi . Likewise, any deviation in ln p = 1β0 + Yβ subject to 1T p = 1. (6)
the two-factor interaction-effect coefficient βi,j cor-
responds to the shift in the dependence of the latent 3.2. Proposed Chart
continuous variables Xi∗ and Xj∗ of factors Xi and
Usually, there are two phases involved in usage of
Xj , respectively. Therefore, we may combine the de-
control charts. In phase I, the process data receive
tection of all the coefficients including the main ef-
retrospective analysis to detect and correct unusual
fect ones β1 , . . . , βd and the two-factor interaction
patterns. In the end, a clean dataset will be obtained
ones β1,2 , . . . , βd−1,d to monitor if there are any lo-
and utilized to estimate parameters that represent
cation shifts or dependence shifts in the latent joint
the IC process performance. In phase II, these IC
continuous distribution of the underlying variables
parameters will be employed as reference to deter-
X1∗ , . . . , Xd∗ . In addition, it should be kept in mind
mine if the process goes out of control (OC). It is
that the monitoring task only relies on the cell counts
common to measure the performance of a phase II
such as na1 ...ad in cell (a1 , . . . , ad ) and that we can-
chart with average run length (ARL), which is the
not have any specific values of the latent variables
average number of samples that has been collected
X1∗ , . . . , Xd∗ nor the underlying thresholds that clas-
before an OC signal is triggered. With the same IC
sify each Xi∗ into some attribute level ai .
ARL, the chart with a smaller OC ARL performs bet-
Before deriving the charting statistic of the pro- ter because it can detect the shift faster. This paper
posed control chart, it is beneficial to give a matrix intends to propose a phase II chart for multivariate

Journal of Quality Technology Vol. 49, No. 2, April 2017


MULTIVARIATE ORDINAL CATEGORICAL PROCESS CONTROL BASED ON LOG-LINEAR MODELING 113

(0) (0) (0) (0)


ordinal categorical process without loss of the ordinal p(i)hi ) with p(i)v = Fi (bi,v ) − Fi (bi,v−1 ). At the
information among the attribute levels of factors. change point, if there is only a location shift in the
underlying continuous variable Xi∗ that changes the
Here we assume that the IC cell probability vec- (0) (1) (0)
tor p(0) of dimension h × 1 is known or has been location parameter from μi to μi = μi + δ, the
estimated from an IC dataset by dividing the IC cell marginal probabilities of factor Xi in the OC state
(1) (0) (0)
count vector by the IC dataset size, where the super- would be p(i)v = Fi (bi,v − δ) − Fi (bi,v−1 − δ). Ac-
script (0) represents the IC state. Furthermore, the cording to the change-point model, we are actually
(0) testing the hypothesis regarding the location param-
IC marginal probabilities p(i)ai of factor Xi in level
ai (i = 1, . . . , d and ai = 1, . . . , hi ) can be calcu- eter vector μ and the correlation matrix Σ at each
lated naturally, and the IC cumulative probabilities sampling point k,
(0) (0) (0)
c(i)ai , the scores s(i)ai = c(i)ai + c(i)(ai −1) − 1, and H0 : μ = μ(0) and Σ = Σ(0)
the design matrix Y are also obtained. In addition, H1 : μ
= μ(0) or Σ
= Σ(0) .
later we will see that there is no need to know the
IC coefficient vector β (0) . We intend to monitor if The extended log-linear model of Equation (6) is
there is any shift in the location parameters or the proposed to characterize multivariate ordinal cate-
dependence of the latent continuous variables of the gorical processes. The main effects yi βi (i = 1, . . . , d)
ordinal factors based on merely the cell counts in the in the model reflect the marginal distributions of
multi-way contingency table formed by the factors. factors. In fact, a shift in βi represents a location
With the above-mentioned IC parameters, we may shift in the latent continuous variable Xi∗ of fac-
construct the proposed chart. tor Xi . Similarly, the two-factor interaction effects
yi,j βi,j (i = 1, . . . , d − 1; j = i + 1, . . . , d) reflect the
To this end, it is reasonable to assume that the
dependence between factors Xi and Xj , and specif-
unobservable random vectors x∗t = [X1t ∗ ∗ T
, . . . , Xdt ]
ically a shift in βi,j echoes a shift in the correlation
of these latent continuous variables are sequentially
coefficient in the latent continuous variables Xi∗ and
collected from the following change-point model
Xj∗ . Therefore, to monitor both shifts in μ and Σ

i.i.d. G(0) (x∗ , μ(0) , Σ(0) ), for t ≤ τ , of the joint distribution of the underlying variables
x∗t ∼ X1∗ , . . . , Xd∗ and to test the above hypothesis, it suf-
G(1) (x∗ , μ(1) , Σ(1) ), for t > τ ,
fices to test the following hypothesis:
where G(0) and G(1) are the IC and OC joint distribu-
H0 : β = β (0) versus H1 : β
= β (0) ,
tions of the latent continuous variables X1∗ , . . . , Xd∗ ,
(0) (0)
with μ(0) and μ(1) as their location parameter vec- at each sampling point k, where β (0) = [β1 , . . . , βd ,
tors and Σ(0) and Σ(1) as their correlation matrixes, (0) (0)
β1,2 , . . . , βd−1,d ]T is the coefficient vector in the log-
and τ is the unknown change point. At the change- linear model of Equation (6) that represents the IC
point, the multivariate ordinal categorical process state of the process.
may shift in at least one of the location parame-
ter vector and the correlation matrix. However, in The GLRT can be employed to develop a −2LRT
practice, we cannot observe the specific values of x∗t . statistic for the above hypothesis (Anderson (2003)),
Instead, we can collect only the attribute levels of which is
each factor, say Xit = ai (i = 1, . . . , d) at time point 1
Qk = (nk − N p(0) )T Y(YT Λ(0) Y)−1 YT
t. Given the phase II sample size N , for the kth sam- N
ple consisting of N sequential attribute-level vectors × (nk − N p(0) ). (7)
from time point t = (k − 1)N + 1 to time point
Its derivation can be found in Appendix B. Here
t = kN , we can calculate and summarize the cell
Λ(0) = diag(p(0) ) − p(0) pT (0) represents the IC co-
counts na1 ...ad ,k with each level combination a1 , . . . ,
variance matrix (Agresti (2010)), where diag(a) is
ad (ai = 1, . . . , hi ) to form a cell count vector nk of
the diagonal square matrix with the column vector a
dimension h × 1 and satisfying 1T nk = N .
as its diagonal elements. We can see the test statistic
In order to see how the cell counts in the d-way does not contain the IC coefficient vector β (0) , and
contingency table, such as nk , reflect the shifts from only Y and p(0) are required.
G(0) to G(1) , we may take the marginal probabili- To make the fullest of the information in both the
ties of factor Xi for illustration. In the IC state, they past and current samples, the EWMA scheme is em-
(0)
follow the multinomial distribution MN(N ; p(i)1 , . . . , ployed to construct a phase II control chart. Given

Vol. 49, No. 2, April 2017 www.asq.org


114 JUNJIE WANG, JIAN LI, AND QIN SU

the smoothing parameter 0 < λ ≤ 1, the exponen- k, then the charting statistic of the LMBM chart is
tially weighted sum of the observation vector at sam- still the −2LRT statistic
pling point k is
Sk = 2zT  k − ln m(0) ),
k (ln w
zk = (1 − λ)zk−1 + λnk .
where m(0) is the IC cell count expectation vector.
Hence, the finalized charting statistic is The expectation vector w  k can be efficiently com-
puted by the iteratively proportional fitting algo-
1
Rk = (zk − N p(0) )T Y(YT Λ(0) Y)−1 YT rithm (IPF) (Bishop et al. (2007)). The IPF algo-
N
rithm can be easily completed by the subroutine
× (zk − N p(0) ). “PRPFT” in Fortran with the IMSL library given
The control limit L can be determined by simulation the IC hierarchy structure of a log-linear model. How-
with bisection search, the framework of which can ever, the LMBM chart assumes that the IC hierarchy
be found in the appendix of Dickinson et al. (2014). structure would remain unchanged in phase II SPC.
With Rk > L, the chart triggers an OC signal. Here- Compared with the MOC chart, the LMBM chart
after, the proposed chart is named the multivariate specializes in detecting shifts in main effects and in-
ordinal categorical (MOC) control chart. As would be teraction effects of original log-linear models, which
expected, the advantage of the MOC chart should lie does not assume latent continuous variables of cate-
in detecting shifts occurring in the parameters of the gorical factors.
joint latent distribution, such as shifts in the mean Without loss of generality, d = 3 factors with 3,
vector μ and the correlation matrix Σ. The joint la- 3, 4 ordinal levels are considered. They will form a
tent continuous distribution reflects the ordinal infor- three-way contingency table with 3×3×4 = 36 cells,
mation between attribute levels of each factor. If the which also means there are 36 level combinations. We
monitoring focus is not the joint latent distribution, evaluate the performance of both charts under three
alternative charts should be considered. Here this pa- types of latent three-variate joint distributions: (1)
per focuses on only monitoring. The diagnostic task multivariate normal distribution N3 (0, Ω); (2) mul-
for identifying shift locations can be developed based tivariate t distribution t3 (3) with degrees of free-
on the framework proposed by Zou et al. (2011). dom (d.f.) 3; (3) multivariate Gamma distribution
Ga3 (3, 1) with shape parameter 3 and scale param-
4. Performance Assessment eter 1. According to the appendix of Stoumbos and
This section is to investigate the performance of Sullivan (2002), the multivariate t distribution and
the MOC chart by comparing it with alternative multivariate Gamma distribution can be obtained
methods. In line with the SPC convention, the IC based on the multivariate normal distribution. The
ARL is set to be 370 for all the compared charts. All covariance matrix Ω associated with N3 (0, Ω), t3 (3),
the ARLs are computed by averaging 10,000 repli- Ga3 (3, 1) is selected as follows:
⎡ ⎤
cations, and a chart with smaller OC ARLs would 1 ρ12 ρ13
outperform others. Ω = ⎣ ρ12 1 ρ23 ⎦ ,
ρ13 ρ23 1
4.1. Alternative Methods and Simulation
Settings where the diagonal elements or the variances are all
1, and ρij (i
= j) is the correlation coefficient be-
Here the log-linear multivariate binomial/multi-
tween factor Xi and factor Xj . In particular, the co-
nomial (LMBM) control chart proposed by Li et
variance matrix Ω is identical to the correlation ma-
al. (2014a) is chosen for comparison because it is
trix Σ of N3 (0, Ω), but this does not hold for t3 (3)
also based on log-linear modeling and can incorpo-
and Ga3 (3, 1).
rate multivariate categorical processes that consist of
multivariate binomial and multivariate multinomial The simulation will be conducted in five steps: (1)
processes. Basically, the LMBM chart is constructed choose a multivariate distribution among the three
based on the IC hierarchy structure of the factors, underlying joint distributions; (2) set a group of cor-
which is derived by variable selection and represents relation coefficients and cutting points for latent con-
the association pattern among these factors. In phase tinuous variables; (3) estimate parameters required
II, given the exponentially weighted sum of observa- and compute control limits satisfying the predefined
tion vector zk , we may further obtain the MLE of the IC ARL; (4)obtain OC ARLs under location shifts
cell count expectation vector w  k at sampling point in the latent joint distribution; (5) obtain OC ARLs

Journal of Quality Technology Vol. 49, No. 2, April 2017


MULTIVARIATE ORDINAL CATEGORICAL PROCESS CONTROL BASED ON LOG-LINEAR MODELING 115

when the dependence among the latent continuous dependence shifts are reflected by shifts in the as-
variables specified by ρij is changed. For each case, sociated covariance matrix Ω, which change Σ(0) to
the hierarchy structure in the IC log-linear model Σ(1) .
employed in the LMBM chart is figured out. In addi-
tion, the IC probability vector p(0) is also determined 4.2. Comparison Results
in advance for both the MOC chart and the LMBM Table 1 shows the comparison results under the la-
chart. tent joint distribution N3 (0, Ω). We consider location
shifts in each of the three latent variables X1∗ , X2∗ and
Furthermore, with N3 (0, Ω) and t3 (3), the three
X3∗ . For instance, with only a location shift δ = 0.02
ordinal levels of the first factor X1 are classified
in X1∗ , the MOC chart has an OC ARL 199, whereas
by (−∞, −0.7], (−0.7, 1], (1, +∞), those of the sec-
the LMBM chart has 266. It is seen that the MOC
ond factor X2 are classified by (−∞, −1], (−1, 0.5],
chart uniformly outperforms the LMBM chart. This
(0.5, +∞), and the third factor X3 ’s four ordinal
is especially true for small shifts with |δ| ≤ 0.06. Here
levels are determined by (−∞, −1.2], (−1.2, 0.3],
the LMBM chart has the IC hierarchy structure [123]
(0.3, 2], (2, +∞). For Ga3 (3, 1), the classified inter-
of the log-linear model. Table 1 also indicates loca-
vals are (0, 3], (3, 6], (6, +∞) for the first factor X1 ;
tion shifts with the same magnitude but different di-
(0, 4], (4, 7], (7, +∞) for the second factor X2 ; and
rections (positive or negative) lead to different OC
(0, 3], (3, 5], (5, 7], (7, +∞) for the third factor X3 .
ARLs. This happens due to the asymmetry of the
The sample size is chosen to be N = 100 and the
classification intervals with regard to 0.
EWMA smoothing parameter λ = 0.1. Moreover,
the IC covariance matrix Ω(0) associated with the The results under the latent joint distribution
three latent joint distributions N3 (0, Ω), t3 (3), and t3 (3) listed in Table 2 also show similar patterns,
(0) (0) (0)
Ga3 (3, 1) has elements ρ12 = ρ21 = 0.2, ρ13 = which again demonstrate the superiority of the MOC
(0) (0) (0)
ρ31 = 0.5, ρ23 = ρ32 = 0.8. More comparison re- chart over the LMBM chart. Table 3 lists the results
sults with other selections of the correlation coeffi- under the latent joint distribution Ga3 (3, 1), from
cients are available from the authors on request. which the same conclusion can be drawn. In fact, the
advantage of the MOC chart in detecting location
In the comparison, the OC ARLs with location shifts in the underlying joint distribution is not sur-
shifts occurring in different latent continuous vari- prising. Notice that the SOC chart developed by (Li
ables and dependence shifts occurring in different et al. (2014b)) has powerful performance in detecting
correlation coefficients in Ω are simulated. With lo- location shifts in the latent continuous variable and,
cation shifts, the location parameter vector would in fact, the proposed MOC chart would simplify into
change from μ(0) to μ(1) = μ(0) + δ . In addition, the the SOC chart if there were a single ordinal factor.

TABLE 1. WOC ARL Comparison with Location Shifts Under N3 (0, Ω)

X1∗ X2∗ X3∗


δ MOC LMBM MOC LMBM MOC LMBM

0.02 199 (1.93) 266 (2.68) 173 (1.68) 251 (2.45) 159 (1.54) 225 (2.27)
0.04 79.1 (0.71) 143 (1.35) 56.9 (0.48) 111 (1.01) 49.3 (0.40) 89.1 (0.81)
0.06 36.5 (0.27) 70.9 (0.61) 26.0 (0.17) 48.7 (0.38) 22.8 (0.14) 39.4 (0.29)
0.08 21.8 (0.14) 38.5 (0.28) 15.9 (0.09) 26.9 (0.17) 14.0 (0.07) 21.9 (0.13)
0.10 15.0 (0.08) 24.9 (0.16) 11.3 (0.05) 17.5 (0.09) 10.0 (0.05) 14.8 (0.07)
−0.02 224 (2.18) 289 (2.85) 172 (1.64) 245 (2.44) 152 (1.46) 234 (2.28)
−0.04 87.4 (0.78) 159 (1.49) 55.9 (0.47) 106 (0.97) 48.2 (0.40) 92.4 (0.82)
−0.06 39.7 (0.30) 77.6 (0.67) 26.1 (0.18) 46.7 (0.36) 22.5 (0.14) 39.9 (0.30)
−0.08 23.0 (0.15) 42.4 (0.31) 15.6 (0.08) 26.0 (0.16) 13.9 (0.07) 22.1 (0.13)
−0.10 15.5 (0.08) 26.4 (0.17) 11.2 (0.05) 17.1 (0.09) 10.0 (0.04) 14.9 (0.07)

NOTE: Standard errors are in parentheses. λ = 0.1. N = 100.

Vol. 49, No. 2, April 2017 www.asq.org


116 JUNJIE WANG, JIAN LI, AND QIN SU

TABLE 2. WOC ARL Comparison with Location Shifts Under t3 (3)

X1∗ X2∗ X3∗


δ MOC LMBM MOC LMBM MOC LMBM

0.02 247 (2.46) 299 (3.01) 201 (1.96) 273 (2.68) 201 (1.98) 261 (2.58)
0.04 116 (1.07) 194 (1.88) 75.6 (0.67) 143 (1.32) 71.2 (0.62) 127 (1.17)
0.06 57.1 (0.48) 110 (1.01) 34.9 (0.26) 68.7 (0.58) 32.2 (0.23) 59.0 (0.49)
0.08 32.5 (0.23) 63.1 (0.52) 20.6 (0.13) 37.7 (0.27) 18.9 (0.11) 31.9 (0.22)
0.10 21.6 (0.14) 39.4 (0.28) 14.2 (0.07) 24.1 (0.14) 13.0 (0.07) 20.4 (0.12)
−0.02 255 (2.49) 302 (3.00) 213 (2.08) 273 (2.74) 184 (1.76) 256 (2.53)
−0.04 118 (1.09) 192 (1.86) 80.8 (0.72) 144 (1.37) 65.3 (0.55) 126 (1.18)
−0.06 57.3 (0.49) 109 (0.97) 37.3 (0.28) 69.3 (0.58) 30.5 (0.22) 58.6 (0.47)
−0.08 33.2 (0.24) 63.3 (0.52) 21.6 (0.14) 38.0 (0.28) 18.1 (0.11) 32.3 (0.22)
−0.10 22.1 (0.14) 38.5 (0.28) 14.8 (0.08) 24.3 (0.15) 12.7 (0.06) 20.8 (0.12)

NOTE: Standard errors are in parentheses. λ = 0.1. N = 100.

The investigation into the efficiency of the MOC LMBM charts. Both of them employ log-linear mod-
chart in detecting dependence shifts is conducted eling, and the difference lies in the design matrixes. In
below. The comparison results are demonstrated in the MOC chart, the interaction between two factors
Tables 4, 5, and 6 for the latent joint distribu- is determined by a coefficient, say βi,j for factors Xi
tions N3 (0, Ω), t3 (3), and Ga3 (3, 1), respectively. and Xj . However, in the LMBM chart the interaction
Throughout the three tables, the dependence shifts between two factors Xi with hi attribute levels and
are induced by changes in the elements ρij of the as- Xj with hj levels are decided by some terms such as
sociated covariance matrix Ω, i.e., ρ12 , ρ13 , and ρ23 , (i,j)
uai ,aj , and totally there are (hi − 1)(hj − 1) indepen-
respectively. We see that, except shifts in ρ23 , the dent terms, which is much larger than 1. Therefore,
MOC chart responds faster in terms of shorter OC with a single coefficient describing the dependence,
ARLs and may be in general slightly better than the the MOC chart achieves slightly better performance
LMBM chart. In fact, we should go deep into the in- in detecting dependence shifts as the LMBM chart
sight into the dependence expression of the MOC and with much more parameters. This shows that the

TABLE 3. WOC ARL Comparison with Location Shifts Under Ga3 (3, 1)

X1∗ X2∗ X3∗


δ MOC LMBM MOC LMBM MOC LMBM

0.05 256 (2.50) 304 (3.02) 234 (2.26) 298 (2.91) 225 (2.22) 289 (2.83)
0.10 123 (1.14) 193 (1.85) 103 (0.95) 185 (1.77) 90.6 (0.82) 163 (1.54)
0.20 34.3 (0.25) 60.0 (0.49) 28.0 (0.19) 56.4 (0.45) 24.1 (0.15) 42.8 (0.32)
0.30 16.6 (0.09) 25.8 (0.15) 13.9 (0.07) 23.8 (0.14) 12.2 (0.06) 19.2 (0.10)
0.50 7.80 (0.03) 10.9 (0.04) 6.77 (0.03) 10.2 (0.04) 6.07 (0.02) 8.74 (0.03)
−0.05 238 (2.34) 296 (2.91) 224 (2.20) 292 (2.89) 205 (2.00) 277 (2.73)
−0.10 114 (1.06) 192 (1.88) 101 (0.93) 184 (1.79) 82.4 (0.73) 153 (1.47)
−0.20 33.2 (0.24) 63.2 (0.54) 28.3 (0.20) 59.9 (0.49) 23.4 (0.15) 43.8 (0.33)
−0.30 16.3 (0.09) 28.4 (0.19) 14.3 (0.07) 27.1 (0.17) 12.2 (0.06) 20.5 (0.12)
−0.50 7.98 (0.03) 13.3 (0.06) 7.08 (0.03) 12.7 (0.06) 6.28 (0.02) 10.1 (0.04)

NOTE: Standard errors are in parentheses. λ = 0.1. N = 100.

Journal of Quality Technology Vol. 49, No. 2, April 2017


MULTIVARIATE ORDINAL CATEGORICAL PROCESS CONTROL BASED ON LOG-LINEAR MODELING 117

TABLE 4. OC ARL Comparison with Dependence Shifts Under N3 (0, Ω)

ρ12 ρ13 ρ23


δ MOC LMBM MOC LMBM MOC LMBM

0.02 185 (1.81) 218 (2.18) 187 (1.81) 260 (2.55) 219 (2.16) 213 (2.01)
0.04 64.6 (0.55) 92.8 (0.85) 68.9 (0.59) 94.9 (0.83) 80.2 (0.70) 47.1 (0.34)
0.06 29.4 (0.20) 42.8 (0.33) 32.4 (0.23) 37.6 (0.25) 35.4 (0.26) 18.7 (0.08)
0.08 17.4 (0.10) 24.1 (0.16) 19.3 (0.12) 20.4 (0.10) 20.0 (0.12) 11.1 (0.04)
0.10 12.2 (0.06) 16.2 (0.09) 13.3 (0.07) 13.8 (0.05) 13.3 (0.07) 7.77 (0.02)
−0.02 185 (1.79) 277 (2.76) 197 (1.91) 181 (1.74) 182 (1.77) 106 (0.99)
−0.04 66.0 (0.56) 126 (1.15) 69.7 (0.61) 67.5 (0.60) 69.4 (0.60) 31.0 (0.24)
−0.06 30.2 (0.21) 53.5 (0.42) 32.3 (0.23) 30.7 (0.23) 32.8 (0.24) 15.0 (0.09)
−0.08 18.1 (0.10) 27.7 (0.17) 19.1 (0.11) 18.0 (0.11) 19.9 (0.12) 9.60 (0.05)
−0.10 12.8 (0.06) 18.2 (0.09) 13.1 (0.07) 12.6 (0.07) 13.9 (0.07) 7.09 (0.03)

NOTE: Standard errors are in parentheses. λ = 0.1. N = 100.

MOC chart still has some superiority and is more chart in detecting location shifts. But this advantage
convenient in detecting dependence deviation. is compromised in the case of shifts in ρ23 , which is
similar to Table 4 that reports OC ARLs of detecting
The above simulations take shifts in either loca-
dependence shifts only.
tion or dependence into consideration. Table 7 shows
the comparison results of simultaneously monitoring Apart from the above simulations, we also investi-
location and dependence shifts under the underlying gated the selection of the EWMA smoothing parame-
joint distribution N3 (0, Ω). Here we denote a corre- ter λ. Generally, a small λ leads to quicker detection
lation shift by δ1 and a location shift by δ2 . Table 7 of small shifts, whereas a large λ allows detecting
considers different combinations of shifts in the cor- large shifts quickly. This is consistent with the con-
relation coefficients ρ12 , ρ13 , ρ23 and in the locations clusion in Lucas and Saccucci (1990). The appropri-
of the latent variables X1∗ , X2∗ , X3∗ . We see that, in ate λ is recommended to be between 0.05 and 0.2.
most cases, the MOC chart outperforms the LMBM The simulation results of the effects of different λ on
chart, which is due to the advantage of the MOC OC ARLs is available from the authors on request.

TABLE 5. OC ARL Comparison with Dependence Shifts Under t3 (3)

ρ12 ρ13 ρ23


δ MOC LMBM MOC LMBM MOC LMBM

0.02 200 (1.94) 227 (2.24) 195 (1.89) 268 (2.59) 199 (1.98) 195 (1.84)
0.04 70.7 (0.61) 106 (0.98) 70.9 (0.61) 107 (0.93) 67.5 (0.58) 46.0 (0.32)
0.06 31.9 (0.23) 49.1 (0.40) 32.5 (0.23) 42.8 (0.30) 29.8 (0.21) 19.0 (0.08)
0.08 18.8 (0.11) 27.5 (0.19) 19.2 (0.11) 23.0 (0.12) 17.4 (0.10) 11.4 (0.04)
0.10 13.2 (0.07) 18.0 (0.10) 13.2 (0.07) 15.3 (0.06) 11.9 (0.06) 8.02 (0.02)
−0.02 193 (1.86) 284 (2.80) 192 (1.88) 201 (1.96) 179 (1.73) 126 (1.24)
−0.04 69.9 (0.61) 137 (1.28) 68.0 (0.58) 81.0 (0.72) 62.9 (0.54) 38.9 (0.31)
−0.06 32.3 (0.23) 60.5 (0.48) 31.0 (0.22) 37.0 (0.28) 30.0 (0.22) 18.4 (0.12)
−0.08 19.2 (0.11) 31.8 (0.20) 18.5 (0.11) 21.2 (0.13) 18.0 (0.11) 11.6 (0.06)
−0.10 13.3 (0.07) 20.3 (0.10) 13.0 (0.07) 14.6 (0.08) 12.9 (0.07) 8.54 (0.04)

NOTE: Standard errors are in parentheses. λ = 0.1. N = 100.

Vol. 49, No. 2, April 2017 www.asq.org


118 JUNJIE WANG, JIAN LI, AND QIN SU

TABLE 6. OC ARL Comparison with Dependence Shifts Under Ga3 (3, 1)

ρ12 ρ13 ρ23


δ MOC LMBM MOC LMBM MOC LMBM

0.02 333 (3.33) 337 (3.33) 190 (1.84) 267 (2.60) 104 (0.95) 137 (1.27)
0.04 259 (2.57) 278 (2.74) 71.3 (0.63) 131 (1.19) 26.5 (0.18) 29.4 (0.18)
0.06 173 (1.66) 210 (2.03) 32.4 (0.23) 58.7 (0.48) 12.6 (0.06) 13.5 (0.05)
0.08 110 (0.99) 147 (1.39) 18.4 (0.11) 29.8 (0.19) 8.09 (0.03) 8.44 (0.03)
0.10 67.8 (0.58) 100 (0.91) 12.4 (0.06) 18.7 (0.10) 5.92 (0.02) 6.10 (0.02)
−0.02 321 (3.18) 339 (3.33) 228 (2.23) 288 (2.83) 88.3 (0.80) 121 (1.13)
−0.04 266 (2.61) 289 (2.84) 95.5 (0.86) 165 (1.57) 26.0 (0.18) 34.2 (0.24)
−0.06 215 (2.10) 228 (2.25) 46.1 (0.36) 88.2 (0.80) 13.9 (0.08) 16.6 (0.09)
−0.08 175 (1.72) 165 (1.56) 27.7 (0.19) 51.3 (0.41) 9.42 (0.04) 11.0 (0.05)
−0.10 146 (1.40) 118 (1.08) 19.5 (0.12) 33.8 (0.23) 7.23 (0.03) 8.29 (0.03)

NOTE: Standard errors are in parentheses. λ = 0.1. N = 100.

Finally, we investigate how large a phase I sample combinations, and a contingency table with 27 cells
size m0 is required in order to achieve the nominal stores the counts under each combination. The porce-
IC ARL 370. Here the simulations are performed via lain quality is affected by three ordinal factors that
a similar procedure to that in Zou and Tsung (2011) may correlate with each other. It is reasonable to em-
and under identical settings to simulating OC ARLs ploy the proposed MOC chart for monitoring them
above. The results are listed in Table 8. It shows that simultaneously, which is a good tool without loss of
the actual IC ARLs approach 370 as m0 increases, ordinal information among the attribute levels.
which is consistent with the result of Jones et al.
Based on the IC dataset comprising 4,600 obser-
(2001). Therefore, in this case, the phase I sample
vations, the IC probability vector can be estimated
size m0 = 100,000 is recommended according to the
as
rule of the actual IC ARL within ±10% of the nom-
inal IC ARL 370 (Jones et al. (2001)). p(0) = [0.1829, 0.0574, 0.0246, 0.2358, 0.0143, 0.0050,
0.0107, 0.0033, 0.0015, 0.1072, 0.0272, 0.0159,
5. A Practical Example
0.1360, 0.0048, 0.0037, 0.0070, 0.0022, 0.0004,
In this section, the proposed MOC chart is imple- 0.0524, 0.0165, 0.0074, 0.0722, 0.0035, 0.0015,
mented in a real process of porcelain manufacturing 0.0050, 0.0007, 0.0009]T .
introduced by Taleb (2009). The process mainly con-
sists of two procedures, transformation and enamel This IC dataset may not be large enough, but it
firing. The former procedure is to transform the pre- suffices to demonstrate the implementation of the
pared material sequentially in the pressing line, cal- MOC chart. According to the IC cell probabilities,
ibration line, and casting line. Next the transformed the marginal and cumulative probabilities of each
units are enameled before being put in the oven factor and the score vectors s(1) , s(2) , s(3) can be cal-
trolleys. After the oven phase, the products are fin- culated. Therefore, the design matrix Y and the co-
ished and require quality assessment for further pro- variance matrix Λ(0) may also be figured out. Then
cedures. The product quality will be evaluated ac- we choose the sample size N = 200 and the EWMA
cording to three characteristics, appearance (X1 ), smoothing parameter λ = 0.1. Given the IC ARL
translucence (X2 ), and whiteness (X3 ). 370, the control limit of the MOC chart is set as
0.9547 via binary segmentation. Note that the sam-
Because it is hard and costly to obtain the fac- ples are collected as count vectors of dimension 27×1,
tors’ numerical values, the quality is better assessed such as
by ordinal levels. To be specific, each factor can be
classified into three ordinal levels like perfect, good, n1 = [37, 10, 5, 56, 2, 2, 3, 0, 0, 23, 5, 3, 27, 0, 0, 2, 0,
and bad. In this way, there will be 33 = 27 level 0, 8, 6, 1, 9, 0, 0, 1, 0, 0]T .

Journal of Quality Technology Vol. 49, No. 2, April 2017


MULTIVARIATE ORDINAL CATEGORICAL PROCESS CONTROL BASED ON LOG-LINEAR MODELING 119

TABLE 7. OC ARL Comparison with Location and Dependence Shifts Under N3 (0, Ω)

(ρ12 , X1∗ ) (ρ13 , X1∗ ) (ρ23 , X1∗ )


δ1 δ2 MOC LMBM MOC LMBM MOC LMBM

0.02 0.02 125 (1.18) 174 (1.71) 126 (1.19) 203 (1.97) 144 (1.35) 168 (1.55)
0.02 −0.02 135 (1.28) 185 (1.82) 142 (1.32) 214 (2.08) 157 (1.47) 175 (1.64)
−0.02 0.02 125 (1.16) 215 (2.10) 131 (1.25) 149 (1.42) 123 (1.13) 89.5 (0.83)
−0.02 −0.02 138 (1.29) 228 (2.21) 139 (1.30) 159 (1.54) 134 (1.26) 93.0 (0.86)
0.04 0.04 37.1 (0.28) 57.6 (0.48) 37.9 (0.28) 59.5 (0.46) 41.7 (0.32) 35.7 (0.23)
0.04 −0.04 37.5 (0.28) 61.7 (0.52) 41.2 (0.32) 61.9 (0.48) 44.6 (0.35) 36.8 (0.24)
−0.04 0.04 36.8 (0.27) 71.8 (0.60) 38.5 (0.29) 45.1 (0.36) 38.2 (0.29) 25.2 (0.18)
−0.04 −0.04 39.3 (0.30) 75.9 (0.64) 39.3 (0.30) 48.8 (0.39) 39.6 (0.30) 26.3 (0.19)

(ρ12 , X2∗ ) (ρ13 , X2∗ ) (ρ23 , X2∗ )


δ1 δ2 MOC LMBM MOC LMBM MOC LMBM

0.02 0.02 112 (1.05) 163 (1.56) 113 (1.06) 192 (1.85) 132 (1.24) 161 (1.49)
0.02 −0.02 111 (1.03) 160 (1.55) 114 (1.06) 184 (1.77) 128 (1.20) 155 (1.43)
−0.02 0.02 113 (1.05) 201 (1.95) 118 (1.10) 140 (1.33) 107 (0.99) 86.2 (0.79)
−0.02 −0.02 113 (1.06) 196 (1.89) 114 (1.06) 138 (1.29) 111 (1.04) 84.3 (0.78)
0.04 0.04 31.2 (0.22) 49.8 (0.40) 31.7 (0.23) 51.8 (0.39) 35.2 (0.26) 33.1 (0.20)
0.04 −0.04 30.2 (0.21) 47.9 (0.37) 33.3 (0.24) 49.2 (0.37) 35.3 (0.26) 32.0 (0.19)
−0.04 0.04 31.2 (0.22) 60.4 (0.49) 33.1 (0.24) 40.3 (0.31) 30.9 (0.22) 23.5 (0.16)
−0.04 −0.04 31.8 (0.23) 58.3 (0.48) 31.6 (0.23) 40.6 (0.31) 32.0 (0.23) 23.4 (0.16)

(ρ12 , X3∗ ) (ρ13 , X3∗ ) (ρ23 , X3∗ )


δ1 δ2 MOC LMBM MOC LMBM MOC LMBM

0.02 0.02 102 (0.95) 153 (1.49) 110 (1.03) 172 (1.66) 125 (1.18) 147 (1.35)
0.02 −0.02 105 (0.95) 156 (1.50) 104 (0.96) 180 (1.74) 117 (1.08) 154 (1.44)
−0.02 0.02 108 (1.00) 185 (1.81) 105 (0.97) 132 (1.27) 102 (0.93) 82.0 (0.75)
−0.02 −0.02 103 (0.95) 191 (1.83) 108 (1.01) 133 (1.25) 102 (0.92) 81.6 (0.75)
0.04 0.04 27.5 (0.19) 44.5 (0.34) 31.1 (0.22) 45.6 (0.33) 33.2 (0.24) 30.5 (0.18)
0.04 −0.04 28.8 (0.20) 45.3 (0.36) 28.8 (0.20) 47.1 (0.35) 31.2 (0.22) 31.3 (0.19)
−0.04 0.04 30.3 (0.21) 53.1 (0.42) 28.6 (0.20) 37.5 (0.28) 28.7 (0.20) 22.5 (0.15)
−0.04 −0.04 28.3 (0.19) 53.6 (0.42) 30.1 (0.21) 37.2 (0.28) 28.8 (0.20) 22.2 (0.15)

NOTE: Standard errors are in parentheses. λ = 0.1. N = 100.

In phase II, values of the charting statistic are cal- chart can detect powerfully location shifts and de-
culated and plotted successively. Figure 1 indicates pendence shifts in the joint distribution of the la-
the MOC chart releases an OC signal at the 27th tent continuous variables that determine the ordi-
sample. nal levels of the factors. For constructing the MOC
chart, we first modified the ordinary log-linear model
6. Conclusion into the extended version for incorporating the ordi-
This article proposes a multivariate ordinal cat- nal information. Compared with the LMBM chart
egorical control chart for monitoring multiple ordi- that is devised for general multivariate categorical
nal factors simultaneously. In particular, the MOC processes, the MOC chart demonstrates more pow-

Vol. 49, No. 2, April 2017 www.asq.org


120 JUNJIE WANG, JIAN LI, AND QIN SU

TABLE 8. IC ARLs with Various Phase I Sample Sizes m0 ments that have resulted in significant improve-
ments in this article. Mr. Wang’s work was sup-
m0 N3 (0, Ω) t3 (3) Ga3 (3, 1) ported by the National Natural Science Foundation
of China Grants 71371163, 71502135, and 71371151.
10,000 103 (0.90) 120 (1.07) 236 (2.25) Dr. Li’s research was supported by the National Nat-
20,000 233 (2.27) 132 (1.20) 267 (2.52) ural Science Foundation of China Grants 71402133,
50,000 297 (2.88) 251 (2.41) 281 (2.65) 71572138, and 11501209, and the Open Fund of State
100,000 352 (3.39) 323 (3.16) 334 (3.19) Key Laboratory for Manufacturing Systems Engi-
200,000 362 (3.50) 349 (3.40) 360 (3.50) neering (Xi’an Jiaotong University) sklms2016010.
Prof. Su’s work was supported by Humanity and So-
NOTE: Standard errors are in parentheses.
cial Science Research Planning Foundation of Chi-
nese Ministry of Education (No. 13YJA630078) and
Major Program of the National Social Science Foun-
erful performance in detecting shifts either in loca- dation of China (No. 15ZDB150).
tions or in dependence in the latent joint distribution
because it successfully exploits the ordinal informa- Appendix
tion. Furthermore, given the IC cell probability vec-
tor, the MOC chart is easy to construct and simple A. Proof of the Equivalence of Equation (1)
to implement into practice. In fact, the MOC chart and the −2LRT Statistic for Model (2)
is a general tool for monitoring multivariate ordinal
categorical processes, where all the factors are ordi- Model (2) involves only a single factor Xi , which
nal categorical. Future work may consider statistical is
surveillance for categorical processes involving both ln p(i)k = β0 + βi s(i)k (k = 1, . . . , hi ).
nominal and ordinal factors. It can be rewritten as

Acknowledgment ln p(i) = 1β0 + yi βi ,


The authors would like to thank the editor and where p(i) = [p(i)1 , . . . , p(i)hi ]T with 1T p(i) = 1 and
two anonymous referees for their many helpful com- yi = s(i) = [s(i)1 , . . . , s(i)hi ]T . We intend to test the

FIGURE 1. An MOC Chart for Monitoring Porcelain Manufacturing Process.

Journal of Quality Technology Vol. 49, No. 2, April 2017


MULTIVARIATE ORDINAL CATEGORICAL PROCESS CONTROL BASED ON LOG-LINEAR MODELING 121

hypothesis According to the first-order Taylor expansion of t(βi )


(0)
(0) (0) at βi = βi , we get
H0 : βi = βi versus H1 : βi
= βi ,
t(βi ) ≈ t(βi ) + t (βi )(βi − βi ).
(0) (0) (0)
(0)
where βi is the value of βi in the null hypothesis
or the IC state in the SPC context. Due to the con- Let t(βi ) = 0, we obtain the MLE β̂i of βi , which is
straint 1T p(i) = 1, the intercept β0 also has its IC
1  (0) 
β̂i = βi + (yiT Λ(i) yi )−1 yiT n(i) −N p(i) , (A.1)
(0) (0) (0)
value β0 . For a general βi , compared with the IC
N
version, model (2) can be rewritten as
(0) (0) (0)  (0) T
(0) where Λ(i) = diag(p(i) ) − p(i) p(i) . The −2LRT
ln p(i) = 1(β0 + δ) + yi βi subject to 1T p(i) = 1,
statistic will be
(0)
where δ is the deviation of β0 from its IC value β0 (0)
Q(β̂i ) = 2[l(β̂i ) − l(βi )]
(0)
induced by the deviation of βi from its IC value βi . (0)
In addition, δ can be derived from the above equa- (i) (1δ + yi β̂i − yi βi )
= 2nT
(0)
(i) yi (β̂i − βi )
= 2nT
tions as
(0)
δ = − ln[1T exp(1β0 + yi βi )]. (0)
− 2N ln[1T exp(1β0 + yi β̂i )]. (A.2)
T
Let n(i) = [n(i)1 , . . . , n(i)hi ] be the collected cell By the second-order Taylor expansion of Q(β̂i ) at
counts in a sample of size N , which jointly follow the (0)
β̂i = βi , we have
multinomial distribution MN(N ; p(i)1 , . . . , p(i)hi ).
Q(β̂i ) ≈ Q(βi ) + (β̂i − βi )Q (βi )
The log-likelihood function of δ can be written from (0) (0) (0)

the probability mass function of the multinomial dis- 1


+ (β̂i − βi )2 Q (βi ).
(0) (0)
tribution, which is (A.3)
2

hi 
hi
Similar to the formulations of t(βi ) and t (βi ), the
l(βi ) = n(i)k ln p(i)k + ln(N !) − ln(n(i)k !)
first-order and second-order derivatives of Q(β̂i ) with
k=1 k=1
respect to β̂i can be obtained based on Euation (A.2).
= nT(i) ln p(i) + ln(N !) − 1T ln(n(i) !) (0)
To make it further, set β̂i = βi and we can get the
(0)
= nT(i) (1β0 + 1δ + yi βi ) + ln(N !) following results:
− 1T ln(n(i) !) (0)
Q(βi ) = 0,
(0)
= nT
(i) (1β0 + yi βi Q (βi ) = 2yiT (n(i) − N p(i) ),
(0) (0)

(0)
− 1 ln[1T exp(1β0 + yi βi )]) Q (βi ) = −2N yiT Λ(i) yi .
(0) (0)

+ ln(N !) − 1T ln(n(i) !).


Integrating these results together with Equations
The first-order derivative of l(βi ) with respect to βi (A.1) and (A.3), we can obtain
is
1
(n(i) − N p(i) )T yi (yiT Λ(i) yi )−1 yiT
(0) (0)
dl(βi ) Q(β̂i ) =
t(βi ) = N
dβi (0)
× (n(i) − N p(i) ). (A.4)
= yiT n(i)
(0)
(0)
− N yiT exp[1β0 + yi βi It is easy to find that U = N yiT Λ(i) yi is a constant
(0)
(0)
− 1 ln(1T exp(1β0 + yi βi ))]. and (n(i) − N p(i) )T yi is a scalar. The test statistic
(A.4) will be
Let
T (0)
(i) yi − N p(i) yi ]
[nT 2
(0) (0)
k(βi ) = exp[1β0 +yi βi −1 ln(1T exp(1β0 +yi βi ))], Q(β̂i ) = .
U
then the second-order derivative of l(βi ) with respect hi
to βi will be Due to k=1 p(i)k = 1, it can be further found that
N pT (0) yi = 0. Then the statistic can be simplified
d2 l(βi )
t (βi ) = as
dβi2 [nT 2
(i) yi ] W2
= −N yiT diag(k(βi ))yi + N yiT k(βi )kT (βi )yi . Q(β̂i ) = = i ,
U U

Vol. 49, No. 2, April 2017 www.asq.org


122 JUNJIE WANG, JIAN LI, AND QIN SU

hi
where Wi = | k=1 (c(i)k + c(i)(k−1) − 1)n(i)k | is the Ding, D.; Tsung, F.; and Li, J. (2016). “Rank-Based Pro-
charting statistic in Equation (1). cess Control for Mixed-Type Data”. IIE Transactions 48,
pp. 673–683.
B. Derivation of the Test Statistic (7) Haberman, S. J. (1974). “Log-Linear Models for Frequency
Tables with Ordered Classifications”. Biometrics 36, pp.
According to the context, model (6) characterizes 589–600.
Johnson, N. L.; Kotz, S.; and Balakrishnan, N. (1997).
multiple ordinal factors, which is Discrete Multivariate Distributions. New York, NY: Wiley.
ln p = 1β0 + Yβ subject to 1T p = 1. Jones, L. A.; Champ, C. W.; and Rigdon, S. E. (2001).
“The Performance of Exponentially Weighted Moving Av-
This model has a similar form to model (2). However, erage Charts with Estimated Parameters”. Technometrics
d 43, pp. 156–167.
p is of dimension h × 1 with h = i=1 hi . We intend Li, J.; Tsung, F.; and Zou, C. (2012). “Directional Control
to test the hypothesis Schemes for Multivariate Categorical Processes”. Journal of
Quality Technology 44, pp. 136–154.
H0 : β = β (0) versus H1 : β
= β (0) , Li, J.; Tsung, F.; and Zou, C. (2014a). “Multivariate Bino-
mial/Multinomial Control Chart”. IIE Transactions 46, pp.
where β (0) is the null version of the coefficient vector
526–542.
β . Due to the constraint 1T p = 1, let δ be the devi- Li, J.; Tsung, F.; and Zou, C. (2014b). “A Simple Categorical
(0)
ation of β0 from its IC value β0 , which is induced Chart for Detecting Location Shifts with Ordinal Informa-
by the deviation of β from β (0) . Then model (6) can tion”. International Journal of Production Research 52, pp.
550–562.
be modified as Lowry, C. A. and Montgomery, D. C. (1995). “A Review
(0) of Multivariate Control Charts”. IIE Transactions 27, pp.
ln p = 1(β0 + δ) + Yβ subject to 1T p = 1.
800–810.
(B.1) Lu, X. S.; Xie, M.; Goh, T. N.; and Lai, C. D. (1998). “Con-
Based on the cell-count vector nk at sampling point trol Chart for Multivariate Attribute Processes”. Interna-
k, the log-likelihood function of β is written as tional Journal of Production Research 36, pp. 3477–3489.
Lucas, J. M. and Saccucci, M. S. (1990). “Exponentially
(0)
k (1β0 + 1δ + Y β ) + ln(N !) − 1 (ln nk !).
l(β ) = nT T Weighted Moving Average Control Schemes: Properties and
Enhancements”. Technometrics 32, pp. 1–29.
(B.2)
Patel, H. I. (1973). “Quality Control Methods for Multivari-
Based on (B.1) and (B.2), the −2LRT statistic of ate Binomial and Poisson Distributions”. Technometrics 15,
Equation (7) can be formulated in a similar way to pp. 103–112.
that in Appendix A, which is finalized as Stoumbos, Z. G. and Sullivan, J. H. (2002). “Robustness to
Non-normality of the Multivariate EWMA Control Chart”.
1
Qk = (nk − N p(0) )T Y(YT Λ(0) Y)−1 YT Journal of Quality Technology 34, pp. 260–276.
N Taleb, H. (2009). “Control Charts Applications for Multi-
variate Attribute Processes”. Computers and Industrial En-
× (nk − N p(0) ). gineering 56, pp. 399–410.
Topalidou, E. and Psarakis, S. (2009). “Review of Multi-
nomial and Multiattribute Quality Control Charts”. Quality
References and Reliability Engineering International 25, pp. 773–804.
Agresti, A. (2010). Analysis of Ordinal Categorical Data, Tucker, G. R.; Woodall, W. H.; and Tsui, K. L. (2002).
2nd edition. New York, NY: Wiley. “A Control Chart for Ordinal Data”. American Journal of
Anderson, T. W. (2003). An Introduction to Multivariate Mathematical and Management Sciences 22, pp. 31–48.
Statistical Analysis, 3rd edition. New York, NY: Wiley. Wang, K. and Tsung, F. (2007). “Run-to-Run Process Ad-
Bersimis, S.; Psarakis, S.; and Panaretos, J. (2007). “Mul- justment Using Categorical Observations”. Journal of Qual-
tivariate Statistical Process Control Charts: An Overview”. ity Technology 39, pp. 312–325.
Quality and Reliability Engineering International 23, pp. Wang, K. and Tsung, F. (2010). “Recursive Parameter Esti-
517–543. mation for Categorical Process Control”. International Jour-
Bishop, Y. M. M.; Fienberg, S. E.; and Holland, P. W. nal of Production Research 48, pp. 1381–1394.
(2007). Discrete Multivariate Analysis: Theory and Practice. Woodall, W. H. and Ncube, M. M. (1985). “Multivariate
New York, NY: Springer. CUSUM Quality Control Procedures”. Technometrics 27,
Chiu, J. E. and Kuo, T. I. (2008). “Attribute Control Chart pp. 285–292.
for Multivariate Poisson Distribution”. Communications in Wu, C. F. J. and Hamada, M. (2000). Experiments: Plan-
Statistics: Theory and Methods 37, pp. 146–158. ning, Analysis, and Parameter Design Optimization. New
Corain, L. and Salmaso, L. (2014). “Nonparametric York, NY: Wiley.
Permutation-Based Control Charts for Ordinal Data”. In Zou, C.; Jiang, W.; and Tsung, F. (2011). “A LASSO-Based
Topics in Nonparametric Statistics., pp. 309–321. New York, Diagnostic Framework for Multivariate Statistical Process
NY: Springer. Control”. Technometrics 53, pp. 297–309.
Dickinson, R. M.; Roberts, D. A. O.; Driscoll, A. R.; Zou, C. and Tsung, F. (2011). “A Multivariate Sign EWMA
Woodall, W. H.; and Vining, G. G. (2014). “CUSUM Control Chart”. Technometrics 53, pp. 84–97.
Charts for Monitoring the Characteristic Life of Censored Zou, C.; Wang, Z.; and Tsung, F. (2012). “A Spatial Rank-
Weibull Lifetimes”. Journal of Quality Technology 46, pp. Based Multivariate EWMA Control Chart”. Naval Research
340– 358. Logistics 59, pp. 91–110.

Journal of Quality Technology Vol. 49, No. 2, April 2017

You might also like