Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

A. Parasuraman, Valarie A. Zeithaml, & Leonard L.

Berry

Reassessment of Expectations as a
Comparison Standard in Measuring
Service Quality: Implications for
Further Research
The authors respond to concerns raised by Cronin and Taylor (1992) and Teas (1993) about the SERVQUAL in-
strument and the perceptions-minus-expectations specification invoked by it to operationalize service quality. After
demonstrating that the validity and alleged severity of many of those concerns are questionable, they offer a set of
research directions for addressing unresolved issues and adding to the understanding of service quality
assessment.

T wo recent articles in this journal-Cronin and Taylor


(1992) and Teas (1993)-have raised concerns about
our specification of service quality as the gap between cus-
resolved issues, including the relationship between service
quality and customer satisfaction.

tomers' expectations and perceptions (Parasuraman, Zei-


Response to Cronin and
thaml, and Berry 1985) and about SERVQUAL, a two-part
instrument we developed for measuring service quality (Par- Taylor (1992)
asuraman, Zeitharnl, and Berry 1988) and later refined (Par- C&T surveyed customers in four sectors (banking, pest con-
asuraman, Berry, and Zeitharnl 1991). In this article we re- trol, dry cleaning, and fast food) using a questionnaire that
spond to those concerns by reexamining the underlying ar- contained (1) the battery of expectations and perceptions
guments and evidence, introducing additional perspectives questions in the SERVQUAL instrument (PBZ 1991; PZB
that we argue are necessary for a balanced assessment of 1988), (2) a separate battery of questions to measure the im-
the specification and measurement of service quality, and portance of the SERVQUAL items, and (3) single-item
isolating the concerns that seem appropriate from those that scales to measure overall service quality, customer satisfac-
are questionable. Building on the insights emerging from tion, and purchase intentions. On the basis of their study,
this exchange, we propose an agenda for further research. C&T conclude that it is unnecessary to measure customer ex-
Because many of the issues raised in the two articles and in pectations in service quality research. Measuring percep-
our response are still unsettled, we hope that the research tions is sufficient, they contend. They also conclude that ser-
vice quality fails to affect purchase intentions. We do not be-
agenda will encourage all interested researchers to address
lieve that these conclusions are warranted at this stage of re-
those issues and add to the service quality literature in a con-
search on service quality, as we attempt to demonstrate in
structive manner. 1
this section. Our comments on C&T's study focus on (1)
Though the concerns raised by both C&T and Teas re-
conceptual issues, (2) methodological and analytical issues,
late to service quality assessment, by and large they focus and (3) practical issues.
on different issues. For this reason, we discuss C&T' s con-
cerns first, then address the issues raised by Teas. We con- Conceptual Issues
clude with a discussion of further research for addressing un-
The perceptions-expectations gap conceptualization of SQ.
C&T contend (p. 56) that, "little if any theoretical or empir-
1
Because of the frequent references to Cronin and Taylor in this article, ical evidence supports the relevance of the expectations-
hereafter we refer to them as C&T. For the same reason, we use initials in
citing our own work, that is, PZB, PBZ, ZBP, or ZPB. Also, when appro- performance gap as the basis for measuring service qual-
priate, we use the abbreviations CS, SQ, and PI to refer to customer satis- ity." They imply here and elsewhere that PZB's extensive
faction, service quality, and purchase intentions, respectively. focus group research (PZB 1985; ZPB 1990) suggest only
the attributes of SQ and not its expectations-performance
A. Parasuraman is Federated Professor of Marketing.Texas A&M Univer- gap formulation. We wish to emphasize that our research
sity. Valarie A. Zeithaml is a Partner, Schmalensee and Zeithaml. Leonard provides strong support for defining SQ as the discrepancy
L. Berry is JCPenney Chair of Retailing Studies and Professor of Market-
between customers' expectations and perceptions, a point
ing, Texas A&M University.
we make in PZB (1985) and ZPB (1990). C&T's assertion

Journal of Marketing
Vol. 58 (January 1994), 111-124 Reassessment of Expectations as a Comparison Standard / 111
also seems to discount prior conceptual work in the SQ lit- believe that this distinction may need to be revised, particu-
erature (Gronroos 1982; Lehtinen and Lehtinen 1982; Sas- larly because some recent studies (Reidenbach and Sandifer-
ser, Olsen, and Wyckoff 1978) as well as more recent re- Smallwood 1990; Woodside, Frey, and Daly 1989) have
search (Bolton and Drew 1991a, b; ZBP 1991) that sup- modeled SQ as an antecedent of CS.
ports the disconfirmation of expectations conceptualization In the research implications section, we propose and dis-
of SQ. Bolton and Drew (199lb, p. 383), in an empirical cuss an integrative framework that attempts to reconcile the
study cited by C&T, conclude: differing viewpoints about the distinction, and the direction
Consistent with prior exploratory research concerning ser- of causality, between CS and SQ. Here we wish to point
vice quality, a key determinant of overall service quality out an apparent misinterpretation by C&T of our published
is the gap between performance and expectations (i.e., dis- work (PZB 1985, 1988). Specifically, in PZB (1985) we do
confirmation). For residential customers, perceived tele- not discuss the relationship between SQ and CS, and in
phone service quality depended on the disconfirmation trig- PZB (1988) we argue in favor of the "CS leads to SQ" hy-
gered by perceived changes in existing service or changes
pothesis. Therefore, we are surprised by C&T' s attributing
in service providers. . . . It is interesting to note that dis-
confirmation explains a larger proportion of the variance to us the opposite view by stating (p. 56) ''PZB (1985,
in service quality than performance. 1988) proposed that higher levels of perceived SQ result in
increased CS" and by referring to (p. 62) "the effects hy-
Therefore, C&T' s use of the same Bolton and Drew article pothesized by PZB (1985, 1988) [that] SQ is an antecedent
as support for their claim (p. 56) that ''the marketing litera- of CS." Moreover, as we discuss in the next section,
ture appears to offer considerable support for the superior- C&T' s empirical findings do not seem to warrant their con-
ity of simple performance-based measures of service qual- clusion (p. 64) that "perceived service quality in fact leads
ity" is surprising and questionable. Moreover, a second ci- to satisfaction as proposed by PZB (1985, 1988)."
tation that C&T offer to support this contention-Mazis,
rypes of comparison norms in assessing CS and SQ. In
Ahtola, and Klippel (1975)-is an article that neither dealt
critiquing the perceptions-minus-expectations conceptualiza-
with service quality nor tested performance-based measures
tion of SERVQUAL, C&T raise the issue of appropriate
against measures incorporating expectations.
comparison standards against which perceptions are to be
There is strong theoretical support for the general no-
compared (p. 56):
tion that customer assessments of stimuli invariably occur
relative to some norm. As far back as Helson (1964), adap- [PZB] state that in measuring perceived service quality
tation-level theory held that individuals perceive stimuli the level of comparison is what a consumer should ex-
only in relation to an adapted standard. More recently, pect, whereas in measures of satisfaction the appropriate
Kahneman and Miller (1986) provide additional theoretical comparison is what a consumer would expect. However,
such a differentiation appears to be inconsistent with
and empirical support for norms as standards against which Woodruff, Cadotte, and Jenkins' (1983) suggestion that ex-
performance is compared and can be measured. pectations should be based on experience norms-what
Attitude formation versus attitude measurement. The em- consumers should expect from a given service provider
pirical, longitudinal studies (e.g., Bolton and Drew 1991a, given their experience with that specific type of service
organization.
b; Churchill and Surprenant 1982; Oliver 1980) that C&T
cite to support their arguments focus on the roles of expec- Though the first sentence in this quote is consistent with the
tations, performance, and disconfirmation in the formation distinction we have made between comparison norms in as-
of attitudes. However, the relevance of those arguments for sessing SQ and CS, the next sentence seems to imply that
raising concerns about SERVQUAL is moot because the in- the debate over norms has been resolved, the consensus
strument is designed merely to measure perceived SQ-an being that the "experience-based norms" of Woodruff, Ca-
attitude level-at a given point in time, regardless of the pro- dotte, and Jenkins ( 1983) are the appropriate frame of refer-
cess by which it was formed. SERVQUAL is a tool to ob- ence in CS assessment. We believe that such an inference is
tain a reading of the attitude level, not a statement about unwarranted because conceptualizations more recent than
how the level was developed. Woodruff, Cadotte, and Jenkins (1983) suggest that CS as-
Relationship between CS and SQ. An important issue sessment could involve more than one comparison norm
raised by C&T (as well as Teas) is the nature of the link be- (Forbes, Tse, and Taylor 1986; Oliver 1985; Tse and Wil-
tween CS and SQ. This is a complex issue characterized by ton 1988; Wilton and Nicosia 1986). Cadotte, Woodruff,
confusion about the distinction between the two constructs and Jenkins (1987) themselves have stated (p. 313) that
as well as the causal direction of their relationship. SQ re- "additional work is needed to refine and expand the concep-
searchers (e.g., Carman 1990; PZB 1988) in the past have tualization of norms as standards.'' Our own recent re-
distinguished between the two according to the level at search on customers' service expectations (ZBP 1991, a pub-
which they are measured: CS is a transaction-specific assess- lication that C&T cite) identifies two different comparison
ment (consistent with the CS/D literature) whereas SQ is a norms for SQ assessment: desired service (the level of ser-
global assessment (consistent with the SQ literature). On vice a customer believes can and should be delivered) and
the basis of this distinction, SQ researchers have posited adequate service (the level of service the customer consid-
that an accumulation of transaction-specific assessments ers acceptable).
leads to a global assessment (i.e., the direction of causality Our point is that the issue of comparison norms and
is from CS to SQ). However, on careful reflection, we now their interpretation has not yet been resolved fully. Though

112 /Journal of Marketing, January 1994


recent conceptual work (e.g., Woodruff et al. 1991) and em- SERVQUAL versus 42.6% for SERVPERF). These results
pirical work (e.g., Boulding et al. 1993) continue to add to strongly suggest that a unidimensional factor is not suffi-
our understanding of comparison norms and how they influ- cient to represent fully the information generated by the 22
ence customers' assessments, more research is needed. items, especially in the case of SERVQUAL, for which
over two-thirds of the variance in the items is unexplained
Methodological and Analytical Issues if just one factor is used. The difference in the variance ex-
C&T's empirical study does not justify their claim (p. 64) plained for SERVQUAL and SERVPERF also suggests
that "marketing's current conceptualization and measure- that the former could be a richer construct that more accu-
ment of service quality are based on a flawed paradigm" rately represents the multifaceted nature of SQ posited in
and that a performance-based measure is superior to the the conceptual literature on the subject (Gronroos 1982;
SERVQUAL measure. A number of serious questions Lehtinen and Lehtinen 1982; PZB 1985; Sasser, Olsen, and
about their methodology and their interpretation of the find- Wyckoff 1978).
ings challenge the strong inferences they make. Second, though C&T say that they conducted oblique
Dimensionality of service quality. One piece of evi- factor analysis, they report loadings for just one factor in
dence C&T use to argue against the five-component struc- each setting (Table 2). It is not clear from their article
ture of SERVQUAL is that their LISREL-based confirma- whether the reported loadings are from the rotated or unro-
tory analysis does not support the model shown in their Fig- tated factor-loading matrix. If they are prerotation loadings,
ure 1. Unfortunately, C&T's Figure 1 is not a totally accu- interpreting them to infer unidimensionality is questiona-
rate depiction of our prior findings pertaining to ble. If they are indeed postrotation loadings, the variance ex-
SERVQUAL (PZB 1988) because it does not allow for pos- plained uniquely by each factor might be even lower than
sible intercorrelations among the five latent constructs. the percentages shown in Table 2. Because oblique rotation
Though we have said in our previous work that allows factors to correlate with one another, part of the var-
SERVQUAL consists of five distinct dimensions, we also iance reported as being explained by one factor might be
have pointed out that the factors representing those dimen- shared with the other factors included in the rotation. The
sions are intercorrelated and hence overlap to some degree. possibility that the unique variance represented by the sin-
We report average interfactor correlations of .23 to .35 in gle factors might be lower than the already low percentages
PZB (1988). In a more recent study in which we refined in Table 2 further weakens C&T' s claim that the SQ con-
SERVQUAL, reassessed its dimensionality, and compared struct is unidimensional.
our findings with four other replication studies, we acknowl- Finally, C&T's inference that SERVQUAL and
edge and discuss potential overlap among the dimensions SERVPERF can be treated as unidimensional on the basis
and arrive at the following conclusion (PZB 1991, p. 442): of the high alpha values reported in Table 2 is erroneous be-
cause it is at odds with the extensive discussion in the liter-
Though the SERVQUAL dimensions represent five con-
ceptually distinct facets of service quality, they are also in- ature about what unidimensionality means and what coeffi-
terrelated, as evidenced by the need for oblique rotations cient alpha does and does not represent (Anderson and Ger-
of factor solutions in the various studies to obtain the bing 1982; Gerbing and Anderson 1988; Green, Lissitz,
most interpretable factor patterns. One fruitful area for fu- and Mulaik 1977; Hattie 1985; Howell 1987). As Howell
ture research is to explore the nature and causes of these (1987, p. 121) points out, "Coefficient alpha, as an esti-
interrelationships. mate of reliability, is neither necessary nor sufficient for uni-
dimensionality." And, according to Gerbing and Anderson
Therefore, we would argue that the fit of C&T' s
SERVQUAL data to their model in Figure 1 might have (1988, p. 190), "Coefficient alpha ... sometimes has been
misinterpreted as an index of unidimensionality rather than
been better than that implied by their Table 1 results if the
reliability .... Unidimensionality and reliability are distinct
model had allowed the five latent constructs to
concepts."
intercorrelate.
C&T use results from their oblique factor analyses to re- In short, every argument that C&T make on the basis of
iterate their inference that the 22 SERVQUAL items are uni- their empirical findings to maintain that the SERVQUAL
dimensional. They do so by merely referring to their Table items form an unidimensional scale is questionable. There-
2 and indicating (p. 61) that "all of the items loaded predict- fore, summing or averaging the scores across all items to cre-
ably on a single factor with the exception of item 19." But ate a single measure of service quality, as C&T have done
a careful examination of C&T' s Table 2 raises questions in evaluating their structural models (Figure 2), is question-
about the soundness of their inference. able as well.
First, except for the factor representing SERVQUAL's Validity. Citing evidence from the correlations in their
perceptions component (i.e., SERVPERF) in the pest con- Table 3, C&T conclude that SERVPERF has better validity
trol context, the percentage of variance in the 22 items cap- than SERVQUAL (p. 61): "We suggest that the proposed
tured by the one factor for which results are reported is less performance-based measures provide a more construct-
than 50%, the cut-off value recommended by Bagozzi and valid explication of service quality because of their content
Yi (1988) for indicators of latent constructs to be consid- validity ... and the evidence of their discriminant validity.''
ered adequate. Moreover, across the four contexts, the vari- This suggestion is unwarranted because, as we demonstrate
ance captured for the SERVQUAL items is much lower subsequently, SERVQUAL performs just as well as
than for the SERVPERF items (mean of 32.4% for SERVPERF on each validity criterion that C&T use.

Reassessment of Expectations as a Comparison Standard / 113


C&T infer convergent validity for SERVPERF by stat- Segment I Tangibles Variables Vl-V4
ing (p. 61), "A high correlation between the items Segment 2 Reliability Variables V5-V9
Segment 3 Responsiveness Variables VIO-Vl3
SERVPERF, importance-weighted SERVPERF, and ser-
Segment 4 Assurance Variables Vl4-VI7
vice quality indicates some degree of convergent validity." Segment 5 Empathy Variables Vl8-V22
However, they fail to examine SERVQUAL in similar fash-
ion. According to Table 3, the average pairwise correlation The total number of significant regression coefficients in
among SERVPERF, importance-weighted SERVPERF, the various segments of Table 4 are as follows:
and overall service quality is .6892 (average of .9093, Segment I Tangibles 6
.6012, and .5572). The corresponding average correlation Segment 2 Reliability 32
for SERVQUAL is .6870 (average of .9787, .5430, and Segment 3 Responsiveness 9
.5394). Clearly, the virtually identical average correlations Segment 4 Assurance 14
for SERVPERF and SERVQUAL do not warrant the conclu- Segment 5 Empathy 11
sion that the former has higher convergent validity than the As this summary implies, the items under the five dimen-
latter. sions do not all contribute in like fashion to explaining the
C&T claim discriminant validity for SERVPERF by stat- variance in overall service quality. The reliability items are
ing (p. 61), "An examination of the correlation matrix in the most critical drivers, and the tangibles items are the
Table 3 indicates discriminant validity ... as the three ser- least critical drivers. Interestingly, the relative importance
vice quality scales all correlate more highly with each other of the five dimensions implied by this summary-that is, re-
than they do with other research variables (i.e., satisfaction liability being most important, tangibles being least impor-
and purchase intentions)." Though this statement is true, it tant, and the other three being of intermediate importance-
is equally true for SERVQUAL, a finding that C&T fail to almost exactly mirrors our findings obtained through both di-
acknowledge. In fact, though the average pairwise correla- rect (i.e., customer-expressed) and indirect (i.e., regression-
tion of SERVPERF with satisfaction and purchase inten- based) measures of importance (PBZ 1991; PZB 1988;
tions is .4812 (average of .5978 and .3647), the correspond- ZPB 1990). The consistency of these findings across stud-
ing value for SERVQUAL is only .4569 (average of .5605 ies, researchers, and contexts offers strong, albeit indirect,
and .3534). Because the average within-construct intercorre- support for the multidimensional nature of service quality.
lations are almost identical for the two scales, as demon- They also suggest that techniques other than, or in addition
strated previously, these results actually imply somewhat to, factor analyses can and should be used to understand ac-
stronger discriminant validity for SERVQUAL. curately the dimensionality of a complex construct such as
SQ. As we have argued in another article comparing several
Regression analyses. C&T assess the predictive ability
SERVQUAL-replication studies that yielded differing fac-
of the various multiple-item service quality scales by con-
tor patterns (PBZ 1991, p. 442-3):
ducting regression analyses wherein their single-item over-
all SQ measure is the dependent variable. On the basis of Respondents' rating a specific company similarly on
the R2 values reported in Table 4, they correctly conclude SERVQUAL items pertaining to different dimensions-
rather than the respondents' inability to distinguish concep-
that SERVPERF outperforms the other three scales. Never-
tually between the dimensions in general-is a plausible
theless, one should note that the average improvement in var- explanation for the diffused factor patterns. To illustrate,
iance explained across the four contexts by using consider the following SERVQUAL items:
SERVPERF instead of SERVQUAL is 6%. (The mean R2
values, rounded to two decimal places, are .39 for •Employees at XYZ give you prompt service (a responsive-
ness item)
SERVQUAL and .45 for SERVPERF.) Whether this differ-
• Employees of XYZ have the knowledge to answer your ques-
ence is large enough to claim superiority for SERVPERF is tions (an assurance item)
arguable. The dependent variable itself is a performance- • XYZ has operating hours convenient to all its customers (an
based (rather than disconfirmation-based) measure and, as empathy item)
such, is more similar to the SERVPERF than the
SERVQUAL formulation. Therefore, at least some of the im- If customers happen to rate XYZ the same or even simi-
provement in explanatory power achieved by using larly on these items, it does not necessarily mean that cus-
tomers consider the items to be part of the same dimen-
SERVPERF instead of SERVQUAL could be merely an ar- sion. Yet, because of high inter-correlations among the
tifact of the ''shared method variance'' between the depend- three sets of ratings, the items are likely to load on the
ent and independent variables. As Bagozzi and Yi (1991, p. same factor when the ratings are factor analyzed. There-
426) point out, ''When the same method is used to measure fore, whether an unclear factor pattern obtained through an-
different constructs, shared method variance always inflates alyzing company-specific ratings necessarily implies poor
the observed between-measure correlation.'' discriminant validity for the general SERVQUAL dimen-
sions is debatable.
The overall pattern of significant regression coefficients
in C&T' s Table 4 offers some insight about the dimension- In discussing the rationale for their first proposition,
ality of the 22 items as well. Specifically, it can be parti- which they later test by using regression analysis, C&T
tioned into the following five horizontal segments corre- state (p. 59), ''The evaluation [of] P 1 calls for an assess-
sponding to the a priori classification of the items into the ment of whether the addition of the importance weights sug-
SERVQUAL dimensions (PZB 1988): gested by ZPB (1990) improves the ability of the

114 /Journal of Marketing, January 1994


SERVQUAL and SERVPERF scales to measure service Figure 2, only one degree of freedom is available to test the
quality." This statement is misleading because in ZPB model, a fact that C&T do not report in their LISREL re-
(1990), we recommend the use of importance weights sults summarized in Table 5 or their discussion of the re-
merely to compute a weighted average SERVQUAL score sults. Insufficient degrees of freedom to estimate the parame-
(across the five dimensions) as an indicator of a company's ters of a structural model will tend to inflate the fit of the
overall SQ gap. Moreover, the importance weights that we data to the model. For example, a model with zero degrees
use in ZPB (1990) are weights for the dimensions (derived of freedom (i.e., the number of intermeasure correlations
from customer responses to a 100-point allocation ques- equal to the number of model parameters) will fit the ob-
tion), not for the individual SERVQUAL items. Neither in served data perfectly.
ZPB (1990) nor in our other work (PBZ 1991; PZB 1988) Therefore, the meaningfulness of C&T' s observation
have we suggested using survey questions to measure the im- (p. 63) that "Model 1 (SERVQUAL) had a good fit in two
portance of individual items, let alone using weighted indi- of the four industries ... whereas Model 2 (SERVPERF)
vidual-item scores in regression analyses as C&T have had an excellent fit in all four industries," should be as-
done. In fact, we would argue that using weighted item sessed carefully, keeping in mind two important points: (1)
scores as independent variables in regression analysis is not The fit values for both Model 1 and Model 2 could be in-
meaningful because a primary purpose of regression analy- flated due to insufficient degrees of freedom and (2) The
sis is to derive the importance weights indirectly (in the somewhat better fit values for Model 2 could be an artifact
form of beta coefficients) by using unweighted or "raw" of the "shared method variance" between the SERVPERF
scores as independent variables. Therefore, using weighted and the "Overall Service Quality" measures in the model.
scores as independent variables is a form of ''double Because of the questionable meaningfulness of the struc-
counting." tural model tests, C&T's interpreting the test results (p. 63)
Relationships among SQ, CS, and Pl. Figure 2 in ''as additional support for the superiority of the
C&T' s article shows the structural model they use to exam- SERVPERF approach" is debatable as well.
ine the interrelationships among SQ, CS, and PL Their oper- A comparison of the results in C&T's Tables 3 and 5 re-
ationalization and testing of this model suffer from several veals several serious inconsistencies and interpretational
serious problems. C&T' s use of single-item scales to meas- problems that reiterate the inadequacies of their measures
ure SQ, CS, and PI fails to do justice to the richness of and structural model test. For example, the correlation be-
these constructs. As already discussed, SQ is a multifaceted tween the single-item measures of SQ and CS is .8175
construct, even though there is no clear consensus yet on (Table 3), implying that the measures are highly collinear
the number of dimensions and their interrelationships. and exhibit little or no discriminant validity. Yet C&T seem
Though a single-item overall SQ measure may be appropri- to ignore this problem and rely solely on the Table 5 results
ate for examining the convergent validity and predictive (which themselves are questionable because of the prob-
power of alternative SQ measures such as SERVQUAL, it lems described previously) to make strong claims pertain-
is not as appropriate for testing models positing structural re- ing to the direction of causality between the two constructs
lationships between SQ and other constructs such as PI, es- (i.e., SQ leads to CS and not vice versa) and the relative in-
pecially when a multiple-item scale is available. Therefore, fluence of the two constructs on PI (i.e., CS has a signifi-
in Figure 2, using SERVQUAL or SERVPERF directly to cant effect in all four contexts and SQ has no significant ef-
operationalize overall SQ (i.e., 1] 2) would have been far bet- fect in any context).
ter than using a single-item SQ measure as C&T have done. Regarding the latter claim, it is instructive to note from
In other words, because C&T's P2, P3, and P4 relate to just Table 3 that CS and SQ have virtually identical correlations
the bottom part of their Figure 2, the procedure they used to with PI (.5272 for SQ and .5334 for CS). The most logical
operationalize 1; 1 should have been used to operationalize inference on the basis of this fact is that SQ and CS have a
11 2 instead. similiar effect on Pl, particularly because SQ and CS them-
C&T's failure to use multiple-item scales to measure selves are highly correlated. Such an inference is much
CS and PI, particularly in view of the centrality of these con- more plausible and justified than C&T's conclusion that
stucts to their discussion and claims, is also questionable. only CS has a significant effect on PL That the findings
Multiple-item CS scales (e.g., Westbrook and Oliver 1981) from C&T's structural-model test led them to this question-
are available in our literature and have been used in several able inference is perhaps a manifestation of the problems
customer-satisfaction studies (e.g., Oliver and Swan 1989; with their measures (single-item measure, multicollinearity)
Swan and Oliver 1989). Multiple-item scales also have and their model (questionable causal path, lack of degrees
been used to operationalize purchase or behavioral inten- of freedom).
tions (e.g., Bagozzi 1982; Dodds, Monroe, and Grewal
1991). Practical Issues
Another serious consequence of C&T' s use of single- Arguing in favor of performance-based measures of SQ,
item measures as indicators for their model's four latent con- C&T state (p. 58), "Practitioners often measure the determi-
structs is the lack of degrees of freedom for a robust test of nants of overall satisfaction/perceived quality by having cus-
their model. The four measures yield only six intermeasure tomers simply assess the performance of the company's busi-
correlations (i.e., 4 X 3/2 = 6) to estimate the model's para- ness processes.'' Though the practice of measuring only per-
meters. Because five parameters are being estimated as per ceptions is widespread, such a practice does not necessarily

Reassessment of Expectations as a Comparison Standard / 115


Table 1
Mean Perceptions-Only and SERVQUAL Scoresa

Service Tel. Ins.Co. Ins.Co. Bank Bank


Quality Co. 1 2 1 2
Dimensions p SQ p SQ p SQ p SQ p SQ
Tangibles 5.6 .3 5.3 0 5.6 .6 5.4 .3 5.8 .7
Reliability 5.1 -1.3 4.8 -1.6 5.4 -1.0 5.1 -1.4 5.4 -1.1
Responsiveness 5.1 -1.2 5.1 -1.3 5.6 -.8 4.8 -1.6 5.4 -.9
Assurance 5.4 -1.1 5.4 -1.0 5.8 -.7 5.2 -1.4 5.7 -.8
Empathy 5.2 -1.1 5.1 -1.1 5.6 -.8 4.7 -1.6 5.2 -1.0
aThe numbers under "P" and "SQ" represent mean perceptions-only and SERVQUAL scores respectively.

mean performance-based measures are superior to disconfir- Interpretation of Expectations


mation-based measures. As demonstrated in PBZ (1990), A key issue addressed by Teas is the impact of the interpre-
SQ measurements that incorporate customer expectations tation of the expectations measure (E) on the meaningful-
provide richer information than those that focus on percep- ness of the P-E specification invoked by the SERVQUAL
tions only. Moreover, executives in companies that have framework. Specifically, on the basis of a series of concep-
switched to a disconfirmation-based measurement approach tual and mathematical arguments, Teas concludes that in-
tell us that the information generated by this approach has creasing P-E scores may not necessarily reflect continu-
greater diagnostic value. ously increasing levels of perceived quality, as the
In a recent study we administered the SERVQUAL SERVQUAL framework implies. Though his conclusion
scale to independent samples of customers of five nation- has merit, several assumptions underlying his arguments
ally known companies (details of the study are in PBZ must be reexamined to assess accurately the severity of the
1991). Table 1 summarizes perceptions-only and problem.
SERVQUAL scores by dimension for each of the five com- The P-E specification is problematic only for certain
panies. As Table 1 shows, the SERVQUAL scores for each types of attributes under certain conditions. As Teas's discus-
company consistently exhibit greater variation across dimen- sion suggests, this specification is meaningful if the service
sions than the perceptions-only scores. This pattern of find- feature being assessed is a vector attribute-that is, one on
ings suggests that the SERVQUAL scores could be supe- which a customer's ideal point is at an infinite level. With
rior in terms of pinpointing areas of deficiency within a com- vector attributes, higher performance is always better (e.g.,
pany. Moreover, examining only performance ratings can checkout speed at a grocery store). Thus, as portrayed
lead to different actions than examining those ratings rela- under Case A in Figure 1, for vector attributes there is a pos-
tive to customer expectations. For example, Ins. Co. 1 itive monotonic relationship between P and SQ for a given
might focus more attention on tangibles than on assurance level of the expectation norm E, regardless of how E is in-
if it relied solely on the perceptions-only scores. This terpreted. This relationship is consistent with the
would be a mistake, because the company's SERVQUAL SERVQUAL formulation of SQ. Moreover, customers are
scores show a serious shortfall for assurance and no short- likely to consider most of the 22 items in the SERVQUAL
fall for tangibles. instrument to be vector attributes-a point we discuss sub-
An important question to ask in assessing the practical sequently and one that Teas acknowledges in footnote 16 of
value of SERVQUAL vis-a-vis SERVPERF is, Are manag- his article.
ers who use SQ measurements more interested in accu- The P-E specification could be problematic when a ser-
rately identifying service shortfalls or explaining variance vice attribute is a classic ideal point attribute-that is, one
in an overall measure of perceived service? (Explained var- on which a customer's ideal point is at a finite level and,
iance is the only criterion on which SERVPERF performs therefore, performance beyond which will displease the cus-
better than SERVQUAL, and, as discussed previously, this tomer (e.g., friendliness of a salesperson in a retail store).
could be due to shared method variance.) We believe that However, the severity of the potential problem depends on
managers would be more interested in an accurate diagno- how the expectations norm E is interpreted. Teas offers two
sis of SQ problems. From a practical standpoint, interpretations of E that are helpful in assessing the meaning-
SERVQUAL is preferable to SERVPERF in our judgment. fulness of the P-E specification: a ''classic attitudinal
The superior diagnostic value of SERVQUAL more than off- model ideal point" interpretation and a "feasible ideal
sets the loss in predictive power. point'' interpretation. If E is interpreted as the classic ideal
point (i.e., E = I ) and the performance level P exceeds E,
the relationship between P and SQ for a given level of Ebe-
Response to Teas (1993) comes negative monotonic, in contrast to the positive mon-
The issues raised by Teas (1993) fall under three main top- otonic relationship implied by the SERVQUAL formula-
ics: (1) interpretation of the expectations standard, (2) oper- tion. As Case B in Figure 1 shows, under the classic ideal
ationalization of this standard, and (3) evaluation of alterna- point interpretation of E the P-E specification is meaning-
tive models specifying the SQ construct. ful as long as P is less than or equal to E, but becomes a

116 I Journal of Marketing, January 1994


FIGURE 1 would experience disutility. Therefore, the feasible ideal
Functional Relationship Between Perceived point logically cannot exceed the classic ideal point. In
Performance {P J) and Service Quality {SQ) other words, when customers specify the levels of E and I,
the range of feasible ideal points will be restricted as fol-
Case A: Attribute j is a vector attribute and, therefore, the lows: E::;; I. Yet, implied in Teas's discussion of MQ is the
general form of the function below holds regardless
of how the expectations standard (EJ) is defined. assumption that E can exceed I (Teas makes this assump-
tion explicit in footnote 7). Under the more realistic assump-

SQ:~_/ _PJ
tion that E ::;; I, the absolute value of the E - I difference
(the second component in equation 1's right side) is equiv-

~ alent to -(E - I). Making this substitution in equation 1


yields
Case B: Attribute j is a classic ideal-point attribute and EJ is MQ = -l[jP - Ij - (-(E - I))]

-A--
interpreted as the classic ideal point (i.e., EJ = IJ) =-l[jP-Ij+(E-1)]
= -1 [jP - Ij] - (E - I)

SQ ~I,_ _ PJ
Furthermore, when P ::;; I, IP - II is equivalent to -(P - I),
and the equation for MQ simplifies to
MQ =-1 [-(P-I)]-(E-1)
=(P-I)-(E-1)
Case C: Attribute j is a classic ideal-point attribute and EJ is
interpreted as the feasible ideal point (i.e., EJ < IJ) =P-E

Therefore, as long as the perceived performance (P)


does not exceed the classic ideal point (I), MQ is equivalent
to SERVQUAL's P-E specification, regardless of whether
P is below or above the feasible ideal point (E). As such,
Teas's conclusion (p. 21) that "the SERVQUAL P-E meas-
urement specification is not compatible with the feasible
ideal point interpretation of E when finite classic ideal point
problem if P exceeds E. When this condition occurs, the cor- attributes are involved'' is debatable. As Case C in Figure
rect specification for SQ is -(P-E). 1 shows, when P ::;; I, the relationship between P and SQ is
To examine the P-E specification's meaningfulness positive monotonic and identical to the relationship under
under the feasible ideal point interpretation, Teas proposes the vector-attribute assumption (Case A).
a "modified one-attribute SERVQUAL model" (MQ), rep- The P-E specification does become a problem when P
resented by equation 3 in his article. Dropping the subscript > I under the classic ideal point attribute assumption, with
i from this equation for simplicity (doing so does not mate- E being interpreted as the feasible ideal point. As shown by
rially affect the expression) yields the following equation: Case C in Figure 1, perceived service quality (SQ) peaks
(1) MQ =-l[jP - Ij - IE- IIJ when P =I and starts declining thereafter. An important fea-
ture of Case C not obvious from Teas's discussion is that
where the classic ideal point (I) in effect supersedes the feasible
MQ =''Modified'' quality ideal point (E) as the comparison standard when P exceeds
I =The ideal amount of the attribute (i.e., the classic I. In other words, when P > I, the shape of the SQ function
attitudinal model ideal point) is the same in Cases B and C except that in Case C the de-
E =The expectation norm interpreted as the feasible cline in SQ starts from a positive base level (of magnitude
ideal point I - E), whereas in Case B it starts from a base level of zero.
P =The perceived performance level Therefore, when P > I under the feasible ideal point interpre-
tation of E, the expression for service quality is
Teas defines the feasible ideal point (p. 20) as "a feasible
level of performance under ideal circumstances, that is, the SQ = (I - E) - (P - I)
best level of performance by the highest quality provider
under perfect circumstances." If one examines this defini- Figure 2 summarizes in flowchart form how the type of
tion from a customer's perspective-an appropriate one attribute and interpretation of E influence the specification
given that E and I are specified by customers-it is incon- of SQ. The P-E specification is appropriate under three of
ceivable that the feasible ideal point would exceed the clas- the five final branches in the flowchart. Therefore, the inap-
sic ideal point. Because the feasible ideal point represents propriateness of the P-E specification depends on the like-
the level of service the customer considers possible for the lihood of the occurrence of the remaining two branches.
best service company, exceeding that level still should be This is perhaps an empirical question that also has implica-
viewed favorably by the customer. In contrast, the classic tions for further research on the nature of the service attrib-
ideal point is the service level beyond which the customer utes, as we discuss subsequently.

Reassessment of Expectations as a Comparison Standard / 117


FIGURE 2 thereof) between the conceptual and operational definitions
Impact of Attribute Type and Expectation Standard of the expectation measures. The systematic, comprehen-
on Expression for Perceived Service Quality sive, and unique approach Teas has used to explore this
issue is commendable. His approach involves obtaining rat-
Is attribute j a vector attribute ings on three different comparison norms (E, E*, and I), con-
or a classic ideal point attribute? ducting follow-up interviews of respondents providing non-
extreme ratings (i.e., less than 7 on the 7-point scales used),
using multiple coders to classify the open-ended responses,
Vector attribute Classic ideal point attribute and computing indexes of congruence on the basis of rela-
tive frequencies of responses in the coded categories.
Is E interpreted as a classic ideal
Though the results from Teas's content analysis of the
point or a feasible ideal point? open-ended responses (summarized in his Appendix B and
Tables 4 and 5) are sound and insightful, his interpretation
of them is open to question. Specifically, ''under the assump-
Classic ideal point Feasible ideal point tion that the feasibility concept is congruent with the excel-
(E =I) (E<I) lence norm concept'' (p. 25), he concludes that the index of
congruence is only 39.4% for E and 35.0% for E*, though
he later raises these indexes to 52.4% and 49.1 %, respec-
tively, by including several more response categories. On
the basis of these values, Teas further concludes (p. 25) that
"a considerable portion of the variance in the SERVQUAL
E and E* measures is the result of measurement error in-
duced by respondents misinterpreting the scales.'' As dis-
cussed subsequently, both conclusions are problematic.
The purpose of the follow-up interviews in Teas's study
SQ=P-E
was to explore why respondents had lower than the maxi-
mum expectation levels-as Teas acknowledges (p. 24), "re-
spondents answered qualitative follow-up questions concern-
Operationalization of Expectations ing their reasons for non-extreme (non-7) responses.'' The
issue of how respondents interpreted the expectations ques-
Teas raises several concerns about the operationalization of tions was not the focus of the follow-up interviews. There-
E, including a concern we ourselves have raised (PBZ fore, inferring incongruence between intended and inter-
1990) about the use of the word "should" in our original ex- preted meanings of expectations (i.e., measurement error)
pectations measure (PZB 1988). Because of this concern, on the basis of the open-ended responses is questionable.
we developed a revised expectations measure (E*) represent-
Pertinent to the present discussion is a study that we con-
ing the extent to which customers believe a particular attrib-
ducted to understand the nature and determinants of custom-
ute is "essential" for an excellent service company. Teas ers' service expectations (ZBP 1993, a revised version of
wonders if E* is really an improvement over E, our original the monograph ZBP 1991). In this study, we developed a
measure (p. 21-22): conceptual model explicating the general antecedents of ex-
Defining the revised SERVQUAL E* in this way, in con- pectations and how they are likely to influence expectation
junction with the P-E measurement specification, ... sug- levels. In the first column of Table 2, we list five general an-
gests high performance scores on essential attributes (high tecedents from our study that are especially likely to lower
E* scores) reflect lower quality than high performance on expectations. Shown in the second column are illustrative
attributes that are less essential (low E* scores). It is diffi-
open-ended responses from Teas's study that relate to the
cult to envision a theoretical argument that supports this
measurement specification. general antecedents. The correspondence between the two
columns of Table 2 is additional evidence that the items in
Teas's argument is somewhat misleading because high per- Appendix B of Teas' s article reflect reasons for expectation
formance on an "essential" attribute may not be high levels rather than interpretations of the expectation
enough (from the customer's standpoint) and, therefore, log- measures.
ically can reflect lower quality on that attribute (a key A key assumption underlying Teas's indexes of congru-
phrase missing from Teas's argument) than equally high per- ence is that the only open-ended responses congruent with
formance on a less essential attribute. In fact, as we have ar- the excellence norm standard are the ones included under
gued in our response to C&T, this possibility is an impor- the "not feasible," and perhaps "sufficient," categories.
tant reason why measuring only perceived service perfor- The rationale for this critical assumption is unclear. There-
mance can lead to inaccurate assessment of perceived ser- fore, what the index of congruence really represents is un-
vice quality. clear as well. Given that the open-ended responses are rea-
The operationalization issue addressed and discussed sons for customers' nonextreme expectations, no response
most extensively by Teas is the congruence (or lack listed in Appendix B of Teas's article is necessarily incom-

118 /Journal of Marketing, January 1994


TABLE 2 The response-category frequencies in Teas's Tables 4
Correspondence Between Antecedent Constructs and 5 can be used to compute a different type of "index of
in ZBP's Expectations Model and Follow-up
congruence" that reflects the extent to which reasons for
Responses Obtained by Teas
nonextreme expectations ratings are compatible with a vec-
Illustrative Follow-up tor-attribute assumption. This index is given by
Responses from Teas
Antecedent Constructs (1993) Reflecting Each 100 - I Percentages of responses in the "Ideal"
from ZBP (1993) Construct categories
Personal Needs: States or Not necessary to know From Table 4, this index is 97.6% for E and 96.6% for E*,
conditions essential to the exact need.
implying strong support for the vector-attribute assumption.
physical or psychological
well-being of the customer. From Table 5, which shows the distribution of respondents
rather than responses, this index is 91.6% for E and 90.8%
Perceived Service Alterna- All of them have pretty much for E*. These are conservative estimates because Table 5 al-
tives: Customers' percep- good hours.
tions of the degree to which lows for multiple responses and, therefore, some respon-
they can obtain better ser- dents could be included in more than one category percent-
vice through providers other age in the summation term of the previous expression,
than the focal company. thereby inflating the term. Therefore, for a vast majority of
Self-Perceived Service I've gone into the store not the respondents, the vector-attribute assumption is tenable
Role: Customers' percep- knowing exactly myself what for all service attributes included in Teas's study.
tions of the degree to which I want.
they themselves influence Evaluation of Alternative Service Quality Models
the level of service they re- Using the empirical results from his study, Teas examines
ceive.
the criterion and construct validity of eight different SQ
Situational Factors: Ser- Can't wait on everybody at models: unweighted and weighted versions of the
vice-performance contingen- once. SERVQUAL (P-E), revised SERVQUAL (P-E*), normed
cies that customers perceive quality (NQ), and evaluated performance (EP) formula-
are beyond the control of the
service provider. tions. On the basis of correlation coefficients reflecting cri-
terion and construct validity, he correctly concludes that his
Past Experience: Custom- You don't get perfection in EP formulation outperforms the other formulations. How-
ers' previous exposure to these kind of stores.
the service that is relevant to
ever, apart from the statistical significance of the correla-
the focal service. tion differences reported in his Tables 7 and 8, several ad-
ditional issues are relevant for assessing the superiority of
the EP formulation.
patible with the "excellence norm" -that is, customers' per- In introducing these issues we limit our discussion to
ceptions of attribute levels essential for service excellence. the unweighted versions of just the P-E, P-E*, and EP for-
Interestingly, the coded responses from Teas's study mulations for simplicity and also because (1) for all formu-
offer rich insight into another important facet of measuring lations, the unweighted version outperformed its weighted
SQ, namely, the meaningfulness of the P-E specification. counterpart on both criterion and construct validity and (2)
As already discussed, whether the P-E specification is appro- the NQ formulation, in addition to having the lowest crite-
priate critically hinges on whether a service feature is a vec- rion and construct validity among the four formulations, is
tor attribute or a classic ideal point attribute. The P-E spec- not one that we have proposed. From Teas's equation 6 and
ification is appropriate if the feature is a vector attribute or his discussion of it, the unweighted EP formulation for any
it is a classic ideal point attribute and perceived perfor- service is as follows (the service subscript "i" is sup-
mance is less than or equal to the ideal level (see Figure 2). pressed for notational simplicity):
Customers' reasons for less-than-maximum expectations rat-
ings in Teas' s study shed light on whether a service feature
EP = -1 [ f Ip - I. 1]
j=I J J '

is a vector or classic ideal point attribute. Specifically, rea-


sons suggesting that customers dislike or derive negative where ''m'' is the number of service attributes. This expres-
utility from performance levels exceeding their expressed ex- sion assumes that the attributes are classic ideal point attrib-
pectation levels imply classic ideal point attributes. utes as evidenced by the fact that the theoretical maximum
Of the open-ended responses in Appendix B of Teas's ar- value for EP is zero, which occurs when P.J = I.J for all attrib-
ticle, only those classified as "Ideal" (including double- utes j. As discussed in the preceding section, the evidence
classifications involving "Ideal") clearly imply that the ser- from Teas's follow-up responses strongly suggests that the
vice features in question are classic ideal point attributes. service attributes included in the study are much more
None of the other open-ended responses implies that perfor- likely to be vector attributes than classic ideal point attrib-
mance levels exceeding the expressed expectation levels utes. Therefore, the conceptual soundness of the EP formu-
will displease customers. Therefore, service features associ- lation is open to question.
ated with all open-ended responses not classified as Teas's findings pertaining to nonextreme ratings on his
"Ideal" are likely to possess vector-attribute properties. Ij scale raise additional questions about the soundness of

Reassessment of Expectations as a Comparison Standard / 119


the classic ideal point attribute assumption invoked by the this exchange offer a challenging agenda for further re-
EP formulation. As his Table 4 reveals, only 151 non-7 rat- search on measuring SQ.
ings (13% of the total 1200 ratings) were obtained for the
ideal point (I) scale, prompting him to observe (p. 26), "It Measurement of Expectations
is interesting to note, however, that the tendency for nonex- Of the psychometric concerns raised by C&T about
treme responses was much smaller for the ideal point meas- SERVQUAL, the only one that has consistent support from
ure than it was for the expectations measures." Ironically, their empirical findings is the somewhat lower predictive
this observation, though accurate, appears to go against the power of the P-E measure relative to the P-only measure.
classic ideal point attribute assumption because the fact that Our own findings (PBZ 1991) as well as those of other re-
only 13% of the ratings were less than 7 seems incompati- searchers (e.g., Babakus and Boller 1992) also support the
ble with that assumption. If the service features in question superiority of the perceptions-only measure from a purely
were indeed classic ideal point attributes, one would expect predictive-validity standpoint. However, as argued previ-
the ideal-point levels to be dominated by nonextreme ously, the superior predictive power of the P-only measure
ratings. must be balanced against its inferior diagnostic value. There-
Even in instances in which nonextreme ratings were ob- fore, formally assessing the practical usefulness of measur-
tained for the ideal point (I) measure, a majority of the rea- ing expectations and the trade-offs involved in not doing so
sons for those ratings (as reflected by the open-ended re- is a fruitful avenue for additional research. More generally,
sponses to the follow-up questions) do not imply the pres- a need and an opportunity exist for explicitly incorporating
practical criteria (e.g., potential diagnostic value) into the tra-
ence of finite ideal-point levels for the attributes investi-
gated. From Teas's Table 4, only 40.5% of the reasons are ditional scale-assessment paradigm that is dominated by psy-
chometric criteria. Perreault (1992, p. 371), in a recent com-
classified under "Ideal" (including double classifications
mentary on the status of research in our discipline, issues a
with "Ideal" as one of the two categories). Teas does ac-
similar call for a broadened perspective in assessing
knowledge the presence of this problem, though he labels it
measures:
as a lack of discriminant validity for the ideal-point
measure. Current academic thinking tends to define measures as ac-
Finally, it is worth noting that a semantic-differential ceptable or not primarily based on properties of the meas-
ure. However, research that focuses on defining what is
scale was used for the "P" and "I" measures in the EP for-
"acceptable" in terms of a measure's likely impact on in-
mulation, whereas a 7-point "strongly disagree" - "strongly terpretation of substantive relationships, or the conclu-
agree" scale was used for the three measures (P, E, and E*) sions that might be drawn from those relationships, might
in the SERVQUAL and revised SERVQUAL formulations. provide more practical guidelines-not only for practitio-
Therefore, the observed criterion and construct validity ad- ners but also for academics .... There is probably a rea-
vantages of the EP formulation could be an artifact of the sonable 'middle ground' between the 'all or nothing' con-
trast that seems to have evolved.
scale-format differences, especially because the correlation
differences, though statistically significant, are modest in The most appropriate way to incorporate expectations
magnitude: For the unweighted formulations, the mean cri- into SQ measurement is another area for additional re-
search. Though the SERVQUAL formulation as well as the
terion-validity correlation difference between EP and the
various SQ models evaluated by Teas use a difference-
two SERVQUAL formulations is .077 ([.081 + .073]/2);
score formulation, psychometric concerns have been raised
the corresponding mean construct-validity correlation differ-
about the use of difference scores in multivariate analyses
ence is .132 ([.149 + .114]/2). However, because these corre-
(for a recent review of these concerns, see Peter, Churchill,
lation differences are consistently in favor of EP, the seman-
and Brown 1993). Therefore, there is a need for studies com-
tic-differential scale format could be worth exploring in the
paring the paired-item, difference-score formulation of
future for measuring the SERVQUAL perceptions and ex-
SERVQUAL with direct formulations that use single items
pectations components. to measure customers' perceptions relative to their
expectations.
Implications and Directions for Brown, Churchill, and Peter (1993) conducted such a
comparative study and concluded that a nondifference score
Further Research measure had better psychometric properties than
C&T and Teas have raised important, but essentially differ- SERVQUAL. However, in a comment on their study, we
ent, concerns about the SERVQUAL approach for measur- have raised questions about their interpretations and demon-
ing SQ. C&T's primary concerns are that SERVQUAL's strated that the claimed psychometric superiority of the non-
expectations component is unnecessary and the instru- difference score measure is questionable (PBZ 1993). Like-
ment's dimensionality is problematic. In contrast, Teas ques- wise, as our response to C&T' s empirical study shows, the
tions the meaningfulness of SERVQUAL's P-E specifica- psychometric properties of SERVQUAL's difference score
tion and wonders whether an alternate specification-the formulation are by and large just as strong as those of its P-
"evaluated performance" (EP) specification-is superior. only component. Because the cumulative empirical evi-
We have attempted to demonstrate that the validity and al- dence about difference score and nondifference score meas-
leged severity of many of the issues raised are debatable. ures has not established conclusively the superiority of one
Nevertheless, several key unresolved issues emerging from measure over the other, additional research in this area is

120 I Journal of Marketing, January 1994


warranted. Such research also should expiore alternatives to FIGURE 3
the Likert-scale format ("strongly agree"-"strongly disa- Components of Transaction-Specific Evaluations
gree") used most frequently in past SERVQUAL studies.
The semantic-differential scale format used by Teas to oper-
ationalize his EP formulation is an especially attractive alter- Evaluation of .....___
native-as we pointed out previously, this scale format is a Service Quality (SQ)
plausible explanation for the consistently superior perfor-
mance of the EP formulation.
Dimensionality of Service Quality
As already discussed, C&T's conclusion that the 22 Evaluation of Transaction
SERVQUAL items form a unidimensional scale is unwar- Product Quality (SQ) Satisfaction (TSAT)
ranted. However, replication studies have shown significant
correlations among the five dimensions originally derived
for SERVQUAL. Therefore, exploring why and how the
five service quality dimensions are interrelated (e.g., could
some of the dimensions be antecedents of the others?) is a
fertile area for additional research. Pursuing such a research
avenue would be more appropriate for advancing our under-
Evaluation of
Price (P) -
standing of SQ than hastily discarding the multidimen-
sional nature of the construct.
The potential drawback pointed out previously of using Teas offers a thoughtful suggestion that could be the
factor analysis as the sole approach for assessing the dimen- foundation for such a framework (p. 30):
sionality of service quality implies a need for developing
One way to integrate these two causal perspectives is to
other approaches. As we have suggested elsewhere (PBZ
specify two perceived quality concepts-transaction-
1991), an intriguing approach is to give customers defini- specific quality and relationship quality-and to specify
tions of the five dimensions and ask them to sort the perceived transaction-specific quality as the transaction-
SERVQUAL items into the dimensions only on the basis of specific performance component of contemporary con-
each item's content. The proportions of customers "cor- sumer satisfaction models. This implies that transaction-
rectly'' sorting the items into the five dimensions would re- specific satisfaction is a function of perceived transaction-
flect the degree to which the dimensions are distinct. The pat- specific performance quality. Furthermore, ... transaction-
tern of correct and incorrect classifications also could reveal specific satisfaction could be argued to be a predictor of
potentially confusing items and the consequent need to re- perceived long-term relationship quality.
word the items and/or dimensional definitions. A key notion embedded in Teas' s suggestion is that both
Relationship Between Service Quality and SQ and CS can be examined meaningfully from both trans-
Customer Satisfaction action-specific as well as global perspectives, a viewpoint
embraced by other researchers as well (e.g., Dabholkar
The direction of causality between SQ and CS is an impor-
1993). Building on this notion and incorporating two other
tant unresolved issue that C&T's article addresses empiri-
potential antecedents of CS-product quality and price-
cally and Teas's article addresses conceptually. Though
we propose (1) a transaction-specific conceptualization of
C&T conclude that SQ leads to CS, and not vice versa, the
the constructs' interrelationships and (2) a global frame-
empirical basis for their conclusion is questionable because
work reflecting an aggregation of customers' evaluations of
of the previously discussed problems with their measures
and analysis. Nevertheless, there is a lack of consensus in multiple transactions.
the literature and among researchers about the causal link be- Figure 3 portrays the proposed transaction-specific con-
tween the two constructs. Specifically, the view held by ceptual model. This model posits a customer's overall satis-
many service quality researchers that CS leads to SQ is at faction with a transaction to be a function of his or her as-
odds with the causal direction implied in models specified sessment of service quality, product quality, and price. 2
by CS researchers. As Teas's discussion suggests, these con- This conceptualization is consistent with the "quality leads
flicting perspectives could be due to the global or overall at- to satisfaction'' school of thought that satisfaction research-
titude focus in most SQ research in contrast to the transac- ers often espouse (e.g., Reidenbach and Sandifer-Small-
tion-specific focus in most CS research. Adding to the com- wood 1990; Woodside, Frey, and Daly 1989). The two sep-
plexity of this issue, practitioners and the popular press arate quality-evaluation antecedents capture the fact that vir-
often use the terms service quality and customer satisfac- tually all market offerings possess a mix of service and prod-
tion interchangeably. An integrative framework that reflects uct features and fall along a continuum anchored by ''tangi-
and reconciles these differing perspectives is sorely needed 2Though the term "product" theoretically can refer to a good (i.e., tan-
to divert our attention from merely debating the causal direc- gible product) or a service (i.e., intangible product), here we use it only in
tion between SQ and CS to enhancing our knowledge of the former sense. Thus, ''product quality'' in Figure 3 and elsewhere refers
how they interrelate. to tangible-product or good quality.

Reassessment of Expectations as a Comparison Standard / 121


FIGURE 4 tomer and firm (e.g., the multiple encounters that a hotel
Components of Global Evaluations guest could have with hotel staff, facilities, and services).
The SERVQUAL instrument, in its present form, is in-
tended to ascertain customers' global perceptions of a
Transaction 1
firm's service quality. In light of the proposed framework,
modifying SERVQUAL to assess transaction-specific ser-
vice quality is a useful direction for further research. The
proposed framework also raises other issues worth explor-
ing. For example, how do customers integrate transaction-
specific evaluations in forming overall impressions? Are
some transactions weighed more heavily than others be-
Global lmpresions cause of "primacy" and "recency" type effects or satisfac-
..
·-------------,------------- about Firm tion/dissatisfaction levels exceeding critical thresholds? Do
transaction-specific service quality evaluations have any di-
*Satisfaction
-----------·-----·----·---- *Service Quality rect influence on global service quality perceptions, in addi-
Transaction n *Product Quality tion to the posited indirect influence (mediated through trans-
*Price
action-specific satisfaction)? How do the four facets of
global impressions relate to one another? For example, are
there any halo effects? Research addressing these and re-
lated issues is likely to add significantly to our knowledge
in this area.

Specification of Service Quality


A primary contribution of Teas's article is that it highlights
the potential problem with the P-E specification of SQ
when the features on which a service is evaluated are not
ble dominant" at one end and "intangible dominant" at the vector attributes. Though we believe that the collective evi-
other (Shostack 1977). Therefore, in assessing satisfaction dence from Teas' s study strongly supports the vector-
attribute assumption, the concern he raises has implications
with a personal computer purchase or an airline flight, for ex-
for correctly specifying SQ and for further research on ser-
ample, customers are likely to consider service features
vice attributes. Teas's EP specification, in spite of its appar-
(e.g., empathy and knowledge of the computer salesperson,
ent empirical superiority over the other models he tested, suf-
courtesy and responsiveness of the flight crew) and product
fers from several deficiencies, including its being based on
features (e.g., computer memory and speed, seat comfort
the questionable assumption that all service features are clas-
and meals during the flight) as well as price. Understanding
sic ideal point attributes. A mixed-model specification that
the roles of service quality, product quality, and price eval- assumes some features to be vector attributes and others to
uations in determining transaction-specific satisfaction is an be classic ideal point attributes would be conceptually more
important area for further research. For example, do custom- appropriate (cf. Green and Srinivasan 1978). The flowchart
ers use a compensatory process in combining the three in Figure 2 implies the following mixed-model specifica-
types of evaluations? Do individual factors (e.g., past expe- tion that takes into account service-attribute type as well as
rience) and contextual factors (e.g., purchase/consumption interpretation of the comparison standard E:
occasion) influence the relative importance of the different
evaluations and how they are combined? m
SQ = L [(P.j - E)Dr -(P. - E)D2. + {(T. - E)-(P. - T.) }03.]
In Figure 4, we present our proposed global framework. j=l J j j J l J J j J j

where
It depicts customers' global impressions about a firm stem-
ming from an aggregation of transaction experiences. We D Ii = l if j is a vector attribute, or if it is a classic ideal point
posit global impressions to be multifaceted, consisting of attribute and Pi $Ii;
customers' overall satisfaction with the firm as well as their = 0 otherwise.
overall perceptions of the firm's service quality, product 0 2 . = 1 if j is a classic ideal point attribute and Ei is inter-
quality, and price. This framework, in addition to capturing j preted as the classic ideal point (i.e., Ei =I) and Pi >Ii;
the notion that the SQ and CS constructs can be examined = 0 otherwise.
at both transaction-specific and global levels, is consistent 0 3 . = 1 if j is a classic ideal point attribute and Ei is inter-
with the "satisfaction (with specific transactions) leads to j preted as the feasible ideal point (i.e., Ei < 1i) and Pi >
Ii;
overall quality perceptions'' school of thought embraced by
= 0 otherwise.
service quality researchers (e.g., Bitner 1990; Bolton and
Drew 1991a, b; Carman 1990). The term "transaction" in Operationalizing this specification has implications for
this framework can be used to represent an entire service ep- data collection and analysis in SQ research. Specifically, in
isode (e.g., a visit to a fitness center or barber shop) or dis- addition to obtaining data on customers' perceptions (Pi)
crete components of a lengthy interaction between a cus- and comparison standards (Ii and/or Ei), one should ascer-

122 I Journal of Marketing, January 1994


and comparison standards (lj and/or Ej), one should ascer- attributes exist. Identification of such segments, in addition
tain whether customers view each service feature as a vec- to offering managerial insights for market segmentation,
tor or classic ideal point attribute. Including a separate bat- will reduce the aforementioned analytical challenge in that
tery of questions is one option for obtaining the additional the expression for SQ can be treated as a segment level,
information, though this option will lengthen the SQ sur- rather than an individual-level, specification.
vey. An alternate approach is to conduct a separate study
(e.g., focus group interviews in which customers discuss Conclusion
whether "more is always better" on each service feature)
The articles by C&T and Teas raise important issues about
to preclassify the features into the two attribute categories.
the specification and measurement of SQ. In this article, we
The expression for SQ is implicitly an individual-level attempt to reexamine and clarify the key issues raised.
specification because a given attribute j can be viewed dif- Though the current approach for assessing SQ can and
ferently by different customers. In other words, for the should be refined, abandoning it altogether in favor of the al-
same attribute j, the values of D 1j, D2j, and D3j can vary ternate approaches proffered by C&T and Teas does not
across customers. This possibility could pose a challenge in seem warranted. The collective conceptual and empirical ev-
conducting aggregate-level analyses across customers simi- idence casts doubt on the alleged severity of the concerns
lar to that faced by researchers using conjoint analysis about the current approach and on the claimed superiority
(Green and Srinivasan 1990). An interesting direction for fur- of the alternate approaches. Critical issues remain unre-
ther research is to investigate whether distinct customer seg- solved and we hope that the research agenda we have artic-
ments with homogeneous views about the nature of service ulated will be helpful in advancing our knowledge.

REFERENCES
Anderson, James C. and David W. Gerbing (1982), "Some Meth- tigation Into the Determinants of Customer Satisfaction,'' Jour-
ods for Respecifying Measurement Models to Obtain Unidimen- nal of Marketing Research, 19 (November), 491-504.
sional Construct Measurement," Journal of Marketing Re- Cronin, J. Joseph, Jr. and Steven A. Taylor (1992), "Measuring Ser-
search, 19 (November), 453-60. vice Quality: A Reexamination and Extension," Journal of
Babakus, Emin and Gregory W. Boller (1992), "An Empirical As- Marketing, 56 (July), 55-68.
sessment of the SERVQUAL Scale," Journal of Business Re- Dabholkar, Pratibha A. (1993), "Customer Satisfaction and Ser-
search, 24, 253-68. vice Quality: Two Constructs or One?" in Enhancing Knowl-
Bagozzi, Richard P. (1982), "A Field Investigation of Causal Re- edge Development in Marketing, Vol. 4, David W. Cravens
lations Among Cognitions, Affect, Intentions, and Behavior," and Peter R. Dickson, eds. Chicago, IL: American Marketing
Journal of Marketing Research, 19 (November), 562-84. Association, 10-18.
- - - and Youjae Yi (1988), "On the Evaluation of Structural Dodds, William B., Kent B. Monroe, and Dhruv Grewal (1991),
Equation Models,'' Journal of the Academy of Marketing Sci- "The Effects of Price, Brand and Store Information on Buyers'
ence, 16 (Spring), 74-79. Product Evaluations,'' Journal of Marketing Research, 28 (Au-
---and (1991), "Multitrait-Multimethod Matri- gust), 307-19.
ces in Consumer Research," Journal of Consumer Research, Forbes, J.D., David K. Tse, and Shirley Taylor (1986), "Toward a
17 (March), 426-39. Model of Consumer Post-Choice Response Behavior," in Ad-
Bitner, Mary Jo (1990), "Evaluating Service Encounters: The Ef- vances in Consumer Research, Vol. 13, Richard L. Lutz, ed.
fects of Physical Surroundings and Employee Responses,'' Jour- Ann Arbor, MI: Association for Consumer Research.
nal of Marketing, 54 (April), 69-82. Gerbing, David W. and James C. Anderson (1988), "An Updated
Bolton, Ruth N. and James H. Drew (1991a), "A Longitudinal Paradigm for Scale Development Incorporating Unidimension-
Analysis of the Impact of Service Changes on Customer Atti- ality and Its Assessment," Journal of Marketing Research, 15
tudes," Journal of Marketing, 55 (January), 1-9. (May), 186-92.
---and (199lb), "A Multistage Model of Custom- Green, Paul E. and V. Srinivasan (1978), "Conjoint Analysis in
ers' Assessments of Service Quality and Value," Journal of Consumer Research: Issues and Outlook," Journal of Con-
Consumer Research, 17 (March), 375-84. sumer Research, 5 (September), 103-23.
Boulding, William, Ajay Kalra, Richard Staelin, and Valarie A. ---and (1990), "Conjoint Analysis in Marketing:
Zeithaml (1993), "A Dynamic Process Model of Service Qual- New Developments With Implications for Research and Prac-
ity: From Expectations to Behavioral Intentions," Journal of tice," Journal of Marketing, 54 (October), 3-19.
Marketing Research, 30 (February), 7-27. Green, S.B., R. Lissitz, and S. Mulaik (1977), "Limitations of
Brown, Tom J., Gilbert A. Churchill, Jr., and J. Paul Peter (1993), Coefficient Alpha as an Index of Test Unidimensionality,'' Edu-
"Improving the Measurement of Service Quality," Journal of cational and Psychological Measurement, 37 (Winter), 827-
Retailing, 69 (Spring), 127-39. 38.
Cadotte, Earnest R., Robert B. Woodruff, and Roger L. Jenkins Gronroos, Christian (1982), Strategic Management and Marketing
(1987), ''Expectations and Norms in Models of Consumer in the Service Sector, Helsinki, Finland: Swedish School of Ec-
Satisfaction,'' Journal of Marketing Research, 24 (August), onomics and Business Administration.
305-14. Hattie, John (1985), ''Methodology Review: Assessing Unidimen-
Carman, James M. (1990), "Consumer Perceptions of Service sionality," Applied Psychological Measurement, 9 (June),
Quality: An Assessment of the SERVQUAL Dimensions," 139-64.
Journal of Retailing, 66 (1), 33-55. Helson, Harry (1964), Adaptation Level Theory. New York:
Churchill, Gilbert A., Jr. and Carol Surprenant (1982), "An Inves- Harper and Row.

Reassessment of Expectations as a Comparison Standard / 123


Howell, Roy D. (1987), "Covariance Structure Modeling and SERVQUAL Approach," Journal of Health Care Marketing,
Measurement Issues: A Note on 'Interrelations Among A Chan- IO (December), 47-55.
nel Entity's Power Sources,"' Journal of Marketing Research, Sasser, W. Earl, Jr., R. Paul Olsen, and D. Daryl Wyckoff (1978),
14 (February), 119-26. "Understanding Service Operations," in Management of Ser-
Kahneman, Daniel and Dale T. Miller (1986), "Norm Theory: vice Operations. Boston: Allyn and Bacon.
Comparing Reality to Its Alternatives," Psychological Review, Shostack, G. Lynn (1977), "Breaking Free from Product Market-
93 (2), 136-53. ing," Journal of Marketing, 41 (April), 73-80.
Lehtinen, Uolevi and Jarrno R. Lehtinen (1982), "Service Quality: Swan, John E. and Richard L. Oliver (1989), "Postpurchase Com-
A Study of Quality Dimensions," unpublished working paper. munications by Consumers," Journal of Retailing, 65 (Win-
Helsinki, Finland: Service Management Institute. ter), 516-33.
Mazis, Michael B., Olli T. Ahtola, and R. Eugene Klippel (1975), Teas, R. Kenneth (1993), "Expectations, Performance Evaluation
"A Comparison of Four Multi-Attribute Models in the Predic- and Consumers' Perceptions of Quality," Journal of Market-
tion of Consumer Attitudes," Journal of Consumer Research, ing, 57 (October), 18-34.
2 (June), 38-52. Tse, David K. and Peter C. Wilton (1988), "Models of Consumer
Oliver, Richard L. (1980), "A Cognitive Model of the Antece- Satisfaction Formation: An Extension," Journal of Marketing
dents and Consequences of Satisfaction Decisions," Journal of Research, 25 (May), 204-12.
Marketing Research, 17 (November), 460-69. Westbrook, Robert A. and Richard L. Oliver (1981), "Developing
---(1985), "An Extended Perspective on Post-Purchase Phe- Better Measures of Consumer Satisfaction: Some Preliminary
nomena: Is Satisfaction a Red Herring?'' unpublished paper pre- Results," in Advances in Consumer Research, Vol. 8, Kent B.
sented at 1985 Annual Conference of the Association for Con- Monroe, ed. Ann Arbor, MI: Association for Consumer Re-
sumer Research, Las Vegas (October). search, 94-99.
----and John E. Swan (1989), ''Consumer Perceptions of In- Wilton, Peter and Franco M. Nicosia (1986), "Emerging Para-
terpersonal Equity and Satisfaction in Transactions: A Field Sur- digms for the Study of Consumer Satisfaction," European Re-
vey Approach," Journal of Marketing, 53 (April), 21-35. search, 14 (January), 4-11.
Parasuraman, A., Leonard L. Berry, and Valarie A. Zeithaml Woodruff, Robert B., Ernest R. Cadotte, and Roger L. Jenkins
(1990), ''Guidelines for Conducting Service Quality Re- (1983), "Modeling Consumer Satisfaction Processes Using Ex-
search," Marketing Research (December), 34-44. perienced-Based Norms," Journal of Marketing Research, 20
---, , and (1991), "Refinement and Re- (August), 296-304.
assessment of the SERVQUAL Scale," Journal of Retailing, - - - - , D. Scott Clemons, David W. Schumann, Sarah F. Gar-
67 (Winter), 420-50. dial, and Mary Jane Burns (1991), "The Standards Issue In
---., , and (1993), "More on Improving CS/D Research: A Historical Perspective," Journal of Con-
Service Quality Measurement," Journal of Retailing, 69 sumer Satisfaction, Dissatisfaction and Complaing Behavior,
(Spring), 140-47. 4, 103-9.
---,Valarie A. Zeithaml, and Leonard L. Berry (1985), "A Woodside, Arch G., Lisa L. Frey, and Robert Timothy Daly
Conceptual Model of Service Quality and Its Implications for (1989), "Linking Service Quality, Customer Satisfaction, and
Future Research," Journal of Marketing, 49 (Fall), 41-50. Behavioral Intention," Journal of Health Care Marketing, 9
---.----.,and (1988), "SERVQUAL: A Mul- (December), 5-17.
tiple-Item Scale for Measuring Consumer Perceptions of Ser- Zeithaml, Valarie A., Leonard L. Berry, and A. Parasuraman
vice Quality," Journal of Retailing, 64 (1), 12-40. (1991), "The Nature and Determinants of Customer Expecta-
Perreault, William D., Jr. (1992), "The Shifting Paradigm in Mar- tions of Service," Marketing Science Institute Research Pro-
keting Research," Journal of the Academy of Marketing Sci- gram Series (May), Report No. 91-113.
ence, 20 (Fall), 367-75. ---.---and (1993), "The Nature and Deter-
Peter, J. Paul, Gilbert A. Churchill, Jr., and Tom J. Brown (1993), minants of Customer Expectations of Service," Journal of the
"Caution in the Use of Difference Scores in Consumer Re- Academy of Marketing Science, 21 (1), 1-12.
search,'' Journal of Consumer Research, 19 (March), 655-62. ----,A. Parasuraman, and Leonard L. Berry (1990), Deliver-
Reidenbach, R. Eric and Beverly Sandifer-Smallwood (1990), "Ex- ing Service Quality:Balancing Customer Perceptions and Ex-
ploring Perceptions of Hospital Operations by a Modified pectations. New York: The Free Press.

124 I Journal of Marketing, January 1994

You might also like