2017.wu & Shing. Can Likert Scales

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Journal of Social Service Research

ISSN: 0148-8376 (Print) 1540-7314 (Online) Journal homepage: www.tandfonline.com/journals/wssr20

Can Likert Scales be Treated as Interval Scales?—A


Simulation Study

Huiping Wu & Shing-On Leung

To cite this article: Huiping Wu & Shing-On Leung (2017) Can Likert Scales be Treated as
Interval Scales?—A Simulation Study, Journal of Social Service Research, 43:4, 527-532, DOI:
10.1080/01488376.2017.1329775

To link to this article: https://doi.org/10.1080/01488376.2017.1329775

Published online: 06 Jun 2017.

Submit your article to this journal

Article views: 22436

View related articles

View Crossmark data

Citing articles: 151 View citing articles

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=wssr20
JOURNAL OF SOCIAL SERVICE RESEARCH
2017, VOL. 43, NO. 4, 527–532
https://doi.org/10.1080/01488376.2017.1329775

Can Likert Scales be Treated as Interval Scales?—A Simulation Study


Huiping Wua and Shing-On Leungb
a
College of Mathematics and Computer Science, Fujian Normal University, Fujian, China; bFaculty of Education, University of Macau, Macau,
China

ABSTRACT KEYWORDS
The Likert scale is widely used in social work research, and is commonly constructed with four to Likert scale; ordinal scale;
seven points. It is usually treated as an interval scale, but strictly speaking it is an ordinal scale, interval scale; normality
where arithmetic operations cannot be conducted. There are pros and cons in using the Likert scale
as an interval scale, but the controversy can be handled by increasing the number of points. Several
researchers have suggested bringing the number up to eleven, on the basis of empirical data. In
this article the authors explore this rational and share the same view, but simulate artificial data
from both symmetrical normal and skewed distributions where the underlying metric is known in
advance. Results show that more Likert scale points will result in a closer approach to the
underlying distribution, and hence normality and interval scales. To increase generalizability social
work practitioners are encouraged to use 11-point Likert scales from 0 to 10, a natural and easily
comprehensible range.

Introduction
Traditionally, the number of points in a Likert scale
The Likert (1932) scale is widely used in social service can be as few as three or four. If it can be increased to
practice, e.g. in assessing the satisfaction of young or eleven, a common metric that ranges from 0 to 10, as
elderly people with services provided by frontline recommended by Hodge and Gillespie (2007) and
social workers. Tsui (1997) critically reviewed many Leung (2011), it can be treated as a continuous mea-
empirical studies in the supervision of social workers, sure and hence arithmetic operations can be used. In
and found that most of them used a Likert scale in this paper, we investigate whether an increased num-
mailed self-administered questionnaires. A Likert ber of Likert scale points can lead to a better approxi-
scale is basically an ordinal scale measure, and there is mation of an interval scale and normality via
a very long-standing and controversial question about simulations. The issue starts with four levels of
whether it can perform arithmetic operations such as measurement.
addition, subtraction, multiplication and division. For Stevens (1946) proposed four levels of measure-
instance, a four-point Likert scale with categories ment scales: nominal, ordinal, interval and ratio. The
labeled “strongly agree”, “agree”, “disagree” and nominal scale is only considered to be a label, with
“strongly disagree” would be assigned conventional nothing basically related to numerical values. The
values from 1 to 4, which were then treated as numeri- ordinal scale is order-preserving, only indicating rank
cal numbers automatically. Strictly speaking, this and order. Strictly speaking, the Likert scale is an ordi-
violates the basic assumptions of an ordinal level mea- nal scale and cannot perform arithmetic operations.
sure (Jamieson, 2004). Beyond this, the next question When intervals between successful values of the ordi-
is whether normal distribution assumptions hold, as nal scale are equally spaced, an interval scale is pro-
this is the basis of many statistical tests such as t-tests, duced. However, the zero point on an interval scale is
ANOVA etc. a matter of convenience. If there is a meaningful zero,
On the other hand, there are recommendations to it is then a ratio scale. Theoretically, only interval and
increase the number of Likert scale points to make it ratio scales are considered to be continuous, where
closer to continuous scales and normality. arithmetic operations can be conducted, while

CONTACT Shing-On Leung soleung@umac.mo Faculty of Education, University of Macau, Taipa, Macau, China.
© 2017 Taylor & Francis Group, LLC
528 H. WU AND S.-O. LEUNG

nominal and ordinal scales are considered to be cate- An early conjecture was made by Knapp (1990),
gorical data, where arithmetic operation should not be who stated that increasing the number of points tends
conducted. There are many studies dealing with the to “continuise” ordinal towards interval scales. Hodge
disadvantages of treating ordinal as interval scales. and Gillespie (2007) suggested the Phrase Completion
Jamieson (2004) reviewed ways of using and abusing Scale (PCS), which is essentially an 11-point Likert
Likert scales, and stated that it is a common practice, scale, with ends labeled as zero and 10, referring
but controversial, to treat a Likert scale as interval respectively to the absence and maximum amount of
scale. Computing means and standard deviations for a construct. Leung and Xu (2013) have used this to
Likert scale data are considered to be inappropriate. recommend single-item scales for subjective academic
Instead, nonparametric statistics should be used. performance, self-esteem and socio-economic status.
Kuzon, Urbanchek, and McCabe (1996) maintained Hodge and Gillespie (2007) compared PCS empiri-
that one of the seven deadly sins of statistical analysis cally with a five-point Likert response key by a variety
is using parametric analysis for ordinal scales. There of indicators: written comments, Cronbach’s alphas,
are thus arguments against the use of Likert scales as inter-item correlations, factor scores and SEM coeffi-
continuous measures. cients, and found that PCS had higher validity and
On the other hand, there are arguments in favor of reliability. Leung (2011) also compared the psycho-
considering Likert scales as continuous interval scales. metric properties of four-, five-, six- and 11-point Lik-
Historically, Stevens (1946) did agree that treating ordi- ert scales empirically, and found that there was no
nal as interval scales resulted in many fruitful and differences in means and standard deviations, but
meaningful findings, and this was re-stated in Knapp more points would result in less skewness and kurtosis
(1990); Lord (1953) maintained that what counted was and a close approach to normality and hence interval
that a statistical analysis should be meaningful. There scales. In this way, both studies used empirical data to
have been numerous studies illustrating what ‘fruitful’ show that more points would produce better results.
and ‘“meaningful’ mean. One such example was under- But the problem with empirical data is that the under-
taken by Grolnick, Benjet, Kurowski, and Apostoleris lying continuum or models may be debatable. An
(1997), measuring parents’ involvement in children’s alternative is to use simulation.
schooling. Another was the famous Rosenberg Self- Simulations have the advantage that, though artifi-
Esteem Scale (Rosenberg, 1965), which is still widely cial, the underlying continuum or models are known
used to measure global self-esteem. In fact, Harwell in advance, and can be manipulated by researchers. In
and Gatti (2001) conducted a study of three journals this paper, we simulate artificial data from known
(American Educational Research Journal, Sociology of symmetrical normal distributions, and also skewed
Education and Journal of Educational Psychology), gamma distributions so that the underlying contin-
and concluded that “educational researchers regularly uum is either continuously symmetric or skewed. The
employ ordinal-scaled dependent variables in analyses purpose of this paper is therefore to investigate
typically described as requiring to be interval scales”. whether increasing the number of Likert scale points
The dilemma is then that, even though the Likert scale will produce a result closer to the underlying metric,
violates basic statistical assumptions, many studies find and hence interval scales. Further, since we know the
it useful. However, the issue may be a matter of degree underlying continuum, we can analyze whether simu-
rather than requiring a binary yes-or-no answer, as will lated data aligns with the true continuum if the num-
explained below. ber of points increases. Details of the method are
The term ‘imperfect interval scale’ was first intro- discussed in the next section.
duced by Borgatta and Bohrnstedt (1980), who sug-
gested that at the manifest and observed level most Methodology
measurements probably lie between perfect ordinal
Simulation Design
and perfect interval scales. Hence, the question can be
re-formulated as: “How far can the ordinal be from The author’s use simulation to explore the effects of
the interval scale?” instead of “Can an ordinal be the number of Likert scale points on the degree of
treated as an interval scale?”. One way to do this is to departure from the underlying distribution. Data is
increase the number of points in the Likert scale. generated from known underlying distributions and
JOURNAL OF SOCIAL SERVICES RESEARCH 529

hence there is a known standard to cross-check 1. Equal probability discretization and normal
the methods employed. To explore this concept a underlying distribution
symmetric normal and a very skewed gamma-type 2. Equal probability discretization and gamma
distribution was used for comparison. In order to underlying distribution
form categories from an underlying metric, a discre- 3. Equal interval discretization and normal under-
tized a continuous scale was applied, as follows. lying distribution
4. Equal interval discretization and gamma under-
lying distribution
Discretization
5. Symmetric discretization and normal underly-
After choosing the underlying distribution, the next ing distribution
step is to discretize a continuous distribution, as Likert 6. Skewed discretization and gamma underlying
scale data is discrete. Two obvious ways are equal distribution
probability and equal interval width. For equal proba-
bility discretization, we divide the whole range into k Simulation Procedures and Scores
categories with equal probability to produce k–1
thresholds, i.e. the area percentages in each category We simulate N D 10,000 data points under each of the
are 1/ k. Here, k represents the number of Likert scale six conditions above, and record the number of cases
points and takes the values 4, 5, 6, 7 and 11, with 11 falling into k categories, i.e., N1, N2, …, Nk such that
representing an easy comprehension range from 0 to N1 C N2 C … C Nk D N. We define two scores, raw
10. These five cover the numbers of Likert scale points and true. The raw scores range from 0, 1, 2, …, k–1
used in most practical situations. and are assigned in conventional order. We start at 0
For equal interval discretization, we need to trun- so that an 11-point scale corresponds to the popularly
cate if the end points are either -1 or C1. The recognized scale of 0 to 10. The raw score is used by
normal range is ¡1 to C1. We have used ¡5 and most social work practitioners. True scores are defined
C5 as the lower and upper limits, which will result as the mid-points of the category boundaries, and are
in two negligible tails with a sum of probabilities of known as ‘true scores’ because they are calculated
0.0000317. The range (¡5, 5) is then divided into k from the underlying distribution by means of which
intervals of equal width. The range of a gamma dis- data is simulated.
tribution is (0, C1), and the upper limit is defined
as the point where an equal tail probability of Statistical Procedures
0.0000317 is neglected, making normal and gamma Kolmogorov-Smirnov (KS) statistics are employed to
comparable. detect departures from the underlying distribution.
Apart from equal probability and width discretiza- The smaller the KS statistical value, the closer to the
tion, we add two more supplementary choices. The underlying distribution, and vice versa. Apart from
first is a symmetric bell shape but with unequal proba- the KS test, correlations are calculated under each of
bilities. We start with k D 4 with probabilities in four the six conditions to investigate the departure of raw
categories: 0.187, 0.313, 0.313 and 0.187. For k > 4 from true scores.
categories, we keep the symmetric properties and at
the same time keep the standard deviation unchanged.
Results
For this symmetric discretization, we use only sym-
metric normal as skewed gamma is not appropriate Table 1 reports the KS statistics for raw and true
here. The second is a skewed discretization. We start scores under six conditions for different values of k. In
with k D 4 and with probabilities in four categories: Table 1, the KS statistics generally decrease with k.
0.30, 040, 0.25, and 0.05. For k > 4, we keep to the Exceptions are found in the true scores in the equal
skewness as closely as possible, but there are practical interval discretization case, regardless of whether the
difficulties in keeping it exactly the same. For this underlying distribution is normal or gamma.
skewed discretization we use only the gamma, as sym- In this equal interval case the true scores fully repre-
metric normal is not suitable here. To summarize, we sent the true distribution, whatever number of Likert
have the following six simulation conditions. points is used, since there is no loss of information even
530 H. WU AND S.-O. LEUNG

Table 1. The KS statistics for the raw and true scores under 6 conditions for different values of k.
Equal Probability Equal interval
Symmetric Skewed
Normal Gamma Normal Gamma Normal Gamma

K Raw True Raw True Raw True Raw True Raw True Raw True

4 0.176 0.188 0.932 0.234 0.346 0.004 0.932 0.010 0.191 0.216 0.932 0.302
5 0.162 0.188 0.932 0.186 0.318 0.004 0.932 0.010 0.148 0.194 0.932 0.445
6 0.143 0.182 0.932 0.157 0.260 0.004 0.932 0.010 0.138 0.162 0.932 0.268
7 0.127 0.174 0.932 0.133 0.234 0.004 0.932 0.010 0.120 0.154 0.932 0.257
11 0.103 0.134 0.843 0.084 0.150 0.004 0.874 0.010 0.121 0.122 0.829 0.192

when k is as small as 4. All KS values are very small and indicating that the formers are more suitable when the
the same for different values of k. Another very minor underlying distribution is symmetric normal rather
exception is in the case of skewed discretization with than skewed gamma. It is noted that under equal
gamma the underlying distribution. The KS values are probability discretization and with the underlying dis-
slightly smaller with k D 4 than k D 5, but in general tribution being normal, correlations decrease slightly
KS values decrease with k. This is perhaps due to the from 0.979 when k D 4 to 0.919 when k D 11. When
limitation in controlling the skewness across different k is as small as 4, the scale is relatively robust and the
values of k. Apart from these two minor exceptions, all raw score is closer to the true score and the correlation
the KS values decrease with k. Hence, increasing k, and higher. When k is as high as 11, the loss of informa-
thus the number of Likert scale points, will bring the tion is lower, reflecting the reality that the raw scores
distribution closer to the underlying value. Under the deviate from the true, and that the correlations are
skewed gamma distribution, the KS statistics of the raw therefore lower.
scores are always very high, with a minimum of 0.83 Under equal probability discretization, the same
and all others no greater than 0.45, indicating that the trend is observed under gamma distribution. The cor-
raw score cannot represent the underlying distribution relation decreases from 0.8 when k D 4 to 0.644 when
when it is very skewed. This is because raw scores k D 11. The same reasoning applies here. However,
assume an equal width interval between categories, but there is an additional factor: under skewed gamma
if the underlying distribution is very skewed this distribution the equal-width raw scores deviate much
assumption is invalid. further from the true scores, as the situation is more
The general conclusion in Table 1 is that the raw serious here.
scores cannot represent the underlying distribution Interestingly, when the underlying distribution is
when it is very skewed, and that an increase in the normal, the trend of correlations is reversed under
number of Likert scale points makes the distribution equal interval discretization. The correlations increase
closer to the underlying value. from 0.855 when k D 4 to 0.976 when k D 11, because
Table 2 reports the correlations between raw and with equal interval discretization the raw scores align
true scores under six conditions for different values of k. with the true, and with k increasing the loss of infor-
In Table 2, most correlations are very high. When mation will be smaller so that in this case the raw
the underlying distribution is normal, the correlations scores will be closer to the true.
between raw and true scores are all higher than 0.85, When the underlying distribution is gamma, the
correlations are in general lower than when it is
normal. The correlations are highest under equal-
Table 2. Correlations between the raw and true scores under 6
conditions for different values of k. interval discretization, moderate under skewed
Equal Probability Equal Interval
discretization and lowest under equal-probability
Symmetric Skewed discretization. This is because under equal-interval
Normal Gamma Normal Gamma Normal Gamma
conditions the raw scores tend to be aligned with the
4 0.979 0.800 0.855 0.767 0.978 0.783 true, but will deviate from them when it is a case of
5 0.961 0.752 0.894 0.831 0.965 0.830
6 0.948 0.720 0.926 0.875 0.959 0.787 equal probability. The skewed discretization lies
7 0.937 0.695 0.943 0.902 0.961 0.781 somewhere between equal-probability and equal-
11 0.919 0.644 0.976 0.954 0.980 0.875
interval discretization.
JOURNAL OF SOCIAL SERVICES RESEARCH 531

Further, when the underlying distribution is skewed categories, and the results will measure more precisely
gamma and with equal-interval discretization, the cor- (Alwin, 1997). These results encourage social work
relations increase with k, from 0.767 when k D 4 to practitioners to use Likert scales with more points.
0.954 when k D 11, since when k increases the raw The correlations between raw and true scores are
scores will be closer to the true scores. high when the underlying distribution is symmetric
When the discretization is symmetric and skewed, normal, but low when it is skewed. The important
there is no clear trend of correlations with k. However, implication is that equally spaced raw scores align
they are highest when k D 11, with correlations of with true values when the underlying distribution is
0.980 and 0.875, respectively, for symmetric and symmetric, but deviate from them when the distribu-
skewed discretization. We conjecture that the correla- tion is skewed, making the usual raw scores more suit-
tions would be higher if there were more Likert scale able when the underlying distribution is symmetric, or
points, but this is subject to further research. Other- closer to symmetry.
wise, it is quite clear that correlations between raw As for the effects of the number of Likert scale
and true scores are very high (from 0.959 to 0.980) points on correlation trends, the directions of equal-
under symmetric discretization compared with probability and interval discretization are the reverse
skewed discretization (with correlations ranging from of each other. The correlation decreases with k in the
0.781 to 0.875). In this way, raw scores are always former case but increases in the latter. Increasing the
aligned more closely with true scores when the under- number of Likert scale points increases information,
lying distribution is symmetric. while reducing it loses information. With equal-prob-
ability discretization, more Likert scale points make it
closer to the true scale, the raw scores deviate further
Discussion and Conclusions
from the true and the correlations are consequently
Whether the Likert scale can be considered an interval- lower. On the other hand, with equal-interval discreti-
scale continuous measurement is the subject of long- zation, the raw scores are closer to the true, and more
standing debate; references and discussion can be found Likert scale points will make the scale closer to the
in the Introduction. Hodge and Gillespie (2007); Leung one represented by the raw scores. The trend is not so
(2011) suggested increasing the number of points to 11 clear with symmetric and skewed discretization, and
with empirical support, but the underlying metric of this leaves room for further research, which is also
empirical data is always debatable. This paper comple- needed into the methods of discretization. This study
ments the above findings via simulation, which has the uses equal-probability and equal-interval, symmetric
advantage that the right model is known in advance. and skewed discretization. We believe that such meth-
We simulate data from six conditions covering equal- ods form a good starting point, but that there may be
probability, equal-interval, symmetric, and skewed dis- other ways to discretize underlying distributions, and
tributions. Symmetric normal and skewed gamma dis- hence supplement the results of the present paper.
tributions are chosen to be the underlying distributions. For frontline social work practitioners, raw
The results show that KS statistics generally decrease scores are obviously the easiest and most conve-
with the number of points, indicating a move closer to nient to use. From this simulation study, we sug-
the underlying metric. The raw scores are commonly gest increasing the number of points to 11, making
used by most social work practitioners, and are closer it closer to normality and interval scales. Further,
to the underlying metric when the underlying distribu- there is the added advantage of an easily compre-
tion is symmetric. Overall, the important conclusion is hensible range from 0 to 10.
that increasing the number of Likert scale points will
bring the scale closer to the underlying distributions
with lower values of KS statistics. This is in line with References
Hodge and Gillespie (2007) and Leung (2011) that
Alwin, D. (1997). Feeling thermometers versus 7-point scales.
increasing the number of points will bring the scales
Sociological Methods and Research, 3(25), 318–340.
closer to the continuous. Further, there is also theoreti- Borgatta, E. F., & Bohrnstedt, G. W. (1980). Level of Measure-
cal support for the view that more information will be ment—Once Over Again. Sociological Methods & Research,
transmitted with more points and hence more 9(2), 147–160.
532 H. WU AND S.-O. LEUNG

Grolnick, W. S., Benjet, C., Kurowski, C. O., & Apostoleris, N. Leung, S. O. (2011). A comparison of psychometric properties
H. (1997). Predictors of parent involvement in children’s and normality in 4–, 5–, 6–, and 11–point Likert scales.
schooling. Journal of Educational Psychology, 89, 538–548. Journal of Social Service Research, 37, 412–421.
Harwell, M. R., & Gatti, G. G. (2001). Rescaling ordinal data to Leung, S. O., & Xu, M. L. (2013). Single-item measures
interval data in educational research. Review of Educational for subjective academic performance, self-esteem and socioeco-
Research, 71, 105–131. nomic status. Journal of Social Service Research, 39(4), 511–520.
Hodge, D. R., & Gillespie, D. F. (2007). Phrase completion Likert, R. (1932). A technique for the measurement of atti-
scales: A better measurement approach than Likert Scales?. tudes. Archives of Psychology, 140, 5–53.
Journal of Social Service Research, 33(4), 1–12. Lord, F. M. (1953). On the statistical treatment of football
Jamieson, S. (2004). Likert scales: How to (ab)use them. Medi- numbers. American Psychologist, 8, 750–751.
cal Education, 38, 1212–1218. Rosenberg, M. (1965). Society and the adolescent self-image.
Knapp, T. R. (1990). Treating ordinal scales as interval scales: Princeton, NJ: Princeton University Press.
An attempt to resolve the controversy. Nursing Research, Stevens, S. S. (1946). On the theory of scales of measurement.
39, 121–123. Science, 103, 677–680.
Kuzon, W. M., Urbanchek, M. G., & McCabe, S. (1996). The Tsui, M. S. (1997). Empirical research on social work supervi-
seven deadly sins of statistical analysis. Annals of Plastic sion: The state of the art (1970–1995). Journal of Social Ser-
Surgery, 37, 265–272. vice Research, 23(2), 39–54.

You might also like