Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Published in the Journal of the Market Research Society (1995) 37(2), 195-202.

HANDEDNESS BIAS IN PREFERENCE RATING SCALES

Michael Kirk-Smith, University of Ulster at Jordanstown.

Abstract

Measurements involving responses on left-right scales are common in Market Research practice. This
paper investigates natural right and left-handed tendencies in ordering stimuli, and discusses the issues
they raise in the use of preference and intensity ratings using such scales.

In a first experiment, a series of 6 studies identified that in a preference rating task participants' natural
tendency was to place preferred objects on the left in a "bench scaling" task. Intensity measurement
showed no such tendency. However, when participants were asked to rate objects for the presence or
absence of an intensity-related attribute the left-handed tendency did occur. The extent of this may be due
to a preference for relative absence or presence.

A second experiment investigated how this left-handed tendency affects the use of scales involving left-
right responses. Four products were rated on a line scale for intention to purchase, with the "buy" label
on the right. Despite a visual example and verbal and written instructions, and there were cases where
scaling was reversed, so that ratings were the opposite of what participants intended. This depended on
the experimenter, indicating the importance of context on error rate.

These experiments suggest that data from left-right scales methods should be checked for reversal errors,
perhaps especially if the preferred term is placed on the right. The results also illustrate that intensity
ratings may be unpredictably "contaminated" by preference factors. Theoretical arguments for the use of
behaviourally based preference tests such as "bench scaling" over more cognitively-oriented tasks
employed in current preference tests are also presented.

INTRODUCTION

Measurements involving responses placed on a left-right scale are common in Market Research practice,
e.g., Likert, semantic differential and line scales. Any "handedness" preference might influence or bias
the use of these scales. Morgan (1944) first showed that people have a tendency to attribute greater
importance to the object on the extreme left in a horizontal display. The Ziller test (Ziller, Hagey and
Smith 1965) uses this finding explicitly to find peoples' status hierarchy. People naturally place their most
esteemed person on the left-hand side when presented with an array of acquaintances. DeSoto, London
and Handel (1965) found that serial ordering proceeds more readily from left to right than from right to
left. Also, the word "better" is used much more often than the word "worse" in inter-relating elements
(Thorndike and Lorge 1944).

Taken altogether, these results imply that, in a serial ordering task, people should have a natural tendency
to put the first, or preferred, mark on the left, i.e., the "A is better than B" relation is literally ordered as
the letter position. The effect may be, in part, be due to the habit of reading from left to right. Thus,
peoples' use of left-right scales for preference rating tasks may be highly dependent on the labelling of the
poles. One practical consequence is the possibility of reversed ratings due to any left-right tendency. Such
errors would introduce considerably more 'noise' into the data than simple inaccurate use of the scales.

Zajonc and Markus (1982) propose that the recruitment of participants' behaviour in the ordering or rating
of stimuli may provide a better test of preference than preference rating methods involving thinking (i.e.,
cognition), because preference essentially involves approach-avoidance behaviour, i.e., movement

1
towards or away from an object because it is perceived as being attractive or threatening, respectively. A
preference judgement may be considered qualitatively different from an intensity judgement. In the
former, especially in low-involvement decisions (i.e., low cost, low risk, routine decisions) the evaluative
response may be immediate, with low cognitive involvement; in the latter, evaluation is not involved, but
there is a computation required in ranking the stimuli. A preference judgement implies approach
behaviour, whereas a computation does not. The consequence is that a purely psychophysical intensity-
rating task should not show such handedness.

The aims of the two experiments presented here were to investigate any natural left-right tendencies in
preference and intensity ratings and how this may bias the use of left-right scales. The first experiment
employed "bench scaling", in which participants handle the stimuli and place them in their preferred order
or in the order of intensity. In the second experiment, participants first chose a preferred stimulus, which
was then rated on a line scale with the other stimuli. The initial choice allowed checking for errors in the
use of line scales.

METHOD

EXPERIMENT 1

Sample: There were six studies of 24 participants each. Participants were women aged between 24 and
65, and chosen to reflect the typical consumer of the stimuli tested. Left-handed participants were
excluded from analysis, though their data showed the same tendency as the right-handed participants.

Stimuli: Three 8 oz. powder jars filled with a green, yellow or pink pearlescent shampoo ("Colour") were
used as stimuli for preference ratings. Three 8 oz. powder jars filled with water coloured with three
intensities of a blue dye ("Dye"), originally used for calibration in a Magnitude Estimation task, were used
as stimuli for intensity ratings.

Design: Each study tested a different combination of stimulus and instruction, with each combination
being chosen on the basis of the results of the preceding study. The stimuli (a), instructions (b) and the
rationale (c) for each study were as follows:

STUDY 1
(a) Colour.
(b) "Please arrange these in order of preference".
(c) What is the ordering tendency in a preference judgement?

STUDY 2
(a) Dye
(b) "Please arrange these in order of intensity".
(c) What is the ordering tendency in an intensity judgement?

STUDY 3
(a) Dye
(b) "Please arrange these in order of preference".
(c) Study 1 showed a LHS bias, study 2 showed none. Is the difference due to the
preference (i.e., affective) task or the stimuli (i.e., the cognitive task of assessing intensity)?

STUDY 4
(a) Colour
(b) "Please arrange these in order of the likelihood that you would buy them".
(c) An intention to purchase may be a mixture of preference and a cognitively assessed trade-off
(i.e., how much money would I give up?). Will there be a LHS bias?

2
STUDY 5
(a) Dye
(b) "Please arrange these in order of darkness".
(c) Is any ordering bias in intensity dependent on the labels used?

STUDY 6
(a) Dye
(b) "Please arrange these in order of lightness".
(c) As for Study 5.

Procedure: In each test the experimenter placed the stimuli on the bench standing on the right-hand side
of half of the seated participants, and on the left-hand side of the other half to avoid position bias. The
three stimuli were placed front-to-back in the middle of the bench before the participants. The order of the
stimuli was randomised between participants. The participants were then asked to arrange the stimuli
across the bench on the basis of preference, purchase intention or intensity. The hand that the participant
used to pick up the stimuli was taken as their handedness.

EXPERIMENT 2

Sample: Forty-eight female participants were tested by a female experimenter. They covered a cross
section of UK consumers in both age and socio-economic grouping. There was no reason to expect that
any segment of the population is any different in left-right tendency than any other. A second similar
study was conducted with different participants and a different experimenter.

Stimuli: Stimuli were four 8 oz. powder jars filled with white pearlescent shampoo. They were shaken
before each participant was tested to given slight variations in pattern. They were labelled R, F, M and J;
letters equi-distant in the alphabet and presented in semi-random order. In the second study, the letters of
stimuli with the highest scores were exchanged with those of the lowest scores in case there was a letter
preference.

Measures: Participants' ratings were measured in two stages. They were initially asked to write their "first
choice" on a scoring sheet (i.e., R, F, M or J). This was assumed to be their "real" first choice. They then
rated the four stimuli on a 10cm. graduated line scale with end labels "I would never buy" (left-hand side)
and "I would always buy" (right-hand side). Participants were instructed to ""Mark the appropriate
position on the line and write the product code above it". This allowed the stimuli to be rated closely
together. Participants were also given, in addition to verbal and written instructions, an example of a
completed scale to indicate correct left-right use.

Procedure: Participants were tested individually at a desk placed against the opposite wall to the
entrance door of the experimental room. This was to prevent recency effects, i.e., a bias to or from the
stimulus first seen. The stimuli were presented 17" apart on the far side of the desk, against the wall. Also
on the desk was the scoring sheet and pencil. The position of the participant's chair was kept central and at
a constant distance from the desk. To keep a constant light intensity, the window blinds were drawn and
the desk was arranged so that a fluorescent light from behind the participant evenly illuminated it. After
the participant had read the scoring sheet, the experimenter then read instructions from behind the
participant to avoid uneven illumination, and shown an example of how to use the scales. If a participant
wrote with her left-hand, her data were excluded and stimulus sequence was repeated with a right-handed
participant.

3
RESULTS

EXPERIMENT 1

The results are presented in Table 1. These show how many subjects out of the 24 in each study ordered
the three colours or shampoos or the three intensities of dye from left to right (LHS) or from right to left
(RHS). The main findings were a strong left-hand preference in the ordering of shampoo preference, and a
lack of handedness in the intensity ratings of the dyes.

TABLE 1: Results for Preference and Intensity Bench Ratings.

Study Stimulus Instruction LHS RHS p

1 Colour Preference 23 1 *

2 Dye Intensity 111 13 n/s

3 Dye Preference 24 0 *

4 Colour Purchase 21 3 *

5 Dye Darkness 151 9 n/s

6 Dye Lightness 202 4 *

1
Position of darkest.
2
Position of lightest.
* Significant at p< 0.001 by the Binomial Test.

EXPERIMENT 2

Three participants (6%) reversed their scores, i.e., their first choice rated as the least likely to be bought,
and four (7%) did not rate their first choice as the most likely to be bought. These reversal and ordering
discrepancies were not observed in the second study.

DISCUSSION

The Scaling of Preference

As hypothesised, the results of studies 1, 3 and 4 show that participants have a natural tendency in a
preference or purchase intention rating task to order preferred stimuli from left to right, whether the
stimuli are qualitatively different or differ in the intensity of one attribute. Although not significantly
different here, it may be that the slight reduction in RHS bias in the purchase rating may reflect the greater
cognitive component of this task.

The Scaling of Intensity

It was originally hypothesised that the difference in results between studies 1 and 2 was due to the
involvement of affect (i.e., feeling or emotion) in the preference task and the cognitive (i.e., thinking)
involvement in the intensity task. The propositional relation "A is better than B" gives a clue to the lack
of handedness in the intensity judgement of study 2. To judge intensity, participants can judge either the
presence or absence of the stimuli, and whatever is chosen is then ordered from left to right. The reason

4
one is chosen over the other may be linked to a participant's preference for presence or absence of the
attribute. Thus, even if the task is to judge intensity it is possible that preference has an influence on the
rating task.

Studies 5 and 6 examined this hypothesis. If the participant is cued to order or judge "lightness" or
"darkness" rather than "intensity", the left to right tendency should appear through a preference for one
label. This is indeed the case, though the results are not as clear-cut as the strong left-right tendency in the
preference studies. We may infer that "light" may be preferred to "dark" since participants more
consistently place it on the right. This result indicates how a psychophysical measurement (in this case
"intensity") may be unavoidably "contaminated" by motivational factors, such as preference, and sets a
limit on the accuracy of any apparently "pure" psychophysical measurement.

Rating errors

The first experiment indicates that in a scaling procedure involving the serial ordering of stimuli there is a
strong tendency to place a preferred object on the left-hand side. The second experiment shows that
reversal and ordering discrepancies were evident in the first study of scaling, despite participants being
given a visual example and written and verbal instructions. These errors may, in part, be due to the
influence of this left-hand tendency.

The combination of visual example and verbal and written instructions may also have confused
participants. Why were there no reversal mistakes in the second study? A subjective impression is that the
experimenter in second study had a more forceful personality than that of the second study, and this could
have caused participants to pay more attention to the position of the labels. The possibility of
experimenter effects in creating rating errors emphasises the importance of keeping tightly to the
experimental protocols and scripts to reduce statistical error and thus increase discriminatory power in
rating tasks. These results suggest that terms might be best be presented from positive to negative in left-
right scales, especially if there is evidence of participants not paying attention to instructions. These
results also suggest that data from left-right rating scales should be routinely scanned for errors of reversal
and ordering, perhaps using a check method similar to that used in this experiment, e.g., by having
participants specify separately which product is their most preferred.

Psychological Aspects

Current preference testing methods are based on a model of consumers computationally or cognitively
weighing-up the various attributes of the products; finally ending up with a summed like-dislike opinion
(Fishbein and Ajzen 1975). These methods, based on attribute measures, do not appear to be very reliable.
For example, a comparison of attribute elicitation, importance ratings, information search, conjoint
measures, participative probability measures and Thurstone measures (each widely used and/or with a
substantial theoretical base) gave very different results, even when tested with an homogenous group of
participants (Jaccard, Brinberg and Ackerman 1986).

In contrast, Zajonc (1980) suggested that preference or evaluation might be considered primarily an
affective or feeling process followed by cognition rather than a cognitive process followed by an affective
reaction. The process of evaluation is seen as a primitive approach/avoidance response to stimuli. This
response is rapid, requires little processing and has little cognitive mediation. The thinking about why a
stimulus is liked (the cognitive response) is a post-rationalisation of this immediate like-dislike response.
For example, Hoyer (1984) showed that the typical consumer is making an extremely quick decision with
only a minimal degree of cognitive effort when choosing a detergent in the supermarket. It should be
noted that though there is considerable debate on the relative involvement of affect and cognitive
processes in preference decisions (e.g., Vanhuele 1994, Anand and Sternthal 1991), the issue here is one
of improving the measurement of participants' preferences. Zajonc and Markus (1982) comment that
affect is primarily stored as the approach and interactive behaviour allied with the object. McSweeney and
Brierley (1984) have also drawn attention to the importance of behaviour when affective classical

5
conditioning and learned evaluative responses are involved, since these may not be reflected in cognitively
constructed evaluative judgements. In general, the involvement of behaviour in determining preference
may have a bearing on wider issues than scaling. For example, conditioning phenomena such as
"autoshaping" might take place, where the repeated experience of a signal (e.g., the pack design) coming
before or "predicting" a positive experience (e.g., a pleasant surface feel) causes the signal to be
approached and touched (Locurto, Terrace and Gibbons 1981).

The inference is that the behaviour towards a product, particularly a low-involvement product, may be
more important than what is said about it (Nisbett & Wilson 1977). The behaviour of reaching out and
physically contacting the product may be an important affective element in choice behaviour. Thus,
recruiting participants' behaviour in ranking stimuli should provide a better test of preference than the
primarily cognitive methods mentioned above. Following these arguments, "bench scaling" involves the
behaviour representing the affect associated with the product, i.e., in requiring participants to pick up
products it mimics the affective/behavioural choice of the supermarket situation better than conventional
"pencil and paper" preference measures with their emphasis on computation. In summary, the advantages
of bench scaling are that the participant:

1 Has the products in front of him/her.

2 Physically selects the products, thus directly accessing motor components of the affective response.

3 May judge the products with less introspection than when products are either absent or unavailable to
touch.

4 May judge the products without the somewhat artificial computations involved in numerical scaling or
allocation.

5 Can re-adjust the product order at will by re-arranging the products (a behavioural equivalent of
putting one product back on the shelf and taking another off).

On the other hand, bench scaling not only involves a behaviour towards the object but contact as well,
which may bring in interfering influences, e.g., of feel and weight and the participant's expectation of
these. If these aspects are taken into account, these results and discussion add support McSweeney and
Brierley's (1984) conclusion that consumer researchers' traditional heavy reliance on pencil and paper
attitude scales as dependent measures should be supplemented with other, more behavioural, approaches
to the measurement of preference. For example, products (or markers signifying them), might be placed in
order of preference along a 1 metre line, with the distance along the line being taken as the measure.

REFERENCES

Anand P & Sternthal B (1991). Perceptual fluency and affect without recognition. Memory and Cognition.
13, 3, 293-300.

DeSoto CB, London M & Handel S (1965). Social Reasoning and Spatial Paralogic. Journal of
Personality and Social Psychology l2, 4, 513-521.

Fishbein M & Ajzen I (1975). Belief, attitude, intention and behavior: An introduction to theory and
research.Reading, MA: Addison- Wesley.

Hoyer WD (1984). An examination of consumer decision making for a common repeat Purchase. Journal
of Consumer Research 11, 822-829.

Jaccard J, Brinberg D & Ackerman LJ (1986). Assessing attribute importance: A comparison of six

6
methods. Journal of Consumer Research 12, 463-468.

Locurto CM, Terrace HS & Gibbons J (1981). Autoshaping and Conditioning Theory. New York:
Academic Press.

McSweeney FK & Brierley C (1978). Recent developments in classical conditioning. Journal of


Consumer Research 11, 619-631.

Morgan JJ (1944). Effects of Non-rational Factors in Inductive Reasoning. Journal of Experimental


Psychology 34, 159-168.

Thorndike EL & Lorge I (1944). The Teacher's Word Book of 30,000 Words. New York: Bureau of
Publications, Columbia University.

Vanhuele M (1994). Mere exposure and the cognitive-affective debate revisited. Advances in Consumer
Research. 21, 264-269.

Nisbett RE & Wilson T (1977). Telling more than we can know: Verbal reports on mental processes.
Psychological Review 84, 3.

Zajonc RB (1980). Feeling and Thinking: Preferences need no inferences. American Psychologist 36, 2,
151-175.

Zajonc RB & Markus H (1982). Affective and Cognitive Factors in Preferences. Journal of Consumer
Research 9, 123-131.

Ziller RC, Hagey J & Smith M (1965). Self-Esteem: A self-social construct. Journal of Clinical and
Consulting Psychology 33, 1, 84-95.

7
8

You might also like