Professional Documents
Culture Documents
A Further Study of Estimating Averages
A Further Study of Estimating Averages
A Further Study of Estimating Averages
J. SPENCER
Article views: 48
By J. SPENCER
University of Bristol
§ 1. INTRODUCTION
A PREVIOUS study of averaging performance (Spencer 1!l61) showed that people
are able to average sets of ten or twenty two-digit numbers with surprising
accuracy. The factors which affect this accuracy were found to be
(I.) amount of information presented for judgment
(2.) scatter of the information;
(3.) presence of one item which differed markedly from the remaining values
of the set.
All numbers on the sets used were approximately normally distributed
about their respective mean values, so that modal, median and mean values
were practically indistinguishable. This prevented any assessment of the
nature of people's estimates of the average: whether for example, the averaging
process might be such as to produce median responses rather than arithmetic
mean responses. Some departure from arithmetic mean averaging was suggested
by the fact that the effect of single items which differed markedly from the
rest was unduly great.
The experiments to be described were therefore carried out to examine the
nature of averaging responses in a situation which allowed the arithmetic mean
of the presented information to be distinguished from the median and other
possible measures of central tendency.
R2
256 J. Spencer
§ 2. SYMBOLIC AV.ERAGING
2.2 Results
The results are shown in Table I, where it can be seen that mean errors were
remarkably low for all conditions. But it is clear that rnean error was appre-
ciably greater with skewed material in three out of the four possible comparisons.
This provides some confirmation that averaging, in the sense of finding the
arithmetic mean, is affected by the type of distribution presented.
When the mean errors for the whole group of subjects were correlated witb
thc degrec of skewness of dlatribution in the twelve skew distributed examples
used, the product-moment correlation coefficient was r=+0·7ll!l (for signifi-
cance at p=0·05, n= 10, I' should be 3> 0'576.) Subjects therefore over estimated
the effects of ' outliers' in a distribution to a demonstrable and significant
extent. .
The results for modulus, mean errol' are also shown in 'I'ahle 1. They show
that accuracy was lowe!' with skew distributions except for the high scatter
condition with twenty items. For normally distributed material, 01'1'01' increased
with increasing amounts of information, whereas for skewed material, the
effect was reduced 01' even reversed.
A four-factor analysis of variance (distribution X amount X scatter X
subjects) was made on the total errol' scores and yielded a highly significant
second order interaction term (distribution X amount X scatter). Accordingly
three-factor analyues were made for each of the two types of distribution
separately. These showed that amount and scatter and their interaction were
A Further Study of Estimating Averages 257
Table 1. Menu error's made when averaging symbolic material
NU11Ibcr of items 10 20 10 20
Mean on-ora
(taking sign in to IH':COUIlL) -0·31.1· -0,20 -0-2ti -0-75 -}·32 -2'07 -1·25 +O·li7
significant for normal distributions but that only scatter was significant for
skew distributions. There was no indication of significant differences between
subjects.
As an additional check on the possibility of systematic differences between
subjects thc total errors for all normally distributed sets combined were
correlated with total errors for all skew distributed sets combined for the ten
subjects. This yielded a product moment correlation coefficient of r= ·-0,\ UJ
which is quite insignificant (for p=0·05 and n =8, r must be ~ 0·6:32). Since
this procedure lumps together what have already been' shown to be important
sources of variation in performance, rank correlations between subjects'
performances on t.he two distributions, were made separately for the different
conditions of "mount and scatter. None of these was significant, so it must be
concluded that there were no marked differences in general averaging ability
between subjects.
A further analysis was made of subjects' performances on skew material.
Each response was compared with the various possible measures of central
tendency that could be used to define the 'card average'. The measures
chosen were (I) arithmctic mean, (2) median, (3) maximin value (i.e. half thc
sum of the lowest and highest values on the card). Responses were classified
according to which of these they most closely resembled. Where a response
was equidistant from two possible measures, or seriously in error, it was
classified as ' not known'. Table 2 shows the results of this procedure.
'I'ublo 2. Types of response made when overeging skew distributed syrnbolic infor-mat.ion ,
Medinu 22 18·3
Arithmetic mean 43 35·8
Maximin 44 :16·(\
Not known 11' 9·2
Total 120 100·0
,. Four of the devon were mnjor errors, all ill the direct,ion of out.licrs.
The first point that can be made about Table 2 is the low proportion of
, median' responses. The majority of judgments are biased towards the
outliers as the positive correlation between mean enol' and skewness demon-
strates. If the basis of mental averaging is assumed to be a mechanism which
258 J. Spencer
produces arithmetic mean responses modified by random error, then onc would
expect that the majority of responses would be classified as ' arithmetic mean'
responses. The remaining responses would be approximately evenly divided
betwcen the' median' and the' maximin' classes. This should occur because
the median and the maximin values for a skew set of numbers lie on opposite
sides of the arithmetic mean for that set, with the maximin value lying on the
same side of the mean as the outliers. The low proportion of 'median'
responses and the high proportion of ' maximin' responses in Table 2 make it
appear that the averaging mechanism approximates to a maximin process
rather than to an arithmetic mean process.
" But if the middle values were not even, then I took a blind swipe - nothing
very systematic". Only two subjects said that they were more confident of
their estimates based on ten rather than twenty numbers.
How well do reported methods agree with the performances achieved by
subjects? Table 3 shows the distributions of responses classified as median,
Table 3. Pcrcnntnges of different types of response with different types of averaging technique.
Symbolic material.
Number of subjects 5 4
Responses classed as :
l\ledian 2U-U 10·4 41·8
Arit.hrnet.ie menu 46·7 23·0 33·2
Maximin 23·3 58·3 16·7
Not known lO'O 8·3 8·3
Total 100·0 100·0 100·0
arit.hmetic mean or maximin for each of the three main methods reported by
subjects.
There is a significant difference between the distributions achieved by the
sampling and maximin groups. (X 2 = 14.7, which, with 3 degrees of freedom
corresponds to p<O·OI). ~Thc one person representing the synoptic method
had to be left out of the examination because expectancies calculated for his
method were too small to be reliable.
It is clear from these results that type of performance and reported method
are related and that the sampling method gives a very close approach to an
arithmetic mean type.
3.2 Results
It can be seen from Table 4- that mean error with skew distributions was
greater than with normal for three of the four comparisons but the differences
were less pronounced than those obtained with symbolic averaging. The
260 J. Spencer
Tublo 4. Menu errors mndo when nvurugiug graphical material
N umber of items 10 20 10 20
Mcuu errors
(l.nking sign into account) -0'17* +0·:18 +0'50 + 1·5U -O'4~ -1·5U -0,33 -2·55
Tu.hlc fi. Types or respouse mudo when uvol'uging skew distributed graphic infor-mnt.ion,
Type of response Number of responses Percentage of total
Median 17 14·'
Arithmetic IllUUIl ti8 5G·8
Maximin ;)2 26·6
Not known 3" 2·5
Total 120 100·0
* One of tile t.hrue wus n. mnjor error ill the direction opposite to outlier's.
Compared with Table 2, the proportion of ' arithmetic mean' responses was
greater than with symbolic averaging, the increase being derived from nearly
equal reductions in the proportions of ' median' and' maximin' judgments.
of graph points was established and the centre chosen as a preliminary average.
Secondly, this preliminary avemgc was adjusted according to the distribution
of the outliers.
It can be seen from Table 6 that the visual balance method provided a
Table 6. Percentages of different types of response with different types of averaging technique.
Graphical method.
Number of subjects 6 4
Responses classed as :
Median 12·5 16-6
Arithrnet.ic mean 65·2 4;{'8
Maximin Ig·r. 37·fi
Not known 2-8 2·1
Total 100-0 100-0
§ 4. DISCUSSION
The results of these experiments show good general agreement with
previously published results (Spencer I fiGI). Performances obtained with skew
distributed information agree with expectations derived from the previous
experiments in which a single deviant value was used. Individual differences of
performance at averaging symbolic material were not statistically significant
and it can be accepted that the same was true for graphic material since
individual differences were found by inspection to be even smaller. The
previous experiments showed that although individual differences existed
between process operators they did not exist between students, and the students
averaged more successfully than the majority of process operators. The
present results could, therefore, be expected because all the subjects were
students.
262 J. Spencer
Although differences octween subjects were small in terms of error there were
uevort.heless significant differences in the averaging methods used, and these
were related to the type of averaging response made. From the introspective
reports that were obtained it appeared that the differences in technique were
primarily differences in the way subjects perceptually structured the presented
information. All the reports described what subjects chose to use as a basis for
estimation: none gave nny clear idea about what they did with the information
when they had perceived it. In other words any computation was apparently
performed unconsciously and only the process of acquisition was recalled.
It was found that judgments were biased towards the outliers to a signifi-
cant extent with sym belie but not with graphical information. In addition,
It significantly greater number of judgments were classifiable as ' arithmetic
menu ' responscs with graphical as compared with symbolic information (X 2 =
7·21 which with 2 degrees of freedom corresponds to p<0·05). Since the same
basic information was used for both conditions it must be concluded that the
graphical form of presentation facilitated arithmetic mean responses. Two
possible explanatory models may be suggested for these results. It may be
thn.t ' averaging' is an inherent characteristic of all sensory systems such that
auy series of similar stimuli or events gives rise to It generalised perception
which corresponds with an average for the series. Some sensory systems lIlay
be better adapted to produce' arithmetic mean' responses than others. The
nltcrnative model is to suppose that averaging involves two stages or mechan-
isms, oue of which is perceptual and the other computational. The computing
stnge is ut a high level of the central nervous system and deals with informat.ion
from all t.hc relevant sensory ohannols. I t might be further suggested that
this ccnt.ral com putor is so constituted that it inherently gives arithmetic
mean responses provided thaf the perceptual system transmits the information
without distortion. At present it is impossible to decide in favour of one of these
models rat.her than the other, but it seems more probable on general grounds
thnt the second is nearer the true facts.
A striking feature of the results is the fact that although increases in
information lead to increases in error for normally distributed information,
this is not consistently the case with skew distributed information. This was
probably due to inadequate control of the normally distributed material in
that when the amount of information increased so did the standard deviation.
This was not so with the skew distributed material. The rise in modulus mean
enol' which occurs with increased amounts of normally distributed information
is, therefore, almost ocrtainly due to therise in standard deviation. It was not
upprcciatcd when the material was prepared how sensitive people would be to
small but consistent changes in scatter. The same explanation applies to some
of the results described in the previous paper (Spencer HJ61 , top of p. 318).
This docs not affect the principal findings or discussions of the present paper
which are based almost entirely on the skew information results. It must
now be concluded, however, that performance does not deteriorate in a
consistent manner as the amount of information is increased over the range
studied in these experiments.
A mom general question is raised by the results reported in this and the
previous paper. Why should performance deteriorate as the scatter of the
information increases! The present results show t hut the modulus mean error
A Further 8tnrly of Estimatiny Averages 263
one ncar by. This may, as Bartlett and Mackwort.h suggested, have created an
illusion of distance for the outliers, so that their deviation appeared to be
gmtLter tluu. it really was.
The nuf.hor is grateful to P. F. Powoslund for helpful discussion and to the students who
volunteered to be subjects.
The work WlLS carried out with tho finunciul support of the Depurt.ment of Seiennifie and
Industrial Rcsearcl r.
APPI~NI)LX
Details of the Norll/ally Distributed. Sets of Data
... Sets 1, 2, a, 7, 8 n.nd D were Lhe 'low scetacc ' sct.s, tho remainder' high scntter '.
~tlllldlll'J Number
~ot,· l\Jenn 1\:lcdinH Muxirnin. dcvin.t.ion Skewness'[ uf Items
* ~OLH 1,2,3,7,8 uud !) WCl'U the' low Meatier' sots, the romuiuder 'high scnttcr ".
t Skewness was est.imuted ItS 3 (mcfill-mediau)/6.
A triul and error method was adopted in preparing the skew distributed sets of
numbers. A dcxired mean value and two values to define It range were chosen to
give either IL high 0" IL low scatter. This range was then divided into five sub-ranges
which diminished in width from one end to the other of the total mnge. Into each
sub-runge two (01' foul') numbers were fitted from random number tables. The
mean lend median of the set were calculated lend the extreme numbers were adjusted
to increase their difference if it appeared to he too small.
A Further Study of Estimating Averages 261'i
Cet ar-t.icle est consacre it une etude de I'est.imat.ion de la moyenne stir plusicurs vnriubles
lorsque cellos-e! sont presentees, soif sous forme symbolique, soit eous forme gruphique.
L'jnforrnat.ion symbolique est presentee nux sujets sur de petites cartes blanches 01'1 Bout
impruue» Bait 10, soit 20 Hombres it deux chiffres. L'infonnation graphique est, presentee au
moyen de J 0 ou 20 points partes sur du pepier R graphiques gr-adue en dixiemcs de pouce. LeH
donnees different pars leurs diatr-ibut.icns qui peuvent et.re OU, symcu-iques ou asyrnetr-iques pOl'
rapport it ln. rnoyenne ar-inhmet.ique et par leur dispersion nut.our de cette moyenne. Los
rl~~liltats. tout, conunc ceux des recherches entcriouros, rnonbrenf que l'erreur d'nppreoiuuion
crolt. lorsquc In dispersion augmcnto ct que cut. nceroiasemcnt est. pins important pour I'Inforuuu.ion
aytuboliquc qne pour rinformntion grnphique. L'crrcur eat. egnlement. plus irnport nnoe duns
le ens de In. dista-ilnn.ion usymet.rjque qtle duns lc cas do In. dist.ribut.lon symct.rfque.
La compu.rnison des result.ots oht enus avec Ius deux types do dist.i-ibut.ions, mont.re quo l'errctu-
no emil, pm~ en fonct.ion de In quant.ite d'Informur.ion presentee, corrtrn.iremcrrt it ce qu'Iudiqunicnt.
certaines recherches »ntcrieures : muis duns cos recherches, il raul. inct-iminer Ie oont.rolo inaufflsnnt,
de In dist.rf but.ion des rtiverses quent.ites d'Infot-mat.ion.
L'onalysc des pcrfot-numcca individuelles monta'e qu'il existe pluaieurs t.ypes de reponse ct.
que Ie t.ypc ddcrif par los sujeta se fondant sur leur int.rospoct.ion est. en su-pport. nvec leur
perfor-mance. En part.ioulior-, In, proportion des est.imnt.ions que I'ou pout, designer cotnrne (Stunt,
<In typo v moycnne ru-it.hmet.ique ", vat-ic significet.ivemcnt avec los t.uct.iqucs decritcs PUl' los
sujets.
II cat sllggl~rc oer-tn.incs experiences fuuures qui devm.iont, pet-mct.ta-c de determiner loqucl dOR
deux modclcs possj bles de l'eet.imat.ion de la moyenne est. 10 lion ct. pom-quoi l'crreur el'oit, en
IUClllO temps q'u'n.ugtuent.e In d'isperaion de I'Juformnt.ion.
Es wird untorsucht., auf welchc Art der Mit.tclwerf einor Roiho von \Verten oiner Vru-inhlen
gefoichiltz,L wirrl , wei III diese \Vete symbolisch oder graphisch dru-gebot.en wet-den. AI~ syrnbolischc
Information wurden den Verauchspersonen klcinc weissc Karten mit, 10 odor 20 xwcisr elllgcn
Kut-ten vorgelegt.. Cruphischc Lnfot'mat.ion wurdc in Form von 10 odor 211 Purikt.en nuf
l\Jillilnetcrpapier gegebeu. Die Dutonreihen untersohieden Rich dndurch. dass sic ent.woder
not-mal odcr ungleich urn dell nr-it.hmcr.ischcn l\littelwert, vert.eilt, wnren , und dnss sic vcrschicdcu
st.nrk nm den Mlt.tolwer-t sbreut.en.
Die Heault.a.te bestiitigten friihero Ergehuisse dur-in, dess die Beurteilungs-Fel-ler mit, del'
St.reutmg zunnhmen. und des in gr-oeserem Auamaxs bei del' symbolischen ala hei del' gt-aphischen
Juformnr.ion. Die Fehler waren nuch grosser, wenn die Verteilung UUl dns m-it.hmet.ische l\lit.lel
unglcich WOT. Del' frUhere Befund, rlass die Fehler mit der Znnahme del' dnrgebot.onen Datcn
grosser "'unlon, cnvies sich als irrig, Er beruhte auf einer ungenligenden Beriick$liehtignng del'
St,r-cuung hoi vcrschiedener Zuhl del' dargehotencn Daten.
Dio Analyse dol' individuellen Leist,ung zeigte. dass die Art. del' Sc:hiil.zung del' VersuchsperRonen
wcch."ldto und dH~~ die selhst,amtlyt,ischen Angaben immer 7.U1' Lei:.<t.ung in Hez,iehung st.anden.
BC~()lldel'~ die Znhl dol' Urteile. die aIR" arit,hmetiseher Mit,tclwel't,-Typ" klassifiziert worden
konnte, itndert, sich f>.ignifiknnt; mit, del' i\lethode des Schiitzens.
Es werden weitere E:'xperimcntc vorgeschlngen, llln Zli crmit,teln, welche Art, des Sehiitzens
richt,ig ist, und warum die Fehler hei grosserer Streuung del' Infol'mnt,ion zunehmen.
REFEREXCES
HAHTLETT, F. C., and l\[ACKwORTH. N. H., 19;')0, Planned 8eein{/, Air :l\lillisj,I·~·. A.P,313gB
(London: .H.1\LS.O.),
S"r,;xCEH, :T.. I ll61, Estimflt,ing 11 vCI'ages. fiJrgonomit;8, 4, 317-328.