Professional Documents
Culture Documents
Dana Dawes Superiority Simple Alternatives Regression Social Science Predictions
Dana Dawes Superiority Simple Alternatives Regression Social Science Predictions
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
American Educational Research Association and American Statistical Association are collaborating with
JSTOR to digitize, preserve and extend access to Journal of Educational and Behavioral Statistics.
http://www.jstor.org
The authorsthankRob Kass, Scott Moser, and JohnPattyfor conversionshelpful to this article.
317
Methods
Notation
Withoutloss of generality,we discuss datain standardscore formthathave been
codified such that each predictorcorrelatespositively with the criterion.The fol-
lowing notationis used:the samplecorrelationmatrixamongpredictorsis S, in the
populationit is 2. The samplevectorof correlationsbetweenthe predictorsandthe
criterionis r andhas m (numberof predictors)elements,while the populationana-
log is v. Sample-derivedleast squarescoefficients are a vector b whose elements
are b coefficients,the populationcoefficientsarePcoefficients,andw is any vector
of coefficients.Finally, the sample multiplecorrelationis R, while the population
multiplecorrelationis p, and the populationcorrelationbetween the criterionand
the predictedvalues resultingsample-derivedcoefficientsis "validatedR."
AlternativeCoefficients
In additionto least squares,we considerthe following':
1. Correlation Weights. Coefficients are each predictor'szero-ordercorrela-
tion with the criterion.Previousauthors(Goldberg,1972;Marks,1966) have found
some supportfor favorablecross-validationof correlationweights over regression
coefficients.
2. Unit Weights. Coefficients are either 1 or -1 on each predictor.Wainer
(1976) showedthatthe loss of predictablevariationof a randomvariableusingequal
weights ratherthan ordinaryleast squaresis theoreticallyquite small (see correc-
tions by Wainer,1978 and Grove, 2002), and Grove shows thatthis loss is smaller
when predictorsare correlated.Unit weights can be chosen post hoc accordingto
observedcorrelationsor a prioriaccordingto the researcher'stheory.As the former
methodrarelyoutperformscorrelationweights, and the latteris less redundant,we
investigate only the lattermethod. Note that this choice entails the possibility of
weightinga predictor-1, even if all sampleobtainedcorrelationswith the criterion
arepositive. It is also possible in some samplesthatproperlycalibrateda prioriunit
weights will yield negativevalues of R.
3. Take the Best Weights. Gigerenzerand Todd (1999) describeone-variable
decision rules that may cross validatebetterthan regression,unit, and correlation
319
Samplingand ValidationProcedures
A modification of the traditionalcross-validationprocedurewas used. Rather
than creating two samples, we began with large datasets that we assume yield
population parameters. We then randomly drew smaller samples from these
datasets. From each dataset, we sampled 300 times with replacementafter each
sample, including 50 each of sizes 5m, 10m, 15m, 20m, 30m, and 50m, where m
is the numberof predictors.We then applied the proceduresdescribed above in
each of the samples to obtain the sample-derivedcoefficients. To determinethe
directions of the unit weights, the researchersused the majorityjudgment of a
small convenience sample of colleagues (all directions were correct). All coef-
ficients were then applied to the "population"superset(including the sample) to
obtain values of validated R, which is computationallysimplified by using the
formula:
W'V
validatedR =
=
w'r
a
w'Sw
TABLE1
Characteristicsof the Real Datasets
Dataset N k P v Vector xixja
Abalone 4,177 7 .73 .63 .58 .56 .56 .54 .50 .42 .89
NFL 3,057 10 .54 .46 .43 .37 .34 .33 .27 .21.07 .05 .05 .21
ABC 955 5 .35 .32 .20 .06 .04 .02 .08
NES 1,910 6 .35 .26.17.15.15.13.12 .11
WLS 6,385 5 .20 .13.11.10.10.10 .15
a
xixjis themeanstrength
of correlation
amongthepredictors.
321
Abalone
0.75
bestcp
0~.6
Sake the best
-0
NFL
best cp essi
...
.. correlation
unit
takethebest
0A4
...
5m 10m 15m 20m 30m 50m
ABC NES
0e35 C
..on ...iunon
Utti a eat
atthe akeebes
0 22 m 10 5m 20m 30m 5bes 5m 15m 20m m 50m r
20
005
Sample Size
FIGURE 1. Mean validated R as a function of sample size for each public dataset.
323
u = J1-h2.
l-d
324
0.95
o" regression
80
-- - ridge regression
075.5
"" correlationweights
7 unitweights
.8
070 -
--'-
- - - .7-
.6- .7
a
Level p
--
-- 4-
0 35
.3 -
030 --------•-
.4
Samplesize
FIGURE2. Mean validatedR's as afunction of samplesizefor syntheticdatasets,grouped
by level of prediction error.
327
More recently, rules of 40m for a stable cross-validity coefficient and 100m for
estimating the population regression equation have been suggested (Osborne,
2000).
We throwourown hat into the ring, arguingfrom a ratherpracticalperspective.
Regressioncoefficients should not be used for predictionsunless erroris likely to
be extremely small by social science standardsor sample sizes will be largerthan
100 observationsto predictors.In otherwords,regressioncoefficientsshouldalmost
never be used for social science predictions.Simple alternativeswill usuallyyield
betterpredictions.Schmidt(1971) made a similarpoint aboutthe cross-validsupe-
riorityof unit weights:
Sincemanystudiesemployingregressionweightsreported arechar-
in theliterature
acterizedby samplesizes belowthe criticalvaluespresentedin Table3, it is con-
cludedthatmanypsychologistsin appliedareasareroutinelypenalizingthemselves
by theiradherence
to [leastsquaresregression].
(p. 710)
His results typically required sample sizes close to 20m for least squares to be
superiorto unit weights. Ourmore conservativeestimatesregardingleast squares
result in partfrom the considerationof correlationweight alternatives,which are
often superiorto unit weights, and in partbecause we conditionour recommenda-
tions on p.
If the problemis to choose a sample size a priorito obtaineffective regression
coefficients, the researchershould assume a value of p based on his or her theory.
328
Notes
1One of us (RMD) had the idea of
retainingonly those off diagonalelements of
S that could change the sign of a regression coefficient (i.e., those pertinentto a
suppression).This strategy,which we call vanishingcovariances,was also exam-
ined. However,because ourdataarenot characterizedby any strongsuppressions,
we do not include its results. It usually (but not always) reduced to correlation
weighting, and, thus, usually yielded the same accuracy(but sometimes worse).
2 One simple intuition behind the increased efficiency of regression as error
References
ABCNews/TheWashington Post.(2002).ABCNews/Washington PostSix MonthsAfter
September 11thPoll,March2002(Computer PA:Taylor
file).ICPSRversion.Horsham,
NelsonSofresIntersearch 2002.AnnArbor,MI:Interuniversity
[producer], Consortium
forPoliticalandSocialResearch[distributor].
Armstrong,J. S. (1985). Long-rangeforecasting: From crystal ball to computer(2nd ed.).
NewYork:WileyInterscience.
329
330
Authors
JASONDANA is a doctoralstudent,Departmentof Social andDecision Sciences,208 Porter
Hall,CarnegieMellonUniversity,Pittsburgh,PA 15213;jdd@andrew.cmu.edu.His inter-
ests are in behavioral economics, social preferences, the use of clinical vs. actuarial
judgment, and applicationsto ethics.
ROBYN M. DAWES is the CharlesJ. Queenan,Jr. University Professorof Psychology,
Departmentof Social andDecision Sciences,208 PorterHall,CarnegieMellonUniversity,
Pittsburgh,PA 15213;rdlb@andrew.cmu.edu.
331