Article 1

Comparing Predictive Accuracy
Author(s): Francis X. Diebold and Roberto S. Mariano

Source: Journal of Business & Economic Statistics, Vol. 20, No. 1, Twentieth Anniversary
Commemorative Issue (Jan., 2002), pp. 134-144
Published by: American Statistical Association
Stable URL: http://www.jstor.org/stable/1392155
Accessed: 06/10/2008 14:31
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless
you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you
may use content in the JSTOR archive only for your personal, non-commercial use.
Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at
http://www.jstor.org/action/showPublisher?publisherCode=astata.
Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed
page of such transmission.
JSTOR is a not-for-profit organization founded in 1995 to build trusted digital archives for scholarship. We work with the
scholarly community to preserve their work and the materials they rely upon, and to build a common research platform that
promotes the discovery and use of these resources. For more information about JSTOR, please contact support@jstor.org.
American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journal
of Business & Economic Statistics.
http://www.jstor.org
? 1995American
Statistical
Association Journalof Business&Economic July1995,Vol.13,No.3
Statistics,
ComparingPredictive Accuracy
FrancisX. DIEBOLD
Department of Pennsylvania,
of Economics,University PA19104-6297,and
Philadelphia,
NationalBureauof EconomicResearch,Cambridge, MA 02138
RobertoS. MARIANO
of Pennsylvania,
of Economics,University
Department PA 19104-6297
Philadelphia,
Weproposeandevaluateexplicit tests of thenullhypothesis

of no difference
inthe accuracyof
twocompeting forecasts.Incontrastto previouslydevelopedtests, a widevarietyof accuracy
measurescanbe used(inparticular,thelossfunction neednotbe quadratic andneednoteven
be symmetric),andforecasterrorscan be non-Gaussian, nonzeromean,seriallycorrelated,
and contemporaneously correlated.Asymptotic and exactfinite-sampletests are proposed,
evaluated,andillustrated.
KEYWORDS:Economicloss function;Exchangerates; Forecastevaluation;Forecasting;

tests;Signtest.
Nonparametric
Predictionis of fundamentalimportancein all of the sci- and Chinn and Meese (1991) stressed direction of change,
ences, including economics. Forecast accuracyis of obvi- Cumbyand Modest (1987) stressedmarketand countrytim-
ous importanceto users of forecasts because forecasts are ing, McCulloch and Rossi (1990), and West, Edison, and
used to guide decisions. Forecast accuracy is also of ob- Cho (1993) stressedutility-basedcriteria,and Clementsand
vious importance to producers of forecasts, whose repu- Hendry(1993) proposeda new accuracymeasure,the gen-
tations (and fortunes) rise and fall with forecast accuracy. eralizedforecast-errorsecond moment.]Moreover,we allow
Comparisonsof forecast accuracyare also of importanceto for forecasterrorsthatarepotentiallynon-Gaussian,nonzero
economists more generally who are interestedin discrim- mean,seriallycorrelated,andcontemporaneouslycorrelated.
inating among competing economic hypotheses (models). We proceedby detailingour test proceduresin Section 1.
Predictiveperformanceand model adequacy are inextrica- Then, in Section 2, we review the small extant literatureto
bly linked-predictive failure implies model inadequacy. provide necessary backgroundfor the finite-sampleevalu-
Given the obvious desirabilityof a formal statisticalpro- ation of our tests in Section 3. In Section 4 we provide an
cedure for forecast-accuracycomparisons,one is struckby illustrativeapplication,andin Section 5 we offer conclusions
the casual mannerin which such comparisonsare typically and directionsfor futureresearch.
carried out. The literaturecontains literally thousands of
forecast-accuracycomparisons; almost without exception, 1. TESTINGEQUALITY
OF FORECAST
point estimates of forecast accuracyare examined, with no ACCURACY
attemptto assess their sampling uncertainty.On reflection,
the reason for the casual approachis clear: Correlationof Consider two forecasts, {it}, and {}fi, of the time
forecasterrorsacross space and time, as well as several ad- series {y,}ri. Let the associated forecast errorsbe
{ei,}r,
ditionalcomplications,makesformalcomparisonof forecast and {et},T1. We wish to assess the expected loss associated
accuracydifficult. Dhrymeset al. (1972) andHowrey,Klein, with each of the forecasts(or its negative,accuracy).Of great
and McCarthy(1974), for example, offered pessimistic as- importance,and almost always ignored, is the fact that the
sessments of the possibilities for formaltesting. economic loss associated with a forecast may be poorly as-
In this articlewe proposewidely applicabletests of the null sessed by the usual statistical metrics. That is, forecastsare
hypothesisof no differencein the accuracyof two competing used to guide decisions, and the loss associated with a fore-
forecasts. Our approachis similar in spirit to that of Vuong cast errorof a particularsign and size is induceddirectlyby
(1989) in the sense that we propose methods for measuring the natureof the decision problemat hand. Whenone consid-
andassessing the significanceof divergencesbetweenmodels ers the varietyof decisions undertakenby economic agents
anddata. Ourapproach,however,is baseddirectlyon predic- guided by forecasts(e.g., risk-hedgingdecisions, inventory-
tive performance,and we entertaina wide class of accuracy stockingdecisions, policy decisions, advertising-expenditure
measuresthat users can tailor to particulardecision-making decisions,public-utilityrate-settingdecisions, etc.), it is clear
situations.This is importantbecause, as is well known, re- that the loss associated with a particularforecast erroris in
alistic economic loss functions frequently do not conform generalan asymmetricfunctionof the errorand,even if sym-
to stylized textbook favorites like mean squared predic- metric, certainlyneed not conform to stylized textbookex-
tion error(MSPE). [For example, Leitch and Tanner(1991) amples like MSPE.
134
Journalof Business&Economic
Statistics,July1995 135
Thus, we allow the time-t loss associated with a fore- To motivate a choice of lag window and truncationlag
cast (say i) to be an arbitraryfunction of the realizationand that we have often found useful in practice, recall the fa-
prediction, g(y,,3i,). In many applications, the loss func- miliar result that optimal k-step-aheadforecast errorsare at
tion will be a direct function of the forecast error;that is, most (k - 1)-dependent.In practicalapplications,of course,
g(y,, i,) = g(ei,). To economize on notation,we write g(ei,) (k - 1)-dependencemay be violated for a varietyof reasons.
from this point on, recognizing that certain loss functions Nevertheless,it seems reasonableto take (k - 1)-dependence
(like direction-of-change)do not collapse to g(ei,) form, in as a reasonablebenchmarkfor a k-step-aheadforecast error
which case the full g(y,,Y,) form would be used. The null (and the assumptionmay be readily assessed empirically).
hypothesis of equal forecast accuracy for two forecasts is This suggests the attractivenessof the uniform,or rectangu-
E[g(ei,)] = E[g(ejt)],or E[d,] = 0, where d, - [g(ei,) - g(ejt)] lar, lag window,definedby
is the loss differential. Thus, the "equalaccuracy"null hy-
pothesis is equivalentto the null hypothesis thatthe popula- 1 =1 for S ? 1
tion mean of the loss-differentialseries is 0. S(T) S(T)
=0 otherwise.
1.1 An AsymptoticTest (k - 1)-dependenceimplies thatonly (k - 1) sample autoco-
Considera samplepath {dt}I', of a loss-differentialseries. variancesneed be used in the estimationof fd(O)because all
If the loss-differentialseries is covariancestationaryandshort the others are 0, so S(T) = (k - 1). This is legitimate(i.e.,
memory, then standardresults may be used to deduce the the estimatoris consistent)under(k - 1)-dependenceso long
as a uniform window is used because the uniform window
asymptoticdistributionof the sample mean loss differential.
We have assigns unit weight to all includedautocovariances.
Because the Dirichletspectralwindow associatedwith the
- N(O, rectangularlag window dips below 0 at certainlocations,the
T/(d - -) 27rfd(O)),
where resultingestimatorof the spectraldensityfunctionis notguar-
anteedto be positive semidefinite.The large positive weight
d= -[g(ei,)-g(et)] nearthe origin associatedwith the Dirichletkernel,however,
makes it unlikely to obtain a negative estimate of fd(0). In
is the sample mean loss differential, applications,in the rareevent thata negativeestimatearises,
o0 we treat it as 0 and automaticallyreject the null hypothe-
1
fd(O) =
7 Z
Yd(r)
sis of equal forecast accuracy. If it is viewed as particularly
"7T=-00 importantto impose nonnegativityof the estimatedspectral
is the spectraldensity of the loss differentialat frequency0, density, it may be enforced by using a Bartlettlag window,
with correspondingnonnegativeFejerspectralwindow,as in
Y%(r)= E[(d, - -)(d,_, - I)] is the autocovarianceof the the work of Newey andWest (1987), at the cost of havingto
loss differentialat displacementr, and I is the population
mean loss differential. The formulaforfd(0) shows that the increasethe truncationlag "appropriately" with samplesize.
correctionfor serial correlationcan be substantial,even if Otherlag windows and truncationlag selection procedures
the loss differentialis only weakly serially correlated,due to are of coursepossible as well. Andrews(1991), for example,
cumulationof the autocovarianceterms. suggested using a quadraticspectral lag window, together
Because in largesamples the samplemeanloss differential with a "plug-in"automaticbandwidthselection procedure.
d is approximatelynormally distributedwith mean /t and 1.2 ExactFinite-SampleTests
variance27rfd(O)/T,the obvious large-sampleN(O, 1) statistic
for testing the null hypothesis of equal forecastaccuracyis Sometimes only a few forecast-errorobservations are
available in practice. One approachin such situations is
to bootstrapour asymptotic test statistic, as done by Mark
$1 =
(1995). Ashley's (1994) workis also very muchin thatspirit.
T
Littleis knownaboutthe first-orderasymptoticvalidityof the
wherefd(0) is a consistent estimate offd(0). bootstrapin this situation, however, let alone higher-order
Following standardpractice, we obtain a consistent esti- asymptoticsor actualfinite-sampleperformance.Therefore,
mate of 2lrfd(0) by taking a weighted sum of the available it is useful to have availableexact finite-sampletests of pre-
sample autocovariances, dictive accuracy, to complement the asymptotic test pre-
sented previously. Two powerful such tests are based on
the observed loss differentials(the sign test) or their ranks
2lrfd(0)= 1( T) d(7), (Wilcoxon's signed-ranktest). [These tests are standard,so
where our discussion is terse. See, for example, Lehmann(1975)
for details.]
t=1fi+1I
1.2.1 The Sign Test. The null hypothesis is a zero-
median loss differential:med(g(ei,) - g(ei,)) = 0. Note that
1(7/S(T)) is the lag window, and S(T) is the truncationlag. the null of a zero-medianloss differential is not the same
136 DieboldandMariano: Predictive
Comparing Accuracy
as the null of zero difference between median losses; that the loss functionneed not be quadraticand need not even be
is, med(g(ei,)- g(ej,)) / med(g(ei,)) - med(g(ei,)). For that symmetricor continuous.
reason,the null differs slightly in spirit from thatassociated Second, a varietyof realisticfeaturesof forecasterrorsare
with our earlierdiscussed asymptotictest statistic S1, but it readily accommodated.The forecast errorscan be nonzero-
neverthelesshas an intuitiveandmeaningfulinterpretation- mean, non-Gaussian, and contemporaneouslycorrelated.
namely,thatP(g(eit) > g(ej,)) = P(g(ei,) < g(ei,)). Allowance for contemporaneouscorrelation,in particular,is
If, however, the loss differential is symmetrically dis- importantbecausethe forecastsbeing comparedareforecasts
tributed,then the null hypothesis of a zero-medianloss dif- of the same economic time series and because the informa-
ferentialcorresponds precisely to the earlier null because tion sets of forecastersarelargelyoverlappingso thatforecast
in that case the median and mean are equal. Symmetry of errorstend to be stronglycontemporaneouslycorrelated.
the loss differential will obtain, for example, if the distri- Moreover, the asymptotic test statistic S1 can of course
butions of g(ej,) and g(ej,) are the same up to a location handle a serially correlatedloss differential. This is poten-
shift.Symmetryis ultimatelyan empiricalmatterandmay be tially importantbecause, as discussed earlier,even optimal
assessedusing standardprocedures.We have found roughly forecasterrorsareseriallycorrelatedin general. Serialcorre-
symmetricloss-differential series to be quite common in lationpresentsmore of a problemfor the exact finite-sample
practice. test statisticsS2 and S3 and their asymptoticcounterpartsS2,
Theconstructionandintuitionof a test statisticarestraight- and S3abecause the elements of the set of all possible re-
forward.Assuming thatthe loss-differentialseries is iid (and arrangementsof the sample loss differentialseries are not
we shall relax that assumptionshortly), the numberof pos- equally likely when the data are serially correlated,which
itive loss-differentialobservationsin a sample of size T has violates the assumptionson which such randomizationtests
the binomialdistributionwith parametersT and l underthe are based. Nevertheless, serial correlationmay be handled
nullhypothesis. The test statisticis thereforesimply via Bonferronibounds,as suggestedin a differentcontextby
T Campbell and Ghysels (1995). Under the assumptionthat
S2 = the forecasterrorsand hence the loss differentialare (k - 1)-
+(d,),
dependent,each of the following k sets of loss differentials
where will be free of serial correlation: {d1y,I, dij,l+k,dij,1+2k,.. .,
I+(d,)= 1 if d, > 0 {dij,2, dij,2+k, dij,2+2k,...... ., {di,k, dij,2k, dij,3k,.. .}. Thus, a
= 0 otherwise. test with size boundedby a can be obtainedby performing
k tests, each of size a/k, on each of the k loss-differential
Significancemay be assessed using a table of the cumula- sequences and rejectingthe null hypothesis if the null is re-
tive binomialdistribution.In large samples, the studentized jected for any of the k samples. Finally, it is interestingto
versionof the sign-test statistic is standardnormal: note that, in multistep forecast comparisons,forecast-error
serial correlationmay be a "commonfeature,"in the termi-
S2--.5T a
S2a = ..,N(O,1). nology of Engle and Kozicki (1993), because it is induced
largelyby the fact thatthe forecasthorizonis longerthanthe
1.2.2 Wilcoxon's Signed-Rank Test. A related distri- intervalat which the dataare sampledand may thereforenot
bution-freeprocedurethatrequiressymmetryof the loss dif- be presentin loss differentialseven if presentin the forecast
ferential(but can be more powerfulthan the sign test in that errorsthemselves. This possibility can of coursebe checked
case) is Wilcoxon's signed-ranktest. We again assume for empirically.
the moment that the loss-differentialseries is iid. The test
statisticis 2. EXTANTTESTS
T
S3 = I+(d,)rank(Idt), In this section we provide a brief descriptionof three ex-

isting tests of forecast accuracy that have appearedin the
the sum of the ranks of the absolute values of the positive literatureand will be used in our subsequentMonte Carlo
observations.The exact finite-samplecritical values of the comparison.
test statistic are invariant to the distribution of the loss
differential-it need be only zero-meanandsymmetric-and 2.1 The Simple F Test: A Naive Benchmark
have been tabulated. Moreover, its studentized version is If (1) loss is quadraticand(2) the forecasterrorsare(a) zero
asymptoticallystandardnormal, mean, (b) Gaussian,(c) serially uncorrelated,or (d) contem-
S3 - (T+) poraneouslyuncorrelated,then the null hypothesis of equal
S3a 4 N(0, 1). forecast accuracycorrespondsto equal forecast errorvari-
/ T(T+I)(2T+I)
V 24 ances [by (1) and (2a)], and by (2b)-(2d), the ratioof sample
varianceshas the usual F distributionunderthe null hypoth-
1.3 Discussion
esis. More precisely,the test statistic
Here we highlight some of the virtues and limitationsof
ourtests. First, as we have stressedrepeatedly,our tests are F = j ei ej
i
valid for a very wide class of loss functions. In particular, T
Journalof Business&Economic July1995
Statistics, 137
is distributedas F(T, T), where the forecasterrorseries have A consistentestimatorof E is

been stacked into the (T x 1) vectors ei and ej.
Test statistic F is of little use in practice, however, be-
cause the conditions requiredto obtain its distributionare =-S(T)
too restrictive. Assumption (2d) is particularlyunpalatable where
for reasons discussed earlier. Its violation produces corre-
lation between the numeratorand denominatorof F, which
will not then have the F distribution. Yr)= t=7r+I
xz,_, >o0
= (-7r) otherwise,
2.2 The Morgan-Granger-Newbold
Test
1T
ZX__ - 0
The contemporaneouscorrelation problem led Granger (T)=
and Newbold (1977) to apply an orthogonalizingtransfor- =
mation due to Morgan (1939-1940) that enables relaxation '(-r7) otherwise,
of Assumption(2d). Let x, = (ei,+ ej,) and z, = (ei, - ej,), and
let x = (e; + ei) and z = (ei - ej). Then, underthe maintained
Assumptions(1) and (2a)-(2c), the null hypothesisof equal
forecastaccuracyis equivalentto zero correlationbetweenx X(T) = T ztzt-,r
•-+1
andz (i.e., px = 0) and the test statistic
andthe truncationlag S(T) growswith the samplesize butat a
MGN = slowerrate. Alternatively,following Diebold andRudebusch
T--I (1991), one may use the closely related covariancematrix
estimator,
is distributedas Student'st with T - 1 df, where S(T)
x'z
Eitherway, the test statisticis

(e.g., see Hogg and Craig 1978, pp. 300-303).
Let us now consider relaxing the Assumptions (1) and MR =
(2a)-(2c) underlyingthe Morgan-Granger-Newbold(MGN)
test. It is clear that the entire frameworkdepends crucially
on the assumption of quadraticloss (1), which cannot be Under the null hypothesis and the maintainedAssumptions
relaxed. The remainingassumptions,however,can be weak- (1), (2a), and (2b), MR (Meese-Rogoff) is asymptotically
ened in varyingdegrees; we shall considerthem in turn. distributedas standardnormal.
First, it is not difficult to relax the unbiasednessAssump- It is easy to show that, if the null hypothesisand Assump-
tion (2a), while maintainingAssumptions(1), (2b), and (2c). tions (1), (2a), (2b), and (2c) are satisfied,then all termsin E
Second, the normality Assumption (2b) may be relaxed, are0 except 7 (0) and-y(O) so thatMR coincides asymptoti-
while maintaining (1), (2a), and (2c), at the cost of sub- cally with MGN. It is interestingto note also thatreformula-
stantialtediuminvolvedwith accountingfor the higher-order tion of the test in terms of correlationratherthan covariance
momentsthatthenenterthe distributionof the samplecorrela- would have enabledMeese and Rogoff to dispense with the
tion coefficient (e.g., see Kendalland Stuart1979, chap. 26). normality assumptionbecause the sample autocorrelations
Finally, the no-serial-correlationAssumption (2c) may be are asymptoticallynormaleven for non-Gaussiantime series
relaxed in addition to the no-contemporaneous-correlation (e.g., Brockwell and Davis 1992, pp. 221-222).
Assumption (2d) while maintaining(1), (2a), and (2b), as 2.4 AdditionalExtensions
discussed in Subsection 2.3.
In Subsection 2.3, we consideredrelaxationof Assump-
2.3 The Meese-Rogoff Test tions (2a)-(2c), one at a time, while consistentlymaintaining
UnderAssumptions(1), (2a), and (2b), Meese andRogoff Assumption(1) and consistently relaxing Assumption(2d).
Simultaneousrelaxationof multipleassumptionsis possible
(1988) showed that
within the MGN orthogonalizingtransformationframework
V A N(O,C), but much more tedious. The distributiontheoryrequiredfor
joint relaxationof (2b) and (2c), for example, is complicated
where ? = x'z/T, C = ,.)__oo=[' (r)() + %Y(r)'Y=(7)], by the presenceof fourth-ordercumulantsin the distribution
7Y.(r) = cov(x,, z,,), Y(7r) = cov(z,, x,_,), 7Y(7) = of the the sampleautocovariances,as shown, for example,by
cov(xt, xt_,), and y(7r) = cov(z,, z,_•,). This is a well-known Hannan(1970, p. 209) and Mizrach (1991). More impor-
result(e.g., Priestley 1981, pp. 692-693) for the distribution tantly,however,any procedurebased on the MGN orthogo-
of the sample cross-covariancefunction, cov(•(s), •(u)), nalizingtransformationis inextricablywed to the assumption
specialized to a displacementof 0. of quadraticloss.
138 Dieboldand Mariano:ComparingPredictiveAccuracy
3. MONTECARLOANALYSIS standardizationamountsto dividingthe t(6) randomvariable

3.1 Experimental Design byiV 2.
Throughout,we performtests at the a = .1 level. When
We evaluate the finite-sample size of test statistics F, using the exact sign andsigned-ranktests, restrictionof nom-
MGN, MR, S1, S2, S2a, S3, and S3a under the null hypoth- inal size to precisely 10%is impossible (withoutintroducing
esis and variousof the maintainedassumptions.The design randomization),so we use the obtainableexact size closest
includes a variety of specifications of forecast-errorcon- to 10%,as specified in the tables. We performat least 5,000
temporaneouscorrelation, forecast-errorserial correlation, Monte Carloreplications. The truncationlag is set at 1, re-
andforecast-errordistributions.To maintainapplicabilityof flecting the fact thatthe experimentis designed to mimic the
all test statistics for comparisonpurposes,we use quadratic comparisonof two-step-aheadforecast errors, with associ-
loss; that is, the null hypothesis is an equality of MSPE's. ated MA(1) structure.
We emphasize again, however,that an importantadvantage
of test statistics S1, S2, S2a, S3, and S3a in substantiveeco- 3.2 Results
nomic applications-and one not shared by the others-is
theirdirect applicabilityto analyses with nonquadraticloss Results appearin Tables 1-6, which show the empirical
functions. size of the varioustest statisticsin cases of Gaussianandnon-
Consider first the case of Gaussian forecast errors. We Gaussianforecast errorsas the degree of contemporaneous
draw realizations of the bivariate forecast-errorprocess, correlation,the degree of serial correlation,and sample size
are varied.
{eit,ej,t},, with varying degrees of contemporaneousand
serial correlation in the generated forecast errors. This is Let us first discuss the case of Gaussian forecast errors.
achieved in two steps. First, we build in the desired de- The resultsmay be summarizedas follows:
gree of contemporaneouscorrelationby drawinga (2 x 1)
forecasterrorinnovationvector ut from a bivariatestandard 1. F is correctlysized in the absence of both contemporane-
normaldistribution,u, ,- N(02,12), and then premultiply- ous and serial correlationbut is missized in the presence
ing by the Choleski factor of the desired contemporane- of either contemporaneousor serial correlation. Serial
ous innovationcorrelationmatrix.Let the desiredcorrelation correlationpushes empiricalsize above nominalsize, but
matrixbe contemporaneouscorrelationpushesempiricalsize drasti-
P cally below nominalsize. In combination,andparticularly
R= p E[0, 1). for largep and 0, contemporaneouscorrelationdominates
and F is undersized.
Thenthe Choleski factor is 2. MGN is designed to remainunaffectedby contemporane-
ous correlationand thereforeremains correctly sized so
1 0
long as 0 = 0. Serialcorrelation,however,pushesempiri-
cal size above nominalsize.
3. As expected,MR is robustto contemporaneousand serial
Thus,the transformed(2 x 1) vector v, = Pu, , N(02, R).
Thisoperationis repeatedT times, yielding {vt, vYit}l. correlationin large samples, but it is oversized in small
Second,(moving average)MA(1) serial correlation(with samples in the presenceof serial correlation.The asymp-
totic distributionobtainsratherquickly,however,resulting
parameter0) is introducedby taking
in approximatelycorrectsize for T > 64.
eit = _,+ 2it 4. The behaviorof S1is similarto thatof MR. S, is robustto
eit 0 t = 1,... j T. contemporaneousand serial correlationin large samples,
1+oL ,'
but it is oversized in small samples, with nominal and
Weusev0 = 0. Multiplicationby (1 + 02)-1/2 is done to keep empiricalsize converginga bit more slowly thanfor MR.
theunconditionalvariancenormalizedto 1. 5. The Bonferronibounds associated with S2 and S3 work
We considersample sizes of T = 8, 16, 32, 64, 128, 256, well, with nominaland empiricalsize in close agreement
and512, contemporaneouscorrelationparametersof p = 0, throughout.Moreover,the asymptoticson which S2aand
.5, and.9, andMA parametersof 9 = 0, .5, .9. Simple calcu- S3adependobtainquickly.
lationsrevealthatp is not only the correlationbetween vyand
vj,butalso the correlationbetween the forecasterrorsei and Now consider the case of non-Gaussianforecast errors.
ei so thatvaryingthe correlationof vi andvuthrough[0, .9] ef- The strikingand readilyapparentresultis thatF, MGN, and
fectivelyvariesthe correlationof the observedforecasterrors MR aredrasticallymissized in largeas well as small samples.
throughthe same range. S1, S2a, and S3a,on the other hand, maintainapproximately
Wealso considernon-Gaussianforecasterrors.The design correctsize for all but the very small sample sizes. In those
is thesame as for the Gaussiancase describedpreviouslybut cases, S2 and S3 continue to performwell. The results are
drivenby fat-tailedvariates (ut, uj:)' [ratherthan (ui,,uit)'], well summarizedby Figure 1, p. 261, which chartsthe de-
whichare independentstandardizedt randomvariableswith pendenceof F, MGN, MR, andS1on T for the non-Gaussian
6 df. The varianceof a t(6) randomvariableis 3/2. Thus, case with p = 0 = .5.
Journalof Business &EconomicStatistics,July 1995 139
Table 1. EmpiricalSize UnderQuadraticLoss, TestStatisticF
Gaussian Fat-tailed
T p 0=.0 0=.5 8=.9 8=.0 0=.5 8=.9

8 .0 9.85 12.14 14.10 14.28 15.76 17.21
8 .5 7.02 9.49 11.42 9.61 11.64 13.02
8 .9 .58 1.26 1.86 .57 1.13 1.79
16 .0 9.83 12.97 14.85 16.47 18.59 19.78
16 .5 7.30 10.11 11.89 11.14 13.55 14.94
16 .9 .47 .99 1.55 .34 .70 1.13
32 .0 9.88 12.68 14.34 18.06 19.55 20.35
32 .5 6.98 9.50 11.22 21.30 21.00 21.37
32 .9 .23 .55 1.00 .01 .07 .23
64 .0 9.71 13.05 14.62 29.84 29.72 29.96
64 .5 6.48 9.25 10.62 23.48 23.93 24.15
64 .9 .16 .47 .79 .02 .12 .29
128 .0 10.30 13.41 14.99 30.34 30.95 31.26
128 .5 7.01 10.13 11.64 24.89 25.01 25.16
128 .9 .16 .50 .74 .11 .44 .73
256 .0 10.01 13.05 14.65 31.07 31.12 31.24
256 .5 7.37 10.31 11.78 25.48 25.45 25.70
256 .9 .19 .51 .80 .51 1.13 1.44
512 .0 10.22 13.51 15.25 31.45 32.38 32.60
512 .5 7.53 10.16 11.49 26.35 26.92 16.95
512 .9 .18 .50 .85 .81 1.58 2.06
NOTE: T is sample size, p is the contemporaneouscorrelationbetweenthe innovationsunderlyingthe forecasterrors,and 0 is the
coefficientof the MA(1)forecasterror.Alltests are at the 10%level. 10,000 MonteCarloreplicationsare performed.
Table 2. EmpiricalSize UnderQuadraticLoss, TestStatisticMGN
Gaussian Fat-tailed
T p 0=.0 0=.5 8=.9 0=.0 8=.5 0=.9
8 .0 10.19 14.14 17.94 18.10 21.89 25.65

8 .5 9.96 14.66 18.61 16.00 20.51 24.19
8 .9 9.75 14.53 18.67 11.76 16.31 20.00
16 .0 10.07 14.34 17.54 20.33 24.54 27.08
16 .5 9.56 14.37 17.95 37.15 36.18 25.66
16 .9 10.02 14.70 18.20 12.01 16.76 19.81
32 .0 9.89 15.04 18.00 22.94 26.32 28.72
32 .5 10.08 15.11 17.95 20.23 23.76 26.20
32 .9 9.59 15.32 18.25 12.75 17.78 20.54
64 .0 10.09 15.37 17.99 24.56 28.15 30.00
64 .5 9.95 15.18 18.15 21.10 25.18 27.28
64 .9 10.26 15.67 18.49 12.98 18.09 20.53
128 .0 9.96 15.09 17.59 26.47 29.50 30.94
128 .5 10.23 15.07 17.48 23.62 26.82 28.51
128 .9 10.11 15.05 18.05 14.34 18.89 21.56
256 .0 10.28 15.62 18.37 27.39 30.74 32.46
256 .5 10.60 16.02 18.44 23.81 28.38 30.31
256 .9 10.11 15.48 17.91 14.15 19.43 22.03
512 .0 10.12 15.34 17.68 27.64 30.55 32.14
512 .5 10.05 14.96 17.66 24.10 27.40 29.28
512 .9 9.90 15.09 17.53 14.78 19.16 21.49
NOTE: T is sample size, p is the contemporaneous correlation between the innovations underlying the forecast errors, and 8 is the
coefficient of the MA(1) forecast error. All tests are at the 10% level. 10,000 Monte Carlo replications are performed.
140 Dieboldand Mariano:ComparingPredictiveAccuracy
Table3. Empirical
Size UnderQuadratic
Loss,TestStatisticMR
Gaussian Fat-tailed
T p 0=.0 0= .5 8= .9 0= .0 0= .5 8= .9
8 .0 9.67 19.33 22.45 16.16 25.26 27.62
8 .5 9.50 19.00 22.07 14.81 24.50 26.99
8 .9 9.66 19.51 22.85 11.23 21.28 24.14
16 .0 9.62 13.92 14.72 19.94 22.56 23.06
16 .5 10.02 13.88 14.96 17.70 21.04 21.26
16 .9 10.04 13.82 14.94 11.76 15.68 16.70
32 .0 9.96 10.98 11.12 22.78 22.86 21.72
32 .5 9.68 11.46 11.66 19.78 20.32 20.14
32 .9 9.86 11.62 11.96 12.42 13.54 13.46
64 .0 10.32 11.02 11.04 24.50 22.60 21.58
64 .5 9.84 10.56 10.64 21.44 19.48 18.84
64 .9 9.58 10.58 10.34 13.38 13.38 13.20
128 .0 9.78 10.54 10.44 25.86 22.90 21.54
128 .5 10.02 11.04 11.18 22.76 20.26 19.44
128 .9 10.76 11.28 11.38 13.44 13.52 12.92
256 .0 10.04 9.90 9.58 27.16 23.74 22.70
256 .5 10.32 9.92 9.82 24.00 20.50 19.18
256 .9 9.92 10.16 10.34 13.38 12.70 12.24
512 .0 9.94 10.48 10.56 26.92 23.40 21.78
512 .5 9.52 10.56 10.48 23.56 20.52 19.36
512 .9 9.80 9.82 9.88 13.96 12.98 12.74
coefficientof the MA(1)forecasterror.Alltests are at the 10%level. Atleast 5,000 MonteCarloreplications
are performed.
Table4. Empirical
Size UnderQuadratic
Loss,TestStatisticS,
Gaussian Fat-tailed
T p 0=.0 0=.5 0=.9 0=.0 0=.5 0=.9

8 .0 31.39 31.10 31.03 31.62 29.51 29.07
8 .5 31.37 30.39 29.93 31.21 29.71 29.36
8 .9 31.08 30.19 30.18 31.18 30.12 29.75
16 .0 20.39 19.11 18.94 19.26 18.50 18.32
16 .5 20.43 19.52 18.86 19.57 17.67 17.63
16 .9 20.90 19.55 19.59 20.15 18.38 18.16
32 .0 12.42 12.28 12.18 11.30 11.64 11.56
32 .5 13.32 13.22 12.94 11.54 10.66 10.84
32 .9 12.60 13.38 13.22 11.16 11.22 11.50
64 .0 12.47 12.11 11.94 12.44 11.62 11.36
64 .5 12.76 12.49 12.35 12.10 12.26 12.10
64 .9 12.21 12.23 12.03 13.00 12.36 12.16
128 .0 11.72 11.94 12.04 11.48 10.72 10.28
128 .5 11.44 11.72 11.60 10.84 10.96 10.96
128 .9 11.76 11.26 11.34 11.50 10.66 10.86
256 .0 11.11 10.65 10.66 12.06 11.67 11.79
256 .5 10.90 10.39 10.48 12.16 11.46 11.60
256 .9 10.69 10.79 10.75 11.51 11.59 11.16
512 .0 11.15 10.67 10.63 10.06 9.46 9.62
512 .5 10.90 10.39 10.49 9.94 9.66 9.76
512 .9 10.31 10.09 10.05 10.12 10.12 10.06
coefficientof the MA(1)forecasterror.Alltests are at the 10%level. Atleast 5,000 MonteCarloreplicationsare performed.
Journalof Business & EconomicStatistics,July 1995 141
Table5. Empirical
Size UnderQuadratic
Loss,TestStatisticsS2 andS28
Gaussian Fat-tailed
T p 8=.0 8=.5 8=.9 0=.0 0=.5 8=.9

S2, nominalsize = 25%
8 .0 22.24 22.48 22.38 23.94 23.46 23.34
8 .5 22.14 23.46 22.16 23.08 24.80 23.06
8 .9 22.24 23.02 22.66 22.92 23.26 22.86
size = 14.08%
S2, nominal
16 .0 13.46 13.26 13.14 13.62 13.06 13.76
16 .5 14.22 13.46 12.92 13.70 13.24 13.62
16 .9 13.08 13.84 13.28 12.86 13.06 13.20
size = 15.36%
S2, nominal
32 .0 14.36 14.52 14.28 14.54 14.32 14.30
32 .5 14.36 14.06 13.94 15.08 14.36 15.02
32 .9 14.68 14.62 13.46 14.94 14.76 14.52
S2a, nominalsize = 10%
64 .0 9.72 9.92 9.42 9.68 10.36 10.44
64 .5 9.66 10.34 9.68 9.52 10.06 10.00
64 .9 10.84 9.46 10.34 9.40 8.98 10.02

128 .0 11.62 11.62 11.84 12.22 12.20 11.42
128 .5 11.66 11.62 11.90 12.06 11.94 11.44
128 .9 11.22 11.72 11.28 12.06 10.76 11.40

coefficientof the MA(1)forecasterror.At least 5,000 MonteCarloreplicationsare performed.
Loss,TestStatisticsS3 andS3e
Size UnderQuadratic
Table6. Empirical
Gaussian Fat-tailed
T p 0= .0 = .5 0 = .9 0= .0 = .5 0 = .9
S3, nominalsize = 25%

8 .0 22.50 22.92 22.90 23.26 23.34 21.96
8 .5 22.98 22.26 23.06 23.42 23.86 22.88
8 .9 23.16 22.36 24.24 24.26 23.32 23.34
S3, nominalsize = 10.92%
16 .0 10.62 10.06 10.40 10.16 10.42 9.84
16 .5 10.38 10.92 10.32 10.54 10.94 10.34
16 .9 10.64 10.18 9.62 10.58 10.96 10.64
size = 10.12%
S3, nominal
32 .0 10.72 10.28 9.30 9.90 10.00 9.98
32 .5 10.56 10.00 10.02 10.40 10.64 10.30
32 .9 10.92 10.44 10.30 10.46 9.96 10.70
Sa, nominalsize = 10%

64 .0 9.38 9.54 9.16 9.64 9.24 8.84
64 .5 9.80 10.02 9.66 9.58 8.82 8.78
64 .9 9.90 9.24 9.68 9.92 9.78 10.00
128 .0 9.94 9.70 9.12 9.82 9.04 8.46
128 .5 9.52 10.00 9.32 10.08 9.24 9.20
128 .9 9.46 9.64 9.42 9.28 9.22 9.26

coefficientof the MA(1)forecasterror.Atleast 5,000 MonteCarloreplicationsare performed.
Comparing Accuracy
30- 1.25
0 IMGN 1.00
0.75
"0.50
d 0.25
S0.00
254. AN EMPIRICAL
EXAMPLE -0.25
8 16 32 64 128 256 512

78 80 82 84 86 88 90
Sample Size Time
1.
Figure Empirical FourTestStatistics:Fat-Tailed
Sizeast, Case; Figure3. Loss Differential(forward--random
walk).
Theta = Rho = .5.
4. AN EMPIRICALEXAMPLE loss-differentialseries is shown in Figure3, in which no ob-

vious nonstationaritiesarevisually apparently.Approximate
We shall illustrate the practicaluse of the tests with an
stationarityis also supportedby the sample autocorrelation
applicationto exchange-rateforecasting. The series to be function of the loss differential,shown in Figure 4, which
forecast,measured forecasts,
monthlyth three-monthchangein the
end-of-month decays quickly.
nominaldollar/Dutchguilder spot exchange Because the forecastsare three-step-ahead,our earlierar-
rate(in U.S. cents, noon, New Yorkinterbank),from 1977.01
gumentssuggest the need to allow for at least two-dependent
to 1991.12. We assess twoforecasts , the "no change" (0)
forecast errors, which may translateinto a two-dependent
forecastassociated with a model and the fore-
random-walk loss differential. This intuitionis confirmedby the sample
cast implicit in the three-monthforwardrate (the difference
autocorrelationfunctionof the loss differential,in which siz-
betweenthe three-monthforwardrate and the spot rate).
able and significantsample autocorrelationsappearat lags 1
The actual and predictedchanges are shown in Figure 2.
and 2 and nowhereelse. The Box-Pierce X2 test of jointly
The random-walkforecast, of course, is just constant at 0,
zero autocorrelationsat lags 1 through 15 is 51.12, which
whereasthe forwardmarketforecast moves over time. The
is highly significantrelative to its asymptoticnull distribu-
movementsin both forecasts, however, are dwarfedby the
tion of X25. Conversely,the Box-Pierce X2 test of jointly
zero autocorrelationsat lags 3 through15 is 12.79, which is
insignificantrelativeto its null distributionof X2'3
We now proceedto test the null of equal expected loss. F,
MGN, andMR are inapplicablebecause one or moreof their
forecast;as one hearsso often, "Therandomwalk wins." The
0.5
5.0
0.4
2.5 0.3
o 0.1
S-2.5
0.0
-5.0
-0.1
-7.5 ,
77 78 79 80 818283885868788 8990 91
Time 2 4 6 8
Figure2. Actualand PredictedExchange-RateChanges. The Displacement
solidline is the actual exchange-ratechange. Theshort dashed line Figure4. Loss DifferentialAutocorrelations.The firsteightsam-
is the predictedchange fromthe rao andom-walk oe
model, and the long are graphed,togetherwithBartlett'sapproximate
ple autocorrelations
dashed lineis the predictedchange impliedby the forwardrate. 95%confidenceinterval.
Statistics,July1995
Journalof Business&Economic 143
maintainedassumptionsareexplicitly violated. We therefore determinereliably the integrationstatus of macroeconomic

focus on our test statisticS1, setting the truncationlag at two time seriesand(b) the conclusionsof macroeconometricstud-
in light of the preceding discussion. We obtain S1 = -1.3, ies are often criticallydependenton the integrationstatusof
implying a p value of .19. Thus, for the sample at hand, we the relevant time series. One may proceed by noting that
do not reject at conventionallevels the hypothesis of equal tests of exclusion restrictionsamountto comparisonsof re-
expectedabsoluteerror--the forwardrateis not a statistically strictedand unrestrictedsums of squares. This suggests es-
significantlyworse predictorof the futurespot ratethanis the timatingthe restrictedand unrestrictedmodels using partof
currentspot rate. the available data and then using our test of equality of the
mean squarederrorsof the respective one-step-aheadfore-
5. CONCLUSIONSAND DIRECTIONSFOR casts.
FUTURERESEARCH As a second example, it would appearthat our test is ap-
We have proposed several tests of the null hypothesis of plicable in nonstandardtesting situations, such as when a
nuisanceparameteris not identifiedunderthe null. This oc-
equal forecast accuracy.We allow the forecast errorsto be curs, for example, when testing for the appropriatenumber
non-Gaussian,nonzero mean, serially correlated,and con- of states in Hamilton's(1989) Markov-switchingmodel. In
temporaneouslycorrelated.Perhaps most importantly,our
tests are applicableunder a very wide variety of loss struc- spite of the fact thatstandardtests are inapplicable,certainly
the null and alternativemodels may be estimated and their
tures.
We hasten to add that comparisonof forecast accuracyis out-of-sampleforecastingperformancecomparedrigorously,
as shown by Engel (1994).
but one of many diagnostics that should be examined when
In closing, we note that this article is part of a largerre-
comparingmodels. Moreover,the superiorityof a particu- search programaimed at doing model selection, estimation,
lar model in terms of forecast accuracydoes not necessarily
prediction,and evaluationusing the relevant loss function,
imply thatforecastsfrom othermodels containno additional whateverthatloss functionmaybe. This articlehas addressed
information.That, of course, is the well-known message of
evaluation. Granger(1969) and Christoffersenand Diebold
the forecast combinationand encompassingliteratures;see,
for example, Clemen (1989), Chong andHendry(1986), and (1994) addressedprediction. These results, together with
those of Weiss andAndersen(1984) and Weiss (1991, 1994)
Fairand Shiller (1990).
on estimationunderthe relevantloss functionwill makefea-
Several extensions of the results presentedhere appearto
sible recursive,real-time,prediction-basedmodel selection
be promisingdirectionsfor futureresearch. Some are obvi-
underthe relevantloss function.
ous, such as generalizationto comparisonof more thantwo
forecasts or, perhaps most generally, multiple forecasts for
ACKNOWLEDGMENTS
each of multiplevariables. Othersare less obvious and more
interesting.We shall list just a few: We thank the editor, associate editor, and two referees
1. Our frameworkmay be broadenedto examine not only for constructivecomments.Seminarparticipantsat Chicago,
whetherforecast loss differentialshave nonzeromean but Cornell,the FederalReserve Board,London School of Eco-
also whether other variables may explain loss differen- nomics, Maryland,the Model ComparisonSeminar,Oxford,
tials. For example, one could regressthe loss differential Pennsylvania,Pittsburgh,and Santa Cruz provided helpful
not only on a constant but also on a "stage of the busi- input, as did Rob Engle, Jim Hamilton, Hashem Pesaran,
ness cycle" indicatorto assess the extent to which relative IngmarPrucha,PeterRobinson,and Ken West, but all errors
predictiveperformancediffers over the cycle. are ours alone. Portions of this article were written while
2. The ability to formally compare predictive accuracy the first authorvisited the Financial Markets Group at the
afforded by our tests may prove useful as a model- London School of Economics, whose hospitality is grate-
specification diagnostic, as well as a means to test fully acknowledged. Financial support from the National
both nested and nonnested hypotheses under nonstan- Science Foundation,the Sloan Foundation,and the Univer-
dard conditions, in the traditionof Ashley, Granger,and sity of PennsylvaniaResearch Foundationis gratefully ac-
Schmalensee (1980) and Marianoand Brown (1983). knowledged. Ralph Bradley,Jos6 A. Lopez, and Gretchen
3. Explicit accountmay be takenof the effects of uncertainty Weinbachprovidedresearchassistance.
associatedwith estimatedmodel parameterson the behav-
March1994.RevisedDecember1994.]
[Received
ior of the test statistics, as shown by West (1994).
Let us provide some examples of the ideas sketchedin 2. REFERENCES
First, consider the development of a test of exclusion re- Andrews,D. W.K. (1991),"HeteroskedasticityandAutocorrelation Con-
strictions in time series regression that is valid regardless sistentCovariance MatrixEstimation,"
Econometrica, 59, 817-858.
of whetherthe data are stationaryor cointegrated. The de- Ashley,R. (1994),"Postsample
ModelValidation andInference MadeFea-
sirability of such a test is apparentfrom works like those sible,"unpublishedmanuscript,VirginiaPolytechnicInstitute,Dept. of
Economics.
of Stock and Watson (1989), Christiano and Eichenbaum C.W.J.,andSchmalensee, R.(1980),"Advertising and
Ashley,R.,Granger,
(1990), Rudebusch(1993), and Toda and Phillips (1993), in AggregateConsumption: An Analysisof Causality,"
Econometrica, 48,
which it is simultaneouslyapparentthat (a) it is difficult to 1149-1167.
Comparing Accuracy
Brockwell,P. J., and Davis, R. A. (1992), TimeSeries: Theoryand Methods Kendall,M., andStuart,A. (1979), TheAdvancedTheoryofStatistics(Vol.2,
(2nded.),NewYork:Springer-Verlag. 4thed.),NewYork:OxfordUniversity
Press.
B.,andGhysels,E. (1995),"IstheOutcome
Campbell, of theFederalBudget Lehmann, E. L. (1975), Nonparametrics: Statistical Methods Based on
ProcessUnbiasedandEfficient?A NonparametricAssessment,"Review Ranks,SanFrancisco: Holden-Day.
of Economicsand Statistics, 77, 17-31. Leitch,G., andTanner,J. E. (1991), "Econometric
ForecastEvaluation:
Chinn,M., andMeese,R. A. (1991),"Bankingon Currency
Forecasts:Is ProfitsVersusthe Conventional ErrorMeasures,"AmericanEconomic
Changein MoneyPredictable?" unpublished Universityof
manuscript, Review,81, 580-590.
Berkeley,Graduate
California, Schoolof Business. Mariano,R. S., andBrown,B. W.(1983),"Prediction-BasedTestforMis-
Chong,Y.Y.,andHendry,D. F. (1986),"Econometric
Evaluation
of Linear specificationin NonlinearSimultaneousSystems,"in Studiesin Econo-
MacroeconomicModels,"Reviewof EconomicStudies, 53, 671-690. metrics, TimeSeries and MultivariateStatistics,Essays in Honor of T W
Christiano, M. (1990),"UnitRootsin RealGNP:Do
L., andEichenbaum, Anderson,eds. T. Amemiya,S. Karlin,andL. Goodman,New York:
We Know,and Do We Care?"Carnegie-RochesterConferenceSeries on AcademicPress,pp. 131-151.
Public Policy, 32, 7-61. Mark,N. (1995),"Exchange
RatesandFundamentals:Evidenceon Long-
P., andDiebold,F. X. (1994),"Optimal
Christoffersen, Prediction
Under Horizon Predictability,"
AmericanEconomicReview, 85, 201-218.
Asymmetric Loss,"TechnicalWorkingPaper167, NationalBureauof R., andRossi,P.E. (1990),"Posterior,
McCulloch, andUtility-
Predictive,
EconomicResearch,Cambridge, MA. BasedApproaches to Testingthe Arbitrage Journalof
PricingTheory,"
Clemen,R. T. (1989), "Combining Forecasts:A ReviewandAnnotated Financial Economics,28, 7-38.
Bibliography"(with discussion), InternationalJournalof Forecasting,5, Meese,R. A., andRogoff,K. (1988),"Wasit Real? TheExchangeRate-
559-583. InterestDifferential
RelationOverthe ModemFloating-Rate Period,"
Clements,M.P.,andHendry, D. T.(1993),"OntheLimitationsof Compar- Journal of Finance,43, 933-948.
ingMeanSquareForecast Errors" Journalof Forecast-
(withdiscussion), Mizrach,B. (1991),"ForecastComparison in L2,"unpublished
manuscript,
ing,12,617-676. Wharton School,Universityof Pennsylvania,Dept.of Finance.
Cumby, R.E., andModest,D. M.(1987),"TestingforMarketTimingAbil- Morgan,W. A. (1939-1940),"ATestfor Significanceof the Difference
ity: A Framework Journalof FinancialEco-
for ForecastEvaluation," BetweentheTwoVariances in a SampleFroma NormalBivariate Popu-
nomics,19, 169-189. lation,"Biometrika,31, 13-19.
Diebold,F. X., andRudebusch, G. D. (1991),"Forecasting
Outputwith Newey, W., and West, K. (1987), "A Simple, PositiveSemi-Definite,
the CompositeLeadingIndex: An Ex AnteAnalysis,"Journalof the HeteroskedasticityandAutocorrelationConsistentCovariance Matrix,"
AmericanStatisticalAssociation, 86, 603-610. Econometrica,55, 703-708.
Dhrymes,P.J., Howrey,E. P.,Hymans,S. H., Kmenta,J., Leamer,E. E., Priestley,M. B. (1981), SpectralAnalysisand TimeSeries, New York:Aca-
Quandt, V. (1972),
R. E., Ramsey,J. B., Shapiro,H. T., andZarnowitz, demicPress.
"Criteria
for Evaluationof Econometric Models,"Annalsof Economic G. D. (1993),"TheUncertain
Rudebusch, Trendin U.S.RealGNP,"
Amer-
and Social Measurement,1, 291-324. ican EconomicReview,83, 264-272.
Engel,C. (1994),"Canthe MarkovSwitchingModelForecastExchange Stock,J. H., andWatson,M. W. (1989), "Interpreting
the Evidenceon
Rates"Journal of InternationalEconomics,36, 151-165. Money-IncomeCausality,"Journalof Econometrics,40, 161-181.
forCommonFeatures,"
Engle,R.F.,andKozicki,S. (1993),"Testing Jour- Toda,H.Y.,andPhillips,P.C.B.(1993),"Vector andCausal-
Autoregression
nal of Business & Economic Statistics, 11, 369-395. ity,"Econometrica,61, 1367-1393.
Information
Fair,R. C., and Shiller,R. J. (1990), "Comparing in Fore- Vuong,Q.H.(1989),"LikelihoodRatioTestsforModelSelectionandNon-
casts From Econometric Models," American Economic Review, 80, nestedHypotheses," 57, 307-334.
Econometrica,
375-389. Weiss,A. A. (1991),"Multi-step
EstimationandForecasting
in Dynamic
C. W. J. (1969), "Prediction
Granger, Costof Error
Witha Generalized Models,"Journalof Econometrics,48, 135-149.
Function,"OperationalResearchQuarterly,20, 199-207. - (1994),"Estimating TimeSeriesModelsUsingtheRelevantCost
C. W. J., andNewbold,P. (1977), Forecasting
Granger, EconomicTime Function," unpublished
manuscript,Universityof SouthernCalifornia,
Series,Orlando, FL:AcademicPress. Dept.of Economics.
Hamilton,J. D. (1989), "ANew Approachto the EconomicAnalysisof Weiss,A. A., andAndersen,A. P.(1984),"EstimatingForecastingModels
Nonstationary TimeSeriesandtheBusinessCycle,"Econometrica,
57, UsingtheRelevantForecastEvaluation Journalof theRoyal
Criterion,"
357-384. StatisticalSociety, Ser. A, 137, 484-487.
E. J. (1970),MultipleTimeSeries,NewYork:JohnWiley.
Hannan, West, K. D. (1994), "Asymptotic
InferenceAboutPredictiveAbility,"
Hogg,R. V., andCraig,A. T. (1978), Introductionto MathematicalStatistics SSRIWorkingPaper9417,Universityof Wisconsin-Madison, Dept.of
(4thed.),NewYork:MacMillan. Economics.
M. D. (1974),"Noteson Test-
E. P.,Klein,L. R., andMcCarthy,
Howrey, West,K.D.,Edison,H.J.,andCho,D.(1993),"AUtility-Based
Comparison
ingthe PredictivePerformanceof EconometricModels,"International of SomeModelsof ExchangeRateVolatility,"Journalof International
EconomicReview, 15, 366-383. Economics,35, 23-46.

Article 1

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Article 1

Uploaded by

Copyright:

Available Formats

Comparing Predictive Accuracy

Author(s): Francis X. Diebold and Roberto S. Mariano

Weproposeandevaluateexplicit tests of thenullhypothesis

KEYWORDS:Economicloss function;Exchangerates; Forecastevaluation;Forecasting;

S3 = I+(d,)rank(Idt), In this section we provide a brief descriptionof three ex-

is distributedas F(T, T), where the forecasterrorseries have A consistentestimatorof E is

Eitherway, the test statisticis

3. MONTECARLOANALYSIS standardizationamountsto dividingthe t(6) randomvariable

Table 1. EmpiricalSize UnderQuadraticLoss, TestStatisticF

T p 0=.0 0=.5 8=.9 8=.0 0=.5 8=.9

Table 2. EmpiricalSize UnderQuadraticLoss, TestStatisticMGN

T p 0=.0 0=.5 8=.9 0=.0 8=.5 0=.9

8 .0 10.19 14.14 17.94 18.10 21.89 25.65

T p 0=.0 0=.5 0=.9 0=.0 0=.5 0=.9

T p 8=.0 8=.5 8=.9 0=.0 0=.5 8=.9

S2a, nominalsize = 10%

NOTE: T is sample size, p is the contemporaneouscorrelationbetweenthe innovationsunderlyingthe forecasterrors,and 0 is the

S3, nominalsize = 25%

Sa, nominalsize = 10%

NOTE: T is sample size, p is the contemporaneouscorrelationbetweenthe innovationsunderlyingthe forecasterrors,and 8 is the

8 16 32 64 128 256 512

4. AN EMPIRICALEXAMPLE loss-differentialseries is shown in Figure3, in which no ob-

maintainedassumptionsareexplicitly violated. We therefore determinereliably the integrationstatus of macroeconomic

You might also like