Arnaud Costinot and Dave Donaldson : American Economic Review: Papers & Proceedings 2012, 102 (3) : 453-458

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

American Economic Review: Papers & Proceedings 2012, 102(3): 453458

http://dx.doi.org/10.1257/aer.102.3.453

Ricardos Theory of Comparative Advantage:


Old Idea, New Evidence
By Arnaud Costinot and Dave Donaldson*

itself[because] the model implies


The anecdote is famous. A mathematician, complete specialization in equilibrium
Stan Ulam, once challenged Paul Samuelson . This in turn means that the differences
to name one proposition in the social sciences in labor requirements cannot be observed,
that is both true and nontrivial. His reply was: since imported goods will almost never be
Ricardos theory of comparative advantage; produced in the importing country.
see Samuelson (1995, p. 22). Truth, however,
in Samuelsons reply refers to the fact that A similar identification problem arises in the
Ricardos theory of comparative advantage is labor literature in which the self-selection of
mathematically correct, not that it is empiri- individuals based on comparative advantage is
cally valid. The goal of our paper is to assess the often referred to as the Roy model. As Heckman
empirical performance of Ricardos ideas. and Honor (1990) have shown, if general dis-
To bring Ricardos ideas to the data, one must tributions of worker skills are allowed, the Roy
overcome a key empirical challenge. Suppose, as modeland hence Ricardos theory of com-
Ricardos theory of comparative advantage pre- parative advantagehas no empirical content.
dicts, that different factors of production special- Econometrically speaking, the Ricardian model
ize in different economic activities based on their is not nonparametrically identified.
relative productivity differences. Then, follow- How can one solve this identification prob-
ing Ricardos famous example, if English work- lem? One possibility consists in making untest-
ers are relatively better at producing cloth than able functional form assumptions about the
wine compared to Portuguese workers, England distribution of productivity across different
will produce cloth, Portugal will produce wine, factors of productions and economic activi-
and at least one of these two countries will be ties. These assumptions can then be used to
completely specialized in one of these two sec- relate productivity levels that are observable to
tors. Accordingly, the key explanatory variable in those that are not. In a labor context, a com-
Ricardos theory, relative productivity, cannot be mon strategy is to assume that workers skills
directly observed. are log-normally distributed. In a trade con-
This identification problem is emphasized by text, building on the work of Eaton and Kortum
Deardorff (1984, p.476) in his review of empiri- (2002), Costinot, Donaldson, and Komunjer
cal work on the Ricardian model of trade: (forthcoming) have shown how the predictions
Problems arise, however, most having to of the Ricardian model can be tested by assum-
do with the observability of [productiv- ing that productivity levels are independently
ity by industry and country]. Theprob- drawn from Frchet distributions across coun-
lem is implicit in the Ricardian model tries and industries.
This paper proposes an alternative empiri-
cal strategy that does not rely on identifica-
*Costinot: MIT, Department of Economics, MIT,
tion by functional form. Our basic idea, as in
Costinot and Donaldson (2011), is to focus on
50 Memorial Drive, Cambridge, MA 02142 and NBER
(e-mail: costinot@mit.edu). Donaldson: MIT Department
of Economics, MIT, 50 Memorial Drive, Cambridge, MA agriculture, a sector of the economy in which
02142 and NBER (e-mail: ddonald@mit.edu). We thank Pol scientific knowledge of how essential inputs
Antrs, Chang-Tai Hsieh, and Esteban Rossi-Hansberg for such as water, soil, and climatic conditions map
comments and Meredith McPhail and Cory Smith for excel-
lent research assistance. into outputs is uniquely well understood. As a

To view additional materials, visit the article page at consequence of this knowledge, agronomists are
http://dx.doi.org/10.1257/aer.102.3.453. able to predict how productive a given parcel of
453
454 AEA PAPERS AND PROCEEDINGS MAY 2012

land, which will we refer to as a field, would goods, and f=1,,F factors of production. In
be were it to be used to grow any one of a set our empirical analysis, a good will be a crop and
of crops. In this particular context, the econo- a factor of production will be a parcel of land
metrician therefore knows the productivity of a or field. Factors of production are immobile
field in all economic activities, not just those in across countries and perfectly mobile across
which it is currently employed. sectors. Lcf 0 denotes the inelastic supply of
Our strategy can be described as follows. factor f in country c. Factors of production are
We first establish how, according to Ricardos perfect substitutes within each country and sec-
theory of comparative advantage, total output tor but vary in their productivity A gcf 0. Total
of various crops should vary across countries as output of good g in country c is given by
a function of: (i) the vector of productivity of F
Q gc = A gcf L gcf ,
the fields that countries are endowed with, and
(ii) the producer prices that determine the allo- f=1
cation of fields across crops.1 We then combine
these theoretical predictions with productivity where L gcf is the quantity of factor f allocated to
and price data from the Food and Agriculture good g in country c. The variation in A gcf is the
Organization (FAO). Our dataset consists of 17 source of Ricardian comparative advantage. If
major agricultural crops and 55 major agricul- two factors f1and f2located in country c are such
tural countries. Using this information, we can that A gcf22/ A gcf12>A gcf21/ A gcf11for two goods g1 and
compute predicted output levels for all crops and g2, then field f2has a comparative advantage in
countries in our sample and ask: How do pre- good g2.2
dicted output levels compare with those that are Throughout this article, we focus on the
observed in the data? supply-side of this economy by taking producer
Our empirical results show that the output prices p gc 0 as given. We assume that the allo-
levels predicted by Ricardos theory of com- cation of factors of production to each sector in
parative advantage agree reasonably well with each country is efficient and solves

{ }
actual data on worldwide agricultural produc-
tion. Despite all of the real-world considerations C G G
from which Ricardos theory abstracts, a regres-
max
p gc Q gc |L gcf Lcf .
sion of log output on log predicted output has a
g
L cf c=1 g=1 g=1

(precisely estimated) slope of 0.21. This result


is robust to a series of alternative samples and Since there are constant returns to scale, a
specifications. competitive equilibrium with a large number
The rest of the article is organized as follows. of profit-maximizing firms would lead to an
Section I derives predicted output levels in an efficient allocation. Because of the linearity of
economy where factor allocation is determined aggregate output, the solution of the previous
by Ricardian comparative advantage. SectionII maximization problem is easy to characterize.
describes the data that we use to construct As in a simple Ricardian model of trade with
measures of both predicted and actual output. two goods and two countries, each factor should
Section III compares predicted and observed be employed in the sector that maximizes A gcf p gc ,
output levels, and Section IV offers some con-
cluding remarks.

I. Ricardian Predictions 2
The present model, like the Roy model in the labor
literature, features multiple factors of production. In inter-
national trade textbooks, by contrast, Ricardos theory of
The basic environment is the same as in
Costinot (2009). We consider a world economy
comparative advantage is associated with models that fea-
ture only one factor of production, labor. In our view, this
comprising c=1,,C countries, g=1,,G particular formalization of Ricardos ideas is too narrow for
empirical purposes. The core message of Ricardos theory of
comparative advantage is not that labor is the only factor of
1
In line with Ricardos theory of comparative advantage, production in the world, but rather that relative productivity
the focus of our paper is on the supply-side of the economy, differences, and not absolute productivity differences, are
not the demand-side considerations that would ultimately the key determinant of factor allocation. As argued below,
pin down prices around the world. the present model captures exactly that idea.
VOL. 102 NO. 3 A New Test of Ricardos Ideas 455

independently of where other factors are being currency units per tonne. Imperfect data report-
employed. ing to the FAO means that some output and price
Assuming that the efficient allocation is observations are missing. We first work with a
unique,3 we can express total output of good g sample of crops and countries that is designed
in country c at the efficient allocation as to minimize the number of unreported observa-

tions. This sample comprises 55 countries and
(1) Q gc = A gcf Lcf , 17 crops.4 In the remaining sample, whenever
f gc
output data are missing we assume that there
is no production of that crop in that country.
where gc is the set of factors allocated to good Similarly, whenever price data are unreported
g in country c: for a given observation, both quantity produced
and area harvested are also reported as zero in

{ }
(2) the FAO data. In these instances, we therefore
A gcf p gc replace the missing price entry with a zero.5
gc = f=1,F|_

_
>
if gg . Our data on productivity (A gcf ) come from
A gcf p gc version 3.0 of the Global Agro-Ecological
Zones (GAEZ) project run by the International
Equations (1) and (2) capture Ricardos idea Institute for Applied Systems Analysis (IIASA)
that relative rather than absolute productivity and the FAO (IIASA/FAO 2012). We describe
differences determine factor allocation and, in this data in detail in Costinot and Donaldson
turn, the pattern of international specialization. (2011) but provide a brief description here; see
also Nunn and Qian (2011). The GAEZ proj-
II.Data ect aims to make agronomic predictions about
the yield that would obtain for a given crop at a
To assess the empirical performance of given location for all of the worlds major crops
Ricardos ideas we need data on actual output and all locations on Earth. Data on natural inputs
levels, which we denote by Q gc , as well as data (such as soil characteristics, water availability,
to compute predicted output levels, which we topography and climate) for each location are
denote by Q gc in line with Section I. According fed into an agronomic model of crop produc-
to equations (1) and (2), Q gc can be computed tion with distinct parameters for each variety of
using data on productivity, A gcf , for all factors of each crop. These models condition on a level of
production f; endowments of different factors, variable inputs, and GAEZ makes available the
Lcf; and producer prices, p gc . We describe our output from various scenarios in which different
construction of such measures here. Since the levels of variable inputs are applied. We use the
predictions of Ricardos theory of comparative scenario that corresponds to a mixed level of
advantage are fundamentally cross-sectional in
nature, we work with the data from 1989 only;
this is the year in which the greatest overlap in 4
The countries are: Argentina, Australia, Austria,
the required measures is available. Bangladesh, Bolivia, Brazil, Bulgaria, Burkina Faso,
We use data on both agricultural output (Q gc ) Cambodia, Canada, China, Colombia, Democratic Republic
and producer prices (p c ) by country and crop
g of the Congo, Denmark, Dominican Republic, Ecuador,
Egypt, El Salvador, Finland, France, Ghana, Honduras,
from FAOSTAT. Output is equal to quantity har- Hungary, Iceland, Indonesia, Iran, Ireland, Israel, Jamaica,
vested and is reported in metric tons. Producer Kenya, Laos, Lebanon, Malawi, Mozambique, Namibia,
prices are equal to prices received by farmers net Netherlands, Nicaragua, Norway, Paraguay, Peru, Poland,
of taxes and subsidies and are reported in local Romania, South Africa, Spain, Suriname, Sweden, Togo,
Trinidad and Tobago, Tunisia, Turkey, USSR, United States,
Venezuela, Yugoslavia, and Zimbabwe. The crops are: bar-
ley, cabbages, carrots and turnips, cassava, coconuts, seed
3
In our empirical analysis, two out of 101,757 grid cells, cotton, groundnuts (with shell), maize, onions (dry), rice
the empirical counterparts of factors f in the model, are such (paddy), sorghum, soybeans, sugar cane, sweet potatoes,
that the value of their marginal products A gcfp gc is maxi- tomatoes, wheat, potatoes (white).
5
mized in more than one crop. Thus, the efficient allocation We have also experimented with replacing missing
is unique only up to the allocation of these two grid cells. prices with their world averages across producing countries
Dropping these two grid cells has no effect on the coefficient adjusted for currency differences. The empirical results in
estimates presented in Table 1. Table 1 are insensitive to this alternative.
456 AEA PAPERS AND PROCEEDINGS MAY 2012

inputs, where the farmer is assumed to be able


to apply inputs differentially across subplots
within his or her location, and in which irriga-
tion is available. It is important to stress that the
thousands of parameters that enter the GAEZ Relative
model are estimated from countless field and lab wheat-to-sugar cane
productivity
experiments, not from statistical relationships
High: 12,033
between observed country-level output data
(such as that
Low: 0
from FAOSTAT which we use here
to construct Q gc ) and natural inputs. Figure 1. An Example of Relative Productivity
The spatial resolution of GAEZ outputs is Differences
governed by the resolution of the natural input
Notes: Ratio of productivity in wheat (in metric tons/ha) rela-
whose resolution is most coarse, the climate tive to productivity in sugar cane (in metric tons/ha). Areas
data. As a result the GAEZ productivity predic- shaded white have either zero productivity in wheat, or zero
tions are available for each fivearc-minute grid productivity in both wheat and sugarcane. Areas shaded dark,
cell on Earth. The land area of such a cell varies with the highest value (12,033), have zero productivity in
by latitude but is 9.2 by 8.5 km at the Tropics. sugar cane and strictly positive productivity in wheat.
The median country in our dataset contains Source: IIASA/FAO (2012)
4,817 grid cells, but a large country such as the
United States comprises 157,797 cells. Since the
grid cell is the finest unit of spatial heterogeneity amount of output (Q gc ) that country c should
in our dataset we take each grid cell to be a dis- produce in crop g according to Ricardos the-
tinct factor of production f and the land area of ory of comparative advantage, i.e., accord-
each grid cell to be the associated endowment, ing to equations (1) and (2). We then compare
Lcf. Hence, our measure of the productivity of these predicted outputlevels to those that are
factor f if it were to produce crop g in country observed in the data (Q gc ).
c, A gcf , corresponds to the GAEZ projects pre- In the spirit of the slope tests in the
dicted total production capacity (metric tons/ Heckscher-Ohlin-Vanek literature, see Davis
ha). We match countries (at their 1989 borders) and Weinstein (2001), we implement this com-
to grid cells using GIS files on country borders parison by simply regressing, across countries
from the Global Administrative Areas database. and crops, data on actual output on measures
A sample of the GAEZ predictions can be of predicted output. Like Davis and Weinstein
seen in Figure 1. Here we plot, for each grid (2001), we assess the empirical performance of
cell on Earth, the predicted relative productiv- Ricardos ideas by studying whether (i) the slope
ity in wheat compared to sugar cane (the two coefficient in this regression is close to unity;
most important crops by weight in our sample). and (ii) the coefficient is precisely estimated.
As can be seen, there exists a great deal of het- Compared to these authors, however, we have
erogeneity in relative productivity throughout little confidence in our models ability to predict
the world, even among just two of our 17 crops. absolute levels of output. The reason is simple:
In the next section we explore the implications the model presented in Section II assumes that
of this heterogeneityheterogeneity that is the only goods produced (using land) in each
at the core of Ricardos theory of comparative country are the 17 crops for which GAEZ pro-
advantagefor determining the pattern of inter- ductivity data are available. In reality there are
national specialization across crops. many other uses of land, so the aggregate amount
of land used to grow the 17 crops in our study
III. Empirical Results is considerably lower than that assumed in our
analysis. To circumvent this problem, we simply
We are now ready to bring Ricardos ideas estimate our regressions in logs.6 Since the core
to the data. To overcome the identification
problem highlighted by Deardorff (1984) and
Heckman and Honore (1990), we take advan-
6
In order to measure the gains from the economic inte-
gration of US agricultural markets between 1880 and 2000,
tage of the GAEZ data, together with the other Costinot and Donaldson (2011) have developed a methodol-
data described in Section II, to predict the ogy that uses additional data on aggregate land use to c orrect
VOL. 102 NO. 3 A New Test of Ricardos Ideas 457

Table 1Comparison of Actual Output to Predicted Output

Dependent variable: log (output)


(1) (2) (3) (4) (5)

log (predicted output) 0.212*** 0.244*** 0.096** 0.143*** 0.273***


(0.057) (0.074) (0.038) (0.062) (0.074)

Sample All All All Major countries Major crops


Fixed effects None Crop Country None None
Observations 349 349 349 226 209
R2 0.06 0.26 0.54 0.04 0.07

Note: Standard errors clustered by country are in parentheses.


***Significant at the 1 percent level.
**Significant at the 5 percent level.
*Significant at the 10 percent level.

aspect of Ricardian comparative advantage lies in not be surprising. First, our empirical exercise
how relative productivity predicts relative quanti- focuses on land productivity and abstracts from
ties, we believe that a comparison of logarithmic all other determinants of comparative costs (such
slopes captures the essence of what the model as factor prices that differ across countries and
described in Section I can hope to predict in this factor intensities that differ across crops) that are
context. likely to drive agricultural specialization through-
Our empirical results are presented in Table 1. out the world. Second, the fit of our regressions
All regressions include a constant and use stan- does not only depend on the ability of Ricardos
dard errors that are adjusted for clustering by theory to predict relative output levels conditional
country to account for potential within-country on relative productivity levels, but also on the
(across crop) correlation in data reporting and ability of agronomists at the GAEZ project to
model misspecification. Column 1 contains our predict productivity levels in each of 17 crops at
baseline regression. The estimated slope coef- five arc-minute grid cells throughout the world
ficient is 0.212 and the standard error is small conditional on the (counterfactual) assumption
(0.057).7 While the slope coefficient falls short that all countries share a common agricultural
of its theoretical value (one), it remains positive technology. Third, while the spatial resolution of
and statistically significant. the GAEZ predictions is considerably finer than
The fact that Ricardos theory of comparative the typical approach to cross-country data in the
advantage does not fit the data perfectly should trade literature (in which countries are homoge-
neous points), five arc-minute grid cells are still
very coarse in an absolute sense. This means
for this problem. Applying that correction is beyond the that there is likely to be a great deal of potential
scope of the present paper.
7
within-country heterogeneity that is being
In our logarithmic specification all observations in which smoothed over by the GAEZ agronomic model-
either output or predicted output are zero must be omitted.
Out of the total of 935 potential observations (55 countries ing. Yet despite these limitations of our analysis,
and 17 crops), 296 have zero output and 581 have zero pre- Ricardos theory of comparative advantage still
dicted output-that is, our Ricardian model predicts more has significant explanatory power in the data, as
complete specialization than there is in the data. This should column 1 illustrates.
not be surprising given the potential for more spatial hetero-
geneity to exist in agricultural reality than can be modeled
Columns 2 and 3 explore the robustness of our
(due to data limitations) by GAEZ. In all, 349 observations baseline estimate in column 1 to the inclusion of
have both nonzero output and nonzero p redicted output and crop and country fixed effects, respectively. The
are, hence, included in the regression in column 1. We have rationale for these alternative specifications is
explored a number of potential adjustments to correct the that there may be crop- or country-specific ten-
results in column 1 for these missing observations, including
a Tobit regression (where the coefficient is 0.213 and the s.e. dencies for misreporting or model error. Such
is 0.057) and adding one to all observations prior to taking errors may be economic in nature if, say, some
logs (coefficient 0.440; s.e. 0.031). countries had higher intranational price distor-
458 AEA PAPERS AND PROCEEDINGS MAY 2012

tions, or agronomic in nature if, say, the GAEZ real-world considerations from which Ricardos
model predictions were relatively more accurate theory abstracts, we find that a regression of log
for some crops than others. Including such fixed output on log predicted output has a (precisely
effects can reduce the slope coefficient (to as low estimated) slope of 0.21. Ricardos theory of
as .096, in column 3), but these estimates are comparative advantage, is not just mathemati-
still statistically significantly different from zero. cally correct and nontrivial; it also has significant
Thus, the results in columns 2 and 3 show that explanatory power in the data, at least within the
Ricardos theory of comparative advantage con- scope of our analysis.
tinues to have explanatory power whether focus-
ing on the across-country variation, as in column REFERENCES
2, or the across-crop variation, as in column 3.
Finally, columns 4 and 5 investigate the extent Costinot, Arnaud. 2009. An Elementary Theory
to which our estimates are driven by particular of Comparative Advantage. Econometrica 77
components of the sample. Column 4 estimates (4): 116592.
the slope only among the 28 countries that are Costinot, Arnaud, and Dave Donaldson. 2011.
at or above the median in terms of agricultural How Large Are the Gains from Economic
production (by weight). And column 5 estimates Integration? Theory and Evidence from US
the slope only on the 9 crops that are the most Agriculture, 18802002. Unpublished.
important (by weight) in global production. In Costinot, Arnaud, Dave Donaldson, and Ivana
both cases the estimated slope coefficient is sim- Komunjer.  Forthcoming. What Goods Do
ilar (within one standard error) to our baseline Countries Trade? A Quantitative Exploration of
estimate in column 1. Ricardos Ideas. Review of Economic S tudies.
Davis, Donald R., and David E. Weinstein. 2001.
IV. Concluding Remarks An Account of Global Factor Trade. Ameri-
can Economic Review 91 (5): 142353.
Ricardos theory of comparative advantage is Deardorff, Alan V. 1984. Testing Trade Theo-
one of the oldest and most distinguished theories ries and Predicting Trade Flows. In Handbook
in economics. But it is a difficult theory to bring to of International Economics, Volume 1. No.
the data. To do so using conventional data sources, 3, Handbooks in Economics, edited by Ron-
one needs to make untestable functional form ald W. Jones and Peter Kenen, 467517. New
assumptions about how productive a given factor York: North-Holland.
of production would be at the activities it is cur- Eaton, Jonathan, and Samuel Kortum. 2002.
rently, and deliberately, not doing. In this paper we Technology, Geography, and Trade. Econo-
have argued that the predictions of agronomists metrica 70 (5): 174179.
i.e., the scientists who specialize in modeling how Heckman, James J., and Bo E. Honor. 1990.
agricultural crops would fare under a wide range The Empirical Content of the Roy Model.
of possible growing conditionscan be used to Econometrica 58 (5): 112149.
provide the missing data that make Ricardos IIASA/FAO. 2012. Global Agro-Ecological Zones
ideas untestable in conventional settings. (GAEZ v3.0). IIASA, Laxenburg, Austria and
We have combined the data from a particu- FAO, Rome.
lar group of agronomists, those working on the Nunn, Nathan, and Nancy Qian. 2011. The Potatos
GAEZ project as part of the FAO, along with Contribution to Population and Urbanization:
producer price data from the FAO, to assess Evidence from a Historical Experiment. Quar-
the empirical performance of Ricardos ideas terly Journal of Economics 126 (2): 593650.
across 17 a gricultural crops and 55 major agri- Samuelson, P. A. 1995. The Past and the Future of
culture-producing countries in 1989. We have International Trade Theory. In New Directions
asked a simple question: How do output levels in Trade Theory, edited by Jim Levinsohn, Alan
predicted by Ricardos theory compare to those V. Deardorff, and Robert M. Stern, 1722. Ann
that are observed in the data? Despite all of the Arbor: The University of Michigan Press.

You might also like