Supplementary Online Appendix For Trade, Farmers Heterogeneity, and Agricultural Productivity

Supplementary Online Appendix for “Trade, Farmers Heterogeneity, and
Agricultural Productivity: Evidence from Colombia”

Margarita Gafaro and Heitor S. Pellegrina
(Not for publication)
A Data
This section describes the construction of the dataset used in our analysis. Specifically,
we describe: (1) the output and revenue data; (2) the measurement of farm capital and
farm labor, which we use in the construction of our measures of farm productivity; (3) the
geographic variables for each municipio; and (4) the construction of the instruments that we
use to evaluate the impact of distance to urban centers.
Output and Revenue. The agricultural census of Colombia, conducted by DANE, col-
lected information on the production volume for all crops planted in each farm plot during
2013. We compute the physical output per crop for each farm as the sum across all farm
plots, which incorporates the total production in all harvest seasons of 2013. We gather
information on crop prices from different sources to compute the total farm revenue. Our
final price data includes 258 crops and covers 85% of the total planted area in the country.
We only lack price data for timber trees and aromatic herbs that are usually consumed at
the farm. Our main source of price data is the System of Price Monitoring (SIPSA). SIPSA
collects weekly information of prices for food items in 61 wholesale markets. For each crop
and price, we first apply a discount factor used by government officials to convert wholesale
prices to producers prices. We then compute annual prices as the average of weekly prices
across all wholesale markets. We also bring in data from the producers association of cof-
fee, African palm, sugarcane, rice, cotton, and cocoa. In each case, producers associations
provide average producer prices across the country. We use conversion factors provided by
DANE to obtain units of measurement consistent between the census and the price data.
Panel A in Appendix Figure A.2 presents the distribution of our measure of farm revenue.
We exclude outliers with extremely high or low values representing less than 1 percent of
the data. In our final data, the median farm revenue is 2564 USD and the average is 12945
USD.
Farm capital. The census contains information on the type and number of machines
available in the farm for agriculture and livestock production. Since our primary interest is on
agricultural productivity, we focus on machines typically used for agriculture. This accounts
for 22 different machines. To compute a measure of farm capital, we combine the information
on machinery from the census with data that we gathered on machinery prices. We collect
price information online from 17 national suppliers of machinery and from advertisements
published in one specialized newspaper.42 We complement this information with data on
machinery values from FAOSTAT. Since the census does not provide information on specific
product references, we take as a proxy for each machine’s value the minimum price among
42
Source: https://www.agronegocios.co/
1
those prices found for machines of the same type.43 We were able to recover price information
for 76% of the machines that we selected. This allows us to construct a capital valuation
measure for 98% of the farms that report some type of machinery. Nevertheless, only 20%
of farms report having any machinery. For the rest of the farms, we make an imputation of
capital value by fitting a linear regression of the log of farm capital value on log farm size,
log workers, a dummy that indicates if the farm has more than 80% of the land in a single
crop, and dummy variables for departamento, for the main crop in the farm, and for several
types of livestock. To ensure that our results or not driven by such imputation, we test
the robustness of our main results to several different approaches, as reported in Appendix
Tables A.5 and A.10. Appendix Figure A.3 presents the distribution of our observed values
of capital and of the imputed values.
Farm labor. The census collects information on the number of permanent farmworkers and
the number of additional man-days hired on the farm the month before the interview. About
77% of farms have between 1 and 2 permanent farmworkers and 63% of farms did not hire
any additional labor the month before the interview. We use this information to compute
the total number of man-days of work employed at the farm per year. First, we convert
the number of permanent farmworkers on the farm to annual man-days by multiplying this
number by six days a week, four weeks a month, and 12 months a year. Second, we multiply
the number of additional man-days hired the previous month by 12 to measure annual
seasonal man-days.44
In addition, for our calibration of the model, we use data from the cost module of the
Municipality Agricultural Evaluation (EVA) of 2018. This data contains detailed information
on farms’ input use by production stage (land preparation, planting, maintenance, harvest,
and post-harvest) for the 33 crops selected according to their importance in the country. The
data was collected with a survey instrument applied to a sample of about 1,000 farmers and
local extensionists. The sample was designed to provide cost estimates that are representative
at the crop level. We used the microdata from this survey, which includes information on
where farmers usually sell their produce, and categorical variables that classify farms as
small, medium, or large depending on their land use. For the calibration of the model,
we construct a measure of total man-days hired by each farm. This measure includes all
seasonal workers hired during the production cycle and the labor hired to perform clerical
work, which is not available in the agricultural census.
43
We use minimum prices for three reasons. First, it is possible that the information that we were able to
gather online has an upward bias because providers of the cheapest and less sophisticated machines might
be less likely to be advertise their products online. Second, given the low levels of technology adoption that
have been documented in Colombian farms, the machines with the minimum values in our sample might be
the best proxies for the types of machines in most farms in the census. Third, using minimum prices allows
us to take into account capital depreciation. We obtain a similar distribution of capital values when we use
median prices instead.
44
For farmers interviewed off-peak, we underestimate the use of labor. To the extent that peak seasons
vary within crops across regions and within regions and across crops, and the timing of the interviews is
not correlated with crop cycles, we expect this measurement error to be uncorrelated with our variables of
interest.
2
Geographic characteristics. The geographic variables that we use are the altitude of
the municipality, a measure of the ruggedness of the terrain, total rainfall in the year of the
survey, proximity to rivers, and 6 variables capturing the share of the area in a municipality
under 6 different climatological zones. To compute these variables, we perform the geometric
intersection between the municipality boundaries and the raster and shapefiles containing
information on each of these variables. For elevation and terrain roughness, we use ASTER
GDEM rasters and compute the weighted average of all pixel values on each variable within
a municipality. For climatological zones, we use estimations from the Caldas-Lang model
for climatological zoning. This model classifies geographic areas according to temperature,
annual precipitation and elevation, ranging from warm and dry to extremely cold and humid.
We use the shapefiles with this information provided by the National Institute of Hydrology
and Meteorology (IDEAM). In addition, we use maps on the location of rivers to compute the
distance from the centroid of each municipality to the closest river and measures of annual
rainfall provided by IDEAM to compute a measures of water availability in the municipality
in the year of the survey. Finally, we use data from FAO-GAEZ to measure agricultural
suitability across regions in Colombia. Specifically, we measure the agricultural suitability
in coffee, sugarcane, cocoa, banana and oil palm for rain-fed and low-input technology use.
Colonial gold mines and Indigenous settlements. We use historical records on

gold production during the Spanish empire in different regions that make up the current
Colombian territory. These historical records are presented in Urrutia and Ortiz (2015) and
Colmenares (1972), and compile information collected by several historians. Specifically,
we took information from Colmenares (1972), and complemented it with the information
presented in Urrutia and Ortiz (2015) to identify the municipalities that had mines between
the 16th and 18th centuries. We did this by looking at the names and locations of the mines
from both sources. Overall, Urrutia and Ortiz (2015) present 48 different locations of gold
mines between the 16th and the 18th centuries, among those, 15 have names that do not
match with current municipality names. By looking at the maps from Colmenares (1972)
and cross-checking the location of these sites with geographic landmarks, we were able to
locate 11 of those mines to current municipalities. We were not able to locate to current
municipalities 4 historical sites with gold mines (Zea, Cajon, Quindio, Ibagupe) because we
did not have enough information.
To identify municipalities with pre-Hispanic settlementss we use information on the pres-
ence of indigenous population between 1510 and 1561 in current municipalities of Colombia.
This information is available at the CEDE-Data Center of Universidad de los Andes and was
compiled from historical records presented in Melo (1995).
B Measuring Farm Productivity

Combining our data from the agricultural census with information on output and machinery
prices, we compute measured farm productivity, defined by Âi , based on
ȳi
Ai ≡ ,
((l)i (ni ) (mi )1−α−β )γ
α β
3
where ȳi represents the total output of a farm i in value based on the average price of each
crop at the country-level, li the sown area, ni the annual P man-days of work, and mi the
measure of machinery use. Specifically, we calculate ȳi = k qik p̄k , where qik is the quantity
produced by farm i of crop k and p̄k is the average price of crop k in the country. This
production function is consistent with the span-of-control function that we use in our model.
To compute Âi , we need values for the parameters of the production function, which we
pick from the literature. We follow estimates from Avila and Evenson (2010), which are
estimated based on data from Latin American countries. For Colombia, they estimate a
factor share of 0.55 for labor, 0.23 for land, 0.08 for fertilizers and chemicals, 0.06 for seeds,
0.07 for machinery and 0.04 for animals. For our baseline empirical results, we set labor
share to 0.55, capital to 0.10 (based on the summation of machinery and animals), and the
rest we attribute to land. Since we do not have data on fertilizer and chemical use, we
incorporate the application of these inputs into the factor share of land intensity.
As discussed in Appendix Section A, we use imputed data on machinery use for part of
the farms and we aggregated both permanent and temporary labor in our final measure of
labor employment. To test the robustness of our results to these choices, Appendix Table
A.10 report estimates of our main results using several different measurement approaches.
Panel A shows that our results for the effect of market access on agricultural productivity are
robust to an estimation that restricts the sample to farms that report machinery. Similarly,
Panel B shows that our results are robust to an estimation of farm productivity assuming
a production function without machinery. In both panels, we include as a control the share
of farmers in the municipality that report machinery ownership. Panel D, shows that the
share of farmers reporting machinery ownership falls with distance to urban centers. Panel
C presents results with an alternative measure of farm productivity that only takes into
account the number of permanent farmworkers to compute the use labor in the production
function. In sum, our main conclusions hold across these different measurement approaches.
C Overview of Agricultural Production in Colombia

This section provides an overview of the agricultural sector in Colombia. First, we discuss key
characteristics of crop production in Colombia. Second, we describe the different varieties
of crops that are sold in Colombia and how they are associated with different destination
markets. For the particular case of rice and tomatoes, we present price data that supports
our modelling approach in the paper.
C.1 Crop Production and Agricultural Exports

Coffee, banana and African palm oil are the most important exportable crops in the country.
Colombia is among the world’s top four countries in coffee, African palm oil and bananas
exports. Owing to the diversity of elevations, climates and soil compositions, several regions
of the country produce coffees of diverse qualities. Most varieties are suitable for hilly
terrains, and production usually takes place in small-scale farms. Cooperatives play a key
role in the distribution of coffee. Some cooperatives also make contracts with international
buyers to sell special coffees at higher prices.
4
The production of bananas and African oil palm takes place on large landholdings typ-
ically farther from urban centers. Due to its ecological conditions of humidity and tem-
perature, banana cultivation is concentrated in the Northern region, while the cultivation
of African oil palm is widely spread across several regions. Exports of cocoa have recently
gained importance in Colombia. Like coffee, cocoa is grown on small farms. There are two
large processing companies that buy the raw product from agricultural intermediaries in
the local market. Producers’ cooperatives have emerged recently seeking to reduce inter-
mediation costs and facilitate contracts with international buyers. Similar to African palm,
cooperatives play a smaller role in the distribution of cocoa relative to coffee.
Tubers, including potatoes, yams and yucca, and plantains are the most important food
crops in the country. Production of these crops usually takes place on small farms. The soil
and climate of the mountains are suitable for growing potatoes,while plantains, yam and
yucca are suited for lower elevation and warmer temperatures.
C.2 Varieties of Crops and Agricultural Exports

Several crops can be divided into different varieties in terms of quality grades and production
methods. These attributes can have important implications about the destination market
of each variety. Chowdhury et al. (2005) documents this duality for the case of Indonesia
fruits and vegetables in supermarkets and traditional retailers. In this section we discuss a
few concrete examples for Colombia. For the particular case of tomatoes and rice, we have
price data by crop-variety that we can use to study the price premia of varieties with higher
quality that are produced for non-local markets.
Let us start with sugarcane and corn, which are key crops in Colombian agriculture.
These two crops have important differentiation by variety. For the production of sugarcane,
medium- to large-scale landholdings specialize in the production of a specific type of sug-
arcane that industrial plants can use for the production of sugar, while small-family farms
tend to grow a type of sugarcane called “caña panelera”, which is used for the production
of “panela” in small rustic plants. (Panela is a concentrated product of cane juice without
separation of the molasses and crystals.) Across regions, we can observe some clear regional
patterns of specialization. The departamento of Valle del Cauca in the southwest of the
country, a region near the big urban center of Cali, concentrates most of the sugarcane for
sugar, while panela sugarcane is spread across the country. Similarly, for the case of corn, we
have the production of white corn, which takes place on small farms that are geographically
spread across the country and is mostly used for human consumption. In contrast, we also
have the production of yellow corn, which takes place on large mechanized farms and tends
to be used for livestock feeding.
For two cases, tomatoes and rice, we have data on prices by variety. Tomatoes come in
two major varieties, “larga-vida” (long-lasting) and “chonto”. The “larga-vida” is a variety
of tomato of higher quality which is more often exported to consumers abroad or purchased
by consumers of higher income in large urban centers. The taste between the two tomatoes
is almost indistinguishable. They are also similar in size, with larga-vida being slightly
larger. However, tomatoes larga-vida last longer before they ripen and have a more uniform
shape and shell. Also, tomates larga-vida go through a pre-shelf cleaning and packaging.
(Appendix Figure A.6 presents pictures with these types of tomatoes being sold in Bogota
5
and the unit price of each.) Data on price by variety is available for these two types of
tomatos in key consumer markets between 2016 and 2019 for eight municipalities. We find
that the average price premium for tomatoes larga-vida relative to concho was 18%.45
Similarly, rice is graded according to the percentage of broken grains. First grade rice
has up to 10% of broken grains, while second grade rice has between 10% and 20% of broken
grains. The share of whole grains obtained after milling, and therefore, the quality of rice
depends on farming and milling practices. We have data on rice prices by quality type for
the period between 2016 and 2019 in seven municipalities. In this case, we find that the
average price premium equals 9%.
In the calibration of our model, we find that regions with larger urban centers have,
in average, higher price premium for exportable varieties of agricultural goods, since these
regions tend to have a larger share of farmers selecting into non-local markets. Appendix
Figure A.5 presents the relationship between the total population in an urban center and
the price premia. We find a positive relationship between these. We have also checked
the relationship between the price premium and distance to urban centers. We find that
the correlation between the log of the price premium and the log of distance to major
departamento capitals of Bogota (Cali, Bogota and Medellin) to be -0.63, which is consistent
with farmers having lower incentives to produce these types of varieties.
D Robustness Checks of Structural Results

We check the robustness of our structural results to two assumptions of the model. First,
we study the assumption that the distribution of individuals’ ability is log-normal. We
calibrate the model using a Pareto distribution Gs (s) = 1 − s−β , which is another common
parametrization adopted in the literature. We calibrate β according to the productivity
gap between farmers producing exportable varieties of agricultural goods relative to farmers
producing conventional varieties, assuming that the level parameter of the Pareto equals one.
Appendix Table A.15 reports the impact of changes in δi with this parametrization of the
distribution of ability.
Second, we verify the robustness of our results to the introduction of bilateral migration
costs. Specifically, in the counterfactuals in which we allow individuals to change their
location preference, we assume that there exists a bilateral migration cost between regions
τij . In that case, optimal location decisions of individuals in i are given by
(τij yi )κ
µij = P κ.
i (τij yi )
We parametrize based on τij = (distij )−τ1 . We pick τ1 = 0.05, which is in line with values
obtained in Bryan and Morten (2019) and Pellegrina and Sotelo (2021). Appendix Table
A.16 reports results for this case. In both tests, our main conclusions are similar if we use
this version of the model.
45
These data is provided by the price monitoring unit of the Colombian Statistical Unit SIPSA.
6
E Details of the Model
This section provides details about the model. Section E.1 to E.4 derive the key expressions
presented in the main body of the paper. Section E.5 proves the parameter conditions
that are necessary and sufficient for the existence of each type of local equilibria. Section
E.6 proves the propositions in the main body of the paper. Lastly, Section E.7 present
the algorithm that we apply to solve for the spatial equilibrium in the model. To save on
notation, in what follows, we often drop the index for locality, unless otherwise indicated.
E.1 The maximization problem of farmers

Given the production function in equation (3) and factor prices w and r, the cost minimiza-
tion problem of a farmer with ability s is
γ
min wl + rn − λ (As)1−γ lα n1−α − q .

l,n
First order conditions give
r = λ (As)1−γ γαlγα−1 nγ(1−α)

w = λ (As)1−γ γ (1 − α) lγα nγ(1−α)−1
which can be written as
l = λqγα/r (13)
n = λqγ (1 − α) /w (14)
Substitute equations (13) and (14) into equation (3) to get
q = (As)1−γ (λqγα/r)αγ (λqγ (1 − α) /w)γ(1−α) .
Isolate the langrangian λ in the expression above and substitute it into equations (13) and
(14) to get
(1 − α) γ−1 1
n= [As] γ (r)α (w)−α (q) γ (15)
α0
α γ−1 1
l= [As] γ (r)α−1 (w)1−α (q) γ (16)
α0
where α0 ≡ (α)α (1 − α)(1−α) . Combining the two expressions above, we get

n 1−α r
= , (17)
l α w
7
which we will use later. Using equations (15) and (16), we can write the cost function as
C = wn + rl
1 γ−1 1
= [As] γ (r)α (w)1−α (q) γ .
α0
The profit maximization problem of a farmer with ability s is thus

1 γ−1
α 1−α 1
max pq − [As] γ (r) (w) (q) γ
q α0
where p = δp0 if v = 1 and p = p0 if v = 0. First order condition gives

γ
γ
p 1−γ
q = (γα0 ) 1−γ As . (18)
rα w1−α
Substitute the equation above (18) into equations (15) and (16) to get the optimal land and
variable labor use
1
! 1−γ
p
l∗ (s) = κL As (19)
(r)1−γ(1−α) (w)γ(1−α)
1
1−γ
∗ p
n (s) = κN As (20)
(r)γα (w)1−γα
1 γ 1 γ
where κL ≡ αγ 1−γ (α0 ) 1−γ and κN ≡ (1 − α) γ 1−γ (α0 ) 1−γ . Substitute (18) into the maxi-
mization problem to get the optimal profit
1
! 1−γ
p
π ∗ (s) = κπ As γ (21)
(r)α (w)1−α
γ
h γ 1
i
where κπ ≡ (α0 ) 1−γ (γ) 1−γ − (γ) 1−γ . In what follows, we define
1
! 1−γ
p0
π̃ = κπ A γ
(r)α (w)1−α
which is the payment to farmers per unit of ability s.
E.2 Occupational choices

An individual chooses to become a farmer instead of a worker if
sπ̃ ≥ w.
8
An invididual who is indifferent between becoming worker or a farmer must thus satisfy
w
s= . (22)
π̃
A farmer chooses the exportable variety over the conventional one if
1
sδ 1−γ π̃ − f w ≥ sπ̃.
A farmer who is indifferent between the two varieties must thus satisfy
fw
s̄ = , (23)
ρπ̃
1

where ρ ≡ δ 1−γ − 1 is a profit premium for producing exportable varieties.
E.3 Aggregates: revenue, land, variable labor, and ability

With the expressions derived in subsections (E.1) and (E.2), we are now equipped to derive
expressions for the aggregate output and use of land, labor, and ability in each region.
First, multiplying equation (18) by p, we get total sales y (s) of an invidual with ability
s 1
1−γ
γ p
y (s) = (γα0 ) 1−γ sA if s > s.
(rα w1−α )γ
Aggregate revenue is Z
Y = y (s) dGs (s) .
s
Integrating over the total sales of individuals gives
1
1−γ
γ γ p0
Y = N Sγ 1−γ (α0 ) 1−γ A , (24)
(r w1−α )γ
α
where we defined
S ≡ (1 − Gs (s)) E (s|s > s) + ρ (1 − Gs (s̄)) E (s|s > s̄) .
Using similar steps, integrate over the optimal land use in equation (19) to get
1
! 1−γ
1 γ p0
L = N Sαγ 1−γ (α0 ) 1−γ A . (25)
(r)1−γ(1−α) (w)γ(1−α)
Combining equations (24) and (25), we can write
rL = γαY. (26)
9
Following analogous steps that we took to derive equation (26), we get
wNV = γ (1 − α) Y, (27)
where NV is the employment of variable labor in a region. In what follows, we write variable
labor as
NV = N V,
where we defined
V ≡ Gs (s) − f (1 − Gs (s̄)) .
Lastly, the total profit of farmers is
Z
Π= π ∗ (s) dGs (s) .
s
Integrating over abilities s gives
Π = N S π̃
1
! 1−γ
γ
h γ 1
i p
= N S (α0 ) 1−γ (γ) 1−γ − (γ) 1−γ A γ .
(r)α (w)1−α
Combining the equation above with equation (24) gives
π̃N S = (1 − γ) Y. (28)
E.4 Aggregate Agricultural Productivity

Insert equations (26) and (27) into equation (24) to get
γ
1−γ
1 γ 1
Y =p 1−γ (γα0 ) 1−γ N SA .
(γαY /L) (γ (1 − α) Y /N V )1−α
α
After several manipulations, we get
Y = p0 (A)1−γ (S)1−γ (V )γ(1−α) (L)γα (N )1−γα .
Dividing the expression above by N gives the value added per capita
γα
1−γ 1−γ γ(1−α) L
y = p0 (A) (S) (V ) .
N
E.5 Types of Local Equilibria

In Section 4, we described three types of local equilibria: (i) a fully integrated equilibrium in
which all farmers produce exportable varieties of agricultural goods, (ii) a partially integrated
equilibrium in which some farmers produce exportable varieties and others conventional ones,
10
and (iii) an autarky equilibrium in which all farmers produce conventional varieties. This
section proves the conditions on the parameters for each equilibria to hold.
Before we turn to the proof of each type of local equilibria, we notice that, for any value
of ρ and f > 0, there is a positive mass of individuals who choose to become a farmer.
This result comes directly from an inspection of equation (21). Since s is unbounded from
above, for any value of w, we can always pick a value of s that is large enough so that
π̃s > w is satisfied. Therefore, any local equilibria must have some individuals choosing to
become a farmer. In addition, we also highlight that we assume that individuals who are
indifferent between becoming a farmer who produces exportable varieties and conventional
ones will choose to produce exportable varieties, and individuals who are indifferent between
becoming farmers or farmworkers will choose to become farmers.
Fully Integrated Equilibrium. We now prove that a fully integrated equilibrium in

which all farmers produce exportable varieties holds if and only if ρ ≥ f . We first prove
the “only if” part. Suppose ρ ≥ f is true but that there exists a mass of individuals who
choose to become producers of the conventional variety. For these individuals, we must have
π̃s > δπ̃s − wf (which becomes wf > ρπ̃s) and π̃s ≥ w (which becomes π̃sf ≥ wf after
multiplying both sides by f ). Combining these two inequalities, we get π̃sf ≥ wf > ρπ̃s,
which implies f > ρ, a contradiction with our initial assumption that ρ ≥ f . Let us now
prove the “if” part. Assume that we have a fully integrated equilibrium, so that all farmers
select into non-local markets. In that case, we must have δπ̃s − wf ≥ w and δπ̃s − wf ≥ π̃s
for individuals with s ≥ s̄ and w ≥ δπ̃s − wf and w ≥ π̃s for individuals with s̄ > s. For
s = s̄, we have δπ̃s̄ = w and δπ̃s̄ − wf ≥ π̃s̄, which implies ρπ̃s̄ ≥ wf ⇒ ρw ≥ wf ⇒ ρ ≥ f
.
Partially Integrated Equilibrium. Here, we prove that a partially integrated equilib-

rium in which some farmers produce exportable varieties and others conventional ones holds
if and only if f > ρ > 0. We also first prove the “only if” part here. Suppose f > ρ > 0 is
true but that there exists no farmer selecting into non-local markets. In that case, for any s,
we must have either π̃s > δπ̃s − wf and w > δπ̃s − wf or π̃s < δπ̃s − wf and w > δπ̃s − wf
or π̃s < δπ̃s − wf and w > δπ̃s − wf . In all three cases, we can find a sufficiently large s that
does not satisfy the combination of inequalities. Therefore, if f > ρ > 0 then we must have
some farmers producing exportable varieties. Alternatively, assume that there are no farm-
ers producing the conventional variety, then we are back to the fully integrated equilibrium
and we must have ρ ≥ f . Let us now prove the “if” part. Assume that we have a partially
integrated equilibrium. In that case, we must have δπ̃s − wf ≥ π̃s and δπ̃s − wf ≥ w for
s ≥ s̄ and π̃s > δπ̃s − wf and π̃s ≥ w for s̄ > s ≥ s and w > π̃s and w > δπ̃s − wf for
s > s. For s = s, we must have π̃s = w and π̃s > δπ̃s − wf , which combined gives f > ρ.
For s ≥ s̄, we have δπ̃s − wf ≥ π̃s ⇒ ρ ≥ wf /π̃s > 0 ⇒ ρ > 0, which completes the proof.
Autarky Equilibrium. Lastly, we prove that an autarky equilibrium in which all farmers
produce conventional varieties holds if and only if 0 ≥ ρ. We know that for ρ ≥ f we
have a fully integrated equilibrium and for f > ρ > 0 we have a partially integrated one.
Therefore, given that in any local equilibrium some individuals become farmers since s is
11
unbouded from above, then we know that for 0 > ρ all farmers must choose to produce the
conventional variety.
E.6 Proof of Proposition

This section proves the two parts of Proposition (1) in the main body of the paper. The
first part of the proposition focuses on relationships that emerge within regions. The second
part describes relationships that hold when we compare regions with different levels of δi .
E.6.1 Part 1 of Proposition (1)

The first part of the proposion comes directly from a few results. First, a direct inspection of
equation (19) shows that farmers with higher management ability will also have a larger de-
mand for land, given input and output prices. Second, in the partially integrated equilibrium,
we have positive sorting of farmers, with those having higher s ≥ s̄ selecting into non-local
markets. Finally, farmers with higher s will have higher measured farm productivity, which
comes directly from equation (7).
E.6.2 Part 2 of Proposition (1)

We prove the second part of Proposition (1). Our proof assumes no variable labor in the
production function (α = 1), which allows us to obtain sharp predictions of the model. In
this case, notice that in the autarky equilibrium in which no farmer pays for the fixed cost,
all individuals become farmers. Therefore, our analyses focus on the case in which we have
the partially integrated equilibrium. Also, instead of evaluating the impact of δ, the price
premium, we analyze the impact of changes in ρ, the profit premium, because it gives simpler
expressions.
Step 1: Impact of ρ on s. Let us start by establishing the following relationship between

ρ, s and s̄, which can be derived using equations (22) and (23)
f
s̄ = s. (29)
ρ
Since there is no variable labor, full employment implies that the total mass of individuals
becoming farmworkers must equal the total demand for farmworkers from farmers producing
exportable variety of
Gs (s) = f (1 − Gs (s̄)) . (30)
Insert equation (29) into equation (30) and differentiate the expression above, treating s as
a function of ρ

ds f f ds
gs (s) = −f gs (s̄) − 2 s +
dρ ρ ρ dρ
12
where gs () is the pdf of the distribution of ability. Rearrange the expression above to get
2
f
sgs (s̄)
ds ρ
= > 0. (31)
dρ (f )2
gs (s) + gs (s̄)
ρ
Therefore, an increase in ρ leads to an increase in the cutoff s. Because the average measured
farm productivity is given by Ai E (s|s > s), we know that the average farm productivity rises
with increases in ρ.
Step 2: Impact of ρ on s̄. To show the impact of ρ on s̄, we will take advantage of the
result derived in step 1. First, differentiate equation (30), making s̄ a function of s
ds̄
gs (s) = −f gs (s̄) . (32)
ds
Rearrange to get
ds̄ 1 g (s)
=− < 0. (33)
ds f g (s̄)
Combining equation (33) above with equation (31), it gives the result that an increase in
ρ generates a drop in s̄. Therefore, the share of farmers selecting into non-local markets
increases.
Step 3: Impact of ρ on average farm size. If we define farm size as the ratio of land
relative to the number of farms, our average farm size will be given by
L
FS = .
N (1 − G (s))
From the previous steps, we know that s increases when ρ increases. Therefore, if N is fixed,
then farm size should necessarily increase with ρ, since there is a reduction in the total mass
of farmers. If we consider individuals location choices, then both N and s change with ρ.
Intuitively, N should rise with a ρ, since the region becomes more attractive. Recall that
the mass of farmworkers in region i is given by
(yi )θ
Ni = N.
Ξθ
As such, the impact of ρ on F S depends on yi , which is a function of Si and Vi (defined in
Section 4.4). The impact of ρ on farm size will depend on how much individuals will move
to a region in reaction to an increase in δ, which is directly related to θ. When θ is larger,
more inviduals will move to region when ρ, which tends to reduce farm size.
13
E.6.3 Impact of ρ on S
To study the impact of ρ on S, it is useful to work with the following expression for S
Z ∞ Z ∞
S= sg(s)ds + ρ sg (s) ds.
s(ρ) s̄(s(ρ))
Using the Leibniz rule, differentiate with respect to ρ to get

Z ∞
dS ds ds̄ ds
= −sg (s) + sg (s) ds − ρg (s̄)
dρ dρ s̄ ds dρ
Z ∞
ds̄ ds
= −sg (s) − ρg (s̄) + sg (s) ds.
ds dρ s̄
ds̄
Using equations (32) and (29), we get −sg (s) − ρg (s̄) = 0, which we can use to write
ds
Z ∞
dS
= sg (s) ds.
dρ s̄
R∞
Since s̄
sg (s) ds > 0, we know that an increase in ρ always increases S.
E.7 Solving the Model

Given the structure of the model, we can solve the equilibrium in two steps. In the first step,
we solve for s̄i and si within every municipality given price premium δi and the parameters
of the model. In the second step, we solve for Ni , which is the mass of individuals choosing
to live in each location.
Step 1: Local Equilibrium. Given ρi , f , α, γ, and a parametrization of the distribution

of management ability Gs (.), we first check the type of local equilibrium that we have in a
municipality. If f > ρi > 0, we have a partially integrated equilibrium. In that case, we
solve for s̄i and si using two equations. First,
γ (1 − α) Si
si = , (34)
(1 − γ) Vi
which can be derived by combining equations (22), (27) and (28). Second,
f
s̄i = s, (35)
ρi i
which comes from equations (22) and (23). In addition, we use the previous definitions for
Si and Vi .
For the fully integrated equilibrium, we need a single equation to solve for the equilibrium,
14
given by
γ (1 − α) 1+f Si
s̄i = ,
(1 − γ) 1 + ρi Vi
where we used s̄i = (1 + f ) w/ (1 + ρi ) π̃, which must hold for the fully integrated equilib-
rium. Notice that here Si = (1 + ρi ) (1 − Gs (s̄i )) E (s|s > s̄i ) and Vi = Gs (s̄i )−f (1 − Gs (s̄i )).
Lastly, for autarky equilibrium, we use
γ (1 − α) Si
si = ,
(1 − γ) Vi
where we used the condition that si = w/π̃, which must hold for the autarky equilibrium.
Notice that in this final case we have Si = (1 − Gs (si )) E (s|s > si ) and Vi = Gs (si ).
Step 2: Spatial Equilibrium. Second, we compute the spatial equilibrium across regions.
Given Ai , Si , Vi , N , pi0 , and pi , we solve for the value of Ni , using the following expressions
γα
1−γ 1−γ γ(1−α) Li
yi = pi0 (Ai ) (Si ) (Vi ) , (36)
Ni
(yi /pi )θ
Ni = P θ
N, (37)
i∈I (yi /pi )
X
N= Ni . (38)
i∈I
Equations (36) and (37) gives the relative employment of labor across regions and equation
38 pins down the level.
E.8 Extension of the Model with Multiple Crops

In the main body of the paper, we developed a model without crop-choice, as to focus on
the selection of farmers into non-local markets and its impact on agricultural productivity.
In this section, we show how the model can incorporate crop-choices.
In this extension, there are k = 1, ..., K crops in the economy. For each crop, there is a
conventional v = 0 and an exportable variety v = 1. Each crop-variety pair is homogeneous.
The price of the conventional variety in urban centers is p0k and the premium δi . There is
a trade cost τik . No arbitrage ensures that the price of agricultural produce in region i is
pik0 = τik pk0 for the conventional variety and pik1 = δi τik pk0 for the exportable one.
In addition to being heterogeneous in terms of their location preferences and management
ability, they are also are heterogeneous in terms of the productivity to produce a given crop
k. The crop-specific productivity is drawn from a Fréchet distribution GK = e −z −ξ .

Individuals observe this productivity term after observing their location preferences and
managements abilities, and after making their location and occupation choices.
The technology to produce a good is given by
γ
qik (s) = Aik (zk (s) s)1−γ (likv (s))α (nikv (s))1−α .
15
Notice that, relative to the main formulation, we rearranged the productivity term Aik in
the production function, this facilitates the algebra in our derivations. The optimal land use
is given by
1
! 1−γ
∗ A ik p ikv
likv (s) = κL szk (s) , (39)
(ri )1−γ(1−α) (wi )γ(1−α)
the use of variable labor is
1
1−γ
Aik pikv
n∗ikv (s) = κN szk (s) , (40)
(ri ) (wi )1−γα
γα
and the net profit of farmers producing activity k variety v is

1
! 1−γ
∗ Aik pikv
πikv (s) = κπ szk (s) γ − 1 (v = 1) wi f. (41)
riα wi1−α
From equation (39), we can see that farmers with higher management ability s will also
have, on average, larger farms. This will occur even within a crop choice (that is, on k being
∗
the choice that maximizes πikv (s)).
In this extension of the model, farmers make discrete choices about which crop to produce
in their farm. We can integrate these discrete choices over continuum of farmers using our
distributional assumption that zk (s) is Fréchet. Similar to derivations in Sotelo (2016), we
get the following share of farmers producing crop k
ξ
(Aik pik ) 1−γ
λik = ,
Φθi
hP ξ
i 1ξ
where Φi ≡ κ k (A p
ik ik0 ) 1−γ . Here, Φi is a term that captures the role of comparative
advantage forces in land use. For example, an increase in pik0 of a single crop changes Φi .
The extent to which this price affects overall productivity depends on ξ, which controls the
heterogeneity in crop productivity across farmers. When ξ is low, so that there is high crop
heterogeneity, changes in pik0 generate smaller changes in the share of farmers λik dedicated
to a given crop. Notice that, because s and δ operate as a profit shifter that is proportional
to all crops, they do not affect crop-decisions.
Lastly, one can show that the average value added per worker becomes
γα
1−γ 1−γ γ(1−α) Li
yi = (Φi ) × (Si ) × (Vi ) × .
Ni
| {z } | {z } | {z } | {z }
Comparative Advantage Ability Variable Labor Congestion
In the equation above, different from equation (10), the natural advantage become an en-
dogenous term that captures the role of prices and selection of farmers into different crops.
16
E.9 First-Order Approximation to Counterfactuals
This section derives the first-order approximation of a change in value added (that is, pay-
ments to farmworkers, managers and land) in response to a change in the price premium for
the production of exportable agricultural goods (δi ).
The problem of individuals in a region i is equivalent to that of a planner in that region
that chooses land use, labor use, and occupational choices to maximize total revenues, subject
to the endowment of land and skill distribution. Dropping the region index for simplicity,
we get:
XZ γ
VA= max pv (As)1−γ (lv (s))α (nv (s))1−α ds
lv (s),nv (s),Σv ,Σw s∈Σv
v
subject to
XZ
N lv (s) dG (s) = L
v s∈Σv
XZ Z
nv (s) dG (s) = sdG(s)
v s∈Σv s∈Σiw
XZ Z
sdG(s) + sdG(s) = 1.
v s∈Σv s∈Σw
Recall that p1 = p0 δ and denote by lv∗ (s) , n∗v (s) , Σ∗v and Σ∗w the optimal choices of the
central planner. Now consider a change in δ, holding these optimal choices fixed. The
envelope theorem implies
Z
γ
dV A = p0 dδ (As)1−γ (lv∗ (s))α (n∗v (s))1−α ds
s∈Σ∗v
Z
dδ γ
= pv (As)1−γ (lv∗ (s))α (n∗v (s))1−α ds.
δ s∈Σ∗v
The expression above implies

∆V A = Y λ1 ∆δ,
which is equation (12) in the main body of the paper.
17
F Additional Tables and Figures
Figure A.1: Average Farm Productivity and Average Distance to Urban Centers across
Municipalities of Colombia
−.2 −.4
Log(avg farm productivity)
−.8 −.6
−1
−1.2
−2 −1 0 1
Log(avg distance to urban center)
Notes: This figure presents a binscatter plot illustrating the conditional relationship between average farm productivity and
average distance to urban centers across municipalities of Colombia. The y-axis represents the log of average of measured
productivity across farms in each municipality and the x-axis represents the log of distance between the municipality and the
nearest urban center. Urban centers are defined as departamento capitals. Measured farm productivity assumes a Cobb-Douglas
function with land, labor, and machinery as inputs (see Appendix Section B for details). The X and Y variables are residualized
and the mean of each variable is added back to its residuals before plotting. Control variables include: total rainfall, average
terrain slope, ruggedness, altitude, and the average distance from each municipality to the closest river. The line represents a
linear fit.
Figure A.2: Distribution of Farm Output in Value and Measured Farm Productivity
(a) Farm Output in Value (b) Measured Farm Productivity
.25
.3
.2
.2
.15
Density
Density
.1
.1
.05
0
10 15 20 25 5 10 15 20 25
Log of farm output in value Log of measured farm productivity
Notes: Panel A shows the distribution of log of the farm output in value computed with the information on volume of
production and crop prices. Panel B shows the disrtribution of our baseline measured farm productivity, which incorporates
information on land use, labor use, and machinery. See text for details on the construction of these variables.
18
Figure A.3: Distribution of Farm Capital Before and After Imputations
(a) Sample with information (b) Whole sample after imputation
1
1
.8
.8
.6
Density
Density
.6
.4
.4
.2
.2
0
0
5 10 15 5 10 15
Log of the value of farm capital Log of value of farm capital
Notes: Panel A shows the distribution of the log of the farm capital computed with the information on farm machinery and
machinery prices for the farms that report machines. Panel B shows the distribution of this variable after the imputation for
capital value in farms that do not report any machinery. See text for details on the construction of these variables.
Figure A.4: Current Departamento Capitals and Historical Origins of Urban Centers
(a) Colonial Gold Mines (b) Indigenous Settlements
Notes: Panel A shows the location of colonial gold mines (triangles) and departamento capitals (dots) by year of establishment
(lighter colors represent earlier years). Panel B shows with stars the location of indigenous settlements between 1510 and
1561. Data on mines was compiled from historical records presented in Urrutia and Ortiz (2015), Colmenares (1972). Data
on indigenous settlements available in CEDE-Data Center from Universidad de los Andes and compiled from historical records
presented in Melo (1995). We compute the distance from the centroid of every municipality to the nearest indigenous settlements
and the colonial goldmines. We then used these as instrumental variables to evaluate the impact of distance of a municipality
to the nearest urban center nowadays.
19
Figure A.5: Price Premium and Population for Selected Crops
1.6
Tomato
Rice
1.4
Price Premium
1.2 1
11 12 13 14 15 16
Log of Population
Notes: This figure shows the relationship between the price premium for exportable (high-quality) types of tomatoes and rice.
Specifically, for tomatoes, data on prices is available for two types: “tomate larga vida” and “tomate chonto”, the former is
export-oriented. For rice, data is available for a high and low quality categories. On the y-axis, we plot the ratio of the average
ratio of the price of the exportable relative to the conventional variety between 2016 and 2019 in a given market. On the x-axis,
we plot the log of the population in that given market. The red line presents the slope of a regression of the price premia
against the log of the population, which equals 0.046.
Figure A.6: Examples of Types of Tomatoes

(a) Larga-Vida (b) Chonto
Notes: This figure shows the two types of tomatoes with measures of price premia. Tomatoes “larga-vida” are considered of
higher quality and more often exported, whereas tomatoes “chonto” are of lower quality and tends to be consumed domestically.
Prices measured in June 2021.
20
Figure A.7: Distribution of the Impact of Endogenous Reallocation of Farmers
(a) Within Region Integration (b) Between Region Integration
Notes: This figure shows the additional impact of changes in the price premia when we allow farmers to re-optimize their
occupational choices, which changes the pool of farmers in the market. Panel A presents the additional impact in the first set
of counterfactuals, in which we change the price premia within the catchment area of different urban centers. Panel B shows
the additional impact in the second set of counterfactuals, in which we equalize the price premia between the catchement area
of different urban centers.
Figure A.8: Changes in Si and Vi in the First Set of Counterfactuals
Notes: This figure shows the distribution of changes in Si and Vi when we change the exportable price premia in our first set
of counterfactuals.
21
Table A.1: Gap in Value Added between Municipalities
Quintile Ratio value added per worker
Q1 1.0
Q2 2.1
Q3 3.1
Q4 4.8
Q5 12.1
Notes: This table presents the gap on the municipality’s average value added per worker across quintiles of the value added
distribution. The gaps are computed relative to average value added in municipalities in Q1.
Table A.2: Initial Inspection of the Relationship between Farm Productivity and Distance
to Departamento Capital
(1) (2) (3) (4) (5) (6) (7)
Log(Avg dist) -0.269* -0.261* -0.230 -0.232 -0.228 -0.253* -0.269*
(0.143) (0.147) (0.140) (0.144) (0.144) (0.148) 0.156
R2 0.148 0.155 0.187 0.168 0.172 0.157 0.214
Obs 32 32 32 32 32 32 32
Rainfall Y Y
Slope Y Y
Altitude Y Y
Ruggedness Y Y
Dist river Y Y
Notes: This table provides an initial inspection of the relationship between distance to urban centers and measured farm
productivity in Colombia. The dependent variable is the average of the measured farm productivity across municipalities in
Colombia—weighted by the production area— and the explanatory variable is the average distance to the departamento capital.
Each column controls for a different geographic control, which is the average of that geographic variable across municipalities
within departamentos, with the exception of rainfall, which we take the sum. Robust standard errors in parenthesis. ***
denotes significance at 1% level, ** significance at 5% and * at 10%.
Table A.3: First Stage Estimates

(1) (2) (3) (4)
Panel A: Dep var is DistUC
Dist to Mines 0.319*** 0.166* 0.162**
(0.053) (0.086) (0.074)
Dist to Ind. Settlements 0.310*** 0.178** 0.167**
(0.057) (0.084) (0.077)
R2 0.641 0.643 0.649 0.667
Panel B: Dep var is log(DistUC)

Dist to Mines 87.309*** 58.119*** 55.874***
12.274 20.269 15.560
Dist to Ind. Settlements 80.066*** 34.016 33.066*
(14.159) (21.567) (19.011)
R2 0.797 0.785 0.804 0.829
Obs 1082 1082 1082 1082
Nearest UC FE Y Y Y Y
Geo+Suitability Y Y Y Y
Population 1760-1808 Y
Non-agricultural labor Y
Notes: This table presents the first stage regressions associated with the IV results in the paper. Robust standard errors
in parenthesis. *** denotes significance at 1% level, ** significance at 5% and * at 10%. See Table 1 for details about the
specifications.
22
Table A.4: Relationships between Farm Size, Measured Farm Productivity, and Selection
into Non-local Markets within Veredas
(1) (2) (3)
Panel A: Log farm productivity
Log(Farm Size) 0.334*** 0.357*** 0.339***
(0.011) (0.007) (0.008)
R2 0.181 0.434 0.664
Obs 854946 854946 854946
Panel B: Log farm productivity

Non-local markets 0.410*** 0.358*** 0.465***
(0.054) (0.035) (0.050)
R2 0.018 0.338 0.591
Obs 854946 854946 854946
Panel C: Farmer sells to non-local markets

Log(Farm Size) 0.362 0.908*** 2.016***
(0.443) (0.244) (0.215)
R2 0.000 0.399 0.760
Obs 854946 854946 854946
Vereda FE Y
Vereda-Crop FE Y
Notes: This table shows the main results from Table 2 within veredas. Standard errors clustered by municipality in parenthesis.
*** denotes significance at 1% level, ** significance at 5% and * at 10%. The unit of observation is the farm. The dependent
variable in panels A and B is the log of farm productivity, and in Panel C and independent variable in Panel B is a dummy
variable on whether the farmer sells to non-local markets. Column (2) includes fixed for each vereda. Column (3) includes fixed
effects for the vereda-crop pair, so that we compare only farmers producing the same crop within the vereda.
Table A.5: Relationships between Alternative Measures of Farm productivity and Farm size
within Municipalities and Crop-Choices
(1) (2) (3)
Panel A: Log farm productivity (only farms with machinery)
Log(Farm Size) 0.249*** 0.312*** 0.283***
(0.002) (0.002) (0.002)
R2 0.096 0.317 0.548
Obs 163929 163929 163929
Panel A: Log farm productivity (no machinery)

Log(Farm Size) 0.334*** 0.344*** 0.325***
(0.011) (0.008) (0.008)
R2 0.181 0.357 0.581
Obs 854946 854946 854946
Panel C: Log farm productivity (permanent workers)

Log(Farm Size) 0.336*** 0.346*** 0.327***
(0.011) (0.008) (0.008)
R2 0.182 0.358 0.581
Obs 854946 854946 854946
Municipality FE Y
Municipality-Crop FE Y
Notes: This table shows the main results from Table 2 for alternative measures of farm productivity. See text for more details
on the measures of productivity. Robust standard errors in parenthesis. *** denotes significance at 1% level, ** significance at
5% and * at 10%.
23
Table A.6: The Impact of Access to Urban Centers - Flexible IV Specification
OLS OLS IV IV IV IV
(1) (2) (3) (4) (5) (6)
Panel A: Dep var log of average agricultural productivity
log(DistUC) -0.192*** -0.140*** -0.374*** -0.432*** -0.370*** -0.400***
(0.048) (0.048) (0.138) (0.116) (0.124) (0.145)
R2 or F 0.416 0.474 18.044 9.684 14.266 8.502
Panel B: Dep var is the share of farms selling to non-local markets

log(DistUC) -0.121*** -0.072*** -0.214*** -0.351*** -0.260*** -0.252***
(0.024) (0.018) (0.063) (0.072) (0.063) (0.067)
R2 or F 0.447 0.600 18.044 9.684 14.266 8.502
Panel C: Dep var log of average farm size

log(DistUC) 0.995*** 0.998*** 1.443*** 2.151*** 1.819*** 1.346***
(0.177) (0.154) (0.373) (0.358) (0.352) (0.289)
R2 or F 0.696 0.753 18.044 9.684 14.266 8.502
IV Mines Ind Both Both
Obs 1082 1082 1082 1082 1082 1082
Nearest UC FE Y Y Y Y Y Y
Geo+Suitability Y Y Y Y Y
Notes: This table shows results for Table 1 using a flexible specification for our instrumental variables. In particular, we divide
our instruments into six bins (0-100, 100-200, 200-300, 300-400, 400-500, 600 or more) in terms of distance to colonial mines
and distance to indigenous settlements. See Table 1 for details on specifications.
Table A.7: The Impact of Access to Urban Centers - Specification with Distance to Nearest
Urban Center in Levels
OLS OLS IV IV IV IV
(1) (2) (3) (4) (5) (6)
Panel A: Dep var is log of average measured farm productivity
log(DistUC) -0.106*** -0.081** -0.207*** -0.194*** -0.198*** -0.207***
(0.029) (0.032) (0.063) (0.058) (0.058) (0.059)
R2 or F 0.417 0.475 31.023 59.948 27.704 43.626

log(DistUC) -0.090*** -0.055*** -0.121*** -0.128*** -0.126*** -0.126***
(0.015) (0.015) (0.026) (0.025) (0.025) (0.026)
R2 or F 0.499 0.616 31.023 59.948 27.704 43.626

log(DistUC) 0.720*** 0.723*** 0.664*** 0.734*** 0.712*** 0.656***
(0.105) (0.090) (0.137) (0.122) (0.123) (0.129)
R2 or F 0.754 0.792 31.023 59.948 27.704 43.626
Obs 1082 1082 1082 1082 1082 1082
Notes: In the Table 1, included in the main body of the paper, we estimated the impact of distance to urban centers using
specification in logs. This table presents an alternative to that specification in which we use distance to urban centers in levels
(in units of 100 km). Robust standard errors in parenthesis. *** denotes significance at 1% level, ** significance at 5% and *
at 10%. See Table 1 for a full description of the specifications.
24
Table A.8: The Impact of Access to Urban Centers - Unweighted Estimations
OLS OLS IV IV IV IV
(1) (2) (3) (4) (5) (6)
Panel A: Dep var is log of average measured farm productivity
log(DistUC) -0.109*** -0.092*** -0.153 -0.155 -0.153 -0.182
(0.029) (0.029) (0.129) (0.148) (0.129) (0.133)
R2 or F 0.253 0.345 132.443 82.296 66.081 67.824

log(DistUC) -0.044*** -0.029*** -0.159*** -0.128** -0.159*** -0.165***
(0.010) (0.010) (0.053) (0.061) (0.053) (0.055)
R2 or F 0.224 0.302 132.443 82.296 66.081 67.824

log(DistUC) 0.303*** 0.296*** 0.917*** 0.519* 0.912*** 0.894***
(0.052) (0.060) (0.215) (0.277) (0.215) (0.215)
R2 or F 0.290 0.367 132.443 82.296 66.081 67.824
Obs 1082 1082 1082 1082 1082 1082
Notes: This table shows the main results from Table 1 without weights. Robust standard errors in parenthesis. *** denotes
significance at 1% level, ** significance at 5% and * at 10%. See Table 1 for a full description of the specifications.
Table A.9: Labor Employment and Sales to non-Local Markets

(1) (2)
Non local 0.245*** 0.428***
(0.050) (0.072)
Farm size controls Y
Notes: Robust standard errors in parenthesis. *** denotes significance at 1% level, ** significance at 5% and * at 10%.
This table shows the additional employment of labor by farms who sell their produce to external markets. Data comes from
Municipality Agricultural Evaluation of 2018. Controls for farm size include categories of farm size and the employment of
machineries.
25
Table A.10: The Impact of Access to Urban Centers - Alternative Measures of Productivity
OLS OLS IV IV IV IV
(1) (2) (3) (4) (5) (6)
Panel A: log of avg ag productivity (only include farms with machinery)
log(DistUC) -0.094 -0.052 -0.428** -0.122 -0.281 -0.306**
(0.058) (0.058) (0.213) (0.230) (0.218) (0.148)
R2 or F 0.310 0.388 30.205 39.354 16.756 23.633
Panel B: log of avg ag productivity (no use of data on machinery)

log(DistUC) -0.145*** -0.108** -0.369** -0.338* -0.353** -0.371**
(0.042) (0.045) (0.180) (0.178) (0.171) (0.174)
R2 or F 0.459 0.504 28.071 41.861 17.409 24.275
Panel C: log of avg ag productivity (only permanent workers)

log(DistUC) -0.194*** -0.141*** -0.570*** -0.559*** -0.564*** -0.594***
(0.048) (0.048) (0.176) (0.173) (0.168) (0.177)
R2 or F 0.418 0.474 31.721 48.348 19.590 22.534
Panel D: Share of farms not reporting machinery

log(DistUC) 0.053*** 0.037** 0.264*** 0.288*** 0.276*** 0.286***
(0.020) (0.016) (0.066) (0.062) (0.061) (0.066)
R2 or F 0.460 0.608 31.721 48.348 19.590 22.534
Obs 1082 1082 1082 1082 1082 1082
Notes: This table shows the results from Table 1 using alternative measures of farm productivity and for the share of farmers
reporting machinery in the municipality. Panels A and B controls, in addition, for the share of farms in the municipality
reporting machinery. See text for more details on the measures of productivity. See Table 1 for details on specifications.
Table A.11: The Impact of Access to Urban Centers - Alternative Measures of Participation
in Non-local Markets
OLS OLS IV IV IV IV
(1) (2) (3) (4) (5) (6)
Panel A: Dep var is the share of farms selling to non-local markets (broader measure)
log(DistUC) -0.145*** -0.093*** -0.344*** -0.382*** -0.362*** -0.375***
(0.022) (0.016) (0.071) (0.078) (0.072) (0.074)
R2 or F 0.549 0.678 29.594 36.338 16.381 18.971
Panel B: Dep var is the share of farms selling to non-local markets (more restrictive measure)
log(DistUC) -0.066*** -0.033*** -0.172*** -0.176*** -0.174*** -0.183***
(0.017) (0.012) (0.045) (0.045) (0.044) (0.046)
R2 or F 0.470 0.630 29.594 36.338 16.381 18.971
Panel C: Dep var is the share farms producing for subsistence or barter trade
log(DistUC) 0.053*** 0.061*** 0.122** 0.115* 0.119** 0.135**
(0.019) (0.018) (0.055) (0.060) (0.055) (0.063)
R2 or F 0.406 0.494 29.594 36.338 16.381 18.971
Obs 1082 1082 1082 1082 1082 1082
Notes: This table shows the results from Table 1 with alternative measures of non-local markets. Robust standard errors in
parenthesis. *** denotes significance at 1% level, ** significance at 5% and * at 10%. The dependent variable in Panel A adds
to our main definition of non-local markets farmers who sell their produce to the industry and in local food markets. Panel B
excludes from our main definition farmers who sell to food wholesalers and farm-gate sales, which excludes farmers who use
part of their produce for household consumption or barter trade. The dependent variable in Panel C is the share of farms in
the municipality that produce for household consumption or barter trade. See table 1 for a description of the control variables.
. OLS columns report R2 and IV columns report F from Kleibergen-Paap weak instrument test.
26
Table A.12: The Impact of Access to Urban Centers - Robustness to Alternative Mechanisms
IV IV IV IV IV
(1) (2) (3) (4) (5)
Panel A: Dep var log of average agricultural productivity
log(DistUC) -0.582*** -0.522*** -0.582*** -0.582*** -0.416***
(0.176) (0.176) (0.176) (0.176) (0.125)
R2 or F 22.534 20.218 22.534 22.534 33.252

log(DistUC) -0.349*** -0.332*** -0.349*** -0.349*** -0.268***
(0.083) (0.085) (0.083) (0.083) (0.055)
R2 or F 22.534 20.218 22.534 22.534 33.252

log(DistUC) 1.801*** 1.723*** 1.801*** 1.801*** 1.792***
(0.399) (0.372) (0.399) (0.399) (0.305)
R2 or F 22.534 20.218 22.534 22.534 33.252
IV Both Both Both Both Both
Obs 1082 1082 1082 1082 1082
Nearest UC FE Y Y Y Y Y
Pop 1760-1808 Y Y Y Y Y
Non-agricultural labor Y Y Y Y Y
Institutions Y
Violence Y
Current mines Y
Current pop and educ Y
Notes: This table shows the results from Table 1 with additional control variables. Robust standard errors in parenthesis. ***
denotes significance at 1% level, ** significance at 5% and * at 10%. Institutions variables include year of foundation of the
municipality, investments in sewage and piped water, and presence of the public agricultural bank in the municipality. Violence
controls include years of presence of non-state armed actors and the number of hectares planted with coca. Education variables
include the share of farmers a high school degree and with an associate degree. Current mines refers to indicator variables for
active mines of gold and other minerals in 2013. See notes in Table 1 for additional explanation.
Table A.13: The Impact of Access to Urban Centers on the Share of Individuals who are
Farmworkers relative to Farm Managers and on Soil Improvements
OLS OLS IV IV IV IV
(1) (2) (3) (4) (5) (6)
Panel A: Dep var is the share of farmworkers relative to farmers
log(DistUC) -0.021*** -0.014* -0.068*** -0.065** -0.066*** -0.067**
(0.007) (0.008) (0.025) (0.025) (0.024) (0.027)
R2 or F 0.264 0.369 31.721 48.348 19.590 22.534
Panel B: Dep var is the share of area w/ soil improvements

log(DistUC) -0.128*** -0.113*** -0.174*** -0.139*** -0.156*** -0.161***
(0.012) (0.011) (0.027) (0.023) (0.023) (0.025)
R2 or F 0.479 0.597 31.721 48.348 19.590 22.534

Obs 1082 1082 1082 1082 1082 1082
Notes: This table provides complementary evidence of the predictions of the model about the spatial patterns of agricultural
production. Specifically, the model indicates that regions farther away from urban centers should have a larger share of
individuals choosing to become farmers (relative to becoming farmworkers), and a smaller average costs per farm that can be
related to production costs of exportable varieties of agricultural goods. In this table, we use share of area in a municipality
with soil improvements as a proxy for such costs. Robust standard errors in parenthesis. *** denotes significance at 1% level,
** significance at 5% and * at 10%. See notes in Table 1 for a full description of the specifications.
27
Table A.14: The Impact of Access to Urban Centers on the Price and Profit Premia of
Exportable Varieties of Agricultural Goods
Price Premium (δi )
OLS IV
(1) (2)
log(DistUC) -0.008*** -0.042***
(0.002) (0.013)
R2 or F 0.297 67.337
IV Both
Obs 1082 1082
Nearest UC FE Y Y
Geo+Suitability Y Y
Notes:This table shows the impact of access to urban centers on the log of the model-implied price premium. Specifications in
this table replicates columns (2) and (5) from Table 1. Robust standard errors in parenthesis. *** denotes significance at 1%
level, ** significance at 5% and * at 10%.
28
Table A.15: Aggregate Impacts of Changes in the Price Premia of Exportable Varieties of
Agricultural Goods - Pareto distribution
Types of Market Integration
Within Regions Between Regions Between and Within
Endogenous Endogenous Endogenous
Reallocations Reallocations Reallocations
FOA Partial Full FOA Partial Full FOA Partial Full
(1) (2) (3) (4) (5) (6) (7) (8) (9)
Panel A: Aggregate Effects
Aggregate VA 4.29 4.73 4.77 1.56 1.68 1.71 5.93 6.69 6.76
Non-local sales - 35.16 35.16 - 22.89 22.89 - 41.03 41.03
Avg farm size - 23.14 23.95 - 8.02 8.37 - 39.94 39.77
Panel B: Above the Median Increase in Price Premium

Aggregate VA 5.35 5.96 6.69 1.97 2.15 2.43 6.86 7.82 8.53
Non-local sales - 42.58 42.58 - 37.73 37.73 - 46.57 46.57
Avg farm size - 24.87 25.53 - 8.55 8.81 - 43.50 43.03
Panel C: Below the median Increase in Price Premium

Aggregate VA 3.06 3.31 2.54 0.95 0.97 0.59 4.85 5.38 4.70
Non-local sales - 28.26 28.26 - 12.74 12.74 - 35.88 35.88
Avg farm size - 11.43 13.25 - 5.13 5.94 - 15.88 17.72
Notes: This table replicates results from Table 4 using a Pareto distribution for the distribution of farmers.
Table A.16: Aggregate Impacts of Changes in the Price Premia of Exportable Varieties of
Agricultural Goods - Bilateral Migration Costs
Types of Market Integration
Within Regions Between Regions Between and Within
Endogenous Endogenous Endogenous
Reallocations Reallocations Reallocations
FOA Partial Full FOA Partial Full FOA Partial Full
(1) (2) (3) (4) (5) (6) (7) (8) (9)
Panel A: Aggregate Effects
Aggregate VA 9.13 9.76 9.95 3.21 3.39 3.51 12.65 13.75 13.99
Non-local sales - 34.48 34.48 - 22.21 22.21 - 40.38 40.38
Avg farm size - 51.21 57.10 - 15.08 19.96 - 88.11 91.40
Panel B: Above the Median Increase in Price Premium

Aggregate VA 11.58 12.48 14.15 4.14 4.41 4.91 14.88 16.30 17.93
Non-local sales - 42.29 42.29 - 36.79 36.79 - 46.21 46.21
Avg farm size - 56.24 62.19 - 16.50 21.82 - 97.51 100.45
Panel C: Below the median Increase in Price Premium

Aggregate VA 6.34 6.68 5.17 1.91 1.94 1.55 10.12 10.85 9.51
Non-local sales - 27.22 27.22 - 12.13 12.13 - 34.97 34.97
Avg farm size - 17.17 22.64 - 7.19 9.62 - 24.58 30.23
Notes: This table replicates results from Table 4 using bilateral migration cost for the impact of migration in the counterfactual.
Specifically, we assume that farmworkers have to pay a migration cost to move between the location in the baseline calibration
and the one in the counterfactual.
29

Supplementary Online Appendix For Trade, Farmers Heterogeneity, and Agricultural Productivity

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Supplementary Online Appendix For Trade, Farmers Heterogeneity, and Agricultural Productivity

Uploaded by

Copyright:

Available Formats

Supplementary Online Appendix for “Trade, Farmers Heterogeneity, and

Agricultural Productivity: Evidence from Colombia”

Colonial gold mines and Indigenous settlements. We use historical records on

B Measuring Farm Productivity

C Overview of Agricultural Production in Colombia

C.1 Crop Production and Agricultural Exports

C.2 Varieties of Crops and Agricultural Exports

D Robustness Checks of Structural Results

E.1 The maximization problem of farmers

First order conditions give

r = λ (As)1−γ γαlγα−1 nγ(1−α)

which can be written as

Substitute equations (13) and (14) into equation (3) to get

q = (As)1−γ (λqγα/r)αγ (λqγ (1 − α) /w)γ(1−α) .

where α0 ≡ (α)α (1 − α)(1−α) . Combining the two expressions above, we get

where p = δp0 if v = 1 and p = p0 if v = 0. First order condition gives

which is the payment to farmers per unit of ability s.

E.2 Occupational choices

E.3 Aggregates: revenue, land, variable labor, and ability

S ≡ (1 − Gs (s)) E (s|s > s) + ρ (1 − Gs (s̄)) E (s|s > s̄) .

Combining equations (24) and (25), we can write

Integrating over abilities s gives

Combining the equation above with equation (24) gives

E.4 Aggregate Agricultural Productivity

After several manipulations, we get

Y = p0 (A)1−γ (S)1−γ (V )γ(1−α) (L)γα (N )1−γα .

E.5 Types of Local Equilibria

Fully Integrated Equilibrium. We now prove that a fully integrated equilibrium in

Partially Integrated Equilibrium. Here, we prove that a partially integrated equilib-

E.6 Proof of Proposition

E.6.1 Part 1 of Proposition (1)

E.6.2 Part 2 of Proposition (1)

Step 1: Impact of ρ on s. Let us start by establishing the following relationship between

Using the Leibniz rule, differentiate with respect to ρ to get

E.7 Solving the Model

Step 1: Local Equilibrium. Given ρi , f , α, γ, and a parametrization of the distribution

E.8 Extension of the Model with Multiple Crops

and the net profit of farmers producing activity k variety v is

The expression above implies

Figure A.6: Examples of Types of Tomatoes

Figure A.8: Changes in Si and Vi in the First Set of Counterfactuals

Table A.3: First Stage Estimates

Panel B: Dep var is log(DistUC)

Panel B: Log farm productivity

Panel C: Farmer sells to non-local markets

Panel A: Log farm productivity (no machinery)

Panel C: Log farm productivity (permanent workers)

Panel B: Dep var is the share of farms selling to non-local markets

Panel C: Dep var log of average farm size

Panel B: Dep var is the share of farms selling to non-local markets

Panel C: Dep var log of average farm size

Panel B: Dep var is the share of farms selling to non-local markets

Panel C: Dep var log of average farm size

Table A.9: Labor Employment and Sales to non-Local Markets

Panel B: log of avg ag productivity (no use of data on machinery)

Panel C: log of avg ag productivity (only permanent workers)

Panel D: Share of farms not reporting machinery

Panel B: Dep var is the share of farms selling to non-local markets

Panel C: Dep var log of average farm size

Panel B: Dep var is the share of area w/ soil improvements

IV Mines Ind Both Both

Panel B: Above the Median Increase in Price Premium

Panel C: Below the median Increase in Price Premium