MLF Growth Notes Net PDF

11/06/2014
Introduction to Economic Growth
Miguel Lebre de Freitas (afreitas@ua.pt)

Universidade de Aveiro, Campus de Santiago, Aveiro, Portugal
Miguel Lebre de Freitas
afreitas@ua.pt
Contents
Introduction: The growth question
Part I – Basic models
1. The Malthus model

2. The basic Solow model
3. Exogenous Growth
4. The Neoclassical model with Human Capital
5. The AK model
Part II – Technology and its diffusion
6. Learning by Doing
7. Excludable knowledge
8. Creative destruction
9. Technology adoption
Part III – Policies, geography and Institutions
10. Government inputs

11. Distortions
12. Economies of scale
13. Corruption and rent seeking
Epilogue: what have we learned?
http://sweet.ua.pt/afreitas/growthbook/capa.htm iii
Detailed Contents
0. Introduction: the growth question
1. The Malthusian model
1.1. Introduction
1.2. The Malthus Model
1.3. Technological change in the Malthus model
1.4. Demographic Transition
1.5. Globalization, fertility and the Great Divergence
1.6. Discussion
2.1 Introduction
2.2. The Solow model
2.3. The Solow model and the facts of economic growth
2.4. Transitional Dynamics
2.5. The Golden Rule
2.6. The model with endogenous savings
2.7. The Solow Residual
2.8. Discussion
Appendix 2.1: The optimal consumption path in a simple 2-period model
3. Exogenous Growth
3.1. Introduction
3.2. Perfect technological diffusion
3.3. The extended Solow Model
3.5. The extended Solow model meeting the real world facts
3.6. Growth accounting revisited
3.7. Discussion: what we have achieved?
Appendix 3.1. Transition dynamics in the Solow model
afreitas@ua.pt
4. The Neoclassical model with Human Capital
4.1. Introduction
4.2. The Lucas Paradox
4.3. Human capital
4.4. The augmented Solow model (MRW)
4.5. Empirical controversies
4.6. Discussion: two directions for our quest
5. The AK model
5.1 Introduction
5.2. The simple AK model
5.3. The Harrod-Domar model
5.4. The AK model with endogenous savings
5.5. The AK model with Physical and Human Capital
5.6. A two sector model of endogenous growth
5.7. Neoclassical models of endogenous growth
5.8. Empirical controversies
5.9. Discussion
Appendix 5.1. Unbalanced growth in the HD model
6. Learning by Doing
6.1. Introduction
6.2. Externalities on capital accumulation
6.3. The market failure and optimal intervention
6.4. The case with external economies of scale
6.5. The Learning by doing model
6.6. Learning by doing and comparative advantages
6.7. Discussion
Appendix 6.1. Strong versus weak scale effects in endogenous growth models
http://sweet.ua.pt/afreitas/growthbook/capa.htm v
7.1 Introduction
7.2. R&D Taxonomy
7.3. A simple model with horizontal and vertical innovations
7.4. The role of excludability and market size
7.5. Making knowledge excludable
7.6. Too little R&D
7.7. Discussion
8.1 Introduction
8.2. Creative destruction
8.3. The optimal level of R&D
8.4. Multiple sector considerations
8.5. Competition and innovation
8.6. Discussion
9.1. Introduction
9.2. Vehicles of technological diffusion
9.3. Barriers to technology diffusion
9.4. Matching specific needs
9.5 A simple model of technological adoption
9.6. Discussion
Part III –Getting the prices right
10.1. Introduction
10.2. The role of government in the economy
10.3. Public inputs
10.4. A simple growth model with government spending
10.5. Intervention trade-offs
10.6. Discussion
Appendix 10.1. The case with a pure public good.
afreitas@ua.pt
11. Distortions
11.1. Introduction
11.2. Distortions in the consumption-saving decisions
11.3. Financial deepening and economic growth
11.4. Distortions in factor markets
11.5. Tax cum subsidy schemes
11.6. Tax evasion
11.7. Monopoly
11.8. Externalities again
11.9. The Washington Consensus
11.10 Discussion
12. Traps and cycles
12.1. Introduction
12.2. The Big Push theory
12.3. The extent of the market and the division of labour
12.4. The division of labour and the extent of the market
12.5. Transport costs and economic geography
12.6. Geography and economic development
12.7 Discussion
13. Corruption and rent seeking
13.1. Introduction
13.2. Corruption
13.3. A model of centralized corruption
13.4. The model with decentralized corruption
13.5. Coping with decentralized corruption
13.6. When institutions become dysfunctional
13.7. Discussion
http://sweet.ua.pt/afreitas/growthbook/capa.htm vii
Organization
General
Part I introduces the basic models of economic growth, namely the Malthusian
model, the Solow model and the AK model, as well as some of their variants. These
models focus mainly on the contribution of inputs to production, while technology is in
general assumed exogenous. This part also makes the point that factor accumulation
alone cannot explain economic growth.
Part II is dedicated to productivity in the “engineering sense”. The main question
is how the technological level is determined and why it differs across the space. The
theories addressed include the learning by doing, the Schumpeterian model of economic
growth and a model of technological catch up.
Part III is devoted to productivity in the sense of resource allocation. The main
concern is the effectiveness with which factors of production and a given technology are
combined to produce valuable output. The chapters discuss the role of the government
in providing public goods and infrastrture, as well as the pervasive role of government
policies that distort the allocation of resources. This part of the book also addresses the
problem of government failures and stresses the key role of institutions in shaping the
incentives of policymakers.
The proposed structure is an attempt to organize the narrative, given the models
we selected. To take full opportunity of the analytical tools and concepts as they are
introduced in each chapter, we often override the thematic boundaries the chapter is said
to address. For instance, questions like endogenous technological change, cumulative
causation and static vs. dynamic efficiency arrive as early as in Chapter 1. In Chapter 6,
where the main topics are knowledge spillovers and learning by doing, we take
opportunity to discuss other externalities and the implications for international trade.
For this reason, some readers may find the sequence odd.
Another implication of pursuing essentially a model-based narrative is that some
important topics are scattered along the book, instead of systematically addressed in
purposeful chapters. This includes, for instance, the role of international trade, income
inequality and institutions. Hopefully, in a posterior version, I will include a final
chapter summarizing the main conclusions for each of these questions.
Chapter 1 focuses on the relationship between population and economic

development. This chapter offers a first approach to the problem of diminishing returns
and explains the importance of technological progress in overcoming it. The chapter
introduces briefly the mutual causality between the size of the population and
technological change. Finally, the chapter reviews the theory of demographic transition
and presents a theory to explain why some countries entered in modern growth later
than others.
Chapter 2 is devoted to the basic Solow model. It is shown that introducing
physical capital in the production function, wages no longer need to converge to a low-
income equilibrium trap. Still, because physical capital is itself subject to diminishing
afreitas@ua.pt
returns, this model is not capable of generating sustained economic growth. The chapter
includes a discussion on the optimality of the saving rate and shows that making the
saving rate endogenous does not alter qualitatively the conclusions. Finally, with the
help of a simple growth accounting exercise, it is argued that assuming an invariant
technology is at odds with real worls facts.
Chapter 3 extends the basic Solow model to the case where technology expands
over time. It is shown that this modification rescues the model from its main limitation
and makes it capable of describing the main stylized facts of economic growth. It is
explained why technological progress has to be exogenous in this model.
Chapter 4 explains the failure of the Solow model in accounting for cross-
country differences in per capita income in terms of magnitudes. It then extends the
model so as to account for human capital accumulation, and shows that this extension
improves the explanatory power of the model. Even though, it is argued that accounting
for human capital is not enough to explain the large cross-country income disparities we
observe in the real world. Thus, one needs to account for the possibility of productivity
to differ across countries. The chapter ends up with a discussion on the need to better
understand what drives total factor productivity, distinguishing the two main
components: “efficiency in resource allocation”, which we relate to the level of
productivity (to be addressed Part III); and “technology”, which is assumed to drive the
growth rate of technology (to be addressed in Part II).
Chapter 5 reviews different incarnations of the AK model to show that, getting
rid of diminishing returns, factor accumulation could lead to unceasing growth, without
the need to postulate exogenous technological progress. It is argued that the empirical
evidence has not been too favourable to the view that factor accumulation alone is
enough to induce sustained growth. It is pointed out, however, that the linear structure
of the AK model underlies the models of endogenous technological change that we
address later in the book.
Chapter 6 motivates the AK model with the theory that externalities related to
capital accumulation can be strong enough to overcome diminishing returns. The
chapter reviews the case with static Marshalian externalities and basically argues that a
similar case holds with learning by doing. The chapter introduces the concept of
cumulative causation and addresses briefly the localized-versus-global knowledge
spillovers controversy. The implications of learning by doing for international trade
policy are also discussed.
Chapter 7 addresses the question of why some people devote time and effort to
develop new products and production processes. The chapter explains the role of
knowledge excludability in providing economic agents with market incentives to invent
new technology and explains the Schumpeterian trade-off between static efficiency and
dynamic efficiency. The role of patents and other mechanisms of knowledge exclusion
in securing the benefits of research is adressed. It is argued that, even with patent
coverage, inventors do not in general fully appropriate the benefits of their inventions,
so a case may exist for government intervention.
Chapter 8 brings to the analysis the competitive dimension of innovations. The
analysis focuses on vertical innovations, which come along with the destruction of
existing rents. A simplified Schumpeterian model is introduced, to illustrate the optimal
http://sweet.ua.pt/afreitas/growthbook/capa.htm ix
allocation of time between working time and time devoted to research. The chapter ends
up with a discussion on the relationship between product market competition and
innovation.
Chapter 9 deals with the problem of a developing country which main challenge
is to adopt technologies developed abroad. The chapter starts with a discussion on the
factors that delay the pace of cross-country technological diffusion. Then a simple
model of technological catch up is presented. In this model, there is a World
technological frontier and the country’s policies determine how close the country gets to
that frontier. In this model, backwardness provides the poor country with the potential
to catch up, but whether this advantage materializes or not depends on the country
economic and political circumstances.
Part III – Getting the prices right
Chapter 10 focuses on the role of government in providing essential inputs to

production, such as the rule of law and public infrastructure. A simple model with a
non-excludable good is used to explain the trade off between the benefits of public
provision and the costs of taxation. The model is also used to illustrate the pervasive
role of government failures.
Chapter 11 addresses the effect of different types of distortions, in a context of
constant returns to scale. The chapter starts with distortions affecting the consumption
savings decision and exemplifies with transport costs and financial market
imperfections. Then, it addresses the case of misallocation in factor markets caused by
distortionary taxation, tax evasion, monopoly, externalities, high inflation, and human
rural migration. Intervention trade-offs are again discussed, now adding the idea of
second-best decision-making. The chapter ends up with a brief introduction to the
debate on economic reform in emerging economies, and the controversy surrounding
the Washington Consensus.
Chapter 12 returns to the case with economics of scale and imperfect
competition, to explore different types of coordination failures. The discussion covers
horizontal complementarities, vertical complementarities and briefly addresses the
challenges posed by international trade and international factor mobility. The chapter
ends up with a discussion of the role of geography as a fundamental determinant of
economic performance.
Chapter 13 addresses the implications of corruption, using an extended Solow
model with public inputs. Three cases of corruption are distinguished: centralized
corruption, decentralized corruption and the case of generalized corruption. In the later
case, strategic complementarities in corruption give rice to virtuous and vicious cycles.
The chapter describes the importance of institutions in getting the incentives right.
afreitas@ua.pt
How to use this book?
The book allows some flexibility in terms of reading. However, some chapters
that depend on preceding chapters should not be read in isolation. The core chapters are
the Solow model and the AK model (chapters 2, 3 and 5). In the table that follows, next
to each chapter is indicated the chapter that should be read before:
The book is designed to cover a one-semestre course fully devoted to economic
growth. In principle, the lecturer should be able to cover all chapters, skipping non-
essential sections in each chapter at his own discretion.
For shorter courses, possible options are the following:
One semester course
The book is basically designed for a course on economic growth with

development concerns. For a semester course on Economic Growth, the following
sequence is appropriated:
2. The Basic Solow model
3. Exogenous Growth (2)
4. The Neoclassical model with Human Capital (3)
5. Endogenous Growth (2)
6. Learning by doing (5)
8. Creative destruction (7)
9. Technology adoption (3)
A course more concerned with development issues could try the following:

3. Exogenous Growth (2)
5. Endogenous Growth (2)
9. Technology adoption (3)
10. Government Services (2)
11. Distortions (10)
13. Corruption and rent seeking (10)
Half term course in economic growth
The book can also be used in half term courses, skipping non-core sections.
A possible selection is the following:
3. Exogenous Growth
5.4. The AK model with endogenous savings
http://sweet.ua.pt/afreitas/growthbook/capa.htm xi

8.3. The optimal level of R&D
Other courses
Specific chapters can also be adopted in other courses. Chapters 2, 3, and parts
of 5 and 6 can be adopted in intermediate courses in macroeconomics. Policy oriented
courses on macro or on public economics can find useful the chapters 10, 11 and 13.
Chapters 12, 7 and 8 may be of interest for undergraduate courses on Industrial
Economics. Chapter 12 may be useful for a course on international trade. Chapter 1
alone may be of interest for a course on Economic History. The book can also provide a
brief introduction to growth models, for students engaged in more advances studies.
afreitas@ua.pt
Symbols and notation
Symbols used
Y = Output
Yd= Households’ disposable income
T = Land
N = Labour, population
y = Per-capita output
C = Private consumption
c = Private consumption per capita
 Physical capital-output elasticity
 Human capital-output elasticity
K = Physical Capital
I = Gross investment in Physical Capital
k = Capital per worker
 = Profits
L = Labour measured in efficiency units
 = Effective labour input per worker.
~
k = Capital per unit of efficiency labour
~
y = Output per unit of efficiency labour
H = Human Capital
I H = Gross investment in Human Capital
h = Human Capital per worker
~
h = Human capital per unit of efficiency labour
s = Fraction of disposable income devoted to physical capital accumulation
sH = Fraction of disposable income devoted to human capital accumulation

k = Physical capital per unit of Human Capital

y = Output per unit of Human Capital
sR = Fraction of disposable income devoted to Research and Development
Speed of adjustment to the steady state in the neoclassical growth model 
 Depreciation rate
 Growth rate of per capita income/Growth rate of Harrod Neutral TFP
g = Hick Neutral rate of technological progress
 Externality
 External effect of public inputs
 Subjective discount rate
 Fraction of working time devoted to rent-seeking
b = Effectiveness of the rent seeking effort
 Fraction of public expenditures which are unproductive
1-u = Fraction of human capital devoted to human capital accumulation
Fraction of the labour force devoted to R&D
r = Real Interest rate
w = Real wage-rate
G = Productive government expenditures
 = Production tax / income tax
 H = Tax on human capital income
http://sweet.ua.pt/afreitas/growthbook/capa.htm xiii
 K = Tax on physical capital income

xj = Production of intermediate input j
X = Composite measure of intermediate inputs
Nj = Raw labour used in production of intermediate input j
NY= Labour used in the production of Y
F = Fixed cost
t = Time index
Mathematical notation
A dot over a variable denotes time variation:

X  X t .
The time variation divided by the level is the growth rate:
Xˆ  X X
When a variable grows at a constant rate – say g – over time, the relationship
between the value of X at time zero and at time t, is:
X t  X 0 e gt .
In logs, a linear equation arises:

ln X t  ln X 0  gt
In many figures, economic variables are represented in logs, so that we can reat
the growth rate in the slope of a linear regression.
Graphical illustration
Many figures along the book will describe steady-states, some of which are
stable and some other are unstable. Visually, we will distinguish stable and unstable
steady states by drawing, respectively, a ball on the top of a hill and a ball lying in a
valley. As follows:
Unstable Equilibrium Stable equilibrium
afreitas@ua.pt
http://sweet.ua.pt/afreitas/growthbook/capa.htm xv
Introduction to Economic Growth , 12/03/2014
“The most decisive mark of the prosperity of any country is the increase in the
number of its inhabitants”. [Adam Smith].
Learning Goals:
 Understand the challenges raised by the Law of Diminishing Returns for

economic growth
 Acknowledge the importance of the Malthus theory to explain historical
facts.
 Understand the critical role of technology in overcoming diminishing
returns.
 Explain the double role of population size in the race with technology
 Understand the factors that drive the changing attitude towards fertility
with economic development.
 Use the learned models to interpret the Great Divergence.
1.1. Introduction
The world population has been expanding at impressive rates. Along the last two
centuries, the World population increased from 1 billion to more than 7 billion.
Although population growth is decelerating, population is still increasing and is
expected to reach 9 billion in 2050. A question that arises is whether the continuing
population expansion will overwhelm the existing resources, posing a threat on living
standards.
Such question was first formulated by the British philosopher Reverend Thomas
Malthus, in its famous book “An essay on the Principle of Population”, published in
1798. Malthus contended that a fixed amount of natural resources could not feed a
constantly increasing population. Thus, the population explosion that was already
becoming evident in the 18th century should face some kind of barrier. Malthus
observed that societies throughout history had experienced at one time or another
different types of checks on excessive population growth, including epidemics, famines,
and wars, that masked the fundamental problem of populations overstretching their
resource limitations. These checks act as natural devices to prevent population from
expanding further.
As we will see in the chapter, the Malthus ideas provide a useful tool to interpret
the almost constant population and living standards that characterized the pre-industrial
era. The Malthusian model might also be thought to provide a reasonable narrative for
some of the world’s today poorest countries and regions. But fortunately, the Malthus
pessimism regarding the future of the human kind did not materialize: somehow
afreitas@ua.pt 16
ironically, at the time Malthus was writing his book, a set of countries in West Europe
entered in a new phase of economic development, in which population and living
standards were expanding together. Malthus overlooked the role of technological
progress and its interaction with fertility choices, in overcoming the constraints posed
by natural resources.
To start a course on economic growth with the Malthusian model is most useful.
First, the narrative remains valid to describe a long period of human history. Second,
understanding the forces that triggered the changes in the demographic behaviour in the
now developed countries may shed some light on the challenges posed to the poorest
countries that are still to make such change. Finally, for the purposes of our study, the
Malthus model provides a good starting point to understand the mechanics of growth
models and the pervasive role of diminishing returns. It also provides a good
opportunity to introduce ideas to be explored later in the book, such as the role of
technological progress, static and dynamic efficiency and poverty traps.
Section 1.2 introduces the Malthus theory in its basic formulation. Sections 1.3
discussed the implications of adding technological change to the Malthus model.
Section 1.4 presents some historical data to illustrate the change in the demographic
behaviour along the last two centuries as well as some explanations for this
phenomenon. Finally, Sections 1.6 presents a theory to explain why some countries
were able to escape the Malthusian trap sooner than others. Section 1.7 concludes.
1.2. The Malthus model
When Thomas Malthus presented his theory, in 1798, he was mainly descriptive.
A simple model will help however understand the critical role played by some
assumptions as well as the mechanics underlying the argument. This section introduces
a simple model capturing the main drivers of the Malthus theory.
The Production Function
Consider a closed economy (i.e. one with no international trade) without

government, and basically devoted to agriculture. Output takes the form of a single
homogeneous good (Z), which is produced with labour inputs (N) and land inputs (T).
The relationship between inputs and output is described by an aggregate
production function of the form1:
Z t  BT t  N t1  . (1.1)
In equation (1.1) the subscript t is a time index. The term B is called Total
Factor Productivity (TFP) and measures the state of technology (or “efficiency” in
production).
1
Specification (1.1) corresponds to a well known class of production functions, named Cobb-Douglas.
The main properties of the Cobb-Douglas production function are diminishing returns and a unit elasticity
of substitution between inputs. The Malthus theory is consistent with any production function exhibiting
diminishing returns on labour, but we stick with this particular formulation, for simplicity.
http://sweet.ua.pt/afreitas/growthbook/capa.htm 17
In the following, it is assumed that population and the workforce are the same
and that wages are fully flexible, so employment will be equal to population. To capture
the existence of physical limits to land expansion, we assume that the amount of land
available to agriculture is fixed ( T  T ).
If land is fixed and technology is given, the only way to increase production in
this model is by increasing the amount of labour use. However, this does not lead to an
increase in output per worker (or per capita income). The reason is that, as the number
of workers increases, output increases less than proportionally. The reason is the Law of
Diminishing Returns.
The Law of Diminishing Returns
The Law of Diminishing Returns (LDR) is one of the oldest and more important
postulates in Economics. In short, it states that, increasing one ingredient of production
keeping all other ingredients constant has a decreasing marginal impact on output.
At the time Malthus wrote his book, the most important factor of production
other than labour was land. Since the availability of land to a nation is difficult to
change, the LDR implies a negative relationship between employment, N, and the
average product of labour (or per capita income), defined as y  Z N .
In terms of the production function (1.1), the LDR may be checked dividing
both sides by N, which gives:

T 
y  B  . (1.3)
N
The LDR is illustrated in Figure 1.1. In the figure, the vertical axis measures the
level of output (Z) and the horizontal axis measures the labour input, N (remember that
in this model employment and population are the same).
The average product of labour is measured by the slope of the ray that departs
from the origin and crosses the production function in each point. For instance, when
the workforce is equal to N0, output will be Z0 . The corresponding average product of
labour is Z 0 N 0 (i.e, the slope of OP). Given the shape of the production function,
when the number of workers rises to N1, output expands less than proportionally, so that
the average product of labour declines to OP´.
afreitas@ua.pt 18
Figure 1.1: Output and employment in the Classical Model
Z
P P’
Z1
Z = BT  N 1-
Z0 S
0
N0 N1 N
The Malthus theory of population
Malthus formulated his theory of population observing that, in nature, animals

and plants are “impelled by a powerful instinct to the increase of their species”. He also
pointed out that “superabundant effects” are repressed by “want of room and
nourishment”2.
In his theory, Malthus argued that a similar mechanism holds for human beings:
the “passion between sexes” compels humans to increase the species, but the world has
limited resources. Hence, if human population expands without control, a point will
come when human population reaches the limit up to which food sources can support it.
At that point, any further increase in population would result in food shortage and death
due to starvation and disease.
To this natural barrier, he added a more conscientious prevention mechanism,
that operates through birth: Man differs from other animals in that it may deliberately
reduce fertility in face of a resource shortage. Thus, Malthus distinguished two types of
checks holding population within resource limits: “positive checks”, like epidemics,
famines and wars, which cause death; and “preventive checks”, like abortion, birth
control and postponement of marriage, which refrain birth.
In terms of our model, the Malthus population dynamics may be formulated
introducing a “subsistence level of per capita income” ( y ) defined as the minimum
income necessary to sustain the life of a human being. According to the Malthus theory,
whenever per capita income rises above this subsistence level, population expands;
when per capita income fall below the subsistence level, population shrinks3.
Dynamics and equilibrium
2
Malthus (1798), Chapter 2.
3
Formally, the dynamics of population (N) in the Malthusian model may be described by the following
equation: n  N N    y  y  , where is some positive parameter.
Summing up, the model has two basic ingredients: the Law of Diminishing
Returns (LDR) and the Malthusian theory of population. With these two ingredients, the
theory follows in a intuitive manner: in the absence of technological progress, the LDR
implies that a growing labour force will lead to a more intensive use of land and thereby
to a decline in output per worker. As output per worker declines, population expansion
decelerates. At the moment output per worker falls below the “subsistence” level, both
population and output stop expanding. Hence, given the land availability and the level
of technology, the size of the population is self-equilibrating. Any technological
progress will be offset, in the long run, by an increase in the size of the population,
without having any positive impact on real income per capita.
Figure 1.2 illustrates the dynamics of the model. The exogenous subsistence
level of per capita income ( y ) is represented by the slope of the line OS. Suppose that
initially there are N0 workers producing Z0 . According to the Malthus theory of
population, since per capita income Z0/ N0 is higher that the subsistence level, in this
case there will be a tendency for population to expand. The expansion of population, in
turn, will lead to more intensive use of land and a declining output per worker. This
process ends up at point R, where output per worker is equal to the subsistence level. At
this point, there is no tendency for population to expand. Any further increase in
population would result in famine, disease and death. This is the equilibrium of the
model.
Figure 1.2: Dynamics and equilibrium in the Classical Model
Z S
Y*
R
P Z = B T  N 1-
Z0
Q
O N0 N* N
Box 1.1: Stable Steady State
Technically, point R in Figure 1.2 is an equilibrium, because once it is reached,

there is no tendency for the economy to move away from it. In dynamic models, such
equilibrium is called a steady state.
It is important to note that, in this model, the dynamic forces are such that this
equilibrium will be met, irrespectively of the departing point. For instance, if population
is initially smaller than N*, per capita income will be higher than the subsistence level
and population will expand. Conversely, if population is initially larger than N*, then
afreitas@ua.pt 20
the level of per capita income will be less than the subsistence level and population will
decline. Because the economy will end up in the equilibrium whatever is the departing
point, this equilibrium is said to be stable.
Box 1.2: The Black-Death
The Black Death in the European Middle Age constituted a large and exogenous
shock that reduced populations significantly below trend for an extended period of time.
One implication of this calamity was to increase significantly the availability of land per
worker. According to the Malthus model, a sudden decline in the size of population
(move from R to P in Figure 1.2) should be accompanied by an increase in per capita
income.
In fact, there is considerable evidence confirming that such prediction of the
Malthus model actually occurred. In the case of England, for instance, the Black Death
(1348-1349) lead to a decline in population from about 6 million to 3,5 million people,
causing real wages to triple in the 150 years that followed. The data also reveal that
most of the increase in per capita income was channelled to a new phase of population
expansion. By the middle of the XVI century, real wages had return back to the pre-
plague level4.
Smith’ mark of prosperity
The model just described reveals, in a simple manner, the dramatic implications
of the LDR: for any given state of technology, the growth of the economy settles at a
point where income per person is constant at the very low subsistence level.
Formally, the equilibrium size of population in this model is obtained when per
capita output (1.3) is equal to the subsistence income, y . Solving the resulting equation
for N, one obtains:
1
 B
N *    T (1.4)
y
This equation states that the equilibrium level of population is larger the larger
the availability of land and the higher the level of the technology. Saying in other way,
equation (1.4) states that countries with superior technology, B, should exhibit higher
population densities, defined as the number of inhabitants per unit of available land
(N/T) - note how well this prediction of the model fits with the Adam Smith claim
quoted at the beginning of this chapter!
The model prediction that differences in technology should give rise to
differences in population density but not in differences in living standards was
investigated empirically by many authors5. The main conclusion of this research is that,
prior to the Industrial Revolution, differences in standards of living across regions in the
4
Clark (2001). Other authors analysis this period include Livi-Bacci (1997) and Hansen and Prescott
(2002).
5
For instance, Easterlin (1981), Kremer (1993), Lucas (2002).
world were indeed small, even though differences in technology were large. Box 1.4
provides one example of these findings.
1.3. Technological change in the Malthus model
What happens when technology improves?
By now, we have assumed that the level of technology (B) is constant over time.
In this section, we explore the implications of technological change. Technological
progress means that inputs become more productive, so this may overcome the
limitation of having a fixed amount of land.
Suppose, for instance, that people in this economy discover the plough and
started using animal power and water power. In terms of our model, these innovations
may be thought as an increase in the productivity parameter B (equation 1.1).
Figure 1.3 describes the impact of a technological change in this economy.
Suppose that the economy starts out in point R, with per capita income equal to the
subsistence level. The new production function is represented by the dashed curve that
crosses the OS locus at point V. If the economy is initially in R, after the technological
change, production jumps to point U. Since in U per capita income (slope of OU) is
higher than the subsistence level (slope of OS), population in this economy starts
expanding (this is the Malthus theory of population). Then, as population expands,
diminishing returns drive per capita income back to the subsistence level. This happens
when the new equilibrium, V, is reached. In the long run, the gains from technological
progress were totally channelled to the increase in the size of population.
Similar results hold when the availability of land increases. Suppose, for
instance, that a swampy stretch of land was drained. In terms of equation (1.1), such
change is accounted for a rise in the parameter T. In terms of Figure 1.3, the story is the
same as with the increase in B (after all, what matters is that the “carrying capacity” of
the economy has changed): in the long run, the initial improvement in living standards
is completely offset by population expansion.
afreitas@ua.pt 22
Figure 1.3: A technological improvement in the Classical Model
Z S
V
U
R
Z= BT  N 1-
O N R* NV* N
Box 1.3: Transition dynamics vs. change in the steady state
Both Figure 1.2 and Figure 1.3 describe how the Malthusian economy evolves
along time until reaching the steady state. This adjustment process is called the
transition dynamics.
There is however a critical distinction between the case in Figure 1.2 and that of
Figure 1.3: in Figure 1.2, the economy is not initially in the steady state and approaches
the steady state, R. In Figure 1.3, the steady state of the model changes: because an
exogenous parameter of the model changed, the initial equilibrium (R) no longer holds.
So the economy engages in a transition dynamics until the new steady state is reached
(V).
Race between technological progress and diminishing returns
In Figure 1.3, we examined the implications of a “once and for all” improvement
in technology. In short, an invention brings with it an increase in the land “carrying
capacity”, meaning that living standards increase temporarily. Then, as time goes by,
population expands and diminishing returns show up, implying that the amount of food
available per person falls back to the subsistence level.
The fact that population is slow to adjust rises the question at to whether a
continuous and fast enough pace of technological change could allow the economy to be
permanently engaged in transition dynamics, with per capita income permanently above
the subsistence level.
To illustrate this, let’s consider again Figure 1.3, but assume that a second
invention tilted the production function while the economy was still on its way from R
to V. And again, before the new long run equilibrium was reached, a third technological
change took place, and so on. Clearly, if the economy was continuously hit by
technological improvements and population expansion was never fast enough, then per
capita income would never fall back to the subsistence level, even if population was
expanding according to the Malthusian rule6.
Formally, suppose that technology expands continuously at rate B B . Instead of
forcing the economy to be permanently in equilibrium (as described by 1.4). assume
instead that the (endogenous) rate of population growth, n, is never fast enough for the
economy to meet the steady state. In that case, the growth rate of per capita income will
be:
B
yˆ   n  0 (1.7)
B
Equation (1.7) is obtained log-differentiating (1.3) and describes the change in
per capita income as a race between technological change and diminishing returns7.
As shown in (1.7), the likelihood of such a benign outcome depends on the
degree of diminishing returns. In particular, the lower the role of land in production
(that is the lower the ), the more likely the acceleration of technological progress to
cause a departure from the Malthusian trap8. So, you may also use this model to think
the implications of an economy moving away from agriculture to manufactures.
Endogenous technical change
The Malthus model implies that population expansion exerts a negative

influence on living standards. The underlying mechanism is diminishing returns: since
the amount of arable land is fixed, for any given level of technology, a larger population
implies that less output is available per person.
One may argue, however, that the size of population also exerts a positive effect
on living standards, through its influence on the state of technology. The main idea is
that a larger population should, in principle, contain a larger number of potential
inventors: if each person has a given probability of inventing something, then, all else
equal, a larger and more diverse population should, in principle, be capable of
generating more inventions per unit of time9 10.
6
This example caracterizes the “Post-Malthusian regime”, defined in the next section.
7
Note that land is a resource that, in general can be used every period without suffering depletion. But
one could easily adapt the model to a case where T denoted for a non-renewable resource, that is, a
resource that exists in finite supply and is depleted when used in production. This could be, for instance,
oil and natural gas. As you may guess, in that case one would need a faster technological progress, to
overcome both the diminishing returns and the depletion of the natural resource. For a formal explanation,
see Jones (2002) chapter 9.
8
The lower the  , the lower the role of land in production and hence, the less significant diminishing
returns are. Note that if there were no diminishing returns at all (0) it would be impossible for
population expansion to offset the technological progress.
9
William Petty, a 17th century expert on the economics of taxation, once stated: “It is more likely that
one ingenious curious man may rather be found among 4 million than 400 persons”.
10
Another channel through which the size of population may enhance productivity is through market size
effects. This includes the benefit of specialization (Adam Smith, 1776) and break-even effects related to
economies of scale. In that case, however, it is the level of productivity (as captured by B), rather than the
growth rate, that is affected. We will address specifically this channel in Chapter 12.
afreitas@ua.pt 24
It is important to note that, in contrast to many other goods, technology is non-

rival: that is, many people can share an invention without losing its effectiveness.
Hence, as long as technology is free to spill over across people, each single agent will
benefit from everybody else discoveries. Thus, with a larger population, everyone
would enjoy the benefits of a faster rate of technological progress11.
A simple way of modelling this is to assume that the rate of technological
progress is a linear function of the population size:
B B  bN , (1.5)
where b captures the probability of somebody inventing something.
Adding a positive relationship between technological change and the size of
population to the simple Malthusian model has an interesting implication: population
and technological progress reinforce each other.
To see this, consider two economies, say R and V, completely isolated from
each other, which differ only in terms of the quality of the land available to agriculture.
Say, V has land with superior quality. If technology was initially the same in both
economies, which one would end up with a larger population, R or V? Obviously, V,
you will conclude: more quality land means a larger carrying capacity and therefore
more population. Then, if a larger population implies faster technological progress,
what is going to happen? Clearly, economy V, because it started out with a better
geography will meet faster technological progress and thereby faster population
expansion, in a virtuous cycle.
At this stage - you may argue - the model looks like departing from reality: in
our days, knowledge has the potential to diffuse across country borders, so one does not
need to have a large population to enjoy fast technological progress. Belgium, for
instance, is a rich country despite its small size, because it developed social capabilities
to adopt the best technologies invented elsewhere. Bangladesh, in turn, is much more
populated than Belgium and is not particularly famous by its scientific achievements.
So, in order to understand why technology improves faster in some countries than in
others, one certainly needs a much more comprehensive theory than the one just
sketched out above.
Still, if we go back in history long enough to analyse economies that were
effectively isolated from each other, we will confirm that this simple theory relating
technological progress to the size of population fits pretty well in the historical facts.
Box 1.4 presents an historical experiment, due by a professor from Harvard, Michael
Kremer: based on historical evidence, this author showed that, among societies without
contact, those with greater land areas and hence with larger initial populations managed
to achieve faster technological progress. This evidence suggests that indeed the size of
population and technological change go along.
Box 1.4: Technology and population density: an historical experiment
11
This avenue was explored by Kuznets (1960), Simon (1977, 1981). Note that population is assumed to
be diverse: if individuals were all alike, a larger population could translate into more individuals
inventing exactly the same piece of knowledge, without mutual benefit. The possibility of overlapping
research was coined “stepping on shoes” by Charles Jones (1995), leading to modifications of equation
(1.5) below. These complications are addressed in Box 8.4
The Malthusian prediction that the technological level of a region impacts on the
size of its population but not on living standards was investigated by Michael Kremer
(1993). Kremer added, however, to the simpler Malthusian formulation a mechanism of
reverse causality from population to technology: he argued that regions with larger
populations should observe faster technological progress than regions with smaller
populations. The reason is intuitive: if the probability of inventing something is the
same for any single person, then a region with more inhabitants should, in principle, be
better endowed to generate ideas and to enjoy fast technological progress than a region
with a less number of inhabitants.
Of course, one may argue that knowledge does not recognize borders: in
principle, the mechanism of technological diffusion should help mitigate technological
differences across regions, blessing the laggard regions with the opportunity to catch up.
To abstract from this possibility, Kremer examined a particular period of the Human
History, where populations in different areas were effectively isolated from each other.
The author first observed that, before the end of the last ice age (about 10.000
B.C.) ocean levels were so low that humans could easily migrate across continents,
including through the Bering Strait, which connects Asia to the Americas. Hence, at that
time, technology had the potential to diffuse across regions. It is thus plausible to
assume that – say - by 12.000 B.C., the known technologies were pretty similar across
the humanity. Note that in these times human were basically hunter-gatherers.
With the melting of the polar ice caps, around 10,000 B.C, land bridges were
flooded. In consequence, the Old World (Europe, Asia, Africa), the Americas, Australia,
Tasmania and the Flinders Island became isolated from each other. If Kremer’s
conjecture was right, one would expect that, at the time connections were re-established
- with the European explorations of the 15th century - technological levels, population
densities and land sizes were all positively correlated. Why? Because larger regions
would have built bigger populations and therefore technology would have developed
and diffuse quicker, causing in turn faster population expansions.
For instance, medieval Islam, centrally located in Eurasia, was able to acquire
inventions from India, China and Greece. In contrast, the Aboriginal Tasmanians, who
remained isolated, could not have adopted new technology other than what they
invented themselves. With a faster technological improvement, Eurasia should have
experimented a faster population expansion and therefore a larger increase in population
density than Tasmania.
Kremer showed that the data confirm these conjectures. By the year 1500,
population densities where much higher in Eurasia-Africa (4.85/km2) - the region with
larger area - than in the Americas (0,36/Km2), Australia (0.026/Km2), Tasmania
(0,018/Km2 to 0.074/Km2) and the Flinder Island (0,0/Km2). Accordingly, the Old
World had the highest level of technological sophistication, followed by the Americas
(the Aztec and the Mayan civilizations had already discovered agriculture). Australia
was in an intermediate stage, having developed some artefacts like the boomerang, but
with a population that was still of hunters and gatherers. Tasmania registered
technological regression: its inhabitants lacked basic tools such as fire-making and lost
the ability to make bone tools. Finally, the Flinders Island, with 680 square kilometres
of land and only 500 inhabitants initially, lost all its inhabitants by around 4,700 BC.
afreitas@ua.pt 26
1.4. The demographic transition
Population and technological change along the last twenty centuries
Figure 1.4 plots the evolution of per capita income and of population in the
world economy from year 1 to 2000. As shown in the figure, the size of population and
per capita income increased very little until the 17th century. They however started
accelerating slowly, until expanding very sharply in the last two hundred years.
This pattern has a natural interpretation in terms of the ideas sketched out in
Section 1.3: initially, with less (and segmented) population, the arrival of new ideas was
slow and the slow pace of technological change was basically matched by the
population expansion. This is basically the Malthusian story. As population expanded,
however, the arrival of new ideas accelerated, allowing technological progress to
outpace population growth: in the figure, this is captured by the sharp increase in per
capita income along the last two hundred years. Hence, at certain moment, it looks like
population lost the race with technological change.
To further investigate this question, let’s look at Table 1.1. The table describes
the evolution of GDP, population and per capita GDP in West Europe and in Asia along
that last two thousand years. A rough estimate of the pace of technological progress is
also displayed in line (4), using equation (1.1) and postulating a value for  equal to 1/3
(details in the table).
According to the table, in Western Europe and in Asia, GDP and population
expanded very slightly from year 1 to 1.000 (lines 1 and 2), reflecting an almost
stagnant technology (line 4). During this period, technological change was fully
matched by population expansion, in accordance to the Malthus theory.
In the centuries that followed, up to the Industrial Revolution (1820),
technological progress was slow for modern standards. In Western Europe, TFP growth
stood at rates that ranged from 0.15% to 0.29% per year (line 4). In Asia, technological
progress was even slower. During this period, however, population responded less than
proportionally to technological change. Although per capita GDP was improving very
slowly (0.11%-0.16% in Western Europe, 0.02%-0.17% in Asia), in this period
population was already losing the race.
Figure 1.4 Population and per capita GDP over the last two thousand years
World population and per capita gdp
7.000
Population (10^6)
6.000 Per capita GDP (1990 International Geary-Khamis dollars)
interpolation
5.000 interpolation
4.000
3.000
2.000
1.000
0
1 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500 1600 1700 1820 1900 2000
year
Source: Angus Maddison.
Between 1820 and 1900, both the growth rates of per capita income and of
population accelerated significantly. This suggests that the Malthusian mechanism
whereby a higher income translates into faster population expansion was still in
operation. However, the proportion of technological change that was matched with
increasing population declined sharply (line 5).
Finally, the positive relationship between the standard of living and population
growth vanished in the twentieth century Europe. As shown in Table 1.1, after picking
up along 1820-1900, population growth rates started declining in Europe, even though
GDP per capita was growing faster than ever. This means that the Malthusian theory of
population no longer applies in this period: the faster increase in standards of living did
not translate into faster population expansion.
In Asia, a similar phenomenon has occurred, though with a time lag in respect to
Western Europe: the acceleration of TFP growth and of per capita income occurred in
the first half of the twentieth century, only. As in Europe, this process was first
accompanied by rapid population expansion, though not fast enough to prevent the
acceleration of per capita income (race between technology and population). The
decline in the growth rate of population in Asia only occurred in the last quarter of the
twentieth century.
afreitas@ua.pt 28
Table 1.1. GDP, Population and per Capita GDP, 1-2000
1 1000 1500 1600 1700 1820 1900 1960 2000
29 Western European Countries
(1) GDP
Billions of 1990 International Geary-Khamis dollars 11 10 44 66 81 160 676 2,251 7,430
Growth Rate (% per annun) -0.01 0.29 0.40 0.21 0.57 1.82 2.02 3.03
(2) Population
Millions 25 25 57 74 81 133 234 326 391
Growth Rate (% per annun) 0.00 0.16 0.25 0.10 0.41 0.71 0.56 0.45
(3) Per Capita GDP

1990 International Geary-Khamis dollars 450 400 771 890 998 1,204 2,893 6,896 19,002
Growth Rate (% per annun) -0.01 0.13 0.14 0.11 0.16 1.10 1.46 2.57
Memo:
(4) Total Factor Productivity (% per annun) -0.01 0.19 0.23 0.15 0.29 1.35 1.65 2.73
(5) Population growth divided by GDP growth 0.55 0.64 0.46 0.72 0.39 0.28 0.15
Asia
(1) GDP
Billions of 1990 International Geary-Khamis dollars 78 82 161 217 230 413 557 1,736 13,762
(2) Population
Millions 174 183 284 379 402 710 873 1,687 3,605
(3) Per Capita GDP

1990 International Geary-Khamis dollars 449 449 568 572 571 581 638 1,029 3,817
Memo:
(4) Total Factor Productivity (% per annun) 0.00 0.08 0.10 0.02 0.17 0.20 1.18 4.03
(5) Population growth divided by GDP growth 0.65 0.98 1.03 0.97 0.69 0.58 0.36
Source: (1) and (2): Maddison (1995). (3)=(1)/(2). Total Factor Productivity growth (4) is a measure of
technological progress and is computed as a residual using equation (1.7), B B  y y   N N , and  
postulating =1/3, that is: (4)=(3)+(1/3)*(2). Note: Since only the labour input is accounted for in this
decomposition, the term B captures the contribution of all other inputs.
For the rest of the World, the story was not different. Figure 1.5 describes the
evolution of population growth rates in different regions of the world along the last
three centuries. As shown in the figure, population growth rates started declining by the
end of the 19th century and by the beginning of the 20th century in the Western
Offshoots and in Western Europe, respectively. In Asia and Latin America, the
Malthusian mechanism seems to have vanished in the last quarter of the 20th century,
only. In Africa, population growth rates were still increasing by the end of the 20th
century.
Figure 1.5.- Population growth along 1700-2000
3,5%
12 Western Europe
Western Offshoots
3,0% Latin America
Asia
Africa
2,5%
Rate of Population Growth
2,0%
1,5%
1,0%
0,5%
0,0%
1700-1820 1820-1870 1870-1913 1913-1950 1950-1975 1975-2000
Source: Maddison (1995). Notes: The 12 Western European Countries are Austria, Belgium, Denmark,
Finland, France, Germany, Italy, Netherlands , Norway , Sweden, Switzerland , United Kingdom. The
Western Offshoots are Australia, New Zealand, Canada and the United States.
The three phases of economic development
Based on the above evidence Galor and Weil (2000) proposed the categorization
of economic development in three phases:
- The Malthusian Regime characterised by slow technological progress and with
population responding positively to per capita income. In this regime, most of
technological progress is matched by population expansion (Europe before the
Industrial Revolution, Asia until the end of the nineteenth century).
- The Modern Growth regime, characterized by steady growth of per capita
income and in which the relationship between income and population growth is
reversed: the acceleration of per capita income translates into a slower population
growth (Western Offshoots by the end of 19th century, Europe at the beginning of the
20th century, Asia after the third quarter of the twentieth century);
- The Post-Malthusian regime, an intermediate stage between the Malthusian
and the Modern growth regimes. This regime shares one characteristic with each of one
of the other two: the Malthusian relationship between income per capita and population
still holds; but technology takes a clear lead in the race with population, so per capita
income accelerates as well.
A question that naturally arises is what fundamental changes have occurred after
the Industrial Revolution so as to reverse the relationship between per capita income
and population. To answer this question, one has to go deeper on the understanding of
the microeconomics of demography, than simply appealing to the Malthusian rule
sketched out above.
afreitas@ua.pt 30
Birth rates and death rates
The process by which a country’ demographic characteristics are transformed as

it develops is labelled “demographic transition”.
To study the process of demographic transition, it is useful to introduce two
demographic indicators: the birth rate and the death rate. The birth rate is defined as the
number of new-borns each year per thousand of inhabitants. The death rate is defined as
the number of people that die each year per thousand of inhabitants. The difference
between birth rates and death rates gives the growth rate of population.
In order to understand how economic development alters the relationship
between population growth and economic development, we have to examine how
economic development impacts on birth rates and on death rates.
The impact on death rates is straightforward: death rates decline monotonically
with economic development: in poor economies, death rates are high, especially among
children, due to malnourishment, deficient sanitation and disease. When income
increases, better nutrition, improvements in housing, public health, modern sewage,
clean water, etc. cause death rates to decline (note that this is fully consistent with the
Malthusian idea of “positive checks”).
Since death rates decline with economic development, they cannot help explain
why, in the Modern Growth Regime, population growth declines with per capita
income: thus, explanation for the demographic transition has to be found on birth
rates12: the decline in birth rates that accompanies the move towards modern growth
regime is the most important feature of the demographic transition.
Stages of demographic transition
Based on the observation of historical facts, demographers and development

economists identified three stages over which the demographic transition unravels:
1) The first stage, pre-industrial, is characterized by high birth rates and high
death rates.
2) The second stage is characterized by a steady decline of death rates, while
birth rates remain high.
In England, for instance, the decline in mortality rate preceded that of the
birth rate by 140 years13. This means that, during a certain period, there was
a tendency for population growth to accelerate.
3) In the third stage, the continuing decline in death rates is accompanied by an
even faster decline in birth rates, so population expansion decelerates. In the
case of Western Europe, for instance, fertility rates declined sharply by the
12
As we will see in a minute, death rates also have a role in the demographic transition, but its influence
will be “indirect”, that is, mediated by the birth rate.
13
Actually, in England as well as in most Western Europe, the decline in mortality rates was
accompanied by an initial increase in fertility rates (Coale and Treadway, 1986).
turn of the 19th century. In Latin America and Asia this happened only in the
second half of the 20th century.14
These stylized facts of the demographic transition imply that the rate of
population growth has an inversed-U shaped relationship with per capita income. This is
illustrated in Figure 1.6. When the level of per capita income is very low, both birth
rates and mortality rates are very high, so the net population growth rate is low (bottom
panel). As living standards increase, with better nutrition and health care, death rates
start falling. Initially, however, the decline in mortality rates is not accompanied with an
equally fast decline in fertility rate. Hence, in the intermediate stage, the gap between
fertility and mortality widens and population growth shoots up. As the country gets
more developed, birth rates start declining faster than death rates. This causes the
growth rate of population to fall back and the economy enters in the Modern Growth
regime.
Looking across the history, one may also observe that the speed of demographic
transitions accelerated along time. In the 19th century Europe, birth and death rates fell
gradually, accompanying the slow progresses in medicine, in sanitation, etc. Birth rates
declined much later than death rates, but a situation never arose of population growth to
reach very high rates. In more recent transition processes, in contrast, populations
benefited from the fact that advances in medicine were already available. Thus, once the
conditions in place allowed a country to take opportunity of these advances, death rates
fell sharply. In result, the gap between birth rates and death rates shoot up sharply,
leading populations to explode.
14
Note that this fall in birth rates occurred long before modern contraception became available. Hence, it
was not technological reasons but instead changes in attitude towards fertility that brought about these
decreases in birth rates. Of course, in developing countries, where the decline in birth rates occurred later,
contraception helped families to better plan their offspring. But in any case, it is the willingness to have
less children that we need to explain.
afreitas@ua.pt 32
Figure 1.6. The stylized facts of demographic transition
Birth rate
Death rate
Population
growth rate
Before concluding the description of the stylized facts, one should note that
these tell us nothing about causality: one shall not conclude that it is the increase in per
capita income that directly causes the demographic transition. In the next section we
will discuss theories according to which some variables correlated with per capita
income cause the demographic transition.
Why birth rates decline with economic development?
The Malthus theory of population provides an explanation for why birth rates
decline when per capita income falls below a critical level: in order to escape poverty,
people refrain from giving more birth (the “preventive check”). Malthus did not address,
however, the possibility of birth rates to decline when per capita rises above a critical
level.
Recent theories aiming to understand the changing attitude towards fertility have
investigated the economic incentives underlying fertility choices. These theories replace
the mechanical Malthusian rule by a framework where households optimally choose the
number of children15. The common feature of these theories is that they provide an
explanation for the demographic transition: at some point in the development process,
further increases in per capita income lead people to have less children, not more.
Below, we present some ideas of this research.
The missing institutions theory
A strand in the literature considers offspring as substitutes for missing

institutions, such as insurance and social security. The main idea is that children may be
15
These theories include: the increasing value of time (Schulz, 1981, Galor and Weil, 1992), education
(Becker, 1981), and a change in the pattern of intergenerational transfers (Willis, 1982). For a survey, see
Ray (1998, chapter 9).
a source of income, both now and in the future (they have an “asset role”). Thus, in
economic environments lacking financial institutions that support people in the old age
or in case of bad luck (such as disability or theft), children play the role of these
institutions. People invest in the future and buy insurance and retirement pensions in the
form of children. So, when parent’s income increases, the number of children will
increase (as Malthus predicted), as a vehicle for lifetime consumption smoothing.
Note however, that “investment” in the form of children evolves a lot of
uncertainty: a child may move to another village and decide not to look after his
parents; there is uncertainty regarding the ability of each child to generate a decent
income; in some societies, the earning potential depends on gender; etc. Hence, one
shall expect risk averse parents, caring about the income generating potential of their
offspring, to decide in advance to rise more children than they would actually need, just
in case. Note that this effect will be more significant in a context of high mortality rates:
parents decide to have more children to increase the chance of reaching a minimum
number of survivals. That is, death rates have an indirect impact on birth rates through a
risk premium effect: when mortality rates decline, birth rates will decline as well
because parents will become more certain regarding the number of survivals.
In the Modern Growth Regime, the “asset role” of children declines
significantly. Economic development brings about institutions that specialize in
covering risks and in protection in the old age. Workers have the opportunity to buy
protection from insurance companies and often are required to contribute to social
security systems. In this context, children become an expensive form of transferring
income to the future. With the arrival of superior technologies, the asset role of children
vanishes and people naturally responded reducing the number of births.
Cost of child rearing
An important determinant of fertility is the cost of child rearing: children need to

be fed, clothed and schooled. Of most importance, looking after children takes time and
time has an opportunity cost: in traditional societies, where social norms establish that
women should stay at home and look after their kids, and where job opportunities for
women are scarce, the opportunity cost of upbringing children is naturally low. But in
modern societies, women are engaged in the labour force, so the opportunity cost of
raising children is higher.
On the other hand, the technological progress that comes along with economic
development brings an increasing demand for human capital, leading parents to spend
more resources in their children education.
To understand this, think first on a traditional economy - say in the European
Middle Age - basically devoted to agriculture, with a technology that is relatively
simple and stable. In such a context, children can start helping their parents at young
ages (so they act as an “asset”), and acquire their skills by observing what their parents
are doing. In such an economy, returns to formal education are low. Moreover, in an
environment where many children die before adulthood, parents will be reluctant in
spending valuable resources to educate a single child: they will instead invest in
quantity: whenever a productivity improvement easies the household budget constraints,
parents will tend to have more children, just like Malthus predicted.
In the Modern Growth Regime, in contrast, the knowledge required to operate
complex machinery cannot be acquired observing what the parents are doing.
afreitas@ua.pt 34
Technological sophistication creates a demand for technical skills, raising the returns to
formal education. In such a context, the relative wage of child labour is low, turning
investing in child quantity less attractive for parents. Thus, parents will invest more in
children quality, sending them to school. With no surprise, the reversal of the
Malthusian relation between income and population growth by the turn of the 19th
century in Western Europe, was accompanied by an increase in the average years of
schooling16.
All in all, because in the Modern Growth regime the cost of rearing children is
higher, a natural response for parents is to have fewer children.
Income and substituting effects
A theory of demand for children alternative to the “asset view” is to assume that
parents derive “intrinsic pleasure” in rearing children17. That is, children enter in the
households’ utility function as normal goods, so when per capita income increases,
everything else constant, fertility will increase.
In this framework, technological progress influences fertility decisions through
two different channels: On one hand, it easies the household’s budget constraints,
allowing parents to spend more resources in raising children (positive income effect).
This is the pure Malthusian mechanism. On the other hand, economic development
increases the cost of rearing children (negative substitution effect): the opportunity cost
of mother’s time increases; parents are required to provide their children with expensive
education. Thus, households will optimally reduce the children quantity.
This framework provides an interpretation for the demographic transition based
on the changing balance between these two effects along time. In earlier stages of
development, the first effect dominated, so population growth responded positively to
the income generated by technological change, as Malthus predicted. However, the
arrival of more demanding technologies gradually changed the balance between the two
effects: as returns to human capital increased, parents gradually shift from child quantity
to child quality. Then, a more educated people become more likely to develop and adopt
new technologies, accelerating the pace of technological progress in a virtuous cycle.
This allowed technology to win the race with population, entering in the Post-
Malthusian regime. At some stage, returns to education become so high that the second
effect dominates the first in fertility decisions and the economy enters in the Modern
Growth regime18.
Why birth rates take more time to fall than death rates?
16
A number of authors have argued that the phenomenon of demographic transition is inherently linked
to technological change, which leads parents to invest more in child quality rather than in child quantity.
The trade-off between quality and quantity of children was first suggested by Becker (1960). Authors
stressing the changing preferences over the number of children include Becker, Murphy and Tamura
(1990), Galor and Weil (1999, 2000), Galor and Mounford (2006, 2008), Galor and Moav (2002) and
Lucas (2002).
17
Becker (1981).
18
Galor and Weil (2000).
A stylised fact of the demographic transition is that the decline in death rates
responds faster to the improvements in living standards than the fall in birth rates.
There are many explanations for this.
A first one is that the economic incentives for the birth rates to decline arrive
with a lag relative to the initial improvement in living standards that causes the death
rates to fall. According to this view, the arrival of a welfare state, the deepening of
financial markets, the integration of women in the labour force, the raise in the value of
education tend to materialize only after critical improvements in nutrition and in health
care take place.
A more elaborated explanation relates to the age structure of population. Note
that birth rates, (number of new-borns each year as a fraction of total population) are
jointly determined by fertility rates (number of children per women in the reproductive
age) and the structure of population (the percentage of women in the reproductive age in
total population). Thus, even if fertility rates are already declining, the overall birth rates
may remain high just because more and more you girls enter in the reproductive age.
The implication for the demographic transition is straightforward. Suppose that a
given country starts out with high birth rates and high death rates. Then, suddenly living
standards improve causing infant mortality to decline. This means that more babies will
survive childhood, causing the age structure to change. With a younger age structure,
the number of potential mothers in the future exceeds the existing number of mothers.
In this case, even if the new mothers decide to have less offspring (that is, even if
individual fertility rates responded to economic incentives), birth rates will not decline
immediately because more women are entering in the reproductive age. This
phenomenon is known as the population momentum: whatever a country does, the
future growth rate of the population is largely determined by the existing age structure
and this takes generations to change.
A second motive for birth rates to remain high despite the economic incentives
is that family level fertility decisions are not entirely driven by private considerations: to
a large extent, fertility decisions are influenced by the need to conform with social
norms: if societies demand families to give a large number of births, families desiring to
conform to what is socially acceptable will refrain from reducing fertility, even though
it becomes economically convenient to do so. Of course, social norms evolve along time
in response to economic incentives, but this process takes time19.
Box 1.5 Lant Pritchett and the Theory of States and Transitions
Along this chapter, we referred to a Malthusian Regime, to a Modern Growth

Regime, we distinguished the different attitudes towards fertility in these regimes and
19
Parente and Prescott (2005) related the evolution of social attitudes towards fertility to the changing
needs regarding defence. According to this theory, prior to the modern growth regime a small group of
people could not defend a large territory from outside expropriators. Hence, there was a trade-off between
labour productivity (requiring lower population) and the risks of being invaded. Social norms then
emerged, creating incentives for populations to be such as to allow for the highest possible living
standards, taking into account the defence needs. In modern regimes, defending the territory became less
important, so social norms evolved so as to prioritize labour productivity. This is also a theory of
demographic transition.
afreitas@ua.pt 36
we outlined some theories attempting to explain the conditions under which a country
can move from one regime to the other (Demographic Transition).
Lant Prichett (2006) offers a nice analogy with this way of thinking
development economics:
“Suppose you have a pot of water and you pick it up and turn it over. Where will
the water go? The answer, that it will spill out onto the ground, is so obvious that the
astute reader already realizes it is a trick question. If the water is frozen, it may stay
right in the pot. If the water is vapor, then turning the pot over will trap the steam in the
pot. The obvious point is that the equations of motion of water (or any other substance)
depend on the state—solid, liquid, or vapor—it is in. What determines the transitions of
water between states? Well, applying heat will cause water to change states, but only in
a discontinuous way—water at 35° F and water at 95° F behave almost the same, while
water at 32° F and at 102° F behave nothing like each other. The equations of motion of
water in one state do not work at all when water is in another state, and the response of
water to heat applied within a state does not work at all well when applied to transitions
from one phase to another”.
Likewise – Prichett argues – “If France and Nepal can both be treated as water
in a liquid state, then it is conceivable that a theory and empirics of growth that treat
France and Nepal as both generic countries is adequate. I regard it as much more likely
that growth dynamics are characterized as equations of motions within states and
equations that determine transitions across states” (…). “The key idea in my proposal is
that economies are in different "states," and, therefore, the dynamics of output are
different for economies in different states, and the dynamics of transitions between
states are different from the dynamics within states”.
1.5. Globalization, fertility and the Great Divergence
Divergence, Big Time
Along the last two centuries, there was a dramatic divergence in living standards
across the globe. The development economist Lant Pritchett, professor at Harvard
University, dubbed this period as of “Divergence, Big Time” 20. The author observed
that between 1870 and 1994, one small set of countries - consisting in 12 West
European countries plus 4 Western offshoots (United States, Canada, Australia and New
Zealand) and Japan - managed to sustain fast economic growth, leaving the remaining
regions behind.
20
Pritchett (1999).
Table 1.3: Divergence Big Time
Per Capita GDP (1990 International Geary-Khamis dollars) Average growth rates:
1 1000 1500 1600 1700 1820 1960 2000 1000-1700 1700-1820 1820-1960 1960-2000
W estern Europe 450 400 771 890 998 1.204 6.896 19.002 0,13 0,16 1,25 2,57
W estern Offshoots 400 400 400 400 476 1.202 10.961 27.065 0,02 0,77 1,59 2,29
Latin America 400 400 692 3.133 5.838 1,08 1,57
Former USSR 400 400 499 552 610 688 3.945 4.351 0,06 0,10 1,26 0,24
7 East European Countries 400 400 496 548 606 683 3.070 5.804 0,06 0,10 1,08 1,61
Asia 449 449 568 572 571 581 1.029 3.817 0,03 0,01 0,41 3,33
16 Asian countries 581 962 3.794 0,36 3,49
26 East Asian countries 556 862 1.467 0,31 1,34
15 W est Asian countries 607 2.492 5.706 1,01 2,09
Africa 430 425 414 422 421 420 1.066 1.464 0,00 0,00 0,67 0,80
W orld 445 436 566 595 615 667 2.777 6.012 0,05 0,07 1,02 1,95
W estern Europe/ Africa 1,0 0,9 1,9 2,1 2,4 2,9 6,5 13,0
Source: Maddison, 2001.
Notes: Western Offshoots: Australia , New Zealand, Canada, United States; 7 East European Countries: Albania, Bulgaria, Czechoslovakia,
Hungary, Poland, Romania, Yugoslavia; 16 Asian countries: China, India, Indonesia, Japan, Philippines, South Korea, Thailand, Taiwan,
Bangladesh, Burma, Hong Kong, Malaysia, Nepal, Pakistan, Singapore, Sri Lanka; 26 East Asian countries: Afghanistan, Cambodia, Laos,
Mongolia , North Korea, Vietnam, and 20 other Small Asian Countries; 15 West Asian countries: Bahrain, Iran, Iraq , Israel , Jordan , Kuwait ,
Lebanon , Oman, Qatar, South Arabia , Syria , Turkey , United Arab Emirates , Yemen , Palestine and Gaza.
Table 1.3 illustrates the Great Divergence. The table describes the evolution of
per capita incomes in some regions of the world between year 1 and year 2000.
According to this data, between year 1 and year 1000, per capita incomes remained
almost unchanged around the world and regional income disparities remained small.
Between the 10th and the 15th century, most regions achieved a modest progress. This
period coincides with the rising of Western Europe. Per capita income disparities
remained, however, moderate: by 1700, the ratio of per capita incomes between
Western Europe and Africa was 2.4, only.
Between 1820 and 2000, regional income disparities increased dramatically. For
instance, along this period, per capita GDP increased 23-fold in the Western Offshoots
and only 3-fold in Africa. By the year 2000, per capita income in Western Europe was
13 times higher than in Africa. This is the Great Divergence.
Figure 1.7 provides a graphical illustration. The figure relates the growth rates of
per capita GDP to the initial levels of per capita GDP, for the period 1820-2000 (the
data is the same as in Table 1.3). The positive correlation between growth and per capita
incomes indicates that, along this period, initially rich countries tended to grow faster
than poor countries.
It is important to note that the Great Divergence is no longer materializing.
Indeed, in the second half of the twentieth century, a set of highly populated countries in
Asia managed to achieve fast growth in living standards. As shown in Table 1.3,
between 1960 and 2000, Asia grew much faster than Europe and the Western Offshoots.
Given the population size of Asia, the global picture became no longer of divergence21.
It should be noted that at the country level, growth experiences differ quite
significantly: while some countries manage to converge, many poor countries remain
21
Sala-i-Martin (2002) measured income differences across people in the World (that is, abstracting from
national boundaries) and observed that the last decades of the 20th century were already of convergence.
afreitas@ua.pt 38
very poor. This in particularly true in Africa: as illustrated in Table 1.3, by the second
half of the 20th century, the income gap between Europe and Africa was still widening.
Figure 1.7 Divergence Big Time
2,0
1,8 Western Offshoots

Growth rate of GDP per capita, 1820-2000
1,6
Western Europe
1,4
15 West Asian Countries

7 East European Countries
1,2
Latin America
16 Asian Countries
1,0
USSR
0,8
Africa
0,6
26 East Asian Countries

0,4
400 500 600 700 800 900 1.000 1.100 1.200 1.300
Per capita GDP in 1820
Of course, the explanations for the Big Divergence are multiple and reviewing
the main theories that explain why some countries managed to grow faster than others is
pretty much the scope of this book. For the aim of this chapter, however, it is useful to
relate the Big Divergence with the different timings that different countries performed
their demographic transitions. Indeed, while technological leaders entering in Modern
Growth first, laggard regions prolonged their Malthusian stage, meeting fast population
expansions and sluggish per capita income 22 . As a result of these different timings,
cross-country disparities in per capita incomes increased. The following sections
develop on this idea.
Industrialization and the demographic transition
The entry in Modern Growth Regime in both developed and less developed
regions has been associated to a reallocation of resources from agriculture to
manufactures. In the particular case of Western Europe and the Western Offshoots
(Australia, Canada, New Zealand, United States), this coincided with the Industrial
Revolution, which began at the end of 18th century in England and spread around along
the 19th century. Other countries that managed to catch up also experienced fast
industrialization (for instance, Japan, South Korea, and Singapore). In general, the
22
According to Parente and Prescott (2005), Mexico started the transition to modern economic growth
during the first half of the 19th century; Japan initiated the transition in the second half of the 19th century;
Brazil started in the early twentieth century, and India started its transition sometime between 1950 and
1980.
empirical evidence points to a declining share of agricultural employment in total

employment along the process of economic development23.
Because technological progress tends to be faster in manufactures than in
agriculture, it should now be intuitive for the reader why a move towards manufactures
may played a key role in demographic transition: the acceleration of technological
progress that comes along with industrialization generates an increased demand for
skilled labour and theoretical knowledge, raising the returns to education and leading
parents to alter their choices over their children education. In response, societies press
their governments to introduce universal schooling and to condemn child labour. As
educational reforms induce more children to engage in formal educational, fertility rates
decline.
From the Agriculture Revolution to the Industrial Revolution
A natural question that arises is why some countries were able to industrialize
first than others. This question is, of course, too complex to be answered with a simple
theory. At this stage, however, it is useful to introduce a conventional wisdom, largely
inspired in the 18th century England, that points to the modernization of agriculture as a
key determinant of industrialization.
Positive links between agriculture and industry may arise at different levels:
first, rising productivity in agriculture makes it possible to release workers from
agriculture to industry; second, food surpluses are necessary to feed a large urban
population; third, a raising income in agriculture creates a natural market for industrial
products; finally, the savings generated in agriculture may be used to finance investment
in industry. Based on this idea, classical development theorists defended that
agricultural (green) revolution is a precondition for industrialization24.
The link between agriculture and manufactures can be illustrated with the help
of a simple model 25 . Assume that there are two goods, Manufactures (Y) and
Agriculture (Z) and that labour is the only input to production. Moreover, assume that
the total labour force in the economy is equal to 1, wages are flexible, and that both
production functions are linear:
Y  AN Y (1.12)
Z  BN Z  B1  NY  (1.13)
where NY and NZ denote for the fractions of the labour force employed in
manufactures and agriculture, respectively. The last term in equation (1.13) incorporates
the resource constraint of the economy and stresses the trade-off between agricultural
production and manufactures production: given the productivity parameters, A and B,
23
Clark, (1940), Kuznets, (1966), Chenery and Syrquin, (1975). According to Galor (2005), the share of
agricultural employment in England declined from 40% in 1790 to 7% in 1910. The same author
documents that, along the period from 1800 to 1860, the volume of industrial production per capita
quadrupled in the United Kingdom and doubled in other countries, such as the United States, Germany,
France, Canada, Belgium, Sweden and Switzerland.
24
Nurkse (1953), Rostow (1960).
25
This model is a simple version of Matsuyama (1992).
afreitas@ua.pt 40
the only way to expand production in agriculture is by reallocating employment from

manufactures.
Moving one worker from manufactures to agriculture causes the agriculture
output to expand by B and the manufactures output to fall by A. Hence, the opportunity
cost of expanding the production of Z by one unit is A/B units of Y.
In such a framework, a productivity improvement in agriculture (raise in B)
causes an expansion of agricultural output, Z. It may also lead to an expansion of
manufactures output, Y, but this will happen only if some workers are reallocated from
agriculture to manufactures. Such reallocation underlies the argument that
modernization of agriculture is a pre-condition for industrialization.
The reallocation of labour from agriculture to manufactures is not, however,
entirely obvious: in a closed economy, when agriculture production expands, the
relative price of agriculture goods falls. Whether this indices consumers to demand
more agriculture goods, or instead to release some income to expand the consumption
of manufactures depends ultimately on preferences.
If the demand for agriculture goods was sufficiently elastic, it could be that the
fall in agricultural prices induced consumers to increase the demand for agriculture
goods more than proportionally. In that case, employment in manufactures would even
fall. In the real world, however, one does not expect the demand for agriculture goods to
increase that much. The reason is an indisputable statistical regularity, which tells us
that, as households’ income increases, the fraction of households’ income spent in food
tends to decline. This is the Engel’s Law, owing its name to a 19th century German
statistician called Ernst Engel. Such law can be incorporated in the model, postulating a
utility function of the form:
U  ln Z  Z   ln Y , (1.14)
where Z refers to a minimum subsistence level of agricultural consumption. When
positive, this parameter implies that the income elasticity of agriculture goods is less
than unitary. Substituting (1.12) and (1.13) in (1.14) and maximizing in respect to NY ,
one obtains the level of manufactures employment that matches the demand and supply:
1 Z 
NY  1   . (1.15)
2 B
This equation states that, when Z  0 (and only in this case) 26, a productivity
increase in agriculture leads to an increase in the share of employment in manufactures.
This is the result one wanted to obtain27.
The intuition is the following: an exogenous increase in agricultural productivity
leads to an increase in per capita income; then, because of the Engel law, the relative
demand for manufactures rises, implying a reallocation of labour from agriculture to
26
When Z  0 the share of expenditure devoted to each good is constant. In that case, an increase in
agriculture productivity would lead to a proportional fall in the corresponding price, so that employment
shares would be unaffected.
27
Note that employment is unaffected by changes in manufactures’ productivity: an increase in A leads to
an increases in Y, but the relative price of manufacture goods falls in such a manner that the demand for Z
remains unchanged.
manufactures. This captures the conventional wisdom that “modernization in agriculture

is a precondition for industrialization”28.
International trade and industrialization
The model above implies that an increase in agriculture productivity leads to an

expansion of employment in manufactures. This is indeed what happened in Britain
prior to the Industrial Revolution: the innovations introduced in agriculture caused
agriculture output to expand, giving rise to positive income effects that lead to an
increase in the share of consumption devoted to manufactures, allowing the economy to
industrialize and undergo its demographic transition.
History is however plenty of examples of countries with strong agriculture, such
Argentina, that failed to industrialize and of countries with poor natural resources, such
as Japan, that successfully industrialized. An explanation for this apparent paradox is
that the experiences of these countries differed from that of Britain in respect to its
timing relative to globalization29.
Indeed, from the Seven Years War in 1756 until the beginning of the nineteenth
century, England could be seen mainly as a closed economy. In that case, the model
above applies. During the nineteenth century, however, there was a significant
expansion of international trade. The ratio of world trade to output increased from 2% in
1800 to 21% in 191330. Even though imports at that time were highly restricted, the
expansion of agriculture in Argentina and the industrialization of Japan occurred at a
time where globalization was already under way.
The key point is that, when an economy is open to international trade, the
relationship between agricultural productivity and employment in manufactures is the
inverse of that in a closed economy: a high productivity in agriculture increases the
likelihood of a country having comparative advantages in agriculture, in which case
trade openness implies a specialization in agriculture goods and a reallocation of the
labour force away from manufactures.
To see the argument formally, note that the main difference between the closed
economy case and the open economy case concerns the determination of output prices:
in an open economy, prices are determined according to the world demands and
supplies. If the domestic economy is small, prices are exogenous.
In terms of our model, let p  PZ PY be the relative price of agriculture goods
in terms of manufactures in the world market. To maximize its consumption
opportunities through international trade, an economy shall allocate labour to
agriculture or to manufactures according to the Law of Comparative Advantages (see
28
Note that this is a demand side story. Hansen and Prescott (2002) focus instead on the supply side.
These authors assumed that, at early stages of development, the industrial sector is not productive enough
to attract workers, so the economy lies in an agricultural/Malthusian regime. Exogenous technological
change makes however the industrial sector progressively more attractive. At a certain point, the
industrial sector becomes viable and starts absorbing labour from agriculture.
29
Matsuyama (1991, 1992).
30
Galor and Mounford (2008).
afreitas@ua.pt 42
Box 1.6). This is equivalent to maximizing the value of domestic income at world
prices, given by:
pZ  Y  pB 1  NY   ANY (1.16)
With a simple derivative with respect to NY you may verify that this expression
is an increasing function of NY when p  A B and a decreasing function of NY when
p  A B . Interpreting, this means that: if the relative price of the agriculture good in
the world economy is lower than the opportunity cost of producing the agriculture good
in the domestic economy (that is, the domestic economy has comparative advantages in
manufactures), then it is optimal to expand the employment in manufactures until
N Y  1 ; when instead the relative price of the agriculture good in the world economy is
higher than the opportunity cost of producing the agriculture good domestically (the
domestic economy has comparative advantages in agriculture), then it is optimal to
reduce employment in manufactures until N Y  0 .
Using this reasoning, it is easy to understand why a country like Argentina failed
to industrialize: in a context of trade openness, the Argentinean high productivity in
agriculture induced the country to specialize in agriculture goods, retarding the
industrialization process. In contrast, for a country like Japan, with very low agriculture
productivity, it became profitable to specialize in manufactures: the low productivity in
agriculture endowed the country with an abundant supply of “cheap labour” that the
manufactures sector could use31.
Trading population for productivity
Putting the pieces together, the discussion above suggests that the rapid
expansion of international trade in the 19th century lead some countries to specialize in
manufactures while others specialized in agriculture goods. This, in turn, may have
influenced the different timing of demographic transition across countries, affecting
persistently the distribution of the world population, human capital and technology. In
other words, this provides an explanation for the Great Divergence32.
The argument runs as follows: By the end of 19th century England and
Northwest Europe became net exporters of manufacture goods and net importers of
primary products, where the exports of Asia, Oceania, Latin America and Africa were
overwhelming composed of primary products33.
Then, once some countries got a head start in terms of manufactures production,
their comparative advantage in manufactures reinforced their specialization pattern,
31
The model implies that countries lacking the appropriate conditions for agriculture (low B) may have
comparative advantage in manufactures and specialize in manufactures, This will happen even if
countries are less efficient in manufactures in absolute terms (lower A) than the rest of the world:
according to the theory of comparative advantages, what matters is relative productivities, as summarized
in the ratio A/B: provided the disadvantage in agriculture is larger than that in manufactures, a country
will still have comparative advantage in manufactures.
32
Galor and Mountford (2006, 2008).
33
Evidence in Findlay and O’Rouke (2003).
while most other countries reinforced their specialization in primary products. Why was
that so?
In Western countries, the increasing demand for skilled labour induced the
respective societies to press governments for educational reforms, expediting the
demographic transition. With more educated populations, Western countries met faster
technological progress, which further enhanced their comparative advantages in skilled
intensive products. Since these countries had escaped the Malthusian trap, the fast
technological development translated into increasing leaving standards.
In non-industrial economies, on the contrary, international trade induced
specialization in unskilled intensive goods. This generated low intensives to invest in
child quality, delaying the demographic transition. In these countries, the gains from
trade were channelled towards increasing populations, without impacting significantly
on living standards. Moreover, the growing abundance of unskilled labour reinforced
the comparative advantages in unskilled intensive products, in a vicious cycle.
Galor and Mountford (2008) dubbed the emergence of North-South trade in this
period as “trading population for productivity”34.
Static and dynamic gains from international trade
The story above suggests that a poor country with initial comparative advantage
in agriculture, instead of engaging in free trade, should better impose restrictions on
manufacture imports. If that helped the country to industrialize, the policy would also
accelerate its demographic transition and escape the Malthusian trap.
In other words, the discussion above suggests that the “static” gains from trade
(that is, the increase in efficiency achieved through specialization, as demonstrated in
Box 1.6) do not necessarily go along with the “dynamic” gains from trade. This model
provides, though, a first approach to the “infant industry” argument, which will be
further addressed along this book.
Box 1.6: The Law of Comparative Advantages
One of the oldest propositions in Economics and one which validity remains
untouchable for more than two centuries is the Law of Comparative Advantages,
usually attributed to the British political economist and stock trader, David Ricardo35.
34
An interesting example of such a North-South trade happened between England UK and India in the
19th century: between 1813 and 1850, India increased significantly its trade openness, namely in respect
to England. This opening process turned India from an exporter of manufactured goods (mainly textiles)
into a supplier of primary commodities (according to Bairoch, 1982, between 1800 and 1913,
industrialization in India declined by 2/3). In India, this implied a low demand for skilled workers,
reducing the incentives for investment in education and delaying the demographic transition. Thus, the
gains from trade were mostly channelled towards increasing population, without a significant impact on
living standards. In England, on the contrary, the gains from trade were channelled towards investment in
education stimulating faster technological change and faster economic growth.
35
In fact, the theory was independently discovered by Ricardo (1817) and the Royal Marines Officer,
Colonel Robert Torrens (1815).
afreitas@ua.pt 44
The theory states that, regardless the technological differences observed between
countries, provided these differences are not uniform across sectors (in which case trade
openness would be indifferent), openness to trade is, in general, efficiency enhancing.
The main idea is that trade allows nations and individuals to specialize in activities in
which they are relatively more efficient, abandoning the activities in which they are
relatively less efficient.
To illustrate the argument, consider two countries, Portugal (P) and England (R)
and two goods Y and Z, with production functions of the form (10.5) and (10.6). Since
we now have two countries, P and R, and two sectors, Y and Z, there are four different
productivity parameters. To make the story simple, we refer to a numerical example:
B P  0 . 25 , A P  0 . 15 , B R  0 . 40 and A R  0 . 50 . In this example, England is more
efficient than Portugal (productivity is higher) in both sectors. Still, we will see that
trade is mutually beneficial.
Suppose that the total amount of labour available in Portugal was N=400. If
there was no trade, Portugal could either produce Z=100 and Y=0, Z=0 and Y=60 or
any combination of these two extreme cases. If, for instance, consumers liked to
consume the two goods in equal amounts, the optimal production plan without trade
would involve the allocation of 150 units of N to the production of Z and 250 units of N
to the production of Y, yielding Z=37,5 e Y=37,5.
With trade, each country specializes in the good in which it is relatively more
efficient. Portugal is less efficient in both sectors, but its disadvantage is more
pronounced in sector Y: to produce one extra unit of Z, Portugal needs N=4. If this
amount of labour is deviated from the production of Y, the output loss in this sector is
0.15*4=0.6. This means that the opportunity cost of Z in terms of Y ( A B ) in Portugal
is equal to 0.6. In England, the opportunity cost of Z in terms of Y is 1.25. Since
producing Z implies a lower sacrifice of Y in Portugal than in England, Portugal has
comparative advantages in Z. By the same token, since in England the production of Y
implies a lower sacrifice of Z than in Portugal, England has comparative advantage in
Y.
To illustrate the gains from trade, assume – without loss of generality - that in
the global economy one could trade one unit of Y for one unit of Z (that is, p=1). With
trade, instead of wasting resources producing Y, Portugal would produce only Z. With
all resources allocated to Z, Portugal would be able to produce Z=0.25*400=100. Then,
it could import 50 units of Y in exchange for 50 units of Z, obtaining a consumption
bundle in free trade of Y=50 e Z=50. Clearly, Portugal would be better of with trade
than without trade (Y=37,5 e Z=37,5). By the same token, England, would benefit by
importing Z from Portugal at the price of one unit of Y instead of producing it at the
cost of 1.25 units of Y. Thus, England would also gain with the possibility of trade.
With this simple reasoning, David Ricardo showed that international trade acts
like a technology through which nations “obtain” (import) goods that otherwise they
would produce less efficiently, devoting their resources to the production of goods in
which they are relatively more efficient36.
36
This does not mean that with free trade per capita incomes will be the same in both countries. In the
above example, it is easy to show that real wages in England will be higher than in Portugal. That is,
people in technologically advanced country will enjoy higher wages.
1.6 Discussion
The Law of Diminishing Returns (LDR) has an important role in the theory of
economic growth. The Malthus model puts this in a simple manner. In this model, a
growing labour force leads to a more intensive use of land and thereby to a decline in
labour productivity and wages. At the moment wages fall below a given subsistence
level, both population and output stop growing.
Despite its simplicity, the Malthusian model provides an insightful tool to
interpret the almost constant living standards that characterized the pre-industrial era.
The model also provides a useful tool to think about major contemporaneous problems,
such as environmental sustainability and the challenges posed by exhaustible
resources37.
The model fails, however, to describe modern economic growth. On one hand,
the model conflicts seriously with the stylised fact that, in modern economies, per capita
incomes exhibit a tendency to growth over time, not to remain constant at the very low
subsistence level. On the one hand, its predictions regarding the relationship between
population growth and per capita income no longer hold in modern societies.
To understand why fertility choices change along the process of economic
development, we looked at the microeconomic of fertility. These theories point to the
critical role of technological change and the increasing demand for human capital in
explaining the demographic transitions. Then, we addressed the question as to why
some regions of the world achieved their demographic transition first than others. This
discussion constituted a first approach to the problem of interdependence and to the
question of why technology does not evolve equally everywhere.
37
Modern writings drawing on the Malthus ideas include Elrich (1968) and Meadows (1972), who
stressed the impact of population growth on natural resources and on pollution. Also the World Summit
in Sustainable Development in 2002, was much inspired on Malthus ideas.
afreitas@ua.pt 46
Key ideas of Chapter 1
 According to Malthus, the size of population should increase whenever per capita income increased
above a given subsistence level.
 Because, with everything else constant, a higher work force implies a lower labor productivity,
Malthus predicted that the size of human population would be self-equilibrating.
 The Malthus prediction that technological gains should translate into higher population densities
rather than to higher leaving standards describes pretty well the history of human kind in the pre-
industrial era.
 Along the centuries, it happened however that technology started winning the race against
population: population was still expanding with per capita income, but not the enough to avoid the
expansion of per capita income. This pattern is labeled the post-Malthusian regime.
 Following the industrial revolution, more and more countries entered in the Modern growth regime,
where the Malthusian mechanism linking per capita income to population expansion no longer holds.
 The change in the human behavior towards fertility along the process of economic development is
labeled “Demographic Transition”.
 In the Malthusian regime, fertility rates tend to be high, because children play an asset role and
because of risk aversion.
 In the Modern growth regime, the asset role of children is dominated by superior alternatives. On the
other hand, the cost of rearing children and preparing them to enter in the labor force is considerably
higher. Thus, parents’ choices move from “quantity” to “quality”.
 The fall in birth rates entails some inertia, either because of social normas and because of the
structure of population.
 The rising cross-country income disparities along the last two centers is known as “The Great
Divergence”.
 Some authors argued that the interaction between globalization, industrialization and attitudes
towards fertility help explain the great divergence: countries with comparative advantages in
agriculture remain basically in the Malthusian regime, with technological improvements matched by
population expansions. In countries with comparative advantages in manufactures, societies felt the
pressure to switch from child quantity to child quality, investing more in education and achieving
faster technological change, in a virtuous cycle.
Problems and Exercises
Key concepts
 Positive checks vs. preventive checks.

 Stable vs. unstable steady state.
 Race between technology and diminishing returns.
 Post-Malthusian regime vs Modern Growth Regime.
 Demographic transition.
 Asset role of children.
 Population momentum.
 The great divergence.
 Trading productivity for population.
Essay questions:
1- Comment: “The most decisive mark of the prosperity of any country is the
increase in the number of its inhabitants”. [Adam Smith].
2 - Comment: “Technology and population reinforce each other”.
3 – By the 15th century, population density in Eurasia was much higher than in
Australia. Explain.
4 – What drives the fall in fertility rates in the transition to the Modern Growth
regime?
5 – Explain how birth rates can remain high despite the fall in fertility rates.
6- Explain why high productivity in agriculture favoured the demographic
transition in England but not in Argentina.
afreitas@ua.pt 48
Exercises
1.1.
Consider a closed economy with no government and basically devoted to
agriculture. Output takes the form of a single homogeneous good (Y), which is
produced using labour (N) and land (T). The relationship between inputs and output is
described by an aggregate production function of the form:
Yt  BTt 0.5 N t .
Assume that the availability of land is fixed, with T=1. The dynamics of
population (N) is described by the following equation:
N    y  y 
Where  is a positive parameter within the unit circle, y=Y/N and y  2 is the subsistence level of per
capita income. Assume initially that B=18 and   0 . 5 .
a) Explain the equation that describes the dynamic of population in this

economy.
b) Find out the steady state of the model and represent it in a graph. Is this
steady-state stable?
c) Suppose now that the discovery of a new fertilizer improves B from 18
to 20. Following this change, will the economy expand indefinitely?
Why? What happens to the population density, N/T?
d) Suppose now that B was expanding continuously at a rate of 2% per
year? Would population expand at 2% per year as well? Why? What if 
was very small?
e) Consider now that the introduction of a new technology based on the
division of labour altered dramatically the shape of the production
function. In particular, assume that in the new production function
B=0.02 and   2 . Describe the new relationship between output and
labour in a graph. Starting with N=100, how would the economy
evolve? Explain what happened if the economy was tilted out of the
steady state by a small exogenous change in TFP (say, to B=0.025).
Compare with (d).
1.2.
Consider an economy where people live two periods. In the first period, people
are young, they work, they have children and they support their parents. In the second
period, people are too old to work or to have children, so they need assistance from their
children to sustain their consumption needs.
Each family is only concerned with its lifetime utility function, given by
U  ln ct  ln ct 1 . Further assume that: family income during the working age period is
equal to 10 monetary units; the cost of rising children is equal to 1; each child delivers 1
unit of its income to his parents in the old age. There are no social security or capital
markets..
a) Formalize the utility maximization problem of a period-1 individual.

Write down the intertemporal budget constraint.
b) Describe the welfare gains associated to the fact that people can have
children. Use a graph to illustrate your answer.
c) What happens to the optimal choice when the family income increases
from 10 to 12? Show in a graph. Explain how this relates to the
Malthusian theory.
d) What happens if the cost of rising children increases from 1 to 1.25?
e) Show that the problem above is equivalent to that of a static optimization
in which individuals derive an intrinsic utility from having children.
f) Suppose now that banking services become available, so that households
could borrow or lend any amount of money at a zero interest rates.
Would children still be a profitable investment?
1.3.
The following table illustrates the “demographic momentum”. Initially, the population
is stable with a fertility rate equal to 2. The number of fertile women in each generation
being equal to half of the newborns 30 years before, and the death rate is stable at 1/3.
In year zero the fertility rate jumps temporarily from 2 to 3.
a) Explain how the birth rate is determined in the model
b) Explain why the temporary shock in fertility produces lasting effects in the birth rate
for more than two centuries.
Year -60 -30 0 30 60 90 120 150 180 210 240
Initial Population 100.0 100.0 100.0 116.7 127.8 135.2 140.1 143.4 145.6 147.1 148.0
Fertile women 16.7 16.7 16.7 25.0 25.0 25.0 25.0 25.0 25.0 25.0 25.0
Fertility Rate 2 2 3 2 2 2 2 2 2 2 2
Birth Rate 33% 33% 50% 43% 39% 37% 36% 35% 34% 34% 34%
Death Rate 33% 33% 33% 33% 33% 33% 33% 33% 33% 33% 33%
Population growth 0.0% 0.0% 16.7% 9.5% 5.8% 3.7% 2.3% 1.5% 1.0% 0.7% 0.4%
1.4
Consider one economy where two goods, Manufactures (Y) and Agriculture (Z),
are produced using labour input, only. For simplicity, assume that the total labour force
in the economy is equal to 100 and that both production functions are linear in labour:
Y  BN Y
Z  AN Z
a) Find out the expression for the production possibilities frontier. Display
it in a graph, assuming that the productivity parameters are A=1/2 and
B=1. In that case, what will be the opportunity cost of expanding one
unit of manufactures output?
b) Assume now that we are dealing with a closed economy and that the
utility function is given by U  ln( Z  Z )  ln Y , with Z  40 . (b1)
Interpret the parameter Z . (b2) Find out an expression relating the
equilibrium level of employment in manufactures with the parameter A.
afreitas@ua.pt 50
(b3) Assume that this country experiments an agricultural revolution,

with the productivity parameter A shifting from A=½ to A=1. Explain
what happens to employment in manufactures.
c) Suppose that productivity in manufactures evolves according
to B t  0 .1 N , t 1 . (c1) Interpret this rule.(c2) What happens to
productivity in manufactures after the agricultural revolution? (c3) Does
employment in manufactures change at all? Why?
d) Assume now that two countries, say England and India, engage in
international trade. Assume that, before openness, England
experimented an agrarian revolution as described in b) and the implied
transformation, as described in c). India, on the contrary, was still in the
first stage (A=1/2 and B=1): (d1) Has England absolute advantages in
agriculture? (d2) Has England comparative advantages in agriculture?
(d3) Admitting that both economies open up to international trade, how
will employment evolve in both countries? (d4) Taking into account rule
(c), are comparative advantages likely to change in the future?
“A thrifty society will, in the long run, be wealthier than an impatient one, but it
will not grow faster” [Robert Lucas Jr.]
Learning Goals:
 Acknowledge the distinctive feature of capital, as compared to those of

the inputs labour and land.
 Understand how capital accumulation can overcome the diminishing
returns to labour
 Understand the mechanics of the Solow model, and the disctinction
between transition dynamics and equilibrium.
 Evaluate the Solow model in light of the Kaldor facts.
 Understand how technological progress allows per capita income and
wages to growth without altering the real interest rate.
 Discuss the optimality of the saving rate in the context of the Solow
model
 Derive a simple optimal consumption (Ramsey) rule
2.1. Introduction
The neoclassical theory of economic growth was pioneered by two independent

authors, the American economist and Nobel Laureate Robert Solow (1956) and the
Australian economist Trevor Swan (1956). The main innovation of the Neoclassical
model in respect to the Malthusian Model is the replacement of land by “capital” in the
production function. By “capital”, we mean machinery, buildings, and other equipment.
This modification is more than a mere change in form: contrary to land, capital
can be produced and accumulated. That is, people can choose the stock of capital they
want. Capital accumulation, in turn, increases a country’ productive capacity and by
then the productivity of labour. This opens an avenue to overcome the diminishing
returns to labour: by allowing the capital stock to expand, the Solow model avoids the
negative relationship between productivity and the size of population that plagues the
Malthus model. Thus, in the Solow model, the long run is no longer to be characterized
by a low wage equilibrium trap. Still, because capital itself faces diminishing returns,
capital accumulation alone cannot generate long term growth. In the Solow model, a
highet investment rate leads to a higher level of per capita income, but it does not
generate long term growth.
This chapter describes the Solow model in its simplest formulation. In Section
2.2, we describe the basic model and its equilibrium. In Section 2.3, we compare the
predictions of the model with the main stylized facts of economic growth. In Section 2.4
afreitas@ua.pt 52
we discuss how the main endogenous variables of the model respond changes in the key
exogenous parameters. Section 2.5 discusses the optimality of the saving rate, from a
social point of view. Section 2.6 extends the model to the case in which the saving rate
is the result of optimizing decisions by individual agents. Section 2.7 introduces a
growth accounting exercise to illustrate a fundamental limitation of this simple version
of the Solow model. Section 2.8 concludes.
2.2. The Solow model
The production function
In the Malthus model, with everything else constant, an expansion in the size of
the labour force leads to a decline in per capita income. This is a direct consequence of
the Law of Diminishing Returns. The Solow model retains the assumption of
diminishing returns, but explores another property of the neoclassical production
function: the property of Constant Returns to Scale (see Box 2.1). When the production
function exhibits CRS, if one sets capital and labour to expand at the same rate, then
output will grow at the very same rate. Hence, per capita income will remain constant,
rather than declining towards a low level subsistence trap.
To see this how the Solow model works in the long run, let’s assume that the
aggregate production function in the economy is the following:
Y t  AK t N t1  , 0 <  < 1 (2.1)
where K denotes for the econoomy’ capital stock, N for the size of the labour force, Y
for output and A is the Total Factor Productivity.
Since we are interested in the well being of the average person, we focus on per
capita income. Dividing the production function (2.1) by the size of the workforce (N),
we obtain a new expression, which relates per capita output to the availability of capital
per worker:

K 
y t  A   Ak t , (2.2)
N 
In (2.2), y=Y/N denotes for per capita income and k=K/N is the capital-labour ratio.
Equation (2.2) is called the production function in the intensive form and stresses the
role of the capital-labour ratio as a main driver of per capita income.
To remember, the Malthusian model assumes that the input other than labour in
the production function (land in that case) remains constant, so per capita income is
doomed to decline as the labour force expands. In terms of equation (2.1), this happens
when N increases while K is held constant. This is illustrated in Figure 2.1, with a move
from point A to B: all else constant, an increase in the use of labour from N 0 to N 1
implies a decline in the output per worker from y 0 to y1 .
In the Solow model, in contrast, the second input, capital, is allowed to increase
along with the labour force. Actually, the steady state of the model will be such that the
capital stock will increase exactly in the same proportion as the labour force. In that
case, we see from (2.2) that per capita income will remain constant. In terms of Figure
2.1, this is illustrate by a move from A to C: if, when labour expands from N 0 to N 1 ,
the capital stock also increases, and exactly in the same proportion (from K 0 to K 1 )
then per capita income remains constant at y 0 .
Figure 2.1. Diminishing returns versus constant returns to scale

yY N
C 
y0 K 
A y  A 1 
 N
y1 
B K 
y  A 0 
 N
N0 N1 N
The interesting feature of the Solow model is that we do not need to postulate K
to grow at the same rate as the labour force. As we will see next, the properties of the
model are such that the capital stock, despite being endogenous to the model, will end
up growing at the same rate as the labour force in the long run, assuring a constant level
of per capita income.
Box 2.1 Constant returns to scale
The key assumption of the neoclassical growth model is that of constant returns
to scale (CRS). CRS means that if one increases the use of all inputs by a positive
proportional factor, output will rise in the same proportion. For instance, duplicating the
use of labour and capital, output will double. The CRS property can easily be checked
in equation (2.1): for any q  0 , A qK  qN 1    qY .
Note that this is not inconsistent with the LDR: the Law of Diminishing Returns
states that if one expands the use of one input while holding the other input constant,
output will grow less than proportionally. The Constant Returns to Scale property
applies when all inputs increase in the same proportion at the same time.
In general, production functions may also exhibit decreasing returns to scale (in
which case output grows less than proportionally than inputs) and increasing returns to
scale (when output grows more than proportionally than inputs). It is believed that
decreasing returns to scale are unlikely: if we manage to increases all inputs in a given
proportion, there are no reasons to believe that output will not respond at least
proportionally. Increasing returns to scale may occur in certain circumstances. We will
address this case later in this book.
Main assumptions
afreitas@ua.pt 54
Consider a closed economy with no government with a large number of small

firms producing a single homogeneous good, Y, using two inputs: labour (N), which
grows over time at an exogenous rate (n) and capital (K), which can be produced and
accumulated. Population and the labour force are the same. Inputs are hired from
households, who are also the owners of the firms and the consumers in this economy.
Households save a fraction of their income, s, by spending less on current consumption
to buy new capital. The capital stock depreciates at a constant rate, . Perfect
competition and flexible prices are assumed, so that full employment holds each
moment in time.
The production function of each firm i takes the following form38:
Yit  At K it N it1  . (2.3)
This production function exhibits Constant Returns to Scale in labour and capital
and decreasing marginal returns to each of those inputs (i.e. the LDR applies to each of
them).
The parameter A stands for the level of technology and is assumed exogenous to
the firm. Throughout this chapter it will be assumed that the level of technology is
constant over time:
At  A . (2.4)
Assuming that all firms are equal, the aggregate production function in this
economy becomes (2.1), where Y   Yi , K   Ki and N   Ni stand,
i i i
respectively for the aggregate levels of output, capital and labour in the economy.
Households use their income either for consumption expenditures or for savings.
In the aggregate, this implies that:
Yt  C t  S t (2.5)
where, C and S denote aggregate consumption and aggregate savings, respectively.
The Solow model combines the neoclassical features of perfect competition and
flexible prices with a Keynesian consumption function. In particular, it is assumed that
savings are proportional to current income:
S t  sYt (2.6)
It should be noted that, in the real world, savings depends on a number of factors
other than income. For instance, the age structure of population, income inequality,
financial deepening, taxation, macroeconomic and political stability, culture, all these
factors influence the saving rate. So, although we take the saving rate as exogenous in
this model, one should take into account that this parameter is influenced by other
factors, when referring to this model to interpret real world facts.
Given the constant saving rate, s, the flow equilibrium in this economy is given
by:
38
We stick to the Cobb-Douglas production function, for simplicity. The Solow model is however
consistent with more general specifications for the production function. What we need is constant returns
to scale, and positive but diminishing returns on both inputs.
sYt  I t (2.7)
where I denotes gross investment39.
With the passage of time, capital wears out or becomes obsolete. This process is
called depreciation and implies that some investment is needed every year just to
replace the depreciating capital. In this model, it is assumed that the depreciation rate
() is exogenous and constant over time:
K t  I t   K t (2.8)
Equation (2.8) states that the change in the capital stock (net investment) is
equal to gross investment minus depreciation.
In this model, the ability to accumulate capital (via savings) prevents output per
capita from declining when population increases. Hence, one no longer needs to worry
with Malthusian barriers and subsistence wages that limit population growth. Instead, it
is assumed that population expands at an exogenous and constant rate, n:
n  N t N t (2.9)
The above equations describe the basic Solow model. The flow income chart of
this economy is displayed in Figure 2.2.
Figure 2.2: The flow income chart in the basic Solow model
sY
Households
C  1  s Y
Y  wN  r   K F.Markets
I  K   K
Firms
39
Equation (2.7) implicitly postulates the price of the capital good to be the same as that of output. As an
example, think that the only output in this economy was potatoes: potatoes can be either consumed or
planted (invested) to grow more potatoes. If however total investment included a plough, in that case a
given amount of saved output would translate into more or less capital accumulation, depending on how
many potatoes would be necessary to buy a plough. In that case, equation (2.7) should be divided by the
relative price of capital. We will analyse the implications in changes in the relative price of capital in
Chapter 11.
afreitas@ua.pt 56
Factor prices and factor income shares
In this model, firms are price-takers both in the product market and in factor
markets. When this is so, we know that profit maximization delivers demands for inputs
equal to the respective marginal products.
Formally, the profit function of each individual firm each moment in time is
given by:
 it  Yit  rt   K it  wt N it (2.10)
In this equation, w is the real wage rate, r denotes for the real interest rate and  is the
rate of depreciation of the capital stock (note that the “user cost” of capital is the sum of
two terms).
The first order conditions of profit maximization imply:
 i Y
 1   K i N i   wt  1    it  wt  0
N i N it
 i Y
 K i 1 N i1   rt      it  rt     0
K i K it
Since firms are all alike, this leads to the following aggregate demand functions
for labour and capital, respectively:
Yt
wt  (1   ) (2.11)
Nt
Yt
rt     (2.12)
Kt
Equations (2.11) and (2.12) imply that the income shares of capital and labour,
(r+)K/Y and wN/Y , are constant and equal to  and 1-, respectively. That is, even
though the prices and quantities of capital and labour may vary, changes are such that
the shares of national income paid out to each factor of production remain constant.
This is a direct implication of assuming perfect competition and a Cobb- Douglas
production function.
The Fundamental Dynamic Equation
To understand how the model works, note that the main determinant of per
capita output (2.1) is the capital-labour ratio, which is given (pre-determined) each
moment in time. The capital-labour ratio does not jump, but may change continuously
over time, depending on investment, the capital erosion, and the population expansion.
Formally, let’s take the time derivative on k, to obtain:
 K N  N K  K N
k      k (2.13)
 N2  N N
After some substitutions using (2.2) and (2.7)-(2.9), we obtain the so-called
Fundamental Dynamic Equation of the Solow model:
kt  sAkt  n   kt


(2.14)
This equation states that the capital-labour ratio (i.e. the amount of capital

available to each worker) increases with per capita saving ( sy  sAkt ) and decreases
with the depreciation rate () and the population growth rate (n).
The term (n+) in (2.14) may be interpreted as the rate of depreciation of the
capital-labour ratio: on one hand, the depreciation rate reduces k by causing the capital
stock K to wear out; on the other hand, population growth results in less capital being
available to each worker (this negative effect of population growth on capital per
worker is often called capital dilution). According to equation (2.14), the change in the
capital-labour ratio is positive whenever per capita savings exceed the depreciation of
the capital-labour ratio, and conversely.
A graphical illustration
Figure 2.3 describes the dynamics and the equilibria of the model, as implied by
equation (2.14). The uppermost curve is the production function in per capita terms
(2.2). The figure also depicts the two terms in the right hand side of (2.14): per capita
savings (sy), and the locus (n+)k. The later is known as the break-even investment line:
it depicts, for each level of capital per worker, the exact amount of gross savings that
will be necessary to offset the corresponding capital depreciation and capital dilution.
To see how the capital-labour ratio evolves over time, assume first that initially
the capital-labour ratio was equal to k0. In that situation, per capita income would be y0,
of which QR devoted to consumption and k0Q devoted to savings. Since in this case per
capita savings exceed the “break even investment” (given by k0P), from equation (2.14)
it follows that the change in the capital-labour ratio will be positive. In words, since the
economy generates savings (and hence new investment) larger than the amount needed
to keep the amount of capital per worker constant, the capital-labour ratio will increase.
Figure 2.3. Dynamics and equilibria in the Solow Model
S y = Ak
y*
R (n+)k
y0
sy
Q
O
k0 k* k1 k K/N
afreitas@ua.pt 58
By the same token, as long as savings per capita exceed the break-even
investment line, the capital-labour ratio will keep increasing. However, as k
progressively approach the point k=k*, the distance between the two locus decreases.
The reason is again diminishing returns: since income per worker grows less than
proportionally than the stock of capital per worker, savings cannot grow as fast as
depreciation. And a moment will come when the two locus cross each other: at k=k* ,
the amount of savings per capita is just the needed (but no more) to equip the new
entrants into the labour force and to replace the depreciating capital. This is the steady
state (equilibrium) of the model40.
The steady state
Formally, the equilibria of the model are obtained solving (2.14) for k  0 . This
equation has only two solutions, the trivial one (k=0 ) and:
1
 sA  1 
k*    . (2.16)
 n  
Because the model predicts that the economy gravitates to the steady state (2.16)
from any departing point in its neighbourhood, this equilibrium is said to be stable41.
Substituting (2.16) in (2.2), one obtains the steady state level of per capita
income:

1
1   s  1 
y*  A   (2.17)
 n  
Since parameters A, s, n and  are all constant, equations (2.16) and (2.17) imply
that, in the steady state, capital per worker and per capita income are also constant.
Note that this outcome is in full conformity with the CRS property: if labour and
capital are set to grow at rate n (for the capital-labour ratio to remain constant), then
output output will also grow at rate n (implying a constant level of per capita income).
This is why the CRS property is the key of this model.
40
We invite the reader to use a similar reasoning to explain why the capital-labour ratio converges to the
steady state departing from k1.
41
Formally, the equilibrium described by k* is locally stable because the condition k k  0 holds for
any point in its neighbourhood. The reader may verify that the same condition does not hold in the
neighbourhood of the trivial steady state, k=0. The later is an unstable equilibrium.
2.3. The Solow model and the facts of economic growth
The Solow’s quest
Robert Solow developed his famous model with the main purpose being a better
understanding of the growth performance of the US economy in the twentieth century.
He was particularly interested in explaining the long-run tendency for output and capital
to grow at the same rates – a statistical regularity first documented for the U.S. economy
by Simon Kuznets, a Nobel Laureate Russian-American economist.
Solow also wrote is model with the so-called “Kaldor’s facts” in mind. These
are six “remarkable historical constancies” (empirical regularities) that the British
economist (born in Budapest) Nicholas Kaldor identified as characterizing modern
economic growth42. In particular, the Kaldor stylized facts are:
1. Output per worker grows over time at a sustained rate
2. The capital stock per worker grows over time at a sustained rate
3. The capital-output ratio exhibits no clear trend over time;
4. The real return to capital is relatively constant over time;
5. The shares of labour and of capital on national income are roughly
constant over time;
6. There are wide differences in the growth rate of productivity across
countries.
Kaldor did not claim that these facts hold each moment in time. For instance, per
capita output falls during recessions and the interest rate fluctuates significantly in the
short run. Over long periods of time, however, these facts tend to show up in the
statistical data, so they provide a natural benchmark to confront growth theories with
when trying to explain long term trends. In the following section, we will see how the
Solow model conforms to these stylized facts.
Confronting the model with the stylized facts
As we just saw, equations (2.16) and (2.17) imply that, in the long run, per
capita income and the capital labour ratio do not grow at all. This means that the Solow
model, as presented so far, does not account for the Kaldor Stylized facts 1, 2, and 6.
To check whether Fact 3 is met, let’s divide (2.16) by (2.17), to obtain output
capital ratio in the steady state:
*
Y  n 
   (2.18)
K s
42
Kaldor (1957, 1961).
afreitas@ua.pt 60
This ratio is constant, because all three parameters on the right hand side are
themselves constants (in Figure 2.3, this ratio is measured by the slope of the ray OS).
Thus, the model accounts for the Simon Kuznets’ fact number 3: the long run tendency
for capital (K) and output (Y) to grow at the same rate.
An important implication of a constant average product of capital in the steady
state is that the interest rate will also be constant in the steady state (remember equation
2.12). Hence, the Kaldor’s stylized fact 4 is also verified. Regarding fact 5, we already
verified that it holds in this model (equations 2.11 and 2.12).
Savings, population and per capita income in the real world
According to equation (2.17), countries with high saving rates and with slow
population expansions should enjoy higher standards of living than countries with low
saving rates and fast population expansions.
Figures 2.3 and 2.4 check how these two predictions of the model go along with
the real world facts. The figures plot the level of GDP per capita in the year 2000 with,
respectively (i) gross investment as a share of GDP and (ii) population growth rates.
Figure 2.4 reveals a positive correlation between investment rates and per capita
incomes. Figure 2.5 reveal a negative correlation between population growth and per
capita incomes. Both figures are in broad accordance with the Solow model.
Note, however, that these figures tell us nothing about the direction of causality.
For example, it could be that the low saving rates in the poorest countries were
explained by the fact that people living at the margin of subsistence can’t afford to save.
On the other hand, poorest countries may exhibit faster rates of population expansion
simply because they still didn’t make their demographic transition (see Box 2.2). Thus,
while these data are in accordance to the Solow model, they do not prove that the Solow
model is the actually right one.
Figure 2.4: Per capita GDP and gross Investment 1950-2000
11
Real GDP per capita in 2000
10
(logs, 1996 US dolars)
5
0 5 10 15 20 25 30 35
Investment as a percentage of GDP (average 1950-2000)
Source: Penn World Table 6.1, Heston, Summers and Aten (2002). The sample
includes 169 countries and average data over the 50 year period from 1950 to 2000
Figure 2.5: Per capita GDP and population growth rates, 1950-2000
10.5
Log of GDP per capita in 2000
9.5
(1996 US dolars)
8.5
7.5
6.5
5.5
-1% 0% 1% 1% 2% 2% 3% 3% 4% 4% 5%
Average population growth (1950-2000)
Source: same as Figure 2.4
Box 2.2 Population dynamics and poverty traps
The model outlined above assumes that population grows at an exogenous rate,
n. The departure from the Malthusian assumption looks sensible to explain growth in
economies that already made their demographic transition. However, it may be
interesting to investigate whether the model predictions will change if one allows the
rate of population expansion to depend on per capita income.
As explained in Chapter 1.5, a country’ population is expected to expand at
moderate rates both in cases of extreme poverty (high birth rates and high death rates)
and when living standards are high (low birth rates and low death rates). Yet in
intermediate levels of development, the growth rate of population is expected to be
high, because the birth rate is still high while the death rates are already low (Figure
1.6).
Adding a non-linear relationship between population growth and per capita
income to the Solow model, one obtains a break–even investment curve that is non-
linear, as depicted in Figure 2.6. In this case, the model may display multiple equilibria.
In the figure, there are four equilibria: origin, L, A, and H. The equilibrium
represented by H corresponds to a low population grow rate and a high level of per
capita income. Equilibrium A is an intermediate equilibrium, characterised by fast
population growth. Equilibrium L is characterised by a low rate of population growth
and a low level of per capita income.
The equilibria described by H and L are both stable, just like in the basic Solow
model: departing from any point in its left (right), per capita savings exceed (are less
than) the break even investment and hence the capital labour ratio increases (decreases)
until reaching the steady state.
The equilibrium described by A is unstable: if, departing from this equilibrium,
the capital stock decreases (raises) by a small amount, then the saving rate becomes
lower (higher) than the break even investment and the economies starts shrinking
(growing), until reaching the low (high) income steady state.
afreitas@ua.pt 62
Because the equilibrium L is stable and dominated by another possible outcome

(equilibrium H), it is called a “poverty trap”.
Figure 2.6. Illustration of a Poverty trap
y
y  Ak 
n  y    k
sy
L A H k K/N
A characteristic of many models with multiple equilibria is they give history a

role in equilibrium selection. That is, if an economy starts out in the bad equilibrium L,
will remain in the bad equilibrium. If the economy starts out in the good equilibrium, H,
it will remain in the good equilibrium.
Moreover, a policy designed to move the economy out of the poverty trap may
fail to do so, unless it is powerfull enough to push the economy to the right side of the
break even investment line. To see this, suppose that the economy was initially at point
L and then it received some form of external aid to improve its capital stock. As you can
easily check in the figure, unless this aid was large enough to move the economy to the
right of the threshold point A, the effect of expanding the level of capital per worker
would be merely temporary: the initial rise in output per capita would lead to an
acceleration of population expansion, which in turn would require a higher saving rate
just to stand still. Since the saving rate is constant, the capital-labour ratio would fall
back, driving the economy again to the poverty trap. In that case, the “demographic
transition” does not occur because the saving rate is too low to induce the required
increase in the capital-labour ratio during the intermediate stage. In the end, the extra
capital obtained was “spent” in an increasing population, rather than in improving living
standards.
If, however, the donation was large enough so that the critical point A was
bypassed, then the economy would be able to escape the trap, moving towards the high
income steady state, H.
This version of the model suggests that that a relatively small level of
international assistance to a poor country trapped in a Post-Malthusian regime will not
deliver long-run economic performance. In contrast, a temporary boost in savings may
have a long run effects.
This conclusion has important policy implications: it suggests that external aid
to a developing country may be useless when small and permanent, while it can produce
long-lasting effects if large and lasting only the enough for the economy to overcome
the trap.
The idea that some poor economies may be locked in poverty traps, out of which
they need ambitious investment programs financed by foreign aid, is known as the “Big
Push”. This argument will be discussed in more detail later in this book. For the
moment – and sticking with the case at hand – just note that a policy alternative to “Big
Push” would be to tackle directly the source of the problem, which is the demographic
behaviour: if a policy of birth control and family planning was successful in reducing
the gap between the birth rate and the death rate in the intermediate stage, then the
break-even investment line would approach the straight line, turning the model more
similar to the basic Solow one and eventually eliminating the poverty trap.
An important feature of the Solow model is that, if the economy is not in the
steady state, it will converge to the steady state. But the economy cannot jump
instantaneously from one steady state to the other: since capital accumulation is
bounded by the availability of savings, there will be an adjustment period, during which
the economy approaches the new steady state.
This point is very important because any real world individual economy that we
study for a particular period may be precisely in that adjustment process, rather than in
the steady state: e.g. perhaps because the savings rate has risen recently but not yet been
reflected fully in higher per capita income. In the light of the Solow model, that
economy will be experiencing a transitory growth, reflecting the adjustment of the
economy from one steady state to the other. The following sections address specifically
the issue of transition dynamics, in face of changes in the exogenous (fundamental)
parameters.
What Happens if the Savings Rate Rises?
Figure 2.7 shows how the model adjusts to a rise in the saving rate. When the
saving rate rises from s0 to s1, the curve measuring per capita savings shifts upwards,
moving the steady state from k 0* to k1* .
Thus, the economy engages in an expansion process until the new equilibrium is
met. In the long run, the rise in the saving rate produces a level effect: that is, in the new
steady state, the economy will enjoy a higher level of output per worker than in the
steady state before. During the transition from one steady state to the other, per capita
output increases. But, because of diminishing returns, this increase in output per capita
is no more than a temporary phenomenon.
afreitas@ua.pt 64
It is worth mentioning that the average product of capital (Y/K) in the new
steady state is lower than in the old steady state. This can be checked by reference to
equation (2.18), where the saving rate enters in the denominator. Visually, in Figure 2.7
you see that the slope of the ray that departs from the origin and crosses the production
function at point 1 is lower than the one corresponding to the ray that crosses the
production function at point 0. Referring to (2.12), this also means that the interest rate
has declined: intuitively, one consequence of there being relatively more capital per
worker available in the economy is that capital becomes relatively cheaper.
The implication of what we just learned is that, in low-income countries with
low savings (say 10%), a growth surge could eventually be achieved by raising the
saving rate. However, once the economy reached the new steady state, per capita
income would stagnate again. Thus, in order to achieve further increases in output per
worker, one would need to raise the saving rate again and again. And clearly, there are
limits in exploring this avenue: the maximum level of the savings rate observed in
countries in the real world is around 30-35%. Rates much higher than this obviously eat
into available consumption and so the current standard of living. The conclusion is that
it will be impossible to achieve a continuous growth of per capita income by increasing
the saving rate.
Figure 2.7. A higher saving rate raises the steady state level of per capita income but only
boosts growth rates temporarily
y Y / N (Y/K)0
(Y/K)1 y = Ak
1
y 1* (n+)k
0
y 0* s1 y
s0 y
k 0* k 1* k K/N
What happens if Population Growth or the Depreciation rate Decline?
From equation (2.17) we see that a fall in the population growth rate has a
similar effect to that of a rise in the savings rate. In terms of the Figure 2.3, the
difference is that the change in the steady state will be caused by a downward shift of
the break-even investment line. Thus, after a decline in the population growth rate, both
output per worker and capital per worker will increase, but this will happen only during
the transition from one steady state to the other. Note that in the new steady state, the
average product of capital, and interest rate, will be lower than in the initial steady state.
What happens if the level of technology improves?
Figures 2.6 and 2.7 describe how the different variables of the model adjust
following a once-and-for-all improvement in Total factor Productivity. In terms of
Figure 2.8, when this parameter rises from A0 to A1, both the production function and
the curve measuring per capita savings shift upwards, moving the steady state from k 0*
to k1* .
Figure 2.8. An increase in technology raises the steady state level of per capita income but
leaves the output capital ratio unchanged
y Y /N (Y/K)0
1
y1* y  A1k 
y  A0 k 
(n+)k
* 0 sy
y 0
sy
k 0* k1* k  K/N
By contrast to the case of an increase in the saving rate, in this case there is an
initial jump in per capita output: thereason is that productivity increase means that more
production is achieved out of the same inputs.
The paths of output per worker, the average product of capital, and of the
interest rate are described in Figure 2.9. As shown in the figure, at the time of the shock
(t0), all the three variables jump up. During the adjustment to the new steady state, the
average product of capital declines again (diminishing returns show up) and so will do
the interest rate. In the new steady state (after t1), the average productivity of capital
and the interest rate are the same as before the shock (you may confirm this by
observing that the long run level of Y/K (equation 2.18), does not depend on the level
of technology, A).
afreitas@ua.pt 66
All in all, the improvement in technology allowed the economy to move from
one steady state to a new one with more capital per worker, without any decline in the
average productivity of capital and of the interest rate. Changes in TFP face no
diminishing returns.
Figure 2.9. The time paths of per capita output, the capital –labour ratio, the output
capital-ratio and the interest rate, following a an improvement in technology
Y/K
time
time
t0 t1
2.5 The Golden Rule
The Golden Rule of capital accumulation
In the Solow model, the steady state level of per capita income depends
positively on the saving rate (Eq. 2.17). Does this mean that any increase in the saving
rate is welcome?
To answer this question, remember that, from the perspective of the household,
savings represent foregone consumption: since the household’s major concern is the
amount of consumption he can afford, a higher saving rate does not necessarily deliver
higher utility.
Remember however that, in the context of this model, a higher saving rate will
deliver a higher level of per capita income in the steady state. Hence, while a lower
proportion of income is devoted to consumption, income itself will rise. The final
impact on consumption will depend on the balance between these two opposing effects.
Mathematically, the saving rate that maximizes the level of per capita
consumption in the steady state can be found in the following manner43:
max c  y t  sy t , subject to k  0 . (2.19)
k
where c=C/N denotes for per capita consumption. In the steady state, this is equivalent
to44:
max c  Ak t  n   k .
k
The first order condition of this problem leads to:

y
 n (2.20)
k
This condition is called the "Golden Rule of Capital Accumulation"45. It states
that the steady state level of per capita consumption is maximised when the slope of the
production function (i.e. the marginal product of capital) is equal to the slope of the
break-even investment line.
Geometrically, this problem is illustrated in Figure 2.10. The Golden Rule is met
at point k G . Whenever the steady state level of capital per worker is below this level
(that is, when k *  k G ), the rise in output that results from a possible increase in k *
more than offsets the rise in the amount of savings that is necessary to sustain such
equilibrium, implying that more resources are available for consumption. Conversely,
whenever the steady-state capital-labour ratio is higher ( k *  k G ) the rise in output that
would result from any further raise in k * is less than the required increase in savings,
implying that less resources become available for consumption.
Algebraically, the golden rule level of k * is given by:
1
 A  1 
kG    . (2.21)
 n  
The value of s that turns k G into a steady state is called the “golden rule” saving
rate. Comparing (2.21) with the general solution for steady states (2.16), we conclude
that the golden rule saving rate is:
sG   (2.22)
That is, the golden-rule saving rate is equal to the share of capital in total
income.
43
This problem was first investigated by the Nobel Laureate Edmund Phelps.
44
Note the in the steady state sy*=(n+)k*. The symbol “*” - which refers to the steady state - is
suppressed to simplify the algebra. An alternative avenue is to replace (2.17) in the maximization problem
(2.19) and take the derivative in order to s, obtaining directly the saving rate that maximizes the per capita
consumption in the steady state.
45
Phelps (1961).
afreitas@ua.pt 68
Using taxes and subsidies to achieve the golden rule
Now suppose you were a benevolent central planner wanting to maximize the
steady state level of per capita consumption of your citizens. How would you achieve
this objective?
One possibility would be to use taxes and subsidies. To illustrate this, assume
that you had the ability of imposing a tax  (subsidy if negative) on production and that
tax proceedings were returned to households in the form of a lump-sum transfer, T (e.g,
a transfer made after households decided the amount of consumption and savings; if
negative, this implies a confiscation of part of the household consumption). The
government budget is assumed to be balanced, that is T  Y . The flow income chart of
this economy is as described in Figure 2.11.
Figure 2.10: Illustration of the Golden Rule
y Y / N
y = Ak
(n+)k
(1  s G ) y G
sG y
sG yG
kG k  K/N
Figure 2.11: The flow income chart with government intervention
s1  t Y
Households
T
1  t Y
C  1  s 1  t Y  T
Government C.Market
tY
I  K   K
Firms
Assuming, as before, a constant saving rate s, total savings in this economy will
be given by:
S  s 1   Y  K  K .
To solve the model in the new version, just note that s1    shall be used
instead of s in the fundamental dynamic equation. Proceeding as before, the steady state
level of per capita income will now be given by:

 s1    1 
1
1 
y*  A   . (2.23)
 n  
The corresponding steady state level of per capita consumption is:

 s 1     1 
1
T
c*  1  s 1    y *   1  s 1   y *  1  s 1   A1    .
N  n 
(2.24)
Maximizing (2.24) with respect to , one obtains an intuitive result:

  1 . (2.25)
s
According to (2.25), the golden rule tax rate on output depends on the gap
between the actual saving rate s and the golden rule saving rate : the optimal tax will
be negative (subsidy) if the saving rate falls below the golden rule; it will be positive
(tax) if the saving rate is higher than the golden rule; and it will be zero if the saving
rate satisfies exactly the golden rule.
Dynamic inefficiency
The “golden rule” saving rate is the one that delivers the highest level of per
capita consumption in the steady state. This is not to say that the society will always be
better off approaching the golden rule.
To see this, consider the case in which the economy starts out in a steady state
on the left of the golden rule (that is, initially k *  k G ). In this case, reaching the golden
rule will require the saving rate to rise. In other words, agents will have to sacrifice
consumption today to enjoy more consumption in the future.
This case raises an important policy question: if the decentralized economy
deliveres a saving rate that is lower than the golden rule, should a benevolent planner
intervene, forcing the economy’ saving rate to increase (for instance, subsidizing
savings, as illustrated in equation 2.25)?
As a general principle, as long as saving rates are decided by optimizing agents,
altering their choices will make them worse off. So, unless there are good reasons to
believe that some kind of impediment prevents consumers from optimally deciding their
afreitas@ua.pt 70
saving rates, or that some kind of market failure turns individual decisions socially
inacceptable, there will be no case for intervention46.
A different case occurs when the initial steady state lies beyond the golden rule
(that is, if initially k *  k G ). In that case, by reducing savings today, consumption
would increase both today and tomorrow. Since a “free lunch” is readily available, this
case is labelled as “dynamically inefficient”47. By contrast, the case in which the saving
rate is higher than the golden rule saving rate is "dynamically efficient", because no
“free lunch” is readily available. Under dynamic inefficiency, there would be a gain for
the society as a whole if a central planner forced the current generation to save more.
A case of dynamic inefficiency looks at odds with the principle that agents are
optimizers: if households were saving too much, they should realize that reducing the
saving rate today, they would be increasing their consumption both today and
tomorrow. However, at least theoretically, it is possible to figure out cases in which
individuals end up saving more than they desire: forced savings occur, for instance,
when individuals have income available to spend but no goods to buy (some authors
contend that this was the case with the rationing policies of the ex Soviet Union).
Other possibility is individuals optimally deciding a saving rate that proves
excessive from the social point of view: for instance, individuals saving for the
retirement age may opt to accumulate too much capital (e.g. yielding very low returns),
simply because this is the only way of transfering resources to the future, while the
society would be better off if the current generation consumed more today and the
future generation transferred some the implied gain to the current generation in the
future 48. Thus, at least theoretically, it is possible to find examples in which forcing the
current generation to save more would constitute a Pareto improvement.
In the real world, we observe that the shares of capital in national income vary
from 0.3 and 0.4. According to the model formulation, this corresponds to the
contribution of capital to output, . Since real world saving rates are, in general, lower
than 30%, one may conclude that “dynamic inefficiency” is not at all a general case.
46
A tricky question arises, in that private choices influence the inter-generational distribution of income:
in a world where individuals have finite lives, the impatience of the current generation (reflected in low
saving rates) may be seen as a kind of selfish behaviour, which comes at the cost of future generations. In
principle, there is nothing wrong with the fact that individuals are impatient: if individuals are willing to
pay a cost in terms of future consumption to consume more today, they are in their own right. Still, a
planner could see reasons to force the current generation to save more, so as to make future generations
better off. Such policy would be equivalent to a transfer between generations, a balance between
conflicting interests which economic theory has little to say about. What we know for sure is that such an
intervention would not be a Pareto improvement.
47
Phelps (1965). Formally, a capital path is said to be dynamically inefficient if the path of savings can be
changed so as to strictly increase consumption at some point in time without lowering it at any point in
time.
48
The “overlapping generations model” accounts for the fact that people have finite lives and a
productive phase during which they save for the retirement phase). In that model, it is theoretically
possible the competitive equilibrium to be dynamically inefficient, with too much capital accumulation.
The reason is that capital is used to transfer income to the future, so individuals will save even at very low
interest rates. In this case, a central planner could improve the welfare of both current and future
generations with an appropriate transfer policy. For a discussion, see Romer (2001), pp 85-86.
2.6. The model with endogenous savings
An optimal consumption rule
By now, the saving rate was assumed exogenous. The neoclassical model can
however be extended, to account for the case in which individuals optimally decide their
saving rates49.
It is not in the scope of this book to solve complicated dynamic optimization
models. So, in the following – and throughout the book – we will refer to a very simple
particular case, in which the optimal consumption rule is given by:
 t  rt   (2.26)
where  t  c c denotes for the growth rate of per capita consumption, and is the rate
of time preference (that is, the rate at which individuals are willing to trade one unit of
utility today for one unit of utility in the future).
According to (2.26), as long as the interest rate is higher than the rate of time
preference, individuals will optimally decide to increase consumption over time. If
however the interest rate falls below the rate of time preference, individuals will
optimally reduce consumption over time. When rt   , the optimal level of
consumption will be constant.
Formally, the consumption rule (2.26) can be obtained assuming that individuals
in the economy are all alike and infinitely lived, that their instantaneous utility function
is logarithmic, and that they all have full access to a frictionless financial market,
whereby they can borrow or lend any amount of income at a given interest rate r (in
Appendix 2.1, we illustrate this in a simple 2-period framework)50.
What happens when the rate of time preference decreases?
To see the implication of replacing an exogenous saving rate by (2.26) in the

neoclassical growth model, remember that in the competitive equilibrium the interest
rate is determined by the marginal product of capital (equation 2.12). The later, in turn,
is a negative function of the capital-labour ratio (as implied by the Law of Diminishing
returns).
49
The problem of how much an individual should save was first addressed in an inter-temporal
optimizing framework by a mathematician from Cambridge UK called Frank Ramsey (Ramsey, 1928).
Ramsey died at the age 26 and his seminal contribution remained obscure for long time by the economics
profession, because at that time most economists were not familiar to dynamic optimization. His work
was re-discovered four decades later, by Cass (1965) and the Nobel Laureate Tjalling Koopmans (1965),
who used it to characterize the optimal saving paths if the context of the neoclassical growth model.
50
Acknowledging these assumptions is very important, for qualification purposes. For instance, we all
know that financial markets are far from perfect, especially in poor countries. Households without
collateral, in particular, will find it difficult to borrow from the banking system against future incomes.
And whenever consumers face borrowing constraints, they are fated to consume at most their current
income, no matter how impatient they are. This means that a consumption function depending only on
current income, as assumed in the basic Solow model, may be in many circumstances, quite appropriate.
afreitas@ua.pt 72
Referring to Figure 2.12, suppose that initially the rate of time preference was
equal to the real interest rate ( r0   0 , so that per capita consumption was constant over
time (in the Solow model, we know that a constant level of per capita consumption
holds in a steady-state; so you may interpret this initial situation as corresponding to
point 0 in Figure 2.7). In Figure 2.12, this initial situation is described by point A, with
the steady state capital labour ratio being equal to k 0* .
Now suppose that the rate of time preference falls to  1 . This means that
individuals will demand a lower return to postpone consumption. Hence, at an
unchanged interest rate, savings will increase and the consumption level will fall
instantaneously at the time of the shock.
Since more savings translate into more investment, the implication is that the
capital labour ratio starts increasing, inducing a temporary growth of per capita income
and of consumption51. Then, the economy will move slowly, from A to C. As the stock
of capital per worker increases, the marginal product of capital declines, and so will do
the interest rate. At the time the interest rate becomes equal to the rate of time
preference again, the desired consumption becomes constant over time (eq. 2.26) and
the process of capital accumulation stops (point C). From C, any further investment in
physical capital would bring a return that is lower than the new rate of time preference,
so the individual consumer will prefer not to save. In the new steady state, both
consumption and per capita income are higher than in the old steady state, but they will
be constant again (just like in point 1 of Figure 2.7).
Figure 2.12. Transition dynamics following a fall in the rate of time preference
Y
K
A
0  
Y A
B C  1
1   K k
k0* k1*
k K/N
51
In terms of equation (2.26), because the rate of time preference falls below the interest rate - which is
determined by the capital-stock - the growth rate of per capita consumption jumps initially to
 1  r0   1 (distance AB), declining slowly to zero afterwords.
This example reveals why the neoclassical model cannot generate sustained
growth of per capita income, even when savings result from unrestricted optimization
decisions: as the stock of capital per worker increases, its marginal product declines and
so will do the interest rate. At the time the interest rate equals the discount rate, the
desired consumption becomes constant over time (eq. 2.26) and the process of capital
accumulation stops52.
The modified golden rule
A question that arises is how the endogenous saving rate, determined in a

competitive equilibrium where agents face no borrowing constraints (as implied by
2.26) compares to the golden rule saving rate.
Before addressing this question, it is important to note that in the model with
endogenous savings, the saving rate is not in general constant along the transition to the
steady state (that is, the transition from B to C in Figure 2.12 is not exactly the same as
the transition from 0 to 1 in Figure 2.7). When the economy reaches the steady state,
however, both the per capita consumption and the per capita income become constant
over time, so the saving rate will become constant as well. It is the value of the
endogenous saving rate in the steady state that we want to compare with the golden rule
of the Solow model.
To obtain the steady state saving rate in the model with endogenous savings, lets
first substitute the market interest rate (2.12) in (2.26), to obtain the growth rate of per
capita consumption each moment in time (including during the transition dynamics):
Yt
t     
Kt
In the steady state the growth rate of consumption is zero and the capital output
ratio is given by (2.18). Imposing these conditions in the equation above, and solving
for the saving rate, one obtains:
n 
s  (2.27)
 
The saving rate (2.27) is often labelled as the “modified golden rule”. It can be
shown that the term inside brackets is less than one, so this saving rate is in general
lower than the “golden rule” saving rate, s= (2.22) 53 . Intuitively, the impatience
reflected in the rate of time preference means that an infinitely lived consumer will, in
general, prefer a steady state consumption level that is lower than the maximum
possible.
52
Note the similarity with the Malthus model: instead of a model where population expands whenever
labour productivity is higher than a subsistence wage, you now have a capital stock that expands
whenever its productivity is higher than the rate of time preference. In both cases, the growing process
stops because of diminishing returns.
53
Technically, the saving rate in the “modified golden-rule” is lower than the golden-rule saving rate
because   n . The reader is not supposed to guess this. Intuitively, the condition is imposed to prevent
consumers from choosing an infinite consumption level financed with an explosive debt (demanding
students are invited to read a technical discussion in Romer 1996, p. 40).
afreitas@ua.pt 74
2.7. The Solow Residual
Before leaving this chapter, let’s introduce another seminal contribution of

Robert Solow, which came out only one year after his famous model was published.
This contribution is basically a technique to estimate the (non-observable) rate of
technological progress using observable data.
The technique is labelled “growth accounting”, and departs from a production
function like (2.1). Log-differentiating (2.1), we obtain:
Y A K N
    1     Aˆ   Kˆ  1   n (2.28)
Y A K N
This equation states that the growth rate of output equals the growth rate of “A”
(TFP) plus a weighted average of the growth rates of physical capital and labour, where
the weights are the corresponding elasticities in the production function.
If factor markets are competitive, as assumed by the model, then these
elasticities can be measured directly by the factor income shares calculated from
national income accounts (remember 2.11 and 2.12). In the real world, it has been
observed that the share of capital in national income ranges from 0.3 to 0.4.
Solow proposed estimating TFP as a residual, i.e. as the difference between the
actual growth of output and the growth implied by factor accumulation (hence the label
Solow residual):
Aˆ  Yˆ   Kˆ  1   n (2.29)
By definition, the Solow residual measures the part of actual output growth that
is not accounted for by factor accumulation. In the intensity form, the Solow residual is
obtained as:
Aˆ  yˆ   kˆ , (2.30)
with yˆ  Yˆ  n and kˆ  Kˆ  n .
As an example, the growth rate of GDP in the US along the first half of the
twentieth century was roughly 3% per annum, on average. Its capital stock also
expanded at about 3% per annum in that same period, whereas its labour input (hours
worked) expanded at only about 1% per annum. Assuming a capital share in national
income of one third and a labour share of two thirds, the implied Solow residual is:
1 2
Aˆ  3%   3%   1%  1.3% .
 3 3
That is, labour and capital together accounted for about 1.7 percent per annum to
the total GDP growth of 3 percent. The residual balance of 1.3 percent per annum is
accounted for by “technological change”.
Using the intensive form (2.30), the conclusion is even more starling: 2/3 of the
change in per capita income is accounted for TFP:
1
Aˆ  2%   2%  1.3% .
 3
This evidence points to a fundamental limitation of the basic Solow model: by

assuming that A is constant, this model ignores a critical ingreduient of economic
growth: technological progress54.
Note that computed in such a way, the TFP term captures much more than
“technological progress” in narrow sense, that is, changes in the “efficiency” with which
the existing technology and inputs are combined: it also captures unmeasured changes
in the quality of inputs (skill increments in the labour force, quality of land, climate) and
aggregation errors. In any case, the accounting exercise points to the unrealism of
assuming that the state of technology is constant over time.
2.8. Discussion
The basic Solow model rightly accounts for the role of capital in production and
stresses the key role of saving in generating the resources that are necessary to invest in
new capital. It also provides a sensible story about why historical ratios of capital to
output and real interest rates appear to be relatively stable in the long run. Finally, it
offers the credible suggestion that countries with high savings rates and low population
growth rates should expect higher levels of per capita income in the long run than
countries with low saving rates and rapid population expansion.
In its current form the model fails, however, to explain the most basic fact of
modern economic growth, that per capita income tends to increase over time: in the
Solow model, any growth in per capita income has a merely transitory nature, reflecting
the adjustment of the economy from one steady state to the other. The reason is
diminishing returns on physical capital.
Of course, continuous growth of per capita income would be obtained in the
context of the Solow model if saving rates rose continuously over time. If however
sustained growth of per capita income in the real world was really accounted for by
successive rises in the saving rate, interest rates should exhibit declining trends. Since
this is not a real world fact, the conclusion is that sustained growth of per capita
incomes as we observe in the real world is not accounted by successive raises in the
saving rates.
The key to overcome this limitation of the basic Solow model follows from our
discussion around Figure 2.8: if we allow the level of technology to expand over time,
then capital per worker (and per capita output) will increase over time while the capital-
output ratio (and the interest rate) remain unchanged. In that case, the model will be
consistent with all the stylized facts reported by Kaldor. This extension of the model
will be addressed in the next chapter.
54
The figures above are from the World Bank, World Development Report (1991). In his original paper,
Solow (1957) found out that only one eight of the growth rate of output per hour worked in the U.S.
economy along 1909-1949 could be attributed to the increase in capital intensity, k=K/N. The remaining
seven-eights were attributed to “technical change”. Other classical papers quantifying the sources of
growth using growth accounting include Denison (1962, 1967) and Maddison (1982, 1991).
afreitas@ua.pt 76
Appendix 2.1: The optimal consumption path in a simple 2-period model
Consider an individual who lives only two periods and whose life-time utility
function is given by
U  uc1   uc2  1    ,
where u ct  is a concave function, c t is real consumption in period t=1,2 and  is a
given rate of time preference.
Assume that this individual has full access to financial markets, so he can
borrow or lend any amount of income at the interest rate r. His problem is to maximize
the lifetime utility function, subject to c1  c2 1  r    , where  denotes for lifetime
wealth.
From the first order conditions of the maximization problem one obtains the so-
called Euler equation:
u' c2   u ' c2 1    1  r  .
This equation states that the marginal utility of consumption in the next period
must be equal to the marginal utility of consumption in the current period, weighted by
the ratio of the rate of time preference to the market discount rate. In other words, this
rule implies that the consumption level each period must be such that an extra unit of
consumption would make the same contribution to lifetime utility no matter to what
period is allocated.
In the main text, we stick with the convenient assumption of logarithmic
preferences, that is u c c   ln ct . In this very simple case, the Euler equation simplifies
to:
c2 c1  1  r  1    .
Denoting by  the growth rate of per capita consumption, the later expression
becomes equal to:
1    1  r  1    ,
which by approximation gives:
 r .
The is the optimal consumption rule we will use throughout the book, whenever
the saving rate is not assumed exogenous.
 The Solow model explores the assumption of Constant Return to Scale to

overcome the limitations imposed by diminishing returns. The model
combines the neoclassical assumptions of perfect competition and
flexible prices with a Keynesian consumpion function, whereby
consumption is a linear function of current income.
 The properties of the model are such that per capita income and capital
per worker converge to a steady state, where both are constant.
 The model predicts that economies with higher saving rates or with
lower population growth rates should enjoy a higher level of per capita
income in the steady state than economies with low saving rates or with
fast population expansions.
 The model basically accords to the real world facts that the capital-output
ratio, the interest rate and the shares of labour and capital in per capita
income are roughly constant over time, but it fails to deliver the most
basic of the stylized facts of economic growth, namely that output per
capita (and real wages) tends to grow over time: in this model, any
growth in per capita income is merely temporary, reflecting the
adjustment in the economy from one steady state to the other.
 In this model, an exogenous improvement in technology leads to a higher
level of per capita income without causing any decline in consumption or
on the interest rate. This suggests an avenue to overcome the main
limitations of the basic model.
 In the context of the Solow model, there is a saving rate that maximizes
the level of per capita consumption in the steady state. This is not to say
that a change in the saving rate towards that level represents a net gain
for the society as a whole: this will only be the case if the initial saving
rate was higher than the “golden-rule” saving rate. When, in alternative,
the saving rate is lower than the golden rule saving rate, raising it will
not be, in general, a Pareto improvement.
 Making saving endogenous does not rescue the model from its main
limitation, that the long run growth rate of per capita output is zero.
 Measuring a country’ productivity change by the difference between
output growth and the “contribution” of inputs to this growth is called
“growth accounting”. In general, growth accounting exercises reveal that
technology expands over time. This evidence points to the need of
enriching the model so as to allow technology to expand over time.
afreitas@ua.pt 78
Key concepts
 Gross investment vs. net investment

 Break even investment
 The Kaldor facts
 Poverty trap
 Dynamic inefficiency
 The modified golden rule
 The Solow residual
Essay questions:
a) Explain how the steady state in the Solow model relates to the CRS
property
b) To which extent the basic Solow model is capable of describing the real
world facts?
c) Why can’t the Solow model generate a sustained growth of per capita
income?
d) Is the Golden Rule saving rate an optimal saving rate?
Exercises
2.1.
Consider an economy where the aggregate production function Y=AF(K,N) exhibits
Constant Return to Scale, positive and decreasing marginal productivity and unit
elasticity of substitution between factors. Admitting that the saving rate, the
population growth rate, technology and the rate of capital depreciation are all
constant and exogenous:
a- Describe in a graph the steady state of this economy. Is it a stationary steady
state? Why?
b- Describe in a graph the effects of the following changes on the long run level of
per capita output:
i. An increase in the population growth rate;
ii. An earthquake that destroys part of the capital stock.
c- Describe the effects of a rise in the saving rate in the time paths of the following
variables:
iii. Capital per worker;
iv. Per capita income;
v. Per capita consumption.
d- Describe the effects of a rise in the level of technology on the time paths of the
following variables:
vi. Per capita output;
vii. Capital per worker;
viii. Interest rate.
e- In light of the Solow model, is there a tendency for per capita output levels in
different countries to approach each other in the long term? Why?
2.2.
Consider an economy where the production function is given by: Yt  20K t1 / 3 N t2 / 3 ,

where Nt is the number of workers in period t. In this economy, 25% of income is saved,
the labour force grows at 2.5% and capital depreciates at 2.5%. We also know that in
this economy there is perfect competition, and wages and prices are fully flexible.
a) Compute the steady state values of capital and output per worker.
Represent in a graph and describe the stability of the equilibrium.
b) Suppose tha this economy was affected by a hurricane, which reduced its
capital stock. Discuss the subsequent dynamic adjustment of this
economy with the help of a graph.
2.3.
Consider an economy where the production function is given by Y  AK 0,5 N 0,5 .

a) Write down the main assumptions of the Solow model and find out the
expression that describes the dynamics of the capital-labour ratio.
b) Assume that: A = 1, the saving rate is 20%, the capital depreciation rate
is 8% and that the population grows at 2% per year. Find out the steady
afreitas@ua.pt 80
state levels of: output per capita, capital per worker, real wages and
interest rate.
c) What is the growth rate of output in the steady state?
d) Suppose that the productivity level increased to A = 2. Describe the
impact on per capita output, wages and the real interest rate.
2.4.
Consider two economies, A and B, sharing the same technology, given by
Y  K 0.5 N 0.5 . Assume that the saving rates in A and B are, respectively 10% and 20%
and that the sum n+ is equal to 10% in both countries.
a) Suppose that initially the capital-labour ratio was equal to 2 in both
countries. What will be the corresponding initial levels of per capita
consumption and per capita income?
b) Starting from the position described in a), compare the evolution of per
capita income in both economies as time goes by. Discuss.
2.5.
Consider an economy where the labour income share is 75%. What would be the Solow
residual, if both output and capital were growing at 3% per year and the labour force
was expanding at 1.5%?
2.6.
Consider an economy where the production function is given by Yt  0.2 K t1 / 3 N t2 / 3 . In

that economy, 25% of income is saved, capital depreciation is 5% and population is
constant and equal to 1000 inhabitants.
a) Find out the steady state values of per capita income, per capita
consumption, real wages and the interest rate.
b) Find out the saving rate that would maximize C/N in steady state, where
C is consumption. Illustrate with the help of a graph the adjustment
dynamics of Y/N and C/N admitting that the saving rate actually
changed to that level.
c) Suppose you were a benevolent planner who could coerce firms to pay a
tax on production , and transfer the proceedings consumers without
distorting the saving-consumption decisions. What would be the level of
 if you wanted to maximize the steady state level of per capita
consumption?
3 Exogenous Growth
“The Solow model did not assume that technical progress was exogenous—that
is, determined outside the model. Rather, the model made the assumptions necessary to
produce a model of an economy with a dynamic equilibrium, a path to which, in the
long run, the economy would settle down. The implication of those assumptions was
that technical progress had to be exogenous to the model”. [Lant Pritchett]
Learning Goals:
 Solve the Solow model with exogenous technological change

 Acknowledge the extent to which the modified model helps explain real
world facts
 Explain the model implications regarding convergence
 Discuss the implications for growth accounting of using alternative
definitions of technological progress
3.1. Introduction
As shown in Chapter 2, the basic Solow model does not account for the most
basic stylized fact of Modern Economic Growth: that output per capita tends to grow
over time. This limitation was noted by Robert Solow itself in its original article, where
he also provided a brief indication of how technological progress could be incorporated
into the model.
This chapter shows how the Solow model can be adapted to account for the
possibility of technological progress. As we will see, this modification rescues the
model from its main limitation and renders it capable of describing most stylized facts
of economic growth.
The Chapter is organized as follows: in Section 3.2, we explain why the Solow
model cannot account for endogenous technological progress. Section 3.3 presents the
extended version of the Solow model, assuming exogenous technological progress.
Section 3.4 discusses how the main variables of the model adjust to a change in an
exogenous parameter. In Section 3.5, we show how this extended version is helpful to
understand the main facts of modern economic growth. Section 3.6 discusses the
implications for growth accounting. Section 3.7 concludes.
afreitas@ua.pt 82
3.2 Perfect technological diffusion
Technology is different from most other goods, in that it is composed by ideas,

rather than by objects. One implication of technology’s non-physical nature is that its
use is nonrival: it can be used by more than one person at the same time without loosing
its effectiveness. This is in sharp contrast to physical capital as we have so far been
defining it: if someone is using a machine, no one else can use that equipment at the
same time. In other words, equipment is “rival” in its use. This is not true for ideas and
knowledge, even if they do come packaged up in bits of capital equipment: the fact that
a given company uses some software to manage its operations does not preclude other
firms from using the same software55.
Technology may vary, however, in its degree of excludability. Excludability is
the degree to which an owner of something can prevent others from using it without
consent. Because of its nature, knowledge is often non-excludable: it is difficult to
prevent an agent from using a good idea, once he or she becomes aware of it. Still, there
are ways of preventing others from using particular pieces of knowledge: for instance,
trade secrets, patents and copyrights, are mechanisms though which agents try to keep
competitors away from their inventions.
In what follows, we will stick with the assumption of perfect technological
diffusion: that is, once new technology becomes available, it becomes equally available
to all agents at the same time. The reason for doing so is that we are dealing with the
Solow model, which assumes perfect competition. Under perfect competition,
information is completely available, so technological secrets are ruled out56.
The implication of assuming perfect technological diffusion is that technology
becomes a pure public good: no user will be willing to pay for it and no self-interested
agent will engage in a deliberate effort to produce it. In terms of our model, this
implication is rather convenient: we don’t need to worry with returns to innovation or to
model the research activity. The other face of the coin is that technological progress is
doomed to enter in the model exogenously. Since there are no profit opportunities in
technology creation, we are doomed to assume that all technological progress takes
place for non-economic reasons (such as unintended discoveries that come out through
the passage of time or by chance).
55
On the nonrivalrous nature of knowledge, Jones (2005) remembers a famous quote from Thomas
Jefferson, in a letter to Isaac McPherson, in 1813: “Its peculiar character…is that no one possesses the
less, because every other possesses the whole of it. He who receives an idea from me, receives instruction
himself without lessening mine; as he who lights his taper at mine, receives light without darkening me”.
56
The reader may argue that even ideas that are available in books and academic journals do not spill
over at zero cost. Reading books, for example, requires appropriate skills and is time-consuming. At this
stage, however, we abstract from such complications. Later we will enrich the model so as to incorporate
the possibility of imperfect technological diffusion.
3.3. The extended Solow model
Labour augmenting technological progress
In the model of Chapter 2, the state of technology, A, was assumed constant over
time (conf. equation 2.2). In this section, it is assumed instead that technology improves
over time at the constant rate, g:
At  Ae gt (3.1)
Technological progress specified in such way is labelled “Hicks Neutral”57.
The constant term A may be interpreted as capturing the influence of factors that
affect the level of productivity on a “once and for all” basis. For instance, a country
climate may influence the overall relationship between inputs and output, for each level
of technology. We will label this component as measuring “efficiency” in resources use.
The second component – which grows over time – captures the role of
technological progress. The rate of growth of the second term, g, corresponds to the
concept of “Solow residual” introduced in Section 2.7.
The remaining assumptions of the model are the same as in the basic Solow
model. For your convenience, we reproduce the main equations here:
Yit  At K it N it1   . (2.3)
sY t  I t (2.7)
K t  I t   K t (2.8)
n  N t N t (2.9)
Labour in efficiency units
Since technological progress (3.1) causes the production function (2.3) to shift
upwards continuoulsly over time, solving the model in this new version is not as
straightforward as it was in the basic formulation. But with the help of a small trick, we
will see that the complication is not that much. The trick is to re-write the model in
terms of a new variable, L, which we will label as “labour in efficiency units”.
Substituting (3.1) in (2.3) and aggregating across firms, we can rewritte the
aggregate production function in the following convenient form:
Y t  AK t L1t   , (3.2)
where:
Lt  N tt (3.3)
57
Note that, with such specification, technological progress has the effect of “renumbering” the isoquants
– with each isoquant corresponding to a higher level of output than before – but it does not alter the
“shape” of the isoquants: for each relative factor price, the optimal proportion in which inputs are used
remains unchanged. Technically, technological progress is said to be “Hicks Neutral” if the Marginal
Rate of Technical Substitution remains unchanged, for each given capital-labour ratio.
afreitas@ua.pt 84
 t  e  t , with   g (1   ) . (3.4).
In (3.2), the term L measures labour in “efficiency” units (i.e, the number of
workers adjusted for their – time varying - efficiency level). The term  refers to the
“effective labour input per worker”.
Under the assumptions above, the “effective labour input per worker” grows at
an exogenous rate, . That is, as time goes by, the typical worker becomes more
efficient because new skills/abilities are costlessly bestowed upon him at the rate . The
rate  is labelled the “Harrod neutral” or “Labour Augmenting rate of technological
progress”. It is called Labour Augmenting because, analytically, it produces the same
effect in production as an increase in raw labour, N58.
Note that, with the transformation above, the production function (3.2) gets a
form similar to that of (2.1): the main difference is that we replaced N by L. This
similarity is not just a coincidence: actually, this was the trick we needed to return to the
“previous problem”, which we already know how to solve (see Box 3.2).
The Fundamental Dynamic Equation revisited
To solve the model, we make use of a new variable, ~y  Y L , which will be

labelled as output “per unit of efficiency”. Remember, however, that our variable of
interest – the one that measures economic progress – is per capita income, y=Y/N.
Using (3.3), the relationship between the two variables is:
y t  ~y t  t  ~y t e t
(3.6)
Dividing all terms in (3.2) by L, one obtains the production function in the
intensive form:
~ ~
y t  Ak t  , (3.7)
~
where and k  K L denotes physical capital “per unit of efficiency”.
After some manipulation, the modified version of the Fundamental Dynamic
Equation results as follows59:
~ ~ ~
k t  sAk t  n     k t (3.8)
The two terms in the right hand side of (3.8) are depicted in Figure 3.1, together
with equation (3.7). The first term measures gross investment per unit of efficiency
58
Technically, technological progress is said to be “Harrod Neutral” if does not alter the shares of labour
and capital on income, for each given capital-labour ratio. Note however that, when the production
function is a Cobb-Douglas, the shares of labour and capital are constant and given by their elasticities in
production, 1- and , respectively. Hence, any Hicks neutral technological progress will also be Harrod
neutral. For a given rate of Hicks neutral technological progress (g), the equivalent rate of Harrod neutral
technological progress () is larger. The reason is that, in the later case the burden of technological
progress is carried by only a factor. This distinction is important for growth accounting, as we will see in
Section 3.6.
59 ~
The method is the same as used in Chapter 2: take time derivatives in k and use (2.7), (2.8) and (2.9).
Note that L L     N N    n .
labour. The second term gives the “break-even investment”, that is, the one that would
~
be necessary to compensate for the "depreciation" of k . Note that the later includes the
depreciation of physical capital and the growth rate of the effective labour force, n+.
Apart from the way the endogenous variable is defined, the interpretation of
equation (3.8) is the same as that of the corresponding equation in the basic Solow
~
model, (2.14): in brief, k rises whenever gross investment per unit of efficiency is
~
larger than the break-even investment - as in k1 of Figure 3.1 - and conversely. The
model of Section 2 is a particular case of the model developed in this section, with =0.
Box 3.2 The joke of the mathematician
One morning, a mathematician wanting to make tea, finds the teakettle on a

chair. What do you think he will do? Obviously - you say - he takes the teakettle, he
places it in the stove and boils the water to make tea. In the day after, however, the same
mathematician finds the teakettle on a table. What do you think he will do? The right
answer is: he takes the teakettle and brings it back to the chair, so as to return to the
previous problem.
This joke is about the use of algorithms to solve mathematical problems: often
the easiest way to solve a problem is not necessarily the shortest way. Adapting the
problem so that it fits in a well known algorithm is often much less time consuming that
trying to invent a solution specially adapted to the specific case. This is precisely what
we did in Section 3.3.
The steady state
~
The steady state of the model is obtained setting k  0 in (3.8). Using
~
,
k t  k t e  t , y t  ~y t e  t we find new equations for the paths of capital per worker
and per capita income in the steady state:
1
 sA  1 
k *    et (3.9)
 n   

1
1  s 1 t
yt*  A   e . (3.10)
 n   
afreitas@ua.pt 86
Figure 3.1. The Solow model with technological progress
~
y Y /L
~ ~
y  Ak 
~
y* ~
n     k
s~
y
~ ~
~ k* k  K /L
k1
Equation (3.10) states that, in the steady state, per capita income grows
continuously at the rate . How can that be? The labour augmenting technological
progress has the effect of neutralising the diminishing returns to capital that would
otherwise constrain per capita income growth: by economising progressively the input
whose supply cannot be changed – labour – technological progress allows affective
labour to increase along with the number of machines (capital), so that the number of
machines per worker increases at rate .
What happens if the Savings Rate increases?
The implications of a once-and-for-all increase in the saving rate may be

assessed with the help of Figure 3.2. This figure is similar to 2.5 of the earlier chapter,
with the only difference being that variables in the axes are defined in a different way.
As before, a rise in the saving rate shifts the steady state to the right and lowers
the average product of capital, Y/K. Thus, the interest rate declines from one steady state
to the other (equation 2.12).
Because capital accumulation is bounded each moment in time by the
availability of savings, the economy does not jump to the new steady state: it slowly
converges to it. How slow? Box 3.4 addresses the length of the adjustment period.
Figure 3.3 describes the time paths of the main variables of the model, following
a rise in the saving rate. The top panel depicts the evolution of output per unit of
~
efficiency ( ~y  Y L ) and capital per unit of efficiency ( k  K L ). The Middle panel
depicts the paths of output per capita (y=Y/N) and of per capita consumption (c=C/N) –
these are displayed in logs, so as to stick with linearity. The bottom panel depicts the
paths of the interest rate (r) and of the growth rate of per capita consumption (t).
Figure 3.2.A higher saving rate raises the steady state level of output per unit of efficiency
units
~
y Y / L (Y/K)1 ~
~
y  Ak 
~y * 1
n   k~
1
~y * 0
0
s1 ~
y
s0 ~
y
~ ~ ~
k 0* k 1* k K/L
Figure 3.3 – Implications of a rise in the saving rate
~y ; k~
~
k
~y
ln y time

ln c
r time
r r

 
time
t0 t1
Let t0 be the moment at which the saving rate raises to the new level. Assume
that in the moment just before, the economy was in a steady state, with a constant level
of output per unit of efficiency and with per capita income growing at the exogenous
rate  (remember equation 3.10). Since capital and the labour force are both pre-
afreitas@ua.pt 88
determined, at the time of the shock (t0,) per capita income remains initially unchanged.
Per capita consumption, however, falls to a lower level, because a higher proportion of
income is now devoted to savings.
The adjustment process takes place between t0 and t1. Since savings become
temporarily greater than the break-even investment, both K/L and Y/L start increasing.
This means that the growth rates of output per capita and of per capita consumption
jump temporarily above their steady state levels, . As the economy approaches the new
steady state, diminishing returns show up, implying that the growth rate of output per
capita falls back, approaching the exogenous rate, . The new steady state occurs when
t =.
As in the case without technological progress, in the new steady state (after t1),
the consumption path may be above or below the original consumption path. Referring
to our discussion in Section 2.5, this will depend on how the new and the old saving
rates compare to the Golden Rule saving rate60. Figure 3.4 depicts the special case, in
which the rise in the saving rate leads to a higher level of per capita consumption in the
steady state.
Box 3.3. Growth effects and level effects
At this stage, it is important to introduce the distinction between level effects and
growth effects:
- A level effect occurs when changing a model’ parameter changes the steady
state without affecting the growth rate of the economy in the steady state.
- A growth effects occurs when a change in a parameter alters the growth rate
of the economy in the steady state.
In the Solow model, changes in parameters like the saving rate and the
population growth rate produce level effects, only. A growth effect could only occur if
the exogenous rate of technological progress changed to a different value.
3.5 The extended Solow model meeting the real world facts
Revisiting the Kaldor’ facts
As discussed in Chapter 2, the main drawback of the simpler version of the

Solow model is that it cannot account for two stylized facts of economic growth,
namely that per capita output and capital per worker tend to grow over time at a
sustained rate (Kaldor’s facts 1 and 2). As show in equations (3.9) and (3.10), the
assumption that technology expands over time brings the model into compliance with
these two facts.
60
You may verify that the Golden Rule saving rate in this version of the model is the same as before: it
corresponds to the share of capital in total income, 
The model is also consistent with the evidence that the shares of labour and
capital on national income tend to be constant over time (Kaldor’ fact 5) 61.
Dividing (3.10) by (3.9) we obtain the expression for the average product of
capital:
*
Y  n 
   (3.11)
K s
Since all parameters in the right hand side of (3.11) are constant, this means that
the average product of capital is constant in the steady state. Hence, the Simon Kuznets
fact that Robert Solow wanted to explain still holds in this version of the model. As a
by-product – and using equation (2.12) - it follows that the interest rate is constant in the
steady state. This is Kaldor fact number 4.
A novelty with the augmented model is that predicts real wage increasing over
time: since the wage rate is proportional to per capita output (equation 2.11), in the
steady state they will both increase at the rate . This is another feature of the
augmented model that makes it more compliant with the facts of Modern Growth.
Finally, we turn to fact number 6 (“There are wide differences in the growth rate
of productivity across countries”). As long as we stick with the assumption of perferct
technological diffusion, the model predicts that in the long run all countries should be
growing at the same rate, . That is, the steady states of the different countries should be
characterized by per capita incomes evolving in parallel over time. Note however that
such conclusion only applies to the long run: since equation (3.10) refers to the steady
state, it is expected to hold only for countries that already adjusted fully to changes in
their exogenous parameters. For countries that are engaged in a transitional dynamics,
the current growth rate of per capita income may be higher or lower than  , depending
on whether the starting point is below or above the corresponding steady state: countries
that start out below (at the left of) their respective steady states are expected to grow
faster that countries that start out above their steady states. Thus, each moment in time,
the growth rates of per capita income may vary considerably across countries. This
property of the model makes is consistent with the Kaldor’ fact number 6.
Absolute convergence
A question that has received considerable attention in the economic profession is

whether there is a general tendency for poor countries to grow faster than rich countries.
At least since David Hume (1758), economists have been arguing that technological
spillovers, diminishing returns and capital mobility provide poor countries with an
impetus to “catch up”. The hypothesis that poor economies tend to grow faster than rich
economies is known as absolute convergence.
A simple way to investigate the convergence hypothesis using cross-sectional
data is by plotting growth rates of per capita incomes against the initial levels of per
capita incomes, for different countries along a period of time: if there was a general
61
Firms take technology as given, so they maximize profits with respect to K and N, as in the simple
Solow model. Hence, conditions (2-9-2.12) also hold in this more sophisticated version of the model. The
stylized fact 5 is a direct consequence of assuming CRS and perfect competition.
afreitas@ua.pt 90
tendency for per capita incomes to approach each other, then poorer countries should
grow faster than richer countries.
Figure 3.5 illustrates such an exercise, using a sample of 98 non-oil countries.
The figure relates the growth rate of GDP per working age person from 1965 to 1985
(vertical axis) with the corresponding 1965 level. If there was a general tendency for
poor countries to grow faster than rich countries, the slope of the regression line should
be negative. However, this is not the case. The conclusion is that “absolute
convergence” does not apply to this broad cross section of countries during this time
period.
Note however that this evidence does not challenge the Solow model: the Solow
model does not imply that countries should converge to same level of per capita output.
As stated in equation (3.10), countries differing in terms of the fundamental parameters
(those that determine the steady state, such as the saving rate and the population growth
rate) are expected to reach different steady states.
Figure 3.5: Evidence of Non-Convergence in 98 Countries
y = 0,0943x - 0,2666
R2 = 0,0363
1,5
Growth of GDP/adult 1960-1985
0,5
-0,5
-1
5 5,5 6 6,5 7 7,5 8 8,5 9 9,5 10
GDP/adult 1960 (logs)
Source: Mankiw et al, 1992.
Figure 3.6: Evidence of Absolute Convergence among 22 OECD Countries
1,4
1,2 y = -0,3411x + 3,6863

R2 = 0,4855
0,8
0,6
0,4
0,2
7,5 7,7 7,9 8,1 8,3 8,5 8,7 8,9 9,1 9,3 9,5
Of course, countries that are similar in terms of the fundamental parameters,

such as the saving rate and the population growth rate are expected to approach steady
states that are close to each other. Thus, if one restricts the sample to countries that are
very similar in terms of the fundamental parameters, one may well observe countries
with lower levels of per capita income growing faster than those with higher per capita
incomes. An example of this is displayed in Figure 3.6. This figure restritcs the sample
of Figure 3.5 to OECD countries, only. In this particular sub-sample, we are able to
identify a negative relationship between growth rates and initial per capita incomes62.
Summing up, in light of the Solow model, it will be impossible to predict
whether a country will grow faster or slower by observing its initial income relative to
other countries. In light of the Solow model, it is not the initial income that determines a
country’ growth rate, but instead its distance relative to the steady state: economies
with per capita incomes that fall behind their steady states should grow faster than
economies with per capita incomes that are above the respective steady states. This
property of the model, is known as conditional convergence and will be subject to
further discussion in the following chapter.
Box 3.4. Explaining the convergence test
Formally, the hypothesis of “absolute convergence” can be assessed empirically

by running the following regression equation:
62
This is not to say that all these countries are converging exactly to the same steady state: actually, in
this particular sub-sample, departures from steady state (i.e, transitional dynamics) account for a larger
share of the cross-country variation of per capita incomes than differences in the steady states. Note that
the data refers to the post-WWII period, during which many European countries were rebuilding their
capital stocks. Evidence of a negative relationship between per capita income growth and the initial level
of per capita income is often found in samples containing industrial countries or their regions (Baumol,
1986, Dowrick and Nguyen, 1989, Barro e Sala-i-Martin, 1991, 1992, 1995, Mankiw et al., 1992). But in
general, the evidence of absolute convergence has been found to be fragile and sensitive to small sample
modifications (see De Long, 1988).
afreitas@ua.pt 92
ln y it  ln y i 0  a  b ln y i 0   i (3.13)
where the dependent variable is the growth rate of per capita GDP between period zero
and period t in country i and the regressors are: a constant (a) and the initial level of per
capita GDP in country i (ln yi0). The term i is a random disturbance.
The “absolute” convergence hypothesis is assessed by investigating the sign and
significance of b: If b<0, this means there is a general tendency for initially poor
economies to grow faster than rich economies.
To see how this test relates to the Solow model, consider the following equation,
which describes the dynamics of per capita income as it approaches the steady state (see
Appendix 3.1 for a mathematical discussion):
 
ln y t  ln y 0   t  1  e  t ln ~ 
y *  ln y 0 , (3.14)
with   1   n       0 .
Equation (3.14) states that the growth rate of an economy depends on its
distance relative to the steady state: if the economy starts out in the steady state, the
expected growth rate is  ; if the economy is below (above) the steady state, its grow
rate will be higher (lower) than . In general, this equation states that per capita income
converges to a steady state and the speed at which it does so relates inversely to the
initial distance to the steady state. This property of the model is known as conditional
convergence.
The relationship between the parameters of the regression equation (3.13) and
those of the structural relationship (3.14) is straightforward:
 
a  t  1  e t ln ~
y* (3.15)
b   1  e t 
This correspondence reveals that regression equations of the form (3.13), by
imposing the same intercept to all countries, implicitly impose the same steady state63.
With no surprise, tests for absolute convergence perform very poorly in World-wide
samples.
To overcome this limitation, many studies have allowed the intercept (the steady
state) to differ across countries. This is done adding to the regression model variables
that are thought to determine ln ~y * , such as the saving rate, the population growth rate,
and efficiency, A (equation 3.10). We will return to these tests of conditional
convergence in Chapter 14.
Box 3.5. How Long?
In (3.14), the term   1   n       0 measures the speed of adjustment

of per capita income to the steady state. To quantify this, let’s calibrate the equation
with reasonable parameters:
63
In rigor, the slope b also varies across countries, because population growth rates – which enter in v –
differ across countries. This is, however, a second order – small - effect.
- According to the Solow model, the elasticity of labour in the production

function can be assessed by the (observable) share of labour on national
incomes. National accounts data for different countries reveal that the labour
share in national income varies from 60% to 70%. Thus, a reasonable
assumption for the elasticity of capital in the production function is 
- In the Solow model, the rate of technological progress,  , is equal to the
growth rate of per capita output in the steady state. The long term evidence
for industrialised countries reveals that per capita incomes have evolved at
an average rate that is close to  =2%. A popular assumption in the literature
is to consider .
- The population growth rate varies considerably across countries. As an
example, consider a country which population growth rate is equal to 1% per
year.
Under the above assumptions, the speed of convergence becomes .
With such a value, how long it will take for that country to get “halfway” to its
balanced growth path? Using equation (3.14), the answer is e 0.04 t  0.5 , which solves
for t=17 years.
Interpreting growth patterns using the Solow model
By now, we have been confronting the Solow model with stylized facts referring
to samples of countries. A different question is whether the model is helpful to interpret
specific growth patterns of individual countries. To address this question, let’s consider
Figure 3.4, which depicts the evolution of per capita incomes in some advanced
economies, namely the United States, France, Japan and Germany, over the period from
1871 to 2001.
We observe that:
1. In all cases, per capita GDP exhibits an upward trend.
This is accounted for by the Solow model with technological progress.
2. The growth rates of per capita output appear to be pretty stable in the long
run and similar across these countries.
This suggests that the draconian assumptions of technology evolving at a
constant rate g and with all countries drawing from a common technological pool may
not be at odds with reality in this very particular sample, composed by a group of
advanced countries64.
3. The long-term paths of per capita incomes are parallel but not coincident.
The Solow model does not imply that steady states should be the same:
according to equation (3.10), the long run path of per capita income in a given country
may be higher or lower, depending on the saving rate (s), the population growth rate (n)
and efficiency (A).
64
Note that this set of countries is very specific: they share a set of characteristics that make them
permeable to technological innovations discovered by each other. Many other countries in the world will
hardly be thought as taking full opportunity of this “technological pool”.
afreitas@ua.pt 94
4. During the long period from 1870 to 2011, some major disruptions pushed
the US, France and Germany away from their respective long-term paths.
These events included the Great Depression in the 1930s, the First World
War (1914-1918) and the Second World War (1939-1945). As time went by,
per capita incomes look like having return to the earlier paths.
According to the Solow model, the steady state levels of per capita output are
independent of a country’ initial capital endowments. So, if some disaster destroys part
of the capital stock, per capita GDP will fall initially, but then it will recover until
returning to the steady state. In Figure 3.4, we see that this prediction of the model fits
quite well the cases of US, Germany and France. For instance, during WWII, per capita
output in Germany and France dropped significantly, but this was followed by a fast
recovery that brought these economies back to the earlier path.
5. In the case of Japan, a “level effect” is likely to have occurred after the
Second World War.
The path of Japan points indeed to a distinct case from those of US, Germany
and France: after WWII this country seems to have moved from a lower steady state to a
higher one, closer to that of United States. In light of the Solow model, such move could
be explained by a change in a fundamental parameter, such as the saving rate, the
population growth rate or other country-specific effects, as captured by the country
efficiency parameter, A. Any change in one of these exogenous parameters implies a
change in a country steady state65.
Figure 3.4- Per Capita GDP in Japan, France, Germany and US, 1871-2001
10.5
France
10 Germany
Japan
9.5 United States
8.5
7.5
6.5
6
1871
1874
1877
1880
1883
1886
1889
1892
1895
1898
1901
1904
1907
1910
1913
1916
1919
1922
1925
1928
1931
1934
1937
1940
1943
1946
1949
1952
1955
1958
1961
1964
1967
1970
1973
1976
1979
1982
1985
1988
1991
1994
1997
2000
Source: Maddison (1995).
65
Remember (from our discussion in Figures 2.7 and 3.3) that observing the real interest rate, you could
disentangle whether a change in the steady state is attributable to a change in A or to a change in the other
exogenous parameters.
3.6 Growth accounting revisited
Autonomous versus induced contributions
Having in mind the extended version of the Solow model, we now revisit the
growth accounting exercise introduced in Section 2.6. In particular, consider again the
figures for the US economy during the first half of the twentieth century that we already
used in section 2.6: GDP growing at 3%; capital stock growing at 3% and labour input
growing at 1%.
The fact that both capital and output grew on average at about 3% in US is
consistent with the view that this economy evolved along a balanced growth path (with
Y/K constant, and hence with a constant saving rate).
But if the US economy was indeed evolving along a balanced growth path, why
should a growth accounting exercise like (2.29) indicate a contribution of capital to
GDP growth equal to 1%?
To answer this question, note that the Solow model with exogenous growth
predicts the capital stock to grow along the steady state at the rate K K    n . This
growth, however, is not autonomous (i.e, it is not implied by a change in the saving
rate); it is instead an endogenous response to the expansion of the effective labour force:
if there was no population growth or technological progress (=n=0), then the capital
stock would be constant, unless the saving rate had recently increased.
The problem with the growth accounting exercise based on equation (2.29) is
that it measures the contribution of capital to growth, without disentangling whether the
observed growth in capital is induced by an increase in the saving rate (transitional
dynamics), or is a mere response to a growing effective labour. When the economy is in
the steady state, for instance, such an exercise will attribute    n  to the growth of
capital, 1   n to the growth of population and g   1    to technological
progress. And yet, all these parameters should be zero if there was no population change
or technological progress.
An alternative approach
To disentangle whether a country growth process is mainly driven by

transitional dynamics or by steady technological change, many authors prefer to
implement growth accounting exercises based on the following re-arrangement of the
production function, (3.7)66:

1
 Kt  1 
yt  A t
1 
  (3.16)
 Yt 
66
David (1977), Mankiw et al., (1992), Klenow and Rodriguez-Clare (1997), Hall and Jones (1999). You
may easily obtain this expression by manipulating (3.7) and taking logs.
afreitas@ua.pt 96
This rearrangement emphasizes the role of the capital output ratio as determinant
of per capita income. According to the Kaldor stylized facts, this ratio should be
roughly stable in the long run. In light of the Solow model, in a steady state, this ratio is
influenced by the saving rate, but is independent of the technological level, A (equation
3.11).
Log-differentiating this, you get
yˆ 
1 ˆ
1 
A
 ˆ ˆ
1 

K Y  (3.17)
In (3.17) the contribution of capital to the growth rate of per capita GDP is now
evaluated by the extent to which its growth rate exceeds that of output growth. This
decomposition actually expurgates the part of the growth of the capital stock that is
induced by the exogenous parameter (). Thus, growth accounting based on equation
(3.17) will capture the contribution of capital, only to the extent that the country is
involved in a process of transition dynamics.
In technical terms, the first term in (3.17) is is the Harrod neutral rate of
technological progress,   g 1    , while decomposition (2.30) emphasizes the Hick
neutral rate of technological progress (g).
Using this new approach and the same figures for the US economy, one obtains:
1
2%  2 %  3%  3%   
2
That is, in the U.S. economy along the first half of the last century, capital
accumulation by itself did not account for the growth of per capita GDP: the only
exogenous source of per capita income growth in this period was the (Harrod neutral)
technological progress, which averaged 2% per year. The capital stock evolved at a rate
of 3% per year, but this was merely a response to the increasing population and TFP
growth.
Box 3.6. The TFP controversy in East Asia
There are many examples in the literature of practical applications of growth

accounting. A particularly controversial application refers to the so-called Asian Tigers
(Hong Kong, Singapore, South Korea and Taiwan). Economists have for many years
turned to these successful countries for clues that can explain the successes of
development and so provide examples of effective development to the rest of the world.
The World Bank in particular had come down strongly in favour of the idea that the
East Asian economies, by operating sounder domestic policies than other developing
countries, relied on TFP growth for a substantial part of that total growth – almost 2
percentage points of its overall growth of almost 7 per cent per annum67. By contrast,
TFP growth rates in both Africa and Latin America were almost zero in that same
period.
In an influential article, Young (1995) dissented from this view. The author
found that the bulk of the impressive growth achieved in the four East Asia miracle
countries in the period 1966-1990 could be attributed much more narrowly to their fast
67
World Development Report 1991.
rates of factor accumulation, and not to their exceptional levels of TFP growth. These
countries, he concluded, have achieved very high growth rates because of their ability to
achieve high investment rates (in physical capital and in human capital), and a drastic
increase in the fraction of population at work (largely via the increased labour force
participation of women).
Columns (1), (2) and (3) of Table 3.1 present a summary of the Young (1995)
estimates. These are obtained using a decomposition similar to (2.29), with the
difference that the author aggregated raw labour and education levels into a single
measure of labour. The data in column (3) indicate that TFP growth accounts for only a
small proportion of GDP growth in these countries. Singapore for example was a
particularly bad performer, with TFP growing at a rate close to zero.
Based on Young’ evidence, some authors disputed the World Bank view and
concluded that the East Asian episodes illustrate the importance of neoclassical
transition dynamics (that is, the old-fashioned capital accumulation and hard work!)
rather than productivity change. Paul Krugman who helped popularise these results
contended that the key issue was "transpiration" rather than "inspiration".
The implication of this finding for development economics is somehow
disappointing: if technological catch up played only a minor role in East Asia's growth,
this means that belt-tightening and policies that address the issues of low initial savings
and investment and poor education are the more critical ones to promote economic
growth. Thus, the quest for policies that encourage greater aggregate efficiency that
many authorities have emphasised in the East Asian context appeared to be relatively
less important.
With no surprise, such controversial conclusion became subject to further
scrutiny in the years that followed.
One avenue that was explored relates to our discussion in Section 3.9: traditional
growth accounting tends to overstate the role of factor accumulation because it does not
control for the component of factor accumulation that is merely induced by
technological change.
Klenow and Rodriguez Clare (1997) stressed this point and computed the
Harrod neutral rates of technological progress implied by the Young (1995) estimates of
Total Factor Productivity. Their results are reproduced in column (4) of Table 3.1.
Clearly, the adjusted measures of technological change are higher than those in Column
(3)!
Table 3.1. - Productivity growth in East Asia
Klenow and Rodríguez-

Young (1995) Hsieh (1999)
Clare (1997)
Harrod Harrod Rank (98
Y Y/N TFP Neutral Y/N Neutral countries) R w Dual TFP
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)
Hong Kong 7.3 4.7 2.3 3.7 5.5 4.4 4 0.3 4.0 2.7
Korea 10.3 4.9 1.7 2.5 5.4 2.5 17 -4.8 4.4 1.6
Singapore 8.7 4.2 0.2 0.3 5.1 3.3 6 1.2 2.7 2.0
Taiwan 9.4 4.8 2.6 3.5 5.3 3.0 7 -1.7 5.3 3.5
Source: Columns (1) - (3) are from Young (1995) and display average growth rates for the period 1966-
1990 (1966-1991 in Singapore). Columns (4)-(6) are from Klenow and Rodriguez Clare (1997) and refer
afreitas@ua.pt 98
to the period 1960-85. Column (7) displays the rank order of estimates in Column (6) in a sample of 98
countries. Columns (8)-(10) are from Hsieh (1999). In Column (8), R denotes for the “rental price of
capital”, i.e, the real interest rate plus the depreciation rate multiplied by the relative price of capital.
Following Hsieh (1999), this based on the "average lending rate", for Singapore (1968-1990), the
"secured loan rate", for Taiwan (1966-1990), the "best lending rate", for Hong Kong (1966-1991) and the
"curb market loan rate", for Korea (1966-1990).
A second problem mentioned by Klenow and Rodriguez Clare relates to data

measurement. In general, growth accounting exercises are very sensitive to key
assumptions, such as the weights assumed in the production function, and the treatment
of human capital. Differences in data definitions across countries also create problems
of international comparison. To illustrate how sensitive the estimates are to changes in
methodology, we display in columns (5) and (6) of Table 3.1 the estimates proposed by
Klenow and Rodriguez Clare for output per capita and for (Harrod neutral) productivity
growth in these countries. Although the estimates for South Korea and Taiwan roughly
match those of Young, estimates for Singapore and Hong Kong are much higher. In the
case of Singapore, the difference is qualitatively important68.
The third caveat to the Young results is that the contribution of TFP growth
should be evaluated per se and not as a proportion of output growth 69 . In general,
estimates of TFP growth using comparable data for a large set of countries reveal that
East Asian productivity growth is relatively high when compared to other regions.
Column (7) displays the rank order of the Harrod neutral rate of technological progress
estimated by Klenow and Rodrigues (1997) in their 98 countries sample. Clearly, the
TFP growth rates in the four East Asian economies are quite respectable.
More recently, Hsieh (1999) observed that, if East Asian grown was mostly
driven by transition dynamics, with little technological progress, then the return to
capital should have fallen dramatically, due to diminishing returns. However, the
interest rate in Singapore did’t fell accordingly. Manipulating the condition that output
equals factor incomes, the author proposed a new (dual) estimate of TFP growth, based
on factor prices:
A R w
   1    ,
A R w
where R is the “user cost of capital” and w are real wages (obtained as a weighted
average of workers of different qualities). The dual estimate of TFP growth has the
advantage of being based on market prices (namely wages and interest rates), rather
than on national accounts. Some of the Hsieh (1999) results are depicted in columns (8)-
(10) of Table 3.1. As the table reports, in case of Singapore there is a significant
difference between the Young (1995) primal TFP estimates and the Hsieh (1999) dual
TFP estimates. The author suggested that national accounts data in Singapore are
68
Klenow and Rodriguez Clare (1997) attributed the bulk of the difference to different assumptions
regarding the capital income share (the authors used 0.30, instead of 0.48 in Young, 1995) and also to the
different data set used for employment growth.
69
In light of the neoclassical model, for any two countries enjoying the same rate of technological
progress, the one that starts with a lower capital labour ratio should exhibit faster growth. Thus, because
of the transition dynamics, a lower proportion of output growth will be accounted for by improvements in
TFP, even though both countries share the same rate of technological progress.
probably wrong. A possible reason is that the private sector tends to overstate the
investment effort so as to take advantage of tax allowances.
All in all, despite the initial controversy, the most recent evidence points to the
case that TFP growth (technology and aggregate efficiency) has indeed played a much
greater role in the economic transformation of East Asia than the original Young
estimates suggest. All in all, the World Bank was right.
3.7. Discussion: What we have achieved?
In Chapter 2 we saw that, because of diminishing returns, the basic Solow model
is not capable of describing the real world fact that economies engaged in modern
growth tend to grow over time.
In this chapter, we saw that assuming an exogenous rate of technological
progress we turn the model capable of describing most stylised facts of economic
growth. In particular, the model is able to conciliate a sustained growth of per capita
income with an interest rate that remains constant over time. With this refinement, the
model becomes complient with all the 6 stylized facts of economic growth identified by
Kaldor. In plus, we saw that the Solow model is consistent with the evidence that poor
countries do not tend to grow faster than rich countries. In a word, the Solow model can
be fairly said to provide the correct answers to the set of questions it was intended to
address.
The main drawback of the model is that it cannot address the causes of
technological progress. The model describes how economies evolve over time, and can
be extended to account for the role of technological progress in delivering long term
growth. However, the model fails to explain why such technological progress occurs.
The reason for this was already discussed at the beginning of this chapter: by assuming
perfect competition, the model implicitly assumes that technology is freely available to
everybody, so no profit-making economic agent would found incentives to invent a new
technology and sell it. Moreover, because in the model the payments made to inputs
exhaust the total output, nothing would be left to reward any eventual innnovator.
Without accounting for the incentives to invent new technologies, the model is forced to
assume that technology grows exogenously. Questions such as: "who produces
technological progress and why" cannot be addressed by the Solow growth model.
Despite its limitations, the Solow model provides a framework that shall be seen
as the centrepiece to describe the process through which per capita income grows over
time. As such, it became the workhorse of growth theory and is often the basis of many
advanced models in macroeconomics.
afreitas@ua.pt 100
Appendix 3.1 Transition dynamics in the Solow model
The stability properties of the neoclassical growth model ensure that the
economy converges to its steady state and that the speed of adjustment depends on how
far the economy is from the steady state.
Formally, the speed of convergence may be assessed using a first-order Taylor-
series approximation of (3.8) around the steady state (3.9). This gives:
~  Y  ~ ~*
  ~ ~*
~ ~*    s K  n      k  k   1   n      k  k  
~ k
k  ~
k ~ ~*
k k  
k k
~ ~
 
  k  k * with   1   n       0 ,
where we used equation (3.11) to eliminate Y/K. The last equation is a first-order, non-
homogeneous differential equation, whose solution is given by70:
~ ~ ~ ~

k t  k *  e t k 0  k * 
~
This equation states that the change in k each moment in time declines as the
economy approaches its steady state. When the economy is in the steady state, the
~
second term on the right hand side is zero, implying a constant level of k . When the
economy is below the steady state, the second term on the right hand side is positive,
~
implying that k is rising. This means the model exhibits local stability.
Since y is a continuous function of k, as a linear approximation, y approaches
the steady state at the same rate as k. Then, it can also be shown that the dynamics of y
in natural logarithms is given by71:
ln ~
y t  ln ~
y *  e  t ln ~
y 0  ln ~y * 
Thus, the speed of adjustment of per capita income to the steady state is given
by   1   n       0 . This equation suggests a natural regression to study the rate
of convergence in the context of the Solow model: subtracting ~y 0 in both sides,
rearranging and using the identity ln yt  ln ~yt  t to eliminate ~y t and ~y 0 , one obtains
(3.14).
 Because it assumes perfect competition and rules out market failures, the
Solow model can only account for exogenous technological progress.
 The exogenous rate of technological progress allows effective labor to
expand faster than population. If – as implied by the model – the capital
70
See, for instance Chiang (1984), pp. 472-474.
71
The student will thank us for skipping the tedious mathematical derivation.
stock expands proportionally to effective labor, then output will also

expand at the same rate as effective labor. The implication is that capital
per worker and per capita income will both increase at the rate given by
the labour augmenting rate of technological progress.
 With this small amendment, the Solow model can account for the fact
that most countries see output per capita increasing, not constant.
 As before, changes in the saving rates can only produce “level effects”:
that is, the growth rate of per capita income increases temporarily, but in
the long run it falls back to the level given by the rate of technological
progress.
 The Solow model does not imply that in the long run all countries should
have the same level of per capita income (“absolute convergence”).
According to the Solow model, per capita incomes differ in the steady
state depending on country characteristics, such as the saving rate and
the population growth rate.
 The Solow model implies that countries that lag behind their respective
steady states should grow faster than countries that are close to their
steady states. This property of the model is known as “conditional
convergence”.
 The Solow model is capable of describing many stylized facts of
economic growth. If fails however to explain economic growth, because
the parameter that ultimately determines the rate at which per capita
incomes are expanding is exogenous to the model.
 Growth accounting exercises have different interpretation depending
whether the technological parameter is specified as Harrod neutral or as
Hicks neutral. If one wants to capture the contribution of capital after
expurgating the capital accumulation that is induced by technological
change, the Harrod neutral measure should be used.
afreitas@ua.pt 102
Key concepts
 Perfect technological diffusion

 Labour in efficiency units
 Harrod neutral vs Hicks neutral technological progress
 Level effect vs. growth effect
 Absolute Convergence
 Conditional Convergence
Essay questions:
a) Comment: “In the Solow model, technology has to be assumed

exogenous”.
b) Comment: “The Solow model does not explain economic growth: it
describes it”
Exercises
3.1.
Consider an economy where the production function is given by:
Yt  At K t1 3 N 2 3 , where At  16 e 0 , 02 t describes the technology and N is the (constant)
number of workers. In this economy, 25% of income is saved the capital depreciation
rate is 1%.
a) Describe the main equations of the model and find out the fundamental
dynamic equation for K/L, where L is labour in efficiency units.
b) Find out the equilibrium values of K/L, Y/L and K/Y.
c) Describe the time-paths of per capita income (Y/N), the wage rate, the
interest rate and the factor income shares in the steady state. Are these
paths in accordance to the real world facts?
d) Suppose that a war destroyed part of the stock of capital of that
economy. Describe the subsequent evolution of per capita income
(Y/N).
e) How did the growth rate of per capita income and the interest rate
evolved during the transition path? Explain.
3.2.
Consider an economy (Oldland) where the production function is given by the
following expression: Y  At K t1 / 3 N t2 / 3 , where N measures the number of workers. It is
known that, in this country 25% of income is saved, population grows at 0.5% per year,
the capital stock depreciates at 3% and that At  20e 0.01t .
a) Find the equilibrium values of K/L, Y/L and K/Y of this economy,
where L represents labour in efficiency units. Discuss the stability of the
equilibrium and represent it in a graph.
b) Describe the short and long run effects of a rise of the saving rate in the
following variables: per-capita income, growth rate of per-capita
income, per-capita consumption and interest rate.
c) Admit that in Oldland per-capita income is ten times higher than in
Newland. In what conditions could you state that Newland was growing
faster than Oldland?
d) Knowing that technology was the same in both countries, find out what
the interest rate in Newland should be. Discuss.
3.3.
In Micronia the production function of each individual firm is given by:
Yi  At K i1 / 3 N i2 / 3 , where N measures the number of workers. In this country, 25% of
income is saved, the population doesn’t growth, the capital stock depreciates 3% each
0.04
t
year and At  5e 3
.
afreitas@ua.pt 104
a) Find out the equilibrium values of K/L, Y/L and K/Y of this economy,
where L denotes for labour in efficiency units..
b) Find the saving’s rate that would maximize per-capita consume in
steady-sate.
c) Assume that saving rate jumped to the level found in c. Describe in a
graph the time paths of y=Y/N, c=C/N and of the interest rate during the
adjustment period.
d) Consider an economy with a level of per capita income in the steady
state corresponding to 10% of the one in Micronesia. Find out the
corresponding saving rate, assuming that this was the only difference
between the two economies. Discuss.
3.4
Consider an economy composed by a large number of small and identical firms.
1 1
The available technology for each of them is given by Yi  0.5K i Li , where L=N
2 2
measures labour in efficiency units. Population doesn’t growth, the depreciation rate is
3.5% and the saving rate is 20%. Admit also that   e 0.015 t .
a) Find out the equilibrium values of K/L, Y/L and K/Y of this economy.
b) Describe the long run behaviour of per-capita output, wages and the
interest rate.
c) Assume now that saving rate was determined according to the following
c
rule    r   , where  is the rate of time preference and r the
c
interest rate. Explain this rule. Find out the value of that is consistent
with s=20%.
3.5
Consider an economy where capital and output grow at 3% per year and the labour
force grows at 1%. Assuming that the weight of capital in the production function
() is one third, compute the contribution of capital to output growth, using: (a)
conventional growth accounting; (b) growth accounting based on the following re-
parameterisation of the production function: yˆ  A1 1  K Y 
 1 
. Interpret the
differences. Repeat the exercise assuming that output growth is 4,5%.
4 The Neoclassical model with Human Capital
Why doesn’t capital flow from rich countries to poor countries?

[Robert Lucas Jr.]
Learning Goals:
 Understand why the Solow model is not capable of accunting for large
per capita income gaps
 Understand why accounting for Human Capital makes the model more
consistent with real world facts.
 Aknowledge that, even accounting for human capital, much of the cross-
country variance of per capita incomes remains to be explained.
 Acknowledge the main achievements as well as the main limitations of
the neoclassical growth model.
4.1. Introduction
We just saw that the Solow model is capable of describing very reasonably a
wide range of real world facts of economic growth. Not surprisingly, the Solow model
rapidly became the workhorse model in the theory of economic growth and it still is.
During the last 30 years, however, various researchers have subjected the model to
further and more demanding empirical scrutiny. One direction explored in this further
investigation relies on the fact that the model specification implies not only the expected
signs of certain parameters but also their approximate magnitudes. In particular, the
model implies broad orders of magnitude for the coefficients linking per capita income
to the savings rate and to the population growth rate. It happens that these magnitudes
do not conform too well to the empirical evidence.
The key parameter in this further investigation is the elasticity of output in
respect to physical capital,. If, according to the model, there is perfect competition and
factors of production are paid their marginal products, then the elasticity of output in
respect to physical capital should correspond to the share of capital in domestic income.
As we will see, the empirical observation that the later stands at around 30% to 40%
makes the model incapable of describing the large differences in per capita incomes that
we observe in the real world. The conciliation would require, for example, differences
in savings rates as between countries much larger than those actually observed in
reality.
This chapter addresses these inconsistencies and explores one avenue that was
proposed to mitigate the problem: the introduction of a second reproducible factor in
the production function, Human Capital. Is in argued however that such an
improvement is not enough to make the model fully consistent with the real world facts.
afreitas@ua.pt 106
Section 4.2 reviews the above mentioned limitations of the Solow model.
Section 4.3 introduces the concept of Human Capital and presents a human capital
augmented production function. Section 4.4 shows a version of the neoclassical growth
model extended with Human Capital. Section 4.5 turns to the empirical evidence to
evaluate how far the extended version can go in accounting for cross-country income
differences. The conclusion is that augmenting the production function with human
capital improves the predictability of the model but it does not allow it to explain fully
the existing per capita income differences. The chapter concludes with the need to have
a better understanding of what is behind the productivity term, and outlines the
subsequent directions of hour search.
4.2. The Lucas Paradox
Explaining cross-country income differences
In the Solow model, the steady-state level of per capita income is given by:

1
1   s 1  t
y*  A   e [3.10]
 n   
In this section, we investigate whether calibrating this expression with sensible
values for the main parameters makesit capable of explain large per capita income
differences.
First, we neeed an estimate for  . Because the model assumes perfect
competition and rules out market failures (such as externalities, and public goods), it
predicts that factors are paid according to their marginal products. This is reflected in
equations (2.11) and (2.12), which state that the shares of capital and of labour in
national income shall correspond to the respective elasticities in the production
function,  and 1-. In the real life, we observe that the shares of labour in national
incomes vary around 60% to 70%, depending on the country. Given this, one may
reasonably calibrate the Solow model setting the parameter  equal to 1/3.
With such calibration, the elasticity of income in respect to the savings rate,
(  1    in equation 3.10) becomes equal to 0.5: that is, a 1p.p. increase in the saving
rate implies a 0.5p.p increase in per capita income. The inconsistency with the empirical
facts arises from that fact that this elasticity is too small to account for the large cross-
country per capita income differences we observe in the real world.
To get a sense of this, consider two countries; say a Rich Country (R), and a
Developing Country (L). From (3.10), and abstracting from differences in A, the ratio of
per capita incomes in the steady state will be:
 
yR  sR  1   n L      1 
    (4.1)
y L  s L   R n     
In the following exercise, assume that =1/3, =3% and =2% (the last
corresponding to the trend growth rate of per capita GDP in the U.S.).
Assume further the following values for the remaining parameters: s R  0.2 ,
sL  0.24 , nR=1% and nL=3%. These values match roughly those of US and Tanzania
between 1960 and 2000. With these parameter values, equation (4.1) would predict per
capita incomes in the rich country and in the poor country differing only by a factor of
1.05 (that is, the difference between the average income of a US citizen and that of a
Tanzanian citizen should be 5% only). Obviously this is too little: by 2000, per capita
income in the US was 32 times higher than that of Tanzania.
Now, as an extreme case, assume that the poor country had instead sL=1.5% and
nL=5%. Clearly, these assumptions are extreme, even for developing countries (no
country has sustained a population growth rate as high as 5% and few save as little as
1.5% of income). Still, these figures would imply a ratio of per capita incomes of 4.7,
only. Again, this is too small to account for the observed differences in per capita
incomes72.
The conclusion is that, even using drastic assumptions concerning the
differences in the saving rate and in the population growth rate, there is no way of
generating cross country income gaps as large as we observe in reality in the context of
the Solow model as it is.
Using the production function
By making use of equation (3.10), exercise (4.1) implicitly postulates that

economies are in the steady state. But it could be that the poor country was still on its
way to the steady state. In that case, part of the poor country’ income gap vis-a-vis the
rich country could be due to that fact that its income potential (as determined by n and
s) was not yet fully materialized.
An alternative approach is to look at the production function. This approach has
the advantage of relying on a relationship that should hold each moment in time,
regardless whether a country is in the steady state or in the transition dynamics.
Let’s consider the production function of the Solow model, (2.2):
y t  At k  [2.2]
Now, take a developing country, say Peru. By 2000, per capita income in Peru
was roughly 17.9% of the corresponding level in the United States 73 . Could such
difference be explained by the capital labour ratio alone? To answer this question, let’s
assume again that  is equal to 1/3, and solve (2.2) for k, to find out the level of capital
per worker in Peru that would be needed to matched the assumptions. The answer is
k  0.1793  0.0057 . That is, for the income difference between Peru and the US to be
explained by the capital labour ratio only, one would need a capital labour ratio in Peru
equivalent to 0.57% of that in the United States. This is clearly unrealistic: actually, by
2000, the capital labour ratio in Peru stood at about 18% of the corresponding level in th
US.
72
If one used instead , which is also a reasonable assumption, the ratio of per capita incomes in the
extreme scenario would rise to 7.9. This is still too low.
73
The following figures are from Caselli and Feyrer (2007).
afreitas@ua.pt 108
Why doesn't capital flow from rich to poor counties?
A third way of formulating the same problem was proposed by the Nobel
Laureate Robert Lucas (1990): if, cross country differences in per capita incomes were
only related to differences in the ratio of capital per worker, then poor countries should
have much higher interest rates than rich countries.
How much higher? To answer this question, Lucas solved equation (2.2) for k,
and used (2.12), to get:
r     At1  y   1   . (4.2)
Now consider a Rich country, (R), and a Less Developed Country, (L), operating
along the same production function, (2.2) – that is, with an equal A. In that case, the
ratio of capital returns in the two countries should be equal to:
1 
r   L y R
 
  t

 (4.3)
r   R y t
L

As an example, Lucas considered the cases of USA and of India, which per
capita incomes in 2000 differed by a factor of 15:174. Assuming =0.4, the exponencial
term in equation (4.3) equals 1.5. Thus, to explain a 15:1 gap in per capita incomes, the
ratio of capital returns should be 15^1.5= 58. That is, the interest rate in India would
need to be 58 times greater than that of the USA. In plain language, if capital were
generating a return of around 4% in the USA, then the corresponding average return in
India should be 233%. Clearly, this is wholly unrealistic, even admitting that poorer
countries have very high risk premia75.
Lucas pointed out that such high interest differentials cannot hold in a world
with capital mobility. Thus, in face of such return differentials, capital goods should be
flowing from rich countries to poor countries. Moreover one would expect almost no
investment to occur in capital-abundant countries until capital-labour ratios - and hence
interest rates - were more or less equalised across the globe. This, in turn, would be
good news for poor countries: if their problem was lack of capital and if this was
reflected in high interest rates, then capital should flow in and convergence of per capita
incomes would be just a matter of time.
Clearly this story does not conform to today’s realities: capital does not flow in
huge amounts to poor countries, interest differentials are nowhere near as large as 58
times and there is no systematic tendency for poor countries to grow faster than rich
countries76.
74
According to Maddison (2001), in 2000, per capita incomes in USA and in India in comparable units
(PPP) were, respectively, $28.129 and $1.910.
75
Note that =0.4 is a generous assumption. With =1/3 the ratio of interest rates would jump to 225!
76
Lucas observed that during the XVIII and XIX centuries, at a time when the production function was
better described as a function of labour and land, labour actually moved from labour abundant countries
(Europe) to labour scarce countries (New World). In the XX century, capital replaced land as the main
factor in the production function and became the potentially mobile one, as strong restrictions on labour
mobility were erected.
Figure 4.1 – GDP per worker and capital per worker
70,000
60,000
USA
HKG
50,000 BEL NOR
IRL
CHE
GDP per worker
40,000 GBR
FIN
JAP
GRE
30,000 PRT
MUS MYS
CHL
MEX
20,000 VEN
EGY PAN
10,000 PER
BOL
0,000
0,000 20,000 40,000 60,000 80,000 100,000 120,000 140,000 160,000 180,000
Capital per worker
Source: Caselli and Feyrer (2007).
Box 4.1 The return to physical capital in poor countries
The question as to whether the marginal product of capital is higher or lower in

poor countries than in rich countries has important policy implications: if one concludes
that the marginal product of capital differs substantially across the globe, this means
that some form of market frictions are preventing physical capital from being more
efficiently allocated across the world. In that case, there would be scope for expanding
the world output by promoting a better allocation of capital between rich countries and
poor countries (for instance, through external aid).
First, lets investigate if rich countries tend to have indeed more capital per
worker than poor countries. Figure 4.1 addresses this question. The figure crosses data
on per capita income and on capital per worker for 53 countries. The figure confirms
that capital per worker varies considerably across countries (by a factor of 100:1) and
that is positively related to per capita income. The non-linearity in the relationship
between per capita income and capital per worker is also suggestive of diminishing
returns, as assumed in the Solow model77. Thus, one would expect the country with less
capital per worker to have a higher real interest rate than the country with more capital
per worker.
Caselli and Feyrer (2007) investigated whether the low levels of capital per
worker in poor countries are associated with higher marginal products of capital or not.
The authors estimated the marginal product of capital for 53 countries, assuming an
aggregate production function exhibiting constant returns to scale.
77
We have to be careful with such interpretation: for instance, Hong Kong achieves a higher level of
output per worker than Japan, with a lower level of capital per worker. This reveals that other factors
apart from capital per worker (actually, captured by parameter A!) are driving cross-country income
differences. However, the discussion above abstract from the influence of this factor, so as to stress the
limitations of the original Solow model.
afreitas@ua.pt 110
In the case of a Cob-Douglas, we know that Y K   Y K . The authors noted

however that the share of capital in national income (the proxy for ) shall not be
computed as one minus the share of labour on national income. The reason is that, apart
from labour and physical capital, there are other non-reproducible factors, such as
“natural capital”, that includes land and natural resources.
In the case of United Kingdom, for instance, the authors estimate a share of
labour on national income equal to 75% and a share of physical capital equal to 18%,
only. The 7% difference corresponds to the share of natural capital. In Bolivia, the share
of labour is estimated to be 67% while the share of physical capital is only 8%, which
implies a 25% share for natural capital. Since the share of income rewarding natural
capital tends to be significant in countries where agricultural and natural-resource
industries are sizeable, ignoring them would imply an overestimation of the contribution
of capital to output, especially in poor countries.
A second caveat pointed out by Caselli and Feyrer (2007) relates to the relative
price of capital. If the price of output and the price of capital were the same – as
assumed in the Solow model – then in a world with perfect capital mobility one should
observe an equalization of marginal products of capital across countries. However, in
the real world, a unit of capital does not cost the same as one unit of output. In general
the relative price of capital is higher in poor countries than in rich countries, due to
tariffs, transport costs and other distortions. Since in poor countries, firms have to spend
more units of output to buy one unit of capital, they need to achieve a higher
productivity on the invested capital, all else constant.
To better understand the argument, note that the marginal product of capital, is a
physical measure: it gives the increment in real output obtained per unit of capital
invested. From the investor’s point of view, however, what matters is how many units
of output she will obtain more per unit of output sacrificed today (foregone
consumption). When the price of capital is different from the price of output, these two
perspectives will differ.
Formally, let pK be the relative price of capital goods (in units of Y). In that
case, when one unit of output is saved, the investor will be able buy 1 p K units of
physical capital, only. With a marginal product of physical capital equal to
Y K   Y K  , the change in output obtained by saving one unit of output will be
 Y K 1 p K  . Taking this into account, Caselli and Feyrer corrected the usual estimate
of the Marginal Product of Capital, dividing it by the relative price of capital.
Figure 4.2 compares the “corrected estimates” with the “naïve estimates” (that is
without price correction and without accounting for the share of natural capital). As the
figure reveals, the naïve estimates suggest that the marginal product of capital is high
and variable in poor countries and low in rich countries. That being the truth, it would
support the idea that capital does not flow from rich countries to poor countries due to
some form of capital market frictions. In that case, a massive investment in poor
countries would be the key for economic convergence.
After the corrections, however, the story is different. As shown in Figure 4.2,
when corrections for the natural capital and for the relative prices are implemented, the
marginal product of capital is largely equalized across countries. In other words, from
the investor’s point of view, the return to investment is not higher in poor countries than
in rich countries. This evidence suggests that the large variation in capital per worker
across countries cannot be attributed to capital market frictions: a reallocation of capital

across countries so as to exactly equalize the marginal products of capital would not
bring a world significantly more equal than what we observe today.
Figure 4.2 – The marginal product of capital (naïve and corrected estimates)
60%
Corrected bY/K
50% Naive bY/pK

Marginal product of capital
40%
30%
20%
10%
0%
0.000 10.000 20.000 30.000 40.000 50.000 60.000
Real GDP per worker
Source: Caselli and Feyrer (2007).
Figure 4.3 – The price-corrected marginal product of capital in rich countries and in poor
countries
24% Rich Countries
22%
Poor Countries
20%
Price‐corrected MPK
18%
16%
14%
12%
10%
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
Source: Mello (2008). Note: the author didn’t correct for natural capital.
afreitas@ua.pt 112
This is not to say that differences in the marginal product of capital have always
remained small in the past. Extending the work of Caselli and Feyrer for the period
1970-2000, Mello (2008) showed that the roughly equalization of marginal products of
capital across the world is a rather recent recent phenomenon. This authour’ evidence is
summarized in Figure 4.3. The figure displays the price-corrected marginal product of
capital for a group of rich and a group of poor countries, along the period from 1970 to
2000. As shown in the figure, the marginal product of capital differed substantially
between rich and poor in the 1970s: at that time, large efficiency gains could have been
achieved improving cross-country capital mobility. However, globalization and
elimination of capital controls across the globe through the 1980s and 1990s lead to a
rough cross-country equalization of capital returns thereafter78. So, in today’s world, the
gains from further capital market integration are expected to be small.
So, how can we explain these inconsistencies?
The discussion above reveals that capital accumulation cannot account for the
large differences in per capita income we observe in the real world. If these differences
were accounted by the differences in the availability of capital per worker, only, then
the marginal product of capital should be very high in poor countries and capital should
flow from rich countries to poor countries. However, this is no longer the case in our
days: capital is not flowing massively from rich countries to poor countries and there is
no evidence that the marginal product of capital is subtancially higher in poor countries
than in rich countries.
The conclusion is that we can hardly explain the cross-country variation of per
capita incomes with differences in the level of capital per worker, only.
In order to solve this puzzle, two avenues can be explored:
1. One is to account for the role of other inputs in production, in particular
Human Capital. This avenue will be addressed in the remaining of this
chapter.
2. The second is to focus on the factor that has been silent so far and that
we loosely related to the state of “technology”. This avenue will be
explored in parts II and III of this book.
4.3. Human capital
What is human capital?
Human capital is the term used to coin the stock of knowledge and health
embodied in labour. The notion of human capital comes from the observation that
people invest in knowledge and in health (through schooling, on-the-job training,
78
Mello (2008). According to this author, the price-corrected marginal product of capital in poor and rich
countries differed, on average, by 5.52 p.p. in the 1970s, 1.85 p.p in the 1980s and 0.37 p.p in the 1990s.
exercise and healthcare) with the aim to obtain a return, just like people invest in
physical capital.
Human capital is also similar to physical capital, in that it depreciates along
time. As an example, think on the know-how needed to distinguish poisonous
mushroom from the eatable ones: certainly, in our days such knowledge is much less
relevant for most of us than it was thousands of years ago, when most people lived in
the woods. As for another example, think on what happens to your old knowledge each
time you need to catch up with a new release of a computer software: you have to learn
how to operate with the new version, throwing away part of the time invested in
learning how to operate with the old version. In general, as the time goes by, part of the
“accumulated” knowledge gets outdated and useless.
A similar reasoning holds for health: health depreciates along time (and at an
increasing rate with age!). Consequently, continuous investment in health is required
(exercise, safe nutrition, health care), so as to prevent health capital from eroding too
fast.
Note that the two components of human capital tend to be correlated to each
other: more educated people tend to be more aware of the advantages of a healthy
nutrition and of exercise, so they will tend to be more healthy too; by the same token,
healthier individuals, with longer life expectancies, are likely to invest more in
education, because they have longer payback periods. Thus, improving one dimesion of
human capital is expected to deliver improvements in the other dimension too.
Human capital as an input to production
In order to account for the role of human capital in production, let’s consider a
production function where both “raw labour” and “human capital” enter as inputs to
production:
Y  At K t H t N t1   , with <1 (4.4)
In (4.4), the new variable “H” stands for Human Capital (in the form of
knowledge and health) and all other variables are defined as before.
In light of this specification, production needs both “bodies” and “brains” and it
is not possible to substitute completely “bodies” for “brains” or “brains” for “bodies” 79.
This production function exhibits CRS in all these inputs: that is, one would be able to
duplicate Y if one could simultaneously duplicate the use of the three factors, K, H and
N. As in the basic Solow model, however, we will be constrained by the fact that one
input (N) evolves at exogenous rate. Hence, expanding K and H faster than N will
deliver diminishing returns.
To see the link between human capital and per capita income, let’s write the
production function in the intensive form:
y t  At k t ht ,
where h  H N is the level of human capital per worker.
79
In alternative, we will see formulations where human capital and raw labour are merged together into a
unique “composite” input called “human capital”.
afreitas@ua.pt 114
Why does human capital impact on positively on per capita output?
There are several reasons to believe that human capital impacts positively on
output per worker:
- First, more knowledgeable workers will be able to accomplish more complex
tasks with a minimum outlay of time.
- Second, healthy and well nourished workers are expected to have more
physical and mental energy to learn and undertake their tasks than sick and
starving workers80.
- Third, health capital increases the amount of healthy time available for work,
by reducing incapacity, disability and absenteeism.
- Fourth, individuals with longer life expectancy have incentive to work
harder, because they need to save more for their longer retirement period.
All in all, these various effects imply that a higher level of human capital should
impact positively on output and, by then, on the average product of labour.
In addition to the direct effect of human capital on y, human capital can also
impact positively on y via effects that are mediated through the productivity parameter,
A. For instance, more skilled and educated workers are more likely to adopt new
technologies and to contribute themselves to technological change than less educated
people. Also more educated people are more likely to press thir governments for
reforms and good policies than people that are less educated. These effects run from H
to A and then to Y. For expositional convenience, however, at this stage we abstract
from any impact on growth that is mediated by the parameter A (actually, we are still
abstracting from cross-country differences in A).
Human capital and the productivity of physical capital
As you may remember, whenever two inputs are complementary in production,

increasing the use of one input has a positive impact on the marginal product of the
other. To see this in terms of (4.1), let’s take the partial derivative in respect to physical
capital:
Y
K
y

    At k t 1ht
k
 (4.5)
According to (4.5), the marginal product of physical capital is a decreasing

function of physical capital per worker (this is the law of diminishing returns), but an
increasing function of human capital per worker (h). Thus – at least in theory - a higher
level of human capital per worker has the potential to offset the diminishing returns to
physical capital, and eventually to prevent the interest rate from falling as the capital
labour ratio increases.
80
A classical study on this dimension is Fogel (1991, 1994). The author analysed the relationship
between work effort and caloric intakes in France and England since the 18th century. He concluded that
in the 18th century France, individuals in the bottom 10 percentile of consumption had daily caloric
intakes that were so low that they could not even have enough energy to work. In the centuries that
followed, improvements in nutrition impacted significantly on workers’ effort (some 30% among British
workers, he estimated), and also in the “participation rate” (that is, the fraction of working-age population
that is actually able to work).
Figure 4.4 provides a graphical illustration. The figure displays two curves
describing the marginal product of capital (4.5) as a function of capital per worker, in
two different countries which differ in terms of human capital per worker: a “rich”
country (upper case) and a “poor country” (lower case). These functions are downward
sloping because of diminishing returns: in any of these countries, more capital per
worker will translate into a lower productivity of capital (and hence, lower interest
rates), everything else constant. However, the fact that the rich country is endowed with
a higher level of capital per worker may prevent the interest rate from being lower than
in the poor country.
In the figure, the marginal product of capital in the rich country is represented
by point C, which is roughly similar to that of the poor country (A). Hence, in this
consutructed example, there would be no reason for capital to flow from the rich
country to the poor country. Whether in the real world adding human capital in the
production will be enough to solve the Lucas paradox is a different story: as we will see
in the remaining of this chapter, the answer is “no”.
Figure 4.4 – What happens to the marginal product of physical capital when human capital
per worker increase?
Y
K
A C
r  Y
K rich
B
Y
K poor
k poor krich
k K/N
4.4. The augmented Solow model (MRW)
An extension of the Solow model to account for human capital as an input to

production was first proposed by Mankiw, Romer and Weil (1992). In this section, we
briefly describe the implications of such extension, focusing on the steady state, only.
The main assumptions
As in the Solow model, let’s assume that the technological parameter in (4.4)
increases continuously, at an exogenous rate g:
afreitas@ua.pt 116
At  Ae gt [3.1]
Rearranging, the production function (4.4) in terms of “efficiency labour”, we
get:
Y  AK t H t L1t     , (4.6)
where L t  N t e t and   g 1      .
Note that in this model, workers become more productive both because of labour
augmenting technological progress, and because of investment in human capital. The
critical distinction between these two factors is that the former is strictly exogenous,
while the later is produced and accumulated, exactly in the same manner as physical
capital.
As in the Solow model, it is assumed that one unit of output can be transformed
at no cost into either one unit of human capital or into one unit of physical capital81. The
resource constraint of the economy is given by:
Yt  C t  I t  I tH , (4.7)
where I H refers to investment in Human capital and all other variables are defined as in
the Solow model.
It is also assumed that people invest constant fractions of their incomes in
human capital. Let sH be the fraction of income invested in human capital and s the
fraction of income invested in physical capital. The dynamics of K and H are given,
respectively, by:
sY t  I t  K t   K (4.8)
s H Yt  I tH  H t   H . (4.9)
For simplify, the model assumes that the stocks of physical and of human capital
depreciate at the same rate, . The depreciation of human capital may be interpreted as
the erosion of knowledge (obsolescence, forgiveness) net of the benefits from
experience.
The steady state in the MRW model
To solve the model in the simplest possible manner, there is a helpful clue: in
the steady state, Physical Capital and Human Capital must grow at the same rate. At this
stage, reason should be intuitive: under diminishing returns, it would not be efficient
having one input growing faster than the others: if, for instance, physical capital was
growing faster than human capital, then the return to physical capital would decrease
relative to the return of human capital and there would be no reason for people to keep
investing more in physical capital than in human capital.
Imposing K K  H H in the two equations above, one obtains a critical
condition:
81
In alternative, one could specify a second sector for this economy, specifically devoted to production of
human capital (like schools, universities and hospitals). This alternative approach is examined in the
Chapter 5.
H sH
 (4.10)
K s
This condition states that the ratio of human to physical capital in the steady
state shall be proportional to the corresponding investment rates. Solving for H, and
substituting in the production function (4.4), one obtains82:
1
Y  BK t Lt (4.11)

s 
where      and B  A H  . (4.12)
 s 
Note that, because we used (4.10), equation (4.11) holds in the steady state only.
Given the similarities between (4.11) and (2.1), you may guess that the steady
state of this model will be very similar to that of the Solow model. And this conjecture
happens to be true. Adapting equation (3.10) for the parameters in (4.11), one obtains:

1
 s  1 t
y B
* 1
  e (4.13)
 n  
t

Substituting back      and (4.12), one obtains the steady state level of per
capita income in this augmented model:
1
 As H s   1   t
yt  
*
  
e (4.14)
 n      
where   g 1      . Note that (3.10) is no more than a particular case of (4.14),
with .
Equation (4.14) states that the steady state level of income per capita depends
positively on the saving rate and negatively on the rate of population growth (as before).
It also states that per capita income rises in the long run at a constant rate  (as before).
The novelty here is that the level of per capita income also depends on the share of
income devoted to Human Capital Accumulation, sH: an increase in sH gives rise to a
level effect (output per capita will expand temporarily until the new steady state is
reached). This is similar to what happens with investment in physical capital.
A property of this model is that the two saving rates impact on per capita income
individually and together, reinforcing each other: for any given rate of human capital
accumulation, a higher saving rate leads to a higher level of per capita income in the
steady state, which in turn leads to a higher level of human capital. Hence, smaller
differences in the saving rates may explain large differences in per capita income,
allowing the model to fit much better the real world facts.
Factor income shares in the MRW model
82
In a footnote to their paper (footnote 12), Mankiw et al. (1992) noted that the properties of the model
change dramatically in the case in which =1. In that case, equation (4.11) becomes linear in K,
delivering “endogenous growth”. This alternative case will be addressed in the next chapter.
afreitas@ua.pt 118
The production function (4.4) applies to the economy as a whole. Assuming that
all firms in this economy are identical, firm level profit maximization leads to the
following aggregate demands for physical capital, human capital and raw labour83:
Y Y
  r  (4.15)
K K
Y Y
  r  (4.16)
 H H
Y Y
 1       w
N N
As expected, these equations imply that factor income shares are equal to the
corresponding elasticities in the production function (remember that this is a direct
consequence of CRS and no market failures).
Taking together, these equations imply that the total share of labour in national
income is . Note that, although the production function distinguishes
two types of labour input, in the real world, these two inputs are paid in the same wage
bill. If, for instance, this means that the labour share in income will be around
2/3, which accords to the empirical evidence.
An arbitrage (efficiency) condition
Absence of arbitrage opportunities implies that investment in the two types of

capital shall be such that their marginal products are equal. Combining the demands for
physical and human capital (4.15) and (4.16), one obtains:
H 
 (4.17)
K 
This equation implies that capital and labour evolve proportionally, as assumed
in (4.10) 84 . For future reference, the “efficiency condition” (4.17) will be taken as
benchmark, namely to analyse departures from the well functioning economy case, in
the Part III of this book.
Cross-country income differences revisited
To get a sense on whether augmenting the neoclassical model with human

capital helps explain the observed cross-country differences in per capita output, let’s
first re-consider the “rich country”- “poor country” example of Section 4.2.
83
Note that, because in this model one unit of output can be transformed at no cost in either one unit of
physical capital or in one unit of human capital (equation 4.7), the user costs of these two forms of capital
are the same.
84
It is worth noting that this efficiency condition is consistent with the “golden rule”. As in the basic
Solow model, the golden rule for capital accumulation can be obtained maximizing the steady state level
of per capita consumption: max c t*  1  s  s R  y t* , where y t* is given by (4.14). The solution to this
s,sR
problem is as expected: s= and s R   . Using (4.10), we then one obtain (4.17). Note, however, that the
golden rule is more restrictive than (4.17), because it implies with the level of savings.
Using (4.14), the ratio of per capita incomes in the steady state becomes:
   
yR  sR 1   sHR 1   nL     1 
     
y L  sL  s
 HL  n
 R     
Comparing to (4.1), we see that the exponentials are now larger:
   
1     1  . This means that the impact of each parameter in the steady state
level of per capita income is larger than in the simple Solow formulation.
To exemplify, let’s consider a case with =3% and =2%. This
gives:
2
y R  s R  s HR  n L  0.05 
    
y L  s L  s HL  Rn  0. 05 
Now, consider for instance, a case where s R  s HR  0.2 , n R  0.01 ,
s L  s HL  0.10 and n L  0.03 . In this case, the ratio of per capita incomes between
the rich country and the poor country is 7.1:1. This is more than in the Solow model
with equivalent assumptions (1.92:1), but –still - is half-way relative to what we need!
In general, explaining real world cross-country income differences using
equation (4.4) delivers better results than using equation (3.10) (see Box 4.2). But the
empirical evidence also reveals that a significant part of the cross-country variation of
per capita incomes remains to be explained. In other words: accounting for the role of
Human Capital certainly improves the explanatory power of the model; but is not
enough: definitely, one needs to depart from the assumption that A is equal across
countries.
Box 4.2. Empirical test on the MRW model – steady state
Instead of calibrating equation (4.14) to compare pairs of countries, one may use
regression analysis to investigate more systematically how well the model accounts for
observed cross-country income disparities in the real world.
This was done by MRW in their paper. In order to run a linear regression, the
authours took logs in equation (4.14). The implied regression equation is:

ln y i*  a  ln s i  ln ni        ln s Hi  ln ni       u i
1   1  
with a  t  ln A 1      (4.18)
Equation (4.8) implicitly assumes that all countries are in their steady states or,
more generally, that deviations from the steady state are random. The random
disturbance u i captures these deviations, as well as country specific effects determining
afreitas@ua.pt 120
differences in the level of A, such as differences in resource endowments or in

climate85. The Solow model corresponds to the particular case in which .
Table 4.1 describes the estimation results obtained by Mankiw et al (1992). The
authors used a sample of 98 countries for the year 1985. The proxies used were: the
growth rate of the working-age population for n; the ratio of real investment to GDP for
s (period averages); the fraction of the eligible population enrolled in secondary
education for s H ; the real GDP per working age person for y; the authors also
postulated     0 .05 for all countries.
The estimates of the Solow model (equation 4.18 imposing  are displayed
in the first column of table 4.1 (standard errors in italics). According to these estimates,
the Solow model accounts for 59% of cross-country variation of per capita incomes.
The estimated coefficients have the predicted signs and are significant. Nonetheless, the
implied value for  is as high as 0.60 (using 1.48   1    ). This is just another
incarnation of the Lucas puzzle: in order for the observed differences in per capita
incomes to be accounted for by differences in physical capital endowments, one would
need a contribution of capital to production much larger than the corresponding
observed income shares. The coefficient on physical capital is biased upwards because
human capital is omitted from the regression equation.
85
Note that the possibility of country specific effects leads to a potential econometric problem: to the
extent that negative country specific effects (e.g., poor resource endowments) discourage capital
accumulation (as is likely), the error term will be correlated to the saving rate and therefore estimates will
be biased. This is one of the main objections to the empirical implementation of this model. To get around
this problem, some authors proposed panel data estimation, which allows for the control of country
(“fixed”) effects (Islam, 1995, Caselli et al, 1996).
Table 4.1. Estimation of the Solow model and the Augmented Solow model
Dependent variable: log GDP per working age person in 1985
Sample Non-oil countries

Observations 98
constant 6,87 7,86
0,12 0,14
ln s  ln n  0,05 1,48 0,73

0,12 0,12
ln s H  ln n  0 , 05  0,67
0,07
Implied beta 0,60 0,31

0,02 0,04
Implied alfa 0,28
0,03
R- Squared 0,59 0,78
S.E.E. 0,69 0,51

Source: Mankiw et al (1992), Tables I and II. Notes: Standard errors are in italics.
The results using the augmented model (equation 4.18) are displayed in the
second column of table 4.1. According to these estimates, this model explains 78% of
the cross-country variation in per capita incomes. All coefficients have the expected
signs and are highly significant. Moreover, the new estimate for  (0.31) accords much
more closely with the observed facts. Taken together, the evidence presented is more
favourable to the augmented version of the neoclassical model than to its basic
version86.
Conditional convergence
In Figure 3.5, we showed that there is no general tendency for poor countries to
grow faster than rich countries. In Section 3.5, it was argued that this finding is fully
consistent with the neoclassical growth model: the model allows countries to reach
different steady states. The model also implies that countries grow faster the farther they
are from their steady states. This property of the neoclassical model is called
conditional convergence.
86
Mankiw et al (1992) also estimated the model for a sample of OECD countries, only. The results with
the MRW specification were however disappointing: the R-squared was 0.28 and the coefficients on
investment and population growth were not significant. A natural explanation is that WWII caused
significant departures from the steady state in this sub-sample. Since the regression model (4.18) does not
account for transition dynamics, it cannot isolate the fact that, in these economies, the investment rates
and population growth rates have not yet deliver their full impact on per capita income.
afreitas@ua.pt 122
Empirically, the conditional convergence hypothesis may be tested by observing

the relationship between growth and the initial level of per capita income, after the
variables determining the steady state are controlled for (remember equation 3.14): if a
significantly negative partial association between growth and initial per capita income is
found, this is taken as evidence of conditional convergence.
Figure 4.5 Evidence of Conditional Convergence (98 countries)
1,5
y = -0,3x + 2,4747
R2 = 0,4144
Conditional on saving, population growth and human capital
1
0,5
-0,5
-1
5 5,5 6 6,5 7 7,5 8 8,5 9 9,5 10
Figure 4.5 shows the results of a test on conditional convergence performed by

MRW using their theoretical model - details in the Box 4.3. The vertical axes measures
the growth rate of per capita income, after controlling for the effects of: the saving rate;
investment in human capital; and the population growth rate (e.g, using equation 4.14).
The horizontal axes measures the initial per capita income.
Clearly, the figure reveals a strong negative association between growth and
initial incomes, after controlling for the different steady states. This evidence strongly
contrasts to the finding in Figure 3.5, which uses the same data but without controlling
for differences in the steady state.
The conclusion is that the worldwide evidence is not supportive of absolute
convergence, but it is supportive of the neoclassical proposition of conditional
convergence.
Box 4.3. Empirical implementation of the MRW model – Conditional Convergence
Estimating equation (4.18), one implicitly assumes that coutries are already in
their steady states. To overcome this limitation, Mankiw et al (1992) also estimated a
version of the model accounting for the possibility of countries being engaged in
transition dynamics. To understand their test, let’s go back to equation (3.14), which
describes the transition dynamics in the Solow model. This equation also holds in the
MRW, after adapting the parameter measuring the “speed of convergence”, v, to the
existence of human capital:   1     n      87.
Then, on can test for conditional convergence by investigating the sign and
significance of parameter b (3.15), without forgetting to control for the determinants of
the steady state in the MRW model (that is, replacing y * by 4.18).
Formally, the following empirical model is obtained:

ln yt  ln y0     ln y0  ln s  ln ni        ln sH  ln ni     
1   1  
(4.19)
where   t   ln A 1      and   1  e t  ,   1     n      .
Table 4.2 illustrates how the test on conditional convergence was implemented
by Mankiw et al. (1992), for a worldwide sample and for the OECD economies. The
third column of Table 4.2 also shows the results of a similar estimation, for a sample of
37 African countries88.
As shown in the table, in the three samples, all coefficients have signs in
accordance to (4.19) and are statistically significant. The estimated coefficients of ln y 0
are negative and significant at the 5% level. This means that the conditional
convergence hypothesis holds in the three samples. The partial association between the
growth rate of per capita GDP and the initial level of per capita GDP, as implied by the
first column of Table 4.2 is displayed in Figure 4.5.
In respect to the parameter estimates, Table 4.2 point to some differences as
between the OECD and African countries. In particular, the estimated average speed of
conditional convergence () is 1.7% for African countries and 2.1%, for OECD
countries. This means that the time required to eliminate half of the initial gap from
their steady states in Africa is about 42 years, which compares to 35 years in the OECD
sample89. Eventually, the lower speed of convergence in Africa is due to a lower ability
to attract capital, probably due to bad economic policies, political instability, etc. (but
remember the model we are using is silent in respect to the effect of these factors!).
Table 4.2 – Tests for conditional convergence
87
Note that  in the MRW model is slightly different from that in the Solow model (Appendix 3.1). The
later shall be seen as a particular case, with . This means that the inclusion of human capital generates
a larger role for transitional dynamics than in the Solow model. It is also worth noting that by imposing a
common convergence parameter  (while population growth rates differ across countries), there is a
potential bias (a discussion in Lee et. al., 1997). Attempts to address this limitation include Evans (1997)
and Arnold et al. (2007).
88
Murthy and Ukpolo (1999). These authours used the same dataset as Mankiw, Romer e Weil (1992)
except in that their measure of investment in human capital also includes primary school attainment.
89
To assess how long it takes an economy to get halfway to its balanced growth path, the reader is
referred to equation (3.5), though adjusting for the new definition of  in equation (4.19) : the answer is
e t  0.5 , which solves for t   ln0.5  . Note that, with the inclusion of human capital, the
parameter v is now lower, implying a slower convergence than in the Solow model (Box 3.5)
afreitas@ua.pt 124
Dependent variable: log difference GDP per working age person 1960-1985
Sample Non-oil OECD Africa
Observations 98 22 37
constant 2,46 3,55 2,16

0,48 0,63 0,65
ln y 0 -0,30 -0,40 -0,35

0,06 0,07 0,10
ln s  ln n  0,05 0,50 0,40 0,44

0,08 0,15 0,16
ln s H  ln n  0 , 05  0,24 0,24 0,31

0,06 0,14 0,17
Implied beta 0,48 0,38 0,40

0,07 0,13
Implied alfa 0,23 0,23 0,28

0,05 0,11
Implied speed of convergence 0,014 0,021 0,017

0,00 0,00
R- Squared 0,46 0,66 0,41
S.E.E. 0,33 0,15
Source: Non-oil countries and OECD: Mankiw et al (1992), Table VI. African countries: Murthy and
Ukpolo (1999), Table 1. Significant levels in italic.
4.5. Productivity matters!
The contribution of MRW was criticized by a number of authours, on the

grounds that it emphasizes the role of factor accumulation as the explaination for cross-
country income differences, disregarding the differences in technology90. Such approach
is in line with the traditional Solow model (which focus on physical capital
accumulation), and also with the results of Alwin Young, that the East Asian growth
miracles were fuelled by old-style factor accumulartion, rather than by technological
change (Box 3.5).
A seminal contribution in this criticism was formulated by Klenow and
Rodriguez Claire (1997). These authors first pointed out that the proxy used by MRW
for human capital, the secondary school attainment rate, is likely to overestimate the
cross country variation of human capital. True, richer countries not only have more
schooling but also better schooling than poor countris, so by the quality factor one
would expect proxies based on education attainment to underestimate the cross-country
90
“This paper takes Robert Solow seriously”, wrote the authors in the first paragraph of their seminal
article (Mankiw, Romer and Weil, 1992, pp. 407).
disparities of human capital. However, secondary school attainment varies much more
across countries than primary school attainment. Because MRW did not include the
primary school attainment in their proxy, the balance was likely to imply an
overestimation of the cross-country differences in human capital stocks.
Using a more sophisticated approach to estimate proxies for human capital,
Klenow and Rodriguez Claire concluded that the role of human capital in explaining
cross country differences in per capita incomes is substantially lower than what MRW
made us believe. In their reasoning, the authours calibrated a production function, to
asees the contribution of inputs and of productivity, and displayed results both in levels
and in changes. The following two sections summarize some of their results.
Development accounting
As a starting point, Klenow and Rodriguez Claire, considered a production

function equal to (4.6). They focused, however on a transformation of it, which you can
easily obtain, by dividing both sides by Y and manipulating a bit further:
 
1
 Kt 1   H t 1 
yt  A t
1  
    (4.20)
 Yt   Yt 
As explained in Section 3.6, transforming the production function so as to have
capital-output ratios on the right-hand-side has the adavantage of abstracting from the
capital accumulation that is induced by technological change, to measure only the
capital accumulation induced by changes in the behavioural parameters. The same
consideration applies in this extended model. Accordingly, the first term in the right
hand side measures the Harrod Neutral level of technology.
Klenow and Rodriguez-Clare calibrated equation (4.20) for 98 countries over the
period 1960-85, assuming  and . Then, they expressed all country’
variables as perentage of the corresponding values in the United States (that is,
US=1.00).
Their results for a selection of countries are displayed in Table 4.3. In the table,
take Tanzania, for instance. In 1985, per capita income in this country was only 3% of
that in the United States. How much of that difference could be explained by physical
capital alone? In 1985, the K/Y ratio in Tanzania was 59% of that in the United States.
If human capital played no role in the production function (i.e, if ), then the
0.3
contribution of the capital labour ratio would be K Y 10.3  0.79 . That is, with equal
productivity and no role for human capital, Tanzania should have a per capita income
equal to 79% of that in the US! Now, let’s include human capital in the production
function. In the table, we see that the ratio H/Y in Tanzania was 37% of the
corresponding variable in the United State. Hence, the joint contribution of physical and
human capital to relative per capita income was equal to:
0.3 0.28
X  0.5910.280.30 0.3710.280.30  0.35 .
afreitas@ua.pt 126
Thus, adding human capital improved the estimate, but we are still very far from
reality. In light of (4.20), the remaining difference can only be accounted for by the
productivity parameter:
1
A 1  
 0.03 0.35  0.08 .
Thus, in the case of Tanzania, the main reason for per capita income to be so
small relative to that of the United States is productivity, not human or physical capital
accumulation.
Table 4.3 – Development Accounting (1985, US=1.00)
Output per Productivity

Factor Contribution
Working Age Person (Harrod Neutral)
Y/L K/Y H/Y X

SOUTH AFRICA 0.29 0.84 0.45 0.52 0.57
TANZANIA 0.03 0.59 0.37 0.35 0.08
BRAZIL 0.32 0.70 0.40 0.42 0.77
INDIA 0.08 0.71 0.38 0.41 0.20
INDONESIA 0.13 0.59 0.45 0.40 0.32
MALAYSIA 0.31 0.68 0.51 0.48 0.63
FRANCE 0.80 1.47 0.45 0.77 1.04
NETHERLANDS 0.85 1.28 0.61 0.86 0.98
PORTUGAL 0.34 1.21 0.34 0.56 0.60
TURKEY 0.21 0.79 0.37 0.44 0.48
U.K. 0.68 1.23 0.64 0.86 0.79
PAPUA N.GUINEA 0.10 1.08 0.26 0.43 0.23
Source: Klenow and Rodriguez-Clare (1997).
Inspecting Table 4.3 for other countries, we see that – with the exception of
France - factor accumulation alone does not fully account for per capita income
differences vis-a-vis the United States: productivity differences are a very important
source of divergence.
To look at the 98 countries in the sample at the same time, we refer to Figure
4.6. The figure crosses the combined contribution of physical and human capital (X)
with per capita incomes, with all variables being measured as a percentage of the
corresponding values in the United States. As the figure reveals, most observations in
the figure fall on the right hand side of the 45 degrees line, meaning that, in general,
factor accumulation alone tend to underestimate the observed income gaps.
Figure 4.6 – Development Accounting: contributions of factor accumulation versus per

capita income (US=1.00)
1.20
1.00
Per capita income (US=100)
0.80
0.60
0.40
0.20
0.00
0.15 0.35 0.55 0.75 0.95 1.15
Factor contribution: labour, human capital, physical capital (US=1.00)
Source: Klenow and Rodriguez-Clare (1997).
The technique of calibrating a production function to measure productivity

differences vis-a-vis a reference country was coined “development accounting” by King
and Levine (1994). Development Accounting shares with growth accounting the feature
that it uses national accounts data to calibrate a production function and disentangle the
relative contributions of inputs and TFP to economic performance. Instead of measuring
growth rates, however, the method focus on variables in levels, relative to a reference
country. In genereal, “development accounting” exercises reveal that a large proportion
of cross country income disparities cannot be explained by factor accumulation, human
capital included. Produtivity differences play a major role.
Growth accounting
In addition to development accounting, Klenow and Rodriguez-Clare analysed

equation (4.20) in changes, that is:
yˆ 
1
1  
Aˆ 

1  

Kˆ  Yˆ 


1  
Hˆ  Yˆ   (4.21)
In this formulation, the contributions of K and H are measured only to the extent
that their growth rates exceed output growth. When the economy is in the steady state,
K/Y and H/Y are both constant, so all growth of per capita income is accounted for by
the Harrod neutral rate of technological progress,   g 1      . When, in contrast,
there is a change in the propensity to invest in physical or human capital, this will show
up in the decomposition, because the ratios K/Y and H/Y will be in transitory moves91.
Figures 4.6 to 4.9 illustrate the implementation of equation (4.21) by Klenow
and Rodriguez-Clare (1997) for the sample of 98 countries over the period 1960-85,
91
Of course, these decompositions are plagued by the fact that they ignore causal relationships between
the different variables. For instance, an increase in the level of schooling can boost growth through its
effect on technology adoption. Any attempt to isolate the contribution of the different factors using
growth accounting is always a limited exercise.
afreitas@ua.pt 128
assuming  and . The figures plot the growth rates of Y/N with the growth
rates of, respectively, K/Y, H/Y and A.
Figure 4.7 - Growth rates of output per worker versus human capital per unit of output, 98
countries, 1960-1985
8.00%
7.00% R2 = 0.0779
6.00%
5.00%
Y/N (% change, 1960-85)
4.00%
3.00%
2.00%
1.00%
0.00%
-1.00% -0.50% 0.00% 0.50% 1.00% 1.50% 2.00% 2.50% 3.00%
-1.00%
-2.00%
-3.00%
H/Y (% change, 1960-85)
Source: Klenow and Rodríguez-Clare (1997).
Figure 4.8 - Growth rates of output per worker versus total factor productivity, 98
countries, 1960-1985
8.00%
R2 = 0.759
7.00%
6.00%
5.00%
Y/N (% change, 1960-85)
4.00%
3.00%
2.00%
1.00%
0.00%
-4.00% -3.00% -2.00% -1.00% 0.00% 1.00% 2.00% 3.00% 4.00% 5.00% 6.00%
-1.00%
-2.00%
-3.00%
TFP (% change, 1960-85)
Figure 4.9 - Growth rates of output per worker versus physical capital per unit of output,
98 countries, 1960-1985
afreitas@ua.pt 130
8.00%
2
7.00% R = 0.0018
6.00%
5.00%
Y/N (% change, 1960-85)

4.00%
3.00%
2.00%
1.00%
0.00%
-3.00% -2.00% -1.00% 0.00% 1.00% 2.00% 3.00% 4.00% 5.00% 6.00% 7.00%
-1.00%
-2.00%
-3.00%
K/Y (% change, 1960-85)
It is worth noting that all graphs exhibit positive correlations, suggesting that
both TFP and factor accumulation are important for growth. The correlations differ,
however, substantially: in particular, the correlation is much stronger in the cases of
TFP than in the cases of physical capital and human capital. In other words, the driver
that better explains the cross-country growth differences is TFP change, not human and
physical capital accumulation.
This evidence not only confirms the importance of TFP – as we already found
out in similar exercises with the Solow model– but also that differences in TFP growth
play an important role in discriminating among growth performances.
4.6. Discussion.
The basic formulation of the Solow model stresses the role of physical capital
(as implied by saving rates and population growth rates) in explaining cross-country
income differences. Although this prediction accords broadly to the empirical evidence
in terms of signs, it does not in terms of magnitudes. Further investigating this
relationship, one concludes that the weight given to physical capital in the neoclassical
production function is too low to account for the existing per capita income disparities.
One solution to this problem is to increase the weight of reproducible inputs in
the production function. Such avenue was proposed by Mankiw, Romer and Weil, in
their extension of the Solow model.
The MRW model inherits an important drawback from the Solow model: while
it does a quite good job in describing economic growth, it is not capable of explaining
economic growth. Because it assumes perfect competition and absence market failures,
the model is doomed to take exogenouly the key factor that drives long run growth:
technological progress.
With no question, augmenting the model so as to include human capital allows it
to better capture real world facts. However, further empirical scrutiny of the augmented
model revealed that a significant proportion of per capita income differences cannot be
accounted for by physical and human capital accumulation: “technology” plays a very
important role in explaining cross-country differences in economic performance.
All in all, we definitely need a much better understanding of what‘s behind this
parameter that we have so far called TFP or “technology”. In doing so, one shall take
into account that differences in A are not necessarily due to technology in the narrow
sense: if two countries have access to the same knowledge but make use of that
knowledge with different organization schemes, these differences will show up as TFP
differences. The notion of “technology” embodied in TFP not only reflects differences
in the quality of technology used in production but also differences in the organization
of production and policies affecting the efficiency with which existing factors are
utilised.
In the following chapters, we categorize changes in TFP in light of the two
components identified in equation (3.1):
- Differences in the effectiveness with which factors of production and a given
technology are combined to produce output (parameter A in equation, 3.1).
We dub these differences as differences in “efficiency”. This direction of
analysis will be explored in chapters 6, 10, 11, 13.
- Differences in the pace of adoption of more advanced technologies
(parameter g in equation 3.1). This direction of analysis will be explored in
chapters 7, 8, 9,12.
 The Solow model is capable of predicting cross-country differences in

per capita income in terms of signs, but it performs less well in
predicting the existing magnitudes. The reason is that the fundamental
parameter through which cross-country differences are mediated – the
capital-output elasticity – is, according to the model’ assumptions, too
small to capture the real world facts.
 This limitation of the model was put in a simple way by the Nobel
laureate Robert Lucas: If differences in per capita income were basically
explained by differences in the availability of capital per worker, than the
return to capital should be higher in capital scarce countries and capital
should flow from rich countries to poor countries. This doesn’t happen in
reality.
 The empirical evidence suggests that before the financial globalization of
the 1990s the marginal product of capital was indeed higher in capital
scarce countries, but the evidence for the most recent years reveals that
now marginal products are roughly equalized across the world. The
conclusion is that the different availability of physical capital per worker
is not enough to explain why some countries are richer than others.
 In order to solve these inconsistencies, two avenues were proposed in the
literature: one is to enlarge the concept of capital, so as to include other
reproducible factors, such as human capital. The other avenue is to
afreitas@ua.pt 132
abandon the idea that TFP is the same across countries. This chapter
addressed the first avenue.
 Human capital includes both education and wealth. Like physical capital
human capital can be accumulated through investment, and it depreciates
along time. The return to human capital is combined with the reward to
raw labor, in the form of a wage conpensation.
 Theoretically, when the neoclassical production function is augmented
with human capital, physical capital abundancy does not necessarily
imply low returns on physical capital: a high endowment of human
capital can fix this.
 An implication of augmenting the neoclassical growth model with
human capital is that the saving rate and the investment rate in human
capital reinforce each other in the determination of per capita output.
Thus, the augmented model is much more capable of predicting real
world cross-country income differences.
 Still, empirical assessments using the extended model reveal that
physical capital and human capital together are not enough to account
fully for the existing cross-country per capita income differences.
 This means that, in order to have a complete picture of why some
countries are richer than others, one should learn more about this
parameter that we label TFP or “technology”.
 Problems and Exercises
Key concepts
 Human Capital
 The Lucas (1990) paradox
 Development accounting
Essay questions:
a) Comment: “Because poor countries have less physical capital per

worker, returns to capital should be higher there”.
afreitas@ua.pt 134
Exercises
4.1.
Consider the following production function Yt  3 K 1 / 3 N 1 / 3 H 1 / 3 . Assume that

N=1 and H=8. Compute the marginal product of physical capital and display it a graph.
Explain what happens to the marginal product of capital when H increases to H=27.
4.2.
In Micronia the production function of each individual producer is given by:
Yi  At K i1 / 3 N i2 / 3 , where N is the number of workers. In this country the saving rate
is 25%, population is constant and the depreciation of the capital stock is 3% a year.
0.04
t
a) Assume that At  Ae 3
and A=5.
i. Find out the equilibrium values of K/L, Y/L and K/Y, where L is
work per efficiency units. Plot it in a graph and explain the
stability of the equilibrium.
ii. Compute the interest rate and N/Y. Are these values consistent
with the empiric evidence?
iii. Consider other economy with the same structure of the above
except for the saving rate. What would be the saving rate of that
economy in order to its income to be 1/10 of the level found in
a(i)? Explain the result.
b) What are the advantages of the specification of Mankiw, Romer and
Weil (1992) compared to the model of a(i)?
4.3.
Consider economy P, where the production function takes the following form,
Yt  AK  N 1  , and where the share of labour on national income is 2/3. It is also
known that per capita income in this economy is about 20% of the corresponding level
in the economy R.
a) If the only difference between P and R was the level of capital per worker,
how much should be k in P, as compared to R?
b) In that case, how much should be the interest rate in P as compared to R?
Could such difference be explained by risk premiums?
c) Suppose the data revealed k in P to be roughy 51.2% of the corresponding
level in R. If that was the case: (c1) which parameter in the model should
capture the remaining difference in per capita income? How much should
that parameter differ in the two countries? (c2) would the marginal products
of capital still differ in the countries? By how much? Could that difference
be explained by risk premiums and other impediments to capital mobility?
afreitas@ua.pt 136
5. The AK model
“…a level effect can appear as a growth effect for long periods of time, since
adjustments in real economies may take place over decades”. [Sachs and Warner].
Learning Goals:
 Understand why getting rid of diminishing returns one can obtain

unceasing growth through factor accumulation
 Review the models of endogenous growth based on factor accumulation
 Understand why in theses models making the saving rate endogenous
enhances the relationship between TFP and economic growth.
5.1. Introduction
Along the previous chapters, we learned that, if there are diminishing returns to
the reproducible factors, factor accumulation cannot, by itself, explain the long-term
growth of per capita income. For this reason, growth in the neoclassical model is
achieved by postulating an exogenous rate of technological progress.
In this chapter it is shown that, by getting rid of diminishing returns, one can
obtain continuous growth of per capita output without the need to postulate an
exogenous rate of technological progress. The basic model introduced in this chapter is
the AK model. The AK model differs critically from the Solow model in that it relies on
a production function that is linear on the stock of capital. With this change, the model
implies a continuous growth of per capita income without any tendency for the
economy to approach a steady state. Moreover, in this model, a rise in the investment
rate has a proportional effect on the growth rate of per capita income. In contrast, the
Solow model predicts that a developing country succeeding in raising its saving rate
will achieve a higher level of income per capita in the long run, but it will experiment
faster growth only temporarily.
The pitfall of the AK model is that the assumption of diminishing returns plays a
very central role in economic thinking. Because of this, economists remained
suspictious about its validity. Some of the sections below and in the following chapter
describe alternative theories and models that have been proposed to motivate the
abandonment of diminishing returns to capital. This chapter also explains how
endogenous growth can be obtained in the context of a neoclassical model with
diminishing returns to capital. Although none the models described in this chapter shall
be seen as the true model, they all offer avenues to think about the mechanics of
economic development.
Sections 5.2 describes the AK model in its simpler formulation. Section 5.3
describes the predecessor of the AK model, the Harrod-Domar model. Section 5.4
extends the AK model to the case of endogenous savings, to motivate the distinction
between proximate and fundamental causes of economic growth. Sections 5.5 through
5.7 describe alternative emulations of the AK model. Section 5.8 reviews some
empirical controversies on the Solow/AK debate. Section 5.9 concludes.
Getting rid of diminishing returns
Consider a closed economy similar to that described by the simple Solow model,
but assume instead that production is a linear function of the capital stock (K). For your
convenience, the main equations of the model are reproduced here:
Yt  AK t , A > 0 (5.1)
Yt  C t  S t (2.5)
sYt  I t (2.7)
K t  I t   K t (2.8)
n  N t N t (2.9)
As in the basic Solow model, the parameter A, which stands for technology (or
aggregate efficiency), is assumed constant.
Comparing (5.1) with (2.1), we see that now production depends only on the
reproducible factor and that there are no diminishing returns to this factor. The reader
may get suspicious about this formulation. Doesn’t labour have any role in production?
To keep things simple, for the moment, just ignore this question. We’ll return to it in a
minute.
Dividing (5.1) by N, one obtains a linear relationship between per capita income
and capital per worker:
y t  Ak t (5.2)
Using (2.7), (2.8) and (2.9), the fundamental dynamic equation gets the
following form:
kt  sAk t  n   k t (5.3)
This equation is similar to (2.14), with the only difference that now =1. This
small difference has, however, an important implication: since both the production
function and the break even investment line are linear in k, only by an exceptional
coincidence of parameters would the two loci cross each other. Hence, the general case
in the AK model is of no steady state. In particular, if sA  n    , the capital labour
ratio will expand forever, at a constant growth rate. Note that this conformity with real
world experience is now achieved without the need to postulate any exogenous
technological progress.
Dividing (5.3) by k, one obtains the equation that describes the growth rate of
capital per worker in this economy:
k
 sA  n   
k
afreitas@ua.pt 138
Since output is linear in K, the growth rate of capital per worker is also the
growth rate of per capita income. That is:
  sA  n    (5.4)
This equation states that the growth rate of per capita income rises with total
factor productivity (A) and the saving rate (s) and declines with the depreciation of the
capital-labour ratio (n and ). Because the growth rate in this model is influenced by the
other parameters, instead as pre-determined by an exogenous assumption, the model is
labelled as of endogenous growth.
A Graphical Illustration
Figure 5.1 describes the dynamics of the AK model. The horizontal axis
measures the capital labour ratio (k). The vertical axis measures output per capita (y).
The top line shows the production function in the intensity form, (5.2); the middle line
corresponds to gross savings per capita (the first term in the right hand side of 5.3); the
lower of the three lines is the break-even investment line (the second term in the right
hand side of 5.3).
Since the production function is now linear in k, the locus representing gross
savings never crosses the break-even investment line (compare with Figure 2.3). This
means that there is no steady state: as long as sA>n+ , per capita output will grow
forever.
Figure 5.1. The AK model
y Y / N
y = Ak
sy
(n+)k
k0 k K/N
What happens if the saving rate increases?
To see how changes in the exogenous parameters affect the growth rate of per
capita income, consider first a rise in the saving rate. In terms of Figure 5.1, this leads to
an upward shift of the saving schedule. Since all the remaining parameters of equation
(5.4) are unchanged, this means that the growth rate of per capita income rises
permanently. Also note that in this model there is no transition dynamics.
Figure 5.2 compares how the paths of per capita income would differ in first the
Solow model with exogenous growth and second in the AK model, following a once-
and-for-all rise in the saving rate at time t0 (the case with the Solow model was already
discussed in detail in Figure 3.3). The top part of the diagram shows levels and the
bottom part shows growth rates.
Comparing the two models, one concludes that:
- In the Solow model, the rate of growth of per capita income jumps initially to a
higher level, but then it declines slowly over time, until returning to the previous level
(given by the exogenous rate of technological progress) (after t1). Because of
diminishing returns, the long run growth rate of per capita income is independent of the
saving rate. Remember that the model without exogenous growth (Chapter 2) is just a
particular case, with =0.
- In the AK model, the rise in the saving rate has a permanent effect on growth:
there is no tendency for the growth rate of per capita income to decline as time goes by.
The growth rate of per capita output is proportional to the saving rate.
afreitas@ua.pt 140
Figure 5.2: The AK model and the Solow model compared for a rise in the saving rate
ln y
AK
Solow
 0
 1
 0
time
  y/
 y
1 AK
0 Solow
time
t0 t1
The Harrod Domar equation
A useful way to compare the AK model with the Solow model is looking at the
long run relationship between the average product of capital and the growth rate of per
capita income, in light of the two models (equation 3.11 and 5.4). For convenience, let’s
use (5.1) in (5.4), to obtain a formula that applies to both models in the long run:
Y
 s  n    (5.5)
K
This equation is known as the Harrod-Domar equation. The difference between
the two models refers to the variables that are exogenous and endogenous in this
equation. In both models, s, n and  are exogenous. But the two models differ in respect
to the exogeneity of  and Y/K: In the AK model, Y/K is exogenous and  is
endogenous. By contrast, in the Solow model,  is exogenous and Y/K is endogenous.
Hence:
- In the Solow model, a rise in the saving rate leads to a lower average
productivity of capital in the steady state. That is, Y/K declines from one steady state to
the other (Figure 3.2).
- In the AK model, Y/K is constant (equal to A). Hence, a rise in the saving rate
can only be accommodated in the model by an increase in the growth rate of per capita
income, .
In sum, the AK model goes far beyond the neoclassical model in stressing the
relationship between economic policies and economic growth: government policies,
such as taxes and subsidies, that affect the consumption-saving decisions will also affect
the accumulation of physical capital and, henceforth, long term economic growth.
No convergence
Another important feature of the AK model is that it does not predict

convergence of per capita incomes, even among similar economies. According to (5.4),
two economies having the same technology and savings rates will enjoy the same
growth rate of per capita income, regardless of their starting position. This means that
their per capita incomes will evolve in parallel and there will be no tendency for the
poorer economy to “catch up” with the richest economy.
Moreover, since changes in technology (A) and in the saving rate (s) affect the
growth rates permanently, the over-time series of per capita income of any two
countries with different parameters will drift apart. In a world where policies differ
substantially across countries, the rule should be that of divergence of per capita
incomes, rather than of convergence.
5.3. The Harrod Domar model
Historical context
The true predecessor of the AK model is the Harrod-Domar (HD) model. This
model was developed independently by a British economist, Sir Roy Harrod, and a
Russian American economist, Evsey Domar92. Their work preceded that of Solow by
several years and obviously it was not motivated by any explicit intention to improve on
the Solow model. Actually, the HD model was developed in the aftermath of the Great
Depression, as a dynamic extension of Keynes’ general theory, with the aim to discuss
the business cycle in the U.S. economy. Since at that time, unemployment was very
high, the focus of the model was on the relationship between investment in physical
capital and output growth.
A similar story was proposed by another British economist, Sir Arthur Lewis
(Nobel Laureate in 1979) to the context of poor countries. In most developing countries,
availability of labour is not a problem, but lack of physical capital acts a barrier to
economic development (see Box 5.2).
The HD model
92
Harrod (1939), Domar (1946).
afreitas@ua.pt 142
Instead of assuming a production function where capital and labour can

substitute for each other as inputs to production, assume that the production function
takes the following form:
Yt  min AK t , BN t  , (5.6)
where A and B are positive constants. This production function is known as the Leontief
production function (see Box 5.1). Under this technology, capital and labour cannot
substitute for each other in production, so changes in factor prices do not help in
promoting full employment. With such technology, there will be unemployment of
labour or unemployment of capital, depending on whether the proportion of labour
relative to capital in the economy as a whole is more less than A B .
The case of interest is the one in with excess supply of labour. In that case, the
relevant branch of the production function (5.6) is the first one, implying a linear
relationhip between output and K (just like in 5.1). In other words, since labour cannot
substitute for capital, the existence of excess labour implies that the only way to expand
production is by increasing the stock of physical capital. The assumption of surplus
labour turns the model into the AK.
The remaining of this story, you already know: given an exogenous saving rate
and a population growth rate, using (2.5)-(2.9), you’ll obtain the growth rate of per
capita income as described by (5.4).
The main limitation of the Harrod-Domar is that factor prices play no role. The
parameters determining economic growth (s, A, B, n, and  ) are all exogenous in this
model. Hence, the model entails no mechanism to drive the economy towards full
employment of capital and labour (see Appendix 5.1 for details). This contrasts to the
Solow model, where capital and labour can substitute for each other and price flexibility
ensures that the full employment is achieved each moment in time.
Box 5.1 The Leontief production function
The Leontief production function rules out the possibility of substitution

between inputs. This contrasts with the Cob-Douglas production function, where
inputsare also essential to production, but they can substitute for each other: for
instance, one may replace capital for labour while keeping the output level constant.
As an illustration, suppose that you were producing meals (Y) each one
consisting in one steak (K) and two eggs (N). That is, to produce any amount of this
output you would need steaks and eggs in a proportion of two eggs per one steak and
you could not get around this rule: if you did, you would not be producing steaks with
two eggs.
Figure 5.3: The Leontief production function
k  B A  1/2
2 Y=2
T
1 Y=1
R S
2 4 8 N
Figure 5.3 describes this technology. To be consistent with the meal example,
the figure postulates A=1 and B=0,5. Thus, to produce one meal (Y=1), you need at
least one steak and two eggs (Point R). If you had one steak and 8 eggs, your maximum
production would still be equal to one meal (point S).
Now think that this production function applies to the economy as a whole and
that K and N refer to capital and labour. If the economy endowments were K=1 and
N=8, the economy would be producing Y=1 only and there would be unemployment of
labour (in point S, 6 unit of labour are wasted). Note that with such technology, factor
prices do not help driving the economy to full employment: even if labour was very
cheap, since labour cannot substitute for capital, unemployment would not be absorbed,
unless more capital became available.
Clearly, in this model, expanding the amount of labour does not deliver
economic development. If however we managed to increase the stock of capital to K=2,
the output level would jump to Y=2 (point T). Moreover, unemployment would be
reduced to 4 units of labour.
Raising production by incrementing the stock of capital (K) in an economy with
surplus labour (N) is basically how the Harrod-Domar model works.
Box 5.2 Surplus labour
An alternative model stressing the key role of physical capital for economic
development was proposed by Sir Arthur Lewis 93 . Lewis was concerned with the
reallocation of labour from traditional agriculture to modern manufactures, a process
that underlies the transition of poor economies towards modern economic growth. This
process is labelled structural change.
93
A famous quote from Lewis is that “the central fact of economic growth is rapid capital accumulation”
[Lewis (1954)].
afreitas@ua.pt 144
Lewis modelled a developing economy as one with two coexisting sectors94: a

“subsistence” sector, where labour productivity is low and capital plays little role (say,
agriculture, Z); and a “modern” sector (industry, Y), that needs both labour and physical
capital to expand. In such economy, structural change involves the expansion of the
modern sector, by drawing labour from the traditional sector.
The similarity with the HD model arises in that the modern sector is constrained
by capital availability. However, Lewis did not postulate a Leontief production
function: he assumed in alternative that the wage rate in the modern sector is determined
by the subsistence wage in the traditional sector, through an arbitrage condition
(workers would not migrate for a lower pay) 95. Hence, there is no way of expanding
employment in the modern sector by setting lower wages: the only way of expanding
employment in the modern sector is by shifting upwards the demand for labour, which
in the model happens through capital accumulation. The mechanics of the model is as
follows: capitalists in the modern sector save out of their income (given by the
difference between production and wage payments) to buy more machinery; this, in
turn, shifts the demand for labour in the modern sector upwards. In response, some
workers migrate from the traditional sector to the modern sector (to operate the new
machines) and diminishing returns never show up.
In the model, the reason why the reallocation of labour to the modern sector
does not cause wages to increase is that the marginal product of labour in the traditional
sector is zero: the number of workers in the traditional sector is assumed to be so large
relative to the available land that the decrease in the number of workers there does not
affect the traditional output. This situation is labelled as of surplus labour96.
Box 5.3. The ghost of financing gap
One of the reasons why the Harrod Domar equation (5.4) became so popular is
that it offers a simple and appealing formula to forecast economic growth. This formula
was also extensively used by international organizations, such as the World Bank, to
calculate a country’ financing needs.
94
Economies of this type are called dual economies.
95
Note that wages in the modern sector need not to be exactly equal to the subsistence wage in the rural
sector. If, for instance, there is some probability of the migrant worker not to find a job in the modern
sector, then the arbitrage condition will hold with expected wages. That is, the wage rate in the modern
sector has to be higher than in the traditional sector, to compensate for the probavility of the migrating
worker remain in unemployment (a seminal contribution on this avenue is Harris and Todaro, 1970).
96
Note that the existence of surplus labour in the traditional sector does not imply that wages should be
zero. By “traditional” sector is meant not only a sector where capital accumulation plays little role, but
also a sector where the economic organization is based on family and local communities. This contrasts to
the “modern” sector, where production is carried out for economic profits: in a modern sector, the
entrepreneur is expected to dismiss any worker producing less than his pay. In a family farm, in contrast,
dismissals are less likely. Even when the community size is so large that some members could be
dismissed without changing total output, the prevailing “social norms” would condemn such extreme
solution. Lewis (1954): “And even in the severest slump the agricultural or commercial employer is
expected to keep his labour force somehow or other – it would be immoral to turn them out, for how
would they eat, in countries where the only for of unemployment assistance is the charity of relatives?”.
Still, communities may encourage the youngest to migrate to the modern sector.
In light of (5.4), in order to forecast a country’ economic growth, one needs a

saving rate, a depreciation rate and a value for the average product of capital, A. Since
the later is not readily available in national accounts, a common procedure is to proxy it
by the ratio of net investment to the change in real GDP over two consecutive years.
This is the so-called "Incremental Capital-Output Ratio", ICOR. Specifically:
K net investment
ICOR  
Y change in GDP
As an example, consider a poor economy where the ICOR = 3 and the observed
investment ratio (s) is 15%. Assuming a depreciation rate equal to 4%, equation (5.4)
implies that output will grow at 0, 15 3  0,04  0,01 .
Now suppose you were a World Bank economist for that economy, charged with
advising on poverty alleviation. You could well regard the predicted growth rate as too
low. If, for instance, population in that economy was growing at 2%, this would mean
that per capita income was actually declining. You could, then, use the HD equation the
other way around: how much should the investment rate be for this country to achieve
some desired rate of output growth? Suppose you wanted income per capita to grow at
2%. With the population rising at 2% and a measured ICOR equal to 3, according to the
model, you would need a net investment amounting to 24% of GDP. Since domestic
savings were only 15%, you could request the international donors to fill the "financing
gap", amounting to 9% of GDP.
Economists in international institutions, such as the World Bank, the IMF, the
Inter-American Development Bank, the European Bank for Reconstruction and
Development used and still use models based on the HD equation to estimate the
amount of savings (and/or aid) necessary for poor countries to achieve a minimum rate
of economic growth97. This philosophy is supported by the understanding that people
living near the subsistence level cannot save the same as rich people. In theory, foreign
aid could fix this: if the external aid succeeded in raising per capita output above a
critical level, it could be that the domestic saving responded, allowing the country to
engage in a self-sustained growth path. In this case, foreign aid would need only to be
temporary98.
97
An open economy version of the Harrod-Domar model is the World Bank’s Revised Minimum
Standard Model, which became the World Bank’ workhorse. Other extensions include the “two gap
model” (Chenery and Bruno, 1962, Chenery and Strout, 1966), which focuses on a “foreign exchange
gap”, and the “three-gap model” (Basha, 1990, Taylor, 1990), which brings government savings into the
analysis. All these models imply that foreign aid can supplement domestic savings, to deliver faster
capital accumulation and economic growth (a brief survey in Agenor and Montiel, 1996).
98
You are invited to demonstrate that replacing the assumption of a constant saving rate by a saving rate
that depends positively on per capita income in the context of the AK model raises the possibility of a
bifurcated growth pattern, whereby per capita income rises or decreases forever, depending on the initial
level of capital per worker. Recent proponents of idea include Sachs (2005) and the United Nations
Millennium Development Goals Project. Sachs (2005): “(…) if the foreign assistance is substantial
enough, and lasts long enough, the capital stock rises sufficiently to lift households above
subsistence…growth becomes self-sustained through households savings and public investments
supported by taxation of households” (p. 246). United Nations (2005, p. 19): “The key to escape the trap
is to raise the economy’s capital stock to the point where the downwards spiral ends and self-sustained
economic growth takes over”. (p. 19).
afreitas@ua.pt 146
This interpretation of the HD equation was seriously criticised by William

Easterly in an insightful paper called “The ghost of financing gap”99. The author noted
that, over the past four decades, large amounts of international financial assistance to
the developing world did not translate into faster economic growth. Using a sample of
146 countries along the period from 1950 to 1992, the author failed to find a robust
positive linear relationship between aid and economic growth (see also Box 5.4).
Does this mean that the HD equation is wrong? Not necessarily. But probably
one should not trust too much historical values of the incremental output-capital ratio to
guess the marginal impact of new investments: when, for example, part of the external
aid is diverted into unproductive uses (frivolous expenses, corruption fees), then much
of the higher saving rate will be offset by a lower A. Large amounts of financial aid to
countries with poor institutions may have perverse effects, sucha as encouraging aid-
dependence, and helping perpetuate bad governments in power100.
Box 5.4 The external aid controversy
The question as to whether external aid helps or not a country achieve faster
economic growth is obvioulsly very important from the policy point of view. With no
surprise, this question has been subject to empirical investigation.
A branch in the literature has explored the possibility of the impact of aid being
conditional on the recipient country characteristics101. A particularly influent study was
a background paper to the 1998 World Bank Assessing Aid report, by Burnside and
Dollar (2000). Working with a sample of 56 developing countries along the period from
1970 to 1993, the authors distinguished two sub-groups of countries: those that pursued
“sound policies” and those that pursued “poor policies”. Policy soundness was assessed
by a compounded index of trade openness, fiscal discipline and low inflation. Focusing
on the “sound policies” sub-sample, the authors found that those countries receiving
large amounts of aid grew, on average 3.5%, while those receiving small amounts of aid
grew, on average, 2%. In the “poor policies” sub-sample, the authors found virtually no
growth, irrespectively of the amount of aid received. These findings suggest that
external aid boosts growth, but only if domestic policies are sound.
To further investigate this hypothesis, Burnside and Dollar (2000) run some
regressions trying to explain the growth rates of per capita income achieved by the
countries in the sample. Their central results are reproduced in columns (1) to (3) of
Table 5.1. The dependent variable is the average growth rate of per capita GDP along
the period 1970-93. In equation (1), the growth rate of per capita GDP is correlated
with: the logarithm of initial GDP per capita GDP (capturing conditional convergence);
the degree of ethnic fractionalisation, the rate of political assassinations and the product
99
Easterly (1999).
100
Alesina and Weder (2002) suggested recently that foreign aid favours corruption by increasing the size
of resources that organized groups compete for. A similar argument is explored in Chapter 13.
101
This “conditional” avenue was pioneered by Isham et al. (1995), who found that World Bank projects
tend to deliver higher rates of return in countries with stronger civil liberties. This investigation was
followed by Boone (1996), who contended that the effectiveness of aid is contingent on the level of a
country democracy. But it was the publication of the Assessing Aid report (World Bank, 1998) that
marked the surge in the literature investigating the relationship between aid and development.
of these two variables (to capture political instability); an index of institutional quality;
the ratio of money to GDP (to capture financial development); two regional dummies,
for sub-Saharan Africa and East Asia; and the government budget surplus, inflation and
an index of openness to international trade (to capture the quality of domestic policies).
Burnside and Dollar used the coefficients of the last three variables in regression
(1) to compute a “policy index”. Then, they estimated equation (2), replacing the
variables capturing the quality of domestic policies by this “policy index”. In this
equation, they also added the external aid as a percentage of GDP. Since the t-ratio of
the later was found to be very low (0.28 in column 2), the authors concluded that aid, by
itself, does not explain growth. In column (3), a similar regression is performed, but
including an “interaction term”, given by the product of the variables AID/GDP and the
Policy Index. Because this last variable was found to be significant while AID/GDP
alone is not, the authors concluded that aid only leads to more growth in a sound policy
framework102.
102
The authors also tested the possibility of aid to be detrimental to policies. However, no significant
relationship was found between the amount of aid received and the quality of the domestic policies.
afreitas@ua.pt 148
Table 5.1. Growth regressions explaining the growth rate of per capita GDP in 56
developing countries
Aid-growth regressions with and without policy and geographical interaction

(1) (2) (3) (4)
Estimation method OLS OLS OLS OLS
Initial GDP -0.65 -0.61 -0.60 -0.54
(1.18) (1.09) (1.05) (0.96)
Ethnic fractionalization -0.58 -0.54 -0.42 0.12
(0.79) (0.75) (0.58) (0.16)
Assassinations -0.44* -0.44* -0.45* -0.38
(1.63) (1.69) (1.73) (1.55)
Ethnic fractionalization * Assassinations 0.81* 0.82* 0.79* 0.70
(1.80) (1.86) (1.80) (1.63)
Institutional quality 0.64** 0.64** 0.69** 0.69**
(3.76) (3.76) (4.06) (4.02)
M2/GDP (lagged) 0.015 0.014 0.012 -0.02
(1.00) (1.08) (0.86) (1.54)
Sub-Saharan Africa -1.53** -1.60** -1.87** -1.58**
(2.10) (2.19) (2.49) (2.04)
East Asia 0.89 0.91* 1.31** 1.57**
(1.59) (1.69) (2.26) (2.63)
Budget surplus 6.85**
(2.02)
Inflation -1.40**
(3.41)
Openness 2.16**
(4.24)
Burnside-Dollar policy index 1.00** 0.71** 0.78**
(7.14) (3.74) (4.05)
Aid/GDP 0.034 -0.021 1.49**
(0.28) (0.13) (3.92)
(Aid/GDP) * policy index 0.19** 0.09
(2.71) (1.34)
Fraction of land in tropics -0.70
(1.32)
(Aid/GDP) * fract. of land in tropics -1.52**
(4.02)
Observations 275 275 270 270
Countries 56 56 56 56
R2 0.35 0.36 0.36
Sources: Burnside and Dollar (2002) for columns (1), (2) and (3) (regressions (1), (3) and (4) in
the original paper; Dalgaard et al., (2004) for column (4) (column 5 in the original) 'Notes: The
dependent variable is real per capita GDP growth. All regressions include time dummies.
Robust t-statistics in parentheses.'* significant at 10%; ** significant at 5%.
The Burnside and Dollar (2000) results caused a significant reaction in the
economic profession, as it implied that foreign aid is useless in countries pursuing bad
policies. Not surprisingly, their results were subject to an intense scrutiny by other
researchers. In general, this further investigation revealed that the conclusions of
Burnside and Dollar were, in general, fragile to alternative specifications of the
regression model 103 . Burnside and Dollar (2004b) then shifted their focus from the
quality of policies to the quality of institutions. Using a cross section of 124 countries
over the 1990s, they found that, while aid alone is not significantly related to growth,
the degree of institutional quality interacted with aid, is.
On a different avenue, Dalgaard et al. (2004) tested the impact of a variable that
cannot be changed by policy: geography. They measured the fraction of a country’s land
located in the tropics and interacted this variable with aid, to evaluate the aid-growth
relationship, using the same data-set as Burnside and Dollar. Columns (4) in Table 5.1
displays some of their estimation results. As shown in the table, the policy-aid
interaction variable becomes insignificant, while aid and aid interacted with the climate
became significant. These results suggest that aid has a positive impact on growth, but
the impact decreases for countries located in the tropics. This last result is, of course,
disappointing because it points to a critical role of geography - which cannot be changed
by human actions - rather than of policy, which people can change.
In the last years, a number of other papers came out, addressing the question as
to whether the impact of aid on growth is conditional on third variables. The literature
has still no definitive answer regarding the variable that better interacts with aid: some
papers say is policy, others say is institutions; some others say it is geography and
others say there is no interaction at all. This evidence suggests that the inter-play
between aid, local circumstances and growth is eventually more complex than that
captured by the initial Burnside and Dollar (2000) estimation.
In any case, the evidence that aid may impact differently on growth depending
on a country political, institutional and geographical circumstances points to new
directions in our quest.
5.4 The AK model with endogenous savings
Thus far, the saving rate in the AK model has been assumed exogenous. In this
section we show that, when the model is modified so as to allow individuals to
optimally trade consumption today for consumption in the future, a second channel
linking efficiency to growth is opened up.
Adding an optimal consumption rule to the AK model
In what follows, let’s recall the simplest possible optimal consumption rule,
introduced in Section 2.6:
 r (5.7)
This equation states that, as long as the interest rate is higher than the rate of
time preference, there will be incentive for economic agents to increase consumption
over time. This, in turn, is achieved through a higher saving rate.
103
Easterly and al. (2004), using the same specification and data as Burnside and Dollar, but extending
the sample by four more years, failed to find a significant interactive term. Islam (2005), using annual
data for a sample of 33 countries along the period 1968-1997, found that aid alone has little impact on
economic growth, but has a significant and positive effect when interacted with a variable capturing
politically stability.
afreitas@ua.pt 150
Since the AK model has no transitional dynamics, consumption and income

evolve in parallel each moment it time. Hence, (5.7) can be seen as describing
simultaneously the growth rate of per capita consumption and the growth rate of per
capita income.
To find out how the growth rate of per capita income relates with the remaining
parameters of the AK model, one needs to solve for the interest rate. As before, we
assume that firms are perfectly competitive and maximize profits. In this case, capital
will be paid its marginal product, A. That is:
r   A (5.8)
Substituting (5.8) in (5.7) and rearranging, one obtains:
  A   (5.9)
This equation describes the growth rate of per capita income in an AK model
where consumers are allowed to trade consumption today for consumption in the future
at a given interest rate, r. Comparing to (5.4), you see that now it is the rate of time
preference, instead of the saving rate, that determines the rate of economic growth.
Transpiration responds to inspiration!
From the qualitative point of view, equation (5.9) brings no novelty relative to
the case with exogenous savings, (5.5): a lower rate of time preference (that is, a change
in consumption preferences in favour of more consumption in the future and less
consumption today), by raising the saving rate, leads to a higher rate of capital
accumulation and a higher growth rate of per capita income.
However, comparing (5.9) to (5.5), we observe that the impact of A on growth is
now much larger than in the case with exogenous savings. For instance, with a saving
rate equal to 20%, the impact of a unitary change in A on growth according to equation
(5.5) is 0.2. In equation (5.9), the impact of a unitary change in A on growth is one. That
is: five times more.
What makes the assumption of consumption smoothing so powerful that it can
alter dramatically the relationship between the efficiency parameter and growth? The
point is that, when A rises, there are two effects:
- On one hand, when A rises for a given s, the growth rate of per capita income
rises, just like in (5.5);
- On the other hand, when A rises, there is now an additional effect through the
interest rate, r: a higher marginal productivity of capital translates into a higher return
on capital and this, in turn, will induce a higher saving rate, for each rate of time
preference104. Then, with a higher saving rate, the economy will grow faster.
104
Formally, one may substitute (5.9) in (5.5) and solve for s, to obtain the (endogenous) saving rate:
s  1    n A (the restriction >n is necessary for the problem to be well-behaved). Taking the partial
derivative in respect to A, we verify that the impact of a change in A on the saving rate
is s A  r    A 2 . The total impact of a change in A is the sum of the direct impact of A on  with
the indirect impact, through s: d dA   A   s s A  s  r    A 1 .
This finding is of the upmost importance to understand the mechanics of many

endogenous growth models. A typical assessment based on equation (5.5) is that a
country may either grow through “inspiration” (A) or through “transpiration” (s). But,
we just saw that “transpiration” responds to “inspiration”: that is, a more efficient
resource allocation, leading to a higher marginal product of capital, implies a higher
return on investment. Thus, agents will be willing to forego a higher proportion of their
consumption to save more.
With this finding, one may rewrite equation (5.5) in the following form:
  s   , A  A  n   
 
(5.10)
 
It depends !
It is important to distinguish the circumstances in which one shall refer to

equation (5.10) from the case in which (5.5) is more appropriated.
Remember that the rule (5.9) presumes that the financial system is well
developed, so that households are able to smooth their consumption over time. When
instead the financial system is underdeveloped and households face borrowing
constraints, equation (5.5) is with no question more appropriated.
The implication is that bad economic policies (as reflected in low levels of A)
are likely to impact more severely in countries with developed financial systems than in
countries with underdeveloped financial systems. Putting in other way, in countries with
inefficient financial systems, people are more likely to tolerate bad government policies.
This, in turn, may help perpetuate the bad policies!
This discussion adds to the general point that questions like “what happens to
per capita income (or to economic growth) when some parameter increases” do not have
a unique answer: it depends on a country economic conditions.
Proximate versus fundamental causes of economic growth
According to equation (5.4), a low rate of economic growth can be explained

either because a country does not invest enough (s) or because it does not achieve a
minimum productivity of capital (A). Dealing with the development question at a deeper
level, however, one may like to understand why some countries save and invest more
than others and why some countries reach higher returns on their investment than
others. In other words, one would like to take as endogenous the parameters that the
models so far see as exogenous.
To some extent, equation (5.10) is a step in that direction: according to this
equation, individuals will save more wherever the productivity of capital is higher105.
The following chapters will be devoted to a better understanding of what is behind
parameter A, which we have by now loosely relate to “efficiency” and “technology”.
105
Equation (5.10) stresses the causality from “inspiration” to “transpiration”. However, the reversal may
also be true. In Chapter 6 we’ll address precisely some theories according to which the level of A is
enhanced by capital accumulation. The possibility of mutual causation implies that savings and efficiency
may reinforce each other, both positively and negatively, raising the possibility of multiple equilibria and
poverty traps.
afreitas@ua.pt 152
In this quest, we will relate the level of A to the quality of economic policies and
institutions. We will argue that countries with sound economic policies are expected to
achieve higher efficiency levels and to employ better technologies than countries with
poor economic policies.
But another question will immediately arise: why do some countries implement
better policies than others? To answer this question, we need to address the incentives
of policymakers. These, in turn depend on the quality of political institutions. These, in
turn, are grounded in even deeper factors underlying human societies, such as social
norms, culture and geography.
In a word, as one deepens the analysis, we move from the proximate causes of
economic growth, which are captured by the parameter values in equation (5.4), to the
fundamental causes of economic growth, which ultimately determine why in a given
country the parameter values are what they are. These fundamental causes are essential
to understand why some societies make choices that lead them to benefit from better
policymaking and to adopt more modern technologies.
This is not to say that simple models like (5.4) are useless. On the contrary, they
are essential to understand the mechanics of economic growth. In particular, the role of
investment and technology as mediators between a country fundamental characteristics
and its economic performance. But dealing with the development question at a deeper
level, one may want to understand what is behind the parameters that the simple models
we are using take as exogenous.
5.5. The AK model with Physical and Human Capital
A main limitation of the basic AK model, as described in Section 5.2, is that it

implies a share of national income accruing to capital equal to 1. Thus, the model has
nothing to say concerning resource allocation or income distribution.
There is however a simple way to overcome this limitation: the idea is to
consider that constant returns apply to a broad concept of capital, which includes both
physical and human capital106.
To see this formally, consider the following production function:
Y  AK  H 1  (5.11)
In this production function, there are diminishing returns to physical capital and
to human capital in isolation, but there are contant returns to scale in reproducible
factors. This contrasts to the Solow and the MRW models, where returns to
reproducible factors are decreasing.
Also note that this production function does not necessarily exclude raw labour.
Indeed, one may think human capital, H, as measuring quality adjusted labour, that is,
the number of workers, N, multiplied by the human capital of the typical worker (h):
H  hN . (5.12)
106
This avenue was pioneered by Sergio Rebelo (1991).
The implied assumption in (5.12) is that the quantity of workers, N, and the
quality of workers, h, are substitutes. With such a specification, raw labour needs no
longer to be a source of diminishing returns: multiplying the use of h and K by two
causes the production level Y to double, even if N is held constant. Hence, provided the
two capital inputs grow at the same rate, the CRS property will assure the linearity
between prodiction and reproducible factors that characterizes the AK class of models.
To see this formally, let’s proceed with the model specification. As in the MRW
model, we retain the convenient assumption that one unit of output can be transformed
at no cost into either one unit of physical capital or one unit of human capital. This was
stated in equation (4.7), which we reproduce here:
Yt  C t  I t  I tH , (5.13)
where I H refers to investment in Human Capital and I refers to investment in Physical

Capital. Assuming, as before, that the depreciation rate for the two types of capital is the
same, profit maximization leads to the following conditions:
Y Y
 r  (5.14)
K K
Y Y
 1     r   (5.15)
H H
These conditions imply that the proportion of total income accruing to physical
capital is  , and that the proportion of income accruing to human capital is 1-. From
(5.14) and (5.15), one obtains:
H 1 
 (5.16)
K 
Equation (5.16) implies that both types of capital will evolve at the same rate
along time. Substituting (5.16) in (5.11),the following variant of the AK production
function obtains:
1 
1   
Y  A  K (5.17)
  
Comparing to (5.9) we see that now the marginal product of capital embodies
the relative weight of physical to human capital in the production function (that is, you
can look at A in equation 5.1 as including this effect)107.
Substituting (5.17) in (5.14), solving for the interest rate and using the optimal
consumption rule (5.7), one obtains the growth rate of per capita income in this
extended AK model:
107
The implication is that, if two countries differ in terms of these weights, then the one that uses
relatively more human capital (that is, the one with a lower ) will suggest, in terms of equation (5.1), a
higher marginal product of physical capital, A.
afreitas@ua.pt 154
  B     , with B  A   1   
1 
(5.18)
This equation shows that it is perfectly possible to have diminishing returns to

physical capital alone and yet having sustained growth of per capita income. What we
need is to have constant returns to all types of capital (or reproducible inputs) when
considered together. Note that the MRW model differs from this one in that the non-
reproducible factor (N) cannot be replaced by human capital: in the MRW the returns on
K and H when considered together are decreasing.
The model in the previous sector assumes that one unit of output can be
transformed into either one unit of physical capital or into one unit of human capital, on
a one-to-one basis (equation 5.13). It may be argued, however, that the
productionfunction for human capital shoul be different from the production function
for other goods. This section explores an alternative model, due to Usawa (1965) and
popularised by Lucas (1988), where the education sector has a different production
function.
Main assumptions
The economy has two sectors, the production sector and the education sector.
The production sector employs both human and physical capital and produces goods
and services, which are used for consumption and for investment in physical capital.
The education sector employs human capital only, and its output consists in the
expansion of the stock of human capital.
In the model, it is assumed that workers devote a fraction u of their working
time to production of goods and the remaining 1-u to human capital accumulation. The
production function for goods is given by:
Y  AK  uH 
1 
, with H  hN (5.21)
The production function for human capital is as follows:
h  b 1  u h (5.22)
The parameter b shall be interpreted as the productivity in the education sector.
The production function (5.22) has a critical property: a constant fraction of time
devoted to education produces a constant growth rate of human capital that is
independent on the existing level of human capital (in other words, there are no
diminishing returns to human capital on human capital accumulation). With such an
assumption, a policy change that successfuly increases the proportion of time devoted to
human capital accumulation (1-u) or that improves the effectiveness of the education
system (b) impacts positively and permanently on the growth rate of h and thereby – as
we will see - on the growth rate of per capita income108.
As far as the physical capital accumulation is concerned, the equation of motion
(4.8) is maintained:
sY t  I t  K t   K (4.8)
The fundamental dynamic equation once again
This model can be solved in the same manner as the Solow model. For mathematical
convenience, let’s rewrite the production function (5.21) as follows:
 
y  Au 1   k  (5.23)
 
Where y  Y H and k  K H .
Proceeding as usual, the fundamental dynamic equation becomes:
  
k  sAu1  k   n    b1  u k (5.24)
Comparing with (3.8) you can verify how similar this model is with the Solow
model. The main difference is that the parameter determining the effectiveness of
labour, rather than growing exogenously, is now dependent of other parameters in the
model.
Figure 5.4: the steady state in the two-sector model
108
As noted by Lucas (1988), if instead there were diminishing returns to the accumulation of human
capital (that is, if h  b1  u h  with   1 , then the growth rate of human capital would tend to zero, no
matter how much effort was devoted to accumulation of human capital. In that case, growth could not be
sustained. Lucas (1988) argued that, although individuals accumulate more human capital early in life –
suggesting diminishing returns – one should interpret (5.22) as applying to the society as a whole, with
human capital being accumulated through individual decisions, but also passed on to younger generations.
afreitas@ua.pt 156

y Y /H  
y  Au1  k 
~
y* 
n    b 1  u k

sy
 
k K/H
k*
An hybrid model
This model is hybrid, in the sense that it shares characteristics with the AK
model and with the Solow model.
It shares with the neoclassical model the feature that it has a transitional

dynamics and a stable steady state (to find the steady state, just solve for k  0 ). Figure
5.4 describes this. In this model, the saving rate, s, determines the steady state and
changes in s produce “level effects” (just like in the Solow model). Because the levels
 
of y  Y H and k  K H are both constant in the steady state, the output-capital ratio
is constant and so will do the interest rate. Hence, in the steady state, output grows at
the same rate as Human Capital: Yˆ  Hˆ  hˆ  n and yˆ  Yˆ  n  hˆ  b (1  u ) .
Contrasting to the Solow model, the long run growth rate of per capita income
( b1  u  ) is explained in the model: it depends on (1-u), the proportion of working time
people allocate to education; and on b, the effectiveness of investment in human capital.
fThus, For instance, a policy that is successful in inducing an increase in the proportion
of time devoted to education raises the growth rate of the economy on a permanent basis
(growth effect).
Figure 5.5 describes the path of per capita income in this economy following an
increase in the time devoted to education: at the impact, there is a negative effect on per
capita income, because less time is devoted to production. As the times go by, however,
the growth rate of per capita output accelerates, due to the faster rate of human capital
accumulation109.
Figure 5.5 – The path of output per capita following an decrease in u
ln y
b1  u1 
Change in u
b1  u0 
time
An AK representation
Technically, persistent growth is obtained in this model because the production

function for human capital is free of diminishing returns. In other words, the model
overcomes diminishing returns to physical capital by postulating a linear production
function for human capital. Physical capital can then be accumulated without seeing its
productivity declined, because it will be increasing at the same rate as human capital.
Rewriting the production function (5.21), we see that:
1 
u
Y  A   K (5.25)
k 

Since in the steady state levels of u and k  K H are both constant, this means
that in the long run we have no more than another incarnation of the AK model. In the
109
It is worthwhile to observe that the two-sector model introduced in this section differs from the
“extended” AK model (Section 5.5) in that the ratio of physical to human capital is not always constant:

for instance, following a fall in u, the ratio k  K H starts declining (in terms of Figure 5.4, note that

the production function and the break-even investment line shift in opposite directions). Hence, k starts
declining, implying that during the transition period the growth rate of per capita output will be lower

than in the steady state. As the economy approaches the new steady state, the fall in k becomes smaller,
so that the growth rate of per capita income approaches the new steady state growth rate, b1  u1  .
afreitas@ua.pt 158

short run, however, k  K H is not in general constant, so the model also displays a
transition dynamics.
In sum, because the model has CRS in reproducible factors, it is capable of
generating persistent growth without the need to assume exogenous shifts in the
production function. This is why the model belongs to the general category of
endogenous growth models. However, as long as the parameter u is held constant, this
model behaves like the Solow model: a change in the saving rate will produce a level
effect, only.
Discussion
The model above emphasises the importance of investment in education as the

engine of growth. A question that naturally arises is how people decide the optimal level
of u.
Intuitively, the optimal investment in education shall depend on a number of
factors 110 . For instance, one would expect people to invest more in education in a
context where the productivity of education effort (b) is higher: the more valuable the
time spent in education, the faster will be rate of human capital accumulation and
economic growth.
By the same token, one expects people to invest more in education, the less
impatient they are. That is, if we allowed individuals to optimally choose u and s, a
higher degree of impatience would not only translated into less investment in physical
capital (and hence a negative “level effect”), but also into a lower proportion of time
devoted to human capital accumulation (giving rise to a negative “growth effect”).
In general, to the extent that government policies influence the returns to
investment in human capital, they may have –in light of this model - an influence on
economic growth 111 . Note however that the model, as presented above, is silent in
respect to the role of government in the economy.
5.7. Neoclassical models of Endogenous growth
The models we considered in this chapter generate endogenous growth by

abandoning the assumption of diminishing returns. This is not, however, a necessary
110
To answer this question formally, one would need a more complete specification of the model,
including the consumer side. In sake of simplicity, however, we skip this complication. Intuitively, since
human capital is free to move between the final good sector and the education sector, an arbitrage
condition should hold implying that at the margin the individual should be indifferent between allocating
its time to one sector or to the other. In a later stage we’ll have the opportunity to solve similar problems,
though in different contexts (chapters 13 and 7).
111
Whether this prediction fits the empirical evidence is a different question. Jones (2005), for instance,
contends that investment rates in human capital have risen significantly in the US economy along the 20th
century, but the growth rate of per capita output remained basically the same.
condition: one may generate endogenous growth even without departing from the
assumption of diminishing returns to capital112.
To see this, let’s consider again the optimal consumption rule (5.7): as already
explained in Section 2.6 (Figure 2.12), the Solow model cannot deliver long-run growth
because the marginal product of capital falls down to zero as the capital labour ratio
increases: at the time the interest rate equals the discount rate, the desired consumption
becomes constant over time and the process of capital accumulation stops.
These considerations suggest an avenue to generate endogenous growth: what
we need is to prevent the interest rate from falling below the rate of time preference. In
the AK model, this is possible because the marginal product of capital is a constant
parameter. Thus, as long as A     , per capita income will grow forever.
The same can be achieved in the context of the neoclassical model. Note that the
assumption of diminishing returns only requires the marginal product of capital to be a
decreasing function of the capital stock. The Solow model goes a bit further, by
postulating an aggregate production function (as exemplified by the Cobb-Douglas)
with marginal returns falling asymptotically to zero. If however the marginal product of
capital never approached zero, the model could display endogenous growth.
Thus, the only requirement to generate sustained growth of per capita income in
the neoclassical framework is to postulate that the marginal product of capital is
bounded below by a positive constant. In that case, as the amount of capital per worker
increases, the marginal product of capital approaches that constant. If it happens that
this constant is higher than the rate of time preference, then the economy will expand
without bound113.
As an example, consider the production function Y  AK  BK  N 1  . This
production function exhibits diminishing marginal returns. It converges however
asymptotically to the AK form. In other words, the average product of capital is
bounded below by the parameter A. Thus, as long as A     , the model will display
unceasing growth114.
A neoclassical growth model, suitably modified along these lines is capable of
generating at the same time endogenous growth (as the AK model) and transition
dynamics. In such a model, two economies differing only in terms of their initial per
capital incomes will exhibit a tendency to approach each other, with the one with less
capital per worker growing faster. At the same time any government policy that was
successful in raising the saving rate would have a permanent effect in the growth rate of
per capita income. This is why this class of models is labelled neoclassical models of
endogenous growth.
112
This avenue was explored by Jones and Manuelli (1990).
113
Actually, this is what we did by introducing exogenous technological progress (Chapter 3): the effect
of technological progress is to raises the productivity of capital, offsetting the diminishing returns. In the
long run the interest rate is kept above the discount rate and the economy displays positive growth
forever.
114
Barro and Sala-i-Martin (1995) show that an average productivity of capital bounded below by a
positive constant is also featured by CES production functions with high substitutability between labour
and capital.
afreitas@ua.pt 160
5.8 Empirical controversies
Levels or changes?
The AK model differs dramatically from the exogenous growth model, in terms
of the relationship it establishes between the investment rate and economic growth. This
prediction suggests an obvious avenue to find out which model conforms better to the
real world facts: to investigate whether shifts in the investment rate have permanent or
temporary effects on economic growth.
Along this avenue, critics of the AK model have pointed out that, among OECD
countries for instance, richer economies tend to exhibit higher investment rates in
physical and in human capital than poorer economies, and yet they do not not enjoy
faster growth (actually, the data in Figure 3.6 points to the opposite case). A well
known contribution along this reasoning is from Charles Jones (1995). The author used
time-series analysis for a sample of 15 OECD economies, to teste whether shifts on the
investment rate have permanent effects on GDP. In this investigation, he used two
definitions of investment: a broad one, defined as the gross investment as a share of
GDP and a narrow one, defined as durable investment as a share of GDP. In both cases,
he found no evidence of a permanent effect of investment on economic growth.
With no surprise, Jones’ findings were subject to close scrutiny. Among the
critics, Li (2002) argued that the Jones’ case against the AK model is fragile to sample
modifications and, more important, to changes in the definition of capital. The AK
model, he argued, shall be understood as applying to a broad concept of capital. Hence,
its validity should be tested including both human and physical capital in the estimation,
not a narrow definition of capital as used by Jones. Li estimated two different models,
the AK model and the Usawa-Lucas model. He concluded that the later fits much better
the data on 22 OECD economies than the AK model.
Similarly, Arnold et al. (2007) tested whether growth patterns in a sample of 21
OECD countries over the 1971-2004’ period were better accounted for by the Solow
model augmented with human capital (MRW) or by the Usawa-Lucas model. In the
estimation, the authors allowed different countries to display different speeds of
adjustment to their respective steady states. Their results were also more favourable to
the Usawa model.
In practice, a major difficulty in disentangling whether the true model is the
Solow model or the AK model is that the two models are observationally equivalent for
long periods of time: remember that the Solow model predicts a positive relationship
between investment and growth while the economy moves from one steady state to the
other. Since this transition period can be quite long (in the MRW formulation, for
instance, half of the transition dynamics takes as long as 35 years) and because most
reliable datasets with comparable data start after 1950, it is not easy to assess whether
changes in the investment rate have long run level effects or long run growth effects.
Tests on conditional convergence
The most common approach to assess whether the true model is the AK model
or the neoclassical growth model is testing for conditional convergence. The conditional
convergence hypothesis states that countries tend to approach their respective steady
states, and that the speed of adjustment varies in direct proportion to the distance to the
steady state. This property of the neoclassical model contrasts to the AK model,
according to which parameter shifts alter the growth rate of per capita income
permanently, without any tendency for per capita income to return to a previous path.
The most common approach to test for conditional convergence is to run cross-
country growth regressions (see Box 5.5). Basically, the method consists in estimating
the growth rate of per capita income as a function of a range of explanatory variables,
including the initial level of per capita GDP. Then, conditional convergence is assessed
by investigating the significance of the coefficient in the initial level of per capita
income
Table 5.2 reproduces a pioneer study on cross-country growth regressions, due
to Robert Barro (1991) (another example in Box 4.5). In the table, the dependent
variables are average growth rates of real per capita GDP, along the periods 1960-1985
(GR6085) and 1970-1985 (GR7085), in a cross-section of 98 countries. The explanatory
variables include the initial rates of secondary-school and primary school enrolment
(SEC60 and PRIM60) 115 , the average ratio of government consumption to GDP
(GC/Y)116, the number of revolutions and coups per year (REV), the number of yearly
assassinations per million inhabitants (ASSASS), a measure of the relative price of
capital goods (PPI60DEV), investment as a percentage of GDP (I/Y) and the initial
level of real per capita GDP (Initial PCGDP).
On the estimation results, one may observe the following:
First, holding fixed the other variables, the two measures of education
attainment (SEC60 and PRIM60) exhibit a strong positive correlation with per capita
GDP growth. This, accords to both the MRW model and to the AK model with broad
interpretation of capital.
Second, the author found a negative and significant coefficient for the relative
price of investment goods (PPI60DEV). This suggests an important role for taxation and
other distortions that impact on the relative price of capital as determinants of the
efficiency level, A.
Third, the ratio of real government consumption to GDP has a negative and
significant coefficient. Barro (1991) interprets this, not as capturing a direct effect of
government consumption on GDP, but instead indirect effects: high government
consumption tends to be associated with distortions caused by high tax rates and large
expenditure programmes. In additional regressions (not displayed in Table 5.2), the
author found a positive correlation between growth and the share of public investment
in GDP.
Fourth, political instability, as captured by the variables ASSASS and REV is
negatively correlated with growth. The interpretation is that political instability creates
uncertainty and leads to unpredictable changes in laws and government policies thus
crating uncertainty and having a negative impact on investment. This result stresses the
important role of the government in providing a sound social environment, public order
and protection of property rights.
115
Note that the human capital proxies refer to the beginning of the period. This is a simple way of
avoiding a well known econometric problem arising with mutual causality.
116
The author removed expenditures on education and defence on the grounds that these components of
public expenditure are more likely to play the role of public investment than that of consumption.
afreitas@ua.pt 162
Table 5.2 – Regressions for per capita output growth
(1) (2) (3)

Dependent Variable GR6085 GR7085 GR6085
No. Observations 98 98 98
Const. 0,0302 0,0287 0,0229

(0,0066) (0,0080) (0,0073)
SEC60 0,0305 0,0331 0,0225

(0,0079) (0,0137) (0,0090)
PRIM60 0,0250 0,0276 0,0181

(0,0056) (0,0070)
(0,0060)
G/Y -0,119 -0,142 -0,119
(0,028) (0,034) (0,027)
REV -0,0195 -0,0236 -0,0159

(0,0063) (0,0071) (0,0062)
ASSASS -0,0333 -0,0485 -0,0315

(0,0155) (0,0185) (0,0182)
PPI60DEV -0,0143 -0,0171 -0,0119

(0,0053) (0,0078) (0,0058)
I/Y - - 0,068
(0,032)
Initial PCGDP -0,0075 -0,0089 -0,0072

(0,0012) (0,0016) (0,0009)
2
R 0,56 0,49 0,59
σ 0,0128 0,0168 0,0123

Notes: Standard errors of coefficients appear in parentheses
Source: Barro (1991).
To assess the extent to which the different variables have a direct influence on
growth or, instead, they affect growth only through their effect on private investment
(remember the discussion in Section 5.4), Barro (1991) repeated the estimations
including the ratio of investment (private plus public) to GDP (I/Y). The corresponding
results are displayed in Column (3) of Table 5.2. Basically, the results suggest that all
variables remain significant, meaning that they all (or the determinants they are
measuring) have direct links to economic growth (through A).
Finally, and most important, in the three equations, the initial level of per capita
GDP (initial PCGDP) was found to be significant. That is, controlling for the other
variables, an initial lower level of per capita GDP is positively related to subsequent
growth. This evidence is favourable to the conditional convergence hypothesis and
suggests the rejection of the AK model, at least in its basic formulation.
In general, the evidence with cross-country growth regressions using large
samples of countries has been favourable to the conditional convergence hypothesis.
That is, the coefficient on the initial level of per capita income has been found to be, in
general, negative and significant in cross-country growth regressuions. This evidence is

unfavourable to the AK model. Taking this into account, Robert Barro concluded: “It is
surely an irony that one of the lasting contributions of endogenous growth theory is that
is stimulated empirical work that demonstrated the explanatory power of the
neoclassical growth model” (Barro (1997), p. x).
Box 5.5. Cross-country growth regressions
Cross-country growth regressions are the workhorse of empirical research on

economic growth117. Basically, the approach consists in estimating equations relating
the growth rate of per capita GDP to a range of possible determinants. The later may
include variables capturing factor accumulation (such as the investment rate and the
population growth rate), and variables that are more likely to exert its influence on
growth through the productivity parameter, A. Examples of cross-country growth
regressions are presented in Tables 4.2, 5.1 and 5.2.
Growth regressions have a natural interpretation in terms of equation (5.5): in
that model, growth appears as a function of the investment rate and of the productivity
term, A.
The advantage of cross-country growth regressions relative to simple growth
accounting or development accounting, is that, rather than estimating A as a residual,
they try to identify the policies and other factors that underlie the cross-country
differences in A. In practice, it has been common to find indicators capturing country
idiosincrasies that are strongly correlated to growth. This includes either variables
measuring the quality of policies and institutions (such as trade openness, the rule law,
political risk, inflation, financial depth) and the geographical conditions.
Often, the researcher compares the coefficient of the variable of interest in two
different regressions, one controlling for the investment rate and other not controlling
for the investment rate. The idea is that any particular variable may have a direct
influence on growth,, or an influence that is indirect, via its influence in the rate at
which individuals accumulate capital (remember equation 5.10). Thus, for instance, if
the variable of interest has a smaller coefficient or looses significance when the
investment rate is included among the regressors, this suggests that its influence on
growth occurs mainly through its impact on the investment rate118.
Cross-country growth regressions may also have an interpretation in light of the
neoclassical growth model. In its basic formulation, the conditional convergence test
(Box 4.4) consists in estimating the growth rate of per capita income as a function of
investment rates in human and physical capital, population growth rates, and initial
income. This test intends to control for differences in the steady states (remember that
the regression equation is obtained substituting the steady state level of per capita
117
Growth regressions, were first explored by Robinson (1971) and Kormendi and Meguire (1985), but
were popularized by Robert J. Barro (Barro, 1991, Barro and Sala-i-Martin, 1991, 1992).
118
A notable example in this avenue is Levine and Renelt (1992), who found that trade openness is an
important determinant of economic growth when investment is not in the equation, but it looses
significance when the investment rate is included. They concluded that international trade is good for
growth, but only through its impact on investment.
afreitas@ua.pt 164
income 4.18 in 3.14). The MRW formulation fails, however, to control for policies
impacting on the level of A.
Putting all the pieces together, new research on conditional convergence turned
to extended versions of the MRW test, by adding other variables to the regression
equation. Formally, the equation to be estimated in an extended neoclassical framework
is:
ln y t  ln y 0  a  b ln y 0   X  Z  u t , (5.25)
where X is a vector of variables capturing factor accumulation that are present in the
MRW model (propensities to invest in physical and human capital, and the population
growth rates) and Z is a vector of other variables determining the level of A.
As in the simpler model, conditional convergence is assessed investigating the
significance of b : if its found to be zero, then changes in the other explanatory variables
impact on the growth rate permanently, supporting the endogenous growth model (5.5);
if, instead, b is found to be significant and negative, this suggests that growth rates are
proportional to the distance to steady states, which accords to the idea of conditional
convergence.
Cross country growth regressions face a number of limitations.
First, because the theory does not provide an unambiguous guide to the choice
of elements of Z, there is a lot of uncertainty regarding the right model specification. In
practice researchers have proposed more and more variables to complement the baseline
MRW specification, each one stressing a causal relationship between a particular
variable and growth. This, in turn, brings a familiar econometric problem: because
explanatory variables tend to be correlated to each other (countries performing badly in
a given indicator also tend to perform badly in other indicators), there is a large scope
for multicolinerity: the significance of each variable in the equation is influenced by the
particular combination of variables included in the regression.
In practice, although many variables have been found to be correlated with
growth, most of them loose significance when other variables are included in the same
equation. This problem makes very difficult to assess empirically which variable is
more correlated to growth and how much (e.g, if inflation rates, exchange rate volatility
and political instability go wrong together, how can we disentangle the various
contributions to growth?)119.
Second, there is a problem of endogeneity: although it may appear natural that
the parameter estimates (and in equation 5.25) contain information of causal effects
on economic growth, this is not necessarily true. Some right-hand-side variables may be
econometrically endogenous in the sense that they are jointly determined with the rate
of economic growth: for instance, the same factors that make a country invest little in
physical capital may also have a direct effect on its growth rate. In that case, the
119
Attempts to resolve this problem in a systematic way include Levine and Renelt, (1992), Doppelhoffer
et al., (2004), Sala-i-Martin, (1997). Levine and Renelt (1992) found only four variables robustly
correlated to growth: the share of investment in GDP, the rate of population growth, the initial level of
real GDP per capita and a proxy for human capital. The remaining variables capturing the quality of
policy and political instability did not pass the robustness tests proposed by these authors (still, they
found a measure of openness to international trade to be positively correlated with investment). These
results are favourable to the MRW model. Durlauf et al (2005) contended however that the robustness
tests proposed by Levine and Renelt are too stringent.
estimated parameter will be biased and will provide little information regarding the
direction of causality.
Third, even if all variables on the right hand side were actually exogenous, many
of them could be “symptoms”, rather than “syndromes”. For instance, consider the
measurement of human capital. Shall we choose the secondary school enrolment or the
primary school enrolment? Since these tend to be correlated to each other, they render
one another insignificant when both are included in the regression equation. So which
one should we choose? Moreover, a given symptom may be interpreted as capturing
different syndromes. For example, a negative correlation between inflation and growth
means bad macroeconomic management or a large tax evasion that forces the
government to rely on revenues from money creation?
Fourth, there is a problem of parameter heterogeneity: parameter values
estimated with cross section exercises that pool together countries so different such as
France and Nepal may fail to accurately capture any of each120.
Fifth, the lack of a structural model stating how much the parameter A depends
on each policy variables makes it difficult to go beyond general statements on observed
correlations and to provide a convincing interpretation of the results.
Other problems of cross-country-growth regressions include: the presence of
outliers, measurement errors, model linearity and, most important, they tend to
overestimate the impact of policies on growth. Despite the extensive econometric
improvements that have been adopted to overcome these limitations (see Durlauf et al,
2005, for a survey), the results of cross-country growth regressions have always to be
taken with caution.
120
Prichett (2006). Pritchett (2006) observes that the problem is even more serious with qualitative
variables: “is the effect of corruption proportional to corruption as measured? Is the effect stronger in
democracies than in non-democracies? In poorer than in richer countries? In more open than in less open
countries? In reality, the growth regression is only a crude approximation that indicates the average
impact of corruption, but it does not provide the information policymakers really want—the specific
impact in a particular country”. In this respect, Temple (1999) quotes Harberger (1987): “What do
Thailand, the Dominican Republic, Zimbabwe, Greece and Bolivia have in common that merits their
being put in the same regression analysis?
afreitas@ua.pt 166
5.9 Discussion
Despite the empirical controversy, most growth theorists are now converging to
the idea that policy changes have “level effects”. Authors sharing this view argue that
country characteristics, such as the saving rate and aggregate efficiency are more likely
to influence the levels of per capita income than growth rates. This trend in the literature
was dubbed the neoclassical revival. This view is supported by an extensive empirical
literature favourable to the conditional convergence hypothesis.
Does this mean that we shall abandon the AK model? The answer is no.
First, remember that the important link between efficiency and growth is also
present in the neoclassical model: the difference is that in the later the growth effect will
be transitory. That is, you may interpret the AK model as a short-run version of the
neoclassical growth model. With half of the transition period between steady states in
the neoclassical model taking as long as 35 years, whatever the true model is, we are
doomed to accept that policy actions may influence economic growth for a considerable
period of time121.
Second, the AK model is much easier to solve than the Solow model. Because of
this, from the expositional point of view, it is often more convenient to study the impact
of particular policies in the context of the AK model than in the context of the Solow
model, especially when the math becomes too complex. Of course, in doing so, one
shall take into account that any conclusion regarding the impact of the policy at hand on
growth has to be spelled out in term of level effects, when adapted to the context of the
Solow model. In some of the upcoming chapters, we will follow this approach.
Last, but not the least: the AK model illustrates how linearity avoids the basic
problem of diminishing returns, generating long-term growth. Linearity is a basic
feature of most endogenous growth models, including those focusing on technological
change. The AK model can therefore be interpreted as a toy version of more complex
endogenous growth models, whereby knowledge expands through investments in R&D.
Note that knowledge shares with capital the characteristic that it can be built over time
by sacrificing some of today’s consumptions. So, interpreting investment as foregone
consumption in a broad sense (that is, including physical assets, human capital and
R&D), one can see the AK model as a general framework to think the mechanics of
economic growth. Having said this, one should state that physical capital and
knowledge have quite different natures. For this reason, in what follows, we will enrich
the model so as to distinguish the accumulation of new knowledge from simple
accumulation of physical and human capital.
121
Easterly (2005) calibrated a simple neoclassical growth model with a share of total capital equal to 2/3
(which accords to MRW) and with other reasonable values for the remaining parameters. He found that a
tax decrease from 30% to zero raises per capita income by a factor of 2.25 times. The author also showed
that immediately after the change in policy, the growth rate of the economy shoots up by almost 8
percentage points relative to its steady state. Only in the very long run (more than 5 decades after), the
growth effect wears off and the growth rate returns to its long run level. The author concluded that
policies have significant effects in the neoclassical model, too (pp. 1024-1026).
Appendix 5.1 Unbalanced growth in the HD model
We argued that, because the Harrod Domar model has no role for prices, only by
an exceptional coincidence of parameters will the economy evolve along the full
employment locus.
To see this, consider first the case in which parameters are such that population
and capital grow exactly at the same rate (that is, sA    n ). In that case, both the
capital-labour ratio and output per capita remain constant. In terms of figure 5.1, this
exceptional case occurs when the break-even investment line coincides with the saving
line. In this case, any starting point is a steady state.
Note however that sA    n does not imply that the economy will grow along
the full employment locus k=B/A, in Figure 5.4. For instance, if the economy started
out with labour surplus (point S), then it would move along a path with a constant
capital-labour ratio ( k S in the figure), but with increasing unemployment. The only case
in which the economy evolves along the full employment locus is when sA    n and
simultaneously the economy starts out without surplus labour (point R).
Figure 5.4.Unbalanced growth in the Harrod Domar Model
K
sB
k* 
n 
k  B A  1/2
kS
kU
A less fortunate scenario occurs when sA    n . In that case, the economy

does not save the enough to keep the capital labour ratio unchanged. In (5.5), we see
that in this case, the growth rate of per capita income is negative, implying a rising
labour surplus and chronic underproduction (path k U in Figure 5.4).
The best scenario in the HD model occurs when the parameters in the economy
are such that the capital stock grows faster than population (that is, when sA    n ).
In this case, per capita income increases until the surplus labour is completely
eliminated. Still, the mechanics of the model is such that per capita income cannot
growth indefinitely. The reason is that at the time the full employment line is crossed
(point R in Figure 5.4), the binding constraint in production becomes the availability of
labour (that is, the relevant segment of the production function in (5.6) shifts to
afreitas@ua.pt 168
Y  BN ). Hence, beyond this point output will be bound to expand at the same rate as
population, implying a constant level of per capita income thereafter. In Figure 5.4, this
case is represented by path k * , with surplus of capital in the steady state122.
 The AK mode reveals in a simple manner that getting rid of diminishing

returns, factor accumulating alone can generate continuous growth of per
capita income.
 In the context of the AK model, changes in the saving rate produce
“growth effects” rather than “level effects”.
 The predecessor of the AK model, was the Harrod Domar model. Since
this model assumes unemployment of labor, the main constraint to
economic growth is capital availability. A similar story was formulated
by Sir Arthur Lewis. For the context of developing countries.
 The Harrod Domar model inspired the idea that complementing low
domestic savings in poor countries by foreign aid would be a key to
generate economic growth. In practice, however, the impact of external
aid on the growth varied significantly across countries, depending on the
quality of domestic policies, institutions, and geography. This suggests,
once again, a key role for aggregate efficiency in determining growth
performances.
 Extending the AK model to the case with endogenous savings, the direct
effect of aggregate efficiency on growth is reinforced by an indirect
effect via a higher return on savings. The implication is that, wherever
financial markets are more developed so that households can smooth
consumption over time, the impact of policy changes on growth is more
dramatic.
 The model with endogenous savings appeals to the distinction between
proximate causes of growth and fundamental causes of growth. As for
the proximate causes, we are explaining differences in economic
performance by differences in saving rates and differences in aggregate
efficiency. As for fundamental causes, one wants to deepen the analyses
so to understand why some countries have higher saving rates and
better efficiency than others.
122
Dividing both terms of Y  BN by K, and substituting for Y/K in the Harrod-Domar equation (5.4),
the growth rate of the economy in this segment becomes:  t  sB kt  n    . Since this expression
depends negatively on k (that is, as k rises, its growth rate declines), this segment of the model has a
stable equilibrium. Solving for =0, one obtains the steady state level of capital per worker:
k *  sB n     B A (see Barro and Sala-i-Martin, 1995, pp. 46-49, for details). The implication is
that, after crossing the full employment locus, the economy will evolve along the path k  k * , with
unemployment of capital.
 Extending the model to include human as well as physical capital does

not change its properties, as long as there are constant returns to scale on
reproducible factors. This extension suggests that one can interpret K in
the simple AK model in a broad sense, including different types of
capital and even technology.
 An alternative extension consists in specifying a production function for
the change in human capital, depending on the stock of human capital.
As long as this production function is linear, the model will display
unceasing growth in the steady state. The interesting feature of this
model – first proposed by Usawa – is that changes in the saving rate
produce level effects, just like in the Solow model.
 A branch in the literature explored the possibility of obtaining unceasing
growth in the context of the neoclassical framework. For this to happen,
one only need to assume that the marginal product of capital is bounded
below by a positive constant that is higher than the rate of time
preference. This will prevent the interest rate from following to a point
where households desire consumption to be constant, as is happens in the
Solow model.
 The empirical evidence of conditional convergence suggests that the
neoclassical model, and the implied hypothesis of level effects and
conditional convergence, fits better the reality than the simple AK
model.
afreitas@ua.pt 170
Key concepts
 Surplus labour
 Proximate versus fundamental causes of economic growth
 Broad concept of capital
 Neoclassical models of endogenous growth
 Two sector model of endogenous growth
 Cross-country growth regressions
Essay questions:
a) Referring to the Harrod-Domar equation, compare the AK model and the

Solow model in respect to the variables that are exogenous and
endogenous. In particular, examine the impact of an increase in the
saving rate in light of the two models.
b) Comment: “The proof that the AK model is not true is that foreign
assistance to poor countries failed to deliver faster economic growth”.
c) Comment: “Poor countries, with underdeveloped financial markets, are
more likely to tolerate bad policis than rich countries with developed
capital markets”.
d) Explain why the Usawa model is hybrid. In the context of this model,
which policies could influence the rate of economic growth?
Exercises
5.1.
Consider an economy where the production function is given by Y  AK . In this
economy, the saving rate is s, the population grows at rate n and the capital
depreciation rate is δ.
a) Does this production function satisfy the usual neoclassical properties?
Why?
b) Describe analytically and graphically the dynamics of per capita income
in this economy. Is there any stable equilibrium?
c) Does this model predict convergence of per capita incomes across
economies?
d) Describe, comparing with the Solow model, the impact of: (i) a fall in the
population growth rate; (ii) An increase in A.
5.2.
Consider an economy, where the production function is given by Y=0,2K, the
population grows at 2% per year, the capital depreciates at 3% and the saving rate is
25%.
a) Find out the growth rate of per capita income in this economy.
b) What will be the effect of A increasing to 0.25?
c) Now assume that the saving rate was endogenous, as implied by the
following optimal consumption rule:  t  rt  0,17 . Analyse in this case
the implications of an increase in efficiency from 0.2 to 0.25.
d) Comparing the two models, find out the expression that relates the
saving rate to efficiency (A). Explain why a change in the efficiency
parameter (A) has a larger impact when savings are endogenous.
5.3.
In Micronésia, the aggregate production function is given by Y  K 0.5 H 0.5 , where

H=hN , N is the number of workers, and h measures the amount of human capital
per worker. In this economy, the saving rate is given by s=25%, the population is
constant and the rate of depreciation of physical capital is equal to 5%.
a) Assume for the moment that h=1. Find out the equilibrium values of
k=K/N e y=Y/N. Explain the dynamics of the model with the help of a
graph.
Assume now that h  s h y  h  0.25 y  0.05 h .
b) Explain this specification.
c) In this case, the properties of the neoclassical model are satisfied? Why?
d) Find out the growth rate of per capita income in this economy.
e) Compare, in the light of both models: (i) the short run and the long run
effects of a rise in the saving rate (ii) The convergence hypothesis
afreitas@ua.pt 172
5.4.
In Landowr, the production function is given by: Y  0.81K 1 3 H 2 3 , where

H=hN measures labour in unit of human capital. In this country, the investment rate in
physical capital is 20%, the population does not grow (n=0) and the depreciation rate of
physical capital is 4.49%.
a) Consider, for a moment, that h h  0 .0271 .Which model is consistent
with this assumption? Find out the corresponding equilibrium values of
K/H and Y/H.
b) Now assume that h h  b 1     0.0271 .To which model does this
specification apply? Taking into account the production function,
compute the implied values of b e . Find out the steady state of this
model and represent it in a graph. Describe the effects of Y/N and Y/H
of an increase in b to 0.306643. Interpret.
c) Finally, consider, in alternative, the following specification:
h  s h y  h  0.05926 y  0.0449 h .Explain this model. Compute the
dynamic of per capita income in this case.
5.3.
Consider the following production function and law of motion of per capita
consumption:
Yt  At K t N t H t1   , with  ,   1
 r.
Assume that the depreciation rate is identical for the two capital types and that
population does not grow over time.
a) Suppose that   ,   0 .
i. Explain if it is possible to obtain sustained grow of per capita income in
the long-run through factor accumulation.
ii. Describe the impact of an increase in  in the interest rate and in per
capita income.
b) Suppose that   ,   0 . Discuss the advantages of this
parameterization comparing them to the results obtain in (a).
c) Finally, suppose that .
iii. Explain if it is possible to obtain sustained grow in the long-run through
factor accumulation.
iv. Describe the impact of an increase of  in the interest rate and in per-
capita income.
afreitas@ua.pt 174
6. External economies and learning by doing
“One for all! All for one!” [Alexandre Dumas]
Learning Goals:
 Distinction between internal and external economies of scale

 Acknowledge the different types of external economies related to capital
accumulation
 Distinguish the implications of non-decreasing versus increasing returns
to scale for economic growth and convergence
 Explain why increasing returns are a source of cumulative causation
 Discuss the role of external economies in shaping comparative
advantages and the pattern of international trade
6.1 Introduction
As mentioned in Chapter 4, one possibility of accounting for a larger role of

reproducible factors in production than that implied by the corresponding shares in
national incomes is by assuming the existence of externalities. This chapter shows how
externalities associated to capital accumulation have indeed the potential to overcome
the limitation imposed by diminishing returns, leading to unceasing growth.
This avenue was first explored by Marvin Frankel, as early as in 1962. The aim
of Frankel was to reconcile the convenient properties of the Cob-Douglas production
function regarding factor allocation and income distribution with the aim to generate
non-decreasing returns and endogenous growth, as implied by the Harrod-Domar
model. The key he proposed for such conciliation was to assume the existence of
externalities associated to physical capital, so that aggregate productivity becomes a
positive function of the number of firms in the economy. Frankel’s contribution
remained, however, unnoticed by the literature until Paul Romer came out with a similar
idea: in a famous article written in 1986, Romer argued that externalities in knowledge
accumulation may give rise to increasing returns at the economy-wide level, implying
unceasing growth. In this model, knowledge and physical capital are assumed to move
together, an assumption that was previously explored by Keneth Arrow, in his model of
“learning by doing”. A related work by Robert Lucas Jr., in 1988, emphasised the role
of externalities associated to investment in human capital. The theories of endogenous
growth based on externalities on capital accumulation marked the first wave of the so-
called “new growth theory”.
This chapter reviews these theories and explores the policy implications. Section
6.2 describes the model introduced by Marvin Frankel. Section 6.3 explains why the
competitive equilibrium with externalities is not efficient and discusses the possible role
of the government in addressing the market failure. Section 6.4 addresses the specific
case with increasing returns. Section 6.5 focuses in the model of learning by doing.
Section 6.6 discusses the implications of learning by doing for comparative advantages
and international trade. Section 6.7 concludes.
Modelling the externality
In his 1962 paper, Frankel first observed that the Cobb-Douglas production
function is capable of describing factor allocation and income distribution but is not
capable of generating sustained growth of per capita income. In turn, the AK production
function is capable of generating long-run growth, but it does not offer a satisfactory
theory for factor allocation and income distribution.
Frankel then proposed a method to conciliate the two production functions, so
that the desirable properties of each but none of the limitations are retained: the key was
to introduce a production externality, whereby the “overall level of development of a
region” impacts positively on the productivity of each private firm123. Frankel related
the externality to “various external effects” related to capital accumulation, such as
“improvements in the level of organization, technical change, better social overhead
facilities in the form of transport and communication networks, etc”.
To capture this idea, Frankel assumed that each individual firm faces a Cobb-
Douglas production function, where TFP is a positive function of the economy-wide
capital stock. Formally, let the production function for each individual firm i be given
by:
1 
Yi  BKi N i , (6.1)
where Yi, Ki and Ni denote, respectively, for output, capital and labour employed by firm
i. The TFP parameter, B (the “development modifier”, as coined by Frankel) was
assumed to depend positively on the aggregate level of capital per worker:
K
B  A  ´ , with   1   and 0   '   , (6.2)
N
where K   K i and N   N i stand, respectively for the aggregate levels of capital
i i
(human, physical) and labour in the economy.
According to (6.2), an increase in the aggregate stock of capital impacts
positively on the productivity of each firm. Thus, whenever a firm accumulates capital
for private reasons, it will be “indirectly” contributing to the productivity of all other
firms. Because each firm is small relative to the economy, it will ignore this external
effect. Production externalities specified in this manner are labelled “Marshallian
externalities” (see Box 6.3).
123
Frankel (1962): “Enterprises in relatively developed or advanced economies are able to produce more
with given inputs of capital and labour than enterprises in relative underdeveloped economies. This is the
essence of economic development”.
afreitas@ua.pt 176
The productivity term (6.2) also accounts for a negative externality on aggregate
labour, in case  '  0 . This effect captures the possibility of the positive externality
related to capital accumulation being partially or totally diluted by the size of the labour
force. When, for instance,    ' , the firm productivity will depend on the aggregate
stock of capital per worker, rather than with the aggregate stock of capital in absolute
terms. When instead  '  0 , what matters is the absolute level of capital in the
aggregate, not the capital labour ration. In the following, we’ll discuss cases in which
one or other assumption make sense.
The aggregate production function
Because of the externality, the aggregate production function differs from the
individual production functions, even if all firms are alike.
The aggregate production function is obtained substituting (6.2) in (6.1) and
summing up across all firms. This gives:
Y  AK   N 1  ' , (6.5)
where Y  Yi .
i
The novelty of production function (6.5) is that it may deviate from the
neoclassical assumptions of constant returns to scale and diminishing marginal returns.
For instance, when   1  , returns to capital are non-decreasing. As we already
know, this is the condition we need for a model to display unceasing growth thorugh
capital accumulation. On the other hand, whenever    ' , the aggregate production
function will display increasing returns to scale: that is, rising capital and labour by a
given proportion will lead to a more than proportional impact on output. As we will
discuss next, this property makes size an advantage, causing the model to display
circular causation.
Note that at the individual level, the production functions retain the neoclassical
properties of constant returns and diminishing returns to capital. The aggregate
production function departs from these properties because of the externality, which
individual firms – because they are too small - do not take into account.
Box 6.1. Externalities
The most basic type of market failure is an externality. Externalities are present
whenever an individual takes an action that directly affects the environment of others
but for which it neither pays nor is paid in compensation.
In a consumption externality, the utility of one agent is directly affected by the
consumption decisions of other agent. For instance, you may benefit by the fact that
your neighbour hires a private security: your house will be safer, even if you don’t pay
for it.
In a production externality, the utility of one agent is directly affected by the
production decisions of other agent. A textbook example is the steel mill and the
laundry: smoke emissions by a steel mill may directly affect the production of clean
clothes by a laundry. In that case, there is a negative externality on production.
In the presence of externalities, the market mechanism does not deliver an

efficient allocation of resources. In the case of negative externalities, individuals do not
bear the full cost of their actions, so they will engage in socially excessive activity.
Conversely, in the case of positive externalities, individuals do not enjoy the full
benefits of their activities, so they will engage in too little activity. In both cases, a
careful use of discriminatory taxation by the government may improve, at least
theoretically, economic performance.
Box 6.2. Internal and external economies of scale
The distinction between “internal” and “external” economies of scale dates back
from Scitovsky (1954). “Internal” economies of scale refer to the case in which a single
firm faces a downward sloping average cost curve when increasing its own output level.
In this case, there is a tendency for the firm to become larger and larger and to become
monopolists in the market. Internal economies of scale are inherently linked to
imperfect competition.
The concept of “external” economies of scale refers to the case in which scale
economies arise at the aggregate (spatial or industry) level. In that case, average costs
for the individual firm decline with aggregate output, but not with the individual firm
output. “External economies of scale” in the aggregate may co-exist with constant
returns to scale and declining marginal productivities at the firm level. Hence, one does
not need to abandon the assumption of perfect competition.
Factor prices in the competitive equilibrium
Irrespectively of the shape of the aggregate production function, each firm will
see its own (individual) production function (6.1) has having the standard neoclassical
properties of constant returns to scale and diminishing returns on capital. The reason is
that each firm is small relative to the economy: since the impact of a firm investment
decisions in the aggregate is negligible, each firm will take parameter B as exogenous.
Thus, profit maximization by price taker firms will lead to the usual conditions
stating that firms employ labour and capital until their marginal products equal the
respective factor prices:
Yi Y
r    i , (6.3)
K i Ki
and
Yi Y
wt   1    i . (6.4)
N i Ni
Because all firms are equal, we have Yi K i  Y K and Yi N i  Y N . Hence, in
the competitive equilibrium, the shares of capital and of labour on domestic income will
be given, respectively, by  and . This is the very convenient result Frankel wanted
to stick with, for the model to be in accordance to the Kaldor stylized facts.
afreitas@ua.pt 178
The AK model again
Equation (6.5) is general enough to account for all types of external economies.
A particular case occurs when    ´ 1   . In this very special case (on which Frankel
focused on), the size of the positive externality in K is exactly enough to overwhelm the
normal process of diminishing returns to capital, and – at the same time - the negative
externality on labour exactly matches the externality on capital, implying that returns to
scale remain constant. When this is so, the aggregate production function (6.5) becomes
exactly linear in K:
Yt  AK t , A > 0 (5.1)
That is, the production function takes the AK form at the aggregate level, but it
retains the neoclassical properties at the individual firm level. Each firm perceives its
production function as having diminishing returns to capital, so it will employ capital
and labour according to (6.3) and (6.4). In the aggregate, the production function will be
linear in K, so the marginal product of capital will never decline and the economy will
never stop growing.
The advantage of this model when compared to the simple AK model, is that it
does not rely on the peculiar assumption that labour plays no role in production. Like in
the Solow model, both factors are used in production. Moreover, the model accords to
the main Kaldor stylized facts: the share of capital in income is equal to ; the wage rate
and per capita income will grow steadily over time and the user cost of capital is
constant and equal to r    A (equation 6.3).
6.3. The market failure and optimal intervention
The capital’ social return
A novelty in the model with externalities is that the competitive equilibrium will
no longer be efficient. The reason is that each firm, being small relative to the economy,
will decide its capital stock taking into account the impact of that decision on its own
profits, only. The positive contribution of its investment decisions to the overall capital
stock will be considered negligible and hence ignored. Still, the investment decisions of
all firms taken together will impact positively on the profits of each individual firm.
Thus, the competitive equilibrium will deliver a suboptimal level of investment.
Formally, the marginal contribution of aggregate capital to aggregate production
(i.e, taking into account the externalities) as stated in (6.5) is:
 Y  Y
       (6.6)
 K  social K
However, in its profit maximization problem, the firm considers only the narrow
private returns to capital (equation 6.3).
Hence, as long as there is an externality on capital accumulation (   0 ) the
competitive equilibrium will deliver an allocation of resources in which the private
return to capital and the marginal contribution of capital to production differ.
Note that this constitutes an important novelty relative to the Solow model. By
assuming away externalities and other market failures, the Solow model implies that, in
a competitive equilibrium, factor prices are equal to their respective contributions to
production. So, a central planner concerned with efficiency would choose an allocation
of resources matching exactly the competitive equilibrium.
This model, by introducing externalities, implies that the equilibrium allocation
in the decentralized economy will not be efficient. The wedge between private returns
and social returns to capital implies that incentives are misaligned: in the decentralized
economy, investment will be too low.
Growth accounting revisited
The existence of a wedge between private returns and social returns to capital
has important implications for growth accounting. Conventional growth accounting (as
exemplified in Section 2.6) typically uses the share of capital on national income as
proxy for the contribution of capital to production. The discussion in Chapter 4
revealed, however, that this delivers an estimate that is too small to account for the
observed differences in per capita incomes across countries. In order to account for such
large differences, one would need a contribution of capital to production much larger
than that implied by the observed income shares.
This puzzle was actually solved by Frankel as early as in 1962: Frankel argued
that a much larger contribution of capital to production than that implied by the
observed shares in national incomes could be explained by externalities.
Formally, equation (6.5) reveals that, as long as the externality parameter  is
positive, the actual contribution of capital to production () is larger than that implied
by its share in income (. Log-differentiating (6.5), one obtains:
Yˆ     Kˆ  1     'n (6.7)
In (6.7), input changes have two effects, a direct one and an external effect. The
external effect may amplify or diminish the direct effect, depending on the sign and
magnitude of the respective parameter. For instance, when   1   , a one-percentage
point increase in the capital stock will result in a one-percentage point increase in
output, a result that conforms with the AK model (and that Frankel argued to conform
as well to the U.S. data).
Equation (6.7) suggests that conventional growth accounting, by
underestimating the effective contribution of capital to production, overestimates the
Solow residual.
Optimal intervention
The wedge between social returns and private returns to capital constitutes a
market distortion. Firms tend to under-invest in physical capital relative to what would
be considered optimal by a benevolent planner. The government, given this sort of
diagnosis, may have a role in using the policy to achieve the optimal allocation of
resources.
How might government policy be used to establish the efficient allocation? An
obvious avenue is to subsidize capital accumulation. To see this, let’s rewrite the
afreitas@ua.pt 180
individual firm profit function, but allowing for a tax rate  K (subsidy, when negative)
on capital incomes:
 i  BK i N i1   r   1   K K i  wN i (6.8)
In light of this specification, the cost of one unit of capital –the cost to firms - is
  
r   1  K . What households – the owners of capital – receive as net income is
r   .
From the first order conditions of profit maximization, one obtains (instead of
6.3):
Yi Y
  r   1   K  (6.9)
K i K
To remove the distortion, the government should set the tax rate so that the (net)
 
rental price of capital, r   , reflected fully the marginal contribution of capital to
aggregate output, as given by (6.6). That is, the tax rate  K should be such that:
Y K Y
r         (6.10)
1   K  K
Solving for  K , the optimal (first best) policy is to set:


K  0 (6.11)
 
Thus, the optimal policy in this model involves subsidizing physical capital
accumulation124. This result is intuitive: if the contribution of capital to production is
given by (6.6) and private firms only perceive it to be equal to (6.3), then a subsidy
filling the gap will achieve the aim of getting private returns aligned with the social
interest.
In the particular case in which   1   (the AK model), the optimal tax rate
will be  K    1 (note however, that in this extreme case all income in the economy
would be devoted to capital owners and nothing would be left to raw labour; this would
be only possible if K referred to a broad concept of capital, including human capital).
Growth effects
An important implication of this model is that removing the distortion leads to a

greater efficiency and, by then, to a higher rate of economic growth. To see this,
consider again the optimal consumption rule   r   and let’s focus in the particular
case in which    ´ 1   (the AK model).
124
Of course, a question arises on how this subsidy will be financed. For the moment, just assume that
lump sum taxation is available, so that the policy will not imply further distortions. The issue of
distortionary taxation and second best decision-making will be addressed in Chapter 11.
In the competitive equilibrium, the interest rate is determined according to (6.3).

Substituting r in the optimal consumption rule, one obtains the growth rate of per capita
income in the decentralized economy:
Y
       A     (6.12)
K
If however the government managed to influence the interest rate so as to reflect
the social contribution of capital, the user cost of capital would become equal to (6.6).
Thus, the growth rate of the economy would be:
Y
 *          A   (6.13)
K
Comparing (6.13), we see that the growth rate of this economy under central
planning will be higher than in the laissez fare.
This example illustrates how judicious government intervention might be used
to establish the “right” prices and thereby stimulate growth. Note however that such a
“perfect” intervention requires a high level of confidence on the size of the external
effect, as well as availability of non-distortionary taxation. Whenever these conditions
are not met, it may well be the case that the government may fail to do better than the
market.
6.4 The case with increasing returns to scale
So far, the analysis focused on the case with    ´ . This is however a very
special case. In that case, the positive effect arising from a larger stock of physical
capital is exactly offset by the “dilution” effect resulting from a larger number of
workers. In this version of the model the aggregate production function exhibits
constant returns to scale, even though returns to capital are non-decreasing.
A quite distinct situation occurs when    ' 125. In that case, expanding the use
of capital and labour by a given proportion has a more than proportional impact on
output: the aggregate production function exhibits increasing returns to scale.
Remember that these increasing returns do not arise at the individual firm level, but
instead at the aggregate level: it is because the productivity of each individual firm is
parametric on aggregate variables that returns to scale arise. Because of this, increasing
returns are said to be external to the firm (Box 6.2).
When the aggregate production function displays increasing returns, there will
be a tendency for the region to become larger and larger. To see this, just note that the
average product of labour in (6.5) becomes equal to:
y  Y N  Ak    N   ' (6.14)
125
In case    ´ 1   , the aggregate production function exhibits constant returns to scale and
diminishing returns to capital. As you may easily check, in that case the steady state growth rate of output
is equal to the growth rate of population, just like in the basic Solow model. Still, because of the
externality, private returns to capital in laissez faire are too low. The case with  ´ 0 is formally
addressed in Appendix 6.1.
afreitas@ua.pt 182
This means that that, in a competitive equilibrium, the wage rate – determined
according to (6.4) - will also be an increasing function of the size of the workforce.
The implication is that a larger region will be a more attractive place to work
than a smaller region. This will generate a tendency for employment to move to the
larger region, further expanding the larger region and depressing the smaller region.
Cumulative causation
This discussion illustrates why increasing returns are a source of divergence: if

for whatever reason, a region starts out bigger, increasing returns will assure that it will
become a more attractive place to work and invest. With free factor movements, labour
and capital will tend to move from depressed areas to the more dynamic region and the
later will get bigger and richer, absorbing resources from the rest of the world.
The idea that development brings more development in a virtuous cycle is very
central in the development literature and is labelled as “cumulative causation”126.
Box 6.3 Alfred Marshall and the theory of external economies
The theory of external economies was pioneered by one of the founders of

modern economics, the British economist Alfred Marshall. In his book “Principles of
Economics” (1920, first published in 1890), Marshall was concerned with the question
as to why there is a tendency for some industries to concentrate in few areas within a
country (“industrial districts”). Examples of this at that time included the cutlery
manufactures in Sheffield and hosiery firms in Northampton. This type of spatial
concentration of industry could be explained by natural resources. In our days, similar
examples include the Silicon Valley, Hollywood and Las Vegas.
To explain the tendency for firms of the same industry to cluster together,
Marshal conjectured that the productivity of each firm in a given location may depend
positively on the general progress of the corresponding industry in the same location,
via three types of external effects:
First, the availability of specialized suppliers: in many industries, production
requires the use of specialized inputs, such as intermediate products and specific
supporting activities, that cannot be acquired at distance because of high transport costs.
For instance, the production of a motion picture requires a variety of services, such as
casting services, sound effects, costume design, choreography, catering, etc. Many of
these services are better purchased to specialized firms, because specialized firms can
split the fixed costs of their activity through different costumers. If, in a given region,
there is only one film producer, it will not pay for upstream suppliers to locate in that
region. Eventually, the later will prefer locations where there are already many
moviemakers, so that they have a market large enough to break even. By the same
token, moviemakers will find it profitable to join locations where other moviemakers
126
The term was coined by Veblen (1898). It was, however, the Nobel Laureate Gunnar Myrdal (1957)
who popularized the concept. This author contended that labour migration, capital movements and trade
may lead to cumulative expansion of the favoured regions and retard the development of backward
regions, leading to persistent or even divergent spatial differences in per capita income.
are already located, because this will imply a higher market for – and hence a higher
availability of - specialized services, competing with each other127.
Second, labour market pooling: when many firms and specialized workers
located in a given region, both sides of the market will be less exposed to events
affecting a small number of firms or workers. For instance, the failure of one firm will
be less problematic for a specialized worker located in a region with many firms than if
located in a region with one employer only. The same holds for firms. By clustering
together, both firms and workers will benefit from the law of large numbers, being
therefore less exposed to specific shocks affecting particular agents.
Third, technological spillovers: Technological spillovers occur because people
have incentive to observe what the others are doing and imitate the best practices.
Arguably the process of technological diffusion takes place more effectively when
various firms of the same industry are concentrated in a given location, so that workers
belonging to different firms have the opportunity to meet together and discuss technical
problems, face-to-face. The mobility of workers across neighbouring firms is also a
process through which this process of knowledge diffusion accelerates.
All in all, these three types of external effects (often called “Marshallian
externalities”) imply that each firm will become more productive, the more firms of the
same type are located nearby. Formally, this is usually modelled assuming that the
technological parameter of an individual firm’ production function depends positively
on an variable measuring the size of the industry in that location (as done in equation
6.2). In that case, the aggregate production function may display increasing returns to
scale, creating the incentive for firms to cluster together128.
Box 6.10 Externalities on Human Capital
External economies may also show up in human capital accumulation. The main
idea is that people who get educated benefit more in a knowledge abundant society than
in a society with little knowledge.
To understand this, ask yourself why the best graduate economists prefer to
work in the City of London or on Wall Street – where economics graduates are plentiful
– rather than in, say, Mongolia where they are in very short supply. The economist
working at City earn his high income in part because of the manner in which its own
knowledge is enhanced by those of fellow well-educated economists. This happens
because individuals benefit from interacting with each other. Exchange of ideas with
other professionals enhances individual capabilities.
127
We will address this argument more formally in Chapter 12.
128
In the real world, location decisions also depend on centrifugal (dispersion) forces. This includes
congestion effects, whereby the cost of a firm adopting a location rises with the number of adopters. For
instance, the concentration of activities in a small area leads to higher land prices, high commuting costs,
pollution and other sociological factors. Another dispersion force arises due to transport costs: to the
extent that some activities have to be undertaken in the periphery (for instance, agriculture, exploitation of
natural resources), being close to the centre implies higher transport costs in transactions with the
periphery. The allocation of economic activities across the space is therefore determined by the tension
between centripetal forces and centrifugal forces. Classical contributions accounting for these centrifugal
forces in the context of “Marshallian externalities” include Henderson (1974) and Fujita and Ogawa
(1982).
afreitas@ua.pt 184
Thus, just like in an assembly line, where the value of each worker's effort
depends on the other worker's efforts, this creates and incentive for the best workers to
match up with each other: if the best economists are assembled together, they will have
better ideas and will get a higher payoff from their skills. If, instead, they are partnered
with lazy or incompetent economists, they will have a lower reward for any effort that
they might individually provide.
Note that this is exactly the opposite of the LDR: with diminishing returns, skills
substitute for each other, so they become more valuable where they are scarcer – in
Mongolia and not at City. Under diminishing returns, skilled labour would move from
rich countries to poor countries. By contrast when externalities are present and this
effect is strong enough to overwhelm the conventional diminishing returns, then skilled
labour will be more valuable where it is more abundant: returns to skills for each
individual are an increasing function of the existing skill level in the society129.
This story explains why we see immigration of skilled labour at maximal
allowable rates and beyond from poor countries to wealthy ones and not the other way
around (remember the Lucas paradox in Section 4.3). For instance, the stock of
immigrant workers in US is, on average, better educated that the average worker in the
home countries. Moreover, for most developing countries, the highest migration rates
are observed in the group of individuals with tertiary education. That is, skilled workers
tend to move to where skilled workers are.130
The same story applies to the other component of human capita, health: an
healthy society impacts positively on individual health through lower contagion of
diseases. Thus, an individual’ health will be a positive function of the average health in
the society. This, in turn, impactspositively on individual productivity, as healthy
societies are expected to spend less resources in personal-care services, releasing
working time to production.
Like the case with physical capital, complementarities in human capital imply
cumulative causation and vicious cycles: for instance, in a nation where skill levels are
already deep and well established, people in that nation will have strong incentives to
invest in their human skills. But in poorer economies where the skill base is thin, the
incentive of individuals to invest in human skills is low. Thus, a country will be rich if it
started out rich, a country will be poor if it started out poor131.
129
Also note that a more educated population is more likely to press their governments for good policies
and better governance. We will examine the role of government policies in Part III of this book.
130
Carrington and Detragiache (1998, 1999).
131
Note that the same mechanism applies to regions, cities, families and ethic groups (Lucas, 1988).
Leaving in cities, people have more opportunity to work near the highly skilled. This helps explain why
wages for similar skills and education levels are higher in cities than in rural areas and also why people
are able to pay higher rents and property prices there. At the family level there is a tendency for literate
parents (specially literate mothers) to raise healthier and more literate children. This gives rise to vicious
circles: a low human capital generation is succeeded by another low human capital generation, while an
initially high human capital generation would give rise to another high human capital generation. As for
ethnic groups, to the extent that social segregation increases the probability of people of the same ethnic
group to match and work together, there will be a tendency for education levels to converge within each
group: people belonging to the low education ethnic group will not invest in education because working
with people with low education implies a low return to education. On the contrary, people belonging to
the highly educated ethnic group will have an incentive to invest in education, because the chances of
being matched with well-educated people are high.
afreitas@ua.pt 186
6.5 Learning by doing
The benefits of experience
A particular type of external economies arises from the benefits of experience.

The main idea is that firms engaged in a specific production process tend to become
more productive as time goes by, because workers, by undertaking similar actions,
perfection their routines and learn to solve minor problems. This benefit occurs through
practice - hence the label “learning by doing” - and is often summarized by a “learning
curve”, that relates the average cost of producing a given good to the cumulative
experience in producing that good132.
Kenneth Arrow used the idea to build a model of endogenous technological
change133. Arrow modelled learning-by-doing at the individual firm level, assuming that
investment in phyisical capital impacts proportionally in the firm’ stock of knowledge.
The rationale is that, when firms buy a new capital good, they also acquire a new
production technique: learning how to use the new equipment and adapting their
production processes so as to extract full profit from the opportunities opened up takes
time. As workers become more accustomed to the new capital good, their common
stock of technical knowledge increases134.
A key assumption of the learning by doing model is that technological change
occurs as a mere by-product of capital accumulation, so it does not involve deliberate
economic decisions.
Knowledge spillovers again
Another key assumption of the learning-by-doing model is that knowledge

leaks: that is, firms tend to imitate the improvements achieved by fellow firms, so they
all end up benefiting from the accumulated experience of each other.
Thus, when one firm invests in new capital, it adds to its own stock of
knowledge and at the same time to the common stock of knowledge. Formally, an
equation similar to (6.2) arises, whereby total factor productivity at the firm level is an
increasing function of the economy-wide accumulated stock of capital.
With this assumption, the model follows in an intuitive manner: each firm,
perceiving its production function as a CRS, buys new capital until the private marginal
product of capital equals the user cost of capital (eq. 6.3). Buying the state-of-the-art
capital, the firm inadvertently increases its own stock of knowledge, but this effect is
small. Since knowledge leaks, however, the acquisition of physical capital by each one
132
The first person to describe the “learning curve” was a German psychologist Herman Ebbinghaus
(1850-1909), in a series of tests consisting in memorizing nonsense syllables. In economics, the concept
was first described by an aeronautical engineer called Theodore Wright. Wright (1936) observed that, as
more aircrafts of a given type are produced, the costs of production fall dramatically, and proposed a
mathematical model to describe it.
133
Arrow (1962).
134
Arrow (1962), p. 157: “each new machine produced and put into use is capable of changing the
environment in which production takes place, so that learning takes place with continuous new stimuli”.
firm adds to the common stock of knowledge, which impacts positively on the
productivity of all firms. Thus, each firm will be more productive, the higher the
productive experience (measured by the stock of capital) in the economy as a whole (eq.
6.2).
Note that the assumption of knowledge spillovers is critical for the model to be
consistent with perfect competition: if the knowledge created did not leak, the
individual firm accumulating capital would become more productive than its
competitors; its returns would be higher and higher and the conditions would exist for
this firm to grow alone and capture the entire market.
Another critical assumption of the learning-by-doing model is that there is no
negative externality associated to the number of workers. Formally, it is assumed that
 ´ 0 in (6.2). The reasoning is that knowledge is non-rival: that is, once knowledge is
acquired, many workers and firms can use it without reducing its effectiveness. Thus,
the stock of knowledge is better described as an increasing function of the total capital
stock in the economy, rather than as a function of the economy’ capital per worker. The
implication is that the learning by doing model unequivocally displays increasing
returns.
The aggregate production function is
Y  AK   N 1 , (6.5a)
Two cases
The critical assumption of the model is whether the externality on capital

accumulation is high enough to offset the diminishing returns. Thus, two cases shall be
distinguished:
The first, introduced by Kenneth Arrow, retains the neoclassical assumption of
diminishing returns to capital, even after accounting for the aggregate externality. In
terms of equations (6.1), (6.2), (6.5) and (6.14), the Arrow model corresponds to the
case in which  ´ 0 and     1 . In this version of the model there are increasing
returns to capital and labour altogether, but there are diminishing returns to capital
alone. Hence, the model accounts for agglomeration effects but cannot generate
endogenous growth (details in the Appendix 6.1).
The second version of the model, explored by Paul Romer, assumes that
    1 135, so the model delivers unceasing growth. Moreover, under such assumption,
the interest rate becomes an increasing function of the population size and so will do the
growth rate of per capita income: that is, a larger economy should grow faster than a
smaller economy, a case often referred to as a “strong scale effect”, whereby (details in
the appendix).
Box 6.4 presents a popular illustration for the argument that externalities on
capital accumulation combined with knowledge spillovers may lead to unceasing
growth.
135
In Romer (1986), K stands primary for knowledge, instead as for physical capital. However, the author
assumed that firms invest in physical capital and in knowledge in fixed proportions, so K could also be
interpreted as a composit capital good, turning the approach similar to that of Arrow.
afreitas@ua.pt 188
Endogenous technological change
Comparing to the Solow model, the learning-by-doing model retains the

assumption that knowledge is a public good: in this model, knowledge is non-rival and
spills over instantaneously across firms at zero cost (at least locally). Hence, like in the
Solow model, there are no economic incentives to produce new knowledge.
In contrast to the Solow model, the level of technology is endogenous:
technology arises without purposeful efforts, but comes out as a by-product of capital
accumulation, which is driven by economic decisions. Thus, policies influencing the
rate of capital accumulation will also influence the level of technology and therefore
economic development.
Box 6.4. The Noorul Quader's Desh Factory
A popular illustration of how learning by doing combined with knowledge

spillovers may lead to economic growth is due to Rhee (1990).
The story backs to 1979, when Daewoo Corporation, a major world textile
producer from South Korea, was looking for a new base to evade the U.S. and European
import quotas against Korean products. Since these quotas did not cover Bangladesh,
the company created a joint venture there to produce shirts, with a former government
official called Noorul Quader. The new company, called Desh Garment Ltd, start
producing in April 1980.
Because Bangladesh had no experience in garment production, 130 Bangladeshi
workers were trained in Korea. This familiarisation with modern production and
techniques allowed the Daewoo collaborative agreement to have long lasting effects in
Bangladesh: Bangladesh became itself a strong exporter of textiles.
The mechanism through which knowledge leaked around was labour mobility.
During the 1980s, 115 workers trained by Daewoo left the company to start their own
businesses. The new firms not only produced garments, but also gloves, coats and
trousers. All in all, an entire exporting sector emerged, just through learning by doing
and knowledge spillovers.
William Easterly, in his famous book “The Elusive Quest for Growth” (2001)
concludes: "The story of the birth of the Bangladeshi garment industry illustrates the
principle that investment in knowledge does not remain with the original investor.
Knowledge leaks". ... "Why hadn't Bangladeshi already been making shirts on their
own, before Daewoo volunteered its service? The answer is that Daewoo had learned
something about how to produce shirts and how to sell them on the world market. ... and
transmitted this knowledge to Desh workers". ... "Creating Knowledge does not
necessarily mean inventing new technologies from scratch. Some aspects of garment
manufactory were probably several centuries old. The relevant technological ideas
might be floating out there in the ether, but only those who apply them can really learn
them and can teach them to others". (pp 148-150).
Localized versus global technological spillovers
A problem with the Learning-by-doing model is that it gives rise to a “scale

effect”, whereby the productivity in a given economy rises with the size of its
workforce. This is a direct implication of the non-rival nature of knowledge: since

sharing knowledge does not involve loss of its effectiveness, the larger the population
being served with that knowledge, the better. This prediction does not square well,
however, with the real world facts: in general, there is no systematic tendency for large
countries to be better of than small countries.
As already argued in Chapter 1, one way of conciliating the idea that knowledge
leaks with the real world facts is to delink knowledge spillovers from country borders.
Indeed, when one looks at the effects of changes in policies, such as fiscal policy or
monetary policy, the nation is a natural unit. But from the viewpoint of knowledge,
there are no reasons to believe that France and Luxembourg are isolated countries that
grow solely based on the knowledge created by their own workers. National economies
are embedded in an interdependent global system, so firms belonging to different
economies learn from each other new production methods, models of organisation,
marketing and product design.
Taking this into account, one may interpret the model with technological
spillovers as describing, not the path of a single country along one or two decades, but
instead collections of interdependent economies in the long run. According to this
interpretation, France and Luxembourg should grow at similar rates because they share
the same body of technological knowledge.
The idea that knowledge leaks across borders linking the growth rates of
interdependent economies has to be taken seriously. However, it also has to be
qualified. A well-documented fact in our days is that technological levels are not
uniform across the space. Despite all progresses in telecommunications and the internet,
we are far from the neoclassical assumption that knowledge spills over instantaneously
at any distance at no cost. In general, the empirical evidence gives supports to the idea
that proximity matters for technological diffusion (see Box 6.5).
Thus, while keeping an eye to the idea that technology has the potential to flow
across the space – thereby promoting economic convergence, one has to recognize that
geographical distance and other factors may create barriers to technological diffusion,
thereby creating incentives for economic activities to cluster in a given territory136.
This discussion suggests that, one needs to deepen our understanding of the
mechanics of knowledge diffusion and the role of factors such as geographical distance
and economic policies in delaying that diffusion. In that discussion, one shall take into
account that knowledge is not all alike: while some knowledge travels well around the
globe, much knowledge tends to be geographically localized. This question will be
addressed in more detail in Chapter 9.
Box 6.5: Proximity and technological diffusion: empirical evidence
The question as to whether knowledge spillovers tend to be bounded in space or

not is of crucial importance for economic growth and convergence: if most knowledge
spillovers are localized, companies operating nearby benefit more from each other
136
This is not to say that geographical distance is the only variable influencing the pace of technological
diffusion. As we will see in Chapter 9, technology diffuses at different speeds across the space, depending
on a number of factors, including the recipient region’ economic, political and social conditions. For the
moment, just hold on to the idea that the possibility of “imperfect technological diffusion” may give rise
to asymmetries in economic development.
afreitas@ua.pt 190
innovations than companies located elsewhere. In this case, there will be an incentive
for firms to operate in the same location, giving rise to cumulative causation and
divergence. If, in contrast, knowledge spillovers are mostly global, there will be a
tendency for laggard economies to catch up and to converge.
A strand in the literature has examined technological diffusion in its
geographical dimension, and the general conclusion is that proximity indeed matters.
Jaffe et al, (1993) and Eaton and Kortun, 1999) using data on patent citations, found
that technological diffusion is stronger within countries than across countries. Keller
(2002), using intra-industry data, found that with every additional 1200 kilometres
distance there is a 50-percent drop in technological diffusion (irrespectively of country
borders). Ciccone and Hall (1996) found out that employment density increases labour
productivity, supporting the existence of knowledge spillovers across workers in the
same locations. Audretsh and Feldman (1996) found that innovation structures in US
tend to be geographically more concentrated than production structures, suggesting that
agglomeration advantages are more prevalent in R&D. Other authors pointed out that,
although proximity matters for technological diffusion, the advantage of proximity has
declined in recent years, suggesting an impact of communication technologies (see
Keller for a survey, 2004).
6.6 Learning by doing and international trade
A simple model with cumulative causation
External economies of scale and learning by doing have important implications

for international trade. These may be summarised in two main ideas:
 First, external economies and learning by doing intensifies the comparative advantages that led each
country to specialize in the first place. The reason is that external economies and cumulative
experience make the home firms progressively more productive in each of the goods initially
produced at home, while foreign firms become progressively more productive in each of the goods
initially produced abroad.
 Second, trade openness may involve a trade-off between static efficiency and dynamic efficiency: if
different goods differ in terms of their learning potential, the pattern of specialization of a given
country with free trade is not necessarily the one that delivers faster economic growth137.
To illustrate this, let’s return to the two goods model introduced in Section 1.6.
In that model, the home economy is small relative to the world economy, the total
labour force is equal to 1, and there are two consumption goods, agriculture (Z) and
manufactures (Y), produced using labour only (equations 1.12 and 1.13):
Y  ANY (1.12)
Z  BNZ  B1  NY  (1.13)
137
The idea that, in the presence of Marshallian externalities, the static gains from trade and the dynamic
gains from trade may not go along was first formulated by Graham (1923). Authors revisiting this idea
include Krugman (1981, 1987), Lucas, (1988), Young (1991), Stokey (1991) and Matsuyama (1992).
Instead of assuming that the productivity parameters A and B are exogenous,

however, lets now assume that they evolve over time as a positive function of the
country cumulative experience in the respective sectors138:
A   Y Y   Y AN Y with  Y  0 (6.16)
B   Z Z   Z B1  N Y  with  Z  0 (6.17)
As before, assume that learning-by-doing takes place as a pure external effect:
each producer ignores the effect of its decisions in the aggregate.
Now, let p be the relative price of the agriculture good in terms of manufactures
in the world economy. If, at the time of trade openness, p  A B (that is, if the
opportunity cost of producing the agriculture good at home is lower than the relative
price of agriculture goods in the world economy), then the home country has
comparative advantage in agriculture. If international trade is free of impediments, the
home country will specialize in agriculture.
According to (6.6) and (6.17), countries accumulate skills by doing what they
are already doing. This mechanism intensifies whatever comparative advantage
countries begin with. Thus, once a pattern of specialization is established, changes in
relative productivity will act to further lock the pattern in. In terms of the example
above, home firms will be progressively more productive in agriculture goods while
firms abroad will be progressively more productive in manufactures.
Of course, if the learning potential of both industries was the same (i.e, if
Y   Z ), then the growth rate of per capita incomes would be independent of the
specialization pattern. If however the two industries differ in terms of learning
opportunities, then growth rates will depend on which good the country specializes in.
For instance, in the extreme case in which  Z  0 (i.e, if there is no learning-by-
doing in agriculture), then by specializing in agriculture the home country will achieve
no growth at all. In other words, openness to trade will lead to lower productivity
growth than the average achieved under autarky.
This model suggests that, in some cases, restricting imports may lead to faster
economic growth. By closing the economy to manufacture imports, the home country
could feed its own manufacture sector, accumulating productive experience and
eventually developing comparative advantages in manufactures in the future. Based on
138
Krugman (1987) discussed an expanded version of this model, accounting as well for the possibility of
cross-border technological spillovers: that is, of firms learning with the productive experience of firms
located abroad. This is an important assumption, because it opens a channel through which free trade may
promote economic convergence, by increasing a country’ exposure to foreign technologies. The model
above ignores however such complication.
afreitas@ua.pt 192
this idea, some authors gave argued that laggard countries should use temporary import
protection to catch up139.
It should be noted that the argument relies on the assumption that the economy
is small relative to the world economy, so that world prices remain unchanged. If
instead one assumed that the home economy was large, the conclusion could be
different140. To se this, suppose there were two large economies, say, North and South,
the North being specialized in manufactures and the South being specialized in
agriculture. If learning by doing opportunities only occur in manufactures, then
manufactures production will grow over time, while agriculture production remains
constant. When both goods are normal, this implies that the world relative price of
manufactures declines over time, so the North faces an adverse terms of trade effect. In
the South, agriculture production remains constant, but its purchasing power in terms of
manufacture goods increases over time. Whether the terms of trade effect is enough to
compensate the diverging output or not, this depends on the demand conditions:
 Suppose, first, that the two goods are highly substitutes (that is, the
elasticity of substitution on consumption is greater than one). In this
case, the fall in manufactures prices leads to a more than proportional
increase in the world demand for manufactures, so the South’ terms of
trade do not improve enough. In this case, a comparative advantage in
the good with high learning potential leads to faster growth in real
income.
 If however, the elasticity of substitution was equal to one (as in the case
of Cob-Douglas preferences), then the terms of trade effect would
exactly offset the differential productivity growth and countries would
grow at the same rate.
 Finally, if the substitutability between the two goods was lower than one,
then the terms of trade effect dominates the learning effect and real
income in the North will grow at slower pace despite this country having
faster technological progress. That would be an (unlikely) case of
immiserizing growth.
Box 6.6. Learning by doing and the European fears of globalization
In light of the conventional theory of international trade, countries should

specialize according to their comparative advantages. In the real world, however, many
139
In a model with many goods, Krugman (1987) argued that a country could improve its economic
performance by protecting an industry until thus industry gets strong enough to survive in the
international markets and then move protection to another industry. The author argued that such strategy
was followed by Japan, during its industrialization process. Young (1991) introduces a model where
learning by doing opportunities in each good are bounded up. In his model, goods are ranked
hierarchically according to their productivity (learning) potential. Hence, as “knowledge” accumulates in
a given economy, the economy becomes progressively more endowed to produce goods with higher
productivity. Trade openness impacts asymmetrically across countries, because it leads some countries to
specialize in goods in which learning by doing opportunities are exhausted, while other countries
specialize in goods in which learning by doing still proceeds apace.
140
This point was made in Lucas (1988).
policymakers and think tanks believe that giving up a country’s manufacturing sector is
a bad thing. The reason is that the country looses productive experience.
A recent intervention by the European Commissioner, Jacques Barrot (2008),
illustrates this. Barrot contended that allowing the low skill labour intensive
components of the production chain to migrate to emerging economies, taking
opportunity of the lower labour costs there, may benefit the European consumer in the
short run, but rises the risk of Europe losing its accumulated knowledge: “it is not
possible to maintain the knowledge accumulated through learning by doing if not
supported by a production activity”, the author argues. According to Barrot, giving up
the industrial base will imply sooner or later the loss of the accumulated knowledge, so
it will not be possible to explore the potential synergies between universities, research
centres and firms, as envisaged by the European leaders.
Article for discussion: Barrot, J., 2008. Les illusions d’une Europe sans
industries”, Les Echos, 28/4/2008 (http://www.lesechos.fr/info/analyses/4697530.htm).
Box 6.7 Trade openness and convergence
The question as to whether trade openness is good or bad for growth has been
subject to intensive debate by economists of all times. The general case in models with
a widely accepted set of assumptions is that international trade is good for growth. Still,
one may find models stressing less common but equally realistic assumptions showing
that trade can be detrimental to growth. Models with learning by doing are typically in
the second category. Thus, the question as to whether trade openness is good or bad for
growth is to a large extent an empirical one.
Empirically, most evidence points to the case that trade openness is indeed good
for growth. A seminal contribution is from Jeffrey Sachs and Andrew Warner (1995 and
1997). The authors first constructed an “index of trade openness” according to which a
country was classified as “open” if it satisfied 5 requirements at the same time141. Using
this index, the authors found that, along the period 1970-1989, open economies
outperformed closed economies in different dimensions.
Table 6.1 summarizes some of the authors’ results. According to the table, 11
out of the 15 “open economies” in the sample expanded above 3.0% per year, while
only four of the 74 “closed economies” achieved such a fast rate of economic growth.
Controlling for other explanatory variables, the authors found that, on average, open
economies grew by 2-2.5 p.p. faster than closed economies. The authors also concluded
that open economies tend to exhibit higher investment rates than closed economies (a
similar conclusion was no found for investment in human capital).
141
These are: average tariff rates below 40 percent; average quota and licensing coverage of imports of
less than 40 percent; a black market exchange rate premium that averaged less than 20 percent during the
decade of the 1970s and 1980s; a non-socialist economic system; no extreme controls (taxes, quotas, state
monopolies) on exports.
afreitas@ua.pt 194
Table 6.1
Developing countries growth and openess, 1970-1989
Growth rate always open not always open
Average growth > 3.0 11 4

Average growth < 3.0 4 70
Source: Sachs and Warner (1995), p. 36.
The authors then investigated how this results change with a country level of
economic development. They found that within the group of “developing countries”,
those that were considered as “open economies” expanded at 4.49% per year, while
“closed economies” expanded at 0.69%, only. Among “developed economies”, those
that are open economies expanded at 2.29%, while closed economies expanded at
0.74%. The authors also concluded that poor countries tend to grow faster than richer
countries as long as they are linked together by international trade. Closed economies,
in contrast, do not display any tendency towards convergence. This suggests that
international trade may an important channel for international technological diffusion.
Finally, the authors investigated whether trade openness helps improve the
quality of economic policies 142 . The authors found that, among the 73 closed
economies, 59 experienced a severe macroeconomic crisis. In contrast, only one open
economy experienced a serious economic crises (Table 6.2).
In general, this evidence supports the general claim that trade openness is good
for economic performance.
Table 6.2
Developing countries growth and macroeconomic crisis
Growth rate Open in 1970s Not open in 1970s

Macroeconomic crisis in 1980s 1 59
No macroeconomic crisis in 1980s 16 14
Source: Sachs and Warner (1995), p. 56.
Note: “Macroeconomic crisis” is defined by one of the following occurrences: a rescheduling of foreign
debt; arrears on external payments; an inflation rate in excess of 100 per year.
Box 6.8. “You become what you export”
An implication of learning by doing models is that some specialization patterns

are more favourable to economic growth than others. This proposition is however
difficult to test, because a measure summarizing the “quality” of a country
specialization pattern is not easy to define.
142
Sachs and Warner (1995): “(...) the international opening of the economy is the sine qua non of the
overall reform process. Trade liberalization not only establishes direct linkages between the economy and
the world system, but also effectively forces the government to take actions on the other parts of the
reform program under the pressures of international competition”.
In a recent article, Hausmann et al. (2007) proposed a methodology to overcome

this limitation. In particular, they proposed an index to rank goods in terms of their
implied “income content”. This index (PRODY) is estimated as a weighted average of
the per capita GDPs of the countries exporting a product, where the weights are
proportional to each country specialization level in that product. Using the PRODY
indexes, the authors then constructed a measure of the average sophistication level of a
country export basket (EXPY).
Figure 6.1 replicates the results obtained by Hausmann et al., (2007). The figure
displays the relationship between EXPY indexes and per capita incomes, as of 2005.
The figure confirms a high correlation between EXPY and per capita incomes, giving
support to the idea that “poor countries tend to export poor country goods, while rich
countries tend to export rich country goods”. Hausmann et al., (2007) also found that
EXPY indexes are a good predictor of future growth, after controlling for the standard
covariates. The authors conclude that the “quality” of a country specialization pattern
matters for growth. “You become what you export”, they conclude.
Figure 6.1 EXPY indexes and Per capita incomes
11,0
USA IRL
10,5
ITA
GR FRA
SP
10,0 PRT
SA
ARG
9,5
CHI
PC GDP 05 (PPP, Logs)
IRAN ROM
9,0 BRZ THA
8,5 MOR
IND
8,0
y = 2,5103x - 14,706
7,5
R2 = 0,8148
MOZ
7,0
MAD
MLW
6,5
6,0
8,0 8,5 9,0 9,5 10,0 10,5
EXPY 05 (Logs)
Source: own calculations based on Hausmann et al (2007) and using

COMTRADE data.
6.7 Discussion
This chapter reviews the so-called “first wave” of endogenous growth theories,
which stressed the role of externalities related to capital accumulation. The main feature
of these models is that production externalities lead to an aggregate production function
exhibiting non-diminishing returns, even if at the firm level the production function is
perceived to have diminishing returns. This allows the model to stick with the
convenient assumption of perfect competition.
Depending on the size of the external effect, this class of models may lead to
different conclusions. If the size of the external effect is enough to overwhelm the
afreitas@ua.pt 196
diminishing returns to capital, then the economy displays unceasing growth. In the
particular case in which the external effect exactly offsets the diminishing returns, the
production function assumes the AK form.
Aggregate externalities may be a source of cumulative causation. Irrespectively
as to whether returns to the reproducible factor are increasing or decreasing, whenever
the production function exhibits increasing returns in all factors, there will be a
tendency for agglomeration of economic activities and to the self-reinforcement of
economic disparities: everything else constant, richer economies will be more attractive
to new investment than poorer economies, so they will get richer while poor economies
will remain poor. In a sense, this was what happened across the World after the
Industrial Revolution.
However, externalities may also be a source of convergence. If knowledge spills
over across borders, poor economies will have the opportunity to catch up by importing
foreign technology. Such pattern has also been observed in the real world: along the last
couple of centuries, many countries managed to join the rich countries club, by
importing or adapting technologies and institutions developed abroad. This discussion
suggests that we need to improve our understanding on the factors that influence the
pattern of international technological diffusion.
An important dimension of knowledge is its non-rival nature: once a particular

technology his discovered, many firms can make use of it without reducing its
effectiveness. To account for this important property, models where the accumulation of
public knowledge is increasing in aggregate investment typically rule out dilution
effects through the size of population. In terms of equation (6.2), this implies restricting
attention to the case in which  ´ 0 .
With all firms identical, the aggregate production function becomes:
Y  AK   N 1 (6.18)
Thus, whenever   0 , the production function exhibits increasing returns to

scale on capital and labour altogether. Hence, this model displays a tendency for
agglomeration and cumulative causation. Note however that increasing returns on
capital and labour altogether is not a sufficient condition for endogenous growth. For
the growth rate to depend on the saving rate, you need a model without diminishing
returns on the reproducible factors, that is,     1 .
In the literature, two versions of the model emerged. The first, due to Arrow
(1962), assumes that     1 . The second, due to Romer (1986), assumes     1 .
In the Arrow (1962) model there are diminishing returns to capital, so the
economy converges to a steady state. But the steady state is different from the one in the
Solow model, because of increasing returns to capital and labour altogether. To see this
formally, let’s take log differences in (6.18), obtaining:
Yˆ     Kˆ  1   n (6.19)
In the steady state, capital and output should grow at the same rate. Imposing
this restriction in the equation above, one obtains:
Yˆ 1       1   n (6.20)
Now using   Yˆ  n , you obtain the growth rate of per capita income in the
Arrow (1962) model:
  n 1      . (6.20)
Equation (6.20) shows that, as long as population growth is positive, per capita
income in this economy will grow over time, without the need to assume an exogenous
rate of technological progress. Still, because the growth rate of per capita income in the
steady state is determined by the growth rate of population, which is an exogenous
parameter, this model displays exogenous growth: changes in policy influencing the
saving rate or the efficiency parameter A will alter the steady state level of per capita
income, but not its growth rate (only level effects). Moreover, long run growth will only
obtain if the growth rate of the labour force is positive: whenever n=0, diminishing
returns will force the growth rate of per capita income to decline to zero, as in the
neoclassical growth model.
The second version of the model, explored by Romer (1986), assumes     1 .
Substituting (6.18) in (6.3), the user cost of capital in this case becomes:
r     Ak    1 N  (6.21)
The endogenous growth rate of per capita income is:
   Ak    1 N      (6.22)
In the particular case in which     1 , both the interest rate and the growth
rate of per capita income become increasing functions of the capital-labour ratio, so the
model displays explosive growth.
Even in case     1 , the growth rate of per capita income will still be a
positive function of the size of the labour force, N , displaying a “strong” scale effect: if
the workforce grows at a constant rate, n, there will be ever-accelerating growth. This
possibility raises a fundamental problem: one does not observe in the data a general
tendency for growth rates to be explosive143.
Note that the model with     1 also display a kind of “scale effect”:
according to (6.20), the growth rate of per capita income depends on how fast
population is growing. Since this “scale effect” is of a second order, as compared to the
one in (6.22), it has been dubbed as a “weak scale effect”.
143
To avoid explosive growth arising from increasing returns, Romer (1986) assumed that the growth rate
of knowledge is bounded up, due to diminishing returns. With this assumption, the interest rate becomes
bounded up too, and the same will happen to the growth rate of per capita income.
afreitas@ua.pt 198
 An avenue to account for a larger role of capital in production than that

implied by the share of capital in national incomes is to assume the
existence of externalities.
 The rationale for externalities on capital accumulation can be static
(availability of specialized suppliers, labour market pooling and
technological spillovers, complementarities), or dynamic (learning by
doing).
 With externalities, one may observe increasing returns without the need
to depart from the perfect competition paradigm.
 In the presence of positive externalities on capital, conventional growth
accounting overestimates the Solow residual.
 If externalities are significant enough to imply non-decreasing returns in
capital, then the model will display unceasing growth.
 Even if externalities are not strong enough to induce unceasing growth,
they may well cause the aggregate production function to display
increasing returns on capital and labour all together, giving rise to a scale
effect. This scale effect implies that bigger economies will be more
attractive places to produce and invest, favouring economic
agglomeration.
 In order to evaluate whether technological spillovers are source of
agglomeration and divergence or, instead, of convergence, it is necessary
to assess how localized spillovers tend to be. In this discussion one shall
take into account that technology is not all alike. While some
technologies and ideas diffuse well across the space, others tend to
diffuse slowly. This calls for a deeper understanding of the process of
technological diffusion, which will be specifically addressed in Chapter
9.
 With externalities, the competitive equilibrium will no longer be a social
optimum. So, at least theoretically, government intervention has the
potential to improve economic performance.
 With externalities, free openness to international trade may imply a trade
off between the static gains of free openness to trade and the potential
dynamic gains resulting from restricting temporarily imports so as to
achieve a critical mass in sectors with significant spillover and learning
by doing effects.
 Empirically, the evidence suggests that, indeed, some industries have
higher growing potential than others. However, there is also evidence
that closing an economy to international trade tends to result in lower
permeability to technological diffusion from abroad, poorer economic
policies and slower growth.
Key concepts
 External vs. internal economies of scale

 Central Planner vs Competitive Equilibrium
 Knowledge spillovers
 Agglomeration effects
 Cumulative causation
 Learning by doing
 Strong vs. weak scale effect
Essay questions:
a) Comment: “Perfect competition and increasing returns cannot hold

together”
b) Explain why the competitive equilibrium fails to deliver the first best
allocation in the presence of externalities.
c) To which extent do external economies help explain why capital doesn’t
flow from rich countries to poor countries?
d) What are the implications of knowledge spillovers being localized or
global in scope?
e) Explain why with learning by doing, the static gains from trade may not
go along with the dynamic gains.
afreitas@ua.pt 200
Exercises
6.1.
Consider an economy which production function is given by Yt  At K t1 3 N 2 3 . In

this economy, 16% of income is saved, the population is constant and capital does not
depreciate.
0 , 04
t
a) Consider for a moment that At  16e 3
i. Identify the theory that fits in this specification

ii. Write down the main equations of the model and find out the
Fundamental Dynamic Equation.
iii. Find the equilibrium values of K/L, Y/L and K/Y of this
economy, where L represents labour in efficiency units. Discuss
the stability of the equilibrium and represent it in a graph.
iv. Describe the behaviour of per-capita income, Y/N, wages and the
interest rate in the steady state, as well as the capital and labour
shares on national income. Are these results consistent with the
empirical evidence?
v. In country B, per capita output is twice as much that of country
A. Could that difference be explained by differences in the
saving rate only? Explain.
2
 K 3
b) Assume now that At  0,125 
N
vi. Explain the theory that fits in this specification.
vii. Does this aggregated production function verify the neoclassical
properties? Explain.
viii. Find out the dynamics of per capita income in this model and
represent it in a graph.
6.2.
0 .5
In Na-Hava the production function of each individual worker is given by: Yi  K i L0i .5 ,
where Li  N i measures labour in efficiency units. In this country 25% of the income
is saved, the population grows at 1 % a year and the capital stock depreciates at 1% a
year. Show how the model solves under the following assumptions:
a)   e
0.005 t
.
b)   0.01K N  .
c)   0.01K
d)   0.01K 0.5
“A system – any system economic or other – that at every given point in time
utilizes its possibilities to the best advantage may yet in the long run be inferior to a
system that does so at no given point in time, because the latter’s failure to do so may be
a condition for the level of speed of long-run performance”. Joseph Schumpeter.
Learning Goals:
 Understand why knowledge excludability plays a critical role in the

private incentives to innovation
 Acknowledge the importance of the extent of the market for innovation
 Identify the main tools that real world firms use to keep competitors
away of their inventions
 Identify the market failures underlying R&D and the types of
government intervention
7.1. Introduction
In the Solow model, the assumption of perfect technological diffusion implies

that there are no economic incentives to innovate: since knowledge is assumed freely
available, no researcher would be able to reap a return on the time spent in developing
an invention. In the Learning by Doing model, knowledge is assumed to spillover too,
so in this model technological change arises as a mere product of investment decisions.
In the real world, however, most technological progress is driven by purposeful
efforts of economic agents to discover new products or cheaper ways of producing
existing products. In doing so, agents spend time and resources that could otherwise be
employed in other activities.
This chapter addresses the question of why selfish economic agents devote time
and energy to search for new ideas. The main idea is that people invest in R&D because
they expect to obtain a reward. The chapter focuses on market incentives for R&D, so
the most of the discussion addresses the profit-seeking motive. However, the discussion
also applies to other types of reward, including social recognition, and prizes.
The view that technological progress is driven by the prospect of profits is on
the basis of the so-called Schumpeterian paradigm of economic growth, owing this label
to the economist that first theorized this mechanism, Joseph Schumpeter. In light of this
theory, entrepreneurs engage in R&D with the aim to achieve market power and, by
then, to rip a return on their research effort. The argument presumes that technology can
be made excludable, at least during a certain period of time.
Section 7.2 introduces some essential concepts in R&D. Section 7.3 introduces a
basic model with intermediate products, to be used in this and in the following chapter.
Section 7.4 explains why private R&D is inherently linked to imperfect competition.
afreitas@ua.pt 202
Section 7.5 describes alternative mechanisms in which real world’ firms rely, to
preserve ownership on their inventions. Section 7.7 addresses the policy problem raised
by the fact that the decentralized economy tends to produce too little R&D. Section 7.8
concludes.
7.2. R&D taxonomy
Basic research, Applied Research and Development
Research and Development activities may be categorized in different types:

Basic Research, Applied Research and Development.
- Basic research includes studies that aim to improve fundamental knowledge
for its own sake, in a manner that may be subsequently helpful across a
range of activities. Most basic research is carried out in universities and non-
profit institutions or by the government.
- Applied research is aimed at generating specific applications for existing
knowledge, so as to make it useful. Private firms are primarily engaged in
applied research, with the aim of using knowledge for commercial purposes.
- In general, inventions result in prototypes not ready for consumer use. The
process of further improving the invention and its production process so as
to make it marketable is called development.
Horizontal innovations and vertical innovations
In considering the output of R&D, it is convenient to distinguish between

“horizontal innovations” and “vertical innovations”:
- Horizontal innovations consist in expanding the range of available goods.
This type of innovation may be seen as applying to breakthrough inventions,
such as the electricity, the telephone, the bicycle, the automobile, the train
and the airplane. Note that, even though all these horizontal inventions
address the same basic problem (e.g., transportation), the fact they do not
solve this problem exactly in the same manner implies that consumers will
tend to use each newly invented product alongside with the previous ones.
- Vertical innovations are those that make existing goods or varieties obsolete.
For example, the personal computer has displaced the typewriter as a text
processing tool. Also more modern automobiles, computers and software
tend to displace older vintages of automobiles, computers and software. So
when a vertical invention is achieved, consumers tend to replace the old
vintages by the new vintages.
Product Innovation and Process Innovation
An alternative distinction is between:

- Innovations that lead to the introduction of new products (product
innovations), and
- Innovations that lead to the introduction of less costly methods of producing

existing products (process innovations).
Note that both product innovations and process innovations may be achieved
through horizontal innovations or through vertical innovations: for instance, an
improvement in operations management in a factory producing shirts (process
innovation) can be achieved either by introducing higher-quality versions of existing
inputs (for instance, better software - vertical innovation) or by expanding the pool of
intermediate inputs (for instance, inventing a new algorithm to solve a new challenge in
operations management).
General Purpose Technologies
A special category of horizontal innovations are inventions that have efficiency-

enhancing effects across many sectors. For instance, breakthrough invention such as the
steam engine, the telephone, the railroad, the electric power, the micro-ship and the
internet changed the organization of production and triggered a chain of complementary
innovations that transformed the entire nature of the economy at their time. These are
called “General Purpose Technologies”.
Production function in the final good sector
Suppose that aggregate output (Y) is assembled with m intermediate inputs,

according to the following production function:
m
Y  B  x1j (7.1)
j 1
In this formulation, each new intermediate product j is an horizontal innovation,

because it helps expand the range of available intermediate inputs to production144. B is
a productivity parameter. Each intermediate input x j is assumed to depreciate fully
after use.
Production functions of intermediate inputs
We assume that, once invented, intermediate inputs are produced using labour,
only. Let N j denote the amount of labour used in the production of intermediate good j.
The production function of each intermediate input is given by:
xj  jN j (7.2)
144
Since intermediate inputs enter additively in the production function, the marginal product of
intermediate product j is independent of the marginal product of intermediate product j+1. Still, firms
may substitute one input for another while keeping the level of output constant. In this specification,
intermediate inputs are neither “direct” substitutes nor complement of each other. Technically, they are
said to be additively separable.
afreitas@ua.pt 204
The parameter  j measures the state of technology in the activity of producing

and selling the intermediate input j.
When  j increases, this means that the cost of producing intermediate input j
decreases. Since there is no point in producing an existing good at a higher cost,
whenever such technological improvement is achieved, the old process tends to be
abandoned. Technological changes leading to increases in parameter  j are labelled
vertical innovations.
Trade of between production and R&D
In the real World, the activity of researching and developing new products
consumes valuable resources, such as labs, equipment and researchers. Thus, there is a
trade-off between allocating resources to R&D and to goods production.
The simplest way to model this trade-off is assuming that labour can be either
used in goods production or in R&D. Thus, the total labour force in the economy (N) is
split in two groups: those workers engaged in the production of intermediate inputs
( NY ) and those workers engaged in R&D ( N  NY ). Denoting by  the fraction of the
total labour force (N) devoted to R&D, we have:
N Y  1   N (7.4)
With:
m
NY   N j (7.5)
j 1
According to the formulation, there is a trade-off between allocating time to

R&D and to the production of intermediate inputs: if the economy commits a larger
share of the labour force to R&D ( rises), the first impact will be a fall in output,
because less working time will be devoted to the production of intermediate inputs. But,
as long as the research effort translates into faster technological change (more
intermediate inputs or more efficient ways of producing these inputs), the pace of
productivity growth will increase, allowing output to grow faster over time.
For a moment, assume that labour productivity is the same across all
intermediate inputs:
 j   , j . (7.3)
Given the symmetry in (7.1) and (7.3), arbitrage opportunities in the labour
market will imply that each intermediate input will be used exactly in the same amount
( x j  x , j ) by final good producers. Using (7.5), (7.1) simplifies to:
1 
 N 
 Bm  NY 
1 
Y  Bmx1  Bm  Y  (7.6)
 m 
The last term in (7.6) shows the two possible sources of technological change in
this model:
- Horizontal innovations (increases in m): Expanding the pool of basic inputs
available for use in production.
- Vertical (process) innovations (increases in  efficiency enhancements along
a product line, allowing each variety to be produced at lower cost.
The division of labour
With the arrival of a horizontal innovation, some workers are reallocated from
the production of old varieties to the production of the new variety. This movement
happens to be productivity enhancing due to diminishing returns: marginal workers are
released from old varieties, where the marginal product of labour is low, to the new
variety, where the marginal product of labour is initially high. This process causes
marginal productivity to increase in the old sectors and will proceed until the marginal
product of labour is equal across all sectors. In the new equilibrium, because labour is
spread across a wider range of varieties, it is used less intensively in each variety and
hence its marginal product will be higher.
As an example, consider again equation (7.6) and suppose that output (Y)
referred to the number of meals produced by a restaurant with four workers. For
simplicity, assume that B=1, =0.5 and that the productivity of each worker is equal to
λ=1. Suppose that initially there was only one task (m=1), consisting in cooking,
collecting the costumers’ orders, serving and washing the dishes. In that case, the total
number of meals served would be equal to Y=2. Now consider a technological
improvement, consisting in splitting the initial task into four sub-tasks (m=4), each
worker becoming specialized in one task. As you may easily check, output would
increase to Y=4.
Splitting production processes into specialized sub-tasks is a form of horizontal
innovation. The process by which production processes are split into specialized sub-
tasks allowing workers to become more efficient was coined by Adam Smith (1977) as
“division of labour”.
In this section we use the simple model outlined above to illustrate the role of
excludability and of market size in providing incentives for R&D.
Demand for intermediate inputs
In the following, let’s assume that the final good sector is perfectly competitive.
That is, the final good sector is composed by a large number of identical firms, which
maximize profits taking the price of each intermediate input p j as given.
Under perfect competition, the total demand for each intermediate input is such
that its price p j equals the marginal product, Y x j . From (7.1), this gives:
afreitas@ua.pt 206
p j  1   Bx j  (7.7)
Graphically, the demand for the intermediate input j is described in Figure 7.1
by the downward sloping curve crossing M and C.
Market power and horizontal innovation
Suppose that a firm devoting the fraction  of its labour resources to R&D
invents a new intermediate input (say j), which can be produced according to (7.2).
Because this innovation results in the expansion of the number of inputs available to
production, it is labelled a horizontal innovation.
With no question, the fact that a new intermediate input becomes available
constitutes an improvement for the economy as a whole: as we already know, a larger
pool of inputs to be used in production allows the economy to take opportunity of the
division of labour, improving aggregate efficiency. Hence, final good producers will be
most interested in start using the new product.
A different question is whether the invention will be beneficial for the inventor
himself.
To analyse this question, we refer to Figure 7.1. The figure represents the market
for the new intermediate input j. The downward sloping curve crossing M and C is the
demand for the input j by the final good sector (equation 7.7). The marginal cost of
producing this intermediate input with technology (7.2) is represented in the figure by
the horizontal line crossing T and C (it is assumed that the innovator is price taker in
the labour market, so the wage rate w is given).
Let us now examine the profits of the innovator.
First, consider the case were the new technology becomes freely available to all
agents in the economy. In that case, it is natural to guess that the new variety would be
produced under perfect competition. Thus, the innovator would face the competition of
a large number of firms, all with marginal costs equal to w  j . Each firm would be
price taker and profits of all firms would be driven down to zero. This case is
represented in Figure 7.1 by point C, where p j  w  j . Of course, since in this case the
innovator has no profits, he will not be able to recover the (sunk) cost consisting in the
valuable time spent in searching for the innovation.
Now, consider the case in which the inventor is the only one that can produce
the new variety. In this case, he will be able to explore the fact of being a monopolist in
the market for j. Sticking with the assumption that this sector is small in the labour
market, his profits for producing the new design (operating profits) will take the
following form:
w w
max   p j x j  x j  1   Bx1j   xj . (7.8)
xj j j
The second term in (7.8) is obtained substituting p j for (7.7) and captures the
fact that the innovating firm is price maker in the market for this variety.
The solution of the maximization problem is the well-known rule stating that the
monopolist optimal price is a mark up over the marginal cost:
1 w
pj   . (7.9)
1   
 j
In (7.9), the optimal “mark up” depends negatively on the price elasticity of the
demand curve, 1/the lower the elasticity, the higher the mark-up.
The optimal production level is obtained substituting (7.9) in (7.7): 
1
 2  

x Mj   B 1    j  , (7.10)
 w
Using (7.2), the demand for labour in sector j becomes:
1
 B 1   2  
1 
N j  j   (7.11)
 w 
Figure 7.1. Ex-post monopoly profits in the case of an horizontal innovation
pj
(a)
S M
w  j 1   
w  jF (b)
(c)
w j R C p j  1   Bxj 
T
xM x Fj x Cj
j xj
Finally, substituting (7.10) in (7.8) one obtains the optimal operating profits:
1 
j 
1 

 2  
B1     j   
. (7.12)
w
The case in which the innovating firm becomes a monopolist is represented in
Figure 7.1 by points R and M, which correspond to the intersection of marginal costs
with marginal revenues (the later, represented by the dashed curve). The shaded area (b)
measures the firm operating profits.
Operating profits as the reward of innovation
Note that the operating profits (7.12) do not take into account the time spent in
R&D. The reason is that, after the innovation is achieved, the time spent in R&D
afreitas@ua.pt 208
becomes basically a sunk cost: once incurred by the agent, it shall no longer influence
his decisions.
The operating profits can be interpreted as the reward to the time spent in R&D.
If, for instance, the innovator decided to sell to a firth party the right to produce the new
design (“the patent”), a reasonable price for it would be the present value of future
operating profits, along the intermediate product life-cycle.
Whether the reward is enough to compensate the researcher for the time spent
(and the risk incurred) in R&D or not is a different question: it may well be the case that
the researcher finds the operating profits too small, as compared to expectations.
This discussion points to the need to reformulate the problem one step back: if
one wants to understand how much time individuals decide to devote to R&D, one
needs to formulate the problem on an ex ante perspective, that is, before the research
takes place. In that case, the entrepreneur has to balance the expected profits of
inventing against the opportunity cost of the time he expects to be necessary to achieve
an innovation. This ex ante problem will be examined in detail in the next chapter.
Limit pricing
It should be noted that rule (7.9) only applies if there is no risk of other firms
entering the market. In the real life, incumbents often face the potential competition of
less efficient producers or suppliers of similar products. The possibility of these
competitors entering the market may force the incumbent to set a limit price, so as to
prevent the fringe from stealing its costumers.
In terms of our model, suppose that the incumbent in the market for j was
challenged by a large number of imitators that could not replicate exactly the
technology of the incumbent, but could produce the same input with  jF   j . When
the imitators’ disadvantage is not too large (formally, if  jF   j 1    , as exemplified
in Figure 7.1), then the best the incumbent can do is to charge a price just marginally
below w  jF (the limit price): if she set a higher price, she would be driven out of the
market.
Setting the limit price, the incumbent is still able to undercut its rivals and
capture the entire market. But his operating profits will be smaller than in the
unconstrained case. Consumers, of course, will be better off: prices will be lower and
the quantity supplied ( x Fj ) will be higher than in the case with full monopoly.
In sum, potential competitors may constraint the pricing behaviour of the

entrepreneur, even if they don’t actually operate. When this is so, it is the competitive
fringe that sets the market price, not the monopolist.
The case with a vertical innovation
We now turn to the case of a vertical innovation. This case occurs when the
invention leads to a more efficient way of producing an existing product.
Assume that prior to innovation there was perfect competition in the market for
j: that is, a large number of firms were producing j with a given technology 0 (the
suffix j is omitted to simplify the notation). In Figure 7.2, the equilibrium prior to the
innovation is described by point C 0 , where the price is equal to the marginal cost
( w  0 ), profits are zero and the total demand for this variety is x 0C .
Now suppose that one particular firm escapes competition achieving a process
innovation. In terms of Figure 7.2, the vertical innovation is described by the fall in the
(horizontal) marginal costs curve from w  0 to w 1 .
As in the case with a horizontal innovation, the innovation may translate into an
effective competitive advantage to the innovating firm or not, depending on the
innovator’s ability to maintain exclusive control over the technology created:
- If competitors had immediate access to the new design, the market price would
fall to w 1 and the total demand for the good would increase from x0C to x1C . In that
case, there would be no monopoly profits and consequently no reward to the time spent
in R&D.
- If, in alternative, only the innovating firm had access to the new design, this
would constitute an advantage against its competitors: the innovating firm could charge
a price lower than the previous competitive price, driving all competitors out of
business and become monopolist in this particular sector. In this case, it will be possible
for the firm to generate profits to reward the previous research effort.
afreitas@ua.pt 210
Figure 7.2. Ex-post monopoly profits in the case of a drastic vertical innovation
p
Q C0
w 0
S M
w 1 1   
U
C1 p  1  Bx 
w 1
T R
x0C xM x1C x
Figure 7.3. The case with a non-drastic (vertical) process innovation
M
w 1 1   
C0
w 0
P  1   BX  
C1
w 1
R
X M X 0C X 1C X
In terms of Figure 7.2, assume that the new marginal costs curve ( w 1 ) is such
that it crosses the locus of marginal revenues at point R, implying a monopoly price
(point M) that is lower than the original competitive price ( w  0 ). In this case, the firm
operating profits corresponds to the shadow area in the figure. Note that, because the
consumer price falls, there will be an increase in production to x M (of course, this
increase in production is much less than in the case in which technology become freely
available, point C1 ).
Note however that the case depicted in Figure 7.2 is not a general one: if the cost
reduction fails to reach a critical minimum, the innovator would be unable to set the
monopoly price (see the discussion in Box 7.1).
Box 7.1 Drastic versus non-drastic process innovations
A previously competitive firm that beats its competitors through a process

innovation and achieves a monopolist position in the market is said to have escaped
competition. The firm that escapes competition and becomes monopolist does not
always choose the standard profit-maximizing price of a monopolist, as implied by
(7.9). Setting a full mark-up is only possible if the resulting price does not exceed the
minimum price that the potential competitors can set.
In the case described in Figure 7.2, the innovating firm is able to set the full
monopoly price, because the intersection of the marginal cost and marginal revenue in
point R implies a price that that is lower than the pre-innovation competitive price. This
case is known as a drastic process innovation.
Figure 7.3 illustrates the alternative case. In the figure, the equality between the
new marginal costs curve ( w 1 ) and marginal revenues (point R) implies a monopoly
price (point M) exceeding the original competitive price ( w  0 ). Hence, the best the
innovating firm can do is to set the price just marginally below the previous competitive
price ( w  0 ). In doing so, it will be able to undercut its rivals and capture the entire
market, pocketing the difference between this price and the new marginal cost, w 1 .
The innovating firm will not choose a higher price because she would be driven out of
the market herself.
The case described by Figure 7.3 is labelled a non-drastic process innovation.
Note that, since in this case prices do not fall, the total demand for product j remains
unchanged, at x 0C .
Summing up, a drastic innovation corresponds to a sufficiently large
improvement in technology so that the innovator becomes an effective monopolist. In
the case of a non-drastic innovation, the previous producers constrain the pricing
behaviour of the entrepreneur, even if they don’t actually operate145.
R&D, imperfect competition and market size
The discussion above illustrates the key role of excludability in providing

market incentives for R&D. Technology is non-rival, but economic rents are rival.
When competitors have instantaneous access to the knowledge created and the right to
use the new technology, the innovating firm will not be able to raise the required
operational profits to reward its initial research effort. In that case, entrepreneurs will
prefer to free ride on the other’s research efforts, and henceforth there will be no R&D
at all. Note that this outcome may be rather inefficient, as it holds even when the cost of
inventing something is very small. When, in alternative, any mechanism prevents other
firms from using the new design - at least during a certain period of time - then the
innovating firm will be able to raise operational profits to reward its R&D effort.
145
Formally, the innovation will be non-drastic if the fall in marginal costs is larger than the ex post
monopoly mark up, that is, if: 1 0 1 1   . Thus the likelihood of a drastic innovation declines
when this mark-up is very large. The reason is intuitive: when the demand for the intermediate good is
very rigid (large ), the full monopoly price is very high relative to marginal costs and therefore is more
likely to exceed the previous (competitive) price.
afreitas@ua.pt 212
To see whether the R&D effort turns out to be worthwhile or not from the
innovating firm point of view, the firm shall compare the sunk cost of R&D against the
discounted value of the ex post monopoly profits, over the period during which the
technological advantage materialises. Whether the net present value is positive or
negative, it depends on a number of factors, including: the initial investment in R&D,
the operational profits generated each period; the discount rate; and how long the
excludability over the new technology will last. In the next chapter, we will address
specifically the choice of research intensity by an individual firm.
For the moment, just hold on to the idea that private R&D is inherently linked to
imperfect competition: devoting resources to innovation results in a fixed cost. To
reward this fixed costs, firms must achieve some excludability over the technology
created. Moreover, the size of this fixed cost determines how competitive the industry
has to be: the more relevant R&D expenditures are, the more likely it is that there will
be few firms and limited competition. This helps explain the high concentration in
industries such as pharmaceuticals where the research costs of new discoveries are
extremely large. The computer-games industry by contrast, where new games may be
developed with relatively low investment, has a much more open and competitive
structure.
Another important idea to hold on to is that the incentives to R&D depend on the
size of the market (for the moment, you may interpret the size of the market as captured
by parameter B). The bigger the size of the market, the lower will be the R&D cost per
unit of output, and hence the quicker the initial investment will be recovered. A
corollary is that openness to international trade, by enlarging the extent of the market, is
good for innovation146.
Static and dynamic efficiency again
We just saw that, from the entrepreneur point of view, the less (or the slower)
technology leaks to competitors, the greater the market incentives to invent. A question
arises, however, as to whether the society will be also better off with a high degree of
knowledge excludability.
The classical economic theory tells us that monopolies are bad for welfare. In
terms of Figure 7.1, for instance, the welfare gain of the innovation in the case with
monopoly is given by the area (a)+(b), corresponding to the gains in consumer surplus
and the monopolist’ profits, respectively. Under perfect competition – with an
immediate fall in consumer prices to w  j - the welfare gain is given by the area
(a)+(b)+(c), all accruing to consumers. Hence, the monopoly involves a transfer from
consumers to the innovating firm (b) and a deadweight loss to the economy as a whole
equal to (c). A similar reasoning applies to the case of a vertical innovation (Figure
12.2).
In order to avoid the welfare losses caused by monopolies, governments are
typically recommended to intervene, regulating prices, for instance. In the case of
knowledge, a further reasoning exists: since the social cost of having an extra individual
146
Schrerer (1984, p. 13), quotes a James Watt’s partner, Matthew Boulton: “It is not worth my while to
manufacture your engine for three countries only, but I find it very well worth my while to make it for all
the world”.
sharing a given idea is zero, does it make sense to exclude other people from using that
idea?
The other side of the coin is that, without excludability, there are no market
incentives for R&D. As with many other problems in economics, there is a trade-off
here: some excludability is inefficient from the static point of view, but it may provide
the incentives for private agents to develop more ideas. This trade-off led one of the
pioneers of modern development economics, Joseph A. Schumpeter (1883-1950) to
claim that “static” efficiency and “dynamic” efficiency do not necessarily go along (see
the quote at the beginning of this chapter).
7.5. Making knowledge excludable
Excludability sources
We just saw that the entrepreneurs’ incentive to innovate rely on their ability to
capture a rent from their ideas. This, in turn, requires some form of appropriation of the
technology created. This section deals specifically with the mechanisms on which
innovating firms in our days rely to maintain control over their technologies without
seeing these technologies leaking out to competitors147.
The first and most obvious mechanism of knowledge excludability is the trade
secret. By not disclosing the details of an invention, its owner may manage to keep its
competitors away from business. This has been the case, for instance, of the famous
formula of Coca-Cola, for more than one hundred years. Not all inventions, however,
are suitable to be protected as trade secrets. On one hand, some ideas are so simple that
are very easy to replicate (think, for instance, in the “Post-it”, the genial adhesive that
we stick on our desk to remember things). On the other hand, the passage of time makes
even complex ideas very difficult to hide. As an example, take the battle against the
diffusion of the atomic bomb, since it was invented at the time of WWII.
In some industries, an important source of excludability is lead-time. Knowledge
leaks only gradually. So in many industries the problem of competitors free riding on
ones’ ideas is circumvented by achieving a faster rate of technological change:
innovating firms try to keep the lead continuously developing new sources of
differentiation against their competitors. The time length that competing firms take to
assimilate new ideas and incorporate them into their own business provides the
innovating firm with a first-mover-advantage.
Other advantages for first movers include the time to build up customer loyalty,
reputation, and the benefits of experience. As for the later, many industries (notably,
shipbuilding, aircraft manufacturing, semiconductors) are characterized by a steep
learning curve. Thus, accumulated experience gives incumbents a cost advantage over
their competitors. This cost advantage does not leak instantaneously to competitors.
Often, the innovator devotes specific efforts to further design the product so as
to make very hard for competitors to replicate it. An example of this is encrypting CDs
147
Of course, the discussion presumes that the technology created is useful for competitors: if the
invention was so specific that it only served the innovating firm, its diffusion would not be a problem.
afreitas@ua.pt 214
to prevent unauthorized copies of a computer game. Note however that spending

resources to make reverse engineering more difficult is a source of static inefficiency, in
the sense that it implies that less resources are employed in production (a discussion in
Box 7.4).
A common problem in innovative industries is that key ideas are embodied in
workers hired and trained by the firm. Thus, there is an obvious risk of workers leaving
the innovating firm to join rival firms or to start a competing business independently. In
order to avoid this, firms may design compensation schemes that give key employees an
incentive stay together (for instance, by sharing profits). In plus, they can introduce
non-disclosure and no-compete clauses in employment contracts. In some cases, fellow
firms working in a given location set agreements limiting the exchange of skilled labour
between them148.
Whenever the mechanisms above fail to provide the required protection,
inventors still have the option to buy legal protection, registering their property rights
Patents, copyrights and trade marks
Patents are a legal mechanism that establishes private claims on intellectual

property rights, permitting innovators to restrict unauthorized use of their ideas. To buy
a patent, an inventor must demonstrate that the invention is novel and non-obvious. A
patent grants the inventor exclusive right to its discovery for a definite time length (20
years in Europe and in the US; from 14 to 15 years in the UK). During this period, other
producers are precluded from making use of the invention without permission of the
patent holder. Patent holders may however license (permit) others to use their invention
in exchange for a payment called royalty. When the patent expires, other firms are
allowed to enter the industry (note that this will not happen with a well maintained trade
secret).
When applying for a patent, the inventor has to disclose the details of its
invention. If succeeded, the patent will cover only the patented output, only. That is, the
knowledge revealed is protected during the life-time of the patent in the sense that only
its owner can use it to produce the patented output. Yet the information in the patent
(the technical details of the invention) can be used freely by other firms, to improve
their own research projects. New inventions that do not compete directly with the
patented output, even when build on the patent information, are in general considered
legal.
An instrument related to patents is copyrights. Copyrights apply to art works and
works of authorship when these are attached to a tangible medium, such as a book or a
CD. This contrasts to patents, which apply to products, processes, designs and
substances. An important distinction between patents and copyrights is that the later
protects the particular expression of an idea, whereas patents protect any tangible
embodiment of the idea itself. Therefore, patents allow greater exclusivity than
copyrights. In compensation, the society sets copyright terms longer than patents (in the
United States copyrights to business last 75 years and copyrights to individuals last for
life plus 50 years).
148
Whether this is socially good or bad is a different question. Gilson (1999), for instance, contends that
the weak enforcement of non-compete covenants in California may have contributed to the success of
Silicon Valley.
Another instrument that provides protection of intellectual property are

trademarks. The possibility of registering a trade mark encourages firms to develop
customer loyalty and reputation. Unlike copyrights and patents, trademarks last forever.
Although legally distinct, patents, copyrights and trademarks can all be viewed
as serving the same purpose: they all provide mechanisms of intellectual property
protection, preventing others from using an existing idea.
The economics of patents
The regulation of property rights has a difficult task in weighting the benefits of
providing adequate incentives for researchers against the cost of creating legally
enforced monopolies. In that choice, there are two key dimensions. The first is the
patent length: for how long should the patent apply? The second is the breath of patent
protection: to what range of products should the patent apply?
The optimal length of the patent obeys to a balance between the need to provide
adequate ex ante incentives to researchers and the benefits that will accrue to consumers
once the patent expires. The longer the duration of the patent, the more time the
innovator earns monopoly profits (area b in Figure 7.1), and hence the greater will be
the incentive to engage in costly R&D. However, a long patent length also implies a
long lasting monopoly power, which comes along with a static deadweight loss (area c).
If the life of the patent is too short, the innovating firm may not be able to accumulate
enough profits to reward the research effort; if the life of the patent is too long, there are
more incentives to innovate, but consumers have to wait too long for open competition.
The 20 year patent period is intended to strike a balance between static efficiency and
the long run objective of stimulating research and innovation149.
A similar trade-off applies to the breath of patent protection. If an inventor
comes up with a product that is similar to one already patented, shall a patent be given
to the new variant? If yes, the first inventor will reap less of the returns of her invention.
Excessive coverage, on the other hand, limits competition through innovation in the
neighbourhood of the protected idea: other firms will see their returns to further
developing the idea squeezed by the royalties they must pay to the original inventor.
The optimal choice involves a balance between the need to stimulate R&D and
competition through innovation.
In practice, the breath of patent protection is a matter of dispute in the patent
office, with later entrants claiming the right to introduce slightly different innovations or
new applications of the original idea without paying the royalties. Because litigation
results are not always as desired by established firms, the later often protect the
invention against other firms “inventing around”, by establishing property rights on
related ideas, even if never used (“sleeping patents”).
Some authors argue that the optimal patent breath and the optimal patent length
are not independent. For instance, it has been argued that, because imitators can often
get around the patent protection, engaging in a socially costly free ride, and because the
incentives to do so increase with the patent length (if the patent duration is short,
149
A famous model analysing this question is due to Nordhaus (1969).
afreitas@ua.pt 216
imitators will find cheaper to wait for the patent to expire), a “short and fat” patent
system may be preferable to a “long and thin one”150.
The case against patents
Many historians have emphasized the role of institutions governing intellectual

property rights as a main driver of economic growth. According to this view, the
Industrial Revolution was only possible after governments established a proper
regulation and enforced property rights, granting the inventors the necessary ex ante
incentives to stimulate research and development151.
Other authors have argued, however, that the monopoly distortions imposed by
the patent and copyright systems are too costly for what they achieve. According to this
view, purely private excludability mechanisms, such as first-mover-advantages, lead
time, secrecy and imitation delays provide enough protection for innovation and lead to
a better allocation of resources than patent and copyright systems 152 . These authors
propose restricting patents and copyrights severely or even eliminating them altogether.
An argument that has been put forward is that patents are less efficient than trade
secrets. At the first sight, the opposite looks true: when the inventor buys the patent, he
has to reveal the “secret”, allowing other inventors to build on it for other uses. On the
other hand, consumers will have the opportunity to enjoy the benefits of the innovation
when the patent expires. However, the inventor will only prefer to buy patent protection
if he foresees that it will be impossible to keep the trade secret much longer. If the trade
secret could be maintained for more than 20 years, he would never buy the patent. Thus,
secrets that without patent would have been kept for a short period of time will, with the
patent, be maintained for 20 years.
A mechanism that helps reducing the inefficiencies generated by patent
protection is licensing: patent holders can permit others to use their invention in
exchange for a payment called royalty. Licensing is welfare enhancing for two motives:
on one hand, it prevents competitive innovation and imitation efforts, which are socially
costly. On the other hand, because knowledge is non-rival, sharing it, even at a positive
cost, is socially better than not to share it at all. Furthermore, from the innovating firm
point of view, licensing allows the idea to be used in markets in which the inventor
might not have competitive advantage (for instance, in a foreign country, in a different
use, etc). In the real world, licensing is a primary way of transferring know-how across
country borders.
150
Galinni (1992). Denicolò (1996) argues that this conclusion is conditional on the market structure.
151
Douglas North (1981), P. 164: “The failure to develop systematic property rights in innovation up until
fairly modern times was a major source of the slow pace of technological change”. Other authors
stressing the protection of property rights as the prime factors of the European take-off include Landes
(1998) and Mokyr (2002).
152
Among others, Kremer (1998), Boldrin and Levine (2002, 2004), Kelly and Quah (1998), Boldrin and
Levine (2004). Merges and Nelson (1994), for example, argued: “…we believe that the granting and
enforcing of broad pioneer patents is dangerous social policy. It can , and has hurt in a number of ways. It
has made the entry of creative and energetic newcomers difficult…. There are many cases in which
technical advance has been very rapid under a regime where intellectual property rights were weak or not
strongly enforced.”
The social cost of a patent may be especially high in the case of critical
medicines, that could otherwise be sold cheaper and save lives. Because of this, critics
of the patent systems have argued that the government (tax payers) should purchase
patents for particular innovations and release them to the public. This would eliminate
the ex-post distortion and keep the incentives right. However, this policy would lead to
another failure: if the innovation could be produced in other countries, there would be a
free ride on the first country’ taxpayer effort.
In practice, patents are not equally necessary across industries. While some
inventions only become available with an enforced patent system, many others become
available just as quickly without a patent system. In other words, some inventions are
“patent dependent” and others are not. In many industries, sufficient economic
incentives for invention and innovation result from secrecy, and first-mover advantages.
In these cases, the patent system is inefficient, in that the same innovation would be
obtained without the cost of granting monopoly power.
Box 7.3 How effective are patents?
A famous research on the effectiveness of patents is Levin et al. (1987), who

surveyed 650 R&D managers representing 130 lines of business. The executives were
asked to rate the effectiveness of patents as well as of other mechanisms, in protecting
their competitive advantages, on a scale from 1 (”not at all effective”) to 7 (“very
effective”). Their findings are displayed in Table 7.1.
Table 7.1. Effectiveness of alternative means of protecting advantages of new or improved

processes and products
Effectiveness of alternative m eans of protecting advantages of new

or im proved processes and products
Sam ple m eans

Method of Appropriation
Processes Produtcs
Patents to prevent duplication

3,52 4,33
Patents to secure royalty income 3,31 3,75
Secrecy 4,31 3,57
Lead time 5,11 5,41
Moving quickly dow n the learrning curve 5,02 5,09
Sales or services effort 4,55 5,59
Note: Range: 1= not at all effective; 7= very effective. Source: Levin et al. (1987), pp
794. Adapted from ......
Interesting enough, for process innovations, patent protection was considered the
least effective method of protection. In the case of product innovations, the average
score obtained by patents was slightly higher than in the case of process innovations.
Still, only secrecy was rated lower than patents.
The authors also investigated how the answers differed across industries. Among
18 industries, only in one industry, pharmaceutical drugs – and for the particular case of
afreitas@ua.pt 218
product innovations - did the majority of the respondents rate patents as strictly more
effective than other means of appropriation.
A related study, conducted by Mansfield (1996), analyzed a sample of 100
firms, from twelve broadly defined industries. The author surveyed the firms’ R&D
directors to ascertain what proportion of their companies inventions were “patent
dependent”. The results reveal a quite unimportant role for patents (proportions of 1%
or less) in six out of twelve industries, and a moderate role in other three (from 11% to
17%). The three industries reported as more patent dependent where “petroleum”
(25%), “Other chemicals” (38%) and “pharmaceuticals” (60%).
A more recent study, by Cohen et al. (2000) analyzed survey responses from
1478 R&D labs in the U.S manufacturing sector. The authors found that patents were
the least emphasized mechanism of protection, in most industries. In the pharmaceutical
industry, however, patents were considered an effective protection mechanism for more
than 50% of all product innovations. In the case of chemicals, the authors also indicate
an important role of patents as a mechanism to deter the patenting of close substitutes
by rivals (patent blocking). They also found an important role for patents as a device to
force rivals into negotiations, namely in the telecommunications equipment and
semiconductors industry.
Box 7.4 The Battle of Hybrid Seeds
“Since the beginning of the XIX century, when the production of new seed and
plant varieties took a central place in the development of modern agriculture, and until
the 1960s, many new seeds were introduced but very few were patented and enjoyed
legal protection. The reason for this was relatively simple: new seeds were technically
not patentable because seeds coming from natural reproduction could not be
distinguished from those coming from plant breeders (the same did apply, and
apparently still applies, to cattle). This state of affair continued until during the 1940s,
after 50 years of research and thanks to a lot of private and public research money, the
hybridisation became available. To make a long story short, this technique allows for
the production of patentable seeds, as the hybrid seeds cannot be reproduced (they are
sterile), and only people that control the original pure kinds of seeds can produce the
hybrid through a monitorable fertilization technique. From then on, lobbying from
companies producing hybrid seeds for new and special legislation for plant patents
intensified, and in 1970 The Plant Varieties Protection Act was enacted. This is the
most stringent patent legislation for agricultural products in the whole world; it is this
legislation that American chemical monopolies are trying to impose on the agricultural
products of less developed economies. Hybrid seeds, which cost billions of private and
public dollars to be developed, are neither particularly more productive or socially (as
opposed to privately) valuable than traditional ones. They are instead patentable, which
allows their producers to establish and maintain a monopoly power. Notice, in
particular, that is the option of eventually purchasing patents for the hybrid seeds had
not been available, resources would have not been wasted in the first place to develop
the hybridisation technique. This is a good example of socially damaging reinforcement
between public and private rent-seeking”. Boldrin and Levine (2004), pp 129-130.
7.6. Too little R&D
The appropriability effect
Along this chapter we have been stressing the role of knowledge excludability in
providing market incentives to innovate. A different question is whether knowledge
excludability – even when fully achieved– is sufficient to induce socially desirable
innovations. In fact, this is not always the case.
To see this, let’s refer to Figure 7.1 again. In that example, the monopoly profit
(area b) falls short the social benefit of the invention (area a+b). The later accounts for
the impact on “consumer surplus”, which, in the case of a horizontal innovation,
measures the efficiency-enhancing effect in production, due to the arrival of a new
intermediate input. The fact that the monopolist does not fully appropriate all the
benefits of its innovations to the society is called the appropriability effect 153 154.
The appropriability effect implies that a socially beneficial innovation may fail
to occur, even if perfectly excludable. In terms of Figure 7.1, this will be the case when
the fixed cost of the innovation lies between the areas (b) and (a)+(b).
Box 7.5. Standing on shoulders
An important property of knowledge is its cumulative nature: inventors develop

new ideas learning from old ideas. This property is often labelled the “standing on
shoulders” effect, owing this name to a famous quotation from Isaac Newton (“If I have
been able to see further, it was only because I stood on the shoulders of giants”).
The following examples illustrate the “standing on shoulders effect”.
The steam engine was invented by James Watt in 1769. The idea was not
however inspired by watching steam rise from a teakettle’s spout, as it is commonly
said. The idea was actually borne while James Watt was repairing an earlier steam
engine invented 57 years before by Thomas Newcomen. The later, in turn, was an
improvement of a steam engine patented in 1698 by the Englishman Thomas Savery,
which followed another designed by the Frenchman Denis Papin around 1680, which in
turn had precursors in the ideas of the Dutchman Christiaan Huygens, and so on155.
Another famous example of “standing on shoulders” is the Toyota “lean
production system”. This was achieved by Toyota after carefully studying the Ford
production system and improving upon it. Some of Toyota developments were later
imitated by American firms156 157.
153
Empirically, many studies have found social returns to innovation largely exceeding private returns.
For a survey, see Griliches (1991).
154
In the case of a drastic vertical innovation, the increase in consumer surplus also implies a social gain
larger than the private gain. In Figure 7.2, we see that ex post profits are given by the area [SMTR] while
the increase in social welfare is the area [QC0MRT].
155
Diamond (1997).
156
Mukoyama (2003).
afreitas@ua.pt 220
It should be noted that “standing on shoulders” not always requires access to the
technical details of the previous idea. Even when the new technology is not well
assimilated by followers, the simple propagation of the idea may inspire people to
develop alternative ways of achieving similar results. The difference is that in this
alternative model of technological diffusion, all the details have to be reinvented. As an
example, consider the Microsoft windows158: the first graphical operating system was
first introduced by Apple in 1984. But Microsoft adapted the idea and came out with its
own graphical operating system in 1985, the Windows 1.0. The underlying idea was the
same, but the programming language was completely different.
The fact that simple ideas may induce independent efforts to achieve similar
results implies that even well maintained trade secrets face the threat of competitors free
riding on the idea diffusion. An obvious example is the case of Coca-cola: even though
its formula is a trade secret, the concept is not. With no surprise, other firms, such as
Pepsi Co and Canada Dry, have entered the market with close substitutes. Another
example is the atomic bomb: historians are still debating whether the Russian atomic
bomb was based on detailed blueprints stolen from the Americans or instead it was the
diffusion of the idea that induced the Soviets to engage in a independent project where
the principles of the bomb were reinvented.
The “standing on shoulders” also applies across industries. Technical
innovations originated in a particular industry often find important applications, as well
as instigate further technological change very far from the original invention’ starting
industry. A classical example is the invention of the transistor by the telephone
company AT&T. Although this firm was rewarded for its R&D effort through higher
margins in the production of telephone devices, many other firms took the opportunity
offered by the new invention, namely to develop better radios and better television sets.
The standing on shoulders effect implies that innovating firms will not, in
general, appropriate all the benefits of their innovations to the society. Thus, in a
decentralized economy, they will not innovate as fast as it would be socially desirable.
Subsidies to private R&D
We just argued that, in general, the ex post monopoly rents that the innovator
can capture fall short the social welfare created by the invention. Hence, private firms
will not innovate as fast as it would be socially desirable. The implication is that there is
a role for the government to support innovation.
A common policy instrument is the subsidy. Government subsidies can be
attributed either to specific innovation projects, to particular innovation activities or
more generally to particular industries.
Some authors argue that government subsidies should target differently different
industries, on the ground that they are more needed in more competitive environments
(because there are no profits) and the potential for technological spillovers is larger.
157
This externality often creates the incentives for firms to cooperate: coordinated research and joint
research ventures allow the costs and the benefits of R&D to be shared by different firms, internalizing
part of the externality. To account for such possibility, some domestic competition laws, including in the
U.S. and in the E.U., have been relaxed in respect to collaborative R&D projects.
158
Mukoyama (2003).
Designing different subsidies to different industries involves however, a large level of

discretion. In a world where governments face important information failures, a
question of level-playing-field arises: firms or sectors benefiting from a government
subsidy may obtain an undesirable competitive advantage against their competitors at
the expense of the taxpayers. For this reason, international agreements and some
domestic competition laws (such as in the U.S. and in the E.U.) have been limiting
narrowly focused subsidies and state-aids to particular firms. By contrast, broad-based
subsidisation mechanisms to particular activities, such as R&D tax credits, because they
do not depend on government selection of particular projects or industries, are
inherently less distortionary and hence more tolerated by the domestic competition laws
and international trade agreements.
The case for subsidies is not free of controversy. Some authors argue that
technological spillovers can be largely internalized by the innovating firm: for instance,
when innovators become aware that there is a chance of their invention to leak out, they
can sell part of their business to competitors before the idea actually leaks. In
alternative, they can charge the employees (deducing an implicit fee from the worker’s
wages) for the knowledge they are acquiring and that will become useful to them159.
Financing constraints
The discussion above focuses on the ex ante incentives to R&D. Appropriate

incentives are not, however, a sufficient condition for firms to engage in R&D: even
when expected profits are high, valuable research projects may fail to be implemented,
due to lack of finance. In the real world, many firms face difficulties in raising capital to
finance their R&D projects.
There are many reasons for this. First, research projects may fail: either because
nothing much of relevance is discovered or because the invention is beaten by a
competing firm in the patent office, research projects involve a significant probability of
ex post returns being insufficient to cover the loan, raising the likelihood of involuntary
default.
Second, financing R&D projects typically involves asymmetric information:
either because of the technical complexity of the project or simply because researchers
do not want to disclose the technical details, investors do not in general fully understand
what is envisaged by the researcher. This rises a typical problem of moral hazard:
because it is difficult to monitor the true effort and the quality offered by the
researchers, there is ample scope for low levels of commitment and to the hiding of
relevant aspects of new ideas in the event of success.
Third, R&D projects do not provide an obvious collateral: in the case of a
mortgage loan, if the borrower defaults, the bank gets the real asset. If the bank lends
for R&D and the research project fails, the bank may end up with nothing. This problem
may be, of course, bypassed by the borrower offering other asset as collateral, but this
mechanism may not be available for many start ups, especially in poor countries.
159
Becker (1971): “Firms introducing innovations are alleged to be forced to share their knowledge with
competitors through the bidding away of employees who are privy of their secrets. This may well be a
common practice, but if employees benefit from access to sellable information about secrets, they would
be willing to work more cheaply than otherwise”.
afreitas@ua.pt 222
Because capital markets are not efficient, R&D intensive firms tend to use own
capital to finance their research efforts. This is not a big issue for large companies
already established in the market. Established firms can raise capital for new R&D out
of their profits on past R&D. But for new entrants, especially small firms, lack of
financing may constitute a significant barrier to entry and a source of market
imperfection160.
A mechanism that addresses specifically these problems is venture capital.
Venture capital firms invest in promising R&D projects demanding, in compensation
for their risk taking, an ownership stake in the new company. Allowing the research risk
to be shared, venture capital can have a positive impact on the level of R&D. On the
other hand, because they have the right to assign a manager, venture capital firms can
avoid the problem of moral hazard. However, venture capital firms rarely meet all the
existing needs of R&D finance, especially at the smaller end of the market, where the
transaction costs are high relative to the expected returns.
In general, financial development is favourable to R&D: as new and more
complex securities become available, risk-spreading opportunities for investors
increase, with the consequence of increasing the availability of funds to risky projects.
In countries with a low level of financial development, on the contrary, the inability to
diversify idiosyncratic risks leads agents to choose inferior but safer strategies, such as
relying more on imitation than on invention161.
Government funded R&D
We argued that governments have a role in complementing private incentives for

R&D. By establishing and enforcing a system of property rights and by subsidizing
innovative activities, government may help firms appropriate more of the social benefits
of their innovations, and thereby inducing an R&D effort more aligned with the social
interest.
Not all research, however, is driven by market concerns. For example, advances
in basic sciences, such as geography, economics, mathematics and physics cannot be
patented, so there are no rents to extract. And yet, because of the large externalities
involved, advances in basic science are of great importance for the progress of human
kind.
On the other hand, even when particular types of knowledge are suitable for
exclusion, it may be socially preferable to make them freely available. Remember that,
because knowledge is non-rival, the social cost of having more agents sharing the same
idea is zero. Given the cumulative nature of knowledge – i.e, new discoveries build on
old discoveries – there is a good case to let relevant knowledge become freely available,
even when patenting is possible.
As with public goods in general, governments may have a role in supporting
directly the creation of knowledge. One possibility is to reward with prizes and research
160
The fact that a significant fraction of R&D investments are self-financed lead Schumpeter to defend
that large firms have an advantage because they have more resources to invest and setup expensive
laboratories. Schumpeter also conjectured that large firms are more likely to engage in R&D because they
can explore economies of scale and spread risks across projects.
161
Acemoglu and Zilibotti (1997).
grants the creation of knowledge that becomes public property. For instance, academic
and government scientists do not work with the primary objective of profit-
maximisation. Their main incentive is to disclose the product of their research in order
to receive rewards. This kind of support is known as “patronage”162.
Governments may also promote research and development through
“procurement”. In this case, a public body contracts out in advance for a specific piece
of research to be undertaken. With this mechanism, the government absorbs some or all
the risks that the private firm would otherwise have to address. Depending on the
interests of the government, the findings of the research undertaken under procurement
may become publically available or not. In the case of military research and big space
programmes, such as those managed by NASA in the USA, disclosure is not in general
allowed.
Government funded R&D accounts for between one-third and one half of total
R&D expenditures in US and Europe163.
7.7. Discussion
A basic flaw of the Solow model is that, because it assumes perfect

technological diffusion, it cannot account for private incentives to R&D. This chapter
stresses the critical role of knowledge excludability in providing market incentives for
firms to engage in R&D. This implies, however a departure from the perfect
competition model.
By protecting and enforcing property rights, governments have a role in shaping
the market incentives to R&D. The regulation of property rights involves however a
trade-off between static efficiency and dynamic efficiency. Some authors have argued
that the existing patent system involves too much static inefficiency, so that it should be
alleviated or even banished. In many industries, natural excludability mechanisms, such
as lead-time and first move advantage provide enough incentives for R&D.
Private researchers do not in general fully appropriate the social benefits of their
inventions, even when property rights are fully enforced. This means that the
government may have a role in stimulating the research activity through subsidies and
research grants. Investment in R&D is also affected negatively by frictions in financial
markets.
The model analyses in this section suggests that the incentives for R&D depend
on the costs of achieving an innovation, on the size of the market, on the existence of
competitors and on how long the excludability will last. The question of the optimal
level of R&D will be specifically addressed in the next chapters.
162
David (1992).
163
Kelly and Quah (1998).
afreitas@ua.pt 224
Key ideas of chapter 7
 Private agents dedicate valuable resources to the development of new technologies because they
expect to be rewarded in case of successful innovation.
 Private incentives to R&D depend on the ability of the innovating firm to keep competitors away
from its invention. A technology can be made exclusive by legal mechanisms, such as patents,
copyrights and trademarks, but also by natural mechanisms, such as secrecy, lead-time, and costumer
loyalty. In practice, patents are likely to be an important source of excludability in few industries,
such as chemicals and pharmaceuticals.
 Monopoly rents made possible by R&D depend positively on the size of the market, on how long the
technological advantage will last, and on the existence of competing technologies.
 The fact that incentives to R&D depend on making excludable a good that is non-rival (knowledge)
implies a trade off between static and dynamic efficiency. Some authors contend that the dynamic
gains achieved with the patent system are not enough to offset the static costs.
 In general, because the social benefits of R&D exceed the private gains, there is scope for
government intervention. This can be achieved through simple subsidies or, in some cases, by fully
supporting the research activity, through procurement or patronage.
 Even when private incentives for R&D are high enough, a researcher may face borrowing
constraints, because the output of R&D is uncertain and immaterial. Financing R&D is easier in
countries with well-developed financial markets, where venture capital firms are able to spread risks
across a large range of activities. The implication is that financial market development is good for
growth.
Key concepts
 Horizontal vs. vertical innovation

 Division of labour
 Drastic vs. non-drastic innovation
 Limit Pricing
 The apropriability effect
 Standing on shoulders
Essay questions:
a) In a small economy, incentives to R&D are higher under autarky or under

free trade? Explain.
b) Comment: “Patents are inefficient. One could banish them and still have
incentives for R&D”.
c) Comment: “Even with perfectly protected property rights, the market
mechanism would deliver too little R&D”.
afreitas@ua.pt 226
Exercises
7.1.
Consider a carpenter that produces chairs as a final good (Y). The aggregate
m
output is obtained through: Y   x j , where x refers to the number of tasks used in
1/ 2
the production process. Assume that there are 4 employees ( N Y ), each one with
productivity equal to λ=9.
a) Assume first that there is no division of labour (that is, each worker does all the
tasks). Find the number of chairs produced by the carpenter.
b) Now assume that the task of producing a chair is split into four sub-tasks, each
one attributed to one worker only. Explain the impact on aggregate production.
c) What would be the impact if the productivity of each worker increased from λ=9
to λ=25?
d) Referring to the exercise, explain the difference between vertical innovations
and horizontal innovations. Is the example referring to process innovations or to
product innovations?
7.2.
Consider an economy where the aggregate output is assembled with m
m
intermediate inputs: Y  B x j . The production function of each intermediate input
1 
j 1
is given by: x j   N j and the total labour force employed in the intermediate input
m
sector as a whole is expressed as: N y  1   N   N j . μ is the constant fraction of
j 1
the labour force devoted to R&D. The wage rate (w) is 50, β=1/3, B=100 and λ=2.
a) If the final good sector was perfectly competitive, what would be the
demand for input j?
b) If only one producer had the right to produce j, what would be the
corresponding price? Represent in a graph.
c) If competitors became licensed to produce this variety, what would
happen?
d) If, in alternative, imitators could produce this variety with a marginal
product equal to λF=1.6, what would be the equilibrium? Explain, with
the help of a graph.
e) Assume now that a firm escaping competition developed a more efficient
technique to produce good j (λ= 2.5). Would this innovation be drastic
or non-drastic? Explain with the help of a graph.
The essential point to grasp is that in dealing with capitalism we are dealing with
an evolutionary process [Joseph Schumpeter].
Learning Goals:
 Understand the process through which innovations destroy existing rents

 Understand the analogy between creative destruction and the Darwin
theory of evolution
 Identify the factors that influence the market value of a discovery and the
optimal investment in R&D
 Acknowledge the alternative approaches that have been proposed to
remove the scale effect from endogenous growth models
8.1 Introduction
In today’s world, much competition between firms takes the form of firms trying
to develop new and better products or less costly methods of producing existing
products. This competition forces incumbents to continuously revise their plans and
production techniques, in a process of permanent adaptation. In this process, there are
winners and losers. Firms that fail do adapt, experiment losses and some are forced out
of business.
The view that technological change comes along with the destruction of existing
businesses is on the basis of the Schumpeterian paradigm of economic growth. In light
of this paradigm, the disappearance of old activities and firms and the emergence of
new activities and firms is an important vehicle through which technological progress
materializes. Joseph Schumpeter labelled this process as of “creative destruction”.
Creative destruction is a form of competition through innovation that delivers rapid
productivity growth.
This chapter examines the argument, focusing on the competitive nature of
R&D. Section 8.2 explains what is meant by creative destruction. Section 8.3 presents a
simplified version of the basic Schumpeterian model. Section 8.4 gives the intuition od
extending the analysis to more than one sector. Section 8.5 analyses the question as to
whether more product market competition is good or bad for innovation. Section 8.6
concludes.
Destroying a monopolist’ rents
afreitas@ua.pt 228
A distinctive feature of the Schumpeterian model is that monopoly rents last

only until the arrival of a competing innovation. Hence, in choosing their research
effort, firms have to balance the potential gain of gaining market power against the cost
of losing it to a competing innovation.
To illustrate the concept of creative destruction, let’s consider again the case of a
vertical innovation, with a small novelty. In Figure 7.2 it was assumed that prior to the
innovation the market was under perfect competition. Hence, in that example, there
were no economic rents to destroy. In Figure 8.1, on the contrary, it is assumed that
prior to innovation the market was run by an incumbent with full monopoly power.
Figure 8.1. Creative Destruction
p0  w 0 1    M0
(a)
w 0
M1
p1  w 1 1    p  1   Bx  
(b)
w 1
x0M x1M x
The equilibrium prior to innovation is described in Figure 8.1 by point M0. This
equilibrium corresponds to the intersection of the incumbent’ marginal costs curve
( w 0 ) with the locus of marginal revenues (the dashed curve), implying a price equal
to p0  w 0 1    and a total demand equal to x0M (the suffix j is omitted to save
algebra). The incumbent’ operational profits (7.12) corresponds to the area (a).
Now assume that an entrepreneur achieves a drastic vertical innovation,
corresponding to a productivity gain from 0 to 1  0 . With this innovation, the
entrepreneur achieves a marginal cost equal to w 1 (it is assumed that the market for
this variety is small in respect to the labour market, so the wage rate remains constant).
The innovation allows the entrepreneur to undercut its rival and still capture all the
market. The new monopoly price falls to p1  w 1 1    , implying an increase in
production of this variety from x0M to x1M .
With the innovation, the entrepreneur achieves operational profits equal to area
(b). The incumbent monopolist, in turn, looses area (a) to consumers. Hence, the arrival
of a new rent (b) comes along with the destruction of an old rend (a). This is why the
process is called Creative Destruction.
Note that, since the innovation is drastic, the consumer gain (area [p0M0 M1 p1])
more than offsets the incumbent loss. Hence, s long as the innovation is profitable for
the entrant (that is, if (b) is greater than the sunk cost of the innovation), there will be a
gain for the society as a whole164.
Box 8.1. Joseph Schumpeter
Joseph A. Schumpeter (1883-1950) was one of the most prominent economists

of the 20th century and is sometimes referred to as the “father of economic
development”. In two famous books, The Theory of Economic Development, published
in 1911, and Capitalism, Socialism and Democracy, first published in 1943, Schumpeter
argued that the process of economic development resembles the Charles Darwin’s
theory of evolution.
Schumpeter theorized that the introduction of new products, new production
processes and new forms of industrial organization by innovating firms undermine the
marketability and the value of existing designs and production techniques. This allows
inventing firms to achieve a new dominant position in the market and, by then, to reap a
return on their research effort. According to Schumpeter, that dominant position should
not last for long: sooner or later other firms will come up with new and better
technologies, causing the incumbent’ rents to erode.
The process through which firms bringing new technology enter in the market
and undercut incumbents destroying their rents, “revolutionizing the economic structure
from within”, was labelled by Schumpeter as of “creative destruction”. According to the
author, creative destruction allows the market economy to incessantly revitalize itself,
reallocating resources from old and failing business to newer and more promising areas.
Creative destruction, the author defended, is the essential mechanism through which the
market economy adapts to technological change.
Box 8.2. Peas, dark moths and the theory of natural selection
In its primitive form, the pea plant evolves a gene that makes its pods explode
when peas are ready for germination. This mechanism allows peas to be scattered on the
ground, ensuring the survival of the species. In each generation of pea plants, however,
a number of mutants grow by accident lacking this key genetic ingredient: pods of
mutant peas fail to pop up. In the wild, mutant peas die entombed in their pods. The
natural selection assures that only the healthy pods pass on their genes.
When the man invented agriculture, however, the direction of natural selection
was changed. Humans were not interested in the primitive version of the pea plant,
because it is much more convenient to gather pods with peas enclosed than to search for
scattered peas on the ground. Thus, once the man became a farmer, it started growing
the mutant version. Today, the pea plants we see in our fields are the mutant version,
not the primitive. Farmers reversed the direction of natural selection: the formerly
successful gene became lethal and the formerly lethal mutant became successful.
By the end of 19thb century a darker variant of moths became far more abundant
than the paler varieties, in regions of England with carbon intensive manufactures. The
164
Note that this is not a general case: if the sunk cost of R&D was too large, the net welfare gain could
end up being negative. Also note that, in case the innovation is non-drastic, there will be no consumer
gain at all (we will examine the later case in Box 8.3).
afreitas@ua.pt 230
reason is that, as the environment became dirtier, dark moths resting on dirty trees were
more likely to escape the attention of the predators than the pale moths. The sudden
change in environment caused a significant evolutionary change within a time period
corresponding to only hundreds of generations.
These two examples, described by Jared Diamond in his famous book Guns,
Germs and Steel165, illustrate the Darwin’s concept of “natural selection”: in the nature,
each new generation of a species produces a number of mutants. Because in general
mutants are not endowed with the same genetic information that their ancestral
developed for thousands of years, they are in principle more vulnerable to the
environmental challenges. The natural processes of differential surviving and
reproduction does the selection. In certain moments, however, the mutant
“competencies” may turn out to become an advantage instead of a threat: changes in the
natural environment may cause a mutant variety to become naturally selected. In these
cases, the population undergoes an evolutionary change.
Like living species, economic agents respond to changes in the economic
environment. Each moment in time, agents tend to use strategies that they observed or
they learned to be successful in the past. The behaviour of each economic agent each
moment in time reflects thus a learning process and an interaction between its
competences and the economic environment. Occasionally, agents experiment new
strategies. This is innovation. When the new strategy fails, agents retreat to the old
strategies. Whenever the new strategy succeeds, the innovating agent gains a
competitive advantage. This advantage will render the previous strategies obsolete. As
time goes by, other agents copy the more efficient strategy, until it becomes dominant in
the market. This is Creative Destruction.
8.3 The optimal level of R&D
The Schumpeterian model of economic growth focuses on vertical innovations.

Vertical innovations are by definition, those that improve productivity along a given
product line, rendering previous technologies obsolete. A key feature of growth models
with vertical innovations is that entrepreneurs bringing new technologies drive the
previous incumbents out of the market, destroying their rents, until they are themselves
displaced by other entrepreneurs bringing newer and even more efficient technologies.
The analytical model in this chapter builds on the one already introduced in
Chapter 7 - described by equations (7.1) to (7.12) 166 . The main features are the
following: first, chance matters for innovation; second, inventions are achieved by
newcomers; third, successful innovations allows the innovator to displace the previous
incumbent (that is, all innovations are assumed to be drastic); fourth, future inventions
displace current inventions, so monopoly rents are temporary.
165
Diamond (1997).
166
The first successful model to describe the Schumpeterian argument is due to Aghion and Howitt
(1992). Earlier attempts to formalize R&D as a rent seeking activity include Nordhaus (1969) and Shell
(1973). Both authors faced, however, difficulties in dealing with increasing returns in a general
equilibrium framework.
For simplicity, we start out with the case in which there is only one intermediate
input (that is, m=1). Later (section 8.4) we’ll discuss the implications of extending the
analysis to multiple intermediate inputs.
A production function for knowledge
By now, we have been abstracting from the question of how the arrival of a new
idea relates to the R&D effort. This is, however, a very important question: when firms
invest in R&D, they expect to achieve some innovation in the future, and the
relationship between resources employed and the expected output in terms of new ideas
is critical to find out the optimal level of research.
The problem is that the relationship between R&D effort and technological
change is not easy to model:
- First, knowledge is something that we don’t know how to measure: shall we count ideas?
Shall all ideas be counted as valuing the same? If not, how to evaluate each particular idea?
- Second, there is an element of risk: researchers may not succeed in inventing a new
technology .
- Third, the likelihood of an agent discovering a particular idea may depend on the success of
other agents discovering complementary ideas (we labelled this as the “standing on
shoulders” effect).
- Fourth, the research effort at the individual level may reveal useless if a competing
researcher independently discovers a similar idea (this problem is labelled “stepping on
shoes”).
- Finally, even if the key ingredients of a “knowledge production function” were well known,
a question remained as to the choice of its functional form: shall output knowledge vary
linearly with the research effort, or shall the production function for new ideas exhibit
diminishing returns? That is, in order to sustain a given rate of technological progress (and
therefore a given rate of per capita output growth) will it be sufficient to have a constant
number of researchers or do we need instead an increasing number of researchers over time?
All these questions mean that the relationship between the production of ideas
and the resources allocated to such endeavours is much more difficult to formalize than
for other goods. And yet, the shape of such a production function is an essential
ingredient to determine the optimal level of R&D. With no surprise, the choice of an
appropriate specification for the production function of knowledge became a matter of
dispute in the research arena. In Box 8.4, we’ll discuss alternative specifications that
have been proposed in the literature. In this section, we stick with basic formulation of
the Schumpeterian model.
The Schumpeterian approach assumes that innovations arise randomly, with an
“arrival rate” that is proportional to the amount of working time devoted to R&D.
Formally, it is assumed that, when one unit of labour is devoted to the search for
technology  1 , that technology will be discovered with probability b. For the economy
as a whole, when N units of labour are allocated to R&D, the probability of the next
vintage being discovered is bN. Thus, the higher the number of researchers, the more
ideas will be produced. The parameter b is assumed exogenous and shall be interpreted
as capturing the productivity of the research effort.
An arbitrage condition
afreitas@ua.pt 232
To find out the optimal level of R&D, let’s recall our earlier assumption that
working time can be split into two basic functions, only: final good production and
R&D (equation 7.4). The implication is that the opportunity cost of devoting one unit of
time to R&D is the wage rate that the worker abdicates for not engaging in final good
production.
With this ingredient, the model develops in an intuitive manner: labour is
deviated away from production with the aim to obtain rents. Depending on how
expected rents compare with the wage rate, workers allocate their time to R&D or to
output production. At the margin, workers must be indifferent between devoting one
unit of time to output production or to research167. Formally, the following arbitrage
condition should hold:
w0  bV1 , (8.1)
where w0 denotes for the wage rate and V1 denotes the “market value” of technology
1 . Both variables are evaluated prior to the innovation. Condition (8.1) states that the
expected gain of an individual researcher allocating one unit of time to research (the
probability b of an innovation times the value of the innovation, V1 ) shall be equal to the
wage rate.
Note that the suffixes 0 and 1 do not refer to time, but instead to the moments
before and after the innovation: because of the stochastic nature of innovations, the
period of time between two successive innovations in this model has a random length.
The market value of a drastic innovation under creative destruction
To find out the value of the licence to produce with technology  1 , one shall take
into account not only the implied profits, but also the time length during which these
profits materialize. In the Schumpeterian model, each new innovation is fated to become
obsolete at a given point in the future, when a superior technology (say 2 ) is
discovered by a competitor.
In the following, let’s assume that the license to produce with technology 1 can
be sold in an auction to whoever makes the higher bid. The question is how much will a
potential bidder be willing to pay for that license. To make the story interesting, let’s
assume that investors also have the possibility of investing in a capital good, earning the
interest rate r168.
In deciding whether to buy the licence or not, investors shall compare two
options:
- One, they can buy the license to produce with technology  1 at the price V1 ,
earning the corresponding ex-post monopoly profits  1 per unit of time, but
167
Corner solutions are ignored, for simplicity.
168
The interest rate could be made endogenous, extending (7.1) so as to account for the role of physical
capital. For such an extension, you are invited to read Aghion and Howitt (1998), chapter 3.
facing the threat of the next vintage ( 2 ) being discovered, which will
happen with probability bN169;
- Two, they invest the amount V1 in capital, earning an income equal to rV1
per unit of time.
From an investor’ point of view, the optimal allocation of money shall obey to
an arbitrage condition stating that, at the margin, the reward of holding the license must
be equal to its opportunity cost:
1  bNV1  rV1 (8.2)
Condition (8.2) states that the interest-income generated by the value of the
license per unit of time, rV1 , shall be equal to the ex-post monopoly profit minus the
expected loss resulting from the arrival of 2 . The later is equal to the value of the
license ( V1 ) times the probability of the next vintage being discovered, bN.
Rearranging (8.2) one obtains:
1
V1  (8.3)
r  bN
The denominator of (8.3) can be interpreted as the “obsolescence-adjusted
interest rate” and captures the effect of creative destruction: if there was no threat of
competing innovations (=0), the value of the license would be given to the perpetuity’
formula, V1  1 r . When however is positive, the expected duration of the monopoly
rent is finite and this lowers the expected discounted value of the monopoly rents.
In (8.3), the value of innovation 1 declines with b and . Thus, current research
is discouraged by the intensity and the productivity of future research. This captures the
negative externality of new innovations on incumbents.
The ex post monopoly profits
The expression for monopoly profits was already obtained in (7.12):

1 
j 

1 
B1    2  j 
 

w
According to this expression, profits increase with productivity and decrease
with the wage rate. Thus, a vertical innovation, leading to a productivity increase,
impacts positively on profits.
Note however that technological progress in general exerts a negative effect on
profits, through its influence on wages. The reason is that productivity improvements
raise the demand for labour, causing wages to increase. This indirect effect of
In sake of simplicity, it is assumed that R&D intensity, , is constant over time. In the Schumpeterian
169
model, this will happen in the steady state, as long as technological improvements are proportional (that
is, 1 0  2 1  3 2 etc). The model above implicitly assumes this, and it can only be used to
compare steady states.
afreitas@ua.pt 234
technological change has to be taken into account when assessing the ex post monopoly
profits170.
With m=1, equations (7.2) and (7.5) imply:
x  NY . (8.4)
Substituting this in (7.1) and using (7.11) one obtains a simple expression for the
(aggregate) labour demand in the intermediate input sector:
w
1   2 Y (8.5)
NY
This equation differs from (2.11) in that the term 1    instead of 1   

2
appears in the numerator. This is an implication of having imperfect competition in the

market for the intermediate product171.
Substituting (8.5) in (7.12), the expression for profits becomes
   1   Y . (8.6)
Now, we make use of equation (7.6) of the earlier chapter, to obtain the
following relationship between production prior to innovation and after the innovation:
Y1  1  0  Y0 , holding constant the supply of labour. Given this, the ex post
1 
monopoly profits (8.6) become:

 1  1 0 1   1   Y0 (8.7).
The equilibrium level of research
The optimal allocation of labour to R&D is determined by equation (8.1).

Substituting the left hand side by (8.5), using (8.3) and (8.7) in the right hand side and
solving for the equilibrium level of R&D, one obtains:
1  r bN
 1 (8.8)
1  1 0   1   
1 
Equation (8.8) states that the optimal proportion of time devoted to R&D is
higher, the lower the interest rate, the larger the size of the labour force N, the higher the
productivity of R&D, b, and the bigger the technological jump, 1 0 .
It is also apparent that  is an increasing function of  : the lower the elasticity
of the demand curve faced by the intermediate monopolist, the larger the monopoly
170
Although we are assuming one sector only, one wants the model to be meaningful for the case with
many intermediate inputs, where each intermediate input is small relative to the economy. Thus, while it
is reasonable to assume that the innovator takes the wage rate as given (eq. 7.7), the general equilibrium
of the model implies that technological change impacts on wages through its influence in the demand for
labour.
171
In other words, monopoly profits come at the cost of lower wages. For each employment level,
monopoly profits are equal to the difference between the wage rate that would prevail under perfect
competition and the wage rate under monopoly, multiplied by the employment level. That is:
 
  1   Y N Y  1   2 Y N Y NY   1   Y (conf. equation 7.6).
rents that will be appropriated by successful innovators and hence the larger the
incentives to innovate. This accords to the Schumpeter view that market power is good
for innovation.
Graphical illustration
To illustrate the trade-offs involved in the choice between allocating time to

production and to R&D, lets refer to Figure 8.2. The curves in the figure do not exactly
correspond to the two sides of equation (8.1), but rather to the two sides of equation
(8.1) divided by per capita income. This small modification is rather convenient, as it
allows the horizontal axes to be expressed in terms of .. That is, in Figure 8.2, from left
to right we measure the proportion of time devoted to output production, 1    ; from
right to left we measure the proportion of time devoted to R&D, . The size of the
horizontal axes is equal to one.
The downward sloping curve (from left to right), describes the demand for
labour and is equal to equation (8.5) divided by per capita output:
w 1   
2
 (8.9)
y 1 
This equation states that, as the proportion of time devoted to production 1   
increases, diminishing returns translate into lower wages per unit of per capita output.
The upward sloping curve (actually, downward sloping, from right to left)
describes the demand for research labour, as implied by the right-hand side of (8.1).
This is obtained, using (8.3) and (8.7) and dividing by per capita income:
bV1 bN 1  0   1   
1 
 (8.9a)
y r  bN
This is a negative function of  because of creative destruction: the greater the
fraction of labour devoted to R&D, the more likely is the arrival of a competing
innovation. Thus, a larger proportion of workers in the society devoted to R&D reduces
the incentives to engage in R&D.
Solving together equations (8.9) and (8.9a), one obtains the equilibrium level of
research and development (8.8).
To see how the model works, assume that initially the allocation of labour was
as described by points A and B: in that case, the marginal benefit of working time (B)
would be lower than the marginal benefit of R&D (point A). Since in that allocation
workers had an incentive to devote more time to R&D than they actually do, such
allocation cannot be an equilibrium.
The equilibrium level of R&D (as described by equation 8.8) occurs at point E
in the figure. In this allocation, wages are higher than in A because there are less
workers in production (this reflects diminishing returns) and the expected benefit of
research time is lower because there are more researches in the economy (this reflects
the negative effect of creative destruction). In E, the arbitrage condition (8.1) holds.
afreitas@ua.pt 236
Figure 8.2. The allocation of labour in a laissez faire Schumpeterian economy
Marginal Expected
benefit of value of
working R&D
time (scaled)
(scaled)
A
E
V w
b 1
y y
B
1  
Work effort R&D effort
What are the implications of a higher productivity in R&D?
Consider the impact of an increase in the productivity of R&D, as captured by

parameter b. Such a change has no effect on the curve describing the demand for labour
by the productive sector (equation 8.9).
It has however two effects on the curve describing the marginal benefit of R&D
(equation 8.9a): on one hand, it improves the probability of innovation and hence the
incentives to innovate (numerator); on the other hand, it increases the likelihood of
creative destruction (denominator), reducing the marginal benefit of research. It is easy
to check that the former effect turns out to dominate, so when b increases, the curve
shifts up and left (Figure 8.3).
Thus, a higher effectiveness of R&D translates in this model into a higher
research effort and henceforth to a faster pace of technological progress.
What are the implications of a larger labour force?
Figure 8.3 can also be used to analyse the implication of having a larger labour
force.
Since the curve describing the demand for labour by the productive sector does
not depend on N, it remains unchanged. The curve describing the demand for research
labour, in turn, is hit by two effects (equation 8.9a): on one hand, an increase in
population increases the size of the market and henceforth the monopoly profits
(numerator); on the other hand, a larger population also implies a higher number of
researchers and therefore a higher probability of the monopoly rents being eroded
through creative destruction. As before, the first effect dominates, so the curve shifts up
and left.
Figure 8.3. The effect of an increase in the effectiveness of R&D
Marginal Expected
benefit of value of
working R&D
time (scaled)
(scaled)
E’
E
w
V1 y
b
y
1  
Work effort R&D effort
This means that a larger population, by raising the size of the market for a
successful entrepreneur, increases the incentives to R&D, leading to higher research
intensity.
A corollary is that a large economy should grow faster than a small economy. In
other words, this model is plagued with the same type of scale effect that is common to
many other models of endogenous growth.
afreitas@ua.pt 238
8.4 Multiple sector considerations
Extending the model to m intermediate inputs
The model above was solved assuming one intermediate input, only (m=1). This
simplification hides some interesting aspects. In this section we discuss the
consequences of having more than one intermediate sector.
So, consider the model described by (7.1)-(7.12), with m>1. Each variety is
supposed to have its own research sector, with firms competing to invent the next
generation of the corresponding technology. A successful entrepreneur in sector j will
displace the current incumbent and will become the incumbent of sector j until being
displaced itself. As before, it is assumed that innovations in each variety imply
productivity improvements that are proportional to the each other.
Because innovations arise randomly, productivity improvements are not
synchronized across sectors. Thus, in contrast to what assumed in (7.6), the level of
technology will be in general different across sectors. With a large number of sectors,
the implication of asynchronous technological progress is that aggregate productivity
and wages will evolve in a much smoother way than in the case with one sector only.
Conditional on technology and wages, the price level, production and profits in
each intermediate sector j are given by (7.9), (7.10) and (7.12), respectively. With more
than one sector, profit opportunities in each sector depend on the other sectors
developments: it is the combined effect of all technological improvements that
determines the wage rate and henceforth, expected profits and the incentives to
innovate.
To see this, let’s first compute the aggregate demand for labour. Substituting
(7.11) in (7.5) and solving for the wage rate, one obtains:

 m  1 
w  1    B  
2
, (8.10)
 NY 

1 
  1 
where     j
 
  is the average technological level in the economy.
 j 
 
Equation (8.10) states that vertical innovations (  ) impact positively on wages.
These are defined in average terms, so as to account for asynchronous technological
progress across sectors. Because we now have various intermediate sectors, equation
(8.10) also accounts for the impact of horizontal innovations (m) on wages.
Substituting (8.10) in (7.11), one obtains the demand for labour in each variety
as a function of aggregate productivity:
1 
 j   NY
N j    (8.11)
  m
According to this equation, the share of sector j on aggregate employment,

N j NY , is higher/lower than 1/m depending on how sector j’ productivity (  j )
compares to the economy average,  . In particular, the share of sector j in manufactures
employment is higher than the average 1/m, if its own technological level is higher than
the average.
Equation (8.11) also reveals that, with a fixed labour supply, as more and more
varieties are introduced in the economy (horizontal innovations), the employment level
in each sector declines. This is no more no less than the division of labour effect.
Substituting in (7.2) and then in (7.1) one obtains, after some manipulation, the
output level in the economy:
Y  Bm  NY 
1 
(8.12)
This equation replicates (7.6), with one difference, only: we are not imposing the
productivity across sectors to be the same. Using (8.8) and (8.11) you’ll find an
aggregate demand for manufacture labour exactly equal to (8.5).
Finally, substituting (8.11) in (7.12) and using (8.12), one obtains an expression
for the monopoly profits in each sector:
1 
Y   
 j   1     j  (8.13)
m  
Comparing to (8.6), we see that the monopoly profits in each individual sector
depend positively on that sector productivity (  j ) relative to the economy’ average (  ).
Equation (8.13) also shows that horizontal innovations, by expanding the

number of varieties and reducing the demand for each variety, impact negatively on
individual profits.
The crowding out effect
Equations (8.11) and (8.13) reveal a form of creative destruction that was not
accounted for in the model with one sector: asynchrony in innovation implies a
continual reallocation of labour and profits between sectors. In particular, employment
and profits will increase in innovating sectors and will decline in non-innovating
sectors.
This “crowding out” effect is another negative externality arising from
innovators to incumbents, that reduces the incentives to innovate: monopoly profits in
each variety not only erode with the arising of a superior variety in the same product
line (the “creative destruction effect” in equation 8.2), they also erode through rising
wages implied by technological improvement in other product lines.
An implication is that an increase in b not only shortens the duration of the ex-
post monopoly rents along the corresponding variety, it also acts to reduce profits on
non-innovating sectors along time, through successive increases in the wage rate172 .
172
Still, this crowing out effect is never large enough to invert the relationship between b and , which
remains positive. For a formal explanation, see Aghion and Howitt (1998), pp 87-92.
afreitas@ua.pt 240
Note that the crowding out effect applies both to vertical and horizontal innovations: an
increase in the number of varieties, m, impacts positively on aggregate output and
productivity, raising real wages and depressing profits (equations 8.10 and 8.13).
Removing the scale effect
In equation (8.8), we saw that the size of the market impacts positively on the
incentives to innovate. Thus, the larger the population the higher will be the optimal
proportion of time devoted to R&D and henceforth the faster will be the rate of
technological progress and of economic growth. Thus, the model displays a scale effect.
It is important to note that this property of the model does not change in the
multiple sector case: according to equation (8.13), monopoly profits in each variety are
still a positive function of aggregate output, Y. However – and this is the key issue - in
the model with many varieties, profits decline with the number of varieties. The reason
is that what determines the size of the market to a typical sector is not aggregate output
(Y), but rather its market share (Y/m). Thus, when the number of varieties increase, the
market share of a typical variety declines and so will do profits.
This property of the model suggests a natural avenue to get rid of the scale
effect: if the size of the market and the number of varieties were set to evolve in the
exact proportion, the size of the market available to each variety would remain constant
and so would do the incentives to innovate.
In fact, this is precisely the avenue explored by some Schumpeterian models of
economic growth to get rid of the scale effect. In brief, these models account for two
types of technological progress: increases in the total number of varieties, m (horizontal
innovations) and productivity gains along a given product line (vertical innovations). In
these models, R&D efforts are basically aimed at vertical innovations, while the number
of varieties increase proportionally to the size of the workforce173. If the number of
varieties increases in direct proportion to the size of the market, the size of the market
for each variety remains constant and so will do profits and the incentives to innovate.
Hence, the research intensity in each industry does not change when the population
expands. Moreover, for each given research intensity, a rising population does not
translate into more researchers in each sector: if the number of varieties and the size of
the labour force are proportional, the average firm size (and the number of researchers
per variety) remains unchanged. Thus, technological progress in each variety will be
unaffected by the population size. All in all, the proliferation of product varieties dilutes
the effect of population expansion, both on research intensity and on the number of
workers per variety, removing the “strong” scale effect174.
173
Note that a proportional relationship between the size of the market and the number of varieties is a
conventional property of models with monopolistic competition. This direction was first explored by
Young (1998). Other authors include Dinopoulos and Thompson (1998), Peretto (1998), Aghion and
Howitt (1998, 2005), Peretto and Smulders (2002). For a survey, see Jones (1999, 2005).
174
Still, a “weak scale effect” arises in this class of models, because aggregate productivity depends
positively on the number of varieties (horizontal innovations). That is, while productivity growth through
vertical innovations becomes independent of the size of the labour force, the proliferation of varieties
leads to a division of labour effect, through which the level of per capita income increases with the size of
population.
Box 8.4. The “Fishing out” theory
A main difficulty in models with endogenous technological change is that they

give rise to a counterfactual scale effect, according to which the growth rate of per
capita income becomes an increasing function of size the economy’s workforce.
To see this, let’s first consider the basic formulation of the knowledge
production function introduced in this chapter, but instead of assuming that innovations
arrive stochastically, assume that technological improvements follow a deterministic
rule along a continuous time175:
t  bNt t . (8.14)
As before, the exogenous parameter b measures the productivity of research
labour in the production of knowledge.
The critical assumption in (8.14) is that the creation of new knowledge is
proportional to the existing stock of knowledge in the economy () 176 . As for the
rationale, you may interpret this as capturing the cumulative nature of knowledge or the
“standing on shoulders effect” (Box 7.5): if new ideas build on old ideas, a larger stock
of current knowledge is likely to increase the productivity of researchers seeking for
new ideas. The “standing on shoulders” is a positive externality, whereby each
individual researcher contributes to the common pool of knowledge and henceforth to
other researchers’ productivity.
Dividing (8.14) by , one obtains the (endogenous) rate of technological
progress:

  bN t (8.15)

This model implies a linear relationship between the R&D effort and the rate of
productivity growth: when the proportion of workers engaged in R&D increases, the
growth rate of per capita income also increases. This model is a cousin of the AK in that
a policy change (e.g, a subsidy to R&D) affects long-term growth. It is a model of
endogenous growth.
It is important to stress the critical role of the “standing on shoulders” effect in
this model. Analytically, the assumption that the knowledge production function is a
linear differential equation on  is what we need to assure that the productivity of
researchers grows over time, even when the number of researchers remains constant. If,
in alternative, the knowledge production function did not depend on that is, if
  bN ), then with constant parameters b and and with a constant population, the
flow of new ideas (  ) would be constant over time. Therefore, the growth rate of per
175
In the model analysed in this chapter, subscripts 0, 1, 2 in the variables do not refer to real time, but
rather to a sequence of innovations. Thus, the time interval between each two innovations is random. In
the formulation above, the subscript t refers to time and proportional improvements in technology take a
constant time interval. With large numbers, the two specifications are basically equivalent..
176
Models consistent with (8.14) (e.g, with inventions generating proportional improvements in
productivity) include Romer (1990), Grossman and Helpman (1991) and Aghion and Howitt (1992).
afreitas@ua.pt 242
capita income (  t    ) would fall down to zero as the stock of knowledge 

increased. So the model would be incapable of generating sustained growth.
The other side of the coin of assuming linearity is the scale effect: as long as
population is constant, (8.15) the growth rate of per capita income will be constant; but
with exponential population growth, the growth rate of per capita income will itself
grow exponentially.
As explained in the main text, the Schumpeterian model gets rid of the scale
effect by linking the number of varieties, m, to the size of population: this allows the
number of researchers per variety to remain constant and the same will happen to the
rate of technological progress along each variety. This neutralizes the scale effect. Also
note that in this model the removal of the scale effect does not change the “endogenous
growth” nature of the model: because the knowledge production function is linear in
knowledge, a higher R&D intensity translates into faster production of ideas and fastr
growth. This property strongly contrasts with the competing attempt to remove the scale
effect, by Charles Jones.
Jones 177 abandoned the assumption of linearity in the knowledge production
function, arguing that new discoveries are increasingly difficult to find. That is, as the
stock of accumulated knowledge increases, researchers will find it more difficult to
invent new technologies, because the easiest ideas have already being discovered. On
the other hand, as technology becomes more complex, it takes more time and effort for
a researcher to learn everything it needs just to catch up with cutting hedge.
As for an illustration, note that many breakthrough inventions in the 18th and 19th
centuries were achieved by hobbyists or by single individuals. Thomas Edison, for
instance, invented alone the light bulb, the phonograph and the motion picture. Today,
advances in technology are mostly achieved by scientists engaged in research teams and
focusing on very narrow problems. The assumption that new discoveries become
increasingly more difficult became known as the “Fishing out effect” 178.
Formally, the “Fishing out effect” is modelled allowing past discoveries to
impact on the productivity of current researchers with declining marginal returns:
  bN t t , with 0<<1. (8.16)
In this model, the sign and magnitude of parameter  captures the net effect of
two opposing externalities on productivity growth: the “standing on shoulders” effect
(positive), whereby productivity of current research increases with the accumulated
knowledge in the society and; the “fishing out effect” (negative) whereby past
discoveries turn new ideas more difficult. Jones conjectured that the net effect of these
two externalities may lead to 0<<1 : that is, new researchers benefit from previous
ideas, but there are diminishing returns to knowledge in knowledge production.
Another novelty in the Jones formulation is a negative externality from
researchers to other researchers, due to overlapping research: that is, the waste resulting
from the independent researchers trying to achieve the same piece of knowledge. This
177
Jones (1995).
178
The label “fishing out” arises from the classical example of the fishing pound for the Tragedy of the
Commons: if the pound is stocked with a fixed number of fish, then it becomes increasingly difficult to
catch each new fish. Followers of this approach include Kortum (1997) and Segerstrom (1998).
externality (labelled “stepping on the shoes”) is accounted for in (8.16) by postulating

<1 (that is, doubling the number of researchers less than doubles the production of
new ideas). This assumption is not, however, necessary to remove the scale effect.
When the knowledge production function takes the form (8.16), the rate of
technological progress becomes:

 bN t t 1 . (8.17).

With 0<<1, (8.17) implies a negative relationship between the growth rate of 
and the level of . This means that the model converges to a steady state: like in the
Solow model, there is balanced growth path, in which the growth rate of technology is
equal to the growth rate of per capita income,     .
The steady state growth rate may be obtained log-differentiating both sides of
(8.17) and imposing ˆ  0 . This gives:
  n 1    . (8.18)
In (8.18), the growth rate of per capita income is not a direct function of the
population size, so the strong scale effect is removed. Still, a weak scale effect shows
up, as the growth rate of per capita income is a direct function of the population growth
rate179.
Another important distinction between (8.15) and (8.18) is that, in the later
formulation the steady state growth rate of per capita income is invariant with the
fraction of the population engaged in R&D, . Hence, changes in government policy
leading to changes in research intensity have no long run effects on economic growth.
Because in this, this model is categorised as of exogenous growth. Still, in this model,
changes in research intensity  do affect the long run level of per capita income (in
other words, there is a transitory effect on growth)180. The policy implication is that a
subsidy to the research activity would affect the level of income but not its long term
growth rate.
Jones (1995, 2005) contended that this prediction of the model fits well with the
cross-countries empirical evidence. The author observed that countries with high R&D
intensity do not grow systematically faster than other countries, but they do exhibit
higher levels of per capita income (similar evidence in Klenow and Rodriguez-Clare,
2005).
179
Note that in this model population growth is necessary to obtain sustained growth of output per
worker: the assumption that new ideas become increasingly difficult to discover implies that the growth
rate of  falls down to zero over time when the population is constant (in other words, once linearity was
removed from the knowledge production function, a constant research effort will no longer be sufficient
to sustain the continuing proportional increase in the stock of knowledge that is necessary to sustain long
run growth). Thus, only with an ever-increasing research community will be possible to maintain a
constant rate of technological progress.
180
When the fraction of population devoted to knowledge accumulation () increases, there is an initial
fall in output (labour is deviated away from production) but then the rate of technological progress
accelerates (8.16). However, such acceleration is only temporary. Because of diminishing returns in
knowledge production, the rate of technological progress falls back until reaching its previous (long-run)
level, (8.18).
afreitas@ua.pt 244
8.5. Competition and innovation
The discussion in this and in the previous chapter stressed the idea that
innovations impact on the market structure: by introducing a new product or a cheaper
way of producing an existing product, the innovating firm acquires market power.
A different question is whether the existing market structure affects the
incentives to innovate. This section addresses precisely the question as to whether more
product market competition is good or bad for innovation.
The replacement effect
A conventional wisdom is that established monopolies, because they are already

earning profits, have less incentive to innovate than newcomers. Kenneth Arrow coined
this idea as the replacement effect181.
To see this, let’s refer again to Figure 8.1. In that figure, the equilibrium
previous to innovation is described by M0, with the incumbent monopolist having
profits equal to area (a). The innovation described in Figure 8.1 corresponds to an
increase in productivity from 0 to 1  0 .
Consider first the case in which the innovation is achieved by a newcomer: since
the innovator moves from a situation with no profits to a full monopoly, its net gain will
be area (b) minus the whatever fixed costs associated to the innovation. If instead the
same innovation was achieved by the incumbent, its net gain would be (a)-(b) minus the
fixed cost.
Comparing the two cases, we see that the monopolist benefits less with the
innovation than the newcomer182. The reason is that the later jumps from a situation of
zero profits to one with full monopoly profits, while the monopolist was already earning
a monopoly profit prior to innovation. When the monopolists innovates, it replaces old
profits by new (larger) profits.
The replacement effect thus suggests that leaders have less incentive to innovate
than outsiders.
There are, however, some caveats in the analysis above:
 First, the analysis presumes that without innovation the incumbent
preserves its profits, (a): however, if the incumbent does not innovate, an
outsider will most probably replace him. If one assumes that (a) is lost
anyway, then the incentives for the incumbent to innovate and prevent
entry will exactly match those of an outsider with equal R&D costs.
 Second, the analysis abstracts from the possibility of the incumbent
having lower costs in achieving the innovation: if learning by doing or
any other information advantage translated into lower R&D costs for the
181
Arrow (1962).
182
A different question is whether there will be a difference for the society as a whole. As you may easily
check, the social gain of the process innovation does not depend on who is the new monopolist.
incumbent, the later could end up with greater incentive to innovate than
the outsider183.
In the real world, innovations are often carried out by industry leaders, which
remain leaders for long periods of time.
Neck and neck competition
So far, we have been assuming that outsiders always undercut incumbents. The
implication is that each moment in time, there is only one incumbent in each sector. In
alternative, one may assume that followers have first to catch up with the leader before
becoming monopolists themselves. With such modification, the model will account for
the possibility of firms in a sector to be in a state with equal technologies, competing
neck-and-neck at the frontier184.
The implication of neck-and-neck competition is that it provides an incentive for
incumbents to innovate: since their profits are constrained by the existence of other
competitors with the same level of technology, the larger the number of firms
competing neck-and-neck, the larger the incentive for an incumbent to innovate, in
order to acquire a technological advantage and escape competition, becoming leader.
To illustrate this, we refer to Figure 8.4. Suppose that all innovations are non-
drastic and that technological spillovers are such that no firm can get more than one
technological step ahead his competitors: that is, if the technological leader innovates
(say to 1 ), a competitive fringe automatically learns to copy the leader’ previous
technology ( 0 ). It is however possible for a laggard firm to escape the fringe and catch
up with the leader.
Hence, at any point in time, there will be only two possible market structures in
the industry: “neck-and-neck”, in which more than one firm compete using the frontier
technology; and “unlevel”, in which only one firm holds the frontier technology and
supplies the entire market.
In the “unlevel” case, the leader’ marginal cost is equal to w 1 and the
competitive fringe’ marginal cost equals w  0 . In this case, the leader sets the price just
marginally below w  0 , capturing all the market, and pocketing the difference between
this price and the marginal cost w 1 185. The leader profits are equal to the shaded area
in the figure (  ). All its competitors are priced out of the market, so their profits are
zero.
Now suppose that an entrepreneur from the fringe successfully innovates and
joins the frontier technology, 1 . This means that, from now on, two firms will be
183
For a discussion, see Barro and Sala-i-Martin, (1995, pp. 254-259), or Mukoyama, (2003).
184
The following explanation adapts from Aghion and Howitt (2009), chapter 2.2. The main references
are Aghion et al. (1997) and Aghion at el. (2001).
185
Note that the equality between the leader marginal costs ( w 1 ) and marginal revenues occurs at point
R, implying a monopoly price (point M) exceeding the competitive price ( w 0 ). This means that the
leader’ innovation is non-drastic.
afreitas@ua.pt 246
operating in this market, competing neck-and-neck. The profits earned by each firm will
depend on how far they will compete with each other: at one extreme, if they engage in
open price competition, the equilibrium price will fall to w 1 , resulting in zero profits
for both; at the other extreme, if they collude, they can set the profit-maximizing price
( w 0 ) and share equally the profits, obtaining  2 each (in this case, the newcomer is
said to have “stolen” part of the leader business – see Box 8.3).
If more firms catch up to the frontier technology, the share of  obtained by
each one declines further and the collusive solution becomes more difficult to
implement. Thus, a laggard firm achieving a successful innovation (from 0 to 1 ), will
gain something between  2 and zero, depending on the degree of competition in the
neck and neck state.
Figure 8.4. Neck-and-neck competition
pj
w 1 1   
C0
w 0

p  1   Bx  
w 1
R
xM x0C x
Thus, for laggard firms, the higher the degree of competition in the market, the
lower the incentives to achieve a successful innovation and join the incumbents in the
neck-and-neck state. This captures the conventional “Schumpeterian effect”, according
to which increased competition discourages innovation.
For firms already in the neck-and-neck state, however, there will be more
incentive to innovate, the higher the level of competition. The reason is that, the more
competition in the neck and neck state, the lower the firm’s profits and hence the higher
the benefit of discovering a superior technology, to undercut its rivals and become
monopolist. Thus, through this “escape competition effect”, there will be a positive
relationship between product market competition and innovation.
In sum, once we account for the possibility of neck-and-neck competition, the
relationship between product market competition and innovation needs no longer to be
negative: true, in industries where competition is better described as “unlevel”, more
competition should come along with lower R&D effort, because joining the leader at the
frontier becomes less attractive; but in industries better described as “neck-an-neck”,
more competition should be associated to a higher research effort, as firms try to
“escape competition” and become market leaders.
Box 8.3. The Business Stealing Effect
When an entrepreneur from the fringe successfully innovates and enters in the
market joining the leader in a neck-and-neck competition, there will be a partial
deviation of rents from the leader to the newcomer. In this case, the innovator is said to
steal business from the incumbent.
The business stealing effect implies that some of the rents earned by the
innovator are simply deviated from the previous incumbents: they do not correspond to
a gain from the social point of view. This fact introduces the possibility of the R&D
efforts being excessive under laissez-faire.
In terms of figure 8.4, consider the case in which an entrepreneur from the fringe
achieves an innovation that exactly matches the leader’ technology, 1 . In this case, the
market price and the total demand for the good will not change, so the only difference is
that two incumbents - instead of one - will now share the market. If, for instance, they
collude and share equally the profits, then the “business stealing effect” will correspond
to half of the shaded area describing profits in Figure 8.2.
Clearly, in this case the innovation comes along with a social loss: the consumer
surplus does not change at all, and all the return reaped by the innovating firm will be a
mere transfer from the incumbent. As long as the innovation involves a fixed cost, this
will be a pure loss for the society as a whole, even if the innovator itself gets a profit.
Note however that this is not a general case: if the innovator achieved some cost
advantage relative to the leader, then there will be scope for social gains, even if the
previous incumbent was not driven out of the market: there would be a cost saving in
the units produced by the newcomer and consumer prices could fall. In this case, the
innovation could have a positive or negative social value depending on how these two
benefits compared with the fixed cost of the innovation.
Box 8.4. The inverted-U puzzle
On the empirical front, the relationship between competition and innovation has
not been free of controversy. Some authors found a positive correlation between
competition and innovation 186 . Other authors, allowing for a non-linear relationship
between competition and R&D effort, came out with a new stylized fact, according to
which the relationship between R&D and competition follows a inverted U: that is, at
low levels of market competition, increasing competition comes along with more
innovation; but at high levels of competition the relationship turns out to be negative:
that is, more competition is associated with less innovation187.
An inverted-U relationship between competition and innovation is consistent
with the story outlined above. The only thing one shall take into account is that the
186
Nickel (1996), Geroski (1995), Blundel et al., (1999).
187
Comanor (1967) found that R&D is smaller when technical entry barriers are too high or too low.
Scherer (1967) found an “inverted U” relationship between industry concentration and employment of
scientists and engineers. Aghion et al. (2005), found an inverted U relationship between product market
competition and R&D output, measured by citation-weighted patents. The later also provide evidence that
the positive relationship between competition and innovation is more pronounced in firms that are close
to the technological frontier than in laggard firms.
afreitas@ua.pt 248
steady state proportion of firms that are in the fringe or in the neck-and-neck state
depends on the level of product market competition:
- When competition is very low, there are little incentives for firms in the
neck-and-neck state to innovate. Hence, most firms will remain in the neck-
and-neck state. In this case, the “escape competition effect” dominates: an
increase in competition increases the incentives to innovate.
- On the other hand, when competition is very high, there is little incentive for
firms in the fringe to enter in the market. Hence, the markets structure will
be “unlevel”: in this case, the Schumpeterian effect dominates: more market
competition reduces the incentives to innovate.
Summing up, as the degree of competition increases, the relationship between
competition and innovation changes from positive to negative. This captures the
stylized fact of an inverted-U-shape relationship between R&D and product market
competition.
8.6 Discussion
This chapter discussed the competitive dimension of R&D. Competition through

innovation subjects agents in the market to a process of permanent adaptation that
resembles the Darwin’ theory of natural selection.
The Schumpeterian paradigm implies that the introduction of a new technology
comes along with the destruction of existing rents. In each sector, when a firm manages
to discover a technology that renders an older technology obsolete, it obtains a
competitive advantage and destroys its rivals’ rents. As time goes by, other firms will
imitate the leader or will develop even better technologies, causing the leader’ rents to
erode. Sectors that innovate faster are more likely to expand and to absorb the workers
released from non-innovating sectors.
The conventional Schumpeterian paradigm implies that product market
competition, by eroding the rents that reward successful innovations, discourage R&D.
Thus, the higher the intensity of competition in a given market, the lower the incentive
for an outsider to innovate and join that market. Creative destruction accounts however
to a form of dynamic competition, according to which incumbents, seeing their rents
being eroded by competing innovations, try to “escape competition” with faster
innovation. This effect is more likely when firms compete neck-and-neck at the
technological frontier: in this case, an increase in product market competition leads
incumbents to increase their research effort, in an attempt to achieve a technological
advantage and become market leaders.
Most endogenous growth models are plagued by a scale effect whereby the
growth rate of per capita output becomes a function of the growth rate of population.
The Schumpeterian model gets rid of this scale effect by linking R&D to vertical
innovations and allowing the number of varieties to expand along with the size of
population. This allows the increased population to be diluted by a larger number of
varieties, so the number of researchers per variety does not increase. However, a “weak”
scale effect arises, due to the “division of labour effect”: since the number of varieties
increases proportionally to the size of population, the growth rate of output will be itself
a function of the growth rate of population
 In our days, many firms compete through innovation. When entrepreneurs achieve a successful
innovation, they reap a return that often comes at the cost of losses for non-innovating firms. This
competitive nature of R&D is labelled Creative Destruction and resembles the Darwin theory of
natural selection.
 Because the potential gains of successful innovations materialize after the sunk cost of R&D is
incurred, the optimal level of R&D has to be determined “ex ante”. R&D expenditures are decided
depending on the expected profits achieved with the innovation compared to the opportunity cost of
the resources employed.
 In this assessment, entrepreneurs have to take into account both the potential profit in case innovation
and also how long this profit will materialize. This, in turn, will depend on the research effort by
others: the higher the R&D activity in an economy, the higher the likelihood of a successful
innovation to be short-lived, and hence the lower the incentives to innovate. In contrast, when the
society devotes only few resources to innovation, the opportunity cost of R&D is low.
 In the Schumpeterian model, the optimal level of R&D depends positively on the size of the
workforce: a larger workforce implies a larger market and hence more profits, so the optimal R&D
intensity increases. This, in turn, leads to higher growth, giving rise to a scale effect.
 Extending the model to many sectors, another source of creative destruction is identified: successful
innovations across different sectors translate into real wages, shrinking the profits and employment in
non-innovating industries.
 The model with many sectors offers a natural framework to remove the scale effect: the key
assumption is to link the size of population to the number of product varieties. As long as the size of
population and the number of varieties are proportional, the fraction of the workforce employed in
each variety remains constant. With a constant number of researchers per variety, the arrival rate of
vertical innovations will be constant as well, despite the expanding population. Still, the model will
display a weak scale effect.
 An alternative avenue to remove the scale effect is to assume that ideas become more difficult to
achieve as the level of technology increases (the Fishing out effect). With such an assumption, a
larger population will not imply a faster rate of technological progress.
 The fact that the reward to innovation comes through monopoly profits does not necessarily imply
that less competition is good for innovation. True, a market with low competition will be more
attractive for newcomers, so through this “Schumpeterian effect”, less competition is good for
innovation. However, high product market competition also makes more attractive for firms in that
market to “escape competition” by innovation. The total effect is ambiguous.
afreitas@ua.pt 250
Key concepts
 Creative destruction
 The crowding out effect
 Stepping on shoes
 Fishing out effect
 The replacement effect
 The business stealing effect
 Neck and neck competition
Essay questions:
a) Comment: “The larger the number of researchers in an economy, the lower the value of a
patent”.
b) Comment: “Firms escaping competition have more incentives to innovate than incumbent
monopolies”.
c) In empirical studies, some authors identified an inverted-U shape in the relationship
between the degree of competition and innovation. Explain the theory that was proposed to
explain this stylized fact.
d) “Competition improves the static efficiency but is bad for growth”.
Exercises
8.1.
Consider a product, which production is carried by an incumbent that is
monopolist in the product market and price taker in the labour market. The demand
curve for this product is given by p  2 x 1 / 2 and the production function is equal
to x  N Y , where N Y  (1   ) N is the proportion of working time that workers in this
sector devote to production. The workers’ remaining time is devoted to private R&D, in
an attempt to achieve a vertical innovation and displace the incumbent. When 1 unit of
labour is devoted to R&D, the probability of achieving a vertical innovation consisting
multiplying the previous lambda by four is b=1%. The total working time in this sector
is constant and given by N=5.
a) Consider first the problem of the incumbent monopolist, who achieved a
productivity level (λ) equal to 4. Taking into account that the wage rate
(w) is equal to 1, find out the selling price and the production level that
maximize the incumbent’ profits. Compute these profits and represent
the monopolist’ optimal solution in a graph.
b) Taking now the wage rate and the productivity parameter as unknowns,
find out the general expression of the demand for labour in this industry.
Assume that the total working time devoted to formal production in this
sector is equal to N Y  (1   ) N  4 . Find out the equilibrium wage rate
when λ=4, λ=16 and λ=64. Describe the successive equilibria with the
help of a graph.
c) Consider now the problem of a research worker that is trying to discover
technology λ=16. If he succeeded and became monopolist (displacing
the incumbent), how much would be his profits? Taking into account the
probability of achieving a vertical innovation (b=1%) and assuming that
the discount rate is equal to r=7%, what proportion of his time should he
devote to R&D?
d) Returning to (b) and leaving μ, N and b unknown, solve again the
researcher problem.
e) With the help of a graph, explain how changes in the different
parameters affect that equilibrium.
8.2.
Consider the market of an intermediate input, which individual production
function and market demand are described by xi  i N i and p  1.5 x 1 3 , respectively.
Also assume that this market is small relative to the rest of the economy, with W=1.
Initially, this product is produced under perfect competition, with p  W   0 .75 .
a) Assume that an entrepreneur achieved a technology to produce this good with
λ=1.6. Is this innovation vertical or horizontal?
b) The innovation just described is drastic or non-drastic? Explain the optimal
strategy of the entrepreneur and the corresponding profit. Represent in a graph.
c) For this strategy to materialize, which condition shall be verified? Identify real
world mechanisms that help this condition to be verified.
afreitas@ua.pt 252
d) Assume now that more entrepreneurs were able to achieve the same technology
λ=1.6, and divided equally the implied profits. Discuss the effect of the
increasing competition in this market on the incentives to innovate,
distinguishing entrepreneurs from the fringe and entrepreneurs already using the
leading technology.
“…it is a matter not of individual inventiveness but of the receptivity of whole

societies to innovation”. [Jared Diamond].
Learning Goals:
 Understand the heterogeneity of knowledge regarding its diffusion

potential
 Understand the critical role of economic openness for technology
diffusion
 Acknowledge how country’ characteristics may delay the pace of
technology diffusion
 Understand why adoption of foreign technologies may involve
adaptation efforts
 Understand the basic functioning of growth models with technological
interdependence
9.1 Introduction
How to improve the state of technology is a policy question that confronts all
modern societies. For an industrial country, keeping the lead requires a continuous
effort to invent new products and processes. For an emerging economy, however, the
issue is not as much of pushing forward the world technological frontier, as of
benefiting from the world technological diffusion.
The advantage of adopting foreign technologies is that these do not need to be
invented again. New technologies do not flow, however, instantaneously from rich
countries to poor countries. Technologies have the potential to be transferred across
firms and country borders, but whether they are implemented or not in each particular
environment depends on the prevalent incentives to do so. These incentives, in turn,
differ across the space, depending on economic, political, cultural and geographical
circumstances.
This chapter addresses the question of why available technologies do not flow
uniformly across the space. Section 9.2 describes the critical role of economic openness
to technological diffusion. Section 9.3 explains how recipient country characteristics
influence the absorptive capacity. Section 9.4 addresses the costs involved in selecting
and adapting the technology that better matches the recipient country needs. Section 9.5
presents a model of an emerging economy faced with the challenge of adopting
technologies developed abroad. In this model, there is a World technological frontier
afreitas@ua.pt 254
and the country’s characteristics and policies determine how close it gets to that
frontier188. Section 9.6 concludes.
9.2. Vehicles of technological diffusion
The advantage of backwardness
The view that poor countries may improve their living standards by imitating
successful technologies and practices from rich countries backs from David Hume
(1758). Since the inventing process does not have to be repeated, there is a potential
advantage for those who adopt frontier technologies without the need to learn from the
beginning. This idea was popularised by Alexander Gershenkron (1952), who coined
the term “advantage of backwardness”.
Taking opportunity of ideas developed abroad is not, however, an automatic
process. Ideas have the potential to be transferred across the space, but whether they are
actually implemented in each particular environment or not, it depends on incentives to
do so. Although in today’s world it is possible to store an invention in a little file and
send it through the Internet to any country in the World, the truth is that technology
differs considerably across the space. Even within single countries, there are large and
persistent productivity differences across plants in the same narrowly defined
industry189.
The conclusion is that, although backwardness carries with it the potential for a
country to catch up, the degree to which this potential materializes in each particular
country depends on the country economic, political and social circumstances. Factors
such as the availability of human skills, infrastructures and the quality of the business
environment, by shaping the economic incentives, may accelerate or retard the adoption
of new technologies. Moses Abramowitz (1979, 1986) labelled these as the “social
capability” of a country to absorb the available technologies.
Box 9.1 Imperfect technological diffusion
In the neoclassical model, it is assumed that technology spills over

instantaneously across firms and countries borders at no cost. In the real life, however,
many factors prevent technology from diffusing across the space. To motivate this,
consider the following examples, consisting in four of the most important inventions of
the Human kind:
The wheel: the wheel was discovered in the region of the Black Sea by the year
3.400 B.C. You may think of this idea as almost a public good: once an agent becomes
aware of the concept, nothing much prevents him from using it for his own benefit.
Nevertheless, this simple idea took centuries to spread around Europe and Asia. In the
188
The chapter draws extensively from Klenow and Rodriguez-Clare (2005), Comin and Hobijn (2004)
and Keller (2004).
189
A classic example of this is the adoption of hybrid corn in the U.S (Griliches, 1957): the diffusion of
the more productive hybrid corn across the territory was rather asymmetrical and dependent on local
circumstances.
New World, people had to wait until the XV century before enjoying the benefits of the
wheel in transportation (actually, the Mayans and the Aztecs had the “wheel”, but had
only used it in toys). This example suggests that geographical distance has a role in
determining the pace of technological diffusion.
Making fire: with no question, this technology is easier to hide from imitators
than the wheel. Probably, the hominids who first discovered how to make a campfire,
around 1,4 million years ago, tried to keep it secret, so as to have an advantage against
their competitors. But, either through disclosure or through independent discoveries, the
fact is that the ability to make fire became universally known long ago in the human
history. This example suggests the passage of time has a role in eroding the barriers to
technological diffusion.
Writing: this technology was invented in Mesopotamia around the year 3000
B.C. It was also independently discovered in Central America before 600 B.C. Writing
is a powerful tool that fuels human interactions, but it is a rather complex technology,
which requires considerable individual efforts to be transmitted across people. Not
surprisingly, after writing was invented, it was rapidly spread through organized
societies, but it was unable to penetrate in hunter-gathered societies, where the
economic incentives to adopt it were inexistent. Today, even though governments spend
large amounts of resources to make this technology universally available, many people
do not achieve it. This example shows that some knowledge is only assimilated when
people perceive it to be worthwhile, even when publicly available.
Democracy: democracy was first implemented in the ancient Greek city-state of
Athens, in the year 508 B.C. At that time, it was invented so as to provide peasants
engaged in highly productive long-term investments (preparing the fields to cultivate
olives) with a political system that minimised the expropriation risk 190 . In general,
however, democracy is a technology that proves difficult to implement. The reason is
that it depends on collective actions, and requires a minimum set of complementary
institutions (the rule of law, for instance). Moreover, even when the conditions exist to
implement democracy, interest groups who have more to gain with a non-democratic
status quo may block it. This example stresses the fact that lack of complementary
inputs and vested interests may delay the pace of technological diffusion.
Taken together, these examples remind us that, although technology is, in
principle, infinitely expansible, its diffusion across the space is far from automatic.
Because of various combinations of geographical distance, secrecy, lack of
complementary skills, vested interests or cultural idiosyncrasies, technology tends to be
differently assimilated across the space.
The critical role of trade
Technology does not spread instantaneously across firms and country borders. It
rather flows through specific mechanisms of human interaction. This includes trade and
factor mobility.
In complete isolation, it would be virtually impossible for a country to learn
from abroad. As an extreme example, remember the history of the Aboriginal
Tasmanians in Box 1.4: since they had no contact with other societies for more than
190
Fleck and Hansen, 2006.
afreitas@ua.pt 256
10,000 years, they could not acquire new technology other than what they invented
themselves.
In our days, on-going interaction with foreign firms and consumers provides a
fundamental base for learning and matching best performances, in a way that cannot be
replicated interacting with domestic agents only. Openness to the global economy is a
critical ingredient for technological diffusion.
There are different mechanisms through which international trade increases an
economy’ permeability to the world technological diffusion. First, importing equipment
from more advanced countries is a direct way of using the embodied technology without
the need to replicate the research effort. Second, opening the domestic market to the
competition of foreign firms bringing newer and more sophisticated products compels
domestic firms to improve their products and to seek for more efficient ways of
producing them. Third, competition in foreign markets provides exporting firms with
the discipline of interacting with highly demanding customers, inducing them to meet
high quality standards 191 . Fourth, access to external markets may favour the
establishment of new exporting industries, that otherwise would not spring. Fifth, a
society more exposed to foreign ideas tends to be more demanding in respect to the
quality of domestic policies and institutions. Protectionism, in contrast, creates the
conditions for interest groups to become organized and to spend resources in pressing
the government for more protection, instead of devoting their time in searching for
better technologies.
In the real world, there are plenty of examples of a positive impact of trade
openness on productivity. For instance, some authors argue that one reason why the
United States emerged in the 1865-1929 period and surpassed England as the world
technological leader is that they became a “free trade club”, whereby members states
were not allowed to impose restrictions on imports from (or on technology developed
by) other member states. By the same token, after 1957, the advent of European
Economic Integration helped Western Europe to catch up with the United States192.
Empirically, studies pointing to the critical role of international trade as a
vehicle for technological diffusion include, among others Sachs and Warner (1985,
1987) and Comin and Hobijn (2004). Sachs and Warner (Box 6.9) showed that trade
openness tends to be associated to faster productivity growth and sounder economic
policies. Comin and Hobijn (2004) analysed the diffusion of 25 specific technologies
across 23 industrial economies along the period 1788-2001, and report an important role
of trade openness in determining the speed of technology adoption.
A different question is whether the identity of the trading partner matters. That is
countries importing primarily from technological leaders should benefit more than
countries importing primarily from laggard countries. Empirically, it has been found
191
These learning-by-exporting effects are often quoted as a key ingredient for the success of East Asian
countries. Keller (2004) however, reports that the econometric evidence for these effects has been weak.
192
Parente and Prescott (2005). In a similar reasoning, Ferreira-Cavalcanti and Rossi (2003) document a
large and widespread improvement in output per worker in 16 Brazilian industries, once barriers to trade
were drastically reduced, in the early 1990s.
that TFP levels in developing countries tend to increases with the R&D effort of its
main trading partners193.
Foreign Direct Investment
Like international trade, FDI is usually seen as a vehicle for cross-border

transfer of technology. Specific mechanisms through which FDI promotes technological
diffusion include: bringing new machinery and production techniques to the host
country; demonstration effects that induce imitation by local firms; increased
competition in the domestic market; creation of a demand for high quality or specific
intermediate inputs. To this, one shall add an important role of foreign investors in
promoting face-to-face contacts between skilled workers in the headquarters and in the
subsidiaries. These face-to-face contacts are essential to diffuse the so-called tacit
knowledge, which is difficult to transfer across the board (see Box 9.2).
In the empirical front, however, knowledge spillovers related to FDI have been
difficult to demonstrate. That is, in general, it has been difficult to prove that high FDI
levels are associated to high productivity increase in the recipient country. An
explanation that has been proposed for this result is that country characteristics matter:
that is FDI is important for technological diffusion, but conditions should exist in place
for technology to diffuse.
A widely quoted study in this direction is Borenztein et al. (1998). The authors
investigated the effect of direct investment from industrial countries in 69 developing
countries. These authors found that FDI contributes relatively more to growth than does
domestic investment, suggesting that FDI is indeed a vehicle for technological change.
However, the higher productivity associated to FDI is conditional on a minimum
threshold stock of human capital in the hosting country. The authors concluded that FDI
contributes to economic growth only when the host country has sufficient “absorptive
capability”194.
Box 9.2 Tacit knowledge
A major factor preventing technological knowledge from spilling over across the
board is that it often requires face-to-face interactions to be transmitted.
Indeed, if all knowledge could be codified in simple formulas like the Pitagoras
theorem, its transmission across the space would not be a problem. Not all knowledge,
however, is suitable for codification. In many cases, only the broad lines of technology
are codified. The remainder knowledge remains non-codified, embodied in the skills of
193
This hypothesis was first proposed by Grossman and Helpman (1991), and was subject to statistical
scrutiny by Coe and Helpman (1995) and Coe et al. (1997). Other authors testing the relationship between
international trade and technological diffusion include Lichtenberg and de la Potterie (1998), Bayoumi et
al. (1999), Savvides and Zacharriadis (2005). For a survey, see Keller (2004).
194
Along the same reasoning, Xu (2000) analysed U.S. outward FDI in manufactures to forty countries,
along the period 1966 to 1994. He found a positive relationship between FDI and productivity growth.
The author also reported that rich countries benefit more from hosting multinational subsidiaries than
poorer countries, suggesting that the “absorptive capacity” matters.
afreitas@ua.pt 258
practitioners and is better transmitted through face-to-face contacts. Such knowledge is

said to be “tacit” 195.
Tacit knowledge may leak unintentionally. For instance, when workers trained
by one firm move to another firm, they will be agents of knowledge diffusion
(remember the case of the Desh Factory in Box 6.4) 196 . This diffusion tends to be,
however, localized in space. Since tacit knowledge transmits better through face to face
contacts, it will tend to act as a local public good, giving rise to agglomeration effects.
In contrast, transmitting tacit knowledge at distance may be very costly.
Multinational firms spend considerable amounts of resources in organizing meetings,
workshops, demonstrations and seminars, just to transfer knowledge across different
locations 197. Because multinationals find it profitable to do so, they are an important
vehicle of technological diffusion
9.3. Barriers to technological diffusion
Complementarities
While international trade and factor mobility may be seen as vehicles for
technological diffusion, other forces push in the opposite direction. Often, a given
technology is well known and freely available, but the incentives to use it in a particular
environment are missing.
A major reason for the slow adoption of new technologies is the lack of an
appropriate set of complementary inputs. The productivity of a new equipment does not
depend only on its intrinsic efficiency but also on the abundance/adequacy of
complementary inputs in the hosting economy. The more the new technology matches
with the country endowments, the higher the likelihood of it to be profitable and hence
adopted.
An obvious complementary input to new technologies is human capital. Poor
and unequal countries with low levels of literacy will find it more difficult to adopt
sophisticated technologies than countries with high levels of human capital.
Empirically, an extensive literature points to an important role of human capital in
195
Polanyi, (1958), p.53: “Tacit knowledge can be passed only by example from master to apprentice”.
Studies documenting the importance of personnel contacts for international technological transfer include
Kerr (2008) and Agrawal et al (2006).
196
A formal model is in Fosfuri et al., (2001).
197
Teece (1977) analysing 26 projects involving the transfer of manufacture capability from multinational
firms with headquarters in the United States to other countries, estimated the cost of within-firm transfers
of technology to be, on average, almost 20 percent of the total project cost.
determining the “absorptive capacity” of countries.198. Some authors argue that the main
role of human capital on economic growth is not its direct effect as input to production
(as captured by the MRW model, for instance), but instead its role in shaping the ability
of a country to innovate and adopt new technologies. Hence, conventional growth
accounting, by assessing the contribution of human capital to economic growth by its
elasticity in production only, will understate the actual role of human capital.
Complementary inputs other than human capital include physical infrastructure
(ports, telecommunication networks, power supply), business services (accountancy,
machinery repairs), financial services, government services (property rights protection,
regulation), and so on.
The existence of complementary inputs suggests that the slow adoption of new
technologies in developing countries may be an optimal response to differences in
endowments, which translate into differences in the efficiency with which new
technologies can be used. This means that governments may have a role in shaping a
country’ absorptive capability: by promoting the education of people, building essential
infrastructures and promoting a balanced development of the different capabilities,
governments may help a country to overcome the coordination failures that impair the
adoption of new technologies.
Changing a country economic conditions is however a slow process. Some
capabilities, by its nature, evolve slowly over time (e.g, human capital, culture), others
are very expensive (infrastructure) and others cannot be changed at all (geography)199.
Thus, in some cases, rather than simply trying to adopt the foreign technology, it is
more appropriated to adapt the foreign technology so as to make it more suitable to the
conditions of the recipient country (Box 9.6 offers a real world example).
The old blocking the new
198
Benhabib and Spiegel (1994) investigated the relationship between human capital endowments and the
speed of technological adoption. Caselli and Coleman (2001) found that high levels of education
attainment are determinants of computer-technology adoption. Comin and Hobijn (2004) found a
significant role of secondary education in explaining technology adoption in the period up to 1970
(tertiary education after 1970). Comin and Hobijn (2004) also examined the importance of education for
the adoption of specific technologies, finding an important role in technologies related to electricity, mass
communication and personal computers, and a negligible role in textiles, steel and shipping. Other
empirical studies documenting positive correlations between human capital endowments and
technological diffusion include Griliches (1957), Eaton and Kortum (1996), Doms, Dunn and Troske
(1997) and Borentzein et al. (1998), Caselli and Wilson (2004). At the theoretical level, Nelson and
Phelps (1996) built a Schumpeterian model where technological diffusion is mediated through human
capital (see also Acemoglu et al., 1996, Aghion et al., 2002). Aghion and Howitt (2005) argued that the
importance of higher education increases as the country approaches the world technological frontier, that
is, as innovation becomes more important than imitation.
199
Young (1928): “An industrial dictator, with foresight and knowledge, could hasten the pace somewhat
but he could not achieve the Alladin-like transformation of a country’s industry so as to reap the fruits of
a half century ordinary progress in few years. The obstacles are of two sorts. First, the human material
which has not been used is resistant to change. New trades have to be learnt and new habits have to be
acquired (…). Second, the accumulation of the necessary capital takes time (…). An acceleration of the
rate of accumulation encounters increasing costs, into which both technical and psychological elements
enter”. (p. 534).
afreitas@ua.pt 260
In light of the neoclassical model, whenever a more efficient vintage of a given

equipment is invented, there should be no additional investment in the vintage that
became obsolete. Thus, the share of the old vintage in the total capital stock should
decrease gradually over time200. This theory is labelled as the “vintage capital theory”
In the real world, however, there are many examples in which investment in
frontier technologies only becomes dominant after a period during which investment in
non-frontier technologies continues to dominate. An interesting one occurred in
Germany after WWII. With the war, Germany lost a significant part of its merchant
fleet. If the “vintage capital theory” was correct, Germany should have rebuilt its
merchant fleet with state-of-the-art ships, only. However, the Germany impressive
investment rates after the war did not translate into a higher proportion of motor-ships
(relative to sail-ships and steamships) than in other European countries201.
The persistent behaviour of investment in old technologies looks a paradox.
Why should firms insist in buying technologies that are technically dominated by newer
technologies?
One possible reason for firms not to adopt immediately frontier technologies is
that the use of a technology often requires technology-specific experience (the so-called
“Vintage Human Capital”. When relevant productive experience is already achieved
working with the old technology, workers may have low incentive to update to a more
advanced technology, as that would imply a decline of the value of their experience.
Consequently, firms may “hang on” to the old technology and even continue to invest in
it, even though a superior one becomes available. In this case, the old technology is said
to “block” the new technology. A classical example of how experience in working with
an old technology may block a superior technology is given in Box 9.4202.
A similar story holds in the presence of network externalities. A network
externality arises when the benefit of using a given technology increases with the
number of users. The classic example is the telephone: the more people use telephones,
the more valuable the telephone is to each user. Thus, when many users are hang on to
the old technology, it is difficult for a newer technology to develop.
Network effects in technology adoption also arise through social learning.
Suppose, for instance, that you buy a new computer software: the learning cost on how
to use this software will be certainly lower if you have a friend using the same software
nearby. To the extent that some of the required knowledge is tacit or it flows more
easily through personal contacts, the more people in a given location use the same
technology, the higher the likelihood of each new user to interact with a potential
teacher and hence the lower it will be its learning cost. When this is so, the conditions
exist for a widely used technology to block a new and more efficient technology. To a
large extent, network effects and social learning have protected Microsoft office users
against its main competitors.
200
Johansen (1959), Solow (1960).
201
Comin and Hobjin (2004). The authors examined the diffusion of 25 technologies covering different
industries across 23 industrial countries along the period 1788-2001, and identified substantial delays in
the adoption of new technologies, not consistent to the vintage capital theory.
202
Another popular example happened in the shipbuilding industry. According to Harley (1973), wooden
shipbuilding persisted much longer in the U.S. than in the UK, because the specific skills involved in
wooden production were abundant there, while those needed to build iron ships were lacking.
Box 9.4. Locked to QWERTY
The QWERTY keyword that you probably find on your computer was created in
1873, with a series of tricks designed to slow-down typists: the commonest letters are
scattered over all rows and concentrated on the left side of the keyboard, so that right
handed people have to use their weaker hand to reach them.
Why was this keyboard designed with such unhelpful and unproductive
features? The answer is that mechanical typewriters in 1873 jammed easily if two keys
were struck in very quick succession. The QWERTY key layout emerged to slow typing
speeds and so reduce the frequency of jams. It was therefore a purposeful inefficiency
created to avoid problems of jamming.
What makes this case interesting is that when improvements in typewriting
eliminated the problem of jamming, the QWERTY keyboard was already installed in
most of the world’s typing machines and the secretarial profession was trained to use it.
New keyboards, allowing for faster writing and lower effort were launched, but they did
not succeeded, because people were already locked in to the less efficient technology
and refused to change203.
Leapfrogging
Locking-in effects related to vintage human capital and network externalities

should, at the first sight, favour income convergence: countries that are intensive users
of the old technology should be the countries that have more to lose by switching to the
new technology. Backward economies in contrast, because they are not heavily
committed to any technology, could in principle jump to the frontier by investing in
skills and in infrastructure targeting the frontier technologies. The opportunity to jump
ahead of the leaders is dubbed as “leapfrogging”.
In the real life, examples of leapfrogging abound. For instance, many developing
countries in the past decade have adopted cellular phones faster than developed
countries, because operators were not already locked in to the technology of
conventional telephones.
Leapfrogging is not, however, a general case: on the contrary, there is now
significant evidence pointing to the fact that more advanced economies are not only
those that invent new technologies, they are also those who adopt the new technologies
the earliest204. A possible explanation for this pattern is that part of experience acquired
with the old technology is transferrable to the new technology. For instance, it is
probably easier to become a computer typist for someone with experience in mechanic
typewriting than for someone with no experience at all. By the same token, adopting a
more sophisticated equipment to produce t-shirts may be easier for workers that are
203
David (1985).
204
Comin and Hobijn (2004) examined the diffusion of 25 technologies across 23 industrial countries for
the period 1788-2001, and found no evidence of a dominant leapfrogging effect in the data. The authors
found that most technologies are first adopted in advanced economies and then they trickle down to
countries that lag economically. In a panel regression controlling for other factors, the authors found that
the pace of technological adoption is positively related to per capita GDP, suggesting that richer countries
adopt newer technologies first. The authors also found that leaders in the adoption of a predecessor
technology tend to be the leaders in the adoption of the successive technology.
afreitas@ua.pt 262
used to produce t-shirts with older technologies than to workers that never worked in
the t-shirt industry.
When this is so, the accumulated experience in dealing with an old technology
not only reduces the user costs in that technology (giving rise to lock in effects), it also
reduces the costs of adopting the new and more efficient technology. When the second
effect dominates, countries that are heavy users of the old technology will have an
advantage instead of a disadvantage relative to workers who have no relevant
experience in the field205. Box 9.5 presents an argument in this avenue.
Box 9.5. The Haussman - Klinger forest
Hausmann and Klinger (2006, 2007) contend that the ability of a country to start
producing more sophisticated (rich country) goods depends on the usefulness of the
industry-specific experience generated by the particular basket of goods in which the
country is currently specialized.
The authors illustrate the argument with the metaphor of a forest, where each
tree represents a product. In that forest, each tree is placed at some distance to the other
trees, the distance capturing the degree to which the production capacities of one
product can be used in other product. Because some industries use skills that are
common to a large number of industries, some parts of the forest are denser than others.
In this metaphor, firms are monkeys that live on trees and the process of
structural transformation involves the monkeys jumping around from tree to tree.
Moving to trees at larger distances involves the need for productive capabilities that
have not been previously accumulated. Because some trees generate more income than
others, each monkey would like to move to high productivity trees. However, because
smaller jumps are less costly than larger jumps, the ability of the “tribe” to engage in
superior technologies depends on having a path to nearby trees that are increasingly of
higher productivity. If the move towards high productivity trees require larger jumps,
the tribe may find itself in a poverty trap, jumping around lower income trees.
With this metaphor, the authors contend that the process of technological change
is path dependent: when the accumulated experience is less valuable, a developing
country may found itself stuck in traditional industries, unable to embark on modern
productions that generate new knowledge and spur economic development. This
interpretation is consistent with a broad notion of experience, including non-tradable
intermediate inputs, common infrastructure, labour skills, country-specific technical
knowledge, specific regulations an so on.
Barriers to technology adoption
205
Jovanovic and Nyarko (1996) build a model of individual decisions, whereby learning by doing
provides an agent with information that improves its productivity in the old technology (vertical shifts). In
this model, agents may also switch to new technologies (horizontal shifts). The degree of similarity of the
new technology to the old one determines how transferable the accumulated knowledge is. The lower the
possibility of transferring the accumulated knowledge to use with the new technology, the larger will be
the productivity loss faced by those workers being asked to move to the new technology. When the
technological leap is too large, the expertise loss may be such that a highly skilled agent prefers not to
switch, becoming therefore locked in the old technology.
Innovations not only have the potential to generate rents, they also have the
potential to destroy existing rents. To the extent that technological progress brings about
more efficient machinery and production methods, owners of the old machinery will
lose. Also skilled workers hanged on the old technology may see the new technology as
reducing the value of their accumulated knowledge.
Powerful elites and organized groups seeing their economic, political and social
interests threatened by the adoption of new technologies may try to place obstacles to its
diffusion. Parente and Prescott (1994) used the term “barriers to technology adoption”
to coin the obstacles put in the path of innovators by established interests. This includes
regulatory and legal constraints, bribes, violence or threat of violence, and worker
strikes206.
Barriers to technology adoption also arise from social norms: the
implementation of new technologies often requires organizational changes and
complementary reforms that challenge believes and traditions within a country. Leaders
in laggard countries often lack the political power or the political will to confront these
traditions.
All in all, the arrival of new technologies often faces the resistance of groups
who have much to gain with the preservation of the status quo. When these groups are
powerful enough to block the adoption of new technologies, the society as a whole will
lose.
Because of this, many authors defend that the most important element
influencing the pace of technological diffusion is the quality of political institutions: the
less dependent they are from the established elites, the easier it will be to find
policymakers committed with reforms and able to accept the underlying changes that
the new technologies are likely to bring about.
9.4 Matching specific needs
The fact that different technologies perform differently in different environments

implies that different countries should optimally adopt different technologies207. Thus,
rather than simply importing any foreign technology, follower countries have to pick
those technologies that better suit their particular set of capabilities, and eventually
adapt them so as to make them more effective in their particular environment. This
process brings about more costs, adding to the general costs related to technology
adoption.
Self discovery
206
An interesting example occurred in Austria-Hungary and in Russia during the nineteen century.
According to Acemoglu (2003), the elites in these countries “blocked industrialization and even the
introduction of railways” because they realized industrialization would “reduce their power and
privileges”.
207
This is basically what the “theory of appropriate technology” states (Atkinson and Stiglitz, 1969,
Diwan and Rodrik, 1991, Basu and Weil, 1998, Acemoglu and Zilibotti, 2001).
afreitas@ua.pt 264
Because countries differ in terms of endowments, capabilities and culture,

different technologies match differently the needs of different countries. If information
was perfect, investors would always pick up the technology that better match the target
environment. In a world with uncertainty, however, finding out which of the many
potential technologies better fits a country’s specific circumstances is a process of trial
and failure. Riccardo Hausman and Danny Rodrik coined this process as of “self-
discovery”208.
The process of “self-discovery” involves different types of externalities from the
innovator to the followers.
For instance, the entrepreneur that first adopts a new technology provides
valuable information to its competitors: if the entrepreneur succeeds, other
entrepreneurs who opted to wait and see will imitate it, eroding the innovator’ rents; if it
fails, the innovator will bear the costs alone. Because the entrepreneur that first adopts
the new technology provides valuable information to other potential entrepreneurs
without being compensated for that, an information externality arises.
A similar reasoning holds for the activity of training workers to the use of a new
technology: once the innovating firm incur the training costs, the risk exists of
competitors to free-ride on workers mobility, and thereby beating the innovator with
lower training costs.
These externalities clearly reduce the incentives to innovate. If the expected gain
from moving first is not large enough to compensate the innovating firm for its risk
taking, the firm will optimally prefer to wait and see, postponing the adoption of the
new technology, even if that was socially valuable. This discussion adds to the general
case that private returns to innovation tend to fall short the social returns, calling for
government intervention.
Directed technological change
A problem for emerging economies is that most technologies are invented

targeting the conditions of advanced countries. The reason is that markets in advanced
countries are large and offer higher return prospects than those in emerging markets.
In agriculture, for instance, most innovations relate to cultures of temperate
zones – where rich countries are – thus not being suitable to be implemented in
developing countries located in the tropics. In manufactures, many process innovations
tend to economize labour and are specially designed to improve the productivity of
skilled labour, thus not matching the abundance of unskilled labour that is typical in
poor countries.
To the extent that innovations do not fit well with the characteristics of
developing countries, their adoption “tout-court” would result in productivity gaps that
could not be eliminated along time 209 . Hence, an effective technology transfer may
involve an effective spending of resources by the recipient country, in order to master
the foreign technology and adapt it to the local environment, preferences and believes.
208
Hausmann and Rodrik (2006).
209
Atkinson and Stiglitz (1969) and Basu and Weil (1998).
To the extent that mastering a foreign technology involves a fixed cost, an

entrepreneur in a developing country will only engage in such an effort if he is able to
protect the mastered technology from its competitors. The problem with many
developing countries is that the enforcement of property rights there is so weak that it
doesn't pay for innovators to spend resources adapting foreign technologies and make
them suitable to the local environment. Instead, engineers in developing countries may
find it more profitable to design new products targeting the needs of industrial
countries, where patented inventions can be sold with a minimum scale to recover the
R&D costs 210 . A corollary of this discussion is that the enforcement of intellectual
property rights in developing countries is an essential pre-requisite to expand the size of
their markets and, by then, induce investments in technology adaptation.
Institutions do not travel well
The idea that “technology” needs to be adapted to match the conditions of the
recipient country does not only apply to machinery, but also to policies and institutions.
Indeed, just like the effectiveness of a given machine in a particular location depends on
the availability of labour skills, the effectiveness of a given policy or institution may
depend on how this new policy or institution interacts with existing policies, institutions
or other deep characteristics of the country, including culture, believes and social
norms.
Examples of complementarities involving institutions and policies abound. For
instance: financial liberalization may be efficiency enhancing, but it may also led to
crises in the absence of an effective supervision; privatization of utilities can be a good
thing, but it can also be welfare reducing if there is no competition authority protecting
the consumers from price abuses; implementing a private property system is in general
favourable to long term investment, but it will fail to do so if it lacks an effective
judiciary to enforce the property claims.
Because the effectiveness of policies and institutions is largely conditional on
this type of complementarities, the choice of an appropriate sequence in the reform
process is a matter of great concern for policymakers and international organizations.
Along these lines, some authors have argued that optimal policies are not
independent on how far a country is from the technological frontier. The main argument
is that, as a country develops, the main source of technological change shifts from
“imitation” to “invention”. Thus, institutions that are important to support imitation in
the first stage (such as long-term banking finance, selective important protection,
targeted industrial supports, eventually coupled with low enforcement of property
rights), turn out to be insufficient or inadequate to sustain economic growth in the
second stage. In the second stage, free entry, open competition, trade openness, and
flexible labour markets are critical ingredients to provide a selection mechanism to
weed out unprofitable projects211.
210
Acemoglu and Zilibotti (2001) stress the role of intellectual property rights in explaining why most
technological change is directed towards the needs of rich countries instead as of the poor countries.
211
Using cross-country data over the 1960-2000 period, Acemoglu et al. (2004) found that trade openness
and a high degree of product market competition are more important to frontier economies than in
countries that lag behind. See also Acemoglu et al. (2002), and Aghion and Howitt, (2005).
afreitas@ua.pt 266
Sometimes, the simple imitation of foreign institutions may turn out to have
adverse effects. A suggestive example happened in Bombay Deccan, in the nineteenth
century colonial India. The reform consisted in the introduction of civil courts, to
improve the effectiveness of contracts in general. These courts however interacted
adversely with the credit market for agriculture. Before courts were introduced, a
traditional practice existed of lender subsidizing farmers’ investments during bad
harvests. The newly established civil courts were able to enforce simple debt contracts,
but not the complex risk-sharing informal arrangements such as those that proved to
work well in the past. The implication of the reform was to make farmers more
vulnerable to bad harvests212.
This discussion points to the idea that “institutional innovations do not
necessarily travel well”213: the simple copy of institutions that perform well in a given
context does not necessarily deliver the highest possible economic performance in a
different context. The optimal policy may involve adapting the institutional
arrangements to fit a country set of characteristics.
An example of a successful adaptation is described in Box 9.6.
Box 9.6. Islamic Finance
The Islamic laws (Sharia), which rule the social, political and economic aspects
of Islamic Societies, encourage hard work, fair dealing, property rights, and the honour
of contracts. They also approve the earning of profits, because profits reward successful
entrepreneurship and reflect the creation of additional wealth. The Sharia prohibits
however interest payments. The reason is that interest is a predetermined cost that is due
irrespectively of the business outcome.
Banishing interest payments, the Sharia precludes the use of bonds and the
development of banking, at least with the same design as in industrial economies. This,
in turn, leads to insufficient savings, low investment and low growth.
However, because the Islamic doctrine advocates profit sharing (qirad), a
window is open for financing mechanisms alternative to credit. A widely accepted
scheme is called Mudarabah. This is a form of “venture capital”, under which one party
provides the capital for a project and the other party provides labour effort. The
principle is that providers of funds become partners instead of creditors: if the enterprise
succeeds, they share profits; if it fails, they lose the capital and the working time
invested. Such a contract reflect the ideal cooperative spirit of Islam: borrowers and
lenders share losses as well as rewards.
Another popular mechanism is the Murabaha: this consists on a purchase and
resale contract, in which the bank purchases goods from the producer with the promise
to re-sell them at an agreed-upon date at an agreed-upon inflated price. Although it
looks as a debt-instrument, the Murabaha is viewed as legitimate by the Islamic laws,
because the financier bears risk during the period he owns the goods.
212
Kranton and Swami (1999).
213
Rodrik (2005). Along the same reasoning, Douglass North (1994, p.8): “(…) transferring the formal
political and economic rules of successful western market economies to third world and Eastern European
economies is not a sufficient condition for good economic performance”.
The principles of Islamic Finance were already practiced in Muslim societies

throughout the middle ages, but it was after its inception in Egypt, in 1963, in Dubay, in
1975, and the opening of the first Islam bank subsidiary of a Western Bank (the
Citibank) in Barhain in 1996 that Islamic Finance flourished around the world. Today,
there are more than three hundred Islamic financial institutions operating in more than
75 countries, including in Europe and the United States. These institutions offer a wide
set of instruments targeting the needs of providers and users of funds: murabaha (trade
with mar-kup financing), bay’ salam (forward sale), bay’ mu’ ajjal (deferred payment
sale), ijara (leasing), mudaraba (profit-sharing), musharaka (partnership). These
instruments can then ne combined to build a wider range of complex financial
instruments.
The emergence of Islamic banking is creating big challenges to policymakers.
This includes developing a framework to implement monetary policy and adapting the
Western institutions of supervision and regulation. But with no question, Islamic
Finance has proved to be a successful form of providing funds to entrepreneurs in
compliance to a specific ideology. It offers a good example of the principle that
sometimes it is better to have a well adapted institution than simple imitate the original
one without taking into account the local circumstances214.
9.5. A simple model of technology adoption
This section presents a growth model specifically designed to illustrate the

problem of a technological follower aiming to implement technologies developed
elsewhere. In this model, country characteristics and innovation efforts determine how
far he gets from the world technological frontier, while its long run growth is linked to
the world rate of technological progress215.
Modelling technology adoption
Consider a small emerging economy, which instead of inventing its own

technology, adopts (and adapts) technology invented elsewhere.
Output consists in a homogenous good, Y, produced with a Cobb-Douglas
technology:
Y  AK   N 
1 
, (9.1)
where K includes both Human and Physical capital and  measures the efficiency of
labour.
Output can be spent in consumption (C), Investment (I=sY) or in deliberate
efforts to adopt new technologies (R). The later includes the spending of resources to
214
For more on Islamic banking, see See El Qorchi (2005), Iqbal (1997).
215
The model adapts from Klenow and Rodriguez-Clare (2005). Other models in this class include
Howitt, (2000), Parente and Prescott, (1994), Eaton and Kortun, (1996), Barro and Sala-i-Martin, (1997),
Nelson and Phelps, (1996).
afreitas@ua.pt 268
discover or to adapt the technologies that better match the country needs. In a broad
interpretation, you may also interpret R as including government expenditures, such as
subsidies to innovation and the provision of public infrastructures that are
complementary to private investments. Although these expenditures do not fit the
conventional definition of R&D, they play a similar role, in that they imply the use of
resources with valuable alternative uses with the aim to expand the country
technological level.
The capital stock evolves as in the basic Solow model with and exogenous
saving rate:
K t  sY t   K t
Similarly, it is assumed that a constant fraction of GDP is devoted to adopt and

adapt the foreign technologies:
R  s RY (9.2)
The level of technology is assumed to evolve according to:
1
 R    
   b   , with  (9.3)
 N   
where b is a positive parameter measuring the “productivity of the adoption efforts”,

 represents the world technological frontier and  is a parameter measuring the
strength of technological diffusion into the country.
The first term in (9.3) captures the impact of the technology adoption efforts216.
With such specification, a deliberate effort to adopt (or adapt) foreign technologies is
necessary for technology to diffuse into the country: when no resources are spent, there
is no assimilation of foreign technologies and the economy does not grow at all. The
impact of the research effort on technological adoption is mediated through the
productivity parameter b. In a narrow interpretation, this term may be seen as capturing
the skills of engineers and scientists. In a broad interpretation, parameter b may be
interpreted as capturing the influence of barriers to technological adoption, such as
licensing, legal restrictions to the establishment of firms and low enforcement of
property rights. These barriers increase the costs of adopting foreign technologies,
implying that more resources R are needed to achieve a given improvement in
technology. In terms of our model, b gets smaller.
The second term in (9.3) captures the “benefits of backwardness”: other things
equal, the more knowledge remains to be absorbed by the country (   ), the higher it
will be its rate of technological progress. The rationale is that, as the country approaches
the frontier, less and more complex ideas will be available for copying, so the cost of
achieving a given improvement in technology increases. Countries that are backward
relative to the frontier technology will achieve greater improvements in technology for
216
In (9.3), the country’ expenditure in technology adoption, R is divided by population, N, so as to avoid
scale effects. Since R depends linearly on Y, the change in technology will depend on per capita income
and, by then, on the level of Hence, the model captures the “standing on shoulders effect”, without the
drawback of a scale effect.
each unit of output spent in technology adoption. The parameter  >1 imposes
diminishing returns on the benefits of backwardness.
Substituting (9.2) in (9.3) and dividing both terms by , one obtains:
1
   
 bs R ~
y   (9.4)
 
where
y  Y N  A1 1  K Y 
~  1 
. (9.5)
Equation (9.4) reveals that both the technology adoption effort, s R , and the
respective productivity, b, affect the country rate of technological progress, conditional
on output per unit of efficiency labour, ~ y . This property of the model captures the
interactions between the innovation effort, capital availability (human, physical) and
efficiency (A). Thus, policies leading to a higher investment in physical or in human
capital and policies aiming to improve static efficiency A will enhance the activity of
technological adoption, translating into a faster technological catch up. The model
therefore establishes a causal relationship from aggregate efficiency and capital
intensity on the pace of technology adoption.
The Steady state
The benefits of backwardness imply that technology absorption is faster when

the country technological gap is larger. Thus, there is a force pulling the country
towards the frontier. In the steady state, this force is powerful enough to ensure that its
productivity level will grow at the same rate as the world technological frontier. In other
words, the country will evolve along a balanced growth path that is parallel to that of
the world economy.
The World technological frontier is assumed to expand at the exogenous rate
 217
:
  e t (9.6)
The steady-state of the model is obtained setting     in (9.4). Solving for
the technological gap, this implies:

   
  (9.7)
  bsR ~y 
Equation (9.7) states that the steady state technological gap is positively affected
by the world rate of technological progress (  ) and negatively affected by the country’
innovation effort ( s R ) and the productivity of the innovation effort (b).
217
Klenow and Rodriguez-Clare (2005) extend the model for the world technological frontier to be driven
by the research efforts of all countries in the model (see also Howitt, 2000, and Eaton and Kortun, 1999).
The student will thank us for skipping that complication.
afreitas@ua.pt 270
The steady state level of per capita output in this model can be obtained referring
to equation (3.10) in the Solow model, which is repeated here:

1
 s 1 
yt*  A 1 
  t . (9.8)
 n  
Solving together (9.7) and (9.8), using (9.6) and the definition ~y  y  , one
obtains the steady state level of per capita income in this extended version of the
neoclassical model:
 1 
1  
1   s  1   bs R  t
y A
*
t     e . (9.9)
 n      
As in the Solow model, country characteristics determine the level of income per
capita, but not its steady state growth rate: the long run growth rate is exogenous and
given by the world rate of technological progress.
Also like the Solow model, this model does not predict absolute convergence of
per capita incomes, but rather conditional convergence: differences parameters translate
into level-differences across countries. In the long run, countries will evolve along
parallel growth paths, an implication that is supported by the general evidence on
conditional convergence (see Box 9.8).
There are however three main differences relative to the Solow model: First, the
exogenous rate of technological progress now applies to the world as a whole and the
model determines how close the country gets to the world technological frontier.
Second, the country’s ability to approach the world technological frontier depends on
the proportion of income spent in the technology adoption efforts (innovation,
adaptation, addressing specific market failures), s R , and the productivity of these
efforts, b, that depends on political, social and economic factors. Third, the influence of
the “old” parameters, namely aggregate efficiency (A), the propensity to invest in
physical (human) capital (s), and the population growth rate (n), is amplified by a factor
of This reflects the above mentioned interaction between capital availability, TFP and
innovation efforts.
Transition dynamics
As in the basic Solow model, the principle of transition dynamics applies to

changes in the exogenous parameters: a favourable change in a parameter will produce a
level effect and a transitory period during which the country approaches the World
technological frontier. During this period, the country will exhibit faster growth than the
world average, but this will be temporary. In the long run, despite the differences in the
behavioural parameters, the growth rate of per capita income in the country will be
equal to that of the World frontier.
To see how the model works in the short run, let’s depict in Figure 9.1 the
country rate of technological adoption, , as a function of its technological gap,  
(Curve CC). According to equation (9.4), this curve is upward sloped (a higher
technological gap implies a faster rate of technology absorption) through the benefits of
backwardness. The figure also depicts the growth rate of the World technological
frontier,  , which is independent of the country technological gap and hence, horizontal
(curve WW).
Now assume that the country is initially in point R, with a technological gap
equal to   0 . With such a gap, the rate of technological expansion in the economy is
smaller than the world rate, that is  0   . This means that the country will be
diverging relative to the world economy. In the figure, the country will move
rightwards, from R to S (higher technological gap). As the technological gap widens,
the advantage of backwardness shows up more strongly, so the rate of technological
adoption (and, thereby, per capita output growth) increases. When point S is reached,
the growth rate of the economy is exactly equal to the growth rate of the world
technological frontier and the income gap stabilizes. By the same token, if the country
starts out on the right hand side of S, it will converge to S.
Figure 9.1: Transition dynamics and the steady state in the technology adoption model
CC
S
 WW
0 R
O   0   
*  
What happens if the technological adoption effort increases?
Figure 9.2 describes the effect of a rise in the country’ adoption effort. From
(9.4), an increase in s R causes the CC locus to shift upwards. This means that the steady
state moves to a different point (from S to S’).
If the country is initially in the original steady state S, at the impact there will be
an acceleration in the growth rate of technology adoption (from  to  R - point R).
Since the country technology is now expanding faster than the World frontier, the
technological gap starts decreasing, meaning that the country moves leftwards, from R
to S’. As the technological gap decreases, the benefit of backwardness decreases,
implying a declining growth rate. When the new steady state is reached (S’), the country
growth rate is the same as the world economy and the technological gap stabilizes.
Thereafter, the economy will evolve in parallel to the rest of the world, but with a lower
income gap than in the initial state.
afreitas@ua.pt 272
Figure 9.2: The effect of an increase in adoption effort or of a decrease in barriers to

technology adoption
CC’
R
R CC
S’

WW
S
O   *
1   *
0
 
What happens when barriers to technology adoption decline?
Assume now that the economy opened to international trade, enabling the
economy to absorb foreign technologies faster than before, for each level of adoption
effort. In our model, this is captured by an increase in parameter b. In terms of Figure
9.2, the locus CC shifts up, just as in the earlier section. The adjustment mechanism is
similar to the one before: the economy will grow temporarily faster than the rest of the
world, but as the productivity gap declines, the growth rate approaches the world rate of
technological progress (point S’).
Note however that the two cases are distinct in one aspect: an increase in the
adoption effort involves the expenditure of resources, while an improvement in
productivity is, by definition, free. Hence, although s R and b have similar effects in
terms of Figure 9.2, the path of per capita consumption differs in the two cases. Figure
9.3 shows the difference. In the case of an increase in s R , there is an initial fall in per
capita consumption that may or may not be compensated by the temporary acceleration
that follows. In the figure, it is assumed that the acceleration effect dominates, but this
is not necessarily true. As in the Solow model there is a golden rule for the optimal
spending on technology adoption218.
218
To see how the golden rule looks like in this model, you may investigate which values of s R and s
maximize the steady state level of per capita consumption, given by 1  s  s R yt* . The solution is s  
and s R   1    1    .
Figure 9.3: The effect of an increase in innovation effort or innovation productivity
ln c
Change in b
Change in sR

time
What happens when the world rate of technological progress increases?
We now analyse the impact of a shock that is out of control of an emerging

country: a change in the world rate of technological progress.
To analyse this, let’s refer again to Figure 9.1. Assume that initially the world
technological frontier was expanding at rate  0 and that our emerging economy was in
the corresponding steady state, with a constant technological gap equal to   0 . Then
suppose that the world rate of technological progress accelerated once-and-for all to  .
This change implies a shift in the country’ steady state, from point R to point S.
As explained before, the adjustment to the new steady state is not instantaneous: the
faster expansion of the world technology frontier leads to a widening of the country
technological gap. This, in turn, impacts positively on the country’ rate of technology
adoption (due to the benefit of backwardness). When the technological gap is
sufficiently large, the country rate of technological absorption is equal to the world rate
of technological progress  and the technological gap stabilizes in its new steady state
level,    .
*
afreitas@ua.pt 274
Figure 9.4: the widening of income gaps
ln  World
Contry
  
*


0
  
0
0
time
Figure 9.4 displays the path of technology in the emerging country and in the
world (in logs, so as to stick with linearity): before the shock, the two technologies were
growing in parallel, with a gap equal to   0 ; after the shock, the two growth rates
depart from each other, resulting in an episode of temporary divergence; in the new
steady state, the two technologies evolve again in parallel, but the new technological
gap    is larger than the initial one.
*
All in all, the implication of the increase in the world rate of technological
progress is a widening of the emerging country technological gap. In the long run, the
emerging country will grow as fast as the world economy, only because it got
sufficiently behind.
The great divergence revisited
The model just sketched offers an interpretation for the episode of the Great
Divergence: when West Europe and the Western Offshoots entered in modern growth,
the world technological frontier started growing faster than before. However, the rest of
the world did not enter immediately in modern growth. Because of domestic
idiosyncrasies, countries such as India and China entered in modern economic growth
two centuries later, only.
According to the discussion in the previous section, the acceleration in the world
rate of technological progress should have caused the technological gap between the
leader countries and the rest of the world to increase. In fact, in the case of India and
China, their per capita incomes relative to the leader fell from 44% and 48%,
respectively, in 1700 to only 7,9% and 7,2% in 1965. As the income gaps got larger,
these countries started benefiting from faster technological diffusion, so at some point in
time (by the middle of the twentieth century) these countries stopped diverging (see Box
9.9).
Proximate causes versus ultimate causes once again
The model stresses the role of exogenous parameters, such as the saving rate (s),
the innovation effort ( s R ), the productivity of innovation (b) and total factor
productivity (A) in determining how close a country gets to the technological frontier.
One should remember, however, that these parameters can hardly be taken as
independent from each other. For instance, an increase in the technology adoption
effort, s R , may induce organizational and political changes in the country, paving the
way for the ease of existing barriers to technological adoption: That would be a case of
transpiration bringing more inspiration.
In contrast, consider an economy that is closed to international trade, where
property rights are not enforced and where a privileged elite is able to block any attempt
to introduce new technologies – hence a country where productivity parameters b and A
are very low. In that case, one would expect people to save less and to dedicate less
resources to the adoption of new technologies (low transpiration because of low
inspiration)219.
This discussion brings us again to the discussion of proximate causes versus the
ultimate causes of economic growth. A reduced-form model such as the one described
above is helpful to understand the mechanics of economic growth and to discuss links
between critical behavioural parameters and economic performance. But if one really
wants to deepen the question and ask why do some countries save more or invest more
in R&D than others, we have to depart from the simple equations outlined above and
ask what determines parameters that are taken as exogenous in the model above. In that
query, the basic principle is that people respond to incentives: so they will invest more
if they perceive this to be worthwhile. These perceptions, in turn, are influenced by
other factors that we need to focus on, including the quality of policymaking and of
institutions.
Box 9.9 Miracles and Disasters
Figure 9.5 plots the evolution of per capita incomes in two leader countries, the
United Kingdom and the United States, and six followers, Argentina, Portugal, China,
India, Botswana and Chad.
The facts in the figure are as follows:
- The two leader countries, United Kingdom and the United States, have evolved
mostly in parallel. You may think these two countries as designing the world
technological frontier and sharing equally the benefits of the world technological
diffusion. These countries have been evolving more or less in parallel for a long time,
though with slight different levels, that you may relate to differences in efficiency in
which resources are used.
219
Howitt (2000) proposed a Schumpeterian model of economic growth, where the innovation effort is
endogenously chosen by profit maximizing firms. The author shows that only in countries with a
minimum level of R&D productivity and with enough protection of property rights firms will find it
profitable to innovate. When these minimum conditions are not in place, firms prefer not to innovate and
the economy stagnates. This case intends to captures the situation of very poor countries which have not
been able to achieve conditional convergence.
afreitas@ua.pt 276
- In the figure, we see that China and India diverged relative to the leader
countries from the beginning of the sample up to the mid-twentieth century. According
to the model above, such divergence could be explained by an acceleration of
technological progress in the leader countries combined with the existence of country
specific barriers to the adoption of new technologies that prevented them from joining
the innovating club.
- By 1970, India is likely to have engaged in a parallel growth path vis-à-vis the
leader countries, without being able to catch up. This path is consistent to the idea that,
when the laggard economy gets sufficiently behind, the benefits of backwardness
prevent further divergence.
- In the second half of the twentieth century, some countries started approaching
the leader countries: Portugal in the early 1950s, China in the early 1960s, Botswana in
the mid-1960s. According to the model above, improvements in political, social and
economic environments may have helped increase permeability to technological
diffusion, leading people to invest more and to adopt foreign technologies. In light of
the model, such changes lead to a level effect in terms of per capita income and to a
transition period during which the country approaches the world technological frontier.
In the long run, each country is expected to stabilize in a parallel growth path vis-à-vis
the leader country.
- Along the period, Argentina has diverged relative to the technological frontier.
This case is symmetrical to the earlier one: something in the Argentinean polity has
evolved in the wrong direction, moving this country to a higher-level income gap.
- Per capita income in Chad has stagnated and even declined in some years: the
performance of this country suggests that extremely adverse local conditions, prevented
the country from investing in the adoption of new technologies and enjoy the benefits of
backwardness.
Figure 9.5: Miracles and Disasters
11
United Kingdom
United States
10 Argentina
China
India
9
Botswana
Chad
Portugal
Per Capita GDP
1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000
9.6. Discussion
This chapter is devoted to the question of why economies do not always adopt
the more efficient technologies, even when these are readily available.
It was argued that international trade and factor mobility help diffuse
technologies across the board. The fact that imitation is cheaper than invention makes
backwardness potentially an advantage. The absorptive capability of a country depends
however on local characteristics, such as human capital endowments, complementary
inputs, infrastructure, institutions, and geography. Because countries differ in respect to
this set of characteristics, the technologies that better suit each country has to be
discovered and often adapted. All in all, although there are forces that create the
potential for a poor economy to catch up, this will not happen automatically. For a
laggard country, the critical challenge is how to adjust domestic policies so as to better
take opportunity of the world technological diffusion.
A model was presented capturing these dilemmas. According to the model,
international technological diffusion prevents countries from drifting indefinitely apart
from each other: in the steady state, they will all grow at the same rate. How close each
country gests to the world frontier depends, however, on the quality of domestic policies
and institutions.
 The view that poor countries may catch up with rich countries by
imitating successful technologies without the need to invent everything
from the scrap is labelled the “advantage of backwardness”.
 However, the transfer of technology is not automatic.
 In general, openness to trade and international factor mobility increase a
country exposure to outside innovations, speeding up technological
change.
 Technological diffusion may be enhanced or slowed down depending on
a country set of characteristics. This includes the availability of
complementary inputs and infrastructure, as well as the costs of
switching to the new technology (learning costs, network externalities).
Accumulated experience with an old technology may help or retard the
process, depending on how useful the inherited knowledge is to operate
with the new technology.
 Interest groups may block the adoption of foreign technologies,
whenever they have more to gain in maintaining the old technology.
 The fact that countries differ in the set of complementary inputs implies
that the appropriate technology differs from country to country. The
process of discovering which technology better serves a given country
involves positive externalities.
afreitas@ua.pt 278
 The fact that many technologies are developed to match the

characteristics of industrial countries implies that the adoption of the
appropriate technology may involve some adaptation effort.
 The model of technological interdependence analysed in the chapter
distinguishes a world technological frontier, driven by the innovation
effort of all countries, and the technological level of an individual
country. The later depends on its innovation efforts and also on how far
is from the frontier.
 In light of the model, in the long run the country will evolve in parallel
to the world frontier. Its technological gap will depend on domestic
policies and innovation effort.
 In light of the model, an acceleration in the world rate of technological
progress translates into faster growth in the catching up country too, but
this will come up with a lag, so during the adjustment process the income
gap vis-à-vis the leading countries increases. This model provides an
interpretation for the Great Divergence.
Key concepts
 The advantage of backwardness

 Tacit knowledge
 Leapfrogging
 Self-discovery
 Directed technological change
Essay questions:
a) Explain how the adoption of a new technology may be retarded by complementarities

relative to other factors and substitutability relative to older vintages.
b) Comment: “Institutions do not travel well”.
c) Explain why in many poor countries the simple adoption of foreign technologies results in
productivity gaps that cannot be eliminated along time.
d) “The advantage of backwardness implies that laggard countries are doomed
to grow faster than rich countries”.
afreitas@ua.pt 280
Exercises
9.1
Consider a small emerging economy, which instead of producing its own technology,
adopts technology produced elsewhere. Output consists in a homogeneous good Y
produced according to the following production function: Y  AK 0.5  N 0.5 , where K
includes both human and physical capital and  measures the efficiency of labour.
Assume additionally that population is constant, the depreciation rate is equal to 2%.
a) Find out the expression for the steady state as a function of A and s.
b) Now assume that the savings rate is 25% and A=0.4. Find out the steady state
and the corresponding level of output per capita as a function of  .
From now on, consider that technology evolves according to:
1
   
 bss ~
y   , with   2 , and with the world technological frontier
 
expanding at   0.02 per year.
c) Interpret the expression above.
d) Find an expression for the technological gap in the steady state.
e) Now calculate the steady state technological gap for the following parameters
and interpret comparatively.
i. b=0.2 and sr =0.1, s=0.25, A=0.4
ii. b=0.125, sr =0.08, s=0.25, A=0.4.
iii. b=0.2, sr =0.1, s=0.125, A=0.4
iv. b=0.2, sr =0.1, s=0.25, A=0.2.
v. b=0.2, sr =0.1, s=0.25, A=0.4 and   0.03 .
9.2
Consider a closed economy where firms perceive the production function to be of the
form Y  AK 0.5  N 0.5 . In this economy the population is constant, the saving rate is
equal to s=0.2, the depreciation rate is equal to δ=0.03 and A=0.25.
a) Assume first that technology in this economy expands at 2% per year.

i. Find the steady-state in this economy and discuss the stability of
the equilibrium.
ii. Examine the implications of an increase in the efficiency
parameter from A=0.25 to A=0.5. Compute the new steady
state and explain with the help of a graph. Draw the time paths
of income per capita (y) and of the interest rate (r).
b) This economy does not produce its own technology and therefore adopts
technology produced elsewhere. Assume that technology in this
economy evolves according to:
1
   
 bss ~
y   , with   2 , b=0.1, sr = 0.08 and   0.02 .
 
iii. Interpret the expression above.
iv. Assuming that A=0.25, find out the steady-state value of the
technological gap and represent it graphically.
v. Assume that A increases to 0.5. Compute the technological gap in
the new steady state and explain graphically the adjustment
process. Compute the new steady state level of per capita
income. Explain.
vi. Starting out in a steady state where A=0.5, assume that the
foreign rate of technological progress decelerates to 0.01.
Compute the new steady state level of the gap. Draw the time
paths of income per capita (y) and of the efficiency of labour
(  ) following this change.
afreitas@ua.pt 282
Part III – Getting the prices right
“Economic history is overwhelmingly a story of economies that failed to

produce a set of economic rules of the game (with enforcement) that induce sustained
economic growth”. [Douglas North].
Learning Goals:
 Understand why we need a government

 Understand why the market fails in the presence of public goods
 Understand the optimal intervention rule
 Acknowledge the main sources of government failures and their
implication for economic performance
 Discuss the trade-off between efficiency and equity in a growth
perspective.
10.1. Introduction
This chapter focuses on the role of government in providing essential goods and
services that competitive markets do not generally produce. This includes, a physical
dimension (infrastructure, communications systems), and an institutional dimension (for
instance the rule of law and regulatory agencies). Without a minimum provision of
these goods, the private economy will fail to operate efficiently, giving rise to
distortions and bad resource allocation. By providing these services, governments
enhance productivity, raising the incentives to produce and invest.
This does not mean that the larger the public provision the better. Government
activities are financed with taxes, which crowd out private investment. Hence, a well-
balanced intervention shall weight the efficiency enhancing potential of public
expenditures against the distortionary effects of taxation. In this judgement, one has also
to take into account the government’ own limitations: because of different types of
inefficiencies, there is waste in the process of transforming tax proceeds into public
services.
This chapter addresses the trade offs involved in the public provision of goods
and services that are essential to economic activity. Section 10.2 briefly reviews the role
of government in the economy. Section 10.3 describes the types of goods that
governments are thought to provide. Section 10.4 extends the basic Solow model by
adding a government sector that collect taxes and provides a public input. Section 10.5
analyses the trade-offs involved in government intervention. Section 10.6 concludes.
afreitas@ua.pt 284
10.2. The role of government in the economy
In his masterpiece, Wealth of Nations, Adam Smith (1776) used the metaphor of
the “invisible hand” to argue that the public interest would be best served if
governments allowed selfish individuals to pursue their own interest. The “profit
motive” would lead individuals, competing against each other, to supply the goods other
individuals wanted at the lowest possible price. Because only agents producing at the
lowest possible cost would survive, in a free-market system resources would not be
wasted and the economy would operate at its maximum level of efficiency220.
Smith ideas have influenced nineteenth-century economists, like John Stuart
Mill, who advocated the doctrine of the laissez faire. According to this doctrine, the
government should not interfere with the private sector, regulating or controlling the
production. Free competition would serve the best interest of the society. At the other
extreme of the economic thinking, the nineteenth-century economist, Karl Marx argued
that capitalism leads to grave income inequalities, and advocated a greater role for the
state in controlling the means of production.
Economics has progressed a lot since then. With no question, the “invisible
hand” argument still has great appeal. In general, there is a much-supported proposition
that greater economic freedom is related to better economic performance. But the
economic profession is well aware that government plays an important role as a
complement to the market. Although there is now a widespread agreement that markets
and private entrepreneurship are at the heart of successful economy, there is also a
recognition that an economy without government intervention will hardly work at all.
So, fully hedged laissez faire is definitely ruled out.
An area where there is a broad consensus in favour of government intervention
is the need to provide public order and to set up a coherent system of (individual and
corporate) property rights and enforcement of contracts. The free-market system
requires that entrepreneurs who are investing in risky businesses have a high probability
of making a profit that rewards their investment and risk. The legal system must protect
the right to own property and must protect it from offences and thieves. Having a stake
in the future, agents will take a long-term perspective and will produce and invest.
Moreover, enforceability of contracts is a necessary condition for individuals to engage
in beneficial exchange. With unsecure property and contract rights, agents will have no
incentive to engage in complex long-term operations and to take full opportunity of the
benefits of specialization. They will instead tend to adopt shorter-term horizons,
investing in inexpensive technologies and relying on bribery and corruption to enforce
transactions. As argued by Stiglitz (2000), property rights and contract enforcement
may be seen as the foundations on which the market economy rests.
In general, as it is well known, the free market is likely to produce too much of
some undesirable outcomes, such as pollution and too little of some essential goods and
services, such as roads and public infrastructure. Economists refer to these problems
collectively as “market failures”. Market failures include inadequate provision of public
goods, externalities, and imperfect competition, missing markets, information failures
220
Remember that, in terms of the AK model, a higher efficiency parameter (A) leads to faster growth. In
terms of the neoclassical growth model, this would mean a higher level of per capita income in the steady
state.
and persistent unemployment. When there is a market failure, the market mechanism
does not produce the most efficient outcome.
Governments can obviously do things that private agents cannot do. For
example, they have the right to force citizens to pay taxes; if they fail to do so,
governments can confiscate their property. Governments may also manipulate prices,
regulate markets and undertake production itself. All in all, governments have the power
to interfere and influence profoundly economic outcomes. If the intervention is
successful, private incentives will become more aligned with the social interest. This, in
turn, will induce a more efficient allocation of resources. As claimed by the Nobel
Laureate Douglass North and his co-author Robert Thomas (1973), “getting the prices
right” (that is, making individuals capture the social returns to their actions as private
returns) is good for growth221.
10.3. Public inputs
Institutions
A fundamental function that underlies the origin of the state is the establishment
of otherwise missing but essential institutions.
Broadly speaking, institutions are social, humanly devised constraints that
govern human interactions 222 . Institutions are inherent to human societies: human
beings compete with each other for scarce resources. For that competition to result in
mutual gains, societies need to set up and enforce some supportive framework.
Institutions provide such framework. Institutions are created to set out the “rules of the
game”, and thereby reduce uncertainty and the costs of transacting.
Some institutions are typically provided by the state, such as the judiciary, the
competition authority, international agreements and money. Others emerge
spontaneously from civil society. This includes, for instance, codes of conduct,
professional associations and language.
Institutions have become increasingly complex along human history. In
primitive hunter-gatherer societies the “rules of the game” were mostly encoded in
simple traditions, and enforced by a leader with extensive discretionary power (the “Big
Chief”), often under the supervision of some form of tribe council. At the time, simple
societal structures like that were enough to resolve most internal conflicts. However,
they could not, in general, support contractual arrangements among their members.
When, 10.000 years ago, humans moved to agriculture, the emergence of larger
and more structured societies turned necessary the creation of new supporting
221
North and Thomas (1973).
222
Avner Greif, defines an institution as “a system of rules, believes, norms, and organizations exogenous
to each individual whose behaviour they influence that together generate a regularity of behaviour in as
social situation” (Greif, 2009, p. 12). Douglass North defines institutions as “(…) the humanly-devised
constraints that structure human interaction. They are composed of formal rules (statute law, common
law, regulations), informal constraints (conventions, norms and self imposed codes of conduct), and the
enforcement characteristics of both” (North, 1993, pp 5-6).
afreitas@ua.pt 286
institutions. In these societies, some individuals specialized in agriculture, some in

artefact production, some in trade and some other in defending the territory. In order to
support investment and to organize exchange and the division of labour in these more
complex communities, it became necessary to secure property rights and enforce
contracts: if private property was not protected and contracts freely entered into could
not be enforced, the incentives to invest and to conduct many types of beneficial trade
simply would not exist.
Initially, these services were delivered by coalitions of privileged elites, who had
a stake in providing a stable framework within the community and who, at the same
time, had the power to assign and enforce their own privileges and property rights223. As
societies progressed, more open competition and the rise of new elites seeking for the
control of the state turned necessary an increasingly complex network of social
arrangements. In the political front, it became necessary to regulate the election of
leaders and control their power. On the policy front, it became necessary to regulate
markets, to build resilience to economic shocks and to provide socially acceptable
income distributions 224 . In civil society, complementary institutions emerged, in the
form of social networks, providing group enforcement, punishment, relevant
information and even protection of property rights, wherever the formal mechanisms
were less effective or absent (Box 10.1 offers an example).
By rewarding certain types of behaviour and punishing others, institutions shape
the incentives to produce and invest and influence decisively the outcome of the
competition game over limited resources. Many authors believe that the Rise of the
West is much attributable to the development of institutions that have allowed economic
agents to reduce transaction costs and uncertainty, thereby exploring more fully the
potential gains from exchange225.
Of course, institutions may also retard growth. For instance, regulatory agencies
may be used to prevent entry; courts may be commanded to resolve disputes
dishonestly; the police may be used to extract bribes from honest citizens, and so on.
Not surprisingly, many empirical studies have found a positive relationship between
economic growth and measures of institutional quality (an example in Box 5.4).
223
North et al, 2006.
224
Rodrik et al (2004) distinguish four categories of (economic) institutions: “Market-clearing
institutions”: those that protect property rights and ensure that contracts are enforced (without them,
markets either do not exist or perform very poorly); “Market-regulating”: those that deal with
externalities, economies of scale and imperfect information (regulatory agencies in telecommunications,
transport and financial services); “Market stabilizing”: those that assure low inflation, minimize
macroeconomic volatility and avert financial crises (central banks, exchange rate regimes, budgetary and
fiscal rules); “Market legitimizing”: those that provide social protection and insurance, redistribution and
manage conflict (pension systems, unemployment insurance schemes and other social funds). This
concept includes many dimensions of government intervention.
225
Acemoglu and Johnson (2005) investigated the type of institutions that are more important for
economic growth. The authors distinguish those institutions that regulate the (vertical) relationships
between political elites and citizens ( “property rights institutions”) from those that facilitate (horizontal)
exchange between citizens, firms and financial intermediaries, such as laws, courts and regulations
(“contracting institutions”). Using historical data, the authors found that “property rights institutions” are
much more important to explain economic development than “contracting institutions”. The authors
interpreted this, arguing that private agents can overcome weak contracting institutions, by developing
alternative private mechanisms to enforce their contracts or to cover their risk (see Box 10.1).
Box 10.1 “Dom Peppe”
Avinash Dixit, from Princeton University, offers a nice illustration of how

institutions that are essential to support economic transactions may emerge
spontaneously in the society, if the legal system administered by the state is inefficient
or unable to provide them.
“In most economic transactions that can create economic gains for all parties,
some or all of them can gain an extra private benefit while hurting the others, by
violating the terms of their explicit or implicit agreement. The fear of such exploitation
by the other party may deter each from entering into the agreement in the first place.
This was brilliantly illustrated by Diego Gambetta in his ethnographic sociological
study of the Sicilian Mafia (1993, p. 15). In the course of his interviews, a cattle breeder
told him: “When the butcher comes to buy an animal, he knows that I want to cheat him
[by supplying a low-quality animal]. But I know that he wants to cheat me [by reneging
on payment]. Thus we need … Peppe [the Mafioso] to make us agree. And we both pay
Peppe a commission.” By providing a mechanism of contract enforcement, Peppe
makes it possible for the two to enter into a mutually beneficial transaction. And he
does this with a profit motive, exactly as would any businessperson providing any
service for which others are willing to pay”. (Dixit, 2007, p. 3).
Goods that governments are thought to provide
A first category of goods that governments are ought to provide are Public
Goods. Public goods are a particular category of goods, which differ from private goods
in two main aspects:
(a) Public goods are non-excludable: if they are provided at all, it will be
technically impossible to preclude anyone from consuming it. For instance, it will be
virtually impossible to preclude somebody crossing a street from having access to that
street’ public lightening. Other goods that are non-excludable include clean air, radio,
and low inflation. The implication of non-excludability is that the benefits of public
goods cannot be confined to those who have paid for it.
(b) Public goods are non-rival: one person’s consumption does not diminish the
amount available to others. For example, if you eat an apple nobody else can eat the
same apple. The apple is a rival good. In the case of public goods, extending its
provision to an additional user does not diminish the quantity available to other users.
For instance, one’s benefit with a clean environment does not diminish the enjoyment of
others. A clean environment is non-rival. Other goods that are non-rival include cable
TV, macroeconomic stability, and the rule of law.
A good that is both non-rival and non-excludable is called public good. A
classical example of a public good is national defence: once a country is protected from
foreign invasion, there is no extra cost in protecting a new citizen. So national defence
is a non-rival good. Furthermore, it will be impossible to preclude anyone from that
protection. Other pure public goods include radio, street lightening, clean air,
macroeconomic stability and so on.
Private markets do not work at all well when goods are not excludable
(characteristic a): whenever it is impossible or difficult to preclude anyone from using a
service or consuming a good, it will be difficult to find someone voluntarily paying for
its production. This is because each individual will prefer not to pay and instead to take
afreitas@ua.pt 288
a free ride on any eventual production that does appear. This means that provision of a
non-excludable good is not in general profitable226.
The implication is that public goods will be under-supplied in a competitive
equilibrium, even if they are socially very important. To the extent that the whole
economy benefits with public goods, there is scope for government intervention. The
government clearly has an advantage over private markets, in that it has the power to
coerce citizens to pay taxes. With the tax proceeds, governments can finance the non-
excludable goods.
The second characteristic of public goods (non-rivalry), creates a second type of
inefficiency. In particular, it makes exclusion inefficient, even if achievable: if the
social marginal cost of one’s consumption is zero, why should the consumption of this
good be charged at all?
Some goods are excludable, but non-rival in consumption. As an example,
consider encoded TV. Encoded TV is a non-rival good, because one person’s
consumption doesn’t reduce another person’s. But it is excludable, since only people
who have access to a decoder can enjoy the service. Goods of this sort are called Club
Goods. Many public infrastructures fall in this category. This includes highways,
railways, airports, ports, telephone networks and electricity systems. Consider, for
instance, an un-congested bridge. An un-congested bridge is non-rival in consumption.
But it is possible to preclude people from (or charge people for) crossing a bridge. This
property makes private provision of bridge crossing entirely possible. Still, to the extent
that the social cost of having an extra individual crossing the bridge is zero (i.e, crossing
the bridge is non-rivarlous), preventing it will not be, in general, efficient. The
government can fix this by publicly providing the bridge.
A number of government activities are motivated by information failures.
Information (or knowledge) is, in many respects, a public good, because is non-rival.
However, information is not always perfectly available to all users. In some cases,
governments help information – or its use - becoming excludable, turning it a club
good. This is the case of patents. In other cases, however, it is desirable to help
information diffuse faster. Governments may fix this, helping consumers in getting the
information they need. For instance, by forcing firms to label their food products with
the true caloric content, by forcing banks to indicate explicitly the effective rate of
interest on their loans, etc. Other examples where the market may undersupply
information include weather forecast and national statistics. By publicly providing this
information, governments may improve both the welfare and the productivity of their
constituencies.
Some goods share with public goods the characteristic that they are equally
available to all members of a group, but they are not purely non-rival: in the example of
the bridge, as more and more individuals cross the bridge, the facility may become
congested. With congestion, for a given quantity of available bridge crossings the
quantity (or the quality) available to any one individual declines as other users congest
the facility. The marginal cost of an extra individual crossing the bridge (defined in
terms of the time lost in attempting to cross an overcrowded bridge) becomes positive,
implying that charging for its use becomes desirable. Many governmental activities,
226
Sometimes, spontaneous voluntary associations emerge to collectively assure the provision of public
goods. For instance, groups of neighbours pay voluntarily for local security patrols at night. But these
associations work better within small communities, as they are in general fragile to the free-rider problem.
such as highways, water systems, fire services, police and courts are subject to
congestion.
A different category of goods refers to those that are rival in consumption but
which consumption cannot be precluded. Goods in this category are called “common
goods”. A classic example is the stock of fish in international waters: the fish is a rival
good, but non-excludability in fishing may lead to a coordination failure, called the
“tragedy of the commons”: people with access to the “common pool” will try to extract
as much as possible without taking into account (because each individual is small) the
impact of their actions in the aggregate. This will eventually lead to over-fishing and the
depletion of the resource. In this case, excludability would be desirable. Governments
can fix this, by coordinating the extraction activity, setting limits to each fisherman, so
as to assure its sustainability.
Goods like education and health services are private goods in technical terms,
because the cost of extending the supply to more users is positive (they are rival) and
exclusion is relatively easy. Still, due to the positive externalities involved (the
community as a whole benefits from a higher education level and from a lower
incidence of diseases) these services will be under-supplied in a laissez faire. In these
cases, private provision is feasible (the good is excludable), but government
intervention is desirable in order to boost usage to a level closer to the social optimum.
Other goods entirely private - in the sense that they are rivalrous in consumption
and exclusion is feasible - but that tend to be undersupplied in a laissez faire are those
involving large economies of scale. Consider, for example, the case of postal services.
Because delivering letters involves costs (time, fuel) that depend on the distance
between the sender and the receiver, the closer the costumers are to each other, the
lower the unit costs. A mail company seeking for profits will then prefer not to operate
in areas where there are only few users. The government may however determine that
the provision of mail services should be equally available to all citizens, irrespectively
of their residence. To assure this, it may decide to run the post office itself. The same
applies to other utilities, like water sanitation, and electricity provision.
Along these lines, a category of market failure is labelled missing markets. In
general, private markets fail to provide good and services which cost of provision is
more than what individuals are willing to pay. For example, private markets do a poor
job in providing unemployment benefits and loans to research and development. The
government may then extend its intervention to boost these markets.
In practice, government spending covers different areas, including health,
education, infrastructures and communication networks, environmental management,
water and sanitation, information and communication, scientific research. Because there
is no clear cut distinction between goods that shall only be provided privately and goods
that can be provided publicly, the adequate amount of public intervention is a matter of
dispute in the economics profession.
Intervention options
When markets fail to produce the first best allocation of resources, there is scope
for government intervention. This, in turn, may be achieved through different
instruments.
afreitas@ua.pt 290
One option governments have is to take direct action, providing the goods and
services themselves. For example, if a government believes there is insufficient supply
of education services, it can decide to provide it itself, running public schools.
Public provision does not necessarily imply state ownership. For example,
governments can purchase goods and services from the private sector. This solution is
feasible, for example, with education services, garbage collection, and healthcare
services. A difficulty in this avenue that procurement contracts shall be properly
designed, so as to avoid unnecessary waste and undesirable transfers from tax payers to
contractors.
In many countries, infrastructure provision is private, in the sense that the
government assigns rights on highways, ports or airports. Still, the location and design
of the infrastructure is decided by the government, because the market fails to do this
properly.
Some market failures may be corrected at distance. Through taxes, subsidies,
and rewards, the government has the potential to manipulate relative prices so that
private incentives become aligned with the public interest. For instance, governments
may promote the use of energy efficient cars by taxing more the less efficient cars. In
education, the government can subsidize private institutions providing educational
services or support directly the students, with education vouchers.
Finally, the government can intervene using regulation and legal sanctions.
Governments have the right to create rules that regulate or otherwise restrict private
activities, so as to minimise the incidence of undesirable market outcomes. In most
countries, government agencies regulate what people can eat and drink, what kind of
houses they can live in, how many hours an employee can work at most, how much
pollution a factory can produce. Regulation has no impact on the government budget,
but it imposes costs on economic agents, by restricting their choices.
10.4. A simple growth model with government spending
The discussion above made the point that markets fail to provide some goods at
the optimal level. Governments have the potential to fix this, by publicly providing
these goods or by subsidizing its acquisition. This section extends the Solow model so
as to account for the role of government in providing essential services, and capture the
policy trade-offs involved.
The model is formulated assuming diminishing returns to reproducible factors,
so the relationship between optimal intervention and efficiency translates into level
effects. Similar conclusions can however be spelled out in terms of the AK model, with
the difference that efficiency will affect growth rates (this alternative formulation is
developed in Appendix 10.1).
Private productivity and public inputs
Consider an economy with a large number of equal firms. Each firm produces a
homogeneous consumption good according to the following production function:
Yit  At K it N it1  , (10.1)
where N refers to labour and K refers to private capital (which may include human
capital).
Sticking with the assumption of exogenous growth, let’s consider again equation
(3.1):
At  Ae gt
In (3.1), we now distinguish two components: one related to “efficiency”, the

constant A, and the other related to “technological change”, which in this model is
assumed to evolve exogenously.
Public provision will be thought as affecting directly the efficiency parameter, A.
The rationale is that an insufficient provision of public inputs results in higher costs in
higher transaction costs. To the extent that a larger provision of government services
reduces these costs, resources will be freed to be invested in production.
Formally, we consider the following specification for the TFP term:

G 
At  Ae , with A   t
gt
 and >0. (10.2)
 Yt 
In (10.2), G refers to the amount of services provided by the government. You
may think it as including expenditures related to the enforcement of property rights, the
provision of public infrastructures, etc. Because the public good in non-excludable, it
enters in individual production functions in the form of an externality.
Equation (10.2) states that the public input is essential to production: if G is
zero, private output will be zero (for instance, without a minimum protection of
property rights, production cannot take place). An increase in G raises the marginal
products of capital and labour. In this model, G has a non-cumulative nature (that is,
public spending must be renewed each period to sustain the private economy).
Note that in this model government services impact on the productivity of
individual firms through an external effect. That is, firms do not take into account the
impact of their own actions on the level of G. The relevant production function for
private decisions is (10.1), where At is taken as given. Also note that profit maximizing
firms will decline any invitation to share the costs of a spontaneous provision of G:
since G is non-excludable, each firm will prefer to free-ride on any eventual production.
Congestion versus non-congestion
According to (10.2), the output of the average firm i rises with the provision of
the public input relative to the size of the economy (Y). The implicit assumption is that
the public input is rival or subject to congestion: for a given level of public provision,
G, the amount of public input available to each firm declines as output (Y) increases. In
other words, when an individual firm expands its production, this acts as a negative
externality to other firms227.
227
Equation (10.2) refers to the public input getting congested with the income level. A different question
is whether there is a congestion effect through income per capita: the rationale is that, as countries
become wealthier, more complex regulation is needed, calling the provision of public goods to rise more
than proportionally. This hypothesis, known as the Wagner’s Law (Adolf Wagner, 1883), is ignored here.
afreitas@ua.pt 292
An alternative specification would be to assume that G, instead of G/Y impacted

on TFP. In that case (addressed in Appendix 10.1), G is non-rival (an increase in the
number of users does not reduce the amount available to existing beneficiaries), so it
becomes a pure public good.
To distinguish the two cases, consider the constitutional law and its
enforcement. The constitutional law is purely non-rival, in the sense that it will serve a
country population, irrespectively of its size. Producing a constitutional law entails large
economies of scale: the larger the country, the less each citizen is coerced to pay in the
form of taxes to finance the provision of a constitutional law. The enforcement of the
constitutional law, in turn, is subject to congestion: the larger the population, the more
courts and the more police services will be needed to enforce the law, everything else
constant.
To see how aggregate output relates to the availability of public input, just sum
the production function (10.1) across firms (remember they are all equal) and substitute
(10.2). You’ll get:
  1 
1  1 
Yt  G t K t L1t  . (10.3)
An interesting feature of this (aggregate) production function is that it still
exhibits CRS on both private and government services (note that the sum of the three
exponentials is equal to one).
Moreover, the public input exhibits decreasing marginal returns: that is, each
extra road or mains water pipe has a positive impact on total factor productivity that is
lower than of the road or the water pipe before. This rises the question as to whether
expanding too much public provision might be inefficient. In particular, if the public
input crowds out private capital, beyond a certain point, additional provision will have a
negative impact on output. The following discussion clarifies this.
Factor income shares
In what follows, let’s assume that the public provision is financed with a tax on
output levied at rate . Each firm maximizes:
 it  1   Yit  rt   K it  wt N it . (10.4)
The first order conditions of profit maximization are:
 i Y
 1   1    it  wt  0 , and (10.5)
Ni Nit
 i Y
 1    it  rt     0 . (10.6)
K i K it
Since all firms are equal, this leads to the following factor income shares:
wt Nt
 (1   )1    , and (10.7)
Yt
rt   K t
  1    . (10.8)
Yt
The market failure
Because government services arise as an externality, private firms do not

consider them as an input to production. They perceive the contribution of physical
capital to output as equal to  (as implied by 10.1), which is higher than the actual
contribution,  1   , (as implied by 10.3). This means “prices are not right”: without
intervention (e.g, without the tax on production), factor rewards will not be aligned with
their effective productivities.
Note that, if no tax was collected, the capital and labour income shares would
be, respectively,  and . This means that all output would be exhausted on the
rewards to these two factors, with nothing left to finance the public input. Since,
according to (10.2) the public input is essential to production, without government
intervention, this economy would not exist al all.
Getting the prices right
As equation (10.7) and (10.8) suggest, the government can use the tax rate so as
to get the factor rewards aligned with the public interest. The optimal tax rate, you may
guess, is the one that turns the after-tax factor income shares, (10.7) and (10.8), equal to
the actual contributions of capital and labour to output, as stated in (10.3).
Analytically, you may obtain the optimal tax rate  G (where the superscript G
refers to the Golden Rule), solving the following equation:


  1 G 
1
This gives:

G  (10.9)
1 
This discussion reveals that not all taxes have adverse effects: in some cases, an
appropriate choice of the tax rate constitute an effective tool to get incentives right. The
mechanism is simple: we saw that a firm expanding its output imposes a negative
externality on other producers, via lower availability of public inputs. Setting a tax that
is proportional to output, the government has a perfect mechanism to deal with the
congestion problem: a rise in the level of production by an individual firm suffers a
penalty equal to the cost it imposes on others.
Moreover, as we will see next, this penalty impacts positively on government
revenues on exactly the amount needed to finance the increase in G that is necessary to
compensate the rest of the economy for the erosion of public services per unit of output.
afreitas@ua.pt 294
10.5. Intervention trade-offs
Unproductive public expenditures
In order to make the model more interesting, let’s assume that a constant
fraction  of the tax proceeds are lost in unproductive uses. Thus, total “unproductive
expenditures”, denoted by  will be given by:
 t  Yt (10.10)
For the moment, just take as an exogenous and constant parameter. In Chapter
13 we will work out a model where this parameter is endogenous. The government
budget is assumed balanced each moment in time. That is, the government can neither
finance deficits by issuing debt nor run surpluses by accumulating assets. That is:
Gt   1   Yt  0 (10.11)
Equation (10.11) shows that, when  is positive, taxes are higher than the
minimum needed to finance a given level of public provision. Because of excess
bureaucracy, badly designed contracts, corrupt misappropriation of public resources, or
other inefficiencies, part of the tax proceeds will not translate into the provision of
public input.
It should be noted that the distinction between productive and unproductive
government expenditures has nothing to do with the distinction between government
consumption and government investment. Much of the government expenditures that
are classified as consumption in national accounts are, in our model, “productive”. For
instance, the policeman wages and the electricity bill are classified as consumption, but
can be highly productive. By the same token, not all investment expenditures shall be
classified as productive. Many public investment programmes give rise to “white
elephants” and many others become too expensive for what they achieve228.
A typical example of over-spending occurs with large-scale public investment
projects. The reason is that the risks involved - in case the project turns out to be more
costly than expected - are too big to be supported by private sector contractors. Given
the difficulty in finding firms willing to bear such a risk, contracts often leave
governments with part or the totality of the risk. This, in turn, generates an incentive
problem: the contractor may argue that costs are increasing and the government most
probably has not enough information to argue against this. In plus, contractors know
that, facing the alternative of having the project incomplete, politicians will not in
general resist the pressure. This is a typical problem of moral hazard that leads to cost
overruns in government spending. Box 10.2 describes different reasons why
228
Of course, there are other sources of “unproductive” expenses not necessarily related to waste. This
includes, for example buying art for a national museum. The society may however prefer to bear the cost
in exchange for a more cultured society. Because this trade-offs involves other considerations than those
related to efficiency, we skip that discussion.
government actions may result in waste. Box 10.3 shows an attempt to measure the
government waste in a sample of OECD countries
Income and expenditure
In our model, output has four different uses: private consumption (C), private
investment (I), government productive expenditures (G) and government waste ()229.
:
Yt  Ct   t  I t  Gt . (10.12)
Equation (10.12) states that one unit of public input costs the same as one unit of
physical capital. Contrary to other models, however (for instance, the MRW), in the
decentralized economy there is no arbitrage condition forcing the marginal product of
physical capital and of public inputs to be the same. Since G is not excludable, no
private agent will find it profitable to produce it. So it is up to the government to assure
that this condition will hold after intervention.
The flow income chart
To make a long story short, all flow identities of the model are displayed in
Figure 10.1, which describes the flow income chart of this economy230.
229
To keep the model simple, it is assumed that one unit of output can be either consumed or transformed
into one unit of capital or to one unit of public input. An equivalent assumption is that the production
functions for public input, capital goods and consumption goods are all equal.
230
In this model, taxes are paid by firms. But it would be equivalent if a uniform tax was levied on factor
incomes. The only difference would be on who was delivering the money to the government. We’ll return
to this discussion in Chapter 11.
afreitas@ua.pt 296
Figure 10.1: The flow income chart
s1   Y
Households
C  1  s 1   Y
1   Y
Government C.Market
G  Y  G  
Y I  K  K
Firms
The steady state
With the tax rate, households’ disposable income declines and so does the
amount of investment per unit of output. Comparing to the Solow model, you may guess
that the steady state level of per capita income is similar to (3.10), except in that s shall
be replaced by s1    and A shall be defined as in (10.2). That is:
 1 
 s 1   

1
 1   G   1 
 s  1  t
e t    1    
1  
y t*  A     e .
 n     Y   n  
(10.13)
Comparing to (3.10), you see that government expenditures impact on the
“efficiency term” and, by then, on the steady state level of per capita income through
two different channels:
(i) First, a higher provision of public input raises the productivity of private
inputs, raising the steady state level of per capita income;
(ii) Second, a higher tax rate, by reducing the disposable income and henceforth
the investment rate, impacts negatively on the steady state level of per capita income.
The golden rule
Equation (10.13) reveals a trade off between the benefits and the costs of
intervention: on one hand, a larger provision of public input raises the productivity of
private capital, inducing a higher level of capital per worker in the economy; on the
other hand, taxes impacts negatively on factor incomes and, by then on savings and
investment. This section examines the optimal balance between these two effects.
Suppose you are a benevolent planner who wants to maximize the steady state
level of per capita consumption in this economy231. To do this, first you need to find out
the expression for per capita consumption. Using (10.11) in (10.13) and the equation for
consumption in Figure 10.1, you get:

1   s  1 
c t*  1  s 1    1   1   
1    e t (10.14)
 n   
This equation re-states the above mentioned trade-off, but now in terms of the
tax rate, only. It also shows that a rise in government inefficiency,  by deviating funds
to unproductive uses, is equivalent to a decline in the saving rate: it crowds out private
investment without any positive impact on productivity. This has an unambiguous
negative impact on private consumption per capita.
If you choose the tax rate, , so as to maximize (10.14) – a simple but tedious
exercise - you’ll have the opportunity to confirm our previous guess, (10.9). An
interesting result is that this “golden rule tax rate” does not depend on 232 Substituting
(10.9) in (10.13) and (10.14) you obtain the corresponding golden rule paths of per
capita income and per capita consumption.
The bureaucracy right!
As mentioned before, raising government revenues by the amount needed to

provide the optimal level of public inputs - as implied by (10.9) - does not necessarily
imply that the government will actually provide the optimal level of public inputs:
remember from (10.11) that, if  is positive, some fraction of the tax proceeds will be
wasted in unproductive uses.
With the tax rate satisfying (10.9), the amount of public input in percentage of
total output is:
G 
 1    (10.15)
Y 1
This means that, if you forgot to use as a control variable to maximize private
consumption (10.14) you did not act as a genuine benevolent planner. If you were really
a benevolent planner you should care for your constituents’ money, rather than allowing
it to be wasted by your bureaucrats in unproductive uses. You should pay very much
attention on the way contracts with the public sector are designed, in order to keep the
incentives aligned with the public interest.
Thus, if you were really a benevolent planner, you should also set  =0, leading
to:
231
Note that maximization of per capita consumption in the steady state does not need to correspond to
the maximization of social welfare. We abstract, however, from this complication by assuming that the
social value of a growth path depends only on the consumption stream associated to it in the steady state.
232
Note that, by abstracting from the consumption-leisure choice, this model does not account for the
impact of taxation on the labour supply (this question is addressed in Mendoza et al., 1997). This
observation illustrates the general claim that the optimal tax structure is highly sensitive to the
formulation of the model. Taking this into account, one shall take the model above as a mere illustration
of possible intervention trade-offs.
afreitas@ua.pt 298
G
G 
   (10.6)
Y  1
This corresponds to the golden rule government size in this economy.
To interpret (10.16), note that producing one unit of public services costs the
same as one unit of output (equation 10.11). This means that the natural efficiency
condition for the size of the government is Y G  1 . According to (10.3), the
marginal contribution of G to aggregate output is Y G   1    Y G  .
Substituting (10.15), this gives 1   . Hence, only in case  =0 will each resource used
in the economy, either in the private sector or in the public sector, worth the equivalent
to its opportunity cost and the economy will be operating efficiently.
A Graphical illustration
Figure 10.3 plots the steady state level of per capita consumption (per unit of
efficiency labour), according to equation (10.14), as a function of the tax rate, for two
different levels of . The upper curve corresponds to a government without failures
(
For each given value of , there is a curve representing the relationship between
the government size and the steady state level of per capita consumption (per unit of
efficiency labour, L). At lower values of , the positive effect (i) described in Equation
10.13 dominates the negative effect (ii), so increasing the size of the government raises
per capita consumption. As the size of the government rises, the benefits of expanding
further the provision of public services declines, while the negative impact of taxation
(ii) rises. At higher levels of taxation, the effect (ii) dominates (i), so a further rise in 
decreases the steady state level of per capita consumption and the curve slopes
negatively.
As stated in equation (10.9), the golden rule tax rate does not depend on . This
means that, choosing the golden rule tax rate, you’ll achieve different levels of per
capita consumption in the steady state, depending on how efficient your government
gets. The case with is the one that leads to more consumption. The dashed curve in
Figure 10.3 corresponds to a case in which 0<, with a lower level of per capita
consumption for each tax rate than in the optimal case. When  =1, the provision of
public input is zero, so per capita income and per capita consumption will be zero toom
irrespectively of the tax rate (the curve is flat and coincides with the horizontal axes).
Figure 10.3. The trade-off between public provision and taxation
c~
FB
c~ FB
 0
0  1 K
 1
0  1 
G 
1
Efficiency and equity
As it is well known, the price mechanism produces too much of income

inequality. Without intervention, the free market forces will turn some individuals
extremely wealthy and many others extremely poor. Most economists see an important
role for the government in income redistribution, namely by taxing the rich and creating
welfare programmes for the poor.
In general, policies aiming at reducing inequalities weaken economic incentives
and impact negatively on efficiency and on the saving rate. This gives rise to the well
known trade-off between efficiency and equity.
In terms of our model, a redistributive policy may be interpreted as an increase
in , causing the steady state level of per capita consumption to decline, improving at
the same time a dimension of welfare that is not captured by the model. Whether this is
good or bad does not have a clear answer. Such assessment would depend on how the
society values efficiency versus equity, and economics has little to say about this. What
we know is that, in practice, developed societies deliberately choose to sacrifice some
consumption in order to obtain a more equal society233.
When inequality is extreme, however, there might be no trade-off between
efficiency and equity. The reason is that an extremely unequal society will also be a
society with social tensions. This, in turn, generates political instability, policy
unpredictability and threat of violence.
The United Nations (2005) makes the point: “Poverty increases the risk of
conflict through multiple paths. (…) Without productive alternatives, young people may
turn to violence for material gain, or feel a sense of hopeless, despair and rage. (…). The
lack of economically viable options other than the criminal activity creates the seedbed
of instability – and increases the potential for violence” (p.6,8).
Hence, in cases of extreme poverty, an effective redistributive policy may
actually impact positively on aggregate productivity. By levying taxes on the rich to
reduce social inequalities, the government will indirectly improve the enforcement of
the law and the protection of property rights. With a lower incidence of crime and
233
Note that this may give rise to a causal effect from high inequality to low growth: in an uneven
society, there will be social pressure for income redistribution, forcing the government to rely more on
distortionary taxation.
afreitas@ua.pt 300
property offenses, there will be a friendlier economic environment for investment and
job creation. In a safer social environment, less private and public resources will be
needed to secure property and maintain the public order. In that case the redistributive
policy may be seen as included in the public input that leads to higher productivity234.
Box 10.2. Governments fail, too
The discussion in Section 10.2 suggests that the government has a role in
altering the working of private markets in desirable ways. A great deal of controversy
exists, however, on the extent to which government intervention can do better than
markets. The reason is that governments have their own failures in achieving their stated
objectives. Even assuming that decision makers really want to maximize social welfare,
there are good reasons to believe that they may not be able to reach the most efficient
outcome.
The Nobel Laureate Joseph Stiglitz, in its Economics of the Public Sector,
distinguishes four categories of government failures235:
1- Limited information: the optimal intervention requires a correct assessment by
the government on the nature and the size of the market failure. However, the decision-
maker perception may be different from the real world. Due to limited mental capacity
by which to process information, governments do not have in general the information
required to do what they would like to do. Limited information may preclude the
government from correctly distinguishing whether its actions are really needed and to
which extent. For example, the government would like to make sure that only disabled
people were receiving social assistance. But it is often costly to avoid the free riding of
healthy individuals pretending to be disabled. Spending more resources on screening
may improve the information available to the government, but at the cost of less
resources being available to the social programme.
2- Limited control over private market response: the success or failure of
programmes in the public sector depends not only on public actions but also on how the
private sector responds. For example, by introducing an unemployment benefit, the
government does not know the extent to which individuals will adjust, spending more
time in unemployment, searching for better jobs. Because the links between policy and
outcomes (e.g, multipliers) are not well known, the intervention design and magnitude
often fails to be adequate.
3- Limited control over bureaucracy: bureaucrats don’t face the same kind of
pressures on them to cut costs that firms operating in competitive markets have. To the
extent that their expenses cannot be perfectly monitored, they may well become
prodigal, spending more than the strict necessary to implement their programmes.
Moreover, in many countries, public servants cannot easily be dismissed and are not
234
Empirically, the relationship between inequality and growth is difficult to test, because inequality itself
is difficult to measure. Alesina and Perotti (1996) report a positive relationship between income
inequality and an index of socio-political instability which, in turn, tends to be negatively correlated to
economic growth (see also Perotti, 1996). Alesina and Rodrik (1994) found a negative relationship
between inequality and economic growth, after controlling for other variables. The emprirical case for a
relationship between inequality and growth is not however very strong (see Helpman, 2004, chapter 6 for
a summary).
235
Stiglitz (2000).
rewarded for good performance, so that there are neither the carrots nor the sticks to
provide strong individual incentives. Often, the success of a policy relies on the ability
and the honesty of the entrusted officials.
4- Limitations imposed by political processes: even if the decision maker
perceived the world as it really was, the political process through which decisions are
made would make her deviate from the public interest. Representatives often have
incentives to act in favour of particular groups or to adopt (populist) policies that the
majority of the electorate perceives to be correct, even if they know they aren’t. State
ownership, subsidized loans, agricultural supports, for example, are often used to serve
political goals of governments, at the cost of the social interest.
All in all, while market failures provide a motivation for government
intervention, governments should in each case assess the extent to which they can do
better than the market. In some cases, such an assessment may lead to the conclusion
that the costs of intervention exceed the benefits, so it is better not to intervene after all.
Government actions should be directed only to those market failures where there is clear
understanding that government intervention can make a significant difference.
Box 10.3 The public services efficiency frontier
The question as to whether governments could do more with the same resources
or do the same with fewer resources has always attracted the interest of academics and
practitioners. A recent contribution is Afonso et al (2005), who computed a “revealed
efficiency frontier” for public services, using a sample of 23 OECD countries.
The authors first computed, for each country, a measure of “public sector
performance”. This measure is defined as an average of seven sub-indicators, measuring
the outcomes of intervention in key policy dimensions: the quality of public
administration (confidence in the administration of justice, the size of the shadow
economy, red tape and corruption); education achievements (secondary school
enrolment, scores obtained by students in international tests); health (infant mortality,
life expectancy at birth), public infrastructure (quality of communication and transport
infrastructures), income distribution (income share of the poorer 40%), macroeconomic
stability (volatility of GDP, inflation) and macroeconomic performance (per capita
GDP, GDP growth and unemployment rate).
The authors then compared the estimated “Public Sector Performance Index”
with total public expenditures, which they considered as input in the analysis (the output
is public sector performance). Figure 10.2 reports the author’s results. The vertical axes
measures the “Public Sector Performance Index”. According to this figures, the
countries with highest public sector performance are Luxembourg, Japan, Norway and
Austria. The countries with the lowest indexes are Greece, Portugal and Italy. The
horizontal axes measures the total government expenditure as percentage of GDP. In the
figure, we see that countries with larger spending are the Nordic countries (Sweden,
Denmark and Finland) while those with smaller spending are the United States, Japan,
Australia and Ireland.
The efficiency frontier is defined by the observed combinations of public sector
performance and expenditure that are not dominated by other observed combinations.
Take, for instance, the case of Portugal. This country exhibits roughly the same level of
spending as Luxembourg, but achieving a much lower public sector performance index.
Sweden on the other hand, obtains the same performance index as the US, but with
afreitas@ua.pt 302
much more spending. So these two cases are inside the “efficiency frontier” (in terms
of equation (10.15), this means that these countries should exhibit a large value of ).
Note however, the limitations of the exercise: the public sector performance
index is computed as a simple average of sub-indexes. If different weights were
assigned to the seven dimensions, different efficiency frontiers would be obtained.
Figure 10.2 Public Sector efficiency in 23 OECD Countries (2000)
Public Sector Efficiency Frontier
1.25
LUX
JAP
1.15
NOR
AUT
e HOL
c
n CHE
a
rm 1.05 IRL
DEN
fo
r
USA
AUS ICE
SWE
e CAN
P FIN
r
o
tc
e GER
S 0.95 BEL
c
il NZE FRA
b
u UK
P SPA
0.85
ITA
POR
GRE
0.75
30.0 35.0 40.0 45.0 50.0 55.0 60.0 65.0 70.0
Total public expenditure (1990s, % of GDP)
Source: Afonso et al., (2005).
10.5. Discussion
This chapter presented an extended version of the Solow model that describes
the role of government in providing services that are essential to the functioning of a
market economy. The model emphasizes the trade-off between the benefits of public
provision and the cost of taxation, which in this model lowers private savings. Since
government services exhibit diminishing returns, there is an optimal scale of public
provision. Beyond that level, intervention becomes counter-productive.
The model also illustrate how government failures, leading to waste of resources
in the transformation of tax proceeding into valuable public expenditures, impact
negatively on the steady state levels of per capita income and consumption. Due to
simplicity, the model above abstracts from an essential problem of taxation, which is the
distortions it may cause on the relative prices of different inputs. This question will be
tackled in the following chapter. In Chapter 13, we will further explore the
inefficiencies in public provision.
Appendix 10.1. The case with a pure public good
This appendix shows how the model changes when one assumes that public
services are pure public goods (that is, non-rival).
In that case, instead of (10.2), one shall specify the impact of public
expenditures on TFP as:
At  G t (10.17)
The aggregate production function becomes:
Yt  G t K t N t1   (10.18)
The implication of this change is that the aggregate production function now
displays increasing returns to scale (     1    1 ). As we already know from
Chapter 6, in this case there will be cumulative causation and divergence.
Whether the model displays or not endogenous growth, this will depend on the
returns to the two reproducible factors, K and G, altogether.
In the following discussion, two cases shall be considered: the case in which
 and the case in which the case with  leads to explosive growth).
Case A: 
Consider first the case with  Since in this case there are diminishing
returns to reproducible inputs, there is no endogenous growth. Like the Arrow’
Learning by doing model (chapter 6), this version of the model with public goods
exhibits a steady state, where all inputs grow at the same rate.
Log-differentiating (10.18), and imposing the long run condition that K, G and Y
all grow at the same rate (remember that, due to (10.11), the ratio of public spending on
output is constant), one obtains:

  Yˆ  n  n (10.19)
1   
This equation states that the growth rate of per capita income depends on the
growth rate of the population, which is exogenous. Thus, there is a weak scale effect.
Intuitively, because the public good is not subject to congestion, when the size of the
workforce increases, the economy will benefit from sharing the public expenditure by a
larger number of users. In case the population does not growth, diminishing returns on
aggregate capital will force the growth rate of per capita income to decline to zero, just
like in the Solow model.
It is important to observe that the growth rate of per capita output in (10.19)
does not depend on policy parameters: in this version of the model, changes in policy
produce level effects, only.
Case B: 
In this case, the (aggregate) production function exhibits constant returns with
respect to the reproducible factors. Hence, if public expenditures grow at the same rate
as physical capital the economy will evolve along a balanced growth path. Using
(10.11) with  (we let to the reader the solution with a positive ) and substituting in
(10.18) with  one obtains the AK version of this model:
afreitas@ua.pt 304
1 
Y  N   K (10.20)
The interest rate in the decentralized economy is obtained from profit
maximization at the firm level. Since firms do not take into account the impact of their
decisions on the aggregate, the relevant production function is (10.1). Thus, the user
cost of capital will be equal to:
Yi 1 
r    1     1    N   (10.21)
Ki
Using the optimal consumption rule   r   , the growth rate of per capita
income becomes:
1 
  1    N       (10.22)
Comparing to the model with congestion, in the main text, we see that now
public actions impact on growth rates, rather than in levels. So this version of the model
displays endogenous growth.
The rest of the story is very similar to the one in the main text. As before, there
are two opposite effects: (i) a higher provision of public services raises the productivity
of private investment, inducing a faster capital accumulation; (ii) a higher tax rate
reduces the net worth of private investment, inducing a slower rate of capital
accumulation. Since in this model returns to capital are constant, these effects impact on
the growth rate of per capita output.
Since this version of the model has no steady state, there is no meaning in
maximizing per capita consumption. But the government may well want to maximize
the growth rate of per capita consumption236. The tax rate that achieves this target is
obtained by setting the derivative of  in (10.22) with respect to  equal to zero. This
gives:
* 1   (10.23)
Again, the optimal policy corresponds to setting the size of government
provision proportional to its impact on aggregate production (conf. 10.18).
In this version of the model, the growth rate of output per worker is an
increasing function of the workforce, N (eq. 10.22): if the growth rate of the population
happens to be positive, then the growth rate of per capita income will be explosive. This
model displays a strong scale effect.
236
In this case, maximizing the growth rate corresponds to maximizing welfare (see Barro and Sala-i-
Martin, 1995, pp 156).
 Because of different types of market failures, the laissez faire does not deliver in general an efficient
resource allocation.
 A fundamental function that underlies the origin of the state is the establishment of essential
institutions, such as the rule of law, protection of property rights, and money.
 In general, market failures include externalities, public goods, common goods, coordination failures,
missing markets, high unemployment, information failures, and imperfect competition.
 The essential role of government can be accounted for in a growth model augmented by an essential
non-excludable (public) input. The implication is that in a laissez fare, there would be no provision of
this input and hence no economy at all.
 In the model, the government solves the market failure by coercing people to pay taxes. The optimal
intervention involves setting the tax rate so that private prices become aligned with the social interest.
 With the tax proceeds, the government provides the public input. Because in this model the public
input and capital cost the same, the optimal provision will be such that the marginal product of the
public input is the same as the marginal product of capital.
 The first best policy presumes however that there are no losses in the process of transforming tax
proceeds into public inputs. In practice, governments are not that efficient, due to different types of
“government failures”. The implication of government failures in the model is that some of the tax
proceeds will be wasted in unproductive uses, so per capita consumption in the steady state will be
lower.
 The model can also be used to discuss the trade-off between efficiency and equity. To the extent that
a redistributive policy implies the use of distortionary taxation without a corresponding increase in
public provision, it will enter in the model as “waste”, leading to lower per capita consumption in the
steady state. However. when inequality is very high, a redistributive policy may also have the role of
a public input, in the sense that some redistribution will contribute to a more peaceful social
environment and hence to lower transaction costs and higher productivity of private businesses.
afreitas@ua.pt 306
Key concepts
 Institutions
 Public goods
 Congestion
 The tragedy of the commons
 Government failures
Essay questions:
a) Comment: “When the government fails to provide basic public inputs, the market
mechanism will assure its provision with the same level of efficiency”.
b) Explain why in the first best allocation the marginal products of the public input and of
physical capital ought to be equal.
c) Comment: “There is a trade off between efficiency and equity: so more equity will mean
lower per capita income”.
d) Comment: “Governments fail, too”.
Exercises
10.1
Consider an economy with a large number of equal firms. Each firm i produces a
1 1
homogeneous consumption good according to Yi  AK i2 N i2 . Although each firm
considers A as an exogenous parameter, in the aggregate the following condition holds :
1
 G 2
A  .
Y 
a) Compute the aggregate production function in this economy and explain
why there is a market failure. What would happen if there was no
government?
b) Assume that the provision of public inputs is financed with a production
tax τ, but that a fraction ø of the tax revenues is wasted in unproductive
uses. Write down the government budget constraint.
Assume that in this economy the population is constant (n=0), there is no
technological progress (γ=0), the saving rate is 27% and the capital depreciation rate (δ)
is equal to 2%.
i. Find out the expression for the steady state levels of per capita
income and per capita consumption in terms of the fundamental
parameters. Clue: remember that c  1  s 1    y .
ii. Explain, with the help of a graph, the dual effect of the tax rate on
the steady state level of per capita consumption.
iii. How does ø affect the steady-state of private per capita
consumption? Explain.
iv. Find out the benevolent planner solution. Is this solution
intuitive?
v. Examine the implications of a positive waste ø for per capita
consumption and G/Y. Discuss.
10.2
Consider a closed economy where firms perceive the production function to be
of the form Y=AK. In this economy the population is constant, the saving rate is equal
to s=0.24 and the depreciation rate is equal to δ=0.04. Assume also that the government
levies a tax  on household’s capital incomes.
a) From the firm’ maximization problem, find out the expression for the
interest rate in this economy as a function of the tax rate.
b) Find out the expression for the households disposable income. Place the
main income identities of this economy in a flow income chart.
c) Consider for the moment that A=1/3 and   1 / 8 . Using the equality
between savings and gross investment, find out the growth rate of per
capita income in this economy.
d) Discuss, with the help of a graph the dynamic properties of this model.
Does this model predict convergence among similar economies?
afreitas@ua.pt 308
e) Examine the implications of a change in the tax rate from   1 / 8 to

 1 3. .
f) Keeping the tax rate equal to   1 3 , examine now the implications of a
change in the efficiency parameter from A=1/3 to A=1/2. Compare the
implications of such a change with a similar change in the context of the
Solow model and explain.
g) Assume now that A  G Y  , where G is a public good. Find out the
0,5
expression for aggregate output. Explain why in this economy there is a

market failure.
h) Assuming that the government budget constraint is G   1   Y ,
compute the growth rate of this economy when   1 / 8 and
when   1 3 Compare with c and discuss.
i) In light of this new interpretation for A, how would you explain the
equilibrium described in e? Compare the 3 solutions with the help of a
figure relating the growth rate of the economy with the tax rate (laffer
curve). Do any of these correspond to the first best?
11. Distortions
“People do what they get paid to do; what they don’t get paid to do, they don’t
do”. [William Easterly].
Learning Goals:
 Understand the pervasive role of distortions for economic performance.

 Identify real world examples of market and government imposed
distortions.
 Understand how taxes and subsidies can be manipulated to “get the
prices right”.
 Understand the intervention dilemmas in a second best context.
11.1 Introduction
In a well functioning economy, each resource is valued in the market according

to its contribution to social welfare. In the real life, however, different types of
impediments drive wedges between market prices and social returns. These cases are
referred to as distortions. Whether government-imposed (e.g., taxes) or inherent to
certain markets (e.g., externalities, natural monopolies), distortions cause misallocation
of resources keeping the economy below its attainable productivity frontier.
This chapter provides a systematic view of the distortions that interfere in capital
accumulation and in the allocation of inputs to production. For mathematical
convenience, the underlying framework is the AK model. This means that efficiency
losses caused by distortions will translate into lower growth. A similar analysis could be
spelled out in terms of the neoclassical model, with the difference that distortions in that
case would have level effects. Sticking with the AK model, however, we gain in
mathematical simplicity237.
In Section 11.2 we discuss the case of distortions that affect consumption-saving
decisions. Section 11.3 illustrates with the particular case of financial market
inefficiencies. In Section 11.4 we address the case where the distortion affects the
relative use of two inputs to production. Section 11.5 extends the analysis in the
previous section to the case in which a policy to subsidize one input comes at the cost of
a tax in other input. Section 11.6 concludes.
237
The chapter draws on William Easterly (1993, 2005).
afreitas@ua.pt 310
11.2. Distortion in the consumption saving decisions
The simpler manner to model a market distortion is to think it as an unjustified

tax that introduces a wedge between private returns and social returns. This section
gives an example, whereby a distortion affecting the relative price of capital leads to a
suboptimal rate of capital accumulation.
Assume that the aggregate production function takes the AK form:
Yt  AK t , A > 0 (11.1)
Where K denotes for private (excludable, rival) inputs, only238.
It is also assumed that households have full access to financial markets, so they
are able to smooth consumption according to:
 r (11.2)
Because in this model K is a purely private good and consumption-saving
decisions are optimal there are no market failures. Thus, there is no role for government
intervention. The market mechanism delivers the first best outcome and any
government attempt to interfere in the price mechanism will be welfare worsening.
Assume now that the government imposes a tax,  K , on capital incomes. Profit
maximization by an individual firm i takes the following form:
 it  AKit  rt   1   K Kit (11.3)
The first order condition of profit maximization is:
 it
 A  rt   1   K   0
K it
With all firms equal, the economy’ interest rate will be:
A
r  (11.4)
1   K 
From (11.4), the households’ disposable income ( Yd ) will be equal to:
Yd  r   K  Y 1   K  (11.5)
To see how much the tax rate affects growth, let’s first solve the model as if the
saving rate was exogenous. Because the saving rate applies to Yd , the equality between
savings and gross investment becomes:
sY
 K  K (11.6)
1 K
Dividing both members of (11.6) by K, rearranging and subtracting n on both
sides, one obtains the growth rate of capital per worker (and of per capita income), for
each level of the saving rate:
238
You may interpret K as including both human and physical capital. As long as the tax rate is uniform
across capital inputs, there is no gain in modelling the different types of capital separately. Later in this
chapter we will address specifically the case with non-uniform taxation.
K sA
  n   n    (11.7)
K 1K
This equation shows that an increase in the tax rate, by reducing the households’
disposable income, impacts negatively on capital accumulation and by then, on growth,
for each level of the saving rate.
When the saving rate is endogenous, it will depend on the reward of capital.
Because the distortion impacts negatively on capital returns (equation 11.4) consumers
will optimally decide to save less.
Replacing (11.4) in the optimal consumption rule (11.2), one obtains the growth
rate of per capita income in the economy with endogenous savings:
A
     (11.8)
1K
Equation (11.8) stresses the growth implications of policies that distort the
relative price of capital, in the context of the AK model when households face no
borrowing constraints. Comparing to (11.7), se see that the impact of taxation is much
larger in this version of the model. The reason is that, when savings are endogenous, an
increase in the tax rate not only decreases the productivity of capital but also the saving
rate.
To disentangle the two effects, let’s equal (11.8) and (11.7) and solve for the
(endogenous) saving rate:
 n
s  1 1   K  (11.9)
A
This equation shows that an increase in the tax rate leads households to save less
out of their disposable income. Thus, equation (11.8) accounts for two effects: on one
hand, by reducing the households’ disposable income,  K reduces the total amount of
savings - and by then investment – for a given saving rate; on the other hand, by
reducing the return on investment,  K induces a lower saving rate.
An interesting feature of (11.8) is that the lower the A, the lower the derivative
of growth with respect to the tax rate,  K . This means that the distortion is less
pervasive when TFP is low. In the words of William Easterly, bad policies are more
likely to be tolerated in low-A countries than in high-A countries239.
Example: Transport costs
Remember that  K does not necessarily stand for an income tax: you can
interpret this parameter as capturing the effect of any policy or institution that alters
artificially the return on investment.
An example is transport costs. A developing country landlocked or surrounded
by forests and mountains will face, everything else constant, higher costs on
international trade than a country with a coastal area or with access to navigable rivers.
239
Easterly (2005).
afreitas@ua.pt 312
If, as it is often the case, the developing country exports agricultural goods (Y) in
exchange for equipment (K), its location will imply a higher relative price of capital as
compared to a similar country located in the centre. In that case, high transport costs
will act like a tax on physical capital,  K : everything else constant, this is expected to
reduce the amount of investment that can be obtained out of a given saving rate and also
to impact negatively on the saving rate.
High transport costs are one of the vehicles through which an adverse geography
impacts negatively on economic development.
Other distortions that alter the relative price of capital include tariffs in imported
equipment, licensing fees, borrowing constraints. The following section addresses more
in detail one particular distortion of this class: financial market imperfections.
11.3. Financial deepening and economic growth
Transaction costs
When we describe the financial system in the flow income chart, we ignore the
frictions that underlie the transfer of funds as between savers and borrowers. In the real
world, however, financial trade is affected by pervasive transaction costs.
A main source of frictions in financial markets relates to the fact that
information is incomplete. In particular, two kinds of information problems arise: First,
in a heterogeneous world, where individuals differ regarding their financial needs and
risk characteristics, searching and matching the different needs involves collecting
information that is not readably available. Second, the relationship between borrowers
and lenders is characterised by asymmetric information: in general, the borrower is
better informed than the lender about the risk and other relevant characteristics of his
project. As long as the lender cannot monitor perfectly the activities of the borrower,
this creates the conditions for the borrow to adopt opportunistic behaviour, for instance,
giving other uses for money than those agreed at the time the loan was hired.
These information failures translate into significant transaction costs in financial
trade. This includes costs related to the activities of: searching and matching lenders and
borrowers (brokerage costs), gathering information on the borrowers to evaluate their
risk characteristics (evaluation costs), negotiating and designing contracts (negotiation
costs), monitoring the implementation of projects (agency costs) and assuring the
enforcement of the contract’ obligations (enforcement costs).
These costs imply that the gross return paid by the borrower exceeds the net
return received by the lender. The wedge between the costs of capital to borrowers and
the reward to lenders is known as the external finance premium. The external finance
premium arises precisely to compensate the lender for the transaction costs in financial
intermediation.
Financial markets and institutions
In a World without frictions there would be no need for financial institutions.

However, in a world with asymmetric information and where financial contracts cannot
be costlessly enforced, there is scope for the emergence of economic agents specialized
in reducing the high transaction costs that characterize financial trade.
Among these, banks are the most prominent, especially in less developed
financial systems. By making use of their economies of scale and expertise, banks
specialize in collecting and processing information, in scrutinizing potential borrowers,
in designing financial products to match the different needs, in pooling risks, in
monitoring project implementation and in enforcing contracts.
More generally, financial institutions that create value by reducing transaction
costs in financial transactions include those specialized in matching together borrowers
and lenders (brokers), agents that take a profit by bundling the funds of many savers to
lend to big borrowers at its own risk (dealers, banks, insurance companies), agents that
are paid to produce and release relevant information for financial decisions (rating
agencies), and institutions that organize and rule the functioning of financial markets,
forcing the disclosure of relevant information and restricting the activities of market
participants (exchange commissions, financial supervisors).
All these activities help reduce substantially the costs of financial
intermediation, making investment more attractive for borrowers and savings more
attractive for lenders.
Financial underdevelopment
By allowing risk to be spread across different assets and reallocated from risk
averse agents to risk takers, by pulling together small and short term savings of many
households and channelling them to large long term loans to finance big projects, by
scrutinizing risks, by collecting and releasing relevant information, the financial system
helps deliver more favourable risk-return-maturity combinations for savers while
creating incentives for entrepreneurs to engage in long-term and riskier projects, which
otherwise would not be under consideration. With no question, a well-developed
financial system helps improve the allocation of resources and creates more incentives
to save and invest, thereby increasing economic performance.
In less developed countries, however, different factors prevent the financial
system from operating that well.
First, weak legal systems make loan contracts more difficult to enforce. When
this is so, lenders become more demanding in terms of collateral requests. In poor
countries, many talented entrepreneurs fail to obtain credit, because they lack the
necessary collateral240.
Second, weak accounting standards reduce significantly the quality of
information, turning the lending activity riskier.
Third, smaller market sizes may prevent banks from fully exploit their scale
economies, giving rise to low competition and higher intermediation margins.
240
This is especially true for the poor, who have no tangible assets to provide collateral for their loans.
Credit constrains among the poor are a main reason why income inequality affects a country overall
investment rate and, in particular, human capital accumulation.
afreitas@ua.pt 314
Fourth, governments in less developed countries often intervene in the process

of credit allocation, by directing credit to special – often loss making - borrowers.
Last, but not the least, governments in countries with less developed financial
markets often turn to domestic banks for deficit financing at below market rates,
crowding out the private sector and rising its funding costs.
A model with costs in financial intermediation
The links between finance and growth can be examined in terms of the model of
Section 11.2. Instead of interpreting  K as an income tax, however, let it measure the
costs in the process of channelling savings to investment (you may relate these with the
external finance premium).
That is: for each euro saved, only 1 1   k  translates into acquisition of new
capital; the remaining  k 1   k  is retained in the financial system. Hence, the equality
between savings and investment becomes:
sYd  1   K K  K  (11.10)
Since in this new version of the model there are no taxes, the disposable income
is equal to:
Yd  r   K  Y (11.11)
Thus, (11.10) becomes equal to (11.6), and the growth rate of this economy will
be given by (11.8).
In this model, a less efficient transformation of savings into investment is
captured by a higher  K . For instance, imperfect competition in the banking sector may
translate into economic rents in the intermediation process, meaning that more savings
will be required to achieve a certain level of investment. The same applies to some types
of government intervention in the banking system, such as high reserve requirements or
intervention in credit allocation.
The link between finance and economic growth also runs through the efficiency
parameter A: for instance, the function of evaluating and selecting the most profitable
projects and of monitoring their implementation tends to enhance the average
productivity of capital. Sometimes, governments try to influence banks’ decisions.
Whenever this translates into the deviation of credit to socially inefficient projects or to
loss-making companies, there will be a negative impact on efficiency and thereby on the
rate of economic growth.
Box 11.1. The research of King and Levine
In a serious of research papers, Robert King and Ross Levine analysed the
relationship between financial development and economic performance. In one of these
papers241 the authors assessed whether the level of financial development affects (1) real
per capita GDP growth, (2) capital accumulation and (3) productivity growth. Their
study involves 80 countries over the period 1960-1989.
241
King and Levine (1993a).
Table 11.1, reproduces some of the author’s findings. In the table, dependent
variables are in columns and independent variables are in rows. The later group includes
the log of initial per capita GDP (capturing conditional convergence), the log of
secondary school enrolment, government consumption divided by GDP, inflation, a
measure of trade openness and a measure of financial depth, consisting on the amount
of a country liquid liabilities (currency plus demand and interest bearing liabilities of
financial intermediaries) divided by GDP.
As shown in the table, the three regressions suggest that financial development
is a good predictor of growth, capital accumulation and TFP. Another conclusion of the
exercise is that initial income, the education level and the government consumption are
correlated with growth, but inflation and trade openness are not. In a related paper, the
authors found a positive correlation between economic growth and the strength of the
legal system in terms of creditor rights, contract enforcement and accounting practices
242
.
Table 11.1: Growth and initial financial depth, 1960-1989
Per Capita Capital Per capita

Per capita GDP
Growth, Productivity
Growth, 1960-1989
1960-1989 Growth, 1960-1989
Constant 0,035*** 0,002 0,034***

[0,001] [0,682] [0,001]
Log(real GDP per person in 1960) -0,016*** -0,004* -0,015**
[0,001] [0,068] [0,001]
Log(secondary school enrollment in 1960) 0,013*** 0,007*** 0,011***
[0,001] [0,001] [0,001]
Government Consumption/GDP in 1960 0,07* 0,049* 0,056*
[0,051] [0,064] [0,076]
Inflation in 1960 0,037 0,02 0,029
[0,239] [0,0238] [0,292]
(Imports plus exports)/GDP in 1960 -0,003 -0,001 -0,003
[0,604] [0,767] [0,603]
DEPTH (Liquid Liabilities in 1960) 0,028*** 0,019*** 0,022***
[0,001] [0,001] [0,001]
2
R 0,61 0,63 0,58
*significant at the 0,10 level, **signficant at the 0,05 level, ***significant at the 0,01 level.
[p-values in brackets]
Observations = 57
Source: King and Levine (1993a)
242
Levine, Loyaza and Beck (2000). In a third paper, Beck, Levine and Loyaza (2000) contend that the
main channel through which financial development influences growth is TFP, rather than physical capital
accumulation and savings. This result is at odds with the model with endogenous savings. One possible
explanation is that the level of precautionary saving declines when the financial system becomes more
developed. Other possible explanation is that financial development comes along with the elimination of
borrowing constraints, thereby causing the saving rate to fall (an explanation in Pagano, 1993).
afreitas@ua.pt 316
11.4. Distortions in factor markets
The efficient allocation
The model in Section 11.2 describes the growth effects of policies that impact in
the consumption-saving decisions. By pooling together all forms of capital, however,
the analysis misses an important category of distortions, namely those that affect the
relative price of different inputs. This section and the section that follow address
specifically this case.
In the following, assume that there are two types of private capital: human
capital (H) and physical capital (K), so that the aggregate production function takes the
following form:
Y  AK  H 1  (11.12)
Also assume that one unit of output can be converted at no cost into either one
unit of physical capital or into one unit of human capital: that is, the opportunity costs of
investing on unit of output in human capital and in physical capital are the same. This is
not necessarily a realistic assumption, but it is convenient for the analysis at hand, in
order to abstract from other effects, related to differences in costs of producing the two
types of capital. By the same reason, it is assumed that the depreciation rates for the two
types of capital are the same.
The implication is that, in an economy without distortions, profit maximization
and the price mechanism will ensure that human and physical capital will be employed
so that their marginal products are equal. That is243:
H 1 
 (11.13)
K 
Since with this condition, aggregate productivity is maximized, growth is
expected to be maximized too244.
Figure 11.1 illustrates the efficient allocation. Consider one particular moment
in time. The horizontal axes measures the total amount of capital available in the
economy, with the endowment of physical capital being measured from left to right and
the endowment of human capital being measured from right to left. The vertical axes
measures the marginal products of the two types of capital.
243
Note that this corresponds to equation (5.16). In terms of the MRW, the corresponding efficiency
condition is (4.16).
244 
To see this formally, let k  K H denote for the ratio of physical to human capital. The question is to

find out the value of k that delivers faster growth. To solve this problem, rewrite the production
  
 
function, as Y  k  1  k K , where K  K  H denotes for the total capital endowment of the
economy each moment in time. Assuming for a moment an exogenous saving rate, the change in total

 


 
capital will be given by   K K  s Y K    s k  1  k   . Maximizing this in respect to k , you

will obtain k   1    .
The equilibrium described by allocation E corresponds to that of a decentralized

economy without distortions: if both types of capital cost the same and if their
depreciation rates are equal, then profit maximization and absence of arbitrage
opportunities will ensure that their marginal products are the same.
Figure 11.1 Marginal products of physical and human capital
Y
Y
K
H
T
r   E U
r   D S
O K E F D O H
K H
Now imagine that the economy was operating at allocation D. In allocation D

there is more physical capital and less human capital than in E. Due to diminishing
returns, the marginal product of physical capital in D (DS) is lower than in E (ET). The
marginal product of human capital, in turn, is higher in D (RD) than in E (ET).
Allocation D is inefficient, because a move from D to E would translate into an output
gain, equivalent to the area [RTS]245. In other words, a move from D to E would allow
the economy to approach its efficiency frontier246.
In an well functioning economy, the market mechanism should prevent
allocations like D: since in D the marginal products of the two types of capital differ,
agents will only invest in H and no investment will be made in K until the two marginal
products are the same. As the stock of H rises relative to that of K, the marginal
productivity of the former decreases and the marginal productivity of the later increases.
245
Remember that the area below each curve of marginal product measures the output level: a move from
D to E, by reducing the use of physical capital, would impact negatively on output by the area [DSTE];
but by increasing the use of human capital, production would rise by the area [DRTE].
246
It is important to note that, in light of the model outlined above, a move from D to E cannot happen: as
long as people invest less in out type of capital and more in the other, the two curves shift up and down
and the efficient point E moves accordingly. The “deadweight loss” described above presumes that the
two marginal products of capital are independent of each other, which is not the case. Thus, any reference
to this deadweight loss – here and below – shall be taken as illustrative only.
afreitas@ua.pt 318
This movement stops in an allocation like E, where the two marginal products are
equal247.
Now suppose that, due to any imperfection, the economy was stuck in point D.
In that case, the wedge RS between the marginal product of human capital and the
marginal product of physical capital would remain unchanged and the economy would
keep operating below its production possibilities frontier. In other words, there would
be a deadweight loss corresponding to the area [RTS].
Why should such distortion persist? In real life, many reasons prevent factor
markets to clear in allocations like E. Among others, taxes, tariffs, import quotas, price
controls, interest rate controls, dual exchange rates, employment protection rules,
nominal rigidities, corruption fees and imperfect competition. The following sections
address some examples.
Discriminatory taxation
The most obvious source of factor price distortions is government taxation.

Consider an economy composed by a large number of identical firms, with production
functions of the form:
 1 
Yi  AK i H i . (11.14)
Suppose that the government has the ability to coerce citizens to pay taxes out of
their factor incomes. Let  K be the tax rate on physical capital incomes and  H the tax
rate on human capital incomes. These taxes create a wedge between the user cost of
capital, paid by firms, and the net worth to households.
The individual firm’ profits become:
 i  Y  1   K r   K i  1   H r   H i . (11.15)
Profit maximization in that case leads to:
Yi Y
  i  r   1   K  ; (11.16)
K i Ki
Yi Y
 1    i  r   1   H  . (11.17)
H i Hi
Due to arbitrage, the net rental prices of human and physical capital ought to be
equal, implying:
Hi H 1   1   K
  (11.18)
Ki K  1H
Equation (11.18) illustrates how discriminatory taxation, affecting the relative
use of the two types of capital, impacts on the marginal rate of technical substitution,
moving the economy away from allocation E. If both tax rates were zero, the economy
would achieve the most efficient outcome.
247
In the model with government services, G replaces H. Because in that model G is non-excludable, the
market mechanism does not deliver allocation E. Instead, it is the government policy (as stated in
condition 10.9) that drives the economy to an allocation like E.
Summing (11.14) across firms and substituting (11.18), one obtains the
aggregate production function as a function of the tax rates:
1  1 
1K  1  
Yt  A    Kt , (11.19)
1  H    
This model is another incarnation of the AK model. The novelty here is that the
efficiency component (the average productivity of capital) now depends on two policy
parameters,  K and  H .
Solving (11.16) for the interest rate, substituting (11.19) and using the optimal
consumption rule (11.2), one obtains the growth rate of per capita income in this
economy:
B
     (11.20)
PI
B  A  1    and PI  1   K  1   H 
1   1 
with .
The term PI is the average price of capital.
As a benchmark, consider first the case in which  K  0 and  H  0 . In this
case, PI =1, the marginal products of the two types of capital are equal and the first best
outcome is achieved (equation 11.6, allocation E in Figure 11.1).
Now assume that  K   H  0 , e.g., both taxes are positive and equal. In this
case, equation (11.18) becomes equal to (11.13). This means that no distortion is
imposed on the relative use of the two types of capital. Still, a distortion exists in that
the average price of capital goods rises relative to consumption: because PI >1, there
will be less capital accumulation and lower growth than in the first best case:
B
     (11.21)
1H
Note that this case corresponds to that already examined in Section 11.2.
Now assume that there is a tax on one type of capital only (say, human capital):
 H  0 and  K  0 . In this case, there are two distortions: one on the relative price of
capital, PI; the second, on the relative price of the two types of capital: the user cost of
human capital rises relative to that of physical capital, inducing firms to use
proportionally more physical capital than the corresponding weight in the production
function. In terms of Figure 11.1, the economy will be in an allocation like D248.
The growth rate of per capita income becomes:
B
     (11.22)
1   H 1 
248
Actually, because the production function is a Cob-Douglas, there is an equivalence between
distortions affecting consumption-saving decisions and those affecting the relative use of the two inputs,
in the sense that they both translate into a certain level of PI . Thus, for each magnitude of one distortion,
there is an equivalent magnitude of the other distortion, leading to the same growth rate of per capita
income. Easterly (1993, 2005) uses a CES production function to better distinguish the two types of
distortions.
afreitas@ua.pt 320
Since   1 , you may be tempted to conclude that (11.22) implies a higher

growth rate than (11.21). Note however that, for any given tax rate, government
revenues will be lower in this second case. Thus, in order to generate the same tax
proceeds, the government would need to set a higher tax rate in (11.22) than in (11.21),
further lowering growth.
Also note that the impact of taxation depends on the relative contribution of the
two inputs to production: for each level of  H , the lower the , the lower the impact
on PI and hence the lower the effect of taxation on efficiency and growth. But - again -
the impact on government revenues will be also lower, so probably the chosen level of
the tax rate will not be independent of . We will return to this discussion in Section
11.5.
Several studies have analysed the implications of distortionary taxation on
economic growth. Jones et al. (1993), for instance, calibrated a general equilibrium
model in which human and physical capital are produced using the same technology and
where the labour supply is inelastic, so as to fit the U.S. data. They found that the
potential growth achieved by reducing drastically all forms of distortionary taxation is
quite high (3% in the benchmark case). 249
Example: Import protection
In many developing countries, import tariffs are used to raise government

revenues. In terms of the model outlined above, one may examine the effects of import
protection interpreting  K and  H as tariffs, instead as of taxes. Suppose that H refers
to imported equipment while K refers to locally produced equipment. If a tariff is levied
on the imported capital, this will lead to a distortion in the relative use of the two types
of capital. The same applies to other import protection schemes, such as import quotas,
licenses, custom procedures and technical requirements. All these instruments will
cause the relative price of inputs to depart from the competitive equilibrium, causing the
economy to move away from E.
Example: High Inflation
In many developing countries, governments use money creation to finance their

budget deficits. This leads to the erosion of the value of money through inflation.
High inflation leads to efficiency losses at different levels. First, the institution
of money (not the private input “cash”) is an important public input to production. As
pointed out in the nineteenth century by the British economist William Jevons 250 ,
money avoids the double coincidence of wants problem that arises from direct
exchange. On the other hand, by providing a unit of account, money helps clarify the
relative prices, reducing uncertainty. By lowering the costs of transacting, money
favours the division of labour, impacting positively on resource allocation. So, an
effective money can contribute to a higher efficiency level, A.
249
This result was however questioned by other authors. For instance, Stokey and Rebelo (1995)
contended that there was a substantial increase in the average level of taxation in the U.S. after WWII and
this did not translate into lower growth. For a discussion, see Jones and Manuelli (2005).
250
Jevons (1875).
When a people face very high inflation rates, they tend to move away from
money: the erosion of - and the uncertainty regarding – the purchasing power of money
leads agent to substitute it for other devices (foreign banknotes, “liquid” consumption
goods such as cigarettes or sugar). People also use more the financial system to protect
their wealth. The implication is that people will spend more resources in transactions,
cash management and hedging activities, devoting fewer resources to productive uses.
Finally, lack of synchronization in price revisions may lead to distortions in relative
prices, confounding economic agents. All in all, high inflation reduces the effectiveness
of money, leading to a lower efficiency, A.
Another implication of inflation is that it taxes differently different types of
capital251. To see this in terms of the model above, just interpret K as physical capital
and H as working capital. Working capital, which includes cash, short-term financial
applications and credit to customers, is essential input to production, just like physical
and human capital. High inflation, by affecting nominal variables but not real variables,
drives a wedge between the costs of holding working capital (except inventories)
relative to other inputs. This, in turn, induces firms to work with less working capital
than the ideal. In other words, the economy departs from allocation E, achieving lower
growth252.
Empirically, various cross-country studies have found a negative correlation
between inflation and growth, but only for inflation rates exceeding a critical level. For
instance, Bruno and Easterly (1998), using a sample of 127 countries between 1960 and
1992 identified a negative association between inflation and economic growth at
inflation rates above 40%. Other authors achieved similar conclusions, though with
some disagreement concerning where the threshold is253. The causality from inflation to
economic growth remains however controversial. Many authors argue that inflation is
more a symptom of bad policies rather than a syndrome itself. That is, a significant
correlation between inflation and growth may capture the influence of an omitted third
variable that is correlated to inflation254. This is an example of the “one symptom for
different syndromes” problem that plagues cross-country growth regressions.
Example: Monopoly
A common source of efficiency-loss is imperfect competition: whenever one

side of the market is characterized by a small number of players, these may have the
251
This point was made by Gylfason (1998).
252
Yet the impact of high inflation on savings is ambiguous. The utility function leading to (11.2)
implicitly assumes a unit elasticity of inter-temporal substitution. Using a more general formulation,
Gylfason (1998) finds that the impact of inflation in savings may either be positive or negative.
253
Barro (1995, 1997), Sarel, (1996). Barro (1998) finds a threshold on 20%, but contends that “there is
not enough information in the low-inflation experiences to isolate precisely the effect of inflation on
growth” (p. 98).
254
In this avenue, Roubini and Sala-i-Martin (1995), argue that inflation is highly correlated with
financial repression. Hence, a negative correlation between inflation and growth may be actually
capturing the distortionary effects of financial markets restrictions. De Gregorio (1993), on the other
hand, argues that high inflation normally arises to overcome the inability of governments to collect formal
taxes, so it is mostly an indicator of tax inefficiency.
afreitas@ua.pt 322
ability to influence prices. The implication is that, under laissez faire, production will
fall short the efficient level255.
To see this in the context of the model above, assume that the production
function in the final good sector is given by (11.12) and that one unit of capital can be
converted into either one unit of physical capital or one unit of human capital. Assume
that there are no taxes.
As in the basic model, let each firm i in the final good sector be price taker in
factor markets. Factor prices, however, are allowed to depart from the competitive case.
To account for this possibility, let PK , be the price of physical capital and PH the price
of human capital. Each firm i in the final good sector maximizes its profits, given by:
 i  AK i H i1   PK K i  PH H i (11.23)
Proceeding as before, if all firms are equal, profit maximization leads to the
following optimal condition:
H 1    PK
 (11.24)
K  PH
Consider first the case in which both factor markets are competitive. In that case,
each firm producing capital (either physical or human) will be price taker. A firm j
producing physical capital, for instance, will maximize the following profits
function  Kj  PK K j  r   K j , taking PK as given. The solution to this problem
is PK  r   , implying zero profits in the activity of producing K. If the same was true
for producers of human capital, then PK  PH  r   and (11.24) would mimic the
optimal efficiency condition (11.16).
Now assume that the supply of human capital was unionized and that the only
concern of the union was to set PH so as to maximize its monopsony rents. The
difference in respect to the competitive case is that, because the union is large, it faces a
downward sloping demand curve. The later is given by the aggregation of individual
demands for human capital, as implied by the corresponding first order condition in the
problem of maximizing (11.23). That is:
PH  1   K  H   . (11.25)
The union problem is then to choose H so as to
maximize   PH H  r   H , but instead of taking PH as exogenous, it will
H
consider the downward sloping demand curve (11.7). The solution to this problem is the
well known optimal mark-up pricing, given by:
r 
PH  . (11.26)
1 
This condition implies that the price of human capital will be set above its
marginal cost. Using (11.26) and PK  r   , (11.24) becomes:
255
This discussion focus on static efficiency, only. As we already pointed out a number of times, static
efficiency and dynamics efficiency do not necessarily go along.
H 1   
2
 (11.27)
K 
As expected, this expression reveals a departure from allocation E. In particular,
the relative use of human capital is lower than in the efficient allocation256.
Example: Income inequality
Human capital and physical capital differ dramatically in two key dimensions:
One is the amount of capital one can accumulate at the individual level: while
individuals can accumulate physical capital without bound, there are limits in the
amount of human capital one can acquire to himself. The reason is that investment in
human capital is time consuming and the time available to each individual is limited257.
The implication is that extremely rich people are doomed to invest most of their extra
wealth in the form of physical capital.
The second dimension refers to the possibility of lending the acquired capital to
other users: while physical capital can be rented to someone else, the human capital
one’ buys can only be used by its owner. Thus, for instance, each new computer a single
investor purchases can be used to equip a different worker, without loosing
effectiveness (assuming constant returns to scale). So the rent the investor expects to
obtain in each additional computer does not decline. Yet each extra year of schooling an
investor buys can only be installed in himself, so he will be subject to the law of
diminishing returns: that is, he expects the return of 12 years of schooling to be less (in
terms of wages) than twice the return of 6 years of schooling.
Thus, even if it was feasible for a rich investor to accumulate a gigantic amount
of education, this would hardly prove as profitable as investing the same amount of
resources in physical capital to rent. The conclusion is that rich people will better hold
most of their wealth in the form of physical capital. Poor people, in turn, are more likely
to devote a large fraction of their wealth to human capital, because at low levels of
human capital its marginal return is higher than that of physical capital.
What are the implications of this for the relationship between inequality and
growth? When the distribution of income becomes very asymmetric, one expects poor
people to invest less in human capital, because they cannot afford to do so, and this will
not be offset by an equal investment in human capital by the rich: as we just argue, rich
people will accumulate the extra wealth mostly in the form of physical capital. In terms
of Figure 11.1, as the income distribution becomes more asymmetric, the economy
moves away from E towards a point like D, with underinvestment in human capital and
overinvestment in physical capital, resulting in a less efficient allocation of capital and
lower growth258.
256
Comparing to (11.18), we see that a monopolized market for input H is equivalent to a subsidy to
physical capital accumulation. This suggests that the government can use taxes and subsidies to remove
the distortion.
257
Note that this is different from what we assume for human capital in the aggregate: that is, the stock of
human capital embodied in the society is allowed to increase without bound, while human capital
embodied in individuals with finite lives is bounded by some feasible level.
258
Galor and Zeira (1993).
afreitas@ua.pt 324
Example: segmented labour markets
A problem that is common to many developing as well as to industrial countries

is lack of unified factor markets in general and of labour markets in particular.
When otherwise identical workers get paid different wages in different
employment sectors and this is not fixed by labour mobility, the labour market is said to
be segmented. This presumes the existence of any friction that prevents workers from
the low-wage segment from having full access to a job in the “high-wage” segment.
Sources of labour market segmentation include differences in the institutional nature of
the employer (public vs. private), restrictive labour laws (high severance payments,
restrictive conditions for dismissals), dual productive structures (modern vs. traditional,
formal vs. informal), barriers to geographical mobility (housing rents, high licensing
costs) or the rational for employees to pay wages above the competitive level
(efficiency wages). All these factors may prevent real wages to be equalized across
workers with similar qualifications.
In terms of Figure 11.1, a segmented labour market can be described by point D.
Suppose that K and H refer to homogeneous workers, employed in two different
sectors: H refers to a modern/formal sector and K refers to a traditional/informal sector.
In the modern sector, wages are higher than in the traditional sector either because of
government regulations (for instance, a minimum wage that is enforced in one sector
but not in the other) or as a strategy to deal with imperfect information259. In D, firms in
the modern sector pay a wage rate above the market clearing level ([DR]), while
workers in the traditional sector receive a wage rate that is lower than the market
clearing level [DS]. Of course, workers in the traditional sector would like to move to
the modern industry to get a higher wage, but they can’t. In result, the suboptimal
allocation of labour will persist over time, giving rise to an efficiency loss equal to the
area [RTS].
So far, the discussion abstracted from the possibility of unemployment. But one
can easily extend the analysis to the case in which some workers fail to be allocated in
any of the two industries. This will occur, for instance, when the wage rate in the
modern sector is equal to [RS] and the wage rate in the traditional sector is equal to
[UF]. In that case, [FD] workers will be unemployed and the total efficiency loss will be
equal to [RTUFDS].
Of course, for such unemployment to persist, one would need an explanation.
After all, why should informal workers be so hard nosed that they would prefer not to
work at all than to accept an informal wage equal to [DS]?
A famous model that accounts for this possibility was proposed by a pioneer in
development economics, Michael Todaro, in a joint paper with John Harris 260 . The
259
In broad lines the “efficiency wages” case goes as follows. In the informal sector, workers are
basically self-employed or work in small units where their work effort can be well monitored. Hence,
they will tend to exert a fair working effort. In the modern sector, however, workers are engaged in
plants, where individual work effort is hard to monitor. This gives rise to what economists call an “agency
problem”: as long as employers cannot observe the work effort of each worker, workers will have a
tendency to “shirk”. This makes optimal for employers in the modern sector to pay wages above the
competitive level: this will create incentives for employers to work harder (they will suffer more if fired)
and will allow employers to attract better workers and save on training and turnover costs.
260
Harris and Todaro (1970).
authors wanted to explain the persistence of migration from traditional/rural areas to

modern/urban areas, despite the existence of widespread urban unemployment. The key
element of their model is that workers decide to migrate comparing the wage rate they
get in the traditional sector with an expected wage in case of migration ( W e ), which is
less than the wage actually paid in the urban sector (W=[RD]). The reason is that there
is a chance of migrating workers to become unemployed there. So workers will only
migrate if the expected wage W e  qW (where q  [ DOH ] [ FO H ]  1 is the probability
of being hired) is more than what they get for sure in the rural employment. In
equilibrium, workers will be indifferent between migrating and not migrating, so the
wage rate in the traditional sector [FU] is exactly equal to the expected wage. The
interesting feature of this model is that it displays an excess supply of labour in
equilibrium that cannot be eliminated through labour migration.
11.5. Tax cum subsidy schemes
A balanced budget condition
In light of (11.22), while taxes on capital are harmful to growth, subsidies to

capital lead to faster growth. Consider, for instance, the case when  H  0 and
 K  0 (subsidy to physical capital accumulation). In this case, the average price of
capital declines below one, implying a higher rate of per capita income growth. True,
because the user cost of human capital rises relative to that of physical capital, firms are
induced to use a higher proportion of physical capital than the efficient level (in terms
of Figure 11.1, the economy will operate in a point like D). However, because the net
return to (total) capital increases, investment per unit of output increases and this effect
dominates the negative one.
A question might be raised then: should the government give large subsidies to
capital accumulation, so as to induce faster growth?
One answer is that subsidies need to be financed. Since lump-sum taxes are not,
in general, available, the positive effect on growth due to a higher subsidy on capital
accumulation has to be compared with the negative effects of distortionary taxation
anywhere else.
To analyse this problem in the context of our model, suppose that the
government launches a subsidy to human capital accumulation financed with the
proceeds of a tax on physical capital. For simplicity, let’s assume that there are no
government services (G=0). The government budget constraint is:
r    H H   K K   0 (11.28)
At the first sight, a tax on H and a subsidy on K looks like having an uncertain
impact on the relative price of capital, PI in (11.20). However, when the subsidy to
human capital is financed with the proceeds of taxation on physical capital (e.g, when
11.28 holds), PI rises unambiguously, leading to lower growth. The reason is the policy
makes the tax base to shrink and the subsidized capital to expand, through substitution
effects. Hence, when the subsidy rate increases, this requires a more than one-for-one
afreitas@ua.pt 326
increase in the tax rate in order to keep the budget balance unchanged. Thus, the price of
capital rises unambiguously261 262.
In any case, even if lump-sum taxation was available, a question arises as to why
should the government try to maximize the growth rate of the economy: if there are no
distortions and private agents optimally decide their consumption paths, there is no
point in trying to modify the competitive growth rate. Policies that artificially increase
the growth rate of the economy are in this context detrimental to welfare and shall be
compared to immisering growth263.
Examples of tax cum subsidy schemes
In the real world, many government policies can be interpreted as tax-cum

subsidy schemes.
One example are price controls. In many countries, governments set maximum
price ceilings in some goods, to promote their consumption. Such policy calls for
budgetary transfers so as to compensate firms from the implied losses. If taxes on other
goods are levied to finance the policy, this is equivalent to a tax-cum-subsidy scheme
that impacts negatively on efficiency and therefore on growth.
A common form of price controls in developing countries is the imposition of
maximum ceiling in banks lending rates. Whenever this is the case, an excess demand
for loans will arise. Thus, the policy if often complemented with credit rationing and
with government instructions for banks to direct credits to priority areas or to state
owned enterprises. The existence of an excess demand for credit also induces
development of parallel financial markets, which tend to operate at very high interest
rates. This segmentation of the credit market acts as a tax-cum subsidy scheme: all in
all, the policy implicitly subsidises some types of credit while it imposes a penalty
(parallel markets interest rates) on other types of credit.
Another example are dual exchange rates. Many developing countries have non-
convertible currencies. Inconvertibility means that some sort of restriction on
international capital movements prevents agents from exploring arbitrage opportunities
arising from eventual disparities in the exchange rate value at home and abroad. In these
cases, central banks have a greater ability to manipulate the home value of the exchange
rate. Often, they simply set it by decree. As long as the official rate differs from the
261
Easterly (1993). Formally, substitute the balanced budget requirement (11.28) in (11.18), obtaining the
"break even" tax rate,  K  1       H  . Taking the partial derivative of this in respect to  H and
using again (11.18), one obtains a relationship between changes in the break-even tax rate and changes in
the subsidy:  K  H   K 1   K   H  H  1 . Totally differentiating PI with respect to  K and
 H , it is easy to verify that the tax rate that keeps the relative price of investment goods unchanged (e.g,
the growth rate of the economy unchanged),  K  H   K  H , is lower than the required to keep
the government budget unchanged.
262
Actually, because in this model the supply of labour is inelastic, the subsidy could be financed by a tax
on consumption without lowering growth. In a more elaborated version of the model, however, the same
argument applies: a tax-cum subsidy scheme would have a negative impact on the labour supply, thereby
reducing growth (Mendoza et al., 1997).
263
Jones and Manuelli (1990), Easterly (1993, 2005).
market rate, a black market rate will arise. This implies a wedge between the official
exchange rate, which is used by the government to buy foreign exchange from exporters
and promote certain imports, and the black market exchange rate, which is used by
those who have no access to the official reserves. Those who are allowed to import
goods at the official exchange rate receive an implicit subsidy, while exporters and
those who are forced to pay the black market rate pay an implicit tax. In terms of the
model, this is equivalent to a tax-cum subsidy scheme.
Other examples of tax-cum subsidy schemes include unanticipated inflation
(which acts as a tax on creditors and a subsidy to debtors); an overvalued real exchange
rate (which acts as a tax on tradable goods producers and a subsidy to producers of non-
tradable goods). The following section addresses another example.
Tax evasion
A major problem in government finance is tax evasion. When a significant share

of economic activity is informal and doesn’t pay taxes, the burden of financing
government activities falls on a narrower base, giving rise to unfair competition, wrong
incentives, and an implicit transfer from those who pay taxes to those who benefit with
the public good without paying taxes. This phenomenon is particularly pervasive in
developing countries.
To analyse this phenomenon, assume that a non-excludable input G is essential
to production:

G
A  (11.29)
Y 
Substituting (11.29) in (11.12) and rearranging, one obtains the economy’
production function in terms of the three inputs:
  1 
1  1 
Y G K H 1  (11.30)
This equation reveals that the actual contributions of physical capital and human
capital to production are less than what individual firms perceive to be, in (11.14).
Hence, prices are not right.
The question now is what combination of  K and  H shall the government use
to finance the provision of government services? In sake of simplicity, assume that the
government budget is to be balanced and that there are no other government revenues or
expenditures. With public provision, the government budget constraint becomes:
r    H H   K K   G (11.31)
The rest of the model is equal to that in section 11.4. Profit maximization at the
firm level leads to (11.16) and (11.17) and the growth rate of the economy is just as
described by (11.20). The only difference is that now the parameter A (inside B) is
related to public provision, according to (11.29). Using (11.16) and (11.17), the
government balanced budget constraints simplifies to:
G

 K

1    H (11.32)
Y 1K 1H
afreitas@ua.pt 328
Substituting this in (11.29), one obtains the expression for A in terms of the two
tax rates:
  K 1    H


A     (11.33)
1 K 1 H 
The growth rate of this economy is given by (11.20), with the only difference that A
shall now be replaced by (11.33).
A benevolent planner in this economy would choose  K and  H so as to
maximize the growth rate of per capita income, as given by (11.20), taking into account
that the parameter A (inside B) is determined according to (11.33). As usual, this
problem is solved setting the partial derivatives   K and   H equal to zero. The
algebra of the exercise is rather tedious, but the solution should be, according to our
previous findings, intuitive: from (11.16), the share of capital returns (net of taxes) on
income is K r    Y   1   K  . Thus, the tax rate that makes this share equal to the
contribution of capital to production in (11.30) is  K   . Using the same reasoning for
human capital, the first best solution will be:
K H  (11.34)
With no surprise, the first best policy is the one that sets a uniform tax rate (this,
in turn, is equivalent to a tax on production, as specified in Chapter 10). Substituting the
optimal taxation rule (11.34) in (11.32), one obtains the corresponding (optimal) level
of public provision:
G 
 (11.35)
Y 1
Note that this corresponds to the contribution of government services to
production, as captured by equation (11.30).
To analyse the implications of tax evasion in this model, suppose you could not
raise taxes on one type of capital. Let K denote for the formal sector (the one that pays
taxes) and H denote for the informal sector (the one that does not pay taxes). To find the
benevolent planner solution in this case, let’s impose  H  0 in (11.20) and (11.33)
and maximize again the growth rate of the economy (this time in respect to  K , only).
This leads to:

K   (11.36)

Comparing to (11.34), we see that the tax rate on physical capital is now higher
than in the first best: since the government cannot tax human capital, it sets a higher tax
rate on physical capital. Also note that, the lower the , that is, the smaller the size of
the formal sector, the higher the tax rate has to be in order to finance the government
expenditures.
Substituting (11.36) in (11.32), you’ll get another interesting result:
G  
  (11.37)
Y   1
This equation states that the optimal provision of government services under tax
evasion is lower than the contribution of government services to production, as stated in
(11.30). The reason is intuitive: because rising revenues forces the government to
impose a distortion in the factor markets, a benevolent planner will balance the benefits
of providing one extra unit of public input against the extra cost resulting from a further
move away from E. Equation (11.37) shows that the optimal balance between these two
effects translates into a lower provision of government services than in the case where
uniform taxation is available.
Note also that, according to (11.37) a smaller role of the formal sector in
production (translates into a lower provision of the public good: because in this case
the tax rate has to be set at a higher level (equation 11.36), the distortion in resource
allocation will also be larger and the benevolent planner will take this into account,
reducing further the size of public provision264.
Now you see the pervasive implications of having a large informal sector in the
economy: those in the formal sector pay very high tax rates and the economy as whole
will enjoy a very low level of government services. No wonder why economies with
large informal sectors find it difficult to attract foreign direct investment! 265
Box 11.2. Second-best decision-making
The case with optimal provision of the public input under tax evasion illustrates
a problem of second-best decision-making. The Theory of Second Best concerns what
happens in the presence of unavoidable distortions in an economy.
A well-known proposition in the second best theory is that when there is more
than one distortion in an economy, eliminating one of them does not necessarily leads to
higher efficiency 266 . The reason is that the alleviation of one distortion may impact
negatively on the distortions that cannot be removed. Thus, whenever some restriction
prevents the policymaker from completely eliminating at least one distortion, the
optimal policy shall involve a balance between the benefits of alleviating one distortion
against the costs of increasing the size of another distortion.
As an example, consider a final good, which production impacts negatively on
the environment, through carbon emissions. Also suppose that this final good is
264
Note that the firms’ larger or smaller ability to substitute taxed capital by non-taxed capital may
exacerbate or attenuate the described effects. The Cobb-Douglas production function (11.14) implicitly
postulates a unit elasticity of substitution between H and K. But you could assume instead a CES
production function, with a very high elasticity of substitution between the two inputs. In that case, the
effects just described would be amplified: the size of the formal sector would shrink even more, the tax
rates would be set even higher and the level of public provision would be even lower. The opposite case
occurs with a low elasticity of substitution between inputs.
265
Easterly (1993) makes the point: “Tax systems in developing countries often have a very narrow base,
because of widespread tax evasion and the small size of the formal sector (World Bank, 1988).
Generation of revenues from this narrow base often implies very high taxes. A few examples help
illustrate this. More than 80 percent of income is said to go unreported in Argentina (ibid). Employment
in the private formal sector in Cote D’Ivoire amounts to only 1.4% of the population. Repeated attempts
to increase tax revenue from Cote D’Ivoire have met with failure, as there is large scale evasion of taxes –
which have an effective tax rate of 48 percent (…)”. P188.
266
Lipsey and Lancaster (1956).
afreitas@ua.pt 330
produced using an imported raw material subject to a customs tariff. In that case,
eliminating the tariff (that is, eliminating one distortion) would cause production of the
final good to expand, further damaging the environment. If the government has no other
instrument available to tackle the later problem directly (that would be the first best
policy), the optimal policy may involve some import protection.
In the example of tax evasion, providing more of the public input implies taxing
further the formal sector, exacerbating the distortion in the factor markets. As long as
informality cannot be eliminated (that would be the first-best), the second best policy
(as summarized in 11.37) involves a balance between the benefits of providing the
economy with more of the public input with the cost of further distorting the factor
markets.
These examples illustrate that the second best policy may involve steps away
from what is usually assumed to be optimal in a first-best assessment.
Externalities again
In Chapter 6 we already analysed the market failure resulting from externalities

arising from investment in physical capital. In that case, the decentralized economy
deviates from the optimal allocation because there is a distortion in the consumption-
saving decision. This section extends the analysis, by considering two types of capital.
Assume that there is a positive externality associated to Human Capital267:

H 
A  C  (11.38)
Y 
Summing the production function (11.14) across firms and substituting (11.38),
one obtains the aggregate production function for this economy:
1 
Yt  BK t Ht (11.39)
1

where B  C 1 and 1    (11.40).
1 
Comparing (11.14) and (11.39) we see that there is again a divergence between
the perceived contributions (betas) and the actual contributions (alfas) of physical and
human capital to production. Since , this means that the actual contribution of
physical capital to output is lower than that perceived by firms. In turn, the elasticity of
human capital,  is larger than that perceived by firms,1-.
From (11.39), the first best resource allocation (point E) is:
H 
 (11.41)
K 1
However, because the perceived contribution of the two factors is as given in the
individual production functions (11.14), without intervention the market will deliver a
resource allocation given by (11.13).
267
Note that (11.38) implicitly assumes that the externality is subject to congestion: human capital
generates a positive externality, but you’ll need 10 thousand skilled workers in UK to have the same
external effect as one thousand skilled workers in Ireland.
To illustrate the market failure, we refer to Figure 11.2. In the figure, the plain
curves refer to the actual marginal products of physical and human capital in a given
moment in time. The dashed curves refer to the perceived marginal products.
The decentralized equilibrium is described by allocation D, where the perceived
marginal productivities cross each other (equation 11.6). In D, however, the actual
marginal productivity of human capital is higher than that perceived by firms and the
actual marginal productivity of physical capital is lower.
The loss relative to the optimal allocation is given by the area [RTS]. If you
work out the solutions for the interest rates in the two cases you will realize that the
interest rate in the decentralized economy is lower than in the socially optimum.
Because we are using an AK model, this means that, without intervention, the economy
will not be growing at its potential.
From (11.18), we know that a careful choice of taxes and subsidies can be used
to push the economy towards the first best equilibrium, given by (11.41). Setting
(11.18) equal to (11.41), the following rule is obtained:
1 K 
 1 (11.42)
1 H 1 
If the two tax rates are set according to (11.42), this means that private firms,
following their maximization problem – that is, deciding according to (11.18) – will be
induced to choose a relative employment of H and K satisfying the efficiency condition
(11.41).
Figure 11.2 Actual and Perceived marginal productivities of physical and human capital in
the presence of an externality in human capital
Y
Y
K Actual
Actual
Perceived
H
Perceived
T B  1   
1
r   E
B 1 1   

r   D
S
O K
E D O H
K 1 K 
 
K H  H 1  H
Equation (11.42) shows that there is an infinite range of possibilities to achieve

the first best.
- One possibility would be to set  K   1    and  H  0 . In that case, the
increase in the relative use of human capital would be achieved by a tax on physical
capital, only.
afreitas@ua.pt 332
- In alternative, one could set  K  0 and  H    1      . In that case, the

increase in the relative use of human capital would be achieved through a subsidy on
human capital.
This case illustrates a famous prediction from Tinbergen: one should have at
least as many instruments as targets (see Box 11.3). When there are more instruments
than targets, the same target can be met with different combinations of the two
instruments.
Since the distortion can be corrected using both taxes and subsidies, one may
question as to whether a carefully choice of these two instruments could deliver a self-
financed policy. In fact, imposing the balanced budget constraint (11.28) in (11.42) and
using (11.41), one obtains the solution  K*   and  H*    1      . This means
that a now unique combination of the two instruments satisfies the two targets: the
correction of the externality and a balanced budget. Note that this result also illustrates
the Tinbergern theory: since now we have two instruments (taxes and subsidies) and
two targets (removing the externality and balanced budget), the policy mix is unique.
This example also reveal that tax-cum subsidy schemes are not necessarily
negative: as long as the implied transfer is exactly sized so as to offset existing
distortions in relative prices, the policy may be efficiency enhancing.
Box 11.3. The Tinbergen framework for economic policy
The Dutch economist Jan Tinbergen was the first laureate with the Nobel Prize
in Economics, in 1969. In his “framework" for economic policy, Tinbergen proposes a
step sequence for policymaking: first, policymakers should specify the goals and the
corresponding policy targets; second, the policymaker should specify the policy
instruments; finally, the policymaker must have a model for the economy, linking the
instruments to the targets. Finally, the policymaker should select the optimal value of
the policy instruments.
In is work, Tinbergen referred to a linear model to describe the relationship
between instruments and targets. Using linear algebra, he found that if the policymaker
has N targets, he should have at least N linearly independent policy instruments to reach
these targets. When there are more instruments than targets, there are alternative
combinations of the instruments to reach the targets. When there are fewer linearly
independent instruments than targets, the policymaker cannot achieve all the desired
goals and has to accept a trade-off between the different targets. This doesn’t mean that
there is no optimal policy: in that case, the policymaker has to specify a relationship
representing the costs to society of deviations from the optimal values (the “social loss
function”) and set the instruments so as to minimize that loss.
A criticism to the Tinbergen theory is that it abstracts from uncertainty. As other
authors pointed out later, when the relationship between policy instruments and targets
is uncertain, it is advisable to use a portfolio of instruments (i.e, more instruments than
targets), so as to diversify the risks of failing.
11.6. Discussion: policies and growth
This chapter reviewed different types of market or government-imposed

distortions that impact negatively on economic performance. The chapter also illustrates
how government policies can be used to “get the prices right”. It is suggested that a
careful intervention using taxes and subsidies has the potential to improve economic
efficiency and drive the economy to a faster growth path (or, in alternative, to a higher
level of per capita income, assuming diminishing returns to capital).
The view that policy reforms should address market failures and deadweight
triangles in the first place has been at the core of the policy prescription adopted by the
main international institutions, since the 1990s. This approach was coined “The
Washington Consensus” (see Box 11.4). The experience with the implementation of the
Washington Consensus revealed however that similar policy packages delivered
different results in different countries. This evidence led economists to recognize that
more attention should be given to each country’ specific circumstances, when designing
policy reforms.
In this debate, some authors argued that, instead of “trying to reform as much as
possible”, policymakers should focus the intervention in few areas of intervention. The
reasoning is the theory of second best explained in Box 11.3: in a complex world, with
many distortions and complex interaction effects between them, finding out the
appropriate sequence of reforms is very difficult. Because of this, instead of
implementing any reform and risking making things worse, policymakers should study
carefully the one or two most important constraints to economic growth and tackle them
(see Box 11.4).
Another area of contention regards to the choice of an appropriate balance
between static and dynamic inefficiency. Some economists argued that the Washington
Consensus failed in that it gave to much emphasis on static inefficiencies and
deadweight triangles, overlooking the role of dynamic inefficiencies. According to this
view, policymakers should focus more on old fashion industrial policies, that come at
cost of extra inefficiencies today (import protection, less competition) but may deliver
faster growth in the future268.
Finally, it has been recognized that, although good policies are critical for
economic performance, implementing the right policies is not within reach for many
developing countries, because they lack the necessary institutions. The reasoning is that
policies are embedded in institutions, so when the institutional framework is not
supportive, policies are doomed to failure. Moreover, to the extent that good institutions
tend to deliver good policies, improving the quality of institutions should be the top
priority of any reform agenda269.
268
A prominent economist along this avenue is Dani Rodrik (Rodrik, 2006).
269
This claim is supported by extensive evidence in cross-country growth regressions showing that policy
variables tend to loose significance when variables measuring the quality of institutions enter as
explanatory. Most influential, Easterly and Levine (2003) found that policy variables, such as trade
openness, inflation and exchange rate overvaluation fail to explain economic development once the
quality of institutions is accounted for (another example is in Box 5.4). This is not to say that policies are
not important: simply, because good institutions tend to generate good policies, in cross-country
regressions variables capturing the quality of institutions also capture the quality of economic policies in
general.
afreitas@ua.pt 334
The following two chapters will explore some of these ideas. In the next chapter,
we analyse the argument according for active industrial policies. In the chapter that
follows, we will depart from the benevolent planner assumption, to motivate the
importance of institutions for the quality of decision-making
Box 11.4 The Washington Consensus
The view that economic reforms should target existing market failures and “get
the prices right” became mainstream among economists at the World Bank, the IMF,
and Washington think tanks since the late 1980s.
This view was coined “the Washington Consensus” by John Williamson, in a
conference held at the Institute for International Economics, in 1990. In that conference,
the author listed 10 policy reforms that synthetized the view: (1) fiscal discipline; (2)
reorientation of public expenditures from non-merit subsidies to basic health care,
education and infrastructure; (3) tax reform, so as to enlarge the tax base and reduce
marginal tax rates; (4) liberalization of interest rates; (5) unified and competitive
exchange rates; (6) trade liberalization; (7) openness to inward foreign direct
investment; (8) privatisation; (9) deregulation, to ease barriers to entry and exit; (10)
secure property rights.
The Washington consensus constituted a departure from the thinking in the
1950s and the 1960s, according to which economic development was a complex process
of economic, social, political and historical transformation, requiring a specific
diagnostics for each particular country.
Along the 1990s, the policy prescription of the Washington Consensus was
implemented in many developing countries, namely in Latin America, in Sub-Saharan
Africa, and in former socialist republics. By the turn of the century, many developing
countries had achieved more open economies, sounder public finance, lower inflation,
fewer restrictions on private business, and more efficient financial sectors.
However, the experience with the implementation of the Washington consensus
was not impressive270. In Latin America, for instance, growth rates in the 1990s were on
average lower than along 1960-1980, a period, characterized by extensive state
intervention and import substitution policies. At the same time, countries like Thailand,
Malaysia, China, Vietnam and India were experimenting fast growth, despite their
insistence on industrial policies. China and India, in particular, made significant market-
based reforms, but maintained high levels of state intervention, import restrictions and
targeted industrial policies. All in all, some countries that followed the Washington
Consensus didn’t achieve faster growth, while some countries following less orthodox
approaches achieved fast improvements in their living standards. Moreover, similar
reforms delivered different growth performances in different countries.
In 2005 the World Bank launched a study, Economic Growth in the 1990s:
Learning from a Decade of Reform, where it is explicitly recognised that no one single
prescription shall be viewed as applying to all countries at the same time. The document
argues that, while good policies are in general important for growth, their effects may be
offset or reinforced by other factors, including the cultural, institutional, social and
270
Zagha et al. (2006).
political environment271. These factors imply that similar policies may lead to different
results if embedded in different social, political and economic contexts. The report
concludes that a correct assessment of each country particular circumstances is
essential, in order to define the optimal sequencing of reforms. This report marked a
step back from the “one policy fits all” approach underlying the Washigton consensus,
in the direction of the “this country is different” claim, which dominated Development
Economics in the 1950s and in the 1960s.
 Article for discussion: “Rethinking Growth”, Zagha, R., Nankani, G.,
Gill., I. Finance and Development 43(1), March 2006.
Box 11.5. Growth diagnostics
The theory of second best tells us that a policy reform that looks efficiency
enhancing when considered in isolation may end up being counterproductive, due to
adverse interaction effects with other distortions. Since in the real world, policymakers
do not have the complete knowledge of all prevailing distortions in an economy nor a
precise quantification of all possible adverse second-best interactions, the strategy of
tackling all distortions at the same time can be a rather risky exercise.
This point was made by Haussman et al. (2008), in a critical assessment of the
approach to economic reform followed by the main international institutions (the so-
called “Washigton Consensus”)272. Given this, the authors argue that, instead of trying
to tackle all distortions at the same time (“laundry list approach to economic reform”),
policymakers should carefully identify the one or two most important constraints in a
given economy and tackle these constraints.
To help identify the relevant binding constraints, the authors propose a
diagnostic analysis, using a conceptual model that can be interpreted in light of the
simple AK model. According to that model, economic activity in a given developing
country may be constrained by: (a) Low A (low return on private investment) (b) High r
(high cost of finance). When the main problem is a low A (in which case the interest
rate is expected to be low and capital is expected to be flowing out), this should be
related to low social returns (low human capital, poor infrastructure, bad geography) or
to a large gap between social and private returns (market failures, distortionary taxation,
government failures). When the main problem is instead a high cost of finance, then the
country should exhibit high interest rates, and the distortions should be related to
inefficiencies in the financial market, such as low enforcement of loan contracts, low
savings, restrictions to capital inflows, or imperfect competition on domestic banking.
271
The recognition that sound policies work better if embedded in strong institutions had been already
materialized in 2001, with a new list of priority reforms that became known as the Monterrey Consensus.
This list complemented the original (the so-called first generation reforms) with a number of “action
points” addressing issues such as governance, corruption and human rights (second generation reforms).
272
Haussman et al. (2008): “This [The Washington Consensus] is a laundry-list approach to reform that
implicitly relies on the notions that (1) any reform is good; (2) the more areas reformed, the better; and
(3) the deeper the reform in any area, the better”. However: “(…) the principle of second-best indicates
that we cannot be assured that any given reform taken on its own can be guaranteed to be welfare
improving, in the presence of multiple economic distortions”. (pp. 329-330).
afreitas@ua.pt 336
Using this conceptual model, policymakers should identify the one or two most
binding constraints to economic growth and adjust the policy to tackle this small
number of constraints, only.
 Article for discussion: “Getting the Diagnosis Right”. Hausmann, R.,
Rodrik, D., Velasco, A., 2006. Finance and Development 43(1), March
2006.
Box 11.6. Easterly: Policies can destroy growth!
To illustrate the role of policies in economic growth, William Easterly (2005)

estimated a number of cross-country growth regressions, using a panel of 5-year
averages along 1960-2000. The author considered 6 variables capturing distinct
dimensions of policy (details in Table 11.3): the inflation rate (INFL), the government
deficit (BB), a measure of real exchange rate overvaluation (LREALOVR), the black
market premium on foreign exchange (LBMP), a measure of financial development
(M2/GDP) and a measure of trade openness (TRADE). The author then regressed the
growth rate of GDP per capita (five years average, 1960-2000) in all these six variables.
Table 11.3 describes Easterly’ findings. Column (1) presents the estimated
coefficients of a simple regression where the 6 dependent variables are included. All
signs are in accordance to the theory and most coefficients are significant273.
At the first sight, the results in Column (1) suggest that improving the quality of
economic policies may have a significant impact on growth. To quantify this, Easterly
computed the implied effects on growth resulting from a one standard deviation
improvement in each of the policy variables. These effects are displayed in Column (2);
the one standard deviation changes in the policy variables are displayed in Column (4).
For instance, a reduction in the inflation rate by one standard deviation (that is, by 32
p.p, as showed in Column 4) would augment per capita income growth by 0.6 p.p
(Column 2). According to this table, if all variables were improved at the same time, the
overall impact on per capita GDP would be as much as 3p.p. This finding suggests that
getting the policies right is good for growth.
Easterly pointed out, however, that such an exercise has enormous limitations.
First, he observed that a one standard-deviation change in any of these six policy
variables (Column 4) is outside the experience of most countries: reducing the inflation
rate by 32 p.p, improving the budget balance by 5 p.p, improving the M2/GDP ratio in
25 pp, etc, are certainly not easy to achieve for most countries. The reason, he argues, is
that these large standard deviations are related to the presence of very extreme
observations (extreme inflation, extremely high deficits and extremely high black
market premiums “on the bad side”, extremely high monetization ratios and extreme
degree of openness “in the good side”). These extreme observations influence critically
the statistical significance of the policy variables in growth regressions.
To abstract from the presence of these outliers, the author run the regression
again, restricting the sample to observations where all six policies variables lie in the
range of “moderate” policies, that is, removing the extreme values form the sample (this
reduces the sample to roughly one half). Column 3 of Table 11.3 displays the
273
M2 and TRADE perform less well than the other variables, but the author shows that they become
significant once any one of the other five variables is dropped out of the regression equation.
corresponding results. Excluding observations where any of the six policy variables are
extreme, all policy variables become insignificant!
Table 11.3 – Easterly regressions on polices and growth
Regressions of per capita growth on basic set of 6 policy variables.

Dependent variable : LGDPG (log per capita growth, five year averages, 1960-2000)
(1) (2) (3) (4)

Change in growth from one
Memo: Improvement of one
Coefficient in growth standard Sample: Moderate
standard
regression deviation change in policy Policies, only
deviation in policy variable
(%)
INFL -0.018 0.6 -0.064 -0.325
(2.61)** -1.23
BB 0.092 0.5 0.018 0.054
(2.81)** 0.22
M2 0.01 0.3 -0.004 0.253
1.37 0.27
LREALOVR -0.014 0.5 0.001 -0.387
(2.97)** 0.06
LBMP -0.012 0.7 -0.038 -0.558
(2.33)* -0.95
TRADE 0.01 0.5 0.01 0.454
1.92 1.09
Constant 0.016 0.027
(3.62)** (2.52)*
Observations 422 193
R-squared 0.18 0.03
Notes: Robust t statistics in parentheses. *Significant at 5%; **Significant at 1%.

(3) Restrictions under moderate policies: INFL between -0.05 and 0.3, BB between -0.12 and 0.02, M2<1.0, LREALOVR between -0.5 and 0.5,
trade < 1.20, LBMP between -0.05 and 0.3
Variables used: LGGDP: Log per capita growth rate; INFL: Log (1+ inflation rate); BB: Government budget balance/GDP; M2: M2/GDP;
LREALOVR: Log (overvaluation index/100), above zero indicates overvaluation; LBMP: log(1+black market premium on foreign exchange;
TRADE: (exports+imports)/GDP.
Source: Easterly (2005)
This finding is quite suggestive. It suggests that the results in column (1) are
mainly driven by extreme values. In other words, the results in column (1) may be
reflecting mainly the potential for destruction of bad policies. Thus, countries with
moderate values of these variables are not expected to obtain large gains by achieving
moderate improvements in their policies. Based on this, Easterly concluded that:
“Although extremely bad policy can probably destroy any chance of growth, it does not
follow that good macroeconomic or trade policy alone can create the conditions for high
steady state growth” (pp. 1017).
afreitas@ua.pt 338
 A market distortion can be thought as a tax that drives a wedge between private returns and social
returns, reducing the level of market activity.
 A first type of distortions analysed in the chapter are those that influence the consumption-savings
decision. By affecting the marginal product of capital, these distortions impact on the interest rates
and thereby on saving rates, causing capital accumulation to slow down.
 An example of such type of distortion relates to financial markets. Financial market frictions give rise
to transaction costs that drive a wedge between the marginal product of capital paid by firms and the
net return actually received by borrower. The larger this intermediation margin, the lower the return
on savings. Financial development, by reducing the costs of intermediation, impacts positively on
capital accumulation and economic performance.
 A second type of distortions analysed in this chapter alters the relative price of two inputs, inducing
firms to change the proportions in which they are used. Examples of this type of distortion occur with
import protection, segmented labour markets, high inflation, monopoly and externalities.
 Distortions in factor markets may also arise as a result of inequalities in income distribution. Because
at the individual level there are physical limits to human capital accumulation while physical capital
can be accumulated without bound, an unequal income distribution will deliver an over-investment in
physical capital and under-investment in human capital.
 When governments try to subsidize some goods, this often comes at the cost of an explicit or implicit
tax on other goods. William Easterly labelled this as “tax-cum subsidy schemes”. Examples of these
schemes include interest rate ceilings, directed banking credits, and dual exchange rates. In the model
we analysed, a budget neutral tax-cum subsidy scheme affecting the two types of capital leads to an
higher price of capital on average and therefore less capital accumulation.
 A dramatic case of distortion in factor markets occurs as a consequence of tax evasion. When the
government is unable to tax equally two types of capital, there will be an incentive for firms to use
more of the type of capital that escapes taxation. When this is so, the optimal intervention will
involve a lower provision of the public input. This case provides an illustration of second best
decision-making.
 The view that reform agendas should focus on market failures and deadweight losses was well
reflected in the Washington Consensus. The experience with the implementation of the Washington
consensus lead some authors to argue that more attention should be given to the quality of domestic
institutions. Other authors argued that instead of addressing all types of market failures, policy-
makers should target only few of them or even to accept less static efficiency to achieve faster
economic growth.
Key concepts
 Distortion
 Tax cum subsidy schemes
 Second best decision-making
 The Washington consensus
Essay questions:
 Comment: “bad policies are more likely to be tolerated in poor countries

where financial markets are underdeveloped and A is low than in rich
countries”
 Comment: “Financial development is good for growth”
 Explain the mechanisms through which high inflation could affect
economic growth
 Explain why income inequality may lead to a suboptimal accumulation
of human capital.
 Comment: “Under tax evasion, tax rates tend to be too high and public
provision too little”
 Comment: “Policies can destroy growth”
 Comment: “The emphasis on deadweight-loss triangles and with seeking
the efficiency gains from their elimination is an incomplete agenda to
foster economic growth”
afreitas@ua.pt 340
Exercises
11.1.
Consider an economy, where the production function is given by Y=0,2K, the
population grows at 1% per year and physical capital depreciates at 4%.
a) Assume for the moment that the saving rate is constant and equal to
30%. Describe the main equations of this model and find out the growth
rate of per capita income in this economy. Discuss, with the help of a
graph the dynamic properties of this model.
b) Consider now that the government imposes a tax on production, which
proceedings are used to finance unproductive government consumption.
Describe the main income identities of this economy and place them in a
flow income chart. Find out the growth rate of per capita income in this
economy when: =0%; and =20%. Explain.
c) Now assume that the saving rate is endogenous, so as to satisfy the
following inter-temporal consumption rule:  t  rt  0,15 .
i. Explain this equation.
ii. Departing from the identity K  K   s 1   Y , compute the
endogenous saving rate in this economy, as a function of the tax
rate. Explain.
iii. Compute again the growth rates of per capita income when
=0%; and =20%.
11.2.
Consider an economy with N=90 workers a traditional sector that produces good X and
an urban sector that produces good Y where the respective production functions are
X=ln(Nx) and Y=2ln(Ny ).
a) Find the equilibrium of the labour market, assuming full flexibility of
wages.
b) Now assume that the urban sector faces a legal minimum wage equal to
1/20. Sticking with the assumption of full wage flexibility in the
traditional sector, find out the labour market equilibrium and represent it
in a graph. Identify the implied distortion.
c) Now assume that the probability of finding a job in the urban sector was
p<1. What would be the equilibrium in this case? Identify the
corresponding deadweight loss.
11.3
Consider an economy where aggregate output is produced using two types of
capital, according to: Y  K 10.5 K 20.5 . The total capital available each moment in time
evolves according to sY  K  K 1   K  , where  K refers to financial market
imperfections.
a) Interpret the equation describing capital accumulation
b) Draw the income flow chart of this economy.

c) Show that, as long as the ratio of the two types of capital is constant, the
production function as an AK representation.
d) Find out the ratio K 1 K 2 that maximizes the efficiency level in this
economy. Interpret.
e) Supose that investment decisions for each type of capital were
undertaken by two independent firms, which take the other’ firm
decision as given. Show that, as long as both firs have access to credit at
the same interest rate, the maximum efficiency will be achieved.
f) Describe the growth rate of this economy as a function of the exogenous
parameters. Interpret.
11.4.
Consider an economy composed by a large number of identical firms, with
production functions of the form: Yi  0.5 K i1 3 H i 2 3 , where H=hN , N is the number
of workers and h measures the quality of labour. We also know that the saving rate
is 12,5%, population is constant and the depreciation of both physical and human
capital is 4%. In this economy, the government has the ability to coerce citizens to
pay taxes out of their factor incomes. Let τK be the tax rate on the physical capital
and τH the tax rate on human capital.
i. Solve the individual firm problem and find out the implied factor
income shares.
ii. Describe, with the help of a graph, the effects of taxation on the
relative use of the two types of capital. Compare the cases
where τK= τH=0 , τK= τH>0 and τK> 0, τH=0.
iii. Consider the following optimal consumption rule (γ = r – 0,05) .
Find out the growth rate of per capita income in this economy,
depending on the tax rates. Explain.
1
 G 2
g) Now assume that A    , where G are public inputs.
Y 
iv. Compute the aggregate production function of this economy.
Explain the market failure.
v. Obtain an expression for A in terms of the two tax rates.
vi. Compute the benevolent planner solution. Graph the equilibrium
and explain.
vii. Now assume that the government could not tax Human Capital.
Find out the optimal tax on physical capital and the
corresponding provision of public inputs. Compare with the
first best outcome and explain.
h) Now, return to the case without public inputs, but assume instead that
there was a positive externality associated to the use of Physical capital,

K
according to: A    .
Y 
afreitas@ua.pt 342
viii. Compute the aggregate production function of this economy.

Explain the market failure.
ix. Suppose that the government want to solve the market failure
imposing a tax on human capital, only. What would be the
optimal intervention? And the corresponding growth rate of the
economy?
x. Now suppose that the government wanted a balanced budget, so
it would use the tax proceeds to finance a subsidy on human
capital. What would be the optimal intervention?
xi. Compare the two cases in light of the Timbergern framework.
11.5.
In Unevenland, the aggregate production function can be described as Y  EK , where E
denotes for aggregate efficiency and K denotes for physical capital. The later
depreciates at the rate δ=0,05. Capital markets are perfect and there is no uncertainty, so
households are able to smooth consumption inter-temporally, according to   r  0,1 .
i) Find out the growth rate of per capita income in this economy in terms of
E. Describe the dynamic properties of the model.
j) Elaborating a bit more, suppose that the efficiency term is better
described as a ratio of two terms, E  0,5 A PI , where A is constant and
PI denotes for the relative price of capital goods. Which policies or
country circumstances may be captured by PI ?
k) Assume that initially A  1 2 . Compute the growth rates of per capita
income in this economy when PI  1 2 and when PI  3 2  . With
12
the help of a graph, compare with the impact of a similar change in the
context of the Solow model.
l) The economy of Unevenland is actually more complex than at the first
sight. In particular, each individual firm i faces a production function of
the form: Yi  AK i1 2 H i1 2 , where H is human capital and A  G Y  ,
12
where G denotes for a public good. The government budget constraint is

given by G  1 2 K 1   K    H 1   H Y , where  K is the tax rate
on physical capital incomes and  H the tax rate on human capital
incomes. PI becomes PI  1   K 1   H  .
12
m) Compute the aggregate production function and explain why there is a

market failure.
n) The benevolent planner of Unevenland chooses the intervention level
G Y so as to maximize the growth rate of per capita income or - which
is the same - the efficiency term, E. For the moment, however, assume
that the government can only levy taxes on physical capital  K (that is,
 H  0 ). Find out the optimal level of  K and the corresponding level of
government intervention.
o) Now assume that the government is able to impose a uniform tax rate
(that is,  H   k ). Find out the optimal level of  K and the
corresponding level of government intervention. Explain the effect on
PI and compare this case to the one in question c.
p) Compare the solutions of f and g and discuss.
11.6.
Consider an economy composed by a large number of identical firms, with
production functions in the final good sector of the form: Yi  0.5 K i1 3 H i 2 3 , where K and
H are physical and human capital. We also know that one unit of output can be
transformed in one unit of physical capital or in one unit of human capital and that the
depreciation rate for both types of capital is 3%. Consider that each firm i in the final
good sector is a price taker in factor markets. Let PK be the price of physical capital and
PH be the price of human capital.
a) From the profit maximization problem of firms in the final good sector,
find out the optimal relation between H and K as a function of factor
prices.
b) Now consider the case in which both factor markets are competitive, that
is, each firm producing physical and human capital is price taker.
Describe the profit maximization problem of a representative firm
producing (human or physical) capital, find out the optimal price and the
corresponding profits.
c) Now assume that the supply of physical capital is undertaken by a
monopolist. Solve its profit maximization problem, finding out its
optimal price and the corresponding profits.
d) Compare the relative employment of physical and human capital in the
case of perfect competition and in the case of a monopoly. Compare it
with the efficient allocation.
e) Using the optimal consumption rule γ = r – p, compare the growth rates
of this economy in the two cases.
f) How could the government remove this distortion?
afreitas@ua.pt 344
“…I do not mean the Big Push is really the right story of how development
takes place (…). What I do mean is that the unconventional themes put forth by the high
development theorists – their emphasis on strategic complementarity in investment
decisions and on the problem of coordination failure – did in fact identify important
possibilities that are neglected in competitive models”. [Paul Krugman]
Learning Goals:
 Understand why economies of scale are a source of multiple equilibrium

and of divergence.
 Understand the various arguments underlying the big push idea.
 Understand the implication of economies of scale for international trade
and factor mobility.
 Acknowledge the key role of transport costs in shaping the economic
geography.
 Understand the fundamental role of geography for economic
development.
12.1 Introduction
An argument that has been put forward by many economists is that poor
countries are unable to adopt modern technologies due to the small size of their
domestic markets. This reasoning has a long tradition in economic thinking. Backing
from Adam Smith (1776), economists have been arguing that industrialization involves
the use of modern technologies, which are potentially more productive than traditional
technologies, but that entail some form of economies of scale. Because the adoption of
these technologies requires a minimum level of sales, countries with small domestic
markets or with limited access to international trade may fail to industrialize. This
reasoning led development economists in the 1950s and the 1960s to defend that
governments in poor countries should promote massive investment plans, so as to boost
demand and achieve self-sustained industrialization. This idea, coined as “the big push”
by the Austrian economist Rosenstein-Rodan, inspired many development economists
at that time and still inspires today.
This chapter departs from the perfect competition paradigm, to focus on a model
with internal economies of scale. The model is then used to review some theories
according to which complementarities in investment decisions my give rise to
coordination failures and economic divergence. The chapter does not address
technological change in the sense of explaining why new technologies are invented. It
however reviews different theories that have been put forward to explain why some
countries are able to industrialize while others are not and the potential role of the
government in overcoming the underlying circularities. Because in this literature the

initial conditions play a key role in determining the equilibrium, we take the opportunity
to review the argument that geography has played a key historical role in determining
the initial conditions.
Section 12.2 reviews the original big push argument. In this model, the number
of product varieties is fixed, so the critical question is whether the market is large
enough for firms to break even. Section 12.3 introduces a different paradigm, where free
entry implies that all firms will exactly break even. This model introduces the
relationship between the size of the market and the division of labour. In Section 12.4
introduces the possibility of reverse causation from the division of labour to the extent
of the market. Section 12.5 addresses the implication of factor mobility in models with
economies of scale, to briefly explain the circularities involved in the so-called New
Economic Geography. Section 12.6 reviews the Geography Hypothesis and the
institutions vs geography debate. Section 12.7 concludes.
12.2. The Big Push model
Traditional vs. modern technologies
Consider a closed economy, where the only primary input is labour, which total
supply is fixed and equal to N 274 . Output (Y) is produced using a composite
intermediate good (X), which in turn is assembled with m intermediate inputs:
1
1 
YX , (12.1)
m
X   xi1  with   1 . (12.2)
i
This technology exhibits constant returns to scale, because increasing the use of
all intermediate inputs in the same proportion leads to a proportional increase in Y275.
Along this section, it is assumed that each intermediate input can be produced
with one of two technologies: a “traditional” technology with constant returns, and a
“modern” technology with increasing returns.
The traditional technology is assumed to be equal for all intermediate inputs and
equal to:
N j  xj, (12.3)
where N j is the amount of labour used in the production of the intermediate input j.
274
The model below follows Murphy et al. (1989), drawing also from Sachs and Warner (1999) and
Krugman (1995).
275
Another important property is that the marginal product of each variety increases with the number of
varieties. Also note that the direct partial elasticity of substitution between every pair of varieties is equal
to 1  . The restriction   1 implies that no intermediate input is essential to production.
afreitas@ua.pt 346
The modern technology can be thought as a factory that involves a fixed cost (F
units of labour) and a marginal cost of production. The corresponding production
function (also equal across sectors j) is:
xj
Nj F  , with   1 . (12.4)

The efficiency locus
From (12.3) and (12.4), it follows that modern production will economize labour
relative to cottage production in sector j, if x j  F  x j  , that is, if:
F
xj   xQj (12.5)
1 1 
Where xQj stands for the critical level above which production will better take
place in a plant.
Symmetry in (12.1-12.4) implies that, in equilibrium, all varieties will be
produced by the same amount ( x j  x, j ). Thus, the employment level in each sector
will be:
N
Nj  j=1,...,m. (12.6)
m
Clearly, whether this economy is better served with modern production or with
cottage production depends on the size of its population, N276: if the economy was run
by a central planner concerned with aggregate productivity (or per capita income), he
would command all sectors to industrialize if the size of the population exceeded the
breakeven:
 F 
N *  m   mx j
Q
, (12.7)
1  1  
and all sectors to remain traditional otherwise.
Figure 12.1 provides a graphical illustration277. The figure plots the production
function (12.3) as OL and production function (12.4) as FH. As shown in the figure,
because of the fixed costs, F, at low levels of production (at the left of the critical point
Q), labour is more productive if employed in traditional units than in modern factories.
Beyond the critical level, the higher productivity of labour in modern factories (>1)
more than offsets the fixed cost of building a factory.
In Figure 12.1, the central planner optimal technological choice for each level of
population coincides with the (efficiency) locus [OQH]. Note that if the central planner
276
Remember that this model refers to a closed economy.
277
The figure is borrowed from Krugman (1995).
could choose the size of its population, he would choose it to be as large as possible, so
as to take opportunity of increasing returns278.
Figure 12.1 – The Big Push Model
xj
x j   N j  F 
x Hj H wN j
U
N j  xj
x Lj
R L
F
1 1  Q
O F N m Nj
Why the competitive equilibrium may fail
We just saw that switching from cottage production to modern production is

efficient when (12.7) holds. Thus, a central planner concerned with aggregate efficiency
would command the economy to adopt modern production if the population size was
large enough and to remain traditional otherwise.
A different question is whether the best allocation will actually be reached in a
decentralized economy. In other words, will market forces alone be sufficient to induce
industrialization when it is desirable?
Some authors have argued that this is not always the case. A well-known
argument is from Rosenstein-Rodan (see Box 12.1). The author contended that, while
lack of industrialization may be a consequence of an insufficient market size, an
insufficient market size might itself be a consequence of low industrialization.
To understand the argument, one must de-link the concept of market size from
that of population size. Indeed, if a country market size was defined by the size of its
population, then economies with larger populations, such as China and India, should be
278
In Figure 12.1, the average product of labour can be measured by the slope of the ray that departs from
the origin and crosses the locus [OQH] in each point. You may easily check that the average product of
labour is invariant with the size of the workforce along the segment OQ, but becomes an increasing
function of the workforce in the segment QH. This is an obvious implication of moving from constant
returns to increasing returns.
afreitas@ua.pt 348
more industrialized than, for instance, Germany and France. As we know, this is not the
case. In fact, it is not the size of population that matters, but the size of aggregate
demand, which in turn depends on income. A given population size will translate into
more or less aggregate demand, depending on the employment level combined with
productivity. An economy with large population and low productivity may not generate
income enough to feed the fixed costs of industrialization.
Box 12.1. Rosenstein-Rodan and the shoes factory
The term Big-Push was coined by Rosenstein-Rodan. In his seminal 1943

article, Problems of industrialization of Eastern and South-Eastern Europe, the author
imagined a country in which 20,000 workers are taken from agriculture and put into a
modern shoe factory, earning wages higher than their previous income. Rosenstein-
Rodan argued that such investment would not enlarge the overall market size, because
workers of the shoe factory are not expected to spend all their income in shoes only.
Hence, the investment in the shoe factory would be most probably unprofitable in
isolation. It could, however, become profitable if accompanied by simultaneous
investments in many other industries: according to the author, a coordinated investment
effort, spread over a large number of industries (the “Big Push”), could solve the
problem of insufficient demand, because each industry would act as each other’s buyers.
In that case, each entrepreneur would find it profitable to incur the fixed costs of
industrialization, even if no sector could break-even when industrializing alone.
Roseinstein-Rodan was largely inspired by the Marshall Plan in the Post-WWII
Europe and used this idea to call for a large-scale external assistance to Eastern Europe.
The argument was echoed by other development economists and that time, like Ragnar
Nurske and WW Rostow and became a workhorse in development economics since
then. The theory had however to wait until 1989, to be properly formalized in an
economic model, by Murphy, Shleifer and Vishny279.
The market structure in the big push model
In the following analysis, let’s assume that equation (12.7) holds, so that
industrialization would be socially desirable. The question is whether industrialization
will be naturally achieved under laissez faire. To analyse this question, one needs to
assume a market structure for this economy.
Since the production function in the final good sector (12.1-12.2) exhibits
constant returns to scale, a natural assumption to make is that the final good sector
operates under perfect competition. The same holds for intermediate-good sectors using
the traditional technology. Yet production with the modern technology, because it
involves economies of scale that are internal to the firm, should be operated by a
monopoly.
For convenience, the wage rate in traditional production will be the numeraire of
the model (that is, equal to 1)280 . In what follow, it is assumed that traditional and
279
Nurske (1953), Rostow (1960), Murphy et al. (1989).
280
In contrast to all other models we have used so far, all prices in this model are defined in units of
cottage labour, instead of in units of Y.
modern productions do not pay the same wage. Although labour is homogeneous,
workers must be paid a premium (compensation differential) to work in a factory. So,
the wage rate in a modern plant will be w>1. A way of rationalizing this is to assume
that working in a factory involves some disutility, so workers require a higher wage in
compensation. As we’ll see in a minute, this wage premium is essential for the big push
story to fit in the model.
In the final good sector, Y-producers operate under perfect competition, so they
take both the output price pY and the price of each intermediate input p j as given.
Profit maximization in the final good sector then implies the following demand for each
intermediate input j:
1
p 
xj   Y  Y , j=1,...,m. (12.8)
p 
 j
These demand functions capture a critical aspect of the model: they are
proportional to production in the final good sector (Y).
Producers using the traditional technology operate under perfect competition, so
they takes both the price of its intermediate input ( p j ) and the wage rate (equal to 1) as
given. Given (12.3), profit maximization in cottage industries implies:
pj  1 (12.9)
An entrepreneur that escapes competition to invest in a plant becomes
monopolist in his market. It is assumed however that this move is a non-drastic one: the
technological advantage of modern production is assumed too small to allow the
entrepreneur to charge the full monopoly price. Hence, the best he can do is to set the
limit price (12.9) and undercut the traditional producers.
Figure 12.2 illustrates this. Point C represents the market equilibrium before the
monopolist entry (e.g, with price and marginal costs both equal to one and zero profits).
Adopting the modern technology, the monopolist achieves a marginal cost equal to
w  (the entrepreneurs is assumed to be small in the labour market). Given the demand
curve (12.8), the conventional profit maximization rule corresponds to point R, where
the line describing the marginal cost intersects the locus of marginal revenues. In this
case, however, the corresponding monopoly price (point M) is higher than the
competitive price, (12.9). Hence, the best the monopolist can do is to set the price just
marginally below 1 to undercut its rivals and still capture all the market, pocketing a
gross profit equal to the shaded area in the figure281. Total profits will be positive or
negative, depending on how the shaded area compares with the fixed costs wF.
281
Formally, profit maximization in the intermediate-good sector j taking into account the demand
function (12.8) leads to the well known pricing rule: p j  w  1    . For the innovation to be non-
drastic, this price needs to be higher than the competitive price (12.9), which will happen when
 1     w . This condition states that the monopolist is more likely to be constrained by the competitive
fringe when her productivity advantage (as captured by ) is not too high. If, in alternative, the cost
reduction was large enough so that the new marginal cost (horizontal) curve crossed the marginal
revenues below point S, then the innovation would be drastic. For more on drastic vs non drastic
innovations, the reader is referred to Section 7.4.
afreitas@ua.pt 350
Figure 12.2. Non-drastic innovation
pj
Q M
w  1   
C
1

Y 
p j  pY  
w x 
R  j
S
x Mj x Cj xj
The size of the market
The convenient implication of assuming that the innovation is non-drastic is that

the price of an entrepreneur operating in a factory will be independent of his output
level. Given the wage rate (w>1), the profit function of an entrepreneur engaging in
modern production will be:
 w
 j  1   x j  wF (12.10)
 
With the price of all intermediate goods equal to 1, from (12.8) we see that the
final good sector will demand identical quantities of each variety:
x j  pY1  Y , (12.11)
or:
x j  x, j (12.12)
Using (12.12) in (12.1)-(12.2), production of the final good becomes:
1
1 
Y m x. (12.13)
These equations imply that a larger output in each variety x leads to a larger
aggregate income (12.13) and that this feeds-back to the demand for each variety
(12.11). Thus, there is much to gain if intermediate inputs are produced with the
modern technology282.
Multiple equilibria
282
This illustrates well the following statement by Young (1928), p. 533, 534: “the size of the market is
determined and defined by the volume of production. (…) an increase in the supply of one commodity is
an increase in the demand for other commodities”.
We now turn to the question of technological choices by individual

entrepreneurs. Since all entrepreneurs face the same technology and demand conditions,
if any one finds it profitable to invest in a plant, all will find it profitable. So there are
only two possible equilibria in this economy: one in which all industries remain
traditional (and aggregate demand is low); and one where all industries adopt the
modern technology (and aggregate demand is high). In the following, we investigate the
conditions under which these two equilibria actually exist.
Figure 12.4 – Multiple equilibria in the big-push model
pj
L H
1
(a) (b)
w 
xH
wF xL
xj
N N 
  F 
m m 
For an allocation to be an equilibrium, all agents must be happy with their

choices, so that an economy starting out with that allocation will remain there.
Consider first the case in which all x-producers start out with the traditional
technology, (12.3). In that case, output in each industry is equal to: x Lj  N m . Such
allocation will be an equilibrium if no entrepreneur will find it profitable to set up a
factory when all other sectors remain traditional. This possibility is illustrated by point
L in Figure 12.4: in the figure, because all other industries remain traditional, the
demand for industry j is low and the operating profits running a plant (shaded area a) are
insufficient to cover the assumed fixed cost (wF) 283.
Now consider the case where all sectors are already industrialized. In that case,
each firm will be producing x Hj    N m  F  . This choice will be an equilibrium if all
entrepreneurs find it profitable to remain with the modern technology. In Figure 12.4,
this possibility is illustrated by point H: because all other industries are modern, the
demand for industry j is high, so the operating profits running a plant (the shaded areas
a+b) are more than the fixed cost (wF).
283
The analysis presumes that each sector is small relative to the economy (m is large), so that the
aggregate demand effect resulting from one single innovation is negligible. In other words, the demand
for each input remains equal to x Lj .
afreitas@ua.pt 352
In sum, for both L and H to be equilibria, one needs the fixed costs wF to lie
between the areas a and a+b in Figure 12.4. Formally, the following condition must
hold: 1  w   N m   wF  1  w    N m  F  . Solving for the wage rate, the
condition for multiple equilibria is284:
N m N m  F 
w (12.14)
F  N m N m
When the wage rate lies in this interval, when all firms are engaged in modern
production, productivity is high and the market size is large enough to make it profitable
for all firms to run a factory (point H); when all firms remain traditional, the market size
is small and no firm will find it profitable to invest in a plant (point L). Thus, the
decentralized economy will remain traditional if it starts out traditional and will remain
industrialized if it starts out industrialized.
Note however that when the wage premium does not lie in the specific range
(12.51), the equilibrium of the model will be unique and well determined. If the wage
premium was less than the lower bound in the interval, the economy would naturally
approach the equilibrium with full industrialization, so there would be no coordination
failure. If the wage premium was higher than the upper bound, the economy would
naturally converge to traditional production285 286.
Clearly, the equilibrium with industrialization is superior, as it allows for more
production using the same amount of labour, N. Since the equilibrium without
industrialization (L) is inferior to the equilibrium with industrialization (H), it can be
interpreted as an underdevelopment trap.
The coordination failure
The Big Push model is about a coordination failure 287 : when the level of
economic activity is too low, no individual firm finds it profitable to invest, because
other complementary investments are not made. If however a sufficiently large number
of firms invested at the same time, each investment by each one firm would expand the
market of all others firms, allowing them to break-even and making industrialization
self-sustained.
284
Murphy et al., (1989).
285
As argued by Murphy et al (1989), the conditions for a coordination failure are actually “much more
stringent” than those loosely expressed by Rosenstein-Rodan.
286
The case with multiple equilibria can also be illustrated using Figure 12.1. In the figure, the curve
describing the blue collar wage premium is wN j . You can visually check that as long as this curve
passes above R and below H, it will not pay for an individual entrepreneur to go modern when all the
economy starts out in L and it pays for an entrepreneur to remain modern if the economy starts out in H.
Unique equilibria would obtain if the wage bill curve passed below R or above H.
287
Technically, the big push is possible when industrialized firms capture only a fraction of the total
contribution of their investment to the profits of other industrializing firms. In the model, this occurs
because of the wage premium. Alternatively, you could assume that factory workers live in cities, so they
have a demand more biased towards manufactures than cottage workers living in the country side. Other
mechanisms are discussed in Murphy et al. (1989).
The problem is that in a decentralized economy no such coordination occurs:

because each single investment is too small to influence the size of the market, no
individual firm will find it profitable to invest unilaterally and the economy remains in
the trap. In the light of this model, the government may solve the coordination problem
by stimulating the aggregate demand or by convincing a large number of big players to
invest at the same time
Expectations
The model just described suggests an important role for expectations in solving
the coordination failure. Suppose you belong to a society stuck in equilibrium L. If, at a
certain moment, everybody expected everyone else to invest, it could become
individually worthwhile to invest because the new investment would be matched with
the higher market size resulting from everyone else’s investment. So expectations can
move the economy out of the trap288. However, each entrepreneur will not invest if he
were to believe that others would not invest. In both cases, expectations are self-
fulfilled.
This discussion suggests that the government does not need to take a direct
intervention to get the economy out of the trap: what he needs is to convince firms to
invest simultaneously.
Infrastructures
Another source of economies of scale are infrastructures, such as ports,

railroads, power supply, and training facilities. These infrastructures involve in general
large fixed costs, so a minimum demand from potential users is required for these
infrastructures to be profitable289.
Assuming that each potential user shares the fixed cost of the infrastructure, the
larger the number of potential users, the higher the probability of a given location being
served with the infrastructure. This in turn, may induce the establishment of more users
in that location, in a virtuous cycle.
Although in principle industrialization and infrastructures go along, coordination
failures may prevent essential infrastructures from springing up. Indeed, as long as there
is the possibility of a “bad equilibrium” with no industrialization (such as point L), a
risk adverse infrastructure builder may prefer not to build if he is not sure on whether
the economy will actually industrialize. Thus, the infrastructure may not be built - and
hence industrialization will not take place - even if the conditions existed for
industrialization.
The failure of an efficient infrastructure to be built suggests a scope for
government intervention: however, subsidizing the infrastructure may not be sufficient:
if the economy does not industrialize, then there will be no users and the infra-structure
will become a classic “white elephant”. Hence, a coordinated move towards
288
Nurkse (1961) referred to this as “the infectious influence of business psychology” (p.249
289
Note the contrast with the model in Chapter 11, where it is assumed that G can be provided in any
continuous amount.
afreitas@ua.pt 354
industrialization may require both the subsidy to the infra-structure and a coordinated
investment in modern factories290.
Again, the argument is circular and presumes the existence of multiple
equilibria: an economy under cottage production (bad equilibrium) could move to
factory production (good equilibrium) but some kind of coordination failure prevents
entrepreneurs from making such a move, and the economy gets trapped in the inferior
equilibrium.
Box 12.2. (key concept) Strategic Complementarities
A strategic complementarity occurs when, if one single agent takes some action,
it becomes more profitable for another agent to take a related action. Strategic
complementarities imply that individuals have incentives to do what the others are
doing.
Complementarities belong to the general class of externalities. However, they
are not the same. Positive and negative externalities refer to the case in which individual
actions impact positively or negatively on the welfare of others. This does not
necessarily induce others to take a similar action or a complementary activity. For
instance, when one individual issues pollution, this does not necessarily induce the
others to issue pollution. This is a simple externality. In alternative, if someone builds a
railroad ending in a beautiful beach, this may induce an entrepreneur to build a hotel.
Building a road and building the hotel are complementary actions.
Box 12.3 Paul Krugman and the “The evolution of ignorance”
The Nobel Laureate Paul Krugman argued that economists tend to disregard
what they don’t know how to model, and that this may create “blind spots”.
To illustrate the argument, he remembered how European maps of the African
continent evolved from the 15th to the 19th centuries:
“You might have supposed that the process would have been more or less linear:
as European knowledge of the continent advanced, the maps would have shown both
increasing accuracy and increasing levels of detail. But that's not what happened. In the
15th century, maps of Africa were, of course, quite inaccurate about distances,
coastlines, and so on. They did, however, contain quite a lot of information about the
interior, based essentially on second- or third-hand travelers’ reports. Thus the maps
showed Timbuktu, the River Niger, and so forth. Admittedly, they also contained quite
a lot of untrue information, like regions inhabited by men with their mouths in their
stomachs. (…). Over time, the art of mapmaking and the quality of information used to
make maps got steadily better. The coastline of Africa was first explored, then plotted
with growing accuracy (…). On the other hand, the interior emptied out. The weird
mythical creatures were gone, but so were the real cities and rivers. In a way, Europeans
had become more ignorant about Africa than they had been before. It should be obvious
what happened: the improvement in the art of mapmaking raised the standard for what
was considered valid data. (…). Only features of the landscape that had been visited by
290
Murphy et al, (1989). The novelty of the case with an infrastructure is that it does not require the
economy to be closed.
reliable informants equipped with sextants and compasses now qualified. And so (…)
there was an extended period in which improved technique actually led to some loss in
knowledge”. (…)
Krugman argued that something similar happened in economics. The ideas of
complementarity, circular causation and poverty traps were patent in the work of the
founders of Development Economics in the 1950s and 1960s, like Myrdal (1957),
Hirschman (1958) Rosenstein-Rodan (1943), and Nurkse (1953). In the decades after,
however, the economic science became progressively more orthodox, relying more and
more on formal models. And a problem arose in that nobody knew how to incorporate
internal economies of scale in a general equilibrium framework: if bigger firms face
lower costs, then there should be a tendency for firms to get bigger and bigger until
capturing all the market and this is not what we see in reality. This unconformity with
real life lead mainstream economics to abandon increasing returns in formal models,
dedicating more attention to the case with perfect competition, because this was the
model economists knew how to build.
It was only when Avinash Dixit and Joseph Stiglitz came along with their
seminal article showing how to incorporate the Chamberlain’ monopolistic competition
model in a general equilibrium framework, in 1977, that economies of scale returned to
the top of the research agenda. The main innovation of the Dixit-Stiglitz model is that it
introduces a mechanism offsetting the tendency towards industry concentration: the
taste for variety.
The Dixit-Stiglitz model revolutionized the economic theory in a number of
branches, including industrial organization, international trade, economic geography,
economic growth and business cycles. In the development literature, Murphy et al.
(1989) used the model to finally model the Big Push argument. Other authors coming
with similar ideas include Ciccone and Matsuyama (1996), Matsuyama (2006),
Rodriguez-Clare (1996), Rodrik (1996).
12.3 The extent of the market and the division of labour
By now, we have been assuming that the number of intermediate inputs (m) in
the economy is fixed. The implication of this assumption is that monopoly profits are
not dissipated by free entry.
In the remaining of this chapter, let’s assume instead that positive profits in the
intermediate input sector create the incentive for new firms to enter in the market,
bringing new varieties. Thus, the “division of labour” will be driven by the size of the
market.
It will also be assumed that condition (12.7) holds and that there is no wage
premium for working in a factory. The implication is that there will be a unique
equilibrium with full industrialization.
Expanding varieties
Consider again the model described by (12.1)-(12.13), but assume that w=1 (no
wage premium) and that condition (12.7) holds (so, you don’t need the technology
afreitas@ua.pt 356
12.3). It is also assumed that all entrepreneurs are constrained by a competitive fringe at
the limit price (12.9).
When this is so, profits in each sector j are:
 1
 j  1  x j  F (12.15)
 
Free entry implies that profits are zero each moment in time. Imposing this on
(12.5) and solving for x, you obtain the level of output in each industry:
F
xj  (12.16)
11 
Using (12.4) in (12.16), you verify that the total labour use in each variety will
be exactly:
F
Nj  (12.17)
11 
These equations state that production in each sector will be exactly the break
even (as described by point Q in Figure 12.1). Contrasting to the model with a fixed
number of varieties, in this model the expansion of the size of the market (as captured
by the labour force) does not translate into more production in each sector and lower
unit costs. Instead, production in each sector remains constant (point Q), while the
number of varieties increases.
Thus, in this model (using 12.6):
xj  N j  N m (12.18)
The (endogenous) number of varieties can be found using (12.18) in (12.17), to
obtain291:
N 1
m 1   (12.19)
F  
This equation establishes an important link between the division of labour and
the size of the market, as captured by the size of population: a larger population rises
insipiently the incumbents’ profits, inducing free entry. Stating in the other way around,
the number of intermediate inputs (the division of labour) is determined (limited) by the
extent of the market292.
Substituting (12.13) in (12.11) and using (12.12), one obtains an expression for
the price of the final good:
291
You may observe that equation (12.19) is the analogous to (12.7), except in that in this case the
number of varieties is endogenous. This is obvious, because the requirement of zero profits implies that
all factories operate on a unique scale of output, corresponding to to point Q in Figure 12.1.
292
You may remember from the discussion in Section 8.4, that a link between horizontal innovations and
the population size is how Schumpeterian growth models remove the scale effect. In fact, you may
interpret the model in Section 12.3 as complementing the model described in Chapter 8. That is, while the
model in Chapter 8 basically addresses vertical innovations to the R&D effort, this model offers a
complementary to understand what drives horizontal innovations.

1 
pY  m . (12.20)
This condition also holds in the model of Section 12.2, but in this section it gains
a more important dimension, because m is endogenous.
This equation states that an increased availability of intermediate inputs leads to
a lower output price, even though the price of each input j remains constant (equal to
one). This effect captures the pecuniary externality steaming from the the division of
labour: when a new variety arises, the productivity of all existing varieties increases, so
the cost of producing one unit of output decreases. The implication is that a larger
population, by inducing an expansion in the number of varieties and by then, a higher
productive efficiency, will enjoy lower output prices and therefore higher real wages
(remember that the nominal wage is the numeraire in this model).
Finally, (12.13) and (12.18) imply that per capita income in this economy is a
positive function of the number of varieties293:

Y
y  m 1  (12.21)
N
This equation shows the link between the division of labour and per capita
income. So, the larger an economy is, the better. Note that this is just another
incarnation of the weak scale effect described before: a growing population in this
model will produce exogenous growth.
Box 12.4: (key concept) Pecuniary externalities
A critical feature of the model with increasing returns is that it accounts for the
presence of externalities: the investment by each one firm expands the market for other
firms. These externalities are however transmitted through the price mechanism. For
this reason, they are called pecuniary externalities, and are distinct from the pure
(technological or Marshallian) externalities that we have used in chapters 6, 10 and 11.
The distinctive feature of pecuniary externalities is that they do not alter the
technological relationship between inputs and output (the A) at the firm level. Still, they
also impact on the average costs of each firm, creating the incentives for firms to cluster
together.
An often referred example of a pecuniary externality is the access to a large pool
of specialized inputs. In the real world, some production processes are highly
sophisticated and require specialized inputs or specific support services. These are not
available everywhere. If an individual firm does not provide enough market to attract
these specialized inputs, the only solution is to locate where the required inputs are
already available. Having a large pool of specialized inputs nearby is a pecuniary
externality for a single firm, because the firm will be able to hire these inputs at lower
costs. Hence, the externality is mediated by the market mechanism. Conversely, there
are incentives for these specialized inputs to move to (or to develop in) areas where
more potential employers are located. This brings “cumulative causation”: specific
inputs go (or develop) where firms are and firms go where specific inputs are.
293
Note that because all profits are zero, income consists only on wages. So per capita output is equal to
real wages.
afreitas@ua.pt 358
Some authors have argued that what we normally assume to be technological

externalities may be in fact a pecuniary externality. For instance, it may be that most
“knowledge spillovers” taking place across firms in a given location occur through
inter-firm labour mobility (that is, the leaking knowledge is embodied in workers that
switch between jobs), rather than through demonstration effects or occasional face-to-
face contacts. Thus, what may appear to be a pure knowledge externality may in fact be
mediated by the labour market294.
The new trade theory
Long ago, Adam Smith argued that a main advantage of international trade is
that, by enlarging the size of the market, allows firms to take opportunity of the division
of labour, achieving higher productive efficiency. This theory was first formalized by
Paul Krugman, in its 1979 seminal paper.
To illustrate the argument in terms of the model in this section, suppose that,
instead of one economy, there are two economies, say East and West. These economies
have equal technologies to produce the same final good, but may differ in terms of
population ( N E and N W , respectively). The question is: what happens if the two
economies were initially isolated from each other and then became able to trade freely
intermediate goods?
To analyse this question, let’s assume that labour is immobile between
economies. In autarky, the number of varieties in each region is determined by the
corresponding labour force. From equation (12.19), the number of varieties in the East
and the West will be, respectively, m E  N E   1 F and mW  N W   1 F .
Without trade, the larger economy will produce a wider range of varieties than the
smaller region and therefore will enjoy a higher level of per capita output (equation
12.21).
With trade (and assuming away transport costs), the same set of intermediate
inputs will be available in both regions at the same price. This means that a final good
firm operating in an open economy will be able to use a higher number of varieties (and
hence will achieve higher productive efficiency) than in a close economy.
From (12.21), per capita income in the free trade area will be equal to:

 m W  m E 1  .
Y
y (12.22)
N
This is more than each country can achieve in autarky.
In this model, the gains from trade do not arise from differences in technology or
in endowments giving rise to comparative advantages. Trade and gains from trade may
occur even if the two economies are exactly equal. The reason is that they are realized
through the enlargement of the market and the division of labour. Moreover, because
the smaller economy has a lower output per capita in autarky, it is the one that has more
to gain with trade openness.
294
Breshi and Lissoni (2001): “the rationale for co-localization may have less to do with knowledge
spillovers mediated by physical proximity, than with the need to access a pool of skilled workers and to
establish transaction-intensive relationships with suppliers and customers”.
This model with trade also sheds some light on the apparent unconformity of the
weak scale effect implied by (12.21) with the real world facts: the scale effect suggests
that countries with large populations, like India and China, should enjoy higher levels of
per capita income than countries with smaller populations, like UK and France.
However, according to equation (12.23), it is not a country’ population that matters, but
rather the size of the market. Because UK and France are well integrated in the world
economy, they may well enjoy larger markets and higher levels of specialization than
countries like China and India, less integrated in the world economy and with
segmented internal markets295.
Some authors have argued that the Industrial Revolution took place when a
serious of reductions in trade costs (between some British regions, first, and then
between Britain and other countries) allowed the size of previously autarkic regions to
be combined, enlarging the market and allowing these regions to expand through the
division of labour296.
12.4 The division of labour and the extent of the market
In industrial countries, firms use roundabout production methods, making use of

many different specialized intermediate inputs, including machinery and producer
services. Many developing countries, however, have not explored the benefits of the
division of labour. In poor countries, producers tend to rely on traditional technologies,
that are more intensive in “raw” labour, with little use of specialized inputs. A question
that naturally arises is why don’t developing countries produce more intermediate inputs
and adopt more indirect methods of production?
This section will explore two reasons. First, in a closed economy, an insufficient
demand for the final good may prevent intermediate inputs from springing up, which in
turn will deliver low productivity in the final product and low demand, in a vicious
cycle. Second, in an open economy, comparative advantages may dictate a
specialization pattern that is less favourable to the development of intermediate inputs
and the benefits of the division of labour.
Vertical complementarities and circular causation
An explanation for why poor countries explore less the division of labour than
rich countries follows directly from the theory of circular causation formulated by Allyn
Young (Box 12.5): In poor countries, the low demand for final goods implies that the
economy will produce only a small number of intermediate inputs (the division of
labour is limited by the extent of the market). The lack of local support industries leads
to the adoption of relatively simple (labour intensive) methods of production in
downstream industries. Low productivity in the final goods sector, in turn, imply a low
demand for intermediate inputs (the extent of the market is limited by the division of
labour). Thus, an economy that inherits a narrow range of intermediate inputs may find
295
Matsuyama (1992, p. 323), “...a large country does not necessarily mean a large economy. It may
simply consist of a large number of regional economies”.
296
See, for instance, Ventura (2005).
afreitas@ua.pt 360
itself trapped into a lower stage of economic development, with limited incentives for
new varieties to spring.
To see this formally, consider a model similar to that presented in Section 12.3,
with two novelties. First, assume that both raw labour and intermediate inputs can be
used in final good production. That is, the production of the final good, instead of given
by (12.1) is given by:
Z  F X , NZ  , (12.23)
where Z refers to the final good, X is defined as in (12.3) and N Z is the amount of raw
labour used in the production of the final good (note that in this version of the model,
the labour market equilibrium condition is given by N  N Z  mN j ).Second, assume
that the elasticity of substitution between raw labour and X is greater then one.
In this case, it can be proved that there is a critical number of varieties (m) below
which it pays more for a firm to rely on raw labour than on intermediate inputs297. The
reason is that, with a low number of varieties, production efficiency will be low and so
will be real wages. The price of X, in turn, will be high (remember equation 12.20).
Hence, firms will optimally use raw labour intensively. Thus, if the economy inherits
fewer varieties than the critical level, it will tend to use production techniques more
intensive in raw labour.
If, however the economy achieved a critical mass in support industries, a
virtuous cycle would take place: as the number of varieties increased, the relative price
of X would decline and real wages would increase, inducing producers to substitute raw
labour for intermediate inputs, adopting more indirect methods of production. This
movement increases the size of the market for new varieties, and the economy achieves
a higher division of labour and higher living standards.
The model explains why in developing countries the lack of local support
industries induces the use of relatively simple production methods in downstream
industries and why such situation is self-sustained. Like the Big-Push model, this model
exhibits multiple equilibrium: a good equilibrium with a wide range of intermediate
inputs available and high productivity of labour and wages, and a bad equilibrium with
low wages and production methods more intensive in raw labour. In this model,
however, pecuniary externalities arise through factor substitution, not by income effects,
as in the Big-Push model.
Circular causation in this model also differs from that of the Big Push model in
that complementarities operate between upstream and downstream industries. This is
called “vertical complementarity”. The model of Section 12.2, is of “horizontal
complementarities”, because introducing modern methods in one industry enhances the
profitability of investment activities of other industries operating at the same level.
Box 12.5. The division of labour, from Adam Smith to Allyn Young
In his masterpiece, Adam Smith contented that the move from traditional
agriculture to manufactures entails an efficiency gain, that arises through the splitting of
297
Ciccone and Matsuyama (1996).
productive operations into smaller and more specialized operations. Adam Smith coined
this process as the “division of labour”.
The division of labour improves productivity by different reasons. On one hand,
specialization avoids the time it takes a worker to switch from one task to another and
allows workers to practice and perfect a particular skill. On the other hand, the division
of labour stimulates innovation, as workers engaged in specialized routine operations
come to see “better ways of accomplishing the same results”298.
In his reasoning, Smith was mostly concerned with the division of labour across
different tasks within a firm. Allyn Young, in its 1928 seminal article, extended the
idea. Young contended that an important mechanism through which the division of
labour materialises is through the introduction of new intermediate inputs. That is,
splitting production processes into a succession of simpler tasks typically involves the
use of machinery, which provision leads to the specialization of industries. For instance,
producing a chair can be done more efficiently if a blacksmith provides intermediate
inputs like hammers and nails. These inputs in turn can be made more efficiently with
machinery designed specifically to produce them and so on.
Setting-up a factory to produce the intermediate input may not be profitable
however, in the presence of fixed costs299. Hence, the proposition: the larger the extent
of the market (in upstream industries), the greater the scope for the division of labour
(the springing up of downstream industries). This is the type of effect we have analysed
so far.
Another novelty in the Young contribution is that he added a two-way causality:
“the division of labour depends upon the extent of the market, but the extent of the
market also depends on the division of labour” (p. 539).
The argument runs like this: when the market is too narrow, it doesn’t pay for a
producer of intermediate inputs (e.g. nails) to invest, because he will not sell enough to
recover the fixed cost. Therefore, an insufficient demand will prevents a network of
supporting industries to spring up. But a narrow range of specialized inputs will also
prevent the final goods sector (e.g., chairs) from expanding through the division of
labour. That is, the size of the market for upstream industries limits the development of
downstream industries, and then the absence of downstream industries limits the extent
of the market for upstream industries, in a vicious cycle. This is the argument analysed
in section 12.4.
Box 12.6. Backward and forward linkages
The Big Push, as initially advocated by Rosenstein-Rodan, basically consisted

on a broadly based investment programme. Albert Hirschman, at the time a professor in
Yale, refined the idea, arguing that government intervention to rescue an economy out
298
Smith (1776), Book 1, Ch 1: “Men are much more likely to discover easier and readier methods of
attaining any object when the whole attention of their minds is directed towards that single object than
when it is dissipated among a great variety of things”.
299
Young (1928), p. 530: “it would be wasteful to make a hammer to drive a single nail”.
afreitas@ua.pt 362
of a poverty trap should take into account the complex structure of vertical relations in
an economy300.
Hirschman distinguished backward linkages (which occur when expanding the
production of one good rises the demand for an upstream industry enabling it to
breakeven) and forward linkages (when expanding the production in one sector reduce
the costs of potential downstream users of its products pushing them over the
threshold). The impact in existing industries induced by a newly established industry
would then be given by the sum of its backward linkage effects with the forward linkage
effects (the total linkage effect). Note that the story of backward linkages and forward
linkages is not different from the circular causation mechanism of Allyn Young: input
producers do not invest where there is no downstream demand for them and assembly
does not take place where there is no upstream supply. So, there is scope for a Big Push.
Hirshman contended that, in an heterogeneous world, instead of engaging in
broad based investment programs, governments should take into account the complex
relationship of backward linkages and forward linkages across industries. Thus, the
government should first select a few number of key sectors in the economy (that is,
those with strong linkages to other sectors) and then tackle other sectors to correct the
imbalances generated by the previous investments, and so on. According to the author, a
policy of “engineered scarcities and imbalances”, creating shortages and tensions to
which the market is expected to respond should provide a sound basis for promoting
growth.
Traps and trade
Another reason why a country may be found itself locked in a low productivity
trap is because of an unfavourable specialization pattern. To see this in terms of the
model with pecuniary externalities, let’s turn again to the open economy case.
Section 12.3 examined the implications of trade openness in a world with a
unique final good and free trade among intermediate inputs. This section considers the
opposite case, in which intermediate inputs are non-tradable and trade occurs between
two different final goods.
In the real world, many intermediate inputs that are critical for the development
of downstream tradable goods have low international mobility301. For instance, many
producer services like banking, auditing, accounting, advertising, engineering, legal
supports, wholesale services, transport and communication services, equipment repair
and maintenance are mostly non-tradable or have to be supplied near to the final
producer. The likelihood of these services to develop in a given economy depends
however on the existence of downstream industries, which in turn, may spring or nor,
depending on the country specialization pattern. This raises the possibility of circular
causation between the specialization pattern and the development of upstream
industries.
300
Hirschman (1958).
301
Porter (1992) argued the domestic presence of suppliers is an important determinant of the
comparative advantage of nations. The following argument draws on Rodriguez-Clare (1996), and Rodrik
(1996).
To examine this argument, assume that, instead of one, there are two final
goods, Y and Z produced according to (12.2) and (12.23) respectively. The difference
between Y and Z is that the production of Y does not use raw labour, while to produce
Z some raw labour is required.
Of course, if a central planner could decide the country specialization pattern, he
would prefer the country to export Y and to import Z. The reason is that Y uses
intermediate goods intensively. Hence, a specialization in Y implies a larger demand for
intermediate inputs, leading to a wider availability of intermediate inputs in the
domestic economy and hence more production efficiency through the division of labour
and higher wages than when the economy is specialized in Z.
Changing the specialization pattern is not, however, a simple stroke of the pen
policy: in this model, the specialization pattern and comparative advantages are
mutually reinforced. Remember that, as the number of varieties m increases, the price of
X declines (equation 12.20). Since Y uses X more intensively than Z, when m increases
the cost of producing Y decreases relatively more than the cost of producing Z. Hence, a
country that starts out specialized in Y, will enjoy a relatively lower cost of producing
Y. A country that starts out producing Z, in turn, will support a smaller number of
varieties and is not likely to develop comparative advantage in Y. Such country will be
locked in a low-level equilibrium trap.
This does not mean that trade openness is bad for the country with comparative
advantage in Z: in static terms, the economy may be better off in the low-level
equilibrium trap with trade than without trade at all. There is however a dynamic
argument for restricting international trade: by forcing a country to produce both goods,
a temporary import restriction could induce an expansion in the range of available non-
tradable intermediate inputs: eventually, if a critical mass of these intermediate inputs
was achieved that way, the economy could escape the trap and become specialized in Y,
after openness. This reasoning is no more than another incarnation of the infant-industry
argument302 303.
Multinationals and linkages
Most governments in the world spend large amounts of resources trying to

attract FDI by multinational firms. One of the reasons to do so is the belief that
multinationals may create important linkages in the host economy. These linkages are
easy to describe in terms of the model discussed in this section. In that model, there is
no trade of intermediate inputs and there are two tradable final goods, which differ in
terms of their intermediate inputs requirements. Depending on initial conditions, a
country may be stuck in a low-level equilibrium trap, exporting the good that is less
intensive in intermediate inputs and hence taking low opportunity of the division of
302
A model exploring this avenue is Rodrik (1996). Remember that a similar argument was already made
in the context of the learning by doing model, with technological externalities.
303
Trindade (2005) extends the analysis allowing for trade in intermediate inputs too. This reinforces the
potential vertical complementarities. The author examined a case where trade openness allows the poor
country (South) to become more competitive and specialize in intermediate inputs production. If this
allows the South to produce a critical mass of intermediate inputs, then final good assemblers will find it
profitable to move to the South, too.
afreitas@ua.pt 364
labour. In such a context, multinationals may have a role in tilting the economy from
one equilibrium to the other304.
Box 12.7. The Big push revival
During the second half of the twentieth century, the Big Push argument lost a lot
of its initial popularity. The view that governments may have a role in shaping the
production structure of economies by electing and promoting particular industries was
much discredited.
A basic argument is that governments lack the necessary knowledge about the
economy to design an appropriate balance between investments in different sectors. On
the other hand, planners and bureaucrats may lack the incentives to implement the
policy without making things even worse. A reason is that, once a government starts
providing support to particular industries, the incentives structure changes: it pays for
entrepreneurs to spend resources trying to influence the political decisions.
In the 1990s, the disappointment with state-led industrialization and the collapse
of centrally planned economies created the sense that huge government interventions
are more likely to lead to waste of resources than to sustained growth. Thus, in light of
the Washington Consensus, governments should disengage from policies that target
particular sectors, and provide instead broad-based support to all activities in a sector
neutral way. This includes the provision of public goods, sound money, openness to
trade, the rule of law and protection of property rights.
In the last decade, however, the Big-Push argument was brought back to the
development agenda. Most notably, the 2005 Millenium Development Project Report
(United Nations, 2005) says:
"The key to escape the poverty trap is to raise the economy’s capital stock to the
point where the downward spiral ends and self-sustaining economic growth takes place.
This requires a big-push of basic investments between now and 2015 in public
administration, human capital (nutrition, health, education), and key infrastructure
(roads, electricity, ports, water and sanitation, accessible land for affordable housing,
environmental management)”(p.18).
Supporters of the Big Push contend that this phenomenon fits well in many
historical episodes. This includes the industrialization of continental Europe in the
nineteenth century (e.g, France, Germany), and the recent Southeast Asian’ growth
304
This question was analysed by Rodriguez-Clare (1996a). The key assumption in their model is that
multinationals have the possibility of establishing plants in the poor country (thus benefiting from the
lower wages there) while using intermediate inputs produced at home (thus overcoming the insufficient
supply of upstream industries in the host country). For instance, accountancy and engineering services
may be supplied by the multinational’ headquarters, while production takes place in a foreign country.
Communications costs between the headquarter and the factory imply however some efficiency loss.
Hence, there will be some incentive for intermediate inputs to start springing in the host country. If these
positive linkages pushed the number of varieties in the host country above a threshold, this could trigger a
specialization in the more complex final product, shifting the economy from the bad equilibrium to the
good equilibrium. 304 In related seminal paper, Markusen and Venables (1999) add the effect of
multinationals on domestic competition, rising the possibility of FDI having a negative effect.
miracles (South Korea, Taiwan), that relied heavily on government intervention to catch
up. Leading advocates of this view include Dani Rodrik and Jeffrey Sachs305.
The Big Push idea remains, however, very controversial. Many economists
remain suspicious about the ability of governments to implement successful industrial
policies. William Easterly, for instance, argued recently that: “Implementing the plan
requires the design of proper incentives which may not be an easy task in corrupt
bureaucracies”. (…) “the recent stagnation of the poorest countries appears to have
more to do with awful government than with a poverty trap”306.
Box 12.8. Doomed to choose
Advocates of the Washington Consensus claim that governments should abstain

from selecting particular industries and should instead focus on broad-based sector-
neutral policies, such as the rule of law, macroeconomic stability and provision general
infrastructure. To this view, Ricardo Hausmann and Dani Rodrik responded that
industrial policy is unavoidable: in the real world, it is impossible for governments to be
sector neutral307.
The reasoning is as follows. The adoption of many new technologies depends on
the provision of critical complementary inputs by the government. This includes
specific pieces of regulation and specific infrastructures that serve a narrow range of
activities only. For instance, accreditations, safety rules, pollution restrictions,
regulation to clarify roles and responsibilities, can be very industry specific. Since some
of these inputs are hard to design, providing them involves significant costs for the
government. And yet, absence of these critical ingredients will deter investors, either
because private returns without these complementary investments are too small or
simply because the risks of an adverse regulation in the future are too high.
A problem arises in that government resources are limited. Even though
governments have ministries of agriculture, industry and energy, and specialized
agencies to regulate food safety, financial markets, professional accreditations, and so
on, they have not the resources nor the technical capacity to fix all the standards, to
generate all the regulatory pieces and to address all the infrastructures needed to
accommodate any potential new activity. Hence, Hausmann and Rodrik contend that
public policy cannot be sector neutral: by deciding to provide pieces of regulation that
are specific to some activities and useless to others, governments will end up benefiting
differentially different economic activities. Governments are “doomed to choose”, they
conclude.
12.5. Centripetal and centrifugal forces
International factor mobility
305
Rodrik (2005). Sachs et al (2004), Sachs (2005).
306
Easterly (2006).
307
Hausmann and Rodrick (2006).
afreitas@ua.pt 366
It has long been recognized that factor mobility and trade may act as substitutes
for one another. The same is true in the presence of economies of scale. The difference
is that, with factor mobility, the economies of scale are realized through a process of
agglomeration.
To see this, let’s return to the trade model introduced in Section 12.3. Assume
however that transport costs are prohibitive, so there is no international trade. Labour,
however, is free to migrate across economies.
If the two economies were of equal size, then, with all the rest equal, real wages
would be equal in the two regions. Hence there would be no incentive for labour to
migrate.
Now assume that one of the regions experiments a small increase in the number
of its inhabitants (for instance, N W  N E ). In that case, the larger region will achieve a
larger variety of locally produced intermediate inputs (eq. 12.19). This, in turn,
translates into higher productivity and higher real wages, by lowering the price of the
final good (equations 12.20 and 12.21). Thus, there will be incentives for workers to
migrate from the smaller region to the larger region. With free labour movements, a
process of cumulative causation takes place: as labour moves from the smaller region to
the larger region, there is an expansion in the number of varieties in the larger region
and a contraction in the number of varieties in the smaller region and a further widening
of the real wage gap. The larger region will become richer and richer in a virtuous cycle
and the smaller region will shrink and get poorer, in a vicious cycle. In the limit, the
whole population will end up in the largest region and the world per capita income will
be equal to (12.22).
Note that the agglomeration process may be triggered solely by an initial
difference in the market size. When the two regions have the same productivity levels, it
does not matter which region ends up with the whole population. But if the two regions
differed in terms of technology (say region W has higher fixed and variable costs), it
could be that all population moved to that economy, just because it started out with a
larger population size. In that case, the migration process would have delivered the
undesirable outcome.
Also note that the welfare effects with labour mobility are in sharp contrast to
the case with free trade and no factor mobility: with free trade, workers in both regions
gain with trade openness and those in the smaller region gain more. With factor
mobility and no trade, the workers of the bigger region gain and the workers that remain
in the smaller region loose.
The core-periphery model
By now we have assumed extreme assumptions regarding trade openness and

factor mobility. A more complicated story arises when one looks at intermediate cases,
where some international trade and some labour mobility are allowed.
A particularly interesting case occurs when only some labour is free to migrate
across regions and international trade is subject to transport costs 308 . Suppose, for
instance, that some workers are tied to location-specific activities, such as agriculture or
mining. In that case, location decisions for firms have to trade-off the benefits of staying
in the larger region (agglomeration) against the cost of being too far from peripheral
costumers. This problem was first investigated by the Nobel Laureate Paul Krugman
(1991) and resulted in a new literature that became known as the New Economic
Geography309. This section briefly describes the original idea, which became known as
the “core-periphery model”.
Assume that there are two regions with identical preferences and two kinds of
goods, agricultural and manufactured. Agricultural production is homogeneous and
produced under CRS and perfect competition. Manufactures exist in a large number of
varieties and are produced under increasing returns and monopolistic competition.
There are also two types of workers: “blue collars”, who are free to move to the region
offering higher wages, and “farmers”, who are tied to specific locations. Both types of
workers demand agriculture goods and manufacture goods and have equal tastes.
Agriculture goods are traded costlessly, whereas manufacture goods are subject to
transport costs. The geographical distribution of farmers is taken as fixed, with half
farmers in each location. The problem is to find out how blue collars, and hence
manufactures production, will be allocated between the two regions.
From what we have learned so far, it should be more or less intuitive how the
core-periphery model works. In this model, there are two “centripetal” forces and one
“centrifugal” force, which are realized through migration of blue collars. The centripetal
forces are the desire of firms to locate close to the larger market and the desire of blue
collars to have access to a larger number of consumer varieties at low transport costs.
The centrifugal force is the incentive of firms to move out to serve the peripheral
demand. Depending on the parameters of the model, the spatial equilibrium may be
more or less concentrated. Intuitively, the stronger the agglomeration effects, the more
concentrated will be the economic activities. In this model, transport costs act as a
dispersion force, counteracting the agglomeration effect of economies of scale.
Because the model is non-linear, the interplay between centripetal and
centrifugal forces is very complicated and may lead to different equilibria, depending on
the parameter values and on initial conditions. Krugman (1991) calibrated the model
with specific functional forms and made some simulations, obtaining the following
conclusions. First, when transport costs are very high, there is little trade of
manufactures among regions. In that case, the agglomeration advantages are outweighed
by the need to serve the peripheral markets and the economy converges to an
308
When both trade and labour mobility are allowed without frictions, the proposition that workers in
both regions gain and those in the smaller region gain more is recovered. In that case, however, it is not
possible to determine location: any geographical distribution of production is an equilibrium. For an
integrated model with international trade, factor mobility and economic growth, see Ventura (2005), pp
1427-1442.
309
The starting point of this theory is the final section of Krugman’s (1979) famous article, where he
argues that patterns of labour migration can be analysed within the same framework. One year after,
Krugman (1980) extended the model so as to account for the role of transport costs. In that paper, he
showed that, with positive transport costs, there is an incentive for production to cluster close to the
largest markets, because this allows economies of scale to be realized with minimum transport costs. But
it was his seminal 1991 article that launched the New Economic Geography.
afreitas@ua.pt 368
equilibrium where manufacturing is equally split between the two regions. Second,
when transport costs are very low, there are no significant costs in serving the peripheral
demand, so the agglomeration forces dominate. In that case, if one of the regions starts
out with a larger scale it will be a more attractive location for blue collars to work.
Thus, a process of agglomeration will take place until only agriculture workers are left
in the periphery 310 . Third, at intermediate transport costs, different equilibria may
emerge, depending on the initial conditions.
An implication of this model is that a decline in transport costs may lead to
divergence: if, starting from a situation where the economic activity is equally split
across regions, transport costs fall below a critical level, then it will pay for any single
worker to move from one region to the other, so that the later becomes the larger region.
Then, a cumulative process takes place leading to a core-periphery spatial structure.
The Krugman-Venables theory
The core periphery model was extended in different directions. An important

case arises when labour is immobile across regions 311 . This setup looks more
appropriated to discuss centripetal and centrifugal forces at the global level in our days:
today most labour mobility takes place within countries, with labour mobility between
countries playing only a minor role.
A novelty in the Krugman and Venables’ specification is that manufactured
varieties can be either used as intermediate inputs to produce other varieties or
consumed as final goods. So the model accounts for backward and forward linkages
within the manufactures sector. Since in that model labour cannot migrate
internationally, agglomeration effects leading to a rise in the demand for labour in the
centre translate into higher wages there. This, in turn, acts as a centrifugal force: as
wages increase in the centre, there is an incentive for manufactures production to move
towards the periphery. The functioning of the model is rather intuitive: when
transportation costs are prohibitive, each region will be basically self-sufficient and the
economic activity will be equally split across the two regions. As transportation costs
fall, there is a tendency for manufactures to cluster in the centre, taking opportunity of
the better access to markets and manufactured inputs (linkages), satisfying the
peripheral demand through exports. Thus, a process of agglomeration takes place, with
one region becoming industrialized and the other region relegated to primary
production. As transport costs continue to fall, however, the agglomeration advantages
vanish (varieties can easily be imported from abroad). In that case, some manufactures
production will eventually move to the periphery (drawing workers from agriculture) so
as to benefit from the lower wages there. Thus, when transport costs fall below a critical
level, a symmetric equilibrium emerges again, with half of manufactures located in each
region.
The model with immobile labour offers an interpretation for the Great
Divergence, as well as for its reversal starting in the last decades of the twentieth
century: along the last centuries, there has been a steady decline in transport costs.
310
Of course, if the two regions have initially the same size, blue collars wages will be equal and there
will be no incentive to migrate. This will be, however, an unstable equilibrium.
311
Krugman and Venables (1995), Venables,( 1996).
According to the model, as transport costs declined, there should be initially a tendency
for manufactures production to cluster in the region that was initially more developed
(the “North”). In result, a significant wage (income) gap emerged between North and
South. Later, as transport costs continued to fall, the agglomeration advantages lost
importance relative to the centrifugal effect of labour costs. According to this
interpretation, the world should now be engaged in a process of convergence, with
increasing international trade in manufactures and reallocation of manufactures
production from countries with high labour costs to countries with low labour costs.
12.6 Geography and economic development
Searching for the initial conditions
Along this chapter we have examined different models in which pecuniary

externalities arising from economies of scale led to virtuous and vicious cycles. We
learned that with economies of scale, advantages and disadvantages tend to be self-
sustained.
Thus, if by historical accident, a country is blessed with a small “initial
advantage”, this may trigger a process of cumulative causation, through which the
initial advantage is magnified over time. A less lucky country, on the contrary, may
become poorer and poorer, with its resources being attracted to the richer country.
Economies of scale are inherently linked to divergence.
A theory of inequality purely based on cumulative causation looks however
rather incomplete. To say that a country is poor because it started out poor and that a
country is rich because it started out rich is an unsatisfactory explanation. At a deeper
level, one may want to understand why some countries started out rich while others
started out poor. So, a critical question is to understand where initial conditions came
from.
Some authors argue that Geography played a fundamental role in shaping the
initial conditions. According to this view, physical factors, such as climate, availability
of natural resources, access to navigable waters and endemic diseases played a critical
historical role in the very beginning. The geography hypothesis contends that the
different incidence of geographical factors materialized into different incentives to
produce and invest, triggering a process of cumulative causation and increasing
economic disparities312.
The Geography hypothesis is backed by empirical facts. If one looks at the
World map, it is easy to verify that the most developed nations are geographically
concentrated and tend to be located in tempered areas, while the poorest countries are
located in the tropics. Figure 12.3 illustrates this. The figure crosses data on per capita
GDP for 151 countries as of 1988 with the respective distances to the equator. If initial
conditions were a matter of “pure chance”, then per capita GDP should be randomly
312
Galor (2005, p. ): “Variations in the economic performance across countries and regions (e.g, earlier
industrialization in England than in China) reflect initial differences in geographical factors and historical
accidents and their manifestation in variations in institutional, demographic, and cultural factors, trade
patterns, colonial status, and public policy”.
afreitas@ua.pt 370
distributed across the space. In that case, there would be no systematic relationship
between per capita GDP and distance to the equator. The figure reveals, however, that
countries that are located close to the equator tend to be poorer than countries that are
located in high latitudes. This suggests a role for Geography is explaining economic
development today313.
Figure 12.3: Per capita GDP and distance to equator
11
2
R = 0,43
10,5
10
Per capita GDP (Logs)
9,5
8,5
7,5
6,5
0 10 20 30 40 50 60 70
Latitude (absolute value)
Data source: Hall and Jones (1999).
Geography and economic incentives
There are different mechanisms through which geography can influence

productivity and the human choices.
The most basic mechanism are the agro-climatic conditions: a country with low
rain, lack of conditions for irrigations and with nutrient-poor soil will obviously face
more difficulties in feeding a large population than a country blessed with extensive
arable areas and fertile land. Along this idea, it has been argued that the fundamental
reason why by the 15th century the region of Eurasia was technological more advanced
than the other regions in the World was because this region enjoyed a favourable
selection of native plants and animal species that could easily be domesticated to
produce high yields314. This advantage translated into storable food surpluses, which in
turn allowed the expansion of trade and the division of labour, triggering the
development of different skills, technology and the emergence of organized, hierarchic
and politically structured societies.
313
Empirically, there is extensive empirical evidence pointing to the significant role of geographical
variables in explaining the cross-country variation of per capita incomes. This includes the share of the
country located in the tropics, the incidence of malaria, the location of a country relative to the sea or to
navigable rivers. A seminal empirical contribution in this area is Gallup et al. (1998).
314
Diamond (1998).
Second, Geography may influence economic performance through high

transport costs. Landlocked economies, small islands, economies surrounded by
mountains or at long distances from major world markets face higher transport costs
than economies in the centre315. High transport costs are like tariffs on international
trade, reducing the extent of the market: a region facing high transport costs will benefit
less from division of labour and technological diffusion than a region highly engaged in
trade with abroad.
A third factor is disease. Diseases reduce the availability and the quality of the
main asset of the poor, which is working time. On the other hand, by reducing
individuals’ life expectancy, a high incidence of disease implies a smaller payback
period for investments in human capital, so people will optimally respond investing less
in education. Moreover, in cases of extreme poverty, high mortality rates caused by
disease may induce higher fertility rates, delaying the demographic transition.
Although one may argue that economics drives the pattern of disease, many
diseases are determined by geographical factors, such as temperature, rainfall and soil
quality. In particular, malaria, which is endemic in the tropics and cannot survive
elsewhere, has been referred to as a major factor explaining why the world poorest
countries are located in tropical areas.
Thus, according to the Geography hypothesis, exogenous factors related to a
country specific location play a central role in determining its growth potential. Since
geographical factors are invariant over time, this theory embodies a large scope for
determinism.
Long lasting effects
Geographical factors are invariant over time, but their importance is not. In fact,
the sources of geographical advantage before are not necessarily the same as those of
today.
For instance, at the time agriculture was invented, availability of arable land was
the critical ingredient. Since at that time transport costs and communications were too
costly to support interregional trade, geographical advantage came mainly from location
close to highly fertile areas, such as those around the Tigris and Euphrates. With the
progress in transportation, however, the nature of geographical advantage changed
dramatically: location advantage became related to proximity to coastal areas, such as
the Mediterranean and the North Atlantic. Centuries later, the industrial revolution
marked a move from geographical-sensitive farming to geographical-insensitive
manufacturing. Still, at the outset of the industrial revolution, proximity to key inputs
such as coal, or to transport hubs, such as harbours, made a key difference.
In our days, railroads, automobiles, air transport and progresses in medicine are
turning location much less important than before. However, to the extent that past
geographical advantages materialized into different initial conditions and, by then, they
triggered processes of cumulative causation, it is natural to observe today a heavy role
of geography in explaining the cross-country differences in per capita incomes. Today
most advanced nations have a wide agricultural basis, even though today one can set up
315
“The great rivers in Africa are too great a distance from one another to give occasion to any
considerable inland navigation” [Adam Smith].
afreitas@ua.pt 372
a modern society without the need to step through an agriculture stage. By the same
token, today’s most great cities are ports, even though the importance of ports is now
mitigated by the existence of highways, railroads and airports.
The long lasting effects of geographical factors help explain the data in Figure
12.3: Geography still plays an important role in explaining the current location of
economic activities, even though the underlying initial advantages of geography are no
longer that important316.
Box 12.9. The tragedy of Moriori
A nice essay on the role of geography on economic development is from Jared

Diamond, in his famous book Guns, Germs and Steel. The author contends that
availability of suitable conditions for agriculture was the sole main determinant of the
asymmetric development of the different regions in the world until the 15th century.
To motivate this idea, the author started with an historical example: the spread
of the ancestral Polynesians through the Pacific, 3.200 years ago. In that odyssey, the
ancestral Polynesians encountered thousands of islands differing greatly in respect to
their area, isolation, elevation, climate, productivity and geological and biological
sources. Within a few millennia, that single ancestral Polynesian society had spawned
on those islands a range of diverse societies, from hunter-gatherer tribes to proto-
empires. One of these groups colonized New Zealand around A.D. 1000, to become the
Maori people. A dissident group of Maoris colonized the Chatam Islands, 500 miles east
of New Zealand, to become the Moriori.
Although the ancestors of the two groups shared essentially the same culture,
language, technology and set of domesticated plants and animals, in the centuries after
they evolved in opposite directions. Those that occupied the northern island of New
Zealand, found suitable conditions for agriculture. There, they developed new
agriculture techniques and invented new tools to grow their crops. The food surpluses
allowed population to expand and the society to explore the benefits of specialization,
with the emergence of craft specialists, chiefs and solders, and political leaders.
Those that occupied the Chatam Islands found a climate that was too cold for the
Maori tropical crops to grow. Hence, the Moriori had no alternative but to revert to
being hunter-gatherers. As hunter-gathered, they did not produce crop surpluses for
redistribution and storage. Hence, they could not support and feed armies, bureaucrats
and political organization. Hunter-gatherer societies typically organize themselves in
small groups with a primitive political structure and do not need to develop
sophisticated technologies. Moreover, because the Chatam Islands are relatively small,
they could support at most a population of 2000 hunter-gatherers. The result was a small
population with simple technologies lacking leadership and organization.
In conclusion, the Moriori and Maori societies developed from the same
ancestral society but along very different lines. The Moriori reverted to being hunter-
gathers. The North-Island Maori turned to more intensive farming and develop a
316
Gallup et al. (1998, p. 132) “a city might emerge because of cost advantages arising from
differentiated geography but continue to thrive because of agglomeration economies even when the cost
advantage has disappeared”.
complex political organization. These two societies lost awareness of each other
existence and did not come into contact again for roughly 500 years. Finally, on
November 1835, a group of 900 Maori sailed to the Chatham Islands, attacked and
exterminated the Moriori people in only one month.
The geography vs institutions debate
In the theory of economic growth, two main driving forces have been proposed
as fundamental determinants of economic performance in the long run: Institutions and
Geography.
Institutions refer to the (formal and informal) norms that constraint human
behaviour (the “rules of the game” in a society). The fundamental role of institutions
has been stressed, among others, by the Nobel Laureate Douglass North and Daron
Acemoglu 317 . Geography, in contrast, refers to “forces of nature”, such as climate,
natural resources and location, which impact on agricultural productivity, disease
burden, transport costs and technological diffusion. According to this view, countries
located in favourable regions were blessed with initial conditions that triggered
economic development. The role of geography has been emphasized, among others, by
Jeffrey Sachs and Jared Diamond318.
Of course, at this stage the reader should be aware that economic development is
a very complex phenomenon, so no single factor should be capable of explaining all the
observed economic disparities in the World. Still, understanding the real weight of these
two factors has important policy implications: while Geography refers to conditions that
societies cannot change, institutions are human devised and have at least the potential to
be changed by collective actions. With no surprise, an empirical literature has emerged
trying to disentangle whether the most important factor influencing economic
development is geography or institutions.
Reversals of Fortune
Since Geography is an exogenous and unchanged variable, an obvious way of

assessing its importance is looking at history: if geography played a prominent role,
then some regions should be doomed to be rich while others should be doomed to
remain poor.
Fortunately, this is not always the case: in real life, there are examples of
countries that managed to change their fortune. A popular example is Botswana. This
country is tropical and landlocked and it remained poor until very recently. However,
thanks to a well functioning democracy that has been successful in preserving the
legacy of the laws and contract enforcement inherited from the British colonial period,
this country has enjoyed a fast convergence towards the developed word.
This avenue was explored by Daron Acemoglu and different co-authors in a
series of papers. The authors focused on particular historical episodes, which they
labelled as “natural experiments”: episodes where, “while other fundamental causes of
317
Key references are North and Thomas (1973), North (1990), Acemoglu et al. (2001).
318
Key references include Diamond (1997), Gallup et al. (1998), Sachs (2001).
afreitas@ua.pt 374
economic growth were held constant, institutions changed because of potentially-

exogenous reasons”319.
One of these “natural experiments” is the history of the two Koreas. The split of
Korea into South Korea and the People’s Republic of Korea occurred after World War
II. The two countries were born out of the same people, share the same culture and
climate and had the same per capita income just before separation. The two countries
adopted however, different economic systems: while South Korea engaged in a market
economy, the Republic of Korea abolished the private property and implemented a
centralized command system. Geography, by definition, remained immutable. Three
decades after, South Korea was a growth miracle, while the People’s Republic of Korea
was amongst the poorest nations in the world. This suggests that institutions can
overwhelm geography as a driver of economic development.
Another “natural experiment” was the colonization of much of the world by
Europeans, starting in the fifteen century. This colonization transformed exogenously
and abruptly the institutions in the colonized regions. If climate, ecology, or disease
environments have condemned some of these regions to poverty and other regions to
richness, those regions that were poor should have remained poor and those that were
rich should have remained rich. However, after the European colonization, many
regions experienced “reversals of fortune”: regions that were very rich and influent in
the past, such as the Incas in America and the Mughals in India, became poor, while
some other regions that were poor, like Australia and North America, became rich. Box
12.10 describes this natural experiment in a greater detail.
Box 12.10 The colonization experiment
The European colonization of much of the world starting in the fifteen century
triggered many episodes of “reversals of fortune”: regions that were initially rich
became poor, while some regions that were poor became rich.
A number of authors lead by Daron Acemoglu 320 contended that the main
explanation for the “reversals of fortune” after colonization was the change in the
quality of institutions. Indeed, European colonization transformed exogenously and
abruptly the institutions in the colonized regions. The authors observed that Europeans
followed different colonization models and implemented different institutional setups
around the globe. In some regions, Europeans implemented “extractive institutions”,
such as the slave plantations of the Caribbean, Congo and Central America. In these
cases, institutions were not designed to protect the property rights of the majority of
citizens or to constrain the power of elites. In other regions, Europeans founded settler
societies, replicating European institutions in areas like in North America. According to
the authors, the fact that countries that implemented settler institutions achieved higher
economic performance than those that implemented extractive institutions provides,
supports to the institutional hypothesis.
A different question is why did the Europeans implement different institutional
models in different colonies. Here, geographical conditions and factor endowments
319
Acemoglu (2009), pp. 149. The discussion below follows Acemoglu (2003), Acemoglu, Johnson and
Robinson (2011, 2002, 2006).
320
Acemoglu et al. (2001, 2002).
played a critical role321. Indeed, European colonialists did not setup institutions for the
sake of the society as a whole: they created settler societies wherever it was their
interest to do so and “extractive” institutions wherever it was their interest too.
Thus, in places where the climate and the soil quality made it more effective to
produce crops using large plantations and where the disease environment was not
favourable to European settlement, the colonialists established plantation systems based
on slavery and erected political and legal institutions to protect the few landholders
from the majority of the population.
In places where the climate and the soil quality made it more effective to
produce using small scale farming, where most of the land was empty and with
hospitable climate and germs, Europeans settled in large numbers and developed laws
and institutions protecting property rights of the regular citizen and imposing constraints
on the elites. In these colonies, institutions were much more favourable to growth, broad
public education and innovation.
The conclusion is that, although the reversal of fortunes were triggered by major
changes in the institutional setup, geography played a critical influence in shaping the
quality of institutions
The indirect influence of geography
In sum, the historical evidence suggests that geography neither condemns a

country to success nor to poverty. The fact that many historical “reversals of fortune
were preceded by abrupt changes in the institutional setup points to a prominent role of
institutions as determinant of economic performance.
To the extent that many institutional setups around the globe are an inheritance
of the European colonization and have emerged in response to the existing geographical
conditions, there has been an “indirect” effect of geography on economic performance,
through the quality of institutions. That is, although geography is not the determinant
factor, it played a critical influence in the determinant factor, the quality of institutions.
This reasoning suggests that the correlation between latitude and per capita
income observed in Figure 12.3 may not be due to a direct influence of geography on
economic performance, but rather to an indirect one: to the extent that the historical
creation of institutions was correlated with latitude, a statistical relationship between
geography and economic performance emerges in the data, irrespectively of any direct
causality from geography to economic performance322.
321
Sokoloff and Engerman (2000) and Acemoglu et al (2001).
322
Empirical tests implemented by Acemoglu et al (2001) reveal that, after accounting for the indirect
effect of geography on the quality of institutions, geographical variables exert no influence on economic
performance. Rodrik et al (2004) also tested empirically the institutions-versus-geography hypothesis,
using cross-country data for the year of 1995. They found that the quality of institutions (as measured by
a composite indicator capturing the protection of property rights and the rule of law) is the only positive
and significant determinant of per capita income. Controlling for institutions, they observed that
geography has at best, weak direct effects on per capita income, although it has a strong indirect effect,
through its influence on the quality of institutions (similar result in Easterly and Levine, 2003). The
authors also found that openness to trade is an important determinant of the quality of institutions.
afreitas@ua.pt 376
This is not to say that geography is the only determinant of the quality of
institutions. Institutions and policies do change over time, and sometimes drastically in
response to major political, social or economic disruptions. In any case, it is in the
hands of people to change their own future.
12.7 Discussion
In the neoclassical paradigm, the expansion of one activity comes only at the
expense of the others. So in light of that model, the price mechanism assures that the
economy is self-adjusting. With increasing returns, in contrast, the expansion of one
activity does not necessarily come at the expense of the other: there is room for
complementarities. When investments are complementary, the profitability of two
different investments depends on each other and there is no market mechanism to assure
that both investments will take place. Hence, critical investments fail to occur because
other complementary investments are missing and the latter fail too because the former
do not happen. This is a coordination failure.
In this chapter, two versions of the model with increasing returns were
described. The first assumes that the number of varieties is fixed. In that case, the
expansion of output by one firm may lead to an increase in profits by other firms. The
second version of the model is that of monopolistic competition. In that model, there is
free entry, so profits are driven down to zero in the long run. In both cases, we found
that a larger market size impacts positively on per capita income. This, in turn, leads to
cumulative causation: either through a decline in average costs of producing each
variety or through an expansion in the number of varieties, the more an economy
produces, the more attractive it will be for new investment.
Coordination failures and cumulative causation may have a role in explaining
why some countries are rich while others remained poor. Lack of industrialization and
low division of labour may be self-sustained. Along this reasoning, many authors
contend that governments should have an active policy towards industrialization, either
through a coordinated effort with domestic entrepreneurs or through temporary import
protection. Other authors contend, however, that governments have not the technical
ability nor the political strength to implement successful industrial policies.
Economies of scale are a “centripetal force”. In the real world, however, one
does not observe the concentration of economic activity in one unique geographical site.
So, in order to understand the dispersion of economic activity across the territory, one
needs to account as well for the role of “centrifugal forces”. The so-called “New
economic geography” has focused on transport costs. According to this theory, transport
costs create a tension between the agglomeration advantages originated by economies of
scale and the advantage of staying close to peripheral markets.
The chapter illustrates the difficulty in obtaining general propositions in the
presence of economies of scale. From model to model, the conclusions differ
dramatically with small changes in parameter values or on initial conditions. This is a
general feature of economies of scale: sometimes, small changes in the parameters
produce small effects, sometimes they trigger a process of cumulative causation that
changes the nature of the equilibrium. With increasing returns, it is much more difficult
to obtain general propositions than in the case with perfect competition and constant
returns.
In models with increasing returns, small initial advantages can lead to

persistently different growth rates and divergence of per capita incomes. A question that
arises is why some countries were blessed with an initial advantage while others did not.
Advocates of the geography hypothesis argue that temperate-zone coastal countries
have higher income levels today because their geographical attributes once conferred
advantages, even if these initial advantages are no longer important. History is,
however, plenty of examples of “reversals of fortune”, that is, countries that were
initially poor and became rich or countries that was initially rich and became poor.
Many of these episodes were preceded by dramatic changes in the quality of
institutions. Thus, while recognizing the critical importance of geography, the
conclusion is that human devised institutions are overwhelming as the main
fundamental cause of economic development.
 The Big Push argument points to the possibility of coordination failures

arising from horizontal complementarities: a single investment may not
be profitable even if it would be profitable when coordinated with other
investments. In the Big Push model, the size of the market does not
necessarily go along with the size of population: the choice of
technology matters.
 With free entry, the number of varieties becomes an increasing function
of the size of population (horizontal innovations). The implied division
of labour gives rise to productivity gains that translate into a positive
relationship between per capita income and the population size (weak
scale effect). A corollary of this is that openness to international trade, by
enlarging the extent of the market, results in productivity gains.
 As the enlargement of the market translates into higher productivity, a
higher productivity may feedback on the size of the market. This mutual
causation opens a channel through which a poor country that didn’t reach
a critical stage in the process of division of labour finds itself trapped in
a low level equilibrium, with little division of labour and small market
size. This is another version of the Big Push argument.
 To the extent that traded goods differ in respect to division of labour
potential, the specialization pattern that arises under free trade is not
necessarily the one that delivers the highest possible income. In
particular, a country specialized in a good that does not favour the
division of labour will fail to achieve the implied productivity gains. In
this case, it may pay to promote an “infant industry”. By the same token,
government efforts to attract multinational firms may help trigger the
development of upstream industries that in turn will favour the springing
up new downstream industries, in a virtuous cycle.
 The Big Push idea inspired many economists to argue that governments
should play an active role in industrialization, by coordinating the private
investments. Contenders of the Big Push argue, however, that
afreitas@ua.pt 378
governments lack the knowledge and the appropriate incentives to

implement successful industrialization policies.
 When economies of scale are coupled with factor mobility, cumulative
causation takes the form of agglomeration economies, whereby mobile
factors tend to move from peripheral regions to the centre. Transport
costs and immobile inputs may help alleviate this force. The location of
economic activities across the space reflects a tension between
centripetal forces and centrifugal forces. When labour is immobile across
regions, higher wages in the centre act as a centrifugal force.
 A common feature of models with increasing returns is that the
equilibrium is very dependent on the initial conditions and on parameter
values. Thus, it is very difficult to obtain general propositions.
 History dependence in the presence of economies of scale raises the
question as why the “initial” conditions differed across countries. A
branch in the literature contends that in the real life, the initial conditions
were pretty much determined by geographical circumstances. This is the
so-called “Geography Hypothesis”.
 There are many reasons to believe that geography played indeed a key
role in shaping the initial conditions. Regions with abundant arable land
and access to navigable waters emerged as more attractive places to
produce and invest, feeding large populations and achieving productivity
gains through the division of labour and technological progress, in a
virtuous cycle. Regions that benefitted from favourable geographical
conditions in the past tend to be richer regions today, even though the
geographical characteristics that delivered the initial advantage are no
longer important.
 Although geography plays a key role in explaining why some countries
today are rich and other countries are poor, this does not mean that living
standards cannot be changed by human actions. The debate on
“Geography vs Institutions” has revealed that regions that were initially
rich became poor and regions that were initially poor managed to
become rich. These “reversals of fortune” were in general associated to
changes in the quality of their institutions. The conclusion is that
institutions are the most important determinant of economic
development. Still, historically, geography has played an important role
in influencing the quality of institutions.
Key concepts
 Big push
 Pecuniary externalities
 Strategic complementarities
 Horizontal vs. vertical complementarities
 Backward and forward linkages
 Coordination failure
 The division of labour
 Reversals of fortune
 The geography hypothesis”
Essay questions:
a) Comment: “the case for big push is much more stringent than that loosely expressed by
Rosenstein-Rodan”.
b) In light of the big push model, public provision of an essential infrastructure is not a sufficient
condition to escape the trap. Explain why.
c) Explain: “The division of labour depends upon the extent of the market, but the extent of the
market also depends on the division of labour”
d) Explain how multinationals can help a country escape a low level equilibrium trap.
e) Comment: “Industrial policy is not an option: government are doomed to choose”.
f) Comment: “According to the core periphery theory”, a fall in transport costs may lead to
divergence”.
g) Discuss: “The factor which ultimately better explain economic growth is geography”.
h) The colonization of much of the world by Europeans, starting in the fifteen century delivered
different economic performances around the globe. Does this “natural experiment” favour the
“geography hypothesis” or the “institutions hypothesis?”
afreitas@ua.pt 380
Exercises
12.1.
Consider a closed economy, where aggregate output (Y) is obtained using 100
2
 100 1 
(m) intermediate inputs, according to the following production function: Y    x j 2  .
 j 1 
a) Admitting that the final output sector is perfectly competitive, determine
the demand function for each intermediate input.
b) Knowing that intermediate sectors are all alike and that there is free entry
in the final output sector, find out the relative price of Y, Py/pj.
Each intermediate input can be produced with one of two technologies: (i) a
traditional CRS technology that consists in converting one unit of labour input into one
unit of intermediate input, N j  x j ; (ii) a modern technology with increasing returns,
xj
Nj  F  , where F=10 and λ=2. When technology is traditional, production will be

competitive. When a plant is installed, the entrepreneur becomes monopolist in the
market for the corresponding variety.
c) What is the volume of employment that turns industrialization desirable?
d) Now admit that N=2500. Under these conditions, would it be better for
the economy as a whole to remain traditional or to get modern?
e) Assume that that the blue collar wage (w) is higher than 1. Why should
that be? Verify that, in this case, installing a plant is doomed to be a
non-drastic innovation.
f) Bearing in mind the answer to (e), compute the quantity demanded for
each intermediate input when:
i. All the producers remain traditional sector;
ii. All the producers get modern.
g) Find out the profit of an entrepreneur that decided to go modern when:
iii. All other producers remain traditional;
iv. All other producers are modern.
h) Assume that w=1.1. If all other sectors remain traditional, is it worth
from an individual entrepreneur in a given sector j to go modern? What
if all other sectors were modern?
i) If instead w=1.6, would that pay for an entrepreneur to go modern when
all other sectors remain cottage? And if all other sectors were modern?
j) And what if w=1.16? Repeat the exercise and conclude.
13. Corruption, rent seeking and institutions
“How deep is your love?” [Bee Gees]
Learning Goals:
 Acknowledge the importance of institutions in aligning private

incentives with the public interest
 Distinguish the case with a non-benevolent planner from the case in
which the planner is benevolent but civil servants are not.
 Understand the main components of a well-designed institutional setup.
 Understand why strategic complementarities may lead to endemic
corruption.
 Acknowledge the critical role of institutions as a fundamental factor
explaining economic development.
13.1. Introduction
In the previous chapters, we stressed the key role of government policies for
economic development. So far, however, we have assumed that policies are designed
and implemented by benevolent planners whose interests are aligned with the social
interest. This chapter departs from this perspective.
In the real live, public programs are instituted in complicated political processes
and implemented through complex bureaucracies. Instead of benevolent planners,
political leaders often pursue their own selfish objectives, using their powers to keep
themselves in office or to direct resources to their political supporters. In many
countries, government officials are primary agents of diversion, seeking to maximize
their own benefit through extortion, corruption fees or unduly appropriation of public
assets. As a by-product, in societies with high levels of corruption, individuals tend to
spend valuable resources in seeking for fast money and special favours, instead of
devoting them to production and innovation. When corruption is very high, institutions
become dysfunctional, paving the way for corruption to become self-sustained.
In this chapter, we enrich the neoclassical growth model to examine the
implications of corruption and rent seeking on economic performance. This will also
provide an opportunity to discuss the key role institutions in keeping decision-makers
more aligned with the public interest. In this discussion, three models of corruption will
be considered. In the first model, a non-benevolent despot (the kleptocrat) empowered
with perfect control over its bureaucracy uses his discretionary power with the aim to
maximize his personal theft from the government budget. The only limits he faces are
the political, administrative or legal institutions he cannot change. The second model
(decentralized corruption) examines the case in which a benevolent leader delegates
discretionary power on a large-number of non-benevolent public officials which
corruption activity cannot be coordinated. In this case, the level of corruption will
afreitas@ua.pt 382
depend on the ability of the planner to design incentive compatible institutions. Finally,
we discuss the implications of corruption becoming generalized, affecting the majority
of population and all levels of the public administration. In this case, there is no
benevolent planner seeking to design optimal institutions. The likelihood of detection
and punishment decreases dramatically, institutions and policies become highly
dysfunctional and corruption becomes endemic.
The chapter proceeds as follows. Section 13.2 defines corruption and gives some
real life examples. Section 13.3 introduces the model with centralized corruption.
Section 13.4 addresses the case where corruption is undertaken by a large number of
public officers whose activity the benevolent planner cannot control. Section 13.5
briefly reviews how societies in the real world deal with the corruption problem and the
key role of institutions in shaping the incentives of civil servants. Section 13.6 analyses
the case where institution become dysfunctional and corruption becomes endemic.
Section 13.7 concludes.
13.2. Corruption
What is corruption?
Corruption may be defined as an “act in which the power of public office is used
for personal gain in a manner that contravenes the rule of the game”323. This includes
embezzlement, the appropriation of public assets for personal use, the celebration of
lucrative contracts to business owed by the public officer’ relatives (actions that the
public officer can carry out alone), bribery and extortion (actions that necessarily
involve two parties).
According to Aidt (2003), three conditions are necessary for the existence of
corruption: (i) the public official must have the authority to design or administer
policies and regulations; (ii) this discretionary power must allow the extraction or
creation of economic rents; (iii) the incentives embodied in political, administrative and
legal institutions must be such that the official has incentive to use his discretionary
power to extract or create rents.
The incidence of corruption varies widely across countries. Corruption is more
pervasive in the developing world, but is a matter of major concern all over the world.
Corruption may affect both the lower level of administration and the top levels in the
government. Where corruption emerges it is not because people there are different, but
because there are economic or social incentives for it324.
What is rent seeking?
323
Jain (2001).
324
Douglass North (1993): “If the institutional matrix rewards piracy more than productive activity then
learning will take the form of learning to be better pirates” (p. 6). Hall and Jones (1999): “If a farm cannot
be protected from theft, then thievery will be an attractive alternative to farming” (p.95).
Government actions influence the profitability of private agents. To the extent

that government officers have the discretion to set tariffs and subsidies, to buy goods
and services, to license industrial activities and to regulate monopolies, they will face
pressures from economic agents seeking to obtain favours and special regulations.
Devoting potentially productive resources to persuade politicians and civil servants to
take actions that generate income transfers or rents to particular individuals or groups at
the cost of the general interest is called “rent seeking”325.
Rent seeking can be either legal or illegal. At the legal level, organized lobbies
such as trade organizations and unions influence the public decisions, by making
pressures and giving financial or electoral support to those parties that better serve their
interests. Such influence exists because politicians need votes and financial
contributions to their campaigns. At the illegal level, agents may influence the decisions
of bureaucrats and policymakers by offering them a bribe. Bribery is a form of
pecuniary corruption: public servants are induced to take actions that deviate from the
public interest in exchange for monetary benefit or gifts.
What bribes are for?
There are many things private parties can buy from public officials with bribes
or other forms of influence. This includes326:
- Time savings and regulatory avoidance: in many development countries excess
bureaucracy and red tape rank very high as an obstacle to doing business. Often firms
are given the opportunity to pay bribes to bureaucrats so as “speed up” the bureaucratic
process of obtaining the required permits327.
- Government revenues: bribes can be used to escape taxes or other payments to
the government. A typical case is when a tax inspector accepts bribes in exchange for
lower collection.
- Government benefits: bribes can influence the allocation of benefits to the
private sector. This includes monetary benefits (subsidies, pensions) or in kind (food
supplies, access to medical care, access to courts, housing, privatizations). For instance,
a policeman who is supposed to protect all citizens may be given a bribe to look after a
particular interest, only. This comes at a cost of unfair competition, because when the
officials can privately sell the protection of property rights to individual firms, they
have little interest to provide the public at large with open access to this essential
service328.
325
More generally, the term rent seeking refers to efforts to obtain wealth transfers without creating any
value. A cartel of firms agreeing to raise prices, for instance, is a form of rent seeking that does not
involve bribery or pressures on civil servants. In this chapter we are interested on a sub-category, relating
to persuading public officers to deviate from the public interest.
326
The following classification adapts from Gray and Kaufman (1998).
327
The EBRD (1999, p. 124) estimates the “time tax” imposed on managers by bureaucrats (i.e, the time
spending in dealing with the public administration) to be about 10% of senior managers’ time, which
compares to a “bribe tax” of about 6% of firm’s revenues.
328
According to Hellman et al. (2003) this mechanism of purchasing individualized protection of property
rights became very popular in Russia, as a natural response to the general weakness in the rule of law
after the collapse of the Soviet Union.
afreitas@ua.pt 384
- Government contracts: bribes can be given to influence the choice of private

suppliers to the public sector. That is, contracts may be allocated to the firm that pays
the largest bribe, instead as to the one that puts the lower bid.
- Influencing legal and regulatory outcomes: bribes can influence how existing
laws, rules or regulations are implemented with respect to the bribe payers. For instance,
bribery can be used to prevent the government from stopping illegal activities, such as
pollution and drug dealing, or for a firm to obtain the consent of the competition
authorities to charge a monopoly price. At a higher level, bribes can be directed to
influence how laws, rules and regulations are designed. For instance, bribes to
parliamentarians to “buy” important pieces of legislation and bribes to government
officials to enact favourable regulation (this case is labelled “state capture”)329.
The grease in the wheels argument
Many people believe that corruption can be efficiency enhancing: by providing

bureaucrats a pecuniary incentive (speed money) corruption helps overcome the excess
bureaucracy and red tape. As long as bureaucrats give priority to the individuals paying
the higher bribes and those who offer the higher bribes are those who are carrying the
more promising projects, then bribery will be an efficient mechanism to allocate the
bureaucrats’ scarce time.
The grease in the wheels argument fails, however, for various reasons330: First,
this is second-best reasoning: that is, given the rigidity created by the bureaucracy,
corruption helps relaxing this rigidity. Clearly, the first best policy would be to address
the rigidity itself. Second, since bribery is hidden, it is not equivalent to a competitive
bid. Third, the implied “contracts” cannot be enforced by law, so nothing assures that
the higher bids will be actually attended first. Fourth, even if bribery could effectively
allocate faster government decisions to those who value it more, the society would be
better off if the corresponding revenues were appropriated by the government, rather
than by corrupt bureaucrats.
Last – but not the least - corruption is often what causes bureaucratic processes
to be slowed down, not the other way around: if public officers get rewarded by the
existence of red tape and regulations, they will tend to create extra red tape and
regulations, just to increase their prey opportunities 331.
Box 13.1 Corruption in the real world
“One of the most extreme real-world examples of theft of productive public

infrastructure, according to Abbott (1988, p. 172) involves Luckner Cambronne, a
member of the elite that ruled Haiti under the Duvaliers. He apparently had this
workman pull up and carefully store the entire rail system lining Port-au-Princes to
329
Hellman and Kaufman (2001).
330
Aidt (2003).
331
Tanzi (1998): “when rules can be used to extract more bribes, more rules will be created”. Evidence
that corruption increases red tape is provided by Gray and Kaufman (1998) and Kaufman and Wei (1999).
For a model relating corruption to red tape, see Guriev (2004).
Verrettes via St. Marc; he then sold the 150 kilometres of railroad as scrap metal and
pocket the money for himself” [Mauro, 2004, p.5].
“In May 2000, 950 people were injured and 22 killed, when a fireworks factory
in Enschede, the Netherlands, burst into flames. The explosion reached such
catastrophic levels because government regulators turned a blind eye to grave security
breaches with regard to storing explosives on the factory premises. In return for
remaining silent, the officials are said to have received free fireworks for years. Even an
illegal enlargement of the factory was legalised by the authorities a posteriori. The local
government official in charge of monitoring fireworks factories in the area admitted to
not knowing the specific regulations on the storage of explosives. Though considered an
expert, he hadn't read the relevant literature, nor had he taken part in any training
seminars. He only followed the instructions of his superiors, one of whom was arrested
on corruption charges two years ago.” [Transparency International].
“A Swiss activist for the rights of the Penan, a nomadic people in the Malaysian
rainforest, has been missing since May 2000, after he successfully drew international
attention to the problem of the unscrupulous logging of Borneo's woods. Turning
rainforest into palm plantations, the logging companies and government officials
destroy the habitat of the indigenous rainforest nomads. In addition to threatening the
lives of the Penan and those who fight for them, the excessive logging in Borneo
contributes to the worldwide problem of deforestation, affecting the earth's climate
(…)”.[Transparency International].
“(…) The bank owned a large stake in one of the country’s most profitable
companies. But when the management attempted to sell the stake to the biggest bidder,
it was advised by the government to sell the shares to the company’s founder at a
quarter of the market price instead. The founder turned out to be a close friend of the
country’s president. Where is this bank? It happens to be Crédit Lyonnais in France
(…)” [Shleifer and Vishny, 1998, p 1].
Box 13.2 The TI Corruption Perceptions Index
The recognition that fighting against corruption and monitoring its progress
requires some form of measurement motivated the development of various measures of
corruption. A famous indicator is the Corruption Perceptions Index (CPI), produced by
Transparency International. The CPI measures corruption perceptions as seen by
businessman and country analysts. The index ranges from 0 (highly corrupt) to 10
(highly clean).
Figure 13.1, correlates the results of the 2005 survey (159 countries are
included) with per capita income. The positive correlation in the figure reveals a general
tendency for the incidence of corruption to be larger in poorer economies. For instance,
the Scandinavian bureaucracies rank as the cleanest in the World, while most of the sub-
Saharan African bureaucracies rank at the bottom. Note however that this correlation
afreitas@ua.pt 386
may also entail some form of reverse causality: a high incidence of corruption gives rise
to waste of resources and other inefficiencies that cause per capita output to shrink332.
Figure 13.1 – Corruption perceptions and per capita GDP
50000
45000
40000
Per Capita GDP (2005)
35000
30000
25000
20000
15000
10000
5000 2
R = 0,799
0
1 2 3 4 5 6 7 8 9 10
Corruption perceptions index (2005)
Source: Transparency International, http://www.transparency.org/.
13.3. A model of centralized corruption
This section examines the case of a non-benevolent leader (the Kleptocrat),

whose only aim is to maximize its personal expenditure. The analysis assumes that the
Kleptocrat faces no electoral constraints and is blessed with perfect information and
perfect control over the bureaucracy.
The underlying model is the Solow model augmented with public inputs already
examined in Chapter 10.
Main assumptions
In the private sector, the individual firm production function is given by:
Yi  At K i N i1   . (13.1)
The productivity term has two components: an exogenous rate of technological
progress and an efficiency term related to the ratio of (productive) government
expenditures to GDP:
332
Mauro (1995) investigated the empirical relationship between corruption and economic performance,
controlling for the possible endogeneity of corruption. Using cross-section data for 70 countries in the
early 1980s, the author found a strong negative association between corruption and economic growth. The
author also found corruption to be strongly correlated to other indices of bureaucratic and institutional
inefficiency, including “political instability” and “inefficiency of the legal system”.

G
At  Ae , with A    and >0 (13.2)
gt
Y 
It is assumed that government revenues are raised through a production tax ().
In this model, corruption takes the form of theft from government revenues, i.e,
the planner decides to spend a proportion  of the government revenues “for political
reasons”. This translates into a lower provision of productive public inputs:
G   1   Y with   0. (13.3)
The total amount of theft is therefore333:
  tY . (13.4)
The clever Kleptocrat
Assume that, instead of a benevolent planner, you were a kleptocrat whose only
objective was to use the government budget for personal expenses. Would you divert all
tax proceeds to yourself? Clearly, the answer is no: if you were a clever kleptocrat, you
would realize that failing to provide essential inputs such as public order and basic
infrastructure would impact too badly on private activity, and therefore on your tax
base.
As an extreme case, just think what would happen if you set =1: in that case,
there would be no public provision at all (G=0). Since government services are essential
to production (equation 13.2), there would be no private economy either and you would
end up as a bankrupt dictator. Thus, if you were a clever Kleptocrat you should take into
account that stealing too much you may end up eating the egg and the chicken, too.
Formally, the Kleptocrat problem is to maximize the amount of theft, taking into
account how this impacts on per capita income and therefore on government revenues.
To illustrate the problem in the simpler manner, just remember the formula for
the steady state level of per capita income in the Solow model and adapt it for the
existence of a public input (this is equation 10.13):
1 

 G   1 
 s  1  t
yt*    1    

  e . (13.5)
 Y    n   
Substituting (13.5) and (13.3) on our variable of interest, (13.4), you get:

   1 
1
s
   1     1   
  1 
  Lt . (13.6)
 n  
333
It should be noted that, in the real life, corruption does not necessarily takes the form of theft from the
government budget. Instead of spending out of the government budget, corrupt leaders may confiscate
assets or impose bribes to firms. However, the distinction between extortion, bribes and taxes on
production is no more than an accountancy detail: from the individual firm’ point of view what really
matters is the total amount it is coerced to pay. For the economy as a whole, what matters is the
proportion of these payments that are deviated to unproductive uses. Modelling corruption as theft on the
government budget avoids accountancy complications.
afreitas@ua.pt 388
Now, if you choose  and  so as to maximize  (this is a bit tedious, though not
difficult), you will obtain (the superscript K stands for the kleptocrat solution):
 1 
K  ; (13.7)
 1
1 
K  . (13.8)
 1 
Comparing to the benevolent planner’ case (10.9), you see that the tax rate is
now higher. The fraction of unproductive expenditures is not zero, because these are
precisely the expenditures the kleptocrat wants to maximize. In terms of Figure 10.3,
the equilibrium when the ruler is a fully empowered kleptocrat is represented by point
K.
Interesting enough, if you substitute (13.7) and (13.8) into (13.3), you’ll realize
that there is an agreement between the kleptocrat and the benevolent planner regarding
the proportion of output to be spent in public inputs334:
K G
G G 
     (13.9)
Y  Y  1
Remember that this fraction corresponds exactly to the contribution of the public
input to production, as stated in equation (10.3). The conclusion is that, once you act as
a clever kleptocrat, you want resources in your economy to be allocated efficiently. That
is, you want your chicken conveniently fat so that it can produce more eggs. Using
Easterly (2001)’s words, you become “solicitous of your victim’ prosperity”!
Of course, consumers in this economy will be worse off than in the benevolent
planner case. This can easily be checked by substituting (13.7) and (13.8) in (10.13) and
(10.14).
Dynamic considerations
The above analysis assumes that the kleptocrat maximizes the steady state level
of theft. However, in the real world, despots do not stay in power forever. Either
through democratic elections or through revolutions, non-benevolent leaders may loose
their power. Thus, the optimal misappropriation policy from the kleptocrat point of
view should also take into account the transition dynamics and the time horizon of his
leadership. Solving such a problem is however beyond the scope of this book335.
Also note that incorporating inter-temporal considerations into the model opens
the door for another trade-off: the re-election probability may itself depend on the extent
of the theft. That is, if the kleptocrat steals too much today, he may face a higher
probability of loosing the chicken of the golden eggs tomorrow (either through
democratic elections or through a coup d’état). In this case, the kleptocrat has to balance
the benefits of more extraction during a shorter period of time with those of less
334
Barro (1990).
335
Note that, since this model has a transition dynamics, any change in a parameter (say ) will give rise
to an adjustment period during which the amount of theft approaches its steady state level. The path of
this transition dynamics should obviously influence the kleptocract choice.
extraction during a longer period of time. Intuitively, this decision shall be depend on its
subjective discount rate: if the kleptocrat is very impatient, he will tend to increase
current misappropriation336.
Of course, the probability of dismissal will be more or less sensitive to the extent
of the theft, depending on how strong the political regime is: when the kleptocrat leads a
strong dictatorship supported by the military cupules, the likelihood of dismissal is
lower - and hence theft opportunities are higher - than when the regime is democratic or
when generals have not a share in the cake. In a democratic system, the planner may
improve the probability of re-election by directing transfers to groups of voters with
political influence.
In this judgement it may be wise to find some foreign allies. For instance,
suppose you were running an economy endowed with an important mineral resource,
such as petroleum. In that case, it would be a good idea to buy extra political stability
sharing the cake with a foreign nation with strong military power. You could do so by
allowing foreign companies to extract some oil in your territory, in exchange for a
military cooperation agreement. If you forgot to do so - and if indeed your mineral
resources were significant - then you would most probably face an internal guerrilla,
supported by a foreign nation. Being solicitous of your chicken longevity may have a
foreign affairs dimension, too.
Shaping the leader’ incentives
The theft opportunities of a non-benevolent leader depend on the incentives

embodied in institutions he cannot change. Wherever political, administrative and legal
institutions are not strong enough to persuade politicians from pursuing their own
interests, they may take actions that deviate from their constituencies’ point of view.
Thus, it is the interest of the public to design institutions that reduce the corruption
opportunities of non-benevolent leaders.
The most basic instrument to reduce the discretionary power of political leaders
is the law. The law determines what politicians are allowed to do and what they are not
allowed to do. The law also determines how politicians are chosen. Of course, for the
law to be effective, it has to be designed so that the leader cannot change it at its own
discretion and it has to be properly enforced. In a word, you need separation of powers.
You need an executive power to implement the policies, a legislative power to produce
the laws and courts to enforce them. If the various political institutions are
independently appointed and remain uncoordinated, this will favour mutual control. In
plus, you can set some laws to be harder to change than others. In order to protect the
citizens from the arrival of selfish political leaders backed by full majorities in the
parliament, societies need Constitutional Laws. These can be changed only with a larger
majority of votes in the parliament or sometimes with a referendum.
336
Because government expenditures depend on contemporaneous taxation, the model does not allow the
kleptocrat to “take all the money and run” (=1). But the model could easily be adapted, postulating a
one-year delay in the transformation of taxes into government services (e.g, government inputs need one
period to enter in the production function). In that case, increasing misappropriation today would impact
on the economic performance only tomorrow, giving the Kleptocrat time to take all the money before
leaving his post. Such possibility was considered by Ventolou (2002), who analyses the choices of the
planner (who maximizes sequential flows of budget misappropriations) and of private agents (who seek to
maximize consumption and try to control politicians with voting assessment).
afreitas@ua.pt 390
Although political institutions are of most importance, the role of civil society
should not be neglected. A strong civil society backed by a free press that brings
watchdogs to the fore helps monitoring the implementation of public policies and in
maintaining a continuous pressure on governments to follow policies that best address
the people’s needs. In some countries, civil society has an explicit consultation role in
the decision-making process.
Surveillance by civil society will be more effective if there is transparency in the
decision-making process. If government decisions and expenditure programs are
publicly known, there will be a further source of social scrutiny over policymakers,
inducing them to remain honest. In many countries, government officials are required to
make periodic declarations of assets and income sources. This makes more difficult for
them to hide illegal revenues.
In general, democracies where public decisions are transparent and civil society
is strong tend to be less permeable to corruption than dictatorships where the decision-
making is hidden from public view and civil society is repressed.
No Natural Gravitation
If the quality of institutions is so important, why don’t poor countries just

change their institutions so as to achieve better economic performance?
A problem is that societies do not naturally gravitate towards good institutions.
Institutions are not simple strike of the pen choices. Institutions are social choices that
emerge as outcomes of complex games between the different groups in the society.
Hence, in order to understand how a particular institutional arrangement arises, one
shall take into account the motivations and the bargaining powers of those individuals
and groups that participate in the political game.
The key issue is that institutions not only influence the level of a country income
but also its distribution among individuals and groups in a society. In practice, those
who hold the power to change the rules are often those who benefit most with the status
quo. Hence, rather than designing institutions that maximize the social welfare, leaders
often favour institutional designs that maximize their own interests, subject to some
social constraint337.
Institutional reforms are easier to accomplish when it becomes the interest of the
political elites to do so. This, in turn, may reflect the interest of existing political
supporters or the emergence of new groups in the society with power enough to impose
the change. When the leaders fail to perceive these movements in civil society, a
disruption may occur338.
337
Azariadis and Stachurski (2005) offer a real world example: “Consider, for example, the current
situation of Burundi, which has been mired in civil war since its first democratically elected president was
assassinated in 1993. The economic consequences have not been efficient. Market-based economic
activity has collapsed along with income. Life expectancy has fallen from 54 years in 1992 to 41 in 2000.
Household final consumption is down 35% from 1980. Nevertheless, the military elite has much to gain
from continuation of the war. The law of the gun benefits those who have guns. Curfew and identity
checks provide opportunity for extortion. Military leaders continue to subvert a piece process that would
lead to reform of the army.”
338
North (1993):“Revolutions occur when the fundamental conflict between organizations over
institutional change cannot be mediated within the existing institutional framework”.
Box 13.3. Normative versus positive economics
When economists evaluate alternative policies, weighting up their various

benefits and costs, they are engaged in normative economics. When they describe the
economy and construct models to predict effects of policies or how governments will
behave, they are engaged in positive economics.
Normative economics is concerned with what “should be” or how should
governments act. Should they intervene? What are the most effective means? What are
the optimal policies? Positive economics is concerned with “what is” or why
policymakers do what they do. It incorporates the role of political pressures,
institutional constraints, and ideological issues. Normative economics makes use of
positive economics.
The market failure approach to the role of government is largely a normative
approach. It provides a basis for identifying situations where the government ought to
do something. Under positive analyses one should describe as well the consequences of
government actions. In this assessment, a critical question is the extent to which the
government can do better than the market.
13.4. The model with decentralized corruption
Decentralized corruption and the “tragedy of the commons”
We learned that, under centralized (or organized) corruption, the Kleptocrat

looks after the prosperity of its constituencies. In his maximization problem, he takes
into account that stealing too much will drive the economy down along the Laffer curve,
reducing the tax base. He therefore has incentive to coordinate all the extraction
activity, defining the shares each official can have 339 , so that the overall level of
corruption does not affect the economy too badly.
A different case occurs when the leader has no control over his bureaucracy.
When corruption is undertaken by a large number of uncoordinated civil servants, the
overall level of corruption will be much higher. The reason is that each corrupt official,
being too small to influence the overall outcome, has the incentive to impose as many
bribes as he can, without taking into account the shape of the laffer curve.
To some extent, the case with decentralized corruption looks like the “tragedy of
the commons”: when the law enforcement becomes to weak, it becomes virtually
impossible to preclude any public servant from entering the extraction activity (non
exclusion). However, as more and more people engage in bribery, the amount extracted
by each corrupt officer decreases (rivalry). Thus, competition over the common
resource would lead to its depletion: in the limit,  would approach 1 and the economy
would disappear.
Fortunately, rent seeking is not a costless activity. Bribing, lobbying, matching
corrupt officials and corruption opportunities require time and effort. These costs imply
339
Wade (1982) found an interesting example of organized corruption in South India: the author observed
that each level of the hierarchy in the administration of the irrigation system obtained a fixed percentage
of the total bribe.
afreitas@ua.pt 392
that individuals will devote time to rent seeking only to the extent that the reward
exceeds the opportunity cost. This mechanism will in general prevent the economy from
being totally “exterminated”.
The other side of the coin is that rent seeking diverts valuable resources away
from production. Private agents, instead of competing through innovation, will invest
part of their talents in seeking for special favours and easy profits. So the economy will
be working below its productivity frontier, not only because the provision of public
inputs will be suboptimal, but also because time and resources are waste in
unproductive rent seeking.
The following model addresses these ideas formally.
Rent seeking as a diversion activity
To examine the consequences of people devoting part of their effort to rent

seeking, let’s go back to the Solow model augmented with a public good340. As in the
Kleptocrat case, the extraction activity targets the government revenues, Y. The
Kleptocrat is however replaced by a benevolent leader who cannot control his
bureaucracy.
Let  be the fraction of time each individual devotes to rent seeking and 
the time devoted to legal work. Output will be determined according to:
Y  At K  N Y1  , (13.10)
where A is defined as (13.2) and
N Y  1   N (13.11)
is the total work-time devoted to production. Since rent seeking takes time, the
opportunity cost of rent seeking will be the wage rate.
The difference between (13.10) and (13.1) is that the production function is now
parametric in the proportion of labour devoted to rent seeking, . In the extreme case in
which a benevolent planner optimally deciding the tax rate would be able to
achieve the first-best outcome, as given by (10.16). Our quest is to find out how much
will the production function shift down when the planner has no control over its
bureaucracy.
In this model, the proportion  of resources deviated from public provision
accrues to households, as a reward of rent-seeking. The flow income chart of the
economy is displayed in Figure 13.2341.
340
The following is an adaptation of the AK-type model proposed by Mauro (2004).
341
Note that now the households’ disposable income is 1   1   Y . Because the proportion  of tax
proceedings flows back to households, there will be a positive effect on savings and investment that does
not occur in the kleptocrat case.
Figure 13.2: The income flow chart with decentralized corruption
  Y s1   1   Y

Households
C  1  s 1   1   Y
1   Y
Government C.Market
G   1   Y
Y I  K  K
Firms
A production function for rent-seeking
To find out the equilibrium level of rent seeking, one needs to specify a
“production function” relating the time spent in rent seeking to the amount of extraction
achieved. To be simple, let’s assume that the proportion of government resources
extracted by rent-seekers is a linear function of the proportion of time devoted to rent
seeking, :
  b (13.12)
where b is an exogenous parameter measuring the effectiveness of the rent seeking time.
In a minute we will discuss how this parameter shall relate to the quality of an
economy’s institutions.
The total amount of resources deviated from the government budget by rent
seekers will be therefore:
  Y  bY . (13.13)
Optimal rent seeking at the individual level
Households’ income in this economy is the sum of the wage bill with total theft:
1   w  bty N . (13.15)
where .
Each household is endowed with a unit amount of time. Thus, the household will
allocate its time to legal work or to rent seeking (i.e, he chooses so as to maximize
the term inside brackets in (13.15), taking the wage rate as given. Ruling out corner
solutions, this leads to the following arbitrage condition:
w  by . (13.16)
This condition states that in the optimal allocation of time, spending one extra
hour in formal work has to pay the same as one extra hour in rent seeking.
afreitas@ua.pt 394
The equilibrium level of rent seeking
To determine the impact of individual decisions in the aggregate, we need to

determine the wage rate. The wage rate is determined by the labour supply and the
labour demand. The labour demand is implied by the firms’ profit maximization
problem, given the production function (13.10) and the proportional tax on production
(). Given (13.11), net wages will be equal to:
Y  1   1   
1     yw (13.14)
N Y  1    
Substituting the wage rate (13.14) in (13.16) and dividing (for convenience) both
sides by y, the (macro-level) arbitrage condition becomes:
b 
1   1    . (13.17)
1 
The left hand side of (13.17) gives the marginal product of rent seeking per unit
of per capita income. The right-hand side gives the marginal benefit of working time
(that is, wages) per unit of per capita income. The equilibrium level of  is the one that
leaves workers indifferent between the two activities in the margin and solves equation
(13.17) 342.
Figure 13.3 illustrates the trade-off between formal work and rent-seeking, as
implied by this model. The downward sloping curve WR gives the marginal benefit of
working time (the wage rate wages, scaled by per capita income). The curve named
MPRS gives the marginal product of rent seeking per unit of per capita income. The
equilibrium level of  is obtained at the intersection of the two curves.
Figure 13.3. The optimal proportion of time devoted to formal work
w/y Marginal
(wage rate Product of
scaled by rent
per capita seeking
income)
MPRS
b
WR
1  *
 
Work effort Rent seeking
342
This equation determines and, by then,  To solve for the steady state, you only need equation
(3.10) and a little help from the flow income chart in Figure 13.2 to remember that s shall be replaced by
1   1   s . Due to (13.11), (13.2) and (13.3), the term A shall now be replaced by
 1    1     . All the rest is our good old Solow model.
 1 
In general, the equilibrium level of is positive. The reason is that, as more
resources are allocated to rent seeking (the economy moves to the left), the reward of
formal work rises, because the aggregate production function exhibits diminishing
returns. Diminishing returns to labour in the formal sector prevent the scenario of
“extermination”.
What happens if the effectiveness of rent seeking time declines?
In this model, any policy or institution affecting the effectiveness of a given

intensity of rent seeking is captured by a change in the parameter b. For instance,
suppose that the government is well succeeded in replacing fifty per cent of their corrupt
bureaucrats by honest citizens. In this case, one expects the effectiveness of rent seeking
to decline: because rent seekers have a lower chance of being matched to corrupt
officials, the time necessary to find a corrupt official increases, on average (or the
chance of being caught increases).
The implication of a fall in parameter b is analysed in Figure 13.4. The fall in
the marginal product of rent seeking turns formal work more attractive in the margin, so
more time will be dedicated to formal work. Then, because of diminishing returns, the
equilibrium wage rate falls until it gets equal to the marginal product of rent seeking.
Note that, because individuals devote more time to production, per capita
income will rise. In terms of Figure 13.4, the equilibrium moves from 0 to 1. In terms of
Figure 3.4, the country switches to a higher parallel growth path, approaching the
technological frontier.
Figure 13.4. The effect of a fall in the effectiveness of rent seeking
w/y Marginal
scaled by rent
per capita seeking
income)
MPRS 0 b0
MPRS’
1 b1
WR
1  0* 1  1*
The model thus stresses the relationship between rent seeking and the quality of
domestic institutions: the greater the ability of public servants to interpret creatively the
law, to apply regulations selectively, to repudiate contracts, to confiscate property or to
delay decisions, the more the private agents will be compelled to briber the officials to
guarantee some special treatment. All the rest constant, these countries are expected to
afreitas@ua.pt 396
observe larger deviation of talents away from production and to enjoy lower levels of
per capita income in the steady state.
What happens if the tax rate increases?
The tax rate is chosen by a benevolent leader. To see the implications of setting
different levels of taxation, let’s see what happens when the leader increases the tax rate
from to . This case is analysed in Figure 13.5.
In light of equation (13.17), there are two distinct effects:
On one hand, the increase in the tax rate increases the marginal return of rent
seeking (as given in 13.16)343. In Figure 13.5, this effect is represented by an upward
shift in the marginal product of rent seeking. On the other hand, the tax increase causes
a decline in the proportion of income that accrues to households in the form of (net)
wages. In Figure 13.5 this is represented by a downward shift of the WR locus. Both
effects induce individuals to move further towards informality. In Figure 13.5, the
equilibrium moves from 0 to 1.
A different question is whether the increase in the tax rate is welfare improving
or welfare worsening. In this model, taxes are necessary to finance the public input,
which is essential to production. Hence, a benevolent planner would choose a positive
tax rate for sure. The problem is that a higher tax rate will also induce a higher level of
rent seeking.
As usual, the optimal policy shall obey to a balance between the benefit of a
higher public provision and the negative impact of taxation. In the simple model with
public inputs (chapter 10), the later consists in lower savings and slower capital
accumulation. With corruption, taxes also have the effect of inducing more rent seeking.
Given this, the question of the optimal tax rate should be intuitive: to the extent that a
higher tax rate implies a deviation of talents away from production, a benevolent
planner will opt to tax less (and hence to provide less public inputs) than in the case
without rent seeking. In order to deter corruption, the optimal policy is to set the tax rate
and a level of public provision below the first best case (10.9)344.
343
Note that individuals do not internalise the adverse effect of their appropriative activities on the tax
base.
344
Park et al (2003).
Figure 13.5. The effect of a rise in the production tax
w/y Marginal
scaled by rent
per capita seeking
income)
1 MPRS’
b 1
0 MPRS
b 0
WR
WR’
1  1* 1  0*
Box 13.4 Centralized versus decentralized corruption in the Elusive Quest
William Easterly (2001, p. 247-248):

“Under decentralized corruption there are many bribe takers, and their
imposition of bribes is not coordinated among them. Under centralized corruption, a
government leader organizes all corruption activity in the economy and determines the
shares of each official in the ill-gotten proceeds.” (…). ”More generally, a strong
dictator will choose a level of corruption that does not harm growth too badly, because
he knows his rake-off depends on the size of the economy. A weak state with
decentralized corruption doesn’t have this incentive to preserve growth. Each individual
bribe taker is too small to affect the overall size of the economy, so he feels little
restraint on getting the most out of his victim”. (…) “This tale gives us insight why
corruption was more damaging to growth in Zaire then in Indonesia. Zaire is a weak
state with many independent official entrepreneurs. Indonesia under Suharto was a
strong state that imposed bribes from the top down. Zaire had negative per capita
growth, while Indonesia had exceptional per capita growth (…).”.
Other costs of corruption
The models outlined above illustrate the pervasive effect of corruption in

deviating resources away from production. As any model, however, it cannot address all
the negative implications of corruption. Other costs of corruption include:
- Uncertainty: with corruption, the enforcement of the law and regulations
becomes more uncertain, turning investment decisions riskier.
- Social unfairness: bribery tends to act as a regressive tax, favouring elites
against the general public.
- Distortions: to the extent that corruption fees are not uniformly paid across all
agents in an economy, price incentives will be distorted. Corruption affects particularly
activities with high fixed costs and long-term profitability, because once the fixed costs
afreitas@ua.pt 398
become sunk, investors are more vulnerable to additional extortion. It also tends to
affect more innovative projects, as these are more likely to require licenses, specific
regulation, and so on. Established firms, on the contrary, may use their lobbying efforts
to block innovative entrants.
- Low investment in human capital: Since corruption comes along with more
inequality and less public provision, poor people will have less opportunity to
accumulate education and health. On the other hand, to the extent that corruption
discourages investment in new technologies, expected wages for skilled workers will
decline, decreasing the education effort and the skill level of the population. More
talented workers will tend to emigrate.
- Wrong policies: competition policies, environmental controls, building codes,
safety rules, and prudential regulations are in principle created to serve the public
interest. Through bribes and corruption fees, public servants are induced to alter the way
these rules are implemented or even designed in a society, at the cost of the general
interest.
- Political instability: The use of public offices and public institutions for private
advantage undermines the state legitimacy, leading to political instability and violence.
13.5 Coping with decentralized corruption
The problem of a benevolent planner dealing with a large administration he

cannot control is to design institutions that keep the incentives as right as possible. This
will include a combination of rules and discretion, checks and balances and a little help
from social norms.
Corruption as a case for rules
An obvious way of limiting corruption opportunities is by restricting the

discretionary power of government officials. In the real world, public servants tend to
be constrained by a large number of rules, such as those that set the conditions for hiring
and promoting public personnel, rules for selecting private suppliers, rules to support
private investments, and so on.
However, this avenue shall not be overlooked. The other side of the coin is that
these constraints introduce rigidity: it may become too difficult to fire an incompetent
civil servant, to reward a competent one, or to efficiently select a private supplier.
Moreover, with excess rules, government officials will have limited capacity to react to
new circumstances (i.e., those not accounted for in the rule). In order to deal with
unexpected circumstances, some discretion is desirable.
A well-designed institutional framework involves therefore an appropriate
balance between rules and discretionary power. The later has to be complemented with
other ingredients, such as checks and balances, incentive compatible contracts, and
ethical principles. All these ingredients shall be part of the institutional package.
Sticks and carrots
The potential for corruption arises whenever discretionary power is delegated to

a bureaucracy. Therefore, it is the interest of the benevolent planner to shape the system
of incentives so that it becomes the interest of public servants to carry out the public
interest. Incentives structures that reward good performance and punish bad
performance represent a most effective way of aligning the incentives of bureaucrats
with the public interest.
To illustrate the different components of an incentive structure, consider the case
of a tax inspector whose job is to investigate whether a firm is liable for taxation or
not345. In case the firm is liable for taxation, the firm shall pay a tax However, the tax
collector may report that the firm is not liable for taxation in exchange for a corruption
fee, .
The tax collector’ incentive to misreport will depend on the size of the bribe and
also on the penalties he will face in case he gets caught, which will happen with
probability p. Moreover, when caught, the tax collector is dismissed.
Let g and f be the penalties applying to the firm and to the tax collector,
respectively. The firm expected profit in case of misreporting is   pg   . A risk
neutral firm will be willing to pay the bribe only if:
0      pg . (13.18)
The actual value of  will be a matter of bargaining between the firm and the tax
collector.
The interesting question is whether wages in the public sector could be used to
deter corruption. To analyse this, let w be the tax collector wage premium, defined as
the difference between the tax collector wage and the wage he would receive in the
private sector in case of dismissal. The expected return of the tax collector in case of
misreporting is   1  p w  pf , which he compares to the return of being honest, w.
He will accept the bribe if:
  p w  f  . (13.19)
This last equation shows that the incentive to corruption depends on the
institutional design, as captured by the following instruments: the wage premium (w),
the quality of monitoring (p) and the legal punishments (f, g).
The equation shows that wages paid to the tax inspector can be used to deter
corruption: by increasing the cost of being dismissed, higher wages turn the inspector
less willing to accept bribes. The efficiency wage (the one that turns the tax inspector
honest) is:

we  f (13.20)
p
This model points to complementarity between different instruments to deter
corruption. According to (13.20), the efficiency wage will be lower when the
monitoring system is more effective (higher p) and the legal sanctions (f, g) are more
severe. In the limit, with an effective monitoring system and sufficiently high penalties,
the wage premium could be set equal to zero.
345
The example adapts from Aidt (2003).
afreitas@ua.pt 400
Box 13.5 China and Russia
Dixit (2007, p. 5,6), illustrates the key role of monitoring and punishment in an
incentive structure, taking as examples the cases of China and Russia:
“The research concerning property rights and corruption has yielded some useful
conceptual distinctions and implications. The first is the distinction between de jure and
de facto effectiveness of governance. The distinction is most vividly seen by contrasting
China and Russia. China, at least until recently, had very little formal legal protection of
property rights, especially those of foreign investors. However, in practice it has been
able to deliver sufficient security to continue to attract large foreign investments. Russia
has a much better legal framework on paper, but reality seems much worse. What
explains the difference?” (…) “High officials in Deng Xiaoping’s government
understood enough about economics to recognize that growth requires markets and
markets require assured property rights. The Communist Party had retained its highly
disciplined organization and so was able to prevent self-seeking behaviour by low-level
officials.” The top level in Yeltsin’s Russia may have had the same understanding, but
presumably lacked the disciplined organization”. (…) “The top level of government,
even if itself well-intentioned, needs sufficiently drastic punishments at its disposal to
keep the lower and middle-level agents in check” (…) “Stalin had, and used,
punishments as drastic as one could imagine, and yet could not get his officials to
perform efficiently. What went wrong? Gregory and Harrison (2005) argue that Stalin’s
harsh incentives did not work well because his methods for detecting shirking were
arbitrary, imprecise, and themselves open to corruption. People found that they ran
almost the same risk of being denounced and punished when they worked hard as when
they shirked or cheated. Therefore they did not have the incentive to work hard after all.
An accurate detection procedure is important for the success of any incentive scheme,
including an anti-corruption one”.
Optimal corruption
The discussion above suggests that it is possible to design an incentive

compatible system that fully resolves the agency problems of delegating discretionary
power to a potentially corrupt public officer.
However, implementing such system in the real world is not that easy: First,
public officials often have no ability to dismiss incompetent bureaucrats or to set
performance bonuses. Second, setting wages high enough to guarantee the honesty of
bureaucrats would be too expensive, requiring high distortionary taxes elsewhere346 .
Third, setting bureaucrat’s wages above the private sector wages also leads to a
misallocation of talent, as individuals with entrepreneurial skills may feel attracted to
become bureaucrats347. Finally, increasing monitoring does not necessarily have linear
impacts on corruption: anti-corruption agents may turnout themselves corrupt.
These difficulties rise the question as to whether designing a corruption-free
bureaucracy would be desirable. As any second best reasoning, the optimal policy
346
United Nations (2005, p.16): “Many poor countries without adequate resources for decent salaries - or
the checks on political abuse that provide the incentives for performance and the ability too weed out the
inept and corrupt – are unable to afford an effective public sector (…)”.
347
Acemoglu and Verdier (1998, 2000).
involves a balance between costs and benefits. In general, the fact that achieving a
corruption-free bureaucracy would be too expensive implies that even a benevolent
planner would optimally decide some corruption to persist. In other words, paying
wages to bureaucrats below the efficiency wage may deliver in balance a higher welfare
than a system that turns all bureaucrats honest.
Social norms
A critical ingredient in any institutional setup is the quality of social norms.

Social norms are human devised tools that promote the coordination of actions among
individuals in a society, reducing the need for law enforcement. The underlying
mechanism in the arrival of social norms is that of strategic complementarities: by
establishing what is socially acceptable, social norms impose a psychological cost on
those that take actions that deviate from the common behaviour. This leads individuals
to choose actions that conform with what is acceptable to the society.
In the context of the model above, it is easy to account for the role of social
norms: wherever societies erect social norms against thievery and corruption, the
punishment parameters f and g will include a social stigma attached to the event of
being caught. On the other hand, in a clean society, the probability of a corrupt official
being matched by honest citizens increases, so the probability p of being caught will
increase too. Hence, he will feel more constrained to accept bribes.
In general, wherever principles in ethics in government are developed, the wage
premium necessary to turn public officers honest is lower.
Box 13.6. Developing accounting by Hall and Jones
The technique of Development Accounting consists of investigating the relative

contributions of inputs and of TFP to per capita income, measuring the variables in
levels, relative to a reference country. The method became famous, following the very
influential papers of Klenow and Rodriguez Clare (1997) and Hall and Jones (1999).
The later assumed the following production function:
Y  K 1 3 H 
23
with H  hN . (13.21)
In (13.21), K denotes for physical capital, H for human capital, h for human
capital per worker, N the number of workers and  is a measure of productivity.
They then proposed the following re-specification:
 1

 K  2
y    h    X
 (13.22)
 Y  
 
This equation breaks down differences in output per capita into differences in
the capital-output ratio, differences in human capital per worker and differences in
productivity. The first two terms (in the square brackets) account for the contribution of
inputs (transpiration, X) and the last term measures TFP (inspiration, ). As in standard
growth accounting, the later is obtained as a residual.
Hall and Jones (1999) implemented this decomposition for the year of 1988. In
their calculations, all variables are measured relative to the United States. Human
afreitas@ua.pt 402
capital per worker h was computed as a function of the (observed) average years of
schooling. Figure 13.7 summarises the Hall and Jones (1999) calculations of X and
For each country, the vertical axes measures the estimated value of  relative to the
corresponding value in the US, in logs. For instance, according to the figure, Italy has a
positive inspiration gap vis-à-vis the world leader, while USSR has negative inspiration
gap.
The horizontal axes measures the extent to which a country accumulated more or
less human and physical capital (as captured by X) than the US. For instance, according
to the figure, Norway has positive transpiration gap while Argentina has a negative
transpiration gap.
In the figure, the positively sloped line is a 45º line, describing the points for
which the two gaps are of the same size. For example, Argentina and Turkey exhibit
patterns that are “balanced” in terms of inspiration gap and transpiration gap. On the
contrary, the USSR had a large inspiration gap and very low transpiration gap,
reflecting a massive investment in physical and human capital in a poor innovative
environment and low ability to implement foreign technology.
According to the accountancy in equation (13.22), a lower level of inspiration
may be “offset” by a level of transpiration to deliver a similar level of per capita
income. To capture this, Figure 13.7 displays three negatively sloped lines describing
the different combinations of inspiration gap and of transpiration gap that deliver a
given level of output per capita. For example, the dashed line crossing the origin gives
the combinations of transpiration gap and of inspiration gap that generate the same level
of per capita output as the United States. As shown in the figure, Luxembourg achieved
in 1988 a level of per capita income equal to that of the US with less transpiration and
more inspiration. Among the emerging economies, the Czeck Republic and Guatemala
achieved a level of per capita income equal to that of Turkey, but the former used more
transpiration and the later used more inspiration.
Figure 13.7: Inspiration gap versus transpiration gap for 127 countries (US=0.00)
0,50
PRI
ITA
HK ESP LUX
CAN USA
0,00 SGP UK
MEX -GER
NOR
GTM
-0,50 ARG JAP
Inspiration gap (USA=0.00)
TUR
USSR
-1,00
HUN
IND
CSK POL
-1,50
KEN
-2,00
ZAR
CHN
-2,50
-2,20 -1,70 -1,20 -0,70 -0,20
Transpiration gap (USA=0,00)
Source: The data is from Hall and Jones (1999).
An interesting pattern in Figure 13.7 is that countries with large inspiration gaps
also tend to exhibit large transpiration gaps. In other words, the two variables are
largely correlated across countries.
Hall and Jones (1999) emphasised the causality from  to X. They first
documented that differences in  have a larger role in explaining cross country
differences in per capita output than differences in physical capital intensity and
educational attainment. They then conjectured that cross-country differences in physical
and human capital accumulation and productivity are fundamentally related to cross-
country differences in “social infrastructure”, that is, “the institutions and government
policies that determine the economic environment within which individuals accumulate
skills, and firms accumulate capital and produce output” (p.84). A social infrastructure
will be favourable to economic development if it provides an environment that is
supportive of investment and innovation, rather than promoting theft, corruption or
confiscatory taxation.
To test this proposition, the authors constructed a measure of “social
infrastructure”, as a combination of two indexes. The first is a measure of government
anti-diversion policies (defined as an average of five indexes: law and order,
bureaucratic quality, corruption, risk of expropriation and government repudiation of
contracts). The second is a measure of openness to trade. They found a “powerful and
close association” between output per worker and this measure of social infrastructure,
after controlling for feedback effects. They also show that countries with a good social
infrastructure tend to have high capital intensities, high human capital per worker and
high productivity.
The Hall and Jones (1999) results point to a distinction between the proximate
causes of growth (as captured by capital accumulation) and the deep causes, which
afreitas@ua.pt 404
ultimately determine the individual’s willingness to produce and invest. According to

this interpretation, policy choices such as the size of the government, the rate of
inflation and innovation are all best thought as outcomes (that is, proximate causes)
rather than as determinants. This work points to the critical role of institutions as drivers
of long run economic performance.
13.6. When institutions become dysfunctional
The discussion in this chapter has pointed to the critical role of institutions that
impose constraints on political elites, limiting their ability to prey on common citizens.
A problem arise, however, in that the ability of a society to design good institutions is
significantly weakened when the level of corruption is already high. This fact brings a
new source of vicious cycle: wherever corruption is high because the quality of
institutions is low, the quality of institutions tends to remain low because the level of
corruption is high.
Why corruption brings more corruption?
There are many reasons to believe that the social pressure against corruption
declines with the level of corruption.
First, when corruption is generalized, matching rent seekers and corrupt
bureaucrats is less than a problem. The searching costs for new corruption opportunities
are low, increasing the relative reward of rent seeking time. Thus, individuals will have
little incentive to go honest.
Second, when the number of corrupt civil servants is very large, the probability
of any one of them being caught and prosecuted is low: detection does not entail the
same social stigma as in a corruption-free society (it is hard to punish someone severely
when everyone else is doing the same), auditors may be themselves corrupt and senior
officials have the incentive to make sure that corrupt officials are not punished348.
Third, when corruption is widespread, policies envisaging the strengthening of
institutions are more likely to be blocked, because those that have the power to create
rules may be more interested in making sure that corruption keeps paying of.
Finally, corruption undermines people's trust in the political institutions,
resulting in a weak civil society, and clearing the way for corruption to flourish.
In terms of the model above, these considerations suggest that some parameters
in equations (13.18)-(13.20), rather than exogenous, should depend on the level of
corruption. In particular, the parameters measuring the likelihood of detection (p) and
the legal punishments (f, g), instead of constant, could be modelled as decreasing
functions of the economy-wide incidence of corruption.
348
Mauro (2004) makes the point: “Consider, for example, the case of a civil servant in an administration
where everybody including his superiors, is very corrupt. It would be difficult for this civil servant to
decline offers of bribes in exchange for favours, because his superiors may expect a portion of the bribe
for themselves. By contrast, in bureaucracies that are generally honest, a real threat of punishment deters
individual civil servants from behaving dishonestly”.
Thus, just like we concluded that corruption tends to flourish in countries where
institutions are poorly designed, checks on discretionary power are missing and
punishment is weak, now we conclude the reverse: countries where corruption is
generalized tend to have poorly designed institutions, lower checks on discretionary
power and lower punishment, making corruption more attractive.
Multiple equilibria
A dramatic feature of generalized corruption is that, once installed, it becomes

very difficult to eradicate. An explanation for this fact relates to the endogeneity of
institutions: when the level of corruption is very high, the disciplinary role of
institutions declines dramatically.
To analyse this in terms of the model outlined in Section 13.3, remember that
the quality of institutions in that model is captured by the parameter measuring the
effectiveness of rent seeking time, b. By now, this parameter was assumed exogenous.
In the following, let’s assume instead that this parameter is an increasing function of  :
that is, as the society devotes more time to rent seeking, the quality of institutions
decreases and hence the higher will be the effectiveness of the rent seeking effort.
This case is depicted in Figure 13.7, where the line describing the marginal
product of rent seeking is downward sloping, instead as horizontal. In that case, the
model may have two equilibria, one with low corruption (L) and other with high
corruption (H):
- If the economy starts out in L, the corruption level is low and the probability
of corrupt officials being caught is very high. Because the marginal product
of rent seeking is low, people dedicate a small fraction of their time to
corruption. In this case, the economy will evolve along a high per capita
income virtuous cycle.
- If the economy starts out in H, the probability of someone being charged for
corruption is low. This, in turn, induces people to engage in rent-seeking,
self-validating the high corruption level and the equilibrium path with low
per capita income.
As in general in models with multiple equilibrium, History plays a role in
determining which equilibrium will actually occur: the economy will be poor and
corrupt if it starts out poor and corrupt; the economy will be rich and clean if it starts
out rich and clean. Depending on the initial conditions, two-otherwise identical
economies may end up with different levels of per capita income349.
Escaping the trap
349
According to Hall and Jones (1999), the idea of a good equilibrium with little diversion and high
probability of punishment and of a bad equilibrium where enforcement is ineffective backs to the 17th
century philosopher, Thomas Hobbes. An alternative explanation for multiple equilibrium in corruption
was proposed by Gradstein (2004). in his model, law enforcement is indivisible, so societies can only
choose between two corner regimes, (a) full enforcement of the law and (b) minimal enforcement. Since
the full enforcement requires a considerable investment effort, such investment will only take place in
rich economies. When the economy is poor, individuals cannot afford to pay for the full enforcement
equilibrium, so that the economy will remain poor.
afreitas@ua.pt 406
To see how difficult is to escape the bad equilibrium H, suppose that the fraction
of time devoted to rent-seeking decreased slightly to a point between H and L. In that
case, the marginal product of rent seeking would be temporarily higher than real wages,
and the incentives will exist for corruption to increase. Hence, the economy would
return to point H. Because the equilibrium H is stable and is dominated by another
possible outcome (equilibrium L), it is a poverty trap350.
The model provides an explanation of why it is very difficult to eradicate
endemic corruption once installed. In light of this model, modest anti-corruption
initiatives (like using the police, the courts or even creating ethic offices) are more
likely to move the economy somewhere between H and L, without long lasting effects.
This model resamples the Big Push, in that for the economy to escape the trap, one
would need a huge anti-corruption effort351.
The model also explains why sometimes countries hit by an historical accident
(like a war, a terms of trade deterioration or a natural catastrophe) fall in poverty traps,
being unable to get out of the vicious circles they stuck in. For example a temporary
political instability may lead to an increase in the corruption level, sliding the economy
from the good equilibrium L to the bad equilibrium H. Once institutions, informal rules
and norms of behaviour have already been adapted to a corrupted system, the high
corruption equilibrium will be self-sustained. The economy will be locked in the bad
equilibrium and will not move back to the original equilibrium, even though the original
shock was dissipated352.
The self-sustained aspect of institutions helps understand why many developing
countries have failed to improve their living standards despite considerable external
support.
Figure 13.7. Corruption and vicious cycles
350
Mauro (2004) discusses other possibilities regarding the existence and stability of different equilibria.
351
Skidmore (1996) and Aidt (2003) state that a successful example is the post-WWII Hong-Kong: in the
beginning of the 1970s, the Hong-Kong authorities created the Independent Commission Against
Corruption with extensive power to investigate and prosecute corruption, that was generalized. This
commission, properly empowered by the authorities, managed to reduce drastically corruption within a
decade only.
352
According to Murphy et. al., (1993) this kind of reasoning helps understand what happened during
military instability in Africa or during the collapse of communism in Russia.
w/y Marginal
scaled by rent
per capita seeking
income) H
bH t
L
bL
WR
MPRS
1  H* 1  L*
Box 13.7 Measuring Governance
The understanding that institutions are a crucial ingredient for economic

performance lead different authors to search for alternative measures of institutional
development. A major contribution is Kaufman et al (1999a). The authors use a broad
concept of governance, which they define as the “traditions and institutions that
determine how authority is exercised in a particular country”.
The authors identified several hundred of cross-country indicators on various
dimensions of governance from many different sources and used them compute six
compounded indicators of governance, measuring, respectively: “(i) Government
Effectiveness: the quality of public service delivery and competence and political
independence of the civil service; (ii) Regulatory Quality: the relative absence of
government controls on good markets, banking systems and international trade; (iii)
Rule of Law: the protection of persons and property against violence and theft,
independent and effective judges and contract enforcement; (iv) Control of Corruption:
public power is not abused for private gain or corruption; (v) Voice and Accountability:
the extent to which citizens can choose their governments and have political rights, civil
liberties and independent press; (vi) Political Stability and absence of
Violence/Terrorism: the likelihood that the government will not be overthrown by
unconstitutional or violent means”.
Table 13.1 shows how different countries have ranked in terms of these six
indexes as of 2007. Not surprisingly, countries like Finland, Switzerland and Australia
rank at the top, while countries like Chad, Sudan, Somalia and Democratic Republic of
Congo and Zimbabwe rank in the bottom, for all dimensions. Kaufman et al (1999b)
found a strong causal effect running from improved governance to development
indicators.
afreitas@ua.pt 408
Table 13.1 – Indicators of Governance
(i) (ii) (iii) (iv) (v) (vi)
Political Stability &

Government Regulatory Control of Voice and
Rule of Law Absence of
Effectiveness Quality Corruption Accountability
Violence/Terrorism
Country %Rank %Rank %Rank %Rank %Rank %Rank
ARGENTINA 52 22 39 43 57 50
AUSTRALIA 97 96 95 95 93 79
BOTSWANA 73 65 70 80 62 78
BRAZIL 53 53 43 52 59 37
CAPE VERDE 66 48 67 74 75 83
CHAD 4 12 6 5 9 6
CHILE 86 91 88 90 77 66
CHINA 61 46 42 31 6 32
DEM. REP. OF CONGO 1 8 1 4 9 2
EGYPT 39 43 52 36 12 22
FINLAND 97 96 97 100 98 99
GERMANY 92 93 94 93 95 81
HAITI 8 20 5 3 26 11
INDIA 57 46 56 47 59 18
INDONESIA 42 44 27 27 43 15
ISRAEL 84 83 73 75 70 13
ITALY 65 74 61 71 87 62
SOUTH KOREA 86 79 75 68 67 62
MAURITIUS 72 68 74 70 73 72
MEXICO 60 64 34 49 49 25
MOROCCO 55 51 51 53 29 27
NEPAL 22 27 31 30 23 3
PAKISTAN 28 29 20 21 19 1
PERU 38 58 27 48 49 20
POLAND 67 72 59 61 72 67
PORTUGAL 80 83 82 84 90 73
RUSSIA 42 35 17 16 20 23
SAUDI ARABIA 51 52 59 58 7 25
SINGAPORE 100 99 95 96 35 90
SOUTH AFRICA 75 66 57 67 69 51
SOMALIA 0 0 0 0 3 0
SUDAN 11 9 4 5 5 2
SWITZERLAND 100 93 100 98 100 99
THAILAND 62 56 53 44 30 17
TURKEY 64 60 53 59 42 21
UNITED KINGDOM 94 98 93 94 94 66
UNITED STATES 91 91 92 91 85 56
ZIMBABWE 3 1 2 4 8 12
Source: World Bank, Governance Matters 2008.

Note: The figures refer to the percentage of countries in the sample with lower performance than the country at hand in each specific dimension.
13.7. Discussion
Along this book, we have stressed the key role of human devised government
policies for economic development. To a large extent, the analysis has been optimistic
regarding the growth opportunities of laggard nations: after all, if achieving economic
development was only a matter of adjusting the policy mix, then this should be within
reach for any poor country in the world.
This chapter traces a less optimistic view of economic development. The key
issue is that policies are embedded in institutions and institutions are not always shaped
so as to promote the social welfare. In societies where property rights are not well
enforced and where controls over the political elites are lacking, there will be deviation
of resources away from public infrastructure, misallocation of talent and bad economic
polices.
The discussion in this chapter emphasized the critical role of institutions in
shaping the incentives of policymakers, bureaucrats and economic agents in general. A
well-balanced institutional design has to include the separation of political powers,
transparency in decision-making, a strong civil society, checks and balances, rewards
and punishment, and a strike between rules and discretionary power. A well-balanced
institutional design shall also take into account that achieving zero corruption would be
too expensive. Sound social norms play a critical role in the incentive structure,
enhancing the effectiveness of formal institutions and reducing the costs of law
enforcement.
A critical problem is that institutions are not easy to change. Institutions emerge
as a result of complex games involving the different groups in a society. When
corruption becomes so generalized that institutions, informal rules and norms of
behaviour become adapted to it, the economy will find itself locked in low-level poverty
trap. Poor countries will hardly the financial means and the political strength to escape
this trap, so corruption becomes endemic.
 Central planners and civil servants are not necessarily benevolent. Whenever the institutional
environment allows for, they may deviate from the social interest to pursue own objectives, at the
cost of the quality of the policy environment.
 When corruption affects the high level of the administration only, it is the interest of the corrupt
leader to look out at aggregate efficiency and to organize the extraction activity, in order to maximize
theft opportunities.
 The theft opportunities of a non-benevolent leader depend on the constraints imposed by the political
and legal institutions that he cannot change.
 When corruption affects all levels of the administration, each corrupt official will try to maximize his
corruption fees without taking into account the impact on the economy as a whole. This gives rise to
a coordination failure that resembles the “tragedy of the commons”. The deviation of resources to
corruption comes at a cost of low production in the formal good sector. As long as there are
diminishing returns, the higher the corruption activity in the economy, the higher will be the wage
rate in formal production, creating the incentives for some fraction of the labour force to remain
productive. This prevents the economy from being “exterminated”.
 When corruption is decentralized, a benevolent leader concerned with the social welfare will provide
less of the public good than in the case of centralized corruption. The reason is that the required
increase in taxation would induce an increase in corruption activity. This case provides another
example of second-best decision-making.
 In order to control corruption, benevolent leaders have to strike an ideal balance between the rigidity
cost of rules and the opportunities raised with discretion. In the optimal institutional setup,
discretionary decisions should be complemented with efficient wages, checks and balances in order
to keep the incentives right. The optimal corruption level is not however zero, as that would be
socially too costly to achieve.
 Social norms and civil society play a critical role in deterring corruption. However, societies do not
naturally gravitate towards good institutions: whenever corruption becomes generalized, strategic
complementarities create the incentives for each one individual in the society to remain corrupt. In
this case, a society may find itself locked in a low level poverty trap, with high corruption and
dysfunctional institutions that nobody has interest to change.
afreitas@ua.pt 410
Key concepts
 Corruption
 Rent seeking
 The “Grease in the wheels” argument
 Centralized vs. decentralized corruption
 Incentive compatible contracts
 No-natural gravitation
 Strategic complementarities in corruption.
 Development accounting
Essay questions
a) Comment: “Corruption is an efficient mechanism to allocate the scarce

time of bureaucrats to the most productive uses”
b) Comment: “The theft opportunities of a non-benevolent leader depend on
the incentives embodied in political, administrative and legal institutions
he cannot change”.
c) Explain why the provision of key public inputs is lower under
decentralized corruption than in the kleptocrat case.
d) Comment: “Corruption is bad for growth”.
e) Comment: “A well-designed institutional framework involves an
appropriate balance between rules, discretionary power, incentive
compatible contracts, and well founded ethical principles”.
f) Why it pays more to be corrupt when others are corrupt too?
g) Explain why sometimes countries hit by an historical accident get locked
in high corruption poverty traps.
Exercises
13.1.
Consider an economy with a large number of equal firms. Each firm i at time t
producing a homogeneous consumption good according to:
1 1
Yi  At Ki2 Ni2 , where At  Ae 0,015t and Lt  e t N t .
In this economy the saving rate is 20%, the population is constant and the capital
depreciation rate is 3%.
a) Find out the steady state level of per capita income in this economy.
Consider A=0.3.
1
 G 3
b) Assume now that A    , where G are public inputs.
Y 
i. Explain this specification. To which type of public inputs it
applies?
ii. Compute the aggregate production function of this economy.
iii. Explain why there is a market failure in this model.
c) Consider that the provision of the public input is financed with a
production tax, but ø percent of the tax revenues are wasted in
unproductive uses. Write down the government budget constraint.
iv. Compute the benevolent planner solution. Explain the trade-off
involved.
v. Now suppose that you were a kleptocrat, which aim was to
maximize your theft on government expenditures. Which
solution would you choose?
vi. Compare the results of questions v) and vi) in a graph.
13.2.
Consider an economy with a large number of equal firms. Each firm i at time t
producing a homogeneous consumption good according to: Y  AK 0,5 N Y0,5 , where
NY  1  N is the total work-time devoted to production and ψ is the fraction of
time devoted to rent-seeking. Assume also that the proportion of government
resources extracted by rent-seekers (ø) is a linear function of the seeking effort
(ψ), that is,    b , where b is an exogenous parameter that captures the marginal
product of rent-seeking time.
a) Considering that the government imposes a tax on income, derive the
expression for the wage rate in this economy.
b) Write down the expression for the total amount of resources deviated
from the government by rent-seekers. Explain it.
c) Find out the equilibrium level of ψ.
d) With the help of a graph, analyse the implications of:
i. A fall in the tax rate.
ii. A decline in productivity of the rent seeking.
13.3
Consider an economy where workers are free to decide the time they allocate to
formal work and to rent seeking. In this economy, the production function is given by
Y  AN Y0.16 , where N Y  1   N is the time allocated to formal work. The tax on
production is equal to   1 / 2 .
a) Show that individual choices lead to the following condition:
1    0,16 b , where b measures the effectiveness of rent seeking
afreitas@ua.pt 412
effort. (clue: find out the demand for labour and use the arbitrage
condition w  by ).
b) Explain this relationship above, illustrating with the following values for
b: b=0,2 and b=0,8.
c) Now assume that the effectiveness of rent seeking is itself a positive
function of the level of rent seeking: b   . Why should that be?
Describe the equilibria of the model, discuss their stability and explain
the implications of this model.
13.4
Consider a tax collector which job is to investigate whether a firm is liable for
taxation or not. If so, the firm has to pay a tax τ. However, the tax collector may report
that the firm is not liable for taxation in exchange for a bribe b. Assume also that the
government discovers corrupt acts (with probability 0.5) then the firm pays a penalty g
= 1 and the tax collector pays a penalty f = 0.5.
a) If τ = 2, what would be the maximum bribe amount the firm would be
willing to offer?
b) Write down the expression for the expected return of the tax collector in
case of misreporting.
c) What would be the minimum wage rate that would keep the tax collector
honest?
References
Abramowitz, M., 1956. “Resource and output trends in the United States since 1870”. American
Economic Association Papers and Proceedings, 46(2): 5-23, May.
Abramowitz, M., 1979. Rapid growth potential and its realization. Reprinted in Abramowitz, 1989,
thinking about growth and other essays on economic growth and welfare (Cambridge university
press, Cambridge), 187-219.
Abramowitz, M., 1986. Catching up, forging ahead and falling behind. Reprinted in Abramowitz, 1989,
thinking about growth and other essays on economic growth and welfare (Cambridge university
press, Cambridge), 220-244.
Acemoglu, Daron 2008. Introduction to Modern Economic Growth.
Acemoglu, Daron (2003b) “Labor- and Capital-Augmenting Technical Change.” Journal of European
Economic Association, 1(1), pp. 1-37.
Acemoglu, D. 2003. "Root causes: a historical approach to assessing the role of institutions in economic
development". Finance and Development , 27-30, June.
Acemoglu, D., 1995. Reward structure and the allocation of talent. European Economic Review 39, 17-
33.
Acemoglu, Daron and Simon Johnson (2005) “Unbundling Institutions.” Journal of Political Economy,
113, pp. 949-995
Acemoglu, D., Johnson, S. and Robinson, J., 2001. "Colonial origins of comparative development: an
empirical investigation". American Economic Review 91, 1391-1401.
Acemoglu, D., Johnson, S. and Robinson, J., 2002. "Reversal of fortune: geography and institutions in
the making of the modern world income distribution". Quarterly Journal of Economics CXVII
1231-94.
Acemoglu, Daron, Simon Johnson and James Robinson (2005a) “Institutions as a Fundamental Cause of
Long-Run Growth.” in Philippe Aghion and Steven Durlauf (editors) Handbook of Economic
Growth, North Holland, Amsterdam, pp. 384-473.
Acemoglu, D., Robinson, J., “Economic Backwardness in Political Perspective”, NBER Woking Papers
8831, National Bureau of Economic Research.
Acemoglu, D., and Verdier, T., 1998. Property rights, corruption and the allocation of talent: a general
equilibrium approach. Economic Journal 108, 1381-403.
Acemoglu, D., and Verdier, T., 2000. The choice between market failure and corruption. American
Economic Review 90 (1), 194-211.
Acemoglu, D., Zilibotti, F., 2001. Productivity differences. Quartely Journal of Economics 116, 563-606.
.
Acemoglu, D., Zilibotti, F., 1997. Was Prometheus unbounded by chance? Risk, diversification and
growth. Journal of Political Economy 105, 710-751.
Acemoglu, D., Aghion, P., Zilibotti, F., 2006. Distance to frontier, selection and economic growth,
Journal of the European Economic Association, March 2006, Vol. 4, No. 1, Pages 37-74.
Adamns, W., Dirlam, J., 1966. Big steel, invention and innovation. Quarterly Journal of Economics 80,
167-189.
Ades, A., Glaeser, E., 1994. Evidence on growth increasing returns and the extent of the market. NBER
Working Paper series 4714, April.
Ades, A., di Telia, R., 1997. National Champions and Corruption,: some unpleasant interventionist
arithmetic. Economic Journal, 107, 1023-42.
Ades, A., di Telia, R., 1999. Rents, competition and corruption, American Economic Review 89, 982-93.
afreitas@ua.pt 414
Afonso, A., Schuknecht, L., Tanzi, V., 2005. Public Sector Efficiency: an International Comparison.
Public Choice 123, 321-347.
Agell, J., Lindh, T. and Ohlsson, H., 1997. Growth and the public sector: a critical review essay.
European Journal of Political Economy 13, 33-52.
Aghion, P., Bloom, B., Blundell, R., Griffith, R., Howitt, P., 2005. “Competition and Innovation: An
inverted-U relationship”. Quarterly Journal of Economics 120, 701-728.
Aghion, P., Harris, C., Howitt, P., Vickers, J., 2001. Competition, imitation and growth with step-by-step
innovation. Review of Economic Studies 68, 467-492.
Aghion, P., Harris, C., Vickers, J., 1997. Competition and growth with step-by-step innovations: an
example. European Economic Review, Papers and Proceedings, 771-782.
Aghion, P. and Howitt, P., Violante, G., 2002. General purpose technology and wage inequality. Journal
of economic growth 7, 315-345.
Aghion, P. and Howitt, P., 2009. The economics of growth. Cambridge, MA: MIT Press.
Aghion, P., Howitt, P. 2005. Growth with quality improving innovations: an integrated approach. In
Aghion, P., and Durlauf, S. (eds), Handbook of Economic Growth, North Holland, Amsterdam,
Chapter 2, 67-110.
Aghion, P. and Howitt, P., 1998. Endogenous growth theory. Cambridge, MA: MIT Press.
Aghion, P. and Howitt, P., 1992, “A model of growth through creative destruction”. Econometrica, 323-
51.
Agrawal, A., Cockburn, I., McHale, J., 2006. “Gone but not forgotten: knowledge flows, labour mobility,
and enduring social relationships. Journal of Economic Geography 6 (5), 571-591.
Aidt, T., 2003. "Economic Analysis of Corruption: a survey". The Economic Journal 113, November,
632-653.
Alesina, A., Perotti, R., 1996. “Income distribution, political instability and investment”. European
Economic Review 40, 1203-1228.
Alesina, A., Weder, B., 2002. “Do corrupt governments receive less foreign aid. American Economic
Review 92, 1126-1137.
Alesina, A. and Tabellini, G., 1987. A positive theory of fiscal deficits and government debt in a
democracy. NBER Working Paper 2308, July.
Arnold, J., Bassanini, A., Scarpetta, S., 2007. “Solow or Lucas?: testing growth models using panel data
from OECD countries. OECD Economics Department Working Papers Nº 592, OECD
Publishing”.
Arrow, K., 1962. “Economic welfare and the allocation of resources for invention”, in National Bureau of
Economic Research, The Role and Direction of Inventive Activity, Princeton, Princeton University
Press.
Arrow, K., 1962. “The economic implications of Learning by Doing”, Review of Economic Studies 29:
155-173.
Atkinson, A., and J. Stiglitz, 1980. Lectures on Public Economics,. McGraw Hill, New York, NW.
Atkinson, A., and J. Stiglitz, 1969, a new view of technological change, Economic Journal, pp. 573-578. .
Audretsh and Feldman (1996)
Azariadis, C., Stachurski, J., 2005. Poverty traps. in: Philippe Aghion & Steven Durlauf (ed.), Handbook
of Economic Growth, edition 1, volume 1, chapter 5.
Bairoch, P., 1982. International Industrialization levels from 1750-1980. Journal of European Economci
History 11 (2), 269-333.
Balassa, B. (1965), “Trade Liberalization and ‘Revealed’ Comparative Advantage”, The Manchester
School of Economic and Social Studies, vol. 33(2), pp. 99-123.
Balassa, B., 1985. Exports, policy choices and economic growth in developing countries after the 1973 oil
shock. Journal of Development Economics 18, 23-35.
Bardhan, P., 1997. Corruption and development: a review of the issues. Journal of Economic Literature
35, 1320-1346.
Barro, R., 1990. “Government spending in a simple model of endogenous growth”. Journal of Political
Economy 98, 103-125.
Barro, R. J. , 1991. “Economic growth in a cross-section of countries”, Quarterly Journal of Economics
106:2, 407-43.
Barro, R., 1997. Determinants of economic growth: a cross country empirical study. The MIT Press,
London England.
Barro, R. and Sala-i-Martin, X., 1991. “Convergence across states and regions”. Brooking Papers on
economic activity 1, 107-158.
Barro, R. and X. Sala-i-Martin, 1992. “Convergence”, Journal of Political Economy, 100 (2), 223-251.
Barro, R. and Sala-i-Martin, X., 1992. “Public Finance in models of Economic Growth”. Review of
Economic Studies 59, 645-61.
Barro, R. and Sala-i-Martion, X., 1995. Economic Growth. McGraw Hill.
Barro and Sala-i-Martion, X., 1997. Technological Diffusion, Convergence and Growth, Journal of
Economic Growth 2, 1-26.
Bartelsman, E., Doms,, M. 2006. Understanding productivity: lessons from longitudinal microdata.
Journal of economic literature, 38, pp 569-594.
Basha, E., 1990. “A three gap model of foreign transfers and the GDP growth rate in developing
countries”. Journal of Development Economics 32, 279-96.
Basu, S., Weil, D., , 1998. Appropriate technology and growth, quarterly journal of economics, 113(4),
pp. 1025-1054.
Baumol, W. , 1986. “Productivity Growth, Convergence and Welfare: What the Long Run Data Show”,
Americam Economic Review 76 (5), 1072-1085
Bayoumi, T., Coe., D. and Helpman, E., 1999. R&D spillovers and global growth. Journal of
International Economics 47, 399-428.
Blanchard, O., Fisher, S., 1989. Lectures on Macroeconomics. MIT Press, Cambridge Massachusetts,
London England.
Beck, T., Levine, R. and Loyaza, N., 2000. “Finance and the source of growth”. Journal of Financial
Economics 58, 261-300.
Becker, 1983 on rent seeking.
Becker, G., 1981. A Treatise on the Family. Harvard University Press.
Becker, G., 1971. Economic Theory. Knopf Publishing Co., New York.
Becker, G., 1960. An economic analysis of fertility. In Easterlin, R. (ed), Demographic and economic
change in developed countries. Universities-National Bureau Conference Series nº 11, Princeton:
Princeton University Press, pp. 209-40.
Becker, G., Stigler, G., 1974. Law enforcement, malfeasance and the compensation of enforcers. Journal
of Legal Studies 3, 1-19.
Becker, G., Murphy, K and Tamura, R., 1990. “Human Capital, fertility and economic growth”. Journal
of Political Economy 98 (5): S12-S37.
Becker, G., Murphy, K, 1993. Human capital and specialization. Quarterly Journal of Economics.
Bencinvenga, V. and Smith, B., 1991. Financial intermediation and endogenous growth. Review of
economic studies 58, 195-209.
Bencinvenga, V. and Smith, B., 1993. “Some consequences of credit rationing in an endogenous growth
model”. Journal of Economic Dynamics and Control 17, 97-122.
Benhabib, W. , Spiegel, M., 1994. “The role of human capital in economic development: evidence from
aggregate cross-country data”. Journal of monetary economics 34, 143-173.
afreitas@ua.pt 416
Bernanke, B., Gurkaynak, R., 2001. Is growth exogenous? Taking Mankiw, Romer and Weil, seriously.
NBER working paper 8365.
Bernard, A.; Durlauf, S. “Convergence in International Output”, Journal of Applied Econometrics, 10, 2,
April-June 1995, pp. 97-108.
Blundell, R., Griffith, R., Van Reenen, J., 1999. Market share, market value and innovation in a panel of
British manufacturing firms. Review of Economic Studies 66, 529-554.
Boldrin, M. and Levine, D., 2004. Rent seeking and innovation. Journal of Monetary Economics 51, 127-
160.
Boldrin, M. and Levine, D., 2002a. “The case against intellectual property. The American Economic
Review (Papers and Proceedings) 92, 209-212.
Boldrin, M. and Levine, D., 2002. “Perfectly competitive innovation”, Federal Reserve Bank of
Minneapolis Staff Report 303.
Bond. E., Jones, R., Wang., P., 2005. Economic takeoffs in a dynamic process of Globalization, Review
of International Economisc, 13, 1-19.
Bond, S., Leblebicioglu, A., Schianttarelli, F, 2004. Capital accumulation and growth: a new look at the
empirical evidence. IZA Discussion paper 1174.
Boone, P., 1996. Politics and the effectiveness of aid. European Economic Review 40 (2), 289-329.
Borenztein, E., De Gregorio, J., Lee, J., 1998. How does foreign direct investment affect economic
growth? Journal of International Economics 45, 115-35.
Brainard, W., 1967. Uncertainty and the effectiveness of policy. American Economic Review, May.
Brakman, S., Garretsen, H., Marrewick, C., 2001. An introduction to geographical economics, Cambridge
University Press, Cambridge UK. .
Breschi, S., Lissoni, F. , 2001. Knowledge Spillovers and local innovation: a critical survey, Liuc Papers
N. 84, Serie Economia e Impresa 27, Marzo.
Brezis, E., Krugman, P. , Tsiddon, D., 1993. Leapfrogging ininternational competition: a theory of cycles
in national technological leadership. American Economic Review 83, 1211-1219.
Bruno, M. and W. Easterly, 1998. “Inflation, crises and long run growth”. Journal of Monetary
Economics, 41 (1), 3-26.
Burnside, C. and Dollar, D., 2000. “Aid, policies and growth”, American Economic Review 90 (4), 847-
868.
Burnside, C. and Dollar, D., 2004a. "Aid, Policies, and Growth: Reply," American Economic Review,
vol. 94(3), 781-784, June.
Burnside, C. and Dollar, D., 2004b. "Aid, Policies, and Growth: Revisiting the evidence", World Bank
Policy Research Working Paper 3251, March.
Cabral, L., 2000. Introduction to Industrial Organization, , the MIT Press, Ca Massachusetts.
Caselli, F., Coleman, W., 2005. The world technology frontier. American Economic Review.
Carlino, G.; Mills, L. “Are US regional incomes converging? A time series analysis”, Journal of
Monetary Economics, 32, 2, November 1993, pp. 335-46.
Carrington, W. Detragiache, E. 1998. "How Big Is the Brain Drain?" IMF Working Paper 98/102
(Washington).
Carrington, W. Detragiache, E., “How Extensive Is the Brain Drain?”, Finance and Development, June
1999.
Caselli, F., Wilson, D., 2004. Importing technology. Journal of monetary economics 51, 1-32.
Caselli, F., Feyrer, J. 2007. The marginal product of capital. Quarterly Journal of Economics vol. 122(2),
535-568.
Caselli, F., Esquivel, G. and Lefort, F. 1996. Reopening the convergence debate: a new look at cross-
country growth empirics. Journal of economic growth, 40, pp 363-389.
Caselli, F., Coleman, W., 2001. “cross-country technological diffusion: the case of computers. American
Economic Review, 91(2), pp. 328-335.
Cass, D. , 1965. Optimum growth in an aggregative model of capital accumulation. Review of Economic
Studies 32 (3), 233-240.
Chari, V. , Hopenhayn, H., 1991. Vintage Human Capital, growth, and the diffusion of technology.
Journal of political economy 99, 1142-1165.
Chenery, H., Bruno, M., 1962. Development alternatives in an open economy: the case of Israel.
Economic Journal 77, 79-103.
Chamley, C. and Ghanen, H., 1991. Fiscal Policy with fixed nominal exchange rates: Cote D’Ivoire,
Working Paper Nº 658 (World Bank, Washington).
Chaudhury, K. , 1983. Foreign trade and balance of payments (1757-1947), In Kumar (ed.) The
Cambridge Economic Story of India. Cambridge University Press, Cambridge, MA.
Chenery, H., Robinson, S., Syrquin, M., 1986. Industrialization and growth: a comparative study. Oxford
University Press, New York.
Chenery, H., Syrquin, 1975. “Patterns of Development, 1950-1970. London: Oxford University Press.
Chiang, A. , 1984. Fundamental Methods of Mathematical Economics. Mc-Graw Hill, 3rd edition.
Ciccone, A., and Hall, R., 1992. Productivity and the density of economic activity. American Economic
Review 86 (1), 54-70.
Ciccone, A., and Matuyama, K., 1996. Efficiency and equilibrium with dynamic increasing returns due to
demand complementasrity, econometrica 67, 499-525.
Ciccone, A., and Matuyama, K., 1996a. “Start-up costs and pecuniary externalities as barriers to
economic development”, Journal of Development Economics 49, 33-59.
Clark, C., 1940. The conditions of Economic Progress. London: McMillan.
Clark, G., 2001. The Secret History of the Industrial Revolution”. UC Davis.
Clemens, M., Radelet, S., Bhavnani, R., 2004. "Counting Chickens When They Hatch: the Short-Term
Effect of Aid on Growth," Center for Global Development Working Paper 44 (Washington: Center
for Global Development).
Coale, A., Treadway, R., 1986. A summary of the changing distribution of overall fertility, marital
fertility, and the proportion married in the provinces of Europe. In: Coale, A., Watkins, S. (eds),
The Decline of Fertility in Europe. Princeton University Press. Princeton.
Coe, D., Helpman, E., 1995. “International R&D spillovers”, European Economic Review 39, 859-887
Coe, D., Helpman, E., and Hoffmaister, A., 1997. North South R&D spillovers, Economic Journal 107 :
134-149.
Coe, D., Helpman, E., 1995. “International R&D spillovers”, European Economic Review 39, 859-887.
Cohen, W., Nelson, R, Walsh, J. (2000). Protecting Their Intellectual Assets: Appropriability Conditions
and Why U.S. Manufacturing Firms Patent (or Not), NBER Working Paper No. W7552
Cohen, D. and M. Soto, 2002. "Why are some countries so poor? Another look at the evidence and a
message of hope", OECD Development Centre Technical paper No. 197, October.
Cohen, W., and Levin, R., 1989. “Empirical Studies of Innovation and Market Structure”, in R.
Schmalensee and R. Willing (eds.), Handbook of Industrial Organization, Amsterdam, Vo 2, North
Holland, 1059-89.
Cohen, W. , Arora, A., Ceccagnoli, M., Goto, A., Nagata, A., Nelso, R., Walsh, J., 2002. Patents: their
effectiveness and Role. Carnegie-Mellon University.
Collins, S. and B. Bosworth, 1996. "Economic Growth in East Asia: Accumulation versus Assimilation".
Brooking papers on Economic Activity Nº2, 135-203.
Comanor, W., 1967. Market structure, product differentiation and industrial research. Quarterly Journal of
afreitas@ua.pt 418
Comin, D., and Hobijn, B. , 2004. “Croiss-country technology adoption: making the theories, facing the
facts”. Journal of Monetary Economics 51, 39-83.
Daalgard, C., Hansen, H., 2001. On aid, growth and good policies. Journal of development studies, 37,
17-41.
Daalgard, C., Hansen, H., Tarp, F., 2004. On the empirics of foreign aid and growth. Economic Journal
114(496): F191-F216.
David, P. 1985. Clio and the Economics of Qwerty. American Economuc Review 75, 332-337.
David, P., 1992. Knowledge, property, and the system dynamics of technological change. Proceedings of
the World Bank Annual Conference on Development Economics, 215-248.
David, P., Invention and Accumulation in America’s Economic Growth: a Nineteen century parable,
Journal of Monetary Economics, Special Supplement VI, 176-228.
De Gregorio, J., 1993. “Inflation, taxation and long run growth”. Journal of Monetary Economics 31,
271-98.
De Long, J., 1988. “Productivity Growth, Convergence and Welfare: Comment”. Americam Economic
Review 78 (5), 1138-1154.
Denicolò, V., 1996. Patent races and optimal patent breath and length. Journal of industrial economics 44,
March, 249-65.
Denison, E., 1962. “The sources of economic growth in the United States and the alternatives before us”.
Committee for economic development, Washington DC.
Denison, E., 1967. “Why growth rates differ: postwar experiencein nine western countries. The brookings
institution, Washington DC.
Diamond, J., 1998. Guns, Germs and Steel: a short history of everybody for the last 13,000 years.
Vintage, Surrey, UK.
Dinopoulos, E. and Thompson, P., 1998. Shumpeterian growth without scale effects. Journal of economic
growth 3(4), 313-35.
Djankov, S., La Porta, R., Lopez-de-Silanes, F., Shleifer, A., 2002. The regulation of entry. Quarterly
Journal of Economics 117, 1-37.
Dixit, A., 2007. “Governance, Institutions and Development.” P. R. Brahmananda Memorial Lecture,
Bank of India, June.
Dixit, A., 1985. Tax policy in open economies. In: Handbook of Public Economics 1, Chap. 6, North
Holland, Amsterdam.
Dixit, A., Stiglitz, J, 1977. Monopilistic competition and optimum product diversity. American Economic
Review 67 (3), 297-308.
Diwan, I., Rodrik, D., 1991. Patents, appropriate technology, and north south trade. Journal of
International Economics 30, 27-47.
Dowrick, S. and D-T. Nguyen, 1989. “OECD comparative economic growth 1950-85: catch-up and
convergence”, The American Economic Review 79(5), 1010-1030.
Domar, E., 1946. "Capital expansion, rate of growth and unemployment", Econometrica 154, 137-47.
Doms, M., Dunne, T., Troske, K., 1997. Workers, wages and technology. Quartely Journal of Economics,
112, pp. 253-290.
Doppelhofer, G., Miller, R., Sala-i-Martin, X., 2004. “Determinants of long term growth: a bayesian
averaging of classical estimates (BACE) approach”. American Economic Review, 94, 4, 813-835.
Drazen, A. Political Economy in Macroeconomics. Princeton University Press, Princeton, New Jersey.
Durlauf, S., Johnson, P., 1995. Multiple regimes and cross-country growth behaviour. Journal of Applied
Econometrics, 10, 365-384.
Easterlin, R., 1981. “Why isn’t the Whole World Developed?”, Journal of Economic History , 41 (1), 1-
19.
Easterly, W. 1993. "How much do distortions affect growth?". Journal of Monetary Economics 32, 187-
212.
Easterly, W., 1999. "The ghost of financing gap: testing the growth model of the international financial
institutions". Journal of Development Economics 60 (2), 423-438.
Easterly, W., 2001. “The Elusive Quest for Growth: economist’s adventures and Misadventures in the
Tropics, Massachusetts Institute of Technology.
Easterly, W. , 2001a. “The lost decades: explaining developing country’s stagnation in spite of policy
reform 1980-1998”. Journal of Economic Growth 6: 135-157.
Easterly, W. 2005. In Aghion, P., and Durlauf, S. (eds), Handbook of Economic Growth, North Holland,
Amsterdam.
Easterly, W., 2006. Reliving the ‘50s: the big push, poverty traps and takeoffs in economic development.
Journal of economic growth 11(4), 289-318.
Easterly, W., Levine, R., 2001. It’s not factor accumulation: stylized facts and growth models”, World
Bank Economic Review 15: 177-219.
Easterly, W., Levine, R., 2003. “Tropics, Germs and Crops: the role of endowments in economic
development. Journal of Monetary Economics, 50 (1).
Easterly,W, King, R, Levine, R, Rebelo, S, 1994. “Policy, Technology Adoption and Growth”, NBER
working paper No.4681
Easterly, W., Kremer, M. , Pritchett, L. and Summers, L, 1993. “Good policy or good luck?: Country
growth performance and temporary shocks?”. Journal of Monetary Economics 32(3), 459-483.
Easterly, W., Levine, R. Roodman, D. 2004. "Aid, Policies, and Growth: Comment," American Economic
Review, vol. 94(3), pages 774-780, June.
Easterly, W., Rebelo, S., 1993. Fiscal policy and economic growth: an empirical investigation, journal of
monetary economics 32, 417-458.
Easterly, W., Rebelo, S., 1993. Marginal income tax rates and economic growth in developing countries.
European economic review 37, 409-417.
Eaton, J., and Kortun, S., 1991. Technology, trade and growth: a unified framework. European Economic
Review 45, 742-755.
Eaton, J., and Kortun, S., 1996. Trade in ideas: patenting and productivity in the OECD. Journal of
International Economics, 40: 251-278.
Eaton, J., and Kortun, S., 1999. International patenting and technological diffusion: theory and
measurement, International economic review 40, 537-570.
EBRD, 1999. Transition report: ten years of transition. European Bank for Reconstructio and
Development, London.
Ellison, G., Glaeser, E., 1999. The geographic concentration of industry: does natural advantage explain
agglomeration? American Economic Revies, Papers and Proceedings, 89, 311-316.
Ehrlich, P. The Population Bomb,
El Qorchi, M., Islamic Finance Gears Up, Finance and Development December 2005.
Evans, P., 1996. Using cross-country variances to evaluate growth theories. Journal of economic
dynamics and control 20, 1027-1049.
Evans, P., 1997. How fast do economies converge? “The review of Economics and statistics”, 79 (2),
219-25.
Evans, P.; Karras, G. “Convergence revisited”, Journal of Monetary Economics, 37, 2, April 1996, pp.
249-65.
afreitas@ua.pt 420
Ferreira-Cavalcanti, P., and Rossi, J. , 2003. New evidence from Brazil on trade liberalization and
productivity growth Trade barriers and productivity growth. International Economic Review. vol.
44(4), pages 1383-1405, November
Fleck, R., Hansen, A., 2006. The Origins of Democracy: A Model with Application to Ancient Greece
The Journal of Law and Economics, 49, pages 115–146.
Fogel, R., 1991. The conquest of high mortality and hunger in europe and America: timing and
mechanisms. In Favorite of Fortunes: Technology, Growth and Ecionomic Development Since the
Industrial Revolution. Lander, D., Higgonet, C., Rosovsky, H. (eds.), Cambridge MA: Harvard
University Press.
Fogel, R., 1994. Economic Growth, Population Theory and Physiology: the bearing of long term
processes on the making of economic policy. NBER Working paper 4638. Washington DC.
Fosfuri, A., Motta, M., Ronde, T., 2001. “Foreign direct investment and spillovers through workers’
mobility”. Journal of International Economics 53, pp. 205-22.
Foster, L., Haltiwanger, J., Krizan, C., 2000. Agregate productivity growth: lessons from microeconomic
evidence. NBER Woring paper Nº 6803.
Frankel, J. and Romer, D. (1999), Does trade cause growth? American Economic Review 89, 379-399.
Fu, J. , 1996. “The effect of asymmetric information on economic growth”. Southern Economic Journal
63, 312-26.
Fujita, M., Ogawa, H., 1982. “Multiple equilibrium and structural transition on non-monocentric urban
configurations”. Regional Science and Urban Economics 12, 161-196.
Galor, O., 2005. From stagnation to growth: unified growth theory. In Aghion, P., and Durlauf, S. (eds),
Handbook of Economic Growth, North Holland, Amsterdam, Chapter 4, 171-293.
Galdon-Sanchez, J., Schmitz, J., 2002. Tough markets and labour productivity: world iron-ore markets in
the 1980s. American economic review 92, 1222-35.
Galor, O. and Weil, D., 2000. “Population, Technology and Growth: From Malthusian Stagnation to the
Demographic Transition and Beyound”. The American Economic Review 90(4), 806-828.
Galor, O. and Weil, D., 1999. From Malthusian stagnation to modern growth. American Economic
Review 89, 150-154.
Galor, O., Moav (2002). Natural selection and the origin of economic growth. Quarterly Journal of
Economics 115, 469-498.
Galor, O., Mounford, A., 2006. "Trade and the Great Divergence: The Family Connection," American
Economic Review, American Economic Association, vol. 96(2), pages 299-303, May
Galor, O., Mounford, A., 2008. "Trading Population for Productivity: Theory and Evidence," Review of
Economic Studies, Blackwell Publishing, vol. 75(4), 1143-1179, October.
Galor, O., Zeira, J., 1993. Income distribution and macroeconomics. Review of economic studies 60, 35-
62.
Gallini, N., 1992. Patent policy and the costly imitation. Rand Journal of Economics 23, 52-63.
Gallup, J., Sachs, J. and Mellinger, A. , 1999. "Geography and Economic Development". International
Regional Science Review 22 (2), 179-232.
Gambeta, D., 1993. The Sicilian mafia: the business of private protection. Cambridge MA: Harvard
University Press.
Geroski, P. , 1995. Market Structure, Corporate Performance and Innovative Activity, Oxford University
Press.
Gershenkron, A. , 1952. “Economic backwardness in historical perspective”. In Bert F. Hoselitz (ed.) The
progress of Underdevelopde Areas, Chicago: The University of Chicago Press, 3-29.
Gershenkron, A., Nimitz, N., 1952. A dollar index of soviet petroleum output, 1927-28 to 1937. project
rand research memorandum RM-804. santa monica, California, rand corporation.
Gilson, R., 1999. The legals infrastructure of high technology industrial districts: Silicon Valley, Rout
128, and convenants not to compete. New York University Law Review 74, 575-629.
Giovannini, A., 1985. Saving and the real interest rate in LCDs, Journal of development economics 18,
197-217.
Gloom, G. and Ravikur, B., 1994. “Public Investment in Infrastructure in a simple growth model”,
Journal of Economic Dynamics and Control 18, 1173-87.
Gloom, G. and Ravikur, B., 1997. “Productive government expenditures and long run growth”, Journal of
Economic Dynamics and Control 21, 183-204.
Gradstein, M., 2004. "Governance and Growth". Journal of Monetary Economics 73, 505-518.
Graham, F., 1923, Some aspects of protection further considered, Quarterly Journal of Economics 37,
199-227.
Gray, C. and Kaufman, D., 1998. “Corruption and Development”, Finance and Development, March.
Greif, A. (2009). Institutions: Theory and History. Forthcoming, Cambridge: Cambridge University,
Press.
Griffith, R., Redding, S., Reenen, V., 2000.Mapping two faces of R&D: productivity growth in a panel of
OECD industries, Review of Economic and Statistics.
Griliches, Z., 1957. Hybrid corn: an exploration in the economics of technological change. Econometrica,
25, pp. 501-522.
Grossman, G. and Helpman, 1991, H. Innovation and growth in the global economy. Cambridge, MA:
MIT Press.
Gupta, S., de Mello, L., Sharan, R., 2001. Corruption and military spending. European Journal of
Polictical Economy 17 , 749-77.
Guriev, S., 2004. Red Tape and Corruption, Journal of Development Economics 73, 489-504.
Gylfason, T., 1998. Output gains from economic stabilization, Journal of Development Economics 56,
81-96.
Gylfason, T., 1999. Principles of Economic Growth. Oxford University Press.
Hall, R., and C. Jones, 1999. "Why do some countries produce so much more output per worker than
others?", The Quarterly Journal of Economics 114 (1), 83-116
Hansen, G. and Precott, E., 1999. “Malthus to Solow”, American Economic Review 92 (2002), 1205-17.
Hansen, H., Tarp, F., 2000. Aid effectiveness disputed. Journal of International Development 12, 375-
398.
Hansen, H., Tarp, F., 2001. Aid and growth regressions. Journal of Development Economics 64, 547-
570.
Harberger, A., 1964. Taxation, resource allocation and welfare. In: the role of direct and indirect taxes in
the Federal Revenue System. The National Bureau of Economic Research and The Brookings
Institution. Princeton University Press, Princeton NJ.
Harberger, A., 1971. Three basic postulates for basic welfare economics: an interpretive essay. Journal of
Economic Literature 9, 785-797.
Harley, C., 1973. On the persistence of old technologies: the case of North American Wooden
shipbuilding. Journal of Economic History 33, 372-398.
Harris, J. and Todaro, M., 1970. “Migration, unemployment and development: a two sector analysis”.
American Economic Review 40, 126-142.
Harrod, R. , 1939. "An essay in dynamic theory". Economic Journal 49, 14-33.
Hausmann, R., L. Pritchett and D. Rodrik (2004) \Growth Accelerations" NBER Working Papers 10566.
Hausmann, R., Rodrik, D., 2006. Doomed to choose: industrial policy as a predicament, Center for
International Development at Harvard University, September.
afreitas@ua.pt 422
Hausmann R., Rodrik D. 2003 "Economic Development as Self-Discovery" Journal of Development

Economics 72(2) 603-633
Hausmann, R. and B. Klinger, (2007), “The structure of the product space and the evolution of
comparative advantage”, CID Working Paper no. 146, April.
Hausmann, R. J. Hwang and D. Rodrik, (2007), "What you export matters", Journal of Economic Growth
12, 1-25.
Haussmann, R.,, Rodrik, D., Velasco, A., 2008. Growth Diagnotsics. In Serra. N. and Stiglitz, J. (eds),
The Washington Consensus Reconsidered: Towards a New Global Governance (Initiative for
Policy Dialogue Series), Oxford University Press, 324-354.
Hellman, Joel S., Geraint Jones, and Daniel Kaufmann (2003), Seize the State, Seize the Day: State
Capture and Influence in Transition Economies, Journal of Comparative Economics, 31, pp. 751-
773;
Helpman, E., 2004, The mistery of economic growth. The Belknap Press, Harvard.
Helpman, E., 1993: Innovation, imitation and intellectual property rights. Econometrica, 61 (6), 1247-
1280.
Henderson, 1974. “The syzes and types of cities”. American Economic Review 64, 640-656.
Hicks, J., 1969. A theory of Economic History. Oxford: Clarendom Press.
Hirschman, A., 1958, Strategy of economic development, New Haven: Yale University Press.
Ho, W. and Wang, Y., 2005. “Public Capital, asymmetric information and economic growth”, Canadian
Journal of Economics, 38(1), 57-80.
Howitt, P., 1999. “Steady endogenous growth with population and R&D inputs growing”, Journal of
Political Economy 107, 715-730.
Howitt, P. , 2000. Endogenous growth and cross-country income differences”, Americam Economic
Review 90, 829-846.
Hsieh, C. , 1999. "Productivity growth and factor prices in East Asia", American Economic Review,
Papers and Proceedings, May, 133-138.
Hume, D., 1758. Essays and Treatie’s on Several Subjects. London: A. Millar.
Iqbal , Z., Islamic Financial Systems, Finance and Development, June 1997.
Islam, 1995. quoted in Aghion and Howitt pp 35

Islam, N., 2005. Regime changes, economic policies and the effects of aid on growth. The Journal of
Development Studies 41 (8), 1467-1492,
Isham, J., Kaufman, D., Prichet, L, 1995. “Governance and returns on investment: an empirical
investigation”. World Bank Policy Research Working paper 1550, World Bank, Washington.
Jain, A., 2001. Corruption: a review. Journal of Economic Surveys 15(1), 71-121.
Jaffe, A., 1986. “Technological opportunity and spillovers of R&D: evidence from firm’s patents, profits
and market value”. American Economic Review 76(5): 984-1001.
Jaffe, A., 1989, Real effects of academic research. American Economic Review 79(5), 957-970.
Jaffe, A., Trajtenberg, M. and Henderson, R., 1993. “Geographical localization of knowledge spillovers
as evidence in patent citations”. Quarterly Journal of Economics 108(3): 577-598.
Jevons, W. , 1875. Money and the Mechanism of Exchange. London: Appleton.
Johansen, L., 1959. “Substitution versus fixed production coefficients in the theory of economic growth: a
synthesis”. Econometrica 27, 157-176.
Jones, C. , 1995. "R&D based models of endogenous growth", Journal of Political Economy 103, 759-84.
Jones, C., 1995. "Time series tests of endogenous growth models", Quarterly Journal of Economics, 110,
495-525.
Jones, C. , Introduction to Economic Growth, Norton, New York, 2nd edition, 2002.
Jones, C. , 1999. “Growth: with or without scale effects?”. American Economic Association Papers and
Proceedings, 89, May, 139-144.
Jones, C., 2005. Growth and Ideas. Hanbook.
Jones, C. , 2005. “Growth and ideas”, in Aghion, P., and Durlauf, S. (eds), Hanbook of Economic
Growth, North Holland, Amsterdam, Chapter 16.
Jones, L. and Manuelli, R., 1990. A convex model of equilibrium growth: theory and policy implications,
Journal of political economy 98, 1008-1038.
Jones, L. , Manuelli, R. and Rossi, P., 1993. Optimal taxation in models of endogenous growth. Journal of
Political Economy 101 (3), 485-517.
Jones, L. and Manuelli, R., 2005. Neoclassical models of endogenous growth: the effects of fiscal policy,
innovation and fluctuations, in Aghion, P., and Durlauf, S. (eds), Hanbook of Economic Growth,
North Holland, Amsterdam, Chapter 1, 14-65.
Jovanovic, B., Nyarko, Y., 1996, Learning by doing and the choice of technology, Econometrica 64,
1299-1310.
Kaldor, N., 1970. “The case of regional policies”, Scottish Journal of Political Economy 17: 337-348.
Kaldor, N, 1966. "Causes of the Slow Rate of Growth of the United Kingdom", Cambridge University
Press.
Kaldor, N, 1961. Capital Accumulation and Economic Growth. In F.A. Lutz and D.C. Hague (eds.), The
theory of Capital. New York: St Martin Press.
Kaldor, N., 1957. A model of Economic Growth. The Economic Journal 67 (268), 591-624.
Kaufman, D. and Wei, S., 1999. Does grease money speed up the wheals of comer, NBER 7093.
Kaufmann, D., Kraay, A., Zoido-Lobatón, P. 1999a. "Aggregating Governance Indicators," World Bank
Policy Research Working Paper No. 2195 (Washington), www.worldbank.org/wbi/governance/.
Keller, W. , 2000. "Do Trade Patterns and Technology Flows Affect Productivity Growth?," World Bank
Economic Review, Oxford University Press, vol. 14(1), pages 17-47, January.
Keller, W., 2002. Geographic localization of International Technological Diffusion. American Economic
Review 92, 120-142.
Keller, W., 2004. “International Technological Diffusion”, Journal of Economic Literature 42: 752-782.
Keely, L, and Quah, D., 1998. “Technology and Growth”, Centre for Economic Performance Discussion
Paper Nº 391, London School of Economics, May.
Kerr, W., 2008. Ethnic scientific communities and international technology diffusion”. The review of
economics and statistics 90(3), 518-537.
King, R., Levine, R., 1993a. Finance and growth: Schumpeter might be right. Quarterly Journal of
Economics, 108 (3), 717-37..
King, R., Levine, R., 1993. Finance, entrepeneurship and growth: theory and evidence. Journal of
Monetary Economics 32, 513-542.
King, R., Levine, R., 1994. Capital fundamentalism, economic development and economic growth.
Carnegie Rochester Conference Series on Public Policy 40: 259-92.
Klenow, P. and Rodriguez-Clare, "The Neo-Classical Revival in Growth Economics: Has it Gone Too
Far?", NBER Macroeconomic Annual, 1997.
Klenow, P. and Rodriguez-Clare, A., 2005. “Externalities and Growth”. In Aghion, P., and Durlauf, S.
(eds), Handbook of Economic Growth, North Holland, Amsterdam, Chapter 11, 817-866.
Kneller, R., Bleaney, M. and Gemmel, N., 1999. “Fiscal Policy and Growth: Evidence from OECD
countries”. Journal of Public Economics 74, 171-90.
Knight, F., 1925. “On decreasing cost and comparative cost”. Quarterly Journal of Economics 39, 331-33.
afreitas@ua.pt 424
Kormendi, R., Meguire, P., 1985. Macroeconomis determinants of growth: cross country evidence,
Journel of Monetary Economics, 16 (2), 141-63.
Kortum, S., 1997. Research, patenting and technological change. 1997. Econometrica 65(6), 1389-1419.
Koopmans, T., 1965. On the concept of optimal growth. In The Econometric Approach to Development
Planning. Rand-McNally, Chicago.
Kranton, R., Swamy, A., 1999. The hazards of piecemeal reform: british civil courts and the credit market
in colonial India. Journal of Development Economics 58, 1-24.
Kraay, A., Raddatz, C., 2007. Poverty traps, aid and growth. Journal of Development Economics 82(2),
315-347.
Kremer, M. 1998. “Patent buyouts: a mechanism for encouraging innovation”. Quarterly Journal of
Economics, 1998, pp. 1137-1167, November.
Kremer, M. 1993. "Population growth and technological change: one million B.C. to 1990". Quarterly
Journal of Economics, 108, 681-716.
Krueger, A.O., 1978. Foreign trade regimes and economic development: liberalization attempts and
consequences. NBER, Cambridge MA.
Krueger, A.O., 1983, Trade and employment in developing countries, 3. Synthesis and conclusions,
University of Chicago Press, Chicago IL.
Krueger, A, 1985. The experiences and lessons of Asia’s super exporters. In: Corbo, V., Krueger, A,
Ossa, F. (eds), Export Oriented Development Strategies: The success of five newly industrialized
countries. Westview Press, Boulder, CO, pp. 57-58.
Krueger, A, 1974, “The political economy of the rent seeking society”, American Economic Review
LXIV, 291-303.
Krueger, A., 1993. “Virtuous cycles and vicious cycles in economic development. American Economic
Review, Papers and Proceedings 83(2), 351-56.
Krugman, P., 1979. “Increasing returns, monopolistic competition and international trade”, Journal of
International Economics 9(4), 469-79.
Krugman, P., 1980. Scale economies, product differentiation and the pattern of trade. American
Krugman, P., 1981. “Trade, accumulation and uneven development”, Journal of Development Economics
8, 149-161.
Krugman, P., 1987. “The narrow moving band, the dutch disease and the competitive consequences of
Mrs. Tatcher”, Journal of Development Economics 27, 41-55.
Krugman, P., 1991. “Increasing returns and economic geography”, Journal of Political Economy 99, 483-
499.
Krugman, P. , 1995. Development, Geography, and Economic Theory. MIT Press, Cambridge,
Massachusetts.
Krugman, P. , 1994. "The myth of Asia's Miracle", Foreign Affairs 73(6), Nov-Dec, pp.62-78.
Krugman, P. and Venables, A., 1995. “Globalization and the Inequality of Nations”. Quarterly Journal of
Kuznetz, S., 1966. Modern Economic Growth. New Haven, CT: Yale University Press.
Kuznetz, S., 1960. Population change and aggregate output. In Demographic and economic change in
developed countries. Princeton, NJ. Princeton University Press.
Landes, D., 1998. The Wealth and poverty of Nations. Abacus.
Lau. L., Qian , Y., Roland, G., 2000. Reform without losers: an interpretation of China’s dual track
approach to transition. The Journal of Political Economy 108 (1), 120-143.
Lee, R., 1997. “Populatioon Dynamics: equilibrium, disequilibrium and consequences of fluctuations”. In
Mark Resenzweig and Oded Stark (eds), Handbook of Population and Family Economics Vol. 1B.
Amsterdam: North Holland, 1063-115.
Lee, R., 1988. “Induced Population Growth and Induced Technological Progress: their interaction in the
accelerating stage”, Mathematical Population Studies 1 (3), 265-288.
Lee, K., Pesaran, M., Smith, R., 1997. Growth and convergence in a multi-country empirical stochastic
Solow model. Journal of applied econometrics 12, 357-392.
Leff, 1964.
Lensink, R., White, H, 2001. Are there negative returns to aid? Journal of Development Studies 37 (6),
42-65.
Levin, R., Klevorick, A., Nelson, R., Winter, S., 1987 “Appropriating the returns from industrial research
and development. Brooking Papers on Economic Activity 3, 783-820.
Levine, R. and D. Renelt (1992), “A sensitivity analysis of cross-country growth regressions”, American
Economic Review 82(4), 942-63.
Levine, R, Loyaza, N. and Beck, T., 2000. “Financial Intermediation and growth: causality and causes”.
Ournal of Monetary Economics 46, 31-77.
Lewis, W., 1967. Unemployment in developing countries. World Today 23, 13-22.
Lewis, W., 1954. “Economic Development with Unlimited Supplies of Labour”. The Manchester School
of Economic and Social Studies 22, 139-191.
Li, D. , 2002. "Is the AK model still alive? The long run relation between growth and investment re-
examined", Canadian Journal of Economics 35(1), 92-114.
Li, Q. and Papell, 1999. “Convergence of international output: time series evidence for 16 OECD
countries”. International Review of Economics and Finance 8, 267-280.
Lichtenberg, F., de La Potterie, B., 2001. "Does Foreign Direct Investment Transfer Technology Across
Borders?," The Review of Economics and Statistics, 83(3), 490-497.
Lichtenberg, F., de La Potterie, B., 1998. International R&D spillovers: a comment. European Economic
Review 42, 1483-1491.
Lipsey, R., Lancaster, K. The General Theory of Second Best, The Review of Economic Studies, Vol. 24,
No. 1. (1956 - 1957), pp. 11–32
Livi-Bacci, M., 1997. A concise history of world population”. Oxford: Blackwell.
Loewy, M.; Papell, D. “Are US regional incomes converging? Some further evidence”, Journal of
Monetary Economics, 38, 3, December 1996, pp. 587-98.
Loury, G., 1979. Market structure and innovation. Quarterly Journal of Economics 93, 395-410.
Lucas, R., 1988. “On the mechanics of economic development”. Journal of Monetary Economics 22, 3-
42.
Lucas, R., 1990, “Why doesn’t capital flow from rich countries to poor countries?”, American Economic
Review 80, 92-96.
Lucas, R., 2002. Lectures on Economic Growth. MIT Press.
Lucas, R., (2002a). The industrial revolution: past and future. Harvard University Press.
Maddison, A., 1982. Phases of capitalist development. Oxford University Press.
Maddison, A., 1991. Dynamic forces in capitalist development. Oxford University Press.
Maddison, A., 2001. The World Economy: a Millenial Perspective. Development Centre, Paris.
Malthus, T., 1797. First essays on population. Reprints of economic classics, Augustus Kelley, New York
1965.
Malthus, T., 1798. An Essay on the Principle of Population as it Affects the Future Improvement of the
Society. London: J. Johnson.
Mankiw, G., D. Romer and D. Weil, 1992. “A contribution to the empirics of economic growth”,
Quarterly Journal of Economics, 107 (2), 407-38.
Mansfield, E., 1986. Patents and Innovation, Management Science, 1986, February,
afreitas@ua.pt 426
Markusen, J., Venables, T., 1999, Foreign direct investment as a catalyst for industrial development.
European Economic Review, 335-356.
Markusen, J., 2002. Multinational firms and the theory of international trade. Cambridge MIT Press.
Marshall, A., 1920. “Principles of Economics”, London: Macmillan.
Matsuyama, K., 1991. Increasing returns, industrialization, and indeterminacy of equilibrium. Quarterly
Journal of Economics, Vol. CVI, No. 425, pp. 617-650,
Matsuyama, K., 1992. “Agriculture productivity, comparative advantage and economic growth”. Journal
of Economic Theory 58, 317-334.
Matsuyama, K., 1995. “Complementarities and cumulative processes in models of monopolistic
competition”. Journal of Economic Literature XXXIII, 701-729.
Matsuyama, K., 1996. “Economic Development as Coordination Problems. In : the role of Government in
East Asian Development: A comparative Institutional Analysis”. M. Aoki. H. Kim and M. Okuno-
Fujiwara (eds). New York: Oxford University Press.
Mauro, P., 2004, “The persistence of corruption and slow economic growth”. IMF Staff Papers 51 (1), 1-
18.
Mauro, P., 1995. Corruption and Growth. Quarterly Journal of Economics 110 (3), 681-712.
McGillivray, M., M., Feeny, S., Hermes, N. , Lensik, R., 2005. It works; it doesn’t; it can, but that
depends. United Nations University Research Paper No 2005/54, Helsinki, Finland.
Meadows, 1972 limits to growth (reference in jones)
Mello, M., 2009. Estimates of the Marginal Product of Capital, 1970-2000. The B.E. Journal of
Macroeconomics 9(1), article 16.
Mendoza, E., Milesi-Ferreti, G., Asea, P., 1997. On the effectiveness of tax policy in altering long run
growth: Harberger’s super neutrality conjecture. Journal of Public Economics 66 (1), 99-126.
Menger, K., 1892. ‘On the origin of money.’ Economic Journal, Vol. 2, pp. 239-255.
Merges, R., Nelson, R., 1994. On limiting or encouraging rivalry in technological progress: the effect of
patent-scope decicions. Journal of Economic Behaviour and Organization 25, 1-24.
Mokyr, J., 1985. The economics of the industrial revolution. Rowman and Littlefield.
Mokyr, J., 2002. The gifts of Athena: historical origins of the knowledge economy. Princeton university
press, Princeton.
Myrdal, G., 1957. Economic theory and underdeveloped regions, Duckworth, London.
Mukoyama, T. 2003, Innovation, imitation and growth with cumulative technology, Journal of Monetary
Murphy, K., Shleifer, A, Vishny, R., 1989. Industrialization and the big push, Journal o Political
Economy 97, 1003-1026.
Murphy, K., Shleifer, A, Vishny, R., 1993. Why is rent seeking so costly to growth? American Economic
Review Papers and Proceedings 83 nº 2, May.
Murthy, N. and Ukpolo, V. (1999), A test of the conditional convergence hypothesis: econometric
evidence from African countries, Economics Letters 65, 249-253.
Myearson, R., 1993. Effectiveness of the electoral system for reducing government corruption: a game
theoretical analysis. Games and Economic Behaviour 5, 118-32.
Nelson, R., 1959. The simple economics of basic scientifici research. Journal of Political Economy, 67,
297-306.,
Nelso,, R., Phelps, E., 1966. Investment in humans, technological diffusion and economic growth.
American economic review 61, 69-75.
Nickell, S., 1996. Competition and corporate performance. Journal of political economy 104 (4), 724-746.
Nordhaus, W., 1969. An economic theory of technological change. American economic review 59, 18-28.
Nordhaus, W., 1969. “Invention, growth and welfare: a theoretical treatment of technological change”,
Cambridge, Mass, Harvard University Press.
North, D., Wallis, J., Weingast, B., 2006. A conceptual framework for interpreting recorded human
history. NBER working paper series 12795, December.
North, D., 1994. Economic performance through time”. The American economic review 84(3), 359-368.
North, D. , 1993. The new institutional economics and development”, mimeo Washington University, St.
Louis. WUSTL Economic Working Paper Archive.
North, D., 1991. Institutions. Journal of Economic Perspectives 5(1), 97-112.
North, D., 1990. Institutions, Institutional Change and Economic Performance. Cambridge UK:
Cambridge University Press.
North, D. 1981. Structure and Change in Economic History. New York: Norton.
North, D., Thomas, P., 1973. The rise of the Western World, Cambridge UK: Cambridge University
Press.
Nurkse, R., 1953. “Problems of capital formation in underdeveloped countries”. New York: Oxford
University Press.
(Olson, 1965; on rent seeking
Olson, M., 1996. Big bills left on the sidewalk: why some nations are rich and others are poor. Journal of
Economic Perspectives 10, 3-24.
Pagano, M., 1993. “Financial markets and growth: an overview”. European Economic Review 21, 613-
22.
Parente, S. and Prescott, P. , 2005. A unified theory of the evolution of international income levels” in
Aghion, P., and Durlauf, S. (eds), Hanbook of Economic Growth, Volume 1, Chapter 21, pp.
1371-1416, Elsevier.
Parente, S. and Prescott, P. , 1994. "Barriers to technology adoption and development", Journal of
Political Economy 102(2), 298-321.
Park, H., Philippopoulos, A., Vassilatos, V., 2003. The optimal size of public sector under rent seeking
competition grom state coffers. Cesifo Working Paper Nº 991, Ifo Institute for Economic
Research, Munich.
Pepall, L., Richards, D., Norman, G., 2005. Industrial Organization: Contemporary Theory and Practice,
Third Edition, Southe-Western Thomson.
Peretto, P. “Technological change and population growth”, 1998. Journal of Economic Growth, 3(4), 283-
311.
Peretto, P. and Smulders, S., 2002. Technological Distance, Growth and Scale Effects, The Economic
Journal 112, 603-624.
Perotti, R., 1996. Growth, income distribution and democracy: what the data say. Journal of Economic
Growth, 1(2): 149-187.
Persson, T., Roland, G., Tabellini, G. 1997. Separation of powers and political acontability: Quarterly
Journal of Economis 112, 1163-202.
Persson, T., Roland, G., Tabellini, G., 2000. Comparative politics and public finance. Journal of Political
Economy 108, 1121-41.
Phelps, E., 1961. "The Golden Rule of Accumulation: a Fable of Growthman". American Economic
Review, 51, 638-643.
Phelps, E., 1965. Second Essay on the Golden Rule of Accumulation, American Economic Review 55,
793-814.
afreitas@ua.pt 428
Phelps, E., Pollak, R., 1968. “On second best national savings and game equilibrium growth”. Review of
Economic Studies 35, 185-199.
Polanyi, M., 1958. Personal knowledge: towards a post-critical philosophy. Chicago: U. Chicago Press.
Porter, M., 1992. “The competitive advantage of nations”. The Free Press, New York.
Prescott, E., 1998. Needed: a theory of total factor productivity. International economic review 39, 529-
549.
Prichett, L., 1997. “Divergence, Big Time”. Journal of Economic Perspectives 11(3), 3-17.
Pritchett, L. 2006, The Quest Continues, FD March
http://www.imf.org/external/pubs/ft/fandd/2006/03/pritchet.htm.
Prebisch, R., 1950, United Nations, The Economic Development pf Latin America and its Principal
problems, Lake Success, New York.
Quah, D., 1993. Empirical cross-section dynamics in economic growth. European economic review 37,
426-434.
Quah, D., 1997. Empirics for growth and distribution: stratification, polarization and convergence clubs,
Journal of Economic Growth 2, 27-59.
Ramsey, F., 1928. A mathematical theory of savings. Economic Journal, 38, Nº 152, 543-559.
Ray, D., 1998. “Development Economics”, Princeton.
Rebelo, S. 1991. Long run policy analysis and long run growth, Journal of Political Economy, 99, 500-
521.
Rebelo, S., Stokey, N., 1995. growth effects of tax-flat rates, journal of political economy 103, 500-521.
Rhee, Y., W., 1990. The catalyst model of development: lessons from Bangladesh’ success in garment
exports. World Development, 18, 333-346.
Ricardo, D. 1817. Principles of Political Economy and Taxation.
Rivera-Batiz, L., Romer, P., 1991. Economic Integration and Edogenous Growth. Quartely Journal of
Economics 106: 531-555.
Robinson, S., 1971. Sourcse of growth in Less Developed Countries: a cross-section study, Quarterly
Journal of Economics 85 (3), 391-408.
Rodriguez-Crare, A., 1996. The division of labour and economic development, Journal of Development
Economics, 49, 3-32.
Rodriguez-Clare, A., 1996a. Multinationals, linkages and economic development. American Economic
Review 86, 852-73.
Rodrik, D., 1996. Coordination failures and government policy: a model with applications to East Asia
and Eastern Europe. Journal of International Economics 40, 1-22.
Rodrik, D. , 1997. “TFPG controversies, institutions and economic performance in East Asia”, Working
Paper NBER 5914.
Rodrik, D., 1999. Where did all the growth go? External shocks, social conflict and growth collapses”.
Journal of Economic Growth 4 (4), 385-412.
Rodrik, D., 2003. Institutions, Integration and Geography: in search of the deep determinants of economic
growth”, in In Search f prosperity: analytical country studies on growth, Dani Rodrick (ed),
Princeton, New Jersey.
Rodrik, D., 2006. Goodbye Washington Consensus, Hello Washington Confusion?, Journal of Economic
Literature 44 (4), 973-987.
Rodrik, D., Arvind Subramanian, 2003. The primacy of institutions (and what this does not mean).
Finance and Development, June, 31-34.
Rodrik, D., Arvind Subramanian, Francesco Trebbi, 2004. "Institutions Rule: The Primacy of Institutions
Over Geography and Integration in Economic Development," Journal of Economic Growth,
Springer, vol. 9(2), pages 131-165.
Romer, P. 1986. “Increasing returns and long run growth”. Journal of Political Economy 94, 1002-37.
Romer, P., 1987. “Crazy explanations for the productivity slowdown”, in Fisher, S. (ed). NBER
Macroeconomics Annual. Cambridge Mass: MIT Press.
Romer, P. , 1990. “Endogenous technological change”. Journal of Political Economy 98, s71-s102.
Romer, D., 1996. Advanced Macroeconomics, McGraw-Hill.
Roseinstein-Rodan, P., 1943. Problems of industrialization in eastern and south-eastern Europe,
Economic Journal, 53, 202-211..
Posenstein-Rodan, P., 1961. Notes on the theory of the Big Push. In: Ellis, H.S., Wallich, H.C. (eds),
Economic Development for Latin America. St. Martin’s Press, New York.
Rosser, J., Rosser, M., 1996. Comparative Economics in a Transforming World Economy. McGRaw-Hill.
Rostow, W., 1960. “The stages of economic growth: a non communist manifesto”. Cambridge:
Cambridge University Press.
Roubini, N., and Sala-i-Martin, X., 1985. A growth model of inflation, tax evasion and financial
repression, Journal of monetary economics, 35, 275-301.
Roubini, N., and Sachs, G., 1989. “Political and Economic Determinants of Budget Deficits in the
Industrial Democracies. European Economic Review, May.
Sachs, J., 2001. Tropical underdevelopment: NBER working paper 8819, February.
Sachs and Larraine, Macroeconomics, …
Sachs, J. D. and Warner, A. M. , 1995. “Economic reform and the process of economic integration”,
Brookings Papers of Economic Activity 1, 1-95.
Sachs, J. D. and Warner, A. M. , 1997. "Fundamental sources of long-run growth". American Economic
Review, Papers and Proceedings, May.
Sachs, J. D. and Warner, A. M. , 1999. The big push, natural resource booms and growth, Journal of
Development Economics 59, 43-76.
Sachs, J., 2005. The end of poverty: economic possibilities for our times. New York: Penguin Press.
Sachs, J., 2003. Institution Matter but Not for everything: the role of geography and resource endowments
in development shouldn’t be underestimated. Finance and Development, June, 38-41.
Sachs, Jeffrey D., “Tropical Underdevelopment,” NBER Working Paper No. w8119, February 2001.
Sachs JD,; McArthur JW,; Schmidt-Traub G,; Kruk M,; Bahadur
C,; Faye M,; McCord G. (2004) Ending Africa's Poverty Trap (Brookings Institution, Washington,
DC) in Brookings Papers on Economic Activity, no 2, pp 117–216
Sala-i-Martin, X., “I just ran two million regressions”, American economic review 97 (2), 178-83.
Sala-i-Martin, X., 2002. The disturbing “rise” in world income distribution. Columbia University.
Samuelson, P., 1954. “The pure theory of public expenditure”, Review of Economics and Statistics 36,
387-89.
Savvides, A., Zachariadis, M., 2005. International technology diffusion and the growth of TFP in the
manufacturing sector of developing economies. Review of Development Economics 9 (4).
Slemrod, J., what do cross-country studies teach about government involvement, prosperity and economic
growth? Brookings paper on economic activity 0 (2) 373-415.
Scitovsky, T., 1954. Two concepts of external economies. Journal of political economy, 62, 143-151.
Scherer, F., 1967. Market Structure and the Employment of scientists and engineers. American Economic
Review 57, 524-531.
Scherer, F., 1984. Innovation and growth: Schumpeterian perspectives. Cambridge, MA: MIT Press,
1984, pp. 120-129.
afreitas@ua.pt 430
Schumpeter, J., 1912. Theorie der Wirtschaftlichen Entwicklung [The theory of economic development].
Leipzig : Dunker & Humblot. Translated by Redvers Opie, Cambridge MA, Harvard University
Press, 1934. .
Schumpeter, J., 1950. Capitalism, Socialism and Democracy. New York: Harpe.
Schultz, T., 1960. Capital formation and education, Journal of political economy 69, 571-583.
Schultz, T., 1961. Investment in Human Capital, American Economic Review 51, 1-17.
Scotchmer, S. 1991. Standing on the shoulders of giants: cumulative research and the patent law. Journal
of Economic Perspectives 5(1), 29-41.
Segerstrom, P., 1998. Endogenous growth without scale effects. Journal of Political Economy, 106(1),
41-63.
Shell, K., 1966. Towards a theory of inventive activity and capital accumulation, American Economic
Review, Papers and Proceedings 56, 62-68.
Shleifer, K., Vishny, W., 1998. The Grabbing Hand. Harvard University Press,
Simon, J., 1977. The Economics of Population Growth. Princeton, NJ Princeton University Press.
Simon, J., 1981. The Ultimate Resource. Princeton, NJ Princeton University Press.
Smith, A., 1776. The Wealth of Nations. London: W. Strathan and T. Caddell. Reprinted in Cannan, E.
(ed.), 1961, London: Methuen.
Singer, H., 1950. The distribution of trade between investing and borrowing countries. American
Sokoloff, K., Engerman, S., 2000. Institutions, factor endowments and paths of development in the new
world, Journal of Economic Perspectives 14, 217-232.
Solow, R., 1956. “A contribution to the theory of economic growth”, Quarterly Journal of Economics 50,
65-94.
Solow, R. 1957. "Technical change and the aggregate production function", Review of Economics and
Statistics 39, 79-82.
Solow, R., 1960. “investment in technical progress”, in Arrow, K (ed.), Mathematical methods in the
social sciencies, Standford University Press.
Solow, R. 1994. Perspectives on growth theory. Journal of Economic Perspectives, 8, 45-54.
Solow, R., 2000. “Growth Theory: an Exposition”, Oxford University Press, Oxford.
Solow, R., 2005. Reflections on growth theory, in Aghion, P., and Durlauf, S. (eds), Handbook of
Economic Growth, North Holland, Amsterdam, 3-10.
Stiglitz, J., 2000. “The economics of the public sector”, 3th edition, WW Norton & Co, London.
Stiglitz, J. and Driffil, J., 2000. “ Economics”, Norton, New York.
Stokey, N., 1998. Learning by doing and the introduction of new goods. Journal of Political Economy 96,
701-717.
Stokey, N., Rebelo, S., 1995. Growth effects of flat-tax rates. Journal of political economy 103, 519-550.
Sudekum, J, 2003. “Agglomeration and regional unemployment disparities”, Peter Lang, Frankfurt
Heston, Alan. , Summers, Robert, and and Bettina Aten, 2002. Penn World Table Version 6.1, Center for
International Comparisons at the University of Pennsylvania (CICUP), October.
Swan, T., 1956. “Economic growth and capital accumulation”, Economic Record 32, 334-61.
Tanzi, V., 1998. “Corruption around the world: causes, consequences, scopes and cures. IMF staff papers
45(4), 559-594.
Taylor, L., 1990. Foreign resource flows and developing country growth: a three-gap model. In
McCarthy, F.D. (ed), Problems of Developing Countries in 1990, World Bank Discussion Paper
97, Washington DC: World Bank.
Teece, D., 1977. Technology transfer by multinational firms: the resource cost of transferring
technological know-how. Economic Journal 87, pp. 242-261.
Thirwall, A. P., (1979), "The Balance of Payments constraint as an explanation of international growth
differences", Banca Nazionale del Lavoro Quarterly Review, vol. 3, 245-253.
Tinbergen, J., 1952. “On the theory of economic policy, Amsterdam: North Holland.
Toye, J., Moore, M., 1998. Raxation, Corruption and Reform, European Journal of Development
Research, 10 (1), 60-84.
Trindade, V. 2005, The Big Push, Industrialization and International Trade: the role of exports, Journal of
Development Economics 78 (1), 22-48.
Tsiddon, D., 1992. “Moral Hazard trap to growth”, International Economic Review 33, 299-321.
Turnovsky, S. and Fisher, W., 1995. “The composition of government expenditures and its consequence
for macroeconomic performance”. Journal of Economic Dynamics and Control 19, 747-86.
United Nations, 2005. Millenium Development, Project Report, United Nations, New York.
Usawa, H., 1965. Optimum technical change in an aggregative model of economic growth”, International
Economic Review, 6, 19-31.
Veblen, T., 1898, Why is economics not an evolutionary science? Quarterly Journal of Economics, 12,
373-97.
Ventelou, B., 2002. Corruption in a model of growth: political reputation, competition and shocks. Public
Choice 110, 23-40.
Ventura, J., 2005. A Global View of Economic Growth. In Aghion, P., and Durlauf, S. (eds), Handbook
of Economic Growth, North Holland, Amsterdam, chapre 22, 1419-1497.
Wade, R., 1982. The system of administrative and political corruption: canal irrigation in Soth India.
Journal of Development Studies 18, pp. 287-328.
Wade, R., 1990. “Governing the market”. World Bank Publications. Washington DC.
Weil, D. 2005, Economic Growth, Pearson Addison-Wesley.
Williamson, J., 1990. “What Washington means by policy reform”, in Latin American Adjustment: How
much has happened?, edited by John Williamson (Washington: Institute for International
Economics).
World Bank, 1988. World Development Report 1988. Oxford University Press, New York.
World Bank, 1991. World Development Report. Washington.
World Bank, 1998. Assessing debt: what works, what doesn’t and why. Washington DC: World Bank.
World Bank, 2003. World Development Report: sustainable development in a dynamic world,
Washington DC.
World Health Organization, 1999. WHO on Health and Income Productivity. Population and
Development Review 25 (2), 396-402.
Wright, B., 1983. The economics of invention incentives: patents, prizes and research contracts.
American Economic Review 73 (4), 691-707.
Wright, T., 1936. “Factors affecting the cost of airplanes”. Journal of the Aeronautical Science 3, 122-
128.
Xu, B., 2000. Multinational enterprises, technology diffusion and host country productivity growth.
Journal of development economics, 62 (2), 477-93
Young, A., 1991. Learning by doing and the dynamic effects of international trade. Quarterly Journal of
Economics 106, 369-405.
Young, 1998. “Growth without scale effects”, Journal of Political Economy 106 (1), 41-63.
afreitas@ua.pt 432
Young, A ., 1995. “The tyranny of numbers: confronting the statistical realities of the East Asian growth
experience”, Quarterly Journal of Economics CX (3), 641-680.

MLF Growth Notes Net PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MLF Growth Notes Net PDF

Uploaded by

Copyright:

Available Formats

11/06/2014

Introduction to Economic Growth

Miguel Lebre de Freitas (afreitas@ua.pt)

Introduction: The growth question

Part I – Basic models

1. The Malthus model

Part II – Technology and its diffusion

Part III – Policies, geography and Institutions

10. Government inputs

Epilogue: what have we learned?

0. Introduction: the growth question

Part I – Basic models

1. The Malthusian model

2. The basic Solow model

4. The Neoclassical model with Human Capital

Part II – Technology and its diffusion

Part III –Getting the prices right

10. Government inputs

12. Traps and cycles

13. Corruption and rent seeking

Part I – Basic models

Chapter 1 focuses on the relationship between population and economic

Part II – Technology and its diffusion

Part III – Getting the prices right

Chapter 10 focuses on the role of government in providing essential inputs to

How to use this book?

One semester course

The book is basically designed for a course on economic growth with

1. The Malthusian model

Half term course in economic growth

5.6. A two sector model of endogenous growth

Symbols and notation

 K = Tax on physical capital income

A dot over a variable denotes time variation:

In logs, a linear equation arises:

Unstable Equilibrium Stable equilibrium

Part I – Basic models

1. The Malthusian model

 Understand the challenges raised by the Law of Diminishing Returns for

1.2. The Malthus model

The Production Function

Consider a closed economy (i.e. one with no international trade) without

The Law of Diminishing Returns

Figure 1.1: Output and employment in the Classical Model

The Malthus theory of population

Malthus formulated his theory of population observing that, in nature, animals

Dynamics and equilibrium

Figure 1.2: Dynamics and equilibrium in the Classical Model

Box 1.1: Stable Steady State

Technically, point R in Figure 1.2 is an equilibrium, because once it is reached,

Box 1.2: The Black-Death

Smith’ mark of prosperity

1.3. Technological change in the Malthus model

What happens when technology improves?

Figure 1.3: A technological improvement in the Classical Model

Box 1.3: Transition dynamics vs. change in the steady state

Race between technological progress and diminishing returns

Endogenous technical change

The Malthus model implies that population expansion exerts a negative

It is important to note that, in contrast to many other goods, technology is non-

Box 1.4: Technology and population density: an historical experiment

1.4. The demographic transition

Population and technological change along the last twenty centuries