Professional Documents
Culture Documents
Micro 2Decision2014LecturesPart5
Micro 2Decision2014LecturesPart5
P.G. Babu
Indira Gandhi Institute of Development Research
Gen.A.K.Vaidya Marg
Goregaon (East), Mumbai 400 065.
email: babu@igidr.ac.in
Inequality
These are merely lecture notes. No originality, other than the organization of the material is claimed.
person who appears exactly at the twenty percent way along the parade of the people, and x0.8
gives the income of the person who appears at eighty percent way along the parade of the entire
population. In this way of representing income distribution, we gain the advantage of borrowing
the concepts from Statistics in our inequality analysis.
Both of the above methods are useful. For example, if one is more interested in developing
individualistic welfare criteria, we would use the first approach. However, given our intent to develop
a general theory based on distributions, we would focus more on the latter, viz., distributional
approach to inequality measurement.
Needless to state, a good definition of inequality would require us to come up with implementable definitions of income and the income recipient. In case we think we need to convert the
standard accounting measure of total family income into a measure that captures individual
welfare, we need to introduce an equivalence scale which defines the rate of exchange between
conventionally defined accounting income and an adjusted notion of equivalised income, which
works as a money metric of utility. For the moment, we understand that our notion of income
incorporates all such concerns.
With this preamble, let us turn to the details.
One way of illustrating inequality in the distribution of income is to divide the total population
into several equal sized income classes or so called fractiles from the poorest to the richest, and
then to evaluate the percentage of income accruing to each fractile or income class. Think of the
following example.
Table 1:
Income Class (Fifth of the Population)
Bottom
Second
Third
Fourth
Top
The top fifth gained at the expense of the lowest three income classes.
Distribution in Year X has less inequality than the distribution in Year Y as share of total
income held by the poorest classes is more. Of course, this measure is somewhat crude as it ignores
the distribution of income within each income class or fractile. A way around is given by the
Lorenz curve, formulated by Max Otto Lorenz in 1905. It graphs the cumulative share of the total
income accruing to each cumulative share of the total population, when incomes are ordered from
the poorest to the richest. As the share of the population p varies between [0, 1], the Lorenz curve
indicates the share L(p) of total income received by the lowest p of the population.
2
If all incomes are equal, the lowest p receives exactly p of the total income: its Lorenz curve
will be the diagonal from (0, 0) to (1, 1).
INSERT FIGURE ABOUT HERE.
As income becomes less equal, the associated Lorenz curve bows outward, away from the diagonal. Distribution 1 is said to have less inequality than Distribution 2 in the diagram below by
the Lorenz criterion if L1 (p) L2 (p), p [0, 1], with strict inequality for some p. If two Lorenz
curves intersect, then they are not comparable by the Lorenz criterion.
INSERT FIGURE ABOUT HERE.
In the above figure, L3 has more inequality than L1 and L2 . But L1 and L2 are not comparable.
The Lorenz curve offers only a partial ordering of income distributions with respect to inequality.
To obtain complete ordering, you need to find a numerical inequality measure. The Gini Coefficient is the most widely used numerical inequality measure, partly due to its interpretation in
terms of the Lorenz curve. Consider the area A between the diagonal and Lorenz curve of a given
distribution, as below.
INSERT FIGURES ABOUT HERE.
When all incomes are equal, the Lorenz curve coincides with the diagonal, and the area A is
zero then. At the other extreme, when only a single income is positive, the area tends to B, the
entire area below the diagonal. The Gini coefficient is the ratio of A to B, or equivalently, 2A
(given that B = 0.5). The potential range of Gini coefficient is between 0 and 1; the typical values
are between 0.20 and 0.60.
You can also consider the coefficient of variation. If we consider the distribution of income
as a probability distribution, where the probability of a specific income is the proportion of the
population holding it, we may define the mean, variance and standard deviation as usual. Then,
the coefficient of variation is simply the standard deviation divided by the mean, while its square
is the variance over the squared mean. Measuring income in Euros or thousands of Euros leaves
the coefficient of variation unchanged, while this is not the case for variance. The problem with
the coefficient of variation is that of transfer neutrality principle: a redistribution of income at
the lower end of the income distribution has the same effect as an equivalent redistribution at the
upper end of the income distribution. One would expect inequality measures to weigh transfers at
the lower end more heavily.
Pn
i=1
xi
(xi )
n1
; the Standard Deviation is denoted by is the square root of 2 , and the Coefficient
of Variation, CV, is given by .
i=1
Instead of thinking in terms of specific inequality indexes as above, let us think about general
principles.
Let us quickly look at the key concepts that we need in order to compare income distributions in
the context of measuring inequality. Inequality ordering, denoted by I means a complete and
transitive binary relation on D, the set of all distributions. If it is continuous, we can represent it
as: I : D R. A Social Welfare Function (SWF), W, can be defined as: W : D R. The strict
ordering and indifference relations are defined in the usual manner. The properties of I and W
are determined by the fundamental distributional axioms or ethical principles. Let us turn to a
quick recap of the axioms now.
Axiom 1: Anonymity (x1 , x2 , . . . , xn ) I (x2 , x1 , x3 , . . . , xn ) I (x1 , x3 , x2 , . . . , xn ) . . .
This axiom states that permutations of names or labels of persons are regarded as distributionally equivalent. It implies that we only make use of the information about the income variable and
not any other characteristic which might be discernible in a sample.
Axiom 2: The Population Principle (Dalton) (x1 , x2 , . . . , xn ) I (x1 , x1 , x2 , x2 , . . . , xn , xn ) I
(x1 , x1 , x1 , x2 , x2 , x2 , . . . , xn , xn , xn ) . . .
The Population Principle says that an income distribution is to be regarded as distributionally
equivalent to a distribution formed by replications of it. That is, it is immune to replication.
Axiom 3: Principle of Transfers (Pigou-Dalton) G I F if distribution G can be obtained
from F by a mean preserving spread.
How do we understand this axiom? Consider an arbitrary distribution, say, xA = (x1 , . . . , xi , . . . , xj , . . . , xn )
and a number such that 0 < < xi xj ; from xA let us form another distribution xB =
(x1 , . . . , xi , . . . , xj + , . . . , xn ). The axiom of transfers then ranks xB as more unequal than xA .
The rest of the components of xA and xB are kept identical.
Axiom 4: Monotonicity G W F if distribution G can be obtained from F by a righward
translation of some probability mass.
That is, let xA = (x1 , . . . , xi , . . . , xn ). Get xB from xA as xB = (x1 , . . . , xi + , . . . , xn ). Then
this axiom says that welfare is higher in xB than xA . You can also think of a uniform rightward
translation of the whole distribution instead of the Monotonicity axiom.
Axiom 5: Scale Invariance If all incomes go up or down by a scalar, then the ordering is
unaffected.
That is, for F, G D, if G I F , then G(x/k) I F (x/k) for a scalar multiple k R+ . You
could also think of translation invariance instead of scale invariance.
Axiom 6: Decomposability Given F, G, K D, and [0, 1], G I F implies (1 )G + K I
(1 )F + K.
4
If we combine distribution K in the same way with both G and F , where all the distributions
F, G, K have the same mean, the original ordering should still be preserved. This is analogous to
the independence axiom of the N-M utility theory.
Dalton argued that underlying any inequality measure, there ought to be some concept of social
welfare. Let us follow him to assume an additively separable and symmetric function of individual
incomes; then we would rank distributions according to:
Z b
Z b
u(x)dF (x)
u(x)f (x)dx =
W =
3.1
For all F D, and for all 0 q 1, the quantile functional is defined by:
Q(F ; q) = inf {x | F (x) q} = xq .
INSERT FIGURE ABOUT HERE.
For any distribution of income F , the graph Q describes the ascending order of income, where
x0.2 gives the income of the person who appears exactly twenty percent along the parade of people.
It helps to think of the above diagram as an inverse of the one that appears below.
INSERT FIGURE ABOUT HERE.
Theorem 1
G Q F if and only if W (G) W (F ) for all W W1 , where W1 is the class of increasing SWFs.
If each quantile in distribution G is no less than the corresponding quantile in distribution F ,
and at least one quantile is strictly greater, then distribution G will be assigned higher welfare level
by every SWF in the class W1 (that is, the class of increasing evaluation functions of income, viz.,
u(.)).
As we have seen in the first order stochastic dominance part of the lectures, in practical applications, very often it would be the case that neither distribution first order stochastically dominates
the other. Second, it may not incorporate all the standard principles of social welfare analysis; for
example, it does not incorporate the principle of transfers. For this reason, it is good to introduce
the idea of Second order distributional dominance.
3.1.1
Z b
xs(x)dx.
s(x)dx =
a
Suppose we obtain the probability density function g(x) from another density function, say, f (x)
by the Mean Preserving Spread s(x); that is:
g(x) = f (x) + s(x).
We will then get:
Z b
Z b
Z b
s(x)dx = 1.
f (x)dx +
g(x)dx =
xf (x)dx
xs(x)dx =
xf (x)dx +
xg(x)dx =
a
Z b
Z b
Z b
Z b
Rb
a
Rx
a
s(u)du = 0 = S(a).
There exists z (0, 1) such that S(x) 0 if x z, and S(x) 0 for x > z.
With this quick recall of the idea of Mean Preserving Spread, let us get back to our main argument.
3.2
C(F ; q) =
xdF (x)
a
Then, note carefully that C(F ; 0) = 0 and C(F ; 1) = (F ). You can take the income to range
between a to b, as we have done before in the stochastic dominance section. The graph of C(F ; q)
against q gives the Generalized Lorenz curve.
INSERT FIGURE ABOUT HERE.
Theorem 2
For all F, G D, G C F if and only if W (G) W (F ) for all W W2 class; that is the class of
u(.) that is increasing and concave.
INSERT FIGURE ABOUT HERE.
The distribution G second order dominates F in the above diagram. Generalized Lorenz curve
(GLC) is a fundamental tool for drawing inferences about welfare about individual income data.
6
3.3
Lorenz Curve
Closely linked concept is the traditional Lorenz curve, also known as Relative Lorenz Curve:
L(F ; q) =
C(F ; q)
.
(F )
Result: An inequality measure I : D R is Lorenz consistent if and only if it satisfies the axioms
of Anonymity, Population Principle, Homogeneity, and the Principle of Transfers.
One way is quite easy to see. Let I : D R be a measure of inequality. Suppose that I is Lorenz
consistent. Then, we know that L satisfies all the above axioms. Hence, by Lorenz consistency, the
measure I too satisfies all the above axioms. The reverse is slightly more involved, which we skip.
This result underscores the point that the class of Lorenz consistent inequality measures is
exactly the same as the class of relative inequality measures. This leads us to conclude that:
Result: I(F ) > I(G) for every relative inequality measure I if and only if G L F ; similarly,
I(F ) I(G) for every relative inequality measure I if and only if G L F .
However, it is in the nature of the general ranking principles that in many practical situations,
they tend to yield an indecisive answer. In such cases, the specific indices come handy. Let us take
a quick look at two such measures. The most popular one is that of Gini coefficient.
3.4
Gini Coefficient
1
2(F )
Z Z
This definition is the normalized average absolute difference between all pairs of incomes in the
population. It captures the idea of average distance between incomes in the population according
P
0
0
to the L1 metric on Rn : nj=1 |xj xj | for x, x Rn .
The other way to define Gini Coefficient is:
G(F ) = 1 2
Z 1
L(F ; q)dq
0
That is, the Gini coefficient is twice the area between the Lorenz curve and the diagonal. That is,
it is the normalized area between the Lorenz curve and the diagonal.
3.5
Let point F in the diagram below represent an income distribution in a two-person economy. Then,
we can find the mean income as the abscissa of the point M where the 45 degree line through
F intersects the equality ray. The equally distributed equivalent, xii is the abscissa of the point E
where the W-contour through F intersects the equality ray.
INSERT FIGURE ABOUT HERE.
The farther F is on the constant total income line from perfect equality point M , the lower is
. The normalized gap between and then provide a natural basis for an inequality index.
IA (F ) = 1
8
(F )
.
(F )
For any given income distribution, the more sharply convex to the origin is the W-contour, the
greater is the gap between and . In the extreme case, you will get the following, where the SWF
is L-shaped.
INSERT FIGURE ABOUT HERE.
This value, (F ) is interpreted as aversion to inequality. Put this way, there is a strong link
between the ideas here and Economics of Uncertainty. For example, the equally distributed equivalent, viz., (F ) is nothing but the Certainty Equivalent in the N-M theory.
Of course, the equally distributed equivalent income level will be different for different social
welfare functions, and so the inequality measure will be dependent on the choice of the social welfare
function.
Poverty Measures
Even though we will not go into poverty measures, let us understand certain subtle differences
between inequality measurement and poverty measurement. Any practical approach to poverty
has to specify a poverty line, say, x . This value could be an unique exogenously given value, or
alternatively, some functional of the distribution F , or a set of possible values. You can think of it
as Head Count Ratio - viz., the proportion of the individuals with incomes below the poverty line,
Given such a value, there is a clear partition of the population into poor and non-poor. Now, you
cannot use Anonymity axiom for example in the poverty context without additional changes. You
may want to apply the anonymity axiom to only the partition of the non-poor: a permutation of
the non-poor alone should leave the poverty index unaltered. With such suitable modifications, you
would then be able to apply the stochastic dominance structure to poverty comparisons. However,
it is beyond our scope. For those interested in the poverty measures, a good place to begin would
be: Atkinson, A.B., 1987, On the Measurement of Poverty, Econometrica, Vol. 55, pp. 749-764.