Zha Columbia 0054D 15223

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 123

Essays on Education and the Marriage Market

Danyan Zha

Submitted in partial fulfillment of the


requirements for the degree of
Doctor of Philosophy
in the Graduate School of Arts and Sciences

COLUMBIA UNIVERSITY

2019
© 2019
Danyan Zha
All rights reserved
ABSTRACT

Essays on Education and the Marriage Market

Danyan Zha

Chapter one of this thesis examines one of the largest primary school construction program, INPRES SD,

in late 1970s in Indonesia. Using the variation across regions in the number of schools constructed and the

variation across birth cohorts, I show that in densely populated areas, primary school construction did not

affect the primary school attainment rate. More surprisingly, the program decreased the secondary school

attainment rate for both men and women due to a crowding out of teacher resources.

Chapter two of this thesis examines how education distribution affects the marriage market, in particular,

female marriage age. I first develop a two-to-one dimensional matching model with transferable utility in an

OLG framework, in which the marital surplus allows complementarity between men’s education and both

characteristics of women: education and youth, to understand how female marriage age is affected by others’

education. I then use INPRES SD as a quasi-natural experiment and find that a woman marries earlier and

the spousal age gap increases when fewer women in her birth cohort graduate from secondary school and

the education distribution of their potential husbands does not change. The empirical finding suggests that

men’s education and women’s young age are complementary in generating the marital surplus in the current

setting.

Chapter three of this thesis examines how hukou system affects the marriage market in China. I build a

bidimensional matching model in which individuals are determined by a continuous attribute (that indicates

social economic status) and a discrete attribute (hukou status, either rural or urban). Urban hukou is more

valuable for men than women since it is more likely for a woman to move to her husband’s location upon

marriage in a patrilocal society. The model gives predictions on the matching patterns which are validated

using the China 2000 0.095% sample census.


Contents

List of Figures iii

List of Tables v

Acknowledgements vi

1 Chapter 1. Schooling Expansion and Education 1

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 School Construction and Education in Indonesia . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.3 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.4 Empirical Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.6 Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

1.8 Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

1.9 Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

1.10 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2 Chapter 2. Schooling Expansion and the Female Marriage Age 30

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

2.3 The Marriage Market in Indonesia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

2.4 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

2.5 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

i
2.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

2.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

2.8 Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

2.9 Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

2.10 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

3 Chapter 3. Multidimensional Matching: Hukou status in the Marriage Market 73

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

3.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

3.3 Empirical Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

3.4 Conclusion and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

3.5 Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

3.6 Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

3.7 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

References 110

ii
List of Figures

1.1 Map of primary school construction intensity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

1.2 Construction intensity for sparsely and densely populated areas . . . . . . . . . . . . . . . . . . . 16

1.3 Coefficients of the interactions: age in 1974 * program Intensity in the region of birth in the

education equation for completing primary school . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

1.4 Coefficients of the interactions: age in 1974 * program intensity in the region of birth in the

education equation for completing secondary school . . . . . . . . . . . . . . . . . . . . . . . . . 18

1.5 Effect of the program on completing primary school (top) and completing secondary school (bot-

tom) in sparsely populated areas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

1.6 Effect of the program on completing primary school (top) and completing secondary school (bot-

tom) in densely populated areas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

1.7 Coefficients of the interactions: census year * program intensity in the regency in average number

of secondary school teachers equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

1.8 Effect of the program on primary teacher education . . . . . . . . . . . . . . . . . . . . . . . . . 22

1.A.1Number of newly appointed primary school teachers,1974-1998 . . . . . . . . . . . . . . . . . . . 29

2.1 Marriage frequencies (left) and marriage proportions (right) by education for females . . . . . . . 50

2.2 Effect of the program on female age at first marriage (top) and the spousal age gap (bottom) in

sparsely populated areas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

2.3 Effect of the program on female age at first marriage (top) and the spousal age gap (bottom) in

densely populated areas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

3.1 Predicted matching pattern with symmetric population . . . . . . . . . . . . . . . . . . . . . . . 91

3.2 Matching patterns under different parameter values . . . . . . . . . . . . . . . . . . . . . . . . . 92

iii
3.3 Predicted matching pattern with asymmetric population . . . . . . . . . . . . . . . . . . . . . . . 93

3.4 Predicted matching pattern with asymmetric population (Detailed) . . . . . . . . . . . . . . . . . 94

3.5 Utility in the quadratic example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

3.6 Comparative statics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

3.7 Heatmap for husbands’ and wives’ education . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

iv
List of Tables

1.1 Summary statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

1.2 Effect of school construction on education . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

1.3 Heterogeneous effect of school construction on education . . . . . . . . . . . . . . . . . . . . . . . 25

1.4 Effect of school construction on number of teachers in secondary and primary education . . . . . 26

1.A1 Inpres Sekolah Dassar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

1.A2 Development grant to regions (in billion Rp.) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2.1 Summary statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

2.2 First marriage age . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

2.3 Reduced-form effect of school construction on female marriage outcomes . . . . . . . . . . . . . . 55

2.4 Results of female education distribution on female marriage outcomes . . . . . . . . . . . . . . . 56

3.1 Summary Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

3.2 Matching patterns by hukou status. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

3.3 Positive assortative matching in each matching cateogry . . . . . . . . . . . . . . . . . . . . . . . 99

3.4 Selection into four matching categories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

3.5 Explain individuals’ education using spousal characteristics, by hukou status. . . . . . . . . . . . 100

3.6 Unconditional and conditional correlation between hukou and SES, by spousal type . . . . . . . . 101

v
Acknowledgements

I am deeply grateful and indebted to my three advisors, Pierre-André Chiappori, Bernard Salanié and Cris-

tian (Kiki) Pop-Eleches for their continuous encouragement, guidance and support. Pierre-André sparkled

my research interest in using matching theories to understand marriage markets. Our discussions and his

sharp intuition have helped me develop my own skills to tackle complicated issues with simple models.

Bernard taught me the importance of details and perseverance in research. He has spent enormous time

reading my drafts and helping make them better. Kiki helped me come up with the idea of using Indonesia

as a setting for the empirical analysis in the first two chapters. He encouraged me ceaselessly and was always

there to help.

I also thank other committee members, Suresh Naidu and Jack Willis. Suresh has helped me since my

second year and discussions with him always broadened my thinking. Jack has helped me improve the first

two chapters a lot since he joined Columbia. I have also learned a lot from other development faculties at

Columbia: Alex Eble, Jonas Hjort, Rodrigo Soares and Eric Verhoogen. I would also like to thank Esther

Duflo for generously sharing her original school construction data to make the first two chapters possible. I

also benefited a lot from email exchange with Natalie Bau.

I have also benefited greatly from my fellow colleagues at Columbia, both in research and in life, including

So Yoon Ahn, Ashna Arora, Yoon Joo Jo, Sun Kyoung Lee, Xuan Li, Yifeng Luo, Anh Nguyen, Qiuying

Qu, Anurag Singh, Yue Yu, Qing Zhang and Weijie Zhong.

Lastly, I am immensely grateful to my family and Lihong for their unconditional love and support.

vi
Dedicated to the memory of my mother, who has always believed in me.

vii
Chapter 1

Schooling Expansion and Education

1.1 Introduction

1
Human capital is key to individuals’ lifetime outcomes and countries’ economic growth . This provides

a rationale for world-wide schooling expansion, especially in low- and middle-income countries in the past

three decades (World Bank, 2018). Many papers have documented the positive effect of these education

policies on individuals’ education, wage, income, wealth and health outcomes (Malamud et al., 2018; Jürges

et al., 2011). Careful evaluations are necessary to guide government policies and international organizations

2
to find the most cost-effective policies.

The potential existence of externalities makes program evaluation even harder. A program that targets

one particular population may affect another population through information transmission, resource alloca-

tion or other channels. Other untreated individuals may benefit if there is social positive externality, or may

be affected negatively if their resources decrease due to the existing program.

This chapter evaluates the INPRES SD in Indonesia, the largest and most successful primary school

construction program so far. I find that surprisingly, building primary school has an unintended consequence

on secondary education. It actually has a negative impact on the secondary school attainment rate for both

3
men and women in more densely populated areas due to a crowding-out of teacher resources.

This chapter builds upon earlier studies using the same schooling expansion program (Duflo, 2001; Ashraf

et al., 2016). I replicate some of their findings but also find some surprising results not mentioned in the

previous literature. Consistent with previous findings, there is a positive effect on primary school attainment

1 A large literature in economic growth, (see Mankiw et al., 1992; Young, 1994,9; Barro and Sala-i-Martin, 1995), documents

the importance of endogenous human capital to economic growth.


2 See (Glewwe and Kremer, 2006; Glewwe et al., 2013; McEwan, 2015; Glewwe and Muralidharan, 2016) for a good review

of research on the effectiveness of education policies.


3 The secondary school attainment rate is defined as the percentage of people completing at least secondary school for a

given birth cohort in one place. Similarly, the primary school attainment rate is defined as the percentage of people who
complete primary school or above.

1
rate for men but not for women. However, I find a negative effect on secondary school attainment rate for

women in the full-sample analysis 4 . As suggested in Duflo (2001), the program may have different effect

in sparsely populated and densely populated regions. Exploring the heterogeneity of the effects depending

on population densities of the regions considered, I find that in sparsely populated regions, the school

construction program had a positive effect on both primary school and secondary school attainment rates

for men but not for women; in densely populated regions, the school construction program did not affect

primary school attainment rate but had a negative effect on secondary school attainment rate for both men

and women.

I then investigate two potential mechanisms leading to the negative secondary school attainment result:

(1) a decrease in secondary school quality due to resources being crowded out; (2) a decrease in primary

school quality due to the massive scale of construction. The analysis supports the first mechanism. Building

a primary school increases the total demand for teachers in a region. In 1970s, teacher hiring was very

centralized, hence a huge demand from primary schools may affect the availability of potential new teachers

in secondary education. Moreover, the demand for teachers can be more competitive in densely populated

regions than sparsely populated regions since it is easier to relocate for teachers in the former area. I show

that the total number of teachers and the average number of teachers in secondary school increase less in the

regions where more primary schools were constructed after the launch of the school construction program.

The negative effect on teacher availability in secondary education in future years only exists in densely

populated regions, not in sparsely populated regions. Moreover, rapidly constructing primary schools could

also decrease the quality of primary education.

Using the education level of primary school teachers in the censuses as a proxy for school quality, I show

that teacher education increases less in regions where more primary schools were constructed. However, I

do not find a difference between sparsely and densely populated regions. In summary, the negative result

on secondary school attainment rate is due primarily to the crowding out of teacher resources in secondary

education because of the primary school construction.

This chapter is closely related to other papers that have studied the effect of INPRES SD program since

the seminal paper Duflo (2001), which focuses men and finds a positive effect on male years of schooling

4 This is consistent with other papers evaluating this program using different datasets, Akresh et al. (2018)

2
and their wages in 1995 using SUPAS 1995. Using the same dataset, Ashraf et al. (2016) instead looks at

women and find that the program increases years of schooling for women, but only for the ethnicities that

practice bride price. Breierova and Duflo (2004) shows that mother’s and father’s education are equally

important factors in reducing child mortality. Using the fifth wave of the Indonesian Family Life Survey

(IFLS, 2014), Bharati et al. (2018) shows that the school construction increases schooling for individuals

who experienced negative shock (low rainfall) in the first year of life but not those who didn’t experience the

adverse rainfall shock, partly due to deteriorating school infrastructure and increased competition. Akresh

et al. (2018) and Rosales-Rueda et al. (2019) examine the long-term and intergenerational effect of this

program on individuals’ work choice, household behavior, and children’s education. Martinez-Bravo (2017)

shows that local public good provision increases in the villages where the education of the village heads

increase due to this program. Dominguez (2014) uses structural estimation to show that an increase in

the primary school graduates increase the single rate and decrease the marital utility of the primary school

graduates.

This chapter also contributes to the literature studying the effect of teacher availability and quality on

student learning outcomes. Many papers have shown that higher teacher quality matters more to students’

achievement than other education input including class size and school infrastructure. (Rivkin et al., 2005;

Hanushek, 2011)

Finally, this chapter provides evidence for the existence of another type of externality in the large scale

intervention. Miguel and Kremer (2004) shows a large positive externality of deworming for untreated

children in the treatment and neighboring schools. Bobonis and Finan (2009) finds a positive externality

of the PROGRESA program for program-illegible children’s secondary school participation in the program

communities. Castro and Esposito (2018) finds a negative externality of the bonus paid to incentivize

teachers on nearby rural schools.

3
1.2 School Construction and Education in Indonesia

INPRES primary school construction program in Indonesia

The Indonesian government has consistently sought to broaden educational opportunity since the country’s

independence in 1945. However, due to financial difficulties and political conflict, in the country’s early

years, Indonesia remained backward relative to neighboring countries and to countries with similar levels

of income. As late as the 1971 population census, only 62% of primary school-aged children (ages 7-12

inclusive) were enrolled in any kind of school, while only 54% appeared on the rolls of public and private

schools reporting to the Ministry of Education, (see Snodgrass, 1984). Due to the increased oil production

and the first OPEC-engineered price rise in 1972-1973, which unexpectedly raised government revenue,

a primary school construction aid program (Program Bantuan Pembangunan Sekolah Dasar), known as

5
INPRES Sekolah Dasar and more informally as INPRES SD, was inaugurated in 1973 to help increase

primary school enrollment rate which had been stagnant before 1973.

Between 1973/74 and 1978/79, 62,000 primary schools were scheduled to be built. Each school consists

of three classrooms, and each classroom has one teacher and can accommodate 40 pupils. The allocation rule

every year is as follows: (a) ensure that each district(kecamatan in Indonesian, one level below the regency

and two levels below the province level) was allocated at least one school and each province at least 50, (b)

the remainder were distributed according to the estimated population of unenrolled 7-12 year old children,

(Snodgrass et al., 1980). This creates variation in the construction intensity that I exploit in my empirical

analysis.

In addition to school construction, the government also provided textbooks and teacher training to ensure

that the buildings were used for education purposes. Moreover, the primary school fee was abolished in 1977.

By 1983, nearly all Indonesian children had at least begun to enroll in primary school, while the percentage

of 7-12 year olds enrolled exceeded 90%. INPRES SD has been a successful case of education policies in

developing countries.

This program spanned 1973/74-1988/89, as shown in Table 1.A1. However, in the empirical analysis,

only the first six years (1973/74-1978/79) were used because of the region specific construction target was

5 INPRES stands for Instruksi Presiden (Presidential Instruction) and Sekolah Dasar means primary school.

4
only available for these years. For later years, only the aggregate number of school construction targets

6
was available. In the empirical work, we were comparing older cohorts who wouldn’t have been affected

by the program with younger cohorts who were of age at least 7 in 1979. Unobserving school construction

data after 1979 wouldn’t invalidate this comparison. However, this would still create two issues. First, the

effect of one additional school built may be overestimated if schools built after 1979 could have positive

effect on current students who have enrolled before 1979. Second, this creates more difficulties to test the

mechanism behind the negative effect in the secondary school attainment. If school construction ended in

1979, we could test whether it’s the short-run teacher shortage that contributed to the negative effect on the

secondary school education by comparing the children who were exposed (to schools construction between

1973 and 1979) and those even younger children. If it was short-run shortage, we should expect the negative

effect diminishes for the younger children.

Education system in Indonesia

In Indonesia, the education system consists of six years of primary school (sekolah dasar, SD), three years

of middle school (sekolah menengah pertama, SMP) and three years of high school (sekolah menengah atas,

SMA), followed by various kinds of higher education. Children generally begin primary school at age 7. Two

ministries are responsible for managing the education system, with 84 percent of schools being under the

Ministry of National Education and the remaining 16 percent being under the Ministry of Religious Affairs.

In the 2000 census, although 86.1 percent of the Indonesian population was registered as Muslim, only 15

percent of school-aged individuals attended religious schools. (Frederick and Worden, 1993)

INPRES 1973 initiated Indonesia’s program of compulsory education, but six-year compulsory education

for primary school-aged children (7-12 age group) was not fully implemented until 1984. In May 1994, nine-

year compulsory education for the 7 to 15 age group was introduced. Of all pupils, 92% were enrolled in

public schools for primary education, and 50% were enrolled in public schools for secondary education. The

Indonesian government focused more on primary education than on the secondary level. In 1985, of public

spending on education, 62% went to primary education, while 27% went to secondary education. (see Tan

6 New primary school entrants kept increasing between 1973 and 1979, then fluctuated about 4.3 million. (World Bank,

1989, page 16)

5
and Mingat, 1992, table 3.1, table 6.5)

In the 1980s, although all children began primary school, only approximately 62% of pupils entering

primary school actually graduated from grade 6. Transition between primary school and junior secondary

school was also low, at approximately 60%. (see Jones and Hagul, 2001, table 1, figure 2). Transition

between junior secondary and senior secondary was also low: 53%. However, the survival rates of junior

secondary school and senior secondary school are fairly high in Indonesia, at more than 90%. (see Tan and

Mingat, 1992, table4.5, table 4.6, Table A.1)

Teacher in Indonesia

Teachers used to be of high quality and the profession used to be regarded as highly prestigious before early

1970s. However, with rapid school construction, there were not enough trained teachers and teachers were

prepared in rush, which diluted the teacher quality between 1970s and 1980s in Indonesia.(Jalal et al., 2009)

Figure 1.A.1 plotted the number of newly appointed primary school teachers between 1974 and 1998. The

number of new hires kept increasing since 1974 and followed a similar trend with the funding in the last

column of Table 1.A1. Lots of effort have been spent by Indonesian government to upgrade the teacher

profile including the implementation of Law No. 14/2005 on Teachers and Lecturers, known as the Teacher

Law which contains certification requirements for teachers. (World Bank, 2016)

Teacher salary increases on average 6.5% from primary school to junior secondary school, and increases

7
on average 15% from junior secondary school to senior secondary school in 2004/05. Compared to others

with similar education levels, teachers with high education are paid less, while teachers with low education

are overpaid.

Before the decentralized Education Law 20/2003, teacher hiring was very centralized, as well as the

delivery of other public services. Central government agencies, the Ministry of National Education (MONE)

and the Ministry Religious Affairs (MORA), were responsible for hiring teachers and paying salaries. Public

teachers have always been trained by centrally accredited teacher training institutions through public exam-

inations. In the 1970s, primary school teachers were prepared in the teacher education school called Sekolah

7 In 2004/05, salary for primary school ranges from 2,733 to 3,941 (in US Dollars), ranges from 2,913 to 4,281 for junior

secondary school, and ranges from 3,373 to 4,756 for senior secondary school. (Jalal et al., 2009, Table 1.5)

6
Pendidikan Guru (SPG) after completing junior secondary school. Junior secondary school teachers were

prepared in the institutes and faculties of teacher education (IKIP/FKIP) with Diploma 1 qualification after

completing senior secondary school. Senior secondary school teachers were prepared in the institutes and

faculties of teacher education (IKIP/FKIP) with Diploma 2 qualification after completing senior secondary

school. (See Jalal et al., 2009, Table 1.11)

1.3 Data

Indonesian census data

For the main analysis, I use information from the 10% sample of the Indonesian Population Census 2010 and

the 0.51% sample of the Indonesian Intercensal Population Survey (SUPAS) 2005 downloaded from IPUMS

International 8 . The two censuses were designed to be representative of the whole country. Moreover, the

birthplace (regency level) of individuals is recorded in both censuses, which can be used to proxy for their

exposure to the primary school construction program when they were of primary school age.

Table 1.1 displays the descriptive statistics of individuals’ education using the 2010 census for the older

cohort born between 1950 and 1961 (not exposed to the program) and younger cohort born between 1962

and 1972 (exposed to the program). The number of individuals with secondary school degrees is fairly small,

even for the younger cohort. An increase in education is observed for the younger cohort. Men are more

educated than women.

School construction data

The number of schools planned to be constructed across regencies is collected in Duflo (2001) from each

year’s presidential instruction. Intensity is defined as the average number of primary schools planned to

be constructed between 1973 and 1978 (inclusive) per 1000 children aged 5-14 at the regency level in the

1971 census. INPRES school construction began in 1973/74, the last year of Repelita I (the first five-year

development plan) and continued through Repelita II(1974/75-1978/79), Repelita III (1979/80-1983/84)

8 Minnesota Population Center. Integrated Public Use Microdata Series, International: Version 7.1 [dataset]. Minneapolis,

MN: IPUMS, 2018. https://doi.org/10.18128/D020.V7.1. I would also like to acknowledge the statistical agency that originally
produced the data: Statistics Indonesia

7
9
and Repelita IV(1984/85-1988/89). However, regency-specific plan data are available only for 1973/74-

1978/79; hence I limit my sample to individuals born at or before 1972 who were older than age 7 in 1979.

For those born after 1972, I am unable to identify the primary school construction intensity they were

exposed to at age 7.

Link regency code between censuses

School intensity data are available for 290 unique regencies in Duflo (2001), which were coded using 1995

labels. There were 304 regencies in 1995, and the 14 lost regencies were in East Timor, which became part

of Indonesia as the 27th province in 1976.

Indonesia has experienced a substantial increase in the number of regions (Pemekaran Daerah) since the

enactment of Law No.22 of 1999 concerning regional autonomy. The number of regencies increased from 271

in 1971, to 304 in 1995, to 437 in 2005 and to 494 in 2010. To tackle this issue, I use the GIS shapefiles

provided by IPUMS across census years to link regencies of birth in 2005 and 2010 back to the regency

of birth variable in 1995 to assign the proper program intensity to each individual. Since most of this

expansion is in the form of dividing existing regencies into several small regencies, I can link the majority of

10
the regencies.

School infrastructure and teacher quality data

The number of schools and teachers at different levels is available from the Ministry of Education, which is

also collected in the original dataset used in Duflo (2001). As for teacher quality, I adapt the method used

in Behrman and Birdsall (1983); Bharati et al. (2018): calculating the percentage of teachers (self-report)

who complete secondary school or some college across regions in the Indonesian censuses of 1971, 1980, and

1990 and the inter-censuses of 1976 and 1985, as a proxy.

9 See Table 1.A2 for the number of grants given to primary school building in each Repelita.
10 Between 1995 and 2010, I can link all regencies except Sarmi regency (9419) in Papua Province. Between 1995 and 2000,
I can link all regencies expect Ternate city (8271) in Maluku Utara province.

8
1.4 Empirical Strategy

To analyze how education is affected across regencies and birth cohorts, the empirical strategy is difference-

in-differences, as used in Duflo (2001). One difference comes from the school construction intensity, defined

as the average number of primary schools built between 1973 and 1978 in one regency per 1000 children

aged 5 ∼ 14 in 1971. The other difference comes from birth cohorts. In Indonesia, children begin attending

primary school at ages 7 ∼12. Those aged 13 or above in 1974 would not have been impacted by the program

because they were already out of primary school. For those aged less than or equal to 12 in 1974, the younger

they were, the more exposed they were to this school construction program. A valid difference-in-differences

strategy requires that all regions should have parallel trends in the education of different birth cohorts.

The quantitative effect of the school construction program on individuals born in birth cohort k and

regency j can be estimated with the following specification:


12 ∑
21 ∑
21
yjk = αj + βk + (Pj dkl )γl + (Pj dkl )γl + (Cj dkl )δl + εjk
l=2 l=14 l=2

where yjk is the percentage of individuals completing primary school (secondary school) born in regency j,

and in birth cohort k, dkl is a dummy that indicates whether birth cohort k individuals are age l in 1974

(year-of-birth dummy). αj denotes the regency fixed effect, and βk denotes the birth cohort fixed effect. Pj

is the school construction intensity in regency j. εjk is the error term. Cj represents other region-specific

variables.

The coefficients γl are the coefficients of interest. They represent the effect of one additional primary

school constructed on the dependent variable for individuals of age l in 1974. There is a testable restriction

on coefficients γl . A valid identification strategy would require that γl = 0 if l > 13, i.e., the variation in the

outcome variable is not correlated with the primary school available starting in 1974 for the children who

were already out of primary school in 1974. I should expect that for l ≤ 12, γl > 0, and that γl decreases

with l as the effects should be larger in the younger cohorts.

9
1.5 Results

In this section, I present my empirical results on education. I first present the results for the full sample, then

show the results for two subsamples depending on population density. Finally, I provide further evidence for

the mechanisms behind the different results observed in the subsamples.

Full sample

In Figure 1.3, I plot γl when the dependent variable is the percentage of individuals who complete at least

primary school for men (or women), i.e., the effect of one additional primary school constructed per 1000

children on primary school attainment rate for men (or women) with age l in 1974. To simplify the graph,

I combine three birth cohorts together on the graph.

Two important results stand out from Figure 1.3. First, γl is not significantly different from 0 for l larger

than 13 for both men and women. This lends confidence in the identification assumption: the birth cohort

trend in the primary school attainment rate does not differ across regions with different school construction

intensities. Secondly, γl is positive for men with age l ≤ 12 in 1974, indicating a positive effect on the primary

school attainment rate for men; γl is zero for women except the youngest cohorts, aged l ≤ 3, indicating a

lagged effect on female primary school attainment rate. Both results are consistent with previous findings

in Duflo (2001) and Ashraf et al. (2016).

Difference-in-differences estimates are provided in columns (1)-(3) in Table 1.2. Following Duflo (2001),

the sample includes individuals born between 1950 and 1961 who are older than 12 in 1974, and individuals

born between 1968 and 1972, who are younger than 7 in 1974. ”Post” indicates individuals born between

1968 and 1972. Column (3) suggests that one additional school increases male primary school attainment

rate by 0.6 percentage points. This is smaller than the estimate in Figure 2 in Duflo (2001) where it’s shown

that approximately 1.5% more individuals had at least 6 years of schooling between high program regions

(where on average 2.44 schools were built) and low program regencies (where on average 1.54 schools were

built). My estimate is smaller; one potential reason for this divergence is the inclusion of more controls in

my analysis compared to Equation (4) in Duflo (2001).

In Figure 1.4, I plot the coefficients of the interactions of age in 1974 and program intensity for completing

10
secondary school or above. I find a negative impact on secondary school attainment rate, especially for

women. This is surprising because, if anything, one should expect positive spillover effects from primary

school completion to secondary school completion. This finding is also mentioned for men in Duflo (2001)

but not discussed in detail there. The difference-in-differences estimates in columns (4)-(6) in Table 1.2

suggest that one additional school being built decreases women’s secondary school attainment rate by 0.53

percentage points.

Heterogeneity results on education

Further insight into the effect of the program can be obtained by examining its impact on different types

of regions. In this section, I repeat the previous exercise on two subsamples divided by population density:

sparsely populated regions with densities below the medium density and densely populated regions with

densities above the medium. Population density is calculated as the population in the 1971 census divided

by the area of each region in 1971. The median density (the density for the region of birth for the median

person in the weighted sample) is 470 inhabitants per square kilometer. There are 183 regions in the sparsely

populated subsample, and the average number of schools constructed per 1000 children is 2.1. There are

91 regions in the densely populated subsample, and the average number of schools constructed per 1000

children is 1.67, which is somewhat lower than that in the sparsely populated subsample. Figure 1.2 shows

the distribution of the school construction intensity for the two subsamples.

In Figure 1.5 and Figure 1.6, I plot the coefficients on education γl for both sparsely populated and

densely populated subsamples. The difference-in-differences estimates are shown in Table 1.3.

As Figure 1.5 shows, in sparsely populated areas, the program increased the primary school attainment

rate (top) and secondary attainment rate (bottom) for men but did not affect women’s education. Difference-

in-differences estimates are provided in Panel A of Table 1.3. For men, one additional school constructed

per 1000 children increased the percentage completing primary school or above by 1 percentage point and

the percentage completing secondary school or above by 0.69 percentage points.

As Figure 1.6 shows, in densely populated areas, the program did not affect the primary school attainment

rate (top) but decreased the secondary school attainment rate (bottom) for both men and women. Difference-

in-differences estimates in Panel B of Table 1.3 suggests that one additional school being built per 1000

11
children decreased the secondary school attainment rate by 2.3 percentage points for both men and women.

These heterogeneous effects are consistent with the finding in Duflo (2001) that the program increased

years of schooling in sparsely populated areas but not in densely populated areas for men. Duflo (2001)

interprets this as evidence that the program increased men’s education mainly by decreasing the average

distance to a school. This could explain the difference in the results on the primary school attainment rate

across the two subsamples, but has no explanatory power for the negative result on the secondary school

attainment rate in densely populated regions.

1.6 Mechanism

In this section, I investigate further the surprising finding of a negative effect on secondary school attainment

rate in densely populated regions. There are at least two possibilities: (1) building primary schools crowds

out resources available to secondary schools and deteriorates secondary school quality and (2) a sudden

increase in primary school availability may decrease primary education quality and hence the quality of

primary school graduates. I explore the heterogeneity in the results for sparsely and densely populated

regions and show that the first conjecture is more consistent with the data.

Deterioration in secondary education quality?

Teacher scarcity is always a challenge in Indonesia’s education system. Building primary schools increases

the aggregate demand for teachers. This could affect the availability of secondary school teachers. To test

this conjecture, I use the total number and average number of teachers per school in secondary education

across regions in the years after the INPRES-SD program and to check whether there is a differential change

in regions where more primary schools were constructed. Specifically, I estimate the following specification:


6 ∑
6
yjt = αj + βt + (Pj dtl )γl + (Cj dtl )δl + εjt
l=2 l=2

where j denotes region, and t denotes the survey year: 1 indicates year 1973/74, 2 indicates year 1978/79,

3 indicates year 1983/84, 4 indicates year 1988/89, 5 indicates year 1993/94, and 6 indicates 1995/96. yjt

indicates the total or average number of secondary school teachers in year t in region j. dtl is a year dummy

12
indicating whether t = l. αj denotes the regency fixed effect, βj denotes the year fixed effect. Pj is the school

construction intensity in regency j. εjk is the error term. Cj represents other region-specific variables. The

baseline year is 1973/74 (t = 1).

The results are presented in Table 1.4. The omitted baseline year is 1973/74. The negative coefficients

in column 1 and column 2 suggest that in regions where more schools were constructed, a smaller increase

is observed in the total number and the average number of teachers per school in secondary education in

later years. Reassuringly, column (3) shows a positive effect of the program on the total number of teachers

in primary school education, which is consistent with the teacher crowding out story.

Moreover, since the negative effect on secondary school attainment is only observed in densely populated

regions, this negative effect on the number of teachers in secondary school should also only exist in such

regions. Figure 1.7 separately plots the coefficients before the interaction term of the year dummy and school

construction intensity from the previous specification for sparsely and densely populated regions. A negative

effect on the average number of teachers in secondary education appears for densely populated regions but

not for sparsely populated regions. This confirms my conjecture that primary school construction increases

the demand for teachers, which crowds out teacher resources available for secondary school education and

leads to a negative effect on the secondary school attainment rate. Moreover, this phenomenon exists only

in densely populated regions.

Deterioration in primary education quality?

A second conjecture is that the deterioration in primary school quality leads to a decrease in student quality

among primary school graduates, and this in turn induces a lower secondary school attainment rate. To

meet the surge in demand for teachers created by the school expansion, primary school teacher quality may

have been sacrificed. (Jalal et al., 2009; Bharati et al., 2018)

To test it, I use a similar empirical specification. The outcome variable is the percentage of primary

school teachers who completed secondary school (or some college) in one regency in that census year. The

baseline year is 1971, before the school expansion program started.

Figure 1.8 shows the coefficients of the interaction term between the year fixed effect and school con-

struction intensity, separately for sparsely and densely populated regions, for my two proxies of teacher

13
quality: the percentage of teachers completing secondary school (top) and completing some college (bot-

tom). Consistent with the results in Bharati et al. (2018), I observe a negative impact of the program on

teacher quality in 1976, but not for later years. However, I do not find different patterns between sparsely

and densely populated regions. Therefore, this suggests that deterioration in primary education quality is

not the main reason for the negative impact on the secondary school attainment rate.

1.7 Conclusion

In this chapter, I reevaluate the INPRES primary school construction program in Indonesia and show that

it has an unintended consequence on secondary school education. Moreover, in densely populated regions,

the secondary school attainment rate declines for both men and women due to a crowding out of teacher

resources in secondary education due to primary school construction.

1.8 Figures

14
15

Figure 1.1: Map of primary school construction intensity


.6
Construction intensity
.2 0 .4

0 2 4 6 8
x

Sparsely Populated Areas Densely Populated Areas

Figure 1.2: Construction intensity for sparsely and densely populated areas

Note: This figure shows the distribution of the school construction intensity for the sparsely and densely populated regencies.
Density is calculated using population in 1971 divided by the total area using 1995 maps.

16
.03 .02
complete primary school
0 .01
-.01

24-22 21 to 19 18 to 16 15 to 13 12 to 10 9 to 7 6 to 4 3 to 2
Age at 1974

males females

Figure 1.3: Coefficients of the interactions: age in 1974 * program Intensity in the region of birth in the
education equation for completing primary school

Note: This figure reports estimates of the effect of school construction on primary school completion for 3-year cohorts separately
for males and females born in one regency. Dependent variable is the percentage of individuals completing primary school when
observed in 2010. The x-axis reports the age range (in 1974) for each cohort and the y-axis reports the estimated coefficient,
which can be interpreted as the effect of one additional primary school built per 1000 kids on primary school attainment rate in
that regency. The sample consists of individuals born between 1950 and 1972 observed in 2010 Indonesian census. The vertical
line indicates the youngest cohort that did not receive any treatment from school construction, since they were out of primary
school at 1974, when the first round of constructed primary schools became available. Confidence intervals of 95% were plotted.
The figure shows zero effect for individuals older than 13 at 1974, but an increasing positive effect for males younger than 13.
For females, the effect is smaller.

17
.005
complete secondary school
-.01 -.005
-.015 0

24-22 21 to 19 18 to 16 15 to 13 12 to 10 9 to 7 6 to 4 3 to 2
Age at 1974

males females

Figure 1.4: Coefficients of the interactions: age in 1974 * program intensity in the region of birth in the
education equation for completing secondary school

Note: This figure is built like Figure 1.3 but considers secondary school attainment rate. It reports estimates of the effect of
school construction on secondary school completion for 3-year cohorts separately for males and females born in one regency.
Dependent variable is the percentage of individuals completing secondary school when observed in 2010. The x-axis reports the
age range (in 1974) for each cohort and the y-axis reports the estimated coefficient, which can be interpreted as the effect of
one additional primary school built per 1000 kids on primary school attainment rate in that regency. The sample consists of
individuals born between 1950 and 1972 observed in 2010 Indonesian census. The vertical line indicates the youngest cohort that
did not receive any treatment from school construction, since they were out of primary school at 1974, when the first round of
constructed primary schools became available. Confidence intervals of 95% were plotted.

18
.03 .02
complete primary school
-.01 0 .01
-.02

24-22 21 to 19 18 to 16 15 to 13 12 to 10 9 to 7 6 to 4 3 to 2
Age at 1974

males females
.02
complete secondary school
-.01 0 -.02 .01

24-22 21 to 19 18 to 16 15 to 13 12 to 10 9 to 7 6 to 4 3 to 2
Age at 1974

males females

Figure 1.5: Effect of the program on completing primary school (top) and completing secondary school
(bottom) in sparsely populated areas

Note: This figure is similar to Figure 1.3 and Figure 1.4 but focuses on a subgroup: sparsely populated regions. It reports
estimates of the effect of school construction on primary school completion (top) and secondary school completion (bottom) for
3-year cohorts separately for males and females in this subgroup. Sparsely populated regions are defined as those regions with
population density smaller than the weighted medium density in 1971.

19
.04 .02
complete primary school
-.02 0-.04

24-22 21 to 19 18 to 16 15 to 13 12 to 10 9 to 7 6 to 4 3 to 2
Age at 1974

males females
.02
complete secondary school
-.02 -.04 0

24-22 21 to 19 18 to 16 15 to 13 12 to 10 9 to 7 6 to 4 3 to 2
Age at 1974

males females

Figure 1.6: Effect of the program on completing primary school (top) and completing secondary school
(bottom) in densely populated areas

Note: This figure is built like Figure 1.5 but focuses on the other subgroup: densely populated regions. It reports estimates of
the effect of school construction on primary school completion (top) and secondary school completion (bottom) for 3-year cohorts
separately for males and females in this subgroup. Densely populated regions are defined as those regions with population density
larger than the weighted medium density in 1971.

20
.5
0
-.5
-1
-1.5

1973/74 1978/79 1983/84 1988/89 1993/94 1995/96


Year

Sparsely Densely

Figure 1.7: Coefficients of the interactions: census year * program intensity in the regency in average number
of secondary school teachers equation

Note: This figure reports estimates of the effect of school construction on average number of teachers in secondary school across
different years in sparsely populated areas and densely populated areas. The baseline year is 1973/74. The data was provided by
Indonesian Education Ministry and was collected in Duflo (2001). The dependent variable was the average number of teachers
in secondary school across different regencies. This figure supports the argument that the negative effect on secondary school
attainment is due to teacher resource crowding out in densely populated regions because of primary school construction.

21
.2
.1
seniorhigh
0 -.1
-.2

1971 1976 1980 1985 1990


Year

Sparsely Densely
.05 0
somecollege
-.05 -.1

1971 1976 1980 1985 1990


Year

Sparsely Densely

Figure 1.8: Effect of the program on primary teacher education

Note: This figure reports estimates of the effect of school construction on the education level of primary school teacher in the two
subsamples: sparsely and densely populated regions. Dependent variable for the top panel is a dummy indicating the teacher
completes secondary school, for the bottom panel is a dummy indicating the teacher has some post-secondary education. The
baseline year is 1971. Primary school teacher information for each region is obtained from identifying those individuals who claim
their occupation is primary school teacher in the census year.

22
1.9 Tables

Table 1.1: Summary statistics

Old Cohorts Young Cohorts


2010 Census: Born between 1950 and 1961 Born between 1962 and 1974
Males Females Males Females

Education Attainment
Some School 0.22 0.33 0.10 0.16
Primary School 0.56 0.54 0.53 0.57
Secondary School 0.17 0.11 0.30 0.23
University or above 0.05 0.02 0.07 0.05
Observations 1,275,648 1,231,961 2,148,572 2,128,266

Source: Indonesian Census 2010.

23
Table 1.2: Effect of school construction on education

All sample: Indicator for Completing at least:


Primary School Secondary School
Males: (1) (2) (3) (4) (5) (6)
Post × Intensity 0.022∗∗∗ 0.0079∗ 0.0062∗∗ -0.0077∗∗ -0.0012 -0.0014
(0.0060) (0.0042) (0.0025) (0.0033) (0.0030) (0.0018)
Dep. var. mean 0.847 0.306
Observations 6509 6302 6302 6509 6302 6302
Clusters 283 274 274 283 274 274
Adjusted R-squared 0.917 0.951 0.974 0.961 0.959 0.974
Duflo Controls: No Yes Yes No Yes Yes
Log-linear Trend: No No Yes No No Yes
Females:
24

Post × Intensity 0.020∗∗∗ 0.0027 0.0041 -0.021∗∗∗ -0.0064∗ -0.0053∗∗


(0.0065) (0.0052) (0.0034) (0.0049) (0.0033) (0.0023)
Dep. var. mean 0.766 0.211
Observations 6509 6302 6302 6509 6302 6302
Clusters 283 274 274 283 274 274
Adjusted R-squared 0.934 0.956 0.979 0.949 0.960 0.976
Duflo Controls: No Yes Yes No Yes Yes
Log-linear trend: No No Yes No No Yes

Notes: This table displays results on the effect of school building on education attainment (completing primary school and completing secondary school) for males and
females. Following the strategy of Duflo (2001), the sample consists of individuals born between either 1968 and 1972 or 1950 and 1961. Post refers to the treated cohort,
born between 1968 and 1972, while the untreated cohort was born between 1950 and 1961. Educational attainment data are taken from the Indonesian 2010 Census.
Intensity is the number of schools built in a region per 1,000 kids in the school-aged population. All columns include district fixed effect, school year fixed effect, school
year interacted with number of children at 1971. Duflo Controls consist of school year interacted with enrollment rate at 1971 and school year interacted with water
sanitization program. Standard errors are clustered at the birthplace district level. Significance levels: * 10%, ** 5%, *** 1%.
Source: Indonesian Census 2010
Table 1.3: Heterogeneous effect of school construction on education

Panel A:
Density < Medium: Indicator for Completing at least:
Primary School Secondary School
(1) (2) (3) (4)
Male Female Male Female
Post × Intensity 0.010∗∗ 0.0066 0.0069∗∗ -0.00066
(0.0051) (0.0062) (0.0032) (0.0037)
Dep. var. mean 0.820 0.736 0.270 0.183
Observations 4209 4209 4209 4209
Clusters 183 183 183 183
Adjusted R-squared 0.949 0.952 0.937 0.941
Panel B:
Density > Medium:
Post × Intensity 0.0054 -0.0060 -0.023∗∗∗ -0.023∗∗∗
(0.0066) (0.0073) (0.0055) (0.0073)
Dep. var. mean 0.873 0.795 0.341 0.239
Observations 2093 2093 2093 2093
Clusters 91 91 91 91
Adjusted R-squared 0.957 0.967 0.975 0.975
Duflo Controls: Yes Yes Yes Yes
Log-linear trend: Yes Yes Yes Yes

Notes: This table is similar to Table 1.2 and displays the heterogeneity effect of school building on education attainment
in sparsely and densely populated regions. All columns include district fixed effect, school year fixed effect, school year
interacted with number of children at 1971. Duflo Controls consist of school year interacted with enrollment rate at 1971
and school year interacted with water sanitization program. Standard errors are clustered at the birthplace district level.
Significance levels: * 10%, ** 5%, *** 1%.
Source: Indonesian Census 2010

25
Table 1.4: Effect of school construction on number of teachers in secondary and primary education

Secondary School Primary School

(1) (2) (3) (4)


Total number Average number Total number Average number
INPRES Intensity ×
year=1978/79 -13.9 -0.19∗ 28.0∗∗∗ -0.16∗∗
(8.48) (0.11) (9.91) (0.063)
year=1983/84 -43.5 -0.14 61.1∗∗ -0.069
(31.8) (0.18) (29.2) (0.11)
year=1988/89 -65.1 -0.20 89.4∗ -0.031
(49.1) (0.21) (45.5) (0.066)
year=1993/94 -59.3 -0.086 95.1∗ -0.062
(52.0) (0.22) (57.0) (0.075)
year=1995/96 -51.7 -0.074 177.4∗∗ 0.29∗
(60.0) (0.19) (68.6) (0.17)
Dep. var. mean in 1973/74 555.996 14.723 1529.996 6.762
Dep. var. mean in 1995/96 2583.821 22.989 4207.180 8.345
Observations 1,656 1,656 1,664 1,664
R-squared 0.928 0.929 0.942 0.829
Duflo Controls: Yes Yes Yes Yes

Notes: This table displays the effect of school construction on the number of teachers in secondary and primary education
in the future years. Baseline year is 1973/74. All columns include district fixed effect, school year fixed effect, school year
interacted with number of children at 1971. Duflo Controls consist of school year interacted with enrollment rate at 1971
and school year interacted with water sanitization program. Standard errors are clustered at the birthplace district level.
Significance levels: * 10%, ** 5%, *** 1%.
Source: Indonesian Education Ministry

26
1.10 Appendix

Table 1.A1: Inpres Sekolah Dassar

Primary School Investment Program


1973/74-1988/89
New
Primary Total
Class- Primary Primary
New principal Primary allocation
rooms for schools to school
Financial Primary and school (billions
existing be reha- books
year schools teacher sport kits of current
primary bilitated (mln)
housing rupiah)
schools

1973/74 6,000 - - - 6.6 - 17.2


1974/75 6,000 - - - 6.9 - 19.7
1975/76 10,000 - 10,000 - 7.3 - 49.9
1976/77 10,000 - 16,000 - 8.6 - 57.3
1977/78 15,000 - 15,000 - 7.3 - 85.0
1978/79 15,000 15,000 15,000 - 8.5 - 111.8
1979/80 10,000 15,000 15,000 5,000 12.5 - 155.8
1980/81 14,000 20,000 20,000 7,500 14.0 - 249.8
1981/82 15,000 25,000 25,000 9,500 15.0 - 374.5
1982/83 22,600 35,000 25,000 20,000 30.0 50,000 267.4
1983/84 13,140 15,700 21,000 50,000 32.0 96,000 549.3
1984/85 2,200 12,500 31,000 60,000 32.0 157,799 526.1
1985/86 3,200 12,500 31,000 60,000 32.0 157,799 526.1
1986/87 2,200 10,000 95,000 44,070 32.6 120,000 495.9
1987/88 660 2,200 157,500 2,400 22.9 - 100.8
1988/89 250 1,350 6,000 2,650 18.5 - 112.5


Note: For the first time in 1980/81 new first-phase units were started while second-phase units were still being added to
first-phase units built in the preceding year. The 1980/81 targets were 4,000 first-phase units and 10,000 second-phase
units.(Snodgrass et al., 1980, table 2)
Original source: Republik Indonesia, ”Nota Keuangan dan Rancangan Anggaran Pendapatan dan Belanja Negara, Tahun
1988/1989”
Source: Annex Table 4 of World Bank (1989), page 109.

27
Table 1.A2: Development grant to regions (in billion Rp.)

Grant to:
Primary Market
Health Rural
Period Villages District Provinces school Reforestation place Total
improv. road
building dev.

REPELITA I 26.8 46.3 - 17.2 - - - - 90.3


REPELITA II 94.8 304.0 317.4 323.6 94.6 76.5 12.5 - 1,223.4
REPELITA III 332.2 760.3 989.8 1,939.1 355.8 333.8 35.2 254.2 5,000.4
REPELITA IV 501.3 1,131.8 1,417.0 1,828.3 494.9 156.8 39.6 607.6 6,177.3


Original source: Bappenas
Source: Table 4.4 of Hady (1989), page 151.
28
Figure 1.A.1: Number of newly appointed primary school teachers,1974-1998

160,000

141,324
140,000

121,100
120,000

103,350

100,000
91,050

80,000
75,000
29

60,000 60,000 58,840


60,000

50,000 50,000 50,000

40,000

21,000
20,000 18,000 17,050 16,800
14,000
10,000 10,000 10,150
8,000 8,000
5,160 4,100 4,100 5,000

0
1974/75 1975/76 1976/77 1977/78 1978/79 1979/80 1980/81 1981/82 1982/83 1983/84 1984/85 1985/86 1986/87 1987/88 1988/89 1989/90 1990/91 1991/92 1992/93 1993/94 1994/95 1995/96 1996/97 1997/98 1998/99

Original source:Ministry of National Education, Indonesia, 2005


Source:Jalal et al. (2009), Figure 1.1
Chapter 2

Schooling Expansion and the Female Marriage Age

2.1 Introduction

Female marriage age is important because it could affect fertility, maternal health and even wives’ autonomy

within households. (Jensen and Thornton, 2003; Kirbas et al., 2016) All these aspects could all have long-

term economic consequences. For example, fertility affects future labor force structure, maternal health

affects individuals’ long-term achievements, wives’ bargaining power affects within-household investment

decisions including children’s education.(Fergusson and Woodward, 1999; Chari et al., 2017; Sekhri and

Debnath, 2014)

What could affect female marriage age? We know that a woman’s own education is a key factor that

affects her marriage age (Carmichael, 2011; Ozier, 2018), how about others’ education? Individuals of the

same gender are competing among themselves for their potential spouses on the other side. Therefore, a

change in others’ education could potentially have a spillover effect. Understanding this is also important

since we do observe change in education levels worldwide in recent decades due to schooling expansion

policies.

In this chapter, I first build a two-to-one dimensional matching model in which men differ in education but

women differ in both education and age in a two-period overlapping generation framework, to understand

how female marriage age reacts to the change in the education distributions of men and women across

cohorts. Then I exploit the setting of primary school construction in Indonesia in the late 1970s introduced

in Chapter 1 as a quasi-natural experiment to answer the question empirically. The model gives different

predictions depending on the production function of male education and female youth in the marriage

surplus. In the empirical analysis, I find that for a given woman, when other women’s education decreases

holding everything else constant including their potential husbands’ education, she would be induced to

30
marry earlier and the spousal age gap increases, defined as age difference between husbands and wives.

Combined with the theoretical model, the empirical finding women marrying earlier when other women’s

education decreases suggests that in Indonesia, male education and female young age are complementary in

generating the marital surplus. One way to interpret this complementarity is that: if we think young age

is valued in marriage, then more educated men would value wives’ young age even more than less educated

1
men.

The theoretical model is a two-period OLG model in which women can choose to seek partners either in

the first period or the second to incorporate marriage age as a choice, but men all marry in the second period

to keep a tractable model. In any given year, the marriage market unfolds as in Choo and Siow (2006), where

the marital surplus generated by a couple depends on their types and some idiosyncratic draws modeled by

random vectors. Women differ in two dimensions (education and age) while men only differ in one dimension

(education). In a stationary equilibrium, a woman’s expected return from the marriage market should be

equalized between choosing to marry in the first period or the second.

How the percentage of women choosing to marry in the first period changes with respect to the education

distributions of men and women will depend on how male education interacts with female age and female

education in generating martial surplus. If there is no interaction between male education and female age

in generating marital surplus, female marriage age choice does not depend on the education distribution of

men or women. Intuitively, an individual’s gain from marriage comes from his/her marginal contribution to

the martial surplus. Hence, in this case, women will fully capture the contribution of their age to the martial

surplus in their own utilities. Therefore, the marriage age decision would be fully determined by how young

age and old age contribute differently to marital surplus.

Suppose that women marrying at a young age is ”good” for marital surplus; then, in the case in which

there is complementarity between men’s education and women’s young age, the model predicts that an

increase in the proportion of educated men would decrease the female marriage age (i.e., increase the per-

centage of women marrying in first period). If there is also complementarity between men’s education and

women’s education, an increase in the proportion of educated women would have the opposite effect: an

1 Of course, it can also be interpreted as female preference: if we think husband’s education is valuable in marriage, then

younger women would value the education even more than older women.

31
increase in the female marriage age (i.e. a decrease in the percentage of females marrying in first period).

Intuitively, an increase in the share of educated women would create a relative shortage of educated men

when there is complementarity between men’s education and women’s education. Hence it would have the

opposite effect of an increase in educated men.

In the empirical analysis, I use the INPRES SD program as an instrument variable for a change in

others’ education. The massive size of the primary school construction program in Indonesia provides a

large exogenous shock to a change of other women’s education.

However, two main obstacles remain in the empirical analysis.

First, since the program affects both men and women, it is usually unlikely to separate the effect of the

change in male education and the change in female education in other settings. However, in the current

setting, the fact that the spousal age gap is very large in Indonesia, about 5 years, can help tackle this

challenge. Therefore, for the first few cohorts of women who were impacted by the school construction,

their potential husbands’ education remained the same. By comparing these female cohorts with the older

cohorts who were not impacted by the program, I am able to observe how the female marriage age reacts to

the change in the female education distribution while holding the male education distribution unchanged.

In sparsely populated regions where there is no effect on female education, as expected, I do not observe

any effect on female age at first marriage or the spousal age gap. In densely populated regions where there

is a negative effect on secondary school attainment rate for women, I find a decrease in female age at first

marriage and an increase in the spousal age gap. A 10 percentage points decrease in the percentage of

secondary graduates led to a decrease of 1.1 year in average female marriage age and 0.35 years in spousal

age gap.

Secondly, since the program affects both a woman’s own education and other women’s education in her

birth cohort, I have to identify the effect of a change in her own education to capture the spill-over effect. I

observed that in Indonesia, female secondary school graduates marry on average four years later than female

primary school graduates both before and after the school expansion program. Hence you may think that

a decrease in female education would mechanically lead to a decrease in marriage age and an increase in

spousal age gap. Empirically, I find the overall effect to be much larger than the mechanical effect. In the

reduced form analysis, a 10 percentage points decrease in the percentage of secondary graduates led to a

32
decrease of 1.1 year in average female marriage age, however, the mechanical effect would be 10 percentage

points times 4, which is 0.4 years. Hence the additional 0.7 year is the effect of the education distribution

change on the female marriage age. A similar logic applies to the spousal age gap. A 10 percentage points

decrease in the percentage of secondary graduates led to an increase of 0.35 years in average spousal age

gap, however, the mechanical effect would be 10 percentage points times 1.5, which is 0.15 years. Hence the

additional 0.20 year is the effect of the education distribution change on spousal age gap.

The empirical result is consistent with the model prediction when there exists complementarity both

between male education and female education, and between male education and female youth in generating

marital surplus.

This chapter is related to several distinct literature. The modeling approach in this paper is built on pre-

vious research studying marriage age using OLG models (Bhaskar, 2015; Iyigun and Lafortune, 2016; Zhang,

2018) and a model of matching with TU with separable idiosyncratic preferences in marital surplus.(Choo

and Siow, 2006; Chiappori et al., 2017; Galichon and Salanié, 2015). Of the OLG papers, some only focus on

age (Bhaskar, 2015), while others simultaneously study individuals’ educational and marriage age decisions.

It also contributes to a growing literature studying the impact of education reform on marriage market.

Hener and Wilson (2018) studies a compulsory reform in UK and finds that women decrease the marital age

gap to avoid marrying less-qualified men. André and Dupraz (2018) studies school construction in Cameroon

and finds that education increases the likelihood of being in a polygamous union for both men and women. In

contrast to both of these papers, the present paper analyzes the effect via a general equilibrium framework.

This chapter complements a large literature on the impact of marriage market conditions on individuals’

outcomes. Most of the existing literature focuses on the sex ratio in the marriage market. (e.g. Abramitzky

et al., 2011; Angrist, 2002; Charles and Luoh, 2010) I focus on a distinct but equally important dimension

of marriage market conditions: the education distributions of men and women.

2.2 Model

In this section, I develop a two-period OLG matching model with Transferable Utility (TU) to study how a

change in the education distribution across birth cohorts may affect marriage market outcomes, in particular,

33
female marriage age. There are several important features:

• Individuals get utility from participating in the marriage market.

• Individuals’ education affect the marital surplus, for both men and women.

• Individuals’ age play an asymmetric role for men and women. Women’s age matters but not men’s.

in the surplus function. Much research has documented that female youth is more important than

male youth in the marriage market, this could be due either to the fundamental difference of female

age and male age in the household production function related to fertility, or due to a stronger male

preference for youth related beauty. (Low, 2017; Siow, 1998; Edlund, 2006; Dessy and Djebbari, 2010;

Zhang, 2018; Arunachalam and Naidu, 2006)

• Women are allowed to choose to participate in the marriage market either early or late. However, a

woman who participated in period 1 cannot enter into the marriage market in period 2, whether she

remains married or single. This can be rationalized as the existence of a stigma associated with women

who have tried to seek partners in an early period.

• Each marriage market is modeled as a matching model with TU with idiosyncratic random preference

draws. The existence of random preference draws allows the existence of couples of all types with

respect to male education, female education and female age, which suits the reality more compared to

the static model. In each marriage market, women differ in both education and age, while men only

differ in education.

Two-period OLG

There is an infinite number of periods, r=1,2.... At the beginning of each period, a unit mass of men and

a unit mass of women enter the economy. Assume people can only make marriage decisions in the first two

periods, therefore the problem is simplified to a two-period OLG problem. Furthermore, to focus on female

marriage age decision, I assume that women choose whether they want to seek partners in period 1 (when

they are young) or delay this process to period 2 (when they are old). Men always seek partners in period

2. Individuals differ in their education type, L or H. In the model, let’s focus on the utilities individuals

obtain from the marriage market.

34
Marriage market at one period

I will first discuss how marriage market unfolds given women’s marriage timing choices in any given period.

Individual types

Women can choose to participate in one of the two periods, hence in any period, there are at most four

types of women: Low education and Young (L1 ), Low education and Old (L2 ), High education and Young

(H1 ), and High education and Old (H2 ). Men only participate in period 2, hence there are two types of men

in any period: Low education (L) and High education (H).

Utilities and matching surplus

Denote x as the type of women and X as the type set, i.e. x ∈ X = {L1 , L2 , H1 , H2 }. Similarly, denote

y as the type of men and Y as the type set, i.e. y ∈ Y = {L, H}. To include the possibility of being single,

denote X0 = X ∪ ∅, Y0 = Y ∪ ∅. Suppose that a woman i with type x and a man j with type y form a

couple. I assume their lifetime utilities are as following:

woman i’s utility: uij = αxy + τij + εiy

man j’s utility: vij = γxy − τij + ηxj

αxy , γxy indicate the systematic part of the utility each individual gets from the marriage depending on their

2
types. τij represents the transfer between i and j, which is going to be determined in equilibrium. εiy , ηxj

represent the individuals’ idiosyncratic tastes in partner types. Notice they only depend on the partners’

types.

For individual singles, their utilities will be:

ui∅ = αx∅ + εi∅

v∅j = γ∅y + η∅j

2 τ can be either positive or negative.

35
3
Without loss of generality, we can normalize αx∅ = 0 and γ∅j = 0. Then αxy and γxy can be interpreted

as the net systematic gain from marriage.

4
There are three important assumptions underlying my specification of individuals’utility:

• There exists a transfer technology among a couple to transfer their utilities one to one without loss,

which is the basic feature of a matching model with TU.

• Both transfer and the random taste terms are additive to the systematic part.

• The random terms are individual specific but only depend on the partner’s type.

This utility specification may seem restrictive, but it allows for ”matching on unobservables” and allows

model tractability. What it rules out is the ”chemistry” term between two individuals conditional on their

types, i.e., some unobserved preferences of one individual towards some unobserved characteristic of one

partner.

Stable Matching

Given the population and type distribution,Gx , Gy in a marriage market, a matching is defined as a

measure µ on set X × Y and a set of payoffs {ui , vj , i ∈ I, j ∈ J} such that ui + vj = αxy + γxy + εiy + ηxj

for any matched couple (i, j). In other words, a matching specifies who marries with whom and how each

mathched couple divides the surplus. Notice that the female type distribution Gx is endogenously determined

by female marriage timing choices and the exogenous type distribution, denoted as Ef = (nL , nH ). And the

male type distribution Gy is the same as the exogenous type distribution, denoted as Em = (mL , mH ).

In a stable matching, there are two requirements:

• (Individual rationality) Any matched individual is weakly better off than being single.

ui ≥ εi0 , vj ≥ η0j , ∀i ∈ I, j ∈ J

3 Because we can always define α̃


xy = αxy − αx∅ ; γ̃xy = γxy − γ∅y , as the systematic utility surplus an individual obtain
from marriage compared to being single.
4 This is the ”Separability” assumption in Galichon and Salanié (2015). As noted in that paper, what matters in the model

is the surplus a couple can jointly achieve, i.e. αxy + γxy + εiy + ηxj in our case here. How we attribute this surplus to male
preference or female preference doesn’t matter. For example, it can be the case that women don’t have any random taste for
men and their utilities without any transfer is αxy . Men’s utilities are γxy + εiy + ηxj , indicating that man j not only has a
random draw ηxj depending on women’s type, but also has own-type specific random taste for a particular woman i, represented
by εiy . The solution to the model is the same no matter how we interpret the joint surplus into people’s preference. The same
assumption is also imposed in Choo and Siow (2006) and Chiappori et al. (2017).

36
• (No blocking pair) There doesn’t exist any two individuals, woman i and man j, who are currently

not matched to each other but would both rather match to each other compared with their current

condition.

ui + vj ≥ αxy + γxy + εiy + ηxj , ∀i ∈ I, j ∈ J

Therefore, in any stable matching and given equilibrium transfers τij , the following conditions hold true:

Woman i chooses j ∗ (i) : j ∗ (i) = max uij


j∈J0

Man j chooses i∗ (j) : i∗ (j) = max vij


i∈I0

where J0 represent all men and the possibility of being single, I0 represent all women and the possibility of

being single.

Lemma 2.1. For any stable matching, there exists two vectors U xy and V xy such that:

(i) Woman i of type x achieves utility:

ũi = max(U xy + εiy )


y∈Y0

and she matches some man whose type y achieves the maximum;

(ii) Man j of type y achieves utility:

v˜j = max (V xy + ηxj )


x∈X0

and he matches some woman whose type x achieves the maximum.

(iii) If there exist women of type x matched with men of type y at equilibrium, then

U xy + V xy = αxy + γxy

This lemma has been proved in Chiappori et al. (2017); Galichon and Salanié (2015). I’ll write a short

version of the proof in the appendix. With TU, the additive structure and type-specific heterogeneity, this

37
two-sided matching problem is simplified to a one-sided discrete choice problem.

Solutions with Gumbel distribution

If we further assume Gumbel distribution for ε, η, a closed form solution of the stable matching and the

expected utilities of each type can be derived. From now on, let’s assume the random terms εiy , ηxj follow

independent Gumbel distributions G(−k, 1), with k ≃ 0.5772 being the Euler constant. With the properties

of the Gumbel distribution and Lemma 2.1, for a given woman i of type x,

µy|x := Pr (Woman i (of type x) matched with a man of type y)

exp(U xy )
= ∑
1 + y∈Y exp(U xy )

µ∅|x := Pr (Woman i (of type x) is single)

1
= ∑
1+ y∈Y exp(U xy )

Therefore,
µy|x
= exp(U xy ), ∀x ∈ X
µ∅|x

Similar logic applies to the other side: men:

µx|y
= exp(V xy ), ∀y ∈ Y
µ∅|y

Denote nx , my as the population of each type. Note that nx depends on women’s participation choices.

Denote µxy as the mass of matched couples between woman of type x and man type y, note that µxy = µyx

by construction since it’s a one-to-one match; denote µx0 as the mass of single women of type x, µ0y as the

mass of single men of type y; then we have:

µ2xy
= exp(U xy + V xy ) = exp(αxy + γxy )
µx0 µ0y

Denote Φxy = αxy + γxy . Then given Φxy , the previous equation provides a matching function between the

38
mass of any couple type and the probabilities of singlehood. With the following feasibility constraints, we

can construct a system of equations with |X| + |Y | unknowns (probabilities of singlehood for each type) and

|X| + |Y | equations. Decker et al. (2013) shows the existence and uniqueness of the solution to this system.

µx0 + µxL + µxH = nx , ∀x ∈ {L1 , L2 , H1 , H2 }

µ0y + µL1 y + µL2y + µH1 y + µH2 y = my , ∀y ∈ {L, H}

Moreover, we can recover the expected utilities each type gets from participating in this marriage market.

With the properties of Gumbel distributions,

∑ µx0
ux := E[ũi ] = E[max(U xy + εiy )] = ln(1 + exp(U xy )) = −ln(µ∅|x ) = −ln( )
y∈Y0 nx
y∈Y

∑ µ0y
vy := E[v˜j ] = E[max (V xy + ηxj )]ln(1 + exp(U xy )) = −ln(µ∅|y ) = −ln( )
x∈X0 my
y∈Y

In this case, the expected utility has one-to-one correspondence with the single rate in this case. The smaller

5
the single rate is, the larger the expected utility is.

Stationary equilibrium with OLG

Before participating in any marriage market, the strategic choice for each woman in the model is to choose

when to enter into the marriage market, given the predetermined education distribution of women and men,

denoted by (Ef , Em ). For a woman with education e, if she chooses to enter in period 2 instead of period

1, this increases the expected marital return of all women in period 1 marriage market and decreases the

6
expected return of all women in period 2 marriage market. In a stationary equilibrium, the percentage of

women who choose to wait until period 2 equates women’s expected returns in the two marriage markets.

Denote the percentage of women with education e who choose to seek partners in period 1 (or period 2) as

qe1 ( or qe2 ), assume e ∈ {L, H}. Of course, qe1 + qe2 = 1, ∀e.

5 This is a specific property of the Gumbel distribution.


6 Asproved in Galichon and Salanié (2017), an addition of one woman hurts all women and benefits all men; an addition of
one man hurts all men and benefits all women.

39
We say the marriage market with distribution of female types and male types as (Gx , Gy ) is the induced

marriage market of a strategy vector q if the distribution of female types (four) and male types (two) in the

marriage market is (Gx , Gy ) when women adopt strategy q. Note that for male distribution, Gy = Gm , ∀q.

{ 1 2 1 2
}
Definition 2.1. Strategy vector q = qH , qH , qL , qL forms a stationary equilibrium if uH1 = uH2 and

uL1 = uL2 in the induced marriage market, where ue1 (ue2 ) is the expected marriage payoff of women with

education e who choose to enter the marriage market in period 1 (period 2).

Denote Φxy = αxy + γxy . We have woman’s type x ∈ {L1 , L2 , H1 , H2 }, man’s type y ∈ {L, H}.

Proposition 2.1. There exists a unique stationary equilibrium, and the equilibrium strategy q satisfy:

1
qL
min(ΦL1 L − ΦL2 L , ΦL1 H − ΦL2 H ) ≤ ln( 2 ) ≤ max(ΦL1 L − ΦL2 L , ΦL1 H − ΦL2 H )
qL

1
qH
min(ΦH1 L − ΦH2 L , ΦH1 H − ΦH2 H ) ≤ ln( 2 ) ≤ max(ΦH1 L − ΦH2 L , ΦH1 H − ΦH2 H )
qH

Proof. See the appendix

Intuitively, the equilibrium percentage of women who decide to participate in period 1 depends on the

marital surplus difference between marrying in period 1 and period 2 given any partner type. The larger

the difference, the higher the percentage of women seeking partners in period 1.

One corollary of Proposition 2.1 is that the equilibrium strategy q satisfies the following conditions:

0 < qe1 < 1, 0 < qe2 < 1, ∀e ∈ {L, H}. In equilibrium, it will never happen that all women of the same

education type choose to participate in period 1 or period 2, as long as the surplus Φ terms are bounded.

Intuitively, if all women of one education type choose to participate in period 1, a woman could benefit by

choosing to participate in period 2, which makes her the only older woman with that education. The scarcity

of this type would earn large marital returns for the woman. Since the support of Gumbel distribution is R,

the potential return could be large enough such that being the only one of older type in period 2 is more

rewarded than participating in period 1 no matter how large the surplus difference Φe1 y − Φe2 y is as long as

7
it is finite.

7 One can also understand this in terms of the probability of singlehood. In the model, single probability has one-to-one

correspondence with the expected utility: the lower the single probability, the higher the expected marital return. For a woman

40
Proposition 2.2. If given education type e ∈ {L, H}, Φe1 H − Φe2 H = Φe1 L − Φe2 L , then qe1 , qe2 are uniquely

pinned down by:


exp(Φe1 L ) exp(Φe2 L )
qe1 = , qe2 =
exp(Φe1 L ) + exp(Φe2 L ) exp(Φe1 L ) + exp(Φe2 L )

Proof. See the appendix

Φe1 H − Φe2 H = Φe1 L − Φe2 L indicates that the gain of female youth in surplus is independent of men’s

8
education. This means that male education and female youth don’t interact in the marital surplus, hence

the marginal contribution of female youth in the surplus doesn’t depend on their partner’s education type

either. In a matching model, individuals’ marital gain come from their marginal contributions to the surplus.

In this case, women get all the benefit (or cost) of female youth if they choose to participate in period 1.

Their choice of marriage market is fully pinned down by this difference in marital surplus independent of

the education distribution of both sides.

Comparative statics

School construction would lead to a dynamic change in the population education. However, unlike in Bhaskar

(2015), the current model doesn’t focus on the transitory period, which is of less interest in this paper. I

will concentrate instead on how the stationary equilibrium changes in response to the change in population

education. For simplicity, let’s assume male population and female population are equal. Without loss of

generality, I can also normalize the population of each side to 1 since the model has constant returns to

scale. Let us analyze how female marriage age decision would change when the education distribution of

men or women changes, respectively.

Proposition 2.3. Denote female education distribution as Gf = (nL , 1−nL ) and male education distribution

as Gm = (mL , 1 − mL ).

Keeping n constant, ∀y ∈ {L, H}, a decrease in mL would

who is the only one of older type in period 2, she would almost for sure get married since the men who have very large draws
for this particular older type would compete fiercely among themselves and want to marry her.
8 It can depend on female education, e. For example, the return of female youth is larger for less educated women than

more educated women, or the other way around. The empirical observation that less educated women marry earlier supports
the case that the gain is larger for less educated women.

41
• increase qe1 , if Φe1 H − Φe2 H > Φe1 L − Φe2 L ;

• decrease qe1 , if Φe1 H − Φe2 H < Φe1 L − Φe2 L .

Proof. See the appendix.

If the percentage of more-educated men increases, the equilibrium percentage of women marrying in

9
period 1 increases if male education and female youth are complementary in the marital surplus; the

equilibrium percentage of women marrying in period 1 decreases if instead male education and female

maturity are complementary in the marital surplus. Notice that whether the marital surplus is super-

modular in male education and female education does not matter.

A stable matching maximizes the total social surplus in a TU framework. (Shapley and Shubik, 1971)

When male education and female youth are complementary, the social surplus is larger if we pair more

educated men with younger women. Hence when there is a decrease in mL , the existence of more educated

men would induce more women to marry in period 1 to take advantage of the higher social surplus. Vice

versa.

Proposition 2.4. Denote female education distribution as Nf = (nL , 1−nL ) and male education distribution

as Nm = (mL , 1 − mL ).

Further assume super-modularity in men’s education and women’s education: holding m constant, ∀e ∈

{L, H}, a decrease in nL would

• decrease qe1 , if Φe1 H − Φe2 H > Φe1 L − Φe2 L

• increase qe1 , if Φe1 H − Φe2 H < Φe1 L − Φe2 L

Proof. See the appendix

A change in female education distribution affects the equilibrium female choice by affecting the potential

gain of female youth via affecting the potential distribution of men a woman can marry to. If nL decreases,

for a given woman, other women are more educated. They are more likely to marry with more educated

9 There are at least four ways to interpret the complementarity between male education and female youth. For example, (1)

all men prefer female youth and more educated men value female youth more than less educated men. (2) All women prefer
more educated men and younger women value male education more than older women. (3) All men dislike female youth and
more educated men dislike female youth less than less educated men. (4) All women dislike more educated men and younger
women dislike more educated men less than older women. Of course, the first and second seem to be more plausible than the
last two.

42
men due to the complementarity in education. Therefore, on the market, more educated men are more

scarce, which will discourage all women from participating in period 1 as predicted in Proposition 2.3 if male

education and female youth are complementary.

2.3 The Marriage Market in Indonesia

Marriage traditions differ in Indonesia’s hundreds of different ethnolinguistic groups. However, under the

influence of national policies, certain commonalities also emerge. (Frederick and Worden, 1993)

With more than 87% population as Muslim (according to 2010 census), polygamy is legal. However, only

2% of marriage is polygamous (Jones, 1994).

Arranged marriage still exists, but the percentage is decreasing. Most marriages require the consent of

the children, especially for the groom’s family. (Malhotra, 1991) In Indonesia, average female marriage age

is about 19. It’s low but similar to other southeastern Asian countries.

Divorce rate used to be very high (about 60%) around 1960s, however, it has been decreasing since 1970s

and is now less than 40%. Fertility rate has also been declining since 1970s when average education increases.

There is no evidence for son preference in Indonesia. (Frederick and Worden, 1993)

2.4 Data

To avoid truncation problems caused by the fear that young men and women who are single in the survey

year may marry in future years, I choose the latest censuses available from IPUMS. I use information

from the 10% sample of the Indonesian Population Census 2010 and the 0.51% sample of the Indonesian

Intercensal Population Survey (SUPAS) 2005 downloaded from IPUMS International 10 . Education, current

martial status and current spousal information is also available in both censuses. However, only SUPAS

2005 records detailed lifetime marital outcomes such as age at first marriage and number of marriages;

moreover, only women were surveyed on those questions. Men’s marriage age can be proxied using the wife’s

information if both spouses are in their first marriage.

10 Minnesota Population Center. Integrated Public Use Microdata Series, International: Version 7.1 [dataset]. Minneapolis,

MN: IPUMS, 2018. https://doi.org/10.18128/D020.V7.1. I would also like to acknowledge the statistical agency that originally
produced the data: Statistics Indonesia

43
Table 2.1 displays the descriptive statistics of individuals’ marital outcomes and characteristics of married

people using the 2010 census. For married couples, on average, husbands are 5 years older than wives, and

this is much larger than the gap of 2 years observed in the US.

Table 2.2 presents the detailed summary statistics of female first marriage age by education. Higher

educated women marry later. Younger cohorts tend to marry later, but the difference is very small. More

importantly, the difference of first marriage age between women with different education is very stable across

old and young cohorts.

Figure 2.1 displays the matching patterns with respect to education for the observed married couples in

which wives were aged 40-50 in 2010. Left panel shows the raw frequency numbers and the right panel shows

the weighted percentage. For all four education levels, the percentage of people who have spouses with same

education is the largest, which is a universal phenomenon that’s documented in the literature.

2.5 Main Results

In this section, I present my empirical results on female marriage age and the spousal age gap. I first

show reduced-form event study results on the impact of school construction on female marriage age and the

spousal age gap for the treated female cohorts, separately for sparsely and densely populated regions. I then

provide the 2SLS estimate of how female marriage age and the spousal age gap change with respect to the

female education distribution using the school construction program as an instrument variable for female

education distribution.

Reduced-form results

The empirical specification for the reduced-form results is the same as the previous specification for the

education results.

Figure 2.2 presents the coefficients of the interaction between the birth cohort dummy and school con-

struction intensity on female age at first marriage (top) and the spousal age gap (bottom) by female age

group in 1974 in sparsely populated regions. All coefficients of the interaction between the birth cohort

dummy and school construction intensity are not significantly different from zero. This is expected since

44
female education was not substantially affected by the school construction program in sparsely populated

regions. The results for densely populated regions are presented in Figure 2.3. The top panel shows a

negative effect on female age at first marriage for one additional primary school being built in the region.

Correspondingly, the bottom panel shows a positive effect on the spousal age gap.

Difference-in-differences estimates are presented in Table 2.3. The sample includes women born between

1953 and 1961 who were older than 12 in 1974, and women born between 1965 and 1970. ”Post” indicates

women born between 1965 and 1970. Columns (1) and (2) show the estimates for sparsely populated regions.

Neither female age at first marriage nor the spousal age gap was impacted. Columns (3) and (4) present the

estimates for densely populated regions and suggest that one additional school being constructed decreased

the average female age at first marriage by 0.25 years and increased the spousal age gap by 0.075 years.

2SLS estimate

In chapter 1, I showed that:

• Result 1: The program has a positive effect on primary school attainment rate for men and a surprising

negative effect on secondary school attainment rate for women.

• Result 2: In sparsely populated regions, there is a positive effect on primary school attainment and

secondary school attainment rate for men but zero effect for women.

• Result 3: In densely populated regions, for both men and women, there is no effect on primary school

attainment rate, but negative effect on secondary school attainment rate.

In light of the different effects on education in sparsely and densely populated regions, I should expect

different results on marriage market outcomes in sparsely and densely populated regions. Moreover, I should

expect zero effect on female marriage age or spousal age gap in sparsely populated regions since female

education is not impacted.

Since I lack first stage results for female education in sparsely populated regions, in this subsection, l

focus on densely populated regions. Consider the following equation that characterizes how own education

45
and the education distribution may affect an individual’s choice of marriage age and the spousal age gap:

yijk = αj + βk + Dijk c + Ejk b + νijk

where αj is a region fixed effect, βk is a birth cohort fixed effect. yijk denotes the marriage age or spousal

age gap of a woman i born in year k in region j, Dijk is a dummy variable denoting whether woman i

completes secondary school, and Ejk denotes the female secondary school attainment rate for birth cohort

k in region j.

The coefficient of interest is b, indicating the impact of an increase in the proportion of educated women

on female marriage age and the spousal age gap. However, ordinary least-squares (OLS) estimates of this

equation may lead to biased estimates if there is correlation between Ejk and νijk or between Dijk and νijk .

Unobserved individual characteristics such as ability or family attitudes could affect both her education

attainment and marriage decisions, leading to a correlation between Dijk and νijk . Unobserved region cohort

specific characteristics such as a construction of entertainment facilities or a promotion of family planning

policies could affect the education attainment and marriage decisions of a few cohorts in the region, leading

to a correlation between Ejk with νijk .

To address this issue, let us take the average across individuals i given birth cohort k and region j:

ȳjk = αj + βk + Ejk (b + c) + ν̄jk

The school construction program provides a good instrument variable for Ejk , and hence I can obtain a

valid estimate of (b + c). OLS and 2SLS estimates of this specification are shown in Panel A of Table 2.4

for female age at first marriage and the spousal age gap. The IV estimate for female age at first marriage,

although imprecisely estimated, indicates that increasing the share of female secondary graduates by 10

percentage points would increase the average female marriage age by 1.09 years. The IV estimate for the

spousal age gap indicates that increasing the share of female secondary graduates by 10 percentage points

would decrease the average spousal age gap by 0.35 years.

Separating the Effects of Own Education and the Education Distribution. From the previous

46
specification, we know that:

E(yijk |Dijk = 0) = αi + βk + Ejk b

E(yijk |Dijk = 1) = αi + βk + c + Ejk b

Hence, c = E(yijk |D = 1) − E(yijk |D = 0), which can be empirically estimated as the difference in the

outcome variables conditional on education level. From the summary statistics, we know that the difference

in age at first marriage age between female secondary school graduates and female primary school graduates

is 4 years, while the difference in the spousal age gap between secondary school graduates and primary

school graduates is (-1.5) years. Comparing this with previous estimates indicates that when controlling for

a woman’s education, increasing the percentage of female secondary graduates by 10 percentage points in

her birth cohort would increase her first marriage age by 0.69 years and decrease the spousal age gap by 0.2

years.

Interpretation

My findings on the marriage market are consistent with the model when there is complementarity between

higher education of husbands and younger age of wives in the marital surplus. A decrease in the percentage

of female secondary school graduates creates a relative abundance of secondary school graduate men, which

would encourage more women to marry earlier. In the Indonesian setting, the regions with more school

constructed experienced a smaller increase in female secondary graduates, which created a relative abundance

of male secondary graduates in the marriage market, and this encouraged even more women to marry earlier.

2.6 Discussion

Since places where people get married are not recorded in the data, the birth place is used to define a

marriage market. This could create measurement error if many people marry to people outside of their birth

place. In the sample, the percentage of couples with different birth regencies is 77%. In densely populated

regencies, it’s about 73%. And in sparsely populated regencies, it’s a bit larger, about 81%.

The previous model is one way to rationalize the empirical result. There could be other potential stories.

47
For example, if we think women prefer to marry earlier when others do irrespective of the marriage market

conditions. Then when there are more less educated women who tend to marry early, then all women marry

earlier. However, this sort of stories relies on the ad-hoc preference which should be rationalized itself by

certain kinds of micro foundations. With this regard, I argue that the current model provides one possible

micro foundation for all potential stories.

As for the model, endogenizing both education choice and marriage age choice would be very rewarding

in future work.

2.7 Conclusion

I have shown that women adjust their marriage age when the average education of women changes in the local

marriage market. Exploiting a massive school construction program in the late 1970s in Indonesia, I analyze

the age at first marriage of the first few cohorts of women who were exposed to the school construction

program. Since the spousal age gap is on average 5 years, these women’s potential husbands’ educations

were minimally impacted. I find that women decrease their marriage age when there is a decrease in the

average secondary school attainment rate of other women in the same cohort. To explain this, I construct

a two-to-one dimensional matching model embedding female choice of marriage age into a two-period OLG

framework and show that if with respect to marital surplus, (1) there exists complementarity between

male education and female education, (2) there exists complementarity between male education and female

youth, then women will decrease their marriage age in response to a decrease in other women’s education.

Intuitively, when the education of other women decreases, they tend to marry less-educated men due to

the complementarity with education, and this creates an abundance of more-educated men. Due to the

complementarity between men’s education and women’s young age, the abundance of more-educated men

induces women to marry earlier.

This study is a step toward further understanding the effect of market conditions on individuals’ marriage

decisions and outcomes. Education expansion policies have been observed around the world. The empirical

finding that female marriage age responds to other women’s education has direct policy implications. When

evaluating education policies with potential market-level impacts, we as researchers should consider both

48
the direct effect on individuals and the indirect effect via changing market conditions.

2.8 Figures

49
50

Figure 2.1: Marriage frequencies (left) and marriage proportions (right) by education for females
.3 .2
Female first marriage age
0 .1
-.1
-.2

24-22 21 to 19 18 to 16 15 to 13 12 to 10 9 to 7 6 to 4 3 to 2
Age at 1974
.3
.2
Spousal age gap
0 .1 -.1
-.2

24-22 21 to 19 18 to 16 15 to 13 12 to 10 9 to 7 6 to 4 3 to 2
Age at 1974

Figure 2.2: Effect of the program on female age at first marriage (top) and the spousal age gap (bottom) in
sparsely populated areas

Note: This figure reports estimates of the effect of school construction on female first marriage age (top) and spousal age gap
(bottom) for 3-year cohorts females in sparsely populated areas. The x-axis reports the age range (in 1974) for each cohort and
the y-axis reports the estimated coefficient, which can be interpreted as the effect of one additional primary school built per 1000
kids on primary school attainment rate in that regency.

51
.4
Female first marriage age
-.2 0 -.4
-.6 .2

24-22 21 to 19 18 to 16 15 to 13 12 to 10 9 to 7 6 to 4 3 to 2
Age at 1974
.4
.2
Spousal age gap
-.2 0 -.4
-.6

24-22 21 to 19 18 to 16 15 to 13 12 to 10 9 to 7 6 to 4 3 to 2
Age at 1974

Figure 2.3: Effect of the program on female age at first marriage (top) and the spousal age gap (bottom) in
densely populated areas

Note: This figure reports estimates of the effect of school construction on female first marriage age (top) and spousal age gap
(bottom) for 3-year cohorts females in densely populated areas. The x-axis reports the age range (in 1974) for each cohort and
the y-axis reports the estimated coefficient, which can be interpreted as the effect of one additional primary school built per 1000
kids on primary school attainment rate in that regency.

52
2.9 Tables

Table 2.1: Summary statistics

Old Cohorts Young Cohorts


2010 Census: Born between 1950 and 1961 Born between 1962 and 1974
Males Females Males Females

Marriage outcomes
Never Married ( = 1) 0.01 0.02 0.04 0.03
Separated ( = 1) 0.01 0.04 0.02 0.04
Married Couples
Husband Age minus Wife Age 5.83 4.84 4.45 4.60

Husband More Educated (=1) 0.26 0.27 0.27 0.25


Same Education (=1) 0.62 0.63 0.57 0.59
Wife More Educated (=1) 0.13 0.11 0.16 0.16
Years of Schooling Gap (Hus- 0.57 0.68 0.44 0.37
band’s minus Wife’s)
Observations 1,275,648 1,231,961 2,148,572 2,128,266

Source: Indonesian Census 2010.

53
Table 2.2: First marriage age

mean sd p10 p25 p50 p75 p90 count


Women born between 1950 and 1961:
Some School 17.91 (4.19) 14.00 15.00 17.00 20.00 23.00 26,550
Primary School 18.87 (4.21) 15.00 16.00 18.00 21.00 24.00 27,509
Secondary School 22.19 (4.55) 17.00 19.00 22.00 24.00 27.00 6,971
University or above 24.39 (4.45) 19.00 21.00 24.00 27.00 30.00 938
Women born between 1962 and 1974:
Some School 17.75 (3.79) 14.00 15.00 17.00 20.00 22.00 22,872
Primary School 18.73 (3.74) 15.00 16.00 18.00 20.00 24.00 51,265
Secondary School 22.58 (4.00) 18.00 20.00 22.00 25.00 28.00 22,624
University or above 25.43 (3.97) 20.00 23.00 25.00 28.00 30.00 3,080

Source: Indonesian Census 2005.
54
Table 2.3: Reduced-form effect of school construction on female marriage outcomes

Density < Medium Density > Medium


Panel A:
Female First Marriage Age:
(1) (2) (3) (4)
Post × Intensity -0.054 0.024 -0.27∗∗ -0.25
(0.076) (0.078) (0.12) (0.16)
Dep. var. mean 19.231 19.153
Observations 2664 2664 1365 1365
Clusters 183 183 91 91
Adjusted R-squared 0.673 0.702 0.710 0.717
Panel B:
Spousal age gap
Post × Intensity 0.030 0.057 0.18∗∗∗ 0.075∗
(0.025) (0.040) (0.044) (0.040)
Dep. var. mean 4.838 4.776
Observations 2745 2745 1365 1365
Clusters 183 183 91 91
Adjusted R-squared 0.858 0.879 0.895 0.917
Duflo Controls: Yes Yes Yes Yes
Log-linear trend: No Yes No Yes

Notes: This table displays the reduced-form effect of school construction on female first marriage age (top) and spousal age
gap (bottom) in sparsely populated regions (left) and densely populated regions (right). Post refers to the first few treated
cohorts that were affected by the school construction program, i.e., those born between 1965 and 1970, while the untreated
cohort was born between 1953 and 1961. Female first marriage data is taken from Indonesian SUPAS 2005. Spousal age
gap data is taken from the Indonesian 2010 Census. All columns include district fixed effect, school year fixed effect, school
year interacted with number of children at 1971. Duflo Controls consist of school year interacted with enrollment rate at
1971 and school year interacted with water sanitization program. Standard errors are clustered at the birthplace district
level. Significance levels: * 10%, ** 5%, *** 1%.
Source: Indonesian SUPAS 2005, Indonesian Census 2010

55
Table 2.4: Results of female education distribution on female marriage outcomes

Female first marriage age Spousal age gap


Panel A:
OLS and IV
(1) OLS (2) IV (3) OLS (4) IV
Percentage of females 1.92 10.9∗ -2.71∗∗∗ -3.49∗
with secondary degree
(1.37) (6.53) (0.38) (2.03)
First Stage F statistics 12.929 12.929
Dep. var. mean 19.647 4.550
Observations 1365 1365 1365 1365
Clusters 91 91 91 91
Adjusted R-squared 0.763 0.754 0.920 0.920
Duflo Controls: Yes Yes Yes Yes
Log-linear trend: Yes Yes Yes Yes

Panel B:
First stage and Reduced form
Female age
Complete Spousal age
at first
Secondary gap
marriage
Post × Intensity -0.021∗∗∗ -0.23∗ 0.075∗
(0.0059) (0.14) (0.040)
Dep. var. mean 0.261 19.647 4.550
Observations 1365 1365 1365
Clusters 91 91 91
Adjusted R-squared 0.988 0.763 0.917
Duflo Controls: Yes Yes Yes
Log-linear trend: Yes Yes Yes

Notes: This table displays the OLS and IV estimates of the effect of female education distribution on marriage market
outcomes. All columns include district fixed effect, school year fixed effect, school year interacted with number of children
at 1971. Post refers to the first few treated cohorts that were affected by the school construction program, i.e., those born
between 1965 and 1970, while the untreated cohort was born between 1953 and 1961. Duflo Controls consist of school year
interacted with enrollment rate at 1971 and school year interacted with water sanitization program. Standard errors are
clustered at the birthplace district level. Significance levels: * 10%, ** 5%, *** 1%.
Source: Indonesian SUPAS 2005, Indonesian Census 2010

56
2.10 Appendix

A. Proof for Lemma 2.1

Proof. Denote ũi , v˜j the equilibrium utility individuals get. We know that if woman i and man j match in

equilibrium, then ũi + v˜j = αxy + γxy + εiy + ηxj .

For woman i of type x,

ũi = max {αxy + γxy + εiy + ηxj − v˜j , εi0 }


j∈J
{ }
= max max (αxy + γxy + ηxj − v˜j ) + εiy , εi0
y∈Y j where ji =y

Define U xy = maxj where ji =y (αxy + γxy + ηxj − v˜j ), U x0 = 0, then we get:

ũi = max(U xy + εiy )


y∈Y0

Moreover,

ũi ≥ U xy + εiy , ∀y ∈ Y0

and it achieves equality when the set of women of type x matched with men of type y is nonempty.

With similar notations, define V xy = maxi where xi =x (αxy + γxy + εiy − ũi ), V 0y = 0, then:

v˜j = max (V xy + ηxj )


x∈X0

v˜j ≥ V xy + ηxj , ∀x ∈ X0

and it achieves equality when the set of men of type y matched with women of type x is nonempty.

If there exist women of type x matched with men of type y,

ũi = U xy + εiy

v˜j = V xy + ηxj

57
Hence U xy + V xy = αxy + γxy

B. An important lemma

To prove the propositions, I’ll first establish an important lemma related to how probabilities of singlehood

change related to the shift of marginals in types.

Lemma 2.2. Assume the idiosyncratic tastes follow Gumbel distributions. Assume there are two types for

each side, denote the female marginal as n = (x, 1 − x) and male marginal as m = (y, 1 − y), the surplus

matrix as:  
ΦLL ΦLH
Φ= 
ΦHL ΦHH

denote the mass of singles of females (males) in equilibrium as: µL0 , µH0 (µ0L , µ0H ) then:

(a)
∂µL0 ∂µH0
> 0, <0
∂x ∂x

(b) If the marital surplus function is super-modular, i.e., ΦLL + ΦHH > ΦLH + ΦHL , then

(b1)
∂µ0L ∂µ0H
>0⇒ >0
∂x ∂x

∂µ0H ∂µ0L
<0⇒ <0
∂x ∂x

(b2) There exists some δx , δ¯x , δy , δ¯y , such that if δx < x < δ¯x , δy < y < δ¯y , then:

∂ µµ0H
0L

<0
∂x

Proof. Denote a = exp( ΦLL ΦLH ΦHL ΦHH


2 ), b = exp( 2 ), c = exp( 2 ), d = exp( 2 );

√ √ √ √
denote sL0 = µL0 , sH0 = µH0 , s0L = µ0L , s0H = µ0H ;
∂sL0 ∂sH0 ∂s0L ∂s0H
denote DL0 = ∂x , DH0 = ∂x , D0L = ∂x , D0H = ∂x .

Then we can rewrite the feasibility constraints with the matching function as:

s2L0 + sL0 s0L a + sL0 s0H b = x

58
s2H0 + sH0 s0L c + sH0 s0H d = 1 − x

s20L + sL0 s0L a + sH0 s0L c = y

s20H + sL0 s0H b + sH0 s0H d = 1 − y

In the four equations above, taking the derivative with respect to x, we get:

(2sL0 + as0L + bs0H )DL0 + sL0 (aD0L + bD0H ) = 1 (2.1)

(2sH0 + cs0L + ds0H )DH0 + sH0 (cD0L + dD0H ) = −1 (2.2)

(2s0L + asL0 + csH0 )D0L + s0L (aDL0 + cDH0 ) = 0 (2.3)

(2s0H + bsL0 + dsH0 )D0H + s0H (bDL0 + dDH0 ) = 0 (2.4)

Hence we can express D0L , D0H using DL0 , DH0 from Equation 2.3 and Equation 2.4:

s0L (aDL0 + cDH0 )


D0L = − (2.5)
2s0L + asL0 + csH0

s0H (bDL0 + dDH0 )


D0H = − (2.6)
2s0H + bsL0 + dsH0

Plugging in Equation 2.1 and Equation 2.2, we get:

as0L (2s0L + csH0 ) bs0H (2s0H + dsH0 ) acsL0 s0L bdsL0 s0H
(2sL0 + + )DL0 −( + )DH0 = 1
2s0L + asL0 + csH0 2s0H + bsL0 + dsH0 2s0L + asL0 + csH0 2s0H + bsL0 + dsH0
(2.7)

59
cs0L (2s0L + asL0 ) ds0H (2s0H + bsL0 ) acsH0 s0L bdsH0 s0H
(2sH0 + + )DH0 −( + )DL0 = −1
2s0L + asL0 + csH0 2s0H + bsL0 + dsH0 2s0L + asL0 + csH0 2s0H + bsL0 + dsH0
(2.8)

Add Equation 2.7 and Equation 2.8, we get:

2as20L 2bs20H 2cs20L 2ds20H


(2sL0 + + )DL0 +(2sH0 + + )DH0 = 0
2s0L + asL0 + csH0 2s0H + bsL0 + dsH0 2s0L + asL0 + csH0 2s0H + bsL0 + dsH0
(2.9)

Hence DL0 and DH0 have opposite signs. With Equation 2.7, we know:

DL0 > 0, DH0 < 0

This completes the proof for (a).

For part (b1) of the lemma, with super-modularity, we know:

a∗d>b∗c

Since DL0 > 0:


a b
DL0 > DL0
c d

a b
⇒: DL0 + DH0 > DL0 + DH0
c d

Hence:

aDL0 + cDH0 < 0 ⇒ bDL0 + dDH0 < 0

bDL0 + dDH0 > 0 ⇒ aDL0 + cDH0 > 0

Recall Equation 2.5 and Equation 2.6, we have:

∂µ0L ∂µ0H
>0⇒ >0
∂x ∂x

60
∂µ0H ∂µ0L
<0⇒ <0
∂x ∂x

Proof for (b1) is complete.

Now let’s prove part (b2):


∂ ss0H
0L
D0L s0H − D0H s0L
=
∂x s20H

Using Equation 2.5 and Equation 2.6,

s0L s0H (aDL0 + cDH0 ) s0L s0H (bDL0 + dDH0 )


D0L s0H − D0H s0L = − +
2s0L + asL0 + csH0 2s0H + bsL0 + dsH0
s0L s0H ([b(2s0L + csH0 ) − a(2s0H + dsH0 )]DL0 + [d(2s0L + asL0 ) − c(2s0H + bsL0 )]DH0 )
=
(2s0L + asL0 + csH0 )(2s0H + bsL0 + dsH0 )

It has the same sign as:

[2bs0L − 2as0H + (bc − ad)sH0 ]DL0 + [2ds0L − 2cs0H + (ad − bc)sL0 ]DH0

= 2s0L (bDL0 + dDH0 ) − 2s0H (aDL0 + cDH0 ) − (ad − bc)(DL0 sH0 − DH0 sL0 )

We know that (ad − bc)(DL0 sH0 − DH0 sL0 ) > 0, since ad − bc > 0, DL0 > 0, DH0 < 0.

According to (b1), there are only three cases:

(Case 1): aDL0 + cDH0 > 0, bDL0 + dDH0 < 0; it’s straightforward to show:

∂ ss0H
0L

<0
∂x

(Case 2): aDL0 + cDH0 > 0, bDL0 + dDH0 > 0

in this case, from Equation 2.9, we know sL0 DL0 + sH0 DH0 < 0, hence:

a b sL0
> >
c d sH0

sL0
Since we know sH0 increases with x, to satisfy previous inequality, we know that x is also relatively small in

this case.

61
There exists some δx , δ¯y such that for x > δx , y < δ¯y ,

∂ ss0H
0L

<0
∂x

(Intuition: we need x to be away from 0 and y to be away from 1 to avoid large value of s0L and small value

of s0H .)

(Case 3): aDL0 + cDH0 < 0, bDL0 + dDH0 < 0

in this case, from equation (9), we know sL0 DL0 + sH0 DH0 > 0, hence:

sL0 a b
> >
sH0 c d

x is relatively large in this case. There exists some δ¯x , δy such that for x < δ¯x , y > δy ,

∂ ss0H
0L

<0
∂x

(Intuition: we need x to be away from 1 and y to be away from 0 to avoid small value of s0L and large value

of s0L .)

Proof for part (b2) is complete.

Lemma 2.3. An extension of Lemma 2.2:

Suppose there are two types on one side, and there are K > 2 types on the other side, denote the marginals

as n = (x1 , x2 , ..., xK ), m = (y, 1 − y), where k xk = r, where r is a constant. The surplus matrix is:

 
Φ11 Φ12
 
 
Φ =  ... ... 
 
ΦK1 ΦK2

denote the mass of singles in equilibrium as: µk0 , µ01 , µ02 then:

(a)
∂µ01 ∂µ02
> 0, <0
∂y ∂y

62
(b) For any two types k1 , k2 , if we increase k1 by decreasing k2 , then µxk1 increases and µxk2 decreases.

(c) For any two types k1 , k2 , if Φk1 + Φk2 2 > Φk2 1 + Φk1 2 , then there exist values δx1 , δ¯x1 , δx2 , δ¯x2 , δy , δ¯y :

xk1 ∈ (δx1 , δ¯x1 )

xk2 ∈ (δx2 , δ¯x2 )

y ∈ (δy , δ¯y )

µ01
such that: µ02 decreases if we shift some mass from type k2 to type k1 , i.e.:

µ01 µ02
|(n=(...,xk1 +∆,xk2 −∆,...),m) < |(n=(...,xk1 ,xk2 ,...),m) , ∀∆ > 0
µ02 µ01

Proof. The proof is very similar to the proof of Lemma 2.2. WLOG, assume we shift the mass from type

2 to type 1 and denote x1 = x, x2 = γ − x, then n = (x, γ − x, x3 , ..., xk ), and m = (y, 1 − y). Denote
√ √
si0 = µi0 , s0j = µ0j . First, write down the feasibility conditions:

s210 + s10 s01 ϕ1 + s10 s02 ϕe1 = x

s220 + s20 s01 ϕ2 + s20 s02 ϕe2 = γ − x


..
.

s2K0 + sK0 s01 ϕK + sK0 s02 ϕeK = xK

s201 + s10 s01 ϕ1 + s20 s01 ϕ2 + ... + sK0 s01 ϕK = y

f1 + s20 s02 ϕ
s202 + s10 s02 ϕ f2 + ... + sK0 s02 ϕeK = 1 − y

To prove part (a), let’s take the derivative with respect to y for all K + 2 equations and denote Di0 =
∂si0 ∂s0j
∂y , D0j = ∂y .

63
D10 (2s10 + ϕ1 s01 + ϕe1 s02 ) + s10 (ϕ1 D01 + ϕe1 D02 ) = 0 (2.10)

D20 (2s20 + ϕ2 s01 + ϕe2 s02 ) + s20 (ϕ2 D01 + ϕe2 D02 ) = 0 (2.11)
..
.

DK0 (2sK0 + ϕK s01 + ϕeK s02 ) + sK0 (ϕK D01 + ϕeK D02 ) = 0 (2.12)

D01 (2s01 + ϕ1 s10 + ϕ2 s20 + · · · + ϕK sK0 ) + s01 (ϕ1 D10 + ϕ2 D20 + · · · + ϕK DK0 ) = 1 (2.13)

D02 (2s02 + ϕe1 s10 + ϕe2 s20 + · · · + ϕeK sK0 ) + s02 (ϕe1 D10 + ϕe2 D20 + · · · + ϕeK DK0 ) = −1 (2.14)

We can rearrange Equation 2.10 - Equation 2.12 to express Dk0 as a function of D01 , D02

sk0 (ϕk D01 + ϕek D02 )


Dk0 = − , ∀k = 1, 2, ..., K (2.15)
2sk0 + ϕk s01 + ϕek s02

We can substitute Equation 2.15 to Equation 2.13 and Equation 2.14:


K
ϕk sk0 (2sk0 + ϕek s02 ) ∑
K
s01 ϕk sk0 ϕek
D01 (2s01 + ) − D02 =1 (2.16)
2sk0 + ϕk s01 + ϕek s02
k=1 2sk0 + ϕk s01 + ϕek s02
k=1


K e
ϕk sk0 (2sk0 + ϕk s01 ) ∑
K
s02 ϕek sk0 ϕk
D02 (2s02 + ) − D01 = −1 (2.17)
2sk0 + ϕk s01 + ϕek s02
k=1 2sk0 + ϕk s01 + ϕek s02
k=1

Add Equation 2.16 and Equation 2.17,


K
ϕk 2s2k0 ∑
K
ϕek 2s2k0
D01 (2s01 + )) + D02 (2s02 + )=0 (2.18)
k=1 2sk0 + ϕk s01 + ϕek s02 k=1 2sk0 + ϕk s01 + ϕek s02

Therefore D01 and D02 should have negative signs. Moreover, with Equation 2.16, we know:

D01 > 0, D02 < 0

Part (a) is proved.

64
Now let’s prove part (b) Let me abuse the use of the notation Di0 and D0j . For the proof of part (b),
∂si0 ∂s0j
denote Di0 = ∂x , D0j = ∂x . Let’s take the derivative with respect to x for all K + 2 feasibility equations:

D10 (2s10 + ϕ1 s01 + ϕe1 s02 ) + s10 (ϕ1 D01 + ϕe1 D02 ) = 1 (2.19)

D20 (2s20 + ϕ2 s01 + ϕe2 s02 ) + s20 (ϕ2 D01 + ϕe2 D02 ) = −1 (2.20)

D30 (2s30 + ϕ3 s01 + ϕe3 s02 ) + s30 (ϕ3 D01 + ϕe3 D02 ) = 0 (2.21)
..
.

DK0 (2sK0 + ϕK s01 + ϕeK s02 ) + sK0 (ϕK D01 + ϕeK D02 ) = 0 (2.22)

D01 (2s01 + ϕ1 s10 + ϕ2 s20 + · · · + ϕK sK0 ) + s01 (ϕ1 D10 + ϕ2 D20 + · · · + ϕK DK0 ) = 0 (2.23)

D02 (2s02 + ϕe1 s10 + ϕe2 s20 + · · · + ϕeK sK0 ) + s02 (ϕe1 D10 + ϕe2 D20 + · · · + ϕeK DK0 ) = 0 (2.24)

Rearrange Equation 2.21 - Equation 2.22 to express Dk0 as a function of D01 , D02 for k > 2:

sk0 (ϕk D01 + ϕek D02 )


Dk0 = − , ∀k = 3, ..., K (2.25)
2sk0 + ϕk s01 + ϕek s02

Substitute Equation 2.25 to Equation 2.23 and Equation 2.24:


K
ϕk sk0 (2sk0 + ϕek s02 )
D01 (2s01 + ϕ1 s10 + ϕ2 s20 + )
2sk0 + ϕk s01 + ϕek s02
k=3

K
s01 ϕk sk0 ϕek
−D02
e
k=3 2sk0 + ϕk s01 + ϕk s02

+s01 (ϕ1 D10 + ϕ2 D20 ) = 0 (2.26)


K e
ϕk sk0 (2sk0 + ϕk s01 )
D02 (2s02 + ϕe1 s10 + ϕe2 s20 + )
2sk0 + ϕk s01 + ϕek s02
k=3

K
s02 ϕek sk0 ϕk
−D01
e
k=3 2sk0 + ϕk s01 + ϕk s02

+s02 (ϕe1 D10 + ϕe2 D20 ) = 0 (2.27)

65
Then (Equation 2.19 + Equation 2.20 )- ( Equation 2.26 + Equation 2.27) gives us:


K
2ϕk s2k0 ∑
K
2ϕek s2k0
D10 2s10 + D20 2s20 − D01 (2s01 + ) − D02 (2s02 + )=0
k=3 2sk0 + ϕk s01 + ϕek s02 k=3 2sk0 + ϕk s01 + ϕek s02
(2.28)

Moreover, from Equation 2.26 and Equation 2.27, we can express D01 and D02 as a linear combination of

D10 and D20 . Denote We can also show that the coefficents are all negative. Combing Equation 2.28, D10

and D20 should have negative signs. Therefore D10 > 0, D20 < 0. Part (b) is proved.
∂si0 ∂s0j
Now let’s prove part (c). Let’s follow the notation of the proof for part (b): Di0 = ∂x , D0j = ∂x .

Rearrange Equation 2.19 - Equation 2.22 to express Dk0 as a function of D01 , D02 :

1 − s10 (ϕ1 D01 + ϕe1 D02 )


D10 = (2.29)
2s10 + ϕ1 s01 + ϕe1 s02

−1 − s20 (ϕ2 D01 + ϕe2 D02 )


D20 = (2.30)
2s20 + ϕ2 s01 + ϕe2 s02

sk0 (ϕk D01 + ϕek D02 )


Dk0 = − , ∀k = 3, ..., K (2.31)
2sk0 + ϕk s01 + ϕek s02

Substitute Equation 2.29 - Equation 2.31 to Equation 2.23 and Equation 2.24:


K
ϕk sk0 (2sk0 + ϕek s02 ) ∑
K
s01 ϕk sk0 ϕek
D01 (2s01 + ) − D02
2sk0 + ϕk s01 + ϕek s02
k=1 2sk0 + ϕk s01 + ϕek s02
k=1

ϕ2 ϕ1
= s01 ( − ) (2.32)
2s20 + ϕ2 s01 + ϕe2 s02 2s10 + ϕ1 s01 + ϕe1 s02


K e
ϕk sk0 (2sk0 + ϕk s01 ) ∑
K
s02 ϕek sk0 ϕk
D02 (2s02 + ) − D01
2sk0 + ϕk s01 + ϕek s02
k=1 2sk0 + ϕk s01 + ϕek s02
k=1

ϕe2 ϕe1
= s02 ( − ) (2.33)
2s20 + ϕ2 s01 + ϕe2 s02 2s10 + ϕ1 s01 + ϕe1 s02

66
Denote:
s01 ∑ ϕk sk0 2 ssk0
K
02
A=2 +
s02 2sk0 + ϕk s01 + ϕek s02
k=1


K
ϕek sk0 ϕk
B=
k=1 2sk0 + ϕk s01 + ϕek s02

s02 ∑
K
ϕek sk0 2 ssk0
01
C=2 +
s01 2sk0 + ϕk s01 + ϕek s02
k=1

ϕ2 ϕ1
F = s01 ( − ) (2.34)
2s20 + ϕ2 s01 + ϕe2 s02 2s10 + ϕ1 s01 + ϕe1 s02

ϕe2 ϕe1
G = s02 ( − ) (2.35)
2s20 + ϕ2 s01 + ϕe2 s02 2s10 + ϕ1 s01 + ϕe1 s02

we know A > 0, B > 0, C > 0, moreover:

D01 s02 (A + B) − D02 s01 B = F (2.36)

D02 s01 (C + B) − D01 s02 B = G (2.37)

Therefore:

D01 s02 − D02 s01 < 0 ⇐⇒ C ∗ F − A ∗ G < 0

One sufficient condition for CF − AG < 0 is that F < 0, G > 0. One sufficient condition for F < 0, G > 0
e2
ϕ ϕ2
when e1 > ϕ1 is that:
ϕ
ϕ2 s20 ϕe2
< <
ϕ1 s10 ϕe1

since we can arrange Equation 2.34 and Equation 2.35:

1 1
F = s01 ( e2
− e1
) (2.38)
2s20 ϕ ϕ
ϕ2 + s01 + ϕ2 s02 2 sϕ101 + s01 + ϕ1 s02

67
ϕe2 ϕe1
G = s02 ( − ) (2.39)
2s20 + ϕ2 s01 + ϕe2 s02 2s10 + ϕ1 s01 + ϕe1 s02

Hence there exists δx1 , δ¯x1 , δx2 , δ¯x2 , when

x ∈ (δx1 , δ¯x1 )

(γ − x) ∈ (δx2 , δ¯x2 )

we have:
∂ ss01
02
<0
∂x

68
C. Proof for the Propositions
Proof for Proposition 2.1

Proof. To prove the existence of a stationary equilibrium, we need to show that there is a solution to the

following equilibrium conditions given Gf , Gm , Φ, denote Gf = (nL , nH ), Gm = (mL , mH ):

√ ΦL y √ ΦL y √ ΦH y √ ΦH y
µ0y + µL1 0 µ0y exp( 1 )+ µL2 0 µ0y exp( 2 )+ µH1 0 µ0y exp( 1 )+ µH2 0 µ0y exp( 2 ) = my , ∀y ∈ {L, H}
2 2 2 2
(2.40)

√ Φe L √ Φe H
µe1 0 + µe1 0 µ0L exp( 1 ) + µe1 0 µ0H exp( 1 ) = qe1 ∗ ne , ∀e ∈ {L, H} (2.41)
2 2

√ Φe L √ Φe H
µe2 0 + µe2 0 µ0L exp( 2 ) + µe2 0 µ0H exp( 2 ) = qe2 ∗ ne , ∀e ∈ {L, H} (2.42)
2 2

qe1 + qe2 = 1, ∀e ∈ {L, H} (2.43)

µe1 0
exp(−ue1 ) = , ∀e ∈ {L, H} (2.44)
qe1 ∗ ne

µe2 0
exp(−ue2 ) = , ∀e ∈ {L, H} (2.45)
qe2 ∗ ne

ue1 = ue2 , ∀e ∈ {L, H} (2.46)

Equation 2.40-Equation 2.42 characterize the equilibrium conditions of marriage market stability for given

q strategy under the assumption of Gumbel distribution. Equation 2.44-Equation 2.45 characterize the

expected marital utilities of females. Equation 2.43 comes from the property of stationarity. Equation 2.46

guarantees that women are indifferent between choosing to marry at period 1 or period 2.

69
Re-arrange Equation 2.41 and Equation 2.42 , we can get:

√ √
µe1 0 √ µ e1 0 1 Φ √ µ e1 0 1 Φ
+ µ0L √ exp( e1 L ) + µ0H √ exp( e1 H ) = ne
qe1 qe1 1
qe 2 qe1 1
qe 2

√ √
µe2 0 √ µ e2 0 1 Φ √ µ e2 0 1 Φ
+ µ0L √ exp( e2 L ) + µ0H √ exp( e2 H ) = ne
qe2 qe2 2
qe 2 qe2 2
qe 2

Combining with Equation 2.44-Equation 2.46, we can get:

√ √ Φ √ Φ
qe1 µ0L exp( e21 L ) + µ0H exp( e21 H )
= √ Φ √ Φ
qe2 µ0L exp( e22 L ) + µ0H exp( e22 H )
√ √ Φ −Φ
Φe1 L − Φe2 L µ0L + µ0H exp( e1 H 2 e1 L )
= exp( )√ √ Φ −Φ (2.47)
2 µ0L + µ0H exp( e2 H 2 e2 L )
√ Φe1 H −Φe1 L
µ0H
Φe1 L − Φe2 L 1 + µ0L exp( 2 )
= exp( ) √ Φ −Φ
2 1 + µ0H exp( e2 H e2 L )
µ0L 2

There are three cases:

1. Φe1 H − Φe1 L = Φe2 H − Φe2 L

2. Φe1 H − Φe1 L > Φe2 H − Φe2 L

3. Φe1 H − Φe1 L < Φe2 H − Φe2 L

Case one: In the first case, we have:


qe1 Φe L − Φe2 L
= exp( 1 ) (2.48)
qe2 2

Hence equilibrium strategy q is pinned down by Equation 2.48 and Equation 2.43. Moreover, we know that

given q, Equation 2.40-Equation 2.42 has a unique equilibrium solution according to Decker et al. (2013).

Hence stationary equilibrium exists in this case and is unique.


qe1 µ0H
Case two: In the second case, qe2 is an increasing function of µ0L in Equation 2.47. Moreover, according

to Lemma 2.3, we know that when Φe1 H − Φe1 L > Φe2 H − Φe2 L indicating there is a complementarity
qe1 µ0H
between male High type and female marrying at period 1, an increase in qe2 would lead to a decrease in µ0L

from Equation 2.40-Equation 2.43.

70
Moreover, from Equation 2.47, we know that:


qe1 Φe L − Φe2 L µ0H
2
→ exp( 1 ), as →0
qe 2 µ0L


qe1 Φe H − Φe2 H µ0H
→ exp( 1 ), as → +∞
qe2 2 µ0L

µ0H
While from Equation 2.40 - Equation 2.43, we know µ0L is bounded by finite positive number when
√ 1
Φ −Φ q Φ −Φ
exp( e1 L 2 e2 L ) ≤ qe2 ≤ exp( e1 H 2 e2 H ).
e

Hence equilibrium exists and is unique.


qe1 µ0H
Case three: In the third case, qe2 is a decreasing function of µ0L in Equation 2.47. Moreover, according

to Lemma 2.3, we know that when Φe1 H − Φe1 L < Φe2 H − Φe2 L indicating there is a complementarity
qe1 µ0H
between male L type and female marrying at period 1, an increase in qe2 would lead to an increase in µ0L

from Equation 2.40 - Equation 2.43. Applying the same logic as in case two, equilibrium exists and is unique.

Moreover, we know that equilibrium strategy satisfies:

1
qL
min(ΦL1 L − ΦL2 L , ΦL1 H − ΦL2 ) ≤ ln( 2 ) ≤ max(ΦL1 L − ΦL2 L , ΦL1 H − ΦL2 H )
qL

1
qH
min(ΦH1 L − ΦH2 L , ΦH1 H − ΦH2 ) ≤ ln( 2 ) ≤ max(ΦH1 L − ΦH2 L , ΦH1 H − ΦH2 H )
qH

Proof for Proposition 2.2

Proof. This is our first case in the previous proof of Proposition 2.1. Hence from Equation 2.47 equation

(17), we know:
qe2
= exp(Φe2 L − Φe1 L )
qe1

with qe1 + qe2 = 1, we have:

exp(Φe2 L ) exp(Φe1 L )
qe2 = , qe1 =
exp(Φe2 L ) + exp(Φe1 L ) exp(Φe2 L ) + exp(Φe1 L )

71
Proof for Proposition 2.3 and Proposition 2.4

Proof. From the proof of proposition 1, we know that equilibrium strategy is pinned down by both Equa-

tion 2.47 and Equation 2.40-Equation 2.43. Hence how equilibrium strategies change depend on whether

Φe1 H − Φe2 H > Φe1 L − Φe2 L or Φe1 H − Φe2 H < Φe1 L − Φe2 L , and how µ0H
µ0L changes in equilibrium.

Let’s first prove Proposition 2.3, according to Lemma 2.3 result (a), an increase in mH would increase
µ0H
µ0H and decrease µ0L , which increases µ0L given any strategy qy , hence an increase in mH would

• increase qe1 , if Φe1 H − Φe2 H > Φe1 L − Φe2 L

• decrease qe1 , if Φe1 H − Φe2 H < Φe1 L − Φe2 L

µ0H
Then let’s prove Proposition 2.4, according to Lemma 2.3(b), an increase in nH would decrease µ0L given

any strategy qy if the following condition holds:


Φe H − Φe2 H µe1 0 Φ e L − Φ e2 L
exp( 1 )≤ ≤ exp( 1 )
2 µe2 0 2

Moreover, we know that:


µe1 0 q1
= e2
µe2 0 qe

and

Φe H − Φe2 H qe1 Φe L − Φe2 L
exp( 1 )≤ ≤ exp( 1 )
2 qe2 2

from Equation 2.47. Therefore the condition always holds in the neighborhood of the equilibrium. Hence

an increase in nH would

• decrease qe1 , if Φe1 H − Φe2 H > Φe1 L − Φe2 L

• increase qe1 , if Φe1 H − Φe2 H < Φe1 L − Φe2 L

72
Chapter 3

Multidimensional Matching: Hukou status in the Marriage Market

3.1 Introduction

Mating patterns affect social equality. For example, the empirical pattern that people marry partners with

similar social economic status (SES), called positively assortative matching (PAM), increase social inequality.

(Greenwood et al., 2014) In practice, people consider multiple characteristics when seeking partners. How

people match along all these dimensions affect social structure. In China, one particular and important one

is the hukou status, which classifies people into either the urban or the rural population. Urban hukou enjoys

better amenities including education, health care and working opportunities than rural hukou. Understanding

how people match along hukou status is the first steppingstone to understand how this particular system

affects individuals’ welfare. Moreover, multiple hukou reforms have taken place in the past two decades to

increase internal migration and to decrease social inequality. A matching framework is essential to analyze

how these reforms jointly affect people’s marital choices and labor migration in order to have a more complete

evaluation of their effect on social welfare.

This paper adopts the recently developed two dimensional matching model under Transferable Utility

(TU).(Chiappori et al., 2017) In the model, each agent differs in two attributes: one continuous one that in-

dicates SES and one discrete one that indicates the hukou status, either agriculture (rural) or non-agriculture

(urban). A key assumption of the TU model is that there exists a marital surplus a couple can jointly produce

and then bargain among themselves how to divide. The marital surplus depends on the two characteristics.

The higher either the husband’s SES or the wife’s SES is, the higher the surplus. However, wives’ and

husbands’ hukou status play asymmetric roles in the surplus. I assume that husbands’ urban hukou status

is more valuable than the wives’, because of the rationale that in a patrilocal society, it’s much more likely

for a wife to her husband’s location instead of the other way around.

73
I first analyze the case where rural women, urban women, rural men and urban men share the identical

SES distributions. I then analyze the case where the four groups have SES distributions suggested by the

empirical data, i.e., urban population have higher SES than rural population and rural population are three

times as urban population. The predictions have very similar qualitative properties. First, mixec couples are

less frequent than non-mixed couples. Second, PAM exists in each couple category by hukou status. Third,

for all four population categories, rural women, urban women, rural men and urban men, those married to

urban spouses have higher SES than those married to rural spouses. Fourth, when two urban men with same

SES marry to a urban woman and a rural woman respectively, the rural woman has higher SES than the

urban woman. When two urban women with same SES marry a urban man and a rural man respectively,

the rural man has higher SES than the urban man. When two rural men with same SES marry to a urban

woman a a rural woman respectively, the rural woman and the urban woman have the same SES.

I then use the China 2000 0.095% sample Census to test the predictions. Education (years of schooling)

is used as a proxy for a person’s SES. I show that first, the frequency of mixed couples is much fewer than

that predicted by random matching. Second, among all four marriage categories, own education is always

positively correlated with spousal education. Third, using both simple summary statistics and regression

analysis of explaining a person’s own education using spousal characteristics, a rural spouse is negatively

correlated with own education. Fourth, the correlation between a person’s own education and a rural hukou

conditional on spousal education is less negative than the unconditional correlation between a person’s own

education and a rural hukou status.

The theoretical analysis of matching with TU dates back to (Shapley and Shubik, 1971) and (Becker,

1973). This paper contributes to a set of papers that consider multiple dimensional matching instead of

one dimensional matching, which has been paid much more attention due to its tractability. Chiappori

et al. (2017) considers a bidimensional matching model with SES and the smoking status as the discrete

variable. Ahn (2018) considers SES and a person’s nationality as the discrete variable to analyze cross-

border marriages. Low (2017) analyzes a one-to-two dimensional matching model in which fertility is the

other continuous dimension for women.

From a more applied perspective, this paper also contributes to a large literature that studies how the

hukou system affects rural to urban migration and migrant workers’ welfare. Research on its impact on

74
marriage market is limited, except that (Han et al., 2015) analyzes the policy change in 1998 that new-born

babies could freely choose their hukou status from either father or mother instead of inheriting it from mother

automatically. They found that this policy change increases inter-provincial marriage, benefits urban men

but hurts urban women.

More broadly, studying hukou system in China also allows us to link labor market with marriage market,

and studies how migration decision is made. (Dupuy et al., 2014) builds a marriage matching model of two

locations, each with a distinct labor market and marriage market, to analyse people’s decisions of migration

to work and migration to wed. Empirically, there are also many papers that document how marriage market

influences individuals’ migration choice. (Edlund, 2005) argues that the attractiveness of high-income men

in urban areas contributes to the empirical stylized fact that young women outnumber young men in urban

area though a high skilled labor market in urban area may predict the opposite with the assumption that

there are more skilled men than women. (Weiss et al., 2013) shows that young women migrate from mainland

China to Hong Kong for better marriage prospect after the lifting of migration ban which causes more women

emigration from Hong Kong. Using Danish data, (Gautier et al., 2010) argues that cities serve as a role of

providing dense marriage market, which influences singles and couples’ location choices.

The model is demonstrated in section 2, while section 3 shows the empirical results from data. Section

4 concludes and states possible future work.

3.2 Model

The basic framework

Populations and surplus

There are two populations: women’s and men’s, denoted by X and Y. We normalize the size of female

population to 1, and denote the size of male population as r. Both men and women differ in two dimensions.

First, they are characterized by a continuous attribute: their socioeconomic status which is a proxy for

income, education, prestige and so on. Second, agents differ in terms of hukou status; it can be either

Agriculture (A) or Non-agriculture (N). A woman (man) thus is formally characterized by a pair (x, X)

( (y, Y )) where x, y is the individual’s continuous socioeconomic index, and X, Y defines the individual’s

75
hukou status.

For simplicity, assume that the continuous index x for men with agriculture hukou, denoted by xA is

uniformly distributed over the interval [0, 1]; the continuous index x for men with non-agriculture hukou,

denoted by xN is uniformly distributed over the interval [a, 1 + a], where a represents the average difference

between agriculture population and non-agriculture population. Similarly, yA ∼ U [0, 1], yN ∼ U [b, 1 + b].

Assume a ≥ 0, b ≥ 0. Assume the share of population with agriculture is α, same for both men and women.

I then consider the workhorse model used on marriage market: a frictionless matching model with

transferable utility (TU) as in (Becker, 1973) and Shapley-Shubik(1971). The key assumption in this model

is that for any couple, there exists a marital surplus that the couple can decide how to divide upon marriage.

In this model, the marital surplus depends on both the socioeconomic status and hukou status of each

partner. Moreover, given the rationale that (1) in a patrilocal society, it is most likely that wives move to

the husbands’ place, (2) couples enjoy larger benefit in a urban place compared to a rural place, I assume

that the surplus function Σ has the form:





 if X = N, Y = N


f (x, y),



Σ((x, X), (y, Y )) = λf (x, y), if X = A, Y = N








µf (x, y), if Y = A

where the function f is strictly increasing and supermodular, and satisfies f (0, 0) = 0. Here, λ < 1 represents

the cost of changing wife’s hukou status upon marriage, µ < 1 indicates a lower welfare of living in a rural

area compared to urban area. I furthermore assume that λ > µ because of the patrilocality. Notice that the

cost relative to rural hukou is assumed to be multiplicative rather than additive, because it is very likely that

people with higher SES value the additional benefits of urban hukou more, for example, children’s education

and health care. Moreover, I do not allow a fixed cost of migration in the surplus. This is because compared

to the role that hukou status plays in the marital surplus, the migration cost is trivial. What keeps rural

people from migrating to urban areas is the difficulty of living in urban areas without a local hukou rather

than the migration cost (either physical or cultural).

A stable matching

76
A matching is defined as a measure µ on the set ([0, 1] × {A, N }) × ([b, 1 + b] × {A, N }) and four functions

(uA (x), uN (x), uA (y), uN (y) that captures the utilities of individuals of different types. The only constraint

on the measure µ is that its marginals should be equal to the initial male and female distributions. A matching

is stable if it satisfies individual rationality and the no blocking pair condition. Individual rationality requires

that no matched individual would be better off remaining single. The no blocking pair condition requires

that no two individuals would prefer being matched together to their current situation. Hence stability

would require for any (x, X), (y, Y ) we have that





 f (x, y), if X = N, Y = N





uX (x) + vY (y) ≥ λf (x, y), if X = A, Y = N








µf (x, y), if Y = A

where the equality is satisfied on the support of the matching measure µ, i.e. where we observe a positive

probability of matching for that couple type.

Existence of a stable matching is guaranteed by the property of a TU model. Stability in a TU framework

is equivalent to the maximization of aggregate surplus over all possible assignments; therefore the problem

boils down to the existence of a solution to a simple linear programming problem, for which one can readily

check that the standard conditions are satisfied.

As for pureness, as defined in (Chiappori et al., 2017), a matching is pure if almost all women with

same attributes (x, X) are matched with probability one to exactly one type of agent (y, Y ) = ρ(x, X)

and the same applies to men. In one-dimensional case where we have positive assortative matching with

supermodularity assumption, the stable matching is one-to-one and hence is pure. However, in the current

model, we would need the ”twisted” condition (Chiappori et al., 2010): for almost all (x0 , X), the partial

derivative of the surplus Σ((x, X), (y, Y )) with respect to x0 at two different points (x0 , X), (y1 , Y1 ) and

(x0 , X), (y2 , Y2 ) are equal to each other if and only if ((y1 , Y1 ) = (y2 , Y2 ). This property unlikely holds in the

current setting. If a woman with index x0 has an agriculture hukou and marries with an urban man with

index y1 , the partial of the surplus with respect to x is λ ∂f (x∂x0 ,y1 ) . If she is mated with a rural man with

77
∂f (x0 ,y2 )
index y2 , the partial is ∂x . We may still have:

∂f (x0 , y1 ) ∂f (x0 , y2 )
λ =
∂x ∂x

with y1 > y2 since λ < 1. Therefore, the stable matching may not be pure in the current setting.

Stable matching equilibrium: general properties

The stable matching depends on the following parameters: sex ratio r, the share of population with agricul-

ture hukou, α, the average difference in the socioeconomic status between people with Agriculture hukou and

Non-agriculture hukou, a and b, and the surplus function parameters, λ, µ, f (x, y). Let me first document

some general properties.

Let me denote pA (x) as the probability of marrying a rural husband for a rural woman with SES x. I

define similarly pN (x′ ), qA (y), qN (y ′ ) as the probability of marrying a rural partner for a urban woman, a

1
rural man and a urban man.

Proposition 3.1. In any stable matching, consider two couples with same hukou status, ((x, X), (y, Y )) and

((x′ , X), (y ′ , Y )). Then x ≥ x′ if and only if y ≥ y ′ .

Proposition 3.1 says that couples are positively matched within each hukou category.

Proof. Assume not, there exist two couples with x ≥ x′ and y < y ′ , then we can exchange their partners

among these two couples and achieve a higher social surplus because of the supermodularity assumption of

f (x, y).

f (x, y ′ ) + f (x′ , y) > f (x, y) + f (x′ , y ′ )

Proposition 3.2. In a stable matching, if there exists an open set Ox ⊂ X, where for any x ∈ Ox , a rural

woman x marries to a rural man y or a urban man y ′ with positive probability. Then the rural man y

marries to a rural woman with probability one. Moreover, y > y ′ .

1 Note that from now on, for the simplicity of notation, I’ll use ”rural” indicate people with Agriculture hukou and ”urban”

indicate people with Non-Agriculture hukou.

78
Similarly, if there exists an open set Oy ⊂ Y , where for any y ∈ Oy , a rural man y marries to a rural

woman x or a urban woman x′ with positive probability. Then the rural woman x marries to a rural man

with probability one. Moreover, x = x′ .

Proposition 3.2 states that certain types of randomization can’t exist together.

Proof. Suppose a rural woman x marries to a rural man y or a urban man y ′ with positive probability, I’d

like to first show that y > y ′ .

Denote u(x) as the rural woman’s utility, by stability,

u(x) = maxs µf (x, s) − vA (s) = max



λf (x, s) − vN (s′ )
s

where vA (s) is the utility of a rural man with SES s, and vN (s′ ) is the utility of a urban man with SES s′ .

The maximization is achieved for s = y, s′ = y ′ respectively. By the envelope theorem,

∂f (x, y) ∂f (x, y ′ )
u′ (x) = µ =λ
∂x ∂x

Since µ < λ and ∂f (x,y)


∂x is a non-decreasing function of y due to supermodularity, we have y > y ′ .

Then let’s prove by contradiction. Suppose that the rural man y also marries to a urban woman x′

with positive probability, we know that x′ = x. Then we have a couple (x, A), (y ′ , N )) and another couple

(x′ , N ), (y, A) where y > y ′ , x = x′ . However,

Σ((x, A), (y, A)) + Σ((x′ , N ), (y ′ , N )) = µf (x, y) + f (x′ , y ′ )

> µf (x′ , y) + λf (x, y ′ )

> Σ((x, A), (y ′ , N )) + Σ((x′ , N ), (y, A))

This violates the property that a stable matching maximizes the total social surplus. The proof of the

second statement is similar.

79
Case one: Identical distribution

Now let’s first solve the case where SES and hukou status are independent, i.e., a = b = 0 All four types

(rural women, urban women, rural men and urban men) have same SES distributions: U (0, 1). Let me

further assume that r = 1, α = 1: there are equal number of men and women, across the two hukou types.

Moreover, assume that λ ∼ 1, µ << λ. The main result is as following:

Proposition 3.3. There exists a function δ(λ), such that when µ < δ(λ), λ < 1, in a stable matching

outcome, there exists xe′ , ye′ , x


e:

• Very top urban women and very top urban men marry each other:

∀x′ ≥ xe′ , pN (x′ ) = 0; y ′ ≥ ye′ , qN (y ′ ) = 0

• Top rural women only marry urban men:

∀x ≥ x
e, pA (x) = 0

Proof is in the appendix. Figure 3.1 shows an example of the matching patterns when f (x, y) = xy. The

four colors indicate the four matching categories with respect to hukou. Blue indicates rural-women-rural-

men, purple indicates urban-women-urban-men, yellow indicates rural-women-urban-men and pink indicates

urban-women-rural-men.

Since λ = 0.99 < 1, the very top urban men (above ye1′ ) will only marry urban women (above xe′ ).

However, λ is large enough to incentivize other top urban men (y ′ ∈ (ye2′ , ye1′ )) marry both urban women

f′ , x
(x′ ∈ (x1 f 2 )) and rural women (x ∈ (e
x, 1)). Rural women with (x ∈ (0, x
e) marry up to rural men. Because

some urban men marry rural women, some urban women (x′ ∈ (0, xe′ )) have to marry rural men. They either

marry up to a rural man or marry down to a urban man with positive probability.

The cutoffs depend on µ and λ. In particular, in the case where f (x, y) = xy:

• When λ increases, ye1′ increase, x


e, xe′ , ye, ye2′ decrease;

e, xe′ , ye2′ increase, ye decreases, ye1′ doesn’t change.


• When µ increases, x

80
In the extreme case where λ = 1, ye1′ = 1. Moreover, in this case, urban women and rural women can not be

differentiated in the marriage market.

Fix µ, there exists a threshold λ, when λ ≤ λ, the matching is pure, i.e., rural people marry rural people

and urban people marry urban people. Fix λ, there exists a threshold µ̄, when µ ≥ µ̄, the matching is also

pure. This pattern is illustrated in Figure 3.2. Intuitively, when the cost of changing hukou status is too

high (i.e. λ is too small), we should expect no cross hukou type marriage. When the penalty of Agriculture

hukou is very small relative to the cost of changing hukou status (i.e., λ − µ is very small) , we should also

expect no cross hukou type marriage.

Case two: Different distributions

Empirically, urban populations on average have higher SES than rural populations. Moreover, the rural

population is much larger than the urban population in China. To match the empirical observations, I

assume that α = 3 since in China 2000 Census, 76% population have an agriculture hukou and 24% have a

non-agriculture hukou. Further assume that a = b = 0.5, hence x ∼ U (0, 1), y ∼ U (0, 1), x′ ∼ (0.5, 1.5), y ′ ∼

(0.5, 1.5). This corresponds to the empirical statistic that on average, the urban populations have 50% more

years of schooling than the rural populations. The main result is very similar to Proposition 3.3 where SES

and hukou status are independent, and urban population and rural population are the same. Figure 3.3

shows an example of the matching patterns when f (x, y) = xy. Because of the difference in average SES,

top urban men and top urban women marry each other, and bottom rural men and bottom rural women

marry each other. The only difference is that for urban woman, the lowest type marries both urban and

rural husbands in the first case but the lowest type only marries rural husbands in the second case. This is

because the smallest SES an urban woman may have is b > 0, instead of 0 as in the first case, and it’s also

specific to this specification of surplus function f (x, y) = xy.

A quadratic example

As stated above, the exact form of the stable matching depends on the distributions and surplus function.

To have a better picture of the matching , the utilities and the comparative statics, let me now consider a

81
simple example with the gain generated from marriage is quadratic. Specifically, I assume that:

f (x, y) = xy

which is the simplest form that captures PAM.

With this quadratic specification and the previous assumptions: (1) r = 1, α ≥ 1, a = b ≥ 0, λ < 1,

µ < 1, it’s possible to completely solve the matching model in close form.
µ(λ− µ
λ )(1+α)
With the following additional regularity conditions: (2) µ < λ2 , b ≤ (1−µ)(α+λ) ; the main result is as

follows:

Proposition 3.4. In a stable matching outcome, there exist thresholds:

α + µλ
e=
x
α+λ

f′ = α + µ ∗ λ − λ ,
µ
x f′ = α + µ b
x
1
1−µ α+λ 2
αµ + µ

1+α λ− λ
µ
ye1 = ∗ , ye2 = b
1−µ α+λ

µ
ye1′ = λ, ye2′ =
λ

such that:

• All agents marry;

• ∀x ≥ x
e, a rural woman with SES x marries with probability 1 to a urban man with SES:

λ − µλ µ
− λx̃
y′ = ∗x+ λ
1−x e 1−xe

82
• ∀x < x
e, a rural woman with SES x marries with probability 1 to a rural man with SES:





x + (1 − x e), if x ∈ (ex − (1 − ye1 ), xe)







 ye1 − µ f
x−1+ye1 )−ye1 x′

b b

(e f′ , x
, if x ∈ (x2 e − (1 − ye1 ))
2
f′
x+ µ f′
e−(1−ye1 )−x2
x e−(1−ye1 )−x2
x
y=

 f

 µ −b
b
x ′−b
f′ ]

 f x+b f2 µ
, if x ∈ (b, x

 x ′ −b x ′ −b 2

2 2




x, if x ∈ [0, b]

f′ , a urban woman with SES x marries with probability 1 to a urban man with SES:
• ∀x ≥ x1





 x, if x ≥ λ





y′ = λ2 −µ e−µ
λ2 x

 x) x − λ(1−e
λ2 (1−e x) , if x ∈ [λe
x, λ)





 f′ , λe
x + µ − λx̃, if x ∈ [x 1 x)
λ

f
• ∀x ∈ (x f′ ), a urban woman with SES x marries with probability p =
f′ , x α(1−e
x)−(x ′ −b)
2
to a rural man
2 1 f
x f
′ −x ′
1 2

with SES:
1 µye1 − b f′
bx˜′ − µye1 x
y= ( x+ 1 2
)
µ x f′
f′ − x f′ − x
x f′
1 2 1 2

marry with probability (1 − p) to a urban man with SES:

µye1 − b f′
bx˜′ − µye1 x
y′ = x+ 1 2
f′ − x
x f′ f′ − x
x f′
1 2 1 2

f′ , a urban woman with SES x marries with probability 1 to a rural man with SES:
• ∀x ≤ x2

b(1 − µ) b(1 − µ)
y= x+b− ∗b
f′
µ(x2 − b) f′ − b)
µ(x 2

Figure 3.4 shows the detailed matching pattern in this case. It shows how individuals marry among each

hukou category besides the four matching types as in Figure 3.3. The proof of this proposition is in the

appendix.

Individual utilities

83
Moreover, we can solve individual utilities from this system. Denote uA (x) as the utility of a rural woman

with SES x, similarly, let’s define uN (x′ ), vA (y), vA (y ′ ) as the utility of a urban woman, a rural man and a

e, her husband is a rural man with SES y = ϕA (x). The


urban man. Consider a rural woman with x < x

stability condition implies that:

uA (x) = max(µxy − vA (y))


y

the maximum achieved at y = ϕA (x). From the envelope theorem,

u′A (x) = µϕA (x)

Then,
∫ x
uA (x) = µϕA (s)ds + K, ∀x ∈ [0, x
e]
0

e, her husband is a urban man with SES y ′ = φA (x).


Since f (0, 0) = 0 in this case, we know K = 0. For x ≥ x

The stability condition implies that:

u′A (x) = max



λxy ′ − vN (y ′ )
y

the maximum achieved at y ′ = φA (x). From the envelope theorem

u′A (x) = λφA (x)

Then,
∫ x ∫ x ∫ e
x
uA (x) = λφA (s)ds + uA (e
x) = λφA (s)ds + µϕA (s)ds, ∀x ∈ [e
x, 1]
e
x e
x 0

The other utilities uN (x′ ), vA (y), vA (y ′ ) can be obtained in the similar way given the matching function.

A numerical example is given in Figure 3.5 when f (x, y) = xy, α = 3, λ = 0.95, µ = 0.6, b = 0.4. There

are several interesting patterns:

• Utilities increase with respect to SES for everyone.

• For urban women, those with (low) SES that marry a rural husband with positive probability obtain

84
the same utility as rural women with same SES. Because urban women and rural women have the

same contribution when they marry rural men.

• For urban women, those with (high) SES that only marry urban husbands have higher utility than the

rural women with same SES. Because urban women contribute more than rural women to the surplus

when they marry urban men.

• Urban men enjoy the largest utility compared to rural men, urban women and rural women with same

SES. This is because of the positive value a urban hukou gains for urban men.

• For rural men, those with (very low) SES that only marry rural wives have the same utility as their

rural wives. All other rural men with (high) SES have lower utility than the rural women with same

SES. This is because very top rural women marry urban men, which creates a scarcity of rural women

that helps them enjoy higher utilities.

Comparative statics Figure 3.6 shows an example of the utility change for the four population when we

change the parameters. The comparative statics in this case can be summarized as following:

• A larger λ, by reducing the cost of transfering wife’s hukou from Agriculture to Non-agriculture,

e, ye2′ , increases x
decreases the threshold x f′ , ye1 , ye′ . It benefits urban men and rural women, but hurts
1 1

urban women and top rural men by reducing their advantage. In Figure 3.6, the impact on urban

population is larger than that on rural population.

• A larger µ, by reducing the cost of living in rural area, increases the threshold x f′ , ye1 . It
e, ye2′ ; decreases x1

benefits rural men, rural women and urban women, but hurts urban men by reducing their advantage.

f′ , increases ye1 .
e, x
• A smaller α, by decreasing the share of rural population, decreases the threshold x 1

It hurts urban men by decreasing their relative scarcity and benefits rural women by increasing their

relative scarcity. Urban women also gain and rural men lose.

• A smaller b, by reducing the difference of the average SES between urban and rural population, this

hurts urban men but benefits rural men, urban women and rural women.

85
Model predictions

The frictionless model illustrated above is only an ideal illustration. Due to the existence of unobserved

characteristics by cliometricians, observed matching patterns are mostly stochastic. However, the previous

equilibrium solution can still shed light on the general properties of the empirical matching patterns, which

I summarize as following:

Prediction 3.1. Cross type couples in which wives and husbands have different hukou status should be

less frequent. More precisely, both the share of A-N couples (rural women and urban husband) and N-A

couples (urban women and rural men) should be much smaller than implied by random matching, i.e., than

1
1+α ∗ α
1+α . In the data, α ≈ 3, hence the threshold is about 0.1875.

Prediction 3.2. Matching within each category:

Among couples in the same hukou category, matching is positively assortative on SES.

Prediction 3.3. Selection into the four matching categories:

• Rural women with urban husbands have higher SES than rural women with rural husbands.

• Urban women with urban husbands have higher SES than urban women with rural husbands.

• Rural men with urban wives have higher SES than rural men with rural wives.

• Urban men with urban wives have higher SES than urban men with rural wives.

Prediction 3.4. Rural and urban partners for the same agent:

• When two urban women with the same (low) SES marry respectively a rural husband and a urban

husband, the rural husband should on average have higher SES than the urban husband.

• When two rural men with the same SES marry respectively a rural wife and a urban wife, the rural

wife and the urban wife should on average have the same SES.

• When two urban men with the same (low) SES marry respectively a rural wife and a urban wife, the

rural wife should on average have higher SES.

86
3.3 Empirical Application

Hukou system in China

Hukou system has been the family registration system in China since early 1950s. A person’s hukou status

contains two parts, (1) residential location (hukou suozaidi) that indicates the place of hukou registration and

(2) hukou type (hukou leibie) that categorizes people into either Agriculture or Non-agriculture population.

Hukou status is determined upon birth. A child’s hukou was automatically inherited from the mother before

1998 but it can be inherited from either the mother or the father after 1998.

Between 1960 and 1984, internal migration in China was very limited. Changing registration place

required a permit to move and a migration certificate. Changing hukou type from Agriculture to Non-

agriculture (nongzhuanfei) was even more difficult and the conversion quotas were limited. Conversion

quotas for personal reasons including family reunion were exceedingly small, fixed at about 0.15 to 0.2

percent of the non-agriculture hukou population of each locale. It was very common to observe separation

of spouses because of this. (Chan and Zhang, 1999)

This strict migration policy could not be maintained during the economic reforms in China. After 1984,

it became possible for people to work outside their hukou registration places. Urban areas have attracted

more and more rural migrant workers. However, without local hukou, migrants encounter various difficulties

including housing, health care and children’s education. To alleviate these problems, local governments

implemented various hukou reforms since 1997 to make it easier for migrants to change their hukou status.

Data

Estimates of the matching patterns are based on China 2000 1% sample census, a household survey conducted

by China National Bureau of Statistics in 2000. Hukou status, education, marriage status are available in this

dataset. The 2000 population census is used instead of the 1990 population census or the 1995 mini census

because of the following two reasons. (1) Before 1990, migration is still very restricted; hence the benefit of a

match between rural woman and urban man is very small, so that λ should be small. (2) Between 2000 and

2005, switching hukou status from Agriculture to Non-agriculture has become much easier. Historical hukou

status isn’t recorded in the census, I may misclassify rural population who have switched hukou to urban

87
population. However, before early 2000, converting a rural hukou to a urban one is still highly restricted.

Therefore, the 2000 population census is the best choice. Furthermore, I restrict my sample to women

between 18 and 33, men between 20 and 35.

Results

In this section, I first show the basic summary statistics and then present the empirical results on the

matching pattern in the order of the four predictions.

Summary statistics

In the sample, 51% are male, 73% have agriculture hukou. 73.43% of male population have agriculture

hukou and 73.18% of female population have agriculture hukou. These basic statistics confirm the parameter

assumptions that r = 1, α = 3. The average years of schooling are 12.2, 12.1, 8.3 and 7.8 for urban men,

urban women, rural men and rural women respectively. 32% are never married.

Table 3.1 presents the summary statistics of the married and singles in the sample. Women and men

have similar years of schooling. Rural people are more likely to get married at the survey time for both men

and women. Not surprisingly, never married people are younger than married couples.

Frequencies of mixed couples

Table 3.2 provides support for Prediction 3.1. There are far more rural-rural and urban-urban couples than

random matching would imply. Mixed couples with different hukou are much less frequent than predicted

by random matching.

Positive assortative matching within each category

Table 3.3 provides support for Prediction 3.2. Among all four panels of the table that represent the four

types of couples, own education is positively correlated with spousal education after controlling own age and

prefecture fixed effect. Figure 3.7 shows the heat map of couples’ years of schooling for the four types.

88
Selection into mixed matching

Table 3.4 provides support for Prediction 3.3. Comparing rural wives in Panel A and Panel B, we can see

that rural women with urban husbands on average have 1.7 more years of schooling than rural women with

rural husbands. Similarly, by comparing Panel C and Panel D, Panel A and Panel C, Panel B and Panel D,

we can find that urban women with urban husbands on average have 1.79 more years of schooling than urban

women with rural husbands, rural men with urban wives on average have 1.28 more years of schooling than

rural men with rural wives, and urban men with urban wives on average have 1.6 more years of schooling

than urban men with rural wives.

Table 3.5 provides further support for this selection pattern. In this table, I divide married people into

four types by sex and hukou status. Individuals’ education are explained by spousal characteristics including

hukou and SES. All regressions have own age and prefecture fixed effect as control. Column (1) and (3) in

Panel (A) and (B) replicate the finding in Table 3.4. Column (2) and (4) present the result with spousal

education as additional explanatory variable. For individuals married to all four types, conditional on SES,

those with a rural hukou tend to have a spouse with lower SES.

Moreover, in Column (4) of Panel B in Table 3.5, the small and insignificant coefficient of rural spouse,

-0.099, is also supportive of our assumption that λ is very close to 1. For those married to urban husbands,

rural women and urban women are married to husbands with similar SES, conditional on the wives’ SES.

This shows that wife’s rural hukou and wife’s urban hukou contribute similarly to the marital surplus.

Conditional and unconditional correlation between SES and hukou status.

In Table 3.6, I divide married people into four types by sex and spousal hukou status.

In the theoretical prediction, for two urban men who marry a rural woman and a urban woman respec-

tively, the rural woman has on average a higher SES than the urban woman. For two rural men who marry

a rural woman and a urban woman respectively, the rural woman has on average a similar SES to the urban

woman. For two urban women who marry a rural man and a urban man respectively, the rural man has

on average a higher SES than the urban man. This would predict that a rural hukou and SES should be

positively correlated for the spouses of urban men and urban women, and not correlated for the spouses

89
of rural men. However, in practice, people match on multiple characteristics including many unobserved

ones. Hence, to match the prediction into a hypothesis using data, the prediction is restated as following:

the conditional correlation between a rural hukou and SES, given spousal SES, should be less negatively

correlated than the unconditional one, for the spouses of urban men, urban women and rural men. This is

supported by Columns (1) and (2) in Panels (A),(B),(D) in Table 3.6.

3.4 Conclusion and Discussion

In this chapter, I build a bidimensional matching model incorporating to understand the marriage matching

patterns along SES and hukou status in China. I test the model’s predictions using the China 2000 0.095%

sample Census. The model can be a building block to evaluate the effect of the hukou policy reforms. In

future work, I plan to use the prefecture-level Hukou reform data between 1997 and 2010 collected in Fan

(2019), to test the comparative statics of the model. More specifically, since hukou reform increased λ, we can

test how the percentage of mixed couples and how individuals’ SES change in the mixed couples. Another

potential direction is to explore the details of the hukou reforms. Reforms at different prefectures at different

time may focus on different public goods, for example, some mentioned more about housing policies, some

mentioned more about children’s education. Empirical analysis combined with the model may be utilized

to evaluate people’s evaluation of different public goods.

Moreover, rural-to-urban migration is not limited to China. Almost all societies have experienced ur-

banization in history and/or are still in the process of urbanization. A cross-country comparison analysis of

how the matching pattern among rural and urban population changes over time will be interesting future

work.

90
3.5 Figures

Figure 3.1: Predicted matching pattern with symmetric population

Rural Urban Rural Urban

ye1′
ye
e
x

xe′

ye2′

Women Men

This figure illustrates an example of the equilibrium matching pattern with the asymmetric surplus function and symmetric
population. Each bar represents population of rural women, urban women, rural men and urban men. The height indicates
SES, the higher, the larger SES the agent has. The width indicates the mass of one particular SES type, the wider, the
more populated this particular type is. Area of same color indicates matching in equilibrium. Agents match positively
assortatively among each color category.

91
Figure 3.2: Matching patterns under different parameter values

N.A.

Pure
Exist Mixed
Couples

This figure illustrates the type of the equilibrium matching pattern with different values of λ and µ in the surplus function
and symmetric population. Cases where λ < µ are not applicable in the current setting. When µ is relatively large, there
is no mixec couple and the matching is pure. In the lower area, mixed couples are observed and the matching pattern is
similar to that depicted in Figure 3.1.

92
Figure 3.3: Predicted matching pattern with asymmetric population

Rural Urban Rural Urban

e
x ye1′
f′ ye1
x1
f′
x 2 ye2′
ye2

Women Men
This figure illustrates an example of the equilibrium matching pattern with the asymmetric surplus function and asymmetric
population. Each bar represents population of rural women, urban women, rural men and urban men. The height indicates
SES, the higher, the larger SES the agent has. The width indicates the mass of one particular SES type, the wider, the
more populated this particular type is. Area of same color indicates matching in equilibrium. Agents match positively
assortatively among each color category.

93
Figure 3.4: Predicted matching pattern with asymmetric population (Detailed)

Rural Urban Rural Urban

e
x
f′
x3 ye1′
f′
f1
x
x4 f′ ye1
x1 ye
3
f2 f′
x x ye2′
ye3′
2
f3
x ye2

Women Men
This figure is a detailed version of Figure 3.3. It shows how rural-rural and urban-urban couples couples match in details.
Different shades of cyan and different shades of blue are used to indicate the different subsets of rural population and urban
population that marry to the spouses with same hukou type.

94
Figure 3.5: Utility in the quadratic example
Utility

1.0

0.8

Rural Women

Urban Women
0.6

Rural Men

Urban men
0.4

0.2

SES
0.2 0.4 0.6 0.8 1.0 1.2 1.4

This figure illustrates the individuals’ utilities in the quadratic example when f (x, y) = xy. The parameter values
are: α = 3, λ = 0.95, µ = 0.6, b = 0.4. The four lines represent how utilities depend on SES for the four population:
rural women, urban women, rural men and urban men.

95
Figure 3.6: Comparative statics

Increase λ Increase μ
Utility change Utility change
0.015

0.010 0.04

0.005
0.02

SES
0.2 0.4 0.6 0.8 1.0 1.2 1.4 SES
0.2 0.4 0.6 0.8 1.0 1.2 1.4
-0.005
-0.02
-0.010
Rural Women
-0.015 -0.04
Urban Women
Decrease α Decrease b Rural Men
Utility change Utility change
Urban Men
96

0.010 0.04

0.005 0.02

SES
0.2 0.4 0.6 0.8 1.0 1.2 1.4 SES
0.2 0.4 0.6 0.8 1.0 1.2
-0.005
-0.02

-0.010
-0.04

This figure illustrates the comparative statics of individuals’ utilities with respect to the theoretical parameters in the quadratic example.
Baseline case is: α = 3, λ = 0.95, µ = 0.6, b = 0.4. Each panel changes one parameter at one time.
The four lines indicate the utility change at different SES levels for the four population: rural women, urban women, rural men and urban men.
In the upper-left panel, λ is increased from 0.95 to 0.99.
In the upper-right panel, µ is increased from 0.6 to 0.7.
In the bottom-left panel, α is decreased from 3 to 2.
In the bottom-right panel, b is decreased from 0.4 to 0.3.
Figure 3.7: Heatmap for husbands’ and wives’ education

15

15
Urban husband

10

10
5

5
0

0
0 5 10 15 0 5 10 15

15
15
Rural husband

10
10

5
5
0

0 5 10 15 0 5 10 15

Rural wife Urban wife

The four panels of this figure show a heatmap of the wives’ and husbands’ years of schooling for the four marriage types.
Red indicates more frequencies while blue indicates fewer frequencies. The dashed diagnol line has slope 1.
Source: China 2000 Census, men age 22-35 and women aged 20-33.

97
3.6 Tables

Table 3.1: Summary Statistics

A. Married Husbands Wives


Age (years) 29.52 28.10
(3.04) (3.05)
Education (years) 9.15 8.43
(2.69) (2.99)
Agriculture Hukou (=1) 0.76 0.77
(0.43) (0.42)
Observations 73,940 73,940
B. Never Married Men Women
Age (years) 24.35 21.38
(3.49) (2.81)
Education (years) 9.81 10.25
(3.27) (3.14)
Agriculture Hukou (=1) 0.68 0.66
(0.47) (0.47)
Observations 50,505 47,088

Note: Sample comes from men age 22-35 and women aged 20-33 in China 2000
Census. Married couples in Panel A limit to those in which both husbands
and wives can be observed in the data.

Table 3.2: Matching patterns by hukou status.

A. Observed matching Rural husband Urban husband


Rural wife 74.03% 3.29%
(54,737) (2,431)
Urban wife 1.83% 20.86%
(1,350) (15,422)
B. Random Matching Rural husband Urban husband
Rural wife 58.52% 18.48%
Urban wife 17.48% 5.53%

Note: Panel A shows the empirical frequencies of the four marriage types by
hukou status. Panel B shows the counterfactual matching frequencies when
people marry across hukou type randomly.
Source: China 2000 Census, men age 22-35 and women aged 20-33.

98
Table 3.3: Positive assortative matching in each matching cateogry

A. Rural wife-Rural husband B. Rural wife-Urban husband


Wife’s Husband’s Wife’s Husband’s
education education education education
Spousal education 0.44∗∗∗ 0.33∗∗∗ 0.29∗∗∗ 0.41∗∗∗
(0.0056) (0.0047) (0.020) (0.027)
N 54,737 54,737 2,431 2,431
R2 0.325 0.260 0.182 0.164

C. Urban wife-Rural husband D. Urban wife-Urban husband


Wife’s Husband’s Wife’s Husband’s
education education education education
Spousal education 0.50∗∗∗ 0.36∗∗∗ 0.62∗∗∗ 0.62∗∗∗
(0.036) (0.028) (0.0065) (0.0064)
N 1,350 1,350 15,422 15,422
R2 0.195 0.227 0.422 0.414
Note: This table shows the correlation between husbands’ and wives’ education (years of schooling) among the four marriage
categories by hukou status.
Source: China 2000 Census, men age 22-35 and women aged 20-33.

Table 3.4: Selection into four matching categories

A. Rural wife-Rural husband B. Rural wife-Urban husband


Wives Husbands Wives Husbands
Age (years) 28.07 29.41 27.37 29.30
(3.12) (3.08) (3.14) (3.05)
Education (years) 7.44 8.23 9.14 10.57
(2.33) (1.93) (2.15) (2.53)
C. Urban wife-Rural husband D. Urban wife-Urban husband
Wives Husbands Wives Husbands

Age (years) 27.64 28.84 28.38 29.99


(3.02) (3.13) (2.75) (2.80)
Education (years) 9.92 9.51 11.71 12.17
(2.50) (2.16) (2.77) (2.76)

Note: This table presents the average characteristics of men and women in each marriage category by hukou status.
Source: China 2000 Census, men age 22-35 and women aged 20-33.

99
Table 3.5: Explain individuals’ education using spousal characteristics, by hukou status.

Panel A: Rural Population


Rural wives’ Rural husbands’
Dependent variable: own education own education
(1) (2) (3) (4)
Rural spouse -1.48∗∗∗ -0.53∗∗∗ -1.11∗∗∗ -0.40∗∗∗
(0.045) (0.043) (0.060) (0.054)
Spousal education 0.43∗∗∗ 0.33∗∗∗
(0.0053) (0.0046)
N 57,168 57,168 56,087 56,087
R2 0.221 0.332 0.143 0.265
Panel B: Urban Population
Urban wives’ Urban husbands’
Dependent variable: own education own education
(1) (2) (3) (4)
Rural spouse -1.83∗∗∗ -0.17∗∗ -1.71∗∗∗ -0.099∗
(0.072) (0.068) (0.056) (0.055)
Spousal education 0.61∗∗∗ 0.61∗∗∗
(0.0064) (0.0063)
N 16,772 16,772 17,853 17,853
R2 0.084 0.423 0.079 0.404

Note: This table shows regression results of using spousal characteristics to explain own education, separately for individuals
of different genders with different hukou status.
Notes: Significance levels: * 10%, ** 5%, *** 1%.
Source: China 2000 Census, men age 22-35 and women aged 20-33.

100
Table 3.6: Unconditional and conditional correlation between hukou and SES, by spousal type

Dependent variable:
Own education
A. Women with rural husbands B. Women with urban husbands
(1) (2) (3) (4)
Rural -2.20∗∗∗ -1.70∗∗∗ -2.65∗∗∗ -1.64∗∗∗
(0.070) (0.064) (0.050) (0.049)
Husband’s education 0.44∗∗∗ 0.58∗∗∗
(0.0055) (0.0062)
N 56,087 56,087 17,853 17,853
R2 0.224 0.337 0.146 0.446
C. Men with rural wives D. Men with urban wives
(1) (2) (3) (4)
Rural -2.23∗∗∗ -1.72∗∗∗ -2.70∗∗∗ -1.55∗∗∗
(0.053) (0.050) (0.064) (0.062)
Wife’s education 0.33∗∗∗ 0.61∗∗∗
(0.0046) (0.0063)
N 57,168 57,168 16,772 16,772
R2 0.173 0.290 0.105 0.437
Note: This table presents the correlation between education and hukou status for individuals by gender and spousal hukou
status, both unconditionally (Column 1) and conditionally on spousal education (Column 2).
Notes: Significance levels: * 10%, ** 5%, *** 1%.
Source: China 2000 Census, men age 22-35 and women aged 20-33.

101
3.7 Appendix

Proof for Proposition 3.3

Proof. Let’s prove the first part of this proposition: Suppose x̄′ =

supx′ {Ox′ : those urban women who marry a rural man with positive probability}; y¯′ =

supy′ {Oy′ : those urban men who marry a rural woman with positive probability}. We want to show

that x̄′ < 1, and y¯′ < 1. Let’s prove by contradiction. There are three cases.

• Suppose x̄′ = 1 and y¯′ = 1, let’s denote the two couples as (x′ = 1, yA ) and (xA , y ′ = 1). Their total

surplus is:

Σ = µf (1, yA ) + λf (xA , 1)

while exchanging the partners will give us:

Σ1 = f (1, 1) + µf (xA , yA )

Σ1 − Σ = f (1, 1) − λf (xA , 1) + µ(f (xA , yA ) − f (1, yA ))

> λ(f (1, 1) − f (xA , 1)) − µ(f (1, yA ) − f (xA , yA ))

> λ((f (1, 1) − f (xA , 1)) − (f (1, yA ) − f (xA , yA ))) > 0

the last inequality is due to the supermodularity assumption. This violates the surplus maximization

property of a stable matching.

• Suppose x̄′ = 1 and y¯′ < 1, let’s denote the two couples as (x′ = 1, yA ) and (xN , y ′ = 1). The total

surplus is:

Σ = µf (1, yA ) + f (xN , 1)

while exchanging the partners will give us:

Σ1 = f (1, 1) + µf (xN , yA )

102
Σ1 − Σ = f (1, 1) − f (xN , 1) + µ(f (xN , yA ) − f (1, yA ))

= (f (1, 1) − f (xN , 1)) − µ(f (1, yA ) − f (xN , yA ))

> µ((f (1, 1) − f (xN , 1)) − (f (1, yA ) − f (xN , yA ))) > 0

the last inequality is due to the supermodularity assumption. This violates the surplus maximization

property of a stable matching.

• Suppose x̄′ < 1 and y¯′ = 1, let’s denote the two couples as (x′ = 1, yN ) and (xA , y ′ = 1). The total

surplus is:

Σ = f (1, yN ) + λf (xA , 1)

while exchanging the partners will give us:

Σ1 = f (1, 1) + λf (xA , yN )

Σ1 − Σ = f (1, 1) − f (1, yN ) + λ(f (xA , yN ) − f (xA , 1))

= (f (1, 1) − f (1, yN )) − λ(f (xA , 1) − f (xA , yN ))

> λ((f (1, 1) − f (1, yN )) − (f (xA , 1) − f (xA , yN ))) > 0

the last inequality is due to the supermodularity assumption. This violates the surplus maximization

property of a stable matching.

Therefore, x̄′ < 1 and y¯′ < 1. According to Proposition 3.1, we also know that all urban women with

SES larger than x̄′ and all urban men with SES larger than y¯′ marry each other positively assortatively.

Moreoever, in our case here, x̄′ = y¯′ . Define h(λ) as x̄′ = y¯′ = h(λ).

Now let’s move on to prove the second part. Denote x̄ =

supx {Ox : those rural women who marry rural man with positive probability}. We want to show that

x̄ < 1. Let’s prove by contradiction. Suppose x̄ = 1, hence pA (x = 1) > 0, there are two cases:

• 0 < pA (x = 1) < 1, for the partners of rural women with x = 1, denote the rural husband’s SES as

103
yA , and the urban husband’s SES as yN . Moreover, we know that yN = y¯′ = h(λ). And

∂f (x, y) ∂f (x, y) ∂f (x, y)


µ (1, yA ) = λ (1, yN ) = λ (1, h(λ)) (3.1)
∂x ∂x ∂x

∂f (x,y)
λ (1,h(λ))
If µ < ∂x
∂f (x,y) , there doesn’t exist yA ∈ [0, 1] satisfying Equation 3.1.
∂x (1,1)

• pA (x = 1) = 1, denote the rural man she marries has SES yA . Then yA = 1. Otherwise, the rural man

yA = 1 must marry a urban woman with xN < 1, couples (x = 1, yA ) and (xN , yA = 1) have smaller

total surplus than couples (x = 1, yA = 1)) and (xN , yA ).

At the same time, urban man with y¯′ = h(λ) marries some rural woman with xA < 1. Couples

(x = 1, yA = 1) and (xA , y ′ = y¯′ = h(λ) have joint surplus:

Σ = µf (1, 1) + λf (xA , h(λ))

while exchanging the partners will give us:

Σ1 = µf (xA , 1) + λf (1, h(λ)))

Σ1 − Σ = λ(f (1, h(λ)) − f (xA , h(λ))) − µ(f (1, 1) − f (xA , 1))


∫ 1 ∫ 1
∂f (x, y) ∂f (x, y)
=λ (t, h(λ))dt − µ (t, 1)dt (3.2)
xA ∂x xA ∂x

∂f (x,y)
λ (t,h(λ))
If µ < mint∈[0,1] ∂x
∂f (x,y) , the RHS of Equation 3.2 is positive.
∂x (t,1)

∂f (x,y)
λ (t,h(λ))
Therefore, we show that if µ < mint∈[0,1] ∂x
∂f (x,y) , x̄ < 1.
∂x (t,1)

Proof for Proposition 3.4

Proof. Let’s break the proofs into two parts. In the first part, I’ll show the equality conditions to pin down

the cutoffs. In the second part, I’ll prove that this matching is a stable matching.

f′ , x
e, x
To pin down the cutoffs, besides the original seven cutoffs in the proposition x f′ , ye1 , ye2 , ye′ , ye′ , let
1 2 1 2

f1 , x
me add seven auxiliary cutoff parameters as shown in Figure 3.4: x f2 , x f′ , x
f3 , x f′ , ye3 , ye′ . Therefore we have
3 4 3

104
14 unknown parameters to be determined in the equilibrium. There are two types of equality conditions.

The first set is that for a person that marries both rural and urban spouse with positive probability, his/her

marginal contribution should be the same in the two types of marriages. The second set is feasibility

constraint that for any type of marriage, the total mass of women should be equal to the total mass of men.

Here are eight equality conditions in the first set:

x, A) marries both rural man (1, A) and urban man (ye2′ , N ) with positive probability:
• Rural woman (e

µ ∗ 1 = λye2′ (3.3)

f′ , N ) marries both rural man (ye1 , A) and urban man (ye′ , N ) with positive probability:
• Urban woman (x1 3

µye1 = ye3′ (3.4)

f′ , N ) marries both rural man (ye3 , A) and urban man (b, N ) with positive probability:
• Urban woman (x 2

µye3 = b (3.5)

• Rural man (ye1 , A) marries both rural woman (f f′ , N ) with positive proba-
x1 , A) and urban woman (x1

bility:

µf f′
x1 = µx (3.6)
1

• Rural man (ye2 , A) marries both rural woman (f


x3 , A) and urban woman (b, N ) with positive probability:

µf
x3 = µb (3.7)

• Rural man (ye3 , A) marries both rural woman (f f′ , N ) with positive proba-
x2 , A) and urban woman (x2

bility:

µf f′
x2 = µx (3.8)
2

• Urban man (ye1′ , N ) marries both rural woman (1, A) and urban woman (x
f′ , N ) with positive proba-
3

105
bility:

f′
λ∗1=x (3.9)
3

• Urban man (ye2′ , N ) marries both rural woman (e f′ , N ) with positive proba-
x, A) and urban woman (x4

bility:

λ∗x f′
e=x (3.10)
4

Here are the additional equality conditions in the second set:

• Rural women with x ∈ (f e) should have the same mass as rural men with y ∈ (ye1 , 1):
x1 , x

e−x
x f1 = 1 − ye1 (3.11)

• Rural women with x ∈ (0, x


f3 ) should have the same mass as rural men with y ∈ (0, ye2 ):

f3 − 0 = ye2 − 0
x (3.12)

f′ , 1 + b) should have the same mass as urban men with y ′ ∈ (ye′ , 1 + b):
• Urban women with x′ ∈ (x3 1

f′ = 1 + b − ye′
1+b−x (3.13)
3 1

f′ , 1 + b) should be euqal to the mass of urban men with


• The mass of urban women with x′ ∈ (x 4

y ′ ∈ (ye2′ , 1 + b) minus the total mass of rural women with x ∈ (e


x, 1):

f′ ) = (1 + b − ye′ ) − α(1 − x
(1 + b − x e) (3.14)
4 2

f′ ) should be equal to the mass of urban men with y ′ ∈ (b, ye′ )


• The mass of urban women with x′ ∈ (b, x1 3

plus the mass of rural women with x′ ∈ (e


x, 1).

f′ − b = ye′ − b + α(1 − x
x e) (3.15)
1 3

106
f′ ) should be equal to the mass of rural men with y ∈ (0, ye3 )
• The mass of urban women with x′ ∈ (b, x 2

minus the mass of rural women with x ∈ (0, x


f2 ):

f′ − b) = α(ye3 − x
(x f2 ) (3.16)
2

There are 14 unknowns and 14 conditions.

From Equation 3.3: ye2′ = µ


λ.

From Equation 3.4: ye3′ = µye1 .

From Equation 3.5: ye3 = b


µ.

f′ = x
From Equation 3.6 and Equation 3.11: x f1 = x
e − (1 − ye1 ).
1

f3 = b.
From Equation 3.7: x

From Equation 3.8 and Equation 3.16: x f′ =


f2 = x b+αye3
= α+µ
2 1+α αµ+µ b.

f′ = ye′ = λ.
From Equation 3.9 and Equation 3.13: x3 1

f′ = λe
From Equation 3.10: x x.
4

From Equation 3.12: ye2 = x


f3 = b.
α+ µ
e=
From Equation 3.14: x α+λ
λ

λ− µ
From Equation 3.15: ye1 = 1+α
1−µ (1 −x
e) = 1+α
1−µ ∗ α+λ .
λ

With the cut-offs pinned down and the positive assortativeness property as shown in Proposition 3.1, the

matching function can be easily written down. The matching function is linear due to the nice property of

the uniform distributions.

Now let’s proceed to the second part: proving this matching is indeed stable. We have to show that

for all women and men, the sum of their current utilities are not less than the surplus they can produce

together. Define ∆((x, X), (y, Y )) = u(x, X) + v(y, Y ) − Σ((x, X), (y, Y )), we have to show that ∆ ≥ 0.

x, 1] who currently marry urban men with y ′ ∈ [ye2′ =


Let’s start with the rural women with x ∈ [e µ e′
λ , y1 ]:

• Urban men with y ′ ∈ (ye1′ , 1 + b] who currently marry urban women with x′ ∈ x
f′ = λ, 1 + b]:
3

∆(x, y ′ ) = uA (x) + vN (y ′ ) − λxy ′

107
∂∆
= u′A (x) − λy ′ = λy ∗ (x) − λy ′ < 0
∂x

∂∆ ′
= vN (y ′ ) − λx = x∗ (y ′ ) − λx > 0
∂y ′

Hence ∆ achives its infimum when x = 1 and y ′ = ye1′ , and we know that:

∆(1, ye1′ ) = 0

Therefore stability holds for this case.

• Urban men with y ′ ∈ [ye2′ , ye1′ ], stability holds trivially since the matching is positively assortative.

• Urban men with y ′ ∈ [b, ye2′ ) who currently marry urban women with x′ ∈ [x
f′ , x
2
f′ = λe
4 x):

∆(x, y ′ ) = uA (x) + vN (y ′ ) − λxy ′

∂∆
= u′A (x) − λy ′ = λy ∗ (x) − λy ′ > 0
∂x

∂∆ ′
= vN (y ′ ) − λx = x∗ (y ′ ) − λx < 0
∂y ′

e and y ′ = ye2′ , and we know that:


Hence ∆ achives its infimum when x = x

x, ye2′ ) = 0
∆(e

Therefore stability holds for this case.

• All rural men with y ∈ [0, 1]:

∆(x, y) = uA (x) + vA (y) − µxy

∂∆ µ
= u′A (x) − µy = λy ∗ (x) − µy ≥ λ ∗ − µy ≥ 0
∂x λ

∂∆ ′
= vA (y) − µx = µx∗ (y) − µx ≤ 0
∂y ′

108
e and y = 1, and we know that:
Hence ∆ achives its infimum when x = x

∆(e
x, 1) = 0

Therefore stability holds for this case.

In all, we prove that any rural women with x ∈ [e


x, 1] can’t form a blocking pair with all men. For other

rural women and all urban women, we can apply the same logic to prove the stability.

109
References

Abramitzky, R., A. Delavande, and L. Vasconcelos (2011). Marrying up: the role of sex ratio in assortative
matching. American Economic Journal: Applied Economics 3(3), 124–57.

Ahn, S. Y. (2018). Matching across markets: Theory and evidence on cross-border marriage.
Akresh, R., D. Halim, and M. Kleemans (2018). Long-term and intergenerational effects of education:
Evidence from school construction in Indonesia. Technical report, National Bureau of Economic Research.
André, P. and Y. Dupraz (2018). Education and polygamy: Evidence from Cameroon. Technical report.

Angrist, J. (2002). How do sex ratios affect marriage and labor markets? Evidence from America’s second
generation. The Quarterly Journal of Economics 117 (3), 997–1038.

Arunachalam, R. and S. Naidu (2006). The price of fertility: marriage markets and family planning in
Bangladesh. University of California, Berkeley.
Ashraf, N., N. Bau, N. Nunn, and A. Voena (2016). Bride price and female education. Technical report,
National Bureau of Economic Research.
Barro, R. and X. Sala-i-Martin (1995). Economic Growth. New York: McGraw-Hill.

Becker, G. S. (1973). A theory of marriage: Part i. The Journal of Political Economy, 813–846.
Behrman, J. R. and N. Birdsall (1983). The quality of schooling: quantity alone is misleading. The American
Economic Review 73(5), 928–946.
Bharati, T., S. Chin, and D. Jung (2018). Recovery from an early life shock through improved access to
schools: Evidence from Indonesia.
Bhaskar, V. (2015). The demographic transition and the position of women: A marriage market perspective.

Bobonis, G. J. and F. Finan (2009). Neighborhood peer effects in secondary school enrollment decisions.
The Review of Economics and Statistics 91(4), 695–716.

Breierova, L. and E. Duflo (2004). The impact of education on fertility and child mortality: Do fathers really
matter less than mothers? Technical report, National bureau of economic research.

Carmichael, S. (2011). Marriage and power: Age at first marriage and spousal age gap in lesser developed
countries. The History of the Family 16(4), 416–436.

Castro, J. F. and B. Esposito (2018). The effect of bonuses on teacher behavior: A story with spillovers.
Technical report, Discussion Paper 104, Peruvian Economic Association, August.

Chan, K. W. and L. Zhang (1999). The hukou system and rural-urban migration in China: Processes and
changes. The China Quarterly 160, 818–855.
Chari, A., R. Heath, A. Maertens, and F. Fatima (2017). The causal effect of maternal age at marriage on
child wellbeing: Evidence from India. Journal of Development Economics 127, 42–55.
Charles, K. K. and M. C. Luoh (2010). Male incarceration, the marriage market, and female outcomes. The
Review of Economics and Statistics 92(3), 614–627.

110
Chiappori, P.-A., R. J. McCann, and L. P. Nesheim (2010). Hedonic price equilibria, stable matching, and
optimal transport: equivalence, topology, and uniqueness. Economic Theory 42(2), 317–354.
Chiappori, P.-A., S. Oreffice, and C. Quintana-Domeque (2017). Bidimensional matching with heteroge-
neous preferences: education and smoking in the marriage market. Journal of the European Economic
Association 16(1), 161–198.
Chiappori, P.-A., B. Salanié, and Y. Weiss (2017). Partner choice, investment in children, and the marital
college premium. American Economic Review 107 (8), 2109–67.
Choo, E. and A. Siow (2006). Who marries whom and why. Journal of political Economy 114(1), 175–201.
Decker, C., E. H. Lieb, R. J. McCann, and B. K. Stephens (2013). Unique equilibria and substitution effects
in a stochastic model of the marriage market. Journal of Economic Theory 148(2), 778–792.
Dessy, S. and H. Djebbari (2010). High-powered careers and marriage: can women have it all? The BE
Journal of Economic Analysis & Policy 10(1).
Dominguez, C. (2014). Aggregate effects on the marriage market of a big increase in educational attainment.
Dissertation, Yale University.
Duflo, E. (2001). Schooling and labor market consequences of school construction in Indonesia: Evidence
from an unusual policy experiment. American Economic Review 91(4), 795–813.
Dupuy, A., A. Galichon, and L. Zhao (2014). Migration in China: to work or to wed? Technical report,
working paper.
Edlund, L. (2005). Sex and the city. The Scandinavian Journal of Economics 107 (1), 25–44.
Edlund, L. (2006). Marriage: past, present, future? CESifo Economic Studies 52(4), 621–639.
Fan, J. (2019). Internal geography, labor mobility, and the distributional impacts of trade. American
Economic Journal: Macroeconomics. Forthcoming.
Fergusson, D. M. and L. J. Woodward (1999). Maternal age and educational and psychosocial outcomes in
early adulthood. The Journal of Child Psychology and Psychiatry and Allied Disciplines 40(3), 479–489.
Frederick, W. H. and R. L. Worden (1993). Indonesia: A country study, Volume 550. Washington, DC:
Federal Research Division, Library of Congress.
Galichon, A. and B. Salanié (2015). Cupid’s invisible hand: Social surplus and identification in matching
models.
Galichon, A. and B. Salanié (2017). The econometrics and some properties of separable matching models.
American Economic Review 107 (5), 251–55.
Gautier, P. A., M. Svarer, and C. N. Teulings (2010). Marriage and the city: Search frictions and sorting of
singles. Journal of Urban Economics 67 (2), 206–218.
Glewwe, P., E. A. Hanushek, S. Humpage, and R. Ravina (2013). School resources and educational outcomes
in developing countries: A review of the literature from 1990 to 2010. Education Policy in Developing
Countries 4(14,972), 13.
Glewwe, P. and M. Kremer (2006). Schools, teachers, and education outcomes in developing countries.
Handbook of the Economics of Education 2, 945–1017.
Glewwe, P. and K. Muralidharan (2016). Improving education outcomes in developing countries: Evidence,
knowledge gaps, and policy implications. In Handbook of the Economics of Education, Volume 5, pp.
653–743. Elsevier.
Greenwood, J., N. Guner, G. Kocharkov, and C. Santos (2014). Marry your like: Assortative mating and
income inequality. American Economic Review 104(5), 348–53.

111
Hady, H. (1989). Capter iv : Regional development review of policies and achievements 1 hariri hady.
Indonesia, two decades of economic development, 142–168.
Han, L., T. Li, and Y. Zhao (2015). How status inheritance rules affect marital sorting: Theory and evidence
from urban China. The Economic Journal 125(589), 1850–1887.
Hanushek, E. A. (2011). The economic value of higher teacher quality. Economics of Education review 30(3),
466–479.
Hener, T. and T. Wilson (2018). Marital age gaps and educational homogamy-evidence from a compulsory
schooling reform in the UK. Technical report, Ifo Working Paper.
Iyigun, M. and J. Lafortune (2016). Why wait? a century of education, marriage timing and gender roles.

Jalal, F., M. Samani, M. C. Chang, R. Stevenson, A. B. Ragatz, and S. D. Negara (2009). Teacher cer-
tification in Indonesia: A strategy for teacher quality improvement. Departemen Pendidikan Nasional,
Republik Indonesia.
Jensen, R. and R. Thornton (2003). Early female marriage in the developing world. Gender & Develop-
ment 11(2), 9–19.
Jones, G. W. (1994). Marriage and divorce in Islamic South-East Asia.
Jones, G. W. and P. Hagul (2001). Schooling in Indonesia: crisis-related and longer-term issues. Bulletin of
Indonesian Economic Studies 37 (2), 207–231.
Jürges, H., S. Reinhold, and M. Salm (2011). Does schooling affect health behavior? evidence from the
educational expansion in Western Germany. Economics of Education Review 30(5), 862–872.
Kirbas, A., H. C. Gulerman, and K. Daglar (2016). Pregnancy in adolescence: is it an obstetrical risk?
Journal of pediatric and adolescent gynecology 29(4), 367–371.
Low, C. (2017). A �reproductive capital� model of marriage market matching. Manuscript, Wharton School
of Business.
Malamud, O., A. Mitrut, and C. Pop-Eleches (2018). The effect of education on mortality and health: Evi-
dence from a schooling expansion in Romania. Technical report, National Bureau of Economic Research.
Malhotra, A. (1991). Gender and changing generational relations: Spouse choice in Indonesia. Demogra-
phy 28(4), 549–570.
Mankiw, N. G., D. Romer, and D. N. Weil (1992). A contribution to the empirics of economic growth. The
quarterly journal of economics 107 (2), 407–437.
Martinez-Bravo, M. (2017). The local political economy effects of school construction in indonesia. American
Economic Journal: Applied Economics 9(2), 256–89.
McEwan, P. J. (2015). Improving learning in primary schools of developing countries: A meta-analysis of
randomized experiments. Review of Educational Research 85(3), 353–394.
Miguel, E. and M. Kremer (2004). Worms: identifying impacts on education and health in the presence of
treatment externalities. Econometrica 72(1), 159–217.
Ozier, O. (2018). The impact of secondary schooling in Kenya: A regression discontinuity analysis. Journal
of human resources 53(1), 157–188.
Rivkin, S. G., E. A. Hanushek, and J. F. Kain (2005). Teachers, schools, and academic achievement.
Econometrica 73(2), 417–458.

Rosales-Rueda, M., B. Mazumder, and M. Triyana (2019, May). Intergenerational human capital spillovers:
Indonesia�s school construction and its effects on the next generation. AEA Papers and Proceedings 109.

112
Sekhri, S. and S. Debnath (2014). Intergenerational consequences of early age marriages of girls: Effect on
children�s human capital. The Journal of Development Studies 50(12), 1670–1686.
Shapley, L. S. and M. Shubik (1971). The assignment game i: The core. International Journal of game
theory 1(1), 111–130.
Siow, A. (1998). Differntial fecundity, markets, and gender roles. Journal of Political Economy 106(2),
334–354.
Snodgrass, D. (1984, October). Development Program Implementation Studies No.5: Inpres Sekolah Dasar.
Cambridge, MA.: Harvard Institute for International Development.
Snodgrass, D., L. Hutagalung, and S. Dasar (1980). Inpres Sekolah Dasar: An Analytical Study. Economics
and Human Resources Research Center, Faculty of Economics,Padjadjaran University.
Tan, J. P. and A. Mingat (1992). Education in Asia : a comparative study of cost and financing (English).
World Bank regional and sectoral studies. Washington, DC : The WorldBank.
Weiss, Y., J. Yi, and J. Zhang (2013). Hypergamy, cross-boundary marriages, and family behavior.
World Bank (1989). Indonesia - Basic education study (English). Washington, DC: World Bank.

World Bank (2016, January). Indonesia Teacher Certification and Beyond. World Bank, Jakarta.
World Bank (2018). World Development Report, LEARNING to Realize Education’s Promise.

Young, A. (1994). Lessons from the East Asian NICs: a contrarian view. European economic review 38(3-4),
964–973.

Young, A. (1995). The tyranny of numbers: confronting the statistical realities of the East Asian growth
experience. The Quarterly Journal of Economics 110(3), 641–680.

Zhang, H. (2018, October). Human Capital Investments, Differential Fecundity, and the Marriage Market.
Working Papers 2018-7, Michigan State University, Department of Economics.

113

You might also like