MICS Booklet M2 Unit2 Unit3 Final

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 29

Short Course on Use of MICS data

for education sector (or situation)


analysis and monitoring | Module 2

Analyzing and interpreting key education indicators


using MICS | Units 2 & 3

1
The designations employed and the presentation of material in this publication do not imply the
expression of any opinion whatsoever on the part of UNICEF, UNESCO or of IIEP-UNESCO concerning
the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation
of its frontiers or boundaries.

The ideas and opinions expressed in this publication are those of the authors and do not necessarily
reflect the views of UNICEF, UNESCO or IIEP-UNESCO.

PUBLISHED IN JUNE 2022 BY


The Office for Africa of the International Institute for Educational Planning in Dakar
(IIEP-UNESCO Dakar)
Almadies – Route de la plage Ngor
BP 3311 Dakar – Senegal
https://dakar.iiep.unesco.org/en

LICENCE

This work is licensed under Attribution – Non-commercial use – ShareAlike 4.0 International. To see a
copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/4.0/.

You are free to share and reproduce the work:


● Attribution — You must credit the work, include a link to the licence and indicate whether any
modifications have been made to the work. You must indicate this information by all reasonable means,
but without suggesting that the Licensor endorses you or the way you have used their work.
● Non-commercial use — You may not make any commercial use of this work, or any part of the material
it contains.
● ShareAlike — If you remix, transform, or build upon the material of the original work, you must
distribute the modified work under the same conditions, i.e. with the same licence under which the
original work was distributed.

2
CONTENTS
CONTENTS 3
LIST OF FIGURES 3
LIST OF TABLES 4
LIST OF ACRONYMS 4
Unit 2 Computing education indicators using MICS data 5
INTRODUCTION TO UNIT 2 5
2.1 MICS6: INTRODUCTION, SAMPLING, QUESTIONNAIRES AND DATA FORMAT 5
2.1.1. MICS6 Sampling 6
2.1.2. MICS6 Questionnaires and Modules 7
2.1.3. MICS6 Data Format 8
2.2. SDG 4 INDICATORS CALCULABLE FROM MICS 9
2.2.1. Key education indicators calculable from MICS 9
2.2.2. Methodology for computation of the selected indicators 11
- Net Attendance Rate (NAR) 12
- Adjusted net attendance rate (ANAR) 12
- Completion Rate (CR) 14
- Participation in organized learning 14
- Out of school children rate (OSR) 15
- Parity indices 15
- Repetition Rate 15
- Dropout rate 16
- Early Childhood Development Index (ECDI) 16
- Foundational Learning Skills. 17
- ICT Skills 17
- Literacy rate 18
- Share of children under 5 living in a positive and stimulating home environment 18
Unit 3 Data analysis using MICS data for monitoring SDG 4 20
INTRODUCTION TO UNIT 3 20
3.1 DATASETS IN MICS6 20
- HL dataset 21
- MN and WM dataset 21
- FS dataset 21
- CH dataset 22
3.2 USE OF SAMPLING WEIGHTS 25
3.3 HOUSEHOLD DATA ANALYSIS AND INTERPRETATION 25

LIST OF FIGURES
Figure 1: Thematic areas that can be analyzed using MICS-EAGLE data ................................................. 6
Figure 2: Questionnaires used in MICS and the various topics covered in each questionnaire ............. 7
Figure 3: Visual differences between GAR, NAR and ANAR in Sierra Leone (2017).............................. 13
Figure 4: Example of MICS Data (Processed after collection) ............................................................... 24
Figure 5: Primary Completion in Sierra Leone, 2017............................................................................. 27
3
Figure 6: Out of school rates for children aged 6-11 in Sierra Leone, 2017.......................................... 29

LIST OF TABLES
Table 1: MICS modules and questionnaires used in collecting education related information ............. 8
Table 2: Key education indicators from MICS and linkage with SDG 4 indicators ................................ 11
Table 3: Questionnaires and datasets in MICS6 .................................................................................... 23
Table 4: Population of completers and children who complete primary and secondary education .... 27
Table 5: Official school ages and children from those ages attending school in Sierra Leone ............. 28

LIST OF ACRONYMS
CAPI Computer Assisted Personal Interviews

EMIS Education Management Information System

ICT information and communications technology

IIEP International Institute for Educational Planning

MICS Multiple Indicator Cluster Survey

MICS-EAGLE MICS-Education Analysis for Global Learning and Equity

SDG Sustainable Development Goals

UIS UNESCO Institute for Statistics

UN United Nations

UNESCO United Nations Educational Scientific and Cultural Organization

UNICEF United Nations Children’s Fund

4
UNIT 2 COMPUTING EDUCATION INDICATORS USING MICS
DATA

INTRODUCTION TO UNIT 2
In Module 1, we covered the intricates of planning and monitoring processes of an education system
and have gone through an overview of MICS, its coverage and evolution, including its complementarity
to administrative data. In Unit 1 of Module 2, we then covered some concepts that will be helpful to
analyze and interpret education indicators calculable from MICS.

In this unit, we will focus on the education indicators that can be computed from MICS6, assessing the
linkage between these indicators and the SDG 4 indicators, and demonstrating how the indicators can
be calculated from the raw data.

Specific objectives of Unit 2:


At the end of this Learning Unit, participants will be able to:
- Identify the key education indicators calculable from MICS;
- Assess the linkage between indicators calculable from MICS and SDG 4 indicators;
- Explain how education indicators are calculated from raw data.

2.1 MICS6: INTRODUCTION, SAMPLING, QUESTIONNAIRES AND DATA


FORMAT

Surveys follow a cyclic process that begins with the design of the survey, which essentially is the
definition of the goal, objectives or use of the survey. Once this is settled, the next main step of the
survey design is the development of survey instruments, which refer to questionnaires used as the
primary data collection tool from anticipated respondents. These survey instruments may also include
other documents such as the household record cards that collect information on members of the
respondent's household, instructions to the interviewers, etc. The next process is the selection of a
sample on which the survey would be conducted. We mentioned in the previous unit some methods
of selecting a sample and the constraints associated with sample data. Above all, a survey sample
should be reliable (here, reliability is used to denote the concept of certainty in that when a sample is
selected, no matter how many times a survey is run on it, the estimates would be similar). When, in
the survey cycle, it comes to collecting data, this could be implemented or conducted through a
Computer Assisted Personal Interviews (CAPI) option. The final step is the documentation and
dissemination of the survey, which involves the definition of key attributes of the survey (definition of
the survey, description of the tools, definition of variables, glossary of terminologies etc.) and provides
directions on where such information can be found in the survey repository. We highlight these steps
here so that all participants who have not participated in a survey can be able to appreciate the entire
process of a survey. In this training, our focus is on education indicators that can be calculated using
MICS6, which is our first learning objective.

5
As a large-scale survey that measures progress over time, MICS6 provides important insights into the
well-being of children and women. In education, the focus is on access and participation in education,
as well as on foundational skills, parental involvement, and ICT skills, for instance. Given this wealth of
data, the MICS-EAGLE initiative has grouped the data in to 8 broad categories, as highlighted in Figure
1 below.

Figure 1: Thematic areas that can be analyzed using MICS-EAGLE data

Source: UNICEF. April 2020. MICS. Toward achieving inclusive and equitable quality education for all.

2.1.1. MICS6 Sampling


MICS surveys are representative household surveys; the selected samples reflect the populations from
which they are drawn. The surveys are typically designed to have sufficient sample size to be
representative at the national level and first subnational level (e.g., region), though a growing number
of surveys are designed to only cover subnational areas, such as counties and provinces (e.g., Turkana
County in Kenya, and Sindh Province in Pakistan) and among special populations such as the Roma
(e.g., Bosnia and Herzegovina, Montenegro, and Serbia) and Palestinians in Lebanon (UNICEF 2019b)
without covering the entire country.

The MICS approach to sampling is that countries define a set of key indicators for the survey, the
domains of reporting, and the desired levels of precision for these domains, and then determine a
range of sample sizes to achieve these parameters. These sample sizes are then considered in
conjunction with several additional factors related to implementation, such as costs, length of
fieldwork, and amount of time available to work on the MICS survey. A final sample size carefully
balances these considerations. Sampling tools, which are online (UNICEF 2019c), consider a 95 percent
level of confidence and a relative margin of error of 12 percent (which may be relaxed at the
subnational level).

The MICS surveys use a multistage sample design based on an existing sample frame, such as the latest
population and housing census or a suitable master sample frame. Sample frames are first evaluated
for quality (e.g., there are no duplicates, they are complete and up to date). Further, the sample frame
must have clearly defined area units, and must include geographic codes, measure of size (households
or population), and auxiliary information for stratification.

6
2.1.2. MICS6 Questionnaires and Modules
Surveys collect information through questionnaires, and questionnaires can be comprised of different
modules. In surveys, a module consists of one or more pages of questions that collect information on
a particular subject. The different modules may be administered to all or part only of the sample,
depending on the subject of interest. The subject could be labor and employment, disability and
special needs, education and training, agriculture, and food security, income sources or household
equipment, etc.

Operationally, the 6th generation of MICS is implemented using eight questionnaires including (i)
household questionnaire, (ii) a questionnaire for individual women aged 15-49, (iii) a questionnaire for
individual women aged 15-49, (iv) a questionnaire for children aged 5-17, (v) a questionnaire for
children under five, (vi) a questionnaire for water quality, (vii) a questionnaire for GPS data collection,
and (viii) a questionnaire for collection of vaccination records. Countries can customize the
questionnaires and modules according to their needs. Figure 2 illustrates the questionnaires used in
MICS and the various topics covered in each questionnaire.

Figure 2: Questionnaires used in MICS and the various topics covered in each questionnaire

Source: UNICEF. April 2020. MICS. Toward achieving inclusive and equitable quality education for all.

As a data analyst or education stakeholder, it is important to note that although modules are meant
for the collection of data on specific subjects, it is not uncommon to find several modules used to

7
collect data on the same subject, due to the nature of the sub samples in the survey. It is therefore
important for all analysts to know in which modules and in which questionnaires the data they want
to analyze are.

Based on what sort of analysis individuals want to undertake, data from different questionnaires will
have to be merged while applying the appropriate weights. For example, education information should
always be taken from the household questionnaire while information on foundational skills comes
from the 5-17 questionnaire.

Table 1 highlights the modules and questionnaires used in collecting education related indicators in
MICS – respective questionnaires are given in parenthesis.

Table 1: MICS modules and questionnaires used in collecting education related information
Household (hh/hl) Children under 5 (ch) Children 5 to 17 (fs) Women (wn) & men
(mn) – 18 to 49

Education background Early childhood Foundational Literacy


• Highest level attended development Learning Skills Use of ICT (ICT skills)
• Current education
• Previous education Marriage union (early
Child discipline Child discipline
marriage)
Modules

Child functioning Child functioning

Child labor

Social transfer module Parental


participation
Source: UNICEF. April 2020. MICS. Toward achieving inclusive and equitable quality education for all.

2.1.3. MICS6 Data Format


As explained in the previous section, MICS6 uses several questionnaires: household, women aged 15–
49 years, men aged 15–49 years, children under age 5, children 5–17 years, and water-quality testing.
Each questionnaire generates at least one microdata file. MICS surveys containing all questionnaire
modules produce 10 secondary microdata files: household, household listing, insecticide-treated nets,
women, birth history, female genital mutation, maternal mortality, men, children aged 5–17 years, and
children under age 5. Each microdata file can be linked to other files based on relationships found in
the household listing. Microdata undergo secondary editing that includes structural and consistency
checks. No values in the microdata are imputed. Cross-country comparison is preserved by ensuring
that country customization uses standard MICS question numbers, even if survey-specific questions
are added or removed.

Microdata contain raw and derived variables. Raw variables directly match questions in individual
country questionnaires. Derived variables are calculated from various raw variables, or by using
external information in the case of sample weights. Sample weights are provided in datasets and are
calculated to account for nonresponse, oversampling (where implemented), and subsampling, and are

8
normalized, so that the total number of weighted households is equal to the total number of
unweighted households.

Each dataset has anonymous identifiers (cluster number, household number, and line number of the
respondent) that are necessary to merge different files (e.g., merge household data into children's
datafiles). All cases are provided in datasets, though for replication of figures in MICS tables only
completed interviews should be used in analysis. ‘HL’ dataset has the education information used to
calculate the key access and internal efficiency indicators, for these indicators, hhweight is to be
applied. ‘FS’ dataset on the other hand includes the variables needed to calculate learning outcomes
and fsweight is the weight to be applied when calculating foundational reading and numeracy skills.

Datasets are currently made available in SPSS and can be converted easily into other datafile types,
such as STATA and R. All data are anonymized and do not include names, locations, and other
identifying information.

In addition, as described in the previous section, there is often interest in analyzing education
indicators in relation to other variables. Many geographical, socio-economical, cultural and family
context variables can be drawn from the other modules to describe, explain or predict the educational
outcomes measured in the education-related modules.

2.2. SDG 4 INDICATORS CALCULABLE FROM MICS


The adoption of the Sustainable Development Goals in 2015 and the subsequent ratification by
different sectors, including the adoption of SDG 4 at Incheon, saw UNESCO Institute for Statistics (UIS)
appointed as the official data source for SDG 4–Education 2030 indicators. As education stakeholders,
it is important to know that UIS by itself does not collect any of the data used for monitoring SDG 4
indicators; they are compiled from countries’ EMISs or from other agencies who do primary data
collection, such as through MICS. In this section, we are going to find out the indicators that can be
sourced from MICS as well as their linkage to SDG 4 indicators.

2.2.1. Key education indicators calculable from MICS

We saw earlier that education related indicators in MICS-EAGLE are clustered around the thematic of
access and completion, skills (learning outcomes, ICT skills and literacy rates), inclusive education, early
learning, out-of-school children, grade repetition and dropout, child protection, and remote learning.
Within these are found several key education indicators that have linkage with the SDG 4 indicators.

Here below is the list of selected indicators used in MICS-EAGLE including their operational definition:
1. Net attendance rate (NAR) and Adjusted Net attendance Rate (ANAR) measures the percentage
of children of a given age group that are attending an education level compatible with their age
ANAR measures the percentage of children of a given age group that are attending an education
level compatible with their age or attending a higher education level. The ratio is termed

9
"adjusted" since it includes not only children attending the level they are expected to, but also
those in higher levels of education.
2. Completion Rate (CR) reflects the percentage of a cohort of children or young people three to five
years older than the intended age for the last grade of each level of education (primary, lower
secondary, or upper secondary) who have completed that level of education.
3. Participation in organized learning measures the share of children one year younger than the
official age to start primary school who are attending ECE or primary education.
4. Out of school children (OOSC) rate measures the part of the population in the official age range
for a given level of education not attending school given that OOSC are children and young people
in the official age range for a given level of education who are not attending either pre-primary,
primary, secondary, or higher levels of education.
5. Parity indices are calculated as the ratio of two categories of one indicator (most commonly
male/female, urban/rural, wealthiest/poorest quintile, disabled/non-disabled).
6. Repetition rate measures the share of children in a given grade in a given school year who
repeated that grade as a percentage of total number of children who attended the grade in the
previous year.
7. Dropout rate measures the proportion of children from a cohort attending a given grade in a given
school year who are no longer attending school in the following year.
8. Early Childhood Development Index (ECDI) measures the percentage of children under 5 years of
age who are developmentally on track in literacy-numeracy, physical, social-emotional, and
learning domains.
9. Foundational Learning Skills measure learning outcomes expected for Grades 2 and 3 in numeracy
and reading.
10. ICT Skills measures the proportion of youth and adults who used at least one of nine ICT skills in
the three months leading up to the survey.
11. Literacy rate measures the share of population that can both read and write a short, simple
statement about their everyday life. Youth literacy rate measures the share of population aged 15-
24 that can both read and write a short, simple statement about their everyday life.
12. Positive and stimulating home environment is an environment that supports young children by
providing them with opportunities to expand and apply their skills and knowledge.

Indicators that can be measured from MICS are a significant contribution to the monitoring of progress
towards commitment to education development. A stock taking of indicators and availability of data
to measure them indicates that MICS contributes to between 2-35% of the increase in the coverage of
SDG 4 indicators1 (2% on ICT skills, 15% on numeracy assessment, 16% on reading assessment, 19% on
organized learning, 28% on equity and disability disaggregation, 35% on ECDI). You will notice that each
of the identified indicators maps onto an existing SDG 4 indicator, as shown in Table 2.

1 UNICEF. April 2020. MICS. Toward achieving inclusive and equitable quality education for all.

10
Table 2: Key education indicators from MICS and linkage with SDG 4 indicators
MICS Education Indicator SDG 4 indicator
Adjusted net attendance rate No specific indicator
Indicator 4.1.2: Completion rate (primary education, lower
Completion Rate
secondary education, upper secondary education)
Indicator 4.2.2: Participation rate in organized learning (one
Participation in organized learning
year before the official primary entry age)
Indicator 4.1.4: Out-of-school rate (primary education, lower
Out of school children rate
secondary education, upper secondary education)
Indicator 4.5.1: Parity indices (female/male, rural/urban,
bottom/top wealth quintile, and others such as disability,
Parity indices indigenous peoples and conflict-affected, as data become
available) for all education indicators on this list that can be
disaggregated
Repetition and Dropout rates No specific indicator
Indicator 4.2.1: Proportion of children under 5 years of age
Early Childhood Development Index who are developmentally on track in health, learning and
psychosocial well-being
Indicator 4.1.1: Proportion of children and young people
Foundational Learning Skills achieving at least a minimum proficiency level in (i) reading
and (ii) mathematics
Indicator 4.4.1 Proportion of youth and adults with
ICT Skills information and communications technology (ICT) skills, by
type of skill
Indicator 4.6.1 Percentage of population in a given age group
Youth literacy rate achieving at least a fixed level of proficiency in functional (a)
literacy and (b) numeracy skills
Indicator 4.2.3: Percentage of children under 5 years
Positive and stimulating home
experiencing positive and stimulating home learning
environment
environments

2.2.2. Methodology for computation of the selected indicators


In the previous section, we discussed about sampling and how necessary it is that a sample selected
for a survey needs to be reliable. In the computation of indicators, we introduce the notion of validity
so that the method provided for computing an indicator remains exactly the same regardless of where
the indicator is computed or whoever computes. This is essential in standardizing results that come
from different countries or different rounds of the survey, to ensure results are comparable. The
formulas are presented so that you will see how the indicators are calculated.

In all the indicators, disaggregation can be done by gender, socio-economic status, regions, all aligned
to the sampling used in the survey, which differs from country to country.

11
- Net Attendance Rate (NAR)
NAR measures the percentage of children of a given age group that are attending an education level
compatible with their age. It can be divided into three indicators:
• NAR primary – percentage of children of primary school age currently attending primary
school.
• NAR lower secondary – percentage of children of lower secondary school age currently
attending lower secondary school
• NAR upper secondary – percentage of children of upper secondary school age currently
attending upper secondary school

Calculation
For example, NAR for primary level is calculated by dividing the total number of students in the official
primary school age range who attended primary education at any time during the reference academic
year by the population of the same age group. The following formula is used:

Where:
= adjusted net attendance rate for level n of education
= population aged the official age for level n of education attending that level of education
= population aged the official age for level n of education

- Adjusted net attendance rate (ANAR)


ANAR measures the percentage of children of a given age group that are attending an education level
compatible with their age or attending a higher education level. The ratio is termed "adjusted" since it
includes not only children attending ECE, but also those in primary education. It can be divided into
three indicators:
• ANAR primary – percentage of children of primary school age currently attending primary or
secondary school
• ANAR lower secondary – percentage of children of lower secondary school age currently
attending lower secondary school or higher
• ANAR upper secondary – percentage of children of upper secondary school age currently
attending upper secondary school or higher

Calculation
For example, the ANAR for primary education is calculated by dividing the total number of students in
the official primary school age range who attended primary or secondary education at any time during
the reference academic year by the population of the same age group. The following formula is used:

Where:
= adjusted net attendance rate for level n of education
= population aged the official age for level n of education attending that level of education or
higher
= population aged the official age for level n of education

12
Box 1: Gross attendance ratio, net attendance rate and adjusted net attendance rate

Besides NAR and ANAR, we can also calculate Gross Attendance Rate (GAR). GAR measures the percentage of
children of a given age group that are attending any level of education level. All the three indicators (GAR,
NAR and ANAR) use the same denominator, which is the number of children of the official age for a given level
of education who are attending that level. However, the numerator differs for each of these indicators. For
example, the GAR is often above 100 per cent in primary education in developing countries because the
numerator includes every child attending primary education. In some contexts, children who should be
attending secondary education based on their age are still in primary, so they are counted in the numerator,
but not in the denominator. In Sierra Leone (2017) for example, the primary GAR is 119 per cent.

Figure 3: Visual differences between GAR, NAR and ANAR in Sierra Leone (2017)
140
119
120
100 82
80 75
80
60 53
33 36
40 27 28
20
0
Primary Lower Secondary Upper Secondary

GAR NAR ANAR

Source: Statistics Sierra Leone. (2018). Sierra Leone Multiple Indicator Cluster Survey 2017, Survey Findings Report.
Freetown, Sierra Leone: Statistics Sierra Leone

When only children attending the correct level are considered in the numerator (NAR), the figures drop
sharply. However, NAR does not consider children who are already attending higher levels of education, such
as children who are primary school age, but are already in secondary school. To include those children in the
numerator, it is necessary to use ANAR, which considers all children attending the level of education designed
for them or higher. As a result, ANAR is always higher than NAR, although the difference is usually small.

13
It is important to note that the ANAR calculation excludes children attending lower levels of education
from its numerator. For example, lower secondary school-age children attending primary schools will
be counted as in school, but they will not be counted as attending the education level designed for
their age group and hence, they will be excluded from the ANAR numerator. As a result, in many
developing countries, ANAR will underestimate access to education because although in school, many
children are attending levels lower than expected for their age group.

- Completion Rate (CR)


It is computed as the number of children or young people aged 3-5 years above the intended age for
the last grade of each level of education (primary, junior/lower secondary, senior/upper secondary)
who have completed that grade divided by the population of children aged 3-5 years older than the
intended age for the grade in question and expressed as a percentage. This is the formula to be used:

,
=
Where:
CRn= Completion Rate for level n of education
EAPn, a+3-5 = Population aged three to five years above the official entrance age a into the last grade of
level n of education who completed level n
Pa+3-5= Population aged three to five years above the official entrance age a into the last grade of level
n of education.

For example, the junior secondary completion rate is given by the formula given below, taking note
that the conventional age for completing junior secondary is 14 years. The age reference age is then
given by adding 3 and 5 years to the official completion age to get the lower and upper bounds
respectively:

17 − 19 ℎ ℎ ℎ 3
=
ℎ ℎ 17 − 19

- Participation in organized learning


Participation in organized learning is calculated as the percentage of children one year younger than
the official primary school entry age attending ECE or primary school. Both the numerator and
denominator include only children aged one year younger than the official entry age for primary
school. The indicator is computed using the following formula:

Where:
PiOL= participation in organized learning
= children attending early childhood or primary school aged one year younger than the
official entry age for primary school
= children aged one year younger than the official entry age for primary school.

14
- Out of school children rate (OSR)
This indicator is calculated as the share of students of the official age for a given level of education
attending pre-primary, primary, secondary, or higher levels of education, subtracted from the total
population of the same age group. The following formula is used:


=

Where:
OSRn = out-of-school rate for children and young people of the official age for level n of education;
SAPn= population of the official age for level n of education;
AAGn= children and young people of the official age for level n of education attending any level of
education

- Parity indices
They are computed by dividing the results of an indicator for one group by those of another group.
Usually, group 1 (numerator) is presumed to have the lowest GAR and group 2 (denominator) to have
the highest, but it is not always the case. For gender parity, girls are often treated as group 1 even
when their GAR is higher than boys’ GAR. The following formula for computing the parity index
between boys and girls in terms of GAR is:

⁄ =

Where:
⁄ = parity ratio of group 1 to group 2;
= GAR of group 1;
= NAR of group 2.
Note: In the case of gender parity indices in particular, the education indicator for girls is not always
lower than that of boys. To facilitate the comparison of gender differences in favour of girls or
boys, the Adjusted Gender Parity Index is often used. In the case of parity on GAR:

= ≤

=2− >

- Repetition Rate
The repetition rate measures the share of children in a given grade in a given school year who repeated
that grade as a percentage of total number of children who attended the grade in the previous year.
The repetition rate is calculated as the number of repeaters in a given grade in a school year divided
by the number of students from the same cohort enrolled in the same grade in the previous school
year. The following formula is used:

15
=

Where:
= repetition rate in a given grade g in a given school year t
= number of students repeating grade g in a given school year t
= number of students from the same cohort enrolled in the same grade g in the previous school
year t-1

- Dropout rate
The dropout rate measures the proportion of children from a cohort attending a given grade in a given
school year who are no longer attending school in the following year. It is worth clarifying that children
who repeat are still considered to be in school and are therefore not included in the calculation for
dropout rate.
The dropout rate can be calculated as the share of children who drop out of a grade in year t to the
total number of students in that grade in the previous year (t-1). The following formula is used:

Where:
= dropout rate for grade g at time t
= children in grade g at time t-1 no longer enrolled in school at time t
= number of students enrolled in grade g in the previous school year t-1

- Early Childhood Development Index (ECDI)


To compute the percentage of children under 5 years of age who are developmentally on track in
literacy-numeracy, physical, social-emotional, and learning domains, there is need to first compute the
Early Childhood Development Index (ECDI).2
ECDI = 1 if LN + P + SE + L ≥ 3
= 0 otherwise
Where:
LN is a binary variable where 1 represents children developmentally on track in the literacy-
numeracy domain
P is a binary variable where 1 represents children developmentally on track in the physical
domain
SE is a binary variable where 1 represents children developmentally on track in the social-
emotional domain
L is a binary variable where 1 represents children developmentally on track in the learning
domain

Once the ECDI is computed, the percentage of children under 5 years of age who are developmentally
on track in literacy-numeracy, physical, social-emotional, and learning domains can be computed as

2 ECDI has recently been updated, and this is the former calculation. Moving forward with MICS7, ECDI2030 will be applied.

16
=

Where:
SDT3to4= share of children aged 3 to 4 years old developmentally on track
ECDI3to4= children aged 3 to 4 years old who have ECDI equal to 1
T3to4= total number of children aged 3 to 4 years old

- Foundational Learning Skills.


This indicator is computed for both numeracy and reading separately. The foundational reading skills
is computed using the following formula:

FRSng= share of children aged n and attending a school grade g who have foundational reading skills
readskng= children aged n and attending a school grade g who have readsk equal to 1 (i.e., the
intersection of reading 6 words correctly; answering a literal question correctly, and answering an
inferential question correctly)
Tng= total number of children aged n and attending school grade g

Where:
FNSng= the share of children aged n and attending school grade g who have foundational numeracy
skills
numbskillng= children aged n and attending school grade g who have numbskill equal to 1 (i.e., the
intersection of correctly answering all the number discrimination questions; correctly answering to all
the addition questions, and correctly answering to all the number pattern tasks)
Tng= total number of children aged n and attending school grade g

- ICT Skills
They are calculated by the simple ratio of the number of individuals in a demographic group who use
a certain ICT skill, divided by the total number of people in that demographic group. The following
formula is used:
,
, =
,
Where:
ICTs, d = Share of individuals in a specific demographic group d who possess ICT skill S;
Ds, d = Number of individuals in a specific demographic group d who used a certain ICT skill S; and
Ps, d = Total number of people in that specific demographic group.

17
- Literacy rate
It is calculated by dividing the total number of literate individuals in an age group (those who are able
to read and write), by the population of the age group. The following formula is used:

=
Where:
LRa = literacy rate for population in age group a;
La = number of literate individuals in age group a;
Pa= population in age group a.

- Share of children under 5 living in a positive and stimulating home environment


This is computed by dividing the number of children under five living in such an environment by the
total population of children under five as shown in the formula below. A child is considered to be living
in such an environment is they meet at least four of the following six conditions (i) an adult member
of the household and the child engage in reading books or looking at picture books together; (ii) an
adult member of the household and the child engage in storytelling; (iii) an adult member of the
household and the child sing songs and lullabies together; (iv) the child is taken out of the home by an
adult member of the household; (v) an adult member of the household engages in playful activities
with the child; and (vi) an adult member of the household and the child play, draw and name things
together.

Where:
PSHEn = share of children aged n who live in a positive and stimulating home environment
CSn= children age n with whom an adult engaged in at least four activities
Cn= children aged n

18
Comprehension Check 1

1. All education indicators in MICS have linkages with the SDG 4 indicators.
a. Yes
b. No

2. Check all that apply with respect to Gross Attendance Ratio (GAR), Net Attendance Rate (NAR),
and Adjusted Net Attendance Rate (ANAR):
a. GAR, NAR and ANAR all use the same denominator of the number of children of the
official age for a given level of education who are attending that level.
b. GAR, NAR and ANAR all use the same numerator of the number of children attending a
level of education.
c. NAR and ANAR include every child attending primary education.
d. GAR measures the percentage of children of a given age group who are attending any
level of education.
e. NAR measures the percentage of children of a given age group that are attending an
education level compatible with their age.
f. ANAR measures the percentage of children of a given age that are attending an
education level compatible with their age or attending a higher education level.

3. Repetition and dropout rate are measured the same way.


a. Yes
b. No

19
UNIT 3 DATA ANALYSIS USING MICS DATA FOR MONITORING
SDG 4

INTRODUCTION TO UNIT 3
The present unit will cover analysis of some of the education indicators introduced in Unit 2 and discuss
how these indicators are useful in informing education policy decisions and policy formulation. We will
also learn the steps involved in transforming variables from MICS and applying the formulae discussed
in Unit 2 as part of the analysis of the selected indicators. Some concepts introduced in Unit 1 of this
Module will be of use in this unit. Participants are called upon to have a quick review of the first unit
so that analysis may be completed seamlessly.

It is essential to correctly interpret the indicators, so that the data can be used to steer education
systems. Beyond the monitoring of SDG 4, education indicators produced from MICS data are central
in the three main phases of educational planning:
- Situation and needs analysis, typically through an Education Sector Analysis (ESA).
- Strategic and operational planning: policy formulation and impact simulation, target setting,
to develop the Education Sector Plan (ESP).
- Monitoring and evaluation of the ESP’s implementation.

In addition, as explained previously, one of the key advantages of household surveys such as MICS is
the possibility to disaggregate chosen indicators along various dimensions of inequities, thus allowing
to identify, design and monitor pro-equity policies.

Specific objectives of Unit 3:


At the end of this Learning Unit, participants will be able to:
- Identify attributes of a household dataset
- Create relevant indicator variables from MICS data
- Analyze key education indicators
- Interpret results of indicators with associated disaggregation
- Understand how education indicators can be disaggregated.

3.1 DATASETS IN MICS6


Before beginning the processing or preparation of variables to be used in the analysis, it is important
to have an idea of the MICS datasets. You will recall from Unit 2 that the education indicators calculable
from MICS are based on data collected through multiple questionnaires, including HH, HL, CH, FS, WN,
and MN questionnaires. The datasets for countries implementing MICS6 are made publicly available in
SPSS format and can be easily converted to Stata or R formats.

Once collected and processed, the secondary data available to analysts are two-dimensional data
tables with variables and cases. The variables, organized in columns, will be as many as the number of

20
questions posed in the survey, while the cases, shown in rows, will be as many as the units at which a
question is posed (depending on the questionnaire, the household, or the individual). This means that
a question posed for a household will yield as many cases as the number of households surveyed, while
a question posed to (or about) household members will yield as many cases as the respondents of the
target group of the corresponding questionnaire (e.g., women, men, children) in the households
surveyed. In some cases, more than one dataset can be produced from a single questionnaire, such as
the HH questionnaire. Table 3 below shows how the questionnaires correspond to the datasets that
are made publicly available.

Socio-economic and demographic variables such as age, sex, location, and region are found across
datasets, except for HH.sav dataset which is the household roster.

With respect to education indicators, the following variables are relevant across datasets:

- HL dataset
Dataset with list of household members and information about individuals in the household. To the
extent possible, education information should be taken from the HL dataset. The weight variable in
this dataset is called hhweight. Variables covering education are:

For all everyone over 3 years of age:


• Ever attended school or ECE (ED4)
• Attended school in the current year (ED9)
• highest level of education (ED5A) and highest grade attended (ED5B) by everyone in the
household, including adults (ED5);
• completion of level of education (ED6)

For those aged 3 to 24 years of age:


• level of education attending in current school year (ED10A) and grade attending in current
school year (ED10B);
• type of education institution being attended (ED11);
• school tuition and other kinds of material support (ED12, ED13 and ED14), and
• level of education attending in current school year (ED16A) and grade attending in current
school year (ED16B)

- MN and WM dataset
MN corresponds to data on men aged 15 to 49 years and WM corresponds to data on women aged 15
to 49 years. The weight variables in the MN dataset in mnweight and in the WM dataset is wmweight.
Variables covering education are:
• ability to read a sentence (WB14 for women and MWB14 for men)
• ICT skills (MT6A- MT6I for women and MMT6A- MMT6I for men)

- FS dataset
FS corresponds to data on children aged 5 to 17 years. The weight variable is fsweight. Variables
covering education are:

21
• Variables for foundational reading skills (questions FL10 to FL22); and
• Variables for foundational numeracy skills (questions FL23 to FL27)

- CH dataset
CH corresponds to data on children under 5 years of age. The weight variable is chweight. Variables
covering education are:
• early childhood education (ECE) attendance (question UB8).
• Information on early schooling can also be connected to data on how children have developed,
which is present in the Early Childhood Development module (EC). The module asks questions
about parental involvement and the home environment (questions EC1 to EC5), as well as on
the child’s health and development (EC7 to EC15).

In carrying out an analysis, an analyst will realize that analyzing datasets independently may not allow
them to get into the full depth of the analysis they intend to carry out. This is to say, an education
analyst may want to analyze education attributes of individuals, but within the context of their
household environment. This will call for the merging of various data sets into a master data set from
which most (if not all) required analyses can be conducted. In this training program, we will not delve
deep into merging functions but implore participants to look up functions that are used for merging
data sets. Merging of data sets requires that unique identifiers be contained in all the data sets, so that
the software can accurately match the information from the various sources. In the case of MICS, the
number of the household cluster, the household number within that cluster and the number of each
respondent within that household3 are included in all data sets and are generally used for merging.

Figure 4 below illustrates a master file obtained from merging HH, HL, WN, MN and CH questionnaires.
The main panel shows the variables and the cases, while the right side shows the variables alongside
their label, as well as some information on the dataset, e.g., that it contains 136 variables, 75,015
observations (respondents), and the memory this data occupies.

3In the example feature in Figure 4, these three identifiers are respectively HH1, HH2 and HL1. For more information on
merging MICS datasets, please refer to the document here:
https://mics.unicef.org/files?job=W1siZiIsIjIwMTkvMDQvMDEvMTQvMDAvMjQvNzM0L0ZBUV9NZXJnaW5nX01JQ1NfZGF0Y
V9maWxlc18yMDE5MDIyOC5kb2N4Il1d&sha=c3117037366d1001.

22
Table 3: Questionnaires and datasets in MICS6
QUESTIONNAIRE FOR QUESTIONNAIRE FOR QUESTIONNAIRE FOR
QUESTIONNAIRE FOR INDIVIDUAL WOMEN
HOUSEHOLD QUESTIONNAIRE CHILDREN INDIVIDUAL MEN CHILDREN
AGE 15-49 YEARS
UNDER FIVE AGE 15-49 YEARS AGE 5-17 YEARS

hh.sav hl.sav tn.sav wm.sav bh.sav fg.sav mm.sav ch.sav mn.sav fs.sav

List of Female
Household Insecticide Woman’s Birth Maternal Under-Five Child Man’s Information 5-17 Child Information
HH HL Household TN WM BH FG Genital MM UF MWM FS
Information Panel Treated Nets Information Panel History Mortality Information Panel Panel Panel
Members Mutilation

Household
HC ED Education WB Woman’s Background UB Under-Five’s Background MWB Man’s Background CB Child’s Background
Characteristics
Mass Media and
ST Social Transfers MT Mass Media and ICT BR Birth Registration MMT CL Child Labour
ICT
Household Energy Early Childhood
EU CM Fertility EC MCM Fertility FCD Child Discipline [5-14]
Use Development

Insecticide Treated Attitudes Toward


TN DB Desire for Last Birth UCD Child Discipline MDV FCF Child Functioning
Nets Domestic Violence

Water and Maternal and


WS MN UCF Child Functioning MVT Victimisation PR Parental Involvement
Sanitation Newborn Health
Post-natal Health Breastfeeding and Foundational Learning
HW Handwashing PN BD MMA Marriage/Union FL
Checks Dietary Intake Skills

SA Salt Iodisation CP Contraception IM Immunisation MAF Adult Functioning

WQ Water quality UN Unmet Need CA Care of Illness MSB Sexual Behaviour

Attitudes Toward
DV AN Anthropometry MHA HIV/AIDS
Domestic Violence
Vaccination records at
VT Victimisation HF MMC Circumcision
health facility
Tobacco and
MA Marriage/Union MTA
Alcohol Use
AF Adult Functioning MLS Life Satisfaction

SB Sexual Behaviour
HA HIV/AIDS
Tobacco and Alcohol
TA
Use
LS Life Satisfaction

23
Figure 4: Example of MICS Data (Processed after collection)

Source: Sierra Leone MICS 6 data set


https://mics-surveys-prod.s3.amazonaws.com/MICS6/West%20and%20Central%20Africa/Sierra%20Leone/2017/Datasets/Sierra%20Leone%20MICS6%20Datasets.zip

24
3.2 USE OF SAMPLING WEIGHTS
As explained in Module 1, some household surveys, because of the design of their sampling, need to
be analysed with sampling weights for the analysis and generalization to the whole population to be
accurate. This is the case of the MICS surveys. The weights are needed to adjust the selection bias that
occurs during the recruitment of the sample to allow for producing estimates representative to the
survey population. These weights are associated to each individual and represent the “importance” of
each individual in the calculation. An individual will be given a greater weight if their characteristics
(age, region, sex, etc.) are under-represented in the sample compared to the whole population.

The use of weights modifies slightly the calculation of the main statistics described in Unit 1 or the
indicators in Unit 2: Instead of counting individuals (children enrolled in an education level, or children
of a certain age group), the analyst needs to add the weights of these individuals (not forgetting to do
this for both the numerator and the denominator).

For instance, the Adjusted Net Attendance Rate (ANAR) retains its definition and formula:

But when calculating it with weights:


= Weighted sumof students for level n of education attending that level of education or higher
= Sum of the weights of children in the official age-range for level n of education

In practice, these calculations with weights are generally done in statistical software. However, for
simple indicator calculation (not calculating variances, for instance, or conducting statistical tests), the
Pivot Tables in Excel can be used, by choosing the weights in the Values field and selecting “Summarize
values by…” Sum instead of count.
The following section highlights the general steps to carry out these calculations, whether on a
statistical software or in Excel.

3.3 HOUSEHOLD DATA ANALYSIS AND INTERPRETATION


Once all relevant data sets are merged4 and an analyst is satisfied that all data required for the specific
analysis are ready, the next step is to conduct the calculations of the indicators according to the
definitions and formulas presented in Unit 2. In this subsection, we will go through the steps for
analyzing two indicators: Completion Rate (CR) and Out-of-School Children Rate (OSR) and follow that
with a snapshot of results from the MICS6 survey from Sierra Leone. We will use the results to discuss
how to interpret the data obtained from the analysis. These examples will be useful as you perform
the group work, where you will be asked to create charts and provide interpretation of the results for
three indicators: completion rates, foundational skills, and out of school rates.

4 In practice the various datasets are not entirely included into the merged into the master dataset, as it would create too
large a file, which would slow down computations. Only the relevant variables are included from each data set.

25
The document intitled “MICS_GoFurther_M2_Unit3”5 contains the steps for analyzing a number of
other education indicators. We encourage you to practice these steps, including filling out the
interpretations for each of the charts.

1. Completion Rate (CR). To compute the proportion from a cohort three to five years older than
the intended age for the last grade of each level of education (primary, lower secondary, or
upper secondary) who have completed that level of education, we need first interpret the
formula given below and derive steps from there:
,
=

a. Step 1: Define the completion age range for the different levels of education. The
completion age range is referenced to the theoretical age one would have if they
started school at the prescribed age and did not repeat a grade. For instance, in a
system with an official school entrance age of 6 and a primary cycle of 6 years, the
theoretical primary completion age is 11 years. Completion rate would therefore be
computed among children aged between 11+3=14 and 11+5=16; 14+3=17 and
14+5=19 in lower/junior secondary; and 17+3=20 and 17+5=22 in upper/senior
secondary. This step yields the denominator of the completion rate formula above.

b. Step 2: Compute the number of children/adolescents aged 3-56 above the official
entrance age to the last grade of the cycle in question. According to the age ranges
created in Step 1, we would be looking for children/adolescents aged 14-16, who have
completed Grade 6 of primary; children/adolescents aged 17-19, who have completed
Grade 9 of junior secondary; and children/adolescents aged 20-22, who have
completed Grade 12 of senior secondary. These groups of children are identified by
combining:
i. ED5A: Highest education level attended
ii. ED6: Highest grade ever completed at that level.
Note that in this case we are not using ED9, since we are not interested in whether
children are still attending school. This step yields the numerator of the completion
rate formula above.

c. Step 3: Divide the weighted sum of children from the reference ages completing the
different levels of education (Step 2) by the sum of weights of the population of
children or youth in those age groups (Step 1). Given the disaggregation advantage
discussed of survey data, this step can be repeated for different cases, i.e., by sex
(male, female), location (urban, rural), socioeconomic status (poorest, second,

5See the document MICS_GoFurther_M2_Unit3 in the “Selected Readings” section of Module 2.


6The reason why the completion rate is calculated on children aged 3-5 years older than the official completion age rather
than the official completion age itself is to recognize the fact that in many countries children may enter school later than the
official age, repeat, and/or drop out and re-enrol, resulting in an actual completion happening (when it does) several years
after it theoretically should. It is however limited in range (3-5 years, and not 3-10 years, for instance), to capture the situation
of children at a given time rather than grouping together too many age cohorts.

26
middle, fourth, richest). Table 4 below presents the sum of weights of the numerators
and denominators in each disaggregation level.
Table 4: Population of completers and children who complete primary and secondary education
Population expected to have completed: Expected completers who have completed: Completion Rate

Lower Upper Lower Upper Lower Upper


Primary Primary Primary
secondary secondary secondary secondary secondary secondary

Overall 4,457 4,627 3,461 2,861 2,046 682

Male 2,146 1,974 1,507 1,358 934 349

Female 2,312 2,653 1,954 1,504 1,112 334

Urban 2,289 2,537 2,083 1,896 1,638 614

Rural 2,169 2,090 1,378 965 407 68

Poorest 659 602 406 216 50 7

Second 766 778 499 308 121 18

Middle 984 905 580 607 319 62

Fourth 964 1,064 856 787 587 193

Richest 1,084 1,277 1,121 943 969 403


Source: Sierra Leone Multiple Indicator Cluster Survey, 2017

d. Step 4: Prepare appropriate visualization and interpret the results. Figure 5 illustrates
a bar chart that can be used in visualizing this indicator, computed from the data
presented in Table 4.

Figure 5: Primary Completion in Sierra Leone, 2017


 Overall, 64% of children from Sierra
Overall 64%
Leone complete primary education.
 Primary completion rates are very
Male 63%
Female 65%
similar for male and female students.
 Nearly twice as many children from
Urban 83% urban areas complete primary school as
Rural 44% children from rural areas.
 Primary completion rates are closely
Poorest 33%
aligned to household wealth, as only
Second 40%
one-third of children from the poorest
Middle 62%
households complete primary,
Fourth 82%
Richest 87% compared to 87% of children from the
wealthiest households.
Source: Sierra Leone Multiple Indicator Cluster Survey, 2017

27
2. Out of school children rate (OSR). To compute the proportion of the population in the official
age range for a given level of education not attending school, we start by examining the
formula to identify the various elements that are needed to establish this rate. From the
formula, although there are three components to be computed, there are essentially two (as
one is repeated twice), i.e., the population of the official age range aligned to a given level of
education (primary, lower/junior secondary and senior/upper secondary), while the second
would be the number of learners from these ages attending any given level of education.


=

a. Step 1: To start off, we define the official age ranges for the three levels of education.
In most African countries, the age range from primary is 6-11 years, 12-14 for lower
secondary, and 15-17 for senior secondary.

b. Step 2: Compute in respect of each age range, the number of children attending any
given level of education. For example, for the primary age group, compute the
number of children aged 6-11 (based on variable ED5A) who are attending pre-
primary, primary, secondary, or higher levels of education. Repeat this for the lower
and upper secondary age ranges.

c. Step 3: Subtract the sums obtained in Step 2 from those obtained in Step 1, to end up
with the number of children out of school in the three respective age ranges. It is
important not to forget sampling weights when computing the sums and differences,
as this will influence the results obtained in the end.

d. Step 4: Divide the difference obtained in Step 3 for respective levels of education by
the respective official population. For example, divide, the number of children out of
school from the primary age reference by the official population of children aged 6-
11 to obtain the out of school rate for primary. Repeat this step for lower and upper
secondary levels. This step can be repeated for different cases, i.e., by sex (male,
female), location (urban, rural), socioeconomic status (poorest, second, middle,
fourth, richest). Table 5 below presents the numbers of students in different age
groups and the number of children attending schools within the different age groups.

Table 5: Official school ages and children from those ages attending school in Sierra Leone
OSR
Population of children aged Children attending school drawn from
6-11 12-14 15-17 6-11 12-14 15-17 6-11 12-14 15-17

Overall 12,727 5,092 5,728 10,505 4,134 3,652

Male 6,391 2,590 2,541 5,096 2,066 1,744


Female 6,336 2,501 3,187 5,405 2,068 1,907

28
Urban 5,386 2,474 3,110 5,049 2,311 2,548
Rural 7,341 2,617 2,618 5,271 1,736 931

Poorest 2,670 852 770 1,560 451 131


Second 2,652 925 956 1,969 619 340
Middle 2,676 1,117 1,179 2,282 912 637
Fourth 2,395 1,058 1,282 2,223 973 988
Richest 2,334 1,140 1,541 2,241 1,079 1,355
Source: Sierra Leone Multiple Indicator Cluster Survey, 2017

e. Step 5: Prepare appropriate visualization and interpret the results. Figure 6 illustrates
a bar chart that can be used in visualizing this indicator, computed from the data
presented in Table 5.

Figure 6: Out of school rates for children aged 6-11 in Sierra Leone, 2017
 Overall, 17% of children aged 6-11 in
Overall 17%
Sierra Leone are out of school.

Male 20%
 A greater share of male children are out

Female 15% of school than female children.


 Rural children are far more likely to be
Urban 6% out of school than urban children.
Rural 28%
 There is a correlation between wealth
of the household and out of school
Poorest 42%
Second 26% rates. Whereas 42% of children from
Middle 15% the poorest households are out of
Fourth 7% school, the share drops to just 4% for
Richest 4% children from the richest households.

Source: Sierra Leone Multiple Indicator Cluster Survey, 2017

29

You might also like