WG I Source Data Revisions

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Worldwide Governance Indicators 2022 Update:

Changes in Underlying Source Data


September 23, 2022

This note describes revisions to the source data used in the 2022 update of the Worldwide Governance
Indicators (WGI) covering the period 1996-2021.

• Part 1 of this note provides a detailed description of a noteworthy one-time revision to the
underlying source data in the 2022 WGI update: the removal of confidential data from the
Country Policy and Institutional Assessment data from the World Bank, Asian Development
Bank, and the African Development Bank from the WGI source dataset.
• Part 2 describes a small number of corrections and revisions to other data sources in the 2022
WGI update relative to the 2021 WGI update.

Because of these revisions to data from previous years, this update of the entire WGI dataset
supersedes previous versions for all years – as is the case with each annual update of the WGI. For
reference, previous years’ versions of the full WGI dataset can be downloaded at www.govindicators.org
(Documentation tab, section on WGI data sources).

The Worldwide Governance Indicators (WGI) are a research dataset summarizing the views on the
quality of governance provided by a large number of enterprise, citizen and expert survey respondents in
industrial and developing countries. These data are gathered from a number of survey institutes, think
tanks, non-governmental organizations, international organizations, and private sector firms. The WGI
do not reflect the official views of the World Bank, its Executive Directors, or the countries they represent.
The WGI are not used by the World Bank Group to allocate resources. For questions about the 2022 WGI
Update, please contact Aart Kraay (akraay@worldbank.org).

1
Part 1: Removal of Confidential CPIA Data
1.1 Background

The Worldwide Governance Indicators (WGI) combine data from over 30 existing data sources into six
composite indicators of governance, covering over 200 countries and territories since 1996. The 2022
WGI update includes one noteworthy revision to the underlying source data: the removal of
confidential data from the Country Policy and Institutional Assessments (CPIA) produced by the World
Bank (WB), Asian Development Bank (ADB), and the African Development Bank (AfDB) from the WGI
source data. Publicly-available CPIA data will continue to be used as part of the source data for the WGI.

The CPIA are quantitative assessments of country policy and institutional quality prepared by country
economists working at the WB, ADB and AfDB following a structured methodology. The primary
purpose of the CPIA is to inform the resource allocation decisions of these institutions. CPIA data are
publicly available for countries eligible for concessional lending since 2004 (AfDB) and 2005 (ADB and
WB).1

CPIA data have been used in the WGI as components of four of the six aggregate indicators:
Government Effectiveness, Regulatory Quality, Rule of Law, and Control of Corruption. In addition to
the publicly-available CPIA data noted above, the WGI have used confidential CPIA data prior to
2004/2005 from all three institutions, as well as confidential CPIA data produced by the AfDB and the
WB since 2004/2005 for their non-concessional borrowers.2 All CPIA data, both confidential and public,
was used to construct the aggregate WGI indicators using the same methodology as applied to all other
WGI data sources. However, in order to respect the disclosure policies of the AfDB, ADB and WB, the
confidential CPIA data itself was not publicly disclosed. Data from all other WGI data sources are
publicly available through the WGI website.

1.2 Removal of Confidential CPIA Data for Greater Transparency

As part of the 2022 update of the WGI, all of the confidential CPIA data have been removed from the
source data for the WGI since 1996, and the full historical dataset of the six aggregate indicators has
been recalculated excluding these data points. Publicly-available CPIA data for concessional borrowers
from the three institutions continues to be used as source data for the WGI. As a result of the removal
of the confidential data, the 2022 release of the full historical WGI dataset supersedes previous
releases.3

The main reason for this change is to further enhance the transparency and replicability of the WGI.
Although the WGI methodology, together with the WGI source data (other than the confidential CPIA
data), has long been publicly available through the WGI website, the confidential CPIA data has not been
shared publicly, and its inclusion among the data sources for the WGI has prevented the publication of a
complete replication dataset and code for the WGI. Removing the confidential CPIA data makes it
possible to fully share the WGI source data as well as a full replication dataset and code for the WGI.
This replication package will be available on the WGI website at www.govindicators.org.

1
Information on the CPIA and its use in resource allocation is available here (ADB), here (AfDB), and here (WB).
2
The ADB does not produce CPIA data for its non-concessional borrowers.
3
Previous releases of the full WGI dataset are archived for reference at www.govindicators.org.

2
1.3 Consequences of Removal of Confidential CPIA Data

The remainder of this note briefly summarizes the changes in the aggregate indicators in the historical
WGI dataset covering the period 1996-2020 that are due to the removal of confidential CPIA data.
Removing the confidential CPIA data results in two types of changes to the data in four of the six
aggregate WGI indicators (Government Effectiveness, Regulatory Quality, Rule of Law, and Control of
Corruption):

• Country-year observations for which confidential CPIA data was used: For these country-year
observations, removal of the confidential CPIA data will result in changes in estimates of
governance and associated standard errors. The estimates of governance can increase or
decrease depending on how the CPIA data scored the country-year relative to other data
sources. The standard errors increase for all country-year observations for which confidential
CPIA data is removed, as the aggregate WGI scores are now based on less information than
before. Finally, the country-year’s percentile rank, which reflect its relative position on the
estimates of governance in that year, will also change.
• Country-year observations for which only public data used: These country-year observations
are not directly affected by the removal of confidential CPIA data since such data was not used
to construct their aggregate scores. However, estimates of governance and standard errors for
these country-year observations change very slightly due to the fact that the aggregate WGI
indicators are standardized to have a mean of zero and standard deviation of one in each year.
The changes to the data for country-year observations where confidential CPIA data was
previously used change the coefficients used to standardize the aggregate indicators, resulting
in very slight changes in the published data for countries for which confidential CPIA data was
not used. Since the standardization affects all countries in the same way in a given year,
countries’ relative positions in that year are not affected by the change in the standardization
coefficients. However, these countries’ percentile ranks may change to the extent that other
countries (in the first bullet above) with “nearby” scores move up or down due to the removal
of their confidential CPIA data.

The effects of these changes are illustrated in this note by comparing the 2021 vintage of the WGI
dataset (published in September 2021) with and without the confidential CPIA data.4 Figure 1 reports
data from one of the four affected aggregate WGI indicators, Control of Corruption, for 2015 (top panel)
and 2000 (bottom panel). In each panel, the horizontal (vertical) axis reports the aggregate Control of
Corruption indicator including (removing) the confidential CPIA data. The red triangles indicate
countries where confidential CPIA data has been removed, i.e. countries in the first bullet point above.
The blue circles indicate countries where no confidential CPIA data were used, but whose scores are
affected by the updated standardization, i.e. countries in the second bullet point above. Note that for
these countries, the data points fall along a straight line, indicating no changes in countries’ relative

4
This comparison isolates the effect of removing the confidential CPIA data. The 2022 update of the WGI dataset
also includes a few other minor data corrections and revisions, that are described in Part 2 of this note. Readers
interested in the overall effects of all of the source data revisions described in this note may compare the 2022
vintage of the WGI dataset with the 2021 vintage of the WGI dataset – both of which are available at
www.govindicators.org. Since the effects on the aggregate indicators of the changes in Part 2 of this note are very
small, the comparison of the 2022 vintage of the WGI dataset with the 2021 vintage is quite similar to the
comparison of the 2021 vintage with and without confidential CPIA data described in this note.

3
positions. The green line indicates the 45-degree line, with points above (below) the line corresponding
to countries for which removing confidential CPIA data increased (decreased) the country score on the
aggregate indicator.

Overall, removing confidential CPIA data has minimal effects on country scores in the top panel of Figure
1 with data for 2015. The red triangles are clustered very close to the 45-degree line, indicating only
minimal changes due to the removal of confidential CPIA data. Since the scores for these countries
change only minimally, the same is true of the standardization coefficients, and therefore the scores for
all other countries also barely change at all. The changes due to removal of confidential CPIA data are
on average somewhat larger (in absolute value) in 2000 and affect more countries, as shown in the
bottom panel of Figure 1. However, the aggregate WGI indicators constructed with and without
confidential CPIA data remain highly correlated.

The difference between the two panels of Figure 1 reflects two factors:

• Fewer affected countries post-2004. The top panel with data from 2015 illustrates the post-
2004 period where confidential CPIA data was used only for non-concessional borrowers from
the AfDB and WB. Data for concessional borrowers from all three institutions was publicly
available during this period and therefore is not affected by the change. In contrast, the
bottom panel with data from 2000 illustrates the pre-2004 period where the CPIA data for all
countries for all three institutions was confidential – this accounts for the much larger number
of countries affected in the earlier period.
• More data sources post-2004. The number of data sources on which the aggregate WGI
indicators are based has expanded significantly over time. For example, for Control of
Corruption, the median country-year observation is based on 6 underlying data sources
between 1996 and 2004, but 10 underlying data sources between 2005 and 2020. When more
data sources are available for a country, removing the confidential CPIA data for that country
will on average have less effect on the aggregate indicator.

Moving beyond the illustrative example in Figure 1, Table 1 systematically reports the average effect of
removing confidential CPIA data for the four affected aggregate WGI indicators. For each of the four
indicators, and for every year between 1996 and 2020, it reports six statistics:

• The total number of countries in the 2021 vintage of the WGI data
• The number of countries for which confidential CPIA data has been removed
• The number of cases where the resulting change in the aggregate indicator is significant at the
10% level
• The correlation between the aggregate WGI indicator with and without confidential CPIA data,
for all countries
• The correlation between the aggregate WGI indicator with and without confidential CPIA data,
only for those countries for which confidential CPIA data has been removed
• The correlation between the aggregate WGI indicator with and without confidential CPIA data,
only for countries for which confidential CPIA data has been removed and that had three or
fewer data sources in the original WGI dataset. This correlation is reported only if there are at
least five countries in this group for a given indicator and year, and usually involves only a small
number of countries.

4
The patterns in Table 1 are consistent with those illustrated in Figure 1. Overall, the correlations
between the aggregate indicators calculated with and without the confidential CPIA data are very high.
Focusing on the group of countries with confidential CPIA data removed, the correlations average to
0.98 in the 1998-2004 period and 0.99 in the 2005-2020 period. As expected, the correlations are
somewhat lower in the last column which focuses on countries that start out with three or fewer data
sources. However, there also are very few countries in this group, which consists primarily of small
island states.

While these high correlations indicate that on average the changes in the data due to the removal of
confidential CPIA data are small, for some countries the changes are non-trivial – particularly in the
1998-2004 period. However, as with all other comparisons involving the aggregate WGI indicators, it is
important to take into account the margins of error that accompany the estimates of governance in
order to assess the significance of the change. The third column of Table 4 reports the number of cases
in each year where the point estimate of governance including confidential CPIA data is more than 1.64
standard errors away from the point estimate of governance excluding confidential CPIA data. This
corresponds to a change that is significant at the 10% level. Looking across all four indicators and all
years, there are just 26 such cases, representing just 0.4% of the total of 6816 cases where confidential
CPIA data has been removed. These 26 cases are listed in Table 2. Notably, over half of them (16) are in
just one country, Seychelles, for which relatively few other data sources are available.

***********************

Overall, for the vast majority of country-year observations, removal of confidential CPIA data has
minimal effects on the four affected aggregate WGI indicators. The few cases of larger changes in the
aggregate WGI indicators are primarily among small economies with relatively few other data sources.
Removing confidential CPIA data makes it possible to disclose all of the WGI source data to all users.
Removing confidential CPIA data also makes it possible to publish a complete replication dataset and
code for the WGI, which is available at www.govindicators.org.

5
Figure 1: Removing Confidential CPIA Data from the WGI: Two Illustrative Cases

Control of Corruption, 2015

3
2
1
0
-1
-2

-2 -1 0 1 2 3
Including Confidential CPIA Data

Control of Corruption, 2000


3
2
1
0
-1
-2

-2 -1 0 1 2 3
Including Confidential CPIA Data

6
Table 1a: Government Effectiveness – Consequences of Removal of Confidential CPIA Data

Number of
Significant
Changes at 10% Correlation:
Number of Level Due to Correlation: Countries with
Countries with Removal of Countries with Removed CPIA
Number of Confidential CPIA Confidential CPIA Correlation: All Removed CPIA Data and 3 or
Year Countries Data Data Countries Data Fewer Sources
1998 194 136 0 0.995 0.983 0.977
2000 196 138 2 0.992 0.974 0.959
2002 197 139 0 0.996 0.986 0.955
2003 197 141 0 0.994 0.980 0.957
2004 204 138 1 0.993 0.975 0.953
2005 205 62 0 0.998 0.989 ..
2006 206 62 0 0.997 0.982 ..
2007 207 66 1 0.997 0.983 ..
2008 207 69 1 0.996 0.978 ..
2009 210 62 1 0.997 0.977 ..
2010 210 61 0 0.997 0.978 ..
2011 212 60 0 0.997 0.980 ..
2012 212 58 0 0.998 0.984 ..
2013 212 58 0 0.999 0.989 ..
2014 209 62 0 0.998 0.989 ..
2015 209 63 0 0.998 0.989 ..
2016 209 67 0 0.998 0.992 ..
2017 209 67 0 0.998 0.990 ..
2018 209 62 0 0.998 0.986 ..
2019 209 67 0 0.998 0.982 ..
2020 209 66 0 0.998 0.991 ..

Averages
1998-2004 198 138 0.6 0.994 0.979 0.960
2005-2020 209 63 0.2 0.998 0.985 ..

7
Table 1b: Regulatory Quality – Consequences of Removal of Confidential CPIA Data
Number of
Significant
Changes at 10% Correlation:
Number of Level Due to Correlation: Countries with
Countries with Removal of Countries with Removed CPIA
Number of Confidential CPIA Confidential CPIA Correlation: All Removed CPIA Data and 3 or
Year Countries Data Data Countries Data Fewer Sources
1998 194 136 1 0.989 0.971 0.928
2000 196 138 1 0.990 0.972 0.933
2002 197 139 1 0.994 0.982 0.846
2003 197 141 1 0.997 0.991 0.944
2004 204 138 1 0.996 0.988 0.944
2005 205 62 1 0.999 0.992 ..
2006 205 62 1 0.997 0.986 ..
2007 207 66 1 0.997 0.986 ..
2008 207 69 1 0.997 0.987 ..
2009 210 62 1 0.998 0.987 ..
2010 210 61 1 0.997 0.984 ..
2011 212 60 0 0.998 0.987 ..
2012 212 58 0 0.998 0.987 ..
2013 212 58 0 0.998 0.989 ..
2014 209 62 0 0.999 0.992 ..
2015 209 63 0 0.998 0.991 ..
2016 209 67 0 0.998 0.991 ..
2017 209 67 0 0.998 0.991 ..
2018 209 62 0 0.998 0.991 ..
2019 209 67 0 0.999 0.992 ..
2020 209 66 0 0.998 0.992 ..

8
Table 1c: Rule of Law – Consequences of Removal of Confidential CPIA Data

Number of
Significant
Changes at 10% Correlation:
Number of Level Due to Correlation: Countries with
Countries with Removal of Countries with Removed CPIA
Number of Confidential CPIA Confidential CPIA Correlation: All Removed CPIA Data and 3 or
Year Countries Data Data Countries Data Fewer Sources
1998 201 136 0 0.991 0.979 0.695
2000 203 138 1 0.989 0.973 0.124
2002 203 139 0 0.989 0.971 0.665
2003 203 141 1 0.987 0.967 0.624
2004 210 138 1 0.994 0.983 ..
2005 210 62 1 0.998 0.984 ..
2006 210 62 0 0.998 0.985 ..
2007 210 66 0 0.997 0.980 ..
2008 209 69 1 0.998 0.990 ..
2009 212 62 0 0.998 0.986 ..
2010 212 61 0 0.998 0.985 ..
2011 214 60 0 0.999 0.990 ..
2012 214 58 0 0.999 0.994 ..
2013 214 58 0 0.999 0.995 ..
2014 209 62 0 0.998 0.986 ..
2015 209 63 0 0.999 0.996 ..
2016 209 67 0 0.999 0.995 ..
2017 209 67 0 0.999 0.996 ..
2018 209 62 0 0.999 0.995 ..
2019 209 67 0 0.999 0.990 ..
2020 209 66 0 0.999 0.997 ..

Averages
1998-2004 204 138 0.6 0.990 0.975 0.527
2005-2020 210.5 63 0.1 0.999 0.990 ..

9
Table 1d: Control of Corruption – Consequences of Removal of Confidential CPIA Data

Number of
Significant
Changes at 10% Correlation:
Number of Level Due to Correlation: Countries with
Countries with Removal of Countries with Removed CPIA
Number of Confidential CPIA Confidential CPIA Correlation: All Removed CPIA Data and 3 or
Year Countries Data Data Countries Data Fewer Sources
1998 195 136 0 0.995 0.983 0.953
2000 198 138 1 0.991 0.969 0.845
2002 199 139 2 0.992 0.970 0.759
2003 199 141 1 0.994 0.979 0.689
2004 206 138 0 0.996 0.986 0.777
2005 206 62 0 0.999 0.994 ..
2006 206 62 0 0.999 0.992 ..
2007 207 66 0 0.999 0.992 ..
2008 207 69 0 0.999 0.993 ..
2009 210 62 0 0.999 0.990 ..
2010 211 61 0 0.999 0.989 ..
2011 212 60 0 0.999 0.991 ..
2012 212 58 0 0.999 0.992 ..
2013 212 58 0 0.999 0.992 ..
2014 209 62 0 0.999 0.995 ..
2015 209 63 0 0.999 0.994 ..
2016 209 67 0 0.999 0.993 ..
2017 209 67 0 0.999 0.994 ..
2018 209 62 0 0.999 0.994 ..
2019 209 67 0 0.999 0.995 ..
2020 209 66 0 0.999 0.997 ..

Averages
1998-2004 199 138 0.8 0.994 0.977 0.805
2005-2020 209.125 63 0.0 0.999 0.993 ..

10
Table 2: Cases with Statistically Significant Changes in Estimates of Governance Due to Removal of
Confidential CPIA Data

Estimate Standard Error Number of Sources P-Value for


With Without With Without With Without Difference
Confidential Confidential Confidential Confidential Confidential Confidential
Year CPIA Data CPIA Data CPIA Data CPIA Data CPIA Data CPIA Data
Government Effectiveness
Equatorial Guinea 2000 -1.51 -1.06 0.23 0.27 4 2 0.09
Seychelles 2000 0.06 0.62 0.26 0.33 3 1 0.09
Seychelles 2004 0.04 0.50 0.21 0.26 5 3 0.08
Seychelles 2007 0.08 0.61 0.25 0.32 5 3 0.10
Seychelles 2008 0.09 0.72 0.24 0.33 4 2 0.06
Seychelles 2009 0.13 0.66 0.23 0.30 5 3 0.08

Regulatory Quality
Seychelles 1998 -0.46 0.30 0.31 0.44 3 1 0.08
Seychelles 2000 -0.81 0.28 0.29 0.42 3 1 0.01
Seychelles 2002 -0.65 0.46 0.30 0.37 3 1 0.00
Seychelles 2003 -0.12 0.68 0.24 0.28 3 1 0.00
Seychelles 2004 -0.87 -0.17 0.21 0.26 5 3 0.01
Seychelles 2005 -0.03 0.45 0.18 0.20 5 3 0.01
Seychelles 2006 -0.62 0.19 0.22 0.27 5 3 0.00
Seychelles 2007 -0.85 0.01 0.22 0.30 5 3 0.00
Seychelles 2008 -0.66 -0.05 0.21 0.29 5 3 0.03
Seychelles 2009 -0.55 -0.01 0.20 0.26 6 4 0.04
Seychelles 2010 -0.50 -0.02 0.20 0.27 6 4 0.07

Rule of Law
Micronesia, Fed. Sts. 2004 0.44 1.01 0.26 0.33 4 2 0.08
Equatorial Guinea 2008 -1.27 -0.87 0.17 0.20 8 6 0.04
Solomon Islands 2000 -0.58 0.57 0.39 0.54 4 2 0.03
Seychelles 2003 0.11 0.76 0.27 0.32 5 3 0.04
Seychelles 2005 0.03 0.40 0.19 0.21 7 5 0.08

Control of Corruption
Bhutan 2002 1.04 0.45 0.29 0.35 4 2 0.09
Ethiopia 2000 -0.45 -0.91 0.22 0.25 6 4 0.07
Solomon Islands 2002 -0.71 0.18 0.36 0.54 3 1 0.10
Solomon Islands 2003 -0.63 0.21 0.35 0.51 3 1 0.10

11
Part 2: Other Corrections/Revisions to WGI Data Sources
The 2022 WGI update also includes the following revisions to the underlying source data relative to the
2021 WGI update:

1. Afrobarometer (AFR). The 2022 update of the WGI uses data from Round 8 of Afrobarometer
for 2019, 2020, and 2021. Afrobarometer round 8 surveys were in the field during these three
years, and the data was not available at the time of the 2021 WGI update. In the 2021 WGI
update, data from Afrobarometer Round 7 was used for 2017, 2018, 2019 and 2020. Replacing
the 2019 and 2020 data with Afrobarometer Round 8 data introduces changes in the WGI
dataset for Voice and Accountability, Government Effectiveness, Rule of Law, and Control of
Corruption for 2019 and 2020.
2. Freedom House (FRH). In the 2021 WGI update, data for West Bank and Gaza was inadvertently
excluded from the WGI dataset for 2020. This has been corrected in the 2022 update, with data
reflecting the average of scores provided for West Bank and Gaza. This affects Voice and
Accountability only.
3. Global Integrity Index (GII). In the 2021 WGI update, data for 2008 was incorrectly used for
Serbia in 2009 and 2010. This has been corrected in the 2022 WGI update, affecting the 2009
and 2010 data for Voice and Accountability, Rule of Law, and Control of Corruption.
4. Political Terror Scale (HUM). At the time of the 2021 WGI update, data from the Political Terror
Scale was only available through 2019, and so we used the 2019 data for 2019 and 2020. At the
time of the 2022 WGI update, data from this source for 2020 is available. We use this data for
2020, and also carry it forward for 2021. This revision affects only the Political Stability and
Absence of Violence/Terrorism indicator for 2020. In addition, data for West Bank and Gaza was
inadvertently excluded from the data for 1996-2009. This has been corrected in the 2022
update.
5. Africa Electoral Index (IRP). At the time of the 2021 WGI update there was no new data
available from this source for 2020, so the 2019 data were used for 2019 and 2020. At the time
of the 2021 WGI update a new dataset is available with data for 2020 and 2021. This has been
used to revise the previous WGI data for 2020 and as new data for 2021. This introduces
changes in the data for 2020 relative to the previous version of the WGI dataset. In addition,
the latest version of the IRP data includes some corrections made by the producing organization
to data for earlier years, which affect the data for Botswana (2014-2018), Cameroon (2019),
Mali (2016-2017), Mauritius (2019), and Tunisia (2019). All these changes affect Voice and
Accountability only.
6. Latinobarometer (LBO). The 2021 update of the WGI used data from the 2018 round of
Latinobarometer for 2018, 2019, and 2020. The 2022 update of the WGI uses data from the
2020 round of Latinobarometer for both 2021 and 2020 corresponding to the period this survey
was in the field. This results in changes to the 2020 data relative to the 2021 WGI update. In
addition, the data for 2018 and 2019 were revised to correctly account for sampling weights
using Latinobarometer’s Online Data Analysis tool. This introduces small changes to the data for

12
this source in these two years. These changes affect Voice and Accountability, Government
Effectiveness, Rule of Law, and Control of Corruption.
7. Trafficking in Persons Report (TPR). The data for Israel in 2020 was recorded incorrectly. This
has been corrected in 2020 and affects the data for Rule of Law.

These revisions have minimal effects on the aggregate indicators for the affected countries and years.

13

You might also like