Download as pdf or txt
Download as pdf or txt
You are on page 1of 29

Building indexes of Vulnerability: a sensitivity analysis

of the Social Vulnerability Index


Authors:
Mathew C. Schmidtlein, Roland Deutsch, Walter W. Piegorsch, Susan L. Cutter

Abstract
In 2003, the Social Vulnerability Index (SOVI), which utilizes Principal
Component Analysis (PCA), was created by Cutter, Boruff, and Shirley to examine
spatial patterns of social vulnerability at the county level in the United States. This paper
seeks to identify the sensitivity of this approach to changes in its construction, the scale at
which it is applied, and to various geographic contexts. To determine the impact of
scalar changes, the SOVI was calculated for multiple aggregation levels in the state of
South Carolina. To examine the sensitivity of the algorithm to changes in construction,
and determine if that sensitivity was constant in various geographic contexts, census data
was collected at a sub-metropolitan level for portions of three U.S. cities. Fifty-four
unique variations of the SOVI were calculated for each area. Each set of indexes was
then evaluated using factorial analysis to see if substantial changes in assigned values
occurred. These results were then compared across study areas to evaluate the impact of
changing geographic context. While decreases in the scale of aggregation were found to
result in decreases in the variability explained by the PCA, and increases in the variance
of the resulting index values, the subjective interpretations yielded from the SOVI
remained fairly stable. The algorithm was found to be sensitive to certain changes in
index construction, which differed somewhat between study areas. Understanding the
impacts of changes in index construction and scale are crucial in increasing confidence in
attempts to represent the extremely complex phenomenon of social vulnerability.

Introduction
The impact of Hurricane Katrina on the Gulf Coast, and in New Orleans in
particular, was perhaps the most visible recent demonstration of the need for an
integrative vulnerability science approach to hazards research in the U.S. (Cutter 2003).
While the pre-hurricane evacuation from New Orleans was judged by some to be a
success when measured by the volume of evacuees who were able to flee in a limited
time and over limited highways (Wolshon, Catarella-Michel, and Lambert 2006), it was a
devastating failure for the socially vulnerable population that were unable to leave (Cutter
and Emrich 2006). It is estimated that some 250,000 residents in New Orleans had no
access to personal vehicles; even if regional busses had been used in the evacuation, they
could have accommodated less that 10% of this number in a single trip (Wolshon et al.
2005). And, these figures do not address the special needs and institutionalized
populations, portions of which accounted for about 10% of the fatalities from Katrina in
New Orleans (Schmidlin 2006). These problems highlight a need to better integrate
social science research concerning social vulnerability into emergency and risk
management approaches. This will allow planners to better identify what and where
problems exist before an event occurs, and provide insight as to what steps may be useful
in remedying them (Chakraborty, Tobin, and Montz 2005).
Towards this end, one approach for identifying the locations of populations with
high levels of social vulnerability is the Social Vulnerability Index, or SOVI (Cutter,
Boruff, and Shirley 2003). SOVI can be used to effectively quantify variations in the
relative levels of social vulnerability across an area. But similar to other comparable
indicators, it is difficult to systematically assess its veracity. Because of the complexity

of factors contributing to vulnerability, no variable has yet been identified against which
to fully validate such indices. An alternative approach to assess the robustness of the
index is to identify the sensitivity of how changes in the construction of the index may
lead to changes in the outcome. Factors that may have a large influence on the outcome
of the index include changes in the set of variables used for index construction,
differences in scale of analysis, and changes in the subjective decisions made in the index
algorithm. When we have a greater understanding of the way the index responds to these
changes, we can more confidently interpret and implement the results. The purpose of
this paper is to conduct such an assessment. Three research questions are asked. First,
what is the impact of changes in variable set and analytical scale on the index results?
Next, how robust is the social vulnerability index to changes in its algorithmic approach?
Finally, does the index behave similarly in all settings, or do changes in geographic
context result in differing behavior? The first question will be answered by comparing
indices calculated with varying variable sets and at differing scales within the state of
South Carolina. The final two questions will be answered by constructing a set of social
vulnerability indices calculated with varying algorithm criteria in three locations: the
Charleston-North Charleston Metropolitan area in South Carolina, in Los Angeles
County, California, and finally in Orleans Parish, Louisiana.

Social Vulnerability
While hazards and disasters researchers have long understood that human
decisions have an influence on the outcome of hazard events (Kates 1971; Mileti 1980), it
has only been within the past decade or so that vulnerability as an explicit concept has
begun to be broadly recognized (Blaikie et al. 1994). While multiple definitions of

vulnerability have been proposed (Cutter 1996), here we define it as the likelihood of
sustaining losses from some actual or potential hazard event, as well as the ability to
recover from that loss.
Contributions to vulnerability can be divided into two broad categories:
biophysical vulnerability, or those characteristics of events and physical contexts that
influence the likelihood of losses and ability to recover, and social vulnerability, those
characteristics of society that influence these outcomes (Cutter 1996). Hazards, risk, and
disaster researchers have often focused primarily on elements related to biophysical
vulnerability, because they are relatively less complex than those related to social
vulnerability (Cutter, Boruff, and Shirley 2003). Social vulnerability, however, provides
greater insight into the manner in which the decisions we make as a society influence our
differential experience of hazard events. Social vulnerability stems from limited access
to resources and political power, social capital, beliefs and customs, physical limitations
of the population, and characteristics of the built environment such as building stock and
age, and the type and density of infrastructure and lifelines (Cutter, Boruff, and Shirley
2003).
Cutter, Boruff, and Shirley (2003) identify social indicators work in the 1960s and
1970s, and later quality-of-life indicators research, as the antecedents of current efforts to
model social vulnerability. More recent examples of social vulnerability modeling have
been based on limited representations of the social characteristics involved (Cutter,
Mitchell, and Scott 2000; Wu, Yarnal, and Fisher 2002; Chakraborty, Tobin, and Montz
2005; Wood and Good 2004) or considered only particular aspects of vulnerability (Luers
et al. 2003). Others center on international indicators of human development not

available at sub-national levels (Bohle, Downing, and Watts 1994; Inter-American


Development Bank 2005), or focus primarily on novel methodological approaches
(Rashed and Weeks 2003).
The social vulnerability modeling approach that we assess in this paper, the Social
Vulnerability Index (Cutter, Boruff, and Shirley 2003), was developed to address many
of these shortcomings. It originally was applied as an analysis of social contributions to
vulnerability at the county level in the U.S. for 1990. It was created by first identifying
social characteristics that were consistently identified within the research literature as
contributing to vulnerability, shown in Table 1. These target variables were used to
identify a set of 42 normalized independent variables which influenced vulnerability
(Table 2). These variables were then entered into a Principal Component Analysis
(PCA), from which 11 components were selected, explaining a total of 76.4 percent of
the variance in the original dataset.
These components were interpreted to identify what element of vulnerability they
represented, and scaled to ensure that they contributed to the final index in an appropriate
manner (positive values added to vulnerability, negative mitigated vulnerability). The 11
factors were then summed with equal weights to create the final vulnerability index. The
index was mapped, with counties shaded according to the SOVI values, to allow for
identification of the spatial patterns of social vulnerability within the U.S.
Table 1. Variables influencing social vulnerability (Cutter, Boruff, and Shirley 2003)
Socio-economic status (income,
Commercial and industrial
Race and ethnicity
political power, prestige)
development
Age
Gender
Employment loss
Rural/urban
Residential property
Infrastructure and lifelines
Renters
Occupation
Family Structure
Education
Population growth
Medical services
Social dependence
Special needs populations

Table 2. Social Vulnerability Index variables (Cutter, Boruff, and Shirley 2003)
General local government debt to revenue ratio, 1992
Social Variables
Median age, 1990
Percent of population 25 years or older with no high
school diploma, 1990
Per capita income (in dollars), 1989
Percent of the population participating in the labor
force, 1990
Median dollar value of owner-occupied housing,
Percent females participating in civilian labor force,
1990
1990
Median rent (in dollars) for renter-occupied
Percent employed in primary extractive industries
housing units, 1990
(farming, fishing, mining, and forestry), 1990
Number of physicians per 100,000 population,
Percent employed in transportation, communications,
1990
and other public utilities, 1990
Vote cast for president, 1992percent voting for Percent employed in service occupations, 1990
leading party (Democratic)
Birth rate (number of births per 1,000
Per capita residents in nursing homes, 1991
population), 1990
Net international migration, 19901997
Percent population change, 1980/1990
Land in farms as a percent of total land, 1992
Percent urban population, 1990
Percent African American, 1990
Percent females, 1990
Percent Native American, 1990
Percent female-headed households, no spouse
present, 1990
Percent Asian, 1990
Per capita Social Security recipients, 1990
Percent Hispanic, 1990
Built Environment Variables
Percent of population under five years old, 1990
Percent of housing units that are mobile homes
Percent of population over 65 years, 1990
Per capita number of community hospitals
Percent of civilian labor force unemployed, 1991
Number of housing units per square mile, 1990
Average number of people per household, 1990
Number of housing permits per new residential
construction per square mile, 1990
Percent of households earning more than $75,000, Number of manufacturing establishments per square
1989
mile, 1992
Percent living in poverty, 1990
Earnings (in $1,000) in all industries per square mile,
1990
Percent renter-occupied housing units, 1990
Number of commercial establishments per square
mile, 1990
Percent rural farm population, 1990
Value of all property and farm products sold per
square mile, 1990

This approach has been further used in a number of published and un-published
studies using varying study areas (Boruff, Emrich, and Cutter 2005; Borden et al. 2006),
in different countries (Boruff 2005), at various spatial scales (Cutter et al. 2006; Borden
et al. 2007), and in differing time periods (Finch 2006; Cutter and Emrich 2006). Indeed,
the method may be best viewed as more of an algorithm for quantifying social
vulnerability, than as a simple numerical index. Via these approaches, SOVI has
consistently illustrated its value by revealing both anticipated and unanticipated spatial

patterns that conform to expert interpretations of social vulnerability. But no study to


date, including the original SOVI paper, has included a sensitivity analysis of the
underlying PCA-based approach. It is such an analysis that this paper provides.

Study Area
The analysis in this paper was conducted in three separate study areas:
Charleston, South Carolina; Los Angeles, California; and New Orleans, Louisiana. To
address changes in vulnerability across these locations, we operate at the U.S. Census
tract level. For Charleston, South Carolina, social vulnerability will be calculated within
the Charleston-North Charleston metropolitan area, which consists of Charleston,
Dorchester, and Berkeley Counties, South Carolina. Because of the much larger number
of census tracts within the Los Angeles, California metropolitan area, the analysis will be
carried out only for Los Angeles County. Orleans Parish, Louisiana will be used to
represent the New Orleans study area.
While the SOVI algorithm has been applied at multiple scales in the existing
literature (e.g. Cutter, Boruff, and Shirley 2003; Borden et al. 2006), there has been no
explicit consideration of the impact scalar changes have on the analysis, beyond visual
interpretations of patterns of vulnerability. Because this analysis will be conducted at the
tract level, rather than at the county level as in the original SOVI index, it is important to
consider how this change in scale may impact the analysis. Openshaw (1983) identified
two types of problems that can occur as part of the Modifiable Areal Unit Problem when
dealing with areally aggregated data at different scales.
The first issue, termed the scale problem, is related to the ecological fallacy. As
scales of analysis change, the relationships between variables aggregated to those levels

also change. Thus without access to the original, un-aggregated data, it is impossible to
determine how severe this problem is, although several studies indicate that correlations
tend to increase with increasing scales (Openshaw 1983; Clark and Avery 1976).
Clark and Avery (1976) used an approach in which the correlations between
variables were calculated for the same data and study area at multiple scales. While this
did not give insight into the extent of the problem created by the initial aggregation of
observation units, it did provide a method to assess the impact of the ecological fallacy on
subsequent changes between aggregation scales. Combining this approach with an
explicit limitation of the application of analysis results to the scale at which they were
derived provide a simple means of addressing this issue and assessing its impact between
aggregation scales. The problem then becomes one of picking an appropriate scale of
analysis. In this study, the desire is to address changes in vulnerability across a
metropolitan area, so census tracts seem an appropriate measure. The impact of scalar
changes from the county to the tract level of aggregation will be examined using an
approach based on Clark and Averys (1976) analysis, and will be discussed in further
detail later on in this paper.
The second issue, termed by Openshaw to be the aggregation problem, is that the
relationships between areally aggregated variables may result as much from the
aggregation scheme as from the fundamental relationships between the variables
themselves. Indeed, dramatic differences in correlations between variables may be
produced by varying the aggregation scheme (Openshaw 1983). This forces one to apply
results from analyses of areal data only to the study units at which they were conducted.

Short of creating new aggregation units, the problem here becomes determining whether
the study units used in an analysis are truly meaningful.
While it would be nave to view any pre-existing areal aggregation as ideal for a
given study, census tracts do seem to be a fairly meaningful spatial unit for our analysis.
Tracts are defined by the U.S. Census Bureau in conjunction with local committees of
census data users, and are meant to represent areas with fairly stable population sizes and
to be relatively homogeneouswith respect to population characteristics, economic
status, and living conditions (U.S. Department of Commerce Bureau of the Census
2006). Additionally, it has been observed that while not complete, census tracts have
provided a relatively meaningful proxy for neighborhoods within urban areas (Sampson,
Morenoff, and Gannon-Rowley 2002). Using census tracts therefore seems to be a fairly
meaningful set of spatial units for modeling social vulnerability at the sub-urban level.

Data
The list of variables from the original SOVI, created using 1990 census variables,
was used to guide the selection of variables for our analysis. Changes in census variable
availability at the county and tract level necessitated that a smaller set of variables be
used for this analysis. Additionally, following the approach taken by Borden et al.
(2007), built environment variables were removed from the analysis to focus more
specifically on characteristics of the populations themselves that contributed to
vulnerability. This resulted in a total of 26 variables obtained from the GeoLytics
Neighborhood Change Database (GeoLytics 2006) for use in this analysis, shown in
Table 3.

10

Table 3. Social vulnerability variables for Charleston, SC; Orleans, LA; and Los Angeles, CA study
areas
Percent Female Participation In Civilian Labor
Civilian Labor Force Participation
Force
Average Family Income
Percent Female Headed Households
Median Dollar Value Of Owner Occupied Housing Percent Native American Population (American
Units
Indian, Eskimo, Or Aleut)
Median Gross Rent ($) For Renter-Occupied
Housing Units
Percent Population Under 5 Years
Percent of Population who are Immigrants
Percent Population 65 Years Or Older
Percent Institutionalized Elderly Population
Percent Living In Poverty
Average Number Of People Per Household
Percent Renter Occupied Housing Units
Percent Employed In Primary Industry: Farming,
Fishing, Mining, Forestry
Percent Rural Farm Population
Percent Asian Of Pacific Islander
Percent Hispanic Persons
Percent Employed In Transportation,
Percent Black Population
Communications, And Other Public Utilities
Percent Of The Civilian Labor Force Unemployed
Percent of the Population Living In Urban Areas
Percent Population Over 25 Years Old With Less
Than 12 Years Of Education
Percent Employed In Service Occupations
Percent Households that receive Social Security
Percent Female
Benefits

Methodology
The algorithm used to construct indices of vulnerability in this paper follows that
used by Cutter, Boruff, and Shirley (2003), with the inclusion of data standardization for
the input variables and the final index scores. The computations are carried out using the
following steps:
1. Standardize all input variables to mean 0 and standard deviation 1.
2. Perform the PCA with the standardized input variables.
3. Select the number of components to be further used based on the unrotated
solution.
4. If desired, rotate the initial solution.
5. Interpret the resulting components on how they may influence social vulnerability
and assign signs to the components accordingly. (For this step, an output of the
loadings of each variable on each factor was used to determine if high levels of a
given factor tend to increase or decrease social vulnerability. If a factor tends to
11

show high levels for low social vulnerability, the corresponding factor scores are
multiplied by -1. In some cases high and low levels may increase social
vulnerability in which case the absolute value of the corresponding factor score
was taken for calculating SOVI.)
6. Combine the selected component scores using a predetermined weighting scheme.
7. Standardize the resulting scores to mean 0 and standard deviation 1.
Because PCA is sensitive to the values of the input variables, the data
standardization step is necessary so that all variables have the same magnitude. With the
standardized data set the PCA can be performed in the second step. It returns a set of
orthogonal components which are all linear combinations of the original variables. By
construction the first component is the linear combination that explains the greatest
variation among the original variables, the second component the greatest remaining
variation, and so on. Based on the results of the performed PCA, it is desirable to select a
parsimonious subset of components that explain the underlying data set as closely as
possible. In Cutter, Boruff, and Shirleys (2003) work, the Kaiser criterion was used to
select components (Step 3), a Varimax rotation was used (Step 4), and the interpreted
components were summed with equal weights (Step 6).
Sensitivity of this approach to creating social vulnerability indices was carried out
in two main phases. First, because the analysis involved only a subset of the social
variables found in the original SOVI, and because it was conducted at a different scale,
the first analytical step was to determine the impact of these two differences on the
constructed indices. To assess the influence of using the subset of variables, and the

12

influence of changes in spatial scale, the variables used here were collected at the Census
tract level for the entire state of South Carolina.
To determine the impact of using the subset of variables, these data were
aggregated to the county level. A social vulnerability index was calculated at the county
level using the 33 social vulnerability variables in the original SOVI, and another using
the 26 variables used in this analysis. The correlation between the county level indices
was calculated to determine how closely the index constructed with the subset of
variables matched the index with the full set of social variables.
The impact of changing the level of aggregation on the analysis was assessed via
the approach used by Clark and Avery (1976). They demonstrated that as the level of
aggregation increases, the correlations between variables increase as well. Because the
social vulnerability index approach is based on PCA, which itself relies on the covariance
or correlation between variables to determine the components representing the maximum
dimensions of variability in the dataset it seems reasonable that decreasing the level of
aggregation from the county level -- the aggregation scale used in Cutter, Boruff, and
Shirleys (2003) work -- to the tract level would result in a decrease in the amount of
variance explained by the PCA used to construct the index. To test this, as well as to
determine any other affects of downscaling the SOVI approach, social vulnerability
indices were constructed at the county and tract levels for the state of South Carolina, as
well as at an arbitrarily assigned intermediate level of aggregation. Four of the counties
in the state had only three tracts, meaning that no intermediate aggregation could be
created which would result in a set of units completely unique from the tract and county

13

levels of aggregation. As such, these counties were removed from this analysis, as well
as from the previous comparison of indices at the county level.
The second step in our sensitivity analysis was to consider the influence of
subjective options applied in the construction of the index. The algorithm used for
constructing the original SOVI was reviewed to identify the types of subjective options
made that seemed likely to have some influence on the assignment of index values.
These fell into three categories: PCA component selection (Step 3), PCA rotation (Step
4), and the weighting scheme used to combine the components to create the final index
(Step 6). Logical alternatives to each of these approaches were considered, and index
values for each study area were calculated based on all possible combinations of these
alternatives. This approach resulted in a collection of indices for the Charleston, Los
Angeles, and Orleans study areas. The sensitivity of the index could then be compared
between study areas to determine if the results of the analysis remained stable across a
variety of regional locations.
For calculating the SOVI index, the following methods for PCA component
selection were used:
1. Kaiser Criterion (Kaiser 1960): Select only those component whose eigenvalues
are greater than 1.
2. Percent Variance Explained: Retain as many components in order to account for a
pre-specified amount of variation in the original data. For the SOVI algorithm, the
fewest number of components were chosen such that at least 80% of the variation
in the original data was accounted for.

14

3. Horns Parallel Analysis (Horn 1958): This selection criterion is similar to the
Kaiser Criterion. However, instead of using a fixed threshold one retains those
factors whose eigenvalues are larger than the expected eigenvalue for that
component. Since the expected eigenvalue is arduous to compute, Horns parallel
analysis uses 100 randomly generated datasets on which a PCA is performed and
then averages over the resulting eigenvalues.
4. Expert Choice: Another approach for selecting components relies upon
identifying a set of components that have meaningful, subject area interpretations.
We term this selection criteria expert choice.
A total of six PCA rotation methods were considered:
1. Unrotated solution: in order to use the components that explain the greatest
percentage of the original variation, no rotation is applied.
2. Varimax Rotation (Kaiser 1958): this rotation tends to load each variable highly
on just one component. This often leads to easier component interpretation.
3. Quartimax Rotation (Neuhaus 1954; Carroll 1953; Ferguson 1954; Saunders
1953): this rotation tends increase large loadings and decrease small ones, so that
each variable will load only on a few factors. This should lead to fewer relevant
components than other rotations.
4. Promax Rotation (Hendrickson and White 1964): In contrast to the other
rotations, the Promax rotation represents an oblique rotation. Thus, the resulting
rotated components are no longer orthogonal. By allowing the resulting
components to be correlated with each other, one may hope to achieve even easier
interpretability. The Promax rotation also requires specification of a power

15

parameter, which is typically taken between 2 and 4. We chose the values 2, 3


and 4 for the algorithm.
Three approaches for weighting the selected and interpreted components were
considered:
1. Sum the component scores: Since each PCA component absorbs a different
aspect of social vulnerability a simple approach of combining the components is
to sum the scores, thus assigning equal contributions to each component of the
SOVI value.
2. First component only: Mathematically, the first extracted component from a PCA
is the linear combination that explains the largest amount of variation in the
original data. Therefore, selecting just the first component will give the
mathematically optimal value to summarize all the input variables in a single
combination.
3. Weighted sum using explainable variance to weight each component: This is a
compromise between the first two methods. Since each successive component
contributes less and less to the explainable variation, it seems reasonable to give
the first PCA component the most weight and to decrease the weights accordingly
for the following components.
Finally, since the computed SOVI value does not itself have any absolute
interpretation, it is standardized to mean 0 and variance 1 in order to map the values over
space or to compare different methods. Positive values suggest high social vulnerability,
whereas negative values suggest low social vulnerability. In total, 72 different versions
of social vulnerability indices were possible for each study area. In the end, only 54

16

unique versions were calculated, because the Expert Choice component selection
coincided with the Kaiser Criteria selection, and was therefore dropped from the analysis.
The approach for this segment of the analysis was implemented using SAS Software (The
SAS Institute Inc. 2005).
To determine the impact of these subjective options on the final index values, we
assessed the changes in SOVI statistically using a three-way factorial analysis. The
factors in this analysis were based on the three categories of subjective options described
above: component selection, PCA rotation, and weighting scheme. In this setting, we
also considered the tract i.d. within each study area as a blocking factor. Because the
computed index values do not represent a truly random sample drawn from some broader
population, this operation is not intended to make any statistical inferences. Rather, we
use this calculation simply to reveal if substantial differences in the index values within
each study area occurred as a result of the choice(s) of subjective option.
In the factorial analysis, we employed Type III Sums of Squares to assess the
importance of each subjective option. The associated p-values were seen as measures of
the influence each subjective option has on the final index value. Small p-values would
suggest that changes in the choice of that subjective option have a large impact on the
final index value, whereas large p-values would suggest that choices within that
subjective option do not substantially impact the final value.
To gain an understanding of how changes in each specific option affected the
final index values, we further assessed differences among the levels within each
subjective option, using multiple comparison techniques. Here, a compared difference
between two choices of a subjective option that exhibits a small p-value suggests that

17

changing from one choice of the option to the other can produce quite different final
index values, whereas a p-value close to 1 would suggest no substantial difference in how
the two choices for the option affect the final index.

Results
The comparison between the vulnerability indices derived using the original
variables at the county level, and the subset of 26 variables used in this analysis showed
strong similarities. The algorithm used in this stage of the analysis followed the original
approach, using the Kaiser criterion for component selection, a Varimax rotation, and
equal weighting to sum the collected components. The PCA performed on the original
set of 33 social variables resulted in 8 components which explained 85.8 percent of the
variance of the original data. The PCA performed on the subset of 26 variables resulted
in 6 components, explaining 85.0 percent of the data variance. Both PCAs resulted in
sets of components with broadly similar subject interpretations. The correlation between
the two indices calculated showed a fairly strong positive relationship between the two
indices ( r = 0.79 ).
The results of the PCAs conducted at multiple aggregation scales are shown in
Table 4. As expected, as the scale at which the PCA was conducted decreased, the
variance explained decreased, and the number of components selected using the Kaiser
criterion increased. Figure 1 shows graph of the percent variance explained by the
rotated components selected for each PCA. From this graph, we can see that decreasing
the aggregation scale has the result of flattening the curve displayed, meaning that both
the variance explained by the first rotated component is decreased, and also that the rate
of decrease in variance explained by component decreases. Returning to Table 4, we see

18

that while there are differences among the subject interpretation of the components
between the scales, they are broadly similar. Finally, as shown in Table 5, as the
aggregation scale was decreased, the variance and range of the indices computed from the
PCAs increased.
Table 4. Results for County, Intermediate, and Tract level PCAs

County Level

Intermediate Level

# Components
Variance
Explained: 6
Components
Variance
Explained: 7
Components
Component
Interpretation
Wealthy,
Urban
Race and
Poverty
Hispanic
Immigrants
Age

# Components
Variance
85.0
Explained: 6
Components
Variance
Explained: 7
Components
Variance Component
Explained Interpretation
Race and
29.5
Poverty
Urban Renters,
21.8
Race

10.8
9.3

Gender

7.7

Race

6.1

Tract Level

# Components
Variance
79.3
Explained: 6
Components
Variance
83.0
Explained: 7
Components
Variance Component
Explained Interpretation
Race and
22.6
Poverty

16.3

Rural/Urban

11.7

Wealth

13.6

Wealth

10.9

Age
Hispanic
Immigrants
Gender

11.3

Elderly
Hispanic
Immigrants
Kids
Gendered
Labor

9.6

8.6
6.9

69.1

73.2
Variance
Explained
17.6

8.8
7.8
6.8

Table 5. Variance and range of indices at County, Intermediate, and Tract aggregations

Aggregation Scale
County Level
Intermediate Level
Tract Level

Index Variance
4.45
5.09
6.66

Index Range
10.38
11.33
29.25

19

Table 6 Charleston-North Charleston Factorial Analysis Results

Source
DF Type III SS Mean Square F Value Pr > F
116 2171.347630
18.718514
28.43 <.0001
ID
2
4.904791
2.452396
3.72 0.0242
select
5
0.521309
0.104262
0.16 0.9776
rotate
10
5.463944
0.546394
0.83 0.5998
Select*rotate
2
29.343456
14.671728
22.28 <.0001
combine
4
41.875727
10.468932
15.90 <.0001
select*combine
1.941932
0.194193
0.29 0.9826
Rotate*combine 10

Comparison of PCAs at Varying Aggregations


35

30

Percent Variance Explained

25

20
County Level
Intermediate Level
Tract Level
15

10

0
1

Component Number

Figure 1. Variance explained by component for three aggregation levels

The results from the factorial analysis reveal the impact of the changes in the
algorithm construction on the index values for each study area. In this analysis, p-values
less that 0.10 are considered to indicate substantial differences resulting from a particular
set of subjective decisions (selection, rotation, and combination), or in the case of the
multiple comparison results, differences between the options within a set. Table 6 lists
the Type II sum of squares result from the factorial analysis of the Charleston-North

20

Charleston area. Both component selection and combination result in substantial changes
in this study area. Table 7 and Table 8 show the multiple comparison results for the
options within each of these catagories. Table 7 shows that the Horn and Kaiser
selection criteria are substantially different, while smaller differences occur between the
Kaiser and Variance Explained selection criteria. Table 8 shows that the weighted sum
approach for component combination is substantially different than the other two
approaches.
Table 7 P-values for pair-wise differences in component selection for Charleston-North Charleston
Horn Kaiser Variance
Explained
0.1180
Horn
0.0241
0.7974
Kaiser
0.0241
Variance 0.1180 0.7974
Explained

Table 8 P-values for pair-wise differences in factor combination for Charleston-North Charleston
First Factor
Sum
Weighted Sum
0.2575
First Factor
<.0001
0.2575
Sum
<.0001
Weighted Sum
<.0001 <.0001

Index values in Los Angeles were substantially influenced by changes in the PCA
rotation and component combination criteria (Table 9). From the results in Table 10, we
see that the unrotated approaches were substantially different from all other rotations, and
that the Promax (k=3) rotation was different from all but the Promax (k=4) rotation. As
was the case in the Charleston-North Charleston study area, the weighted sum
combination approach was substantially different from both of the other combination
approaches (Table 11).
Table 9 Los Angeles Factorial Analysis Results
Source
DF
Type III SS Mean Square
2046 45312.28276
22.14677
ID
2
0.58835
0.29418
Select

F Value
37.47
0.50

Pr > F
<.0001
0.6079

21

Rotate
select*rotate
Combine
select*combine
rotate*combine

5
10
2
4
10

50.85990
2.71381
101.94634
2.01019
124.04739

10.17198
0.27138
50.97317
0.50255
12.40474

17.21
0.46
86.23
0.85
20.99

<.0001
0.9168
<.0001
0.4931
<.0001

Table 10 P-values for pair-wise differences in factor rotation for Los Angeles
Unrotated Varimax Quartimax Promax 2 Promax 3 Promax 4
Unrotated
<.0001
<.0001
<.0001
<.0001
<.0001
1.0000
1.0000
0.1367
Varimax
<.0001
0.0773
1.0000
1.0000
0.1367
Quartimax
<.0001
0.0773
1.0000
1.0000
0.1367
Promax 2
<.0001
0.0773
1.0000
Promax 3
<.0001
0.0773
0.0773
0.0773
0.1367
0.1367
0.1367
1.0000
Promax 4
<.0001

Table 11 P-values for differences in factor combination for Los Angeles


First Factor
Sum
Weighted Sum
0.1611
First Factor
<.0001
0.1611
Sum
<.0001
Weighted Sum
<.0001 <.0001

Finally, all three subjective decision categories were influential for Orleans Parish
(Table 12). The Kaiser selection criteria was substantially different from the Variance
Explained approach, and also modestly different from the Horn selection criteria (

22

Table 13). The only substantial difference found in the rotation category was between
the Promax (k=3) and Promax (k=4) rotations (Table 14). And as in Charleston-North
Charleston and Los Angeles, the weighted sum combination approach was substantially
different from the other two approaches (Table 15).
Table 12 Factorial Analysis Results for Orleans
Source DF Type III SS Mean Square F Value
180 3424.320015
19.024000
31.32
ID
2
4.213846
2.106923
3.46
select
5
6.916796
1.383359
2.27
rotate
79.252498
39.626249
65.10
combine 2

Pr > F
<.0001
0.0314
0.0447
<.0001

23

Table 13 P-values for pair-wise differences in component selection for Orleans


Horn Kaiser Variance
Explained
0.1201
0.7398
Horn
0.1201
Kaiser
0.0413
Variance 0.7398 0.0413
Explained

Table 14 P-values for pair-wise differences in factor rotation for Orleans


Unrotated Varimax Quartimax Promax 2 Promax 3
0.8934
0.8512
0.9966
0.1801
Unrotated
0.8934
1.0000
0.9877
0.9280
Varimax
0.8512
1.0000
0.9841
0.8536
Quartimax
0.9966
0.9877
0.9841
0.4361
Promax 2
0.1801
0.9280
0.8536
0.4361
Promax 3
0.9273
0.3968
0.3471
0.7288
Promax 4
0.0315

Promax 4
0.9273
0.3968
0.3471
0.7288
0.0315

Table 15 P-values for pair-wise differences in factor combination


First Factor
Sum
Weighted
Sum
1.0000
First Factor
<.0001
1.0000
Sum
<.0001
Weighted
<.0001
<.0001
Sum

Discussion and Conclusions


Given these results, it is possible to reach a general set of conclusions regarding
the research questions asked at the beginning of this analysis. The first question was
concerned with the adequacy of the subset of variables used in this analysis, and the
impact of scalar changes on the PCA and resulting index. We find that the subset of
variables used in this analysis provided a representation of vulnerability very similar to
the one derived using the full set of social variables employed in the original SOVI. With
regard to the impact of scalar changes on the analysis, while our ability to explain the
variability in the data through the use of PCA decreases with decreasing scale, the index
values themselves become more spread out. Additionally, the subjective interpretations
of the PCA components remained fairly stable between scales. Because these are
understood as relating to the drivers of vulnerability within the study area, this suggests

24

that while scalar changes impact the PCA analysis and the numeric properties of the
index, the identification of the drivers of vulnerability within a study area, based on a
constant variable set, are not strongly dependent on the scale of aggregation used within
the study area.
While the performance of the SOVI algorithm does not appear to be substantially
influenced by scalar changes, it is sensitive to variations in its construction. Further, it
appears to be sensitive to differing changes in different study areas, suggesting that the
context in which the analysis is performed has an important impact on the behavior of the
index. There were, however, some general conclusions that could be made. First, the
only factor that was found to have a substantial impact in all three study areas was the
manner in which the components were combined to create the final index values. For all
three study areas, the variance weighted approach to combining the components was
substantially different from the first component only and equal weights approaches.
When the selection approach was important, the Kaiser Criteria was substantially
different from one of the others. Finally, differences in the impact of the six rotation
methods were not uniform across study areas.
While this analysis seems to indicate that the algorithm may perform well with
varying datasets and at differing scales, and results in some understanding as to affect of
varying some of the subjective decisions made in index construction, it also highlights the
importance of expert judgment in the process of index creation in varying geographic
contexts. The importance of expert judgment in the index creation process is not limited
to this area, however. Expert judgment is also a critical element in the subjective
interpretation of the components generated by the PCA (Step 5 of the algorithm). These

25

components must be interpreted to determine whether the components will be assigned a


positive, negative, or absolute value before they are combined with the other components
to create the index. Additionally, the adequacy of the representation of vulnerability
produced by the index can at this time only be judged with reference to expert knowledge
of the characteristics of the area. Future research on the SOVI algorithm could be
designed to assess the impact of changes in the interpretation of components on the final
index and provide more concrete guidance to this crucial element. Perhaps the same set
of PCA components could be shown to a panel of experts, and the consistency of their
judgments could be measured. Input from a panel of experts would also be beneficial in
determining not just the sensitivity of the algorithm to changes in the subjective decisions
made, but also the adequacy of the representations of vulnerability produced. This could
be done by the development of an expected representation of vulnerability by a panel of
experts familiar both with vulnerability concepts as well as with the study area. This
representation could be compared against the outcome of various approaches to index
creation to aid in the selection of an optimal approach to creating the index. If this were
repeated for several study areas, it may be possible to reach conclusions not only about
the sensitivity of the algorithm, but perhaps also to make recommendations of optimal
approaches to index construction.
Understanding the sensitivity of the SOVI algorithm to changes in variable sets,
scale, and index construction is crucial. There is no obvious avenue through which
indices of social vulnerability may be validated. That being the case, we must strive at
least to ensure that we understand the limitations of our methodology. Knowing the

26

limitations will allow us to apply them more appropriately, and have greater confidence
in our interpretations of their results.

References:
Blaikie, Piers, Terry Cannon, Ian Davis, and Ben Wisner. 1994. At risk: natural hazards,
people's vulnerability and disasters. New York: Routledge.
Bohle, Hans G., Thomas E. Downing, and Michael J. Watts. 1994. Climate change and
social vulnerability: toward a sociology and geography of food insecurity. Global
Environmental Change 4 (1):37-48.
Borden, Kevin, Mathew C. Schmidtlein, Christopher T. Emrich, Walt Piegorsch, and
Susan L. Cutter. 2006. Vulnerability of U.S. cities to environmental hazards.
Journal of Homeland Security and Emergency Management In review.
. 2007. Vulnerability of U.S. cities to environmental hazards. Journal of
Homeland Security and Emergency Management 4 (2):Article 5.
Boruff, B. J., C. Emrich, and S. L. Cutter. 2005. Erosion hazard vulnerability of US
coastal counties. Journal Of Coastal Research 21 (5):932-942.
Boruff, Bryan J. 2005. A multiple hazards assessment of two Caribbean nations:
Barbados and St. Vincent. Doctoral Dissertation, Department of Geography,
University of South Carolina, Columbia, SC.
Carroll, J.B. 1953. Approximating simple structure in factor analysis. Psychometrika
18:23-38.
Chakraborty, Jayit, Graham A. Tobin, and Burrell Montz. 2005. Population evacuation:
assessing spatial variability in geophysical risk and social vulnerability to natural
hazards. Natural Hazards Review 6 (1):23-33.
Clark, W.A.V., and Karen L. Avery. 1976. The effects of data aggregation in statistical
analysis. Geographical Analysis 8 (4):428-438.
Cutter, S. L., and C. T. Emrich. 2006. Moral hazard, social catastrophe: The changing
face of vulnerability along the hurricane coasts. Annals Of The American
Academy Of Political And Social Science 604:102-112.
Cutter, S. L., C. T. Emrich, J. T. Mitchell, B. J. Boruff, M. Gall, M. C. Schmidtlein, C. G.
Burton, and G. Melton. 2006. The long road home: Race, class, and recovery
from Hurricane Katrina. Environment 48 (2):8-20.
Cutter, Susan L. 1996. Vulnerability to environmental hazards. Progress in Human
Geography 20 (4):529-539.
. 2003. The vulnerability of science and the science of vulnerability. Annals of the
Association of American Geographers 93 (1):1-12.
Cutter, Susan L., Bryan J. Boruff, and W. Lynn Shirley. 2003. Social vulnerability to
environmental hazards. Social Science Quarterly 84 (2):242-261.
Cutter, Susan L., Jerry T. Mitchell, and Michael S. Scott. 2000. Revealing the
vulnerability of people and places: a case study of Georgetown County, South
Carolina. Annals of the Association of American Geographers 90 (4):713-737.
Ferguson, G.A. 1954. The concept of parsimony in factor analysis. Psychometrika
19:281-291.

27

Finch, Christina. 2006. Spatial and Temporal Analysis of Social Vulnerability to


Environmental Hazards in the United States. Masters Thesis, Department of
Geography, University of South Carolina, Columbia, SC.
GeoLytics. CensusCD neighborhood change database (NCDB): tract data from 19702000 (Long form release). GeoLytics 2006 [cited.
Hendrickson, A.E., and P.O. White. 1964. Promax: a quick method for rotation to
orthogonal oblique structure. British Journal of Statistical Psychology 17:65-70.
Horn, J.L. 1958. A rationale and test for the number of factors in factor analysis.
Psychometrika 30:179-185.
Inter-American Development Bank. 2006. Indicators of disaster risk and risk
management: program for Latin America and the Carribean 2005 [cited 7
October 2006]. Available from
http://idbdocs.iadb.org/wsdocs/getdocument.aspx?docnum=465922.
Kaiser, H.F. 1958. The varimax criterion for analytic rotation in factor analysis.
Psychometrika 23:187-200.
. 1960. The application of electronic computers to factor analysis. Educational
and Psychological Measurement 20:141-151.
Kates, Robert W. 1971. Natural hazard in human ecological perspective: hypothesis and
models. Economic Geography 47 (3):438-451.
Luers, Amy L., David B. Lobell, C. Lee Addams, and Pamela A. Matson. 2003. A
method for quantifying vulnerability, applied to the agricultural system of the
Yaqui Valley, Mexico. Global Environmental Change 13:255-267.
Mileti, Dennis S. 1980. Human adjustment to the risk of environmental extremes.
Sociology and Social Research 64 (3):327-347.
Neuhaus, Wrigley. 1954. The quartimax method: An analytical approach to orthogonal
simple structure. British Journal of Statistical Psychology 7:81-91.
Openshaw, Stan. 1983. The Modifiable Areal Unit Problem. Vol. 38, Concepts and
Techniques in Modern Geography. Norwich: Geo Books.
Rashed, Tarek, and John Weeks. 2003. Assessing vulnerability to earthquake hazards
through spatial multicriteria analysis of urban areas. International Journal of
Geographical Information Science 17 (6):547-576.
Sampson, R. J., J. D. Morenoff, and T. Gannon-Rowley. 2002. Assessing "neighborhood
effects": Social processes and new directions in research. Annual Review Of
Sociology 28:443-478.
Saunders. 1953. An analytical method for rotation to orthogonal simple structure. In
Research Bulletin 53-10. Princeton, NJ: Educational Testing Service.
Schmidlin, T. W. 2006. On evacuation and deaths from Hurricane Katrina. Bulletin Of
The American Meteorological Society 87 (6):754-756.
The SAS Institute Inc. 2005. SAS Online Doc 9.1.3. Internet address:
http://support.sas.com/onlinedoc/913/docmainpage.jsp.
U.S. Department of Commerce Bureau of the Census. 2006. Question & Answer Center
2006 [cited 4 October 2006]. Available from http://ask.census.gov/cgibin/askcensus.cfg/php/enduser/std_adp.php?p_faqid=245&p_created=107712247
3&p_sid=l2jFYgji&p_lva=245&p_sp=cF9zcmNoPTEmcF9zb3J0X2J5PSZwX2d
yaWRzb3J0PSZwX3Jvd19jbnQ9NzUmcF9wcm9kcz0mcF9jYXRzPSZwX3B2PS

28

ZwX2N2PSZwX3BhZ2U9MSZwX3NlYXJjaF90ZXh0PWRlZmluZSB0cmFjdA
**&p_li=&p_topview=1.
Wolshon, B., A. Catarella-Michel, and L. Lambert. 2006. Louisiana highway evacuation
plan for Hurricane Katrina: Proactive management of a regional evacuation.
Journal Of Transportation Engineering-Asce 132 (1):1-10.
Wolshon, Brian, Elba Urbina Hamilton, Chester Wilmot, and Marc Levitan. 2005.
Review of policies and practices for hurricane evacuation I: transportation
planning, preparedness, and response. Natural Hazards Review 6 (3):129-142.
Wood, Nathan J., and James W. Good. 2004. Vulnerability of port and harbor
communities to earthquake and tsunami hazard: the use of GIS in community
hazard planning. Coastal Management 32:243-269.
Wu, Shang-Ye, Brent Yarnal, and Ann Fisher. 2002. Vulnerability of coastal
communities to sea-level rise: a case study of Cape May County, New Jersey,
USA. Climate Research 22:255-270.

29

You might also like