Professional Documents
Culture Documents
Docu 4
Docu 4
Docu 4
A new approach for computing a flood vulnerability index using cluster analysis
a r t i c l e i n f o
Article history:
Received 12 February 2015
Received in revised form
1 April 2016
Accepted 5 April 2016
Available online xxx
1. Introduction
From 2001 to 2010, hydrological disasters in Europe (flood and mass movements)
represented the largest share of total disaster victims (55.1%) and millions of
Euros worth of damages (Guha- Sapir et al., 2012). Flood risk assessment
entails understanding vulnerability, which is an important issue at present,
because
http://dx.doi.org/10.1016/j.pce.2016.04.003
1474-7065/© 2016 Elsevier Ltd. All rights reserved.
Please cite this article in press as: Fernandez, P., et al., A new approach for
computing a flood vulnerability index using cluster analysis, Physics and
Chemistry of the Earth (2016), http://dx.doi.org/10.1016/j.pce.2016.04.003
2
P. Fernandez et al. / Physics and Chemistry of the Earth xxx (2016) 1-9
Please cite this article in press as: Fernandez, P., et al., A new approach for
computing a flood vulnerability index using cluster analysis, Physics and
Chemistry of the Earth (2016), http://dx.doi.org/10.1016/j.pce.2016.04.003
P. Fernandez et al. / Physics and Chemistry of the Earth xxx (2016) 1-9
3
Flood vulnerability is assessed for the municipality of Vila Nova de Gaia (Fig.
1), situated in Northern Portugal, where several floods have occurred (Branda~o et
al., 2014; Ze^zere et al., 2014). According to the DISASTER Project (Ze^zere et
al., 2014), any flood event is stored in the database if it led to casualties
or injuries, and missing, evacuated or homeless people, independent of the
number of people affected. Between 1865 and 2010, 57 floods were reported in Vila
Nova de Gaia municipality, accounting for a total of four deaths, as well as
evacuation of 123 and displacement of 2930. The mu- nicipality ranks as the
fourth in Portuguese ranking of flood di- sasters. Between 1999 and 2009, 1275
flood events were reported by the national civil protection service.
The municipality has an area of 168.46 km2 and is divided into
24 civil parishes and 3076 neighbourhoods. It is the third most populous
municipality in Portugal, with 302,295 inhabitants in
2011, approximately 180,000 of which are urban residents. From
2001 to 2011 the number of residents has increased by approxi- mately 15,000.
The population density is about 1795 inhabitants per km2 and the building density
is about 386 buildings per km2.
The predominant land uses are urban (43%) and forest (39%).
In this study two data sets were used: the Geographic Infor- mation of Portuguese
Statistics (2011 Census) provided by the National Statistics Institute and the
land use maps for 1990 and
2007 provided by the Portuguese Geographic Institute. The 2011
Census data are geo-referenced information based on small terri- torial units.
The neighbourhood is the territorial unit which iden- tifies the smallest
homogenous area, whether built-up or not, that exists in the statistical section.
It represents a block in urban areas.
A vulnerability study has to take into account several variables within an area
such as: age, gender, race, ethnicity, social class, unemployment rate, immigrant
status, density and quality of the built environment, land use, housing tenancy
and presence of informal support networks (Borden et al., 2007; Burton and
Cutter,
2008; Cutter et al., 2003; Cutter et al., 2000; Fekete, 2009a; Finch et al., 2010;
Lein and Abel, 2010; Masozera et al., 2007; Rygel et al., 2006; Schmidtlein et
al., 2011; Schneiderbauer and Ehrlich,
2006; Simpson and Katirai, 2006; Tapsell et al., 2002; Wisner et al., 2004).
The variables for the social, economic and physical dimensions were derived from
the information available in the 2011 Census at the neighbourhood level.
Additionally, environmental variables were added from the land use data. The
criteria for variable se- lection were taken from the literature and are
presented in Table 1.
PCA reduces the number of variables and determines some components that summarise
different vulnerability characteristics. The proposed FloodVI is estimated
according to the following steps, as illustrated in Fig. 2:
Please cite this article in press as: Fernandez, P., et al., A new approach for
computing a flood vulnerability index using cluster analysis, Physics and
Chemistry of the Earth (2016), http://dx.doi.org/10.1016/j.pce.2016.04.003
4
P. Fernandez et al. / Physics and Chemistry of the Earth xxx (2016) 1-9
Table 1
Considerations for variable selection. Variables
Considerations
Building density In urban areas with high population density, the rescue
process is often rather complicated. In some cases, high population density is
related to a
relative number of lower income families (Masozera et al., 2007). Buildings'
density is a factor which influences vulnerability in inundated areas because of
potential increase of building's exposure to flooding (Cardona, 2005; Tapsell et
al., 2002).
Number of floors The upper floors of buildings may be used to protect people
and their belongings (Schneiderbauer, 2007).
Construction period The more recent building constructions are based on structure
safety regulations. Therefore, these buildings are often more resistant. Older
neighbourhoods also have older sewage systems which may be more susceptible to
flooding (Simonovic et al., 2007).
Building structure The main floor and wall construction materials determine
the building's physical fragility to a flood event and indicate resistance to
damage, as well as the social status of the residents (Müller et al., 2011;
Schneiderbauer, 2007).
Housing occupancy Landlords are more likely to pursue construction changes in
their buildings and to have insurance or increase their insurance coverage than
tenants (Tapsell et al., 2002). People that rent a house usually do not have the
financial resources for home ownership and often do not have access to information
about financial support during recovery (Cutter et al., 2003; Fekete, 2009a).
Gender Women have a higher perception of risk and are better
prepared for action (Fekete, 2009a). However, women can have more difficulty during
recovery, often due to lower income and greater family care responsibilities
(Cutter et al., 2003; Fekete, 2009a; Hewitt, 1997).
Education level The level of education and illiteracy rate are clear
factors of socioeconomic vulnerability, because there is a direct relationship
between these and economic capacity, social status and job opportunities (Cutter et
al., 2003; Fekete, 2009a).
Age The elderly have limited mobility and
physical difficulties in evacuations. They are more reluctant to leave their homes,
have health-related problems and longer recovery time (Rygel et al., 2006). The
very young also have high physical fragility and dependency (Cutter et al., 2003;
Fekete,
2009a; Hewitt, 1997; Kuhlicke et al., 2011).
Unemployment The unemployed comprise a special group that is more
dependent on other family members and on the government (Fekete, 2009a). The
unemployed potentially have lower financial assets, so their houses are of lower
quality and are most probably not insured (Balica, 2012).
Household composition
analysis and the elimination of redundant data. This procedure measures if the
correlations are appropriate to carry out PCA.
3) Analysis of PCA output results. The Kaiser-Meyer-Olkin test (KMO) measures
the sampling adequacy and shows the extent to which the data fits factor analysis,
thus determining the level of confidence that can be expected when using factor
analysis (Hair et al., 2009). KMO values above 0.6 indicate an acceptable
normalized PC using the Euclidean distance measure. The K- means is a
partition clustering approach by which each point is assigned to the cluster with
the closest centre, with a pre-defined number of clusters. The K-means algorithm
aims at minimizing an objective function (equation (1)), in this case a
squared error function.
level and above 0.8 a good compatibility level of variables
J = X
Xn Ixðj) I2
(Hutcheson and Sofroniou, 1999). Those components whose eigenvalues are greater
than one are selected (Kaiser, 1960).
j=1
i=1I i - cjI
(1)
Communalities measure the extent to which the variance of the
where k is the number of clusters, n is the number of cases and
I j) I2
original variables is accounted for by the observed components.
Ixð I
I i - cjI
is a distance function between case i and the centre of
The communalities values should all be greater than 0.5.
4) Analysis of variance explained. The number of components needed in order
to account for a pre-specified amount of orig- inal data variation should be
retained. The smallest number of components is chosen such that at least 80% of
the original data variation is explained.
5) Rotation of the initial PCA solution using the Varimax rotation.
This is a popular orthogonal factor rotation method and the factors are
extracted so that their axes are maintained at 90o. This generally simplifies
the relationships among the variables and clarifies the interpretation of the
factors.
6) Calculation of the component scores for each neighbourhood.
The component score is a composite measure created for each observation of each
extracted factor in the factor analysis. The component scores are standardized to
a z-score.
7) Aggregation method, assigning a vulnerability class to each neighbourhood.
The novel aggregation method proposed here is based on CA, which is regarded as
the most practical method of establishing
regions with similar characteristics from large data sets (Hosking
cluster j (cj) (Jain and Dubes, 1988).
The cluster centre is defined as a parameter set that has the minimum average
Euclidean distance to each of the members in the cluster. This distance is
weighted by the explained variance obtained for the PC. In this way, the
dimensions that are considered more important will have more impact on the
clustering process. The number of clusters used in clustering is five according
to the five vulnerability classes (very low, low, medium, high, and very high).
Each cluster is then characterized by a value, to reduce the 4 dimensional
space to one dimension, during the classification process. This value is the
mean value of the coordinates of the cluster centre. The cluster with the
smallest value will be classified as being the least vulnerable and the one with
the highest value as he most vulnerable.
To assess FloodVI sensitivity three other aggregation methods are considered:
Sum of components (Aggregation 1): this is a simple approach that adds the
component scores (CS), assigning equal weight to each component of the index
(equation (2)).
Please cite this article in press as: Fernandez, P., et al., A new approach for
computing a flood vulnerability index using cluster analysis, Physics and
Chemistry of the Earth (2016), http://dx.doi.org/10.1016/j.pce.2016.04.003
P. Fernandez et al. / Physics and Chemistry of the Earth xxx (2016) 1-9
5
First component (Aggregation 2): the first extracted compo- nent is the linear
combination of variables that explain the largest amount of variation in the
original data. Therefore, selecting only the first component (CS1) will give
the mathematically optimal value that summarizes all the input variables in a
single combina- tion (equation (3)).
Aggregation 2 = CS1
(3)
Weighted sum of components (Aggregation 3): This is a compromise between
the first two methods, where each compo- nent's weight (vi) is the proportion
between the explainable vari- ance and the total variation (equation (4)).
n
Aggregation 3 v CS
(4)
i=1
to z-scores. These z-score values are then used to classify the cor- responding
neighbourhood into a vulnerability class. These classes correspond to an interval
of dimensions that is determined by the standard deviation of the z-score values
(that has a value of 1 since the z-scores are normalized). Hence, the
corresponding neigh- bourhoods are classified respectively as having very high
vulnera- bility for z-score values > 1.5, high vulnerability for 1.5 < z-score
values <0.5, medium vulnerability for -0.5 <z-score values < 0.5, low
vulnerability for -1.5 < z-score values < -0.5, and finally very low vulnerability
for z-score values < -1.5. Selection of these values is supported by the existing
literature (Cutter et al., 2003; Dunning and Durden, 2013; Schmidtlein et al.,
2008) and assumes that the z- scores have a normal distribution. In this way, and
considering the
5 vulnerability classes, the probability of having a neighbourhood with medium
vulnerability class is 38.292% against 24.173% for the low and high
vulnerability classes and 6.681% for the other two classes. This form of
classification forces the neighbourhoods to pertain to a certain class, depending
on the thresholds used. On the contrary, and being more accurate, CA does not
impose any kind of constraint on the distribution of the areas to be
classified. Henceforth, cluster aggregation method will be referred to as Ag-
gregation 4.
After PCA, a KMO test value of 0.882 was achieved, which can be considered good.
The relationship between the selected variables can be described by 4
components which explain 86.1% of the variance. The first component explains
44.4% of the variance and can be related to the social and economic dimensions of
vulnera- bility. It includes the following variables: male inhabitants; female
inhabitants; age; unemployment; education level; economic ac- tivity sector;
housing occupancy and household composition. The second component that explains
24.0% of the variance addresses building features and includes: building density;
number of floors; construction period and building structure. The third and fourth
components include the environmental dimension where compo- nent three (which
explains 11.2%) is related to the urban aspects: urban land use and urban land
use change (1990 and 2007) and component four (which explains 6.5%) comprises the
rural vari- ables: agricultural land use and forest land use.
The rotated component matrix (Table 2) shows the variable
Table 2
Variable loads in the rotated component matrix. Variables
Components
1 2 3 4
The four aggregation methods were applied (2.3) to compute the FloodVI. The
first three aggregation methods consider that among the 3076 neighbourhoods, 205
have very low vulnerability,
744 have low vulnerability, 1178 have medium vulnerability, 744 have high
vulnerability and 205 have very high vulnerability. Noteworthy is the
artificial distribution of the number of neigh- bourhoods per class due to the
classification procedure. Indeed, the commonly used thresholds (2.3) force, for
example, the medium vulnerability class to have a greater number of
neighbourhoods. The method using CA provides the following results: 1354 neigh-
bourhoods have very low vulnerability, 369 have low vulnerability,
215 have medium vulnerability, 861 have high vulnerability and
277 have very high vulnerability. The spatial distribution of these classified
neighbourhoods per aggregation method is illustrated in Fig. 3.
Fig. 3 shows that the spatial distribution of the 5 vulnerability classes produced
by the aggregations methods 1 and 2 are very different, especially in the
southeast part of the municipality. This indicates that it is important to also
take into account components two, three and four, which represent 41.7% of the
variance of the variables involved. Fig. 3 also shows that the results produced
with aggregation method 3 are very similar with those produced with method 1 since
the aggregation measures used by both methods are very similar. As the first
component is valued more in aggre- gation method 3 than in method 1, the
results obtained with methods 3 and 2 are less dissimilar than those obtained
with methods 1 and 2, with most of dissimilarities also located in southeast
part of the municipality. The major differences are be- tween the results of
aggregation methods 1, 2 and 3 and those of method 4, where there is a
predominance of areas with high vulnerability.
Correlation of the results produced by the four different aggre- gation methods
confirms the aforesaid (Table 3).
Analysing the percentage of the areas classified according to each aggregation
method (Fig. 4a) it can be concluded that both aggregation methods 1 and 3
present similar results. According to aggregation method 2, 52.5% of the area is
classified as having medium vulnerability, which strongly disagrees with the
7.1% (lowest percentage) obtained by aggregation method 4, which classifies
45.6% of the areas as having high vulnerability.
Although we have endorsed 5 vulnerability classes, the sensi- tivity analysis is
more straightforward if we combine Low and Very low vulnerability classes into one
single class. The same is true for the High and Very high vulnerability classes
(Fig. 4b). From Fig. 4b one can conclude that both aggregation methods 1 and 3
provide similar results, although the latter is sounder, from the conceptual point
of view. Considering aggregation method 2, the medium and low vulnerability
classes still have the highest percentage of area, which can be considered
conservative since the physical and the environmental dimensions are not
accounted for. Aggregation method 4 now classifies 92.9% of the area as having
high and low vulnerability.
Intersecting the results obtained with the four aggregation methods, it is
noted that 36 neighbourhoods maintain their clas- sification as high or very high,
none keep the medium vulnerability classification and 49 neighbourhoods maintain
their classification of low or very low vulnerability. Aggregation methods 1, 2
and 3 have been used and divulged in the literature and no attention has been paid
to their somewhat discordant results.
Please cite this article in press as: Fernandez, P., et al., A new approach for
computing a flood vulnerability index using cluster analysis, Physics and
Chemistry of the Earth (2016), http://dx.doi.org/10.1016/j.pce.2016.04.003
P. Fernandez et al. / Physics and Chemistry of the Earth xxx (2016) 1-9
7
4. Conclusions
Table 3
Correlation values between aggregation methods. Aggregation
Aggregation
1 2 3 4
1 1 - -
-
2 0.46 1
- -
3 0.79 0.74 1
-
4 -0.19 0.19
0.07 1
The developed and implemented FloodVI describes how social and economic
characteristics of the population, building features and environmental issues
behave in terms of resistance and resil- ience to flood impact. This study
integrates mathematical analysis (PCA and CA) and Geographic Information System
(GIS) techniques to estimate several vulnerability dimensions. It is also a
contribu- tion to flood risk assessment.
FloodVI was been proven to be aggregation model sensitive with both sum of
components and weighted sum of components
Please cite this article in press as: Fernandez, P., et al., A new approach for
computing a flood vulnerability index using cluster analysis, Physics and
Chemistry of the Earth (2016), http://dx.doi.org/10.1016/j.pce.2016.04.003
8
P. Fernandez et al. / Physics and Chemistry of the Earth xxx (2016) 1-9
Fig. 4. Percentage of area for each vulnerability classes according to the four
different aggregation methods: a) five classes b) three classes.
References
Abdi, H., Williams, L.J., 2010. Principal component analysis. Wiley Interdiscip.
Rev.
Comput. Stat. 2, 433-459.
Abson, D.J., Dougill, A.J., Stringer, L.C., 2012. Using principal component
analysis for information-rich socio-ecological vulnerability mapping in southern
Africa. Appl. Geogr. 35, 515-524.
Adger, W.N., 2006. Vulnerability. Global Environ. Change 16, 268-281.
Alexander, D., 2000. Confronting Catastrophe. Oxford University Press, New York.
Armas, I., Gavris, A., 2013. Social vulnerability assessment using spatial
multi-
criteria analysis (SEVI model) and the social vulnerability index (SoVI model)
- a case study for Bucharest, Romania. Nat. Hazards Earth Syst. Sci. 13,
1481-1499.
Balica, S., Wright, N.G., 2010. Reducing the complexity of the flood
vulnerability index. Environ. Hazards 9, 321-339.
Balica, S.F., 2012. Applying the Flood Vulnerability Index as a Knowledge Base for
Flood Risk Assessment. Delft University of Technology and Academic Board of
the UNESCO-IHE, Delft, Netherlands, p. 152.
Barnett, J., Lambert, S., Fry, I., 2008. The Hazards of Indicators:Insights from
the
Environmental Vulnerability Index. Ann. Assoc. Am. Geogr 98, 102-119. Borden,
K.A., Schmidtlein, M.C., Emrich, C.T., Piegorsch, W.W., Cutter, S.L., 2007.
Vulnerability of U.S. Cities to Environmental Hazards. J. Homel. Secur. Emerg.
Manag 4, 1547-7355.
Bo€hringer, C., Jochem, P., 2007. Measuring the immeasurable - a survey of sus-
tainability indices. Ecol. Econ. 63 (1), 1-8.
Branda~o, C., Saramago, M.M., Ferreira, T., Cunha, S., Costa, S.,
Alvarez, T.,
Carvalho, F.F.d., Silva, M., Duarte, C., Braunschweig, F., Brito, D.,
Fernandes, L., Jauch, E., Silva, R.P., 2014. Elaboraça~o de Cartografia
Específica sobre Risco de Inundaça~o para Portugal Continental. Relato'rio
Final, Volume 1-Memo'ria Descritiva. Age^ncia Portuguesa do Ambiente, Lisbon,
Portugal, p. 260.
Burton, C., Cutter, S., 2008. Levee Failures and Social Vulnerability in the
Sacra- mento-San Joaquin Delta Area, California. Nat. Hazards Rev 9, 136-149.
Cardona, O.D., 2005. Indicators of Disaster Risk and Risk Management: Summary
Report. Inter-American Development Bank, Washington, D.C.
Chakraborty, J., Tobin, G., Montz, B., 2005. Population evacuation: assessing
spatial variability in geophysical risk and social vulnerability to natural
hazards. Nat. Hazards Rev. 6, 23-33.
Cutter, S.L., Boruff, B.J., Shirley, W.L., 2003. Social vulnerability to
environmental hazards. Soc. Sci. Q. 84, 242-261.
Cutter, S.L., Emrich, C.T., Morath, D.P., Dunning, C.M., 2013. Integrating
social vulnerability into federal flood risk management planning. J. Flood Risk
Manag.
6, 332-344.
Cutter, S.L., Emrich, C.T., Webb, J.J., Morath, D., 2009. Social Vulnerability to
Climate Variability Hazards: a Review of the Literature. Hazards and
Vulnerability Research Institute, Department of Geography - University of South
Carolina, Columbia.
Cutter, S.L., Mitchell, J.T., Scott, M.S., 2000. Revealing the vulnerability of
people and places: A case study of Georgetown County, South Carolina. Ann.
Assoc. Am. Geogr 90 (4), 713-737.
Dunning, C.M., Durden, S., 2013. Social Vulnerability Analysis: A Comparison
of Tools. U.S. Army Corps of Engineers Institute for Water Resources
(IWR), Alexandria, VA.
Emori, S., Brown, S.J., 2005. Dynamic and Thermodynamic Changes in Mean and
Extreme Precipitation Under Changed Climate. Geophysical Research Letters
32.
Fekete, A., 2009a. Assessment of Social Vulnerability for River-floods in
Germany.
Institute for Environment and Human Security. United Nations University,
Bonn.
Fekete, A., 2009b. Validation of a social vulnerability index in context to river-
floods in Germany. Nat. Hazards Earth Syst. Sci. 9, 393-403.
Fekete, A., 2012. Spatial disaster vulnerability and risk assessments: challenges
in their quality and acceptance. Nat. Hazards 61, 1161-1178.
Felsenstein, D., Lichter, M., 2014. Social and economic vulnerability of
coastal
communities to sea-level rise and extreme flooding. Nat. Hazards 71, 463-491.
Finch, C., Emrich, C., Cutter, S., 2010. Disaster disparities and differential
recovery in
New Orleans. Popul. Environ 31, 179-202.
Groisman, P.Y., Knight, R.W., Easterling, D.R., Karl, T.R., Hegerl, G.C.,
Razuvaev, V.N.,
2005. Trends in intense precipitation in the climate record. J. Clim.
18,
1326-1350.
Guha-Sapir, D., Vos, F., Below, R., Ponserre, S., 2012. Annual Disaster
Statistical Re- view 2011: the Numbers and Trends. CRED, Brussels.
Hair, J.F., Black, W.C., Babin, B.J., Anderson, R.E., 2009. Multivariate Data
Analysis: a
Global Perspective. Pearson Education, London.
Please cite this article in press as: Fernandez, P., et al., A new approach for
computing a flood vulnerability index using cluster analysis, Physics and
Chemistry of the Earth (2016), http://dx.doi.org/10.1016/j.pce.2016.04.003
P. Fernandez et al. / Physics and Chemistry of the Earth xxx (2016) 1-9
9
Please cite this article in press as: Fernandez, P., et al., A new approach for
computing a flood vulnerability index using cluster analysis, Physics and
Chemistry of the Earth (2016), http://dx.doi.org/10.1016/j.pce.2016.04.003