AS S H R R S: Imulation Tudy of AIL Esistant OOF Trategies

A S IMULATION S TUDY OF H AIL R ESISTANT ROOF S TRATEGIES
Robert L. McPherson
Simulation for Management Decision Making, CSCI E-135
Harvard University Extension
Professor: Stephan Kolitz, Ph.D.

Chief Teaching Assistant: Atul Mohindra
Teaching Assistant: Dawna Wu
May 12, 2010
Abstract
An initial evaluation of the published statistics concerning the benefits of hail-resistant roofs might lead insurance
executives, state insurance regulators, and building code advocates to pursue incentives and laws that would strongly
promote investing extra money to install hail-resistant roofs when conventional roofs are destroyed as a result of hail-
damage. However, an evaluation of the long-term benefits with the use of simulation modeling technology reveals
the answer may not be so clear cut. Hail storm events that are sufficiently strong to cause significant damage (over
$1 million per event, which is a low threshold by insurance company standards) are relatively infrequent, and the
variance in the severity of these events is such that it could take an extremely long time to realize a clear payoff from
such efforts.
This paper is to fulfill the requirements for a final project for Simulation Modeling for Decision Making, ISMT
E-135, Harvard University Graduate Extension School.
1
Robert McPherson
ISMT E-135, Spring 2010
CONTENTS Final Project
Contents
1 Problem Formulation 4
2 System Analysis and Requirements 4
3 Developing the Model and Initial Validation 4
4 Data Collection 5
4.1 Examining the Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4.2 Data Analysis of Hail Storm Property Damage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.3 Data Analysis of Hail Storm Inter-Arrival Times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.4 Data on Standard and Hail Resistant Roofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.5 Simulating the Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5 Model Implementation, Testing, Verification, and Validation 22

5.1 Model Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.2 Simplifying Assumptions in the Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.3 Testing, Verification, and Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
6 Design Experiments and Make Model Runs 31
7 Data Analysis and Results Documentation 33

7.1 Comparisons over 50 Years . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
7.2 Comparisons over 100 Years . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
7.3 Comparisons over 250 Years . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
7.4 Sensitivity Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
8 Conclusion 37
2 of 39
Robert McPherson
LIST OF FIGURES Final Project
List of Figures
1 Basic design of proposed model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Histogram of monthly hail property damage losses over $1 million, January 1993 - March 2010 . . . 8
3 Histogram of monthly inter-arrival times in days for hail property damage losses over $1 million,
January 1993 - March 2010 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4 Scatter-plot matrix of all hail loss data and variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5 Decomposition of seasonality and trend of hail property damage data . . . . . . . . . . . . . . . . . . 11
6 Decomposition of seasonality and trend for inter-arrival between hail events, in days . . . . . . . . . 12
7 GEV exponential distribution fitted to property damage data . . . . . . . . . . . . . . . . . . . . . . 15
8 Generalized Pareto fitted with exponential distribution, on property damage data with a threshold of
200 million dollars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
9 Scatter plot of remaining nine extreme property damage values above the 200 million dollar threshold 17
10 Exponential distribution is best fitting from Arena Input Analyzer for property damage over 200 mil-
lion dollar threshold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
11 Plot of autocorrelation function indicating little evidence of non-stationarity . . . . . . . . . . . . . . 20
12 Lognormal distribution fitted to untransformed inter-arrival data . . . . . . . . . . . . . . . . . . . . 21
13 Configuration of the “Create Hail Storm Event" object . . . . . . . . . . . . . . . . . . . . . . . . . 22
14 Configuration of the lognormal distribution for “Storm Severity" object . . . . . . . . . . . . . . . . 23
15 Configuration of the Hail Size Magnitude object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
16 Configuration of gamma distribution random generator for hail stone size . . . . . . . . . . . . . . . 24
17 Configuration of property damage object (i.e., “Storm Severity") . . . . . . . . . . . . . . . . . . . . 24
18 Configuration of log-Weibull random generator behind “Storm Severity" object . . . . . . . . . . . . 24
19 Model structure for conventional asphalt roofing replacements . . . . . . . . . . . . . . . . . . . . . 26
20 Model structure for hail resistant roofing replacements . . . . . . . . . . . . . . . . . . . . . . . . . 26
21 A single year of simulated hail losses shows extreme variability inherent in losses from actual hail events 28
22 Histograms of inter-arrival times in days (left), and loss severity in dollars (right) . . . . . . . . . . . 31
23 Average losses for each of the 365 days in a year over a single year (i.e., one replication) . . . . . . . 32
24 Average losses for each of the 365 days in a year over 200 years (i.e., 200 replications) . . . . . . . . 33
25 Basic setup for comparing hail-resistant roof replacement strategies in Process Analyzer software . . . 33
26 Box plot of 50 repititions indicating no single scenario is significantly better . . . . . . . . . . . . . . 34
27 Box plot of 100 replications indicating no single scenario is significantly better, even after 100 years . 35
28 Box plot of 250 replications shows all three hail-resistant scenarios are finally significantly better than
non-hail resistant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
29 50-year comparison of exponential inter-arrival with mean of 16.2 days for sensitivity testing . . . . . 37
30 100-year comparison of exponential inter-arrival with mean of 16.2 days for sensitivity testing . . . . 38
3 of 39
Robert McPherson
Final Project
1 Problem Formulation
Installation of hail-resistant roofs could save insurance companies significant claims costs, and reduce premiums for
consumers. However, hail storm events are extremely variable in terms of arrivals, as well as the amount of damage
incurred. Additionally, hail resistant roofing is more expensive than regular roofing, and this added cost is also variable,
depending on price differences among contractors, and suppliers. Because of these factors, determining whether
implementing various strategies to install hail-resistant roofs is not as simple as calculating average loss amounts for
various roof types, and examining average installation costs. It could take a great many years to realize a significant
return on such efforts.
A simulation model could help answer whether it be worthwhile for insurance companies to incent policyholders
to replace hail damaged roofs with upgraded hail resistant shingles. It could also provide information for legislators to
consider whether it is worthwhile to enact building codes requiring roof replacements with hail-resistant shingles.
As we shall see, studies indicate that hail-resistant roofs can reduce damageability of roofs by about half, and the
added cost to install these roofs is only a fraction of this amount. On the surface, it would appear that spending the
extra money to install these roofs is obviously a good thing for the insurance companies. We shall test this assumption
in the analysis to follow. Specifically, the analysis will explore whether the reduction in losses for hail-resistant roofing
is truly significant enough to outweigh the added cost, and whether the rarity of hail-storm events, as well as the high
degree of variability in losses, causes any potential payoffs to fail to be fully realized in a reasonable amount of time.
2 System Analysis and Requirements

In its simplest form, the basic system should account for the following.
• Inter-arrival times between hail storm events

• A means of determining the amount of loss from each event
• A mechanism for estimating the number of conventional, non-hail resistant roofs that have been replaced with
hail-resistant roofs after each hail storm event
• A method to determine the added cost of installing hail-resistant roofs
• A means for tallying the losses from both roof types (hail-resistant, and conventional)
• Various mechanisms for collecting data for validation and analysis
Also, there should be a means for accounting for the extreme amount of uncertainty inherent in natural catastrophe
events, such as large hail storms. Additionally, uncertainty should be accounted for in the amount of loss reduction
that is actually experienced from hail-resistant roofs, as well as in the added cost to install them.
3 Developing the Model and Initial Validation

A simplified view of the proposed logic for the model is shown in figure 1.
The model includes random variables for the following.
• Creation of hail storm events

• Total dollar loss severity of each storm event
4 of 39
Robert McPherson
Final Project
Figure 1: Basic design of proposed model
• Loss reduction amount from hail-resistant roofing

• Increased cost for installing hail resistant roofing
An initial very simple model was built to validate that simulating the process in this manner could produce plausible
results. Several scenarios were developed with varying proportions of conventional roof replacements to hail-resistant
roofing. The Process Analyzer software module that is package with the Arena simulation software by Rockwell was
utilized to compare the results.
The initial proposed models are not shown here, since they differ from the final models, and could cause confusion
with the final result. The models implemented in this analysis were largely shaped by the data that were available, and
they evolved as discoveries about the data unfolded. For instance, the initial models indicated that hail storm events
only included the state of Colorado, and for a very limited time frame. Through later data analysis, it was discovered
there are far too few data points with which to find a good fitting distribution. Thus, the resulting model encompassed
hail events and losses for the entire U.S. over a seven year period (the maximum available). Also, the initial models
added a layer of complexity that greatly broadened the scope by considering claim paying operations as a submodel.
While this may be desirable for future models, it goes beyond the main focus of this analysis. However, despite the
changes in theh final models, the basic model logic, as shown in figure 1 remains the same.
4 Data Collection
Weather data for this study were collected from the National Oceanic and Atmospheric Administration’s free, on-line
data resource, called the National Climatic Data Resource Center (http://www.ncdc.noaa.gov/oa/ncdc.html). As of the
time of this study, hail data was available for the entire U.S. on a monthly basis, from January 1993, through January
2010. (Although this is being written in April, 2010, data after January, 2010, is not yet posted on the website.) All
of the available monthly hail data from January, 1993, through January, 2010, was collected and stored in a database.
Since the data can only be downloaded in smaller increments, one year had to be downloaded at a time, and assembled
in the database.
Initially, the intent for this study was to utilize Colorado data only. However, it became apparent upon analysis
of the Colorado data, that there were not sufficient observations to get a good distribution fit using the Arena Input
5 of 39
Robert McPherson
4.1 Examining the Data Final Project
Analyzer. Therefore, all the following analysis is on consolidated data for all of the states in the continental U.S. Also,
a threshold of one million dollars was selected as the point above which data would be collected. This limited the
amount of the data so that it would be feasible to conform to the website’s limitation of returning no more than 1,000
records per query. The website required running a separate query for each year. All queries returned less than 1,000
records when limiting the dollar amount of property damage to one million dollars.
Utilizing thresholds in this manner for analyzing extreme weather data is common [14]. Also, insurance companies
are particularly interested in analyzing “tail events.” Variance for high frequency, low severity events is predictable
enough that they are adequately handled by conventional pricing models. However, the low frequency, high severity
events can cause losses not contemplated in conventional pricing models, and premium charges may not be adequate
to cover them. A typical threshold for catastrophic events is $250 million dollars, such as utilized in AIR’s Classic/2
catastrophe modeling software. This threshold is too high for this analysis, however, since we are interested in the
difference that hail resistant roofs have on general hail losses. On the other hand, a threshold below $1 million would
like be too low, as many of these claims would turn out to be less than the homeowner policy deductibles. Thus, due
to both practicalities of the website limitations, and the nature of the question at hand, a $1 million threshold was
considered to be adequate for this analysis.
There were three variables that were collected from the NOAA website. For each hail storm event, data were
provided as to the state and county where the event occurred; the month, day, and year of occurrence; the magnitude
of the typical hail stone size; as well as the amount of property damage in U.S. dollars.
After the data were imported into a newly created database, all data attributes were aggregated to the national level
for each day that an event occurred. The hail stone sizes were aggregated to represent the maximum size on a given
day. The property damage represents the sum of all losses for a given day. The inter-arrival times were calculated and
put into a new data column by subtracting the difference in days between events. Structured Query Language (SQL)
code is exhibited below, to further demonstrate how the data were assembled. Lastly, the data were grouped by month
to produce a monthly time series. Since this step was done with a pivot table, there is no SQL code to exhibit.
The hail data were aggregated and grouped by event date, as shown in this SQL snippet:
SELECT
"Date",
MAX( "Mag" ) AS "Magnitude",
SUM( "PrD" ) AS "PropertyDamage",
MAX( "InterArrival" ) AS "InterArrival"
FROM "AllStatesHail_1993_2009" GROUP BY "Date"
Note that since the data were supplied by county and state, there were many duplicate dates. Hence, this is the
reason for using the maximum (i.e., “MAX”) function when grouping the inter-arrival times, as shown in the SQL
snippet. To produce the inter-arrival time in terms of days, the data were initially sorted by date, and each preceding
date was subtracted from the subsequent date. This produced an inter-arrival time of zero for each row of duplicate
dates, while the first row in the block of duplicates contained the only amount greater than zero. Therefore, using the
maximum function was an effective way to group by date, and only show inter-arrival values that are larger than zero.
Table 1 shows a small sample of collected hail data from NOAA, prior to grouping and summing by date.
A sample of data after it has been grouped by date at a national level is shown in table 2.
4.1 Examining the Data

Prior to doing analysis, it is important to get familiar with the characteristics of the data. Extreme event weather data
can be especially challenging to analyze, due to issues such as seasonality, extreme outliers, special distributions, and
6 of 39
Robert McPherson
Sample of Raw Data from NOAA Website, with Calculated Inter-Arrival in Days
Location or County Date Type Mag PrD InterArrival
6 Madras 02/18/09 Hail 4.25 4 0
7 Hampton 02/18/09 Hail 1.75 1.4 0
8 Loganville 02/18/09 Hail 3 2.5 0
9 Chamblee 02/18/09 Hail 1.75 2.5 0
10 Lilburn 02/18/09 Hail 2 2.5 0
11 Tyrone 02/18/09 Hail 3 8 0
12 Bill Arp 02/18/09 Hail 1.75 1 0
13 Jonesboro 02/18/09 Hail 3 6 0
44 Jonestown 03/25/09 Hail 3 160 35
14 Locust Grove 03/28/09 Hail 1.75 1 3
45 South Lake 03/30/09 Hail 2.75 35 2
46 Grapevine 03/30/09 Hail 1.75 5 0
47 Keller Alta Vista Ar 03/30/09 Hail 2.75 1 0
Table 1: Small sample of hail data from the National Oceanic and Atmospheric Administration
Sample of Data Grouped by Date at a National Level

Date Magnitude PropertyDamage InterArrival
02/18/09 4.25 27.9 0
03/25/09 3 160 35
03/28/09 1.75 1 3
03/30/09 2.75 94 2
04/10/09 2.75 7.5 11
04/11/09 2 161.2 1
04/16/09 2.5 40 5
04/23/09 1.75 12.8 7
06/05/09 2 1.5 43
06/06/09 0.5 1 1
06/07/09 3 162.5 1
06/30/09 2.75 65 23
Table 2: Small sample of hail data that has been grouped by date, and aggregated nationally
7 of 39
Robert McPherson
dependencies among the variables. We can see from the histograms in figures 2 and 3, that these data are highly
skewed. It will, in fact, prove quite challenging to fit distributions to these data, as we will later see.
Histogram of PropertyDamage
350
300
250
200
Frequency
150
100
50
0
0 200 400 600 800 1000 1200 1400
PropertyDamage
Figure 2: Histogram of monthly hail property damage losses over $1 million, January 1993 - March 2010
Examining independence The scatter-plot in figure 4 indicates that there is little to no visible dependence between
the inter-arrival times, and the property loss data. However, there appears to be some dependence between the mag-
nitude of the hail (i.e., hail size) and the property losses, as evidenced by the upward sloping dotted regression line.
This indicates a positive correlation, as one might expect. Losses tend to be higher for larger hail sizes. Hail stone
magnitude data were initially evaluated and fitted with distributions along with inter-arrival and loss data. However,
due to the correlation between loss severity and hail stone size, as well as the fact that not enough data was available to
convert hail stone size into equivalent losses, it was decided to omit this variable from the models as being unnecessary
for the goals of this analysis.
In addition to the dotted linear regression line, the solid line in the plot is produced by a non-linear optimized sum
of squares (LOESS) function, and reveals any non-linear relationships. The plots along the diagonal of the matrix
show the probability density for each variable.
Although visualizing the data relationships is useful, the data are so highly skewed, that appearances may be
deceiving. A nonparametric, Spearman rank order correlation matrix was run in R to further test for dependencies in
the data in a manner that is not dependent upon the data having any particular distribution assumption. The results
indicate that the strongest of these correlations is between the magnitude of the hail stone sizes, and the property
damage severity. (Note that the variables in the correlation analysis were detrendend and deseasonalized, utilizing a
method called “seasonal, trend, and LOESS decomposition’, or “STL.”)
Although the correlation between property damage and magnitude is not high, with a correlation coefficient of
r = 0.21, the p-value for this Spearman rank order correlation is 4.579e-05. This is well below the 0.05 level, which
causes us to reject the null hypothesis that this correlation coefficient is equal to zero at the 95% confidence level.
8 of 39
Robert McPherson
Histogram of InterArrival
300
250
200
Frequency
150
100
50
0
0 50 100 150 200 250
InterArrival
Figure 3: Histogram of monthly inter-arrival times in days for hail property damage losses over $1 million, January
1993 - March 2010
Correlation Matrix: Deseasonalized, Detrended Hail Storm Data

Inter-Arrival Magnitude Property Damage
Inter-Arrival 1 -0.07 -0.08
Magnitude -0.07 1 0.21
Property Damage -0.08 0.21 1
Table 3: Spearman rank order correlation matrix on deseasonalized, detrended, hail storm data
Thus, while small, this correlation is statistically significant.

Also, just to ensure that the relationship between property damage and inter-arrival times is indeed independent,
the correlation significance for these two variables was tested in a similar manner. The Spearman rank order correlation
of r = −0.076 has a p-value of 0.1435, which is not statistically significant at the 95% confidence level. This supports
our assumption of independence between the inter-arrival times variable, and the property damage variable.
Examining stationarity It is important to examine the data for the possible presence of seasonality and trend. We
may or may not want to reflect seasonality and trend in a model, depending on the objectives of the model. However,
autocorrelation produced by seasonality and trend and greatly impact the results of a model that is based on time series
data. For instance, if there is a queue in the model, a particularly active season may produce a longer queue than quiet
seasons. Also, non-stationary (i.e., auto-correlative) data can make it quite difficult to fit distributions as well, as we
shall see in the upcoming section regarding distribution fitting. The purpose of the initial examination is simply to
“eyeball” the data to understand it at a high level. More in-depth statistical analysis concerning stationarity will be
performed later in the distribution fitting process, so as to determine whether to utilize the raw data, or data that has
9 of 39
Robert McPherson
0 200 600 1000 1400
● ●
6
● ●
Magnitude ● ● ● ●
5
●
●●●●●● ● ●●
●●●●●●● ●
●
●● ●
●●●● ● ●●●●● ● ●
●
●● ●
●● ● ●●●●● ●
4
● ●
●
●●●● ●●● ●● ●
● ●
●
●●●●
●●●● ● ●●
●●●●●● ●
3
●
●
●●
●●
●●
●●
●●●
● ● ● ● ●
●●
●●
●
●●
●●●●
●●
●●●
●●●● ●● ●● ●●
●
●
●●●
●● ● ●●
●●●●● ● ●
● ●
●
●●●● ●●●●● ●
●●●●● ●
●●● ●● ●● ●
2
●
●
●●●●● ● ● ●●
●●●
●●●
●●
●●●
●●●●●●● ●● ● ● ● ● ●● ● ●
●● ●● ●●
●
● ●● ●●
●
●●●● ● ●●
●●● ●
●●
●● ● ●
1
● ●● ●
●
● ● ●●
| | || | | | | | | | | | | | | | | | | | ● ● ●
1400
● ●
PropertyDamage
1000
600
● ●
● ● ●●
● ●
200
● ● ●●
● ● ● ●
●
● ● ● ●●●●●
●
● ● ● ● ●
●
●●● ●●● ●●
●
● ●
● ●●● ●● ● ●
●
●
●● ●● ●●●
● ●
● ● ●●
●●
● ● ●
●●● ● ●
●
●
●
●
●●●●●●
●●
● ●●●
● ● ● ●
● ●● ● ●
● ●●
●
●●●
● ●●● ● ●
●●●
●●●●●●● ●●
●●● ●
● ● ||||||||||
|||||||||||||| || | || | | ●
●
●●●
●
●●
●●
●●
●●
●
●●
●●
●●
●
●●
●●
●●
● ●
●●
●
●●●
●●● ●●
●●● ●●●● ●●●● ●●● ● ●● ● ● ●
0
250
● ●
● ● InterArrival
200
● ●
● ●
● ●
150
● ●
●● ● ●●
● ● ●●
100
● ●● ●●
● ● ●●
● ●
● ●
● ●
● ●
●
● ● ●
●
50
● ●
● ● ● ●
●
● ● ● ●● ● ●● ●
●●● ●●● ●
●
● ● ● ● ●
● ●● ●
●● ●
●●● ● ●
●● ● ● ●
●●
●
●●●
●●
●● ●
● ● ● ●● ● ● ●
●●
●
●●
●●●●● ●
● ● ●●
●
●●●
●●
● ●●
● ● ●● ●
●●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
● ●
● ●●●
● ●
●
● ● ●
● ●
● ●●● ● ●
● ● ● ●
● ●●●●● ●●●
●●●● ● ● ●
●
●●
●
●●
●●
●●●
●
●
●
●●
●
●●
●●
●
●●● ● ● |||||
||||||||||||||||||||||||||||||||||||
|||| ||||||| || | | | | | |||| | || | | | | | |
0
1 2 3 4 5 6 0 50 100 150 200 250
Figure 4: Scatter-plot matrix of all hail loss data and variables
been transformed to decompose and remove any seasonality and trend.

The exhibit in figure 5 shows a seasonality and trend decomposition (STL) plot for the property damage data. The
trend component is in the top panel, the seasonal component is in the middle panel, and the residual values are in the
lower panel after removing the seasonality and trend effects. The length of the vertical gray bar in the right hand side
of each panel indicates the visual scale of each exhibit (i.e., amount to which the exhibit was magnified compared to
the others). Looking at the seasonal panel, for example, it is helpful to imagine the large vertical bar in the seasonal
panel being shrunk down to the size of the very short gray bar in the raw data panel (titled simply, “data”), and mentally
shrink the entire seasonal graph along with it. The amounts in the y-axis represent millions of dollars per event (i.e.,
occurrence date) nationally.
The plot seems to exhibit some seasonality, as judged by the regular patterns that are visible in the seasonal panel,
although this panel is magnified significantly as judged by the gray visual scaling bar. However, trend is nearly non-
existent. The trend panel shows a widely varying non-linear trend. However, the extreme large size of the gray scaling
bar indicates that this visual is greatly magnified.
Figure 6 exhibits an STL decomposition plot for the inter-arrival data. It also reveals little if any seasonality
and trend. Although the seasonality panel appears to show some seasonality, a look at the vertical scaling reference
indicates that this panel is greatly magnified in comparison to the raw data in the top panel. Also, the pattern in the
seasonal panel does not appear to be as regular with the inter-arrival data, as with the property damage data.
10 of 39
Robert McPherson
4.2 Data Analysis of Hail Storm Property Damage Final Project
1200
800
data
400
0
150
seasonal
100
50
0
150
100
trend
50
0
1000
remainder
600
200
−200
1980 1985 1990 1995 2000 2005 2010
time
Figure 5: Decomposition of seasonality and trend of hail property damage data
4.2 Data Analysis of Hail Storm Property Damage

Property damage in dollars The two major components of modeling claims data are frequency (i.e., event data),
and severity (i.e., loss amount). After much experimentation with this hail data set, it turns out that the most difficult
component to fit is severity, as represented by property damage denominated in U.S. dollars. Therefore, more attention
is given to property damage data in this analysis, than to the inter-arrival time, which was used to determine frequency.
The output below from the Arena Input Analyzer software, shows the output from an attempt to find the best fitting
distribution for the property damage data. Although the Weibull had the best fit overall, the p-values are well below the
0.05 level that would cause us to reject the null hypothesis that this distribution is consistent with a Weibull distribution
at the 95% confidence interval level.
Distribution Summary
Distribution: Weibull
Expression: 0.999 + WEIB(9.05, 0.379)
Square Error: 0.000287
11 of 39
Robert McPherson
200
data
50 100
0
20
seasonal
10
0
−10
30
20
trend
5 10
0
200
remainder
100
50
0
−50
1980 1985 1990 1995 2000 2005 2010
time
Figure 6: Decomposition of seasonality and trend for inter-arrival between hail events, in days
Chi Square Test

Number of intervals = 3
Degrees of freedom = 0
Test Statistic = 0.455
Corresponding p-value < 0.005
Kolmogorov-Smirnov Test
Corresponding p-value < 0.01
Data Summary
Number of Data Points = 371

Min Data Value = 1
Max Data Value = 1.35e+003
Sample Mean = 31.1
12 of 39
Robert McPherson
Sample Std Dev = 93.7
Histogram Summary
Histogram Range = 0.999 to 1.35e+003

Number of Intervals = 19
Various transformations we performed on the data to see whether it would improve the fit. Utilizing the Seasonal,
Trend, and LOESS (STL) decomposition function in the R Environment for Statistical Computing, the inherent au-
tocorrelation due to seasonality was removed, and the data were detrended as well. The illustration in figure 5 was
produced by the STL function, and it also shows the remainder after adjusting out the seasonality and trend. This
remainder represents a more stationary time series that is more suitable for analysis. However, a trade off in utilizing
this approach is that we would have to add a seasonality component back into the model if we wish to preserve the
time dependent aspects inherent in real-life.
Further analysis was done to determine whether the one million dollar threshold was sufficient for distribution
fitting purposes. Often, a better fit can be found, and the results more useful, if the data are broken into segments
within thresholds, based on shifts in the distribution at certain points in the data. For example, it would not make
sense to bias our results with commonly occurring events, when we are really mainly concerned with those that cause
significant damage.
An R software package called, “extRemes” [15] was utilized to examine the effects of only including data above
various thresholds, and also to try fitting various catastrophe related distributions that are not included in the Arena
Input Analyzer software. Of particular interest, was the Gumbel distribution, and the generalized Pareto distribution,
as these are often useful in fitting extreme, infrequent weather data, such as hurricane events.
Fitting Distribution to Deseasonalized and Detrended Property Damage Data An alternative to the threshold
method is to utilize the STL decomposition process that was shown in figure 5 to remove the seasonality and trend
from the data. A distribution could be fitted to the remaining, deseasonalized and detrended data, and this distribution
could be utilized to generate simulated events. Removing the auto correlative features of seasonality and trend from
the time series data should make it easier to find a distribution that fits with some statistical significance (i.e., p-value
greater than .05 at a 95% confidence level), while avoiding the need to utilize a threshold that would omit a large
portion of the data. When producing the simulated values from the fitted distribution, the seasonality could be added
back to the process by utilizing calculated seasonality factors.
The “extRemes” package in R was utilized to fit a generalized extreme value (GEV) distribution to the deseason-
alized, detrended (i.e., remainder) property damage data. The output from this analysis is shown below, and indicates
that the Gumbell distribution is statistically significant. With a p-value of 0.686, we would not reject the null hypoth-
esis that the Gumbell distribution fits this distribution.
Gumbell distribution fits deseasonalized, detrended property damage data
************
GEV fit
-----------------------------------
Response variable: V1
"Nelder-Mead" optimization method selected.
13 of 39
Robert McPherson
L-moments (stationary case) estimates (used to initialize MLE optimization routine):

Location (mu): 146.3331
Scale (sigma): 43.08564
Shape (xi): 0.03098401
Likelihood ratio test (5% level) for xi=0 does not reject Gumbel hypothesis.
likelihood ratio statistic is 0.1637489 < 3.841459 1 df chi-square critical value.
p-value for likelihood-ratio test is 0.6857282
Convergence successful![1] "Convergence successful!"

[1] "Maximum Likelihood Estimates:"
MLE Stand. Err.
MU: (identity) 144.48008 2.98646
SIGMA: (identity) 53.98324 1.92657
Xi: (identity) 0.00528 0.01335
Diagnostics, along with a histogram fitted with the Gumbell distribution are shown in figure 7. The first of the four
graphs in this illustration (top left graph) is a probability plot. If the Gumbell distribution was a perfect fit for the data,
the points in this graph would form up directly on the diagonal line. However, it can be seen by the slight ‘S’ curve,
that the fit is not exact.
The graph at the top right of this illustration shows a quantile plot. The quantile plot indicates that, while the
Gumbell distribution fits pretty well for the lower property damage values, it does fit the more extreme values as well,
beginning at a property damage loss amount of between $300 million and $400 million.
The graph at the lower left of figure 7 is called a “Return Level Plot”. It is is important to extreme value analysis,
since it indicate the probability level at which the distribution is reliable. In this case, it indicates that the Gumbell
distribution fits well, up to the point just below the 100 in the x-axis, which represents a one in one-hundred year event.
This tells us that the distribution does not fit as well in the extreme tail where there is at least a nearly 1/100 chance of
occurrence.
A density plot for the Gumbell distribution is shown in the bottom right of figure 7, and illustrates with a histogram
of the ‘z’ statistic, how the distribution fits well for the bulk of the data, except for the extreme values in the right hand
side of the histogram.
All of these diagnostics confirm that the Gumbell distribution fits the bulk of the property damage data well.
However, the extreme values at the tail are problematic, and might be better fitted as a separate group through a
method known as threshold analysis. This method establishes a threshold, above which the data are dealt with by
fitting a different distribution than the data below the threshold amount.
Based on the diagnostics in figure 7, it would appear that it would be logical to establish a threshold somewhere
between $300 million and $400 million in property damage losses. However, this would result in a sample size of
only about five data points. A possible option to consider may be to reduce the the threshold to below $300, which
increases the sample size slightly. It turns out that establishing a threshold such that only property damage of $200
million for any event, results in a best fit of an exponential distribution for the extreme tail values. The Gumbel was
not a good fit, based on a p-value of nearly zero. The output from the “extRemes” package in R is shown below, and
includes the parameters of the distribution, as well as a p-value that is well above the 0.05 level we would expect to
see in order to avoid rejecting the null hypothesis at the 95% confidence level, that the exponential distribution is a
good fit for these data.
14 of 39
Robert McPherson
Probability Plot Quantile Plot
1.0
●
●●
● ●
●●
●
●
●
●
●
1000
●
●
0.8
●
●●
●
●
●●
●●
●
●
●●
●
●
●
●●
●
●
●●
●
●●
●
●
●●
Empirical
●
0.6
●
●
●●
●
●●
●
●●
●
●●
●
●●
●
●
●●
●
●●
●
●
Model
●
●
●●
●
●●
●
●
●
●
●●
●
●●
●
●●
●
●
600
●
●●
●
●●
●
●●
●
●
●●
●
●●
●
●●
●●
●
●●
●
●●
● ● ●
●●
●
●●
● ●
0.4
●
●●
●
●●
●
●
●
●●
●
●●
●
● ●
●
●
●●
●
●
● ●●
●●
●
●
●
●●
● ●●●●●
●
0.2
●
●●
●
●●
● ●
●
●●
●●
●
●●●
●●
200
●
●● ●●
●
●●
●
●●
●
●
●●
●
●
●
●● ●
●
●●
●
●●
●
●
●●
●
●
●●
●
●●
●
●●
●
●●
●●
● ●
●
●●
●
●●
●
●
●●
●
●
●●
●
●
●●
●
●●
●
●
●●
●
●●
●
●
●●
●
●
●
●
● ●●
●
●●
●
●
●●
●
●
●●
●
●●
●
● ●●
●
●●
●
●
●●
●
●●
●
●
●
●● ●
●●
●
●
●● ●
●
●
●
0.0
●●
●
●
●
●●
● ●●
0
0.0 0.2 0.4 0.6 0.8 1.0 100 200 300 400
Empirical Model
Return Level Plot Density Plot
●
0.006
1000
Return Level
0.004
f(z)
600
●●
●
0.002
●
●●
●●●
●●●●
●●
●
●●
●
●●
●●
●
200
●
●
●●
●
●●
●●
●
●●
●
●
●
●
●●
●
●
●●
●
●●
●
●
●●
●
●
●●
●
●●
●
●●
●
●
●●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●●
●
●
●●
●
●●
●
●
●●
●
●
●●
●
●
●●
●
●
●●
●
●●
●
●
●
0.000
●●
●●
●
●●
●
●
●● ●●
●
●●●
●
●●
●●
●●
●
●●
●
●
●●
●●
●●
●●
●
●
●●
●
●●
●●
●●●
●
●●
●
●
●●
●●●
●
●
●●
●●
●
●
●●
●
●
●●
●
●●
●●
●●
●
●●● ● ●● ●
0
0.1 1 10 100 1000 0 200 400 600 800 1200
Return Period z
Figure 7: GEV exponential distribution fitted to property damage data
Output of best fit from generalized Pareto distribution analysis:

L-moments estimates for (stationary) GPD are:
scale: 119.5762
shape: 0.5480488
These L-moments estimators were used as initial parameter estimates.
Likelihood ratio test (5% level) for xi=0 does not reject Exponential hypothesis.
likelihood ratio statistic is 1.521590 < 3.841459 1 df chi-square critical value.
p-value for likelihood-ratio test is 0.2173791
Convergence successful!
[1] "Threshold = 200"
15 of 39
Robert McPherson
[1] "Number of exceedances of threshold = 9"
[1] "Exceedance rate (per year)= 0.291105121293801"
[1] "Maximum Likelihood Estimates:"

MLE Std. Err.
Scale (sigma): 107.4765993 112.403277
Shape (xi): 0.8167595 1.028090
The graphical output associated with the statistical output shown above, is exhibited in figure 8. This illustration
shows how well the generalized Pareto, exponential distribution fits these extreme values. It also shows a comparison
of “returns” over time periods on an x-axis that increments base-10 logarithmically. This illustration shows the ex-
pected property damage value over time periods of 10, 100, and 1,000 year time periods, respectively, as well as the
confidence interval. It can be seen from this illustration, that the curve is fairly linear up to the 1,000-year point, and
then curves upward from there. Although not shown here for space reasons, this curve is much more gradual and well
behaved than when including the values below the $200 million threshold.
Probability Plot Quantile Plot
● ●
0.8
●
● ●
1000
0.6
●
Empirical
Model
0.4
600
●
●
● ●
0.2
●
●
●
200
●
0.0
● ●●●
0.2 0.4 0.6 0.8 1.0 200 400 600 800
Empirical Model
Return Level Plot Density Plot

6e+05
0.008
4e+05
Return level
f(x)
0.004
2e+05
0e+00
0.000
●●●●●● ● ●
●
0.1 1 10 100 1000 200 400 600 800 1200
Return period (years) x
Figure 8: Generalized Pareto fitted with exponential distribution, on property damage data with a threshold of 200
million dollars
16 of 39
Robert McPherson
There are only nine resulting data points that are this extreme, however. While this is not an uncommon scenario
with extreme weather data, it would obviously be preferable if it was possible to obtain more data points. However,
such historical data on extreme events is often not available, and is not available in this case either. Figure 9 shows a
scatter plot of the nine points that are over the $200 million threshold.
1200
1000
PropertyDamage
800
600
400
200
03/31/08 04/16/98 05/15/98 06/25/06 10/04/06
Date
Figure 9: Scatter plot of remaining nine extreme property damage values above the 200 million dollar threshold
When fitting distributions to extreme weather events it is sometimes necessary to estimate the distribution param-
eters, by either fitting to a small number of points, such as we will do in this case, or by utilizing Bayesian methods of
parameter estimation, such as with Markov Chain Monte Carlo methods (MCMC) [14].
Figure 10 shows the best fitting distribution for the hail property damage data over the extreme value threshold of
$200 million, utilizing the Arena Input Analyzer software. The output from this software for these data is shown below.
It indicates that the exponential distribution is the best fitting among the limited collection of distributions available in
this software package. While the p-value is lower than that of the generalized Pareto exponential distribution (GPD),
it is high enough at the 95% confidence level to avoid rejecting the null hypothesis, base on Kolmogorov-Smirnov
test. Utilizing the exponential distribution in Arena with the parameters shown in the Input Analyzer output would be
an easier alternative than creating an external sample file utilizing the GPD, although the GPD produces a better fit
overall.
Output of best fit analysis from Arena Input Analyzer of property damage:
=================================================================
Fit All Summary
17 of 39
Robert McPherson
Data File: PropDmg200MillionThreshold.txt
Function Sq Error
-----------------------
Erlang 0.0297
Exponential 0.0297
Gamma 0.063
Weibull 0.0848
Lognormal 0.121
Triangular 0.143
Normal 0.151
Uniform 0.232
Beta 0.403
Distribution Summary
Distribution: Exponential
Expression: 201 + EXPO(0)
Square Error: 0.029666
Kolmogorov-Smirnov Test
Corresponding p-value > 0.15
Data Summary
Number of Data Points = 9

Min Data Value = 201
Max Data Value = 1.35e+003
Sample Mean = 465
Sample Std Dev = 363
Histogram Summary
Histogram Range = 201 to 1.35e+003

Number of Intervals = 5
Although the threshold method is a possibility for future analysis, it was not utilized in the models for this analysis.
The Gumbell (i.e., log-Weibull) distribution that we explored earlier significantly fits most of the data except the tail
values. The tails were presumed to be outliers for the purposes of this study, since they were so few. Also fitting a
distribution to so few a number of data points is sketchy at best. Nevertheless, this analysis could serve as a basis
for further study, where one could attempt to fit two distributions together in a sort of piecewise fashion, where one
distribution would be implemented for the bulk of data below the extreme value threshold, and another distribution
utilized to mimic rare, extreme events.
18 of 39
Robert McPherson
4.3 Data Analysis of Hail Storm Inter-Arrival Times Final Project
Figure 10: Exponential distribution is best fitting from Arena Input Analyzer for property damage over 200 million
dollar threshold
4.3 Data Analysis of Hail Storm Inter-Arrival Times

There were no good fitting distributions for the inter-arrival hail data that were deseasonalized and detrended utilizing
the STL function that was described earlier. The Arena Input Analyzer was utilized, along with various distribution
analysis packages in R, such as “exTremes”, and “fitdistr.” It turns out that untransformed inter-arrival data can be
fitted more readily, however.
Comparing the STL plots that were exhibited earlier in figures 6, and 6, it appears that the seasonality and trend
components of the inter-arrival data are much less pronounced than for the property damage data. (Note again the
vertical, gray scaling bar on the right side of the STL plots. The large size difference in this bar between the seasonal
component and the raw data component brings the true seasonality down to size.) Thus, it may not be as necessary
to adjust the inter-arrival data for seasonality and trend. To test this hypothesis, an autocorrelation test was run on
the untransformed inter-arrival data to see whether there is sufficient stationarity in the data without applying any
detrending or deseasonalizing transformations. The R code and output below indicates that indeed, we fail to reject
the null hypothesis that this data set is stationary. The p-value produced by the Box-Ljung test is 0.3124, which is
more than sufficiently high at the 95% confidence level. An autocorrelation plot is also shown in figure 11. This plot
supports the finding of stationarity, since there are no visible systematic aberrations, where the vertical lines pierce
the dotted significance lines. There are only a few very minor cases where this dotted line is pierced, and there are no
noticeable patterns.
> Box.test(InterArrivRaw$RawInterArriv, type = "Ljung-Box")
Box-Ljung test
data: InterArrivRaw$RawInterArriv
X-squared = 1.0205, df = 1, p-value = 0.3124
Lognormal distribution is best fit for untransformed inter-arrival data Running the Arena Input Analyzer on the
untransformed inter-arrival data results in the selection of a lognormal distribution as the best fit. The fit is exhibited
in figure 12..
The p-value for this distribution is 0.134, based on the chi-square test. While this is not as strong a fit as for the
other hail data variables, we would not reject the null hypothesis at the 95% confidence level. The p-value is above
0.05.
The equation output from Arena, however, did not produce a usable result. This output showed the equation as
-0.001 + LOGN(0,0), and of course we cannot simulate lognormal distribution values with all the parameters
19 of 39
Robert McPherson
4.4 Data on Standard and Hail Resistant Roofs Final Project
Autocorrelation Plot on Inter−Arrival Data
1.0
0.8
0.6
ACF
0.4
0.2
0.0
0 5 10 15 20 25
Lag
Figure 11: Plot of autocorrelation function indicating little evidence of non-stationarity
equaling zero. The “fitdistr” function in R software was utilized to produce the correct parameters for the log-normal
distribution on the inter-arrival data. The R output below reveals a usable pair of parameters, as well as the respective
standard error value under each.
R output showing parameters for log-normal distribution fitted to inter-arrival data

> fitdistr(InterArrivRaw$RawInterArriv,"lognormal")
meanlog sdlog
1.82452117 1.33642090
(0.06947721) (0.04912781)
4.4 Data on Standard and Hail Resistant Roofs

Most of the data regarding roofing characteristics, costs, and counts, are only available as sample statistics or estimates,
rather than as complete sample sets. For instance, estimates of roof damageability are available from studies, such as
by the Institute for Business and Home Safety (IBHS). However, although the estimates are based on research data,
the data samples are not publicly available. This section will focus on providing the sample statistics that were found,
as well as the calculations and assumptions that were used when estimations were necessary.
Part of the aim of this analysis is to determine how long it may take to see a return on the investment by insurance
companies in replacing hail damaged conventional roofs with hail resistant roofs. The cost to replace a roof with hail-
resistant shingles is 10.00% to 20.00% of the expense to install a conventional asphalt shingle roof, including materials
20 of 39
Robert McPherson
4.4 Data on Standard and Hail Resistant Roofs Final Project
Figure 12: Lognormal distribution fitted to untransformed inter-arrival data
and labor [11]. To account for this, a triangular distribution was implemented in the Arena simulation model, with a
low estimate of 10%, modal estimate of 15%, and a high estimate of 20%. No modal statistics were available, so 15%
was chosen as the estimate for the mode, simply because it is midway between the low and high range limits.
With each hail event in the simulation, the total proportion of roofs that remain with conventional roofing must be
estimated. With each simulated hail event, the conventional roofs are being replaced with hail resistant roofs, causing
the proportion of conventional roofs in the stock of roofs for single family houses in the U.S. to diminish. Table 4
shows how the total value of roofs in the U.S. was estimated. Bibliography references to data sources are provided in
the table, and include the U.S. Census Bureau [9], and the Cost Helper website [1].
Other sources were also consulted for the roof cost, and a range of estimates can be found, depending on variables
such as roof size, and quality of materials and labor. Rather than add another stochastic variable to the model in this
case, it was decided to choose a midrange point estimate of $5,000, since this study is not concerned so much with
variability in roof costs in general, but is more concerned with variability in the cost difference between conventional
and hail resistant roofs. As previously mentioned, a triangular distribution was used to model this variability. The
$5,000 point estimate, however, can simply be thought of as a constant in the equation, and should not have a significant
impact on the comparison between hail resistant and non-resistant roofing costs.
Data were not found to determine how many roofs presently have hail resistant roofing. However, some sources
mentioned that most roofs for single family houses are asphalt shingle. For purposes of the modeling, it will be
presumed that a low percentage of roofs are currently hail resistant, begining at 10%. A ratio of hail resistant, to
conventional roofs will be calculated after each event. This gradually increasing ratio will be used to determine the
amount of losses for each subsequent event, by blending the higher loss costs for conventional roofs, with the lower
loss costs for hail resistant roofs.
Estimated Total Value of Insurable Roofs

Step Attribute Sample Statistic / Estimate
A Housing units in 2008 [9] 129,065,264
B Homeownership rate in 2000 66.2% [9] 66.20%
C Estimated number of insurable homes (A · B) 85,441,205
D Estimated average roof value [1] $5,000
E Estimated total roof value (C · D) $427,206,023,840
F In $millions (E/1, 000, 000) $427,206
Table 4: Calculation steps for estimating total roof value in the U.S.
21 of 39
Robert McPherson
4.5 Simulating the Data Final Project
Loss costs are based on the hail property damage data from NOAA. These data are adjusted by the typical loss
reduction from hail resistant roofing, based on research. The research indicates that hail resistant roofs are 50% less
likely to be damaged in a hail storm than conventional roofs.
4.5 Simulating the Data

When initially designing this model, it was not certain whether property damage, hail size magnitude, or both variables
would be needed to determine the amount of loss to various hail resistant grades of roofing. Much depends upon what
kinds of roof data can be located that might be combined with hail size magnitude to determine a dollar loss. If it
is not possible to convert hail size directly in to losses, then property damage might be utilized as a proxy for roof
losses. However, it should be noted that this is certainly not an exact proxy. Property losses can include automobiles,
although based on personal experience in working for an insurance company, most losses are indeed roof related.
Property losses in these data do not include agricultural losses, since NOAA reports these separately. Rather than
guess which variables would be most important to the modeling process ahead of time, both were included.
Figure 13 exhibits the setup configuration for the creation of hail storm events. The lognormal expression for the
setup is shown in figure 14.
Figure 13: Configuration of the “Create Hail Storm Event" object
The setup for the random generation of hail stone size is shown in figure 15. The configuration of the random
generator for the gamma distribution behind the “Hail Stone Size” object is exhibited in 16.
Lastly, the setup of the object to generate property damage in dollars, called “Storm Severity”, is shown in figure
17. The setup of the log-Weibull distribution (i.e., Gumbel distribution) that is used to generate the property damage
amount for each storm is illustrated in figure 18.
5 Model Implementation, Testing, Verification, and Validation

5.1 Model Implementation
Two model versions were constructed.
1. The first model, shown in figure 19 is the simplest, and assumes all roof replacements for hail damage are done
with conventional, asphalt shingles, as is customary in the insurance industry today.
22 of 39
Robert McPherson
5.1 Model Implementation Final Project
Figure 14: Configuration of the lognormal distribution for “Storm Severity" object
Figure 15: Configuration of the Hail Size Magnitude object
2. The second model, illustrated in figure 20 adds to the conventional roofing model, by assuming some or all of
the roof replacements for hail damage are done with hail resistant roofing.
23 of 39
Robert McPherson
5.1 Model Implementation Final Project
Figure 16: Configuration of gamma distribution random generator for hail stone size
Figure 17: Configuration of property damage object (i.e., “Storm Severity")
Figure 18: Configuration of log-Weibull random generator behind “Storm Severity" object
Both models are designed to compare estimates as to how many years it could take to realize a statistically sign-
ficant benefit, if insurance companies replace hail damaged roofs with hail resistant shingles, instead of the usual
conventional asphalt roofing shingles. The main factor in this evaluation is the very high amount of variability in hail
arrivals and resulting damange levels, given the nature of the extreme value distributions that are inherent in real-life
hail storm events. Thus, much of the effort in this research was given to researching and fitting distributions to these
24 of 39
Robert McPherson
5.2 Simplifying Assumptions in the Models Final Project
extreme data. Model implementation mainly involves the incorporation of distribution generators that mimic real-life
hail storm events. Thus, there are no resources or servers required in either of the models. Most of the modules shown
in figures 19, and 20, serve the purpose of updating variables, simulating cost estimates, and tracking the results from
various hail-resistant roof replacement strategies.
In the case of the hail-resistant model version, the modules labeled I, J, and K, as well as N through S, were added
to estimate the proportion of existing roofs that have been converted from conventional to hail-resistant roofing. It is
presumed in this model that a certain portion of conventional roofs are replaced with hail-resistant roofs, as a result of
replacements after each hail storm event. Two variables, one containing the percentage of hail-resistant roofs, and one
containing the percentage of conventional roofs, are updated after each simulated hail storm event.
5.2 Simplifying Assumptions in the Models

Following are a number of simplifying assumptions that were made in the models.
• The simulated inter-arrival distribution fits well everywhere except the extreme tails. Rather than create a sepa-
rate entity to model the extreme tail values, it was decided to allow the distribution to underestimate the tail, so
as to shorten the number of years required to realize a statistically significant difference between the alternative
roof replacement strategies that were modeled. As we shall see, this did not turn out to make any difference in
the conclusions derived from the experiments.
• Since there are no data available as to an inventory of individual houses that could be exposed to hail storm
damage, except that which is proprietary to insurance companies, the models assume that a certain proportion
of houses are affected after each hail storm event, rather than attempt to estimate damage to individual homes
based on hail stone size (which is what a commercial catastrophe modeling software package, such as Equicat,
or AIR, would do).
• All simulated damages are presumed to be valued in current U.S. dollars.
• Each replication consists of 365 days, so that each replication represents a full year’s worth of hail storm events.
• No seasonal adjustments were made, as the purpose of the models is to evaluate results across many years,
rather than to evaluate results within any given year. Since each replication represents an entire year, seasonal
adjustments are not very useuful.
• In these models, it is assumed that conventional roofs only get replaced with hail-resistant roofs when a roof is
damaged by hail. There is no provision made for the possibility that some home owners may install roofs on their
own, prior to any hail damage. The object of these models is to examine the potential impact of replacing hail
damaged roofs due to hail damage. Such action would like occur through laws requiring insurance companies to
pay for replacing roofs with hail-resistant roofing, and by changing building codes to require such installations
so that homeowners will spend their claims dollars on hail-resistant roofing as well. (Insurance companies pay
claims, but do not actually make roof replacements.) Homeowners generally determine the contractors who will
do the work, as well as the materials that will be used. Thus, a chance in building codes would be necessary to
assure that roofs are replaced with the upgraded resistant roofing.
• It is assumed that all of the loss data provided by NOAA for hail storm events pertains to residential roof damage.
In fact, having personally worked as a claims manager for a major insurance carrier, I can attest that by far the
greatest losses we encountered in hail storm events was due to residential roof damage. Most residential roofs
are asphalt shingles, and these can be destroyed merely by knocking off some of the top layer of pebbles on the
shingles.
25 of 39
Robert McPherson
Conventional roof replacement model The conventional roof replacement model in figure 19 is explained step-by-
step in table 5, according to module letter name.
Figure 19: Model structure for conventional asphalt roofing replacements
Hail resistant roof replacement model Figure 20 illustrates the module layout and logic for the hail-resistant re-
placement version of the model. The illustration also shows the histograms from a single run, as well as a textbox
indicating the proportion of roof replacements that should be done with hail resistant roofing. In this example, the
0.50 in the textbox indicates that 50% of all losses attributable to conventional roofing should be replaced with hail-
resistant roofing. The model multiplies this percentage times the total amount of conventional roofs remaining from
the previous hail-storm event.
The model also incorporates a ratio of the number of hail-resistant roofs to the number of all single family roofs
in the U.S. housing stock. The numerator, consisting of the number of hail resistant roofs, is increased after each hail
storm event, which increases the ratio. The ratio can be applied to the total number of roofs to determine the number
of hail-resistant roofs. The number of conventional roofs is simply determined after each event, by subtracting the
newly adjusted number of hail-resistant roofs from the number of conventional roofs after each round.
Figure 20: Model structure for hail resistant roofing replacements
Table 6 provides the details behind each of the nodes in the hail-resistant roof replacement model that was illus-
trated in figure 20. (The table is continued on another page from models O through S.) A few of the modules were
26 of 39
Robert McPherson
Module Explanations for Conventional Roof Replacement Model

ID Module Name Module Description Variable Expressions and Details
A Create Hail Lognormal random process to generate NA LOGN(1.82452117 ,1.33642090)
Storm Event hail storm events
B Separate for Inter Duplicates hail event entity so that data NA Provided for visual separation only;
Arrival Tracking can be collected on inter-arrivals no cost assigned to duplicates
C Track Inter Calculates difference between current InterArrivalDays NewDay PreviousDay

Arrival Times simulated event date, and previous
event date
D Previous Event Stores current event date; not used PreviousDay NewDay
Date until next event, whereupon value is (Stores the NewDay value)
retrieved as previous event date
E Current Event Gets current simulated event date NewDay TNOW
Date (Function to get new event date)
F Delay Between Process delay only used to NA InterArrivalDays

Events summarize inter-arrival data as a model
check; does not impact model
G Dispose Inter Ends section devoted to tracking and NA NA
Arrival Tracking Summarizing inter-arrival times
H Storm Severity Generates a dollar loss value based on Loss Severity LN(WEIB(53.98324,.00528)+173)
a lognormal Weibull distribution, for
each hail storm event
I Losses for Non Hail Simply stores the Loss Severity Non Hail Loss Severity
Resistant Roofing variable under the descriptive name: Resistant Losses
Non Hail Resistant Losses
J Total Accum Accumulates losses across yearly Total Cummulative Non Hail Resistant Losses / (Total
Conventional Losses replications; does not reset after Conventional Losses Value of Roof Population)
each year
K Total Hail Losses Stores value under a name that is Total Loss Cost for Non Hail Resistant Losses
comparable to the hail-resistant Both Roof Types
version of the model
L Hail Event Ends section that generates and tracks NA NA
Resolution losses for conventional, non-hail
resistant roofs
Table 5: Description of modules within the conventional roof replacement model
27 of 39
Robert McPherson
5.3 Testing, Verification, and Validation Final Project
mainly used for testing and validation purposes, and most of these are noted within the table.
5.3 Testing, Verification, and Validation

Validation of the hail storm event generation process Figure 21 shows the variability of losses from a single year
of simulated hail losses. This variability mimics the extreme variance inherent in losses from actual hail storm events.
Also, this graph changes shape from year to year, further validating our hail-storm loss generation process.
Figure 21: A single year of simulated hail losses shows extreme variability inherent in losses from actual hail events
Table 8 includes basic statistics from the Arena output after running the simulated distributons for 500 replications.
As previously noted these amounts are less than the actual amounts in the data. Although the fit of the distributions was
seen to be statistically significant in terms of high p-values, as previously demonstrated, the quantile plots that were
reviewed earlier illustrated that the tails were not adequately captured in the fitted distributions. Although the outliers
in the tail are few, they can be enough to shift the mean. Truncating the distributions in this manner can result in the
understated averages we see in table 8. This should not be a serious problem for the purposes of these experiments,
however. We will just need to be aware that the inter-arrivals are likely to happen a bit more frequently than in real-life,
and we should keep in mind that the actual losses can, in rare circumstances, be higher than what is modeled. This
is not unlike what has been observed in some commercial catastrophe model results, when comparing them to real
events. Mother nature sometimes has a way of dealing out more than our laboratories and stock distributions can
reproduce.
In fact, sensitivity analysis was conducted on the inter-arrival distribution to test the impact on the outcome. The
result, as will be discussed later, does not change the findings for this analysis. Table 8 also includes a final row that
compares the mean and maximum of an exponential distribution with a mean of 16.2 days (the same as the actual
inter-arrival mean). The shape of the distribution will not match the actual, of course, but the resulting average is
closer to the actual, as is the maximum inter-arrival, although it is still understated. The shape of the exponential
version that was used for sensitivity testing is more gradually sloping than the real distribution. Again, more will be
discussed on this result later.
28 of 39
Robert McPherson
Module Explanations for Hail-Resistant Roof Replacement Model

A Create Hail Storm Lognormal random pro- NA LOGN(1.82452117,
Event cess to generate hail storm 1.33642090)
events
B Separate for Inter Ar- Duplicates hail event en- NA Provided for visual
rival Tracking tity so that data can be col- separation only; no
lected on inter-arrivals cost assigned to dupli-
cates
C Track Inter Arrival Calculates difference be- InterArrivalDays NewDay Previous-
Times tween current simulated Day
event date, and previous
event date
D Previous Event Date Stores current event date; PreviousDay NewDay; (Stores the
not used until next event, NewDay value)
E Current Event Date Gets current simulated NewDay TNOW; (Function to
event date whereupon get new event date)
value is retrieved as
previous event date
F Delay Between Process delay only used NA InterArrivalDays
Events to summarize inter-arrival
data as a model check;
does not impact model
G Dispose Inter Arrival Ends section devoted to NA NA
Tracking tracking and summarizing
inter-arrival times
H Storm Severity Generates a dollar loss Loss Severity LN(WEIB(53.98324,
value based on a lognor- 0.00528) + 173)
mal Weibull distribution,
for each hail storm event
I Separate to Compare Duplicates storm severity NA Provided for visual
Resistant and Non Re- for visual effect to sepa- separation only; no
sistant rate hail resistant and non- cost assigned to dupli-
resistant calculations cates
J Losses for Non Hail Simply store the Loss Non Hail Resistant Losses Loss Severity
Resistant Roofing Severity variable under
the descriptive name, Non
Hail Resistant Losses, to
parallel similar names in
other models
K Conventional Costs For model check: nor- Conventional Losses Divided By Non Hail Resistant
Divided By Value of malize conventional losses Value of Conventional Roofs Losses / (Total Value
Conventional Roofs per dollar of conventional of Roof Population -
roofs in population Total Value of Hail
Resistant Roofs)
L Total Hail Losses Combine loss costs for Total Loss Cost for Both Roof Hail Resistant Losses
both roof types and in- Types + Non Hail Resistant
clude extra installation Losses + Hail Resis-
costs for hail resistant tant Increased Instal-
roofs lation Cost
M Hail Event Resolution Ends section that gener- NA NA
ates and tracks losses for
all roof types
N Losses for Hail Resis- Calculates proportion of Hail Resistant Losses TRIA(0.4, 0.5, 0.56)
tant Roofing hail-resistant roofs that * Loss Severity * Per-
suffered loss, based on centage of Hail Resis-
triangular distribution tant Roofs in Total
Table 6: Description of modules within the hail-resistant roof replacement model – continued on next page
29 of 39
Robert McPherson
Module Explanations for Hail-Resistant Roof Replacement Model – continued from previous page
O Increased Installation Calculates extra installa- Hail Resistant Increased Installa- TRIA(0.1, 0.15, 0.2) *
Cost for Hail Resis- tion cost of hail-resistant tion Cost Hail Resistant Losses
tant roofs based in triangular
distribution
P Hail Losses Plus Calculates total cost of in- Hail Resistant Losses Plus Added Hail Resistant In-
Higher Installation stalling hail-resistant roof Installation Cost creased Installation
Costs by adding extra installa- Cost + Hail Resistant
tion cost to loss cost Losses
Q Total Accum Hail Re- Accumulates losses across Hail Resistant Losses Plus Installa- (Hail Resistant Losses
sistant Losses Includ- yearly replications; does tion Divided By Value of Hail Re- + Hail Resistant
ing Installation not reset after each year sistant Roofs Increased Installation
Cost) / Total Value of
Hail Resistant Roofs
R Increase Total Num- Increase the number of Total Value of Hail Resistant Roofs Total Value of Hail
ber of Hail Resistant hail resistant roofs in the Resistant Roofs +
Roofs population by the number (Non Hail Resistant
of hail resistant roofs re- Losses * InputPro-
placed in event portionToRepairWith-
HailResistant)
S Percentage of Total For model check: nor- Percentage of Hail Resistant Roofs Total Value of Hail
Hail Resistant Value malize conventional losses in Total Resistant Roofs / To-
to Total Roof Value per dollar of conventional tal Value of Roof Pop-
roofs in population ulation
Table 7: Continuation of description of modules within the hail-resistant roof replacement model
Basic Statistics on Simulated Distribution Output

Variable Average Half-Width Maximum
Days Inter-Arrival 1.8 < 0.27 24
Property Damage 46.48 < 0.72 441
Actual Inter-Arrival 16.22 252
Actual Property Damage 31.13 1,350
Alternate Inter-Arrival 14.1 <0.75 103.6
(with mean of 16.2 days for sensitivity testing)
Table 8: Arena output statistics on simulated distributions
30 of 39
Robert McPherson
Final Project
Nevertheless, the histograms shown in figure 22 show a strong resemblance to the distributions that were shown in
the previous histograms of the actual data.
Figure 22: Histograms of inter-arrival times in days (left), and loss severity in dollars (right)
Testing and validation of the hail-resistant roof replacement incrementer, and total losses If the formula to
increment the number of hail-resistant roofs in the U.S. is working properly, we should expect to see the lines pertainint
to hail resistant roof losses and installation costs move upward as the number of replications increases. That is indeed
what is seen in figures 23 and 24. These graphs compare the total losses from conventional and hail-resistant roofs after
one year, and 200 years. The models were set to replace 100% of hail damaged conventional roofs with hail-resistant
roofs to speed up the replacement process. Thus, the replacements can only occur as fast as the frequency and severity
of hail storms will allow.
Note that severity is not only determined by hail stone size in a real event, but also by geograophical coverage.
Since these models as national in scope, the severity distribution encompasses this consideration. Since the frequency
generator, and the severity generators are based on actual data, the speed at which the roof replacements occur should
mimic real life. However, as pointed out previously, the speed will likely be a bit faster in the models, since there
are tail events that cannot be adequately accounted for in these distributions without resorting to modeling the tails
separately.
The graph in figure 23 indicates the average daily conventional roof losses (top line), the average daily hail-resistant
roof losses, and the average daily hail-resitant losses plus slightly increase installation costs, over a single replication
(i.e., one year). The model was started with only 10% of the roofs having hail-resistant roofs. (While no available data
were found to indicate total hail-resistant roofs in the U.S., 10% was utilized as a starting point, as the number of roofs
with hail-resistant roofs is considered to be quite low. While it is probably lower than 10%, changing this number does
not impact the findings from the experiments that were done.) In any case, figure 23 indicates that the losses related
to hail-resistant roofs remains low relative to conventional roof losses, after only one year. This is, of course, what we
should expect if the incrementing formula is working properly.
Figure 24 shows the same graph as in figure 23, except that it covers 200 replications. We can see from this
graph, that after 200 years, the losses from hail resistant roofs account for nearly all of the losses. The line for the
conventional losses is near zero, and is difficult to see in the graph. After a time as long as 200 years, this seems a
plausible result.
6 Design Experiments and Make Model Runs

It might seem obvious that insurance companies should start installing hail-resitant roofs, just based on the fact that
we already know that hail-resistant roofs are 50% less likely to be damaged than conventional roofing. Also, we know
31 of 39
Robert McPherson
Final Project
Figure 23: Average losses for each of the 365 days in a year over a single year (i.e., one replication)
that the cost to install hail-resistant roofs is only a little bit more than to install conventional roofs: 10% to 20% more.
However, given the wide variance in extreme events such as severe hail storms, things that seem certain may not be so.
Or, it may take a great many years to see the benefits from such loss mitigating efforts. This experiment is intended to
determine whether the apparent advantage between various options for installing hail-resistant roofs is real, and if so,
to determine a minimum number of years it might it take to fully realize these benefits.
There are several scenarios that will be compared.
1. Do not replace any hail-damaged roofs with hail-resistant roofing. This is extreme scenario is more of a baseline
for comparison purposes. It could help answer the question as to whether it is worthwhile to even consider
embarking on any hail-resistant roofing strategies.
2. Replace all hail-damaged roofs with hail-resistant roofing. This scenario represents the opposite extreme case to
the first scenario, and is likewise mainly included for comparison purposes. It would likely take some extreme
insurance and building code legislation to accomplish anything even close to this scenario.
3. Replace 50% of hail-damaged roofs with hail-resistant roofing. This may also be a bit optimistic, but might be
plausible. For example, it might be possible to accomplish this kind of rate eventually, if enough insurance com-
panies can find a way to get the public’s attention through attractive incentives, combined with some legislative
changes in hail prone areas, such as improved building codes.
4. Replace 10% of hail-damaged roofs with hail-resistant roofing. This is replacement rate is perhaps a bit more
realistic.
5. Share costs for installing hail-resistant roofing with homeowners, resulting in a 10% take-up rate (i.e., 10% of
hail-damaged roofs replaced with hail-resistant roofing. This is perhaps the most realistic scenario, as some
insurance companies have begun to offer incentives, such as reduced deductibles, and/or reduced premiums
for residents with hail-resistant roofs. The take-up rate is not known, but does not appear to be getting much
attention in the industry. Hence, the take-up rate would appear to be rather low, so that 10% may well be an
overly optimistic estimate. This scenario is also intended to test whether the extra installation cost is much of an
offsetting factor as to the efficacy of installing hail-resistant roofing from a loss standpoint.
32 of 39
Robert McPherson
Final Project
Figure 24: Average losses for each of the 365 days in a year over 200 years (i.e., 200 replications)
Figure 25 illustrates the basic initial setup for comparing the scenarios.
Figure 25: Basic setup for comparing hail-resistant roof replacement strategies in Process Analyzer software
7 Data Analysis and Results Documentation

The primary variable of interest is the total loss costs for both roof types. The primary perspective for this analysis is
from an insurance industry point of view, and total losts costs will give us the best indication as to the efficacy of each
strategy in terms of profitability potential.
7.1 Comparisons over 50 Years

Figure 26 shows a box plot comparing the total loss results for each scenario for 50 replications. The overlap in the
red (or light gray, if viewing this in black and white print) boxes, representing the confidence intervals, indicates no
significant differences between these scenarios. If any scenario had turned out to have a significantly better mean
result from the others, the box would be shaded blue (or dark gray, in black and white print). All of the boxes for this
50-year experiment are shaded red.
33 of 39
Robert McPherson
7.1 Comparisons over 50 Years Final Project
Figure 26: Box plot of 50 repititions indicating no single scenario is significantly better
The numbers behind the box plot for the 50 replications are shown in table 9. This table indicates the minimum,
maximum, and confidence interval ranges for the mean of each scenario across all the replications (50 years, in
this case). We can see confirmation of the overlapping confidence intervals, despite the seemingly large differences
between the mean values. This is indication of the obscuring affect that extreme events can have on seemingly obvious
courses of action. It turns out that embarking on a program to replace hail-damaged conventional roofs with hail-
resistant ones may not be such an obviously correct course of action after all, or at least, not within a 50-year timeframe.
Fifty years is obviously a long time to wait to see difinitive business results.
Total Loss Cost for Both Roof Types by Scenario 50 Replications

(Standard Error Tolerance = 1.26791)
Min Max 95% CI Low 95% CI High Mean
All Conentional Roof Replacements 5.15 293.74 23.3 68.81 46.06
All Hail Resistant Roof Replacements 3.24 192.84 17.9 48.85 33.38
Fifty Percent Hail Resistant Replacements 3.84 212.12 19.32 53.05 36.18
Ten Percent Hail Resistant Replacements 4.61 231.39 20.83 57.47 39.15
Cost Sharing with Ten Percent Hail Resistant Replacements 4.58 230.09 20.7 57.12 38.91
Table 9: Data behind the box plot for 50 replications
It was considered that there may be a potential need for a warm-up period. However, this did not make a difference
in the results, even after allowing for a 100-year (3,650 days) warm-up period. There really are not any significant
interactions within the model, such as between resources or entities, as there are no resources, and there is only
one entity. Thus, it is unlikely that a warm-up would make much difference in most cases with these rather simple
model structures. Although, increasing the number of replications does reduce the confidence intervals, warm-up
replications are not counted toward the value of the sample size, n, when the Arena software package calculates a
confidence interval.
34 of 39
Robert McPherson

Figure 27 exhibits the box plot of the results after 100 replications. After this many additional replications, we can see
that the boxes are narrower than in the 50 replication box plot shown in figure 26. Table 10 also bears this out.
Figure 27: Box plot of 100 replications indicating no single scenario is significantly better, even after 100 years
In table 10, we can see that the 95% confidence interval low values are not quite as low as the result that was shown
for 50 replications in table 9. Converely, the high values are not as high. Thus, we see confimation of the confidence
interval shrinking with increaseing numbers of years, as expected. However, even with the shrinking confidence
intervals, none of the scenario outcomes is statistically significant. Again, all of the boxes are shaded red in the box
and whiskers plot that was shown in figure 27. None of them are shaded blue, which would have indicated a result that
would have allowed us to reject the null hypothesis that the scenario mean outcomes are statistically equal, given the
amount of variance in the results. Again, we see the effect of extreme events can have on such comparisons.
Fifty Percent Hail Resistant Replacements 212.12 3.42 23.78 44.66 34.22
Cost Sharing with Ten Percent Hail Resistant Replacements 230.09 4.41 27.18 50.54 38.86
35 of 39
Robert McPherson

The results are the same as above when running simulations for higher replications in increments of 50 years, until
reaching 250 years. At this point, it becomes more of an academic exercise to see how many replications are needed to
find a statistically significant difference between any of these scenarios, but it is a question that begs to be answered.
Figure 28 shows that the confidence intervals are finally narrow enough after 250 replications, that all three hail-
resistant roof replacement strategies are signficantly better than the conventional roofing scenario.
In addition to the impact of extreme event variability, another factor affecting these results is that it takes a while
to build up enough of an “inventory” of hail-resistant roofing stock to make a difference. This gets at the issue of low
frequency of occurence, which is another characteristic of extreme loss events.
Figure 28: Box plot of 250 replications shows all three hail-resistant scenarios are finally significantly better than
non-hail resistant
Table 11 shows just how narrow the confidence intervals have become after 250 replications. We can see how
differences in sencario outputs can begin to appear significant at this point.
Fifty Percent Hail Resistant Replacements 2.56 212.12 23.91 36.09 30
Cost Sharing with Ten Percent Hail Resistant Replacements 3.8 251.55 29.25 44 36.62
36 of 39
Robert McPherson
7.4 Sensitivity Testing Final Project
7.4 Sensitivity Testing

As was mentioned previously, there were difficulties in fitting a realistic distribution to the actual severe hail storm
arrival times. The preceding experiments were repeated with a different arrival distribution assumption. It was assumed
that the arrival time could be modeled by an exponential distribution, with a mean of 16.2 days, which is what the
simple mean of the actual arrival data suggests. Repeating the experiments with this result tests to see if this change
is enough to significantly shift up the number of years (i.e., repititions) required so that the confidence intervals are
sufficiently small enough to show a significant difference between the scenarios.
Figure 29 shows that the results with the exponential arrival time are the same in the 50-year experiment, as for the
originally used, fitted distribution. The confidence intervals overlap for all the scenarios, and all are red, indicating no
statistically significant difference between them. Thus, this provides additional evidence that it may take a very long
time to realize any significant benefit from implementing roof replacement strategies involving hail-resistant roofing.
Figure 29: 50-year comparison of exponential inter-arrival with mean of 16.2 days for sensitivity testing
However, figure 30 show a similar result in 100 years with the new distribution, as we saw in 250 years with the
previous, fitted distribution. Thus, it appears that the exponential distribution tends to make the confidence intervals
more narrow at a faster rate. However, for business purposes, 100 years is still a very long time. This sensitivity
test appears to confirm our initial findings that hail-resistant roof replacement strategies may not be as appealing as it
might first appear.
8 Conclusion
An initial evaluation of the published statistics concerning the benefits of hail-resistant roofs might lead insurance
executives, state insurance regulators, and building code advocates to pursue incentives and laws that would strongly
promote investing extra money to install hail-resistant roofs when conventional roofs are destroyed as a result of hail-
damage. However, an evaluation of the long-term benefits with the use of simulation modeling technology reveals
the answer may not be so clear cut. Hail storm events that are sufficiently strong to cause significant damage (over
$1 million per event, which is a low threshold by insurance company standards) are relatively infrequent, and the
variance in the severity of these events is such that it could take an extremely long time to realize a clear payoff from
such efforts.
37 of 39
Robert McPherson
REFERENCES Final Project
Figure 30: 100-year comparison of exponential inter-arrival with mean of 16.2 days for sensitivity testing
However, further research should be conducted, as this study was fairly limited in scope. Another possible av-
enues of research could include costs savings for wind damage combined with hail damage, as hail-resistant roofs are
purported to hold up better than conventional roofs in high wind as well. Other possibilities include doing a more
extensive study that would seek to overcome some of the simplifying assumptions that were listed for this study. One
such extension on this study could involve fitting additional distributions to account for the extreme tails. It may be
possible to do a piecewise combination of distributions at different thresholds.
In addition to modeling the impact of hail-resistant roof replacement strategies, much attention was given to the
extreme value distributions behind hail storm events in general. This study also revealed the high challenge of trying
to mimic mother nature, as she does not like to conform to the rather domestic varieties of distributions that are
commonly available in most software packages. The natural world can be much more variable and extreme than we
may tend to imagine.
References
[1] Cost of a new roof - get prices and estimates - CostHelper.com. http://www.costhelper.com/cost/home-
garden/roof.html, December 2008.
[2] Energy information administration - residential energy consumption survey. http://www.eia.doe.gov/emeu/recs/,
2009.
[3] NCDC: * national climatic data center (NCDC) *. http://www.ncdc.noaa.gov/oa/ncdc.html, August 2009.
[4] NCDC: query output. http://www4.ncdc.noaa.gov/cgi-win/wwcgi.dll?wwevent~storms, December 2009.
[5] Arena simulation software by rockwell automation: Home. http://www.arenasimulation.com/, 2010.
[6] Dia - GNOME live! http://live.gnome.org/Dia, March 2010.

[7] Hail loss fact sheet. http://www.disastersafety.org/publications/view.asp?id=13478&cid=1085, 2010.
38 of 39
Robert McPherson
REFERENCES Final Project
[8] Summary of state hail maps. http://www.disastersafety.org/publications/view.asp?id=8854&cid=1085, 2010.

[9] USA QuickFacts from the US census bureau. http://quickfacts.census.gov/qfd/states/00000.html, April 2010.
[10] WikiAnswers - how many single family homes are there in the united states.
http://wiki.answers.com/Q/How_many_single_family_homes_are_there_in% t he_U nited_States, 2010.
[11] Terry Blinton. Let it hail, let it hail, let it hail. http://www.insurancejournal.com/magazines/southcentral/2003/04/07/features/28144.
April 2003.
[12] NAHB Research Center. High wind- and Impact-Resistant asphalt roofing shingles.
http://www.toolbase.org/Technology-Inventory/Roofs/wind-resistant-asphalt-shingles, 2001.
[13] Stanley A. Chagnon and Tamara G. Creech. AMS online journals - sources of data on freezing rain and resulting
damages. Journal of Applied Meteorology, page 5, 2003.
[14] Stuart Coles. An Introduction to Statistical Modeling of Extreme Values. Springer Series in Statistics. Springer-
Verlag, London, 2001.
[15] Eric Gilleland, Rick Katz, and Greg Young. extRemes: Extreme value toolkit., 2010. R package version 1.62.
[16] Patricia Grossi and Howard Kunreuther. Catastrophe Modeling: a new approach to managing risk. Huebner
International Series on Risk Insurance and Economic Security. Springer Science+Business Media, Inc., New
York, NY, 2005.
[17] Ryan Jewell and Julian Brimelow. EVALUATION OF an alberta hail growth model using severe hail proximity
soundings in the united states. 2004.
[18] David W. Kelton, Randall P. Sadowski, and Nancy B. Swets. Simulation with Arena. McGraw Hill, New York,
NY, international edition, 2010.
[19] Inc. Pearson Education. U.S. home size infoplease.com. http://www.infoplease.com/askeds/us-home-size.html,
2007.
[20] Inc. Professional Investigative Engineers. Insurance-Canada.ca other claims information: Hail resistant roofs:
Fact or fiction? I-ENG-A report. http://www.insurance-canada.ca/claims/other/IENGAHail407.php, 2010.
[]
39 of 39

AS S H R R S: Imulation Tudy of AIL Esistant OOF Trategies

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

AS S H R R S: Imulation Tudy of AIL Esistant OOF Trategies

Uploaded by

Copyright:

Available Formats

A S IMULATION S TUDY OF H AIL R ESISTANT ROOF S TRATEGIES

Professor: Stephan Kolitz, Ph.D.

2 System Analysis and Requirements 4

3 Developing the Model and Initial Validation 4

5 Model Implementation, Testing, Verification, and Validation 22

6 Design Experiments and Make Model Runs 31

7 Data Analysis and Results Documentation 33

2 System Analysis and Requirements

• Inter-arrival times between hail storm events

3 Developing the Model and Initial Validation

• Creation of hail storm events

Figure 1: Basic design of proposed model

• Loss reduction amount from hail-resistant roofing

4.1 Examining the Data

Sample of Data Grouped by Date at a National Level

0 200 400 600 800 1000 1200 1400

0 50 100 150 200 250

Correlation Matrix: Deseasonalized, Detrended Hail Storm Data

Thus, while small, this correlation is statistically significant.

0 200 600 1000 1400

Figure 4: Scatter-plot matrix of all hail loss data and variables

been transformed to decompose and remove any seasonality and trend.

Figure 5: Decomposition of seasonality and trend of hail property damage data

4.2 Data Analysis of Hail Storm Property Damage

Chi Square Test

Number of Data Points = 371

Sample Std Dev = 93.7

Histogram Range = 0.999 to 1.35e+003

Gumbell distribution fits deseasonalized, detrended property damage data

"Nelder-Mead" optimization method selected.

L-moments (stationary case) estimates (used to initialize MLE optimization routine):

p-value for likelihood-ratio test is 0.6857282

Convergence successful![1] "Convergence successful!"

Probability Plot Quantile Plot

Return Level Plot Density Plot

0.1 1 10 100 1000 0 200 400 600 800 1200

Figure 7: GEV exponential distribution fitted to property damage data

Output of best fit from generalized Pareto distribution analysis:

p-value for likelihood-ratio test is 0.2173791

[1] "Threshold = 200"

[1] "Number of exceedances of threshold = 9"

[1] "Exceedance rate (per year)= 0.291105121293801"

[1] "Maximum Likelihood Estimates:"

Probability Plot Quantile Plot

0.2 0.4 0.6 0.8 1.0 200 400 600 800

Return Level Plot Density Plot

0.1 1 10 100 1000 200 400 600 800 1200

Return period (years) x

03/31/08 04/16/98 05/15/98 06/25/06 10/04/06

Data File: PropDmg200MillionThreshold.txt

Number of Data Points = 9

Histogram Range = 201 to 1.35e+003

4.3 Data Analysis of Hail Storm Inter-Arrival Times

> Box.test(InterArrivRaw$RawInterArriv, type = "Ljung-Box")

Autocorrelation Plot on Inter−Arrival Data

Figure 11: Plot of autocorrelation function indicating little evidence of non-stationarity

R output showing parameters for log-normal distribution fitted to inter-arrival data

4.4 Data on Standard and Hail Resistant Roofs

Figure 12: Lognormal distribution fitted to untransformed inter-arrival data

Estimated Total Value of Insurable Roofs

4.5 Simulating the Data

Figure 13: Configuration of the “Create Hail Storm Event" object

5 Model Implementation, Testing, Verification, and Validation