Hadm 4200 Case 5

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

CASE 5

SARAH HU

DECEMBER 2021
04. 04.
CLUSTERS (CLARITAS) PREDICT OPEN (MODEL 2)

06. 06.
PREDICT OPEN (MODEL 3)
DETAILED DNA (CLARITAS) 08.
CLUSTERS (ESRI) 08.
PREDICT OPEN (MODEL 4)
10.
S UMMARY

10.
12.
LOCATION TYPE (PANERA) HOLDOUT (MODEL 2)

COTENANTS (PANERA) 14. 12.


ECONVAR (PANERA) HOLDOUT (MODEL 3)

14.
CORRELATION MATRIX

14.
REGRESS (SEGMENTS)

14.
PREDICT OPEN (MODEL 1)
The PRIZM segments that have at more similar to each other in terms of people with higher incomes/income-pro-
least a 20% or larger relative frequency profile: both groups of wealthy in regards ducing assets, located in rural and urban
associated with OPEN stores include to income/income-producing assets, areas, are mostly aged 55+, are mostly
Upper Crust, Movers and Shakers, mostly homeowners, are employed without kids, and have graduate plus
Cruising to Retirement, Middleburg at the management and professional levels of education.
Managers, Connected Bohemians, and level, and have graduate plus education.
Aspiring A-Listers. This indicates that Aspiring A-Listers represent a slightly The areas in which Panera Bread should
Panera Bread appears to favor these worse off group of people with lower look into building more stores are the
PRIZM segments based on the number income/income-producing assets, rent- ones where there is a positive correlation
of stores they have chosen to open in ers as opposed to homeowners, and only between Dum_OPEN and the clusters.
these types of areas. However, what is some college education. Panera does not These areas did not have a p-value less
also interesting is that these areas are seem to discriminate in regards to the than 0.1, but they include Empty Nests,
also the areas that have higher relative urbanicity or age range: suburban, metro Park Bench Seniors, Bedrock America,
frequencies of closed stores. Thus, this mix, and urban areas are all represented, Kids & Cul-De-Sac with positive correla-
dichotomy leads us to question if there as well as a wide range of ages from 25 tion values ranging from 0.146 to 0.151.

CLUSTERS
are any other variables that we should all the way to 55+. Overall, we can say These areas are typically comprised of
perhaps look at in addition to the type that the type of consumer Panera Bread either low-income, low educated rent-
of cluster (such as location within the is trying to attract is on the wealthier ers from the Kids & Cul-De-Sac and Park
United States to determine the perfor- end, is a savvy technology user, and pur- Bench Seniors group or wealthier, retired
mance of stores within a certain area). In sued secondary education. In terms of homeowners from the other two groups.
regards to location strategy, we can ana- employment, their target consumer is
lyze the similarities across these PRIZM mainly working at the management and The key differences between the clusters
segments to determine this. The overlap professional level, with potentially some that have a positive correlation vs those

(CLARITAS)
in tthe majority of these segments occur consumers working for management. that have a negative correlation are their
in the following categories: income/ urbanicity: the clusters that have a posi-
income-producing assets, household The Claritas segments that have the tive correlation are mostly wealthy sub-
technologies, presence of kids, empoly- highest absolute value correlation urban homeowners whereas the clus-
ment levels, homeownership, and edu- with Dum_OPEN is Money & Brains, ters that have a negative correlation are
cation levels. Most of these segments Fast-Track Families, Urban Elders, and mostly urban/rural homeowners. Across
are wealthy, use modern technology, are Mayberry-ville. These clusters have the the two categories of clusters, the dif-
mostly without kids, are homeowners, highest correlation with the variable and ferences in age, education, technology
have a graduate plus level of education, all these clusters have a p-value of less use, occupation, and presence of kids is
and are employed at the management than or equal to 0.1, meaning that the minimal. Therefore, the main difference
and professional level. correlation between these clusters and is in the urbanicity.
Dum_OPEN is significant. For all four of
It should also be noted that there are these segments, there is a negative cor- For restaurants that have relocated
3 segments that are relatively more relation ranging from -0.16 to -0.234. within the same zipcode, there are 7
important because their relative fre- Because there is a negative correlation Claritas segments that have at least a
quency is at or above 25%, meaning in all of these stores with the open store 25% relative frequency meaning that
Panera Bread has opened 12+ stores in variable, this means that there would these segments represent Panera
these cluster types. These 3 segments be a positive correlation between these Bread's inferred target market seg-
are Aspiring A-Listers (25%), Movers segments and the closed stores variable. ments. These segments are Movers &
& Shakers (25%), and Middleburg Thus, we can ascertain that these closed Shakers, Country Squires, Gray Power,
Managers (27.1%). Across these 3 seg- stores are located in neighborhoods Cruising to Retirement, Middleburg
ments, we can determine the profile of having the wrong DNA and therefore, Managers, Toolbelt Traditionalists, and
the type of consumer that Panera Bread Panera Bread should avoid these areas. Generation Web. Based on these seg-
is trying to attract. Movers & Shakers The commonalities between all of these ments, Middleburg Managers has the
and Middleburg Managers are definitely clusters is that they are comprised of highest relative frequency at 45.2%. The

4 C ASE 5 CASE 5 5
next four highest segments are Movers &
Shakers, Cruising to Retirement, Toolbelt
Traditionalists, and Generation Web,
which are all tied at 29%. From these
segments, the primary overlap occur in
their income/income producing assets
which show that these are wealthy pop-
ulations of people, household technol-
ogy use which is either average or bet-
ter, are mostly without kids, are mostly
homeowners, and have a graduate plus
level of education. The majority of these
segments either reside in subruban
or metro mix areas. The exception to
income, homeownership, and education

DETAILED
levels is Generation Web because this
is a group that mostly rents, has lower
income, and is less educated with only
high school education.

DNA (CLARI-
TAS)

6 C ASE 5 CASE 5 7
CLUSTERS
(ESRI)

8 C ASE 5 CASE 5 9
There are no ESRI segments that Panera be moving away from Urban Chic and or negative) with the variable of inter- to middle income, varying levels of edu- index, occupation by earnings, income
Bread seems to favor based con ESRI Metro Renters. For Urban Chic, this is a est, Dum_Open, is Urban Chic which cation, varying levels of homeownership, and net worth, housing, etc. From these
Tapestry that have at least a 20% or group of people that are mostly home- has a p-value of 0.06. The correlation and varying levels of types of homes. profile characteristics, we can determine
greater relative frequency with open owners and are primarily married cou- with dum_open is -0.19, which means that neighborhoods Panera should avoid
stores. The three highest relative fre- ples with older children. Financially, they that stores in Urban Chic segments are The ESRI segments neighborhoods are tend to be wealthier rural and suburban
quency segments include Top Tier, City are very stable and are well educated more likely to close than to be open. useful in helping us understand the DNA areas. Panera is looking to break into
Lights, and Enterprising Professionals. with 65% of residents holding a bach- Urban Chic is a segment that represents of neighborhoods where Panera is either lower to middle income rural areas and
City Lights population is comprised of elor's degree or higher. They are envi- propserous older married couples that closing restaurants or continuing to keep urban areas with a mix of demographics.
tech-savvy individuals who reside in a ronmentally aware, employed in white live in the suburbs. They are primarily them open. They provide a more detailed
mix of SFH, townhouses or multi-fam- collar occupations, and know how to homeowners who know how to use profile of each type of segments with not
ily properties. There is also a wide range use technology. Metro Renters, on the technology and are environmentally con- only lifestyle tendencies, but also race,
of single/married, renters/homeown- other hand, are well-educated college scious. This segment is growing slowly age by sex, average household budget
ers, and middle/working class. Top students and are also well acquainted but steadily. The characteristics of this
Tier is a population that is comprised with technology. They are not home- particular segment coincide with those
of wealthier married couples living in owners and typically, the households of the ones we found in the preceding
the suburbs. These are people that are are composed of singles. Most of the question that also had higher relative
owners of property that have some of time, they are taking publlic transporta- frequencies with closed stores because
the highest home values and are highly tion and are very well versed in modern they are comprised of older, married,
educated with 1 in 3 having a postgrad social trends as socializing and social sta- wealthier, suburban individuals. Stores
degree. Enterprising Professionals is a tus are very important to them. in the Urban Chic neighborhoods have
population that is well educated and the wrong DNA. Thus, we once again,
are early adopters of new technology. Based on the common segments that determine that Panera should stay out
They are mostly married couples with have high relative frequencies with of these types of areas.
around 30% single person households. closed stores, we cannot necessarily say
There is a wide mix of types of homes that these numbers support the conclu- The neighborhoods that Panera should
like City Lights from suburban SFH to sion that Panera is trying to move away look into going into include those that
larger multifamily properties. There is from these areas completely because had a positive correlation with dum_
also a pretty even split between renters if we look at their relative frequencies open. Although the segments Down the
and homeowners. with open stores (for Urban Chic and Road and City Lights had p values that
Metro Renters), these percentages are were not statistically signficant, they had
Based on these common segments, the also quite high at 12.5% and 10.4% some of the highest positive correlation
profile of the type of consumer Panera respectively. This means that there are coefficients with dum_open of 0.14 and
is trying to attract seems to be one that still a good number of stores operating 0.15. City Lights is a densely populated
isn't necessarily tied down to a certain in these segments, compared to other urban market comprising of a range of
profile/characteristics. As mentioned segments (given that the highest rela- individuals. There is a mix of married
before, there is quite a range of back- tive frequency within the open stores couples/singles, a blend of homeown-
grounds that comprise these three high- column was 14.6%. However, it can be ers and renters, SFH and townhouses,
est relative frequency segments includ- noted that the rate at which they are apartments, multifamily units, etc. Down
ing whether or not they are a home- closing these stores is a lot higher than the Road is also a mix of individuals in
owner, are married/single, what type the rate at which they are opening these rural areas that have a mix of homeown-
of house they live in, income levels, etc. stores. Commonalities between these ers and renters, mix of college and high
However, some general trends is that two segments include using technol- school education, primarily comprised
these people are tech-savvy and have at ogy and being well-educated. They are of family. In terms of income, the fami-
least some secondary education. around the same age group (30s to 40s) lies are on the lower side as they mostly
and reside in urban and suburban areas. work in land occupations like agriculture
Based on ESRI Tapestry that have at least and live in affordable housing. In general,
a 20% or greater relative frequency with The only ESRI segment that has a sta- we can say that Panera should consider
closed stores, Panera Bread appears to tistically signficiant correlation (positive breaking into these segments with lower

10 C ASE 5 CASE 5 11
Free Standing Stores, Power Centers, and near major shopping centers that have Free Standing Stores and Community/
Community/Neighborhood Shopping a prominent regional trade area and are Neighborhood Shopping Centers
Centers all have a positive correlation close to university and college campuses have positive correlation coefficients
with the variable Dum_OPEN. Therefore, as well as business centers. The types with Dum_Open, which increases the
it seems that these types of locations of properties that seem to have a pos- probability that Panera will keep their
increase the probability that Panera will itive correlation with the open store restaurant open in this type of cen-
keep their restaurants open in a given variable are ones that are anchored by ter. Thus, both analyses conclude that
zip code. a large tenant like a supermarket for a Free Standing Stores and Community/
Super-Regional Malls, Lifestyle Centers, Community Shopping Center and large Neighborhood Shopping Centers are
Strip Centers, Residential/Retail, and specialty stores for Power Centers (as preferred location choices.
Office/Retail spaces have negative opposed to your normal department However, there are cons/limitations
correlation coefficients with Dum_ stores for retail centers and malls), which with a preferred location choice since
Open. This suggests that these types are also a bit more upscale. These differ- it limits the type of consumer that you
of properties decreases the probabil- ences in anchors as well as those who may find. For instance, the consumers at
ity that Panera will keep a restaurant shop there could provide insight into regional malls vs those at a free stand-

LOCAT IO N
open. Additionally, we can look at the why these location types matter. ing store may have different tendencies
ratio of times a Panera has opened over like how often they visit a Panera, why
the number of times it closed within a Using the arbitrary cutoff of 1.5 for the they visit a Panera, the type of food that
certain type of center to support this ratio of the number of times Panera they order, etc. Specifically, in malls, you
conclusion. For instance, if we look at open/closed within a type of center, may find the reason for customers vis-
Lifestyle centers, there are no Panera's Free Standing Stores and Community/ iting to be primarily because of conve-
open and for Office Retail, there were Neighborhood Shopping Centers are nience vs those who visit a stand-alone
10 stores that closed out of the 11 that desirable. For these two types of centers, store may visit because they are looking

T YP E ( PA N E-
were developed, leaving only one open. this means that for every store closed, to have a casual lunch and want health-
A general trend that we see here is that there were at least 1.5 new stores that ier food options. There may also be pat-
centers that do not perform so well have opened. For Free Standing Stores, this terns in regards to income, race, etc. that
lower ratios. number was 2.3 while for the latter type would limit the type of consumer that
of center, it was 1.6. The other types of eats there. This could be a problem if
Some economic intuition as to why a shopping centers had ratios less than Panera were to try and implement a new
given location type matters is due to 1.5, and so they are undesirable. This strategy to target different demograph-

R A)
the characteristics of certain proper- correlation analysis supports our pre- ics. Obviously, if you only focus on one
ties. Panera prefers locations that are ceding analysis where we found that type of center, this switch would be very
hard for Panera's business strategy team
to make down the line since there would
primarily be one type of consumer that
eats at these locations.

12 C ASE 5 CASE 5 13
Having at least one or more of these tenants that have high relative frequen- locations than there are closed, which
big box tenants in the same zip code cies with closed stores. Panera may be alludes to the success of these restau-
as a Panera Bread restaurant increases closing stores in some of these zip codes, rants in these shared zip codes with big
the probability that the restaurant will but they are also opening new Paneras box tenants.
remain open. This can be seen in the in the same locations (relocating). For
positive correlation coefficients for all instance, Chipotle had the highest rela- Using our constructed variable, Stat
of the big box tenants and the open tive frequency of closed stores at 54.2%. PosCorr Agglom (Cluster), we can
stores variable. Additionally, 9 of these However, Chipotle also had the highest explain why there is a positive correla-
big box tenants had statistically signif- relative frequency for relocated stores tion between this variable and the open
icant p-values, with those being Dicks, at 87.1%, meaning that Panera is actu- stores variable. Having these co-ten-
Ross, Walmart, TJ Max, Home Goods, ally adding stores at a faster rate to the ants in the same zip code is a good thing
Target, Petco, Home Depot, and Kohls. zip codes near Chipotles. They should from a location perspective because it is
take and continue to take advantage of similar to the idea of anchor tenants in
Panera Bread does take advantage of these clusters because of the concept malls driving in foot traffic. Since these
these clusters. In our relative frequency of economies of agglomeration which big box stores attract a large consumer

COTENANTS
analysis, we see that the number of is when people and businesses benefit population, Panera Bread is able to feed
Panera locations in the same zip code as from being close to one another in cit- off of this foot traffic to these stores to
some of these big box tenants have less ies and industrial clusters. get consumers as well. For other restau-
closed stores than open stores. In fact, rants, Panera is able to benefit because
for every big box tenant, the percentage Additionally, we can analyze the ratio of the consumers that are already look-
of open Panera stores is a lot higher than of number of Panera's open for every ing to get a meal. Panera Bread can thus
the percentage of closed. This indicates store that closed. If we use the arbi- position themselves in an attractive way
that Panera is taking advantage of these trary cutoff of 1.5 again, we see that from a marketing perspective to gain

(PANERA)
clusters by opening these stores in the the majority of the big box tenants have those consumers looking to eat some-
same zip codes as these big box tenants ratios greater than 1.5 (besides Bed Bath thing as well.
and keeping them open more frequently Beyond, Whole Foods and Chipotle).
than they are closing them. Additionally, This means that there are more Paneras
we see that there is more to the big box being opened and staying open in these

14 C ASE 5 CASE 5 15
Comparing and contrasting Panera variables had a statistically significant regression against the open stores vari-
Bread in terms of its open vs closed p value with the open stores variable able because the p value is less than 0.1
restaurants, there are no obvious telltale either. Therefore, regardless of the dif- for both. For age, the correlation with
differences between these two groups ferences in these variables across loca- open stores is negative while for traf-
of stores. tion, Panera's success may not be depen- fic volume, the correlation is positive.
In our two sample t-test, there were no dent on them individually. This makes sense because as we men-
statistically significant p values because tioned in the preceding part, traffic vol-
they were all greater than or equal to 0.1. For the stores that Panera relocated ume could be important to drive through
However, differences that were more within the same zip code, there is one sales and may garner more traffic in com-
prominent had higher absolute value t telltale difference between stores that parison to foot traffic from walking. For
scores and these variables were traf- are operating and stores that closed. This age, this makes sense because as we
fic volume and walk score. While traffic variable is age with a p value of 0.002, saw in the preceding part, a higher age
volume had a positive correlation with indicating that the difference is statis- was associated with a higher probabil-
the open stores variable, walk score did tically significant. The average age of a ity of closed stores. Thus, it would be
not. To explain why this might be the consumer at a closed location is higher in Paneras' best interests to target a

E C ONVAR
case, Panera might be more popular in than the age of a consumer at an open younger demographic and to open stores
areas where there are major highways/ location (37 years old vs 21 years old). near more traffic.
roads and rely more on customers that This would indicate that Panera should
go through the drive-through that visit probably target an area with a lower
the restaurant in a vehicle. However, average age or market themselves to
to explain why there might not be any younger people in their twenties vs
telltale differences between these two those who are older since this older
group of stores, we may be led to con- population was associated with closed

( PA NE RA )
clude that this pool of variables were not stores.
necessarily the major reasons behind
why Panera was unsuccessful at a cer- Additionally, we see that this variable
tain location and more successful in along with traffic volume were both sta-
another, epecially since none of the tistically significant when doing a simple

16 C ASE 5 CASE 5 17
CORRELA-
T ION MATR I X

18 C ASE 5 CASE 5 19
In order to determine why Panera has proximity to each other given the nature household size would be smaller since higher. Therefore, it makes sense that negative correlation because Walmarts
been successful in certain locations and of urban neighborhoods. they don't have any kids. there is a positive relationship ex-ante. are usually stand-alone or one of the
not in others, we can look at different largest big box tenants in a strip mall
variables to help explain the success or Cluster 11 and Cluster 18 are positively Walk Score and Clus 17 are positively CT_Walmart and StatPosAgglom that are not located in walkable areas
the lack thereof. There are some vari- corelated because they represent very correlated because populations in clus- (Cluster) have a positive correlation as they are usually found near highways
ables that we have determined to have similar populations of people. The indi- ter 17 are located in urban neighbor- because StatPosAgglom (Cluster) rep- and other roads so that people from
a high correlation with one another viduals are located in rural areas, have hoods. In urban settings, the walk score resents the stores that have a statisti- nearby can travel to this supercenter.
(greater than 0.4 or less than -0.4). The upscale/high income producing assets, would be quite high because everything cally signficant correlation with open Thus, it makes sense that there is a neg-
relationships between these two vari- are mostly homeowners, and have some is closer in proximity to each other given Panera stores. Since we used the number ative correlation ex-ante.
ables are outlined and explained below. level of college education. Thus, we can the nature of urban neighborhoods. of Paneras near this big box tenant in the
expect Paneras to perform the same in calculation for StatPosAgglom (Cluster), Edu Undergrad + and 2020 Median HH
Cluster 7 and Cluster 27 are positively these two types of clusters. Middleburg Managers and Clus27 are it makes sense that there is a positive Inc have a positive correlation because
correlated because they represent very positively correlated because they rep- correlation ex-ante. those who have pursued higher educa-
similar populations of people. These Cluster 17 and Cluster 7 are positively resent very similar populations of peo- tion are more likely to be employed at
populations are quite wealthy, without correlated because they represent very ple. They are wealthy, mostly homeown- CT_TJMaxx and StatPosAgglom (Cluster) higher level jobs which pay more. Thus,
kids, mostly owners, work at the man- similar populations of people. They are ers, work at the management and pro- have a positive correlation because this relationship makes sense ex-ante.
agement/professional level, and are mainly 55+ years old, live in urban neigh- fessional level, have some sort of col- StatPosAgglom (Cluster) represents the
college educated. Thus, we can expect borhoods, are mostly without kids, and lege education, and have an average stores that have a statistically signficant Edu Undergrad + and Pct Asian (5 mi)
Paneras to perform about the same in have graduate plus levels of education. use of technology. Thus, we can expect correlation with open Panera stores. have a positive correlation because this
these clusters. Thus, we can expect Paneras to perform Paneras to perform about the same in Since we used the number of Paneras demographic/ethnicity of people tend to
about the same in these clusters. these clusters. near this big box tenant in the calculation pursue higher education as a result of
StatNegClus and Clus7 are positively for StatPosAgglom (Cluster), it makes this culture's values. Thus, this positive
correlated because statnegclus rep- StatNegClus and Clus17 are positively StatNegClus and Avg HH size are neg- sense that there is a positive correla- relationship makes sense ex-ante since
resents the clusters that have a nega- correlated because StatNegClus rep- atively correlated because as we saw in tion ex-ante. secondary education in this population
tive correlation with the open stores resents the clusters that have a nega- the preceding analysis, Panera Breads is common.
variable. Because we found that cluster tive correlation with the open stores should target demographics with CT_Barnes&Noble and StatPosAgglom
7 also had a negative correlation with variable. Because we found that cluster younger populations as the average (Cluster) have a positive correlation
the open stores variable, it makes sense 17 also had a negative correlation with age was much higher for those of closed because StatPosAgglom (Cluster) rep-
ex-ante that these two variables have a the open stores variable, it makes sense stores. Thus, it makes sense ex-ante that resents the stores that have a statisti-
positive correlation. ex-ante that these two variables have a there is a negative relationship between cally signficant correlation with open
positive correlation. these two variables. Panera stores. Since we used the number
Walk Score and Clus7 are positive cor- of Paneras near this big box tenant in the
related because populations in cluster 7 Avg HH size and Clus 17 are negatively Urban Chic and Edu Undergrad + have a calculation for StatPosAgglom (Cluster),
are located in urban neighborhoods. In correlated because this cluster mainly positive correlation because the popula- it makes sense that there is a positive
urban settings, the walk score would be represents older people ages 55+ tion in Urban Chic is well educated, with correlation ex-ante.
quite high because everything is closer in without any kids. Thus, their average over 65% holding a bachelor's degree or
CT_BedBathBeyond and StatPosAgglom
(Cluster) have a positive correlation
because StatPosAgglom (Cluster) rep-
resents the stores that have a statisti-
cally signficant correlation with open
Panera stores. Since we used the number
of Paneras near this big box tenant in the
calculation for StatPosAgglom (Cluster),
it makes sense that there is a positive
correlation ex-ante.

CT_Walmart and Walk Score have a

20 C ASE 5 CASE 5 21
R E GR E SS
( SE G ME N TS)

22 C ASE 5 CASE 5 23
24 C ASE 5 CASE 5 25
In Model 1, we find that all of the inde- Agglom (Cluster), Total GLA (ShopCtr), have a negative correlation given the f-statistics for our models are all quite
pendent variables are significant. These Avg HH Size, Edu (Undergrad+)(5 Mi), type of consumer that would pass by low, which means that our models do a
include: Stat PosCorr Agglom (Cluster), and Pct Asian (5 Mi). The two significant and the lack of opportunity Panera has good job of explaining why Panera stores
Clus18 (Mayberry-ville), Clus27 (Big Sky variables that have a positive correla- to attract customers in these areas. remain open. Across our 4 models, model
Families), Dum_OfcRtl, and Total GLA tion with dum_OPEN are Stat PosCorr 3 has the lowest f statistic and incopo-
(ShopCtr). The only variable that has a Agglom (Cluster) and Pct Asian (5 mi). Traffic volume is still positively cor- rated the most variables which were all
positive correlation coefficient is Stat related and walk score is still negatively statistically significant to explain the
PosCorr Agglom (Cluster). All other vari- Not all of the independent variables are correlated because of the accessibility dependent variable (open-stores).
ables have a negative correlation with statistically signficant even though these and type of consumer that eats at Panera
the open stores variable. variables were significant in the case of with the drive through option as well as
the simple regression because of the being closer to major roads as opposed
In Model 2, we find that all of the inde- type of regression we are running. In to sidewalks.
pendent variables are statistically sign- the case of the simple regression, we
ficant except for CT_BedBathBeyond, are only looking at two variables: Dum_ For average HH size, there is a neg-
Traffic Volume, and Walk Score. The OPEN and one of the independent vari- ative correlation with Dum_OPEN,
variables that are statistically signf- ables. However, when we begin to add which coincides with what we found
icant are: Clus07 (Money & Brains), in other variables into the regression previously where there were more open
Clus18 (Mayberry-ville), Clus27 (Big to make a multivariable regression, the stores with younger consumers and the
Sky Families), Urban Chic, Stat PosCorr signficance and weight these indepen- smaller sized households that come with
Agglom (Cluster), Dum_OfcRtl, Total GLA dent variables have on Dum_OPEN will young individuals.
(ShopCtr), 2020 Median HH Inc, Avg change, either increasing or decreas-
HH Size, Edu (Undergrad+)(5 Mi), and ing, which sometimes leads to a vari- An analysis of the four regressions taken
Pct Asian (5 Mi). Out of the signficant able whose impact is not statistically together suggest that Panera Bread's
variables, the ones that have a positive significant. location strategy is to develop stores in
correlation with the dum_OPEN variable areas with high median HH income, high
are Stat PosCorr Agglom (Cluster), 2020 The signs of the statistically signfici- traffic volume, and high percentages of
Median HH Income, and Pct Asian (5 mi). ant variables (positive or negative) con- Asians within the nearby population as
tinue to be correct from an economic these variables all had a positive correla-
In Model 3, we find that all of the inde- standpoint. To summarize, these vari- tion with the dum_open variable in the
pendent variables are statistically sig- ables are Stat PosCorr Agglom (Cluster), models. Panera Bread should avoid areas
nificant. These variables are: Clus07 StatNegClus, Clus07 (Money & Brains), with office retail, large retail spaces, and
(Money & Brains), Clus18 (Mayberry- Clus18 (Mayberry-ville), Clus27 (Big areas that have high walk scores.
ville), Clus27 (Big Sky Families), Urban Sky Families), Dum_OfcRtl, Urban Chic,
Chic, Stat PosCorr Agglom (Cluster), 2020 Median HH Inc, Avg HH Size, Edu The f-statistic is simply a ratio of two
Dum_OfcRtl, Total GLA (ShopCtr), 2020 (Undergrad+)(5 Mi), Pct Asian (5 Mi), and variances and thus, when we run a
Median HH Inc, and Avg HH Size. The Total GLA (ShopCtr). regression/f-test, our null hypothesis
variables that are both signficant and is that the variances of the two groups
have a positive correlation with the For the clusters (7, 18, and 27) as well as are equal (f statistic is 1). Low f-statis-
dum_OPEN variable are once again, Stat Urban Chic, the correlation coefficient tics indicate that there is low variability
PosCorr Agglom (Cluster), 2020 Median with Dum_OPEN is negative. This makes within each group while high f statistics
HH Income, and Pct Asian (5 mi). sense considering that these clusters indicate that there is large variability of
were the wealthier urban and rural areas the group's mean relative to the vari-
In Model 4, we find that all of the inde- that Panera Bread had a high number of ability within the group. We should also
pendent variables are statistically sign- stores close down in (which we saw in look at the F-statistic in choosing which
ficiant except for Edu (Undergrad+) 5 preceding parts of this case). regression models to use in constructing
miles. The variables that are statistically a model to predict which Panera Bread
significant are: Clus27 (Big Sky Families), Office Retail and shopping centers with restaurants to keep open and which
StatNegClus, Urban Chic, Stat PosCorr high amounts of square feet should also to close because we would want. The

26 C ASE 5 CASE 5 27
Model 1 accurately predicts to close a
store 91.67% of the time, while Model
2 accurately predicts this 93.75% of
the time, Model 3 91.67%, and Model
4 87.5% of the time.

Model 1 accurately predicts to open a


store 58.33% of the time, while Model
2 accurately predicts this 66.67% of
the time, Model 3 68.75%, and Model
4 64.58% of the time.

Model 1's overall success rate is 75%,


Model 2's is 80.21%, Model 3's is
80.21%, and Model 4's is 76.04%.

PR E D I CT
Across all of the different models, the
accuracies are all better than flipping a
coin because they are all greater than
50%. However, it should also be noted
that this model seems to do a better job
of predicting closed stores rather than
open stores because of how high the

OPE N (M O D-
models' accuracies are for the percent
correct for closed stores vs open stores.

EL 1 )

28 C ASE 5 CASE 5 29
PREDICT
OPEN (MO D-
EL 2)

30 C ASE 5 CASE 5 31
PREDICT
OPEN (MO D-
EL 3)

32 C ASE 5 CASE 5 33
PREDICT
OPEN (MO D-
EL 4)

34 C ASE 5 CASE 5 35
HOLDOUT
(MODEL 2)

36 C ASE 5 CASE 5 37
Using our Model 2 to determine whether 1 times out of the 4 correct with closed the existing store was closed and not
or not Panera should have closed or kept stores. This overall accuracy is quite low, to open a new store. It was incorrect
open the 8 stores in the holdout sample, and is similar to flipping a coin or guess- 48.39% of the time where the model
we suggested that 6 out of the 8 stores ing randomly with each choice having predicted that the existing restaurant
remain open while 2 of the stores close. equal probability of being chosen. Thus, should close even though the store was
In reality, 4 of the stores remained open this model is not very accurate. kept open in reality. Lastly, our model
and the other 4 closed. Thus, we were suggests to operate in either the new
75% correct in determining the open For restaurants that Panera Bread chose location or the old location or both loca-
stores and 25% correct in determining to relocate in the same zip code, Model tions 58.07% of the time.
the closed stores with an overall accu- 2 was correct 9.67% of the time in sug-
racy of 50%. Based on this sample set, gesting to close the existing restaurant
our model does a better job of correctly and choosing another place in the same
predicting restaruants to keep open than zip code to reopen a new restaurant.
to keep close since we were right 3 out Model 2 suggested 41.94% of the time
of the 4 times with open stores and only to exit the zipcode completely where

HOLDOUT
(MODEL 3 )

38 C ASE 5 CASE 5 39
Using our Model 3 to determine whether 1 times out of the 4 correct with closed the existing store was closed and not to
or not Panera should have closed or kept stores. This overall accuracy is quite low, open a new store. It was incorrect 54.8%
open the 8 stores in the holdout sample, and is similar to flipping a coin or guess- of the time where the model predicted
we suggested that 6 out of the 8 stores ing randomly with each choice having that the existing restaurant should close
remain open while 2 of the stores close. equal probability of being chosen. Thus, even though the store was kept open
In reality, 4 of the stores remained open this model is not very accurate either. in reality. Lastly, our model suggests to
and the other 4 closed. Thus, we were operate in either the new location or the
75% correct in determining the open For restaurants that Panera Bread chose old location or both locations 64.5% of
stores and 25% correct in determining to relocate in the same zip code, Model the time.
the closed stores with an overall accu- 3 was correct 9.7% of the time in sug-
racy of 50%. Based on this sample set, gesting to close the existing restaurant
our model does a better job of correctly and choosing another place in the same
predicting restaruants to keep open than zip code to reopen a new restaurant.
to keep close since we were right 3 out Model 3 suggested 35.5% of the time
of the 4 times with open stores and only to exit the zipcode completely where

40 C ASE 5

You might also like