Stavrou 2019 PDF

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 66

Entry Decisions in Oligopoly

The case of Greek isolated markets

By

S TAVROU A THANASIOS

Department of Economics
A THENS U NIVERSITY OF E CONOMICS & B USINESS

A dissertation submitted to the Athens University of Eco-


nomics & Business in accordance with the requirements of
the degree of Master of Science in Applied Economics & Fi-
nancial Analysis.

S EPTEMBER 2018
A BSTRACT

ntry decisions by firms is one of the fundamental topics in Industrial Organization, raising

E issues concerning the timing, positioning, and firm characteristics of firms entering in
a market. This dissertation studies the entry decisions of gas stations in the unique
oligopolistic environment of Greek islands. The aim is examine which factors determine the
number of firms competing in a market and how a firm decides whether to buy a franchise brand
license or not.
The Greek islands environment is interesting from an Industrial Organization perspective
because of the unique exogenous structure of these markets. Each island is a separate market
with different characteristics, such as size, number of visitors, distance from Piraeus, etc. and
population of consumers (for example, education, income).
This dissertation examines the influence of these characteristics and the analysis is divided
into two parts. In the first part, I examine what affects the number of firms of a market, assuming
homogeneity for gas stations. The homogeneity is assumed on the variety of products they
offer and the services they provide. In the second part, I allow for heterogeneous gas stations
and investigate whether the number of operating gas stations in a market is affected by these
characteristics. I also try to find out if these characteristics can be in some level be determined by
any island / gas station characteristics. For instance, is the percentage of branded gas stations in
a market affects the percentage of gas stations offering shopping services?
I define markets both at the island level, as well as according to a variety of criteria related
to the geographic location and their proximity. For example, I geolocated all gas stations using
Google maps and companies’ websites and define different local markets based on driving distance
(in both kilometers and minutes) or “absolute” distance (a radius on the map). I use 3 km, 5km, 5
minutes, 10 minutes and 3 km radius to be rational and representative choices of these distances.
The results for the homogeneous analysis indicate that the number of competing gas stations
in a market is affected by the island’s location, population and the number of ports or airports, as
well as the consumers’ education.
Respectively, for the heterogeneous segment I found that in addition to the previous significant
variables, the structure of the market affects the competition as well. For instance, the percentage
of foreign brand named gas stations ("BP" and "SHELL") affect the competition positively.
Moreover, I found that the firm’s decision to offer several services is affected by the island’s
characteristics as well. For example, the number of ports and airports and the educational level
of each island seemingly affect the percentage of branded gas stations in a market.
Future research could answer questions related to the effect of prices on the market structure
and the gas stations’ characteristics, potential (observed and unobserved) entrants, as well as the
expected payoffs of the firms when there is heterogeneity in the costs (both fixed and variable)
with or without perfect information.

i
D EDICATION AND ACKNOWLEDGEMENTS

would like to express my deep and sincere gratitude to my supervisor, Genakos Christos,

I who not only accepted my request to be my supervisor, but also suggested the subject of my
dissertation. I could not have possibly managed to finish the analysis without his insightful
comments, as well as his reference guide. His availability and help in general are worth as much
as my own work and are genuinely appreciated.

Furthermore, I would like to thank Katsoulakos Ioannis and Zacharias Eleftherios, for ac-
cepting to be my examiners. I hope they find my work at least interesting, as well as informative.

Last but not least, I would like to thank all of my Microeconomics and Econometrics teachers,
in both my undergraduate and postgraduate studies, since all of my knowledge and understanding
on the specific subject are due to their personal efforts.

iii
TABLE OF C ONTENTS

Page

List of Tables vii

List of Figures ix

1 Literature Review 1

2 Data Collection 9

3 Homogeneous Analysis 13
3.1 Descriptive Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2 Analysis for the island level and the homogeneous markets . . . . . . . . . . . . . . 17

4 Heterogeneous Analysis 25
4.1 Descriptive Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.2 Analysis for the heterogeneous markets . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.2.1 Branded gas stations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.2.2 Gas stations’ services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2.3 Gas stations’ ownership status . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

5 Conclusions 51

Bibliography 53

v
L IST OF TABLES

TABLE Page

3.1 Ports’ distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17


3.2 Airports’ distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3 Education’s summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.4 Significance of the variables for the island levels . . . . . . . . . . . . . . . . . . . . . . 18
3.5 Significance of the variables for all three markets . . . . . . . . . . . . . . . . . . . . . . 21

4.1 Descriptive statistics for the heterogeneous characteristics . . . . . . . . . . . . . . . . 26


4.2 Significance of the variables for all three markets (gr_brand) . . . . . . . . . . . . . . . 29
4.3 Significance of the variables for all three markets (f_brand) . . . . . . . . . . . . . . . . 31
4.4 Significance of the variables for all three markets (shop) . . . . . . . . . . . . . . . . . . 33
4.5 Significance of the variables for all three markets (services) . . . . . . . . . . . . . . . . 35
4.6 Significance of the variables for all three markets (carwash) . . . . . . . . . . . . . . . 37
4.7 Significance of the variables for all three markets (lubricants) . . . . . . . . . . . . . . 39
4.8 Significance of the variables for all three markets (vulcanisateur) . . . . . . . . . . . . 41
4.9 Significance of the variables for all three markets ("do") . . . . . . . . . . . . . . . . . . 43
4.10 Significance of the variables for all three markets ("coco") . . . . . . . . . . . . . . . . . 44
4.11 Significance of the variables for all three markets (independent) . . . . . . . . . . . . . 47
4.12 Significance of the variables for all three markets (heterogeneous) . . . . . . . . . . . . 48

vii
L IST OF F IGURES

F IGURE Page

2.1 Competition illustration example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.1 Markets’ Structure by kilometers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14


3.2 Markets’ Structure by minutes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.3 "Smaller" Market clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.4 "Bigger" Market clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.5 Regressions’ outcomes for the Island level . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.6 Population’s correlation coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.7 Regressions’ results for the differently defined markets . . . . . . . . . . . . . . . . . . 22

4.1 Probit : branded = f (characteristics of 3km market) . . . . . . . . . . . . . . . . . . . . . 27


4.2 Regressions’ results for the differently defined markets (gr_brand regressor) . . . . . 28
4.3 Regressions’ results for the differently defined markets (gr_brand regressand) . . . . 30
4.4 Regressions’ results for the differently defined markets (f_brand regressor) . . . . . . 30
4.5 Regressions’ results for the differently defined markets (f_brand regressand) . . . . . 32
4.6 Regressions’ results for the differently defined markets (shop regressand) . . . . . . . 34
4.7 Regressions’ results for the differently defined markets (services regressand) . . . . . 36
4.8 Regressions’ results for the differently defined markets (carwash regressand) . . . . . 38
4.9 Regressions’ results for the differently defined markets (lubricants regressand) . . . . 40
4.10 Regressions’ results for the differently defined markets (vulcanisateur regressand) . . 42
4.11 Regressions’ results for the differently defined markets (do regressor) . . . . . . . . . . 43
4.12 Regressions’ results for the differently defined markets (coco regressor) . . . . . . . . . 45
4.13 Regressions’ results for the differently defined markets (ind regressor) . . . . . . . . . 46
4.14 Regressions’ results for the differently defined markets (ind regressand) . . . . . . . . 48
4.15 Regressions’ results for the differently defined markets (total) . . . . . . . . . . . . . . 49

ix
HAPTER
1
C
L ITERATURE R EVIEW

There is a wide literature concerning entry decisions, what are the catalytic factors that drive
potential entrants make their decisions, what is the impact of entries on the market structure,
the prices, qualities, R&D, or even the total welfare of the society, an effect corroborated by the
fact that there are strict regulations for that matter. Like this dissertation, the literature is
intimately connected to monopolistic and oligopolistic markets, but there are several papers and
articles that inspired me to work towards the direction I have.
One of these articles is the one by T.F. Bresnahan and P.C. Reiss [5]. In one article that
connects classic and modern literature on the subject, they extend the study of competitive effects
of entry by building on Chamberlin’s [6] and Panzar’s and Rosse’s [13] more classical models. They
also take geographically isolated monopolies and oligopolies studying the number of competitors
in a market amongst other things, like the size of the market and competition status, which
is very similar to what I have done in the following chapters. Trying to figure how quickly the
variable profits drop with the entrance of new competitors in a market, they firstly introduced
the concept of entry thresholds, a scale – free mean of measurement for the needed market size
to support any give number of firms, used as a replacement of oligopoly price – cost margins, as
they are rarely observed. As the authors argue, this measure has two advantages in contrast to
the previous ones, it can be estimated without any information on prices and quantity data and
it makes clear some assumptions that were indistinct before.
They studied five different industries with cross-sectional data, which is a main difference
from my data sets. I had no time variation in my data set, since my inputs were not dynamic.
Also, another difference with previous similar literature was that they used a few more realistic
assumptions, like U-shaped average costs for the firms and entry barriers.
The usage of entry thresholds concept mentioned above is simple, and it is well – explained
through an example. Supposing that in a market with an operating monopoly there are 2000

1
CHAPTER 1. LITERATURE REVIEW

consumers that achieve demand – supply equilibrium and it takes 4000 consumers per firm to
reach perfectly competitive level, we get the range of consumers we are expected to find when
there is a new entry in the market. If, for example there is a new entry and there are 5000
total consumers (2500 for each firm), the structure of the market remains oligopolistic, while
when there are enough consumers (say 3800 per firm) the structure of the market becomes more
and more competitive. Moreover, there is a chance that even though a monopoly needed 4000
consumers to operate, when a second competitor enter the market there are 10000 consumers in
total (5000 for each firm). This would be an indicator that the monopoly was preventing a second
firm from entering the market even if it was needed from the side of demand (as I will discuss
later on, these thresholds are widely used in the papers that I review).
To calculate these thresholds, they passed through the stage of calculating an ordered probit
model with the number of firms as the regressand, which is the same type of model that I am
using in this dissertation. Even though the objective of their paper is not the same as mine,
they found that the population of their sample towns is positively related to the number of most
professions. They did not look into it thoroughly, because their variables affect both demand
and cost (professions such as dentists and doctors drop their average costs with higher values of
“population” etc.). They also took for granted some variables that affect the number suggested
by previous literature (Ernst and Yett [8]), like the expected population growth and changing
economic conditions.
The results of the paper indicate that the competition after the entrance of new competitors
in a market intensifies at a diminishing rate as the number of incumbents increases. More specif-
ically, for the five professions that they studied they found that the competition becomes more
intense with the entries of second or third firms, while the convergence to perfectly competitive
profits becomes more gradually, a result that they found quite bizarre, as they initially expected
this convergence to be gradual from the beginning. Finally, as the authors discuss, the results are
not so clear for cases that the markets are not so well – distinguished, i.e. when they overlap, or
when the timing of entry and exit decisions is taken into account.
A second influential paper was the one by Mazzeo M. J. [12], a paper studying the “Product
choice and oligopoly market structure”. In this paper, Mazzeo suggests a model to study the
effects of competitors in a market, where firms have differentiated products. He studies the
markets of motels across the U.S. highways, where the differentiation of the products they offer
is the quality of their services, trying to define the factors that affect the decisions related to the
product (quality service) offered. He builds on Bresnahan and Reiss’ (the one discussed earlier)
and S.T. Berry’s (1992) [1] equilibrium models and extends it by internalizing the product choice
in the firms. To define the equilibrium, he makes the assumption that the motels play a two-stage
game, determining the entry and quality services in the first stage and taking decisions regarding
the competition status in the second stage.
The market definition is not as clear as in my work, because even though there appear to

2
be clusters of motels geographically, the motels offer services to both residents of nearby towns
and individuals that travel long distances, who have the ability to choose any motel along the
highway they are travelling. The analysis he conducts takes into consideration both geographical
and demographical data, similar to the analysis that I conduct, including average income of
nearby towns and traffic data (correspondingly, I use average island income and port and airport
arrivals). To avoid multiple equilibria, he suggests two ways to move forward with the analysis.
The first one is to import sequential moves from the firms and the second one is that both firms
choose simultaneously whether to enter or not and then they choose the quality of the product,
also simultaneously.
The results of his regressions are fairly expected. In general, high quality motels are expected
to have higher payoffs on average, an occurrence that can vary depending on the rest factors, e.g.
operating on areas that have population below the average point of the data set generates higher
payoffs for low quality motels, while motels operating on areas with higher than the average
population tend to have higher payoffs by offering higher quality services.
Another result is that having a same-type competitor enter the market drops the payoffs
of the incumbent more than having an opposite-type competitor enter. In fact, the effect of
same-type firm entering a market is double (negative) than the entrance of a different-type
firm. For example, if there is a market offering low quality services motel as the incumbent and
another motel enters the market offering low quality services as well, the incumbent’s payoffs
are expected to be lower than the payoffs she would have in the case where the entrant would
offer higher quality services. The same is true for the high-quality services as well.
This is a result that indicates that firms should want to offer differentiated products, but
it is in opposition with the previous result, where motels operating on very populated areas
want to offer high quality services and the other way around, so there are two forces driving the
product-type choice.
Finally, he studies the payoffs of the motels when there are three levels of quality; low,
medium and high. The outcome is the same, concluding that the payoffs of the motels are lower
for the same-type companies and that there is higher incentive to differentiate than previously.
Another influential paper was the one by Toivanen O. and Waterson M. [15], where they study
the effects of the structure of the market on entry decisions. The papers discussed previously are
some of the papers that influenced their work. They try to answer questions such as whether an
incumbent is more likely to open a new store in a market rather than other potential entrants,
or if it is possible for the potential entrant firm to enter a market with an incumbent instead
of a market where they will be the monopoly. To answer these questions, they conduct their
static analysis for the fast food (burger) market of the U.K. for years 1991 to 1995, a market
characterized as a duopoly (MacDonald’s and Burger King having over 60% of the market’s
share).
Once again, the model that they use is built on the two-stages game models, but it differs in

3
CHAPTER 1. LITERATURE REVIEW

the way that they try to parameterize both strategic and learning effects, so that they answer
to the question of firms entering a market with or with no incumbent firm. The first stage
determines the entrance decisions and in the second stage they decide whether to compete in
quantities or prices. They analyze their data with both reduced form estimations and a structural
model. Moreover, they assume that the game being played is Stackelberg-type, having a leader
and a follower.
By using reduced form profit functions for estimations, they find that the aggregate of
business-ratable for the district, the real estate costs and the proportion of pensioners in a
market affect the entry decisions. They also find that it is more possible for the follower to enter
a market where the leader is larger than the potential follower and that the leader is more likely
to enter markets that are not monopolistic. They believe that the fact that the follower is keener
on entering in a market where there is already a bigger player instead of opening new stores in
another market is due to the fact that there are spillovers or learning effects, effects that make
the follower revise on the profitability of the market.
For the structural model they build on the Bresnahan and Reiss paper that I referred to
previously, including the number of competitors in a market of the previous period into the
function related to current period’s market size, in an attempt to parameterize the learning
effects in the market. Their model becomes more heterogeneous when they add age variables, so
that they catch the effect of different groups of ages (because of the nature of the product). The
results of the regressions for the follower indicate that the expected market size is decreased
as the proportion of pensioners or younger people, while it is expected to be increased as the
number of both leader’s and own’s stores increases. The average wage of the area and number of
own stores seemingly increases the variable profits, while the number of the competitor’s stores
decreases them. Lastly, they find out that in this particular market, the follower has learning
effects only when there are no own stores in the market.
For the leader, the results remain the same when it comes to the proportion of pensioners
and younger people, while the market size is expected to be increased as the number of follower’s
stores rises, indicating learning effects on the market size. The number of own stores increases
the variable profits, while the competitor’s number of stores decreases them, an expected outcome
from the results of the follower discussed above.
The conclusion of their analysis is that even though having competitors in a market drops the
variable profits for both firms, it could make sense to open new stores where a rival is already
operating. This happens because there are two forces; the first one pushes the firms not to open
new outlets because the expected profits drop due to competition and the second force motivates
them to expand in these markets, since the market size is expected to be increased. Both follower
and leader use the positive effect of competitor’s stores on the market size to revise expectations
upwards, answering to the initial question that there are indeed positive spillover effects by the
competitor’s presence in a market.

4
The next paper was written by Seim K. [14], a paper where she estimates an empirical model
for firm entries that endogenize the product-type choices. The model she suggests comes from the
analysis of the video retail market in cities (or group of cities) across the US. She extends the
framework which Bresnahan and Reiss, Berry (1992) and Mazzeo worked in by allowing firms
having profits unknown to the rivals (incomplete information), coming from sources such as the
managerial talent or customer service.
This means that there are three characteristics firms take into consideration to calculate
expected payoffs in a market; their location (product characteristics), the competitor’s location
and their own idiosyncratic term (competitor’s idiosyncratic term is assumed unobserved).
Not having the data for video retail industry’s demand made her use proxies to determine it.
She used variables such as the population of the area, it’s average income, as well as the business
distribution of every location. The results of her regressions are not surprising; the density of
business affects negatively the expected payoffs, while the average income of the area as well
as its population affects them positively. However, as the author notes, this effect weakens with
distance, meaning that if the increase of the population is in a significant distance (but still in
the area) then its effect is expected to be lesser than if it was an increase closer to the location.
This is true for both average income and population. Correspondingly with the previous papers,
the presence of rival outlets in the market drops the expected payoffs of the firm, an effect that
decreases with distance as well, indicating strong incentives to differentiate.
Regarding the idiosyncratic term, she found out that it is affected by the potential entrant
pool, i.e. if the potential entrant pool is small then the entrants with the highest payoffs are
expected to have lower unobserved payoff shocks than if this potential entrant pool was bigger.
This happens under certain assumptions, such as not being correlated with the idiosyncratic
terms of adjacent locations.
The next topic she tries to enlighten is the effect of spatial differentiation on entry. As she
notes, there is correlation between the size of the market and its population; when an area
increases its population does so as well. She tries to separate and study the area’s effect when
the population is constant and the population’s effect when the area remains unchanged. In the
latter scenario, she finds out that there are two forces applying pressure on entry decisions; the
first one urges firms to stop entering a market since the population that corresponds to each
firm decreases with each new entry, while the second one encourages them to enter a market
since the distribution of the population becomes denser, increasing the population of immediate
neighborhood of the potential outlet. Thus, the firms will decide whether to enter or not by
calculating the net effect of these two forces.
For the first case, where the population remains unchanged and the size of the area varies, she
infers that it is difficult to conjecture about its effect on the competition of the market, probably
because the nature of the market is of local demand and unwillingness of consumers to travel
bigger distances (as the space of the market widens).

5
CHAPTER 1. LITERATURE REVIEW

The general conclusion of this paper is that spatial differentiation does play a significant
role. For markets with denser population distributions firms have higher expected payoffs when
they differentiate the product, while as the distribution becomes sparser the incentive to relocate
(spatially differentiate) lessens as well, since the market is characterized as local – potential
consumers are not willing to travel long distances for the product.
The final paper that influenced my work is the one by Berry S. and Reiss P. [2]. In this paper,
the two economists study the “empirical models of entry and market structure”. They conduct a
literature review in two parts, a homogeneous and a heterogeneous one, similarly to my work.
Their data sets include various professions, like airlines, lodging and broadcasting industries.
The framework they work in is similar to many previous papers discussed, meaning a two-stage
game where firms decide whether to enter or not and the type of competition in the second stage.
They discuss some problems the literature has faced over the past years (like the lack of
information about (potential) entrants etc.), and the general framework that industrial orga-
nization economists work in for the specific literature; not endogenizing market structure, not
estimating costs (fixed, sunk, entry and exit costs in general), as well as lack of information on
price and quantity. As a note, I should comment on the fact that my analysis uses the number of
competitors in a market as a proxy for demand as well, even though price data were available.
However, making inferences from static models is impossible for this case, as the price of fuel
varies on a daily basis. Even though conducting a dynamic analysis is feasible, many economists
tend to conduct static analyses as the dynamic one forces them to confront various problems that
arise.
Following this logic, they proceed with reviewing the homogeneous analyses (mainly the
one by Bresnahan and Reiss), stating various problems concerning the caution needed for the
choice of valid instruments, especially if the number of competitors in a market is not highly
correlated with the market size (which is a necessary assumption is the number of firms is used
as a proxy). To connect it to an example, they focus on Berry and Waldfogel’s [3] application in
which they study the radio market, where they have to compute “private and social returns to
entry”, verifying their claims. For that matter, they conclude that there is entry surplus relatively
to the consumer’s demand.
For the heterogeneous review (heterogeneous firms with respect to costs, product charac-
teristics, location etc.), they point out the differences between the models of the literature that
affect the estimation significantly. The first major difference is that econometricians may or
may not observe the characteristics that make the firms heterogeneous, while the second one
is the certainty (or uncertainty) of the rival’s expected payoffs. Depending on each researcher’s
assumptions, the estimations may lead to no entries or multiplicity of equilibria. When perfect
information is assumed, possible solutions to this problem include reviewing and reevaluating
the assumptions of the heterogeneity, making additional assumptions related to the model and
modeling the probabilities of the aggregated outcomes.

6
The latter solution required turning the attention to analyzing the market’s outcome as a
whole instead of the firm’s decision individually. Uniqueness of an equilibrium can be reached
by making additional assumptions for the model, like changing the model from being static to
dynamic. For instance, the multiplicity is overcome when the researcher assumes successive
movements of the firms. However, this assumption is questioned by many economists, as the
first-mover could deter the entrance of a more efficient competitor, while others assume that the
firm that moves first has earned this privilege by being the most efficient. If the researcher has
some information on the who makes the first move in this game, then a good approach is one
where she assigns probabilities. A final approach to dealing with multiple equilibria is the one
suggested by Manski C. [11], i.e. using bounds to limit down the possible outcomes.
As the authors argue, another factor that could possibly deal with multiple equilibria is
the assumption of imperfect information by the entrants. These games could result in a unique
equilibrium because the firms now form expectations about their payoffs, even though imposing
uncertainty does not necessarily mean that there is uniqueness. Aside this type of models, quantal
response equilibrium (QRE) may solve the multiplicity problem, i.e. an equilibrium of rational
expectations, where players assign probabilities on competitor’s possible movements consistent
with their actual movements.
Moreover, there are also the models with asymmetric information. This work is initially
introduced by Seim K., whose paper I discussed earlier on. In her work, potential entrants held
information about their own location of entrance, making assumptions about rival’s characteris-
tics.
To wrap up the cases where there is heterogeneity in firms, the authors mention auction
entries and some cases where endogenizing scale of operations, product characteristics in a
continuous space and product quality differentiate the results of the market competition. Finally,
there is work in progress related to dynamic heterogeneous models, where computational and
econometric techniques lag at this moment.
This is a small literature review of the papers that had the most impact on my work, directly
and indirectly. Although I merely try to figure what affects the size of the market, there is still
room for application of some of the models the previously mentioned authors used in the future.
Taking into consideration prices and time reveals many new potential routes this dissertation
could take, but it would require more time and effort.

7
HAPTER
2
C
D ATA C OLLECTION

he data were collected by various sources. The main document that was used as a basis

T came from the Ministry of Economy, Development and Tourism, which was a file consisting
of all the gas stations that reported their prices to the ministry in 2010. The file contained
various variables; the gas stations’ ID number, the owner’s name, the Island, the address, the
prefecture, the municipality and the Company’s name (which is actually a variable that has 14
categories, 13 company brand names and a category for independent gas stations). Also, there
were data for the ownership characteristics; a gas station would be either Company Owned -
Company Operated ("coco"), Dealer Operated ("do") or independent ("ind"). "Coco" gas stations
are outlets completely developed and run by the company while "do" outlets are outlets that the
owner has bought the brand name (franchise).
The next step was to enrich the data. Twelve extra variables were created, giving information
about the gas stations’ characteristics as well as the market structure. Firstly, I created two new
variables (“longitude” and “latitude”) to get the coordinates of every gas station. To do this, I had
to find every gas station on the map (the "Google Maps" 1 application was mostly used for this one)
and get the coordinates. Next, I created five dummies concerning the gas stations’ characteristics,
“shop” (a dummy indicating whether a gas station has a shop or not), “carwash” (a dummy
indicating whether the gas station has a car wash service or not), “lubricants”, “vulcanisateur”
(dummies concerning the respective services), as well as “services” (which is a dummy that takes
the value “1” if the gas station offer the services previously mentioned, or if, in addition, has
other services not worth creating new variables for (because of the rare frequencies), such as
“Café”, “Public WC” etc.).
At this point, as a side note, I should elucidate how I logged the values of the gas stations’

1 https://www.google.com/maps

9
CHAPTER 2. DATA COLLECTION

characteristics variables ("shop", "carwash" etc.). At least for the most brand-named gas stations
this kind of information can be found on their respective websites. There are interactive maps
where anyone can find information about these variables. For the remaining gas stations, the
values were filled by using the "Google Maps" application, and more specifically the "Street View"
mode, where the gas stations’ pictures can be found. Thanks to the fact that most gas stations
have signs indicating whether they offer specific services or not, or even the fact that services
such as "Shop" are visible, I filled most of the remaining observations’ characteristics. There was
a rare case (occurred mostly in smaller islands where "Google Maps" is not available) that neither
of the methods mentioned above applies, so I had to personally contact the gas stations’ owners
and ask for the information needed. Last but not least, I consulted "fuelGR" 2 web application to
pin plenty of the gas stations’ locations down (needed for the "longitude" and "latitude" variables).
After that, some variables that depict the structure of the market were needed. The variables
“gs3km” and “gs5km” indicate how many competitors a gas station has in a 3 and 5 kilometers
reach respectively. The distance is measured in driving distance (shortest route). This is the
case for the next two variables too, “gs5mins” and "gs10mins”, which indicate the number of
competitors in 5 and 10 minutes driving distance respectively (in minutes). Lastly, a “gs3rad”
variable was created, displaying the number of competitors a gas station has in a 3-kilometer
radius (created by using the “longitude” and “latitude” variables). In contrast to the previous four
variables, this one is not measured in driving distance but in absolute distance (the distance of
two points on the map, in kilometers).
The whole idea behind creating so many variables is that two gas stations that are next
to each other on the map may actually not be competitors, due to the fact that there is not
direct route connecting them. The radius variable was created first as a logical benchmark, but
along the way various issues arose, such as considering two gas stations from nearby islands
competitors, so I created variables measured in driving distance. Moreover, I created 5-kilometers
and 10-minutes markets so that I have at least two sizes of the markets, a smaller and a bigger
one. What is more, these three markets are differently defined, so I figured that it would be
interesting to see if the results change with respect to the units of measurement. Lastly, note
that the distances chosen were arbitrary and not based on any theoretical analysis whatsoever.
Next, I had to merge it with another file, consisting of the islands’ characteristics. The merged
file included variables such as “size”, “population”, “population_density” (which is the fraction
“population” to “size”), “ports” (the total number of ports), “airports” (total number of airports),
“big_ports”, “airport_arrivals”, “port_arrivals”, “gas_stations” (the total number of gas stations
operating on the island, updated with my data from the previous data set), “distance_peiraeus”,
“dist_land” (distance from the mainland), “arrivals” (port and airport in total), “income” (the
average income per island), “university” (the percentage of residents with a university degree),
“iek” and “secondary” (percentage).

2 https://fuelgr.gr/web/

10
Now that I had all the variables that were needed, I had to split the data set into multiple
data sets needed for my analysis. I needed 6 different data sets, one for each market (3km, 5km,
3rad, 5mins and 10mins) and one for the island level. For the island level I just had to collapse
my data to an island level, so that I can regress the number of the gas stations per island on
the island’s characteristics. After this collapse, my island level data set was consisted of 41
observations.

Since the point of interest is the factors that affect the number of competing gas stations
in a market, I needed to somehow convert the data from individual observations to market
observations. Thus, I transformed my data so that now each observation is a separate market.
For example, for a market that was consisted of three gas stations (triopoly) I kept only one
observation so that it represents one market. By doing that, the gas stations’ characteristics were
no longer useful in their previous form, because it did not contain the characteristics of all three
gas stations but only the characteristics of the gas station that I kept in the data set, so they
had to be transformed as well. For each market of the data set, I changed the characteristics’
variables into percentages. If in a duopolistic market one of the two gas stations offered the
"carwash" service, then the new "carwash" variable took the value "0.5". If in a quadropoly three
out of the four competing gas stations had "shop", then the new "shop" variable took the value
"0.75" and so on. This had to be done for all the data sets of the differently defined markets. Now
if the variable "gs3km" took the value "10", then that means that in this particular market 11
gas stations were competing (reminder that "10" is the number of competitors of a gas station).

The procedure above was easy to be done for the smaller islands, if an island had only 2 gas
stations I just dropped one. For the bigger islands (e.g. Kefallonia), I had to go on the map and
manually decide which gas stations to drop, because there were many markets consisting of
numerous gas stations and therefore I avoided dropping a gas station that shouldn’t be dropped.
However, dropping some gas stations did not mean that the markets were "perfectly" defined.

Take for example Figure 2.1 . Let’s suppose


F IGURE 2.1. Competition illustration that there are four gas stations, X1 through X4.
For X2, this is an oligopoly consisting of 4 gas
stations (black circle), so when I’m required to
drop gas stations to keep only one, I drop X1,
X3 and X4. However, when I am looking at ei-
ther X3 or X4, they only have two competitors
(red circle). This means that in the end, I will
end up having a duopoly (X1 and X2, when
looking at X1), a quadropoly (when looking at
X2) and a triopoly (X2, X3 and X4, when looking at either X3 or X4). This happens because no real
clustering technique was used, which would result in having well defined clusters, like having a
quadrapoly, a monopoly (X1) and a triopoly (the rest gas stations) or even 4 monopolies (depending

11
CHAPTER 2. DATA COLLECTION

on how much the algorithm would be abused). Despite this disadvantage, the technique used
has an advantage over other techniques, and that is the fact that it is ensured that these 4 gas
stations are on the same island, and not some neighboring islands.
Next, I should point out the fact that I dropped the following big Islands from my data set:
Crete, Lesvos, Evia, Corfu, Chios and Rhodes. I took that decision because I believe that they
tend to lose the characteristics of a small, isolated market that I initially wanted to research. In
the end, three additional islands were dropped because there were values missing during the
merge of the files, finalizing the number of islands to 41. This is important, since when I regress
the number of gas stations under the assumption of homogeneity (for example, since on the island
level I have the fewer observations) I do not want to regress it on many variables. To overcome
this, I ran my regressions in two ways; I either ran the dependent variable on each variable
separately and in the end run another regression with only the previously statistically significant
ones so that I see which variables "survive", or run the regression on a single variable and if it
was statistically significant I kept adding one after another, so in the end I had a model with
only statistically significant variables. This way, I avoided running a regression of 35+ variables
with only 41 observations. Using model selection methods like LASSO, ridge regression, least
absolute shrinkage, step-wise forward regression or even Principal Component Analysis (PCA),
to determine the ideal number of predictors was not a desired solution because I wanted to figure
what variables actually affect the number of gas stations operating in a market and not to find
optimal ways to predict this number based on my variables.
The firms that I study are homogeneous towards the product they offer. I only kept the
variables related to "Unleaded 95", so the analysis is differentiated only with respect to the
services they offer. I also do not study anything related to the potential entrants of the market,
like the literature that I reviewed, but I simply try to determine the factors that define the
market structure.

12
HAPTER
3
C
H OMOGENEOUS A NALYSIS

s firstly stated in the abstract, in the homogeneous part I try to define the factors that

A have an effect on the number of firms of a market, under the assumption that the gas
stations in my data are homogeneous. This means that I do not include the factors that
differentiate them, such as owning a branded gas station and offering various services other
than the gas itself. After displaying some descriptive statistics, I run numerous regressions. The
regressions start from the island level and then move onto driving distance and radius distance
levels. For the island level, I run a regression for all the islands and two separate regressions
for the islands with less than 10 gas stations and the islands with 10 plus gas stations. I do this
so that I check whether the factors that determine the number of competing gas stations differ
between smaller and bigger islands.
After running the regressions for the island level, I run the regressions for the differently
defined markets. Comparing the results of these regressions led me to the decision of presenting
only the results of the 3km, 3rad and 10mins markets. I came to this conclusion after noticing
that the results between the 3km and 5mins market and the 5km 10mins market are quite
similar. Thus, I kept both small and big markets, as well as markets from all three definitions (in
kilometers, minutes and radius).

3.1 Descriptive Statistics

In this section some descriptive statistics are presented. These statistics are noteworthy, as
they show the geography of our markets, the competition structure as well as the gas stations’
characteristics.
The following four figures display the number of gas stations that compete in every possible

13
CHAPTER 3. HOMOGENEOUS ANALYSIS

kind of oligopolistic market, from monopolies to markets with 21 competitors. Note that even
though putting all of these figures together was possible, it was incomprehensible. Furthermore,
it felt rational to compare the markets with respect to their size, so I compared the 3km with the
5mins one, as well as the 5km with the 10mins market. The 3km radius market was used as a
yardstick to measure the scale of all these different markets. The term "clusters" is used for the
different kind of markets, even though they are not real clusters, as the number of competitors
vary with respect to every individual gas station (it is as if a k-mean clustering technique was
used but with no "machine learning" process being followed).

F IGURE 3.1. Bar chart for the 3km, 5km and radius clusters

Figure 3.1 gives information about the markets measured in kilometers (as well as the one
measured in a 3km radius). For example, there are 76 monopolies across all islands when we
define the market as 3 km driving distance and 49 monopolies when we define it as 5 km and 3 km
radius. As expected, the number of operating oligopolies decreases as the number of competing
gas stations increases, mainly because of the geography of our markets. Not only the geographical
space is limited, the number of consumers per island is also bounded.
Figure 3.2 is the same bar chart for the distance measured in time clusters. The interpretation
is no different to the previous bar chart, while the 3km radius variable is kept in this chart as
well, for comparison reasons. As one would notice, there is something peculiar in this graph; the
number of operating oligopolies has a bizarre behavior when there are 14 competitors or more
in a single market. This could have a simple explanation, though. Remember that the "10mins
competitors" cluster defines the market in time traveled (by car), and it is reasonable to assume
that the data with many competitors are coming from sizable islands. With these thoughts in
mind, this variable is expected to count gas stations from different villages / towns, which are not

14
3.1. DESCRIPTIVE STATISTICS

F IGURE 3.2. Bar chart for the 5mins, 10mins and radius clusters

counted in the "5mins competitors" variable. For example, if a town in a particular island has an
oligopoly consisting of 7 competitors and another town 8 minutes away has an oligopoly consisting
of 7 competitors, if the market is defined as "10mins competitors", this would be counted as an
oligopoly with 14 competitors1 .

F IGURE 3.3. Bar chart for the 3km, 5mins and radius clusters

1 Under the assumption that the 2 most distant gas stations are within a 10 minute driving distance, otherwise,

the number of competitors would range from 7 to 13.

15
CHAPTER 3. HOMOGENEOUS ANALYSIS

Figures (3.3 and 3.4) are not of much importance, but they are displayed here just to com-
pare the smaller and the bigger market clusters. Firstly (Figure 3.3), the smaller markets are
presented, where we can only conclude that this is a case of one market defined in two different
ways. It is pretty obvious that the numbers of operating oligopolies are highly correlated for every
different type of oligopoly. Again, the radius cluster is maintained in these graphs too, for reasons
of comparison.
Secondly, Figure 3.4 is the bar chart for the bigger markets. This does not seem like a dif-
ferently defined single market, the number of operating oligopolies vary for different types of
oligopolies, at least in comparison to the previous figure.

F IGURE 3.4. Bar chart for the 5km, 10mins and radius clusters

Next, two tables are being presented, Table 3.1 and Table 3.2, which are the distributions
of ports and airports across all islands respectively. These tables show that most islands have
one port, as well as no airport. This could be useful piece of information further on, as these two
variables are likely to determine the number of competitors for the markets examined.

16
3.2. ANALYSIS FOR THE ISLAND LEVEL AND THE HOMOGENEOUS MARKETS

Ports Freq. Percent. Cum.


1 23 56.10 56.10
Airports Freq. Percent. Cum.
2 7 17.07 73.17
0 25 60.98 60.98
3 2 7.32 80.49
1 16 39.02 100.00
4 6 14.63 95.12
Total 41 100.00
5 2 4.88 100.00
Total 41 100.00 Table 3.2: Airports’ distribution

Table 3.1: Ports’ distribution

Lastly, Table 3.3 displays information about the islands’ education. For the 41 islands of my
data set, the average percentage of residents with higher education is about 11.45%, 11% coming
from universities and 0.45% from iek (private institutions). Moreover, most people (about 20%)
stop their education after high school.

Variable Obs. Mean Std. Dev. Min Max


iek 41 .044878 .0095189 .03 .07
university 41 .1102439 .0190378 .07 .15
secondary 41 .1990244 .0320004 .1 .27
Table 3.3: Education’s summary

3.2 Analysis for the island level and the homogeneous markets

In this section I present the analysis conducted under the assumption that the gas stations are
homogeneous. I start with some linear regressions on an island level, trying to figure out the
factors that affect the number of gas stations operating on an island. Since I am interested in the
number of gas stations, it makes more sense to try out a different model than linear regression
(I explain later on why), but I will firstly run a set of linear regressions for the island level for
comparison reasons.
Table 3.4 gives information about the significance of the variables for the three island levels
when I regress the number of operating gas stations on each variable individually. The first
column is for all the islands in the data set, the second one for the islands with less than 10 gas
stations operating on it and the third column concerns the bigger islands (10 plus gas stations).
We notice that there are way more variables statistically significant for the regressions run for
the first column, but keep in mind that the rest two columns are regressions consisting of 29 and
12 observations respectively. The results indicate that islands tend to have more gas stations
when they have more population, bigger size, more ports, airports and arrivals (both airport
and port). Also, islands that are further away from the mainland are expected to have less gas
stations, as well as islands with higher income (a not so expected outcome). However, the latter
results are not significant for the regressions run for both small and big islands separately.

17
CHAPTER 3. HOMOGENEOUS ANALYSIS

Variable gas_stations (Islands) gssmall (Small Islands) gasbig (Big Islands)


size 0.0444*** 0.0139*** 0.0326***
(0.00514) (0.00453) (0.00979)
population 0.000685*** 0.000366*** 0.000539**
(6.01e-05) (6.01e-05) (0.000208)
airports 9.665***
(2.532)
ports 2.142* 0.830**
(1.071) (0.343)
big_ports 14.02*** 4.680*** 10.56*
(2.153) (0.935) (5.544)
airport_arrivals 3.22e-05***
(7.12e-06)
airpt_arrivals_depts 1.59e-05***
(3.54e-06)
port_arrivals 2.10e-05*** 1.31e-05***
(7.09e-06) (3.01e-06)
dist_land -0.0213**
-0.0101
income -0.00183**
(0.000740)
arrivals 2.16e-05*** 9.59e-06***
(4.06e-06) (2.56e-06)
constant 7.732*** 2.966*** 19.25***
(1.429) (0.440) (2.672)

Table 3.4: Significance of the variables for the island levels

Figure 3.5 displays the results of the regressions for the island level when run cumulatively.
Column (1) displays the outcome for the regression of all the island levels in the data set. As
mentioned previously, the data set contains information about 41 islands. Under column (2) there
are the results of the regression of the islands having 9 or less gas stations, while column (3)
displays the results of the regression run for the islands with 10 or more gas stations operating
on it. All three regressions are linear. The stars next to the coefficients’ values represent the
statistical significance of these values for the three most used significance levels (1 star represents
α = 90%, 2 starts α = 95% and 3 stars α = 99%).

18
3.2. ANALYSIS FOR THE ISLAND LEVEL AND THE HOMOGENEOUS MARKETS

We notice that when I regress the "gas_stations" variable for the island level as a whole, there
are plenty variables that are statistically significant and do not appear in the results in Figure
3.5. This is because of the fact that these variables are related, so when I run a regression on
all these variables the effect of a variable neutralizes the effect of another variable, so that in
the end only three of them "survive" ("size", "population" and "arrivals"). This is the case for the
small and big islands in the next two columns, where "ports", "port_arrivals", "big_ports"and
"arrivals" are statistically insignificant when regressed altogether.

The results of the first re-


gression are very satisfying. A F IGURE 3.5. Regressions’ outcomes for the Island level
very high value of the Adjusted
R 2 of almost 0.9 shows almost
a perfect fit, while all the vari-
ables are statistically signifi-
cant for all three significance
levels. Moreover, it is the only
model out of these three that
has a significant constant for
α = 95%.

In contrast, the constant be-


comes statistically insignificant
for all three levels for the follow-
ing regressions. Also, for both
regressions, the variables men-
tioned previously become statis-
tically insignificant too. For the
regression of the bigger islands,
the significance of the variable
"population" is now weaker, becoming insignificant for α = 99%, while for both these regressions
the significance of the variable "size" drops as well (for α = 99%). Ultimately, the overall fit of the
model drops as well, since there are now less variables than the first regression, with the value of
the Adjusted R 2 dropping almost by one third.

The coefficients’ signs are not surprising; the results indicate that the larger the number of
the residents of an island the more gas stations are expected to be competing in a market. An
analogous interpretation fits the "arrivals" coefficient as well, since a high value of "arrivals"
is expected to yield higher levels of competition. Finally, the sign of "size" is controversial; on
one hand, the variable "size" is on some level correlated with the variable "population", so
the bigger the size of an island the larger the expected number of competitors in a market on
average. However, one could argue that ceteris paribus, i.e. when the level of "population" remains

19
CHAPTER 3. HOMOGENEOUS ANALYSIS

unchanged, higher value for the variable "size" means that the population could potentially
be distributed sparser. In this regression, the correlation effect seems to be stronger, so the
coefficient takes positive value.
Figure 3.6 shows the correlation coefficients of the variables. As we can see, the "population"
variable is moderately to strongly correlated with the other two variables. A value of corr pop,size =
0.65 justifies the reasoning behind the neutralization of some effects during the aggregate
regressions. The value of corr pop,arr (0.593) is also high, with a possible explanation being
that, at least for small islands, the islands’ residents have to move more often for supplies,
entertainment etc., so they are being counted in the "arrivals" variable too (since the "arrivals"
variable is the addition of port and airport arrivals). Bigger islands’ (e.g. Creta) residents do not
have to do that, but the expected number of tourists is expected to be higher than the respective
number for smaller islands, so, in both cases is expected a high value of corr pop,arr .
Other than that, from Figure 3.5 we no-
F IGURE 3.6. Population’s correlation coeffi- tice that the coefficients of the third regression
cients have much higher standard deviations from
the respective coefficients of the other two re-
gressions (almost 200% higher), an expected
result since the third regression has only 12
observations.
All in all, there’s no doubt that the first
regression yielded the best results, having the
2
most observations, best adjusted R and most and strongest statistically significant variables.
However, the linear regression is not a preferred model when the dependent variable is a
multinomial one. This happens because the assumption of the independent, identically distributed
errors is violated, which results in generating non BLU (Best Linear Unbiased) estimates. The
most appropriate model for this type of regressions is the ordered probit 2 but there are several
other models that generate reliable coefficients as well. In spite of the numerous models that
estimate the coefficients better than the linear regression, the multinomial logistic regression
and the ordered probit are the two most used models when it comes to economics literature 3 .
These models are usually estimated using the Maximum Likelihood method and not the Ordinary
Least Squares.
With that said, the ordered probit regression will be used for the rest regressions of the homo-
geneous analysis, starting with the regression displayed on Figure 3.7. For ordered regressions,
the statistic that indicates the overall fit of the model is the "Pseudo R2" value, displayed at the
bottom of the figure. However, Pseudo R 2 is not equivalent to the usual R 2 used in the linear
regression. This is calculated using estimated probabilities (or estimated likelihood) and not the

2 C. Winship, R.D. Mare (1984) [16]


3 S. J. DeCanio, 1986 [7]; J. Boex, 2000 [4]; G. Chan, P. W. Miller and M. J. Tcha, 2005 [9]

20
3.2. ANALYSIS FOR THE ISLAND LEVEL AND THE HOMOGENEOUS MARKETS

predicted values! As McFadden contributed4 in "Behavioural travel modelling", Pseudo R 2 tend


to take lower values than the traditional R 2 , while "values of 0.2 to 0.4 for rho-squared represent
excellent fit".
Starting off with the regressions on each variable individually, the results are displayed in
Table 3.5. These regressions are for the three markets defined as 3km driving distance ("gs3km"),
3km radius distance ("gs3rad") and 10 minutes driving distance ("gs10mins"). The variables that
are statistically significant do not differ dramatically from the results of the Table 3.4 for the
linear regression. The variables "ports" and "income" are now statistically insignificant for all
three regressions, but now the percentage of high school graduates of the island is significant for
all 3 regressions, at least for α = 90%.

Variable gs3km gs3rad gs10mins


size 0.000850**
(0.000384)
population 2.57e-05*** 3.85e-05*** 5.26e-05***
(6.35e-06) (6.56e-06) (6.76e-06)
airports 0.580*** 0.869***
(0.175) (0.179)
big_ports 0.434** 0.670*** 0.897***
(0.169) (0.176) (0.179)
airport_arrivals 1.12e-06*** 1.33e-06*** 2.13e-06***
(3.43e-07) (3.48e-07) (3.41e-07)
airpt_arrivals_depts 5.59e-07*** 6.61e-07*** 1.06e-06***
(1.71e-07) (1.73e-07) (1.70e-07)
port_arrivals 9.61e-07** 1.08e-06*** 1.11e-06***
(3.88e-07) (4.00e-07) (4.20e-07)
dist_land -0.00210*** -0.00212*** -0.00315***
(0.000699) (0.000703) (0.000708)
distance_peiraeus -0.00265*
(0.00158)
secondary 5.897* 8.401** 7.711**
(3.333) (3.378) (3.330)
arrivals 1.06e-06*** 1.25e-06*** 1.79e-06***
(2.60e-07) (2.66e-07) (2.70e-07)

Table 3.5: Significance of the variables for all three markets

4 D. Hensher, P.R. Stopher, 1979 [10]

21
CHAPTER 3. HOMOGENEOUS ANALYSIS

Also, the size of the island is statistically insignificant for the two regressions and the distance
from Piraeus is now significant for the third regression. We notice that none of these markets’
results are comparable to the results of Table 3.4 (small and big islands’ regressions). I conclude
that due to the lack of observations the results of that table are indeed weak, making any
conclusion drawn by it weak as well.
The coefficients’ signs are once again the ones expected. As for the two new variables, we see
that the distance from Piraeus has a negative effect on the number of operating gas stations,
while the percentage of inhabitants with secondary education seems to have a positive effect on
it.
Despite all these variables, when run altogether, the effects of most variables get neutralized
once again. As Figure 3.7 indicates, the variables that remain statistically significant are the
same variables that survived the regressions of Figure 3.5; "population", "size" and "arrivals".
The first regression is the one run for the 3-kilometers driving distance market, the second one
for the 3-kilometers radius and the last one the 10-minutes driving distance market. These
markets are sorted from smallest to biggest, a fact verified by the number of observations, which
keeps decreasing from 185 to 163 (since wider defined markets will have less markets with more
observations).
An obvious difference is that
F IGURE 3.7. Regressions’ results for the differently de- now the size of the island is
fined markets significant only on the third re-
gression, which has a negative
coefficient! Its effect is neutral-
ized by the population and prob-
ably the arrivals of the island on
the rest regressions and, as dis-
cussed before, can have a neg-
ative effect on the number of
operating gas stations because
of the sparser distribution of
the population, ceteris paribus..
Other than this variable and
"arrivals" for the seconds regres-
sion that are statistically signif-
icant for α = 95%, the rest coef-
ficients are statistically signifi-
cant for α = 99%.

22
3.2. ANALYSIS FOR THE ISLAND LEVEL AND THE HOMOGENEOUS MARKETS

As we can see, the sizes of the effects of the variables "population" and "arrivals" for the third
regression are bigger than the effects of the other two regressions, but the standard deviations
are also higher. In spite of the higher standard deviations though, the overall fit of the model is
greater for the third regression, where the "Pseudo-R 2 " value has more or less doubled. Even
though the results are not as good as the results for the island level, keep in mind that these
results are more consistent than the previous coefficients.
The results of these regressions are not unexpected, but the truth it that variables such as
the distance from the mainland, the average income or even the distance from Peiraeus could be
variables that affect the number of operating gas stations on the island as a whole and therefore
in a market as well. However, at least for the homogeneous analysis, only variables that are
related to population seem to be affecting this number when they are being run as a whole.

23
HAPTER
4
C
H ETEROGENEOUS A NALYSIS

he next part of the analysis is the heterogeneous one. In this chapter, after I present some

T descriptive statistics for the heterogeneous gas stations, I try to answer two major ques-
tions. Firstly, I will check if the characteristics of the gas stations affect the competition
of the market. For example, whether the number of competitors in a market gets affected by the
percentage of Greek or foreign brand named gas stations or not. The second question is to find
out if and which variables affect these characteristics. For example, if the number of branded gas
stations in a market is affected by the population of the island.

The characteristics that differentiate the homogeneous from the heterogeneous analysis are
the percentage of gas stations in a market that own a shop, offer car washing, lubricants and
vulcanisateur services, offer services in general (car washing, lubricants, vulcanisateur, WC, Café,
ATM etc.), being branded or not, being Greek or foreign branded and being "COCO", "DO" or
independent.

At this point I should clarify two things. In my data set about 95% of the gas stations did
not offer services other than car washing, lubricants and vulcanisateur. Thus, I did not create
separate variables for the rest services but I included the variable "services" so that I take them
into account in a way. The other thing is that since 2009 and 2010, BP and Shell respectively
sold their operations to Hellenic Petroleum (HELPE) and Motor Oil (Hellas) which means that
they should be consider Greek branded gas stations. However, I believe that in the eyes of the
consumers this is not true, but they believe them to be foreign. Therefore, "SHELL" and "BP"
gas stations are considered as "foreign branded" in my data set and the rest as "Greek branded".
Lastly, I should note that I counted "independent" gas stations as "Greek branded" gas stations too.
The first reason for doing this is because I believe that the consumers think of the independent
gas stations as domestic brands and the second is that if I think of the "independent" variable as

25
CHAPTER 4. HETEROGENEOUS ANALYSIS

"unbranded" then I already study the effects of this variable when I study the "branded" variable
(since they are complementary). So, in the end, the "gr_brand" variable is actually a variable
that gives information about the percentage of gas stations in a market that have a Greek outlet
name and not the percentage of gas stations in a market owning a Greek brand name (such as
"EKO", "ELIN" etc.).
Lastly, I changed the method of the regressions (not when I try to figure if these characteristics
affect the number of operating gas stations in a market but when I regress these variables on the
rest factors), since my variables now are of a different scale (x ∈ [0, 1]).

4.1 Descriptive Statistics

In the Table 4.1 below I present some statistics of the characteristics mentioned above. The total
gas stations in my data set were 317. The first column is the variable name, the second one the
frequency, then the percentage and the the respective values in percentage for the three markets.

Variable Freq. Percentage (%) gs3km (%) gs3rad (%) gs10mins (%)
branded 305 96.22 96.07 96.29 94.13
gr_brand 253 79.81 75.70 80.37 80.04
f_brand 64 20.19 24.30 19.71 19.27
shop 226 71.29 71.15 70.13 69.34
services 206 64.98 65.48 65.41 63.55
carwash 177 55.84 54.22 55.73 54.17
lubricants 150 47.32 46.15 48.17 44.92
vulcanisateur 31 9.78 10.14 13.30 10.40
DO 292 92.11 97.60 92.67 91.71
COCO 13 4.11 1.12 3.04 3.32
IND 12 3.79 1.26 3.71 5.23
Table 4.1: Descriptive statistics for the heterogeneous characteristics

For example, there are 64 gas stations owning a gas station with a foreign brand name
(either "BP" or "SHELL"). This number accounts for the 20.19% of the total gas stations, while
this number is changed to 24.3% when I define the market in 3km driving distance, 19.71%
for the 3km radius and 19.27% for the 10mins market. I should remind that the "gr_brand"
variable includes independent gas stations as well. I believe that the fact that there are only 12
independent gas stations is noteworthy; I would expect the number to be higher, especially since
most islands have less than 10 gas stations operating. This idea is even more unexpected due
to the fact that only 13 gas stations out of the 305 are "Company Owned - Company Operated",
which means that the gas stations’ owners chose to buy a brand’s rights. Other than that, one
can see that most gas stations own a shop (71.29%), as well as offer various services (almost 2
out of 3). Almost half of them offer car washing services and lubricants, while only 10% offer
vulcanisateur services.

26
4.2. ANALYSIS FOR THE HETEROGENEOUS MARKETS

4.2 Analysis for the heterogeneous markets

4.2.1 Branded gas stations

In this section I run the regressions for the "branded", "gr_brand" and "f_brand" variables, trying
to see if these variables either affect the number of operating gas stations in a market or if there
are factors explaining how the gas stations choose whether to buy a brand name or not.
Starting with the "branded" variable, I conclude that not only it does not affect the number of
operating gas stations but there are no variables statistically significant when I run this variable
on the rest factors, in any market, with an exception of the 3km market, where the variable
"port_arrivals" is statistically significant.

F IGURE 4.1. Probit : branded = f (characteristics of 3km market)

As we can see in Figure 4.1, the coefficient takes negative value and is statistically significant
for all three significance levels (P value = 0.008), meaning that in my data set of the 3km defined
market, islands with higher number of port arrivals tend to have lower percentage of branded
gas stations. However, considering that this result is not consistent with the rest of the markets,
this result is questionable to a certain extend.
The next variable tested is the "gr_brand" one, which yields different results. This time,
the variable is statistically significant when I insert it as a regressor in the ordered probit for
the "gs3km" market, without interfering with the significance of the two variables that were
significant in the previous section (Figure 3.7). The result is being displayed in Figure 4.2. From
this table we see that the number of operating gas stations is a market is expected to drop as
the percentage of gas stations with Greek brand name (and independent) increases. The value of
"Pseudo-R 2 " is obviously increased, and the variable is significant for α = 0.90 and 0.95. This is a
result that is true for the 10mins market as well, but not for the 3km radius one. This time the

27
CHAPTER 4. HETEROGENEOUS ANALYSIS

results are clear; the relation between the percentage of Greek brand named gas stations and the
expected number of operating gas stations in a market is negative and statistically significant,
but is there any relation between the percentage of Greek gas stations in a market and other
characteristics of the island (such as the population, size etc.)?

With this question in mind,


F IGURE 4.2. Regressions’ results for the differently de- I regress this characteristic on
fined markets (gr_brand regressor) both the island’s and the gas
stations’ characteristics. Table
4.2 below presents the findings,
the first column is the name of
the variables and the next three
columns are for the "gr_brand"
variable for the three markets.

Table 4.2 displays the re-


sults of the regressions when
run for each variable individu-
ally. For the "gr_brand" variable
of the 3km market there are
only variables that are statis-
tically significant for α = 0.90 (
"big_ports", "dist_land" and, for
the first time, "lubricants"). The
signs of these variables’ coeffi-
cients are, once again, positive
for having big ports and negative for the distance of the mainland. As for the "lubricants" service,
the coefficient is positive, suggesting that gas stations offering this kind of services are expected
to be Greek branded with higher probability than foreign. However, observe that it is weakly
significant and it is of no significance for the other two markets, and, moreover, it is more rational
to think that "Greek brand named gas stations tend to offer lubricant services more often than
foreign ones" than "offering lubricant services affect the decision of buying Greek brand names
(or no brand names at all)", so since I get no causal effect out of this table, I would prefer not
sticking to this variable’s interpretation.

As I widen the definition of the markets, I see that car washing services, distance from
Piraeus, arrivals and level of education affect the percentage of Greek and independent gas
stations in a market, while, ultimately, for the biggest market of these three, the variables that
affect this percentage are the population of the island, the number of big ports and arrivals, as
well as the distance from the mainland. The effects are towards the direction expected, with all
variables other than the distance from Piraeus having a positive impact.

28
4.2. ANALYSIS FOR THE HETEROGENEOUS MARKETS

Table 4.2: Significance of the variables for all three markets (gr_brand)
Variable gr_brand 3km gr_brand 3rad gr_brand 10mins
population 3.18e-05**
(1.61e-05)
big_ports 0.431* 0.736**
(0.231) (0.371)
arrivals 1.09e-06* 1.50e-06*
(6.23e-07) (8.18e-07)
distance_peiraeus -0.0107**
(0.00484)
dist_land -0.00175* -0.00274**
(0.000901) (0.00132)
secondary 14.90**
(6.173)
carwash -0.971**
(0.477)
lubricants 0.461*
(0.273)
constant 1.054*** 1.565*** 1.717***
(0.113) (0.154) (0.174)

When I regress the percentage of Greek brand named and independent gas stations on all of
the aforehead mentioned variables, the number of statistically significant variables is narrowed
down significantly. For the first market, all variables become insignificant when run altogether,
cancelling each other’s effect. For the second market, two variables stand out, the distance from
Piraeus and the car washing services. Notice that both variables have a negative effect on the
percentage of Greek branded gas stations; gas stations that are operating on islands that are
further away from the mainland and gas stations that offer car washing services tend to decrease
this percentage. Also, none of these variables is directly related to the population of the island.
The next set of regressions concerns the variable "f_brand". When I run regressions with
"f_brand" in the set of independent variables, the results do not differ from the previous regres-
sions; they are statistically significant for the same levels (3km and 10mins markets).
This is showed in Figure 4.4, where we can see that the effect of the percentage of foreign gas
stations in a markets affects positively the operating gas stations in a market, a non-surprising
result, since "f_brand" and "gr_brand" consist the set of gas stations; since the number of operating
gas stations is expected to be declined with a rise of the percentage of the Greek branded gas
stations, diminishing this percentage (or increasing the percentage of foreign branded gas
stations) yields the opposite results.

29
CHAPTER 4. HETEROGENEOUS ANALYSIS

F IGURE 4.3. Regressions’ results for the differently The sizes of the effects are the

defined markets (gr_brand regressand) opposite of the sizes of the effects of


Figure 4.2; the coefficient of the vari-
able "f_brand" for the 3km market is
0.519 while for the same market the
coefficient of "gr_brand" takes the
value -0.519. The standard errors of
the regression are the same of the
previous regression as well.
The following set of regressions
were run to determine the fac-
tors that determine the percent-
age of foreign brand names in a
market. We draw information re-
garding this matter from Table 4.3.
F IGURE 4.4. Regressions’ results for the differently Once again, these coefficients are
defined markets (f_brand regressor) for the regressions run individually
for each variable. Some new vari-
ables appear in the table for the
very first time, like "vulcanisateur"
and "airpt_arr_depts". After running
these regressions, I run an "aggre-
gate" regression for each differently
defined market.
The results differ significantly
from the the previous results. For
the smallest market, this time the
variables that are statistically signif-
icant are the number of port arrivals
and the total arrivals, while for the
second market, the population of the
island and the number of big ports are the two variables that matter.
Nevertheless, the variables that affect the foreign brand named gas stations percentage
in the third market are quite many; from variables that are commonly significant (such as
the population and distance from the mainland) to variables never seen in any results before
("vulcanisateur" and "shop"). "Income" and "vulcanisateur" have both a negative impact on the
foreign percentage, while "shop" has a positive one.

30
4.2. ANALYSIS FOR THE HETEROGENEOUS MARKETS

Table 4.3: Significance of the variables for all three markets (f_brand)
Variable f_brand 3km f_brand 3rad f_brand 10mins
population 1.98e-05*** 3.18e-05***
(7.52e-06) (7.66e-06)
big_ports 0.368* 0.474**
(0.206) (0.209)
port_arrivals 9.16e-07**
(4.56e-07)
airport_arrivals 8.85e-07**
(4.06e-07)
airpt_arr_depts 4.41e-07**
(2.02e-07)
arrivals 5.65e-07* 5.56e-07*
(3.02e-07) (3.11e-07)
dist_land -0.00255***
(0.000851)
distance_peiraeus 0.00453**
(0.00193)
income -0.000111*
(6.02e-05)
shop 0.552*
(0.303)
carwash 0.533*
(0.294)
vulcanisateur -1.239**
(0.520)
constant -0.212**
(0.0929)

Also, notice that the distance from Piraeus has now a positive effect. A result that is not so
surprising, if we consider that the variable is statistically insignificant for the "branded" variable
as the independent variable, so since the Greek and foreign gas stations are complementary to
the whole branded set (not perfectly, 12 of them are independent), so their effects cancel out.
Other than that, the coefficients that are significant have not surprising signs. The population,
number of big ports, airport and total arrivals (and departures) increase the number of expected
competing gas stations in a market. However, we know that many of these variables are correlated,
so when run altogether most of them are expected to be statistically insignificant.

31
CHAPTER 4. HETEROGENEOUS ANALYSIS

F IGURE 4.5. Regressions’ results for the differ- The results of the regressions on all

ently defined markets (f_brand regressand) significant variables are displayed in Fig-
ure 4.5. There is nothing of importance
for the first two regressions; the effects
of "arrivals" for the first and "big_ports"
for the second regression get neutralized
by "port_arrivals" and "population" res-
pectively. For the third regression, when
I regress "f_brand" on every previously
significant variable, only two remain; the
population of the island and the vulcan-
isateur services. Alongside the previous
regressions, the effect of the population
remains positive, higher population num-
bers increases the possibility of higher
percentage of foreign brand named gas
stations in a market, an effect also true for the port arrivals for the 3km market as well. However,
the effect is the opposite for the vulcanisateur services. We notice that markets with higher
percentage of gas stations offering vulcanisateur services tend to have lower percentage of gas
stations with foreign brand names. Despite the statistical significance though, this table does
not show causal effect, and it would be rather naive coming to the conclusion that gas stations
with vulcanisateur services have higher probability of buying a foreign brand name instead of a
Greek one, but it would make more sense claiming the reverse of this proposition; gas stations
with foreign brand names tend to offer vulcanisateur services more often than the gas stations
with Greek brand name (or independent), on average. Nevertheless, I will not make an issue out
of it yet, but I will discuss it when I run the regressions for the "vulcanisateur" variable later on.

32
4.2. ANALYSIS FOR THE HETEROGENEOUS MARKETS

4.2.2 Gas stations’ services

In this subsection I present the results of similar regressions with the previous subsection,
but for the characteristics concerning the gas stations’ services ("shop", "services", "carwash",
"lubricants" and "vulcanisateur").
I start with the variable "shop". After running the regressions with the variable in the set
of regressors, I found that the variable "shop" is statistically insignificant for all three markets.
Therefore, the results below are the results for the regressions where "shop" is the dependent
variable.

Table 4.4: Significance of the variables for all three markets (shop)
Variable shop 3km shop 3rad shop 10mins
population 2.11e-05** 2.71e-05*** 2.82e-05***
(8.99e-06) (1.00e-05) (1.01e-05)
port_arrivals 1.43e-06*
(8.01e-07)
airports 0.441** 0.662*** 0.605**
(0.223) (0.246) (0.257)
airport_arrivals 1.96e-06** 2.16e-06** 2.02e-06**
(8.11e-07) (9.84e-07) (9.50e-07)
airpt_arr_depts 9.73e-07** 1.07e-06** 1.00e-06**
(4.02e-07) (4.87e-07) (4.70e-07)
dist_land -0.00177** -0.00210**
(0.000879) (0.000995)
secondary 12.48*** 12.79*** 10.65**
(4.614) (4.933) (5.066)
iek 23.52* 31.25*
(13.70) (16.20)
arrivals 7.44e-07* 1.14e-06** 1.46e-06***
(3.87e-07) (4.65e-07) (5.27e-07)
carwash 0.679*** 0.864*** 0.634*
(0.259) (0.315) (0.346)
gr_brand -0.703** -0.989*
(0.344) (0.578)
f_brand 0.703** 1.210**
(0.344) (0.607)
constant 0.942*** 1.075*** 1.132***
(0.109) (0.119) (0.125)

Table 4.4 shows that there are plenty variables affecting the dependent variable in all
three markets. To begin with, for the 3km driving distance market there are variables that
are related to the total population of the island (such as airport arrivals, total arrivals, airport
arrivals and departures, having an airport and the variable "population" itself), the gas stations’
characteristics (car washing services and ownership status), as well as the education of the island

33
CHAPTER 4. HETEROGENEOUS ANALYSIS

(percentage of secondary and iek graduates of an island). The signs of these coefficients are not
surprising; the variables related to the population of the island have a positive impact, as well
as the education levels of the island and the services of car washing. Contrariwise, the distance
from the mainland and the percentage of Greek and independent gas stations in a market have
a negative impact on the percentage of gas stations owning a shop in this market. Notice that
the coefficients of "gr_brand" and "f_brand" have the exact opposite values (-0.703 and 0.703
respectively), meaning that the 12 independent gas stations’ effect is minimal.

The second market’s coefficients are not very different. This time, the coefficients of "dist_land"
and "f_brand" / "gr_brand" become statistically insignificant. The remaining variables have a
positive effect, while the size of all of these variables’ effects are greater than those of the
previous market. However, the standard deviations of the coefficients are also higher, for all the
statistically significant variables.

Finally, when I regress the percent-


age of gas stations owning a shop for the F IGURE 4.6. Regressions’ results for the differ-
third market, the outcome is slightly dif- ently defined markets (shop regressand)
ferent. Port arrivals and the percentages
of Greek / foreign brand named gas sta-
tions are statistically significant, while
the percentage of iek graduates are in-
significant. All of these coefficients have
a positive sign, excluding the coefficient
of the "gr_branded" variable, which was
the case for the first market as well. Both
the size and the standard deviations of
these coefficients are now of mixed results
compared to the respective values of the
previous markets, some effects are higher
with higher standard deviations and vice
versa.

When I run the percentage of shop


owning gas stations on all of the respec-
tive statistically significant variables for
all three markets, I get the results of table
4.6. We see that only 3 variables remain
statistically significant, the distance from the mainland, car washing services and ownership
status. There are no changes on the signs of the coefficients. Furthermore, note that instead
of "gr_brand" it could be "f_brand" with positive impact, running on both variables results on
omitting one of the two variables. For the second market the population and the car washing

34
4.2. ANALYSIS FOR THE HETEROGENEOUS MARKETS

services remain significant, for at least α = 0.95. Lastly, the last market is explained by a unique
variable (car washing services). Since there is only one variable statistically significant, I could
as well run the percentage of shop owners on any significant variable of table 4.4! The overall fit
of the model is better for the second model with Pseudo R 2 reaching 0.107, while the fit of the
third model is pretty much insignificant.
After the "shop" variable I run the previous regressions for "services". Obviously, I did not
run "services" one any of the "carwash", "lubricants" and "vulcanisateur" variables, since, as
mentioned in Chapter 2, this is the sum of these three variables plus other various services (WC
etc.).
Yet again, the variable has no effect on the number of operating gas stations (by using it as a
regressor on previous ordered probit regressions). However, there are plenty variables that can
explain the percentage of gas stations offering any kind of services, as shown on Table 4.5. As
we can see, there are no variables significant for the first market, so we cannot end up with any
model. This changes when I regress it for the next two markets though. Port arrivals, airport
arrivals, airport arrivals and departures and total arrivals are significant for the second market,
all of them having a positive impact on the percentage of gas stations offering any kind of services
in a market. The coefficients themselves are not so strong, being significant mostly for α = 0.9.

Table 4.5: Significance of the variables for all three markets (services)
Variable services 3km services 3rad services 10mins
population 1.89e-05**
(9.56e-06)
port_arrivals 1.40e-06** 1.69e-06**
(6.67e-07) (8.07e-07)
airports 0.483*
(0.256)
airport_arrivals 1.35e-06* 1.70e-06**
(7.43e-07) (8.59e-07)
airpt_arr_depts 6.74e-07* 8.46e-07**
(3.69e-07) (4.26e-07)
dist_land 1.45e-06***
(5.18e-07)
secondary 13.17**
(5.202)
iek 30.98*
(16.96)
arrivals 1.23e-06*** 1.46e-06***
(4.55e-07) (5.27e-07)
constant 0.767*** 1.024*** 1.132***
(0.103) (0.117) (0.125)

35
CHAPTER 4. HETEROGENEOUS ANALYSIS

The scenery is quite different for the third market. Variables such as the population, airports,
distance from the mainland and secondary and iek graduates become significant. Most of these
variables are significant for at least α = 0.95, except "iek" and "airports". The sizes of the effects
are bigger than both 3km radius and, obviously, 3km markets, but the standard deviations are
higher in this case as well.

F IGURE 4.7. Regressions’ results for the differently defined markets (services regres-
sand)

When I regress "services" on all significant variables, there is one variable standing out;
"arrivals". Since these are regressions for one variable, any other significant variable could
replace "arrivals", but the overall fit would drop since this is the only variable that is significant
for all three significance levels, side to side with "dist_land" for the third regression. However, the
overall fit is not that great anyway, reaching a maximum value of 0.0782 for the third market.

Thus, I move forward with variable "carwash", the percentage of gas stations offering car
washing services in a market. The variable is not statistically significant when run in the
regressor set for the ordered probit of the previous chapter as well. Table 4.6 gives information
for the regressions with "carwash" as the dependent variable for all three markets. For the first
market, port arrivals, total arrivals, the number of big ports of the island, the level of secondary
and iek education, owning a shop, offering lubricants services and, for the first time, being "dealer
operated (do)" are the variables that affect the percentage of gas stations offering car washing
services in a market. The relation between "do" and "carwash" is negative, meaning that gas
stations that are not dealer operated have lower chances of offering car washing services. As
for the rest variables, they have a positive impact of this percentage. For example, gas stations
offering lubricants and shopping services tend to offer car washing services in a higher percentage
than the rest.

36
4.2. ANALYSIS FOR THE HETEROGENEOUS MARKETS

For the second market, "do" and "big_ports" become insignificant. On the contrary, airport
arrivals, airport arrivals and departures and population become significant (all three for α = 0.95).
Now all of the variables have positive signs, while all of them are significant for at least α = 0.95.
Comparing the coefficients of the two regressions, we notice that all coefficients have higher
values and higher standard errors (with an exception of the "lubricants" coefficient).

Table 4.6: Significance of the variables for all three markets (carwash)
Variable carwash 3km carwash 3rad carwash 10mins
population 1.67e-05** 2.13e-05**
(8.35e-06) (8.62e-06)
port_arrivals 1.00e-06** 1.56e-06** 1.60e-06**
(4.96e-07) (6.13e-07) (6.75e-07)
big_ports 0.373* 0.608***
(0.199) (0.233)
airports 0.535**
(0.233)
airport_arrivals 1.46e-06** 1.65e-06**
(6.58e-07) (7.05e-07)
airpt_arr_depts 7.29e-07** 8.18e-07**
(3.27e-07) (3.50e-07)
secondary 8.865** 10.85** 11.69**
(4.114) (4.510) (4.762)
iek 20.14* 31.60** 26.21*
(11.76) (14.18) (14.66)
arrivals 6.53e-07** 1.36e-06*** 1.43e-06***
(3.24e-07) (4.12e-07) (4.40e-07)
shop 0.668*** 0.834*** 0.738**
(0.250) (0.295) (0.318)
lubricants 1.572*** 1.412*** 1.476***
(0.262) (0.311) (0.355)
do -2.284*
(1.361)
constant 0.501*** 0.760*** 0.877***
(0.0965) (0.107) (0.113)

Finally, for the third market, all previously significant variables are significant once again,
with the exception of "do" and the addition of "airports". This time, all coefficients are positive.
Any gas station that operates on islands with higher values for these characteristics, as well as
gas stations with these kind of characteristics (shop, lubricants and "do") have higher probability
of offering car washing services than others. Even though the sizes and the standard deviations
of the coefficients are mixed, they take mostly higher values.

37
CHAPTER 4. HETEROGENEOUS ANALYSIS

F IGURE 4.8. Regressions’ results for the differently defined markets (carwash regres-
sand)

Figure 4.8 shows the results for the aggregate regressions. From this figure one thing is clear;
there are three variables that give a good piece of information for the percentage of gas stations
offering car washing services, no matter how one defines the market. Port arrivals, owning a shop
and offering lubricant services play a novice role in this percentage. They all have positive impact
on it, and the effects and standard deviations vary for every market. However, they are all very
significant variables (α = 0.99, except "port_arrivals" for the first market being significant for
α = 0.99 and lower. We see that the higher arrivals coming from ports, the higher this percentage
is, maybe in the thought that port arrivals equal car arrivals as well. This thought could lead to
higher percentages of gas stations with shops and lubricant services as well (we already saw the
positive relation between "shop" and "carwash" in Figure 4.6). The overall fit of the models does
not differ significantly, with the model of the first market taking the highest Pseudo R 2 value
(0.247) and the model of the third regression the lowest (0.211).
Moving forward with variable "lubricants", the percentage of gas stations offering lubricants
services in a market. Parallelly with the previous services regressions, the lubricants variable
is insignificant for the ordered regressions of the homogeneous analysis, they do not affect the
number of operating gas stations in a market. Consequently, Table 4.7 displays the results of
the individual regressions with "lubricants" on the left hand side of the equations. For the 3km
market (first column), there are six variables in total being statistically significant.

38
4.2. ANALYSIS FOR THE HETEROGENEOUS MARKETS

For the 3km market (first column), there are six variables in total being statistically significant.
Port arrivals, education level (secondary and iek), total arrivals, distance from the mainland and
car washing services consist the set of regressors for this market, with car washing services being
the only one significant for all three significant levels. All variables have a positive impact on the
percentage of gas stations that offer lubricants services, with the exception of the distance from
the mainland.

Table 4.7: Significance of the variables for all three markets (lubricants)
Variable lubricants 3km lubricants 3rad lubricants 10mins
population 1.73e-05** 2.29e-05***
(8.01e-06) (8.18e-06)
port_arrivals 1.01e-06**
(4.72e-07)
big_ports 0.441**
(0.222)
ports -0.146* -0.143*
(0.0755) (0.0771)
airports 0.675*** 0.884***
(0.214) (0.225)
airport_arrivals 8.94e-07* 1.12e-06**
(5.09e-07) (5.20e-07)
airpt_arr_depts 4.44e-07* 5.59e-07**
(2.53e-07) (2.59e-07)
secondary 7.503* 14.93*** 10.60**
(3.948) (4.414) (4.433)
iek 21.67* 25.60**
(11.35) (13.06)
university 12.57** 12.47**
(5.678) (6.135)
arrivals 5.90e-07* 7.19e-07** 9.11e-07**
(3.09e-07) (3.46e-07) (3.63e-07)
dist_land -0.00202** -0.00162* -0.00188**
(0.000792) (0.000834) (0.000858)
distance_peiraeus -0.00476**
(0.00221)
carwash 1.551*** 1.430*** 1.627***
(0.246) (0.290) (0.336)
vulcanisateur 1.539*** 1.195*
(0.554) (0.623)
constant 0.296*** 0.576*** 0.689***
(0.0936) (0.102) (0.107)

The second market has more variables statistically significant; population, ports, airports,
airport arrivals, airport arrivals and departures, university, distance from Piraeus and vulcan-
isateur services are added in contrast to the previous regression, while port arrivals are now

39
CHAPTER 4. HETEROGENEOUS ANALYSIS

insignificant. Once again, most variables have positive effect on the percentage, while the two
variables related to distance (from the mainland and from Piraeus) have a negative effect. Also,
the number of ports has a negative effect as well, which, in my opinion, is an unexpected outcome,
following the reasoning previously discussed (ports equal more car arrivals, which consequently
results in higher need for various services).
The significant variables for the third market are just about the same with the second market,
with "iek" and "distance_peiraeus" becoming insignificant and "big_ports" significant. The positive
sign on "big_ports" coefficient endorses the reasoning I discussed, even though the coefficient of
"ports" remains negative. Other than that, the distance of the mainland has a negative sign while
the rest variables keep their positive signs. On average, the effects and the standard deviations
of the variables increase as the definition of the market widens, a result observed in nearly every
previous regression.
When I regress "lubricants"
F IGURE 4.9. Regressions’ results for the differently on all three markets, I get the re-
defined markets (lubricants regressand) sults of Figure 4.9. For the first
market, car washing services and
the distance from the mainland
are the two significant variables,
with the expected signs on each
variable, positive for "carwash"
and negative for "dist_land". The
second market has variables that
only affect the outcome in a pos-
itive way, the number of airports
and the percentages of gas sta-
tions offering car washing and vul-
canisateur services have all a pos-
itive effect in the percentage of
gas stations offering lubricant ser-
vices. Finally, the third market is
a mix of the previous two, all four
variables mentioned before are significant, for at least α = 0.95. The overall fit varies from 0.218
to 0.292, with the third market having the maximum value.
Finally, the set of regressions for the "vulcanisateur" variable follow. Unsurprisingly, the
variable is insignificant when it is included in the regressor matrix for the number of operating
gas stations in a market, which lead to the regressions of the Table 4.8, where the significant
variables for the individual regressions is displayed.

40
4.2. ANALYSIS FOR THE HETEROGENEOUS MARKETS

The percentage of gas stations offering vulcanizing services across all 3km markets is affected
only from the percentage of gas stations offering lubricants services, which is an understandable
outcome. Having vulcanizing installations results in having low costs for offering lubricants
services, since having lubricants and having the know-how of offering this service is usually the
case for vulcanizing stations.

Table 4.8: Significance of the variables for all three markets (vulcanisateur)
Variable vulcanisateur 3km vulcanisateur 3rad vulcanisateur 10mins
population 2.88e-05***
(7.82e-06)
port_arrivals 1.59e-06*** 1.51e-06***
(5.00e-07) (5.24e-07)
big_ports 0.466**
(0.217)
airports 0.685***
(0.224)
airport_arrivals 2.21e-06***
(4.62e-07)
airpt_arr_depts 1.10e-06***
(2.30e-07)
dist_land -0.00192** -0.00344***
(0.000916) (0.000946)
arrivals 7.86e-07** 1.93e-06***
(3.24e-07) (3.45e-07)
lubricants 0.746*** 0.921*** 0.965***
(0.271) (0.294) (0.316)
f_brand -0.881**
(0.449)
do -1.012*
(0.569)
constant -0.964*** -0.508*** -0.305***
(0.110) (0.101) (0.0998)

For the second market, port arrivals, total arrivals and distance from the mainland are
added to the statistically significant variables list, all of them having a positive impact on the
regressand, except the distance from the mainland having the usual negative effect. The effect of
lubricants becomes greater, as well as its standard deviation.
The third market has various variables that are statistically significant. In addition to the
previous ones, the population of the island, the number of big ports, the airports, the airport
arrivals, airport arrivals and departures and the percentages of foreign brand named and
dealer operated gas stations in a market become statistically significant. The variables related
to the population of the island ("population", "big_ports", "airports", "airport_arrivals" and
"airport_arr_depts") have all positive impact on the percentage of vulcanizing offering gas

41
CHAPTER 4. HETEROGENEOUS ANALYSIS

stations. On the contrary, the percentages of foreign and dealer operated gas stations have
negative signs. As discussed in Figure 4.5 the positive relation of "vulcanisateur" and "f_brand"
is already known, but since we do not get the causal effects from these tables, I believe that it
makes more sense to say that foreign branded gas stations tend to offer vulcanizing services more
often, than saying that the percentage of vulcanizing gas stations in a market determines the
percentage of foreign branded gas stations. Leaving this aside, the percentage of dealer operated
gas stations decreases the percentage of gas stations offering vulcanizing services.

F IGURE 4.10. Regressions’ results for the differently defined markets (vulcanisateur
regressand)

Figure 4.10 above displays the aggregate regressions for every market separately. For the
first market, there was only one variable statistically significant, so this is the obvious outcome.
Next, the port arrivals and the percentage of lubricant offering gas stations are the two variables
significant for the aggregate model, and, finally, "lubricants" and "arrivals" are the two variables
defining the percentage of vulcanisateur gas stations. We see that the number of total arrivals
has a positive impact as well, an intuitively expected result. In every case, all variables are
statistically significant for all three levels. However, the best model in terms of overall fit is the
third one, having a 0.2 Pseudo R 2 value.

42
4.2. ANALYSIS FOR THE HETEROGENEOUS MARKETS

4.2.3 Gas stations’ ownership status

In this section I look into the gas stations’ ownership status (being do, coco or independent), how
it affects the number of competing gas stations in a market and if there are any factors able to
determine on some level this status. I begin with some regressions with "do" variable (dealer
operated) as a regressor.
In addition to the variables that were
statistically significant for the heteroge- F IGURE 4.11. Regressions’ results for the differ-
neous analysis, the percentage of dealer ently defined markets (do regressor)
operated gas stations in a market af-
fect negatively the number of competing
gas stations in a market, only for the
first market. The overall fit changes from
0.037 of Figure 3.7 to 0.0461, which is not
a great fit but, nevertheless, it is almost
a 30% increase.
Next, the results of Table 4.9 follow,
where the variable "do" is the dependent
variable. In this case, I did not run the
ownership status on "branded", since it is
obvious that markets with high percent-
ages of branded gas stations will have
higher percentages of "do" gas stations
(especially since from Table 4.1 we see
that "do" gas stations consist the majority of gas stations, about 95%).
The only variable that is significant in this case is the percentage of shop owning gas stations
for the second market, a not so strong result (significant only for α = 0.9). No safe conclusions
can be drawn from this table, which means that even though we know it affects the percentage
of competing gas stations on some level, we do not know what variables affect the choice of
purchasing a brand name by the owners of these gas stations.

Table 4.9: Significance of the variables for all three markets ("do")
Variable do 3km do 3rad do 10mins
shop 1.082*
(0.563)
constant 2.549*** 1.986*** 2.088***
(0.348) (0.209) (0.233)

43
CHAPTER 4. HETEROGENEOUS ANALYSIS

The "coco" variable does not affect the number of competing gas stations in any market, thus
I present only the results of the probit regressions with "coco" as the dependent variable (Table
4.10).

Table 4.10: Significance of the variables for all three markets ("coco")
Variable coco 3km coco 3rad coco 10mins
population 3.72e-05** 5.29e-05*** 5.62e-05***
(1.48e-05) (1.15e-05) (1.11e-05)
ports 0.279** 0.203**
(0.126) (0.0920)
airports 0.538* 0.467*
(0.288) (0.262)
airport_arrivals 2.47e-06*** 1.73e-06*** 1.11e-06***
(5.81e-07) (4.69e-07) (4.16e-07)
airpt_arr_depts 1.23e-06*** 8.54e-07*** 5.48e-07***
(2.90e-07) (2.33e-07) (2.07e-07)
arrivals 1.64e-06*** 9.09e-07** 7.94e-07**
(5.43e-07) (3.82e-07) (3.54e-07)
secondary 41.35*** 29.22*** 21.10***
(11.97) (7.114) (5.974)
iek 46.36***
(17.92)
university 12.93*
(7.399)
dist_land -0.00349*** -0.00476***
(0.00134) (0.00144)
distance_peiraeus -0.00401*
(0.00214)
income 0.000113*
(6.39e-05)
shop 0.790*
(0.429)
lubricants 0.769* 0.973*** 0.830**
(0.430) (0.375) (0.373)
constant -1.658*** -1.071*** -0.900***
(0.157) (0.119) (0.114)

In contrast to the previous status, this time there are plenty variables statistically significant.
For the first market, the population of the island, the number of ports, the number of airports
arrivals, airport arrivals and departures, the total arrivals, the percentage of iek graduates and
the percentage of gas stations offering lubricant services are the variables that have all positive
impact on the percentage of "coco" gas stations in a market.
The number of airports replace the number of ports and the university graduates replace iek’s
graduates in the second set of regressions, while the distance from the mainland is also added

44
4.2. ANALYSIS FOR THE HETEROGENEOUS MARKETS

to the matrix of independent variables. All variables besides "dist_land" have positive signs, a
pattern existing in most regressions so far.
The third market’s percentage of "coco" gas stations is determined by all variables related
to the population of the island from the previous two markets and some additions of variables
concerning the location of the island (distance from Piraeus), the gas stations’ characteristics
("shop"), as well as the average income of the island. As for the island’s education characteristics,
the percentage of secondary graduates is the only percentage remaining significant. Except for
the two variables related to the location of the island, all variables have positive effects on the
percentage of "coco" gas stations.
The aggregate regressions are dis-
played in Figure 4.12. When I regress F IGURE 4.12. Regressions’ results for the differ-
the percentage of "coco" gas stations ently defined markets (coco regressor)
on all statistically significant vari-
ables for the first market, there is
only one variable remaining signif-
icant, which is significant only for
α = 0.9; the percentage of gas stations
offering lubricant services. The over-
all fit of this model is low as well, a
foreseeable aftereffect.
The two following regressions
yield different results. The factors
that affect the percentage of "coco"
gas stations are related to the popula-
tion, education and location of the is-
land. The population, secondary grad-
uates and distance from mainland are
the three variables for the 3km ra-
dius market, with an overall fit al-
most 6 times better than the one of
the 3km driving distance market. As
usual, the only negative effect comes from the distance of the island from the mainland.
This is the case for the 10-minute driving distance market as well, with the number of
ports having now a negative impact on the percentage. This is the first time this particular
variable "survived" any combined regression, but the sign of the coefficient may be due to the
fact that fuel companies take the decision to operate their own gas stations on islands with lower
levels of competition, and, perhaps, sell the brand name on islands with higher existing levels
of competition. On the other hand, this is not verified by the sign of "population", a variable

45
CHAPTER 4. HETEROGENEOUS ANALYSIS

having positive impact on the competition (as we previously saw in Chapter 3), but nor is denied.
Nonetheless, the overall fit of this model is greater than the rest two, having Pseudo R 2 = 0.354.
The last ownership status concerns the independent gas stations. Trying to answer the
question if this kind of status affects the competition in a market gave the outcome of Table 4.13.
In similar fashion with the "do"
F IGURE 4.13. Regressions’ results for the differ- variable, it only seems to affect the
ently defined markets (ind regressor) number of operating gas stations in
the first market (3km driving dis-
tance). The coefficient itself is not
very strong significantly (α = 0.9), but
the overall fit of the model has sig-
nificantly increased relatively to the
one of the homogeneous analysis once
again.
Running regressions to determine
what variables influence the percent-
age of independent gas stations in a
market, I end up with Table 4.11. The
population, airport arrivals, airport
arrivals and departures, total arrivals
and distance from the mainland of an
island are the variables that are sta-
tistically significant for the 3km driv-
ing distance market. Yet again, excluding the distance from the mainland, all other variables
have positive signs. Markets with higher percentages of independent gas stations are expected to
exist in islands with higher airport traffic, population numbers and distant from the mainland.
In the second market, however, three additional variables affect this percentage negatively.
The university graduates percentage and average income decrease independent percentages in
3km radius markets. In contrast with the regressions for the "coco" percentages of Table 4.10, the
coefficient of university graduates has now the opposite sign. It could be that in the knowledge
of benefits of a brand named gas station (such as higher product quality, positive advertising
externalities etc.), people tend to buy brand names more often, or simply the fact that on these
islands companies tend to operate their own gas stations (as previously shown), limiting down
the numbers / percentages of independent gas stations. Along with university graduates, the
average income of an island has a negative impact on the dependent variable as well, but with
the connection of income and education given, the interpretation of the effect of income could
co-relate to the one of the university graduates.

46
4.2. ANALYSIS FOR THE HETEROGENEOUS MARKETS

Furthermore, the percentage of secondary level graduates affects the percentage of indepen-
dent gas stations in the 10-minute market as well. On the same token, "income" and "university"
remain unchanged, affecting the dependent variable in a negative way altogether. Contrariwise,
arrivals (port, airport and total) and population give a boost on the percentages of independent
gas stations.

Table 4.11: Significance of the variables for all three markets (independent)
Variable indep. 3km indep. 3rad indep. 10mins
population 2.58e-05** 2.09e-05** 3.35e-05***
(1.17e-05) (9.54e-06) (8.74e-06)
port_arrivals 1.53e-06*** 2.33e-06***
(5.84e-07) (5.74e-07)
big_ports 0.660**
(0.258)
airport_arrivals 2.19e-06*** 1.44e-06*** 2.29e-06***
(5.34e-07) (4.67e-07) (4.34e-07)
airpt_arr_depts 1.09e-06*** 7.14e-07*** 1.14e-06***
(2.66e-07) (2.32e-07) (2.16e-07)
university -14.99** -25.25***
(7.148) (6.761)
secondary -8.166*
(4.451)
dist_land -0.00524** -0.00397***
(0.00244) (0.00144)
arrivals 1.59e-06*** 1.72e-06*** 2.73e-06***
(5.00e-07) (4.23e-07) (4.16e-07)
income -0.000276*** -0.000499***
(0.000104) (0.000115)
constant -1.559*** -1.158*** -0.769***
(0.147) (0.124) (0.109)

Again, the variables surviving an overall regression are way less than the respective numbers
of individual regressions. In column (1) of Figure 4.14, we see that for the 3km driving distance
market only the number airport arrivals remains significant (for all three levels). The 3km radius
market is affected only by the total arrivals as well, while the 10-minute driving distance one is
affected by the population, number of big ports, secondary education levels and total arrivals.
First of all, since total arrivals is the aggregation of port and airport arrivals, we notice that
airport arrivals are significant in all three cases. In harmony with the previous table, the all have
positive signs. Third regression’s "population" and "secondary" keep their signs constant as well,
while "big_ports" has changed signed, because of the total effect of all variables run together.

47
CHAPTER 4. HETEROGENEOUS ANALYSIS

F IGURE 4.14. Regressions’ results for the differ- The interpretation of the sign that

ently defined markets (ind regressand) I would attempt to give is similar to


the "ports" coefficient of "coco" regres-
sions in Figure 4.12. The overall fit of
the model is the highest one yet, with
Pseudo R 2 being 0.446.
Variables such as the population
of an island, the total arrivals, the dis-
tance from the mainland and educa-
tion seem to all have impact on every
kind of ownership status. The sign of
the education levels alternates, but
the rest variables point towards the
same point. The population and total
arrivals have both positive effects on
the outcome, while the distance (ei-
ther from the mainland or from Pi-
raeus) has the opposite effect.
After running regressions for the
number of competing gas stations in
a market for every gas station’s characteristics, I get the results of table 4.12.

Table 4.12: Significance of the variables for all three markets (heterogeneous)
Variable 3km market 3radius market 10mins market
gr_brand
f_brand
do
ind

For the first market, the brand name and the ownership status are of importance when
defining the number of competitors, while for the third market only the brand status matters.
However, the effects of these characteristics are once again neutralized when run aggregated, as
it is visible in Figure 4.15 below.

48
4.2. ANALYSIS FOR THE HETEROGENEOUS MARKETS

F IGURE 4.15. Regressions’ results for the differently defined markets (total)

As we can see, only the brand status is statistically significant in the last regression, when
these characteristics are put in the regressors set for the number of competitors in the market.
The percentage of Greek branded gas stations has a negative impact on the number of competitors,
while the variable can be replace with "f_brand" for the excact opposite results.

49
HAPTER
5
C
C ONCLUSIONS

his dissertation’s objective was to highlight the factors that affect the number of operating

T gas stations in an oligopolistic market. The way that this topic was approached was
by having two separate analyses for two different segments, the homogeneous and the
heterogeneous. In the heterogeneous segment I examine these factors for gas stations that do not
offer differentiated services. I run regressions for the island level as a whole, and then I isolate
the islands that have fewer than 10 gas stations from the islands with 10 or more gas stations.
The results of these regressions indicate that in all three cases the population and size of the
island have a positive effect on the number of operating gas stations, while when run for the
whole island the number of arrivals (both port and airport) play a significant role as well.

For the heterogeneous segment there were mainly two questions to be answered; what other
characteristics affect the aforesaid number and what are some variables explaining the values of
these characteristics. For example, does owning a brand named gas station affect the number of
competitors in a market? Moreover, are there any variables affecting the percentage of brand
owned gas stations in a market? The answers varied, depending on the type of market and
characteristic. I run ordered probit regressions to answer the first question, and I found that
variables related to the population of the island, its location with respect to the mainland and
Piraeus, as well as the education levels of the islands and some characteristics of the gas stations
are determining on some level the market structure.

More specifically, for the first market, owning a Greek brand named gas station (merged
with independent gas stations), a foreign brand name ("BP" and "SHELL"), dealer operated or
independent gas station, ostensibly affect the number of operating gas stations. For the 10-minute
driving distance market, this is true only for the two first variables, while there are no statistically
significant heterogeneous variables for the radius market. As Figure 4.15 above shows, in total,

51
CHAPTER 5. CONCLUSIONS

when run altogether, there are four variables determining the number of operating gas stations in
a market; the size, population and total arrivals of an island, as well as weather or not it is Greek
or foreign branded (obviously, "f_branded" was significant as well if it replaces "gr_branded").
Having Greek brand name and bigger island size decrease this number, while higher numbers of
population and arrivals boost this number.
The rest variables not being significant in these regressions does not imply that they do not
play any role in the market definition, however, their effects get neutralized due to collinearity.
For the second question, variables related to the islands’ population, number of ports / airports
and education levels, seemingly affect the characteristics of the gas stations in a positive way, in
general. In contrast, distance from the mainland and Piraeus have a negative impact on them.
These are not constant effects however, sine any characteristics could vary from island to island
or market to market. The population and total arrivals of an island increases the percentages of
gas stations offering car washing, lubricant and vulcanizing services, as well as the percentages
of "coco" and "do" gas stations. Education and gas stations’ characteristics had mixed results,
either increasing or decreasing the percentages of various markets.
This dissertation’s purpose serve as a footstep for future research, leaving many questions
to be answered. The most obvious one, is taking into consideration the prices. It is a logical
assumption that prices could probably have an effect on either the characteristics of the gas
stations (brand names, offering various services etc.) or the market structure (the number of
competing firms). An interesting application would be applying the entry thresholds mentioned
by Bresnahan and Reiss (explained in Chapter 1 later on), as a starting point to study the
(potential) entrants and make different assumption regarding the costs of the firms. Also, by
making the assumptions more heterogeneous, one could add the rest products offered by the
firms (this dissertation deals only with firms offering "Unleaded 95" fuels). Ultimately, making
the models dynamic by having time included, would yield some interesting results.

52
B IBLIOGRAPHY

[1] S. B ERRY, Estimation of a model of entry in the airline industry, Econometrica, 60 (1992),
pp. 889–917.

[2] S. B ERRY AND P. R EISS, Empirical models of entry and market structure, Handbook of
Industrial Organization, 3 (2007), pp. 1845–1886.

[3] S. B ERRY AND J. WALDFOGEL, Social inefficiency in radio broadcasting, The RAND Journal
of Economics, 30 (1999), pp. 397–420.

[4] L. B OEX, Attributes of effective economic instructions: An analysis of student evaluations,


The Journal of Economic Education, 31 (2000), pp. 211–227.

[5] T. B RESNAHAN AND P. R EISS, Entry and competition in concentrated markets, The Journal
of Political Economy, 99 (1991), pp. 977–1009.

[6] E. C HAMBERLIN, Theory of Monopolistic Competition, Harvard University Press, 1933.

[7] S. D E C ANIO, Student evaluations of teaching: A multinomial logit approach, The Journal of
Economic Education, 17 (1986), pp. 165–176.

[8] R. E RNST AND D. Y ETT, Physician Location and Specialty Choice, Health Administration
Press, 1985.

[9] G.C HAN, P. M ILLER , AND M. T CHA, Happiness in university education, International
Review of Economics Education, 4 (2005), pp. 20–45.

[10] D. H ENSHER AND P. S TOPHER, Behavioural Travel Modeling (Ch. 15), Croom Helm, 1979.

[11] C. M ANSKI, Identification Problems in the Social Sciences, Harvard University Press, 1995.

[12] M. M AZZEO, Product choice and oligopoly market structure, The RAND Journal of Economics,
33 (2002), pp. 221–242.

[13] J. PANZAR AND J. R OSSE, Testing for ’monopoly’ equilibrium, The Journal of Industrial
Economics, 35 (1987), pp. 443–456.

53
BIBLIOGRAPHY

[14] K. S EIM, An empirical model of frim entry with endogenous product-type choices, The RAND
Journal of Economics, 37 (2006), pp. 619–640.

[15] O. T OIVANEN AND M. WATERSON, Market structure and entry: Where’s the beef?, The RAND
Journal of Economics, 36 (2005), pp. 680–699.

[16] C. W INSHIP AND R. M ARE, Regression models with ordinal variables, American Sociological
Review, 49 (1984), pp. 512–525.

54

You might also like