Professional Documents
Culture Documents
Optimal Dynamic Clustering Through Relegation
Optimal Dynamic Clustering Through Relegation
Abstract
Related Literature
Competitive balance within sports leagues has been studied in the economics
literature. Moreover, because North American sports leagues are seen as
monopolies or cartels due to their closed structure, individual league decisions
such as the reserve clause, free agency, salary cap, revenue sharing, and win-
oriented owners have been closely monitored on their effects towards competitive
balance. Also, the competitive balance of open league systems that utilize
relegation and promotion has been analyzed and compared to North American
sports leagues (Szymanski and Smith, 2002, Buzzacchi et al., 2003, Szymanski
and Valetti, 2005).
However, these analyses do not isolate relegation and promotion as the
only independent variable, and are highly specific to the sport and other league
characteristics. Theoretical models (Noll, 2002) suggest that the introduction of
relegation and promotion will increase spending, but will not directly translate to
competitive balance.
Medcalfe (2003), in an unpublished paper, studied the effects of
implementing relegation and promotion and found evidence that the introduction
of relegation and promotion does increase competitive balance. However, he did
not study the effect of the number of teams relegated and promoted on
competitive balance of the league, and he did not track team skill level dynamics.
In a surprising article in Nature, Juhas et al. (2006) (see also the comment
by Rehr (2006)) point out that the problem of reconstructing three-dimensional
non-crystalline structures from 'pair distribution function' (PDF) data has
similarities to sports relegation and promotion rules. Juhas refers to their
approach for inverting PDF data as the 'Liga algorithm', because it is modelled on
the rules of relegation and promotion used in most of the world's football leagues,
including La Liga in Spain. Teams correspond to trial clusters of atoms; 'winning'
clusters (those with the smallest errors between the model and the experiment) are
iteratively promoted, whereas losing ones (those with the largest errors) are
relegated, so that an optimal structure is quickly found. The authors show that
their algorithm can determine a number of nanoscale structures with perfect
success rates.
To answer our questions and in order to provide a framework that lets us extend
our analysis to other sports, we propose a mathematical model of an abstract
hierarchical sports league managed by a relegation and promotion system. In
order to generate results that can reasonably reflect a real life sports league, the
following elements are fundamental:
Our proposed model follows. Let denote a set of sports teams. Each
team possesses an intrinsic “power” or “skill” rating throughout
season , where = 1, 2, …, N. This ISL is not observable but manifests itself
through outcomes of games or matches that take place throughout a season
consisting of games. Teams are divided into a hierarchy of divisions where
division 1 contains the strongest teams and division the weakest teams, with the
relative strength of teams in divisions declining in between. A schedule
determines which teams play each other during a season. Possible schedules
include playing only divisional opponents, playing all teams in the league, and
some mix of the two. We assume for simplicity that all teams in the same division
play the same opponents and that each pair of teams play an even number of
games equally divided between home and away games (this avoids home
advantage effects, a point we elaborate on below). We now expand on our
description of the key model ingredients.
The ISL
The ISL for each team varies from year to year. This variability provides the
motivation for relegation and promotion because if ISLs remained constant, no
relegation would be necessary provided the initial assignment of teams to
divisions was correct. In reality, the main sources of variability are player specific
factors including within- and between-season trades, player acquisitions, injuries,
and skill progression or decline; and team factors including coaching,
management changes, and team chemistry plus other intangibles. Unfortunately
such factors cannot be easily modelled, especially in the context of a simulation
model, so instead we propose using a dynamic probabilistic model for the team
ISLs which relates next season’s ISL to this season’s ISL plus other factors that
can be modelled in a simulation. We show below that for NBA data, an AR1
model (Granger and Newbold, 1986) describes the dynamics well, but in other
settings other models may be more appropriate, especially in individual sports
which must take player career trajectories into account. One parameter we will
control for in our study is the variability in the ISDM.
We make the strong assumption that intrinsic skill level changes only after
all games in a season have been played and that it remains constant within a
season. Because both the skill level and the AR1 model are generated through
end-of-season results, the end-of-year skill level change is actually an average of
skill level changes within a season. Therefore, in the long run, we should not
expect skill level changes within a season to affect the overall outcome of our
model.
The MDM
The ISLs of the two teams competing in a match determines the probability of a
result (say team wins) which we restrict to be either a win or loss. The MDM
provides an estimate of this probability. We propose using logistic regression to
model the probability of this outcome although other functional forms may also
be appropriate.
Managerial Levers
League management has three levers to use to achieve competitiveness goals; the
division size, the number of teams to relegate and promote, and the schedule.
Division Size
The teams can be divided into any number of divisions but the divisions should
be large enough to allow a reasonable season length based only on a within-
division schedule and to maintain between-season stability, but small enough to
be competitive. If there are too many teams in a division, the team ISLs will vary
considerably and some matches will become rather one-sided. In a league with 30
teams, we believe three 10-team divisions would be reasonable.
We assume for simplicity that all divisions are the same size and the same number
of teams will be relegated and promoted from each division. Of course there is no
promotion from the top division and no relegation from the lowest division. As
noted earlier, if team ISLs were constant from year to year, then no relegation and
promotion would be necessary provided the initial assignment of teams to
divisions was correct. On the other hand, if the ISLs varied greatly from year to
year, then it might be optimal to relegate and promote a large number of teams
each year. Reality is somewhere in between.
Schedule
In North American sports leagues, scheduling often has a profound effect on team
performance. In the NFL, the divisional winners in the previous season play
against other division winners in their inter-divisional matches in the subsequent
season. A schedule composed of stronger opponents will naturally impact a
team’s performance making it difficult for that team to perform at the level it did
in the previous season. Meanwhile, weak teams will play against other weak
teams in their inter-divisional matches, allowing for a better chance to achieve
better results. Since the playoff system determines the eventual champion, it
ultimately offsets biases that result from these unbalanced schedules.
The English football league does not use a playoff system to determine the
champion; instead, it uses a within-division schedule in which teams play each
other twice. This way, teams play the same opponents, so that the schedule does
not have a considerable impact on seasonal results. This is important because the
teams that get relegated should be the weakest teams in the division, and the
teams that get promoted should be the best. One drawback of this schedule is that
fans of teams that are not in the higher division cities do not get to see the top
teams in person. This deficiency is lessened due to England’s size. There are 20
teams in the English Premier League, where the population is slightly over 50
million and the area is approximately 130,000 square km. Compare that to the US
where population is about 300 million and the area is about 9.8 million square km.
Therefore, the ratio of Premier League teams to the population and area of
England is much greater, so fans have better accessibility to the top teams.
We will consider three different balanced schedules where by the
expression balanced we mean that each team within the same division will play
against the same opponents. Further, in practice we would prefer an even number
of games against each opponent to balance home advantage. If an odd number of
games are to be played, the schedule should ensure an equal number of home and
away games. We classify the schedules as follows:
Two commonly used measures of competitive balance in a sports league are the
standard deviation of winning percentage of its teams and the ratio of actual to
idealized standard deviation of winning percentages (Zimbalist, 2002). Since in
empirical data analysis the team winning percentage is affected by the nature of
the sport and the league (for example the NBA has greater variation than MLB)
independent of league rules, it is difficult to compare competitive balance
between different leagues, leaving appropriate forms of measurements to much
debate. However, in our model the intrinsic skill levels are observable so the
following performance metrics are available.
For each division in each simulated season, we record the mean and standard
deviation of ISLs within each division and the average of these quantities over the
data generated after the simulation warm up period. Ideally, we would like the
average mean divisional ISLs to be ordered in the same way as the divisions and
the average standard deviation of within-division ISLs to be small. Our
assumption is that low within-division ISL variability corresponds to more
competitive divisions. Since the MDM was found empirically (see below) to
depend on the difference of ISLs, this adds further support to this observation
since if two teams have equal ISLs, each has a probability of .5 of winning a
match. Hence, if all other parameters are constant and relegating and promoting
two teams per division produces a lower within-division ISL variation than
relegating and promoting three teams, we conclude that relegating and promoting
two teams produces a more competitive league. Note that we compute these
measures for each division separately but we could also average them over
divisions to get a composite measure of variability.
Correct Rankings
Another metric we will use is the number of teams in a division that are assigned
to the appropriate division on the basis of the ISL. For example, if Division 1 has
ten teams and only six of them are among the top ten in the league in terms of the
ISL, then we say that six teams are correctly assigned. The purpose of this metric
is to measure how well the relegation and promotion system clusters teams into
divisions. In an ideal system, the best teams will be in the top division and the
weakest in the lowest division. We will focus primarily on values of this metric
for Division 1.
Applying the Model: National Basketball Association
Deriving reliable skill level models is challenging and the subject of extensive
debate and press. For example http://espn.go.com/nba/hollinger/powerrankings is
one of many such rankings for the NBA. To objectively measure a team’s skill
level requires comparisons between individual players, team chemistry,
performance, schedules, etc. Development and critique of such ratings are beyond
the scope of our study, but nonetheless such ratings are fundamental to our model.
Since a team’s demonstrated performance (measured by its winning
percentage) is correlated with its ISL, we can view the ISL as a function of
performance. Therefore we equate the ISL with the team winning percentage.
Figure 2 summarizes average winning percentages for the 30 NBA teams between
1979-2009. It shows together with Table 1 below, that average historical win
percentage (divided by 100) varies between .35 and .66 and that the league can
naturally be clustered into three divisions of 10 teams each based on these
percentages. As noted above we will use relegation and promotion to modify
cluster membership each season.
Year-to-year team skill level dynamics is arguably the most important part of
building an accurate simulation. Skill level dynamics differ for different sports,
and can be highly complex. For individual sports such as golf and tennis, skill
level dynamics must take into account a player’s progression and career cycle.
Histogram of Average NBA Winning Percentages
10
Count
5
0
0.30 0.37 0.43 0.50 0.57 0.63 0.70
ave_win_pct
However, for professional team sports, many other factors come into play such as
player acquisition and trades, player progression/career cycle, player injuries,
coaching changes, team chemistry, and ownership/philosophy changes. Especially
in a sport following competition balancing rules of reverse order drafting or the
salary cap, events such as an injury to a star player could have dramatic
consequences. For example, the season ending injury to superstar David Robinson
in 1996-1997 resulted in the San Antonio Spurs going from a top team with a .72
winning percentage to an extremely poor team with a .24 winning percentage, but
also resulted in the drafting of Tim Duncan, extending their run of excellence to
13+ years including four NBA league championships.
The reason this is important for building a realistic model is because of the
effects that it can have on league competition. For example, if we set the variation
of dynamics too high (suggesting that it is common for a team to go from .72 wins
to .24), no amount of relegation and promotion will be able to impact league
competition because quality in a given season will not be an indicator of quality in
the subsequent season.
To build a year-to-year team skill level dynamic model, we gathered
winning percentage data (expressed as decimals between 0 and 1) for all NBA
teams from the 1979-1980 season to the 2009-2010 season. As noted above we
used this as the basis for the ISL. Our statistical analysis showed that year-to-year
variability in individual team winning percentages was well described by the
following first order autoregressive (AR1) model:
where:
= winning percentage of team in year
µ = mean winning percentage for team
= autoregressive parameter with values between -1 and 1
= a term representing unexplained variation in year . It is assumed
to be normally distributed with mean 0 and standard deviation σ. Note that
σ is a parameter we will vary in our simulations.
We fit time series models to this data using the statistical software package
NCSS. In all cases after fitting the AR1, there was insufficient evidence to reject
the hypothesis that the residuals were white noise (based on the residual
autocorrelation functions, the Box-Ljung test statistic and also fitting either MA1
or AR2 terms). Estimates of the model parameters (where σ is estimated by the
RMSE) appear in Table 1. They show that historical average winning percentages
varied between 0.346 (LA Clippers) and 0.662 (LA Lakers) with a mean of 0.503.
The estimates of varied between 0.087 (Dallas Mavericks) and 0.882 (Utah
Jazz) with a mean of 0.552 and the RMSE’s varied between 0.081 (Utah Jazz) and
0.133 (Chicago Bulls) with a mean of 0.110. This analysis suggests considerable
variability in team underlying quality as measured by the historical average
winning percentage, streakiness or persistence as measured by , and deviations
from the model as measured by the RMSE. For example, the Utah Jazz had a
historical mean winning percentage of 0.573, an estimate of of 0.883, and a
RMSE of 0.081. This means that when their within-season winning percentage
exceeded 0.573 by a large amount, there is a high probability that the next
season’s win percentage will also exceed 0.573 because of the large estimate for
and the small RMSE.
To use the fitted AR1 models in the simulation, we must address the
following subtle issue: do we generate next year’s ISL by substituting the
previous year’s ISL estimate, or the actual winning percentage, into the above
model? We believe the former may be more appropriate because the actual
winning percentage may not reflect the true intrinsic skill level because of the
interaction between the schedule and the divisional structure. For example, if all
matches are played within the division, then a weak team in a strong division
could have a lower winning percentage than a strong team in a weak division, so
that in this case the actual winning percentage might not reflect the ISL.
Conversely, when each team plays all other teams, either method would be
appropriate.
Team Mean Alpha RMSE
Boston 0.575 0.648 0.13
Philadelphia 0.516 0.834 0.097
Washington 0.42 0.419 0.092
New York Knicks 0.49 0.603 0.11
New Jersey 0.404 0.496 0.116
Miami Heat 0.4 0.8 0.126
Orlando * * *
Toronto * * *
Charlotte * * *
Atlanta 0.5 0.596 0.107
Milwaukee 0.507 0.702 0.094
Chicago 0.509 0.693 0.139
Cleveland 0.484 0.69 0.114
Indiana 0.494 0.622 0.0975
Detroit 0.542 0.712 0.121
Denver 0.465 0.594 0.116
Utah 0.573 0.882 0.081
Dallas 0.496 0.0867 0.114
Houston 0.541 0.161 0.113
San Antonio 0.603 0.411 0.133
New Orleans * * *
Minnesota * * *
Memphis * * *
Seattle 0.54 0.559 0.113
Phoenix 0.592 0.311 0.115
Portland 0.564 0.634 0.088
LA Clippers 0.346 0.403 0.105
Golden State 0.41 0.135 0.12
LA Lakers 0.662 0.498 0.094
Sacramento 0.449 0.759 0.095
Mean 0.503 0.552 0.109
Std Dev 0.074 0.216 0.0149
The other challenge in using this model is that over the long run it will
generate winning percentages below 0 and above 1. This can be corrected
analytically by using a logistic transformation of the winning percentage but there
are other problems too. Historically, there has never been a team with a winning
percentage of below 0.10 or above 0.90 in any season (the worst record was the
1972-1973 Philadelphia 76ers who finished 9-73, and the best record was the
1995-1996 Chicago Bulls who finished 72-10). We do not expect more extreme
results than this because of the nature of the business. There is no motivation for a
team that has already secured first place in its division to compete aggressively
and risk injury. Also, professional pride will motivate players on teams with poor
records to compete harder.
From an economic perspective, it would be economically inefficient for a
team to invest so much that it is significantly better than every other team when
the league success is measured by winning the division and not by dominating
every game. If winning a championship can be achieved with a 0.75 win
percentage, there is no need for a team to try for a 0.90 win percentage. Therefore,
we added into our model limits that prevent a team from ever obtaining a skill
level of less than 0.10 or greater than 0.90. To achieve this, when generated skill
levels were below 0.10 or above 0.90 we simply generated new realizations of
until the value fell within this range.
We use the above model as follows. Suppose team has ISL 0.70 in year
, its historical mean equals 0.55, equals 0.6, and that the sample from a normal
distribution with mean 0 and standard deviation equal to RMSE (which is
assumed to be 0.08) is 0.04, then the ISL in year +1 becomes 0.6 * (0.70 – 0.55)
+ 0.04 + 0.55 = 0.68. By applying this equation to our simulation for each team at
the end of a season and resampling when necessary, we have a model that we
believe is a reasonable reflection of skill level dynamics in the NBA.
As noted above, in practice, team ISL dynamics are affected by many
factors. In the AR1 model, the expression , combines a wide range of
intangible and unpredictable factors that cause teams to differ from year to year
including retirements, trades, improvement or deterioration in individual players
skills, new coaches or new management. In our experience and in light of our
purpose in developing an ISDM that can be applied in the simulation, the AR1
model is better suited to our needs and is probably better in forecasting future
ISLs than a model which takes team characteristics into account.
or
We divided the NBA into 3 divisions of 10 teams each because it makes the most
sense from the perspective of the goals of this study, is supported historically by
Figure 2, and reflects reality in 2011. Larger divisions will make scheduling
challenging and be less competitive, while smaller divisions would be more
competitive, less interesting to fans, and vary highly year to year under RP rules.
From a scheduling perspective, several options are available in agreement with
the above three scheduling principles. We consider two possibilities depending on
whether or not we wish to have an equal number of home and away games for
each team. This will not affect the simulation model where home advantage is not
considered in the MDM but will have significant practical implications when the
home advantage is real.
Since we are considering a three-tiered league with ten teams per division, we
vary the number of teams to relegate and promote between zero and five per
division and assume further that all divisions relegate and promote the same
number of teams. Further, we assume no playoffs, so that the end of the regular
season standings provide the basis for relegation and promotion.
Results
Figures A7-A18 in the Appendix provide similar results to those in Figures A1-
A6 for the stable case in which the ISDM standard deviation has been halved and
the unstable case in which the ISDM standard deviation has been doubled. In the
stable case, it is optimal to relegate and promote two teams while in the unstable
case it is optimal to relegate and promote four teams for all schedules. Further, in
the stable case relegating between one and three teams results in similar levels of
competitiveness while in the unstable case relegating between three and five
teams provides similar levels of competitiveness. These results suggest that as the
year-to-year variability increases, it becomes optimal to relegate and promote
more teams.
Figures A19 and A20 in the Appendix show how variability impacts the
Division 1 competitiveness measures for the different RP rules. They show that
for each RP rule, as year-to-year variability increases, the number of teams
correctly classified decreases and the within-division standard deviation increases
suggesting that the divisions are becoming less competitive. They also show how
the optima vary with changing year-to-year variability.
Scheduling Effects
The scheduling system had little effect on competitiveness. This was expected
because all three schedule types required teams to play against the same
opposition.
Discussion
We note also that when relegation and promotion was featured in the 2010
American football video game Backbreaker, fans on discussion boards regarded it
a refreshing change to the existing NFL league structure.
Future Research
Figure A1
Figure A2
Figure A3
Figure A4
Figure A5
Figure A6
Figure A7
Figure A8
Figure A9
Figure A10
Figure A11
Figure A12
Figure A13
Figure A14
Figure A15
Figure A16
Figure A17
Figure A18
Figure A19
Figure A20
References
Connolly, R.A. and Rendleman, Jr., R.J. (2008). “Skill, Luck and Streaky Play on
the PGA Tour”, Journal of the American Statistical Association 103, 74-
88.
Connolly, R.A. and Rendleman, Jr., R.J. (2010). “Going for the Green: A
Simulation Study of Qualifying Success Probabilities in Professional
Golf”, Unpublished manuscript, September 20, 2010.
Juhas, P., Cherba, D.M., Duxbury, P.M., Punch, W.F., and Billine, S.J.L. (2006).
“Ab Initio determination of solid state nanostructure”, Nature, 440, pp.
655-658 (30 March).