Nolan - Dominance Matrices Project

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

Tyler Nolan

Professor Hummel

Math 210

April 20​th​ ,2020

The Application of Dominance Matrices to the NCAA Men’s Basketball Tournament

Statistical rating systems are useful in any competition. These systems often use complex

and intricate mathematical calculations. The use of rating mathematical rating systems to rank

competitors against each other became widely popular with the use of the Harkness system in

professional chess tournaments back in the 1950’s. In 1960 Arpad Elo [1], Professor of Physics

at Marquette University and master level chess player, introduced his own system which was

soon adopted by the United States Chess Federation and the FIDE, which governs chess

worldwide [6]. The rating system is still used in chess tournament play today and has more

recently been popularized further in competitive online multiplayer games such as DOTA 2,

League of Legends, and World of Warcraft [6]. The ELO system ranks each player based on

their performance against other players by a number on a normal distribution from 0-3000 [6].

On a basic level, wins against an opponent with a higher ELO rating have a greater impact on

overall player ranking than wins against a player with an ELO rating of equal or lesser value.

The ELO and Harkness rating systems are useful in chess and video games because they are

considered “zero-sum games.” The win condition of these games is based on an objective that

has no score directly tied to it, such as destroying the enemy base in popular online games or

putting the opponent in checkmate in the case of chess. There are no scores tied directly to

winning in these competitions which is what makes them “zero-sum games.”

1
The rating systems needed for sporting events are much more complex, as most all sports

are not considered “zero-sum games.” Very simply, the games are ultimately decided by one

team scoring more points than another. A basic understanding of statistics will tell you that a

competition that presents a higher total score between two teams could, in theory, yield a more

stable statistical rating of a team. That is, in soccer a bad team could score a “fluke” goal in the

final minute to win a game that they statistically shouldn’t have,while in basketball a bad team

must score consistently to keep up with the opposition in a game where one team alone can score

upwards of 100 points. Even so, statisticians, fans, sports betters, and even oblivious spectators

have been continuously stumped by the statistical powerhouse of the NCAA March Madness

Tournament. In the tournament 68 teams from across the country are ranked and pitted against

each other in a single elimination tournament. And each year people across the world attempt to

correctly predict the outcome of all 67 games by filling out their own bracket before the

tournament begins. With a total of 9.2 quintillion (9.2 * 10^18) bracket combinations, it’s

obvious why no one has yet to perfectly predict the outcome of every game (a feat that has a

$1,000,000,000 prize set on it by Warren Buffett) [5].

While a perfect bracket may never be found, it could be useful to employ a rating system

to try to predict as many games as possible. In 2003 Charles Redmond of Mercyhurst University

published a paper in ​Mathematics Magazine​ that presented a system for rating teams in a round

robin tournament by way of dominance matrices and eigenvalues [2]. This system is as relatively

simple as the ELO rating system while also being more applicable to sporting events as it also

takes into consideration margin of victory for each game. Using Redmond’s system we will

attempt to rank each team and fill out a bracket for the 2019 NCAA Men’s March Madness

Bracket.

2
Mathematical Background: ​In his paper Redmond uses a simple example tournament between

four teams to explain his rating system. I will give an overview of that example here [2]. He

denotes four games between teams A, B, C, and D with each team playing two games.

Opponents Score

A vs. B 5-10

A vs. D 57-45

B vs. C 10-7

C vs. D 3-10

We see that team B has a record of 2-0, teams A and D at 1-1, and C 0-2. Thus, team B

has a winning percentage of 1.00, A and D both have 0.50, and C with 0.00. Redmond revises

this rating for mathematical simplicity by assigning a score of 1 for each win, 0 for a tie, and -1

for a loss, and then dividing a team’s total score by the number of games played to find their

rating. The new ratings are B - 1, A - 0, D - 0, and C - (-1). The rankings of each team have not

changed. We still see B in first, A and D tied for second, and C at last in the rankings. Using this

same scoring system we can define a team’s “dominance” over another individual team. B has a

dominance of 1 over A and conversely A has a dominance over B of -1. Redmond then makes a

point to note the major flaw in taking this zero-sum approach to games with a score tied to them.

In this example we see team B beating both teams A and C. Team A loses by 5 points and C by 3

points. Even so, A is rated above C even though team C looks to have fared better when

compared though team B. Conversely, B has a dominance of 1 over both teams even though the

scores would tell us that B is more dominant over A than C. Redmond states that this approach

3
“reflects imperfectly what has really happened.” He makes another change to the system by

redefining a team’s dominance over another by their net score against that team. Therefore B has

a dominance of 5 over A and A a dominance of -5 over B. We can then define a team’s average

dominance as their net score in all games divided by the number of games played. The average

dominance of each team is now as follows:

Team Average Dominance

A 3.5

B 4

C -5

D -2.5

Redmond then brings up an important factor of any rating system: strength of schedule. A

bad team could win every game they play because they are consistently playing against even

worse teams while a better team could lose constantly due to their schedule of games

consistently being against better teams. This factor is addressed in the ELO system by the

intricacy described above where a player is given more rating points for beating a stronger

opponent. To accomplish this in Redmond’s system he makes the revision to say each team plays

themself in a hypothetical game in which they tie. This seems inconsequential at the moment but

will be important to note later. We then see the ratings revised by dividing the total dominance

by an additional game.

Team Rating

A 7/3 = 2.33

B 8/3 = 2.67

4
C -10/3 = -3.33

D -5/3 = -1.67

We are now able to get to the crux of this system. We are able to two teams that have not

played each other directly. This is easy to do at a small scale. Redmond asks us to consider A

and C once again who have never played each other, but note that both teams have played D. We

can then create a path from A to C through team D. A beats D by 12 points and D beats C by 7

points. We can then imagine a game between A and C in which A beats C by 12 + 7 = 19 points.

Redmond notes team C as a “second-generation opponent” of team A. Using this system we can

imagine each team playing 9 games instead of the previous 3. Redmond gives a diagram to show

the pathing of each of A’s 9 games. Keep in mind that we are also considering team A as playing

a game against itself.

We retain the first-generation scoring aspect of a A’s game with B through the pathing of 

A→B→B and also by A→A→B where the score is calculated by -5 + 0 = -5 and 0 + -5 = -5. 

This is the importance of having a team play game against itself, so that these scores are 

5
retained. It is also important to realize the first-generation game is considered twice in the 

second-generation rating. This makes sense as an actual game between two teams should 

have more weight into their rating than an imaginary second-generation game. Also note 

that a team plays itself three times in the second-generation.

From a statistical standpoint the second-generation rating is more accurate because it

provides more data points for us to consider. Additionally, from a sports standpoint, this creates a

more accurate rating because it accounts for strength of schedule. Team A is rewarded 31 points

for beating team D; 12 from beating them directly and 19 additional points due to D’s strong

performance against C.

Redmond calculates the “Average Second-Generation Dominance” as follows:

Team Average Second-Generation Dominance

A 3.44

B 3.22

C -4.11

D -2.56

It is interesting to see that A has now moved into first place by considering strength of schedule

and an increased number of games.

6
Redmond goes on to explain that it would make sense that making additional subsequent

generations would yield increasingly accurate results. This would suggest that a limit exists to

how accurate the results can become. He states that it is important to ask at this point if a limit

exists. If it does, the ratings at that limit would provide the most accurate description of each

team. To find this limit and to expand this system to a larger tournament, like the NCAA

tournament, we must apply principles of linear algebra. Redmond continues his explanation with

the same tournament of the four teams. He illustrates the structure of the tournament as follows.

We can see each team is connected to each team they played (including themselves). He then

creates a 4x4 matrix in which a row and column is assigned to each team. A 1 or 0 is entered

depending on whether or not a team has played another. The first-generation matrix (M) is

He then asks us to consider M​2​ which is

M​2​ is the number of unique paths between any two teams. He asks the reader to think this

through by looking at the illustration above and the diagram of A’s second generation games. We

7
notice that A is a second-generation opponent of itself three times, which is reflected in M,​2​ just

as the other teams are second-generation opponents twice. M​2 ​can be understood as the

second-generation matrix of the tournament structure. It is interesting to see that simply squaring

the matrix results in the second-generation. This makes sense looking back however, as the

teams all played 3 games in their first-generation and 3​2​ = 9 games in the second-generation.

Redmond then defines the vector S as

where the coordinates are the net points scored by each team. He explains how to find the

coordinates of the vector for the first generation ratings of our teams.

From here we can move on to see how to go about computing second-generation ratings.

To do this we must add first-generation dominances in three times. This will be given by 3M​0​ *

S. We also consider the second-generation dominances, but only adding them in once. This will

8
be given by M​1​ * S. Redmond then reveals the coordinates of the second-generation ratings as

being calculated by

We now come to our final equation for finding the ratings for the ​n​-th generation.

Redmond continues on to explain how to apply a limit to infinity for this summation

through eigenvalue decomposition. This is a useful tool when trying to obtain the most precise

rating for a team, but it is unnecessary for our application. Additionally, taking a limit of this

particular summation to infinity for a 70x70 data set will require a large amount of

computational power. For the sake of my computer we will simply take the summation to 100.

While this may not give the most accurate rating, the ​ranking​ of each team relative to each other

should settle into place by the 100th generation.

The Application:​With the necessary equation at our disposal, rating the teams in the

2019 March Madness tournament should be straightforward (but nevertheless tedious). So long

as every team in the bracket can be tied to at least one other team by a regular season matchup

the calculation will scale from a 4 team tournament to any size. A compilation of the scores of

these regular season matchups can be found ​here​ [3].

There are a few things to note in this spreadsheet. Firstly, it should be explained that only

68 teams are entered into the tournament each year. That is the teams who won their conference

9
tournament plus the teams that are selected by the NCAA because of their overall performance

during the regular season [4]. There are 32 conferences in the NCAA so that still leaves 36 teams

who are granted an “at large” bid. These are the teams hand-picked by a selection committee. In

the scores matrix there are 70 teams. This is because two teams from smaller conferences,

Bradley and Farleigh Dickinson University, had not played any else in the tournament. By

adding UIndy’s neighbors IUPUI we can link Bradley to the other teams in the bracket as IUPUI

played both Bradley and Northern Kentucky. Similarly, adding Princeton ties FDU to Duke,

Iona, Yale, Arizona State, and St. Johns. The addition of these teams will have a negligible

impact on our results and their results will be meaningless in the application of filling out our

bracket.

From this scores spreadsheet we can make matrix (M) from our equation by replacing the

scores in each box with the number of games played between any two teams. We must also

include at least imaginary games that each team plays against themselves. One complication that

arises in this application is that each team has not played the same number of games as in the

example Redmond provides. Redmond explains that to combat this, we can simply input as many

imaginary games we need to equalize our total number of games [2]. Kansas played more games

than any other team with 21 total regular season games against other tournament teams. Once we

add the one imaginary game they play against themself we have them down as playing 22 total

games. For each team we will add as many imaginary games as is necessary to get their total to

22 as well. We notice that this matrix is symmetric across the diagonal. This makes sense as, for

example, if Duke played North Carolina 3 times then that should be reflected in the (Duke, North

Carolina) cell as well as the (North Carolina, Duke) cell. Matrix M can be found ​here​.

10
The only remaining variable in our equation is (S). We can find this by taking the net

score (no pun intended) of each team across their row and compile those scores into a 70x1

vector. We notice that the sum of all elements in this vector adds up to zero. This makes sense

given how the vector was equated. Vector S can be found ​here​.

The last edit we must make before making our calculations is a slight alteration to our

equation. Since teams play 22 games in our tournament we will replace the 3 in Redmond’s

equation with 22 [2]. Our new equation is as follows.

n
1 M j−1
Results = lim ( ∑ ( ) * S)
n→∞ j=1 22 22

Using this altered equation, M, S, and some help from MatLab we can calculate the final

rankings of each team (found ​here​). We can then use these rankings to fill out a bracket. The

calculated bracket can be found ​here​.

The Results: ​So how well did Redmon’s ranking system work? By comparing the calculated

bracket to the results of the 2019 tournament (found ​here​) we see a 60% success rate at correctly

picking games [4]. But usually brackets are scored by rewarding more points for correctly

picking games in later rounds like so.

Round Points Awarded

First 1

Second 2

Third 4

Fourth 8

11
Fifth 16

Sixth 32

We can score the calculated bracket in this way as having 89/192 possible points. We can also

compare our calculated bracket to the average bracket entered into ESPN’s Bracket Challenge by

the ESPN People’s Bracket (found ​here​): a representation of the most common picks made by

the millions of brackets submitted to ESPN. The People’s Bracket only scored 75 points.

Therefore, by this standard we can say that applying Redmond’s method to selecting teams in the

2019 March Madness tournament has provided a more accurate prediction than the average

bracket maker on ESPN.com.

Discussion: ​Our results still leave room for improvement. There are some inherent flaws in

applying this system to the NCAA tournament. Firstly, it should be noted that there is so much

more that goes into evaluating an individual team than their net score. Teams matchup

differently against other teams based on a variety of factors. For example, UCF’s team in 2019

featured 7’5” center Tacko Fall. As you can imagine any team that is unable to match Tacko’s

size will struggle to score against him or stop him from scoring himself. This means that UCF is

a great matchup against smaller teams that aren’t great at shooting because Tacko can shut down

their inside scoring. But as soon as UCF ran into Duke in the tournament Tacko was matched up

against 6’6”, 284lbs Zion Williamson. While Zion is almost a foot shorter than Tacko, he is

incredibly strong and was known as one of the best players in the tournament in 2019. Zion’s

strength and sheer skill was able to shut down Tacko in UCF and Duke's second round matchup.

These insights are lost in any mathematical equation.

12
Another flaw in this application is the opportunity for outliers. For some teams we have a

very small data set to go by. Old Dominion only played two games against tournament teams in

their 2019 regular season and racked up a net score of 16 points. This is why we see them ranked

9th after our final calculation, only 1 spot over their first round opponent Purdue. This is just like

the soccer team making a “fluke goal” as discussed earlier. In our calculated bracket we see Old

Dominion make what would be a historic run to the fourth round. In actuality Purdue beat them

by 13 points in the first round and made a run of their own run into the fourth round. Upsets are a

major part of March Madness, but this application allows for some lapses in judgement when it

comes to picking teams. While any one team theoretically has a chance to beat any other team on

any given gameday, I would imagine no professional sports analyst is projecting Old Dominion

to go to the fourth round with any level of confidence. It should be noted that our results could be

made more accurate by taking into account ​all​ games played in the 2018-2019 regular season.

Even games between non-tournament teams. We could then use the calculated rankings of only

the teams chosen to participate in the tournament to make our bracket. This would require a

supreme amount of data entry and processing power that I did not have at my disposal.

The application is not all bad though. It is excellent at sifting through teams with good

records against a weak schedule. Wofford came into the tournament with an impressive 29-4

record. But all four losses were against tournament teams. Wofford’s calculated ranking was

57th, a much better assigned rank than if we were simply basing rankings on win percentage.

So while I doubt this system is winning anyone any prize money, it is without a doubt

interesting in the way it simplifies ranking. It has already proven to predict games at a better rate

than the average human. The application of this system could be used in some right to aid in an

individual’s bracket selection or even revised to calculate based on other important statistics like

13
rebounds or assists. And as previously stated, it could surely be made more accurate by taking

into account the statistics of each team in the NCAA and by taking the limit to infinity. Whatever

the case, we should see the equation’s successes as an example of mathematics and statistics’

ability to predict in the real world, and the equations failures as a celebration of the randomness

that comes with sports and human competition. Everyone loves an underdog and so long as

underdogs are defying the odds in the March Madness Tournament, no one equation will ever be

able to predict their unlikely success.

14
References

[1] Glickman, M., Jones, A. Rating the Chess Rating System.

http://glicko.net/research/chance.pdf

[2] Redmond, C. (2003, April). A Natural Generalization of the Win-Loss Rating System.

Mathematics Magazine​, Vol. 76.

[3] Sports Reference CBB (2020). College Basketball Stats and History.

https://www.sports-reference.com/cbb/

[4] Staats, W. NCAA (March 21st, 2019). NCAA bids 2019: Bracket for March Madness.

https://www.ncaa.com/news/basketball-men/2019-03-21/ncaa-bids-2019-bracket-march-

madness

[5] Wile, R. Business Insider. (Jan 21st, 2014). Warren Buffett will Give you $1 Billion if you

Fill Out a Perfect ‘March Madness’ Bracket.

https://www.businessinsider.com/warren-buffett-billion-dollar-bracket-2014-1

[6] Word Chess Hall of Fame. (2020). Arpad Emrick Elo.

https://worldchesshof.org/hof-inductee/arpad-emrick-elo

15
Document Links

Scores Matrix

Games Played Matrix (M)

Net Score Matrix (S)

Results

Calculated Bracket

Actual Bracket

ESPN Tournament Challenge People’s Bracket

16

You might also like