Professional Documents
Culture Documents
The Advantage of Lefties in One-On-One Sports: Francois Fagan, Martin Haugh and Hal Cooper
The Advantage of Lefties in One-On-One Sports: Francois Fagan, Martin Haugh and Hal Cooper
Unauthenticated
Download Date | 4/22/19 2:08 PM
2 | F. Fagan et al.: The advantage of lefties in one-on-one sports
prominence of left-handers in a given sport was due to the modern professionalism and training are acting to counter
strategic advantages of being left-handed in that sport. the advantage. Deliberate training was shown in (Schorer
Indeed the prevailing² explanation today for the et al. 2012) to improve the performance of handball goalies
over-representation of left-handers in professional inter- against players of specific handedness while (Ullén, Ham-
active sports is the negative frequency-dependent selec- brick, and Mosing 2016) explores the issue of deliberate
tion (NFDS) effect. This effect is also assumed to underlie training vs innate talent in depth. A recent article (Liew
the so-called “fighting hypothesis” (Raymond et al. 1996) 2015) in the Telegraph newspaper in the UK, for example,
which explains why there is long-lasting handedness poly- noted how seven of the first seventeen Wimbledon cham-
morphism in humans despite the fitness costs that appear pions in the open era were left-handed men while there
to be associated with left-handedness. The NFDS effect were only two left-handers among the top 32 seeds in the
is best summarized as stating that right-handed players 2015 tournament. Some of this variation is undoubtedly
have less familiarity competing against left-handed play- noise (see Section 6) but there do appear to be trends in
ers (because of the much smaller percentage of lefties in the value of left-handedness. For example, the same Tele-
the population) and therefore perform relatively poorly graph article noted that a reverse effect might be taking
against them as a result. Key evidence supporting this place in women’s tennis. Specifically, the article noted that
hypothesis was the demonstration of mechanisms for how 2015 was the first time in the history of the WTA tour that
NFDS effects might arise (Daems and Verfaillie 1999; Stone there were four left-handed women among the top 10 in
1999; Grossman et al. 2000; Grouios et al. 2000b). The dif- tennis. In most sports the lefty advantage appears to be
ficulty of playing elite left-handed players in one-on-one weaker in women than in men (Loffing and Hagemann
interactive sports has long been recognized. For example, 2016). The issue of gender effects of handedness in pro-
Breznik (2013) quotes Monica Seleš, the former women’s fessional tennis is discussed in (Breznik 2013) where it is
world number one tennis player: shown through descriptive statistics and a PageRank-style
analysis that women do indeed have a smaller lefty advan-
“It’s strange to play a lefty (most players are right-handed)
tage than men although it’s worth noting their data only
because everything is opposite and it takes a while to get used to
extends to 2011. It is also suggested in (Breznik 2013) that
the switch. By the time I feel comfortable, the match is usually
over.” the lefty advantage in tennis is weaker in Grand Slams than
on the ATP and Challenger tours. They conjecture that pos-
A more general overview and discussion of NFDS effects sible explanations for this are that the very best players are
can be found in the recent book chapter of Loffing and more able to adjust to playing lefties and they may also be
Hagemann (2016) who also provide extensive statistics in a better position to tailor their training in anticipation
regarding the percentage of top lefties across various of playing lefties.
sports. It is also perhaps worth mentioning that the debate Many other researchers have studied the extent of
between the ISH and the NFDS mechanism is not quite set- the leftie advantage and how it might arise. For exam-
tled and some research, e.g. (Gursoy 2009) in boxing and ple, (Goldstein and Young 1996) determines a game the-
(Breznik 2013) in tennis, still argue that the ISH has a role oretic evolutionary stable strategy from payoff matrices of
to play. summarized performance, whereas (Billiard, Faurie, and
Recent analyses of combat sports (such as judo Raymond 2005) explicitly uses frequency dependent inter-
(Sterkowicz, Lech, and Blecharz 2010), mixed martial arts actions between left- and right-handed competitors. The
(Dochtermann, Gienger, and Zappettini 2014), and box- work of Abrams and Panaggio (2012) has some similar-
ing (Loffing and Hagemann 2015)) also support the exis- ity to ours as they also model professionals as being the
tence of NFDS effects on performance, although they sug- top performers from a general population skill distribu-
gest that alternative explanations must still be considered tion. They use differential equations to define an equilib-
and that the resulting advantage is small. This agrees with rium of transitions between left- and right-handed popu-
(Loffing, Hagemann, and Strauss, 2012a) which suggests lations. These papers rely then on the NFDS mechanism
that although left-handedness provides an advantage, to generate the lefty advantage. We note that equilibrium-
style models suggest the strength of the lefty advantage
might be inversely proportional to the proportion of top
2 We do note, however, that there are other hypotheses for explain-
lefties. Such behavior is not a feature of our modeling
ing the high proportion of lefties in elite sports. They include higher
testosterone levels, personality traits, psychological advantages and framework but nor is it inconsistent with it as we do not
early childhood selection. See (Loffing and Hagemann 2016) and the model the NFDS mechanism (and resulting equilibrium).
references therein. Instead our main goal is to measure the size of the lefty
Unauthenticated
Download Date | 4/22/19 2:08 PM
F. Fagan et al.: The advantage of lefties in one-on-one sports | 3
advantage rather than building a model that leads to this where M ij is the number of matches where player i beats
advantage. player j, and T ij := M ij +M ji is the total number of matches
Several researchers have considered how the lefty between players i and j. The inferred skills can then be
advantage has evolved with time. In addition to their used to predict the outcome of future matches.
aforementioned contributions, (Breznik 2013) also plot the There are a few notable extensions to the BTL and TM
mean rank of top lefties and righties over time in tennis models including ELO (Elo 1978), Glicko (Glickman 1999)
and they obtain broadly similar results to those obtained and TrueSkillTM (Herbrich, Minka, and Graepel 2007). ELO
by our Kalman filtering approach. Other researchers have models the performance of each player in a match as hav-
also analyzed the proportion of top lefties in tennis ing a Gaussian distribution centered around their respec-
over time. For example, (Loffing, Hagemann, and Strauss tive skill. Glicko and TrueSkillTM extend the ELO model
2012b) fit linear and quadratic functions to the data and by putting a Gaussian prior on the skill of each player.
then extrapolate to draw conclusions on future trends. These models have been widely applied to various compe-
Their quadratic fit for the proportion of lefties in men’s ten- tition settings. For example, ELO was developed as a chess
nis uses data from 1970 to 2010 and predicts a downwards ranking system, and TrueSkillTM has been used for online
trend from 1990 onwards. This is contradicted by our data match making for video games on Xbox Live. These models
from 2010 to 2015 in Section 6 which suggests that the allow one to infer the skill level of each player and thereby
number of top lefties may have been increasing in recent construct player rankings.
years. They also perform a separate analysis for amateur
players, showing that the lefty advantage increases as the
quality of players improves. It is also worth noting that 1.3 Contributions of this work
Ghirlanda, Frasnelli, and Vallortigara (2009) introduce a
model suggesting the possibility of the lefty advantage In this paper we propose a Bayesian latent ability model
remaining stable over time. for identifying the advantage of being left-handed in one-
on-one interactive sports but with the additional compli-
cation of having a latent factor, i.e. the advantage of left-
1.2 Latent ability and competition models handedness, that we need to estimate. Inference is fur-
ther complicated by the truncated nature of data-sets that
Our work in this paper builds on the extensive³ latent
arise from only observing data related to the top play-
ability and competition models literature. The origi-
ers. The resulting pattern of data “missingness” therefore
nal two competition models are the Bradley-Terry-Luce
depends on the latent factor and so it is important that we
(BTL) model (Bradley and Terry 1952; Luce 1959) and
model it explicitly. We show how to infer the advantage
the Thurstone-Mosteller (TM) model (Thurstone 1927;
of left-handedness when only the proportion of top left-
Mosteller 1951). BTL assumes each player i has skill S i , so
handed players is available. In this case we show that the
that the probability of player i beating player j is a logis-
distribution of the number of left-handed players among
tic function of the difference in skills. Specifically BTL
the top n (out of N) converges as N → ∞ to a binomial
assumes
distribution with a success probability that depends on
⧸︀ (︁ )︁
p(i ◁ j | S i , S j ) = 1 1 + e−(S i −S j ) (1) the tail-length of the innate skill distribution. Since this
result would not be possible if we used short- or long-tailed
where i ◁ j denotes the event of i beating j. TM is defined skill distributions, we also argue for the use of a medium-
similarly, but with the probability of i beating j being a pro- tailed distribution such as the Laplace distribution when
bit function of the difference in their skills. Given match- modeling the “innate”⁴ skills of players. We also use this
play data, the skill of each player may be inferred using result to develop a simple Kalman filtering model for infer-
maximum likelihood estimation (MLE) where the proba- ring how the lefty advantage has varied through time in
bility of the match-play results is assumed to satisfy a given sport. Our Kalman filter/smoother enables us to
(︃ )︃
∏︁ T ij
p(M | S1 , . . . , S N ) = p(i ◁ j)M ij · p(j ◁ i)M ji (2)
M ij 4 Throughout this paper we will use the term “innate skill” to
i<j
refer to all components of a player’s “skill” apart from the advan-
tage/disadvantage associated with being left-handed. We acknowl-
3 See also the related literature on item response theory (IRT) from edge that the term “innate” may be quite misleading – see the dis-
psychometrics and the Rasch model (Fischer and Molenaar 2012) cussion below – but will continue with it nonetheless for want of a
which is closely related to the BTL model below. better term.
Unauthenticated
Download Date | 4/22/19 2:08 PM
4 | F. Fagan et al.: The advantage of lefties in one-on-one sports
smooth any spurious signals over time and should lead to where H i is the handedness indicator with H i = 1 if the
a more robust inference regarding the dynamics of the lefty player is left-handed and H i = 0 otherwise. The generative
advantage. framework of our model is:
We also consider various extensions of our model. For – Left-handed advantage: L ∼ N(0, σ2L ) where σL is
example, in order to estimate the innate skills of top play- assumed to be large and N denotes the normal distri-
ers we consider the case when match-play data among bution.
the top n players is available. This model is a direct gen- – For players i = 1, 2, . . . , N
eralization of the Glicko model described earlier. Unlike – Handedness: H i ∼ Bernoulli(q) where q is the
other models, this extension learns simultaneously from proportion of left-handers in the overall popula-
(i) the over-representation of lefties among top players tion.
and (ii) match-play results between top lefties and right- – Innate skill: G i ∼ G for some given distribution G
ies. Previously these phenomena were studied separately. – Skill: S i = G i + H i L.
We observe that including match-play data in our model
makes little difference to the inference of the lefty advan- The joint probability distribution corresponding to the
tage and therefore helps justify our focus on the simpli- generative model then satisfies
fied model that only considers the proportion of lefties
in the top n players. This extension does help us to iden- N
∏︁
tify the innate skills of players, however, we acknowledge p(S, L, H) = p(L) p (H i ) p(S i | H i , L) (3)
i=1
that these so-called innate skills may only be of interest
to the extent that the NFDS mechanism is responsible for
where we note again that N is the number⁵ of players in
the lefty advantage. (To the extent that the innate superior-
our population universe. We assume we know (from pub-
ity hypothesis holds, it’s hard to disentangle the notion of
lic results of professional competitions etc.) the identity of
innate skill or talent from the lefty advantage and using the
the top n < N players as well as the handedness, i.e. left
phrase “innate skills” would be quite misleading in this
or right, of each of them. Without loss of generality, we let
case.)
these top n players have indices in {1, . . . , n} and define
The remainder of this paper is organized as follows.
the corresponding event
In Section 2 we describe our skill and handedness model
and also develop our main theoretical results here. In {︂ }︂
Section 3 we introduce match-play results among top play- Topn,N := min {S i } ≥ max {S i } . (4)
i=1,...,n i=n+1,...,N
ers into the model while in Section 4 we consider a vari-
ation where we only know the handedness and external Note, however, that even when we condition on Topn,N ,
rankings of the top players. We present numerical results the indices in {1, . . . , n} are not ordered according to
in Section 5 using data from men’s professional tennis in player ranking so for example it is possible that S1 < S2
2014. In Section 6 we propose a simple Kalman filtering or S n < S3 etc.
model for inferring how the lefty advantage in a given sport
varies through time and we conclude in Section 7 where
possible directions for future research are also outlined. 2.1 Medium-tailed priors for the innate skill
Various proofs and other technical details are deferred to distribution
the appendix.
Thus far we have not specified the distribution G from
which the innate skill levels are drawn in the generative
model. Here we provide support for the use of medium-
2 The latent skill and handedness tailed distributions such as the Laplace distribution for
model modeling these skill levels. We do this by investigating
Unauthenticated
Download Date | 4/22/19 2:08 PM
F. Fagan et al.: The advantage of lefties in one-on-one sports | 5
the probability of top players being left-handed as the p(n l | L; n, N) → Binomial (n l ; n, q) as N → ∞. This too is
population size N becomes infinitely large. Consider then undesirable, since the probability of a top player being
left-handed does not depend on L in the limit and agrees
lim p(n l | L; n, N) (5) with the probability of being left-handed in the general
N→∞
population. As a consequence, such a distribution would
where nl denotes the number of left-handers among the
be unsatisfactory for modeling in those sports where we
top n players. For the skill distribution to be plausible,
typically see left-handers over-represented among the top
the probability that top players are lefthanded should be
players.
increasing in L and be consistent with what we observe
We therefore argue that the ideal distribution for mod-
in practice for a given sport. Letting Binomial (x; n, p)
eling the innate skill distribution is a medium-tailed dis-
denote the probability of x successes in a Binomial (n, p)
tribution such as the standard Laplace distribution which
distribution, we have the following result.
has PDF
Proposition 1. Assume that g has support R. Then 1
(︂ )︂ g(x) = exp (−|x|/b) (7)
q 2b
lim p(n l | L; n, N) = Binomial n l ; n,
N→∞ q + (1 − q)c(L) √
where b = 1/ 2 is the standard scale parameter of the
(6)
where distribution. It is easy to see in this case that c(L) =
exp(−L/b) and substituting this into (6) yields
g(s)
c(L) := lim
s→∞ g(s − L) lim p(n l | L; n, N)
N→∞
(︂ )︂
if the limit c(L) exists. q
= Binomial n l ; n, (8)
q + (1 − q) exp(−L/b)
Proof. See Appendix A.1.
which is much more plausible. For very small values of L,
The function c(L) characterizes the tail length of the skill
the probability of top players being left-handed is approx-
distribution. If c(L) = 0 for all L > 0 as is the case for
imately q which is what we would expect given the small
example with the normal distribution, then g is said to
advantage of being left-handed. For large positive values
be short-tailed. In contrast, long-tailed distributions such
of L we see that the probability approaches 1 and for
as the t distribution, have c(L) = 1 for all L ∈ R. If a
intermediate positive values of L we see that the proba-
distribution is neither short- nor long-tailed, we say it is
bility of top players being left handed lies in the interval
medium-tailed.
(q, 1). This of course is what we observe in many sports
If we use a short-tailed innate skill distribution
such as fencing, table-tennis, tennis etc. For this reason,
and L > 0, then c(L) = 0 and limN→∞ p(n l | L; n, N) =
we will restrict ourselves to medium-tailed distributions
Binomial (n l ; n, 1). That is, if there is any advantage to
and specifically the Laplace⁷ distribution for modeling the
being left-handed and the population is sufficiently large,
innate skill levels in the remainder of this paper.
then the top players will all be left-handed almost surely.
This property is clearly unrealistic even for sports with
a clear left-handed advantage such as fencing, since it
is not uncommon to have top ranked players that are
2.2 Large N inference using only aggregate
right-handed. This raises questions over the heavy use handedness data
of the short-tailed normal distribution in competition
models (Elo 1978; Glickman 1999; Herbrich et al. 2007) Following the results of Proposition 1, we assume here
and suggests⁶ that other skill distributions may be more that we only know the number nl of the top n players who
appropriate. As an alternative, consider a long-tailed dis- are left-handed. We shall see later that only knowing nl
tribution. In this case we have c(L) = 1 for all L ∈ R and results in little loss of information regarding L compared
to the full information case of Section 3 where we have
Unauthenticated
Download Date | 4/22/19 2:08 PM
6 | F. Fagan et al.: The advantage of lefties in one-on-one sports
lim p(L | n l ; n, N) Figure 1: The value of limN→∞ p(L | n l ; n, N) as given by the r.h.s.
N→∞
of (9) for different values of n. We assume an N(0, 1) prior for p(L),
a value of q = 11% and we fixed n l /n = 25%. The dashed vertical
∝ lim p(L)p(n l | L; n, N)
N→∞ line corresponds to L* from (10) and the Laplace approximations are
(︂ )︂ from (11).
q
= p(L) Binomial n l ; n,
q + (1 − q) exp(−L/b)
fixed. The value⁸ of L* provides an easy-to-calculate point
estimate of L for large values of n.
(︃ )︃ (︂ )︂n l
n q
= p(L) The bell-shaped posteriors in Figure 1 suggest that we
nl q + (1 − q) exp(−L/b)
might be able to approximate the posterior of L as a Gaus-
)︂n r
sian distribution. This can be achieved by first approxi-
(︂
q
1−
q + (1 − q) exp(−L/b) mating (9) as a Gaussian distribution over L via a Laplace
(︂ )︂n l approximation (Barber 2012, Sec. 28.2) to the second term
1 on the r.h.s. of (9). Specifically, we set the mean of the
∝ p(L)
q + (1 − q) exp(−L/b) Laplace approximation equal to the mode, L* , of (6) and
(︂
exp(−L/b)
)︂n r then set the precision to the second derivative of the loga-
q + (1 − q) exp(−L/b) rithm evaluated at the mode. This yields:
)︂n (︂ )︂n (︂ )︂
exp(n l /n · L/b) n
(︂
exp(n l /n · L/b) ∝
∼ N L; L* , b2 (11)
= p(L) q exp(L/b) + 1 − q nr nl
q exp(L/b) + 1 − q
(9)
where nr := n − nl is the number of top righthanded play- Note that this use of the Laplace approximation is non-
ers, all factors independent of L were absorbed into the standard as the left side of (11) is not a distribution over
constant of proportionality and the binomial distribution L, but merely a function of L. However if 0 < n l < n then
term on the second line is now written explicitly as a func- the left side of (11) is a unimodal function of L and, up
tion of nl . We shall verify empirically in Section 5 that (9) to a constant of proportionality, is well approximated by
is a good approximation of p(L | n l ; n, N) if the population a Gaussian. We can then multiply the normal approxima-
size N is large. As the number of top players n increases tion in (11) by the other term, p(L), which is also Gaussian
while keeping n l /n fixed, the nth power in (9) causes the to construct a final Gaussian approximation for the r.h.s. of
distribution to become more peaked around its mode. This (9). This is demonstrated in Figure 1 where (9) is plotted for
effect can be seen in Figure 1 where we have plotted the different values of n and where the likelihood factor was
r.h.s of (9) for different values of n. If n is sufficiently large set⁹ to the exact value of (6) or the Laplace approximation
then the data will begin to overwhelm the prior on L and of (11). It is evident from the figure that the Laplace approx-
the posterior will become dominated by the likelihood fac- imation is extremely accurate. This gives us the confidence
tor, i.e. the second term on the r.h.s. of (9). This likelihood to use the Laplace approximation in Section 6 when we
term achieves its maximum at build a dynamic model for L.
(︂ )︂
n l /n 1−q
L* := b log · (10)
1 − n l /n q
which we plot as the dashed vertical line in Figure 1. We 8 As a sanity check, suppose that n l /n = q. Then it follows from (10)
can clearly see from the figure that the density becomes that L* = 0, as expected.
more peaked around L* as n increases while keeping n l /n 9 In each case, we normalized so that the function integrated to 1.
Unauthenticated
Download Date | 4/22/19 2:08 PM
F. Fagan et al.: The advantage of lefties in one-on-one sports | 7
2.3 Interpreting the posterior of L left-handed players were to simultaneously give up the
advantage of being left-handed. In this latter case, the drop
In the aggregate data regime we do not know the posterior in rank would not be as steep as that given in (14) since that
distributions of the skills and thus cannot directly infer the player would still remain higher-ranked than the other left-
effect of left-handedness on match-play outcomes. How- handed players who were below him in the original rank-
ever, we can still infer this effect in aggregate. As we shall ing. In fact, we can argue that the approximate absolute
see, the relative ranking of left-handers is governed by drop in ranking for a left-handed player of rank x when
the value of L. In particular, if we continue to assume all left-handers give up the benefit of being left-handed
the Laplace distribution for innate skills, then we show in is given by the number of right-handed players between
Appendix A.3 that for any fixed and finite value of L, the ranks x and exp(L/b)x
difference in skills between players at quantiles λj and λi
satisfies (1 − n l /n) × (x exp(L* /b) − x). (15)
(︁ )︁
lim S[Nλ j ] − S[Nλ i ] = b log(λ i /λ j ) (12) We can argue for (15) by noting that on average a frac-
N→∞
tion n l /n of the players between ranks x and x exp(L* /b)
where the convergence is in probability and we use S[Nλ i ] will be left-handed and they will also fall and remain
to denote the [Nλi ]th order statistic of S1 , . . . , S N . Sup- ranked below the original player when the lefty advantage
pose now¹⁰ that Nλ i = x and Nλ j = exp(−k/b)x. Assum- is stripped away. Therefore the new rank of the left-handed
ing (12) continues to hold approximately for large (but player originally ranked x will be
finite) values of N it immediately follows that
x + (1 − n l /n) × (x exp(L* /b) − x) (16)
S[Nλ j ] − S[Nλ i ] ≈ k. (13)
when all left-handed players simultaneously give up the
Consider now a left-handed player of rank x with innate advantage of being left-handed. Simplifying (16) using (14)
skill, G. All other things being equal, if this player was yields a new ranking of
instead right-handed then his skill would change from S = nl
G + L to S = G. This corresponds to a value of k = −L x.
nq
and so that player’s ranking would therefore change from
x to exp(L/b)x which of course would correspond to an We therefore refer to the r.h.s. of (14) and n l /(nq) as
inferior ranking for positive values of L. We can therefore Dropalone and Dropall , respectively. Results for various
interpret the advantage of being left-handed as improv- sports are displayed in Table 1. For example, the table sug-
ing, i.e. lowering, one’s rank by a multiplicative factor of gests that left-handed table-tennis players would see their
exp(−L/b). ranking drop by a factor of approximately 2.91 if the advan-
We can use this result to infer the improvement in rank tage of left-handedness could somehow be stripped away
for lefties due to their left-handedness in various sports from all of them.
(Flatt 2008). We do so by substituting the fraction of top These results, while pleasing, are not very surprising.
players that are left-handed, n l /n, into (10) to obtain L* , After all, a back-of-the-envelope calculation could come
our point estimate of L. Following the preceding discus-
sion, the (multiplicative) change in rank due to going from Table 1: Proportion of left-handers in several interactive one-on-one
sports (Flatt 2008; Loflng and Hagemann 2016) with the relative
left-handed to right-handed can then be approximated by
changes in rank under the Laplace distribution for innate skills with
nl 1−q q = 11%.
exp(L* /b) = · (14)
n − nl q
Sport Approx % left-handed Dropalone Dropall
which again follows from (10). It is important to interpret (n l /n)
(14) correctly. In particular, it represents the multiplica- Tennis 15% 1.43 1.36
tive drop in rank if a particular left-handed player were Badminton 23% 2.41 2.09
somehow to give up the advantage of being left-handed. Fencing (épée) 30% 3.47 2.73
Table-tennis 32% 3.81 2.91
It does not represent his drop in rank if he and all other
nl
Dropalone = n−n · 1−q
q
represents the drop in rank for a left-hander
l
who alone gives up the advantage of being left-handed while
nl
10 There is a slight abuse of notation here since we first need to round Dropall = nq represents the drop in rank of a left-hander when all
Nλi to the nearest integer. left-handers give up the advantage of being left-handed.
Unauthenticated
Download Date | 4/22/19 2:08 PM
8 | F. Fagan et al.: The advantage of lefties in one-on-one sports
to a similar conclusion as follows. The proportion of top In contrast with the win probability in (1), our win prob-
table-tennis players who are left-handed is ≈ 32% but the ability in (17) has a hyperparameter σM that we use to
proportion¹¹ of left-handers in the general population is ≈ adjust for the predictability of each sport. In less pre-
11%. Assuming the top left-handers are uniformly spaced dictable sports, for example, weaker players will often beat
among the top right-handers, we would therefore need to stronger players and so even when S i ≫ S j we have p(i ◁ j |
reduce the ranking of all left-handers in fencing by a factor S i , S j ) ≈ 1/2. On the other hand, more predictable sports
of ≈ 32/11 ≈ 2.91 = Dropall to ensure that the proportion would have a larger value of σM which accentuates the
of top-ranked left-handed fencers matches the proportion effects of skill disparity. Having an appropriate σM allows
of left-handers in the general population. the model to fit the data much more accurately than if we
While these results and specifically the interpretation had simply set σ M = 1 as is the case in BTL. It’s worth not-
of exp(L* /b) described above are therefore not too surpris- ing that instead of scaling by σM in (17), we could have
ing, we can and do interpret them as a form of model vali- scaled the skills themselves so that S i = σ M (G i + H i L) and
dation. In particular, they validate the choice of a medium- then used the win probability in (1). We decided against
tailed distribution to model the innate skill level of each this approach in order to keep the scale of L consistent
player. We note from Proposition 1 and the following dis- across sports.
cussion that it would not be possible to obtain these results
(in the limit as N → ∞) using either short- or long-tailed
distributions to model the innate skills. 3.1 The posterior distribution
Unauthenticated
Download Date | 4/22/19 2:08 PM
F. Fagan et al.: The advantage of lefties in one-on-one sports | 9
are not player rankings but are merely¹² player indicators will use a Gaussian proposal distribution with mean vec-
from the universe of N players. We know the form of p(L), tor equal to the current state of the Markov chain and
p(S i | L, H i ) and p(M ij | S i , S j ) from the generative model covariance matrix, λ2 Σ/n, where¹³ λ = 2.38 and Σ is an
but need to determine p(Topn,N | S1:n , L). The prior den- approximation to the covariance of the posterior distribu-
sity on the skill of a player in the general population (using tion. In the numerical results of Section 5, we will run mul-
the generative framework with Bernoulli(q) handedness) tiple chains in order to properly diagnose convergence to
satisfies stationarity using the well-known Gelman-Rubin R ̂︀ diag-
[︀ ]︀ nostics. Towards this end, the starting points of the chains
f (S | L) := p(S | L) = EH p(S | H, L) will be generated from an N(µ, γI) distribution where µ is
= qg(S − L) + (1 − q)g(S) (21) an approximation to the mean of the posterior distribu-
tion and where γ is set sufficiently large so as to ensure the
where we recall that g denotes the PDF of the innate skill starting points are over-dispersed.
distribution, G . Letting F(S | L) denote the CDF of the
density in (21), it follows that
3.2.1 Approximating the mean and covariance of the
p(Topn,N | S1:n , L) posterior
N
∏︁ The posterior of the skills S1:n can be approximated by con-
= p(S i ≤ min{S1:n } | S1:n , L) sidering the posterior distribution of its order statistics,
i=n+1
S[1:n] := (S[1] , S[2] , . . . , S[n] ), where S[i] is the ith largest of
= F(min{S1:n } | L)N−n . S1:n . If we were to reorder the player indices so that the
index of each player equaled his rank, then we would
We can now simplify (20) to obtain have S1:n = S[1:n] . Unfortunately we don’t a priori know
the ranks of the players. Indeed their ranks are uncer-
p(S1:n , L | M1:n , H1:n , Topn,N ) tain and will follow a distribution over permutations of
n
1, . . . , n that is dependent on the data. However, it is pos-
g(S i − LH i ) F(min{S1:n } | L)N−n sible to construct a ranking of players that should have a
∏︁
∝ p(L)
i=1 high posterior probability by running BTL on their match-
∏︁ play results. If we order the player indices according to
p(M ij | S i , S j ). (22) their BTL ranks then we would expect
i,j≤n
p(S1:n = s1:n , L | M1:n , H1:n , Topn,N )
≈ p(S[1:n] = s1:n , L | M[1:n] , H[1:n] ) (23)
3.2 Inference via MCMC
where M[1:n] , H[1:n] are the match-play results and hand-
We use a Metropolis-Hastings (MH) algorithm to sample edness of the now ordered top n players.
from the posterior in (22). By virtue of working with the We can estimate the mean and covariance of the dis-
random variables S1:n conditional on the event Topn,N , tribution given by (23) via Monte Carlo. First let us apply
the algorithm will require taking skill samples that are Bayes’ rule to separate L, S[n] and S[1:n−1] and obtain
far into the tails of the skill prior, G . In order to facilitate
fast sampling from the correct region, we develop a tai- p(S[1:n] , L | M[1:n] , H[1:n] )
lored approach for both the initialization and the proposal = p(S[1:n−1] | S[n] , L, M[1:n] , H[1:n] )
distribution of the algorithm. The key to our approach p(S[n] | L, M[1:n] , H[1:n] ) p(L | M[1:n] , H[1:n] ). (24)
is to center and de-correlate the posterior distribution, a
process known as “whitening” (Murray and Adams 2010; We would like to jointly sample from this distribution
Barber 2012). This leads to good proposals in the MH sam- by first sampling L, then sampling S[n] conditional on L,
pler and is also useful in setting the skill scaling hyper-
parameter, σM , as we discuss below. More specifically, we
13 The value of λ = 2.38 is optimal under certain conditions; see
(Roberts and Rosenthal 2001). We also divide the covariance matrix
by the dimension, n, to help counteract the the curse of dimen-
12 It is only through the act of conditioning on Topn,N that we can sionality which makes proposing a good point more difficult as the
infer something about the rankings of these players. dimension of the state-space increases.
Unauthenticated
Download Date | 4/22/19 2:08 PM
10 | F. Fagan et al.: The advantage of lefties in one-on-one sports
and finally sampling S[1:n−1] conditional on both L and proposal distribution for the MH algorithm while we use µ
S[n] . The empirical mean and covariance of such samples as the mean of the over-dispersed starting points for each
would approximate the true mean and covariance of pos- chain. The accuracy of the approximation is empirically
terior of L and S[1:n] , which in turn would approximate the investigated in Section 5 and is found to be very close to
mean and covariance of L and S1:n — our ultimate object of the true mean and covariance of L and S1:n . We note that
interest. Unfortunately sampling from (24) is intractable, the approximate mean, µ, and covariance, Σ, could also
but we can approximately sample from it as follows: be used in other MCMC algorithms such as Hamiltonian
1. As discussed in Section 2.2, for large populations the Monte Carlo (Neal 2011, Sec 4.1) or elliptical slice sampling
posterior of L can be approximated as (Murray, Adams, and Mackay 2010).
(︂ )︂n
∝ exp(n l /n · L/b)
p(L | M[1:n] , H[1:n] ) ∼ p(L) .
q exp(L/b) + 1 − q
3.2.2 Setting σ M via an empirical Bayesian approach
(25)
We can easily simulate from the distribution on the
The hyperparameter σM was introduced in (17) to adjust for
r.h.s. of (25) by computing its CDF numerically and
the predictability of the sport and we need to determine
then using the inverse transform approach. It can be
an appropriate σM in order to fully specify our model. A
seen from Figure 2 in Section 5 that this approximation
simple way to do this is to set σM to be the maximum like-
is very accurate for large N.
lihood estimator over the match-play data where the skills
2. It is intractable to simulate S[n] directly according to
are set to be S1:n = µ S , the approximate posterior mean of
the conditional distribution on the r.h.s. of (24). It
the skills derived above. That is,
seems reasonable to assume, however, that
∏︁
p(S[n] | L, M[1:n] , H[1:n] ) ≈ p(S[n] | L), (26) σ M = argmaxσ>0 p(M ij | S i = µ i , S j = µ j ; σ). (28)
i,j≤n
Unauthenticated
Download Date | 4/22/19 2:08 PM
F. Fagan et al.: The advantage of lefties in one-on-one sports | 11
for the professional athletes in a given sport, e.g. the offi- 3. Update L via a Metropolis-Hastings¹⁴ (MH) step with
cial world rankings maintained by the World ATP Tour for n
men’s tennis (Stefani 1997). We will assume without loss g(S i − LH i ) F(S n | L)N−n .
∏︁
p(L | S1:n , H1:n ) ∝ p(L)
of generality that the player indices are ordered as per the i=1
given rankings so that the ith ranked player has index i
in our model and S i ≥ S i+1 for all i < n. We are there- In the absence of match-play data, we believe this model
fore assuming that the external rankings are “correct” and should yield slightly more accurate inference regarding L
so we can condition our generative model on these rank- than the base model of Section 2 when the left-handers
ings. Specifically, we tighten the assumption made in (4) are not evenly spaced among the top n players. For exam-
that players indexed 1 to n are the top n players to assume ple, it may be the case that all left-handers in the top n
that the ith indexed player is also the ith ranked player are ranked below all the right-handers in the top n. While
for i = 1, . . . , n. Then a similar argument that led to (22) such a scenario is of course unlikely, it would suggest that
implies that the posterior of interest satisfies the value of L is not as large as that inferred by the base
model which only considers nl and not the relative ranking
p(S1:n , L | H1:n , S i ≥ S i+1 ∀ i < n, S n ≥ max{S n+1:N }) of the nl players among the top n. The model here accounts
for the relative ranking and as such, should yield a more
n
accurate inference of L to the extent that the lefties are not
g(S i − LH i ) F(S n | L)N−n
∏︁ ∏︁
∝ p(L) I[S i ≥ S i+1 ]
evenly spaced among the top n players.
i=1 i<n
(29)
where I denotes the indicator function. We can simulate
from the posterior distribution in (29) using a Gibbs sam- 5 Numerical results
pler. The conditional marginal distribution (required for
the Gibbs sampler) of each player’s skill is then a simple We now apply our models and results to Mens ATP tennis.
truncated distribution so that Specifically, we use handedness data as well as match-
play results from ATP Tennis Navigator (Tennis Navigator
p(S i | S−i , H1:n , L) 2004), a database that includes more than seven thousand
players from 1980 until the present and hundreds of thou-
∝ g(S i − LH i ) I[S i+1 ≤ S i ≤ S i−1 ] (30) sands of match results at various levels of professional
and semi-professional tennis. We restrict ourselves to play-
for 1 < i < n and where S−i := {S j : j = / i, 1 ≤ j ≤ n}. ers for whom handedness data is available and who have
Similarly the conditional distributions of S1 and S n satisfy played a minimum number of games (here, set at thirty).
This last restriction is required because we run BTL as a
p(S1 | S−1 , H1:n , L) ∝ g(S1 − LH1 ) I[S1 ≤ S2 ] preprocessing step in order to extract the top n = 150 play-
ers before applying our methods and because BTL can be
and susceptible to large errors if the graph of matches (with
players as nodes, wins as directed edges) is not strongly¹⁵
p(S n | S−n , H1:n , L) connected. Using data on numbers of recreational tennis
players (Tennis Europe 2015; The Physical Activity Council
∝ g(S n − LH n ) F(S n | L)N−n I[S n−1 ≤ S n ]. 2016), we roughly estimate a universe of N = 100, 000
advanced players but we note that our results were robust
Conveniently, the skills of the odd ranked players can be to the specific value of N that we chose. Specifically, we
updated simultaneously since they are all independent of also considered N = 1 and N = 50 million and obtained
each other conditional on the skills of the even ranked very similar results regarding L. We used data from 2014
players. Similarly, the even-ranked players can also be
updated simultaneously conditional on the skills of the
odd-ranked players. This makes the sampling paralleliz- 14 In the numerical results of Section 5, the MH proposal distribution
able and efficient to implement when using Metropolis- for L will be a Gaussian with mean equal to the current value of L and
where the variance is set during a tuning phase to obtain an accep-
within-Gibbs. Our algorithm therefore updates the vari-
tance probability ≈ 0.234. (This value is theoretically optimal under
ables in three blocks: certain conditions; see (Roberts, Gelman, and Gilks 1997)).
1. Update all even skills simultaneously. 15 Otherwise there may be players that have played and won only one
2. Update all odd skills simultaneously. match who are impossible to meaningfully rank.
Unauthenticated
Download Date | 4/22/19 2:08 PM
12 | F. Fagan et al.: The advantage of lefties in one-on-one sports
for all of our experiments and our results were based on player’s rank by a factor of approximately 1.36 on average.
the model of Section 2 with a Laplace prior on the skills Of course, these results were based on 2014 data only and
and an uninformative prior on L with σ L = 10. The MCMC as we shall see in Section 6, there is substantial evidence
chains for the extended model of Section 3 were initialized to suggest that L has varied through time.
by a random perturbation from the approximate mean as While match-play data and ranked handedness there-
outlined in Section 3.2, and convergence checked using the fore provide little new information on L over and beyond
Gelman-Rubin diagnostic (Gelman and Rubin 1992). knowing the proportion of left-handers in the top n play-
ers, we can use the match-play model to answer hypothet-
ical questions regarding win probabilities when the lefty
5.1 Posterior distribution of L advantage is stripped away from the players’ skills. We
discuss such hypothetical questions in Section 5.3.
In Figure 2, we display the posterior of L obtained using
each of the different models of Sections 2 to 4 using data
from 2014 only. We observe that these inferred posteriors
are essentially identical. This is an interesting result and it 5.2 On the importance of conditioning on
suggests that for large populations there is essentially no Topn,N
additional information conveyed to the posterior of L by
match-play data or ranked handedness if we are already We now consider if it’s important to condition on Topn,N
given the proportion of left-handers among the top n play- as we do in (22) when evaluating the lefty advantage.
ers. The posterior of L can be interpreted in terms of a This question is of interest because previous papers in
change in rank as discussed Section 2.3. Since the poste- the literature have tried to infer the lefty advantage with-
rior of the aggregate handedness with the Laplace approx- out conditioning on the players in their data-set being the
imation agrees with the posterior obtained from the full top ranked players. For example, (Del Corral and Prieto-
match-play data, we would argue the results of Table 1 Rodríguez 2010) built a probit model for predicting match-
are valid even in the light of the match-play data. These play outcomes and their factors included a player’s rank,
results suggest that being left-handed in tennis improves a height, age and handedness. Using data from 2005 to
2008 they did not find a consistent statistically significant
advantage to being lefthanded in this context. That said,
their focus was not inferring the lefty advantage, but rather
in seeing how useful an indicator it is for predicting match-
play outcomes. Moreover, it seems reasonable to assume
that any lefty advantage would already be accounted for
by a player’s rank. In contrast, (Breznik 2013) attempted
to infer the lefty advantage by comparing the mean rank-
ing of top left- and right-handed players as well as the fre-
quency of matches won by top left- vs top right-handed
players. Using data from 1968 to 2011 they found a statisti-
cally significant advantage for lefties. The analysis in these
studies do not account for the fact that the players in their
data-sets were the top ranked players. In our model, this is
equivalent to removing the Topn,N condition from the left
–1 –0.5 0 0.5 1 side of (22) and F(min{S1:n } | L)N−n from its right side.
It makes sense then to assess if there is value in
Figure 2: Posteriors of the left-handed advantage, L, using the infer- conditioning on Topn,N when we use match-play data
ence methods developed in Sections 2, 3 and 4. “Match-play” (and handedness) among top players to estimate the lefty
refers to the results of MCMC inference using the full match-play advantage. We therefore re-estimated the lefty advantage
data, “Handedness and Rankings” refers to MCMC inference
by considering the same match-play model of Section 3
using only individual handedness data and external skill rankings,
“Aggregate handedness” refers to (9) and “Aggregate handedness
but where we did not condition on Topn,N . Inference was
with Laplace Approximation” refers to (9) but where the Laplace again performed using MCMC on the posterior of (22) but
approximation in (11) substitutes for the likelihood term. “Match- with the F(min{S1:n } | L)N−n factor ignored. Based on this
play without conditioning on Topn,N ” is discussed in Section 5.2. model we arrive at a very different posterior for L and this
Unauthenticated
Download Date | 4/22/19 2:08 PM
F. Fagan et al.: The advantage of lefties in one-on-one sports | 13
Unauthenticated
Download Date | 4/22/19 2:08 PM
14 | F. Fagan et al.: The advantage of lefties in one-on-one sports
players match up especially well against other players. Table 3: Changes in rank of prominent left-handed players when the
Nadal, for example, is famous for matching up particularly left-handed advantage is excluded.
Table 2: Probability of match-play results with and without the advantage of left-handedness.
Unauthenticated
Download Date | 4/22/19 2:08 PM
F. Fagan et al.: The advantage of lefties in one-on-one sports | 15
6 A dynamic model for L The likelihood of observing the handedness data under
the generative model is:
Thus far we have only considered inference based on data
collected over a single time period, but it is also interest-
ing to investigate how the advantage of left-handedness l
p(n1:T ; n1:T , σ2K )
has changed over time in a given sport. Towards this end,
we assume that the advantage of left-handedness, Lt , in ∫︁
l
= p(n1:T , L0:T ; n1:T , σ2K ) dL0:T
period t follows a Gaussian random walk for t = 1, . . . , T.
We also assume that Lt is latent and therefore unobserved. RT+1
Unauthenticated
Download Date | 4/22/19 2:08 PM
16 | F. Fagan et al.: The advantage of lefties in one-on-one sports
0.25
0.2
0.15
0.1
0.05
1985 1990 1995 2000 2005 2010 2015 2020
0.6
0.4
0.2
–0.2
198 5 199 0 199 5 200 0 200 5 201 0 201 5 2020
Figure 5: Posteriors of the left-handed advantage, Lt , computed using the Kalman filter/smoother. The dashed red horizontal lines (at 11%
in the upper figure and 0 in the lower figure) correspond to the level at which there is no advantage to being left-handed. The error bars in
the marginal posteriors of Lt in the lower figure correspond to 1 standard deviation errors.
inference significantly¹⁸ more difficult. As we observed in the value of L when only the proportion of top left-handed
Section 5.1, including match-play results does not change players is available. In the latter case we showed that the
the posterior distribution of L significantly and so we are distribution of the number of left-handed players among
losing very little information regarding Lt when our model the top n (out of N) converges as N → ∞ to a binomial
and inference is based only on the observed number of top distribution with a success probability that depends on
left-handed players. the tail-length of the innate skill distribution. We also use
this result to develop a simple dynamic model for infer-
ring how the value of L has varied through time in a given
sport. In order to estimate the innate skills of top players
7 Conclusions and further we also considered the case when match-play data among
research the top n players was available. We observed that including
match-play data in our model makes little or no difference
In this paper we have proposed a model for identifying the to the inference of the left-handedness advantage but it
advantage, L, of being left-handed in one-on-one interac- did allow us to address hypothetical questions regarding
tive sports. We use a Bayesian latent ability framework but match-play win probabilities with and without the benefit
with the additional complication of having a latent factor, of left-handedness.
i.e. the advantage of left-handedness, that we needed to It is worth noting that our framework is somewhat
estimate. Our results argued for the use of a medium-tailed coarse by necessity. In tennis for example, there are impor-
distribution such as the Laplace distribution when model- tant factors such as player ability varying across different
ing the innate skills of players. We showed how to infer surfaces (clay, hard court or grass) that we don’t model.
We also attach equal weight to all matches in our model
estimation despite the fact that some matches and tour-
18 Recent techniques have been developed for MCMC on multinomial naments are clearly (much) more important than others.
linear dynamical systems (Linderman, Johnson, and Adams 2015)
Moreover, and as we shall see below, we assume (for a
that significantly improve the efficiency of inference of large state
space models. However these methods will still be orders of magni- given sport) that there is a single latent variable, L, which
tude slower than performing inference on the reduced state space measures the advantage of being left-handed in that sport.
containing only L. We therefore assume that the total skill of each left-handed
Unauthenticated
Download Date | 4/22/19 2:08 PM
F. Fagan et al.: The advantage of lefties in one-on-one sports | 17
player benefits to the same extent according to the value of for measuring Lt . This can only aid with identifying the
L. This of course would not be true in practice as it seems explanation(s) for the benefits of left-handedness and how
likely that some lefties take better advantage of being it varies across sports and time.
left-handed than others. Alternatively, our model assumes It would also be of interest to consider more com-
that all righties are disadvantaged to the same extent by plex models that can also account for interactions in the
being right-handed. Again, it seems far more likely that skill levels between players and/or different match-play
some right-handed players are more adversely affected circumstances. As discussed above, examples of the lat-
playing lefties than other right-handed players. Finally, ter would include distinguishing between surfaces (clay,
we don’t allow for interaction effects between two play- grass etc.) and grand-slam/regular tour matches in ten-
ers in determining the probability of one player beating nis. Given the flexibility afforded by a Bayesian approach
the other. Again, this seems unlikely to be true in practice it should be straightforward to account for such features in
where some players are known to “match up well” against our models. Given limited match-play data in many sports,
other players. Nonetheless, we do believe our model cap- however, it is not clear that we would be able to learn much
tures the value of being left-handed in an aggregate sense about such features. In tennis, for example, even the very
and can be reasonably interpreted in that manner. While best players may only end up playing each other a couple
it is tempting to use the model to answer questions such of times a year or less. As mentioned earlier, Nadal and
as “What is the probability that Federer would beat Nadal Federer only faced each other once in 2014. It would there-
if Nadal was right-handed?” (and we do ask and answer fore be necessary to consider data-set spanning multiple
such questions in Section 5!), we do acknowledge that the years in which case it would presumably also be necessary
answers to such specific questions should be taken lightly to include form as well as the general trajectory of career
for the reasons outlined above. arcs in our models. We note that such modeling might be
There are several directions of interest for future of more general interest and identifying the value(s) of L
research. First, it would be interesting to apply our model might not be the main interest in such a study.
to data-sets from other one-on-one sports and to esti- Continuing on from the previous point, there has
mate how Lt varies with time in these sports. The Kalman been considerable interest in recent years in the so-called
filtering/smoothing approach developed in Section 6 is “interacting performances theory” O’Donoghue (2009).
straightforward to implement and the data requirements This theory recognizes that the performance (and outcome
are very limited as we only need the aggregate data of a performance) is determined by both the skill level or
(n lt , n t ) for each time period. While trends in Lt across quality of an opponent as well as the specific type of an
different sports would be interesting in their own right, opponent. Indeed, different players are influenced by the
the cross-sport dynamics of Lt could be used to shed same opponent types in different ways. Under this theory,
light on the potential explanations behind the benefit of it is important²⁰ to be able to identify different types of
left-handedness. For example, there is some evidence in players. Once these types have been identified we can then
Figure 5 suggesting that the benefit of left-handedness in label each player as being of a specific type. It may then
men’s professional tennis has decreased with time. If such be possible to accommodate interaction effects between
a trend could be linked appropriately with other devel- specific players (as outlined in the paragraph immediately
opments in men’s tennis such as the superior strength above) by instead allowing for player-type interactions.
and speed of the players, superior racket and string tech- Such a model would require considerably fewer param-
nology, time pressure etc. then it may be possible to eters to be estimated than a model which allowed for
attach more or less weight to the various hypotheses specific player-interactions.
explaining the benefit of left-handedness. The recent¹⁹ Returning to the issue of the left-handedness advan-
work of Loffing (Loffing 2017), for example, studies the tage, we would like to adapt these models and apply
link between the lefty advantage and time pressure in elite them to other sports such as cricket and baseball where
interactive sports. While these ideas are clearly in the lit- one-on-one situations still arise and indeed are the main
erature already, the Kalman filtering approach provides aspect of the sport. It is well-known, for example, that left-
a systematic, straightforward and consistent approach handed pitchers in Major League Baseball (approx. 25%)
19 This paper was also discussed in a recent New York Times arti- 20 See for example Cui et al. (2017a), Cui et al. (2017b) and
cle (Yin 2017) reflecting the general interest in the lefty advantage O’Donoghue (2005), all of which discuss techniques for identifying
beyond academia. different types of tennis players.
Unauthenticated
Download Date | 4/22/19 2:08 PM
18 | F. Fagan et al.: The advantage of lefties in one-on-one sports
and left-handed batsmen in elite cricket (approx 20%) are the innate skills have support R we have F(k | L) < 1 for
over-represented. While the model of Section 2 that only all k ∈ R, where F denotes the conditional CDF of a
uses aggregate handedness could be directly applied to player’s total skill given L. Note that for any values of k,
these sports, it would be necessary to adapt the match-play L and ϵ > 0, we can find N k,L,ϵ ∈ N such that N n F(k |
model of Section 3 to handle them. This follows because L)N−n < ϵ for all N ≥ N k,L,ϵ . It therefore follows that for
the one-on-one situations that occur in these sports do not such k, L, ϵ and N we have
have binary outcomes like win/lose, but instead have mul- ∫︁
tiple possible outcomes whose probabilities would need to p(S1:n | Topn,N , L) dS1:n
be linked to the skill levels of the two participants. S[n] ≤k
We hope to consider some of these alternative direc- n
∫︁ (︃ )︃
N
F(S[n] | L)N−n
∏︁
tions in future research. = f (S i | L) dS1:n
n
S[n] ≤k i=1
n
(︃ )︃ ∫︁ ∏︁
N N−n
≤ F(k | L) f (S i | L) dS1:n
A Appendix n
Rn i=1
n N−n
≤ N F(k | L)
A.1 Proof of proposition 1
<ϵ (37)
Proof. We begin by observing that the exchangeability of where f denotes the PDF of F. The conditional distribution
players implies of player handedness in (36) factorizes as
p(n l | L; n, N) = p(n l | Topn,N , L; n). (34) n
∏︁
p(H1:n = h1:n | S1:n , L) = p(H i = h i | S i , L) (38)
Integrating (34) over the handedness of the top n players i=1
yields and consider now the ith term in this product. We have
S[i] for the ith order statistic of the skills S1:N and write H[i] = ⃒p(H1:n = h1:n | S1:n = s1:n , L) − α(L)n l (1 − α(L))n r ⃒ .
⃒ ⃒
Unauthenticated
Download Date | 4/22/19 2:08 PM
F. Fagan et al.: The advantage of lefties in one-on-one sports | 19
(︁ ϵ )︁
We are now in a position to prove that = α(L)n l (1 − α(L))n r −
3
∫︁ ⎛ ⎞
lim p(H1:n = h1:n | S1:n , L) p(S1:n | Topn,N , L) dS1:n ∫︁
N→∞
⎝1 − p(S1:n | Topn,N , L) dS1:n ⎠
⎜ ⎟
Rn
S[n] ≤k ϵ/3
= α(L)n l (1 − α(L))n r . (41)
(︁ ϵ )︁ (︁ ϵ )︁
≥ α(L)n l (1 − α(L))n r − 1−
For any ϵ > 0, for all N > N k ϵ/3 ,L,ϵ/3 we have 3 3
(︁ ϵ )︁ ϵ (︁ ϵ )︁
⃒ ≥ α(L)n l (1 − α(L))n r − − 1−
⃒ 3 3 3
⃒α(L)n l (1 − α(L))n r
⃒
2
≥ α(L)n l (1 − α(L))n r −
⃒
⃒ ϵ (44)
3
⃒
∫︁ ⃒
⃒
− p(H1:n = h1:n | S1:n , L) p(S1:n | Topn,N , L) dS1:n ⃒⃒
⃒ where the first inequality follows from (40) and the sec-
Rn
ond inequality follows from (37). Combining the upper and
⃒
⃒
⃒ ∫︁ lower bounds of (43) and (44) implies that the first term
≤ ⃒α(L)n l (1 − α(L))n r − p(H1:n = h1:n | S1:n , L) on the r.h.s. of (42) is bounded above by 2ϵ/3. The second
⃒
⃒
⃒ S[n] ≥k ϵ/3
term on the r.h.s. of (42) satisfies
⃒ ∫︁
⃒
⃒ p(H1:n = h1:n | S1:n , L) p(S1:n | Topn,N , L) dS1:n
p(S1:n | Topn,N , L) dS1:n ⃒
⃒
⃒ S[n] ≤k ϵ/3
⃒
∫︁
∫︁ ≤ p(S1:n | Topn,N , L) dS1:n ≤ ϵ/3.
+ p(H1:n = h1:n | S1:n , L)
S[n] ≤k ϵ/3
S[n] ≤k ϵ/3
S[n] ≥k ϵ/3
(︃ )︃
n
= α(L)n l (1 − α(L))n r
nl
∫︁ (︁ ϵ )︁
≥ α(L)n l (1 − α(L))n r −
3
S[n] ≥k ϵ/3 (︀ )︀
so the limiting distribution is Binomial n l ; α(L), n as
p(S1:n | Topn,N , L) dS1:n desired.
Unauthenticated
Download Date | 4/22/19 2:08 PM
20 | F. Fagan et al.: The advantage of lefties in one-on-one sports
Unauthenticated
Download Date | 4/22/19 2:08 PM
F. Fagan et al.: The advantage of lefties in one-on-one sports | 21
for all s if L > 0. We then obtain Integrating both sides of (48) w.r.t. x from −∞ to c
yields
E[d(S[1] )] − E[d(X[1] + L)]
∫︁c
∫︁∞ Φ(x)N−1 Nϕ(x) exp(−Lx) dx
= d(s)NF(s)N−1 f (s) ds −∞
−∞
∫︁c2
∫︁∞ ≤ Φ(c)N−1 Nϕ(x) exp(−Lx) dx
N−1
− d(x + L)NΦ(x) ϕ(x) dx −∞
−∞
from which we obtain
∫︁∞
= d(s)NF(s)N−1 f (s) ds E[exp(−LX[1] )I{X[1] ≤c} ]
−∞
≤ NΦ(c)N−1 E[exp(−LX)I{X≤c} ] (49)
∫︁∞
N−1
− d(s)NΦ (s − L) ϕ (s − L) ds
where X is a standard normal random variable. The state-
−∞
ment of the Lemma now follows by noting
∫︁∞ [︁ E[exp(−LX[1] )I{X[1] )≥c} ]
= d(s) NF(s)N−1 f (s)
−∞
= E[exp(−LX[1] ))] − E[exp(−LX[1] )I{X[1] <c} ]
]︁
−NΦ (s − L)N−1 ϕ (s − L) ds
≥ E[exp(−LX[1] ))] − NΦ(c)N−1 E[exp(−LX)I{X<c} ]
[︁ ]︁ ⃒⃒∞
= d(s) F(s)N − Φ (s − L)N ⃒⃒ = E[exp(−LX[1] ))] − aNΦ(c)N−1
−∞
⏟ ⏞
=0 where the inequality follows from (49).
∫︁∞ The following Lemma is well known but we include it here
∂d(s) [︁ ]︁
− F(s)N − Φ (s − L)N ds for the sake of completeness.
∂s
−∞ ⏟ ⏞ ⏟ ⏞ √︀
<0 >0 Lemma 3. E[X[1] ] ≤ 2 log N.
Unauthenticated
Download Date | 4/22/19 2:08 PM
22 | F. Fagan et al.: The advantage of lefties in one-on-one sports
√︀
where we chose β = 2 log N to obtain the final where the second equality follows from (51) with S =
inequality. F −1 (1 − λ | L). Simplifying (52) now yields
F −1 (1 − λ | L)
A.3 Difference in skills at given quantiles
with Laplace distributed innate skills = −b log(λ) + b log([q exp(L/b) + 1 − q]/2). (53)
Proposition 3. If the innate skills are IID Laplace dis- From (David and Nagaraja 2003, pp. 288) we know that
tributed with mean 0 and scale parameter b > 0, then for
any fixed finite L and sufficiently small quantiles λ i , λ j ∈
(︁ )︁
lim S[Nλ j ] − S[Nλ i ]
(0, 1) N→∞
= F −1 (1 − λ j | L) − F −1 (1 − λ i | L)
(︁ )︁
lim S[Nλ j ] − S[Nλ i ] = b log(λ i /λ j ) (54)
N→∞
where the convergence is in probability and we use S[Nλ i ] to where convergence is understood to be in probability. Sub-
denote²¹ the [Nλi ]th order statistic of the skills S1 , . . . , S N . stituting (53) into (54) then yields (for λi and λj sufficiently
small)
Proof. Since the innate skills are Laplace distributed with
mean 0 and scale b > 0 they have the CDF (︁ )︁
lim S[Nλ j ] − S[Nλ i ]
N→∞
1
⎧
⎪
⎪ exp (S/b) if S < 0
⎨2
⎪
= −b log(λ j ) + b log([q exp(L/b) + 1 − q]/2)
G(S) = (50)
⎩1 − 1 exp (−S/b)
⎪
⎪
if S ≥ 0
⎪
2 − (−b log(λ i ) + b log([q exp(L/b) + 1 − q]/2))
while the total skill distribution for someone from the = b log(λ i /λ j )
mixed population of left- and right-handers has CDF F(S |
L) = qG(S − L) + (1 − q)G(S). For skills S ≥ max{L, 0} it as claimed.
therefore follows from (50) that
(︂ )︂
1 (︀ )︀
F(S | L) = q 1 − exp −(S − L)/b
2 A.4 Estimating the Kalman filtering
(︂ )︂ smoothing parameter
1
+ (1 − q) 1 − exp (−S/b)
2
Here we explain how to compute the MLE for σ2K as dis-
1 [︀ ]︀ cussed in Section 6 where we developed a Kalman fil-
=1− exp (−S/b) q exp(L/b) + 1 − q . (51) ter/smoother for estimating the lefty advantage Lt through
2
time. The likelihood of the observed handedness data
Since the Laplace distribution has domain R, for any fixed over the time interval t = 1, . . . , T is given in (33) and
finite L we have limλ↓0 F −1 (1 − λ | L) = ∞. Thus for suffi- satisfies
ciently small quantiles λ ∈ (0, 1) we have F −1 (1 − λ | L) >
max{L, 0}. Consider such a value of λ. Since F(F −1 (1 − λ | l
p(n1:T ; n1:T , σ2K )
L) | L) = 1 − λ it that
∫︁
∝
λ = 1 − F(F −1
(1 − λ | L) | L) ∼ N(L0 ; 0, θ2 )
RT+1
1 (︁ )︁
= exp −F −1 (1 − λ | L)/b T
2
N(L t ; L t−1 , σ2K ) N(L t ; L*t , σ2t ) dL0:T
∏︁
[︀ ]︀
q exp(L/b) + 1 − q (52) t=1
Unauthenticated
Download Date | 4/22/19 2:08 PM
F. Fagan et al.: The advantage of lefties in one-on-one sports | 23
We therefore obtain
l
p(n1:T ; n1:T , σ2K )
)︂ T
L2 ∏︁ (L t − L t−1 )2 (L t − L*t )2
∫︁ (︂ (︂ )︂ (︂ )︂
∝ 1 1 1
∼ √ exp − 02 √ exp − √ exp − dL0:T
2πθ 2θ
t=1
2πσ K 2σ2K 2πσ t 2σ2t
RT+1
T
(︃ )︃
L2 ∑︁ (L t − L t−1 )2 (L t − L*t )2
∫︁
∝ σ−T
K exp − 02 − 2
− dL0:T
2θ
t=1
2σ K 2σ2t
RT+1
T
(︃ (︃ )︃)︃
L20 ∑︁ L2t − 2L t L t−1 + L2t−1 L2 − 2L t L*t
∫︁
1
∝ σ−T
K exp − + 2
+ t dL0:T
2 θ 2
t=1
σK σ2t
RT+1
T
∫︁ (︃ (︃ )︃)︃
1
σ−T L20 (θ−2 σ−2 L2T σ−2 L2t (2σ−2 σ−2 2σ−2 2σ−2
∑︁
*
= K exp − + K ) − K + K + t ) − K L t L t−1 − t Lt Lt dL0:T
2
t=1
RT+1
∫︁ (︂ )︂
1
= σ−T
K exp − L⊤0:T Σ
−1
L0:T + v⊤ L0:T dL0:T (55)
2
RT+1
Unauthenticated
Download Date | 4/22/19 2:08 PM
24 | F. Fagan et al.: The advantage of lefties in one-on-one sports
Performance During Four Grand Slams.” International Journal Herbrich, R., T. Minka, and T. Graepel. 2007. “TrueskillTM : A
of Performance Analysis in Sport 17:783–801. Bayesian Skill Rating System.” in Advances in Neural
Cui, Y., M.-Á. Gómez, B. Gonçalves, and J. Sampaio. 2017b. “Identi- Information Processing Systems, 569–576.
fying Different Tennis Player Types: An Exploratory Approach to Holtzen, D. W. 2000. “Handedness and Professional Tennis.”
Interpret Performance Based on Player Features.” in Complex International Journal of Neuroscience 105:101–119.
Systems in Sport, International Congress Linking Theory and Liew, J. 2015. “Wimbledon 2015: Once they were Great – but where
Practice 97. have all the Lefties Gone?”. The Telegraph, June 27, 2015.
Daems, A. and K. Verfaillie. 1999. “Viewpoint-Dependent https://www.telegraph.co.uk/sport/tennis/wimbledon/
Priming Effects in the Perception of Human 11703777/Wimbledon-2015-Once-they-were-great-but-where-
Actions and Body Postures.” Visual Cognition 6: have-all-the-lefties-gone.html.
665–693. Linderman, S., M. Johnson, and R. P. Adams. 2015. “Dependent
David, H. A. and H. N. Nagaraja. 2003. Order Statistics. 3rd ed. Multinomial Models Made Easy: Stick-Breaking with the pólya-
Hoboken, N.J: Wiley-Interscience. Gamma Augmentation.” in Advances in Neural Information
Del Corral, J. and J. Prieto-Rodríguez. 2010. “Are Differences in Processing Systems, 3456–3464.
Ranks Good Predictors for Grand Slam Tennis Matches?” Loflng, F. 2017. “Left-Handedness and Time Pressure
International Journal of Forecasting 26:551–563. in Elite Interactive Ball Games.” Biology Letters 13.
Dochtermann, N. A., C. Gienger, and S. Zappettini. 2014. “Born to DOI: 10.1098/rsbl.2017.0446.
Win? Maybe, but Perhaps only Against Inferior Competition.” Loflng, F. and N. Hagemann. 2015. “Pushing Through Evolution?
Animal Behaviour 96:e1–e3. Incidence and Fight Records of Left-Oriented Fighters in Profes-
Elo, A. E. 1978. The Rating of Chessplayers, Past and Present. sional Boxing History.” Laterality: Asymmetries of Body, Brain
Batsford: Arco Pub. and Cognition 20:270–286.
Fischer, G. H. and I. W. Molenaar. 2012. Rasch Models: Foundations, Loflng, F. and N. Hagemann. 2016. “Chapter 12 – Performance Dif-
Recent Developments, and Applications. Springer-Verlag, NY, ferences between Left- and Right-Sided Athletes in One-on-One
USA: Springer Science & Business Media. Interactive Sports.” in Laterality in Sports, edited by F. Loflng,
Flatt, A. E. 2008. “Is Being Left-Handed a Handicap? The Short N. Hagemann, B. Strauss, and C. MacMahon, pp. 249–277.
and Useless Answer is “Yes and No.” Proceedings of Baylor San Diego: Academic Press. https://www.sciencedirect.com/
University Medical Center 21:304–307. science/article/pii/B9780128014264000122.
Gelman, A. and D. B. Rubin. 1992. “Inference from Iterative Sim- Loflng, F., N. Hagemann, and B. Strauss. 2012a. “Left-Handedness
ulation Using Multiple Sequences.” Statistical Science in Professional and Amateur Tennis.” PLoS One 7.
7:457–472. DOI: 10.1371/journal.pone.0049325.
Geschwind, N. and A. M. Galaburda. 1985. “Cerebral Lateraliza-
Loflng, F., N. Hagemann, and B. Strauss. 2012b. “Left-
tion: Biological Mechanisms, Associations, and Pathology:
Handedness in Professional and Amateur Tennis.” PLoS One 7:
I. A Hypothesis and a Program for Research.” Archives of
e49325.
Neurology 42:428–459.
Luce, D. R. 1959. Individual Choice Behavior: A Theoretical Analysis.
Ghirlanda, S., E. Frasnelli, and G. Vallortigara. 2009. “Intraspecific
Courier Corporation, New York: Wiley Publishing.
Competition and Coordination in the Evolution of Lateral-
Mosteller, F. 1951. “Remarks on the Method of Paired Compar-
ization.” Philosophical Transactions of the Royal Society of
isons: I. The Least Squares Solution Assuming Equal Standard
London B: Biological Sciences 364:861–866.
Deviations and Equal Correlations.” Psychometrika 16:3–9.
Glickman, M. E. 1999. “Parameter Estimation in Large Dynamic
http://dx.doi.org/10.1007/BF02313422.
Paired Comparison Experiments.” Applied Statistics
48:377–394. Murphy, K. P. 2012. Machine Learning: A Probabilistic Perspective.
Cambridge, MA: MIT press.
Goldstein, S. R. and C. A. Young. 1996. “Evolutionary” Stable Strat-
egy of Handedness in Major League Baseball.” Journal of Murray, I. and R. P. Adams. 2010. “Slice Sampling Covariance Hyper-
Comparative Psychology 110:164–169. parameters of Latent Gaussian Models.” in Advances in Neural
Grossman, E., M. Donnelly, R. Price, D. Pickens, V. Morgan, Information Processing Systems, 1732–1740.
G. Neighbor, and R. Blake. 2000. “Brain Areas Involved Murray, I., R. P. Adams, and D. Mackay. 2010. “Elliptical Slice Sam-
in Perception of Biological Motion.” Journal of Cognitive pling.” in International Conference on Artificial Intelligence and
Neuroscience 12:711–720. Statistics, 541–548.
Grouios, G., H. Tsorbatzoudis, K. Alexandris, and V. Barkoukis. Nass, R. and M. Gazzaniga. 1987. “Cerebral Lateralization and
2000a. “Do Left-Handed Competitors have an Innate Superi- Specialization in Human Central Nervous System.” Hand-
ority in Sports?” Perceptual and Motor Skills 90:1273–1282. book of Physiology. Vol. 5. New York: Oxford University Press.
Grouios, G., H. Tsorbatzoudis, K. Alexandris, and V. Barkoukis. https://doi.org/10.1002/cphy.cp010518.
2000b. “Do Left-Handed Competitors have an Innate Superi- Neal, R. M. 2011. “MCMC using Hamiltonian Dynamics.” Handbook
ority in Sports?” Perceptual and Motor Skills 90:1273–1282. of Markov Chain Monte Carlo 2:113–162.
Gursoy, R. 2009. “Effects of Left-or Right-Hand Preference on O’Donoghue, P. 2005. “Normative Profiles of Sports Perfor-
the Success of boxers in Turkey.” British Journal of Sports mance.” International Journal of Performance Analysis in Sport
Medicine 43:142–144. 5:104–119.
Unauthenticated
Download Date | 4/22/19 2:08 PM
F. Fagan et al.: The advantage of lefties in one-on-one sports | 25
O’Donoghue, P. 2009. “Interacting Performances Theory.” Taddei, F., M. P. Viggiano, and L. Mecacci. 1991. “Pattern Reversal
International Journal of Performance Analysis in Sport Visual Evoked Potentials in Fencers.” International Journal of
9:26–46. Psychophysiology 11:257–260.
Raymond, M., D. Pontier, A.-B. Dufour, and A. P. Moller. 1996. Tennis Navigator. 2004. “Tennis Software – Tennis Navigator.”
“Frequency-Dependent Maintenance of Left Handedness http://www.tennisnavigator.com/, accessed: 2015-02-03.
in Humans.” Proceedings of the Royal Society of London B: Tennis Europe. 2015. “About Tennis Europe”.
Biological Sciences 263:1627–1633. https://www.tenniseurope.org/page/12173, accessed:
Roberts, G. O., A. Gelman, W. R. Gilks. 1997. “Weak Conver- 2017-01-01.
gence and Optimal Scaling of Random Walk Metropo- The Physical Activity Council. 2016. “2016 Participation Report The
lis Algorithms.” The Annals of Applied Probability 7: Physical Activity Council’s Annual Study Tracking Sports, Fit-
110–120. ness, and Recreation Participation in the US.” https://cdn4.
Roberts, G. O., J. S. Rosenthal. 2001. “Optimal Scaling for Vari- sportngin.com/attachments/document/0112/0253/2016_
ous Metropolis-Hastings Algorithms.” Statistical Science Physical_Activity_Council_Report.pdf.
16:351–367. Thurstone, L. L. 1927. “A Law of Comparative Judgment.”
Schorer, J., F. Loflng, N. Hagemann, and J. Baker. 2012. “Human Psychological Review 34:273–286.
Handedness in Interactive Situations: Negative perceptual Fre- Ullén, F., D. Z. Hambrick, and M. A. Mosing. 2016. “Rethinking
quency Effects Can be Reversed!” Journal of Sports Sciences Expertise: A Multifactorial Gene–Environment Interaction
30:507–513. Model of Expert Performance.” Psychological Bulletin
Stefani, R. T. 1997. “Survey of the Major World Sports 142:427–446.
Rating Systems.” Journal of Applied Statistics 24: Witelson, S. F. 1985. “The Brain Connection: The Corpus Callosum is
635–646. Larger in Left-Handers.” Science 229:665–668.
Sterkowicz, S., G. Lech, and J. Blecharz. 2010. “Effects of Wood, C. and J. Aggleton. 1989. “Handedness in Fast Ball Sports:
Laterality on the Technical/Tactical Behavior in View of the Do Lefthanders have an Innate Advantage?” British Journal of
Results of Judo Fights.” Archives of Budo 6: Psychology 80:227–240.
173–177. Yin, S. 2017. Do Lefties have an Advantage in Sports? It Depends.
Stone, J. V. 1999. “Object Recognition: View-Specificity and The New York Times. https://www.nytimes.com/2017/11/21/
Motion-Specificity.” Vision Research 39:4032–4044. science/lefties-sports-advantage.html
Unauthenticated
Download Date | 4/22/19 2:08 PM