SSRN Id2509740

Quantifying Victory: Napoleon’s Armies’ Victories and Losses
Max Lenk1, Ron Lenk1, Dr. Chris Tofallis2

1
Santa Fe, NM 87507
2
Statistical Services and Consultancy Unit, University of Hertfordshire, Hatfield, UK
Abstract
Napoleonic France won a great many of more than 150 battles in which it engaged. There
has been much dispute about which, if any, of the many qualitative theories as to why it
was so successful is correct; many of them centered on the personal characteristics of
Napoleon himself. However, none of these theories appears amenable to statistical
analysis. To examine this question quantitatively we take a new direction. We leave aside
questions of generalship and instead analyze the sizes of both of the opposing armies in
battles of the Napoleonic Wars, analyzing French wins and losses separately. We find the
best-fit linear models for these data sets using the Geometric Mean Functional
Relationship. The coefficients of determination for both of these results were 71%,
implying that our best fits model the data unexpectedly well. The difference between
these two models has high statistical significance. Napoleonic France won even though
outnumbered on average by 9%, whereas their opposition won only when they
outnumbered the French by typically 83%. We conclude that absolute sizes of armies -
and not just their relative size - are important factors in determining the result of a battle,
and that Napoleonic France and its opponents were very different in their ability to win
for given army sizes.
Key words: mathematical history, quantitative history, Napoleon, French history
Electroniccopy
Electronic copy available
available at:
at:https://ssrn.com/abstract=2509740
http://ssrn.com/abstract=2509740
Introduction
Determining the reason for the outcome of a battle often comes down to non-quantifiable
aspects, such as superiority of the commanding general or strategic use of geography.
Sometimes, the reason for a victory seems intuitively clear, such as when one army is
numerically much larger than the other is. In this paper, we take a new direction. We look
at battles of the Napoleonic wars, both those in which France won and those in which it
lost, and statistically analyze the sizes of both armies assuming that they are independent
variablesi.
It is common knowledge that Napoleon conquered much of Europe in the early
19th century. From an early age, he showed great abilities in military command, and at
only 24 years of age became a general. He led French armies to victory in 1796 and 1797
against the First Coalition, which consisted of Austria and Prussia, and then the Second
Coalition (1798 - 1802) facing Austria and Russia. In between these, he commanded
French troops in the Egypt-Syria Campaign (and during which he discovered the Rosetta
stone). After this string of successes, Napoleon began to fight in the French Revolution,
achieving victories over the Directory and Louis XVIII, and making him well known
throughout France. The Revolution culminated in the overthrow of the monarchy, and
Napoleon rose to supreme power in France. During the battles with the Third Coalition,
while fighting Austria and Russia again, Napoleon crowned himself emperor (1804). The
Fourth Coalition (1806) consisted of Prussia, Russia, and the UK among others. After
Napoleon’s success against all of these enemies, the French were left facing only Sweden
and the UK, and had formed treaties with, or destroyed, all other opposing states. In the
Electroniccopy
Electronic copy available
available at:
at:https://ssrn.com/abstract=2509740
http://ssrn.com/abstract=2509740
Fifth Coalition (1809), France fought against Austria and the UK (again), and Austria
suffered a catastrophic defeat, losing territory encompassing a fifth of its population.
Finally, during a campaign against the Sixth Coalition (1812), Napoleon was forced to
retreat from Moscow, with his army starved and freezing. Later in the campaign, a
coalition consisting of the UK, Austria, Prussia, Sweden, and Russia, among others,
finally defeated him. The victors banished Napoleon to Elba, an island off the coast of
Italy. He managed to return to France to face the Seventh Coalition (1815). However, he
was defeated once more at the famous battle of Waterloo by the same group of countries
that had formed the Sixth Coalition. They once again banished him, this time to the
distant island of Saint Helena in the South Atlantic, where he died.
There have been many explanations proffered as to why Napoleonic France was
so successful in warfare, many of them naturally centered on the characteristics of
Napoleon himself. The most common reason given is that Napoleon was a ‘genius’ (e.g.,
Kagan). This is obviously impossible to analyze quantitatively. Napoleon was reputed to
have an ‘exceptional talent for motivating his soldiers’ (Hughes), and that he had
‘enormous powers of concentration and of memory’ (van Creveld). Some have suggested
that the victories were due to Napoleon’s introduction of new tactics (Wilkinson-Latham),
while his opponents were still stuck in the older ways of conducting battle. For example,
Napoleon utilized a fast moving flanking infantry to achieve a tactical advantage by
greater maneuverability. Carlyle says that Napoleon had ‘a faith in democracy, yet hatred
of anarchy … that carries [him] through’. None of these explanations appears testable,
the sine qua non of a scientific theoryii.
Electronic copy available at: https://ssrn.com/abstract=2509740

Sometimes the successes of Napoleonic France are not attributed to Napoleon, but
to the French armies themselves, which were composed of ‘men who were genuinely
willing soldiers’ and ‘officers of outstanding personal qualities’ (Platt). In this
explanation, French generals rose through merit, as opposed to the wealth-based system
in most of the monarchies of the various Coalitions, so that the men more accomplished
and better suited for commanding rose more quickly to the top. None of these
explanations seem readily quantifiable, and so it is unclear how they could be tested.
Finally, Tolstoy asserts that the events of history (and in particular of Napoleonic battles)
are composed of the actions of the thousands of individuals involved, and that to the
extent that these individuals have free will are not explicable causally at all iii.
It has thus proven impossible to determine which if any, or all, or none, of these
differing explanations is correct. In this paper, we took a different approach. We left
aside questions of generalship, and instead analyzed statistically battles of the Napoleonic
wars based on available numerical data. Our goal was to determine if there were
statistically significant differences in the outcomes of the battles based on this data,
without proposing a priori hypotheses. We began by collecting data on 156 battles in
which the Napoleonic French armies participated (102 wins and 41 losses, along with 13
battles in which it was unclear which side was victorious). A variety of numerical data is
available for many of these battles, such as the army sizes of both sides, the numbers of
guns, horses, casualties, prisoners etc.
After an initial analysis of the various sets of data in pairs, it appeared that there
were separate groupings of French wins and losses based on opposing army sizes.
Following this up, we examined separately French wins and losses. Linear logistic

regression, both univariate and multivariate, did not reveal any combination of variables
that did better than the null hypothesis that France always won. Instead, we used the
Geometric Mean Functional Relationship to determine a best fit, separately for both wins
and losses. The two datasets turned out to have very well separated best fits, with a
probability of this being due to chance of only 4×10-10. The coefficients of determination
(R2) of both of these results were an unexpectedly high 71%. This implies that the sizes
of both of the armies - not just their relative size - have a large effect on battle outcome.
Most interestingly, it also implies that the French and their opponents were very different
in their ability to win with given army sizes.
Data Selection
The decision to model Napoleonic battles rather than battles of other historical periods
had several reasons. We wanted to model a series of battles occurring in a period during
which technology did not substantially evolve. Technological evolution would give an
advantage to the side with the newer technology, and we were interested in removing
technological factors involved in wins or losses. This ruled out the battles of the 20th
Century, as new weapons were constantly being introduced. Reliable data on the battles
also had to be available in order to get adequate statistics, meaning that we could not
model battles in or before the Middle Ages. We chose the Napoleonic wars not only
because they occurred during a time when most of Europe was at approximately the same
level of technology, but also because they had very well recorded battles. An additional
reason for the selection was that over 150 battles were fought, which is a large enough
number that we could hope to achieve reliable statistics.

The variables we collected were army sizes; the number of guns on each side; the
number of guns lost for each side; and the casualties on each side. Not all of this data was
available for all of the battles, but generally, there was enough to be statistically
significant.
A few comments are in order concerning the data sources. Even though the
Napoleonic battles occurred relatively recently, sources of numerical data are not as
abundant as could be wished, and those have disagreements amongst themselves. The
main source of the data used was the set of Wikipedia articles on each of the battles.
While this was convenient and almost comprehensive, we discovered a number of errors-
--for example, sidebars stating different numbers of combatants than the text. In these
cases, we used other sources (e.g. Bodart) to try to ascertain the correct numbers. In the
end, the errors of the types we encountered were not sufficient to significantly affect the
resultsiv.
Turning now to the types of data collected and used, the army sizes varied by
more than two orders of magnitude, from only 800 French troops in the Capitulation of
Stettin to 380,000 Russian, Prussian, Austrian and Swedish troops in the Battle of Leipzig.
We did not include sieges, as these did not reflect actual fighting. We also did not include
naval engagements. In these, the troops were combined into larger units (naval vessels),
and if the ship went down, so did everyone on board. For these battles, it might have been
more appropriate to analyze numbers of vessels engaged, rather than number of soldiers.
However, we left this for a future study.
One of the issues that arose in analyzing the data was how to count the number of
troops in a battle. Some sources listed the number of troops present as being the army

size, while others listed only the number of troops actually engaged in the battle. Most of
the sources did not distinguish between the two possibilities. It appeared that this
distinction was the cause of many of the differences in estimates of army sizes in the
literature. It seemed to us that the number of soldiers merely present at the battle was
irrelevant to the outcome. We felt that the number of troops actually engaged in combat
determined the outcome. We thus used the number of troops actually engaged in fighting
as the data in our analysis.
A second issue that arose with the data set was how to mathematically deal with
what historians call a campaign. Should a campaign be counted as one battle, or be split
into individual battles? During our analysis, we tested both possibilities, and found that
the fit was better with individual battles. We thus decided our dataset would use what
historians deemed as battles, rather than amalgamating them into larger units.
A final issue was what to do with battles that were indecisive. For example, the
battle of Schoengrabern was called a Russian strategic victory, but a French tactical one.
The Russians retreated, but claimed that they did this intentionally in order to lure the
French further on into Russia. For the nine cases in which one side retreated, we listed the
tactical victor as the actual victor, since we were interested in the outcome of the battle,
not of the campaign. In the final mathematical analysis, we also checked the results when
the strategic victor, rather than the tactical victor, was recorded as the actual victor. The
results changed by less than 2%. Finally, neither side was victorious in four of the battles.
We omitted these from our analysis entirely. After dealing with these various issues, the
final data set consisted of 152 battles.

Given the nature of the available data, and the nuances of the interpretation of this
data, a few words are in order about what could reasonably be expected from a statistical
analysis of historical data. In hard sciences such as physics, a hypothesis is not
considered to be confirmed unless the correlation coefficient between the variables and
the predicted values is at least 0.975, which gives a coefficient of determination of 95%.
In areas such as medicine, where phenomena may have many unmeasured determinants,
it is typically expected that significant data should have a correlation coefficient of at
least 0.50 (coefficient of determination = 25%). In history, not only are there many
unmeasured variables, but there are also large errors in the measurements (in both the
sense that different sources disagree, as well as in not having clear definitions of the
variables themselves). Further, there is no a priori reason to suppose that this ‘noise’, or
the explanatory variables themselves, are normally distributed. We thus came into this
study with the anticipation that finding even a low correlation coefficient, say 0.25
(corresponding to a coefficient of determination of only 6%) would be historically
interesting.
Preliminary Analysis
We began by analyzing the various sets of data in pairs. All the data had high correlation
coefficients with the army sizes. For example, bigger armies tended to have more horses
and more guns. Army size was thus a natural choice on which to concentrate. We
separated all the army size data into two groups, French victories and their opposition’s
victories. For an initial look at the two groups, we plotted the French army size on the x-
axis, and on the y-axis their opposition’s army size, see Figure 1. Inspection of the plot
suggested drawing two ellipses that encompassed much of the data for each of the two

groups. The ellipses had notably different angles of their major axes, suggesting that
there might be a significant difference between the victories of French and those of their
opposition.
Figure 1 Based on Army Sizes, Napoleonic Victories and Defeats Appear to Fall into Two
Roughly Separate Groups
Based on this apparent grouping of data on the scatter plot, we looked more
carefully at the army size data. In linear regression the goal is to make a best prediction
(in the sense of least squares) of the values of one variable as a function of the other(s).
We began with logistic regression, using win/lose as the value to be predicted. For the
explanatory variable, we first tried shrinking the dataset to a single variable. We tried the

difference between French and opposition army sizes; the ratio of French to opposition
army sizes; the French percentage of the total troops engaged; and the difference as a
proportion of the total. Using the null hypothesis that France won all of the battles (it
actually won 70%), none of these single-variable logistic regressions performed better
than the null hypothesis in a statistically significant manner. We measured the
improvement from the null model to the fitted model using the Nagelkerke R2 statistic, as
the logistic regression is a discrete model. For the logistic models that we tried, the best
value of Nagelkerke R2 obtained was only 0.2v.
We next tried multi-variable logistic regressions, using two and then three of the
variables. Again, the improvement from the null hypothesis was not statistically
significant. We also considered a standard linear regression of one army size against the
other for wins and losses separately, but this also had problems. In this dataset, the ratio
of largest to smallest army size was very large. Standard linear regression gave an
unreasonable bias towards the largest values of the data, poorly representing the small-
valued data.
Given these poor results with standard analyses, we looked instead for an
alternative method with which to characterize statistically the battle outcomes. We chose
a statistical method that treats both variables on an equal footing, the Geometric Mean
Functional Relationship (GMFR). This is called a functional or structural relationship, as
opposed to a regression. GMFR is scale-invariant, and allows both variables to be treated
symmetrically and to have measurement error without any a priori assumptions about the
distribution. In our case, we were not trying to predict either one of the two variables

(army sizes). What we were attempting to do was to model the relationship between the
variables, and GMFR provided a suitable method to do sovi.
The Geometric Mean Functional Relationship (GMFR)
The approach we adopted goes by various names including the geometric mean
functional relationship and the reduced major axis. One of its advantages over other
approaches is that it is scale invariant. This means that changing the units of
measurement leads to an equation that is equivalent to the original equation. It also treats
all variables on the same basis – there is no need to distinguish between a dependent and
an independent variable, which would lead to two different models under ordinary least
squares according to the choice of dependent variable vii.
In normal regression, the regression of y on x minimizes the sum of squared deviations in
the vertical direction, and the regression of x on y does the same in the horizontal
direction. The first of these regression lines has slope r y /  x , where r is the correlation
in the data and  refers to the standard deviation. The second of these regression lines has
slope 1 / r  y /  x .
As the correlation r approaches unity, these two slopes approach the same value  y /  x ,
which is precisely the slope given by the approach we are using. This slope is seen to
equate to the geometric mean of the two regression slopes and so the resulting line is
called the geometric mean functional relationship. All three lines always pass through the
point of means of the data, known as the centroid. This fact together with the slope value
enables us to fit the line to the data and deduce its equation.

It is worth noting that the alternative name of reduced major axis for the slope arises
from the fact that for data with normally distributed errors in both variables, the
calculation leads to a distribution centered on the centroid, surrounded by contours of
equal probability in the shape of ellipses. The GMFR provides the correct theoretical line
in such instancesviii.
Mathematical Analysis
We separated the data into two sets, French wins and opposition wins. We first noted that
the correlation coefficients were very similar, and both were unexpectedly high: for the
French victories R2 = 0.712, while for opposition victories R2 = 0.715. Using GMFR, we
found the slope and intercept of the best fit for each; see Figures 2 & 3 respectively. The
intercepts were small (For the French, -4300 out of a mean of 30,000 troops = -14%, for
the opposition +4600 out of a mean of 35,300 troops = +13%). However, the slopes were
very different: 1.83 for opposition wins vs. 1.09 for French wins. Plotting all of the data,
along with both best fits, on a single plot (see Figure 4), we see how the GMFR best fit
quantifies our original notion of the ellipses’ major axes pointing in different directions.
Given that there was considerable spread in the data, we investigated the
possibility that the difference in slopes was a chance effect. A hypothesis test was carried
out where the null hypothesis was that the GMFR slopes were the same. Details of the
GMFR two-sample test calculations can be found in (Clarke), where it is assumed that
errors in measurement (in this case, of army size) are normally distributed. The result was
that the chance of obtaining such a difference by chance, assuming the slopes were the
same, is 4×10-10. Hence, we could safely reject the null hypothesis, and concluded that the
difference in slopes was highly statistically significant ix.

Historical Analysis
We attempt here to give some historical interpretation to the results of our
modeling. Looking at the slopes, the values mean that when the French won battles, their
opposition actually had on average 9% more troops than they did! On average, when the
French lost it was because their army was only 55% (=1/1.83) the size of their opposition.
That is, the French typically lost when they were very seriously out-numbered, almost 2
to 1. This suggests that the best representation of the effect of the French army on the
battlefield is not as having more soldiers than the opposition. (Wellington famously states
that the effect of having Napoleon on the battlefield was equivalent to the French having
an additional 40,000 troops (Stanhope)) Rather, the best representation of the French
army is as an increased probability of winning a battle, given the particular actual French
and opposition army sizesx.
We found additional support for these results in a model-independent way. For
opposition victories, the French army had an average size of 35,320, vs. 56,119 for the
victors. But on the other hand, when the French were victorious their average army size
was 29,748, vs. 27,982 for the opposition. In good approximate agreement with our
model’s results, using averages shows that the French typically won with an advantage of
just 6%, whereas their opposition typically needed superiority of 59% to achieve their
victories.

Figure 2 Best fit line for French victories

Figure 3 Best fit line for opposition victories

Figure 4 Best fit lines for Napoleonic battles, both wins and losses
Conclusions
We have examined quantitatively the difference between Napoleonic French
victories and losses in battle. Using the Geometric Mean Functional Relationship, we
analyzed the sizes of both of the opposing armies in these battles, for wins and losses
separately. We found the best-fit linear models for each of these two data sets, and
determined that the data is bimodal with high statistical significance. The French won
even though typically outnumbered by 9%, whereas their opposition typically won only
when they outnumbered the French by 83%. The coefficients of determination for both of

these results were 71%, implying that our best fits model the data very well. We conclude
that absolute sizes of armies---not just their relative size---are important factors in
determining the result of a battle, and that the Napoleonic French and their opponents
were very different in their ability to win with given army sizes. We plan to determine if
similar differences exist for other periods of conflict in other historical periods.
REFERENCES
i Brauer, Jurgen and van Tuyll, Hubert, Chicago 2008, Castles, Battles & Bomb: How Economics Explains
Military History; Dupuy, Col. Trevor Nevitt, Fairfax 1979, Numbers, Predictions & War: Using History to
Evaluate Combat Factors and Predict the Outcome of Battles.
ii
Kagan, Frederick, Philadelphia 2006, The End of the Old Order: Napoleon and Europe, 1801-1805 (v. 1);
Hughes, Michael, New York 2012, Forging Napoleon’s Grande Armee: Motivation, Military Culture, and
Masculinity in the French Army, 1800-1808; van Creveld, Martin, Cambridge 1987, Command in War;
Wilkinson-Latham, Robert, Oxford 1975, Napoleon’s Artillery; Carlyle, Thomas, On Heroes and Hero-
Worship and the Heroic in History, London 1840.
iii
Platt, Piers, 2014, From the Arquebus to the Breechloader: How Firearms Transformed Early Infantry
Tactics, Google eBook; Tolstoy, Leo, 1869, translated by Garnett, Constance and Wilson, A. N.,
republished New York 2004, War and Peace.
iv
Bodart, Gaston and Kellogg, Vernon Lyman, Oxford 1916, Losses of Life in Modern Wars, Austria-
Hungary: France, Carnegie Endowment for International Peace Division of Economics and History.
v
Nagelkerke, N. J. D., 1991, A Note on a General Definition of the Coefficient of Determination,
Biometrika, LXXVIII(3), 691-2.

vi
Tofallis, Chris, 2002, Model fitting for Multiple Variables by Minimising the Geometric Mean Deviation,
from Total least squares and errors-in-variables modeling: algorithms, analysis and applications, Eds. S.
Van Huffel and P. Lemmerling, Kluwer Academic
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1077322
vii
Barker, F., Soh, Y. C., and Evans, R. J., 1988, Properties of the Geometric Mean Functional
Relationship, Biometrics 44(1), 279-281.
viii
Webster, R, 1989, Is regression what you really want?, Soil Use and Management, 5(2), 47-53.
ix
Clarke, MRB. 1980, The reduced major axis of a bivariate sample. Biometrika, 67(2), 441-6.
x
Stanhope, Earl Philip Henry Stanhope and Wellesley, Arthur, Duke of Wellington, London 1888, Notes
of Conversations with the Duke of Wellington, 1831-1851.

SSRN Id2509740

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SSRN Id2509740

Uploaded by

Copyright:

Available Formats

Quantifying Victory: Napoleon’s Armies’ Victories and Losses

Max Lenk1, Ron Lenk1, Dr. Chris Tofallis2

Key words: mathematical history, quantitative history, Napoleon, French history

aspects, such as superiority of the commanding general or strategic use of geography.

It is common knowledge that Napoleon conquered much of Europe in the early

suffered a catastrophic defeat, losing territory encompassing a fifth of its population.

distant island of Saint Helena in the South Atlantic, where he died.

so successful in warfare, many of them naturally centered on the characteristics of

Kagan). This is obviously impossible to analyze quantitatively. Napoleon was reputed to

Napoleon utilized a fast moving flanking infantry to achieve a tactical advantage by

the sine qua non of a scientific theoryii.

Electronic copy available at: https://ssrn.com/abstract=2509740

willing soldiers’ and ‘officers of outstanding personal qualities’ (Platt). In this

differing explanations is correct. In this paper, we took a different approach. We left

without proposing a priori hypotheses. We began by collecting data on 156 battles in

guns, horses, casualties, prisoners etc.

Electronic copy available at: https://ssrn.com/abstract=2509740

in their ability to win with given army sizes.

number that we could hope to achieve reliable statistics.

Electronic copy available at: https://ssrn.com/abstract=2509740

However, we left this for a future study.

Electronic copy available at: https://ssrn.com/abstract=2509740

as the data in our analysis.

final data set consisted of 152 battles.

Electronic copy available at: https://ssrn.com/abstract=2509740

analysis of historical data. In hard sciences such as physics, a hypothesis is not

it is typically expected that significant data should have a correlation coefficient of at

(corresponding to a coefficient of determination of only 6%) would be historically

Electronic copy available at: https://ssrn.com/abstract=2509740

Roughly Separate Groups

Electronic copy available at: https://ssrn.com/abstract=2509740

than the null hypothesis in a statistically significant manner. We measured the

value of Nagelkerke R2 obtained was only 0.2v.

Functional Relationship (GMFR). This is called a functional or structural relationship, as

opposed to a regression. GMFR is scale-invariant, and allows both variables to be treated

Electronic copy available at: https://ssrn.com/abstract=2509740

variables, and GMFR provided a suitable method to do sovi.

The Geometric Mean Functional Relationship (GMFR)

squares according to the choice of dependent variable vii.

In normal regression, the regression of y on x minimizes the sum of squared deviations in

Electronic copy available at: https://ssrn.com/abstract=2509740

calculation leads to a distribution centered on the centroid, surrounded by contours of

difference in slopes was highly statistically significant ix.

Electronic copy available at: https://ssrn.com/abstract=2509740

We attempt here to give some historical interpretation to the results of our

and opposition army sizesx.

We found additional support for these results in a model-independent way. For

Electronic copy available at: https://ssrn.com/abstract=2509740

Electronic copy available at: https://ssrn.com/abstract=2509740

Electronic copy available at: https://ssrn.com/abstract=2509740

We have examined quantitatively the difference between Napoleonic French

Electronic copy available at: https://ssrn.com/abstract=2509740

Evaluate Combat Factors and Predict the Outcome of Battles.

Worship and the Heroic in History, London 1840.

republished New York 2004, War and Peace.

Biometrika, LXXVIII(3), 691-2.

Electronic copy available at: https://ssrn.com/abstract=2509740

Van Huffel and P. Lemmerling, Kluwer Academic

Relationship, Biometrics 44(1), 279-281.

of Conversations with the Duke of Wellington, 1831-1851.

Electronic copy available at: https://ssrn.com/abstract=2509740

You might also like