Why Euler's Number Is Just The Best Quanta Magazine PDF

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 1

I NSIGHTS PUZZ LE

Why e , the Transcendental


Math Constant, Is Just the
Best
By P R A D E E P M U T A L I K

November 24, 2021

The solution to our puzzle about Euler’s


number explains why e pops up in
situations that involve optimality.

15

James Round for Quanta Magazine

ast month, we presented three puzzles


L that seemed ordinary enough but
contained a numerical twist. Hidden below the
surface was the mysterious transcendental
number e. Most familiar as the base of natural
logarithms, Euler’s number e is a universal
constant with an infinite decimal expansion
that begins with 2.7 1828 1828 45 90 45…
(spaces added to highlight the quasi-pattern in
the first 15 digits after the decimal point). But
why, in our puzzles, does it seemingly appear
out of nowhere?

Before we attempt to answer this question, we


need to learn a little more about e’s properties
and aliases. Like its transcendental cousin π, e
can be represented in countless ways — as the
sum of infinite series, an infinite product, a
limit of infinite sequences, an amazingly
regular continued fraction, and so on.

I still remember my first introduction to e. We


were studying common logarithms in school,
and I marveled at their ability to turn
complicated multiplication problems into
simple addition just by representing all
numbers as fractional powers of 10. How, I
wondered, were fractional and irrational
powers calculated? It is, of course, easy to
calculate integer powers such as 102 and 103,
and in a pinch you could even calculate 102.5 by
finding the square root of 105. But how did they
figure out, as the log table asserted, that 20
was 101.30103? How could a complete table of
logarithms of all numbers be constructed from
scratch? I just couldn’t imagine how that could
be done.

A bi-monthly puzzle

celebrating the sudden

insights and

unexpected twists of

scientific problem

solving.

See all Puzzles

Later I learned about the magic formula that


enables this feat. It gives a hint of where the
“natural” in “natural logarithms” came from:

x x2 x3 x4 x5
ex = 1 + 1! + 2!
+ 3!
+ 4!
+ 5!
+ ⋯.

For negative powers, alternate terms are


negative as expected:

x x2 x3 x4 x5
e-x = 1 – 1! + 2!
− 3!
+ 4!
− 5!
+ ⋯.

These powerful formulas enable the


calculation of any power of the mysterious e
for any real number, integer or fraction from
negative infinity to infinity, to any desired
precision. They allow the construction of a
complete table of natural logarithms and, from
that, common logarithms, from scratch.

The special case of this formula for x = 1 gives


this famous representation of e:

1 1 1 1 1
e = 1 + 1! + 2!
+ 3!
+ 4!
+ 5!
+ ⋯.

In addition, e has many amazing properties,


some of which we’ll uncover in the solutions to
our problems. But the one property that goes to
the essence of e and makes it so natural for
logarithms and situations of exponential
growth and decay is this:

d x  x.
e = e
dx

This says that the rate of change of  ex is equal


to its value at all points. When x represents
time, it signifies a rate of growth (or decay, for
negative x) that is equal to the size or quantity
that has accumulated thus far. There are
myriad phenomena in the real world that do
exactly this for stretches of time, and we know
them as examples of exponential growth or
decay. But, utility apart, there is an element of
aesthetic perfection and naturalness in this
property of e that can truly inspire wonder. It
even carries a moral lesson; I like to think of it
as a Zen-like function that, in its quest for
growth, is always in perfect balance, never
reaching out for more or less than what it has
earned.

A word of warning: In the puzzle solutions


below, we will get into math that’s a bit more
advanced and formidable-looking than is
normal for this puzzle column. Don’t worry if
the equations make your eyes glaze over; just
try to follow the general argument and
concepts. My hope is that anyone can come
away with some insight, however hazy, about
how and why e appears in our puzzles. In the
BBC TV series The Ascent of Man, Jacob
Bronowski said of John von Neumann’s
mathematical writing that it is important when
reading math to follow the tune of the
conceptual argument — the equations are
merely the “orchestration down in the bass.”

Now let us try and track down how e appears in


our puzzles.

Puzzle 1: Partition

Let’s take any number, such as 10. Divide it


into some number of identical pieces, such
as two 5s, and multiply them together: 5 ×
5 = 25. Now, we could have divided 10 into
three, four, five or six identical pieces and
done the same. Here’s what happens to our
product when we do so:

2 pieces: 5 × 5 = 25
3 pieces: 3.33 × 3.33 × 3.33 = 37.04
4 pieces: 2.5 × 2.5 × 2.5 × 2.5 = 39.06
5 pieces: 2 × 2 × 2 × 2 × 2 = 32
6 pieces: 1.67 × 1.67 × 1.67 × 1.67 × 1.67 ×
1.67 = 21.43

You can see that the product increases,


reaches what seems to be a maximum and
then starts decreasing. Try doing the same
with some other numbers such as 20 and
30. You’ll notice that the same thing
happens in every case. This has nothing to
do with the numbers themselves but is
caused by a unique property of the
number e.

a. See if you can figure out when the


product reaches a maximum for a given
number and what this has to do with e.

As I mentioned in the hint for 1a, the product


reaches a maximum when the value of each
piece is closest to e. To be more accurate, the
two highest products will be obtained when the
values of the pieces lie on either side of e. For
the small, everyday-size numbers we are
considering here, the highest value is obtained
for the piece whose di#erence from e is the
smallest.

b. For the number 10, the largest product


(39.06) is about 5.5% larger than the next
largest (37.04). Without calculating the
actual di#erence, can you guess which
number less than 100 has the smallest
percentage di#erence between the largest
product and the next largest? Why should
this be?

From above, it is easy to see that two products


will be closest together when the values of the
two adjacent pieces are almost equidistant
from e, one lower than e and the other higher.
(This is strictly true only if the function is
symmetric around e, which it is not, but in this
range it is close enough, as Michel Nizette
explained excellently.) If the original number is
N, this will tend to happen when the fractional
part of the ratio  N
e
 is close to 0.5 — that is,
when  N
e
 lies close to the midpoint between two
integers. So if you construct a table of  N
e
 for N
up to 100 and look for the fractional part
closest to 0.5, you will get the required integer:
53. Dividing 53 by e gives 19.4976 and a
di#erence of only 0.0013% for the products
yielded by 19 and 20 pieces.

c. Can you explain why e arises in this


apparently simple problem?

As explained by readers Lazar Ilic, Ashok


Khatri, Alan Olson, Kurt Godel, TG, Atul Kumar
and Michel Nizette, the answer involves some
elementary calculus — specifically, you need to
find the maximum of a function by setting its
x, and the
derivative to zero. Our function is ( n
x
)
value of each piece is  n
x
. The logarithm of the
function is x(ln n
x
), and we can maximize that
instead, which is somewhat easier. If you can
solve this manually, kudos to you! But if doing
calculus is not your cup of tea, you can just type
“d/dx(x ln n/x) = 0″ into Wolfram Alpha. The
derivative evaluates as ln( n
x
) = 1, and out pops
the solution x =  ne for positive n and x. Thus,
the value of the optimal piece is  n
x
= e. Voila!
That’s how e arises and gives the maximum
product.

This teaches us that e has a property of


optimality. It can pop up in situations that
involve finding a maximum or minimum, as we
will see in puzzle 2. The most basic version of
this property of e is seen if you calculate the
value of the function x1/x for all positive real
numbers (this is known as Steiner’s
problem). Of all the infinite real numbers, the x
that yields the highest value for this function is
e. Maximizing x1/x is equivalent to maximizing
(ln x) (1−ln x)
, whose derivative  is zero at ln x =
x x2
1, when x = e.

Puzzle 2: Union

As readers pointed out, this was a restatement


of the well-known secretary problem. The
essential points are summarized below.

An heir has to choose the best of 10


potential candidate spouses under the
following rules. The candidates are
interviewed one after another, and either
accepted (if suspected to be the best) or
rejected before the next candidate is
considered. A rejected candidate cannot be
recalled, and once a candidate is accepted
the process stops. The last candidate must
be accepted by default if the process has
not ended by then.

a. How can the heir maximize his chances


of choosing the best candidate, assuming
there are no ties?

This situation requires the heir to reject a


specific number of candidates unconditionally
(the “rejection” phase) followed by a
“selection phase” in which he selects the first
one among the remaining candidates who
ranks higher than all of the previously rejected
ones. The chances of choosing the best
candidate are maximized when the rejection
phase is a specific length. The probability falls
o# if the rejection phase is longer (the best
candidate may be more likely to be rejected) or
shorter (he doesn’t have enough experience to
rank the candidates properly, resulting in
acceptance of lower-ranked candidates).

This is known as an “optimal stopping”


problem, and e appears in its solution because
of its property of optimality. For a large
number of candidates n, the optimal number of
candidates rejected initially should be equal to
n divided by e.

Here are the probability calculations for n = 10


if the rejection phase (r) = 3. First, note that the
best candidate could turn up at any point in the
series of 10 interviews, with a 1-in-10
probability ( 1n ) of being in any particular
position. For each interviewee’s position (i),
1
we multiply this  10  by the probability that the
best candidate will be selected at that position.
Then we sum the probabilities for all the
positions and build our general expression.

• If the best candidate is at positions 1 to 3,


they will be automatically rejected.
Probability (p) of selecting the best
1
candidate =  10  × 0 = 0.

• If the best candidate is at position 4, they


will always be selected. Probability p of this
1 1
outcome =  10  × 1 =  10 =  1n .

• At position 5, the candidate will be selected


if the previous highest-ranking candidate
1
is in positions 1 to 3, but not 4. So p =  10  ×
3 1 r r 1
=   ×  =  × .
4 n (i–1) n (i–1)
• At position 6, the candidate is selected if
the previous highest-ranking candidate is
in positions 1 to 3, but not 4 or 5. So p =
1 3 1 r r 1
 ×  =   ×   =  ×  .
10 5 n (i–1) n (i–1)
• …
1 3
• At position 10, p =  10   ×   9 =   1n   ×  r =  nr
(i–1)
×  1 .
(i–1)

As you can see, we get the same expression for


1
each position in the selection phase: nr ×
(i–1)
. (The expression for position 4 can be written
1 3
as  10  ×  3 =  nr × 1  to fit the pattern.)
(i–1)

Taking  nr  out of the summation sign, this sum


— the probability of finding the best candidate
— can be written:

The corresponding probability of finding the


best candidate for r = 4 initial candidates
rejected is 39.8%. These are the two highest
values, and the probability falls away for
greater or fewer initial rejections. Does this
remind you of the partition puzzle? That’s no
coincidence, as we’ll see below.

b. How do the heir’s chances change if


there is a 10% chance of a tie for first place?

Since the heir now has two equally ranked


candidates 10% of the time, the chances of
finding the best candidate increase.

c. This problem is a classical one whose


solution has something to do with e. Can
you explain how e enters the picture?

It turns out e enters the picture twice in this


problem! As the number n becomes large,
Euler’s number appears both in the probability
of finding the best choice and in the proportion
of initial cases to reject.

The probability expression we derived above


can be represented, as n grows to infinity, by
an integral by substituting x for the limit of  nr
(i–1)
(the rejection fraction), p for n  (the
incremental probability at each n) and dp for  1n
 (the rate of change from one integer to the
next). This gives the limit of the probability as:

1 1
x ∫x p dp = −x ln x .

Again, the expression on the right-hand side is


similar to the ones we looked at in the partition
problem. Setting the derivative to zero, we find
that both the optimal fraction of candidates
that should be rejected and the probability of
making the best choice are  1e , or about 36.8%.

d. How can the heir achieve the highest


expected rank of his chosen candidate in
this more practical choosing scenario?

In the classical scenario above, the heir adopts


an all-or-nothing strategy by rejecting the
first few candidates and then selecting the first
one that betters all the rejected ones. While this
does maximize the probability of finding the
best candidate, it may also result in his being
stuck with a low-ranked candidate if the best
candidate was among the initial rejects. To
avoid this, his best practical strategy is to be
very picky to begin with and look for the very
best candidates, and then reduce pickiness by
settling for merely good candidates as the
number of candidates runs out.

This strategy can be made precise by starting


backward from when only the last candidate is
left. This person could slot in anywhere in the
ranking with equal probability. So the
expectation is that their final rank will be
average (5.5 in this case). So you should accept
the second-to-last candidate if they would be
in your top five up to this point, even though
they may not be the absolute best. When two
candidates are left, the expectation for the
higher-ranked one of the two will be at about
3.67, so for the third-to-last candidate you
should be satisfied with a candidate who is
within the top three. By calculating exactly how
picky you need to be at every stage, you have a
better chance of getting a very good candidate
than by using the classical algorithm.

Puzzle 3: Togetherness

A large auditorium is staging a show that


admits couples only. When a couple enters
the auditorium, they pick at random a pair
of seats next to each other. Each new
couple does the same, and in many cases
this results in empty single seats between
couples. The seating continues until only
single seats are left. Then the auditorium is
declared full, and the show starts.

a.  What proportion of seats are expected to


be left unfilled when the seating is
stopped?

The answer, as the number of seats increases,


approaches  12 , or about 13.5%.
e

b. How does e enter this theater of


togetherness?

Let’s see what happens when there are a small


number of seats labeled alphabetically. Let’s
call the numbers of expected empty seats E1,
E2, E3, E4, and so on, where the subscripts are
the number of empty chairs in a row.

• A single seat is empty: A couple cannot


occupy it, so E1 = 1.

• For two seats, there is no empty seat, so E2


is 0.

• For three seats, the couple can occupy AB


or BC, leaving one seat empty in each case,
so E3 = 1.

• For four seats, a couple may occupy BC,


leaving two empty seats, or they can
occupy AB or CD, leaving none. This gives
three configurations yielding a total of two
empty seats, giving an average expectation
of E4 =
2
.
3

By playing with the relationships between


these and the next few numbers, Lazar Ilic and
Michel Nizette were able to derive a recurrence
relation that allowed them to predict the
unfilled seats for the current number of seats
(n) using previous results for n – 1 and n – 2
seats. The formula for the recurrence relation
is (for n ≥ 2):

(n − 1)En = 2nEn−2 + (n − 2)En-1.

16
The sequence goes: 1, 0, 1, 23 , 1, 15 , 11 142
,  …
9 105

These numbers divided by the number of seats


give the proportion of unfilled seats, which
Nizette calculated as being 16.24% for 10 seats,
13.804% for 100, 13.561% for 1,000, and
13.538% for 6,000. You can see that the
numbers are approaching 12  or 13.5335…%. But
e
how can we know that’s exactly where they are
headed for huge numbers for which the
relation will take too long to calculate?

A recurrence relation is good, but it is like

You might also like