Professional Documents
Culture Documents
GermanTankProblem Talk Hampshire2019
GermanTankProblem Talk Hampshire2019
GermanTankProblem Talk Hampshire2019
Introduction
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
Approaches
Approaches
7
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
Importance
Importance
Importance
Mathematical Preliminaries
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
Basic Combinatorics
n
n!
k
= k !(n−k )!
: Number of ways to choose k items from
n items when order doesn’t matter.
Binomial Theorem:
n
X
n n
(x + y ) = x k y n−k .
k
k =0
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
Pascal’s Identity
Pascal’s identity
n+1 n n
= + .
k k k −1
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
Pascal’s Identity
Pascal’s identity
n+1 n n
= + .
k k k −1
Pascal’s Triangle
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
10
10 20 30 40 50 60
-10
-20
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
https://www.youtube.com/watch?v=tt4_4YajqRM
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
Needed Identity
N
X
m N +1
= .
k k +1
m=k
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
Needed Identity
N
X
m N +1
= .
k k +1
m=k
Needed Identity
N
X
m N +1
= .
k k +1
m=k
Needed Identity
N
X
m N +1
= .
k k +1
m=k
GTP: Original
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
Answer:
b = m 1+ 1
N − 1;
k
is this reasonable?
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
Answer:
b = m 1+ 1
N − 1;
k
is this reasonable?
Lemma
Let M be the random variable for the maximum number
observed, and let m be the value we see. There is zero
probability of observing a value smaller than k or larger
than N, and for k ≤ m ≤ N
m
m−1
k
− m−1k k −1
Prob(M = m) = N
= N
.
k k
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
Lemma
Let M be the random variable for the maximum number
observed, and let m be the value we see. There is zero
probability of observing a value smaller than k or larger
than N, and for k ≤ m ≤ N
m
m−1
k
− m−1k k −1
Prob(M = m) = N
= N
.
k k
Substituting:
N m−1
X k −1
E[M] = m N
.
m=k k
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
N m−1
X k −1
E[M] = m N
m=k k
N
X (m − 1)! k !(N − k )!
= m
(k − 1)!(m − k )! N!
m=k
N
X m!k k !(N − k )!
=
k !(m − k )! N!
m=k
N
k · k !(N − k )! X m
=
N! k
m=k
k · k !(N − k )! N + 1
=
N! k +1
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
N m−1
X k −1
E[M] = m N
m=k k
N
X (m − 1)! k !(N − k )!
= m
(k − 1)!(m − k )! N!
m=k
N
X m!k k !(N − k )!
=
k !(m − k )! N!
m=k
N
k · k !(N − k )! X m
=
N! k
m=k
k · k !(N − k )! (N + 1)!
=
N! (k + 1)!(N − k )!
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
N m−1
X k −1
E[M] = m N
m=k k
N
X (m − 1)! k !(N − k )!
= m
(k − 1)!(m − k )! N!
m=k
N
X m!k k !(N − k )!
=
k !(m − k )! N!
m=k
N
k · k !(N − k )! X m
=
N! k
m=k
k (N + 1)
= .
k +1
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
b
Step 3: Finding N
k (N+1)
Showed E[M] = ,
solve for N:
k +1
1
N = E[M] 1 + − 1.
k
Substitute m (observed value for M) as best guess for
E[M], we obtain our estimate for the number of tanks
produced:
b = m 1+ 1
N − 1,
k
completing the proof.
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
GTP: New
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
Let N = N2 − N1 + 1, s = m2 − m1 .
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
Let N = N2 − N1 + 1, s = m2 − m1 .
Let N = N2 − N1 + 1, s = m2 − m1 .
Answer:
b = s 1+ 2
N − 1;
k −1
note sanity checks at k = 2 and k = N.
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
Lemma
Probability is 0 unless k − 1 ≤ s ≤ N2 − N1 , then
PN2 −s s−1
m=N1 k −2 (N − s) ks−1
Prob(S = s) = N2 −N1 +1
= N
−2
.
k k
Outline of Proof
Regression
See https://web.williams.edu/Mathematics/
sjmiller/public_html/probabilitylifesaver/
MethodLeastSquares.pdf
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
Overview
Overview
Linear Regression
PN PN PN 2
PN
n=1 x n n=1 x n y n − x n yn
b = PN P N PNn=1 2 PNn=1 .
n=1 xn n=1 xn − n=1 xn n=1 1
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
-1 1 2 3
-1
Statistics
Statistics
Birthday Problem
Birthday Problem
A straightforward analysis
√ shows the answer is
1/2
approximately D log 4.
55
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
5
5
4 4
3 3
2 2
1 1
7 .0 7 .5 8 .0 8 .5 9 .0 7 .0 7 .5 8 .0 8 .5 9 .0
150
100
50
8 0
Implementation Issues
See https://web.williams.edu/Mathematics/
sjmiller/public_html/probabilitylifesaver/
MethodLeastSquares.pdf
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
10 000
8000
6000
4000
2000
MaxObs
2000 4000 6000 8000 10 000
10 000
8000
6000
4000
2000
MaxObs
2000 4000 6000 8000 10 000
10 000
8000
6000
4000
2000
MaxObs
2000 4000 6000 8000 10 000
k +1 k +1
Statistics and German Tank Problem: m = k N + k
MaxObs
10 000
8000
6000
4000
2000
NumTanks
2000 4000 6000 8000 10 000
k +1 k +1
Statistics and German Tank Problem: m = k N + k
MaxObs
10 000
8000
6000
4000
2000
NumTanks
2000 4000 6000 8000 10 000
k +1 k +1
Statistics and German Tank Problem: m = k N + k
MaxObs
10 000
8000
6000
4000
2000
NumTanks
2000 4000 6000 8000 10 000
100 000
80 000
60 000
40 000
20 000
MaxObs
10 000 20 000 3 0 000 40 000 5 0 000 60 000 7 0 000
Best Fit S l o p e
0 6
1 .
0 5
1 .
0 4
1 .
0 3
1 .
0 2
1 .
1 . 1
5 0 5 0
NumObs
1 .
0 0
100 1 200
NumObs
1 2 3 4 5
-1
-2
-3
-4
-5
References
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
Smoots
Smoots
Smoots
Thank you!
Questions?
Work supported by NSF Grants DMS1561945 and DMS1659037.
1 S. J. Miller, The Probability Lifesaver, Princeton University Press, Princeton, NJ, 2018. https://web.
williams.edu/Mathematics/sjmiller/public_html/probabilitylifesaver/index.htm.
2 Probability and Statistics Blog, How many tanks? MC testing the GTP, https://statisticsblog.com/
2010/05/25/how-many-tanks-gtp-gets-put-to-the-test/.
3 Statistical Consultants Ltd, The German Tank Problem, https://www.statisticalconsultants.co.
nz/blog/the-german-tank-problem.html.
4 WikiEducator, Point Estimation - German Tank Problem, https://wikieducator.org/
Point_estimation_-_German_tank_problem.
5 Wikipedia, German Tank Problem, Wikimedia Foundation, https://en.wikipedia.org/wiki/
German_tank_problem.
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
Notation
75
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
Proof
We argue similarly as before. In the algebra below we use our second binomial identity; relabeling the parameters it
is
b
X
ℓ b+1
= . (1)
ℓ=a
a a+1
N−1
X
E[S] = sProb(S = s)
s=k −1
N−1
X (N − s) ks−1
−2
= s
N
s=k −1 k
−1 N−1
X
N s−1
= s (N − s)
k s=k −1
k −2
−1 N−1
X −1 N−1
X
N s!(k − 1) N ss!(k − 1)
= N − = T1 − T2 .
k s=k −1
(s − k + 1)!(k − 1)! k s=k −1
(s − k + 1)!(k − 1)!
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
Proof (cont)
We first simplify T1 ; below we always try to multiply by 1 in such a way that we can combine ratios of factorials into
binomial coefficients:
−1 N−1
X
N s!(k − 1)
T1 = N
k s=k −1
(s − k + 1)!(k − 1)!
−1 N−1
X
N s!(k − 1)
= N
k s=k −1
(s − k + 1)!(k − 1)!
−1 X
N−1
N s
= N(k − 1)
k s=k −1
k −1
−1
N N
= N(k − 1) = N(k − 1),
k k
77
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
Proof (cont)
Turning to T2 we argue similarly, at one point replacing s with (s − 1) + 1 to assist in collecting factors into a
binomial coefficient:
−1 N−1
X
N ss!(k − 1)
T2 =
k s=k −1
(s − k + 1)!(k − 1)!
−1 N−1
X
N (s + 1 − 1)s!(k − 1)
=
k s=k −1
(s − (k − 1))!(k − 1)!
−1 N−1
X −1 N−1
X
N (s + 1)!(k − 1)k N s!(k − 1)
= −
k s=k −1
(s + 1 − k )!(k − 1)!k k s=k −1
(s − (k − 1))!(k − 1)!
−1 N−1
X (s + 1)!k (k − 1) −1 N−1
X
N N s
= − (k − 1)
k s=k −1
(s + 1 − k )!k ! k s=k −1
k −1
−1 N−1
X −1 N−1
X
N s+1 N s
= k (k − 1) − (k − 1) = T21 + T22 .
k s=k −1
k k s=k −1
k −1
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
Proof (cont)
We can immediately evaluate T22 by using (1) with a = k − 1 and b = N − 1, and find
−1
N N
T22 = (k − 1) = k − 1.
k k
−1 N−1
X s + 1
N
T21 = k (k − 1).
k s=k −1
k
−1 XN
N w
T21 = k (k − 1) ,
k w=k
k
−1 N
X −1
N w N N +1
T21 = k (k − 1) = k (k − 1) .
k w=k
k k k +1
Introduction Mathematical Preliminaries GTP: Original GTP: New Regression Implementation Issues References Appendix
Proof (cont)
Thus, substituting everything back yields
−1
N N +1
E[S] = N(k − 1) + (k − 1) − k (k − 1) .
k k+1
(N+1)!
(N−k )!(k +1)! (N + 1)!(N − k )!k !
(N + 1)(k − 1) − k (k − 1) = (N + 1)(k − 1) − k (k − 1)
N! N!(N − k )!(k + 1)!
(N−k )!k !
N +1
= (N + 1)(k − 1) − k (k − 1)
k +1
k (k − 1)(N + 1)
= (N + 1)(k − 1) −
k +1
k
= (N + 1)(k − 1) 1−
k+1
k −1
= (N + 1) ,
k +1
Proof (cont)
b for N:
The analysis is completed as before, where we pass from our observation of s for S to a prediction N
k +1 2
b =
N s−1 = s 1+ − 1,
k −1 k −1
where the final equality is due to rewriting the algebra to mirror more closely the formula from the case where the
2 −1
first tank is numbered 1. Note that this formula passes the same sanity checks the other did; for example s k −1
is always at least 1 (remember k is at least 2), and thus the lowest estimate we can get for the number of tanks is
s + 1.