University of Toronto Scarborough Department of Computer and Mathematical Sciences Final Exam, Winter - 2015

UNIVERSITY OF TORONTO SCARBOROUGH
Department of Computer and Mathematical Sciences
Final Exam, Winter - 2015
STAB57H3: An Introduction to Statistics
Duration: Three hours (180 minutes)
LAST NAME: FIRST NAME:
STUDENT NUMBER: SIGNATURE:
TUTORIAL:
Aids Allowed:
• A handwritten cheat-sheet covering both sides of TWO A4/letter sized papers. You
need to submit your cheat-sheet with your answer-sheet after the exam.
• A calculator (No phone calculators are allowed)
No other aids are allowed.
All your work must be presented clearly in order to get credit. Answer alone (even though
correct) will only qualify for ZERO credit. Please show your work in the space provided;
you may use the back of the pages, if necessary, but you MUST remain organized. Show
your work and answer in the space provided.
There are 13 pages including this page. Please check to see you have all the
pages.
Good Luck!
Question 1 2 3 4 5 6 7 8 9 Total
Points 10 10 10 10 14 10 10 10 16 100
Score
1
1 (a): Given that the survival times (X) in months of a group of patients with lung cancer
follow exponential distribution with probability density function
fX (x) = λe−λx for x ≥ 0,
where λ = 0.07. You want to predict a future value of X using its median. Hence, find
the median survival time of the patients with lung cancer.
[5 Points]
Solution: Consider M is the median survival time. We write
Z M Z M
fX (x)dx = λe−λx dx = 1 − e−λM = 0.5.
0 0
[3 Points]
Solving for M we get
log(0.5) log(0.5)
M =− =− = 9.902103.
λ 0.07
[2 Points]
Hence, the median survival time is approximately 10 months.
1 (b): You are also interested in predicting a future value of X in the form of an interval.
Find the 95% smallest interval of the survival times of the patients with lung cancer.
[5 Points]
Solution: The mass of the density function is heavy on the left side of the curve.
Hence, the smallest interval of X with mass 0.95 is (0, b), where
Z b Z b
fX (x)dx = λe−λx dx = 1 − e−λb = 0.95.
0 0
[3 Points]
Solving for b we get
log(0.95) log(0.95)
b=− =− = 42.79618.
λ 0.07
[2 Points]
This tells us that 95% of the lung cancer patients will fail within 42.79618 months.
2
2 (a): A random variable Y is defined as

 1 if the grade of a student is A+
Y =
 0 Otherwise,
where P (Y = 1) = θ ∈ [0, 1] is unknown. You randomly sample 7 students from the

STAB57H3 class and record (Y1 , Y2 , Y3 , Y4 , Y5 , Y6 , Y7 ). Describe the statistical model
for the observed data
7
X
T = Yi
i=1
[5 Points]
Solution: Each of the random variable Yi follows Bernoulli(θ) distribution. Hence,

7
X
T = Yi ∼ Binomial(7, θ),
i=1
where θ ∈ [0, 1] is unknown. [2 Points]
The probability function for T is given by

7 t
fθ (t) = θ (1 − θ)7−t , t = 0, 1, 2, · · · , 7,
t
where the parameter is θ and the parameter space is Ω = [0, 1]. [3 Points]
2 (b): Suppose that a statistical model is given by the family of Bernoulli(θ) distributions
where θ ∈ Ω = [0, 1]. If your interest is in making inferences about the probability
that two independent observations from this model are NOT the same, then determine
ψ(θ).
[5 Points]
Solution: Let Y1 and Y2 follow Bernoulli(θ), and are independent. Then
ψ(θ) = P (Y1 6= Y2 ).
[2 Points]
That is
ψ(θ) = P (Y1 = 1, Y2 = 0) + P (Y1 = 0, Y2 = 1).
[1 Points]
Finally,
P (Y1 = 1, Y2 = 0) + P (Y1 = 0, Y2 = 1) = θ(1 − θ) + (1 − θ)θ = 2θ(1 − θ) = ψ(θ).
[2 Points]
3
3 (a): Suppose the following sample of waiting times (in minutes) was obtained for customers
in a queue at an automatic banking machine. The median, first and third quartiles are
15 10 2 3 1 0 4 5
5 3 3 4 2 1 4 55
computed as Q2 = 3.5, Q1 = 2.0 and Q3 = 5.0, respectively. Use the 1.5 × IQR rule
to find out any outliers in the data. Show your work clearly.
[5 Points]
Solution: The inter-quartile range is
IQR = Q3 − Q1 = 5.0 − 2.0 = 3.0.
[1 Points]
Here, Q1 − 1.5 × IQR = −2.5 and Q3 + 1.5 × IQR = 6.5.
[2 Points]
In this problem, no sample observation is less than −2.5. On the other hand, the first
(15), second (10), and the last (55) sample observations are greater than 6.5. According
to the 1.5 × IQR rule the first (15), second (10), and the last (55) sample observations
are potential outliers.
[2 Points]
3 (b): Suppose that a statistical model is given by the family of N (µ, σ02 ) distributions where
θ = µ ∈ R1 is unknown, while σ02 is known. If your interest is in making inferences
about the third quartile of the true distribution, then determine ψ(θ).
[5 Points]
Solution: Let the random variable is X which follows N (µ, σ02 ) distribution, and x0.75
be the third quartile. Then

X −µ x0.75 − µ x0.75 − µ
P (X ≤ x0.75 ) = P ≤ =P Z≤ = 0.75,
σ0 σ0 σ0
where Z ∼ N (0, 1). [2 Points]
We write
x0.75 − µ
= z0.75 ,
σ0
where Φ(z0.75 ) = 0.75, and Φ is the cumulative distribution function of N (0, 1) distri-
bution.
[2 Points]
Finally,
ψ(θ) = x0.75 = µ + σ0 z0.75 ,
where z0.75 is the third quartile of a N (0, 1) distribution. [1 Points]
4
4 (a): Let (x1 , x2 , · · · , xn ) be an observed sample from a distribution with probability density
function 
 1 x θ1 −1 0 < x ≤ 1
θ
fX (x) =
 0 Otherwise,
where θ > 0 is an unknown parameter. Find the sufficient statistic and MLE of θ. Is
the sufficient statistic minimal sufficient?
[7 Points]
Solution: Let s stands for the observed sample. Then

n n
! θ1 n
!−1
Y 1 Y Y
f (s) = fX (xi ) = n xi × xi = gθ (T ) × h(s),
i=1
θ i=1 i=1
Qn Qn
where T (s) = i=1 xi . By factorization theorem, T = i=1 Xi is a sufficient statistic
for θ.
[2 Points]
The likelihood function is defined as

n n
! θ1 −1
Y 1 Y
L(θ|s) = fX (xi ) = xi ,
i=1
θn i=1
The log-likelihood function is

n
!
1 Y
log L(θ|s) = −n log θ + − 1 log xi ,
θ i=1
The first derivative of the log-likelihood function with respect to θ is

n
!
log ( ni=1 xi )
Q
d log L(θ|s) n 1 Y set
= − − 2 log xi = 0 ⇒ θ̂ = − ,
dθ θ θ i=1
n
The second derivative of the log-likelihood function with respect to θ, evaluated at θ̂

is
d2 log L(θ|s)

n
2
= − < 0,
dθ
θ=θ̂ θ̂2
Qn
log( i=1 xi )
Hence, θ̂ = − n
is the MLE of θ.
[4 Points]
Qn
Here, θ̂ can be expressed as a function of T = i=1 xi . Hence, T is minimal sufficient
for θ.
[1 Points]
4 (b): Determine the MLE of ψ(θ) = exp (θ).
[3 Points]
Solution: Here, ψ(θ) = exp (θ) is a one-to-one function of θ on Ω, and therefore

ψ(θ̂) = exp (θ̂) is the MLE of ψ(θ).
5
5 (a): A cancer laboratory is estimating the rate of tumorigenesis in a group of mice. They
have tumor count data for 10 mice given in the following table. Information from other
12 9 12 14 13
13 15 8 15 6
laboratories suggests that the mice have tumor counts that are approximately Poisson
distributed with a mean of λ. Given the following prior λ ∼ Gamma(120, 10), find the
posterior distribution of λ.
[8 Points]
Solution: Given that the variable tumor counts, X ∼ Poisson(λ), we write the joint
distribution of the sample as
n
Y e−nλ λnx̄
Pλ (x1 , · · · , xn ) = Pλ (xi ) = .
i=1
x1 ! · · · xn !
[2 Points]
The prior distribution is

β α α−1 −βλ
π(λ) = λ e ,
Γ(α)
where α = 120, β = 10, and λ ∈ R+ . [1 Points]
The posterior distribution of λ given the data is
π(λ|x1 , · · · , xn ) ∝ Pλ (x1 , · · · , xn ) × π(λ) ∝ λnx̄+α−1 e−(n+β)λ ,
which is in the form of a gamma distribution with parameters nx̄ + α and n + β.
[3 Points]
We write
λ|x1 , · · · , xn ∼ Gamma(nx̄ + α, n + β),
where nx̄ + α = 237 and n + β = 20.
[2 Points]
5 (b): Compute posterior mean and variance of λ. [6 Points]
Solution: Posterior mean and variance of λ are
nx̄ + α 237
E(λ|x1 , · · · , xn ) = = = 11.85,
n+β 20
[3 Points]
and
nx̄ + α 237
Var(λ|x1 , · · · , xn ) = 2
= 2 = 0.5925,
(n + β) 20
respectively. [3 Points]
6
6 (a): Let (x1 , x2 , · · · , xn1 ) be a random sample from a location normal model N (µ, σ02 ),
where the unknown µ ∈ R1 and the known σ02 > 0. You are interested in computing
a γ-confidence interval for µ. Consider that the length of your confidence interval
is |CI(n1 )|. Now you draw another random sample of size n2 (> n1 ), and obtain a
γ-confidence interval with length |CI(n2 )|. Hence, the following relationship holds
|CI(n1 )| ≤ |CI(n2 )| . TRUE/FALSE
Explain your reasons.
[6 Points]
Solution: For sample size n1 , the γ-confidence interval for µ is

σ0 σ0
(x̄ − z 1+γ √ , x̄ + z 1+γ √ ).
2 n1 2 n1
[2 Points]
Hence,
σ0
|CI(n1 )| = 2z 1+γ √ .
2 n1
[1 Points]
Similarly,
σ0
|CI(n2 )| = 2z 1+γ √ .
2 n2
[1 Points]
Since n2 > n1 ,
|CI(n1 )| > |CI(n2 )| .
Hence, the stated relationship does not hold. The correct answer is FALSE.
[2 Points]
6 (b): You want the maximum length of the confidence interval to be |CI|. Find the required
sample size n.
[4 Points]
Solution: The maximum length of the γ-confidence interval for µ is |CI|, i.e.,
σ0
2z 1+γ √ ≤ |CI|.
2 n
[2 Points]
Solving for n, we get
2
σ0
n ≥ 2z 1+γ .
2 |CI|
[2 Points]
7
7 (a): Suppose in the population of students of the course STAB57H3, the mark, out of 100,
in the midterm test is approximately distributed as N (µ, 202 ). A random sample of
marks of 10 students is given in the following table. Test whether the distributional
assumption is reasonable or not.
14 55 86 35 59
78 41 28 32 24
[5 Points]
Solution: Here, the null hypothesis is
H0 : The data is drawn from the N (µ, 202 ) distribution.
The sample size and mean are n = 10 and x̄ = 45.2. The residual vector is computed
as
r = (r1 = −31.2, r2 = 9.8, · · · , r10 = −21.2).
P10 2
Here, i=1 ri = 5041.6, and the discrepancy statistic is
10
X
2
χ (r) = ri2 /σ02 = 5041.6/202 = 12.604.
i=1
[3 Points]
The P-value is computed as
P (χ2 (9) > 12.604) = 1 − P (χ2 (9) ≤ 12.604) = 1 − 0.8186414 = 0.1813586.
Hence, the null hypothesis can’t be rejected at 0.05 level of significance.
[2 Points]
7 (b): Consider you place a prior µ ∼ N (70, 42 ) on the unknown parameter. Construct a
0.95-credible interval for µ.
[5 Points]
Solution: Given X ∼ N (µ, σ02 = 202 ), µ ∼ N (µ0 = 70, τ02 = 42 ), γ = 0.95, z 1+γ =
2
1.959964 and x̄ = 45.2, the posterior mean of µ is

−1
1 n µ0 nx̄
+ + 2 = 62.91429,
τ02 σ02 τ02 σ0
and the posterior variance of µ is
−1
1 n
+ = 11.42857.
τ02 σ02
Hence, a 0.95-credible interval for µ is
−1 − 12
1 n µ0 nx̄ 1 n
+ + 2 ± + z 1+γ = (56.2884, 69.54017).
τ02 σ02 τ02 σ0 τ02 σ02 2
[1.5 + 1.5 + 2 = 5 Points]
8
8 (a): Let (X1 , X2 , · · · , X10 ) be a random sample of size 10 from the Bernoulli(θ) distribution.
Find the value of k such that k X̄(1 − X̄) is an unbiased estimator of ψ(θ) = θ(1 − θ).
[5 Points]
Solution: Here,
E(X̄(1 − X̄)) = E(X̄ − X̄ 2 ) = E(X̄) − E(X̄ 2 ) = E(X̄) − (Var(X̄) + E(X̄)2 ).
[2 Points]
Since, E(X̄) = θ and Var(X̄) = θ(1 − θ)/n, we get
n−1
E(X̄(1 − X̄)) = θ(1 − θ)
n
[2 Points]
Hence,
n
E( X̄(1 − X̄)) = θ(1 − θ) = ψ(θ).
n−1
So,
n 10
k= = .
n−1 9
[1 Points]
8 (b): Suppose that X, Y, Z are independent N (0, 1) random variables and that U = X + Z,
V = Y + Z. Determine whether or not the variables U and V are related to each other.
[5 Points]
Solution: The covariance between two random variables U and V is
Cov(U, V ) = Cov(X+Z, Y +Z) = Cov(X, Y )+Cov(X, Z)+Cov(Z, Y )+Cov(Z, Z) = 0+0+0+1 = 1.
[3 Points]
The variance of U is
Var(U ) = Var(X + Z) = Var(X) + Var(Z) + 2Cov(X, Z) = 1 + 1 + 0 = 2.
The variance of V is
Var(V ) = Var(Y + Z) = Var(Y ) + Var(Z) + 2Cov(Y, Z) = 1 + 1 + 0 = 2.
The correlation between two random variables U and V is

p √
Cor(U, V ) = Cov(U, V )/ Var(U )Var(V ) = 1/ 2 × 2 = 0.5.
[1 Points]
Hence, the two variables U and V are related to each other. [1 Points]
9
9 (a): Shown below are the number of galleys for a manuscript (X) and the dollar cost
of correcting typographical errors (Y ) in a random sample of recent orders handled
by a firm specializing in technical manuscripts. Assume that the regression model
Yi = β0 + β1 Xi + i is appropriate, with normally distributed independent error terms
with variance σ 2 .
i: 1 2 3 4 5 6
Xi : 7 12 4 14 25 30
Yi : 128 213 75 250 446 540
The following numbers are provided: 6i=1 Xi = 92, 6i=1 Yi = 1652, 6i=1 Xi2 = 1930,
P P P
P6 2
P6
i=1 Yi = 620394, i=1 Xi Yi = 34602. Obtain the least squares estimate of β0 .
Interpret the result.
[5 Points]
Solution: The least squares estimate of β1 is

P
Xi Yi − nX̄ Ȳ 34602 − 6(92/6)(1652/6)
b1 = P 2 = = 17.8524.
Xi − nX̄ 2 1930 − 6(92/6)2
[2 Points]
The least squares estimate of β0 is
b0 = Ȳ − b1 X̄ = (1652/6) − 17.8524(92/6) = 1.5969.
[2 Points]
Interpretation: Given the number of galleys for a manuscript is 0, the expected

dollar cost of correcting typographical error is 1.5969.
[1 Points]
10
9 (b): Given that an unbiased estimate of the error variance σ 2 as
M SE = 7.001,
construct a 0.95 confidence interval of β0 . Test the null hypothesis of H0 : β0 = 0

against alternative HA 6= 0.
[5 Points]
Solution: An unbiased estimate of Var{b0 } is
X̄ 2 (92/6)2

2 1 1
s {b0 } = M SE +P = 7.001 + 2
= 2.08282 .
n (Xi − X̄)2 6 1930 − 6(92/6)
[2 Points]
The 0.95-confidence interval of β0 is
b0 ± t(0.975;n−2) s{b0 } = 1.5969 ± 2.776445 × 2.0828 = (−4.185872, 7.37971).
[2 Points]
Since the 0.95-confidence interval of β0 contains 0, the null hypothesis H0 : β0 = 0

can’t be rejected at 0.05 level of significance.
[1 Points]
9 (c): The analysis of variance (ANOVA) table for the fitted regression model is given below.
Test the null hypothesis that H0 : β1 = 0 against alternative HA : β1 6= 0.
Sources Df Sum Sq Mean Sq F value Pr(>F)

Regression 1 165515.32 165515.32 23632.04 0.0000
Error 4 28.02 7.00
Total 5 165543.34
[3 Points]
Solution: Here, the test statistic is F = 23632.04 with P-value 0.0. Hence, the null
hypothesis H0 : β1 = 0 is rejected at 0.05/0.01 level of significance.
11
9 (d): Estimate the coefficient of determination R2 . Interpret the result.
[3 Points]
Solution: The coefficient of determination is
SSR 165515.32
R2 = = = 0.9998.
SST 165543.34
[2 Points]
Interpretation: The fitted regression model can explain 99.98% variability of the
total variation in the response variable Y .
[1 Points]
12
Appendix
1. If X ∼ Poisson(λ), then its probability function is defined as
e−λ λx
P (X = x|λ) = x = 0, 1, 2, 3, · · ·
x!
2. The probability density function of X ∼ Gamma(α, β) is
β α α−1 −βx
f (x|α, β) = x e ,
Γ(α)
where α > 0, β > 0, and x ∈ R+ .
3. The following probability is for chi-squared distribution with degrees of freedom 10 − 1
P (χ2 (10 − 1) ≤ 12.604) = 0.8186414.
4. The 0.975th quantile of a normal distribution with mean 0 and variance 1 is
z0.975 = 1.959964.
5. Consider and interval

INT = (a, b),
where, a ∈ R1 , b ∈ R1 , and b ≥ a. Then the length of this interval is defined as
|INT| = b − a.
6. When γ = 0.95 and n = 6
t((1 + γ)/2; n − 2) = 2.776445.
13

University of Toronto Scarborough Department of Computer and Mathematical Sciences Final Exam, Winter - 2015

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

University of Toronto Scarborough Department of Computer and Mathematical Sciences Final Exam, Winter - 2015

Uploaded by

Copyright:

Available Formats

UNIVERSITY OF TORONTO SCARBOROUGH

Department of Computer and Mathematical Sciences

Final Exam, Winter - 2015

STAB57H3: An Introduction to Statistics

Duration: Three hours (180 minutes)

LAST NAME: FIRST NAME:

STUDENT NUMBER: SIGNATURE:

• A calculator (No phone calculators are allowed)

No other aids are allowed.

fX (x) = λe−λx for x ≥ 0,

Solution: Consider M is the median survival time. We write

Solving for M we get

Hence, the median survival time is approximately 10 months.

Solving for b we get

where P (Y = 1) = θ ∈ [0, 1] is unknown. You randomly sample 7 students from the

Solution: Each of the random variable Yi follows Bernoulli(θ) distribution. Hence,

where θ ∈ [0, 1] is unknown. [2 Points]

The probability function for T is given by

Solution: Let Y1 and Y2 follow Bernoulli(θ), and are independent. Then

P (Y1 = 1, Y2 = 0) + P (Y1 = 0, Y2 = 1) = θ(1 − θ) + (1 − θ)θ = 2θ(1 − θ) = ψ(θ).

Solution: The inter-quartile range is

IQR = Q3 − Q1 = 5.0 − 2.0 = 3.0.

Here, Q1 − 1.5 × IQR = −2.5 and Q3 + 1.5 × IQR = 6.5.

where z0.75 is the third quartile of a N (0, 1) distribution. [1 Points]

Solution: Let s stands for the observed sample. Then

The likelihood function is defined as

The log-likelihood function is

The first derivative of the log-likelihood function with respect to θ is

The second derivative of the log-likelihood function with respect to θ, evaluated at θ̂

4 (b): Determine the MLE of ψ(θ) = exp (θ).

Solution: Here, ψ(θ) = exp (θ) is a one-to-one function of θ on Ω, and therefore

The prior distribution is

The posterior distribution of λ given the data is

π(λ|x1 , · · · , xn ) ∝ Pλ (x1 , · · · , xn ) × π(λ) ∝ λnx̄+α−1 e−(n+β)λ ,

which is in the form of a gamma distribution with parameters nx̄ + α and n + β.

where nx̄ + α = 237 and n + β = 20.

5 (b): Compute posterior mean and variance of λ. [6 Points]

Solution: Posterior mean and variance of λ are

|CI(n1 )| ≤ |CI(n2 )| . TRUE/FALSE

Explain your reasons.

Solution: For sample size n1 , the γ-confidence interval for µ is

Solving for n, we get

Solution: Here, the null hypothesis is

H0 : The data is drawn from the N (µ, 202 ) distribution.

The P-value is computed as

P (χ2 (9) > 12.604) = 1 − P (χ2 (9) ≤ 12.604) = 1 − 0.8186414 = 0.1813586.

Hence, the null hypothesis can’t be rejected at 0.05 level of significance.

1.959964 and x̄ = 45.2, the posterior mean of µ is

[1.5 + 1.5 + 2 = 5 Points]

E(X̄(1 − X̄)) = E(X̄ − X̄ 2 ) = E(X̄) − E(X̄ 2 ) = E(X̄) − (Var(X̄) + E(X̄)2 ).

Since, E(X̄) = θ and Var(X̄) = θ(1 − θ)/n, we get

Solution: The covariance between two random variables U and V is

Cov(U, V ) = Cov(X+Z, Y +Z) = Cov(X, Y )+Cov(X, Z)+Cov(Z, Y )+Cov(Z, Z) = 0+0+0+1 = 1.

Var(U ) = Var(X + Z) = Var(X) + Var(Z) + 2Cov(X, Z) = 1 + 1 + 0 = 2.

Var(V ) = Var(Y + Z) = Var(Y ) + Var(Z) + 2Cov(Y, Z) = 1 + 1 + 0 = 2.

The correlation between two random variables U and V is

Interpret the result.

Solution: The least squares estimate of β1 is

The least squares estimate of β0 is

b0 = Ȳ − b1 X̄ = (1652/6) − 17.8524(92/6) = 1.5969.

Interpretation: Given the number of galleys for a manuscript is 0, the expected

construct a 0.95 confidence interval of β0 . Test the null hypothesis of H0 : β0 = 0