Mock Merged

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

Statistics for Data Science - II

Final Mock
May 2022 term

1 Section I

1. Let X1 , X2 , . . . , X50 ∼ i.i.d. Poisson(5) and Y = X1 + X2 + . . . + X50 .

(a) What is E[Y ]? [1 mark]


(b) What is Var(Y )? [1 mark]
(c) What is P (|Y | > 100)? Use Normal approximation and express your answer in
terms of FZ , where FZ is the CDF of standard normal distribution. [2 marks]

(a) 1 − FZ (−150/ 250)

(b) 2FZ (−150/ 250)
√ √
(c) 1 − FZ (−150/ 250) + FZ (−350/ 250)
(d) 1 − FZ (150/250)
Statistics for Data Science - II
Final Mock
May 2022 term

1 Section I

1. Let X1 , X2 , . . . , X50 ∼ i.i.d. Poisson(5) and Y = X1 + X2 + . . . + X50 .

(a) What is E[Y ]? [1 mark]


(b) What is Var(Y )? [1 mark]
(c) What is P (|Y | > 100)? Use Normal approximation and express your answer in
terms of FZ , where FZ is the CDF of standard normal distribution. [2 marks]

(a) 1 − FZ (−150/ 250)

(b) 2FZ (−150/ 250)
√ √
(c) 1 − FZ (−150/ 250) + FZ (−350/ 250)
(d) 1 − FZ (150/250)
2. Consider 100 samples X1 , . . . , X100 ∼ i.i.d. Normal(µ, 1). Let the null and alternative
hypothesis be H0 : µ = 0, HA : µ > 0. Suppose the test statistic T = X. Consider a
test that rejects H0 if T > c, for some constant c.

(i) What is P (Type I error)? [2 marks]


(a) FZ (10c)
(b) 1 − FZ (10c)
c
(c) 1 − FZ
 c  10
(d) FZ
10
(ii) What is P (Type II error) if HA : µ = 1? [2 marks]
 
c−1
(a) 1 − FZ
10
 
c−1
(b) FZ
10
(c) 1 − FZ (10(c − 1))
(d) FZ (10(c − 1))

Page 2
2. Consider 100 samples X1 , . . . , X100 ∼ i.i.d. Normal(µ, 1). Let the null and alternative
hypothesis be H0 : µ = 0, HA : µ > 0. Suppose the test statistic T = X. Consider a
test that rejects H0 if T > c, for some constant c.

(i) What is P (Type I error)? [2 marks]


(a) FZ (10c)
(b) 1 − FZ (10c)
c
(c) 1 − FZ
 c  10
(d) FZ
10
(ii) What is P (Type II error) if HA : µ = 1? [2 marks]
 
c−1
(a) 1 − FZ
10
 
c−1
(b) FZ
10
(c) 1 − FZ (10(c − 1))
(d) FZ (10(c − 1))

Page 2
3. Let X be uniformly distributed over the interval [−a, a], a > 0. For what value of
a, P (X < 12 ) = 0.7. [2 marks]

Page 3
4. Let the random variables X and Y each have range {1, 2}. The following formula gives
the joint PMF
i+j
P (X = i, Y = j) = ,
k
where k is an unknown value. Find P (1 ≤ X ≤ 2, 1 < Y ≤ 2). [2 marks]
1
a)
12
5
b)
12
1
c)
4
7
d)
12

Page 4
2 Section III

1. The standard method of screening for a disease fails to detect the presence of the
disease in 15% of the patients. To determine if the new method of screening is superior,
suppose that 75 people with the disease are chosen at random and screened using the
new method. The new method failed to detect presence of the disease in 6 of the
patients.

(i) Identify the null and the alternative hypothesis. [1 mark]


(a) H0 : p = 0.15, HA : p ̸= 0.15
(b) H0 : p = 0.15, HA : p > 0.15
(c) H0 : p = 0.15, HA : p < 0.15
(d) H0 : p ̸= 0.15, HA : p = 0.15
(ii) Find the P -value of the test. Enter the answer correct to two decimal places. [2
marks]
Use FZ (1.94) = 0.97381

(iii) Will you accept or reject the null hypothesis at a significance level of 5%? [2
marks]
(a) Accept the null hypothesis.
(b) Reject the null hypothesis.

Page 5
2. Let X be a discrete random variable with the following PMF:

X −2 −1 0 1
fX (x) p/2 p/2 (1 − p)/2 (1 − p)/2

where 0 ≤ p ≤ 1. Consider the estimation of p from the samples: −1, 0, 1, 1, −2.

(i) Find the method of moment estimate of p for the given sample. Enter the answer
correct to two decimal places. [2 marks]

(ii) Find the maximum likelihood estimate of p for the given sample. Enter the an-
swer correct to one decimal place. [2 marks]

(iii) Using a Uniform[0, 1] prior, find the posterior mean of the given sample. Enter
the answer correct to two decimal places. [2 marks]

Page 6
2. Let X be a discrete random variable with the following PMF:

X −2 −1 0 1
fX (x) p/2 p/2 (1 − p)/2 (1 − p)/2

where 0 ≤ p ≤ 1. Consider the estimation of p from the samples: −1, 0, 1, 1, −2.

(i) Find the method of moment estimate of p for the given sample. Enter the answer
correct to two decimal places. [2 marks]

(ii) Find the maximum likelihood estimate of p for the given sample. Enter the an-
swer correct to one decimal place. [2 marks]

(iii) Using a Uniform[0, 1] prior, find the posterior mean of the given sample. Enter
the answer correct to two decimal places. [2 marks]

Page 6
3. A company has installed two machines A and B to fill the oil in the bottles. Stan-
dard deviations of the amount of oil from machines A and B are 0.6 and 0.5 litres,
respectively. The company manager suspects that the average amount filled by the
two machines in a bottle is different. He took n samples from machine A and found
the average amount to be 1.2 litres. A sample of 50 bottles from machine B results in
an average amount of 1.6 litres. If the P -value of the test is 0.01, find the value of n.
[4 marks]

Page 7
4. Select the steps from the following options, that you will use for finding the value of n
in the previous question.
Note: This question is optional. We will check your answer to this question if you
make a mistake in the previous one. [0 Marks]

(a) Null hypothesis, H0 : µ1 = µ2 .


(b) Null hypothesis, H0 : µ1 ̸= µ2 .
(c) Right tailed test is used.
(d) Left tailed test is used.
(e) Two tailed test is used.
(f) t-test is used.
(g) z-test is used.
 
0.36 0.25
(h) X − Y ∼ Normal µ1 − µ2 , +
n 50
(i) X − Y ∼ Normal (µ1 − µ2 , 0.61)
 
0.36 0.25
(j) X − Y ∼ Normal 0, +
n 50
(k) Test: Reject H0 , if | Y − X |< c.
(l) Test: Reject H0 , if | Y − X |> c.

Page 8
5. Waiting time X (in minutes) of customers at a restaurant is exponentially distributed
with unknown mean 1/λ. Waiting time of the last five customers are 15, 10, 5, 12, 8.

(a) Find the maximum likelihood estimate of λ. Write your answer correct to two
decimal places. [2 marks]
(b) Find the posterior mean of λ using Exponential(1/10) prior. Write your answer
correct to two decimal places. [3 marks]

= 5/50 = 0.10

0.10.

Page 9
6. Let X be a continuous random variable with pdf f (x) = 6x(1 − x), 0 ≤ x ≤ 1.
Determine the CDF of X. [4 marks]

(i) 
0
 for x < 0
FX (x) = x(3 − 2x) for 0 ≤ x < 1

1 x≥1

(ii) 
0
 for x < 0
FX (x) = 6(1 − 2x) for 0 ≤ x < 1

1 x≥1

(iii) 
0
 for x < 0
FX (x) = 3 − 2x for 0 ≤ x < 1

1 x≥1

(iv) 
0
 for x < 0
FX (x) = x2 (3 − 2x) for 0 ≤ x < 1

1 x≥1

Page 10
7. The density function of a continuous random variable X is given by
 
1 −|x|
fX (x) = exp −∞<x<∞
2σ σ

Let ten samples 4, −2, 2, 1, −4, 5, −1, 2, 3, 6 taken from X. Find the maximum likeli-
hood estimate of σ. [4 marks]

Page 11

You might also like