Professional Documents
Culture Documents
08 Hypo and Regression 5 JCS
08 Hypo and Regression 5 JCS
From records, a shot putter has an average throwing distance of 8.92 metres. His
coach taught him a new technique and after three months of training, his
distance, in metres, on 10 different occasions is as follows.
9.11 8.89 9.02 9.08 9.10 8.90 9.05 8.95 8.87 8.86
The coach is interested to find out if the shot putter has improved with the
new technique.
(i) State, giving your reasons, whether a z-test or t-test should be used. [2]
(ii) Carry out the test in (i) at the 5% significance level, stating any
assumption necessary for validity. [5]
(i) Calculate the value of the product moment correlation coefficient. [1]
(ii) Give a sketch of the scatter diagram for the data and hence comment
on the value of the product moment correlation coefficient found in (i). [3]
A: w = x 2 ,
1
B: w = ,
x
C: w = ln x . [2]
The past records of a supermarket show that its customers spend an average of
$55 per visit at this store. To encourage customers to spend more money at the
store, the management of the store initiated a promotional campaign where each
customer will receive a bonus stamp for every $30 spent and these bonus
stamps can be used to redeem products at the store.
A week after the start of the campaign, the manager of the store took a sample of
15 customers who visited the store. The amount of money (in dollars) spent by
these customers, x, at this supermarket during their visits are summarized by
Σ x = 1014 Σ x2 = 83568
Assume that the money spent by all customers at this supermarket follows a
normal distribution. Test, at the 10 % level of significance, whether the
promotional campaign was successful in encouraging customers to spend more
money at the store. [6]
(ii) Calculate the product moment correlation coefficient for the data. [1]
(iii) The Physics score of a sixth student was mislaid but his Mathematics
score was known to be 26.
Use the appropriate line of regression to estimate this student’s Physics
score, and give a reason for the use of the chosen equation. [2]
Comment on the reliability of your estimate. [1]
NJC08
Yau Yan slimming centre offers a particular slimming package for ladies. Based
on past records, the mean weight of a lady after treatment was 50 kg.
In January 2008, a sample of 20 customers was randomly chosen and the weight,
x kg, of each customer after treatment was recorded. The data were summarized
by Σ (x – 50) = 54 Σ (x – 50)2 = 1226
After a major renovation at the computer-mart Sim Lum Square, its manager
Keith carried out a study to investigate the relationship between the renovation
cost of the shops and their sales figures in the first six months after re-opening.
The data below were collected from eight randomly chosen shops, in thousands
of dollars.
Shop No. 1 2 3 4 5 6 7 8
Renovation cost, 19.5 23 28.1 31.6 35 38.1 46.5 49.2
X (in thousands
of dollars)
Sales figure, Y (in 55.5 a 77.2 78.7 79 88 73.4 80
thousands of
dollars)
(i) Keith found the least square regression line of the above data is
y = 52.244 + bx. It was known that this line and the line of x on y both pass
through the point (33.9, 74.3). Find the value of b to 5 significant figures
and show that a is 62.6. [3]
(ii) Explain why it is not appropriate to use the given regression line in (i) to
predict the renovation cost when given the sales figures. Hence, use an
appropriate regression line to find the renovation cost when the sales
figure is $80000. [3]
(iii) A statistician suggested to Keith that the data could be better modeled by
y = c + d ln x. Give a reason to support the statistician’s claim and find the
values of c and d. [3]
TJC08
(a) The burning time of 15 randomly chosen candles has a mean of 29.4 minutes
and a standard deviation of 2.38 minutes. Find the smallest integer α such that there
is sufficient evidence at α% level of significance to say that the mean burning time
has increased. State any assumptions you have to make in order to carry out the test.
(b) Given that σ is now known to be 3.2 minutes, find the range of values of the
sample mean obtained from a sample of size 100 that would lead to the
rejection of the null hypothesis in favour of the alternative hypothesis that the
mean burning time has changed at 2 % level of significance. [5,4]
With the aid of a suitable diagram, explain what is meant by the “least squares
regression line of y on x”, where x and y are two observed variable quantities of a
given population. [2]
(i) Given that the estimated regression line of v on t is v = 19.19 + 0.625 t, [3]
find the value of k, giving your answer to the nearest integer. Calculate the value of
the product-moment correlation coefficient.
[1]
(ii) If V is the value obtained by substituting a sample value of t into the equation
of the regression line of v on t, find Σ (v – V)2 [2]
(iii) Find the expected value of v when t = 90 and comment on the result obtained.
(v) Using the model chosen in (iv), estimate the value of v when t = 90. Compare
and comment on the result obtained here, and in (iii). [2]
AJC08
(ii) Find an inequality satisfied by the significance level of the test where the
claim is not justified. [3]
From the collected data where 1 < x < 9 and 1 < y < 5, the equation of the line of
regression y on x is found to be 20 y + 9 x = 104 , and the equation of the line of
regression x on y is x + 2 y = 11.
(i) (
Find x , y , ) [1]
(ii) Find the value of the product moment correlation coefficient between x
and y. [1]
(iii) Estimate the amount of time that a student sleeps if he spends 4 hours
gaming on a certain day. Comment on the reliability of this value. [2]
(c) The Consumer Price Index (CPI) and the price of gold are thought to be
B
related by the equation y = A x , where A and B are constants. Six pairs of
observations are given as follow:
CPI (x) 1.5 2.1 3.5 4.7 5.1 7.2
gold ($ y) 208 284 410 514 561 950
(ii) A gold trader claims that the recent increase in the price of gold is due to
the high CPI. Explain whether you agree with his statement. [1]