Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

F29IJ2

HERIOT-WATT UNIVERSITY
_______________________

SCHOOL OF MATHEMATICAL
AND COMPUTER SCIENCES

COMPUTER
SCIENCE
_______________________

DATA ANALYSIS & SIMULATION

Friday 29th April 2005 – 9.30am to 11.30am

Answer THREE questions

Data Sheets & Graph Paper Provided

Candidates may only use a University approved


calculator
JP – Q1, Q2, Q3 & QA 2 F29IJ2

1. (a) A media production company converts an analogue recording to a digital file using
one of two machines, A or B. Generally, machine A is used 55% of the time and
machine B the rest. Unfortunately the converted signal is susceptible to errors that
on occasion make the digital file unreadable. This happens with a probability of
0.06 on machine A and a probability of 0.14 on machine B.

A digital file is selected at random from all files produced one day and is found to
be faulty. By drawing a tree diagram, or otherwise, calculate the probability that
this file has been produced by machine B.
(7)

(b) The probability of any customer spending more than £50 on a visit to a particular
computer superstore has been estimated as 0.36. A random sample of 5 customers
who had made purchases is taken in one hour of business.

(i) Calculate p(0), p(1), p(2) …….. p(5), where p(x) is the probability that x
customers from the sample spent more than £50 on their purchases.
(6)

(ii) Draw a bar chart to display the information in (i).


(3)

(iii) What is the probability that fewer than 3 customers from the sample spent
more than £50?
(2)

(c) (i) What are some of the main uses of the Weibull distribution?
(3)

(ii) The time to failure (in hours) of the roller ball in a particular type of mouse
is modelled as a Weibull random variable with = 0.25 and =10 000.
Determine the probability that a roller ball lasts fewer than 300 000 hours
and give the mean time to failure of the roller ball.
(4)
JP – Q1, Q2, Q3 & QA 3 F29IJ2

2. (a) In non technical language, state the Central Limit Theorem and explain how it can
be used to obtain a confidence interval for a population mean when only a random
sample of values is available.
(5)

(b) The number of hits for two competing websites is recorded on 6 random days
during a month. The results are given in Table 2.1 below:

Website A 22 18 31 26 29 19
Website B 23 17 28 420 20 19

Table 2.1

(i) Obtain the mean, median and standard deviation for the number of hits on
Website A.
(4)

(ii) It is thought that the fourth result recorded for Website B might be a
mistake. Give the preferred measure of central tendency and spread if this
were indeed the case, explaining the reasons for your choice.
(4)

(iii) Calculate 95% confidence intervals for the mean number of hits per day for
each website. For B eliminate the fourth result.
(5)

(c) Use an appropriate hypothesis test to check whether there is a significant difference
between the number of hours worked per week by employees in the two companies
listed in Table 2.2 below:

Mean number of Standard deviation Sample size taken


hours worked
Company P 39.6 7.8 40
Company Q 42.3 8.1 50

Table 2.2
(7)
JP – Q1, Q2, Q3 & QA 4 F29IJ2

3. A fast food company recorded the costs it incurred whilst serving various numbers of
customers per day. The results are given in Table 3.1 below.

Customers 1.1 2.0 1.8 3.1 2.5 0.6 2.1 1.4 1.4 0.9
(thousands)
Costs 5.1 5.6 5.2 5.9 5.3 4.8 5.7 4.9 5.1 4.8
(£hundreds)
Table 3.1

Let x represent the number of customers (in thousands) and y represent the costs (in
hundreds of pounds).

(a) Draw a scatter diagram of these data.


(3)

(b) Calculate the Product-Moment Correlation Co-efficient and explain how this value
relates to the pattern of the points on your scatter diagram.
(7)

(c) Find the least squares linear regression equation for y in terms of x.
(6)

(d) How would you use the equation to estimate costs if the number of customers
became greater than 5000 one day? Comment on the reliability of the estimate.
(4)

(e) Perform a hypothesis test on the slope coefficient to check whether or not it is
significantly different from zero. Show all your workings.
(5)

N.B. Some calculations have already been carried out to make your work simpler
in this question.

x =16.9 x2= 33.81 y = 52.4 y2 = 275.9 xy= 90.88


JP – Q1, Q2, Q3 & QA 5 F29IJ2

4. (a) A computer package was used to create 20 random numbers and the results are
shown in Table 4.1 below:

3.173307 3.038650 3.404472 3.678099 3.364624


3.603936 3.805580 3.550784 3.286465 3.200794
3.182092 3.544498 3.524648 3.832155 3.172050
3.721423 3.283679 3.630457 3.169508 3.402586

Table 4.1

(i) Use a Chi-squared Frequency test to test whether these numbers are, in fact,
random.
(6)

(ii) List some advantages and disadvantages of using simulation in the analysis
of a practical problem.
(6)

(b) (i) Outline the conditions under which the standard queueing formulae are
applicable.
(5)

(ii) A firm operates a 10 tonne crane on a job contracting basis. It has been
shown over a long period of time that contract arrivals follow the Poisson
distribution with a mean number of 1.4 per week. The service rate is 1.8 per
week. Calculate the length of the queue and the average time spent in it.
(5)

(iii) A competitor starts up in the area causing demands to fall to 1.2 per week.
What effect does this have on the percentage of idle time?
(3)

END OF PAPER

You might also like