Professional Documents
Culture Documents
LOG708 Applied Statistics 24.nov.2020
LOG708 Applied Statistics 24.nov.2020
LOG708 Applied Statistics 24.nov.2020
Front page
Keep your mobile phone close by. Important messages concerning all candidates might come by sms.
The best of luck!
1
Assignment
You can find the exam question set in the panel on the left. If you wish to download the question set to your
machine, follow this link: LOG708_H2020_23.11.2020 (002)
If you are not able to see the exam question set in Inspera, you can also find the question set in Canvas.
Write your answer in Word and save the document as one single PDF file on your own machine. Upload your
PDF file below.
Your file is saved in Inspera until the deadline for handing in your assignment. After the deadline has passed,
the last version of any uploaded PDF files is submitted automatically.
1/1
Question 1
Attached
1
Problem 1 [25%]
Mekanic AS operates five production lines for their best-selling component called MDX100.
A production supervisor at Mekanic has randomly selected 150 units of MDX100 from the
five production lines (Line A – E) and created a chart (Figure 1) to summarize her data.
(a) Among the perfect units, what proportion is approximately contributed by Line B?
[2.5%]
(b) Suggest an alternative method that the supervisor could use to summarize the data
[2.5 %]
(c) Suppose the supervisor reclassified the variable “Condition” into two categories:
Defective (components with major or minor fault) and Perfect. Sketch a simple bar chart
to summarize the new variable [4%]
The supervisor decided to examine historical data on the daily number of defective units
produced by each of the production lines for the last 81 days. The summary statistics are as
follows:
Table 1: Daily number of defective components
Line A Line B Line C Line D Line E
Mean 40 35 20 30 25
Standard deviation 16 10 7 9 8
2
(d) Which of the production lines has the most predictable number of defective
components? Why? [8%]
(e) Using the information provided, create a 99% confidence interval for the average
number of defective components produced by Line A. [8%]
Problem 2 [25%]
A dataset about energy use of appliances in a low-energy house consists of 95 observations.
One of the variables in the dataset is T1, which represents energy use of light fixtures
measured in Watt-hours. Figure 2 shows the distribution of T1.
AT = 2T1 + 3T2
3
If the values of T2 are also normally distributed and the analyst has observed a mean of
19.5 and a standard deviation of 4, find the probability that AT is less than 95. [10%]
Problem 3 [25%]
This problem consists of tasks (a) and (b). For task (a), assume that the values of the given
variables are normally distributed.
(a) Terje has surveyed all physical stores in Oslo and found that the average price for product
X is 171 NOK. He then randomly visited five online stores and observed the following
prices.
Task: Test a hypothesis that on average it is cheaper to buy product X online [10%]
(b) Assume that the Ministry of Trade in Norway conducted an extensive study in June and
concluded that 45% of companies in Norway had lost between 5% and 25% of sales since
the onset of COVID-19. In September, a trade organization called Norsk Industri
conducted a similar study and found that among 360 companies that responded to the
survey 51% had lost between 5% and 25% of sales since the onset of the pandemic.
Tasks:
(i) Using Norsk Industri’s report, calculate the margin of error for the estimate of the
proportion of companies that lost between 5% and 25% of sales since the onset of the
pandemic. [5%]
4
(ii) Test the following hypothesis: “The proportion reported by Norsk Industri is
significantly different from the figure reported by the Ministry of Trade”. [10%]
Problem 4 [25%]
In this problem, we will continue to use the energy use dataset (95 observations). After some
preliminary analyses, the analyst conducted a regression analysis to explain the variation in the
energy use of light fixtures. Table 3 presents an extract of her dataset, consisting of six variables
whose names and descriptions are as follows:
The analyst conducted the analysis and Table 4 presents a portion of the results.
Estimate t value
Intercept 12.35 3.29
RH_1 -0.06 -2.57
RH_2 -0.78 -9.79
T3 0.45 2.25
RH_3 0.81 12.81
HMD1 -0.27 -3.39
HMD2 0.03 0.44
R2 = 0.85
Note:
HMD1 and HMD2 are dummy variables defined as follows:
HMD1 = 1 if the humidity outside is low, otherwise, HMD1 = 0
HMD2 = 1 if the humidity outside is moderate, otherwise, HMD2 = 0
(c) By using 99% confidence interval, would you conclude that RH_1 is a significant predictor
of T1? [5%]
(d) If the total sum of squares for the estimated model is 37.27, compute the Adjusted R2 and an
estimate of the standard deviation of the random term. [5%]
Step 3: She estimated a new regression model. Table 5 presents a portion of the results
6
Appendix 1
8
Appendix 2