LOG708 Applied Statistics 24.nov.2020

LOG708 1 Applied Statistics
Front page
LOG708 Applied Statistics

Date: 24.11.20
Time: 09.00 -13.30, including time for any practical and technical actions needed to hand in your exam
paper.
Supporting materials: All supporting materials permitted. It is not allowed to cooperate with or receive
help from others.
Number of papers in exam question set: 8
Technical, administrative or academic questions: studentweb@himolde.no or +47 71 19 59 90
Keep your mobile phone close by. Important messages concerning all candidates might come by sms.
The best of luck!
1
Assignment
You can find the exam question set in the panel on the left. If you wish to download the question set to your
machine, follow this link: LOG708_H2020_23.11.2020 (002)
If you are not able to see the exam question set in Inspera, you can also find the question set in Canvas.
Write your answer in Word and save the document as one single PDF file on your own machine. Upload your
PDF file below.
Your file is saved in Inspera until the deadline for handing in your assignment. After the deadline has passed,
the last version of any uploaded PDF files is submitted automatically.
More information on how to submit in Inspera.
If you have any questions or need assistance, contact studentweb@himolde.no
1/1
Question 1
Attached
1
Problem 1 [25%]
Mekanic AS operates five production lines for their best-selling component called MDX100.
A production supervisor at Mekanic has randomly selected 150 units of MDX100 from the
five production lines (Line A – E) and created a chart (Figure 1) to summarize her data.
Figure 1. Condition of the units selected from the production lines
(a) Among the perfect units, what proportion is approximately contributed by Line B?
[2.5%]
(b) Suggest an alternative method that the supervisor could use to summarize the data
[2.5 %]
(c) Suppose the supervisor reclassified the variable “Condition” into two categories:
Defective (components with major or minor fault) and Perfect. Sketch a simple bar chart
to summarize the new variable [4%]
The supervisor decided to examine historical data on the daily number of defective units
produced by each of the production lines for the last 81 days. The summary statistics are as
follows:
Table 1: Daily number of defective components
Line A Line B Line C Line D Line E
Mean 40 35 20 30 25
Standard deviation 16 10 7 9 8
2
(d) Which of the production lines has the most predictable number of defective
components? Why? [8%]
(e) Using the information provided, create a 99% confidence interval for the average
number of defective components produced by Line A. [8%]
Problem 2 [25%]
A dataset about energy use of appliances in a low-energy house consists of 95 observations.
One of the variables in the dataset is T1, which represents energy use of light fixtures
measured in Watt-hours. Figure 2 shows the distribution of T1.
Figure 2. Boxplot summarizing values of T1

Tasks:
a) Estimate the interquartile range for the values of T1 [3%]
b) The mean and standard deviation for the values of T1 are 20.6 and 0.6, respectively.
Assuming that T1 has a normal distribution, respond to the following questions:
i. Find the proportion of values that fall between 19 and 20.5. [6%]
ii. Determine a value that separates the highest 10% of T1 values from the rest. [6%]
c) In the dataset, there is another variable called T2. An analyst has used the following
formula to create a new variable called AT:
AT = 2T1 + 3T2
3
If the values of T2 are also normally distributed and the analyst has observed a mean of
19.5 and a standard deviation of 4, find the probability that AT is less than 95. [10%]
Problem 3 [25%]
This problem consists of tasks (a) and (b). For task (a), assume that the values of the given
variables are normally distributed.
(a) Terje has surveyed all physical stores in Oslo and found that the average price for product
X is 171 NOK. He then randomly visited five online stores and observed the following
prices.
Table 2: Price of product X

Online store 1 Online store 2 Online store 3 Online store 4 Online store 5
165 NOK 173 NOK 168 NOK 172 NOK 166 NOK
Task: Test a hypothesis that on average it is cheaper to buy product X online [10%]
(b) Assume that the Ministry of Trade in Norway conducted an extensive study in June and
concluded that 45% of companies in Norway had lost between 5% and 25% of sales since
the onset of COVID-19. In September, a trade organization called Norsk Industri
conducted a similar study and found that among 360 companies that responded to the
survey 51% had lost between 5% and 25% of sales since the onset of the pandemic.
Tasks:
(i) Using Norsk Industri’s report, calculate the margin of error for the estimate of the
proportion of companies that lost between 5% and 25% of sales since the onset of the
pandemic. [5%]
4
(ii) Test the following hypothesis: “The proportion reported by Norsk Industri is
significantly different from the figure reported by the Ministry of Trade”. [10%]
Problem 4 [25%]
In this problem, we will continue to use the energy use dataset (95 observations). After some
preliminary analyses, the analyst conducted a regression analysis to explain the variation in the
energy use of light fixtures. Table 3 presents an extract of her dataset, consisting of six variables
whose names and descriptions are as follows:
• T1: Energy use of light fixtures in the house in Watt-hours

• RH_1: Temperature in kitchen area, in Celsius
• RH_2: Temperature in living room area, in Celsius
• T3: Humidity in living room area, in %
• RH_3. Temperature in laundry room area
• HMD: Humidity outside; classified as Low (1), Moderate (2) and High (3)
Table 3: An extract of energy use dataset

ID T1 RH_1 RH_2 T3 RH_3 HMD
1 19,89 47,60 44,79 19,79 44,73 3
2 19,89 46,69 44,72 19,79 44,79 3
3 19,89 46,30 44,63 19,79 44,93 1
4 19,89 46,07 44,59 19,79 45,00 2
5 19,89 46,33 44,53 19,79 45,00 1
6 19,89 46,03 44,50 19,79 44,93 2
The analyst conducted the analysis and Table 4 presents a portion of the results.
Table 4: A portion of results from the analysis

5
Estimate t value
Intercept 12.35 3.29
RH_1 -0.06 -2.57
RH_2 -0.78 -9.79
T3 0.45 2.25
RH_3 0.81 12.81
HMD1 -0.27 -3.39
HMD2 0.03 0.44
R2 = 0.85
Note:
HMD1 and HMD2 are dummy variables defined as follows:
HMD1 = 1 if the humidity outside is low, otherwise, HMD1 = 0
HMD2 = 1 if the humidity outside is moderate, otherwise, HMD2 = 0
(a) Estimate the value of T1 for the following observation [5%]

ID RH_1 RH_2 T3 RH_3 HMD
42 46 44 19 43 3
(b) Interpret the estimated coefficient of HMD1 [5%]
(c) By using 99% confidence interval, would you conclude that RH_1 is a significant predictor
of T1? [5%]
(d) If the total sum of squares for the estimated model is 37.27, compute the Adjusted R2 and an
estimate of the standard deviation of the random term. [5%]
(e) The analyst conducted another analysis as described below:

Step 1: She created a new dummy variable called NHMD and defined it as follows:
NHMD = 1 if the humidity outside is low, otherwise, NHMD = 0
Step 2: She multiplied T3 by NHMD and labeled the result as T3_NHMD
Step 3: She estimated a new regression model. Table 5 presents a portion of the results
6
Table 5: A portion of results from the second analysis

Estimate p value
Intercept 10.96 ------
RH_1 -0.06 ------
RH_2 -0.71 ------
T3 0.53 ------
RH_3 0.72 ------
T3_NHMD -0.01 0.003
Task: Interpret the effect of T3_NHMD on T1. [5%]

7
Appendix 1
8
Appendix 2

LOG708 Applied Statistics 24.nov.2020

Uploaded by

Copyright:

Available Formats

You might also like

LOG708 Applied Statistics 24.nov.2020

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

LOG708 Applied Statistics 24.nov.2020

Uploaded by

Copyright:

Available Formats

LOG708 1 Applied Statistics

LOG708 Applied Statistics

More information on how to submit in Inspera.

If you have any questions or need assistance, contact studentweb@himolde.no

Figure 1. Condition of the units selected from the production lines

Figure 2. Boxplot summarizing values of T1

Table 2: Price of product X

• T1: Energy use of light fixtures in the house in Watt-hours

Table 3: An extract of energy use dataset

Table 4: A portion of results from the analysis

(a) Estimate the value of T1 for the following observation [5%]

(b) Interpret the estimated coefficient of HMD1 [5%]

(e) The analyst conducted another analysis as described below:

Step 2: She multiplied T3 by NHMD and labeled the result as T3_NHMD

Table 5: A portion of results from the second analysis

Task: Interpret the effect of T3_NHMD on T1. [5%]

You might also like