Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

Descriptive Statistics

i. For each of the three variables listed, indicate whether the variable is categorical or numerical. If it is
numerical, is it discrete or continuous? Give Reasons.

a. The primary place of purchase for dog food is a categorical variable, because it can be divided into
distinct groups or categories, such as supermarket, pet store, online, etc. b. The number of dogs living in
the household is a numerical variable, because it can be measured or counted using numbers. It is also
a discrete variable, because it can only take on whole numbers, such as 0, 1, 2, etc. c. Whether the dog is
pedigreed is a categorical variable, because it can be classified into two groups: yes or no.

ii. In which level of measurement can each of these variables be expressed? Give reasons.

a. The bank account number can be expressed in the nominal level of measurement, because it is used to
identify or label an account, but it does not have any inherent order or meaning. b. The position of a ship at
sea, in longitude can be expressed in the ratio level of measurement, because it has a natural zero point
(the prime meridian) and it can be compared using ratios, such as one position being twice as far from the
zero point as another. c. The color of a karate belt can be expressed in the ordinal level of measurement,
because it indicates the rank or level of achievement of a karate student, and it has a clear order from
lowest to highest (white, yellow, orange, green, blue, purple, brown, black). d. Sales revenue of a company
can be expressed in the interval level of measurement, because it has a consistent unit of measurement
(such as dollars) and it can be compared using differences or intervals, such as one company having
$10,000 more revenue than another.

iii. The following is the information about the settlement of an industrial dispute in a factory. Comment on
the losses & gains from the point of view of workers & that of management,
Before After
Copy

No. of workers 3000 2900 Mean wages (Rs) 2200 2300 Median wages (Rs) 2500 2400 Standard deviation
300 260

From the point of view of workers:

 The number of workers decreased by 100, which means that some workers lost their jobs or quit
voluntarily. This is a loss for those workers and their families.
 The mean wages increased by Rs 100, which means that the average worker earned more money
after the settlement. This is a gain for most workers.
 The median wages decreased by Rs 100, which means that the middle worker earned less money
after the settlement. This is a loss for half of the workers who earned below the median.
 The standard deviation decreased by Rs 40, which means that the wages became more consistent
and less spread out after the settlement. This is a gain for some workers who had very low wages
before and a loss for some workers who had very high wages before.

From the point of view of management:

 The number of workers decreased by 100, which means that they had to pay less salaries and
benefits to their employees. This is a gain for management in terms of cost reduction.
 The mean wages increased by Rs 100, which means that they had to pay more money to their
employees on average. This is a loss for management in terms of profit margin.
 The median wages decreased by Rs 100, which means that they had to pay less money to their
employees at the middle level. This is a gain for management in terms of saving costs.
 The standard deviation decreased by Rs 40, which means that they had to pay more equal wages to
their employees and reduce wage inequality. This is a loss for management in terms of incentive
structure and motivation.
iv. Samples of light bulbs were bought from two suppliers and were subjected to destruction test in the lab.
The following data are collected on the life.

Life in hours 700-800 800-900 900-1000 1000-1100 Total Supplier A 14 74 29 13 130 Supplier B 12 58
32 18 120

a. Which supplier provides greater average life?

To find the average life of each supplier’s light bulbs, we need to calculate the mean of each distribution
using the formula:

mean = sum(x * f) / n

where x is the midpoint of each class interval, f is the frequency of each class interval, and n is the total
number of observations.

For supplier A, the mean is:

mean = (750 * 14 + 850 * 74 + 950 * 29 + 1050 * 13) / 130 mean = 887.69

For supplier B, the mean is:

mean = (750 * 12 + 850 * 58 + 950 * 32 + 1050 * 18) / 120 mean = 892.50

Therefore, supplier B provides greater average life for their light bulbs.

b. Which supplier provides uniform quality?

To measure the uniformity or variability of each supplier’s light bulbs, we need to calculate the standard
deviation of each distribution using the formula:

standard deviation = sqrt(sum((x - mean)^2 * f) / n)

where x is the midpoint of each class interval, mean is the mean of the distribution, f is the frequency of
each class interval, and n is the total number of observations.

For supplier A, the standard deviation is:

standard deviation = sqrt((750 - 887.69)^2 * 14 + (850 - 887.69)^2 * 74 + (950 - 887.69)^2 * 29 + (1050 -


887.69)^2 * 13) / 130 standard deviation = 83.28

For supplier B, the standard deviation is:

standard deviation = sqrt((750 - 892.50)^2 * 12 + (850 - 892.50)^2 * 58 + (950 - 892.50)^2 * 32 + (1050 -


892.50)^2 * 18) / 120 standard deviation = 76.65

Therefore, supplier B provides more uniform quality for their light bulbs.

c. Which supplier would you prefer?

The answer to this question may depend on other factors, such as the price, availability, and reliability of
each supplier. However, based on the data given, supplier B seems to have an advantage over supplier A in
terms of providing light bulbs with higher average life and lower variability. Therefore, I would prefer
supplier B over supplier A.

Random Variable

i. The daily world price of refined sugar in cents per pound in April 2016 can be inferred to have the
following distribution:

X 7 8 9 10 11 12 P(x) 0.05 0.10 0.25 0.40 0.15 0.05

a. Show that P(x) is a valid probability distribution.


To show that P(x) is a valid probability distribution, we need to check two conditions:

 The probability of each outcome must be between 0 and 1, inclusive.


 The sum of all probabilities must be equal to 1.

We can see that both conditions are satisfied by P(x), as shown below:

X 7 8 9 10 11 12 P(x) 0.05 0.10 0.25 0.40 0.15 0.05 0 <= P(x) <=1? Yes Yes Yes Yes Yes Yes Sum of
P(x) =? 1

Therefore, P(x) is a valid probability distribution.

b. What is the probability that the price on a given day during this period will be at least 9 cents per pound?

To find the probability that the price on a given day during this period will be at least 9 cents per pound,
we need to add up the probabilities of all outcomes that are equal to or greater than 9, as shown below:

X 7 8 9 10 11 12 P(x) 0.05 0.10 0.25 0.40 0.15 0.05 X >=9? Yes Yes Yes Yes P(X >=9) =?
P(9)+P(10)+P(11)+P(12) P(X >=9) =? 0.25+0.40+0.15+0.05 P(X >=9) =? 0.85

Therefore, the probability that the price on a given day during this period will be at least 9 cents per pound
is 0.85.

c. What is the probability that the price on a given day during this period will be less than 11 cents per
pound?

To find the probability that the price on a given day during this period will be less than 11 cents per pound,
we need to add up the probabilities of all outcomes that are less than 11, as shown below:

X 7 8 9 10 11 12 P(x) 0.05 0.10 0.25 0.40 0.15 0.05 X <11? Yes Yes Yes Yes P(X <11) =? P

Probability Distribution

i. A restaurant has three sources of revenue: eat in orders, takeout orders & the bar. The daily revenue from
each source is normally distributed with mean & standard deviation shown in the table below
Mean Standard Deviation
Copy

Eat in $5,780 $142 Take out $641 $78 Bar $712 $72

a. What will be the distribution of total revenue on a day? Give reason.

The distribution of total revenue on a day will also be normally distributed, because the sum of
independent normal random variables is also a normal random variable. This is a property of the normal
distribution that allows us to combine different sources of revenue into one distribution.

b. What are the mean & standard deviation of the total revenue on a particular day?

To find the mean and standard deviation of the total revenue on a particular day, we need to use the
following formulas:

mean = sum(mean_i)

standard deviation = sqrt(sum(variance_i))

where mean_i and variance_i are the mean and variance of each source of revenue, respectively.

Using the data from the table, we can calculate:

mean = 5780 + 641 + 712 mean = $7,133

standard deviation = sqrt(142^2 + 78^2 + 72^2) standard deviation = $178.06


Therefore, the mean and standard deviation of the total revenue on a particular day
are $7,133 and $178.06, respectively.

c. What is the probability that the revenue will exceed $7000 on a particular day?

To find the probability that the revenue will exceed $7000 on a particular day, we need to use the standard
normal distribution table or a calculator. We first need to convert the revenue into a standard normal
variable using the formula:

z = (x - mean) / standard deviation

where x is the revenue, mean is the mean of the total revenue, and standard deviation is the standard
deviation of the total revenue.

Plugging in the values, we get:

z = (7000 - 7133) / 178.06 z = -0.75

Then, we look up the probability corresponding to z = -0.75 in the standard normal table or calculator. We
get:

P(z < -0.75) = 0.2266

This means that the probability that the revenue will be less than $7000 on a particular day is 0.2266.
Therefore, the probability that the revenue will exceed $7000 on a particular day is:

P(x > 7000) = 1 - P(x < 7000) P(x > 7000) = 1 - P(z < -0.75) P(x > 7000) = 1 - 0.2266 P(x > 7000)
= 0.7734

Therefore, the probability that the revenue will exceed $7000 on a particular day is 0.7734.

ii. The annual return of a well-known mutual fund has historically had a mean of about 10% and a standard
deviation of 21%. Suppose the return for the following year follows a normal distribution, with the
historical mean and standard deviation. What is the probability that you will lose money in the next year by
investing in this fund?

To find the probability that you will lose money in the next year by investing in this fund, we need to find
the probability that the return will be less than zero. We can use the same method as before, by converting
the return into a standard normal variable and using the table or calculator.

z = (x - mean) / standard deviation z = (0 - 10) / 21 z = -0.48

P(z < -0.48) = 0.3156

Therefore, the probability that you will lose money in the next year by investing in this fund is 0.3156.

Random Variable

i. A machine produces steel rods. The lengths of the rods are normally distributed with a mean of 26 cm
and an SD of 1 cm. Rods over 27 cm or shorter than 24 cm must be discarded. The machine produces 500
rods per shift. How many rods per shift have to be discarded?

To find how many rods per shift have to be discarded, we need to find the probability that a rod is either
over 27 cm or shorter than 24 cm, and then multiply it by 500.

We can use the same method as before, by converting the length into a standard normal variable and using
the table or calculator.

For rods over 27 cm:

z = (x - mean) / standard deviation z = (27 - 26) / 1 z = 1

P(z > 1) = 1 - P(z < 1) P(z > 1) = 1 - 0.8413 P(z > 1) = 0.1587
For rods shorter than 24 cm:

z = (x - mean) / standard deviation z = (24 - 26) / 1 z = -2

P(z < -2) = 0.0228

Therefore, the probability that a rod is either over 27 cm or shorter than 24 cm is:

P(x > 27 or x < 24) = P(x > 27) + P(x < 24) P(x > 27 or x < 24) = 0.1587 + 0.0228 P(x > 27 or x < 24)
= 0.1815

Hence, the expected number of rods per shift that have to be discarded is:

E(x) = P(x > 27 or x < 24) * n E(x) = 0.1815 * 500 E(x) = 90.75

Therefore, on average, about 91 rods per shift have to be discar

You might also like