Assignment 1

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 11

Northern Caribbean University

College of Business and Management

Department of Business Administration

Assignment #1 – 100 marks (Equivalent to 15%)

ECON272: Business & Economic Statistics

Due on or before March 4, 2024, at 9:00 a.m.

Note: Assignment may be handwritten if the submission is face to face; otherwise,


type double space, font size 12.

Student’s Name: Sherian Wedderburn Student’s ID: 20181600

General Instruction: Answer ALL questions in section A-D.

Section A: Problem Solving (40 marks)

Question 1

Body Number Class Xf X2f Cumulative


Weight in of College Midpoint Frequency
Pounds Students X F

30-60 90 45 4050 182250 90

60-90 75 75 5625 421875 165

90-120 60 105 6300 661500 225

120-150 45 135 6075 820125 270

150-180 30 165 4950 816750 300

180-210 15 195 2925 570375 315

Required:
i. Complete the table (4 marks)
ii. Represent the table using:

1
a. Histogram (2 marks)

A histogram is a graphical representation of the distribution of a dataset. It is an estimate of


the probability distribution of a continuous variable. To construct a histogram, the first step is
to "bin" the range of value that is, divide the entire range of values into a series of intervals
and then count how many values fall into each interval. The bins are usually specified as
consecutive, non-overlapping intervals of a variable. The bins (intervals) must be adjacent
and are often (but not necessarily) of equal size. In this case, the bins would be the body
weight ranges (30-60, 60-90, etc.), and the frequency would be the number of students in
each bin.

b. Less than ogive (2 marks)

An ogive is a graph that represents the cumulative frequencies for the classes in a
frequency distribution. It is a line graph. On the x-axis, you plot the upper-class boundary,
and, on the y-axis, you plot the cumulative frequency. In this case, the upper-class boundaries
would be 60, 90, 120, 150, 180, and 210, and the cumulative frequencies would be 90, 165,
225, 270, 300, and 315.

iii. Calculate the mean value of the student's body weight. (2 marks)

The mean is calculated by summing all the values and dividing by the number of
values. In this case, we need to sum the column Xf (which is the product of the class
midpoint and the frequency) and divide by the total frequency. The sum of Xf is 29925 and
the total frequency is 315. So, the mean is 29925/315 = 94.92 pounds.

iv. Calculate the median value using the median formula or graphical
estimate (2 marks)

The median is the middle value when the data is arranged in ascending order. In this
case, since we have grouped data, we can use the formula for the median of grouped data: L +
((N/2 - F)/f) * c, where L is the lower boundary of the median group, N is the total number of
data, F is the cumulative frequency of the group before the median group, f is the frequency
of the median group, and c is the class width. The median group is the one where the
cumulative frequency surpasses N/2 (315/2 = 157.5), which is the 60-90 group. So, the
median is 60 + ((157.5 - 90)/75) * 30 = 80.5 pounds.

v. Find the mode by formula or graphical model. (2 marks)

The mode is the value that appears most frequently. In this case, since we have
grouped data, we can use the formula for the mode of grouped data: L + ((d1/(d1+d2)) * c,
where L is the lower boundary of the modal group, d1 is the difference between the
frequency of the modal group and the frequency of the previous group, d2 is the difference

2
between the frequency of the modal group and the frequency of the next group, and c is the
class width. The modal group is the one with the highest frequency, which is the 30-60 group.
So, the mode is 30 + ((90-75)/((90-75)+(90-60))) * 30 = 45 pounds.

vi. Determine standard deviation (3 marks)

The standard deviation is a measure of the amount of variation or dispersion of a set


of values. It is calculated by taking the square root of the variance. The variance is calculated
by summing the squared deviations from the mean, divided by the number of observations. In
this case, we need to sum the column X^2f (which is the product of the square of the class
midpoint and the frequency), divide by the total frequency, and subtract the square of the
mean. The sum of X^2f is 3472875. So, the variance is (3472875/315) - (94.92)^2 = 1100.96.
The standard deviation is the square root of this, which is approximately 33.18 pounds.

vii. Compute Psk and briefly describe with reasons the skewness of the data
value (3 marks)

The Pearson's coefficient of skewness (Psk) is a measure of the asymmetry of the probability
distribution of a real-valued random variable about its mean. It is calculated by 3*(mean -
median)/standard deviation. In this case, Psk = 3*(94.92 - 80.5)/33.18 = 1.30. Since Psk > 0,
the distribution is positively skewed, meaning that it has a long tail in the positive direction.
This means that there are a few students with very high body weights that are pulling the
mean higher than the median.

viii. Determine (estimate) the number of students with:


a. Less than 90 pounds (2 marks)

Ans= 75+90=165

b. Less than 156 pounds. (2 marks)

Ans= 9+60+75+90= 284

c. 180 or more pounds. (2 marks)

Ans= 15

d. More than 66 pounds. (2 marks)

Ans= 75/30=2.5 90-66=24

2.5x24=60+60+45+30+15=210

Total = 28 marks

3
Question 2

A study was recently done in which 500 people were asked to indicate their
preferences for one of three products A, B and C. The following table shows the
breakdown of the responses by gender of the respondent.

Product preferences

Gender A B C Total

Male (M) 80 20 10 110

Female (F) 200 70 120 390

Total 280 90 130 500

Determine:

i. P (F) (2 marks)

Ans= 390/500= 0.78

ii. P (A or C) (2 marks)
Ans= 280+130= 412/500= 0.82

iii. P (B/M) (2 marks)

Ans= 20/110=0.1818

iv. P (C or M) (2 marks)
Ans= 130+110/500=0.48

v. P (A or B or C) comp (2 marks)

Ans= 1-(280+90+130)/500=0

vi. Probability of selecting a female and that she prefers product A. (2 marks)
Ans=200/500=0.4
Total = 12 marks

4
Section B: Problem Solving (20 marks)

Instruction: Answer questions 1 and 2. Each question values 10 marks.

1. A company has two product samples A and B in each product line, each
sample has a mean and standard deviation as indicated in the following table.

Sample A Sample B

Mena = 12 Mean = 6

Standard deviation = 3 Standard deviation = 2

a. Which of the samples, A or B, is statistically reliable? (2 marks)

Ans= Sample B

b. Which of the two will you recommend to the management? Explain why.
(8 marks)

I would recommend Sample B to the management because It will allow a more precise
and accurate conclusion that Product B will

2. Given the information below:

12, 9, 7, 14, 5, 8, 0, 4

Compute sample:
a. Mean (1 ½ marks)

Ans= 7.375

b. Median (1 ½ marks)

Ans= 7.5

c. Mode (1 ½ marks)

Ans= 12, 9,7,14,5,8,0

5
d. Variance (1 ½ marks)

Ans= s2= 19.982143

e. Standard deviation (4 marks)


Ans= S=4.470139
Total = 20 marks

Section C: Problem Solving (25 marks)

A random sample of 32 customer records for a physician’s office showed the


following time (in days) to collect insurance payments.

34 70 38 36

32 38 45 30

58 40 40 32

24 35 60 35

20 36 30 40

38 30 5 56

55 46 28 50

35 30 32 52

6
a. Construct an order stem and leaf diagram. (10 marks)

Stem Leaf

0. 5

2 0,4,8

3 0,0,0,0,2,2,2,4,5,5,5,6,6,8,8,8

4. 0,0,0,5,6

5 0,2,5,6,8

6 0

7. 0

b. Within what range of days are most payments collected? (3 marks)

Most of the data points are in the 30s, so the range is between 30 and 38 days.

c. Distribute these values into seven classes of equal intervals and identify the
frequency for each class. Follow class boundaries or class limits (3 marks)

d. What is your class interval (width) (3 marks)

e. How skewed is this data? How do you know? (3 marks)

Positively skewed

7
f. Represent your data using any suitable model of your choice other than stem
and leaf. (3 marks)
Total = 25 marks

Section D: Multiple Choice (15 marks)

1. Any device, such as a graph, or table, that displays all possible values in a
sample or population along their frequency of occurrence is called:
a. a parameter
b. a sample
c. a population
d. a frequency distribution

2. A descriptive measure computed from or used to describe a sample of data


is called a ____________.
a. statistic
b. random variable
c. discrete variable
d. parameter

3. A characteristic of the___________, as a measure of central tendency, is the


fact that its magnitude is affected by a single extremely small value.
a. coefficient of variation
b. mean
c. medium
d. range

4. A ____________ reveals the range of a data set, shows where the highest
concentration of values occurs, provides information about the presence or
absence of symmetry, and can indicate the degree to which the data are
homogeneous.
a. stem-and-leaf display
b. bar graph
c. line graph
d. pie chart

8
Answer questions 5-7 on the basis of the following frequency distribution.

Class Interval Frequency

20-29 2

30-39 4

40-49 6

50-59 8

60-69 10

70-79 12

5. The distribution suggests a ____________ skew.


a. positive
b. negative
c. zero
d. positive and negative

6. The median value is _____________.


a. 60
b. 60.5
c. 49.5
d. None of the above

7. The modal class is:


a. 40-49 and the mode is 50.
b. 50-59 and the mode is 50.
c. 40-49 or 50-59 and the mode is 49.5.
d. 70-79 and the mode is 70.93.

8. Which of the following statements is not true of the arithmetic mean?


a. For a given set of data there is one and only one mean.
b. Its meaning is easily understood.
c. It is not affected by extreme values.
d. In a symmetric data set it will be equal to the median and the mode.

9
9. The following are the times in seconds required for a sample of assembly-
line employees to complete a certain task: 8, 14, 16, 18, 10, 18, 19. The
sample median is:
a. 17
b. 18
c. 15.88
d. 16

10. Which of the following steps in the construction of a frequency distribution is


first?
a. Determine the mid-point of each class interval.
b. Determine the number of measures falling into each class interval.
c. Determine the number of class intervals and their widths.
d. Construct a histogram.

11. Two events are said to be ____________ if they cannot occur


simultaneously.
a. equal likely
b. independent
c. mutually exclusive
d. discrete

12. P (A U B) = P (A) + P(B) – P(A ∩ B)_ is a statement of the _____________


rule in probability.
a. multiplication
b. addition
c. complementary
d. joint

The following data were obtained from a survey of college students. The
variable X is the number of non-assigned books read during the past six
months.

X 0 1 2 3 4 5 6

P(X=x): 0.55 0.15 0.10 0.10 0.04 0.03 0.03

13. Find 1≤ P(x) ≤ 5


a. 1.00

10
b. 0.42
c. 0.58
d. 0.40

Answer question 14 through question 15 on the basis of the following information for
the number (X) of previous jobs held by 100 applicants for a security guard.

Previously 0 1 2 3 4 5 6 7 8 9
held jobs
(X)

Number 6 18 20 30 8 7 5 3 2 1
of
applicants
(f)

14. What is the probability that a randomly selected applicant will have held
fewer than 3 previous jobs?
a. 0.76
b. 0.56
c. 0.44
d. 0.24

15. What is the probability that a randomly selected applicant will have at least 2
previous jobs?
a. 0.65
b. 0.56
c. 0.44
d. 0.76

11

You might also like