Assignment in Research and Statistics

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 17

1

Assignment in Research and Statistics – Twenty Sample Problems in Statistics

Helen Grace V. Samontañez

School of Graduate Studies

Philippine Christian University

Master in Business Administration

Dr. Visitacion B. Crisostomo

June 3, 2023
2

Descriptive Statistics

1. Percentage rejects of Batteries per size.


Formula is Order Quantity/no. of Reject x 100.
G31 has the highest rejects, while 4dt has the lease rejects in terms of percentage.

Size Order Qty no. of Rejects % Rejection Rate


G31 10,553 1,617 15.33%
D31 2,065 136 6.57%
G78 3,351 213 6.35%
LN5 292 15 5.04%
D26 9,467 450 4.75%
B24 1,719 57 3.32%
G51 320 11 3.32%
GC2 3,662 112 3.07%
GU1 5,675 155 2.74%
G75 709 18 2.48%
D23 2,210 52 2.37%
LN3 1,900 31 1.65%
G65 4,224 70 1.65%
F51 7 0 1.47%
LB3 7 0 1.33%
LN4 1,931 26 1.33%
B20 27 0 1.22%
A46 160 2 1.18%
H52 241 3 1.08%
G58 372 3 0.74%
LN2 1,326 10 0.72%
G34 830 5 0.57%
LB2 451 2 0.43%
GC8 948 2 0.26%
4DT 325 1 0.17%
3

2. Ranking
Rank the Order quantity in descending order, then rank the highest as one.
G31 has the most order while F51 has the least order.

Rank Size Order Qty no. of Rejects % Rejection Rate


1 G31 10,553 1,617 15.33%
2 D26 9,467 450 4.75%
3 GU1 5,675 155 2.74%
4 G65 4,224 70 1.65%
5 GC2 3,662 112 3.07%
6 G78 3,351 213 6.35%
7 D23 2,210 52 2.37%
8 D31 2,065 136 6.57%
9 LN4 1,931 26 1.33%
10 LN3 1,900 31 1.65%
11 B24 1,719 57 3.32%
12 LN2 1,326 10 0.72%
13 GC8 948 2 0.26%
14 G34 830 5 0.57%
15 G75 709 18 2.48%
16 LB2 451 2 0.43%
17 G58 372 3 0.74%
18 4DT 325 1 0.17%
19 G51 320 11 3.32%
20 LN5 292 15 5.04%
21 H52 241 3 1.08%
22 A46 160 2 1.18%
23 B20 27 0 1.22%
24 LB3 7 0 1.33%
25 F51 7 0 1.47%
4

3. Frequency Distribution
Identify the frequency distribution of transit time data below.

Transit Time
35 35 35 35 35 35 37
37 38 38 39 39 39 40
40 40 40 40 43 43 43
44 44 44 44 44 46 46
48 48 48 48 48 53 56
56 58 58 62 67

Step 1: Calculate the range of the data set

Range = 67-35
= 32
Step 2: Divide the range by the number of groups you want and then round up
Let’s say we want 4 groups. Then:

Class Width = 32/5


= 7
Step 3: Use the class width to create your groups

Classes (Transit time,in Frequenc


days) y
32 - 39
39 - 46
46 - 53
53 - 60
60 - 67
Step 4: Find the frequency for each group

Classes (Transit time,in days) Frequency


32 - 39 10
39 - 46 16
46 - 53 7
53 - 60 5
60 - 67 2
5

Frequency Distribution (Transit Time,days)


18
15
Frquency 12
9
6
3
0
39 46 53 60 67
Transit Time, days

4. Mean
Find the mean of the first 10 odd integers.

Solution:

First 10 odd integers: 1, 3, 5, 7, 9, 11, 13, 15, 17, 19

Mean = Sum of the first 10 odd integers/Number of such integers

= (1 + 3 + 5 + 7 + 9 + 11 + 13 + 15 + 17 + 19)/10

= 100/10

= 10

Therefore, the mean of the first 10 odd integers is 10.

5. Median
What is the median of the following data set?
32, 6, 21, 10, 8, 11, 12, 36, 17, 16, 15, 18, 40, 24, 21, 23, 24, 24, 29, 16, 32, 31, 10, 30, 35,
32, 18, 39, 12, 20
Solution:
The ascending order of the given data set is:
6, 8, 10, 10, 11, 12, 12, 15, 16, 16, 17, 18, 18, 20, 21, 21, 23, 24, 24, 24, 29, 30, 31, 32, 32,
32, 35, 36, 39, 40
Number of values in the data set = n = 30
n/2 = 30/2 = 15
15th data value = 21
(n/2) +1 = 16
16th data value = 21
6

Median = [(n/2)th observation + {(n/2)+1}th observation]/2


= (15th data value + 16th data value)/2
= (21 + 21)/2
= 21

6. Mode
Identify the mode for the following data set:

21, 19, 62, 21, 66, 28, 66, 48, 79, 59, 28, 62, 63, 63, 48, 66, 59, 66, 94, 79, 19 94

Solution:

Let us write the given data set in ascending order as follows:

19, 19, 21, 21, 28, 28, 48, 48, 59, 59, 62, 62, 63, 63, 66, 66, 66, 66, 79, 79, 94, 94

Here, we can observe that the number 66 occurred the maximum number of times.

Thus, the mode of the given data set is 66.

7. Measures of Dispersion
Below is the table showing the values of the results for two companies A, and B.

Which of the company has a larger wage bill?


Calculate the coefficients of variations for both of the companies.
Calculate the average daily wage and the variance of the distribution of wages of all the
employees in the firms A and B taken together.
Solution:

For Company A
No. of employees = n1 = 900, and average daily wages = ȳ 1 = Rs. 250

We know, average daily wage = Total wages ⁄ Total number of employees

or, Total wages = Total employees × average daily wage = 900 × 250 = Rs. 225000 … (i)

For Company B
No. of employees = n2 = 1000, and average daily wages = ȳ2 = Rs. 220

So, Total wages = Total employees × average daily wage = 1000 × 220 = Rs. 220000 … (ii)
7

Comparing (i), and (ii), we see that Company A has a larger wage bill.

For Company A
Variance of distribution of wages = σ12 = 100

C.V. of distribution of wages = 100 x standard deviation of distribution of wages/ average daily
wages

Or, C.V. A = 100 × √100⁄250 = 100 × 10⁄250 = 4 … (i)

For Company B
Variance of distribution of wages = σ22 = 144

C.V. B = 100 × √144⁄220 = 100 × 12⁄220 = 5.45 … (ii)

Comparing (i), and (ii), we see that Company B has greater variability.

For Company A and B, taken together


The average daily wages for both the companies taken together

ȳ = (n1 ȳ 1 + n2 ȳ 2)⁄( n1 + n2) = (900 × 250 + 1000 × 220) ÷ (900 + 1000) = 445000⁄1900 = Rs. 234.21

The combined variance, σ2 = (1/ n1 + n2) ÷ [n1 (σ12 + d12) + n2 (σ22 + d22)]

Here, d1 = ȳ1 − ȳ = 250 – 234.21 = 15.79, d2 = ȳ2 − ȳ = 220 – 234.21 = – 14.21.

Hence, σ2 = [900 × (100 + 15.792) + 1000 × (144 + – 14.212)] ⁄ (900 + 1000)

or, σ2 = (314391.69 + 345924.10) ⁄ 1900 = 347.53.

8. Given the data in the table below:

Calculate the; mean absolute deviation, range, semi –inter – quartile.


range, variance and standard deviation.
8

9. Kendall's Coefficient of Concordance (w)


9

The table below shows the distribution of marks scored by 10 students in Mathematics (X),
Statistics (Y) and Computer (Z) tests. Calculate the correlation coefcient from the given table.

10. Regression Analysis:


10

Find the line of regression of Y on X from the table below:

11. Measures of Dispersion


Find the Variance and Standard Deviation of the Following Numbers: 1, 3, 5, 5, 6, 7, 9, 10.
Solution:
The mean = (1+ 3+ 5+ 5+ 6+ 7+ 9+ 10)/8 = 46/ 8 = 5.75
Step 1: Subtract the mean value from individual value
(1 – 5.75), (3 – 5.75), (5 – 5.75), (5 – 5.75), (6 – 5.75), (7 – 5.75), (9 – 5.75), (10 – 5.75)
= -4.75, -2.75, -0.75, -0.75, 0.25, 1.25, 3.25, 4.25
11

Step 2: Squaring the above values we get, 22.563, 7.563, 0.563, 0.563, 0.063, 1.563, 10.563,
18.063
Step 3: 22.563 + 7.563 + 0.563 + 0.563 + 0.063 + 1.563 + 10.563 + 18.063
= 61.504
Step 4: n = 8, therefore variance (σ2) = 61.504/ 8 = 7.69
Now, Standard deviation (σ) = 2.77
Example 2: Calculate the range and coefficient of range for the following data values.
45, 55, 63, 76, 67, 84, 75, 48, 62, 65
Solution:
Let Xi values be: 45, 55, 63, 76, 67, 84, 75, 48, 62, 65
Here,
Maxium value (Xmax) = 84
Minimum or Least value (Xmin) = 45
Range = Maximum value = Minimum value-
= 84 – 45
= 39
Coefficient of range = (Xmax – Xmin)/(Xmax + Xmin)
= (84 – 45)/(84 + 45)
= 39/129
= 0.302 (approx)

12. Suppose X1, . . . , X100 are i.i.d random variables which have uniform distribution on [a − 2, a +
2], where a is unknown. Suppose the random sample
produces sample mean equal to 3.
Compute a 95% confidence interval for a.

SOLUTION
A random variable with uniform distribution on [a − 2, a + 2] has mean
µ = a. So, a confidence interval for µ is a confidence interval for a. Because n = 100 is large, the
confidence interval provided by the Central Limit
Theorem applies:
12

A random variable with uniform distribution on [a − 2, a + 2] has standard


deviation σ = 4/√. Our sample mean is 3. Substituting, we get

13. In a mythical national survey, 225 students are randomly selected from those taking calculus,
and asked if calculus is their favorite subject. 100 students reply that calcululs is their favorite
subject. Give a 95% confidence interval for the proportion of all students taking calculus who
consider it their favorite subject.
SOLUTION We will plug into the 95%confidence interval formula for population proportion,

14. Suppose in a random sample of 225 undergraduate men at UMD that the average best (highest
weight) bench press is 150 pounds, with sample standard deviation of 20 pounds. Compute a
95% confidence interval for the average best bench press for for UMD undergraduate men.
SOLUTION We use for the interval the formula

Here the sample mean is 150 and s = 20. So the desired 95% confidence interval, in pounds, for
the average best bench press of UMD undergraduate men is

15. T Test
After a new sales training is given to employees the average sale goes up to $150 (a sample of
25 employees was examined) with a standard deviation of $12. Before the training, the average
sale was $100. Check if the training helped at
α = 0.05.
13

Solution: The t test in inferential statistics is used to solve this problem.

The degrees of freedom is given by 25 - 1 = 24


Using the t table at α
= 0.05, the critical value is T(0.05, 24) = 1.71
As 20.83 > 1.71 thus, the null hypothesis is rejected and it is concluded that the training helped
in increasing the average sales.
Answer: Reject Null Hypothesis.
16. Test of Hypothesis
A test was conducted with the variance = 108 and n = 8. Certain changes were made in the test
and it was again conducted with variance = 72 and n = 6. At a 0.05 significance level was there
any improvement in the test results?
Solution: The f test in inferential statistics will be used
14

The f test formula is given as follows:

As 4.88 < 1.5, thus, we fail to reject the null hypothesis and conclude that there is not enough evidence
to suggest that the test results improved.
Answer: Fail to reject the null hypothesis.
17. Testing Hypothesis
After a new sales training is given to employees the average sale goes up to $150 (a sample of
49 employees was examined). Before the training, the average sale was $100 with a standard
deviation of $12. Check if the training helped at α = 0.05.

Solution: This is similar to example 1. However, as the sample size is 49 and the population
standard deviation is known, thus, the z test in inferential statistics is used.

From the z table at α = 0.05, the critical value is 1.645.


As 29.2 > 1.645 thus, the null hypothesis is rejected and it is concluded that the training was
useful in increasing the average sales.

Answer: Reject the null hypothesis.


15

18. After new sales training is given to employees, the mean sale goes up to £50 (a sample of 25
employees examined) with a standard deviation of £12. Before the training, the average sale
was £100. Check if the training helped at α = 0.05.
Solution: The t-test in inferential statistics solves this problem with the formula:

x = 150, μ = 100, s= 12, n = 25


H0: μ=100
H1: μ=100
= 20.83
The degree of freedom is given by 25 – 1 = 24. Using the t table at α = 0.05, the critical value is
T(0.05, 24) = 1.71 . As 20.83 > 1.71 thus, H0 is rejected. The conclusion is that the training
helped in increasing the average sales.

19. Spearman's Rank Order Correlation Coefcient (rho):


The following table gives marks obtained by 10 students in
Mathematics (X) and Computer (Y) examinations. Find out the Spearman's
Rank Correlation Coefcient.
16

20. Chi-Square Test:


The table below shows the number of applicant that applied for a Diploma programme in
various Departments in Waziri Umaru Federal Polytechnic Birnin Kebbi. Test whether or not if
there is equal preference of Departments among the applicants.

It is expected that there is equal preference among the applicants, therefore the expected
frequency will be
17

You might also like