Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Project 1: Statistical

Analysis
Quantitative Methods
Answer 1a
Bar Chart of data set:

To plot the given data on the bar chart I have converted the data into intervals by considering the minimum
value of 95 and maximum value of 143.

Measure of Glucose No. of Days


95-105 13 13
14
105-115 7
115-125 5 12
125-135 4

NUMBER OF DAYS
10
135-145 1
8 7

6 5
4
4

2 1

0
95-105 105-115 115-125 125-135 135-145
MEASURE OF GLUCOSE
Answer 1b
Mean: For grouped data, the formula for the mean is: Mean = Σ(f * x) / Σf Median: To find the median class in a grouped frequency distribution, the
where: formula: Median = l+h/f(n/2-c)
• f = Frequency of a class where:
• x = Mid-interval value of a class • l = lower limit of median class interval
• Σf = Sum of all frequencies • h = width of the class interval
• Σ(f * x) = Sum of the product of frequencies and mid-interval values • f = frequency of the class interval to which median belongs
• n = total frequency
Step 1: Calculate the Midpoint value of each class interval (x): Lower limit+ Upper • c = cumulative frequency preceding the median class frequency
limit/ 2 Step 1: Find the Median Class:
Step 2: Calculate the product of the frequency and midpoint of each class (f * x): The median position is n/2 = 30/2 =15, where n is the total frequency. Since the
Table: 1 cumulative frequency of the class 105-115 is 20, which is greater than 15, the
median class is 105-115.
Table: 1 Now, Let's calculated cumulative frequency (c): Table 2
Measure of Glucose Frequency (f) Mid-Interval Value (x) (f*x) Table: 2
95-105 13 100 1300 Cumulative frequency
105-115 7 110 770 Measure of Glucose Frequency (f) (c)
115-125 5 120 600 95-105 13 13-c
125-135 4 130 520 105-115 7-f 20
135-145 1 140 140 115-125 5 25
Total sum 30 3330 125-135 4 29
135-145 1 30
Step 3: Sum the product of frequency and mid-value (Σ(f * x)): 3330
Step 4: Sum the frequencies of all classes (Σf): 30 Step 2: Use the Median formula as mentioned above
Step 5: Calculate the mean (µ): • l = lower limit of median class interval = 105
µ = Σ(f * x) / Σf = 3330 / 30 = 111 • h = width of the class interval = 10
Mean (µ) of this data set is 111 • f = frequency of the class interval to which median belongs = 7
• n/2= median class = 15
• c = cumulative frequency preceding the median class frequency = 13
Median = 105+10/7(15-13) = 107.86
Answer 1b
Mode: To calculate the mode, use the formula: 3*median - 2*mean

• Take the median value and mean value as calculated previously:


Median = 107.50
Mean = 110.67

• Substitute the values into the formula:


Mode = 3*median - 2*mean
Mode = (3*107.86) – (2*111.00)
Mode = 323.57 – 222.00
Mode = 101.57

Therefore, the mode of this data set is 101.57.


Answer 1c
Measure of Dispersion:
Days 30 Mg/Dl 1: Calculating Range:
1 95
The range is the difference between to extreme observations of a distribution. Formula of Range is given by: Range= XMax – Xmin,
2 96
3 96
Where XMax and XMin are the maximum and minimum observations, respectively
4 98
5 99 • The Max Mg/Dl value= 143
6 99 • The Min Mg/Dl value- 95
7 101 • Range = 143−95 = 48 Mg/Dl
8 101
9 102 2: Calculate Quartile Deviation:
10 102 (Q3-Q1)
11 103 Quartile Deviation formula is QD =
12 104 2
13 104 Where as Q1 and Q3 is the first and third quartiles of the distribution, respectively, and (Q3-Q1) is called interquartile range.
14 107
15 107
16 111
• Total Number of data values, given by n is equal to 30
17 112 • n is odd, median =1/2(n+1)= 1/2(30+1)= 31/2= 15.5th observation, Q2 = 107
18 113 • Lower half of the data is: 95, 96, 96, 98, 99, 99, 101, 101, 102, 102, 103, 104, 104, 107
19 113 • First quartile Q1 = Median of lower half of data = 1/2(14+1)=15/2=7.5th observation = 101
20 114 • Upper half of the date is: 111, 112, 113, 113, 114, 115, 117, 121, 123, 124, 127, 129, 131, 135, 143
21 115 • Third quartile Q3 = Median of upper half of data= 1/2(15+1)= 16/2= 8th observation= 121
22 117
23 121 • Quartile Deviation QD = = (121-101)/2= 10
24 123
25 124 Therefore, the quartile deviation of the given data set is 10.
26 127
27 129
28 131
29 135
30 143
Answer 1c
|xi-x|, here
Measure of Dispersion: Mean (x) =
3: Calculating Mean Deviation: Days 30 Mg/Dl (xi) 111.4
1 95 16.4
Formula of Mean Deviation: ∑(xi- x) 2 96 15.4
Where: N 3 96 15.4
• xi represents each individual observation 4 98 13.4
5 99 12.4
• x is the mean (average) of the observations.
6 99 12.4
• N is the number of observations in the data set.
7 101 10.4
• xi-x is the absolute deviation of each observation from the mean 8 101 10.4
∑xi 9 102 9.4
Step 1: Calculate Mean (x): 10 102 9.4
N 11 103 8.4
As per table, ∑xi = 3342, N = 30 12 104 7.4
13 104 7.4
3342
Therefore, Mean (x)= = 111.40 14 107 4.4
30 15 107 4.4
Step 2: Calculate Mean Deviations: 16 111 0.4
17 112 0.6
18 113 1.6
As per attached table, ∑(xi- x) = 314.8
19 113 1.6
20 114 2.6
314.8 21 115 3.6
Apply the mean deviation formula: = 10.49 22 117 5.6
30 23 121 9.6
Hence, the mean deviation of the given data set ≈ 10.49 24 123 11.6
25 124 12.6
26 127 15.6
27 129 17.6
28 131 19.6
29 135 23.6
30 143 31.6
3342 314.8
Answer 1c
|xi-x|, here
Measure of Dispersion: Mg/Dl Mean (x) =
4: Calculating Standard Deviation: Days 30 (xi) 111.4 |xi-x|^2
1 95 16.4 269.0
2
∑ 𝑥𝑖−𝑥 2 96 15.4 237.2
Formula of Standard Deviation: σ = √ 3 96 15.4 237.2
𝑁
Where: 4 98 13.4 179.6
• xi represents each individual observation 5 99 12.4 153.8
• x is the mean (average) of the observations. 6 99 12.4 153.8
• N is the number of observations in the data set. 7 101 10.4 108.2
• (xi-x)2 is the squared deviation of each observation from the mean 8 101 10.4 108.2
9 102 9.4 88.4
Step 1: Calculate Square Deviation (xi-x)2 : 10 102 9.4 88.4
11 103 8.4 70.6
12 104 7.4 54.8
As per table, sum of squared differences= ∑(xi-x)2 = 4777.2, N = 30 13 104 7.4 54.8
14 107 4.4 19.4
Step 2: Calculate variance = ∑(xi - x)2 / N 15 107 4.4 19.4
16 111 0.4 0.2
4777.2
Therefore, Variance = = 159.24 17 112 0.6 0.4
30 18 113 1.6 2.6
2
∑ 𝑥𝑖−𝑥 19 113 1.6 2.6
Step 3: Calculate Standard Deviation: σ = √
𝑁 20 114 2.6 6.8
21 115 3.6 13.0
As we have already calculated the variance in step 2, we only need to calculate the 22 117 5.6 31.4
square root of variance to get the standard deviation = 𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 23 121 9.6 92.2
24 123 11.6 134.6
Standard Deviation σ = 159.24 = 12.62 25 124 12.6 158.8
26 127 15.6 243.4
27 129 17.6 309.8
The standard deviation of given data set ≈ 12.62
28 131 19.6 384.2
29 135 23.6 557.0
30 143 31.6 998.6
314.8 4777.2
Answer 1d
Histogram:
Measure of Glucose No of Days HISTOGRAM
95-105 13 14 13
105-115 7 12
115-125 5

NUMBER OF DAYS
10
125-135 4
135-145 1 8 7

Mean: 111.00 6 5
4
Median: 107.86 4
Mode: 101.57
2 1

0
95-105 105-115 115-125 125-135 135-145
MEASUREMENT OF GLUCOSE

As per attached histogram, this is a positive-skewed distribution, where the mode (most frequently occurring value) is
less than the median (middle value), and the median is less than the mean.
• Mean ≥ Median ≥ Mode
Answer 2a
1 : Calculating Mean Deviation: Fees
Type of College |xi-x|, here Mean
(xi)
∑(xi- x) (x) = 287821
Formula of Mean Deviation:
Where: N Government 16280 271541
• xi represents each individual observation Government 25000 262821
• x is the mean (average) of the observations. Government 41000 246821
• N is the number of observations in the data set. Government 60744 227077
• xi-x is the absolute deviation of each observation from the mean Government 69000 218821
Private 73200 214621
Step 1: Calculate Mean (x):
Government 80000 207821
Private 258000 29821
As per table, ∑xi = 5756424, N = 20
Private 264000 23821
5756424 Private 275000
Therefore, Mean (x)= = 287821 12821
20 Private 312000 24179
Step 2: Calculate Mean Deviation: Private 312000 24179
Private 312500 24679
As per attached table, ∑(xi- x) = 3431976 Private 375000 87179
Private 413000 125179
3431976 Private 476200 188379
Apply the mean deviation formula: = 171598.80
20 Private 495000 207179
Hence, the mean deviation of the given data set ≈ 171598.80 Private 523000 235179
Private 597000 309179
Private 778500 490679
5756424 3431976
Answer 2a
2: Calculating Standard Deviation:
Fees |xi-x|, here
Type of College Mean (x) =
∑ 𝑥𝑖−𝑥 2 (xi)
Formula of Standard Deviation: σ = √ 287821 |xi-x|^2
𝑁
Where: Government 16280 271541 73734623297
• xi represents each individual observation Government 25000 262821 69074983169
• x is the mean (average) of the observations.
• N is the number of observations in the data set. Government 41000 246821 60920704769
• (xi-x)2 is the squared deviation of each observation from the mean Government 60744 227077 51564054760
Government 69000 218821 47882717569
Step 1: Calculate Square Deviation (xi-x)2 : Private 73200 214621 46062259489
As per table, sum of squared differences= ∑(xi-x)2 = 889174188547, N = 20 Government 80000 207821 43189651169
Private 258000 29821 889303969
Step 2: Calculate variance = ∑(xi - x)2 / N Private 264000 23821 567449569
889174188547 Private 275000 12821 164383169
Therefore, Variance = = 44458709427.36
20 Private 312000 24179 584614369
2
∑ 𝑥𝑖−𝑥 Private 312000 24179 584614369
Step 3: Calculate Standard Deviation: σ = √
𝑁
Private 312500 24679 609043169
As we have already calculated the variance in step 2, we only need to calculate the Private 375000 87179 7600143169
square root of variance to get the standard deviation = 𝑉𝑎𝑟𝑖𝑎𝑛𝑐𝑒 Private 413000 125179 15669731969
Private 476200 188379 35486572289
Standard Deviation σ = 44458709427.36 = 210852.34 Private 495000 207179 42923055169
The standard deviation of given data set ≈ 210852.34 Private 523000 235179 55309067969
Private 597000 309179 95591530369
Private 778500 490679 240765684769
3431976 889174188547
Answer 2b
Based on the data set of twenty universities of India, we can state the null and alternative hypothesis as mentioned below:

Null Hypothesis: H0: µ ≥ 4,10,000


Alternative Hypothesis: H1: µ < 4,10,000

Step 1: Put the values of mean and standard deviation as calculated in previous slides:
• Sample mean (X) = 287821
• Given mean (μ) = 4,10,000
• Sample Standard Deviation (σ) = 210852.34
• Sample Size (n) = 20

X−μ
Step 2: Calculate the T-statistic using the formula =
Where: σ/√n
• X is the sample mean.
• μ is the given mean.
• σ is the standard deviation.
• n is the sample size.
287821-410000 -122178.80
Apply formula: = = -2.59
210852/√20 47170.55
Degrees of freedom (df) = (n - 1), (20-1)= 19, Level of significance (𝛼) = 5% or 0.05.
As per the T-table, at df =19 and significance level of 0.05 for one tailed test the T-value = -1.729.
Hence, Critical T-value is > T-statistic

Since the calculated t-statistic (-2.59) is less than the critical t-value (-1.729), we reject the null hypothesis (H₀: µ ≥ 4,10,000) and accept the alternative hypothesis (H₁: µ <
4,10,000). This means that there is enough evidence to conclude that the average (µ) is less than 4,10,000.

You might also like