Mod. 4 Measures of Dispersion BSA

You might also like

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 6

MODULE 4 - MEASURES OF VARIABILITY or DISPERSION

Measures of Central Tendency describes a given set of data by indicating the point where the items are centrally located. In
some instances, these measures may not be sufficient to describe the data especially the intention is to look at how are the
elements of this data varies from each other.
Table 5.1 Raw scores of 5 students in 5 sets of quizzes.
Set1 Set2 Set 3 Set 4 Set5 X Let us take 5 sets of observations from a result of a 30 items quiz.
St’t. A 15 15 17 18 20 17 Solving for X , we shall find out that all the 5 sets have a X of 17 w/c is a
St’t. B 15 16 16 18 20 17 measure of CT. But the # 17 does’nt totally describe each of the 5 students.
St’t. C 14 15 18 19 19 17
St’t.D 14 15 16 19 21 17 To know w/c of the sets have values are more spread or variable, we will use the
St’t. F 11 13 18 18 25 17 measures of variability w/c defines how the individual items vary about the X .
Among these measures we shall discuss are the range, standard deviation, &
variance.

1. Range (R)–a simple yet valuable indicator of the variability of a set of data.

A. R for Ungrouped Data.


R = HV - LV in a distribution.
Set 1 Set 2 Set 3 Set 4 Set 5 R
St’t. A 15 16 17 18 20 20-15 =5
St’t. B 15 20 20 20 20 20-15 = 5
St’t. C 1 20 20 20 20 20-1 = 19

A. R for Grouped Data = UL of Highest CI – LL of Lowest CI

R =UL of the highest CI - LL of the lowest CI =180-118=62


Although R isn’t a stable measure bec. the presence of unusually large/small measurement affects it, it has a wide application.

Ex., the teacher’s interest in the R of the varying grades of his students, the interest of the businessman in the “highs” & “lows”
of his stock, or the agriculturist in the climatic extremes (temperature) of a given geographic region.

R is not a stable measure of variability bec. its value can fluctuate greatly w/ a change in just a single score – either the
highest or the lowest. Although R is the easiest to compute & easiest to understand, it is also the least satisfactory bec. its
value is dependent only upon the 2 extremes & does not consider the scatter of the values in between these 2 extremes.

2. Standard Deviation(s): An important statistic that is also used to measure variation or ave. dev’n of scores (data) from the
X .
It is therefore also affected by all the individual observations in the set.

A. SD for Ungrouped Data.(raw data)

formula for the pop’n sd:

 X
 
2

pop’n sd, δ = X 2
N :
N

X X2
1463   85  2

15 225
225
 5  3.6 = 1.9
15
17 289
5
18 324
20 400
Σ=85 Σ=1463
 X  X 
2

 
N
85
X = X
=17
x- X (x- X )2 5
18 15 -2 4
  = 3.6 =1.9 15 -2 4
5
17 0 0
17
18 1 1
20 3 9
Σ = 85 Σ = 18
Modifying the formula for sample std deviation

If X is a decimal # that has been rounded off, we accumulate a large error. To avoid this, the first formula must be used.
( x) 2
When data is taken from the samples, use the formula for SD: x 2

n
S
n 1
2
1463  (85)
 5  4.5  2.12
5  1correcting the sample sd for bias.
W/ this formula, we are actually

B. Grouped Data. The computations for “s” is essentially the same as that for ungrouped data, however we use M (the CM or
MP) for each of the CIs instead of X. Another point of difference is the use of the freq’y as a factor in the formula.
 fM 

2


 fM 2
N when data is taken from the pop’n,
N
f = the freq’y for every CI
M= the MP for every CI
N= the total # of cases or scores

 fM 

2

S
 fM 2
n when data is taken from a sample.
n 1
Class f M fM M2 fM2
118-126 3 122 366 14884 44652
127-135 5 131 655 17161 85805
136-144 9 140 1260 19600 176400
145-153 12 149 1788 22201 266412
154-162 5 158 790 24964 124820
163-171 4 167 668 27889 111556
172-180 2 176 352 30976 61952
Total 40 5879 871597
Assuming the data is taken from a pop’n:
 fM 

2
871597   5879
2

S
 fM 2
N  40  187.825 = 13.7
N 40
Assuming the data is taken from a sample;

 fM 

2
871597   5879
2


 fM 2
n  40  193.1 = 13.9
n 1 40  1

3. Variance (s2) :Mathematically expressing the degree of variation of scores (data) from the mean
• A large variance means that the individual scores (data) of the sample deviate a lot from the X while a small
variance indicates the scores (data) deviate little from the X .
The square of the s is nothing else but the variance, w/c serves the same purpose as that of the s, except that it has a different
scale of values. The variance is used when the data are in fractions.

The variance for the data in Table 5.3 is 3.6 and the variance for the data in Table 5.5 is 193.1 both calculated by taking the
square of s which means the measurements vary plus or minus +/- 3.6 cm and +/- 193.1 from the mean.

4. COEFFICIENT OF VARIATION (CV)


If we wish to compare the variability between diff’t sets of scores or data, CV would be a very useful measure for interval scale
data. The formula is,
S
C.V .  S = the st’d deviation, X = the mean
X

Ex. In ASC0T, a faculty wishes to compare the variation in the scores of the urban students w/ that of the scores of ASCOT
students in their college entrance tests. It is known that the urban students’ X score is 384, w/ st’d dev’n of 101; while among
ASCOT students, X is 174, w/ a st’d dev’n of 53. W/c group shows more variation in scores?
S 101 = 0.26 for urban stud’ts
C.V .  
X 384
S 53
C.V .   = 0.30 for ASCOT stud’ts
X 174
If we use the SD alone to measure variability, the scores of the urban students are more variable than those of ASCOT
students because of 101 as the SD of the urban students is considerably bigger than the SD of the latter group w/c is 53.

However, after computing the CV, ASCOT students’ scores have been found to be more variable than the urban students’
scores w/ 0.30 & 0.276 respectively, as their CF.

PRE TEST AND POST TEST

1. W/c of the ff statistics is a measure of dispersion?


a. Mean b. median b. Mode D. Std deviation

2. If the range of a set of scores is 14 and the lowest score is 7, what is the highest score?
a. 21 b. 24 c. 14 d. 7

3. Which of the following is not a valid dispersion amount?


a) 0 b) 10 c) 1 d) 100 e) -1
4. When an attribute is extremely varied in the group under investigation, the group is said to be
heterogeneous.
a. True b. false

5. The most important measure of variability is ___________


a. range b. inter quartile range c. mean deviation d. standard deviation

6. The standard deviation is


a. Based on squared deviations from the mean b. In the same units as the mean
c. Uses all the observations in its calculation d. AOTA.

7. The variance is
a. Found by dividing N by the mean b. In the same units as the original data
c. Found by squaring the std deviation. d. The square root of the std deviation

8. A disadvantage of the range is


a. Only two values are used in its calculation b. It is in difft units than the mean
c. It does not exist for some data sets d. All values are used in calculation

9. When the variances of the population distribution and the sampling distribution are compared, the
a. Variances are equal.
b. Population variance is smaller than the sampling distribution variance.
c. Population variance is larger than the sampling distribution variance.
d. Variances have the same degrees of freedom.

10. Why is variance not too useful?


a. It is difficult to calculate b. Squared units are difficult to interpret.
c. It is a large number d. It gives us very little idea about the deviation

11. _____ is the square root of the sum of square deviations of various values from their arithmetic mean
divided by the sample size minus one
a. Sample SD b. Mean absolute dev’n c. Coef of Variation d. Population SD

12. Examples of applications of range in real world includes


a. weather forecasts b. quality control c. fluctuation in share prices d. AOTA

13. In a distribution of 10, 20, 30, 40, 50, the mean ( ) is 30, the sum of deviations from will be
a) 0 b) 60 c) 30 d)15

14. If mean ( ) is 4 & the distribution is 2, 3, 4, 5, 6, the sum of squared deviations from the mean will
be a) 0 b) 8 c) 6 d) 12 e) 10

15. A factory has makes a component to a certain dimension as per the specification. However, the
specifications allow the dimensions to lie between +2% and -2% of the actual dimension. What measure
of dispersion is this allowance?
a. variance b. Range c. Standard deviation d. Difference between mode and median

16. A bank has a list of branches and the total outstanding credit at each branch in Pesos. W/c of the ff
measures would be in the same unit (Pesos)?
 a. Variance b. Standard median c. Standard deviation d. Coefficient of deviation e. AOTA

17. The extent or the degree to which data tend to spread around ________ is called the dispersion or
variation of data.
a. average b. quartiles c. harmonic mean d. geometric mean e. standard deviation

18. The standard deviation (SD) is most commonly used to get a sense of how far the typical score of a
distribution differs from the mean. In computing the SD, why is it necessary to square the deviations
from the mean for each score?
a. The deviations are too small to have a variance without being squared.
b. There is no variability in the deviations of the scores prior to squaring.
c. The mean of the deviations balances out to zero due to negative and positive values.
d. Squaring numbers is fun.

19. ____________ is used to compare the variation or dispersion in two or more sets of data even
though they are measured in different units.
a. Range b. Standard Deviation c. Coefficient of Variation d. Mean Deviation

20. If the mean is 25 and the standard deviation is 5 then C.V (Coefficient of variation) is
a) 100% b) 25% c) 20%/ d) None of these

21. What is the significance of the dispersion of a data set in the decision making process?
a. Low dispersion means that all data is tightly centered on the mean, thus decisions can be made
reliably with little risk.
b. High dispersion means that all data is tightly centered on the mean, thus decisions can be made
reliably with little risk.
c. The dispersion of a data set is a statistical means of contemplating how each datum relates to each
other datum in a set.
d. Dispersion is an unreliable source for analyzing data and should not be used to make decisions.

22. Outliers have the greatest effect on the:


a. Mean. b. Median. c. Percentile. d. Mode.

23 – 26. To assure a uniform product, a company measures each extension wire as it comes off the
product line. The lengths in centimeters of the first batch of ten wires were: 10, 15, 14, 11, 13, 10, 10,
11, 12 and 13. Find the range, variance, standard deviation and coefficient of variation

27-30. Find the range, mean, variance, standard deviation and coefficient of variation of the given data.

x F
100-110 2
111 – 121 1
122 – 132 5
133 – 143 12
144 – 154 9
155 – 165 1

You might also like