Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

1

COMMUNICATION RESEARCH (COMM 18)

LESSON 4: FINALS
PART 1V: DATA ANALYSIS

Introduction to Statistics

DESCRIPTIVE STATISTICS

 Intended to reduce data sets to allow for easier interpretation.


 Data analysis would be easier if the data were organized in some fashion.
 Two primary methods: data distribution and summary statistics

DATA DISTRIBUTION

 Distributing data in tables or graphs. A distribution is simply a collection of numbers.


 Preliminary step in making the data more manageable – arranged in frequency
distribution that is, a table of each score, ordered according to magnitude, and its actual
frequency of occurrence.

TABLE 1 Distribution of Respnses to “How many hours did you spend last week
listening to the radio and watching TV?”

Respondent Hours Respondent Hours


A 12 K 14
B 9 L 16
C 18 M 23
D 8 N 25
E 19 O 11
F 21 P 14
G 15 Q 12
H 8 R 19
I 11 S 21
J 6 T 11
2

TABLE 2 Frequency Distribution of Responses to “How many hours did you spend
last week listening to the radio and Watching TV?”

Hours Frequency (N=20)

6 1
8 2
9 1
11 3
12 2
14 2
15 1
16 1
18 1
19 2
21 2
23 1
25 1

 The column on the left contains all the values of the variable under study; the column on
the right shows the number of occurrences of each value. The sum of the frequency
column is the number (N) of persons or items that make up the distribution.
 A frequency distribution can also be constructed using grouped intervals, each of which
contains several score levels.

TABLE 3 Frequency Distribution of Radio and TV Listening and Viewing Hours Grouped
in Intervals

Hours Frequency
0-10 4
11-15 8
16-20 4
21-25 4

 Data can be transformed into proportions or percentages. To obtain the percentage of a


response, simply divide the frequency of individual responses by N – the total number of
responses in the distribution. Percentages allow comparisons to be made between
different frequency distribution that are based on different values of N.
 Some include cumulative frequency (cf). This column is constructed by adding a number
of scores in one interval to the number of scores in the intervals above it.
3

TABLE 4 Frequency distribution with added columns for percentage. Cumulative


Frequency as Percentage of N

Hours Frequency (N=20) Percentage cf cf percentage


of N

6 1 5 1 5
8 2 10 3 15
9 1 5 4 20
11 3 15 7 35
12 2 10 9 45
14 1 10 11 55
15 1 5 12 60
16 1 5 13 65
18 1 5 14 70
19 2 10 16 80
21 2 10 18 90
23 1 5 19 95
25 1 5 20 100
N = 20 100%

 Sometimes it is desirable to present data in graph form. Graphs usually consist of two
perpendicular lines, the x-axis or abscissa (horizontal), and the y-axis, or ordinate
(vertical)
 One common convention is to list the scores along the x-axis and the frequency or
relative frequency along the y-axis. Thus, the height of a line or bar indicates the
frequency of a score.
 Histogram or bar chart, in which frequencies are represented by vertical bars.
 If a line is drawn from the midpoint of each interval at its peak along the y-axis to each
adjacent midpoint/peak, the resulting graph is called a frequency polygon.
 A frequency curve is similar to a frequency polygon except that points are connected by a
continuous, unbroken curve instead of by lines. Such a curve assumes that any
irregularities shown in a frequency polygon are simply due to chance and that the
variable being studied is distributed continuously over the population.
 Normal curve- is a symmetrical bell curve
 Skewness refers to the concentration of scores around a particular point on the x-axis. If
the concentration lies toward the low end of the scale, with the tail of the curve trailing
off to the right, the curve is called right skew. Conversely if the tail of the curve trails off
to the left, it is a left skew. If the halves of the curve are identical, it is symmetrical or
normal.
 A normal distribution of data is free from skewness.
4

SUMMARY STATISTICS

 Two basic tendencies of distributions: central tendancy and dispersion or variability.


 Central tendency statistics answer the question, what is a typical score? They provide
information about the grouping of the numbers in a distribution by calculating a single
number that is characteristic of the entire distribution.

Mode (Mo)

 The score or scores occurring most frequently. Calculation is not necessary to determine
the mode.
 Disadvantage: focuses attention on only one possible score and can thus camouflage
important facts about the data when considered in isolation and distribution of scores can
have more than one mode.

Median (Mdn)

 Midpoint of a distribution: half the scores lie above it and half lie below it.
 Middle score; if there is an even number the media is a hypothetical score halfway
between the two middle scores.
 Arrange the scores from smallest to largest and locate the midpoint by inspection.

0 2 2 5 6 17 18 19 67

0 2 2 5 6 17 18 19 67 75

11. 5

Mdn = 6 + 17 = 11.5
2

Mean (X)

 The most familiar summary statistic; it represents the average of a set of scores.
 Sum of all scores divided by N or the total number of scores.

X = any score in a series of scores


X = the mean (read “X-bar”; M is also commonly used to denote the
mean)
= the sum (symbol is Greek capital letter sigma)
N = the total number of scores in a distribution
5

Formula:

X= X
N

If it includes the frequency distribution:

X= fX
N

 This equation indicates that the mean is the sum of all scores ( X) divided by the
number of scores (N).

TABLE 5 Calculation of Mean from Frequency Distribution

Hours Frequency fX
6 1 6
8 2 16
9 1 9
11 3 33
12 2 24
14 2 28
15 1 15
16 1 16
18 1 18
19 2 38
21 2 42
23 1 23
25 1 25
N = 20 fX = 293
X = 293 = 14.65
20

TABLE 6 Calculation of Variance


X = score
6

X X X–X (X – X)2
6 14.65 - 8.65 74.8
8 14.65 - 6.65 44.2
8 14.65 - 6.65 44.2
9 14.65 - 5.65 31.9
11 14.65 - 3.65 13.3
11 14.65 - 3.65 13.3
11 14.65 - 3.65 13.3
12 14.65 - 2.65 7.0
12 14.65 - 2.65 7.0
14 14.65 - 0.65 0.4
14 14.65 - 0.65 0.4
15 14.65 0.35 0.1
16 14.65 1.35 1.8
18 14.65 3.35 11.2
19 14.65 4.35 18.9
19 14.65 4.35 18.9
21 14.65 6.35 40.3
21 14.65 6.35 40.3
23 14.65 8.35 69.7
25 14.65 10.35 107.1
558
S2 = (X – X)2 = 558 = 29.4
N–1 19

Variance formula:

S2 = (X – X)2
N–1

Standard Deviation:

S= (X – X)2
N–1

You might also like