Professional Documents
Culture Documents
Pos 311
Pos 311
Pos 311
POS 311
Statistical Methods in Political Science
By
Published by
Distance Learning Centre
University of Ibadan
ii
© Distance Learning Centre
University of Ibadan
Ibadan.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval
system, or transmitted in any form or by any means, electronic, mechanical,
photocopying, recording, or otherwise, without permission of the copyright owner.
ISBN 978-021-190-X
Printed by
iii
Table of Content
Page
Vice-Chancellor’s Message……………………………………… vi
Foreword……………………………………………………………..vii
General Introduction…………………………………………………viii
Lecture One: Computation Skill Reviewed…………………………1
Lecture Two: Definition and Scope of Statistics……………………10
Lecture Three: Statistical Enquiries of Political Phenomenon……….19
Lecture Four: Data Representation………………………………….26
Lecture Five: Descriptive Statistics I……………………………….38
Lecture Six: Descriptive Statistics II………………………………51
Lecture Seven: Introduction to Set Theory…………………………..59
Lecture Eight: Probability and Random Variables………………..…65
Lecture Nine: Special Probability Distributions..………………...…74
Lecture Ten: Statistical Estimation…………………………………86
Lecture Eleven: Hypothesis Testing in Political Science Research….95
Lecture Twelve: Special Test Statistic(s)……………………………100
Lecture Thirteen: Correlation Analysis……………………………...110
Lecture Fourteen: Regression Analysis……………………………..118
Lecture Fifteen: Use of Computer in Data Analysis………………....125
iv
General Introduction
The importance of statistical analysis in social science research cannot be
over-emphasized. That tool called “STATISTICS” is fast becoming the
language of communication in the behavioural sciences. Perhaps this
argument is a direct offshoot of a thought expressed by a scholar over a
century ago when he wrote:
Overall Objectives
After working (not reading) through this course material, you will be able
to:
*
Quoted from Kitchens, L.J. 1998 (Emphasis added).
v
1. Understand the basic concepts of statistics as a tool in Social Science
research.
2. Compute various statistical measures.
3. Use these measures in comparative studies.
4. Prepare and execute a survey research in Political Science.
5. Use the computer to analyse any large volume of data.
6. Interpret statistical reports more competently.
vi
LECTURE ONE
Introduction
This lecture is a general review of some basic Mathematical skills that you
may require in order to work successfully through this course.
Mathematics is the language of science hence you need to brush up some
computation skills. It is not intended to be a rigorous exercise, rather it is
meant to remind you of arithmetic operations you might have learnt in
elementary school. This is a good appetizer for the real statistical meal.
Objectives
Our core objective is that after studying this lecture, you will be able to
perform simple arithmetic operations such as:
1. Addition of numbers.
2. Subtraction of numbers.
3. Multiplication and Division of numbers.
4. Recognize mathematical/statistical notations.
1
4. What is the product of 468 and 25?
(a) 1170 (b) 11700 (c) 117000 (d) 117
CONTENT
2
5. Multiplication is denoted by x
When placed between two numbers, one of the numbers must be
multiplied by the other.
e.g. 3 x 4 = 12
Another word for multiplication is product.
or 12
=3
4
3
12. Bracket sign denoted by ( )
It may be used to separate single arithmetic operations to be
performed.
e.g. (2 x 3) + (10 ÷ 5)
Or when a figure is placed behind it and there is another figure inside
it, then it means multiplication e.g. 2 (4) = 2 x 4 = 8
.
13. Therefore sign denoted by . .
Used to show logical continuation of a step from the previous one.
e.g. If 2x = 10
then
10
2x =
2
∴x = 5
14. For example sign is denoted by e.g.
It means for instance or as an illustration.
e.g.
25 = 5 x5 = 5
4
Section 2: Some Arithmetic Operations
In this section, we intend to put into practical use, some of this lecture.
This is a practical exercise hence you should provide yourself with a pen,
a calculator and several working paper. We shall adopt the question and
answer approach here. Read the question first, attempt to solve it before
you look at the solution provided.
1. 4325
+ 378
467
2. 2 8 5
x 7 9
4. Calculate 196
5. Appropriate the following figures to the nearest whole number:
(a) 17.239
(b) 12.549
(c) 78, 270, 659
Having solved the problems, look at the solutions provided below and
grade yourself.
5
Solutions
1. To add these three figures, you must start from the extreme right and
work through to the left. You must add across column. If after adding
a column you get a 2 digit figure such as (20), write the last digit (0)
below that column and add the first (2) to the next column until the
operation is completed.
4 3 2 2
+ 3 7 8
4 6 7
5 1 6 7
2. You should also start this multiplication operation from the extreme
right. Use 9 to multiply 285, write the value down, then use 7 to
multiply 285, write the value below it and then add the two new values
together. In fact what you are doing can be written in this form (9 x
285) + (7 x 285).
However, we shall adopt the first principle method.
Thus:
2 8 5
x 7 9
2 5 6 5
+ 1 9 9 5
2 2 5 1 5
6
That order must be followed faithfully in any arithmetic operation that
combines two or more operations.
Thus: 60
x720
100
Cancelling out the zeros. Two of them below strikes out another two
above.
6 72
x
1 1
= 7 2
x 6
432
i.e. 196
You are permitted to use your calculator here. Put the power on, press
196, then look for the button which has the inscription X .
Press the button and the answer you get is the solution.
196 = 14
7
5. In approximating a figure to the nearest whole number, all digits to the
right of the decimal point should be ignored. The only exemption is
where the digit just after the decimal point is 5 or bigger than 5. In
that case, we approximate it to 1 and add it to the digit just before the
decimal point.
(a) 17.239 = 17 to the nearest whole number.
(b) 12.549 = 13 to the nearest whole number.
(c) 78,270,659 = 78,000,000 to the nearest whole number.
8
Summary
References
Any recommended Mathematics text at the Senior Secondary
Level.
9
LECTURE TWO
Introduction
This lecture will examine the definitions of Statistics, its scope and
limitations, and its importance to political enquiry. We will also explore
the branches of Statistics and their inter-relationship, and then go ahead to
define and explain some basic concepts in Statistics.
Objectives
After this lecture, you should be able to:
1. Define Statistics as a tool in Social Science studies.
2. Identify the uses and limitations of Statistics.
3. Define some basic concepts that are relevant to this course.
Pre-Test
Answer the following questions before you proceed to study the lecture
1. Statistics is concerned with:
(a) calculation of mean, median and mode.
(b) computation of regression and standard deviation.
(c) collection, organization, presentation and analysis of data.
(d) none of the above.
10
3. The two main branches of statistics are:
(a) Correlation and Regression analysis.
(b) Description and analysis of Variance.
(c) Inferential and Inductive statistics.
(d) Descriptive and inferential statistics.
CONTENT
Section 1: Definition
Statistics is concerned with scientific methods for collecting, organising,
summarizing, presenting and analysing data, as well as drawing valid
conclusions and making reasonable decisions on the basis of such analysis
(Spiegel 1999). It can also be defined as the scientific study of handling
quantitative information. It embodies a methodology of collection,
classification, description and interpretation of data obtained through the
conduct of surveys and experiments (Aggarwal, Y.P. 1998, P1).
It is an analytical tool used in solving problems and in decision making
in areas such as Natural and Biological Sciences, Engineering, Economics,
Management, Marketing, Political Analysis and other problem-oriented
disciplines.
11
Branches of Statistics
There are two main branches of Statistics, they are:
1. Descriptive Statistics
2. Inferential Statistics
Descriptive Statistics utilises numerical and graphical methods (such
as Bar chart, Pie chart, Histogram etc.) to look for patterns, summarize and
present the information in a set of data. Descriptive or Deductive
Statistics can be described as the phase of Statistics which seeks only to
describe and analyse a given group without drawing any conclusion or
inferences about a larger group. Inferential Statistics on the other hand, is
concerned with studying a small portion of a group called a sample and
making valid conclusion about the larger population based on the sample
characteristics. The phase of Statistics which deals with conditions under
which inferences is valid is called Inductive or Inferential Statistics.
Scope of Statistics
Gupta C.B. (1982) in his book “An Introduction to Statistical Methods”
argued that statistical methods can be usefully employed in understanding
and interpreting any phenomenon which satisfies the following two
conditions:
1. It is capable of being quantified (i.e. expressed in the form of
figures) either through a process of counting or measurement.
12
The list is exhaustive. The student should think of more examples of
population and add them to the three above. A population can be infinite.
A finite population is that whose figure is ascertainable (i.e. a population
that is numerically countable). While an infinite population is that whose
figure cannot be ascertained (not countable). Examples of a finite
population are the three listed above, while example of an infinite
population is the population of all animals in the forest.
Variables
In studying a population, we focus on one or more characteristics or
properties of the units in the population. For example we may be
interested in the age, sex and occupation of a population of registered
voters in the 1999 Presidential Election in Nigeria. We call such
characteristics variables. The opposite of a variable is the term constant
which signifies that members of the unit or group have same magnitude of
the characteristic of interest e.g. the number of hours in a day is constant,
it is always 24 hours.
Sample
A sample is a representative subset of the population. It is a small part of
the whole population selected for the purpose of a study. Whatever
deductions made out of the sample can be used to generalize for the entire
population provided the sampling technique used in not biased. For
instance, in order to study the voting pattern of POS 311 Students
(consisting of about 3,000 students), I may carefully select a sample of
500 students, study their voting pattern and then use my findings to
generalize for the entire population of POS 311 Students.
One important question may cross your mind: why do we need to select
only 500 students, why not study the entire 3,000 students?
Well, it is a good question but read the next section to get the answer.
13
4. Where members of the population can easily be destroyed while
studying them, samples are more appropriate.
Sampling Techniques
The method of selecting a sample is called sampling procedure or
sampling technique.
14
Systematic Sampling: This is concerned with the selection of every nth
item from the list of the population. For instance, if we wish to select 40
people out of a population of 100 people, we first number members of the
population serially, we may start from the first person and thereafter select
every 5th number progressively until the desired sample size is achieved.
Cluster Sampling: This method can be employed where the population is
heterogeneous. The entire population is subdivided into clusters and a
suitable technique is used to select samples from each of the clusters.
Statistical Data
Statistical data refers to numerical descriptions of quantitative aspects of
things. This description may take the form of counts or measurements.
For instance, statistical data on students may include:
• Number of students taking a particular course.
• Number of male or female students.
• Number of single or married students.
• Other variables like the height, weight, income, age or political
affiliation of students etc. Think of more examples.
15
Secondary Data: This is the name given to a data that is being used for
some purposes other than that which it was originally collected. e.g. data
collected from Federal Office of Statistics (F.O.S.), World Bank,
UNESCO, etc, and used by a researcher in the University of Ibadan.
Time Series Data: Are sets of observations taken sequentially in time,
usually at equal intervals e.g. Bank deposits are often arranged in
accordance with the time the deposits were made.
Cross Sectional Data: These are data recorded or collected at a specific
time, usually for a day, a week or a year.
16
Note that a questionnaire may be self administered or administered
by post.
3. Direct Observation: In this method of data collection, the
researcher observes the characteristic of interest directly from the
population or sample. The population may consist of a set of
inanimate objects, or a laboratory experiment or when the desired
variable can be measured without soliciting response from the
respondent.
Importance of Statistics
“The fundamental gospel of Statistics is to push back the domain of
ignorance, prejudice, rule of thumb, arbitrary or premature decisions,
traditions and dogmatism, and to increase the domain in which decisions
are made and principles are formulated on the basis of analysed
quantitative facts”.
Robert W. Bugess*
Adapted from Gupta C.B (1982).
Summary
Statistics as a method is a tool used in collecting, presenting and analysing
data as well as the descriptive and inferential Statistics. Samples are
drawn in order to minimize cost and to save time. Data can be obtained
through direct interview, administering questionnaire and direct
observation. The use of Statistics makes Social Science research more
appealing and research outcomes are more clearly presented.
17
Post-Test
Refer back to the 5 objectives pre-test questions and try to provide
answers to them. By now you have acquired the requisite knowledge to
answer those questions and supplementary ones that are provided below:
Additional Questions on Lecture One:
1. Enumerate the uses of Statistics to the social scientists.
2. Write a short note on the following concepts:
(i) Data
(ii) Sample
(iii) Population
3. Write a short note on the data collection techniques you have learnt
and give advantages and disadvantages of each technique.
References
Gupta, C.B. (1982). An Introduction to Statistical Methods. Delhi:
Vikas Publishing House Pvt Limited.
18
LECTURE THREE
Introduction
This lecture explores the main steps and requirements needed in making
statistical enquiry in Political Science. It begins by identifying five main
steps and then discusses and analyses each of the steps.
Objectives
At the end of this lecture, students should be able to:
1. Prepare and execute a research plan.
2. Appreciate the various levels of measurement and when to use
them.
3. Apply some of the concepts learnt in the first lecture.
Pre-Test
1. Statistical enquiries involve:
(a) Collecting data and keeping them in a bank.
(b) Analysing data and reporting to the director.
(c) Formulating a statement of inquiry, collecting data, analysing it
and interpreting the result.
(d) All of the above.
19
3. The three levels of measurement in order of sophistication are:
(a) Ordinal, Normal and interval level.
(b) Interval, Normal and ordinal level.
(c) Ordinal, interval and Normal level.
(d) Normal, ordinal and interval level.
CONTENT
20
1. Statement of Inquiry
The first and the most essential thing in a political inquiry is the
preparation of a statement of purpose, clearly and carefully stated. The
purpose of the inquiry may be:
(a) To examine the validity of an existing theory or hypothesis.
(b) To discover a new theory or hypothesis.
(c) To investigate the existing state of affairs.
(d) To solve a problem involving the inter-relations of several groups
of facts.
21
(i) Characteristics of Statistical Units
• The unit of measurement must be definite and specific.
• It must be of such a nature that may be correctly ascertained.
• Ensure homogeneity and uniformity.
• It should be stable.
• It should be appropriate for the purpose.
(see Gupta C.B., 1982)
(ii) Classification of Units
• Natural units e.g. a person, a cow, a tree, etc.
• Produced units – objects e.g. a house, a car, a table, a ship, etc.
• Measurational units e.g. ton, Kilogram, meter, the year, etc.
• Pecuniary value units e.g. Naira, Dollar, Pound, Cedi, etc.
Normal Level (scale): This is the simplest and most basic level of
measurement. It was derived from the Latin word for ‘name’. The
numbers assigned to the categories and used as the measure of each case
in that category serve simply as names for those categories and cases e.g.
Operational definitions for gender. Categories are:
1 = Male 2 = Female
22
1 = Low participation 2 = Medium participation 3 = High participation
We know that Medium is better than low but we do not know by
how much. Likewise the distance between High and Medium participation
is not ascertainable.
Ratio Scale: This is similar to the interval scale in all respect but in
addition it has an absolute zero point e.g. height, weight, number of books
on a shelf, etc.
23
Summary
Post-Test
Same as the Pre-Test. Go back to the objective question and provide
answers to them. Having completed that task, answer the following essay
questions:
24
References
Gupta, C.B. (1982). An Introduction to Statistical Methods. Delhi:
Vikas Publishing House Pvt Limited.
25
LECTURE FOUR
Data Representation
Introduction
In the previous lessons, we have discussed the definition of basic
concepts. We have also discussed how to plan our research work, and
how to collect our data. In this lecture, we shall examine how best to
organise and present the data. The first section of the lecture deals with
the use of tables to present data while the second section examines the use
of diagrams for the same purpose.
Objectives
At the end of this lecture, students should be able to:
1. Manage and organise raw data by tabulating and grouping them as
the case may be.
2. Present the major ideas of the data on a chart for easy
understanding.
3. Link and practicalize the knowledge acquired in the previous
lectures.
Pre-Test
To be taken before you read the lecture.
1. Numerical data are better organised by
(a) Tabulating them.
(b) Leaving them in raw form.
(c) Throwing part of them away.
(d) None of the above.
26
2. All the following are ways of presenting data diagrammatically except:
(a) Pictogram.
(b) Histogram.
(c) Barograms.
(d) Pie-chart.
CONTENT
27
Age (class) No of Voters (freq)
1-20 2,500
21-40 12,000
41-60 9,800
61-80 4,600
81 above 1,200
The first column is called the class limits because you have grouped the
voters into different age brackets. The second column is called the
frequency column which gives the actual number of voters in each age
bracket.
You can now see why we call table 4.2 a complex tabulation. It contains
information on several variables.
28
Section 2: Diagrammatic Representation
The various forms of diagrammatic representation are discussed in this
section.
29
A 30 million
B 25 million
C 15 million
D 20 million
E 10 million
To draw the Bar chart of data on Table 4.3, scale the frequency column on
the vertical axis and the cities or classes on the horizontal axis. Ensure
equal interval between the bars and label the chart accordingly.
40
30
20
10
A B C D E cities
30
To draw the pie-chart, we first convert the number of persons in each city
to degree i.e. unit of measuring an angle in a circle. The conversion is
simple, it is just:
Thus,
Column 1 Column 2 Column 3
A 30 = 108°
x360 0
100
B 30 = 90°
x360 0
100
15
C x360 0 = 54°
100
20
D x360 0 = 72°
100
10
x360 0
E 100 = 36°
Total 360°
Note that adding column 3 together you must arrive at 360°. Otherwise go
back to the computation and check for the mistakes.
Next, draw a circle and divide it into sectors each sector representing each
value in column 3.
31
Pie Chart of Table 4.3
E, 360
0
D, 72
0
A, 108
0
C, 54 B, 90
0
32
particular class e.g. for the class 21-40, the class interval is 10.
This is derived by:
Class interval = Upper class – Lower class boundary
= 40.5 – 20.5
=10
You should now calculate the class interval for all other classes. All
must give you 10 except the open class which you don’t need to
calculate for.
61
=
2
= 35.5
You should also calculate the class mark of all the other classes except
the last class.
I will now illustrate how to draw these charts with a good example:
33
Example 1: The age of 80 registered voters are given below:
34
Age group 10-19 20-29 30-39 40-49 50-59
No of voters 9 17 25 19 10
Calculate:
1. The lower and upper class boundaries.
2. The class width.
3. Construct the cumulative frequency table.
4. Plot the Ogive and frequency polygon.
5. Draw the histogram.
Solution
1. See column 4 in (iii) below. The procedure is to subtract 0.5 from the
lower class limit and add 0.5 to the upper class limit (of column 1).
2. Class width = Upper – Lower class boundaries
= 19.5 – 9.5
= 10 units
It is the same result for all other classes.
Column 1 Column 2 Column 3 Column 4 Column 5
Age group No of Cum. Freq Class Class mark
(class) voters (f) boundary
10-19 9 9 9.5-19.5 14.5
20-29 17 26 19.5-29.5 24.5
30-39 25 51 29.5-39.5 34.5
40-49 19 70 39.5-49.5 44.5
50-59 10 80 49.5-59.5 54.5
3. To plot the frequency polygon, we need to compute the class mark
(see column 5).
Class mark = Lower + Upper class limit
2
10 + 19
=
2
29
=
2
= 14.5
Do the same thing for all other classes. The result is in column 5.
35
Column 3: Ogive or cumulative frequency curve of table (iii)
Cum. Freq
80
60
40
20
Freq The frequency Polygon of the table (iii) Class boundary column 4
30
20
10
36
Freq Histogram of table (iii)
25
20
15
10
Summary
We have learnt how to represent our data on tables and charts. The charts
make information being presented more appealing. Information is better
managed by tabulating them. All tables and charts must be properly
labelled. Note that we can easily use the computer to draw these charts.
This will be discussed in a later lecture.
Post-Test
The post-test is same as the pre-test question. Go back and answer those
objective questions and compare your answers with the pre-test attempt.
In addition, try to solve this essay question:
37
1. The annual budgetary allocation of a country in Africa is as follows:
Sector Allocation (US Dollar)
Education 10m
Health 15m
Defence 40m
Agric 20m
Power & Steel 8m
(a) Use the data to draw Pie chart.
(b) What is the highest priority of the government for that year?
(c) Also draw a bar chart for the data.
2. The table below presents the times taken by 50 male students to finish
POS 311 Exam (in minutes).
164 140 135 145 155 157 147 125 154 136
142 172 142 144 132 155 158 138 165 139
162 123 136 170 169 115 156 162 133 145
140 153 146 146 136 144 150 145 143 165
135 161 152 130 148 129 142 153 131 145
(i) Using a tally column and a class size of 10 units, construct a
frequency distribution table for the data.
(ii) Draw the bar chart and the histogram.
(iii)Draw the Ogive.
References
Gupta, C.B. (1982). An Introduction to Statistical Methods. Delhi:
Vikas Publishing House Pvt Limited.
38
LECTURE FIVE
Descriptive Statistics I
Introduction
In this lecture, we want to examine how numbers can be employed to
described and explain human social condition. Political Scientists and
Political Practitioners frequently employ simple concepts such as
percentages, rates, averages and deviations to measure politically
significant phenomenon. This lecture will attempt to equip the student
with the skill of supporting their argument with numerical facts and
figures in reaching out to their audience.
Objectives
At the end of this lecture, students should be able to:
1. Compute simple averages, percentage, rates, etc.
2. Use and interpret these numerical measures accurately.
3. Do better comparative studies using the descriptive statistics as a
tool.
Pre-Test
To be taken before you read the lecture.
1. The total annual budget for 2001 was N500m. Education was
allocated only N25m. What percentage of total budget goes to
education?
(a) 50% (b) 0.5% (c) 500% (d) 5%
39
2. 16 million people voted in the 1993 Presidential Election while 24
million people voted in the 1999 Presidential Election. What is the
percentage change?
(a) 5% (b) 50% (c) 0.5% (d) 500%
CONTENT
Section 1: Percentage, Rates and Ratios
Percent literally means out of a hundred. Let us illustrate this concept by
the first question in the pre-test exercise.
Education
= 25x100%
500
= 5%
40
Solution to pre-test 2
Percentage change = (Final value – Initial value) x 100%
Initial value
( 24 − 16) x100%
=
16
8x100%
=
16
= 0.5 x 100 %
= 50%
Note that the percentage change in the number of voters is positive here.
In some other cases, it could be negative (i.e. if the final value is lower
than the initial value).
Rates
Let us also illustrate this concept with a clear example. Answer this
question before you proceed.
Question: From the table below which country has the highest crime rate?
Solution
If your answer is Nigeria, then you are wrong. It is wrong to say Nigeria
had the highest crime rate simply because the number of crimes recorded
in Nigeria is greater than the two other countries. That is, an operational
definition relying simply on the number of crimes committed in a year is
invalid. Large countries have more crimes simply because they have more
people. The correct thing to do is to express the number of crime cases
per 100, 000 population.
41
i.e.
10,000 x100,000
Nigeria =
120,000,000
10x100
=
120
= 8.3
Approximately 8 crime rate per every 100,000 population.
3,500 x100,000
Ghana =
52,000,000
350
=
52
= 6.73
Approximately 7 crime rate per 100,000 population
Therefore South Africa has the highest crime rate per 100,000 population
because 20 is greater than 8.
Ratios
Unlike in percents and rates, ratios are computed by dividing non
equivalent units
For example:
Per capital Gross Nation Product = GNP (in naira)
Population (No of people)
42
Thus nations with high per capital GNP can be said to be more
economically developed that nations with low per capital GNP.
The Mean
The mean, sometimes called the arithmetic mean of a set of numbers is the
sum of the numbers divided by their total frequency. This is usually
computed at the interval level by adding together each score and dividing
the sum by the total number of cases. It is sometime represented by X or
simply M.
For an ungrouped data,
= ∑ xi ……………………(4.1)
N
where N = total frequency (or ∑f)
∑ = is just a symbol for “sum of.”
Mean X
∑ fx ………………………………………….. (4.2)
∑f
Here the numerator says multiply each value of X by its corresponding
frequency and sum all. The denominator says sum together all frequencies.
43
Example: From the given data below, calculate the average (mean)
Defence expenditure of Nigeria over the five years.
Table 5.1
Year 1995 1996 1997 1998 1999
Defence
Expenditure ($b) 5 8 10 9 11
Solution
Since this is an ungrouped data, we use equation (4.1)
Mean ( X ) = ∑ Xi
N
5 + 8 + 10 + 9 + 11
=
5
43
=
5
= 8.6
Thus, the mean Defence expenditure of Nigeria for the period is US $8.6
billion.
5, 4, 5, 4, 7, 8, 4, 5, 8, 7, 8, 5, 7, 8, 6, 8, 6, 5, 8
What is the average number of bills passed for that year in the North?
Have you solved it? Now proceed to the solution provided below and
compare your steps and the answer.
Solution: Since each of the numbers appear more than once; we must
multiply the numbers by their respective frequencies (i.e. number of times
they appear).
44
Thus, using equation 4.2
Mean = ∑ fx
∑f
5(4) + 4(3) + 7(3) + 8(6) + 6(2)
Average bills passed =
4+3+3+ 6+ 2
20 + 12 + 21 + 48 + 12
=
19
113
=
19
= 5.947
The Median
The median is the value that divides a set of observations into two equal
halves. That is, 50% observation is above and below the median. The
first task is to arrange the numbers in ascending or descending order. The
N +1
next task is to locate the th position of the middle term
2
N +1
Step 2: The Median occupies th position
2
i.e. 5 +1 6
th = = 3 position
rd
2 2
45
Note that n is 5 because we have 5 Defence expenditure figures
Step 3: Locate the value that occupies the 3rd position in Step 1. The value
is 9.
∴ Median = 9 + 10 = 19 = 9.5
2 2
Median (Md) = Li + N - ∑f bm
2 C
fm …………………….…..4.3
N +1
Note that in order to locate the th position in this case, we
2
46
shall use the cumulative frequency column (see example 4.3 below for the
calculation procedure).
The Mode
The mode is the category, which occurs most frequently. At the norminal
data below, the mode (or modal category) is Christianity.
Religion No of respondent
Islam 17
Mysticism 4
Christianity 25
Others 3
It will be wrong for you to say the mode is 25 (which is just the frequency
of the modal category).
Another example: The mode of the set of figures 7, 3, 2, 1, 7, 5, 7 is what?
Answer: Mode is 7 because it occurs more frequently than all other
figures. Note that if another figure (say 2) occurs as frequent as 7 in the
example above, we say the set is bimodal (i.e. mode = 2 and 7).
Mode (mo) = Li + ∆i
C
∆i + ∆2
47
{Nigeria or Cameroon}. The number of participants in all the wards are a
given below:
43 49 39 25 48 54 43 42 45
47 51 60 68 34 38 59 31 27
Solution
1 2 3 4 5 6 7
Class Tally F Class Fx Class Cumm
mark (x) boundary freq.
21-30 II 2 25.5 51 20.5-30.5 2
31-40 IIII 5 35.5 177.5 30.5-40.5 7
41-50 IIII III 8 45.5 364 40.5-50.5 15
51-60 IIII 4 55.5 222 50.5-60.5 19
61-70 I 1 65.5 65.5 60.5-70.5 20
20 880
∑f sum of column 3
= 880
20
= 44
48
Mode = Li + ∆i
C
∆I + ∆i
The modal class is the class 41-50 because it has the highest frequency.
3
∴ Mode = 40.5 + x10
3 + 4
30
= 40.5 +
7
= 40.5 + 4.285
= 44.785
20 + 1 21
i.e. = th = 10.5th position.
2 2
. N - ∑ f bm
MedianMedian
. . 2
=Li ___________ C
fm
49
20 - 7
= 40.5 + __________
2 10
8
3_
= 40.5 + 8 X 10
30
= 40.5 +
8
= 40.5 + 3.75
3. To answer this question, we should bear in mind that the Mean is the
only measure of central tendency, which takes all the observations into
account in its computation. We are only fortunate in this example that
the three measures happen to fall in the same class 41-50. This may
not always be the case, the Mean 44 participants is the figure that best
describe the distribution.
Summary
Descriptive tools such as percentages, ratios, rates, average etc are useful
tools in comparative studies. The crime level of two countries can be
compared when they are expressed as a rate per thousand of the
population. The Mean is a measure of central tendency, which shows the
typicality of its group. The Mode is the most frequent occurring value
while the Median divides a set of values into 2 equal halves.
50
Post-Test
You shall now go back to the Pre-Test, attempt it again and compare your
answer with your first attempt.
References
Aggarwal, Y.P. (1998). Statistical Methods: Concepts, Application
and Computation. New Delhi: Sterling Publishers Pvt Limited.
51
LECTURE SIX
Descriptive Statistics II
Introduction
This lecture is the concluding part of our discussion on Descriptive
Statistics that we started in the last lecture. We shall examine the degree
to which a set of figures deviate or spread about their average. The Mean
Deviation, the Range and the Variance are some of the measure of
dispersion that will be discussed in this lecture.
Objectives
At the end of this lecture you should be able to:
1. Define and calculate the mean deviation.
2. Define and calculate the range.
3. Define and calculate the variance.
4. Deduce the standard deviation from the variance.
5. Interpret the values of each of the above measures.
Pre-Test
Attempt the following questions before you proceed to study the lecture.
1. The difference between the highest and the lowest score in a
distribution is called
(a) Mid deviation (b) Range
(c) Standard deviation (d) None of the above
52
2. When a group of scores have a large variance, then the scores are said
to be
(a) Far apart from their median.
(b) Far apart from their mode.
(c) Far apart from their mean.
(d) All of the above.
CONTENT
1. The Range: The range is simply the difference between the highest
score and the lowest score in a distribution. When all scores in the
distribution have the same value; the range is zero, i.e. no variability. The
higher the value of the range, the farther apart are the extreme scores of
the distribution.
53
A weakness of the range is that it considers only the two extreme cases in
its computation.
Example: Find the range of the following income distribution in a private
establishment.
N50,000, N20,000, N150,000, N32,000 and N5,000.
Solution:
Income Range = Highest – Lowest income
= N150,000 – N5,000
= N145,000
This shows that there is a high degree of income inequality in the
establishment.
N.B: Now that you are becoming familiar with some of the symbols in
this course, we shall introduce some new formulas, which are quite
similar to those encountered in the previous. Before then let me remind
you that symbol X stands for numerical value of a variable, symbol XX
stands for mean or average value of a variable X, N stands for the total
number of cases in the distribution. ∑ implies summing up.
For computation purposes, use the format below to arrange your work:
1 2 3
X
X X −X
54
1. Put the values of the variables.
2. Insert the mean (which is a constant).
3. Subtract the mean from each value (ignoring negative signs). The two
vertical lines in column 3 means absolute value i.e. ignore negative
sign.
For a grouped data, use:
X−X
Mean Deviation {MD} = ∑ f ………….6.2
N
The symbols are as explained above in the ungrouped case. Note that the
MD retains the original unit used in measuring the data. We shall take an
example to illustrate these concepts soon. Before then, let us discuss the
variance and standard deviation.
3. The Variance
The variance is also an index that reflects the degree of variability or the
average distance from the mean of all cases in an interval level
distribution. However, it relies on the squared distance between X and X.
Its derivative, the standard deviation, converts the squared unit to a linear
one. The variance is computed by:
2
Variance { ð2 } (X=-∑X) for ungrouped data
……….6.3 N
55
56
∑ f {X − X }
2
=
2
Variance { ð } for a grouped data ……………6.4
N
N ………………6.5
2
= ∑ƒ(X - X) grouped data
N ………………………6.6
Like the mean deviation, the standard deviation is relatively large if the
cases are widely dispersed about mean. If all the cases take on the same
value, the standard deviation is zero.
σ
5. Coefficient of Variation CV = X
57
Solution
1. Since this is an ungrouped data, for
(i) Mean deviation, use equation (6.1)
Thus MD =
∑ X −X
N
5 + 8 + 10 + 9 + 11
But X =
5
43
=
5
= 8 .6
8.4
=
5
= 1.68
i.e. US $ 1.68 billion
∑ (X − X )
2
σ 2
=
N
= (5-8.6) 2 + (8-8.6) 2 + (10-8.6) 2 + (9-8.6) 2 + (11-8.62)
5
58
= 12.96 + 0.36 + 1.96 + 0.16 + 5.76
5
21.2
=
5
= 4.24
= 4.24
= 2.059
= US $ 2.06 billion
CV =
= 2.06
= 8.6
= 0.239
= 0.24
The social information here is that since the coefficient variation is very
small in magnitude, we may conclude that there exists only a little
variation in the Defence expenditure within the five years. The values are
indeed close to the average.
59
Summary
Post –Test
Same as the pre-test, in addition example 2 above should be attempted.
References
Aggarwal, Y.P. (1998). Statistical Methods: Concepts,
Application and Computation. New Delhi: Sterling Publishers Pvt
Limited.
60
LECTURE SEVEN
Introduction
This lecture attempts to introduce students to the use and application of set
theory in political phenomenon. It will introduce several concepts that
will be useful in the next lecture on probability. It will also address the use
of Venn diagrams.
Objectives
At the end of this lecture, students should be able to:
1. Apply and interpret set theoretic notations.
2. Draw and interpret information using the Venn diagram.
3. Appreciate some probability notations that would be used in the
next lecture.
Pre-Test
Attempt these questions before you proceed to study lecture note.
1. A set having only one element is called
(a) One set (b) Unity set
(c) Singleton (d) None of the above
3. If set A contains the elements (1,2,5) and set B contains the elements
(2, 6, 8), then the set A union B, AUB will contain
61
(a) {1, 2, 2, 5, 6, 8}
(b) {1, 2, 5, 6, 8}
(c) {2}
(d) {1, 5, 6, 8}
A B U
(a) A for B.
(b) A intersection B.
(c) A inside B.
(d) A divided by B.
CONTENT
62
The elements of a set can be described in words, for example
Let B = {the set of all State Governors who are ex-military men}.
R = {the set of all Senators who are ex-governors}.
I = {the set of all states in Nigeria that enforces the Sharia Code into
its legal system}.
Definitions
1. If a is an element of set B we write a Є B {i.e. a belongs to B).
2. If a is not an element of set B, we write a ∉ B
(i.e. a does not belong to B).
3. A Set having only one element is called singleton
e.g. A = {2}.
4. A Set which has finite number of elements is called a finite set;
otherwise it is called an infinite set.
5. Let A and B be two separate sets, if every element of A is an element
of B, then A is called a subset of B. It is written as AСB or B ⊃ A. If
the two sets have the same elements then, A = B and it is written as
A ⊆ B.
6. A Set that has no element is called a Null set denoted by {ǿ}.
7. The Universal Set denoted by letter U is the set that contains all sets
under consideration.
8. Union of Sets: The union of two sets A and B are the totality of all the
elements in both sets.
e.g. Let A = {1, 2, 6}
B = {6, 9, 10}
63
10. Complement of a Set: The elements that are in the universal set U but
not in a given set D are called the complements of the set D denoted
by D1
U = {1, 2, 3, 6, 7, 9}
1. Union of sets
2. Intersection of Sets
A B
U A∩B
64
3. Complement of a set
B
B1
U
4. Complement of intersection
5. Complement of Union
The shaded portion implies {AUB}1
A B
65
Summary
Post-Test
Same as the pre-test questions, but in addition, answer the following
questions and compare your answers with the solution provided below:
Solution
(i) A U B = {a, c, d, f, I, k, m, p, s}
(ii) A U B U C = {a, c, d, f, I, k, m, p, s n, o, r}
(iii) A ∩ B ∩ C = {f}
Reference
Spiegel, M.R. and L.J. Stephens (1999). STATISTICS: Theory and
Problems (3rd Ed), Schaum’s Outline Series. New York: McGraw-Hill.
66
LECTURE EIGHT
Introduction
Due to the peculiar nature of man, Social Science phenomena are basically
characterized by uncertainty. A Social Science researcher does not have
the absolute degree of freedom in predicting social phenomenon.
Although the uncertainty principles create a kind of complexity in making
inferences in social science, there is a need to develop a tool that will
enable us make inferences. Probability is the tool that enables inference-
making even at the face of uncertainty.
In this lecture, we shall examine the basic principles of probability and
random variables as it relates to social events.
Objectives
At the end of this lecture, you should be able to:
1. Understand the use of probability in making predictions.
2. Link set theory, events and outcomes in interpreting probability
statements.
3. Compute the probability of a given event.
Pre-Test
Answer the following questions before you read the lecture note further.
1. If the probability that an incumbent governor will win an up coming
election is 3/5, what is the probability that he will loose the election?
(a) 3/5 (b) 4/5 (c) 1/5 (d) 2/5
67
2. The collection of all possible outcomes in an experiment is called
(a) Sample size (b) Sample mean
(c) Sample space (d) Sample half
5. The probability of any given event can only have values that lie
between:
(a) -1 and +1
(b) 0 and +10
(c) 0 and +2
(d) 0 and +1
CONTENT
68
Events and Outcomes
In statistical terms, an event is an outcome of an experiment. Thus success
is an event in an election so also failure is an event. In fact the expected
outcome of any election is the set:
A = {win, lose}
Solution
Total sample space = (5 + 7 + 8) Senators
= 20 Senators
Number of outcomes favourable to PDP Senators = 8
Let E = event of selecting a PDP Senator, then using equation 6.1
P (E) = 8/20
= 0.4 This is a fairly high probability.
69
2. Complementary Events
Two events are said to be complementary if they both exhaust all possible
outcomes.
For instance if the event of winning a war is ½ and the event of losing the
war is also ½, then, the event
We also assume
“winning the war” + “Not winning the war” a fair game here.
= ½ +½
4. Independent Events
Two events are said to be independent if both can occur together at the
same time. That is the occurrence of the one does not prevent the
occurrence of the other.
e.g. Let us define two events as follows: A = Nigeria wins the Bakassi
war. B = Nigeria wins the next African Nations Cup.
Then A and B are said to be independent events since both events can
occur at the same time. Therefore:
70
P (A ∩ B) = P ( A ) P ( B ) ………………………………………….. 8.4
We call equation (8.3) the addition rule of probability and we call equation
(8.4) the multiplication rule of probability.
If W and L are not mutually exclusive, then equation 8.3 becomes
Solution: Since the two events are independent, we shall use equation (8.4)
Then:
P (A ∩ B) = P (A) P(B)
= (0.58) (0.61)
= 0.35
which is a weak probability.
71
Sample space values
S 1
F X 0
Fig 8.1
The linkage between the sample space and the values (figure 8.1) is
provided by the random variable X.
Thus the function that a random variable (X) performs is to associate a
real number 0,1, 2,….., to every element of a sample space.
Illustration: Since the game of tossing a coin is fairly similar to the game
of war (at least with respect to the dichotomy head or tail and win or lose),
all other variables held constant. We shall use a coin tossing example to
illustrate our discussion in this section.
72
Solution
Let us define a random variable X as the number of heads observed in the
experiment.
Diagrammatically
Figure 8.3
Outcome X Values
TT P ( X)
TH 0
HT 1 1/4
HH 2 1/2
If the first coin and the second coin results in tail (TT), we record value
zero which means failure. If one coin results in H and the other in T we
record a value of 1 to show that only one success occurred. Finally if both
coins result in head (HH), we record 2 to show two successes. However,
the probability associated with each value of X, denoted by P(X) is
calculated as follows:
73
= P (A11 n A12)
= P (A11 ) P(A12 ) independent events
= ½ .1/2
= ¼
That was how we arrived at the last column of figure 8.3. The distribution
in figure 8.3 can also be represented graphically as:
0 1 2 X
74
We can as well express the probability distribution in a table as below:
TT 0 0.25
TH, HT 1 0.50
HH 2 0.25
Summary
Post-Test
Go back to the Pre-Test, attempt it again and compare your answers with
your first attempt.
References
Gupta, C.B. (1982). An Introduction to Statistical Methods. Delhi:
Vikas Publishing House Pvt Limited.
Ott, L. et al. (1983). Statistics: A Tool for the Social Sciences (3rd
Ed). Boston, Massachusetts: Duxbury Press.
75
LECTURE NINE
Introduction
In the last lecture, we introduced the concept of probability and random
variables. In this lecture, we shall examine some major probability
distributions that are widely used in Social Sciences. This lecture
concentrates on discussing the Binomial and the Normal probability
distributions.
Objectives
After this lecture, you should be able to:
1. Use the Binomial and Normal distribution in a more detailed
manner.
2. Understand the characteristics of and why some social phenomena
are classified as Binomial and others as Normal distribution.
Pre-Test
To be taken before you read this lecture
1. The Binomial probability distribution can be said to be
(a) A one chance experiment.
(b) A Normal chance experiment.
(c) A good chance experiment.
(d) A two chance experiment.
76
3. If the standard deviation of a Binomial probability is NPQ, what is
the variance of the distribution?
(a) NPQ (b)N2P2q2 (c) NPq2 (d) NPq
CONTENT
77
3. If our success is the event “Vote for”, then the random variable
involved is the number of success in 500 trials. Thus the third
characteristic is satisfied.
4. This characteristic is also satisfied because the probability of success
from one voter to the other is constant i.e. if we assume that the
probability of voting for, P = ½ and the probability of voting against, q
= ½ then p and q are constant from trial to trial.
5. The repeated trials are independent because all the voters can vote at
the same time.
The question now is how do we compute this probability? This leads us to
the next units.
Where x = 0, 1, 2, …………………., n
n!
n
and Cx =
X! (n – X)!
n
Note that: Cx reads n combination x
n! reads n factorial
n! = n (n -1) (n – 2) ………………………………… 1
3! = 3 x 2 x 1 = 6
5! = 5 x 4 x 3 x 2 x 1 = 120
0! = 1
78
Equation 9.1 is called the Binomial Probability Function
Example (9.1): Suppose it is known from past experience that
approximately 90% of human beings contacting HIV virus die within 10
years of infection. Given that 5 people in a certain town contact the
disease, use the Binomial probability to find the probability that exactly 4
of them will die within 10 years.
Solution
In this case, dying of the virus is the success p. Surviving beyond 10 years
is failure
q = 1 – p.
n = 5 5 people involved.
=
5!
(0.9 )4 (0.1)1
4! (5 − 4 )!
=
5 x 4 x 3 x 2 x1
(0.6561 )(0.1)
4 x 3 x 2 x1x1
79
Now you should try and solve the above problem on your own but use the
figures supplied below:
Use p = 80%
n = 4
x = 3
Just follow the same procedure as in the solved example.
Mean X = NP
Variance σ2 = Npq
Mean is sometimes referred to as the expected number. Also note that the
Mean, Variance and Standard deviation discussed in this section have the
same meaning with the ones discussed in lecture four, but different
computation formula.
80
Properties of the Normal Distribution:
1. The curve of a normal distribution is bell-shaped.
F(x)
µ X
Figure 9.1
81
F (z)
84
3 -2 -1 0 1 2 3 Z
68.27%
95.45%
99.73%
Figure 9.2
Z = X – µ ……………………..9.2
This means that X has mean µ and standard deviation but the
variate Z has Mean 0 and Standard Deviation 1. Z is the standard
normal variate or score.
82
The standard normal curve for different values of Z is tabulated for the use
of researchers. This is called the table of normal distribution function (see
the Appendix).
Solution
(i) Using equation 9.2, where
µ = 230, = 20, X = 280
= 50
20
= 2.5
Step 2: Shade the portion on a normal curve. The portion can be shown on
the normal curve as below:
83
X > 280 shaded.
230 280
Fig. 9.2a
Z = X- µ
220 − 230
=
20
− 10
=
20
= −0 .5
84
Second step: Shade the portion on the normal curve
220 230
Figure 9.2b
Third step: Check the area up in the standard normal table (Appendix).
The negative value of Z indicates that the point 220 is to the left of the
mean (fig 9.2b). Thus the area A corresponding to Z = 0.5 (neglecting the
negative sign) is 0.1915. But the entire area to the left of 230 is 0.5,
therefore, the area of the shaded portion is 0.5 – 0.1915 = 0.3085.
Therefore, the probability that the selected state has a per capita income
greater than $220m is 0.3085. This is quite higher than the probability in
(i) above.
(iii) We are required to compute:
P (220 < X < 280)
Step 1: compute the Z score for both sides
85
Step 2: Shade the portion on a normal curve:
A1 A2
(220 < x < 280 shaded)
Figure 9.2c
Which is the required probability that the selected state has a per capita
income between $220 and $280.
Summary
86
Post-Test
Same as the Pre-Test. Provide answers to those questions again and
compare your answer with the pre-test attempt before looking at the
answers provided by the tutor.
References
Gupta, C.B. (1982). An Introduction to Statistical Methods. Delhi:
Vikas Publishing House Pvt Limited.
Ott, L. et. al (1983). Statistics: A Tool for the Social Sciences (3rd
Ed). Boston, Massachusetts: Duxbury Press.
87
LECTURE TEN
Statistical Estimation
Introduction
The population parameters are often unknown and we may need to
estimate them. In this lecture, we will consider the procedure by which we
can estimate the population parameter from the sample selected.
Specifically we shall discuss interval estimate of population mean,
interval estimate of population proportion and interval estimate of the
standard deviation.
Objectives
After this lecture, you should be able to:
1. Discuss the difference between parameter and statistic.
2. Distinguish between point estimate and interval estimate.
3. Construct interval estimate for the population mean, population
proportion and population standard deviation.
Pre-Test
To be taken before you proceed to study the lecture.
88
2. Point estimate and interval estimate are the same. In fact, one can be
used in place of the other. Yes or No?
CONTENT
Table 10.1
Measure Sample Symbol Population Symbol
Size n N
Mean µ (mu)
X
Variance S² σ² (sigma squared)
Sampling Error
In a survey where parameters are unknown, they are estimated from the
statistic. The difference between actual value of a parameter and its
estimate (statistic) is what we call the sampling error. Sampling error is
function of the sample size. The larger the sample size in a survey, the less
the sampling error. By implication, the lower the sampling error, the more
the sample is representative of the population. Let us now discuss how to
compute this estimate from the sample information.
89
Estimation
As we have discussed earlier, we often wish to get information about a
population by studying just a randomly selected sample of it. We say we
are estimating the population parameter based on the sample statistic. We
can do this in two ways:
1. We can compute a single value from the sample (say the mean) and
call it an estimate of the population mean. This is called a point
estimate.
2. We may compute an interval of values (say the mean) from the sample
dates and say the population parameter falls within that interval. This
is called an interval estimate.
The second procedure in more preferable because it gives some air of
freedom from making a wrong prediction about the population parameter
i.e. it reduces sampling error.
Consider the two statements below:
1. The average income of some selected employed graduates in N22,000.
2. The average income of some employed graduate in the private sector
is N20,000 + N3,000 i.e. lies between (N17,000 and N23,000).
Observe that the first statement is more liable to error than the second
statement. Thus we shall examine the interval estimate in more details
in this lecture.
90
σ
X = ±1.96 …………. 10.1 a
n
σ
X ± 2.58 ……………. 10.1 b
n
Where:
x is the sample mean
δ is the population standard deviation
n is the sample size.
σ
The quantity n is often called the standard error of the mean. If S is
given instead of σ, then use S in place of σ in equation 10.1a.
Example 10.11
It was generally believed that the average income of a graduate employee
in the private sector has a standard deviation of N5,000. A sample of 80
graduate employees from that sector however shows that the average
income is N18,000. Estimate the population average income using a 95%
confidence interval. Assume that incomes of employees are normally
distributed.
Solution:
σ
Step 1: Write down the formula: x ± 1.96
n
Step 2: Fix in the given values:
You have been told that x = N18,000
σ = N5,000
n = 80
91
Thus:
σ
X + 1.96
80
18,000 ± 9800
8.94
18,000 ± 1096
#18,000 ± #1,096
Thus the population average income will lie within the interval (#16,904
and #19,096).
P ± 1.96
P q ………………………..10.2
N
Where the P and q are as previously defined and n is the sample size.
92
Example 10.2
Assume that PDP is a capitalist party and AD is a socialist party. In an
attempt to determine the party affiliation of university teachers, a sample
of 800 university teachers were selected and interviewed in Nigeria. Out
of them, 510 identified with PDP. Estimate the true proportion of
university teachers in Nigeria that identify with PDP. Use a 95%
confidence level.
Solution
Step 1: Write down the formula:
: . q = 1- P = 1- 0.63 = 0.37
n = 800
93
= 0.63 ± 0.0334
= 0.63 ± 0.03
σ ……..……………… 10.3
S ± 1.96
2n
Let us also illustrate this with an example.
Example 10.3
The lifespan of an HIV carrier was measured as the time between infection
and death. A sample of 50 HIV patients reveals that the standard deviation
of their lifespan is 6 years. Construct a 95% confidence interval for the
standard deviation of lifespan of all such patients.
Solution
Here you are given S = 6 years, n = 50, no information is given above σ.
But σ appears in the formula you are required to use. (See equation 10.3).
Since the sample size is large (i.e. greater than 30) we assure that S will be
a good estimate of σ in equation 10.3.
94
σ
S ± 1.96
2n
6 ± 1.96
(6 )
2 × 50
= 6 ± 1.96
(6)
100
11.76
= 6±
100
= 6 ± 1.176
Confidence 99.73 99% 98% 96% 95.45 95% 90% 80% 68.27%
level % %
95
Summary
There are 2 types of estimates. The point estimate and the interval
estimate. The interval estimate is more reliable in Social Science
research. It gives an interval of values within which the population
parameter is expected to fall. The information gathered from the
sample statistic is used to estimate the population parameter.
Post-Test
(same as Pre-Test)
1. ……………….. is to population as ………………….. is to sample.
(a) Parameter, Statistic
(b) Statistic, Parameter
(c) Statistic, Variance
(d) All of the above
2. Point estimate and interval estimate are the same. In fact, one can be
used in place of the other. Yes or No?
References
Gupta, C.B. (1982). An Introduction to Statistical Methods. Delhi:
Vikas Publishing House.
96
LECTURE ELEVEN
Introduction
A research work is meaningless if there is no pre-stated research
hypothesis. It is the hypothesis that will serve as the searchlight while the
theory serves as the footpath to a successful research outcome. In the light
of this, this lecture will examine what a good hypothesis entails, how to
formulate it, and how to test it.
Objectives
At the end of this lecture, students should be able to:
1. Formulate a good hypothesis and discuss the procedure to be taken
in testing it.
2. Improve your research competence in the area of hypothesis
testing.
Pre-Test
(To be taken before reading the lecture note)
1. Identify the dependent variable in this hypothesis: Women are likely to
vote than men.
(a) Women (b) Likelihood of Voting
(c) Men (d) Sex
97
3. A proposition formulated for the sole purpose of rejecting or nullifying
based on the result of a statistical test is called (a) a first hypothesis
(b) a good hypothesis (c) a rejected hypothesis (d) a null hypothesis
CONTENT
98
the result expected and thus help us decide whether to accept or reject
hypothesis, are called test of hypotheses, test of significance or decision
rules.
99
Accept Reject
Ho True Right Decision Type I Error
Ho False Type II Error Right Decision
Note that Type I error is more serious than Type II error.
The student should now perform the following tasks: In the few
hypotheses given below, you should identify the dependent, independent
variables, and the case involved. The solution to the first one has been
provided. You can follow the same format.
1. “States with higher per capita income spend a larger proportion of
their budgets on education than states with lower per capita income”.
Independent variable = per capita income
Dependent variable = proportion of budget devoted to education
Cases = States
Level of Significance
In testing a given hypothesis, the maximum probability with which we
would be willing to risk a Type I error is called the level of significance.
It is either 0.05 (5%) or 0.01 (1%) in practice.
If 0.05 significance level is chosen in designing a decision rule, then
there are about 5 chances in 100 that we would reject the hypothesis when
it should be accepted i.e. we are about 95% confident that we have made
the right decision.
100
It could be a T – test
Z – test
Chi – Square test
Correlation
F – test etc.
4. State the level of significance denoted by α e.g. 5% or 1%.
5. Determine the degree of freedom, depending on the test-statistics you
are using.
6. Evaluate your result by comparing computed value with tabulated
value.
7. Make decisions based on 3, 4 and 5 above.
If calculated value > tabulated value, REJECT Ho
If calculated value < tabulated value, ACCEPT Ho
8. Interpret your result by giving the relevant social implications.
Summary
Post-Test
Same as Pre-test.
References
Aggarwal, Y.P. (1998). Statistical Methods: Concepts, Application
and Computation. New Delhi: Sterling Publishers, Pvt Limited.
101
LECTURE TWELVE
Introduction
In this lecture we shall be discussing some test-statistics in details. These
test-statistics are useful tools in hypothesis testing. They are the T-test and
the Chi-square test. Some other test-statistics will be examined in later
chapters.
Objectives
At the end of this lecture, you should be able to:
1. Define and explain what a test of significance means.
2. Use the t-test and X2 – test as a tool for test of significance.
Pre-Test
1. The t-test is a useful test for difference of means. It is usually
employed when the sample size is (a) < 30 (b)> 30 (c) <100 (d) > 100.
2. The Z-test is also a test of significance employed when the sample size
is (a) Large (b) Small (c) Double (d) Halved.
3. The test-statistics for the goodness of fit is (a) The Correlation test
(b)The Chi-square test (c) The Binomial test (d) The Poisson test.
4. The statistic, calculated from the sample data which is used to validate
the Null hypothesis is called the
(a) Good hypothesis (b) Research hypothesis
(c) Test statistic (d) Research statistic.
5. The test-statistic that relies on frequency count for its categories than
actual interval level measurement is
(a) T-test (b) Z-test (c) P-test (d) X2 – test.
102
CONTENT
(b) Likewise, we may have 2 different samples drawn from the same
population, and our interest is to test if there is any significant difference
between the means of the two samples. When we are confronted with
problems like the above, the T-test is to be employed in carrying out the
test. It operates by computing a critical value T (or t calculated) and
comparing it with a table (or t tabulated) value at a specified degree of
freedom and level of significance. T calculated is computed as follows:
X −µ ………………………..12.1
T=
S
n
X −µ
or Z= ………………………...12.2
σ
n
for a large sample and when the population variance is known
Where X = Sample
µ = Population mean
S = Sample standard deviation
σ = Population standard deviation
n = Sample size
103
For scenario (b) above, we use the formula:
X1 − X 2
T=
1 1 ……………………………12.3
σ +
N1 N 2
+ N 2
Where σ = N1 S1 2 2 S2
N1 + N2 -2
T = X1 - X2
SED
σ 2
Where SED (σD ) = σ1 2 2
+
N1 N
2 …………………….12.4
Symbols:
X 1 = Means of sample 1
X = Mean of sample 2
2
104
N2 = Number of Observation in sample 2
σ = pooled Variance
S1 2 = Sample 1 Variance
S2 2 = Sample 2 Variance
Decision Rule
If t calculated > t tabulated, Reject Ho
If t calculated < t tabulated, Accept Ho
Let us now use an example to illustrate what we have discussed so far.
Example 12.1: A Times survey reveals that the ages of JAMB candidates
follow normal distribution with mean 23years and standard deviation
7years. However, a randomly selected sample of 50 JAMB candidate
shows that their average age is 25. Do we have enough evidence to
conclude that the Times survey was low?
Solution
Let µ = true population means of JAMB candidates
Step 1: Set up your hypothesis
Ho : µ = 23 years
H1 : µ ≠ 23years
105
Step 2: Compute your z statistics
ƕ = X - µ
σ
n
= 25 – 23
7
50
= 2
x 7.07
7
= 2.02
Since Z calculated falls outside this range, we reject Ho and conclude that
the true population mean µ is something greater than 23 years. Thus, the
value given by the Times Survey is too low.
106
Is the difference in their average score due to chance?
Solution
Step 1: Set up your hypothesis.
SE D (σ D) = σ1 2 + σ2 2
N1 N2
= 42 + 52
30 40
= 16 + 25
30 40
= 640 + 750
120
= 1390
120
= 11.58
= 3.4
107
Step 3: Compute T
T = X1 - X2
SED
20.5 − 16.2
=
3. 4
4. 3
=
3. 4
= 1.26
df = N1 + N2 - 2
df = 30 + 40 - 2
= 68
108
asked what profession they would like to choose in their adulthood. We
may get a data like the one below.
Table 12.1
Farmer Engineer Pilot Musician
No of 100 500 400 200
Children
In table 12.1, the categories are the profession while the data are the
2
number of children recorded. The task of the χ - test therefore is to test
whether the children are evenly distributed across the professions. Since
we have 1,200 children and four professions ordinarily we expect each
profession to have ¼ (1200) or 300 children and the observed frequencies
2
are the ones in table 12.1. What χ does is to compare the observe
frequency with the expected frequency. If a large difference exists
2
between observed and expected, the χ will have a large value and vice
versa. It is computed by the formula
X 2
=
∑ (o − e ) 2
Df = (r-1) (c-1)
Where r = no of rows
C = no of columns
Example 12.3: Sixty respondents were asked what their opinion is about
convocation of Sovereign National Conference. The data was obtained:
Hausa Igbo Yoruba Total
Support 17 8 5 30
Against 3 12 15 30
Total 20 20 20 60
109
Test the hypothesis that:
Ho : Ethnic affiliation and opinion on SNC are independent
H1: Ethnic affiliation and opinion on SNC are dependent.
Solution: The above given figures are called the observed frequencies.
Step 1: Calculate the expected frequencies thus:
Step 3: Compute χ2
x =∑
2 (O − E ) 2
E
= 15.6 see the procedure in step 2 above.
Step 4: Compare calculated χ2 with tabulated χ2 df = (2-1 ) (3-1) = 2,
since we have 2 rows and 3 columns.
χ2 tabulated (5%, df 2) = 5.99
Step 5: Decision-making
Since χ2 calculates is greater than χ2 tabulated i.e. 15.6 > 5.99, we reject
the Ho and conclude that ethnic affiliation and opinion on the convocation
of Sovereign National Conference are dependent i.e. Whether a
110
respondent is in support or against SNC depends on his/her ethnic
affiliation.
Summary
Post-Test
Same as Pre-Test, attempt the Pre-Test again and compare your answer
with your first attempt.
References
Gupta, C.B. (1982). An Introduction to Statistical Methods. Delhi:
Vikas Publishing House Limited.
111
LECTURE THIRTEEN
Correlation Analysis
Introduction
This lecture examines the measure of association between two variables at
the interval level of measurement. It begins by explaining the use of
scatter diagram and the various forms of association between two
variables. The product moment correlation coefficient and the spearman
rank correlation are also fully examined.
Objectives
After studying this lecture, you should be able to:
1. Illustrate the use of scatter diagram.
2. Calculate the coefficient of correlation ґ.
3. Interpret the value of the correlation coefficient ґ.
Pre-Test
1. The main property of the correlation coefficient (ґ) is that it lies
between
(a) O and 1 (b) O and –1 (c) –1 and +1 (d) -1 and O
112
4. The basic assumption of Pearson’s ґ is based on linearity between the
variables.
Yes or No
5. The straight line that passes through most of the scatter point in known
as
(a) Line of best fit
(b) Line of curvature
(c) Line of correlation
(d) None of the above
CONTENT
Dependent i.e. ґ = + ve
Independent
113
2. Negative correlation or inverse association
Independent
Dependent i.e. ґ = O
Independent
114
Let us illustrate the use of this table with an example.
Example 10:1: Given the data below, calculate the Pearson’s correlation
coefficient ґ between X and Y and interpret your result.
X 1 3 4 6 8 9 11 14
Y 1 2 4 4 5 7 8 9
Solution: Although the question does not ask you to plot scatter diagram,
but you may plot it to have a rough idea of what the relationship is.
X
2 4 6 8 10 12 14
From the Scatter plot, it is clear that a positive relationship exists between
X and Y.
Step 2: Arrange your data in the table format given above.
X Y X2 Y2 XY
1 1 1 1 1
3 2 9 4 6
4 4 16 16 16
6 4 36 16 24
8 5 64 25 40
9 7 81 49 63
11 8 121 64 88
14 9 196 81 126
56 40 524 256 364
115
Note that: adding together values in column X gives ΣX = 56
, , , , , Y gives ΣY = 40
, , , , , X2 gives ΣX2 = 524
, , , , , Y2 gives ΣY2 = 256
, , , , , XY gives ΣXY = 364
= 2912 - 2240
= 672
(1056) (448)
= 672
473088
116
672
=
687.8
= 0.977
= 0.97
Comment: The value of the correlation coefficient is very large. This
shows that there is a very high linear correlation between X and Y. An
increase in one of the variables is accompanied by a proportional increase
in the other.
6∑ D 2
rs = ρ = 1 − ……… 13.2
(
N n −1
2
)
Where d = difference between ranks of corresponding values of X and Y.
Example 10.2: Two different lecturers graded the exam scripts of eight
students. The marks recorded are as follows.
Candidate A B C D E F G H
Scores awarded by 46 37 71 64 48 55 35 36
lecture 1
Scores awarded by 48 42 68 63 60 58 41 40
lecture 2
117
Solution
Step 1: Rank the 2 scores using R1 for rank of scores awarded by lecturer 1
and R2 for scores awarded by lecturer 2.
A B C D E F G H
Lecture 1 46 37 71 64 48 55 35 36
Lecture 2 48 42 68 63 50 58 41 40
R1 4 3 8 7 5 6 1 2
R2 4 3 8 7 5 6 2 1
D= R1 - R2 0 0 0 0 0 0 -1 1
D2 = (R1 - R2) 2 0 0 0 0 0 0 1 1
6∑ D 2
= 1−
ґs
(
N n2 −1 )
6 x2
= 1−
(
8 82 − 1 )
12
= 1−
8(63)
12
= 1−
504
118
504 − 12
=
504
492
= = 0.9902
504
The rank correlation coefficient ґs is very high and positive. This shows
that there is a very close tie between the marks awarded by the two
lecturers.
Summary
Post-Test
Same as the Pre-Test.
References
Aggarwal, Y.P. (1998). Statistical Methods: Concepts, Application
and Computation. New Delhi: Sterling Publishers PVT Limited.
119
LECTURE FOURTEEN
Regression Analysis
Introduction
In the last lecture, we considered the measure of relationship between two
variables. In this lecture, we shall discuss a measure of the impact one
variable (called the independent) has on another variable (the dependent).
We shall also discuss how to make predictions based on the past evidence
of the two variables.
Objectives
After studying this lecture, you should be able to:
1. Calculate simple regression coefficient and thus fit a regression line.
2. Predict future values of the variables using the regression line.
Pre-Test
1. In the straight-line equation Y = a + bx, we say.
(a) Y depends on a
(b) Y depends on b
(c) Y depends on X
(d) None of the above
120
4. In the regression line Y = a + bx, if the value of a = 5, b = 3 and X =
25, predict the value of Y.
(a) 40 (b) 100 (c) 75 (d) 80
CONTENT
Y = a + bx + e ……. 14.1
Equation 14.1 is called the regression line. In that equation, Y is called the
dependent variable, X is called the independent variable, a is called the
intercept (constant) and b is called the regression coefficient. e, called the
error term is often ignored. Equation 11.1 also signifies that we are
predicting Y from X. For instance if we are interested in knowing the
relationship between income and level of savings, we may use equation
14.1 as follows:
Y = a + bx +e
Y = Amount of savings
X = Income level
121
How do we compute a and b?
The procedure is to compute b first, and later on compute a.
The computation formula is as given below:
N ∑ XY − (∑ X )(∑ Y )
b= ……..….(14.2)
N ∑ X 2 − (∑ X )
2
a = Y − b x ………………....(14.3)
Where Y =
∑Y i.e. mean of Y
N
X =
∑X i.e. mean of X
N
Solution
(a) Scatter plot of X and Y
Y
6
4 ▪▪
2 ▪ ▪▪
▪ ▪▪
2 4 6 8 10 12 14 X
122
We can draw a straight line to pass through some of the point, although
there are outliers.
(b) the required regression line is:
Y = a + bx
N ∑ XY − ∑ X ∑ Y
b =
N ∑ X 2 − (∑ X )
2
=
(8 x315) − (83x 27 )
(8 x949) − (83)2
2520 − 2241
=
7592 − 6889
279
=
703
= 0.3968
123
Step 4: Compute the value of a:
a = Y − bX
=
∑Y − b ∑ X
N N
27 83
= − (0.3963
8 8
= 3.375 − (0.3968)(10.037)
= 3.375 − 4.1168
= 0.7418
Interpretation
The regression line Y = - 0.7418 + 0.3968X signifies that, when an
employee receives a salary of zero, that is when X = 0, he / she will be
indebted by a sum of #741.80k. Remember a is the intercept of the
regression line. Also remember that the figures are given in thousand naira
in the question. Furthermore, the regression coefficient b = 0.3968 shows
that an employee’s savings will increase by a factor of #396.80k whenever
the salary increases by a factor of #1,000.
124
The student should note that the foregoing regression line, equation (14.1)
Measures the impact of X on Y (or predicts Y from X). If we are
interested in the impact of Y on X or (predicting X from Y), we shall use
the equation:
X = a + bY ……… 11.4
All the terms have the same meaning, as is equation 14.1 except that here,
X is the dependent variable while Y is the independent variable.
Summary
125
Post-Test
Same as pre-test.
References
Aggarwal, Y.P. (1998). Statistical Methods: Concepts, Application
and Computation. New Delhi: Sterling Publishers PVT Limited.
126
LECTURE FIFTEEN
Introduction
Our discussion up to date has been centred on using manual calculators,
pen and sheets of paper in estimating parameter values. However, if we
have a large volume of data before us, which is usually the case with
survey research, it may be tedious to use manual calculators. Thus, in this
chapter we introduce the use of computer in Data analysis.
Objectives
After studying this lecture, you should be able to:
1. Prepare a good questionnaire for survey research.
2. Code information collected by the questionnaire method.
3. Prepare data for analysis and use a computer package to analyse
your data.
Pre-Test
1. List out 3 benefits of the use of computer in data analysis.
2. Outline the various steps involved in a qualitative research.
3. Describe how you would gain access into an SPSS worksheet.
CONTENT
127
meaningful. Many scholars have suggested the following basic steps in a
qualitative research. (See Alan Bryman & Duncan Grammar 1997 p. 3).
Theory
Hypothesis
Findings
Operationalization of
Analysis concept
Data collection
Survey
The starting point is the theory. The next stage is the setting up of relevant
hypotheses. The hypotheses are propositional statements hence they are
made up of concepts. These concepts must be measured. You must
develop measures for these concepts i.e. operationalize them. A faulty
operational definition will produce a wrong measure. You may do a pre
survey test to try your measure. Then you go out to the field to conduct
your survey by carefully selecting your sample. Collect your data through
the various devices we have discussed in this course. Before the data can
be analysed it must be prepared in a computer ready format (i.e. coding).
128
This stage will be discussed further in this lecture. The result of your
analysis will validate or disprove your hypothesis. Thus your findings can
then be fed back into the theory (see Alan Bryman).
Computer Appreciation
We are making a very big assumption here:
The assumption is that you are computer literate. You do not have to be a
computer “guru” in order to use statistical packages. Nonetheless,
knowledge of computer operating systems is desirable.
The skills you need are within the neighbourhood of what we listed below:
As a beginner, you should be able to:
For items 2 and 6 you need to use the mouse to click. No special skill is
required.
What you will discover about the computer is that the windows
environment is interactive, so even when it appears you cannot go further,
the Help Window will come to your rescue.
129
The rows and columns are numbered serially and there are several rows
and several columns on a worksheet. The boxes are called cells. By
coding we mean specifying what figure goes into what cell for which
variable.
Let us use a typical questionnaire to illustrate coding technique.
QUESTIONNAIRE
Designed to measure the relationship between ethnicity and voting
behaviour (not in details)
Que 1: 16-25 26-35 36-45 46-55 > 56
Age:
Que: 2
Christianity Islam Traditional Others
Religion
Que 3:
Yoruba Hausa Igbo Others
Etnic group
Que 4:
Occupation Employed Unemployed Student
Que 5:
What party do you belong to?
AD APP PDP No Party
Que 6:
What Party would you vote for in the next election?
AD APP PDP No Idea
You should note that in question (1) age is measured on interval scale but
by grouping the respondents into age brackets, you have reduced it to an
ordinal level. Question 2-6 are all measured on Normal scale.
Your coding guide will appear as follows:
Que 1: Age: 1 = 16-25
2 = 26-35
3 = 36-45
130
4 = 46-55
5 = Above 56
Que. 3 1 = Yoruba
2 = Hausa
3 = Igbo
4 = Others
Que 4: 1 = Employed
2 = Unemployed
3 = Student
Que: 5 1 = AD
2 = APP
3 = PDP
4 = No Party
Que. 6 1 = AD
2 = APP
3 = PDP
4 = No Idea
Now assuming you pick the first questionnaire already filled and returned
from the respondent, and the person is 37 years old, a Muslim, a Yoruba
man, employed, a member of AD, would prefer to vote for AD in the next
election. Then you will feed in the data into your data file on the
computer. Thus figure 15.1 will become:
V1 V2 V3 V4 V5 V6
1 3 2 1 1 1 1
2
3
Figure 15.2
131
Notice that the title of the column have been changed to V1, V2, …….V6.
What we have done is define our variables (i.e. give them names that the
computer will use to recognize them. Instead of using V1 for Ques 1, you
may decide to use Q1. It is flexible. Just click on DEFINE VARIABLE
and the dialogue box that appears will be interactive. You must also
specify the value labels and the column size of each of your variables. All
these are contained in the dialogue box, just click as desired.
Now we want to assume you have treated all your returned
questionnaire and that you have key in the data into your data file (fig
15.2). A very important information at this stage is that you should save
your working files as you go along. Give it a suitable file name that you
can easily remember.
132
Package for the Social Sciences (SPSS). The latest version of the package
is what you need to install on your system.
Summary
Post-Test
1. List out 3 benefits of the use of computer in data analysis.
2. Outline the various steps involved in a quantitative research.
3. Describe how you would gain access into an SPSS worksheet.
References
Bryman, A. and D. Cramer (1997). Quantitative Data Analysis
with SPSS for Windows. London: Routledge.
133
Answer to Pre-Test Questions
Lecture One
1. c
2. b
3. d
4. b
5. a
Lecture Two
1. c
2. a
3. d
4. b
5. a
Lecture Three
1. c
2. a
3. d
4. b
5. b
(ii) Normal
Reason: we can only name the different ideological positions of students,
we cannot really say which ideology is better than others.
(iii) Normal
134
Reason: we can only name the difference ideological positions of students,
we cannot really say which ideology is better than others.
Lecture Four
1. a
2. c
3. a
4. b
5. c
Lecture Five
1. d
2. b
3. b
4. a
5. c
Lecture Six
1. b
2. c
3. d
4. b
5. a
Lecture Seven
1. c
2. a
3. b
4. c
5. b
Lecture Eight
1. d
2. c
3. b
4. a
5. d
135
Lecture Nine
1. d
2. b
3. a
4. c
5. a
Lecture Ten
1. a
2. No
Lecture Eleven
1. b
2. a
3. d
4. a
5. b
Lecture Twelve
1. a 2. a 3. b 4. c 5. d
Lecture Thirteen
1. c 2. a 3. No 4. Yes 5. a
Lecture Fourteen
1. c 2. a 3. c 4. d 5. b
Lecture Fifteen
1. Accuracy, reliability and Efficiency
2. Statement of inquiry or problem formulation
- Collection of data
- Organisation of data
- Analysis of data
- Interpretation of result
136
- The Mouse
- Switch on the computer system.
- Allow the window operating system to run.
- Click on the SPSS icon.
- Click on input new data.
(a)
P&S
9%
Education
14%
Defence
43%
Agric
21%
Health
54%
137
(b) Defence was the highest priority because it took the highest proportion
of the total budget
40
Allocation
US’m dollar
30
20
10
E H D A P&S Sector
Sector
2 (i)
Class Tally f Class Boundary Class Mark Cum.
Freq
111-120 I 1 110.5-120.5 115.5 1
121-130 IIII 4 120.5-130.5 125.5 5
131-140 IIII IIII II 12 130.5-140.5 135.5 17
141-160 IIII IIII IIII 15 140.5-150.5 145.5 32
151-160 IIII IIII 9 150.5-160.5 155.5 41
161-170 IIII III 8 160.5-170.5 165.5 49
171-180 I 1 170.5-180.5 175.5 50
138
2 (ii) Bar chart
Fre
15
10
Freq Histogram
15
10
139
2 (iii)
Cum. freq
50 * *
40
Ogive *
30 *
20
10 *
* *
110.5 120.5 130.5 140.5 150.5 160.5 170.5 180.5 Class boundary
Reference
140
Appendix I
Table I: Normal Curve Areas
Z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
0.0 .0000 .0040 .0080 .0120 .0160 .0199 .0239 .0279 .0319 .0359
0.1 .0398 .0438 .0478 .0517 .0557 .0596 .0636 .0675 .0714 .0753
0.2 .0793 .0832 .0871 .0910 .0948 .0987 .1026 .1064 .1103 .1141
0.3 .1179 .1217 .1255 .1293 .1331 .1368 .1406 .1443 .1480 .1517
0.4 .1554 .1591 .1628 .1664 .1700 .1736 .1772 .1808 .1844 .1879
0.5 .1915 .1950 .1985 .2019 .2054 .2088 .2123 .2157 .2190 .2224
0.6 .2257 .2291 .2324 .2357 .2389 .2422 .2454 .2486 .2517 .2549
0.7 .2580 .2611 .2642 .2673 .2704 .2734 .2764 .2794 .2823 .2852
0.8 .2881 .2910 .2939 .2967 .2995 .3023 .3051 .3078 .3106 .3133
0.9 .3159 .3186 .3212 .3238 .3264 .3289 .3315 .3340 .3365 .3389
1.0 .3413 .3438 .3461 .3485 .3508 .3531 .3554 .3577 .3599 .3621
1.1 .3643 .3665 .3686 .3708 .3729 .3749 .3770 .3790 .3810 .3830
1.2 .3849 .3869 .3888 .3907 .3925 .3944 .3962 .3980 .3997 .4015
1.3 .4032 .4049 .4066 .4082 .4099 .4115 .4131 .4147 .4162 .4177
1.4 .4192 .4207 .4222 .4236 .4251 .4265 .4279 .4292 .4306 .4319
1.5 .4332 .4345 .4357 .4370 .4382 .4394 .4406 .4418 .4429 .4441
1.6 .4452 .4463 .4474 .4484 .4495 .4505 .4515 .4525 .4535 .4545
1.7 .4554 .4564 .4573 .4582 .4591 .4599 .4608 .4616 .4625 .4633
1.8 .4641 .4649 .4656 .4664 .4671 .4678 .4686 .4693 .4699 .4706
1.9 .4713 .4719 .4726 .4732 .4738 .4744 .4750 .4756 .4761 .4767
2.0 .4772 .4778 .4783 .4788 .4793 .4798 .4803 .4808 .4812 .4817
2.1 .4821 .4826 .4830 .4834 .4838 .4842 .4846 .4850 .4854 .4857
2.2 .4861 .4864 .4868 .4871 .4875 .4878 .4881 .4884 .4887 .4890
2.3 .4893 .4896 .4898 .4901 .4904 .4906 .4909 .4911 .4913 .4916
2.4 .4918 .4920 .4922 .4925 .4927 .4929 .4931 .4932 .4934 .4936
2.5 .4938 .4940 .4941 .4943 .4945 .4946 .4948 .4949 .4951 .4952
2.6 .4953 .4955 .4956 .4957 .4959 .4960 .4961 .4962 .4963 .4964
2.7 .4965 .4966 .4967 .4968 .4969 .4970 .4971 .4972 .4973 .4974
2.8 .4974 .4975 .4976 .4977 .4977 .4978 .4979 .4979 .4980 .4981
2.9 .4981 .4982 .4982 .4983 .4984 .4984 .4985 .4985 .4986 .4986
3.0 .4987 .4987 .4987 .4988 .4988 .4989 .4989 .4989 .4990 .4990
* Adapted from OTT, Lyman, R.F. Larson and W. Mendenhall (1983) STATISTICS: A
tool for the Social Sciences. (3rd ed) BOSTON: Duxbury press.
141
Appendix II
Table II: Percentage Points of the t Distribution
142
Appendix III
Table II: Percentage Points of the Chi-square Distribution
143
30 13.7867 14.9535 16.7908 18.4926 20.5992
40 20.7065 22.1643 24.4331 26.5093 29.0505
50 27.9907 29.7067 32.3574 34.7642 37.6886
60 35.5346 37.4848 40.4817 43.1879 46.4589
144
40.2560 43.7729 46.9792 50.8922 53.6720 30
51.8050 55.7585 59.3417 63.6907 66.7659 40
63.1671 67.5048 71.4202 76.1539 79.4900 50
74.3970 79.0819 83.2976 88.3794 91.9517 60
145