Statistics

DEBRE MARKOS UNIVERSITY
COLLEGE OF BUSINESS AND ECONOMICS

DEPARTMENT OF ECONOMICS
Learning Material for Distance Students
Course Name: Introduction to Statistics

Course Code: Econ1041
Credit Hours: 3
PREPARED BY:
MULUGETA DEREJE (M.Sc.)
YECHALE GETU (M.Sc.)
HAYMANOT BETSEHA (M.Sc.)
FEBRUARY, 2017
DEBRE MARKOS, ETHIOPIA
Table of content
CHAPTER ONE: INTRODUCTION………………………………………………………….1
1.1.Definition of Statistics …………………………………………………………..…………....1
1.2.Types of statistics ………………………………………………………………..…………...4
1.3.Why we study statistics? ………………………………………………………..…………..4
1.4.Uses of statistics…………………………………...……………………………..…………..5
1.5.Users of statistics…………………………….…………………………………..…………..5
1.6.Application of statistics…………………………………………………………..…………..6
1.7.Limitations of statistics………………………………………………………..……………..7
1.8.Steps of statistical investigation……………………………………………………………...7
CHAPTER TWO: SAMPLING THEORY…………………………………………………..…11

2.1.Basic concepts of sampling theory………………………………………………..………....11
2.2.Reasons for sampling………………………………………………………..……..……..….13
2.3.Sampling methods………………………………………………………..………………....14
2.3.1.Probability sampling……………………………………………………….…….………14
2.3.2.Non-probability sampling………………………………………………….……….……23
2.4.Errors in sampling……………...………………………………………………..……..…….24
CHAPTER THREE: DATA COLLECTION AND PRESENTATION…………………………26

3.1.Definition of data…………………………………………………………….…..…………..26
3.2. Classification of data………………………………………………………..….....…………27
3.3.Method of data collection………………………………………………..………..…………29
3.4.Method of data presentation……………………………………….……………….…..……33
3.4.1.Tabular method of data presentation……………………………………………..………33
3.4.2.Graphic method of data presentation…………………………………………….………42
3.4.3.Diagrammatical method of data presentation……………………………………………47
CHAPTER FOUR: MEASURES OF CENTRAL TENDENCY ………………………………52

4.1 Types of measures of central tendency………………………………………………………52
4.1.1. Mean: classification and properties…………………………………………….……53
4.1.2. Median…………………………………………………………..……………….…..61
4.1.3. Mode……………………………………………………………………..…………..65
i
4.2 Distribution, Shape and Measures of Central Tendency………………….....………………66
4.3 Positional Measures……………………………………………………………...…………..67
CHAPTER FIVE: MEASURES OF DISPERSSION…………………………………………...78

5.1.Types of Measures of Dispersion/ Variation……………………………………………...…78
5.1.1. Range…………………………………………………………………………………....80
5.1.2. Quartile Deviation………………………………………………………………………..83
5.1.3. Mean Deviation………………………………………………………..………………....85
5.1.4. Variance and Standard Deviations ……………………………………………...…….…88
5.1.5. Coefficient of variation…………………………………………………………...…...…93
5.2.Moments, Skewness and Kurtosis……………………………………………………..…….95
CHAPTER SIX: SIMPLE LINEAR REGRESSION AND CORRELATION ……….…….…104

6.1.Simple Linear Regression …………………………………………………………………104
6.1.1 The Scatter Diagram……………………………………...…………………………….104
6.1.2. The regression equation…………………………………..…………………………….106
6.1.3. The regression of X on Y…………………………………...………………….……….110
6.2. Correlation ………………………………………………………...………………….….110
CHAPTER SEVEN: ELEMENTARY PROBABILITY…………………………………....…116

7.1.Introduction………………………………………………………………..………………..116
7.2.Definition of basic terms……………………………………………………..………….….116
7.3.Basic rules of probability……………………………………………………..…………….118
7.4.Conditional probability………………………………………………………..………...….120
7.5.Counting procedures………………………………………………………...…..………….122
7.6.Probability distribution and random variables……………………………………..……….124
References..…………………………….……………………………………………………….130
ii
COURSE DESCRIPTION
We use statistical concepts intuitively in our daily lives; and believe it or not, we all think
statistically. If not, think how many times you have decided to take a jacket with you because
you have predicted it will be cold after hours; how many times you have given your blood for
medical test in a laboratory; etc. In fact, modern society is driven by statistics. This course is an
introductory course which helps students get a preliminary knowledge on statistical tools,
methods and their application. Data and probability related issues will be addressed. On the
progress of the course, emphasis will be given to sampling theory, data collection and
presentation, measures of central tendency and variation, linear regression and elementary
probability theory. The rationale for providing Introduction to Statistics is to equip you with an
arsenal of techniques for understanding Statistics for Economists, which focuses on probability
theory, parameter estimation and hypothesis testing.
COURSE OBJECTIVES:
Upon completing this course, you will be able to:

 Explain the basic concepts of Statistics;
 Collect and organize statistical data;
 Identify the different types of sampling techniques;
 Analyze and conclude based on the collected data; and
 Understand the basics of introductory probability theory.
iii
There are a number of symbols and their representation in the course material:
This tells you there is an introduction to the module, unit or section.
This tells you there is a question to answer or think about in the text.
This tells you there is an activity to do.
This tells you to note and remember important points.
This tells you there is a checklist of the main points or terms.
This tells you there is a self-test for you to do
This tells you there is a written assignment.
This tells you that these are the answers to the activities and self-test questions.
iv
CHAPTER ONE: INTRODUCTION
Chapter objectives:
By the end of this unit you will be able to know:
 The definition of statistics

 Types of statistics
 Types of variables
 Limitations of statistics
1.1.Definition of Statistics
Dear learners, would you please define Statistics? --------------------------------------------------------------------

---------------------------------------------------------------------------------------------------------------------------------------------
good!
Statistics has two meanings. Let’s start with its layman definition. In the more common usage,
statistics refers to a collection of numerically expressed facts or data.
Examples:
The number of hotels in a city;

The number of students in a university;
Per capita income statistics;
Statistics of imports, exports, consumption, etc;
However the subject statistics has a much broader meaning than just collecting and publishing numerical
information.
Therefore, we define statistics as the science of collecting, organizing, presenting, analyzing, and
interpreting numerical data to assist in making more effective decisions.
According to Dominick Salvatore and Derrick Reagle “statistics refers to collection, presentation, analysis
and utilization of numerical data to make inferences and reach decisions in the face of uncertainty in
economics, business and other social and physical sciences.”
As the definition suggests:
 The first step in investigating a problem is to collect data.

 The data must be organized in some way and perhaps presented in a chart.
 Only after the data have been organized and presented, we can analyze and interpret it.
1|Page
Example: If students of economics at a university would like to know the monthly household income of
200 residents in Debre Markos town, then they
I. Have to collect the data, that is, income of the households under study ,
II. Should organize the data (say by arranging the data in ascending or descending order),
III. Should present that data by using charts, tables, etc,
IV. And finally, they should do some analysis (say find the average, median, mode variance,
standard deviation, , etc) and interpret the data.
1.2.Types of Statistics
Dear distance Learner! Can you guess the types of statistics?
_______________________________________________________________________________________
______________________________________________________________________________________
Excellent! The study of statistics is usually divided in to two categories:
I. Descriptive Statistics
 It is a statistical method that deals with describing (summarizing) given set of data
without making inferences about the larger data.
 It involves collection, organization and presentation of data in an informative way.
 Tables, graphs and numerical summary measures may be used to describe data.
 In descriptive statistics, the statistician tries to describe a situation.
Examples on descriptive statistics:
 Consider the national census conducted by the Ethiopian government in 1999 E.C.
Results of this census give the average age, average household income, and other
characteristics of the Ethiopian population and these are descriptive statistics.
 A survey found that 51% of the populations in Ethiopia are females. The statistic
51describes the number out of every 100 persons who are females.
 According to Consumer Reports, Sony TV owners reported 2 defective TVs per 100 TVs
(2%) in 2001. The statistic 2(2%) describes the number of problems out of every 100
TVs.
 According to the bureau of the labor statistics, the average daily wages of workers in a
town is birr 15 in August 2007.
 The GDP of country X was 100 million in 2010 and 140 million in 2016. If we calculate
the percentage growth of GDP from 2010 to 2016, that is still a descriptive statistics.
What is the percentage growth of GDP from 2010 to 2016?
140  100
[Answer 40 %= x100% ]
100
Question: Would it be descriptive statistics if we used this GDP growth rate (40%) to estimate
the GDP of country X in the year 2017? Why? What type of statistics is it?
2|Page
II. Inferential Statistics
 It is also called statistical inference or inductive statistics.
 It is a statistical method that involves taking a sample from a population, computing the
statistic based on the sample, and inferring from the statistic about the value of the
corresponding parameter.
 It is a branch of statistics that is used to determine something about the population on the
basis of a sample taken from that specific population
 It is a decision, estimate, prediction, or generalization about a population, based on a
sample.
Examples:
 The accounting department of a large firm will select a sample of the invoices to check for
accuracy for all the invoices of the company.
 Wine tasters sip a few drops of wine to make a decision with respect to all the wine waiting to be
released for sale.
Dear distance Learner! Can you guess the difference between population and sample? What about
parameter and statistic? ____________________________________________________________________
________________________________________________________________________
Good!
Note the words “population” and “sample” in the definition of inferential statistics.
A population is a collection of all possible individuals, objects or measurement of interest. When a

researcher gathers data from the whole population for a given measure of interest, it is called census
(complete enumeration).
A sample is a portion or part of the population of interest.
When we discuss about inferential statistics we have to differentiate between parameter and statistic.
Parameter is the calculated value of a population (say population mean, population standard deviation,
etc.) and statistic is the calculated value of a sample (say sample mean, sample standard deviation, etc.).
The difference between sample statistic and its corresponding parameter is called sampling error.
Example on sample vs. population:
i. If we want to do a research on the impact of high school GPA (transcript result) on college GPA of
economics students at a university, the population is all economics students at that university.
ii. A researcher may select all students of economics at Debre Markos University as a sample to know
the impact of high school GPA on college GPA and infer (conclude) something about the impacts of
high school GPA on college GPA of economics students at all Ethiopian colleges/universities.
3|Page
Exercise
The marketing department of a bank asked a sample of 1960 customers to try a newly developed banking
system. Of the 1960 samples, 1176 said they would use the new system if it is marketed. What would the
marketing department report to the bank officials regarding the acceptance of the new system in the
population? Is this an example of descriptive or inferential statistics?
Solution:
Based on the samples of 1960 customers; we estimate that, if it is marketed sixty percent
(1176/1960*100%) of all customers will use the new system and it is inferential statistics, because a sample
was used to draw a conclusion about how all customers in the population would react if the new system
were marketed.
1.3.Why we study Statistics?

Statistics is required for many college programs like business, economics, engineering, psychology,
medicine etc. The course content is basically the same. The biggest difference is the examples used and
level of mathematics required. Statistics course in colleges of business and economics usually teach the
course at a more applied level.
Thus, in business and economics, we are interested in such things as:
 Profits of firms (revenue minus cost),

 Gross Domestic Product (GDP), of a nation
 Demand, supply, consumption, cost, and wages, etc.
Dear distance learners, why statistics is required in so many fields of studies?
_______________________________________________________________________________________
_____________________________________________Well!
We are studying statistics for the following reasons:
A. The first reason is that numerical information is everywhere.

 If you have a look at on the magazines in our country Ethiopia, you are going to find a lot
of numerical information like exchange rates (say $1=22 birr), unemployment rates (say
unemployed
8% in Debre Markos= ), per capita income of Ethiopia ($600=
labor force
Gross National Income
), consumption rate of beer, export of flowers, import of cloth, rate
Population
of inflation, demand for cars, enrollment rates of high schools, etc.
 Therefore, to be an educated consumer of this information, an understanding of the
concepts of basic statistics will be useful.
4|Page
B. Researchers and/or students may be called on to conduct research in their fields, since
statistical procedures are basic to research.
 To accomplish this, they must be able to design experiments; collect, organize, analyze and
summarize data and possibly make reliable predictions or forecast for future use.
 They must also be able to communicate the results of the study in their own words.
C. Students, like professionals, must be able to read and understand the various statistical
studies performed in their field. To have such understanding, they must be knowledgeable
about the vocabulary, concepts and statistical procedures used in these studies.
D. Data is everywhere and no matter what your future line of work, you will make decisions that
involve data and understanding of statistical methods will help you make these decisions more
effectively.
1.4.Uses of Statistics
Dear distance learners, could you explain the use of Statistics?

_______________________________________________________________________________________
__________________________________________________________________________________
Good!!
Importance of statistics is clearly stated in the following words of Carol D. Wright of USA “to a very striking
degree, our culture has become a statistical culture. Even a person who may never have heard of an index
number is affected by of those index numbers which describe the cost of living. It is impossible to
understand psychology, sociology, economics, business, finance, or physical science without some general
idea of the meanings of an average, of variations, of sampling, of how to interpret charts and tables.”
According to H.G Wells “statistical thinking will one day be as necessary for effective citizenship as the
ability to read and write.”
The main functions of statistics are to enlarge our knowledge of complex phenomena. That is;
i. It presents facts in a definite and precise form. Example: Instead of saying that per capita income
of Ethiopia is low; better and clear to say it is 110.
ii. It reduces data: i.e. it simplifies a complex mass of data and presents it in a few, clear, and useful
summaries. The bulky data may be summarized in totals, averages, percentages, etc.
iii. It measures the magnitude of variation in data.
iv. It furnishes with technique of comparison.
v. It helps to estimate the unknown population parameter from a sample.
vi. It helps to test and formulate hypothesis.
vii. It helps to study the relationship between two or more variables.
viii. It helps to forecast future events.
1.5.Users of Statistics
Dear distance learners, who are the users of statistics?
5|Page
_______________________________________________________________________________________
_________________________________________________________________Well!
 Most people become familiar with statistics through radio, television, newspaper, and
magazines and statistical methods are used in almost all fields of human endeavor.
 Statistical methods help people identify and solve many problems concerning the environment,
the economy, transportation, public health and other matters of public concern.
 Economists use statistical techniques to predict future economic conditions, to understand
economic problems, to formulate economic policies, to do research in the areas of economics,
to do market analysis, etc.
 Doctors use such methods to determine whether certain drugs help in the treatment of medical
problems.
 Weather forecasters use statistics to help them predict the weather more accurately.
 Engineers use it to set standards for product safety and quality.
 Statistical ideas help scientists design effective experiments.
 Lawyers are increasingly turning the statisticians to help weigh evidence and determine
reasonable doubt.
 In education, the researchers might want to know if new methods of teaching are better than the
old ones.
1.6.Application of Statistics in Business and Economics
Dear distance learners, could you explain the applications of statistics to Business and Economics?
_______________________________________________________________________________________
__________________________________________________________________________________
Good!!
 Now-a-days the success of a particular business or industry very much depends on the accuracy
and precision of statistical analysis.
 Before taking a new venture or for the purpose of improvement of an existing venture, the
business executives must have a large number of quantitative facts. Examples:
 cost of raw materials,  various taxes to be paid,
 demand of products in the  labor conditions,
market,  Sales forecast.
 price of products in the market,
 All these facts are to be analyzed statistically before stepping in for a new enterprise or before
fixing the price of a commodity.
 Statistical methods are now used for exploring possibilities to
 advertising campaigns,
 for adjustment of production methods and
 As an aid to establish standards.
 Statistical techniques help in forecasting future markets.
 Market research and market surveys by statistical sampling methods are now extremely useful for
any business person.
6|Page
 In industry, statistics is widely used in quality control.
 In production engineering, to find whether the product confirms to specification, statistical tools
like inspection plans, control charts, etc are of great use.
 Wide application of statistics can be found in insurance companies where the premium rates are
fixed on the basis of mortality, average length of life, possibilities of investment, etc.
1.7 Limitations of statistics
Dear distance learners, Explain the limitation of statistics?

_______________________________________________________________________________________
_______________________________________________________________________________well!!
 Statistics deals with only quantitative information, i.e. information should be capable of
numerically expressed either directly or indirectly.
 Statistics deals with only aggregates of facts and not with individual data items.
 Statistical data are only approximately and mathematically correct.
 Statistics can be easily misused and, therefore, should be used only by experts.
Misuse of statistics
Knowingly: Unknowingly:
 Advertising media  Lack of knowledge in

 Government for political cause - Statistics
 Inappropriate comparison - The subject matter to which it is applied
 Incomplete information
1.8.Steps of Statistical investigation

A statistical study involves the following stages:
i. Determine the objective of the iv. Presenting the data;

study; v. Analyzing the data, and
ii. Collection of data; vi. Interpreting the results of the study
iii. Organizing the collected data; and recommendations.
Types of Variables
Dear distance learners, what is variable?

_______________________________________________________________________________________
__________________________________________________________________________________ Well!
 A variable is measurable characteristics of a given phenomenon (object, process, event, etc) which
can take different values in a given population or samples of elements or it is a characteristic about
each element of a population or a sample.
7|Page
Examples:
 annual income (it can be Birr 200, Birr 300, Birr 400, or any other value),
 quantity demanded (it can be 200 units, 300 units, 400 units, or any other value),
 price (it can be Birr 2 per unit, Birr 4 per unit, Birr 10 per unit or any other value),
 gender (female or male), etc.
 Data (singular datum):
 are the set of values collected for the variable from each of the elements of the sample
 are the actual measurements or observations that result from an investigation or survey
 are the values (response) of the variable associated with an element of a population or a
sample.
Example:
 The variable monthly household income of a family in a town can assume different values
(say, Birr 1000, Birr 3000, etc). But if we collect the monthly household income of 100
households then the values are called data.
 Data set: is a collection of data values (data). Example: the monthly households’ income of 100
residents in a town is called data set.
 Raw data: is a data collected in an original form (not yet organized)
 Information: is a set of data corresponding to a specific aspect of knowledge combined in an
organized way. Information is a processed data to be used directly. It can transfer knowledge and
meanings
(2) Process:
-Organize the data
(1) Input (Raw
data) -Enter to the
computer
-Find its mean
(3) Information
- says something
to the user
-meaningful to
the user
From the point of view of statistical methods, variables can be broadly classified into qualitative (or
categorical) and quantitative (or numerical) variables.
8|Page
Qualitative Variable፡
 When the characteristic being studied is non-numeric, the variable is called qualitative variable or
attribute.
 It is a variable or characteristic which cannot be measured in quantitative form but can only be
identified by name or categories.
 Examples include; gender, religious affiliation, type of automobile owned, place of birth, eye color,
etc.
 When the data are qualitative, we are usually interested in how many or what portion fall in each
category. For example, what percent of the population are males? What percent of the population
owns a Nokia mobile apparatus?
 Note that: Generally, although numerical codes can be assigned to the different categories of
variables, arithmetic operations (addition, subtraction, multiplication and division) are not
applicable to qualitative data.
Quantitative Variable:
 Itis a variable that can be measured and expressed numerically.

 Examples: balance in your checking account, minutes remaining in class, or number of children in a
family, time taken to finish an exam, etc. Quantitative variables can be classified as either discrete or
continuous.
1) Discrete variables: can only assume certain values and there are usually “gaps” between values.
Discrete variables can be assigned values such as 0, 1, 2, 3, 4, 4.5, 7.75, etc…. and are said to be
countable and typically discrete variables result from counting. Examples: the number of bedrooms
in a house, or the number of cars sold at a car market, etc.
2) A continuous variable can assume any value within a specified range. Examples: The pressure in a
tire, the weight of a stone, or the height of students in a class, the distance from Debre Markos to
Bahir Dar, age, temperature, etc. Typically, continuous variables result from measuring something
and therefore, variables must be rounded to the limit of the measuring device.
Review exercises
1) In each of these statements, tell whether descriptive or inferential statistics have been used.
a) In the year 2015, the enrolment rate of elementary schools in Ethiopia will be 100%.
b) The average household income for people aged 25-34 is birr 2000/month.
c) Drinking coffee may raise cholesterol levels by 7%.
d) Some economists say that National Bank of Ethiopia (NBE) may increase the interest rate on
deposits to lower the money supply of the economy.
2) Classify each of the following variables as qualitative or quantitative.
a) Color of the automobile c) Gender (1=female, 0=male)
b) Number of desks in classrooms d) Number of pages in a book
9|Page
3) Classify each of the following variables as discrete or continuous.
a) Water temperature of the Sauna at a given health spa
b) Income of a household
c) Life time of batteries in a tape recorder
d) Weights of a newly born infants at a certain hospital
4) Consider the following :
Selling price of a house depends on the following factors:
a. Number of bedrooms e. Township

b. Size of the house in square feet f. Garage Attached (1=yes, 0=no)
c. Swimming pool (1=yes, 0=no) g. Number of bathrooms
d. Distance from the center of the city
Which of the variables given above are qualitative and which are quantitative? Why?
5) Briefly explain the difference between the following concepts and give examples, if necessary.
a) Qualitative variable vs. quantitative d) Sampling vs. Census
variable e) Parameter and statistic.
b) Quantitative data vs. qualitative data f) sample vs. population
c) Descriptive statistics s vs. Inferential
statistics
6) Describe the importance of Statistics for an Economist.
7) Select an article newspaper (say Ethiopian Herald) that involves a statistical study and write a paper
answering the following questions.
a. Is the study descriptive or inferential in nature? Explain your answer.
b. What are the variables used in the study? Classify the variables as qualitative or
quantitative
8) One of the following is not true?
a. Population is sometimes referred to as the universe
b. The height of Ras Dashen mountain is 4440m can be considered as continuous variable
c. The ages of students at Debre Markos University is a variable
d. None
9) The difference between the sample mean and the population mean is called
a) Population mean c) Standard error of the mean
b) Population standard deviation d) Sampling error
10) The number of TVs sold by a certain shop during the months of November, December, January and
February, respectively are 25, 40, 35, and 32. Indicate whether the following conclusions belong to
the domain of descriptive statistics or inferential statistics.
a) During the four months, the average number of TVs sold per month was 33
b) Since the average number of TVs sold per month was small, the shop should invest more on
advertisement.
c) Out of the four months, the sale in November was the least.
d) The number of TVs sold in December was the highest because of Christ mass.
10 | P a g e
CHAPTER TWO: SAMPLING THEORY
Chapter Objectives
After completing this chapter, students would be able to:
 Comprehend the basic concepts of sampling theory.

 Understand the reasons for sampling.
 Identify the basic sampling techniques.
 Demonstrate knowledge of basic sampling methods.
 Apply sampling theory in business and economics.
2.1 Basic Concepts of Sampling Theory
Dear distance learners, Explain the sampling theory?
_______________________________________________________________________________________
__________________________________________________________________________________ Well!
Students are expected to know the following concepts in sampling theory:
(i) Population or universe is a group of all elements /observations (persons, animals, objects,
measurements, etc) under consideration in a certain problem. The word population is a technical
term in statistics, not necessarily referring to people.
Examples:
 All students in this university;

 All households in Debre Markos town;
 All light bulbs produced by a firm in a single day;
 All fish in a lake, etc.
(ii) Census is a collection of data from the whole population (that is, complete enumeration). It is the
actual measurement or observation of all possible elements from the population or it is a survey
of everyone in the population.
(iii) Reference population (source or target population) the population of interest, to which the
researcher would like to generalize the results of the study. Example: If a researcher would like to
study the effect of a new fertilizer on crop yield in Ethiopia, then the reference population is all
farmers in Ethiopia who are using the new fertilizer.
(iv) Sampling theory is a study of relationships existing between a population and samples drawn
from the population. Attaining a specified precision at minimum cost is the main intention of
sampling theory. In sampling theory population is often required as an assumption.
11 | P a g e
(v) Sample is the small group that is chosen for the study. It is a part or portion or sub set of a
population taken so that some generalizations about the population can be made. The main
concern in sampling is to ensure that the sample accurately represents the population we are
interested to study. That is, samples are taken in a way that they will be representative of the
population.
(vi) Sampling is the process involving the selection of a finite number of elements from a given
population of interest for purposes of an inquiry. It is a process of taking samples from a
population of interest for purpose of an inquiry. Example: In industry, the quality of a product is
assessed through sampling; the public opinion on social, economical and political problems is
ascertained through sampling.
(vii) Sample size is the number of individuals or observations in a sample (usually denoted by n).
(viii) Parameter is any measurable characteristic of a population. Example: Population means,
Population standard deviations, population medians, etc.
(ix) Statistic is a number resulting from manipulation of sample data. That is, it is any measurable
characteristic of a sample. Example: sample means, sample standard deviations, sample medians,
etc. A statistic is used to estimate a population parameter such as Population mean (  ),
Population standard deviation (  ), etc.
(x) The sampling error is the difference between a sample statistic and its corresponding population
parameter. It is the error that occurs because a sample has been taken instead of a census. For
example: the sample mean may differ from the true population mean.
(xi) Sampling Unit is the ultimate unit to be sampled (elements of the population to be sampled).It is
the unit of selection in the sampling process. Examples:
 In a sample of households, the sampling unit is a household;
 In a sample of students, a student is the sampling unit.
 In a sample of districts, the sampling unit is a district, etc.
(xii) Sampling Frame is the list of all possible units in the reference population, from which a sample is
to be drawn. Example: If a researcher would like to do a research on poverty levels of residents in
a town and if s/he decided that the sampling unit for the study is an individual, then the sampling
frame would be the list of all individuals living in that town. A student roster is a sampling frame
for a sample of students.
(xiii) Sample design is a set of procedures for selecting the units from the population that are to be in
the sample.
(xiv) Sampling fraction (sampling interval):- the ratio of the number of units in the sample to the
number of units in the sampling frame or in the reference population. For example, a sampling
fraction or ratio of 1:3 is equivalent to a sampling interval of 1 in every 3 units. This means that the
sample constitutes 33.3% of the total units in the sampling frame or in the reference population.
An application of the terminologies

 Population: All students in Debre Markos University in 2009 E.C.
 Sampling Frame: All students appearing in the list of students prepared by the registrar on Hidar
30, 2009 E.C.
12 | P a g e
 Sample design: Probability sampling
 Sample size: 2000 students selected from the sampling frame.
 Sampling unit (unit of analysis): a student
 Statistic: Students in the sample have spent an average of 300 birr per month.
 Parameter: Students in the university are probably spending, on average, between 250 birr and
350 birr per month (estimate derived from sample statistic).
2.2.Reasons for Sampling
Dear distance learners, why we used a Sampling instead of a census? Explain the advantage of
sampling?
____________________________________________________________________________
____________________________________________________________________________
Good!!
When studying characteristics of a population, there are many practical reasons why we prefer to select
samples of a population. Some of the reasons for sampling are:
(i) A census can be extremely expensive and time-consuming. Contacting every member of
a large population would require great expenditures of time and money, and sampling from
the list can provide satisfactory results more quickly and at much lower cost. Efficiency is
the commonly known advantage of sampling. For example: a researcher may wish to
determine the average annual income for households in Ethiopia. A sample of households
would take fewer days and lower cost than interviewing all the households in Ethiopia.
Therefore, a sample has to be taken.
(ii) The physical impossibility of checking all items in the population (sometimes census is
impossible): Example: the population of fish, birds, mosquito and the like are large and
constantly moving, being born and dying. Therefore, we just take some samples to do a
research as it is impractical to have a census upon such types of populations.
(iii)A census can be destructive: The Awash wine factory, like every other winery, employs
wine tasters to ensure the consistency of product quality. Naturally, it would be
counterproductive if the tasters consumed all of the wine, since none would be left to sell
the thirsty customers. Likewise, firms wishing to ensure that its steel cable meets tensile-
strength requirements couldn't test the breaking strength of its entire output. As in the
Awash factory situation, the product "sampled" would be lost during the sampling process,
so a complete census is out of the question.
a) The sample results are usually adequate: In practice, a sample can be more accurate than a
census.
b) Speed: The collection and analysis of data can be done more quickly if the data are not
excessive. Time and energy are saved. That is, the data can be collected and summarized more
13 | P a g e
quickly with a sample than with a census. This is a valid consideration when the information is
urgently needed.
c) It enables the researcher to get more detailed information about a particular subject under
investigation. If only a few people are surveyed, the researcher can conduct an in-depth
interview by spending more time with each person, thus getting more information about the
subject. That is not to say the smaller the sample, the better; in fact, the opposite is true. In
general, larger samples-if correct sampling techniques are used-give more reliable information
about the population.
Disadvantages of sampling:
i. Reliability: If the sample is not a true representative of the population, then we may sacrifice
reliability in favor of less time and money.
ii. If complete information is required on each and every element of the population, census should
be applied.
2.3.Sampling Methods
Dear distance learners, List and explain different sampling methods?

____________________________________________________________________________
____________________________________________________________________________
Good!!
The population is too large to consider for collecting information from its all members. Usually, a
representative sub-group of the population (sample) is included in the investigation. Sampling involves the
selection of a number of study units from a defined population. The main concern in sampling is,
therefore, to ensure that the sample accurately represents the population we are interested to study.
Sampling methods can be categorized as probability and non-probability.
Dear distance learners, Explain the difference between probability sampling and non- probability
sampling?
____________________________________________________________________________
____________________________________________________________________________
Good!!
2.3.1. Probability Sampling
A probability sample is a sample selected such that each item in the population being studied has a known
chance (greater than zero) of being included in the sample. These methods remove human judgment from
the sampling process and ensure a more representative sample and it has certain basic features.
Methods of Probability Sampling: The four basic types of sampling methods are:
 Simple random sampling,  Systematic sampling,
14 | P a g e
 Stratified sampling, and  Cluster sampling.
Dear distance learner, Describe the difference between different probability sampling methods?
_______________________________________________________________________________________
_________________________________________________________________________________
Good!!
The choice of which to use in any given situation will depend on the types of a problem being investigated,
aim of the research and the available resources.
a)Simple Random Sample (SRS): In SRS, each item in the population has a known,thesame, non-zero
chance of being included in the sample.
Random samples are selected by using methods such as random numbers (which can be generated
from computers) or lottery method. To select a simple random sample you need to follow the
following procedures:
 Make a numbered list of all units in the population (sampling frame),

 Each unit on the list should be numbered in sequence from 1 to N (where N is the size of the
population),
 Select the required number of study units, using a "lottery" or a table of random numbers.
Lottery Method in SRS
I. Numbered or named papers representing a unit in the population are placed in a hat.
II. The papers are thoroughly mixed and the number of papers equal to the sample size is selected
from the hat. For a sample of 200 students, the researcher would select 200 papers.
III. The sample then consists of all units of the population corresponding to the selected papers.
Random Number Table Method in SRS
I. The researcher assigns a number to each unit of the population and constructs the random table.
II. Then s/he randomly selects a starting place (point), goes through the table across the rows or
down the columns and lists the numbers as they appear on the table.
III. Members of the population with the selected numbers constitute the sample.
IV. A random number table is a list of numbers generated by a computer that has been programmed
to yield a set of random numbers.
V. It is possible for a unit’s number to be selected more than once.
Advantage of SRS
I. Ensures that the sample is unbiased in that every individual and every sample has an advantage of
being chosen.
15 | P a g e
II. SRS is the basic sampling method assumed in survey statistical computations. This can be used
with confidence.
Disadvantages of SRS
I. SRS requires a sampling frame and this is sometimes impossible (the case of fish population),
II. It is difficult to take samples if the reference population is scattered,
III. If the population is extremely large, it is tedious and time consuming to number and select the
sample,
IV. Minority subgroups of interest in the population may not be represented in the sample.
Note that: In SRS, when we apply the table of random numbers, we have to ignore repeated digits and
those lying above the range of the population size. The following table shows a random number
generated by a computer.
731 065 777 796 870 963 130 610

759 454 704 173 030 130 611 005
796 465 951 662 591 414 219 145
343 330 606 637 765 155 590 333
873 496 739 665 456 265 126 687
034 005 258 910 055 349 929 365
984 496 905 172 400 609 844 408
846 838 362 542 485 489 230 221
293 378 496 696 911 898 308 662
250 825 716 795 080 180 487 769
074 750 467 029 647 057 017 108
798 719 839 769 780 814 610 744
629 042 308 361 067 619 658 839
744 159 596 527 650 205 151 875
325 634 664 409 052 842 734 503
675 794 821 221 194 412 879 012
804 975 965 539 105 841 188 430
132 407 945 213 351 859 816 246
321 714 049 895 120 705 025 756
235 042 620 205 048 563 859 040
Example: Suppose a researcher wants to know the impact of microfinance on the clients' household
income. S/he wishes to select 10 clients out of 250 clients and a research assistant is required to select
16 | P a g e
a random samples. Assuming that you are a research assistant, select a simple rand sample of 10
clients.
Solution:
1. Number each client from 1 to 250 (based on alphabet of their names or identity
numbers),
2. Using the random numbers shown above, find the starting point. To find the starting
point, one generally closes one's eyes and places one's figure anywhere on the table. In
this case, let us select number 005 in the 6th row and 2nd column,
3. Going down the column and continuing to the next columns, select the first 10
numbers.
4. The numbers are 005, 042, 159, 049, 173, 172, 029, 221,213 and 205. Therefore, clients
with these numbers will be included in the sample for further analysis.
b) Systematic Sampling (Quasi-random sampling): In systematic sampling, the elements to be
included in a sample are picked at a constant interval. That is, the items or individuals of the
population are arranged in some order and a random starting point is selected from 1 through k
population size N
(where k   ) and then every kth member of the population is selected for the
Sample size n
sample.
In systematic sampling:
 A complete list of all the elements within the population (sampling frame) is required.
 The procedure is to take every kth item from the sampling frame.
 Let N= population size; n=sample size; k=sampling interval, k=N/n
 Choose any number between 1 and k. suppose it is j (1  j  k) .
 The jth unit is selected at first and then (j+k)th , then ( j+2k)th, …..etc. unit is selected until the
required sample size is reached.
Example 1: Suppose there are 2000 subjects in the population and a sample size of 50 subjects are
needed. Select a systematic sample of these 50 subjects.
Solution: The sampling interval (k) is 40 (2000/50). The number of the first subject to be included in the
sample is chosen randomly, for example, by blindly picking up one out of 40 pieces of paper numbered 1
to 40. Suppose subject 12 was the first subject selected, then the sample would consist of samples whose
numbers were 12, 52, 92, etc until 50 subjects (samples) are obtained.
It is obvious that a sample chosen this way is not strictly random since not all the members of the
population have an equal chance of being selected.
Example 2: Suppose a researcher wants to know the impact of microfinance on the clients' household
income. S/he wishes to select 10 clients out of 250 clients and a research assistant is required to select
systematic samples. Assuming that you are a research assistant, select a systematic sample of 10 clients.
17 | P a g e
Solution:
1. Number each client from 1 to 250 (based on alphabet of their names or identity numbers),
2. Since there are 250 clients and 10 are to be selected, the rule is to select every 25 th clients. This rule
is determined by dividing 250 by 10 which gives 25,
3. The number of the first subject to be included in the sample is chosen randomly from numbers 1
to 25. In this case let us select number 5.
4. Then select every 25th number on the list starting from 5. The numbers include the following: 5, 30,
55, 80, 105, 130, 155,180, 205 and 230. Therefore, clients with these numbers will be included in
the sample for further analysis.
Note: The answer is not unique as it depends where the number of the first subject to be included is
picked.
Advantages of Systematic Sampling:
 Less time consuming and easier to perform than SRS,

 It is more convenient to use as compared to SRS,
 It provides a good approximation to SRS.
Disadvantages of Systematic Sampling:
 If there is any sort of cyclic ordering of the subjects, the samples will not be representative of the
population. Example: If subjects in the population are arranged in a manner such as:
1) Defective item
2) Non-defective item
3) Defective item
4) Non-defective item
The selection of the starting point could produce a sample of all defective items or non-defective
items depending on whether the number to be added (k) is even or odd.
Example: starting point =defective item +even k=all defective item in the sample and starting point
=non-defective item +even k=all non-defective items in the sample.
Example: Moha Company stores boxes containing Pepsi and Mirinda in the following order.
1) Box containing Pepsi 200)
2) Box containing Mirinda
3) Box containing Pepsi
4) Box containing Mirinda
5) .
6) .
7) .
. .
. .
18 | P a g e
The quality department of the company would like to check the expiry date of the products by taking a
systematic sample size of 40 boxes containing either Pepsi or Mirinda. Assume that you are working in
the quality department of the company, select the systematic samples required. Is the sample you
selected a representative?
Stratified Sampling: In stratified sampling, a population is first divided into subgroups, called strata
(singular stratum), and a sample is selected from each stratum based on simple random or systematic
sampling method. The strata are made according to various homogeneous characteristics such as sex,
race, region or institutional affiliation such as faculty. This sampling method is appropriate when the
distribution of the characteristic to be studied is strongly affected by certain variables. Note: Stratified
sampling is applied if the population is heterogeneous.
Stratified sampling can also be proportionate or non-proportionate. In the latter case, an equal number
of elements are drawn from each stratum while in the former case a proportionate number is obtained.
a) Proportionate Stratified Sampling: Number of units selected from each stratum is directly
proportional to the size of the strata. If Pi represents the proportion of population included in the
stratum i, and n represents the total sample size, the number of elements selected from stratum i is
nxPi
Examples:
1) Let us suppose that we want a sample size of 30 to be drawn from a population size of 8000
which is divided in to three strata of size 4000, 2400 and 1600. Adopting proportional allocation:
i. Find the sample sizes under each stratum.
Solution: We shall get the sample size for the different strata:
a. N1=4000, we have P1=4000/8000=0.5 and hence n1=n. P1=30*0.5=15

b. N2=2400, we have P2=2400/8000=0.3 and hence n2=n. P2=30*0.3=9
c. N3=1600, we have P3=1600/8000=0.2 and hence n3=n. P3=30*0.2=6
N= N1 +N2+ N3, P= P1 +P2 +P3=1 n1 +n2 +n3=15+9+6=30
Thus, using proportional allocation, the sample sizes for different strata are 15, 9 and 6
respectively which is in proportion for the sizes of the strata namely 4000:2400:1600.
2)In a class of students, you can stratify the whole class on the basis of gender (F or M) and you
would draw an equal number of students from each group (disproportionate) or an unequal
number of students from each group depending on the proportion of males to female in the
original class list (proportionate). Let us take a numerical example: If there are 50 students in a
class of which 10 are female and if 10 students are needed for some study,
a) select a proportionate stratified sample of 10 students (8M, 2F)
b) select a disproportionate stratified sample of 10 students (5M, 5F)
Advantage: The representation of the sample is improved
19 | P a g e
Disadvantages:
 If there are many variables of interest, dividing a large population in to representative

subgroups requires a great deal of effort,
 If variables are somewhat complex or ambiguous (such as beliefs, attitudes, etc), it is
difficult to separate individuals in to the sub groups according to these variables.
Example (class work): Using the population of 20 students given below, select a sample of 8 students on
the basis of gender (female/male) and grade level (freshman/sophomore).
S.No Name Gender Grade level S.No Name Gender Grade level
1 Abebe M Fr 11 Melat F Fr
2 Bekele M So 12 Nigusie M Fr
3 Birtukan F Fr 13 Petros M So
4 Chaltu F Fr 14 Rosa F So
5 Dagmawit F Fr 15 Regassa M Fr
6 Dagne M Fr 16 Selam F Fr
7 Huluka M Fr 17 Solomon M So
8 Lulit F So 18 Tigist F So
9 Melaku M So 19 Tibeyin F So
10 Mohammed M So 20 Tirhas F So
Solution: Steps:
1) Divide the population in to two groups based on gender

2) Divide each subgroup further in to two groups of freshman and sophomore
3) Determine how many students need to be selected from each subgroup to have a
proportional representation of each subgroup in the sample. There are four groups and since
a total of eight students are needed for the sample, two students must be selected from each
subgroup.
4) Select two students from each group by using SRS or systematic sampling.
Solution: 1) Divide the population in to two groups based on gender as shown below:
20 | P a g e
Males Females
S.No Name Gender Grade Level S.No Name Gender Grade Level
1 Abebe K. M Fr 11 Melat A. F Fr
2 Bekele M. M So 12 Lulit L. F So
3 Dagne K. M Fr 13 Birtukan L. F Fr
4 Huluka G. M Fr 14 Rosa M. F So
5 Melaku J. M So 15 Chaltu C. F Fr
6 Mohammed A. M So 16 Selam A. F Fr
7 Nigussie K. M Fr 17 Dagmawit B. F Fr
8 Petros L. M So 18 Tigist M. F So
9 Regassa K. M Fr 19 Tibeyin Y. F So
10 Solomon K. M So 20 Tirhas W. F So
2) Divide each subgroup further in to two groups of freshman and sophomore as shown below:
Group 1 Group 2
S.No Name Gender Grade S.No Name Gender Grade
Level Level
1 Abebe K. M Fr 1 Melat A. F Fr
2 Dagne K. M Fr 2 Birtukan L. F Fr
3 Huluka G. M Fr 3 Chaltu C. F Fr
4 Nigussie K. M Fr 4 Selam A. F Fr
5 Regassa K. M Fr 5 Dagmawit F Fr
B.
Group 3 Group 4
S.No Name Gender Grade S.No Name Gender Grade
Level Level
1 Mohammed M So 1 Lulit L. F So
A.
2 Melaku J. M So 2 Rosa M. F So
3 Petros L. M So 3 Tigist M. F So
4 Solomon K. M So 4 Tibeyin Y. F So
5 Bekele M. M So 5 Tirhas W. F So
21 | P a g e
1) Determine how many students need to be selected from each subgroup to have a proportional
representation of each subgroup in the sample. There are four groups and since a total of eight
students are needed for the sample, two students must be selected from each subgroup.
2) Select two students from each group by using random numbers. In this case we can select the
following students: Group 1: Student 5 & 4, Group 2: Students 5 & 2, Group 3: Student 1 & 3,
Group 4: Students 3 & 4.
3) The stratified sample then consists of the following students:
S.No Name Gender Grade

Level
1 Nigussie K. M Fr
2 Regassa K. M Fr
3 Mohammed M So
A.
4 Petros L. M So
5 Birtukan L. F Fr
6 Dagmawit B. F Fr
7 Tigist M. F So
8 Tibeyin Y. F So
c)Cluster Sampling: if the population is homogeneous and very large or resides in a large area, it is
costly and time consuming to take samples by using the three methods just mentioned above. In this
case, we divide the population in to groups called clusters and then we select representative clusters
randomly. Finally, the samples will be taken from the sample clusters. We can take either all
members of the sample clusters or we may select samples from the clusters by using other sampling
techniques.
Procedures:
1) The reference population is divided in to clusters or subgroups, preferably similar in size,

2) A sample of the clusters is taken by random or systematic sampling,
3) All the units in the selected clusters are then studied or we may select samples from each
cluster. If part of the elements in each cluster is included in the sample, then the procedure is
called two stage sampling. The first stage is selecting a sample of clusters and the second
stage is selecting a sample of elements from each cluster.
Advantage:
 A list of all individual study units in the reference population is not required.
 Reduces cost
 simplify field work and it is convenient
22 | P a g e
Disadvantage:
 The members of the clusters are often more homogeneous than the members of the whole
population and therefore, it may not be representative.
 The elements in a cluster may not have the same variation in characteristics as elements
selected individually from the population
d) Multi-Stage sampling: is a sampling technique that is used when the reference population is
large and widely scattered. Selection of samples is done in stages until the final sampling unit is
obtained. The number of stages of sampling is the number of times a sampling procedure is carried out.
The primary sampling unit (PSU) is the sampling unit in the first sampling stage and the secondary
sampling unit (SSU) is the sampling unit in the second sampling stage, etc. For example: the PSU can be
the weredas, the SSU can be the kebeles, etc. From PSUs, we can select samples based a suitable
method and each of these selected PSUs is further sub-divided in to second stage units (say kebeles) and
from these SSUs again a sample is taken by some suitable methods. Further stages may be added if
required.
Example:
Multistage sampling procedure was used to conduct a research entitled “Health Service Utilization in
Amhara Region of Ethiopia.”
Procedures followed:
Previous provinces of Gondar, Gojjam, and Wollo are divided in to two zones.
One of the two Gondar zones, one of the two Gojam zones and one of the two Wollo zones
were randomly selected. Later one more zone, North Shoa was included (total four zones).
Two districts from all the zones except the North Shoa (one district only) were selected (Total
seven districts).
Two rural and one urban kebeles were chosen from each selected district were considered (14
rural kebeles and 7 urban kebeles).
Advantages: Cuts the costs of preparing sampling frame.
Disadvantages: Gives less precise estimate than SRS for the same sample size
2.3.2. Non-Probability Sampling
Non-Probability Sampling: In non-probability sampling, not every unit in the population has a chance of
being included in the sample and the process involves at least some degree of personal subjectivity
instead of following predetermined, probabilistic rules for selection. This sampling technique is:
 Used when a sampling frame doesn't exist,

 It is non-random selection (unrepresentative)
 Inappropriate if the aim is to measure variables and generalize findings
 Easier, quicker and cheaper to carryout than probability designs.
23 | P a g e
Dear distance learner, Describe the types of non-probability sampling methods?
_____________________________________________________________________________________
_____________________________________________________________ Good!!
There are three non- probability sampling methods. These are:
a) Convenience Sampling: is a method in which a sample is chosen with ease of access being the
primary concern. Example: Interviews conducted in convenient locations such as student
lounge.
b) Purposive (Judgmental) Sampling: the researcher exercises deliberate subjective choice in
drawing samples what s/he regards as more informative for a study undergoing.
c) Quota Sampling: is a method that ensures that a certain number of sample units from different
categories with specific characteristics are represented. Here, judgmental and convenience
sampling methods are combined. Quota sampling can be applied for affirmative action.
Example: Suppose we know that 54% of the adults in a community are females, and the study
requires 100 respondents as a sample. In quota sampling, we might interview the first 54
females and the first 46 males.
2.4. Errors in Sampling

There are two types of errors
1. Sampling error: is the discrepancy between the population value (parameter) and sample value
(statistic). It may arise due to inappropriate sampling technique applied. It can be minimized by
increasing the size of the sample. When n = N, sampling error = 0
2. Non-sampling error (bias): are due to procedure bias such as:
 Subjects’ non-response
 Due to incorrect response
 Problem with sampling frame
 Measurement error
 Errors at different stages in processing the data.
Ways to reduce data error
 Ensure that survey instruments are well prepared, simple to read, and easy to
understand.
 Properly select and train interviewer to control data gathering bias or error.
 Use sound editing, coding, and tabulating procedures to reduce the possibility of data
processing error.
24 | P a g e
Review Exercises
1) What are the reasons of sampling? Discuss and give example for each reason.
2) Differentiate between parameter and statistic. Which one is the result of taking a sample?
3) Define systematic sampling and explain how it is carried out. Describe how you would obtain a
systematic sample of 80 students from a population of 1600 students.
4) Briefly explain the difference between the following concepts and give examples, if necessary.
Sampling vs. Census
Cluster sampling vs. Stratified sampling
Sampling frame vs. Sampling unit
5) Assume that you are going to undertake research on the Ethiopian culture. Before taking a
sample, you observed that the culture is too diversified and large in number. Which type of
sampling method you are going to use so that your samples will represent the whole cultures.
Why?
6) Briefly explain cluster sampling. In which type of population it is preferred to select the samples
from the population?
7) Assume that there are 500 students in FBE, DMU in five departments with students' size of 150,
100, 50, 150 and 50. Assume that 20 students are to be selected from these five department
students for scholarship based on probability sampling. Further assume that students from all
departments have equal chance of being selected, i.e., departments with large number of
students will send more students than others. If you are assigned to select 20 students from
FBE, then
a) Which type of sampling method you are going to use?
b) Determine the sample size to be selected from each department.
8) To study the reaction of students to a policy issued by a college, a sample of 100 students is
required. The number of male students is 1000 and the number of female students in the
college is 1500. If you want to select your sample of 100 students using a proportional
allocation, how many students of each sex should you include in your sample?
9) Suppose you are a Woreda administrator having five kebeles with respective population size
10000, 5000 15000, 20000, and 50000. If you are supposed to select 1000 representatives of the
Woreda, determine the number of individuals to be selected in each Kebele so that your
selection to be fair.
10) Classify each of the following samples as simple random, systematic, stratified or cluster
a. In a large school district, all teachers from two buildings are interviewed to determine
whether they believe the students have less homework to do now than in the previous
years.
b. Every 7th customer entering a shopping mall is asked to select his or her favorite shoes.
c. Nursing supervisors are selected using random numbers to determine annual salaries.
25 | P a g e
CHAPTER THREE: DATA COLLECTION AND PRESENTATION
Introduction
Dear distance learners! In chapter one, we have define statistics as the science of
collecting, organizing, presenting, analyzing, and interpreting numerical data in order to make
more effective and rational decisions. Data are any collection of a raw facts, figures/ numerical
results of any count or measurement collected from a population or sample that will be used to
draw a conclusion or make a decision. Thus the data collected should have a source,
classification, method of collection, and it should be organized and presented in clear, precise
and understandable way. This chapter more concerned with these issues; classifications and
sources of data, methods of data collection and presentation.
Chapter Objectives
 Define the concept of data

 Identify Classifications of Data,
 Identify the sources of data,
 Familiarize with Methods of Data Collection
 Organize data using frequency distributions,
 Visually represent data using tabular Methods and Graphic Methods.
3.1 Definition of data
Dear distance learners! In this section we will try to define data. In defining data,
individuals use data and information interchangeably; however, there is a distinct difference
between the two terms. The former, information is the processed, organized and structured data
that is presented in a given context so as to make it useful. While Data is raw, unorganized facts
that needs to be processed. Data can be something simple and seemingly random and useless
until it is organized/ presented in meaningful way. So, information is the most processed data
and meaningful. In short data (the plural form) as we defined above are any collection of raw
facts, figures/ numerical results or values (response) of the variable of any count or
26 | P a g e
measurement collected from a population or sample; and that will be used to draw a conclusion,
inference or make a decision.
In research, statisticians/researchers use data in many different ways. Data can be used to
describe situations or events or to make an inference.
3.2 Classification of data
Dear distance learner, can you mention some classifications/types of data? Let you try to
answer it below.
_______________________________________________________________________
Great, the classifications of data are based on different criteria. They can be classified as
quantitative or qualitative data based on their nature; primary or secondary data based on their
source and as time series or cross sectional data, or panel data based on the role of time.
i) quantitative or qualitative data

We classify data as quantitative or qualitative data based on their nature, i.e are they measurable
and expressed numerically or not? Quantitative data are data that is expressed and measured
numerically or they are numerical observations of variables. Example: age, Grade Point Average
(GPA), Sales, price of goods and services, number of colleges in a given town, income of
households etc. Valid computations such as mean, variance, etc are possible in the case of
quantitative data.
Qualitative data: data that is non-numeric or expressed in words, letters or symbols. Example:
marital status (married single, widowed, divorce), race (Asian, African, etc), gender
(male/female), blood type (A, B, O, AB), educational level. Valid Computation: Proportions in
each category are possible, Example. What percent of students in this class is female?
ii) Primary or Secondary data

Primary Data: Data that are collected from primary source of data by the investigator him-
self/her-self for the purpose of a specific inquiry or study. When we see their character they are
first hand, new, not used yet in other study and original. Thus the sources for primary data are
the reference /target population understudy.
27 | P a g e
Secondary data, on the other hand, are those which have already been collected by someone
else and which have already been passed through the statistical process and used for some
purpose. When we see their character they are not first hand, new, and original rather have
already been collected and used by someone else for the same or other purpose. Secondary data
can be obtained from published and unpublished materials. Various publications of the central,
state are local governments; various publications of foreign governments or of international
bodies and their subsidiary organizations; technical and trade journals; books, magazines and
newspapers; reports and publications of various associations connected with business and
industry, banks, stock exchanges, etc.; reports and research results prepared by research
scholars, universities, economists, etc. in different fields; and public records and statistics,
historical documents, and other sources of published data. Internet is also one source of
secondary data.
iii) Cross sectional, time series data and panel data

Cross Sectional Data: A data collected from a population /from the sample at a given point in
time (at one point in time).Example: The data collected by the researcher on income level of 1000
households in a given town for the year 2009 can be taken as a cross sectional data.
Time series data: Data collected overtime (sequence of periods) on one or more than one
variables. Or it is data collected at several successive periods of time.
Example: The data collected by the researcher on one or more than one variables for 20 successive
periods/years can be taken as a cross sectional data time series data.
Panel data: The panel data are collected from repeated survey of a single (cross-section) sample
in different periods of time. It is elements of both time series & cross-sectional data. Because
data are collected on the same elements (cross-section) for more than one period/year. By taking
our example for cross sectional data, if the researcher collected data on the income level of those
1000 households for consecutive three years or more years we call it Panel data.
28 | P a g e
3.3 Methods of data collection
Dear distance learner, can you mention methods of data collection? Let you try to answer
it below._______________________________________________________________________
From the previous section we categorize data based on their source as primary and secondary.
The methods of collecting primary and secondary data differ since primary data are to be
originally collected from primary sources, while in case of secondary data the nature of data
collection work is merely that of compilation the already existing data. So we will discuss about
methods of data collection by considering primary data.
Methods of collecting primary data
Dear distance learner, Have you ever collect primary data? If your answer is yes, what
types of methods have you used? Let you try to answer it below
________________________________________________________________________
There are several methods of collecting primary data but the most important ones and most
widely used are listed below.
a. Interview method
b. Questionnaire Method
c. Observation Method
We briefly take up each method separately.
a. Interview method
The interview method of collecting data involves presentation of oral-verbal stimuli and reply in
terms of oral-verbal responses. According to Eckhard and Ermann," Interviewing is a data
collection, procedure involving verbal communication between the researcher and respondent
either by telephone or in a face to face situation".
The method of collecting data through interviews is usually carried out with a structured
interviews or unstructured interviews. As such we call the interviews as structured interviews;
such interviews involve the use of a set of predetermined questions and of highly standardized
techniques of recording. Thus, the interviewer in a structured interview follows a rigid procedure
29 | P a g e
laid down, asking questions in a form and order prescribed. As against it, the unstructured
interviews are characterized by flexibility of approach to questioning. Unstructured interviews do
not follow a system of pre-determined questions and standardized techniques of recording the
data. In a non-structured interview, the interviewer is allowed much greater freedom to ask, in
case of need, supplementary questions. The interview can be trough personal (face - to - face),
tele-phone or mail.
Personal (face - to - face) interview
In this case the respondents and the interviewer will have a face-to-face contact and oral/ verbal
communication. Meaning that, the interviewer asks certain questions to the interviewee
(respondent).And usually the interviewer is expected to initiates the interview and collects the
information. Personal (face - to - face) interview has its own pros and cons.
Dear distance learner, can you list some pros and cons of Personal (face - to - face)
interview? Let you try to answer it below
Let us start from the pros.
More accurate & reliable

Offers a lot of flexibility in allowing the interviewer to explain questions.
Maximizes trust & cooperation b/n interviewer & the interviewee. So that the reluctant
interviewee will be minimized.
Has a higher rate of response
Are more ideal when in-depth study is required and respondents are illiterate.
Cons
More expensive & time consuming

Not ideal to large group of informants
30 | P a g e
Tele phone Interview
This method of collecting information consists in contacting respondents on telephone itself. The
medium of communication is telephone. It is not a very widely used method, but plays important
part in industrial surveys, particularly in developed regions.
The features of telephone interview are listed below:
 Requires a relatively short span of time.
 Has high response rate.
 No field staff is required.
 Less costly than personal interview.
 Less effective in a community with few number of telephone lines.
 Not all people have a chance of being surveyed b/c: some people may not have phones or
they may not pick it up.
 It is faster than other methods
 Extensive geographical coverage may get restricted by cost considerations.
Mail Interview
The medium of communication is mail which can be electronic mail (e-mail).
Characteristics of mail interview
 If one drafts a detailed questionnaire, it can be mailed to the respondent for filling or
can be put in charge of enumerators who go around and fill them after obtaining the
desired observation.
 It is relatively less costly as compared to telephone and personal interview
 The individual should be literate to give an appropriate response
 Non-response error may be high if mailing is costly.
 This survey can be used to cover a wider geographic area than telephone surveys or
personal interviews since mailed questionnaire surveys are less expensive to conduct.
 It has low number of responses and inappropriate answers to questions.
 It has low return rate.
 Some people may have difficulty in reading or understanding the question.
31 | P a g e
b. Questionnaire Method
Questionnaire method is a method in which data are obtained with the help of a questionnaire,
which is prepared exclusively for the purpose. In other words with the help of a set of questions
all the required data is collected. The Questionnaire can be developed or adapted by the
researcher. Concerned with questionnaire; it can either be structured or unstructured
questionnaire. Structured questionnaires are those questionnaires in which there are definite,
concrete and pre-determined questions. The questions are presented with exactly the same
wording and in the same order to all respondents. When these characteristics are not present in a
questionnaire, it can be termed as unstructured or non-structured questionnaire. The types of
questions in a given questionnaire can be multiple choice (‘closed question) ,dichotomous
(having only two choices) (yes/no, female/male, etc) and Open – ended (where the respondents
are free to give any responses).
c. Observation Method
The observation method is the most commonly used method especially in studies relating to
behavioral sciences. In a way we all observe things around us, but this sort of observation is not
scientific observation. Observation becomes a scientific tool and the method of data collection
for the researcher, when it serves a formulated research purpose, is systematically planned and
recorded and is subjected to checks and controls on validity and reliability.
The observation can be controlled / uncontrolled. If the observation takes place in the natural
setting, it may be termed as uncontrolled observation, but when observation takes place
according to definite pre-arranged plans, involving experimental procedure, the same is then
termed controlled observation. In non-controlled observation, no attempt is made to use precision
instruments.
Characteristics of Observation Method
 We see what is happening and record it. E.g. traffic accident, etc
 Observation relies on watching or listening, then, counting or measuring.
 There are no respondents.
 It is time consuming/expensive.
32 | P a g e
3.4 Data Presentation
Dear distance learners! After the data once collected from the subjects under study, they
have to be organized and presented precise and understandable way. Data organization is simply
the process of editing, classifying and arranging the given data set to make it understandable and
to eliminate unnecessary details. And data presentation is the process of presenting or expressing
data using tabular method or graphical methods.
3.4.1. Tabular Methods of Data Presentation
Let’s start our discussion of tabular method of data Presentation by defining tabulation.
Tabulation is the arrangement of a given data set in tables. There are various techniques of
tabulation. The most widely used are data array and frequency distribution.
Data Array
Dear distance learner, what is data array? Let you try to answer it below
a) Data Array
Data array is a table showing data arranged in descending or ascending order both for qualitative
and quantitative data. Descending order is the arrangement of data from the highest to the lowest
and ascending order is the arrangement of data from the lowest to the highest. Examples
 Descending (100, 99, 98, 97 ……..)
 Ascending (1, 2, 3,4,5,6,7,8,9 …………)
 An alphabet list of post office renters can be considered as a data array of qualitative
information. The first two examples shows arrangement of quantitative data.
Dear distance learner, can you list some advantages of data array? Let you try to answer
it below
Data array offers the following advantages:

a) We can determine at a glance the highest and lowest values contained in the data.
b) We can identify groups of similar data values.
c) We can easily see differences between values in the data.
33 | P a g e
Now we will try to see how we present the given data set (raw data) in data array. The following
data set (raw or ungrouped data) displays the Income level of 50 households in a certain town.
112 100 127 120 134 105 110 118 109 112
110 118 117 116 118 114 114 122 105 109
107 112 114 115 118 118 122 117 106 110
116 108 110 121 113 119 111 120 104 110
120 113 120 117 105 118 112 110 114 114
The data can be arranged in the data array either in descending or ascending order. Let us
arrange it in ascending order (lowest to the highest).
Table 3.1: Data Array
Ascending order
100 110 112 116 119
104 110 113 117 120
105 110 113 117 120
105 110 114 117 120
105 110 114 118 120
106 110 114 118 121
107 111 114 118 122
108 112 114 118 122
109 112 115 118 127
109 112 116 118 134
Maximum data value = 134, Minimum data value = 100, Range = 134 – 100 = 34
Dear distance learners! Please arrange the data in descending order.
d. Frequency Distribution
A frequency distribution is a table that group data in to non-overlapping intervals called classes
and records the number of observations in each class. The frequency distribution summarizes
34 | P a g e
data in a condensed form that can be readily understood and easily interpreted. The reasons for
constructing a frequency distribution are:
 To organize the data in a meaningful way
 To enable researchers to draw charts and graphs for the presentation of data.
 To enable a reader to make comparisons among different data sets.
Dear distance learners! There are some key Terms in frequency distribution. Some of them are
listed below.
 Class each category of the frequency distribution is called a class.
 Frequency is the number of data values/observations falling within each class.
 Total frequency: - the sum of all class frequencies.
: :
xi  x1 , x2 ...........xn  class
f i  f1 , f 2 ........... f n  frequency
n
+ + +…+ = total frequency. It implies f
i 1
i = total frequency = n = number of
observation (sample size)
Class Limits -are the boundaries for each class. These determine which data values are
assigned to that class. Class limits can be lower or upper class limits and they have the same
decimal value as the data value. The lower and upper class limits the lowest and highest values
of the class respectively.
It is also called true class limits. It is the highest and the lowest values when there is no gap
between successive classes. To compute class boundaries we need first the correction factor
which is denoted by d.
d = Lower class limit of a class – upper class limit of the previous class .Then
35 | P a g e
1. We add on each upper class limits to get upper class limit of each class and
2. We subtract from each lower class limits to get lower class limit of each class
Class interval is the width of each class. This is the difference between the lower
limits/upper limit of the class and the lower limit/upper limit of the next higher class. Or it is the
difference between the upper and lower class boundaries of any class. And it is expected to be
rounded number.
range
Approximate class width 
number of classes desired
Range  Maximum value - minimum value
Class Mark is the midpoint of each class. This is mid- way between the upper and lower
class limits. To be familiarized with these concepts try to work out on the distribution table
below.
Class Frequency
200 – 299 12
300 – 399 19
400 – 499 6
500 – 599 2
600 – 699 11
700 – 799 7
800 – 899 3
Total Frequency 60
As we said that class is each category in the frequency distribution table, so there are 7 categories
in this frequency distribution table. Total Frequency is the sum of Frequencies in each class or
the total number of observations, i .e 60.Now we will try to see the Class Limits, Class
boundaries, Class intervals and class mark of at least the first class.
A. Class Limits of the first class, lower (LCL) and upper (UCL) class limits.
LCL1=200 and UCL1 =299
B. Class boundaries of the first class, lower (LCB) and upper (UCB) class boundaries.
The lower class boundary is the midpoint between 199 and 200, the d is 1
LCB1 =200-1/2 =199.5
36 | P a g e
UCB1 =299+1/2 =299.5
C. Class interval/ width of the first class
299.5 -199.5 =100
D. class mark of the first class
LCL1 + UCL1/2
200+299/2 =249.5
Guidelines for the frequency distribution
In constructing a frequency distribution for a given data set, the following guidelines should be
followed.
a) The set of classes must be mutually exclusive. That is, a given data value should fall into
only one class/category. There should be no-overlapping between classes and limits.
b) The class must be exhaustive. That is, we have to include all possible data values. No data
value should fall outside the range covered by the frequency distribution.
c) If possible, the classes should have equal widths. Unequal class widths make it difficult to
interpret both frequency distribution and their graphical presentation. One exception occurs
when there is an open-ended distribution i.e., it has no specific beginning value or no specific
ending value.
Example: class
< 10 (meaning that any value below 10 will be tallied in this class)
10 - 20
21 – 31
32 – 42
43 – 53
54 – 64
>65 (means values above 65 will be tallied in the last class)
Generally, in open – ended classes, the lowest class lacks a lower limit or the highest class lacks
an upper limit. Open – ended classes are classes with either no lower limit or no upper limit.
Construction of frequency distribution table

There some steps /stages we need to follow in constructing frequency distribution table. These
are,
37 | P a g e
1. Arrange the data in some order
2. Find the range
3. Find the desired number of classes, there is no clear and fast rule to determine the number
of classes of a data set but it is a subjective process. In general 5 to 20 classes will be
suitable or recommended. In determining the number of classes of a data set we can use
the Sturge’s formula:
k =1+3.322log(n) where n is the number of observations and k is the desired
number of classes which should be rounded to the nearest whole number.
4. Find the class interval or width
Class width = Range/Number of class still it is recommended to be rounded to the nearest
whole number.
5. Select a starting point for the lowest class limit. This can be the smallest data value or any
convenient number less than the smallest data value. Add the width to the lowest score
taken as the starting point to get the lower limit of the next class. Keep adding until there
are 7 classes. Subtract one unit from the lower limit of the second class to get the upper
limit of the first class. Then add the width to each upper limit to get all the upper limits.
6. Tally the data
7. Find the frequency from the tallies
Let us use our previous data set that shows the Household Income level of 50 hh.
Date set
112 100 127 120 134 105 110 118 109 112
110 118 117 116 118 114 114 122 105 109
107 112 114 115 118 118 122 117 106 110
116 108 110 121 113 119 111 120 104 110
120 113 120 117 105 118 112 110 114 114
To construct the frequency distribution table /to group the data
1. Array the data
2. Find the range, 34
3. Determine the number of classes using Sturge’s formula ,
(k): k  1  3.322 log n
k  1  3.322 log 50 =6.64 7, where 3.322 log 50 =5.64
38 | P a g e
4. Then the class width is
Class width = = 4.9
5. Select a starting point for the lowest class limit, Let us use 100 (smallest value) as a
starting point. Add the width to the lowest value taken as the starting point to get the
lower limit of the next class. Keep adding until there are 7 classes. Subtract one unit from
the lower limit of the second class to get the upper limit of the first class. Then, add the
width to each upper limit to get all the upper limits.
105– 1 = 104
1st class = 100 – 104
2nd class = 105 – 109, etc.
6. Then Tally the data and
7. Find the frequency from the tallies
The completed frequency distribution is given as:
Table 3.2: Constructing frequency distribution table
Class Frequency Class boundaries
100-104 2 99.5-104.5
105-109 8 104.5-109.5
110-114 18 109.5-114.5
115-119 13 114.5-119.5
120-124 7 119.5-124.5
125-129 1 124.5-129.5
130-134 1 129.5-134.5
Total frequency 50
Types of frequency distributions

There are three types of frequency distribution.
These are:-
a) absolute frequency;
b) relative frequency;
c) Cumulative frequency.
39 | P a g e
Dear distance learner, can you define these three types of frequency distributions? Let
you try to answer it below
a) Absolute frequency: An absolute frequency distribution table shows the absolute/actual

number of occurrences of an entry or groups of entries in a data set. To construct an
absolute frequency distribution table, list all the scores in the first column and count the
number of times each score occurs in the original data set. Record this against each item
in the second column.
b) Relative frequency: The relative frequency distribution table shows the number of
occurrence of each item or class of items in the data set as a proportion of the total
number of observation. Thus, the relative frequency shows what fractional part or
proportion of the total frequency belongs to the corresponding category. This can be
expressed in decimal, fraction or percentage form. = = where n is total
number of observations, where RF= Relative frequency, AF = Absolute Frequency, TF =

total Frequency (number of observations, n)
fi
* Note that the sum of the relative frequencies is always 1 or 100%. That is,  ( n )  1.
c) Cumulative frequency: The cumulative frequency distribution table shows the absolute
frequency of occurrence added at each successive class in the data set. Alternatively one
can use the relative cumulative frequency table based on relative frequencies.
Dear distance learner, given the following frequency distribution we will see how we can
compute the above three frequencies.
Table 3.3 absolute, relative and cumulative frequencies
Class Class Absolute Cumulative Relative Cumulative
Limits boundaries frequency frequency frequency Relative frequency
24-30 23.5-30.5 3 3 3/25 3/25
31-37 30.5-37.5 1 4 1/25 4/25
38-44 37.5-44.5 5 9 5/25 9/25
40 | P a g e
45-51 44.5-51.5 9 18 9/25 18/25
52-58 51.5-58.5 6 24 6/25 24/25
59-65 58.5-65.5 1 25 1/25 25/25
Total 25 1
n n
fi
 f i  n  25,
i 1
ni 1
1
Furthermore, cumulative frequency distributions can be classified as “less than” and/or “more
than” cumulative frequency distributions. The “less than” cumulative frequency of a class is the
total frequency of all values less than the upper boundary of the class and the “more than”
cumulative frequency of a class is the total frequency of all values which are greater than the
lower boundary of the class.
By using the previously constructed frequency distribution table we can see the above types of
frequencies.
Table 3.4 less than and more than cumulative frequency distributions
Absolute Upper class Less than cumulative Lower More than cumulative
Class frequency boundaries frequency boundaries frequency
100-104 2 104.5 2 99.5 50
105-109 8 109.5 10 104.5 48
110-114 18 114.5 28 109.5 40
115-119 13 119.5 41 114.5 22
120-124 7 124.5 48 119.5 9
125-129 1 129.5 49 124.5 2
130-134 1 134.5 50 129.5 1
Total 50
Based on the above frequency distribution table we can interpret the results in different ways.
Example
 31 (18+13) of the households earn a monthly income from birr 110 – 119
 62% of the households earn a monthly income from birr 110 – 119 (31/50*100%)
 28 of the households earn a monthly income less than birr 114.5
41 | P a g e
 40 of the households earn a monthly income at least birr 109.5
We can interpret in different ways and more than these interpretations.
Dear distance learners! One can construct several different but correct and acceptable
frequency distributions for the same data by using:
 a different class width
 a different number of classes or
 a different starting point
Dear distance learners!

o What are the differences between relative, absolute and cumulative frequencies?
o What is the difference between less than and more than cumulative frequencies?
3.4.2 Graphical Method of Data Presentation

After the data have been organized into a frequency distribution, they can be presented in
graphical form. To represent data in Economics and business, like in other sciences, we use
graphical and diagrammatical methods. Because,
 Convey the data to the viewers in pictorial /graphic form that makes it attractive
and easily understandable.
 Get the audiences’ attention in a publication or a speaking presentation,
 Discuss an issue, reinforce a critical point, or summarize a data set,
 Helps to see simply a trend or pattern in a situation over a period of time.
Dear distance learners! There are three most commonly used graphs in research .These are:-
a. The Histogram
b. The frequency polygon
c. The cumulative frequency graph or O-give (pronounced as o -jive )
The Histogram: - is a graph that displays the data by using adjacent vertical rectangles (unless
frequency of a class is zero) of various heights to represent the frequencies of the classes. The
tallest rectangle in a histogram is associated with a class having the greatest number of
observations (frequencies) and vice versa.
42 | P a g e
In a histogram the class boundaries are marked on the horizontal axis and the class frequencies
on the vertical axis. The length of adjacent rectangles of a histogram (a long the y-axis) can be
the absolute or relative frequencies of a class. We should know that we would have reached the
same conclusions and the shape of the histogram would have been the same had we used a
relative frequency distribution instead of the absolute (actual) frequencies. The only difference is
that the vertical axis would have been reported in percents (proportions) of households instead of
the number of households.Thefollowing frequency distribution will help us to construct a
histogram.
Class boundaries Absolute frequency
99.5-104.5 2
104.5-109.5 8
109.5-114.5 18
114.5-119.5 13
119.5-124.5 7
124.5-129.5 1
129.5-134.5 1
Total 50
To construct a histogram we mark the class boundaries on the horizontal axis and we mark the
frequencies on the vertical axis, as we said above the frequencies can be absolute or relative.
Then using the frequencies as the heights, we draw vertical bars for each class
Figure 3.1 Histogram
18
18
99.5-104.5
16
14 13 104.5-109.5
12 109-114.5
10 8 114.5-119.5
8 7
119.5-124.5
6
4 2
124.5-129.5
2 1 1
129.5-134.5
0
43 | P a g e
Dear distance learner, from the above histogram, which class constitutes greatest number
of data values(frequencies)? Let you try to answer it below
The frequency polygon :The frequency plygon consists of line segments connecting the points
formed by the interesection of the class marks with the class frequencies. Relative frequencies
or percentages may also be used in constructing the figure. Empty classes are included at each
end so the curve will intrsect the X – axis and the frequency plygon will be closed.
To construct frequency plygon we mark the class marks on the horizontal axis and we mark the
frequencies on the vertical axis, like in the case of histogram, the frequencies can be absolute or
relative.
Using the frequency distribution given in above(in constructing histogram), we can construct a
frequnecy polygon.There are some steps we should to follow.
Find the class marks
Class boundaries Class mark Frequency
99.5 - 104.5 102 2
104.5 - 109.5 107 8
109.5 - 114.5 112 18
114.5 - 119.5 117 13
119.5 - 124.5 122 7
124.5 - 129.5 127 1
129.5 - 134.5 132 1
1. Draw the x – y axis. Label the x – axis with the class marks and use a suitable scale on
the y – axis for the frequencies (absolute or relative).
2. Connect the coordinated (x,y) with line segments.
Figure 3.2 frequency polygon
44 | P a g e
frequency polygon
20
18
16
Frequency 14
12
10 18 Frequency
8
6 13
4 8 7
2 2
0 0 1 1 0
102
107
112
117
122
127
132
137
97
Class Marks
Dear distance learner, now we are going to discuss about the cumulative frequency graph ( o-
give).
The cumulative frequency graph ( o-give): The o-give is a graph that displays cumulative values
for frequencies.There are two types of cumulative frequency graphs ( o-give): “more than” and
“ Less than” cumulative frequency graphs.
Example: construct an o-give for the frequency distribution given in example above(in
constructing histogram and frequnecy polygon ).Like in the case of histogram are some steps we
should to follow,these are
1. Find the cumulative frequency for each class
Class boundaries Less than cumulative frequency found by
99.5 - 104.5 2 2+0
104.5 - 109.5 10 2+8
109.5 - 114.5 28 2+8+18
114.5 - 119.5 41 2+8+18+13
119.5 - 124.5 48 2+8+18+13+7
124.5 - 129.5 49 2+8+18+13+7+1
129.5 - 134.5 50 2+8+18+3+7+1+1
2. Draw the x – y axis and lable the x– axis with the class boundaries and y – axis with the
cumultive frequencies.
45 | P a g e
3. Plot the cumulative frequency at each upper class boundary. Upper class boundaries are
used since the cumulative frequencies represent the number of data values accumulated
upto the upper boundary of each class.
Figure 3.3. Less than cumulative frequency graph
Cumulative frequency
60
50
40
30
Cumulative frequency
20
10
0
99.5 104.5 109.5 114.5 119.5 124.5 129.5 134.5
Cumulative frequency graphs (less than cumulative frequency) are used to visually represent
how many values are below a certain upper class boundary. For example, to find how many
households earn less than 114.50 birr, we can locate 114.5 birr on the x – axis, draw a vertical
line up until it intersects the graph, and then draw a horizontal line at the point to the y – axis.
The value is 28 households.
Dear distance learner, Please try to construct the more than cumulative frequency graph ( o-
give) by yourself using the above example.
Note: The abscissa (x-value) of the point of intersection of the two o-give curves (less than and
more than) gives the median of the given data. We will discuss in brief about median in the next
chapter.
46 | P a g e
3.4.3 Diagrammatical data presentation
Diagrams make it possible more attractive to eye of a given data set. As such they are better
suited for publicity and propaganda. The most commonly used diagrams in Economics and
business are the following
a) Line graphs
b) Bar charts
c) Pie – charts
a. Line graphs (charts)

Dear distance learner, in which type of data line charts are ideal? Let you try to answer it below
Line graphs (charts) are particularly effective for business and economic data to show the
changes or trends in a variable overtime. Line graphs (charts) are more ideal for time series
data. The variable of interest, such as the number of units sold or the total values of sales, is
scaled along the y – axis and time along the x – axis. Line graphs are widely used by investors to
support decisions to buy and sell stocks and bonds in the financial market. The idea is to try to
show a trend that will likely continue into the future, and to use that pattern to make accurate
prediction for the immediate future. Two or more series of data can be plotted on the same line
chart. Thus a chart can show the trend of several different variables and this allows for a
comparison of several series over the same period of time.
The Line graphs (charts) below shows unemployment rate over of a country from 1992 to 2000.
Figure 3.4Line graphs (charts) presentation of data
47 | P a g e
Unemployment rate
18.00%
16.00% 15.70%
14.80% 14.60%
14.00% 13.70% 13.50%
12.00% 12.40%
11% 11.30%
10.00% 10.20% Unemployment rate
8.00%
6.00% Linear (Unemployment
rate)
4.00%
2.00%
0.00%
1990 1992 1994 1996 1998 2000 2002
Dear distance learner! From the above graph we can see that, the unemployment rate decreases
from around 1992 and reaches its minimum in 1995( approximately 10%) and then starts to
increase.
a) Bar Charts: bar charts are more applicable when the horizontal axis deals with data that is
qualitative or non – continuous in nature, e.g. Gender, Marital status, etc.When we represent
data using bar charts, the bars are not joined together. All the bars must have equal width and
the distance between bars must be equal.
Example
Education level Earnings/year
High school Diploma 22,895.00
Bachelor Degree 40,478.00
Master’s Degree 73,165.00
Figure 3.5bar chart presentations of data
48 | P a g e
Earnings/year
Master’s Degree
80,000.00 , 73,165.00
70,000.00
60,000.00
50,000.00 Bachelor Degree

, 40,478.00
40,000.00
High school Diploma Earnings/year
30,000.00 , 22,895.00
20,000.00
10,000.00
0.00
High school Diploma Bachelor Degree Master’s Degree
Pie – Chart: - A pie chart is more commonly used to display percentages, although it can be
used to display frequencies or relative frequencies. The whole pie (or circle) represents the
total sample or population. Then we divide the pie into different portions that represent the
different categories/classes.
Example: Samples of 200 Students were asked to select their department in which they can
be more effective. The following data shows the number Students in each department .Draw
a pie-chart based on the following data.
Department Number of Students Relative frequency Percent Angle
Economics 92 0.46 46% 46% x 3600 = 165.60
Management 49 0.245 24.50% 24.5% x 3600 = 88.20
Accounting 37 0.185 18.50% 0.185 x 3600 = 66.60
Banking 13 0.065 6.50% 0.065 x 3600 = 23.40
Marketing 9 0.045 4.50% 0.045 x 3600 = 16.20
Total 200 1 100% 3600
49 | P a g e
Figure 3.6 pie chart presentations of data
Marketing
4.50% Percent
Banking, 6
50%
Accounting,
18.50% Economics
46%
Manageme
nt, 24.50%
Dear distance learner! By using the same procedure please draw a pie-chart based on the
following data.
Assume a typical person made an average monthly expenditure in birr on the following goods
and services, food, cloth, transportation and others for 1500, 2000,300 and 800 birr respectively.
Construct a pie chart that represents this data set.
50 | P a g e
Review Questions:
1. What are the differences between Time series and Panel data?
2. Briefly explain the concept of cumulative frequency distribution. How are the more than
and less than cumulative frequencies calculated?
3. When we use line graphs/charts?
4. Briefly explain the three types of data collection methods.
5. Suppose you are asked to group the final exam mark of 50 students which is out of 80
with uniform class interval. The marks of students are listed below.
21 18 30 40 41 33 73 25 23 25
19 33 65 17 20 76 47 69 20 31
18 24 35 24 17 36 65 70 53 25
65 16 24 29 42 37 26 46 27 63
22 22 23 26 71 37 75 25 27 23
A. Find the number of classes
B. Find the class width(1 point)
C. Construct frequency distribution table (take 16 as the lower class limit of the 1 st class)
with class boundaries.
6. Why we use graphs and diagrams to present a given data set?
7. From a certain frequency distribution table, if the 3rd class upper class boundary and
lower class limit are 20.5 and 16 respectively, determine the class mark of the 3 rd class.
8. The following frequency distribution table displays the money spent by 100 foreign
visitors on visiting Hailessilase palace,in bahirdar.
Amount Spent (in $) Number of customers
3-7 10
8-12 30
13-17 35
18-22 20
23-27 5
100
Then find class marks of each class, relative frequency and cumulative frequency.
51 | P a g e
CHAPTER FOUR: MEASURES OF CENTRAL TENDENCY
Introduction
Dear distance learners! In the previous chapter, you have studied the classification of
given data you have also learnt how to represent the data using tabular or
graphical/diagrammatical methods in the form of various graphs such as bar graphs, histogram,
pie charts, o-gives and frequency polygons. In addition to these methods, the given data set can
be described using a range of numerical measures. Perhaps the best place to start is with some
measure or measures of central location/tendency. Measures of central tendency are used to
define, in some sense, the centre of a set of measurements. The most commonly used measures
of central location are the mean, median, mode. And the relative values of these measures are
very much dependent on the shape and position of the distribution for the data they are
describing. Thus, in this chapter, we will discuss how we compute these measures of central
tendency for both ungrouped and grouped data. We shall also discuss about the Positional
measures, Quartiles, Deciles and Percentiles.
Chapter Objectives
At the end of this chapter, students would be able to:

 Calculate the measures of central tendency, mean, median and the mode for ungrouped
and grouped data;
 Explain the characteristics/ properties uses of each measure of central tendency;
 Identify the position of the mean, median and mode for symmetric and skewed
distributions;
 Compute other measures of location (quartiles, deciles and percentiles).
4.1 Types of measures of central tendency
Dear distance learners! Before we are going to discuss about types of measures of central
tendency, let us define central tendency. Central tendency refers to the location of distribution in
52 | P a g e
to which more values of a distribution are concentrated. And Measures of central tendency
provide indications on middle values or most likely or most frequent values. In other words, they
tell us where the center of the distribution of the data is located.
There are three most commonly used measures of central tendency .These are: mean, median, and
mode.
4.1.1 Mean: classification and Properties

The mean/average of a given data set can be arithmetic, weighted, geometric, and/or harmonic
mean. Let us discuss in brief each of means.
a. Arithmetic Mean: The arithmetic mean is the sum of the data set values divided by the number
of observations. In computation of mean summation or sigma notation is a convenient and
simple form of shorthand used to give a concise expression for a sum of the values of a
variable, the general formula is given by,
That can be read as sum up values of X from 1 to n where n can be any number
,1,2,3,4……………………..n.
Dear distance learners, arithmetic mean or average value of a variable is the most important
numerical measures of central tendency. For ungrouped data, the population mean (usually
denoted by “”) is the sum of all the population values divided by the total number of population
values:
The arithmetic mean is the sum of the data set values divided by the number of observations.
Arithmetic mean or average value of a variable is the most important numerical measures of
central tendency. For ungrouped data, the population mean (usually denoted by “”) is the sum
of all the population values divided by the total number of population values:
53 | P a g e
N
X i
 i 1
N
where : N  number of elements in the population
  population mean
The population mean applies when the data represent all of the items within the population.
For ungrouped data, the sample mean is the sum of all the sample values divided by the number
of sample values:
X i
X  i 1
n
X  sample mean
n  number of elements in the sample/sample size
A sample of five executives received the following salaries (Birr in thousands): 14.0, 15.0, 17.0,
16.0, and 15.0, find the mean salary.
Xi 14.0  ...  15.0 77

X     15.4
n 5 5
Therefore, the mean salary of the executives is Birr 15,400.00
Properties of Arithmetic mean
 All the values in the data set should be included in computing the mean.
 A set of data has a unique mean.
 Every set of quantitative data has a mean.
 The mean is affected by large or small data values, called outliers and may not be the
appropriate average to use in this situations.
 We cannot determine a mean for open ended data.
 The arithmetic mean is the only measure of central tendency where the sum of the
deviations of each value from the mean is zero. i.e
 ( x  x)  0 Where, x is a value in the data set, and x bar is the sample mean
Mathematically,  ( x  x)  0  ( x  x)   x -  x , where x is a constant
54 | P a g e
x) 
 x - nx   x - n ( n
 xx  0
00
 If two data sets with different sample size /observations , n1 and n 2 and with different sample
arithmetic mean, x1 and x 2 respectively, are combined for some purpose then the
n1 x1  n 2 x 2
combined/grand mean will be : xc  (is the same as the weighted mean)
n1  n 2
Example:
1) The mean age of 12 men and 10 women are 45 and 42 respectively. What is the combined
mean age?
12 * 45  10 * 42
Solution: xc   43.6
12  10
 The arithmetic mean is affected by both change of origin and scale. That is,
 Given a mean for data values, if we add or subtract a constant number c from all
data values, the new mean will be the old mean plus or minus c (change of
origin).
 Given a mean for data values, if we multiply all data values by a constant
number c, then the new mean will be c times the old one (change of scale).We
shall see these changes with numerical example.
Example: The mean life of a certain brand of bulbs is 1030 hours.
a) If a new process adds 50 hour to the life of each bulb, what will be the mean life of them?
(ans. 1080 hours ),due to change of origin.
b) If you apply a recently developed method of production, the life of each bulb is doubled,
what will happen to the mean life of them? (ans. 2060 hours ) due to change of scale.
Arithmetic mean for grouped data

Dear distance learners! The arithmetic mean of a sample of a given grouped data is computed
by the following formula:
55 | P a g e
k
f X i i
X  i 1
fi  i th class frequency
k
f i
i 1 where: X i  class mark of the i th class
k  number of classes
Example: Compute the arithmetic mean of for the following grouped data:
Table 4.1 Computing the

arithmetic mean of grouped data
f  i th class frequency
X i  class mark of the i th class
b. Weighted mean:
It is a special case of arithmetic mean. It is the mean value of data values that have been
weighted according to their relative importance. The weighted mean of a set of values X1, X2, ...,
Xi, with corresponding weights w1, w2, ...,wi, is computed from the following formula:
  or X   ixi

 i
Where:   is population weighted mean
X =is sample weighted mean
i  Weight assigned to the ith data value
xi  The ith data value
Often each weight represents the number of items in the data set having a particular value. The
best example for weighted mean is calculating the students’ grade point average (GPA) who
takes different courses with different credit hours.
56 | P a g e
Examples:
1. A student scored an A in Sophomore English (3 credit hours), a C in Psychology (3 credit

hours), a B in Microeconomics-I (4 credit hours) and a D in Civics (2 credit hours).
Assuming A has 4 grade points, B has 3 grade points , C has 2 grade points and D has 1
grade points, calculate the grade point average (GPA).
X  4 * 3  3 * 2  3 * 4  1* 2
433 2 = 2.67
Dear distance learners! Using the above formula please compute weighted mean of the
following data.
The Satcon Construction Company pays its hourly employees $16.50, $19.00, or $25.00 per
hour. There are 26 hourly employees, 14 of which are paid at the $16.50 rate, 10 at the $19.00
rate, and 2 at the $25.00 rate. What is the mean hourly rate paid the 26 employees?
c. Geometric mean
The geometric mean is widely applicable in business and economics finding the percentage
changes in sales, revenues/returns, profits, GDP, growth rates etc. Like in the case of other types
of mean we can find geometric mean for both ungrouped and grouped data sets.
Geometric mean for ungrouped data
The geometric mean (GM) of n positive numbers is defined as the nth root of their product. The
formula is: GM = n  X 1 X 2  X 3 .... Xn   n x i ,  => multiplication. Now we will try to see
the application of geometric mean finding the percentage changes in sales, revenues/returns,
profits, GDP, growth rates etc with the help of some numerical examples.
Examples
 The interest rates on three bonds were 5, 21, and 4 percent. The average interest rate is:
GM  3 5  21  4  7.49
57 | P a g e
 The returns on investment earned by a company for four successive years were 30%,
20%, -40% & 200%, what is the geometric rate of return on investment?
Solution: 30% return means additional gain from what we have (i.e. from 100%).
Then 30% return is expressed as 1.3, -40% implies reduction ( 1-0.4 = 0.6)
GM  4 (1.3) * (1.2) * (0.6) * (3.0 ) =1.294 The GM of the return is therefore 1.294-1= 29.4%
Dear distance learners! As we discuss above, the other use of the geometric mean is to
determine the percent increase in sales, production, and population or other business or economic
series from one time period to another which can be computed using the formula below.
GM  n
value at end of period   1, n= time gap/time period
value at beginning of period 
Example:
1. The sales of soaps for a soap factory increased from 755,000 tons in 1992 to 835,000 in
2000. What would be the rate of production increase? Rate of production increase
835,000
 GM  8  1  1.27%
755,000
2. If the population of Ethiopia increased from 53,000,000 in 1980 to 73,000,000 in 2000.
73 , 000 , 000
What is the average annual increase? GM = 20 1 = 0.016 = 1.6%
53 , 000 , 000
3. The price of a certain commodity in 1970 was 1.06 times that of 1969, in 1971 it was
1.04 times that of 1970. In the next two years it was 1.10 and 1.23 times that of the
respective preceding years. What is the average annual percentage increase in the given
period?
GM  4 1.06 * 1.04 * 1.10 * 1.23  1.105  (1.105  1) * 100%  10.5%

(the average annual increase is 10.5% )
Geometric mean for grouped data
For grouped data geometric mean is calculated as:
58 | P a g e
GM  n x1 1 * x 2 2 * ...... * x m
f f fm
Where fi is the frequency of the ith class mark, Xi is class mark, m is number of values and
n=total number of observations.
The table below shows the percentage increase in salary of 16 employees of a company. Given
the grouped data find the geometric mean.
Table 4.2 Computing the geometric mean for grouped data
% increase in salary Number of employees Class mark
0-4 5 2
5-9 6 7
10-14 3 12
15-19 2 17
Solution: GM  16 2 5 * 7 6 *12 3 *17 2  5.85% . The geometric mean percentage increase in

salary is 5.85%
If 'n' is a large number, the computing the nth root of the product is a tedious work. To facilitate
the computation of GM, we make use of logarithms.
n  X 1 X 2  X 3 .... Xn   n xi Take log
logGM=log n  X 1 X 2  X 3.... Xn 
log xi log x1  log x 2  ...  log x n

1  
 log x i
n n n n
log GM=log (xi )
GM  anti log[
 logx i ]
n
d. Harmonic Mean
Harmonic mean is mostly applicable to compute average rates of change, like prices, speed and
others. It is used in such cases when units are in harmony.
59 | P a g e
Harmonic mean for grouped data: The harmonic mean of n positive observations is defined as
the number of values divided by the sum of the reciprocals of each value. That is,
n n
HM =  n
1 1 1 1

x1 x 2
 ... 
xn x
i 1 i
Example: Suppose a person drove 100kms at 40km/hr and returned driving at 50km/hr. What is
the average speed? Solution
Dis tan ce
Speed 
Time
Dis tan ce S 100 km

t1     2 . 5 hours to make the first trip
Speed V 40 km / hr
Dis tan ce S 100 km

t2     2 hours to return
Speed V 50 km / hr
Total time  2.5 hours  2 hours  4.5 hours

Total distance  100km  100km  200km
S 200km
V    44 . 44 km / hr
t 4.5 hr
2.5 * 40  2 * 50
Arithmetic mean (weighted mean)   44.44km / hr
4.5
This value can be found by using the harmonic mean formula:
2
HM=  44.44km / h
1 1

40 50
NB. Here, we don't calculate the arithmetic mean to find the average speed because the man
traveled equal distances by different speed on three days. If, however, he had traveled for
60 | P a g e
equal times in 3 days the arithmetic mean would be had correct average. If we want to use
arithmetic mean, we have to take weights in to account:
Harmonic mean for grouped data
n n
HM=  n
Xi= class mark
f1 f 2 f fn

x1 x 2
 ...  n
xn

i 1 xi
Relationship between Arithmetic mean, Geometric Mean and Harmonic Mean
For a set of data containing n-positively valued observations, the following relationships always
holds: HM < GM < AM
The three means become equal when all values in the set of data are equal.
4.1.2 Median (MD)
Dear distance learner, have you heard about median? What do you mean by median? Let
you try to answer it below
The other measure of central tendency is the median. Median is that value of a variable which
divides an array of items in two to equal parts; in such a manner that the number of items below
it is equal to the number of items above it. Like we did for mean, we can compute median for
both ungrouped and grouped data. Let us start the discussion by computing median for
ungrouped data.
Median for Ungrouped Data

In case of ungrouped data, to find the median first we a set of values arranged in the order of
their magnitudes, i.e., in an array. The number of observations can be odd or even.
61 | P a g e
 n 1
th
If the number of observations is odd, the median is the middle or   observatio n and
 2 
when number of observations is even, the median is the arithmetic mean of two middle values
th th
n n 
  observation    1 observation
or ,it is the
2 2 
2 .
Example: 1. The ages for a sample of five college students are given below: then find the
median.
21, 25, 19, 20, 22
First arranging the data in ascending order gives:
19, 20, 21, 22, 25. The observations are odd so the median is the middle value or or
 n 1  5 1
th th
rd
  observatio n .Thus the median is 21 or   observatio n i.e the 3 observation,21.
 2   2 
Dear distance learners! Based on the data set given below, compute the median.
Data set: 1, 5, 3, 9, 10, 12, 6
2. The following data set shows the average life expectancy of four countries, 76, 73, 80,
75.compute the median.
Arranging the data in ascending order gives:
73, 75, 76, 80. The observations are even then the median is the arithmetic mean of
th th
n n 
two middle values or ,or
2 2 
2 .
62 | P a g e
th th
4 4 
Thus the median is 75.5, or
2 2 
2 ,
2nd observation  3 rd observation
= is 75.5.
2 ,
Median for Grouped data
For grouped data, median is calculated by using the following formula:
 n 
  cf 
MD   md   2  * i Where
 f 
 
 
md : is the lower class boundary/class limit of the median class
n: is total number of observations
cf: is the cumulative frequency before the median class
i is the class interval/width
f is frequency of the median class
Example, The following grouped data set shows the average time taken travel to work .Then
based on the grouped data, find the median.
Table 4.3: Calculation of median for grouped data
Time to travel to work Frequency Cumulative frequency

1 – 10 8 8
11 – 20 12 20
21 – 30 14 34
31 – 40 9 43
41 – 50 7 50
Total frequency 50
63 | P a g e
Solution:
Steps:
a. Find the cumulative frequency

b. Find f i  n  50  even
th th
n n 
a. Find the median class:
2 2 
2 , 25.5
b. In which class does the 25.5 fall? In the 3rd class and thus the 3rd class is the median
class
c. Find the cumulative frequency preceding the median class. 20 in this case.
d. Find the class width. 10 in this case.
e. Find the frequency of the median class. 14 in this case.
 50 
  20 
MD  20 . 5   2  * 10  24 . 07  24
 14 
 
.  
Dear distance learners! Based on the grouped data set given below, compute the median
Class Limit Frequency Cumulative Frequency
30-40 2 2
40-50 18 20
50-60 24 44
60-70 20 64
70-80 8 72
80-90 3 75
Properties of Median
 There is a unique median for each data set.

 Geometrically, median divides the histogram or cumulative frequency graphs into two parts
with equal area.
 Median remains unaffected by the magnitude of the extreme values (outliers).
64 | P a g e
 It can be calculated for an open ended frequency distribution if the median class doesn't lie
in an open ended class.
4.1.3. Mode (MO)
Mode is the value of the observation that appears most frequently. Or it is the value that has the
highest frequency in a data set in the case of ungrouped data. For grouped data, class mode (or,
modal class) is the class with the highest frequency.
Mode (MO) for ungrouped data: In the case of ungrouped data, the given data set, may not
have mode at all, it can be uni-modal, bimodal or multimodal. The given data set may not have
mode at all when all values of the observation appear equally. The given data set may have one
mode and we call it uni-modal. It can have also two or more mode and called as bimodal and
multimodal respectively. Let us see all with examples.
A. No mode at all, e.g. 1, 3, 9, 0, 7, 8

B. One mode (unimodal) e.g. 1, 3, 1, 7, 1, 9, mode is 1
C. Two modes (bimodal) e.g. 7,2,4,4,7 , mode are 7 and 4
D. Many modes (multimodal) e.g. 1, 0, 0, 1, 3, 2, 2, 3, 7, 7, 4, 9, mode are 1, 0, 3, 2, 7
Mode (MO) for grouped data
The approximate modal value of grouped data is calculated by the following formula:
f  f1 f  f1
Mode  Lo  i  L0  i
 f  f 1  f  f 2 2 f  f 1  f 2 
Where:
Lo  lower classs boundary of the modal class (i.e., the class with the highest frequency)
f  is the frequency of the modal class
f1  frequency of the class immediately preceding the modal class class
f2  frequency of the class immediately following the modal class
i  class interval/width
65 | P a g e
Suppose the following frequency distribution table shows the daily saving of 100 members of
habru saving and credit institution. Based on the data find the mode
Class 10-14 15-19 20-24 25-29 30-34

Frequency 18 23 40 17 2
Solution:
Steps:
a. Find the modal class, the 3rd class is the highest frequency.
b. Find the frequency of modal class, frequency of preceding and following the modal
class.
c. Find the class width. 5 in this case.
40  23 40  23
Then Mode  19.5  5  19.5  5 21.625
40  23  40  17  80  23  17  =
Dear distance learners! Based on the grouped data set given below, compute the mode by
yourself.
Class 91-100 101-110 111-120 121-130 131-140 141-150 151-160 161-170

Frequency 10 37 65 80 51 35 18 4
Properties of mode
 It is the easiest average to compute.

 It can be obtained for both qualitative and quantitative data.
 It is not affected by extreme values.
 The mode may not exist for a data set.
 It is not unique. A data set can have more than one mode.
 The mode of a set of data is not based on all observations.
66 | P a g e
4.2.Distribution, Shape and Measures of Central Tendency
Dear distance learners! The relative values of the mean, median and mode are very much
dependent on the shape of the distribution for the data they are describing. The distributions
of data can be either symmetric or skewed depending on how the data are distributed around
the center.
Symmetry (normal, bell shaped) distribution: occurs when the data values are evenly
distributed around the center. In a symmetrical distribution, the left and right sides of the
distribution are mirror images of each other, and the values of the mean, median and mode
are equal. Then Mean, Median and Mode are equal.
Skewed distribution: occurs when the data values are not evenly distributed around the
center. Skewness refers to the tendency of the distribution to “tail off” to the right or left. It is
simply lack of symmetry of a distribution.
Right (positively) skewed distribution: The mean is greater than the median, which in turn
is greater than the mode. In such distributions, the median tend to be a better measure of
central tendency than the mean. In a positively skewed distribution (when the majority of the
data values fall to the left of the mean and cluster at the lower end of the distribution, to the
right) the arithmetic mean is the largest of the three measures as the mean is influenced by a
few extremely high values more than the Median or Mode. Mode<Median<Mean
Left (negatively) skewed distribution: the mean is less than the median, which in turn is
less than the mode (Mean<Median<Mode. As with the positively skewed distribution, the
median is less influenced by extreme values and tends to be a better measure of central
tendency than the mean. If most of observations lie to the right of the mean & the tail
extends to the left, then the distribution is negatively skewed or skewed to the left. Then
both in the case of positively and negatively skewed distributions median will be a better
measure of central tendency.
The figures below clearly show the relative values of mean, median and mode according to
the distribution of data they describe.
Fig 4,1The Relative Positions of the Mean, Median and the Mode
67 | P a g e
4.3.Positional measures
Dear distance learners! Previously we have discussed about measures of central tendency
which shows location of distribution in to which more values of a distribution are concentrated.
Now we will try to discuss about the other measurement that determines the position of a single
value in relation to other values in a sample or a population data set. And we call ita measure of
position. Positional measures divide data into many equal parts are called quantiles (fractiles)
There are many measures of position; however, only quartiles, deciles and percentiles discussed
in this section. To obtain such measures, first of all, the data should be ordered based on their
magnitude.
Quartiles
Quartiles are the summary measures that divide a ranked data set into four equal parts. Three
measures will divide any data set into four equal parts. These three measures are the first
quartile(denoted by Q1), the second quartile (denoted by Q2), and the third quartile (denoted by
Q3). The data should be ranked in increasing order before the quartiles are determined. The
upper quartile, Q3 gives the value where 75% of the observations fall below it and the remaining
25% above it. The lower quartile, Q1 gives the reverse information of Q3. The second quartile is
the same as the median of a data set that gives the value where 50% of the observations fall
below it and the remaining 50% above it. Let us try to see how can compute quartiles for
ungrouped and grouped data.
68 | P a g e
Quartiles for ungrouped data,
The jth quartile denoted by Qj where j=1, 2, 3,4 for ungrouped data is defined as
 j (n  1) 
th
Qj   observation
 4 
Example: Find the quartiles (Q1, Q2, & Q3) from the following distribution
47 28 39 51 33 37 59 24 33
Solution:Arrangefirst:24,28,33,33,37,39,47,51,59
1(9  1) 
th
Q1    item  (2.5) th item  2 nd item  0.5(3 rd item  2 nd item)  28  0.5 * (33  28)  30.5
 4 
 2(9  1) 
th
Q2    item  (5) th item  37

 4 
 3(9  1) 
th
Q3    item  (7.5) th item  7 th item  0.5(8th item  7 th item)  47  0.5(51  47)  49

 4 
Dear distance learners! Using the same procedure, try to find the three quartiles of the
following dada.
10,25,15,30,35,40,50,45,55,60 ( answers Q1=22.5, Q2=37.5, Q3=51.25)
Quartiles for grouped data,
Dear distance learners! The jth quartile denoted by Qj where j=1, 2, 3,4 for grouped data is
computed by the following formula
i*n 
  cf 
Qj  i    *w
4
i
fi
Where i=1, 2,3
69 | P a g e
i * n th
 i = lower class boundary of the ith quartile class (the class which contains the ( ) item).
4
wi =class width
fi=frequency of the ith quartile class
n=total number of observations
cf=the cumulative frequency of the class preceding the ith quartile class
Let us compute the three quartiles for the following distribution table.
Table 4.6 calculation of quartiles for grouped data
Class Fi Cf
1 – 10 8 8
11 – 20 14 22
21 – 30 12 34
31 – 40 9 43
41 – 50 7 50
i*n 
  cf 
Q1   1    *w
4
i
fi
th
n  50 
( ) th item    item  12.5 th item is Q1 and it falls in the 2 nd class  11 - 20 is first quartile class
4 4
 1 * 50 
  8
Q1  10.5   4  *10  13.714
14
Q2 ?
70 | P a g e
th
2n th 100 
( ) item    item  25 th item is Q 2 and it falls in the 3 rd class  21 - 30 is second quartile class
4  4 
 2 * 50 
  22 
Q2  20.5    *10  20.5  2.5  23  median
4
12
th
3n 150 
( ) th item    item  37.5 th item is Q 3 and it falls in the 4 th class  31  40 is third quartile class
4  4 
 3 * 50 
  34 
Q3  30.5    *10  30.5  3.89  34.39
4
9
Dear distance learners! Using the same procedure, try to find the three quartiles of the
following dada.
Class Frequency
6-10 1
11-15 2
16-20 3
21-25 5
26-30 4
31-35 3
36-40 2
Deciles
Dear distance learners! Standing from the name what do you think about deciles? Let you
try to answer it below.
________________________________________________________________________
Have you tried it? Great.
71 | P a g e
Deciles are measures that divide a distribution/data set in to ten equal parts. We can compute
deciles for ungrouped and grouped data. Let us try to see how can compute deciles for
ungrouped data.
The jth decile for a simple frequency distribution (ungrouped data) denoted as Dj, where j=1, 2,
3.....9 is computed using this formula
 j (n  1) 
th
Dj   observatio n
 10 
D1 gives the value where 10% of the observations lie below and 90% above it
For grouped data,
i*n 
  cf 
Dj  i    *w
10
i
fi
Where i=1, 2,3,4.....9
i * n th
 i = lower class boundary of the ith decile class (the class which contains the ( ) item).
10
wi =class width
fi=frequency of the ith decile class
cf=the cumulative frequency of the class preceding the ith decile class
72 | P a g e
Percentiles
Dear distance learners! As we can guess from its name, Percentiles divide a distribution/data
set in to 100 equal parts.
Percentiles for ungrouped data
The jth percentile for a simple frequency distribution (ungrouped data) denoted as Pj, where
j=1, 2, 3.....99 is defined as
 j (n  1) 
th
Pj   observation
 100 
P1 gives the value where 1% of the observations lie below and 99% above it
Percentiles for grouped data
i*n 
  cf 
Pj   i    *w
100
i
fi
Where i=1, 2,3,4.....99
i * n th
 i = lower class boundary of the ith percentile class (the class which contains the ( ) item).
100
wi =class width
fi=frequency of the ith percentile class
cf=the cumulative frequency of the class preceding the ith percentile class
73 | P a g e
Dear distance learners! As we can see from their formula the computation of percentiles and
deciles is the same to that of quartiles. The only difference the value they divide the given data
set.
Observe that:
1. Q2= D5= P50=Median

2. Dj= P10j, j=1, 2, 3,4,5,6,7,8,9.
3. Qj= P25j, j=1, 2, 3
Let us see the first case, Q2= D5= P50=Median with the help of example.
The following frequency distribution table shows the time taken by 20 workers to go from their
home to work. Using the frequency distribution table find Q2, D5, P50 and Median.
Table 4.7 calculation of quartiles, deciles and percentile for grouped data
Time taken 8-10 11-13 14-16 17-19 20-22 23-25

Frequency 2 4 6 4 3 1
Cumulative frequency 2 6 12 16 19 20
Let us start from the Q2

th
2n  40 
( ) th item    item  10 th item is Q 2 and it falls in the 3 trd class  14 - 16 is second quartile class
4 4
 2 * 20 
  6
Q2  13.5    * 3  13.5  2  15.5  median
4
6
D5=?
i * n th
The deciles class is the class which contains the ( ) item.
10
74 | P a g e
th
5 * 20 th 100 
( ) item    item  10 th item is D5 and it falls in the 3 trd class  14 - 16 is D5 quartile class
10  10 
 5 * 20 
  6
D5  13.5    * 3  13.5  2  15.5  median
10
6
P50=?
i * n th
The percentile class is the class which contains the ( ) item.
100
th
50 * 20 th 1000 
( ) item    item  10 th item is P50 and it falls in the 3 trd class  14 - 16 is P50 quartile class
100  100 
 50 * 20 
  6
P50  13.5    * 3  13.5  2  15.5  median
100
6
Median=?
n 
  cf 
MD  md   2 *i
 f 
 
 
 20 
  6
MD  13.5    * 3  13.5  2  15.5
2
6
Therefore Q2= D5= P50=Median
75 | P a g e
Review Questions:
1. Why median is a better measure of central tendency when the distribution of data is
skewed?
2. What are the differences between measure of central tendency, median, mode and
positional measures, quartiles, deciles and percentiles?
3. What are the differences between weighted and arithmetic mean?
4. Do you think that the mean of raw data and the mean of the same raw data grouped into a
frequency distribution are same?
5. Proof that the sum of the deviations of each value from the mean is always zero.
6. In which distribution of data the Mode is higher than the Mean and Median
A. positively skewed distribution
B. negatively skewed distribution
C. symmetric distribution
D. none
7. In a set of observations, which measure reports the middle value?
A. Quartile (Q2) B. Median C. Percentile 50 (P50)D. All E.A&B
8. The following frequency distribution table shows the monthly salary of 20 professors in
Debre Markos University, in thousand.
Class Frequency
6-10 1
11-15 2
16-20 3
21-25 5
26-30 4
31-35 3
36-40 2
Using the above information on the frequency distribution table
A. Compute the arithmetic mean, mode and median.

B. Find the second quartile (Q2), deciles 5(D5) and percentile 50 (p50).
C. Is the distribution of data positively skewed distribution, negatively skewed distribution
or symmetric distribution?
76 | P a g e
9. A nation faces a rate of unemployment of 2% in 1990, 5% in 1992, and 12.5% in 1993.
Find the geometric mean of the unemployment rates?
10. A household purchased Birr 600 worth teff for consumption in three equal purchases of
Birr 200 each over a three months period. The first pack of teff was Birr 2.95/kg, the
second Birr 3.10/kg and the third Birr 3.25/kg. What was the average price per kg paid
for all the teff?
11. The mean age of all students in a class of 50 students is 17 years. If the mean age of 30 of
them is 18 years, find the mean age of the remaining 20 students.
12. For a sample of 50 stocks traded yesterday on the Ethiopian Stock Exchange, 10 showed
a decline of $1.00, 15 showed no change, and 25 increased by $2.00. Find the weighted
mean.
13. Suppose you receive a 5 percent increase in salary this year and a 15 percent increase
next year. The average annual percent increase is 9.886, not 10.0. Why is this so?
77 | P a g e
CHAPTER FIVE: MEASURES OF DISPERSION
Chapter Introduction
Dear distance learners! The measure of central tendency of any series or data distribution in
the previous chapter summarizes it in to single representative form which is useful in many
respects but it fails to account the general distribution pattern of data. Thus any conclusion only
based on central tendency may be misleading.
Measures of dispersion/ variation or spread are all about the amount of the spread or scatter in a
distribution. They measure the variability in the values of observations in the set. If all values are
the same the dispersion is zero. If the values are homogenous and close to each other the
dispersion is small. If the values are so different the dispersion is large.
Chapter Objectives
 Compute and interpret the quartile deviation, the mean deviation, the variance and the
standard deviation of ungrouped and grouped data.
 Explain the characteristics, uses, advantages and disadvantages of each measure of
dispersion.
 Compute and interpret the inter quartile range and its relative measure.
 Compute and interpret the relative measures of dispersion
 Compute and interpret the Z-score
 Understand and measure Moments, Skewness and Kurtosis.
5.1 Types of Measures of Dispersion /Variation

Dispersion is the scatter or variation of items from a measure of central tendency. It measures the
extent to which the values vary among themselves. It is often difficult to assert which set of data
is better represented by its mean value unless we refer to dispersion. This points to the possibility
when any two or more sets of sample data having the same mean, may differ considerably in
78 | P a g e
terms of the degree of dispersion. For instance, the average income in a community is not an
adequate indicator of the well- being of the community since it doesn’t show us the inequality
among the residents. But, the measure of dispersion can show us this inequality. Therefore, it is
useful to have a measure of dispersion to observe variability of data.
Measures of dispersion fall into two categories:
a. Measures of absolute dispersion: is an absolute form which shows the actual amount of
variation of an item from a measure of central tendency. It includes: Range, Quartile
deviation, Mean deviation, Standard deviation and variance.
b. Measures of relative dispersion contain: is a quotient obtained by dividing the absolute
measure by a quantity in respect to which the absolute deviation has been computed. They
are unitless and are used to compare variability between different sets of data. Examples:
Coefficient of quartile deviation, Coefficient of mean deviation and Coefficient of
variation.
Dear distance learners! Do you know the qualities of good measure of dispersion?
The following are some of the qualities of a good measure of dispersion.
 It should be based on all observations

 It should be easily calculated.
 It should be easily understandable
 It should be affected as little as possible by sampling fluctuations.
 It should be capable of further statistical treatment.
As stated so far, when these measures express the magnitude of dispersion in the same unit
of measurement in which the data are recorded, they are known as measures of absolute
dispersion. However, when dispersion is expressed in percentages or ratios, these measures are
called measures of relative dispersion.
79 | P a g e
5.1.1. Range
Range is defined as the difference between the smallest and the largest observations in a given
set of raw data. Obtaining range from raw data thus requires identifying only these two extreme
values, and taking the difference between them
Properties of range
 Only two values are used in its calculation
 It is influenced by an extreme value.
 It is easy to compute and understand.
 It is the crudest measure of dispersion.
 It cannot be determined for an open ended data.
 The grater the range, the higher the variability of the data and vice versa.
Example 5.1: Consider the following data on the expenditures of two groups of workers:
Group A: Br 6200 2200 1700 1700 1200
Group B: Br 1600 1700 1300 4200 3200
Solution:
For Group A: For Group B:
The highest expenditure = 6200 birr The highest expenditure = 4200
The lowest expenditure = 1200 birr The lowest expenditure = 1300
Range = highest value – lowest value Range = highest value – lowest value
= 6200 – 1200 = 5000 Birr = 4200 – 1300 = 2900 Birr
Therefore, in terms of expenditure more variation is observed in group A.

Note that: for discrete grouped data we use the same formula as given above, i.e, highest value
minus lowest value.
Example 5.2: Compute the range of the following data.

Table 5.1: Results (out of 35%) of 20 students in Econometrics test.
Xi 6 24 18 22 30 15
Fi 3 2 5 1 4 5
Maximum value = 30 marks
80 | P a g e
Minimum value = 6 marks and Range = Highest value – lowest value = 30 – 6 = 24
In case of continuous grouped data, range can be obtained in the following three ways:
i) Range is found by taking the difference between the upper class limit of the last class and
the lower limit of the first class. This is because the lowest and the highest observations are
not identifiable in the case of continuous grouped data. That is,
Range = UCLL – LCLF
Where UCLL = Upper class limit of the lest class
LCLF = Lower class limit of the first class
ii) Range is found by taking the difference between the upper class boundary of the last class
and the lower class boundary of the first class. That is,
Range = UCBL – LCBF
Where UCBL = Upper class boundary of the last class
LCBF = Lower class boundary of the first class.
iii) Range is found by taking the difference between the mid points of the first and the last class.
This does yield a result closer to the actual range as it reduces the margin by which it is in
error when computed by using the first the second methods.
Example 5.3:– Compute the range of the data given below in table 5.2.
Table 5.2 Results (out of 35%) of 40 students in Econometrics test
Score (35%) Class Boundary Number of Students (Fi)
6 – 10 5.5 – 10.5 5
11 – 15 10.5 – 15.5 10
16 – 20 15.5 – 20.5 15
21 – 25 20.5 – 25.5 7
26 – 30 25.5 – 30.5 3
Solution
Range = UCBL – LCBF Range = UCLL – LCLF
= 30.5 – 5.5 = 30 – 6
= 25 or = 24
81 | P a g e
And also it can be computed as the difference between the midpoint of the last class and the
midpoint of the first class. That is, Range = 28 – 8 = 20
It may have been noted that range is measured in an absolute form in the above discussions.
It implies that such a measure cannot be used for comparing variabilities expressed in different
units. Therefore, there is a need to have a measure of relative dispersion /variation. The relative
range or coefficient of range is defined as:
Range Highestvalue  LowestValue
 x100%  x100% for raw data
Sumofexteremevalue Highestvalue  Losestvalue
& discrete grouped data.
UCB L  LCB F
 x100% for continuous grouped data.
UCB L  LCB F
Example 5.4: Compute the coefficient of range for the following raw data.
2, 4, 6, 8, 16, 18, 20
20  2 18
Solution:- Coefficient of range = X 100%  X 100% = 81.8%
20  2 22
Example 5.6: Find the coefficient of rage (relative range) for the data given in table 5.2.
Solution:UCBL = 30.5 LCBF = 5.5.
30.5  5.5
Coefficient of range = X 100%
30.5  5.5
25
= X 100% = 69.4%
36
Besides being simple to compute and understand, range is as good a measure of dispersion as
any other where the data consist of a few observations and is advantageous when one wants to
know only the extent of the extreme dispersion under “ordinary” conditions. However, its major
drawbacks include; (i) it tells us noting about the dispersion of the values which fall between the
two extremes, (ii) it is highly sensitive to sample size, (iii) highly affected if the values of the two
extremes change.
82 | P a g e
5.1.2. Quartile Deviations
Quartiles are the values which divide the array into four equal parts. Q1 gives the value of the
ℎ
item which is 1 4 the way up the distribution, Q2 gives the value of the item which is half of
the way and Q3 is the value of the item 3/4th the way up the distribution.
Inter-quartile range is the difference between Q3 and Q1, i.e., inter-quartile range = Q3 – Q1
Q3 Q1
Quartile deviation, denoted as Q D , is defined as Q D = . Quartile deviation is also called
2
semi-quartile range.
Example 5.6: Find the Quartile deviation of the following data.
Table 5.3: Results (out of 35%) of 40 students in Econometrics test.
Scores (35%) Class Boundary) Frequencies (fi) Less than cumulative frequencies
6 –1 0 5.5 – 10.5 5 5
11 – 15 10.5 – 15.5 10 in
15 (Q1 class, as = 10th value)
4
16 – 20 15.5 – 20.5 15 30 (Q1 value – 3oth value)
21 – 25 20.5 – 25.5 7 37
26 – 30 25.5 – 30.5 3 40
40
Solution:
since the ith quartile is computed as
Qi = LQi +
in  CF 
4 PQi
xCWQi
FQi
Where: n = sample size
LQi = lower class boundary of the quartile class
CFPQi = Cumulative frequency of the preceding quartile class
CQWi = Class width of the quartile class
Fqi = frequency of the quartile class
83 | P a g e
Q1  10.5 

1x 40  5
4 x5  10 .5  25  13
10 15
Q3  15.5 
3 x 40  15x5
4 = 20.5
15
Q3  Q1 20.5  13
Quartile deviation (semi – quartile range) =  = 3.75
2 2
The coefficient of quartile deviation, which provides us a relative measure, is defined

Q3  Q1
2 Q  Q1
as: QD  x100%  3 x100%
Q3  Q1 Q3  Q1
2
Example 5.7: Compute the coefficient of quartile deviation for the data given in table 5.3.
Solution
Given: Q3 = 20.5 and Q1 = 13
Q3  Q1 20.5  13 7.5
Coefficient of Q D    X 100% = 22.4%
Q3  Q1 20.5  13 33.5
Dear distance learners! Can you differentiate the advantages and disadvantages of
quartile deviation?
Did you try? …………. Good!
Advantages of Quartile deviation include

 It is easy to compute and understand
 It can be computed for open-ended classes given that Q3& Q1 can be found.
 It is not affected by extreme values
Disadvantages of Quartile deviation include

 It ignores the first 25% and the last 25% items
 It is not capable of mathematical manipulations.
84 | P a g e
 Its value is very much affected by sampling fluctuations.
 It doesn’t show the scatter around the average, but only a distance on scale.
5.1.3. Mean Deviation
The mean deviation, also called the average deviation, measures the average deviation
/scatters of a set of observations about a central value, usually the mean or the median of the
distribution. It is computed by subtracting the mean/median from each individual observations,
summing all the deviations ignoring the negative sign, and dividing the sum by the total number
of observations. The negative sign is ignored, for instance, otherwise the sum of the deviation
 
from the mean i.e,  X i  X  will be zero.
The mean absolute deviation from the mean for a set of sample data consisting of n observations
is computed as M D from the mean =

X i X
Similarly, MD from the median is obtained as
n ,
MD from the median =

X i  Md
in the case of ungrouped data.
n
in case of grouped data, It is also obtained as M D from the mean =

f X X
i i
f i
M D from the median = f i X i  Md

, where Xi’s are the mid-points and f  n.
f
i
i
Example 5.8: The age of a sample of 10 students from a class is given below.
18, 19, 19, 19, 20, 21, 21, 22, 23, 24
Find mean deviation (i) from the mean (ii) from the median
Solution:Arithmetic mean =
X i
 206  20.6
n 10
n  value  n 2  1 value  20  21 = 20.5

Median = 2
th th
2 2
85 | P a g e
Age Mean Absolute deviation from Mean absolute deviation
the mean from the median
18 /18 – 20.6/ = 2.6 /18 – 20.5/ = 2.5

19 /19 – 20.6/ = 1.6 /19 – 20.5/ = 1.5
19 /19 – 20.6/ = 1.6 /19 – 20.5/ = 1.5
19 /19 - 20.6/ = 1.6 /19 – 20.5/ = 1.5
20 /20 - 20.6/ = 0.6 /20 – 20.5/ = 0.5
21 /21 - 20.6/ = 0.4 /21 – 20.5/ = 0.5
21 /21 - 20.6/ = 0.4 /21 – 20.5/ = 0.5
22 /22 - 20.6/ = 1.4 /22 – 20.5/ = 1.5
23 /23 - 20.6/ = 2.4 /23 – 20.5/ = 2.5
24 /24 - 20.6/ = 3.4 /24 – 20.5/ = 3.5
16 16
Therefore,
M D from the mean =

X i X
=
16
= 1.6 and M D from the mean =
X i  Md

16
=1.6
n 10 n 10
Example 5.9: Find the mean absolute deviation from the mean and from the median for the data
given in table 5.2.
Solution: First arrange the data as follows:
Score (35%) Fi Class mark Xi  X Fi X i  X Xi  Md Fi X i  M d
6 –10 5 8 9.125 45.625 9.167 45.835

11 - 15 10 13 4.125 41.250 4.167 41.67
16 – 20 15 18 0.875 13.125 0.833 12.495
21 – 25 7 23 5.875 41.125 5.833 40.831
25 – 30 3 28 10.875 31.625 10.833 32.499
40 173.75 173.33
86 | P a g e
fX i i  (5x8) + (10x13) + (15x18) + (7x23) + (3x28)
= 40 + 130 + 270 + 161 + 84 = 685
Mean =
fX i i

685
= 17.125Median = Lmd 
40  CF
2 PMd
xCWmd
 
n 40 FMd
= 15.5 
20  15 x5 = 17.167
15
Therefore, M D form the mean =

f X X i i
=
173.75
= 4.344
f i 40
M D from the median

X i  Md

173.33
= 4.333
n 40
Note: Coefficients of mean deviation, relative measures, forms the mean and from the median
are given as follows:
M D from the mean

(i) Coefficient of M D form the mean = x100%
mean
M D from the median
(ii) Coefficient of M D from the median = x100%
median
Example 5.10: Compute the coefficient of mean deviation from the mean and from the median
for the data given in example 5.9.
Solution:-
M D from the mean = 4. 344 M D from the median = 4.333
Mean = 17.125 Median = 17.167
4.344
Thus, coefficient of M D from the mean = x100% = 25.37%
17.125
87 | P a g e
4.344
Coefficient of M D from the median = x100% = 25.24%
17.167
Advantages of MeanDeviation
 It is easy to understand and compute than standard deviation
 It is not unduly influenced by large or small values
 All values are used in its calculation
Disadvantages of Mean Deviation

 It ignores the algebraic sign of the deviations
 It is not suitable for further mathematical processing.
5.1.4. Variance and Standard Deviation
Like other measures, variance and standard deviation also quantifies the dispersion of the
observations around the mean value.
The population variance is defined as the arithmetic mean of the squared deviations from the
population mean.
Properties of Population variance

 All values are used in calculation.
 The units are awkward, the square of the original units.
 X  
2
The formula for the population variance for raw data is:  
2 i
Where:  = Mean (population) ,N = total number of observation
 X 
2
X
S 2
 i
Where; n = sample size, X = mean (sample)
Sample variance: n 1
Alternatively, we can simplify it as follows
 X  =   X  X  2 X X i 
2 2
2
X i

S2  i
n 1 n 1
88 | P a g e
  X   X  2 X X i 
2
   Xi   X  2X  Xi
2 2
2
i
=
n 1 n 1 n 1 n 1
 X 
i
2
X  Xi 
2 2 2 2
nX 2n X n
=  i
 
n 1 n 1 n 1 n 1 n 1
n X i   X i 
2 2

nn  1 For small sample size
n   X i 
2 2
 X i
n2 For large sample
Why n-1?
The reason for this is, in small sample, if provides a better estimate of the variance of the
population from which the sample is drawn. However, as n increases above about 30, we can use
n instead of n-1, as the two versions given approximately the same result for practical purposes.
Example 5.11: The ages of a family (in years) are: 2, 18, 34, 42.
What is the population variance?
Solution:

X i

96
= 24
 4
 X    2  242  18  142  34  242  42  242

2
944
 2
  = = 236
 4 4
The population standard deviation is the square root of the population variance.
 X  
2
 
i
and
N
the sample standard deviation is the square root of the sample variance.
89 | P a g e
 X   X 
2 2
X X
S for small sample size and also S 
i i
for large sample size
n 1 n
n  X i2   X i 
2
Alternatively, for small sample less than about 30: S 

nn  1
Example 5.12: From the sample data given below compute variance and standard deviation
10, 15, 30, 22, 41, 32
Solution:-
n = 6  X i  150
and
 X   4414
i
2
Xi 10 15 30 22 41 32
2
Xi 100 225 900 484 1681 1024
n  X   X i  64414   150 
2 2 2
So, S 2  i
= = 132.8
nn  1 45 
S  S 2  132.8 = 11.51
 Variance and Standard deviations for grouped data

For grouped data the population and sample variance denoted by and S2 respectively are
given by:
 f X      f i X i2   f i X i 
2 2
  
2 i i
f i 2
 f X  X   n  f X   f i X i 
2 2

2 i i i i
S
f i n2
in which Xi’s are the class mid-points and f i  N for the population and f i  n for the
sample.
90 | P a g e
n  f i X iw   f i X 2
Alternatively for small sample size we can use: S  2
nn  1
By definition, standard deviations in each case are the square roots of the respective variances.
Example 5.13: From the continuous frequency distribution given in table 5.2, compute the
sample variance and standard deviation.
Solution:
Class limits Class fi fi X i X X   X X 

i i
2

fi X i  X 
2
X i2 f i X 2i
(scores) mark
6 –10 8 5 40 -9.125 83.26 416.328 64 320

11 – 15 13 10 130 -4.125 17.016 170.16 169 1690
16 – 20 18 15 270 0.875 0.7656 11.48 324 4860
21 – 25 23 7 161 5.875 34.516 241.609 529 3703
26 – 30 28 3 84 10.875 118.26 254.80 784 2352
40 685 253.82 1194.8 12925
Therefore, for small sample size
 f X 
2
Xi 1194.8
  = 30.625 S  S 2  30.625 = 5.534
2 i
S
n 1 40  1
n f i X i2   f i X  4012925  685
2 2
Alternatively, S  2
= = 30.625
nn  1 4039 
S  30.625 = 5.534
Important properties of Variance /Standard Deviation

The following are some of useful mathematical properties of variance and standard deviation:
1. The variance/standard deviation of any constant is always zero.

A standard deviation of zero implies that there is no variation at all in the data set. In other
words the data values are the same.
91 | P a g e
2. A variance/standard deviation never be a negative number.
3. If a constant is added or subtracted from each observation, the variance/standard deviation
of the resulting observations will not be affected.
4. If every observation is multiplied by a constant K, then the new variance will be K 2 times
the original variance and the new standard deviation will be K times the original standard
deviation.
5. If there are two sets of data consisting of n1 and n2 observations with S12 and S 22 as their
respective variances, the combined variance S C2 of (n1 + n2) observations is
S 
2   
n1 S12  d12  n2 S 22  d 22   
Where; d12 = X 1  X C and d 22   X 2  X C  .
2 2
C
n1  n2  
n1 X 1  n2 X 2
Herein, the combined mean X C 
n1  n2
n1S12  n2 S 22 S 2  S 22
If X 1  X 2  S C2  Further, when n1 = n2 S C2  1
n1  n2 2
6. If Y represents a linear transformation of X as Y = a+bX, with a as the additive constant and b

as the multiplicative constant, then the variance of Y is: SY2  b 2 S X2 , where S X2 is the variance of
X. It follows that standard deviation of Y is bSX , Where SX is the standard deviation of X.
Example 5.14: Calculate the standard deviation of the combined group of 400 items form the
following data.
Group A Group B Group C
Number of items (ni) 50 150 200
Mean X i   40 50 60
Variance S i2 
81 100 121
Solution:-
n1 X 1  n2 X 2  n3 X 3 50(40)  150(50)  200(60)
XC  = = 53.75
n1  n2  n3 50  150  200
92 | P a g e
d i  X  X C d1 = 40 – 53.75 d2 = 50 – 53.75 d3 = 60 –53.75
:
= -13.75 = -3.75 = 6.25
Consequently, the combined variance is given as
S C2 
   
n1 S12  d12  n 2 S 22  d 22  n3 S 32  d 32 
n1  n 2  n3
=
    
50 81   13.75  150 100   3.75  200 121  6.25
2 2 2

400
13503  17109  32012

= = 156.56
400
S C  156.56 = 12.512
5.1.5. Coefficient of Variation
Coefficient of variation, developed by Karl person (1857 – 1936), is a relative measure of

dispersion which is a very useful measure when either the data are in different units or the data
are in different units or the data are in the same units but the means are far apart. It is defined
as the ratio of the standard deviation to the arithmetic mean (where mean is different from
zero), expressed as a percentage:
Population S tan daed deviation

CV  X 100% for population
Populatiopn Mean
sample s tan dard deviation

CV  X 100% for sample
sample mean
Coefficient of variation (CV) helps us for comparing the Variability,Heterogeneity

/homogeneity, Uniformity, &Consistency of two or more distribution.
93 | P a g e
A series /distribution with smaller coefficient of variation is said to be more homogenous
/uniform/ consistent than the other distribution. And a series /distribution with larger CV is said
to be more variable or more heterogeneous than the other distribution.
Example 5.15: The number of employees, the average wages and the variance of the wages for
two factories are given below.
Table 5.5: Summary of wage & employees of two factories
Factory A Factory B
Number of employees 50 100
Average wages 120 85
Variance of the wages 9 16
Which factory is consistent in respect to the wages of employees?
Solution:
Factory A Factory B
Given: nA = 50, X A = 120 and S A2 = 9 Given: nB = 100, X B = 85, and S B2 = 16

SA SB
CV A  x100%  3 X 100 % = 2.5% CVB  X 100%  4 X 100 % = 4.7%
XX 120 XB 85
Conclusion: CVA< CVB =>The wages of employees of factory A is more consistent than factory
B.
Interpretation of Standard Deviation

Theorem: (GAUSSIAN RULE). If a data in a sample are approximately distributed, then
a. X  S , Approximately include 68% of the data.
b. X  2S , Includes approximately 95% of the data
c. X  3S , Includes approximately 100% of the data.

Standard Scores (Z-Scores)
The Z-score is defined to indicate the number of standard deviations that an observation is below
or above the mean depending on whether the Z-score is negative or positive.
94 | P a g e
Xi  X
Z – is called the standard value which is given by Z 
S .d
Example 5.16: Helen scored 65 in Auditing and Samuel scored 70 in Auditing. If the average
score of the whole students in Auditing is 67 and standard deviation equal to 3, which student
performs better?
Z Helen  X 65  67 X  X 70  67
Solution Z Helen  = = -0.6 Z Samuel  Sami = =1
S 3 S 3
Therefore, Samuel performs better in Auditing than Helen and than the average result of the
whole students.
Dear distance learner! In a sample, 100 students doing a master program in management
were tested in a general knowledge paper carrying 100 marks. At the end of the exercise, they
were found distributed according to marks obtained as follows:
Marks obtained 30 -34 35-39 40-44 45-49 50-54 55-59 60-64
Number of students 5 8 12 20 27 20 8
Then find
a) The range of the distribution,
b) Quartile deviation,
c) Mean absolute deviation from the mean,
d) Variance and standard deviation, and
e) Coefficient of variation.
a. 30 b. 5.375 c. 6.46 d. 61.24 and 7.82 and e. 15.8%
5.2.Moments, Skewness, and Kurtosis
Dear learners! In this section, we will deal with two other important characteristics of a
frequency distribution. One refers to lack of symmetry in the distribution, or its departure from
being bell-shaped. The other relates to the degree of flatness or peakdness of a distribution at its
top. The former is described as skewness and the later kurtosis.
Moments
95 | P a g e
 Moments tell us information about the “shape” of the distribution
 It is represented by Mr, r =0, 1, …, r, which is called the r th moment.
 We can have moments about any constant number, about the mean, zero or any desired
value.
In general, the rthmoment about any arbitrary constant number, say A, is given by
X  A
2
Mr  i
n
Example 5.18: Consider the following data and compute the first four moment’s bout five (5).
2, 2, 3, 4, 4, 5, 6, 7, 8
Solution:-
A = 5n = 9
Xi Xi-5  X i  52  X i  53  X i  54
2 -3 9 -27 81
2 -3 9 -27 81
3 -2 4 -8 16
4 -1 1 -1 1
4 -1 1 -1 1
5 0 0 0 0
6 1 1 1 1
7 2 4 8 16
8 3 9 27 81
Total -4 38 -28 278
X  5 1
r
 X  5
0
Mr  i
M0  i
 i 1

9
 1
n 9 9 9
X  5  X  5
1 2
 = 4 M 2 
i i
M1 = 38
9 9 9 9
X  5 X  5
3 4
 =  28 
i i
M3 M4 = 278
9 9 9 9
Note: For grouped data the rth moment about any constant number, say A, is given as:
 f  X  A
r
Mr  i i
Where;
f i
96 | P a g e
f i => Frequency of Xi in case of discrete grouped data
f i => Frequency of the ith class in case of continuous groped data and
Xi => Class mark of the ith class.
Note: M0 is always equal to 1.
Example 5.19: Find the first three moments about 4 for the data given in table 5.6
Table 5.6 Number of children in ten families
Xi 2 3 4 5
fi 3 2 3 2
Solution:-
Xi fi Xi  4 f i  X i  4  X i  42 f i  X i  4 2 f i  X i  4 3 f i  X i  4 4
2 3 -2 -6 4 12 -8 -24
3 2 -1 -2 1 2 -1 -2
4 3 0 0 0 0 0 0
5 2 1 2 1 2 1 2
Total -6 16 -24
 f X  4   f 1  10 = 1 M
2
  6 = -0.6 M 2  16 = 1.6 M 3   14
i i i
M0 = -2.4
f 10 10 10
1
10 10
i
Central Moments (Moment about the mean)

The rthcentral moment for ungrouped data is given by the formula.
X  
r
Mr  i
, for the population with N observations and mean  .
N
Mr 
X i X
, for sample data with n sample size and mean X .
n
Similarly, for grouped data the central moment is defined as:
 f  X    for the population, and M   f X  X  for sample data.

r w
M  i i i i
f f
r r
i i
Where;  f  N - for the population

i  f  n - for sample i
Xi = class mark of the ith class in case of continuous grouped data.
97 | P a g e
= frequency of Xi in case of discrete grouped data & frequency of the ith class in case of
continuous grouped data.
Example 5.20: Find the first three central moments for the population data given by:- X = 2, 3, 7
Solution  
X i

2  3  7 12

N 3 5 =4
∑( ) ( ) ( ) ( )
M0 = 1 = = =0
∑( ) ( ) ( ) ( ) ∑( ) ( ) ( ) ( )
= = =14 3 = = =18 3
Note:
∑( )
 For central moments : M0 = 1, M1 = 0, and M2 = = (variance of X)
 M2 and M3 help us to measure Skewness and Kurtosis

∑( ) ∑
 Moment about the origin (i.e, A = 0) is given by: = =
Example 5.21: Compute the first four moments about the mean for the following sample data
(discrete frequency distribution)
Xi -3 1 2 3 5
Fi 2 1 4 2 3
Solution:-
( ) ( ) ( ) ( ) ( )
= = =2
Xi fi ( − ) ( − ) ( − ) ( − ) ( − ) ( − ) ( − ) ( − )
-3 2 -5 -10 25 50 -125 -250 625 1250

1 1 -1 -1 1 1 -1 -1 1 1
2 4 0 0 0 0 0 0 0 0
3 2 1 2 1 2 1 1 1 2
5 3 3 9 9 27 27 81 81 243
Total 0 80 -169 1496
M0 = 1 M1 = 0 M2 = 80 12 = 6.6667
M3 = 169 12 = -14.083 M4 = 1496 12 = 124.67
98 | P a g e
Skewness
Skewness refers us lack of symmetry. We study skewness to have an idea about the shape
of the curve which we can draw with the help of the frequency distribution. Frequency
distributions often found skewed on either side of its central value. As a result, it has a longer
tail either to the left or to the right. When there is a longer tail to the right of the center, the
distribution is said to be positively skewed. If the tail is longer to the left of the center, the
distribution is said to be negatively skewed. A positive skewness means a greater dispersal of
individual observations towards the right of the central value. A negative skewness, on the other
hand, implies that individual observations have greater dispersal towards the left of the central
value.
Skewness, therefore, not only refers to the lack of symmetry in distribution, it also shows the
direction of dispersion of individual observations on either side of the center of the distribution.
Accordingly, a measure of skewness quantifies the extent of departure from symmetry and also
indicates the direction in which the departure takes place.
Diagrammatically, the shape of frequency curves:
a) b) c)
= =
Positively Skewed Negatively skewed
Symmetrical distribution
Among the measures of skewness, two shall be discussed here.

a) Moment coefficient of skewness
b) Pearsonian coefficient of skewness
99 | P a g e
a) Moment coefficient of Skewness
In terms of moment coefficient, skewness is defined as: = = =
( )
Where M2 = S2 = variance
Interpretation:
(1) If = 0 => Symmetrical distribution
(2) If < 0 => Negatively skewed distribution
(3) If > 0 => positively skewed distribution
(4) A greater or smaller value of means a greater or smaller degree of skewness.
Example 5.22: Find the skewness of the distribution given in example 5.18
Solution: = −46 9 = 39 9
Thus = = = 0.56 7< 0, therefore the distribution is negatively skewed.
b) Pearsonian coefficient of Skewness
Pearsonian coefficient of skewness is developed by Karl Pearson. This measure is based on

the fact that when a distribution drifts away from symmetry, its mean, median, and mode tend to
deviate from each other. This results about from the presences of exceptionally high or low
observations affecting the value of the mean the most, and that of the mode the least.
The value of the mean tends to be the highest and that of the mode the lowest when some
observations in a given set of data are exceptionally high. Consequently, a distribution having
exceptionally high observations has a longer tail towards the right. Contrarily, mean tends to be
the lowest, and mode the highest, when a set of data contain some exceptionally low
observations. As a result, the distribution will have a longer tail towards the left.
Thus, it is the direction in which mode drifts from mean that determines whether a distribution
will have positive or negative skewness.
Using this conclusion, the pearsonian coefficient of skewness, denoted as , is defined as

−
=
100 | P a g e
In which S is standard deviation. Using the empirical relationship among mean, mode and
median in a moderately skewed distribution, i.e, mode = mean – 3(mean – median), the above
( )
equation can be modified as: =
Note:
1. −3 ≤ ≤3
2. If = 0, the distribution is symmetrical
3. If > 0, the distribution is positively skewed
4. If < 0, the distribution is negatively skewed
Example 5.23: Find the skewness of the following data using pearsonian’s coefficient of
skewness.
Solution:-
Arrange the data in an increasing order
1, 2, 4, 5, 6, 7, 8, 10, 30, 32
∑ ∑( ) .
= = 6.5 = = = 10.5 = = = 124.06
= √ = √124.06 = 11.14
( ) ( . . )
Therefore, = = .
= 1.077
Interpretation: The distribution is positively skewed.
Kurtosis
Another attribute of a frequency distribution is its peakdness, or flatness, at its top. A

distribution may have a smaller or greater degree of flatness at its top. Thus, it is the
characteristics of flatness or peakdness at the top of the distribution that kurtosis describes and
measures.
Taking symmetrical distribution as a frame of reference, a distribution which is more peaked
than the normal as in (a) below is known as Leptokurtic distribution. The one whose polygon
is flat at its top as in (c) below is called a platikurtic distribution. A distribution with a polygon
101 | P a g e
which is neither to high in peak, nor too flat at the top as in (b) is termed as Mesokurtic
distribution.
a. Leptokurtic b. Mesokurtic c. Platykurtic
We have two measures of Kurtosis: The coefficient of Kurtosis and Moment coefficient of
Kurtosis
(i) The coefficient of Kurtosis

The coefficient of kurtosis denoted by K is defined as a ratio of inter-quartile range to inter-
decile range. K =
Interpretation:
 If K = 0.5, approximately the distribution is Mesokurtic
 If K > 0.5, approximately the distribution is leptokurtic
 If K<0.5, approximately the distribution is platykurtic.
(ii) Moment coefficient of Kurtosis
Moment coefficient of Kurtosis is Kurtosis in terms of the fourth moment about the mean,
denoted by B2, and is defined as = =  = −3
Where S is standard deviation.

Interpretation:
 If = 3 /  = 0 => Mesokurtic distribution
 If > 3/  > 0 => Leptokurtic distribution
 If < 3/  < 0 =>Platykurtic distribution
102 | P a g e
Summary Exercises
1. The mean and standard deviation of 25 observations were found to be 30 and 3 respectively.
After the calculations were made, it was found that two of the observations were recorded as
29 and 31 incorrectly. Find the mean and standard deviation if the incorrect observations are
excluded
2. A person invested his money in to two areas A and B. His net profit (in Birr) for the first three
months are:
Area A 72 76 74
Area B 45 92 85
a. Find the mean net profit for each area of investment

b. Find the range of net profit in both areas.
c. Which area is risky to invest? In which area is the net profit more consistent?
3. The yearly salaries of all employees working for a company have a mean of Birr 42350 and a
standard deviation of Birr 3820. The years of schooling for the sample of employees have a
mean of 15 years and a standard deviation of 2 years. Is the relative variation in the salaries
higher or lower than that in years of schooling for these employees? Why?
4. The coefficient of variation of a distribution is 60% and its standard deviation is 12. Find out
its mean.
Class Intervals 50 - 51 53 - 55 56 – 58 59 - 61 62 - 64
Frequencies 5 10 21 8 6
5. Based on the above frequency distribution table, find

a. The range,
b. Quartile deviation
c. Mean absolute deviation from mean
d. Variance and standard deviation
e. Pearsonian coefficient of skewness using two different formulas.
103 | P a g e
CHAPTER SIX: SIMPLE LINEAR REGRESSION AND CORRELATION
Chapter Objectives
 Define regression analysis

 Define and fit simple linear regression
 Predict the population average value of the dependent variable on the basis of known
(fixed) values of the independent variable.
 Understand correlation
 Compute the Pearsonian and rank correlation coefficients.
6.1. Simple Linear Regression
Dear Learners! In the preceding chapters we have been dealing with data on a single
variable. Here we shall focus on methods of dealing with paired data, which may be related in
some way.
Regression Analysis is concerned with describing and evaluating the relationship between a
dependent variable and one or more independent variables. Therefore, regression is used for
bringing out the nature of relationship and using it to know the best approximate value of the
other variable. In what follows, therefore, we will deal with the problem of estimating and/or
predicting the population mean/average values of the dependent variable on the basis of known
values of the independent variable (s).
The variable whose value is to be estimated/predicted is known as dependent variable while the
variables which help us in determining the value of the dependent variable are known as
independent variables.
A regression equation which involves only two variables, a dependent and an in dependent
referred to us simple regression. This model assumes that the dependent variable is influenced
by only one systematic variable and the error term. However, when several variables (necessarily
more than two) are included in the model, it is called multiple/multivariate regression.
104 | P a g e
The relationship between any two variables may be linear or non-linear. The former implies a
constant absolute change in the dependent variable in response to a unit changes in the
independent variable while the latter implies varying marginal change in the dependent variable
in response to changes in the independent variable.
Consequently, in this chapter we will confine ourselves to the type of regression involving only
two variables and the type of relationship between our variables which is linear. If this turns out
to be the case, it is called simple linear regression.
6.1.1. The Scatter Diagram

Consider the following data collected by taking a sample of five industries in a given industrial
sector on their input (number of workers) and output (thousands of birr).
Table 6.1:
Industry (Yi)- output (thousands of Birr) (Xi) - Inputs (no of workers) (Xi, Yi)
1 4 2 (2,4)
2 7 3 (3,7)
3 3 1 (1,3)
4 9 5 (5,9)
5 17 9 (9,17)
Output level (Yi) is believed to depend on number of workers (Xi). Accordingly, Yi is a

dependent variable and Xi is independent variable.
In order to visualize the form of regression we plot these points on a graph as shown in fig. 6.1.
What we get is a scatter diagram.
105 | P a g e
Y
20
15
*
10
*
5 *
* *
1 2 3 4 5 6 7 8 9 X
When carefully observed, the scatter diagram at least shows the nature of relationship; whether
positive or negative and whether the curve is linear or non-linear. When the general course of
movement of the paired points is best described by a straight line, the next task is to fit a
regression line which lies as close as possible to every point on the scatter diagram. This can be
done by means of either free hand drawing or the method of least squares. However, the latter is
the most widely used method.
6.1.2. The regression Equation
Regression equation is a statement of equality that defines the relationship between two
variables. The equation of the line which is to be used in predicting the value of the dependent
variable takes the form Ye = a + bX. The most universally used and statistically accepted method
of fitting such an equation is the method of least squares.
The Method of Least Squares:-
This method requires that a straight line is to be fitted being the vertical deviations of the
observed Y values from the straight line (predicted Y values) is the minimum.
As shown in figure 6.1, if e1, e2, … e5 are the vertical deviations of observed Y values from the
straight line (predicted Y values – Ye), fitting a straight line in keeping with the above condition
requires that (for n sample size)
106 | P a g e
n
+ + ….+ = e
i 1
2
i is minimum. This can be done by partially differentiating e 2
i with
respect to a andb and equating them to zero.

 ei is the error made when taking Ye instead of Y. Therefore, ei = Yi – Ye.
e  Y  Ye 
2 2
i = i
e 2
i =  Y i  a  bX i 
2
  ei2   (Yi  a  bxi ) 2

  0
a a
 -2  Yi  a  bX i   0
  Y   a   bx  0
i i
na  Y b X
  
i i
 a  Y  b X
n n n
  ei2   (Yi  a  bxi ) 2
  0
b b
 -2  Yi  a  bX i X i  0
 ∑ − ∑ − ∑ =0
 ∑ −( − )[∑ − ∑ ]=0
 ∑ − ∑ − [∑ − ∑ ]=0
∑ ∑ ∑ ∑

∑ ∑
= ∑ ∑
∑ − ∑
=
∑ − ∑
∑ −∑ ∑
=
∑ − ∑( )
107 | P a g e
Example 6.1: Suppose we want to study the relationship between input (number of workers) and
output (thousands of Birr) of five factories given in table 6.1 above. To fit the regression line of
Yi (thousands of Birr) on Xi (number of workers, we can employ the method of least squares as
follows:
Solution Table 6.2
Arrange the data in tabular form
Yi Xi YiXi Xi2
Where  = summation /total
4 2 8 4 ∑
Mean of =
7 3 21 9
3 1 3 1 Mean of =
∑
9 5 45 25
17 9 153 81 n = number of sample size
Total 40 20 230 120 n=5

Mean 8 4
∑ −∑ ∑ 5(230) − 40(20)
= =
∑ − (∑ ) 5(120) − (20)
1150 − 800 350
= = = 7/4
600 − 400 200
Substituting these values in the above equations, we get

= − = 8 − 7 4 (4)= 1
Therefore, the least square regression equation equals: = 1+7 4

 Estimate the amount of Birr that a factory will have if it has 8 workers.
Xi = 8; =1+7 4
= 1 + (7 4)(8) = 15
Therefore, if a factory has 8 workers, its level of output will be 15 thousand ETB.
Example 6.2: In what follows you are provided with sample observations on price and
quantity supplied of a commodity X by a competitive firm.
a) Construct the scatter diagram
b) What is the linear regression of Yi (quantity supplies) on Xi(price of the commodity X).
108 | P a g e
c) Suppose price of the commodity X be 32, what will be the quantity supplied by the firm?
Tab.6.3. Data on price and quantity supplied.
(Yi) (Xi) XiYi Xi2

40 15 600 225
45 20 900 400
40 25 1000 625
50 30 1500 900
55 35 1925 1225
60 40 2400 1600
60 45 2700 2025
65 50 3250 2500
70 55 3850 3025
75 60 4500 3600
55 40 2200 1600
60 45 2700 2025
Total 675 460 27,525 19,750
a. *
70
*
*
60 **
*
50 *
*
40 *
* *
30
20
10
10 20 30 40 50 60 70
109 | P a g e
∑ ∑ ∑ ( , ) ( )
b) = ∑ ( )
= ( , ) ( )
= 0.7795
= −
∑ ∑
= = 675 12 = = 460 12
= 675 12 − 460 12 (0.7795) = 26.3718

Therefore, the estimated supply function is: Ye = 26.3718 + 0.7795 Xi
c) Xi = 32
Ye = 26.3718 + 0.7795 Xi
= 26.3718 + 0.7795 (32)
= 26.3718 + 24.944
= 51.3158
If the price of x is 32, the estimated quantity supplied will be approximately equal to 51 units.
6.1.3. Regression of X on Y
In the above sub-topic 6.1.2, we have explored regression of Y on X type. Sometimes, it is
possible and of interest to fit the regression of X on Y type, i.e., being Y as independent and X
dependent.
In such cases, the general form of the equation is given by: = +
Where Xe = expected value of X
a0 – X-intercept
b0 – slope of the regression
Applying the principle of least squares as before, the constants a0 & b0 are given as follows:
= −
∑ −∑ ∑
=
∑ − (∑ )
N.B. The regression equation of Y on X type and of X on Y type coincide at ( , ).
6.2. Correlation
The correlation coefficient measures the degree to which two variables are related
/associated – simple correlation denoted by r. For more than two variables we have multiple
correlations.
110 | P a g e
Two variables may have either positive correlation, negative correlation or may not be
correlated. Furthermore, depending on the form of relationship the correlation between two
variables may be linear or non-linear. Therefore, in this section, we shall be concerned with
quantifying the degree of association between two variables with linear relationship.
Contrary to regression analysis explained in the previous section (6.1), the computation of
coefficient of correlation does not require one variable to be designated as dependent and the
other as independent.
The measure of the degree of relationship between any two variables known as the pearsonian
coefficient of correlation, usually denoted by r, is defined
∑( )( )
= and is termed as the product – moment formula.
∑( ) ∑( )
It can be further simplified as

∑ −∑ ∑
=
[ ∑ − (∑ ) ][ ∑ − (∑ ) ]
NB. The building blocks of this formula are, therefore, ∑ , ∑ , ∑ , ∑ , ∑ and n

(sample size).
Properties of pearsonian coefficient of correlation
1. −1 ≤ ≤1
2. ℎ = 0, ℎ
3. When r = 1/-1 perfect positive/negative correlation.
4. Adding a constant number to each value of X and Y, as well as multiplying each value by
a constant does not affect the value of r.
5. The closeness of the relationship is not proportional to the value of r.
6. When r is positive and close to 1 then there is high positive correlation while when it is
close to zero it shows low positive correlation. Similarly, when r is negative and close to -
1 then there is high negative correlation while when it is close to zero it shows low
negative correlation
7. It is free of any units used.
Example 6.3. Find the pearsonian coefficient of correlation for the two variables in the data of
table 6.1.
111 | P a g e
Solution
Table 6.4.
Yi Xi Xi2 Yi2 XiYi

4 2 4 16 8
7 3 9 49 21
3 1 1 9 3
9 5 25 81 45
17 9 81 289 153
Total 40 20 120 444 230
( ) ( )
= = ( )(
= .
= 0.99
[ ( ) ( ) ][ ( ) ( ) ] )
Interpretation: it implies strong positive relation:

Example 6.4: Find the pearsonian coefficient of correlation for the two variables in the data of
table 6.3.
Solution: Table 6.5.
Yi Xi Xi2 Yi2 XiYi

40 15 225 1600 600
45 20 400 2025 900
40 25 625 1600 1000
50 30 900 2500 1500
55 35 1225 3025 1925
60 40 1600 3600 2400
60 45 2025 3600 2700
65 50 2500 4225 3250
70 55 3025 4900 3850
75 60 3600 5625 4500
55 40 1600 3025 2200
60 45 2025 3600 2700
Total 675 460 19,750 39,325 27,525
112 | P a g e
( , ) ( )
Therefore, =
[ ( , ) ( ) ][ ( , ) ( ) ]
,
= , .
= 0.974 ==> it implies strong positive relation between X & Y.
Example 6.5: Adding to each value of X and Y given in table 6.1 a constant number, say 1, show
that property 4 holds true.
Solution: Table 6.6.
Yi Xi Xi2 Yi2 XiYi

5(295) − 45(25)
5 3 9 25 15 =
[5(165) − (25) ][5(529) − (45) ]
8 4 16 64 32 350
= = 0.99
4 2 4 16 8 352.14
10 6 36 100 60 Therefore, we have shown that property
18 10 100 324 180 4 is true.
Total 45 25 165 529 295
Spearman’s Rank Correlation Coefficient
The pearsonian coefficient of correlation cannot be used in cases when the direct
quantitative measurement of the phenomenon under study is not possible. In such cases, we
make use of the rank correlation coefficient.
Steps involved calculating the spearman’s coefficient of rank correlation:
1. Rank the X values among themselves giving rank (1) to the largest (or smallest value and
(2) to the next largest (or smallest) value and so on.
2. Rank the Y-values among themselves in a similar way to that of X.
3. When there are ties in rank, i.e., when there are values sharing the same rank, assign to
each of the filed observation, the mean of the ranks they jointly occupy and the next rank
to be over looked.
4. Find the sum of the squares of the differences between ranks of two variables.
∑
5. Apply the formula = 1− ( )
ℎ n = number of pairs of observations

di =ith difference between ranks of X and Y
113 | P a g e
As the steps above indicate, rsmay be calculated for numerical data after ranking the values
according to numerical size.
Example 6.2: Consider the ranks given by two Judges for five ladies in a beauty contest:
Table 6.7
Judges Solution
Ladies RA RB di di 2
AZEB 1 2 1 1
TIZITA 3 4 1 1
FATUMA 4 3 -1 1
LEMLEM 2 1 -1 1
CHALTU 5 5 0 0
Total 4
∑ ( )
=1− ( )
=1− ( )
= 0.75==>It implies that there is similarity between the ranks of
Judge A and Judge B.
Review Exercises
1. Define and distinguish between;
a) Regression and correlation
b) Simple and multiple regression
c) Linear and non-linear relationship
2. Bring out the relevance of a scatter diagram in regression analysis.
3. Explain the meaning and status of the two constants a and b in the regression equation
Ye = a + bXi.
4. The marks obtained by 10 students in their graduation with B.A. degree in management
and the MBA entrance test were found as given below.
Graduation (Xi) 50 52 55 60 62 65 65 66 70 75
Entrance test (Yi) 52 50 57 65 65 62 65 65 71 75
Therefore, find
a) The two regression equations
b) The correlation coefficient between two sets of marks
114 | P a g e
5. Obtain the regression equation of X on Y and Y on X for the paired data given below.
Also compute the coefficient of correlation.
Market price of X 26 28 30 31 35
Market price of Y 20 27 28 30 25
6. Ten students got the following marks in Maths and Statistics
Student A B C D E F G H I J
Maths (X) 78 36 98 25 75 82 90 62 65 39
Statistics (Y) 84 51 91 60 68 62 86 58 58 47
Compute the coefficient of Rank correlation and interpret the result.
7. For a certain set of paired data on X and Y, 3Xi + 2Yi – 26 = 0 and 6Xi + Yi – 31 = 0
are the two regression equations.
a) Find the mean values
b) Find the coefficient of correlation : = .
8. A leading company engaged in the production of detergents has 10 vacancies of salesman
for which 15 (n) persons were called for personal interviews. The interview board
consisted of the sales manager and a psychologist. The ranks given by the two to all 15
candidates who attend the interview is given below.
Sr.No. in the interview list 1 2 4 5 8 9 10 11 13 14 15 17 18 19 20

Ranking by the sales 2 3 1 5 4 6 8 7 9 10 12 11 13 14 15
manager (xi)
Ranking by the 1 3 2 4 6 5 7 9 8 11 10 12 14 13 15
psychologist (Yi)
Compute the rank correlation coefficient.
115 | P a g e
CHAPTER SEVEN: ELEMENTARY PROBABILITY
Chapter Objectives
Dear learner, at the end of this chapter, you are expected to:
 Understand the basic terms such as probability, experiment, outcome and event.
 Calculate probabilities applying the rules of addition and multiplication.
 Define the terms conditional probability and joint probability.
 Understand permutation and combination.
 Define the terms random variable and probability distribution.
 Distinguish between a discrete and continuous probability distribution
 Calculate the mean, variance and standard deviation of discrete probability distributions
 Understand binomial and normal probability distributions.
 Define and calculate the Z-value
 Compute probabilities using the standard normal distribution.
7.1.Introduction
Dear distance learners, Define probability?

_____________________________________________________________________________________
________________________________________________________________Good!!
Probability as a general concept can be defined as the chance of an event occurring.

Probability theory gives us methods of dealing with uncertainty. As nothing is accurately
predictable, uncertainty is common feature of every decision making process. In such situations
the probability theory comes to our aid, by providing the necessary methods to take appropriate
decisions even under conditions of risk and uncertainty.
7.2. Definition and Basic Concepts
Dear distance learners Explain basic probability concepts, i.e. Experiment, sample space,
sample point, event…?
______________________________________________________________________________
______________________________________________________________________________
Good!
An Experiment – is the process that leads to the occurrence of one or more possible
observations.
116 | P a g e
Example:- Tossing a coin
 Rolling two dice once
 Drawing a card from a deck
Sample Space – is a complete listing of all elementary events of an experiment.
Example:
 The sample space for the experiment of tossing a coin is (H,T). if two coins are tossed once,
the sample space is (H1, H2) (H1, T2) (T2 H2) (T1 T2).
 The sample space for the roll of a single die is (1,2,3,4,5,6). If two dice are rolled once, the
possible outcomes (sample space) are:-
(1,1)(1,2)(1,3)(1,4)(1,5)(1,6)
⎡ ⎤
(2,1)(2,2, )(2,3)(2,4)(2,5)(2,6)
⎢ ⎥
⎢ (3,1)(3,2)(3,3)(3,4)(3,5)(3,6) ⎥
⎢ (4,1)(4,2)(4,3)(4,4)(4,5)(4,6) ⎥
⎢ (5,1)(5,2)(5,3)(5,4)(5,5)(5,6) ⎥
⎣ (6,1)(6,2)(6,3)(6,4)(6,5)(6,6) ⎦
Sample points:- are elements of sample space.
Example-2 is one sample point of rolling a die.
To find the number of sample spaces, apply the formula where n is the number of experiments
and K is the number of possible outcomes of a single experiment.
An Event – is the collection of one or more outcomes of an experiment. Events are mutually
exclusive if the occurrence of any one event means that none of the others can occur at the same
time. That is if two events cannot occur at the same time, they are mutually exclusive. Events are
independent if the occurrence of one event does not affect the occurrence of another. Events are
collectively exhaustive if at least one of the events must occur when an experiment is conducted.
Example: A fair die is rolled once. The experiment is rolling a die. The possible outcomes are
the numbers 1,2,,4,5, and 6. If an event is the occurrence of an even number, we should collect
the outcome, 2,4 and 6.
Probability is a measure of the chance or likelihood that a particular event will happen in
the future. It can only assume between 0 and 1. For instance, probability of E which is written as
P(E) as a number do have the properties:
 ≤ ( )≤1
 P(E) = 0 means the event will not happen and is called impossible event.
 P(E) = 1 means we are 100% sure that the event will occur (sure event)
Probability can be defined in three different approaches.

(i) Classical probability
117 | P a g e
(ii) Relative frequency (Empirical) probability
(iii) Subjective probability
i) Classical Probabilities: - It is based on the assumption that the outcomes of an
experiment are equally likely. It applies rules and laws and involves an experiment.
( )= Where: N = total possible outcomes of an experiment
n = the number of outcomes in which the event occurs
out of N outcomes in an experiment.
Examples. In a coin tossing experiment, what is the probability of getting a head on one
toss of a coin? As there are only two possible outcomes, the probability is 50% or 0.5 or
½.
 An unbiased die is thrown. What is the probability that digit 2 appears? Ans. 1 6.
ii) Relative frequency (Empirical) Probabilities- This method is based on cumulative past
historical data.
ℎ
( )=
:
a) Suppose that, of the last 70 days with conditions like those forecasts for today, it
rained for 12 days, what is the probability of rain today based on those historical
days? 12 70 = 0.17 or 17%
b) Throughout her teaching career a professor has awarded 186 A’s out of 1200
students. What is the probability that a student in her section this semester will
receive an A grade?
( )= = 0.1555
iii) Subjective Probability: -It uses probability value based on an educated guess or
estimate, employing opinions and inexact information. For example, a seismologist might
say that there is a 45% probability that an earthquake will occur in Afar after thirty years.
7.3. Basic Rules of Probability
Dear distance learners, what are the basic rules of probability?
______________________________________________________________________________
______________________________________________________________________________
Good!!
If two events A and B are mutually exclusive, the special rule of addition states that the
probability of A or B occurring equals the sum of their respective probabilities: P (A or B) = P(A)
+ P(B)
118 | P a g e
Definition: Two events of a single experiment are said to be mutually exclusive if they
cannot occur simultaneously as a result of the experiment. This is equivalent to saying that
mutually exclusive events must have disjoint event sets.
Example: Abay Zuria transport association has recently supplied the following information on
their trip from Bahir Dar to Debre Markos:
Arrival Early On time Late Cancelled Total

Frequency 100 800 75 25 1000
 If A is the event that a bus arrives early, then P(A) = 100/1000 = .10.
 If B is the event that a bus arrives late, then P(B) = 75/1000 = .075.
 The probability that a bus is either early or late is:
P (A or B) = P(A) + P(B) = .10 + .075 =.175.
The complement rule
The complement rule is used to determine the probability of an event occurring by subtracting
the probability of the event not occurring from 1.
If P(A) is the probability of event A and P(~A) is the complement of A, then P(A)+P(~A)=1 or
P(A)= 1- P(~A).
Examples:
(i) Two events X and Y are mutually exclusive. Suppose P(X) =0.04 and P (Y) =0.03. What
is the probability that either X or Y will occur (0.07). What is the probability that neither
X nor Y will happen? (0.93)
(ii) Suppose the probability that you will score an A in this class is 0.25 and the probability
that you will get a B is 0.50. What is the probability that your grade will be above C?
(0.75)
(iii)The probabilities of events A and B are 0.20 and 0.30 respectively. The probability that
both A and B occur is 0.15. What is the probability of either A or B will occur?(0.35)
(iv)A student is taking two courses, microeconomics and statistics. The probability that the
student will pass the microeconomics course is 0.60 and the probability of passing the
statistics course is 0.70. The probability of passing both is 0.50. What is the probability of
passing at least in one course? (0.80)
The general rule of addition
If A and B are two events that are not mutually exclusive, then P(A or B) is given by the
following formula: P(A or B) = P(A) + P(B) - P(A and B)
119 | P a g e
Example: In a sample of 500 students, 320 said they had a radio, 175 said they had a TV, and
100 said they had both:
 If a student is selected at random, what is the probability that the student has only a radio,
only a TV, and both a radio and TV? Solution: P(S) = 320/500 = .64.P(T) = 175/500 =
.35.P(S and T) = 100/500 = .20.
 If a student is selected at random, what is the probability that the student has either a
radio or a TV in his or her room? Solution: P(S or T) = P(S) +P(T) - P(S and T)= .64
+.35 - .20 = .79.
 Joint Probability
A joint probability measures the likelihood that two or more events will happen at the same time.
 An example would be the event that a student has both a radio and TV in his or
her dorm room.
Special rule of multiplication
The special rule of multiplication requires that two events A and B are independent.
Two events A and B are independent, if the occurrence of one has no effect on the probability of
the occurrence of the other.
If the occurrence of one event has no effect on the probability of the occurrence of any other
event, then the events are called independent events. Two events originating from independent
experiments will be independent, while two events originating from the same experiment will
not, in general, be independent.
Example: Suppose two coins are tossed, the outcomes of one coin (head or tail) is unaffected by
the outcome of the other coin (i.e. head or tail). That is, the outcome of the second event does not
depend on the outcomes of the first event.
 This rule is written: P(A and B) = P(A)P(B)
7.4. Conditional Probability
Dear distance learners, Define conditional probability?
___________________________________________________________________________
___________________________________________________________________________
Good!!
120 | P a g e
A conditional probability is the probability of a particular event occurring, given that another
event has occurred. The probability of the event A given that the event B has occurred is written
P(A|B).
General rule of multiplication
The general rule of multiplication is used to find the joint probability that two events will occur.
It states that for two events A and B, the joint probability that both events will happen is found by
multiplying the probability that event A will happen by the conditional probability of B given
that A has occurred.
The joint probability, P(A and B) is given by the following formula:

P(A and B) = P(A)P(B/A) or P(A and B) = P(B)P(A/B)
Where P (B/A) = probability of B given that event A has occurred.  Conditional probability
P( AandB)
P( A / B)  , P( B)  0
B
Example: The Dean of the School of Business at a University collected the following
information about undergraduate students in her college:
Major Male Female Total

Accounting 170 110 280
Finance 120 100 220
Marketing 160 70 230
Management 150 120 270
Total 600 400 1000
a) If a student is selected at random, what is the probability that the student is a female (F)
and Accounting major (A)
P (A and F) = 110/1000.
Given that the student is a female, what is the probability that she is an Accounting
major? P (A|F) = P (A and F)/P (F) = [110/1000]/[400/1000] = .275
Let an experiment have a sample space S with E as any event. We define the probability of E
occurring written as P (E) as a number of satisfying the following conditions.
P(S) = 1, p i =1
121 | P a g e
Additional examples:
1. An experiment is performed by tossing a normal coin and observing which side (H or T)
is shown uppermost.
a. Write down the sample space S = (H, T)
b. Calculate P(H) = ½
1 1
c. Show that P(S) = 1 = (   1)
2 2
d. Show that E1 (H) and E2 (T) are mutually exclusive.
2. A fair dies is rolled once as an experiment with S = (1,2,3,4,5,6)
a. P(1 or 2) = P(1)+P(2) = 1/6+/6=1/3
b. P(X<4) = ½
c. P(even number)= ½
d. P(even or less than 4)=P(even number) + P(<4) – P(even number and <4)=1/2
+1/2 -1/6=5/6
7.5. Counting Procedures
Permutation is any arrangement of r objects selected from n possible objects. The

formula to count the total number of different permutation is
n!
n pr  where n! n( n  1)( n  2)........ 2 *1 By definition 0! (read as zero factorial)=1
(n  r )!
NB. The arrangements abc and bac are different permutations.
Example: If you have three guests (Addisu, Tilahun, Chala) invited to come to your
house,
a. In how many ways can they sit on the chair available in your house?
Sitting Arrangement
Addisu, Tilahun Chala
Addisu, Chala, Tilahun
Tilahun, Addisu, Chala
Tilahun, Chala, Addisu
Chala, Addisu. Tilahun
Chala, Tilahun, Addisu
3!
3 p3  6
(3  3)!
Therefore, there are 6 different arrangements for the three guests.
b. If you want to arrange a seat for two guests out of three, in how many ways can you
arrange them?
Addisu Tilahun
Addisu, Chala
122 | P a g e
Tilahun, Addisu
Tilahun , Chala
Chala, Addisu
Chala, Tilahun
3!
3 p2  6
(3  2)!
Therefore, there are 6 different sitting arrangements for the two guests.
c. What if you are trying to give a seat for a guest out of three guests?
Addisu, Tilahun, Chala
3!
3 p1  3
(3  1)!
Therefore, there are 3 different sitting arrangements for a guest.
Combination: is the number of ways to choose r objects from a group of n objects.

n!
Formula n cr 
r!(n  r )!
Example: If executives Abebe, Bekele and Chala are to be chosen as a committee to negotiate on
the price of a car,
a. How many combinations of these three executives are possible?
3!
Solution: 3 c3   1.
3!(3  3)!
There is only one combination of these three. The committee of Addisu, Tilahun and
Chala is the same as the committee of:
Tilahun, Chala and Addisu or
Chala, Addisu and Tilahun
Bekele, Addisu and Tilahun
Chala, Tilahun and Addisu
Addisu, Chala and Tilahun
b. How many possible combinations are possible of two executives are supposed to
negotiate to by a car?
Addisu, Tilahun
Addisu, Tilahun
Tilahun, Chala
3! 3!
3 c2    3 . Three combinations are possible.
2!(3  2)! 2!*1!
c. How many possible combinations are possible if one executive is supposed to negotiate
to buy a new car? Addisu, Tilahun, Chala
3! 3!
3 c1    3 Three combinations are possible.
1!(3  1)! 1!*2!
123 | P a g e
7.6. Probability Distributions and Random Variables
 Probability Distribution: It is a listing of all the outcomes of an experiment and the

probability of each of these outcomes either tabular or graphically.
 Random Variables
A random variable is a numerical value determined by the outcome of an experiment.
 Types of Probability Distributions
 A discrete probability distribution can assume only certain outcomes.
 A continuous probability distribution can assume an infinite number of values within a
given range.
Examples of a discrete distribution are:
 The number of students in a class.

 The number of children in a family.
 The number of cars entering a carwash in a hour.
Examples of a continuous distribution include:
 The distance students travel to class.

 The time it takes an executive to drive to work.
 Features of a Discrete Distribution
The main features of a discrete probability distribution are:
 The sum of the probabilities of the various outcomes is 1.00.

 The probability of a particular outcome is between 0 and 1.00.
 The outcomes are mutually exclusive.
Example: Consider a random experiment in which a coin is tossed three times. Let x be the
number of heads. Let H represent the outcome of a head and T the outcome of a tail.
 The possible outcomes for such an experiment will be: TTT, TTH, THT, THH, HTT,
HTH, HHT, HHH.
 Thus the possible values of x (number of heads) are 0,1,2,3.
 The outcome of zero heads occurred once.
 The outcome of one head occurred three times.
 The outcome of two heads occurred three times.
 The outcome of three heads occurred once.
 From the definition of a random variable, x as defined in this experiment is a random
variable.
124 | P a g e
The probability distribution is given as
X P(X)
0 1/8
1 3/8
2 3/8
3 1/8
The Mean of a Discrete Probability Distribution
 The mean:
 reports the central location of the data.
 is the long-run average value of the random variable.
 is also referred to as its expected value, E(X), in a probability distribution.
 is a weighted average.
The mean is computed by the formula:    [( xP( x)] where represents the mean and P(x) is
-
the probability of the various outcomes x.
The Variance of a Discrete Probability Distribution
 The variance measures the amount of spread (variation) of a distribution.

 The variance of a discrete distribution is denoted by the Greek letter (sigma squared).
 The standard deviation is the square root of Sigma Squared.
The variance of a discrete probability distribution is computed from the formula:
 2   [( x   ) 2 p( x)]
Examples:
1. The table listed below show random variables and their probabilities. However only one
of these is actually a probability distribution:
X P (X) X P (X) X P (X)
5 0.30 5 0.10 5 0.50
10 0.30 10 0.30 10 0.30
15 0.20 15 0.20 15 -0.20
20 0.40 20 0.40 20 0.40
a) Which one is a probability distribution?
b) Using the correct probability distribution, find the probability that X is
1) Exactly 15 (0.20)
2) Not more than 10 (0.40)
3) More than 5 (0.90)
125 | P a g e
c) Calculate the mean, variance and standard deviation of the correct probability
distribution.
Mean=5*.10+10*.30+15*.2+20*.4=0.5+3+3+8=14.5
2. According to recent information published in the capital magazine 36 percent of the

households in the Ethiopia have one TV set, 47 percent have 2 sets, 15 percent have 3
sets, and 2 percent have 4 sets.
a) Depict the probability distribution
X P(X)
1 0.36
2 0.47
3 0.15
4 0.02
b) What is the mean number of sets per household?
  1(.36)  2(.47)  3(.15)  4(.02)  1.83
What is the variance of the number of sets per household?
 2  1  1.83 2 (.36 )  2  1.83 2 (.47 )  3  1.83 2 (.15)  4  1.83 2 (.02 )  .5611
3. The head of a department estimated the distribution of student admission to his

department for the next semester based on past experience as follows:
Admission Probability
1000 0.60
1200 0.30
1500 0.10
a) What is the expected number of students who will admit to the department next
semester? (Ans. 1110)
b) Compute the variance and standard deviation
 The binomial distribution
 The binomial distribution has the following characteristics:
 An outcome of an experiment is classified into one of two mutually exclusive
categories, such as a success or failure.
 The data collected are the results of counts.
 The probability of success stays the same for each trial.
 The trials are independent
Mean & Variance of the Binomial Distribution
 The mean is found by:   n
126 | P a g e
 The variance is found by:  2  n (1   )
 To construct a binomial distribution, let
 n be the number of trials
 x be the number of observed successes
  be the probability of success on each trial
 The formula for the binomial probability distribution is:
 P ( x )  n c x x (1   ) n x
Example: The Department of Labor reports that 20% of the workforce is unemployed.
From a sample of 14 workers, calculate the following probabilities:
 Exactly three are unemployed.

 At least three are unemployed.
 At least one are unemployed.
Solution
 The probability of exactly 3:

P ( x )  n c x x (1   ) n x
P (3) 14 c3 (.2) 3 (1  .2)11  364.91 * 0.008 * 0.859  0.2501
 The probability of at least 3 is:

 P ( x  3) 14 c3 (.2) 3 (1  .2)11 14 c4 (.2) 4 (1  .2)10  ...14 c14 (.2)14 (1  .2) 0  0.551
 The probability of at least one being unemployed.
P ( x  1)  1  P (0)  114 c0 (.2) 0 (1  .2)14  0.956
 The Normal Probability Distribution

Characteristics of a Normal Probability Distribution
 The normal curve is bell-shaped and has a single peak at the exact center of the
distribution.
 The arithmetic mean, median, and mode of the distribution are equal and located at the
peak. Thus half the area under the curve is above the mean and half is below it.
 The normal probability distribution is symmetrical about its mean.
 The normal probability distribution is asymptotic. That is the curve gets closer and
closer to the X-axis but never actually touches it.
 It is a continuous probability distribution.
 Theoretically, curve extends to infinity
The Standard Normal Probability Distribution
127 | P a g e
The standard normal distribution is a normal distribution with a mean of 0 and a standard
deviation of 1. It is also called the z distribution. A z-value is the distancesbetween selected
values, designated X, and the population mean divided by the population standard deviation.
The formula is:

X 
z

Example: The bi-monthly starting salaries of recent MBA graduates follow the normal
distribution with a mean of Birr 2,000 and a standard deviation of Birr 200. What is the z-value
for a salary of Birr 2,200?
X   2,200  2,000
z   1.00
 200
X   1,700 2,200
What is the z-value of $1,700? z   1.50
 200
A z-value of 1 indicates that the value of $2,200 is one standard deviation above the mean of
$2,000. A z-value of –1.50 indicates that $1,700 is 1.5 standard deviations below the mean of
$2000.
Example: The daily water usage per person in New Providence, New Jersey is normally
distributed with a mean of 20 gallons and a standard deviation of 5 gallons. About 68 percent of
those living in New Providence will use how many gallons of water? About 68% of the daily
water usage will lie between 15 and 25 gallons.
 What is the probability that a person from New Providence selected at random will use between
20 and 24 gallons per day?
X   20  20 X   24  20
z   0.00 z   0.80
 5  5
 The area under a normal curve between a z-value of 0 and a z-value of 0.80 is 0.2881.
 We conclude that 28.81 percent of the residents use between 20 and 24 gallons of water per day.
What percent of the population use between 18 and 26 gallons per day?
X   18  20 X   26  20
z   0.40 z   1.20
 5  5
 The area associated with a z-value of –0.40 is .1554.

 The area associated with a z-value of 1.20 is .3849.
 Adding these areas, the result is .5403.
 We conclude that 54.03 percent of the residents use between 18 and 26 gallons of water
per day.
128 | P a g e
Review Exercises
1. Sixty percent of the students at Scandia Tech drive to class and 30 percent have GPAs of
at least 3.00. Ten percent of the students have a 3.00 GPA and drive to class. If we select
a student at random, what is the likelihood that the student had a GPA of 3.00 or drives to
class?
2. An insurance sales representative has an appointment with four clients today. From long
experience she knows that the probability of selling a policy to a client is .80.
a. What is the probability of selling a policy to all 4 clients?
b. What is the probability of selling a policy to three or more clients?
3. There are 600 employees at the Tuesday Morning’s Department Store corporate
headquarters in Columbia.
See the following breakdown.
Gender No College College Total

Male 25 225 250
Female 75 275 350
Total 100 500 600
An employee is selected at random.
a. What is the probability the employee is female?

b. What is the probability the employee is either female or attended college?
c. What is the probability the employee attends college given a female
employee?
4. For a particular group of taxpayers, 25 percent of the returns are audited. Six taxpayers
are randomly selected from the group.
a. What is the probability two are audited?
b. What is the probability two or more are audited?
5. Suppose P (A) =0.75, P (B/A) =0.40, what is the joint probability of A and B?
129 | P a g e
References
 Freund, J.E. and G.A. Simon (1992). Modern Elementary Statistics, 8th ed., Prentice-
Hall.
 Hooda, R. P. (2003). Statistics for Business and Economics, 3rd ed., New Delhi:
Macmillan.
 Monga, G.S. (2000). Mathematics and Statistics for Economics, 2nd ed., Delhi: Vikas
Publishing.
 Freund, E.J. Modern Elementary Statistics, John Wiley.
 Gupta, C.P. Statistical Methods.
 Gupta, C.B and Gupta, V. An Introduction to Statistical Methods
 Leven, J. Elementary Statistics in Social Research.
 Pine, V.R. Introduction to Social Statistics: Its Elements and Application
130 | P a g e
Debre Markos University, College of Business and Economics
Department of Economics
Introduction to Statistics Assignment for Distance Learners (30%)
Name: ____________________________________ID ___________Section ______
Part I: Say True if the following statements are correct or false if incorrect (5 points)
1. Time series data is a data collection from a population at a given point in time.
2. A distribution with higher coefficient of variation is said to be more consistent than the
other distribution.
3. Class mark is the mid-way between the upper and lower class limits.
4. Quartile deviation can be taken as among the measures of variation/dispersion.
5. Measures of dispersion can assure which set of certain data is better represented by its
measure of central tendency value.
6. Among the methods of data collection, observational research requires to conduct an
experiment and record the results.
7. The sum of a cumulative frequency is always 1.
8. The intersection of less than and more than cumulative frequency curves gives the
median of a given data.
9. The difference between histogram and bar chart is that the former uses adjacent bars
while the later uses non adjacent bars.
10. Range remains unaffected by the magnitude of the extreme values.
Part II: Multiple choice Questions (10 pts)

1. When we find the probability of an event happening by subtracting the probability of the
event not happening from one (1), we are using
a. Subjective probability d. The special rule of
b. The complement rule multiplication
c. The general rule of addition e. Joint probability
2. The arithmetic mean of 100 items is 25. However, it is discovered the two items were
wrongly taken as 73 and 164 instead of 37 and 100. Find the correct arithmetic mean.
a. 22 b. 26 c. 24 d. none
3. The difference between permutation and combination lies on
a. In permutation order is important and in combination it is not
b. In permutation order is not important but in combination it is
c. A combination is based on the classical definition of probability
d. A permutation is based on the classical definition of probability
e. All of the above
4. Given the data values of a variable X = 2, 7, 5, 4. Find the ∑Xi, ∑Xi 2 and ∑(Xi – X)
a. 16, 74, 6 b. 18, 94, 0 c. 18, 74, 0 d. 18, 94, 6 e. none
st
5. From a certain frequency distribution table, if the 1 class has an upper class boundary of
17.5 and the 2nd class has an upper class limit of 22, determine the class mark of the 2 nd
class.
a. 19 b. 19.75 c. 20 d. none
1|P age
6. Certain farmers around Debre Markos have received aid in kind from 1999 which was
2%, 4% in 2000 and 5% in 2001. Find the Geometric mean of the aid.
a. 3 b. 3.41 c. 4 d. 3.28
Results (out of 25%) of 20 students in microeconomics test

Scores (25%) No. o Students
6-10 5
11-15 7
16-20 8
Based on the above table, answer the following:
7. What is the arithmetic mean of the data?
a. 13.75 b. 18.75 c. 13.25 d. 15.25
8. What is the Geometric mean and range of the data?
a. 10 & 20.87 respectively c. 12.87 & 10 respectively
b. b. 13.75 & 10 respectively d. 10 & 13.51 respectively e. none
9. Indicate the median class of the table
a. 1st b. 2nd c. 3rd d. none
10. The production of clothes for a textile factory increased from 500000 in 1990 to 920000
in 2001. What would be the rate of production increase each year?
a. 2.056 b. 1.056 c. 0.056 d. none
Part III: Matching (5 Points)
Group A Group B
1. Frequency distribution a) 2, 7, 1, 1
2. Line graph b) Class must be non-exhaustive
3. Bi-Modal c) Particularly effective to show changes in a
4. Variance variable over time
5. Coefficient of variation d) 2, 7, 1, 2
e) Formed by intersection of the class mark
with the class frequencies
f) Effective in presenting cross sectional data
g) Class limit should be mutually exclusive
h) 1, 7, 1, 2, 7
i) Absolute measures of dispersion
j) Relative measures of dispersion
2|P age
Part IV: Work out (10 points)
1. The following data is given on the ages of 10 individuals
18 21 24 37 27
27 34 29 26 20
a.Determine the number of classes (using 2k rule) (1 pt)

b.Determine the class interval (1 pt)
c.Construct a frequency distribution table with class limit, class boundary (1.5 pts)
d.Calculate class mark, absolute frequency, relative frequency and cumulative
frequency. (1.5 pts)
2. The following table shows the number of male and female students enrolled in economics
department in DMU in 2006 academic year.
Sex Economics Non economics Total
major major
Males 100 900 1000
Female 500 1500 2000
Total 600 2400 3000
From the above table if a student is randomly selected
a. What is the probability that the student is an economics major? (1 pt)
b. What is the probability that a student is non-economics major and is a female?
(1pts)
c. Given that a student is male, what is the probability that he is an economic major?
(1 pts)
3. The following frequency distribution table reports the number of patient and their
respective age in a given hospital.
Patient Frequency
1-10 2
11-20 8
21-30 16
a. Find the coefficient of range using class mark (1 pts)
b. Calculate the Mean deviation from the Mean (1 pts)
3|P age

Statistics

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Statistics

Uploaded by

Copyright:

Available Formats

DEBRE MARKOS UNIVERSITY

COLLEGE OF BUSINESS AND ECONOMICS

Learning Material for Distance Students

Course Name: Introduction to Statistics

CHAPTER TWO: SAMPLING THEORY…………………………………………………..…11

CHAPTER THREE: DATA COLLECTION AND PRESENTATION…………………………26

CHAPTER FOUR: MEASURES OF CENTRAL TENDENCY ………………………………52

CHAPTER FIVE: MEASURES OF DISPERSSION…………………………………………...78

CHAPTER SIX: SIMPLE LINEAR REGRESSION AND CORRELATION ……….…….…104

CHAPTER SEVEN: ELEMENTARY PROBABILITY…………………………………....…116

Upon completing this course, you will be able to:

This tells you there is an introduction to the module, unit or section.

This tells you there is an activity to do.

This tells you to note and remember important points.

This tells you there is a checklist of the main points or terms.

This tells you there is a self-test for you to do

This tells you there is a written assignment.

By the end of this unit you will be able to know:

 The definition of statistics

Dear learners, would you please define Statistics? --------------------------------------------------------------------

The number of hotels in a city;

 The first step in investigating a problem is to collect data.

Excellent! The study of statistics is usually divided in to two categories:

A population is a collection of all possible individuals, objects or measurement of interest. When a

A sample is a portion or part of the population of interest.

Example on sample vs. population:

1.3.Why we study Statistics?

Thus, in business and economics, we are interested in such things as:

 Profits of firms (revenue minus cost),

We are studying statistics for the following reasons:

A. The first reason is that numerical information is everywhere.

Dear distance learners, could you explain the use of Statistics?

Dear distance learners, who are the users of statistics?

Dear distance learners, Explain the limitation of statistics?

 Advertising media  Lack of knowledge in

1.8.Steps of Statistical investigation

i. Determine the objective of the iv. Presenting the data;

Dear distance learners, what is variable?

 Itis a variable that can be measured and expressed numerically.

a. Number of bedrooms e. Township

After completing this chapter, students would be able to:

 Comprehend the basic concepts of sampling theory.

2.1 Basic Concepts of Sampling Theory

Dear distance learners, Explain the sampling theory?

Students are expected to know the following concepts in sampling theory:

 All students in this university;

An application of the terminologies

2.2.Reasons for Sampling

Dear distance learners, List and explain different sampling methods?

Sampling methods can be categorized as probability and non-probability.

2.3.1. Probability Sampling

 Simple random sampling,  Systematic sampling,

 Make a numbered list of all units in the population (sampling frame),

731 065 777 796 870 963 130 610

Advantages of Systematic Sampling:

 Less time consuming and easier to perform than SRS,

a. N1=4000, we have P1=4000/8000=0.5 and hence n1=n. P1=30*0.5=15

 If there are many variables of interest, dividing a large population in to representative

1) Divide the population in to two groups based on gender

S.No Name Gender Grade

1) The reference population is divided in to clusters or subgroups, preferably similar in size,

2.3.2. Non-Probability Sampling

 Used when a sampling frame doesn't exist,

Solution: GM  16 2 5 * 7 6 12 3 17 2  5.85% . The geometric mean percentage increase in