Professional Documents
Culture Documents
Statistics
Statistics
PREPARED BY:
MULUGETA DEREJE (M.Sc.)
YECHALE GETU (M.Sc.)
HAYMANOT BETSEHA (M.Sc.)
FEBRUARY, 2017
DEBRE MARKOS, ETHIOPIA
Table of content
CHAPTER ONE: INTRODUCTION………………………………………………………….1
1.1.Definition of Statistics …………………………………………………………..…………....1
1.2.Types of statistics ………………………………………………………………..…………...4
1.3.Why we study statistics? ………………………………………………………..…………..4
1.4.Uses of statistics…………………………………...……………………………..…………..5
1.5.Users of statistics…………………………….…………………………………..…………..5
1.6.Application of statistics…………………………………………………………..…………..6
1.7.Limitations of statistics………………………………………………………..……………..7
1.8.Steps of statistical investigation……………………………………………………………...7
i
4.2 Distribution, Shape and Measures of Central Tendency………………….....………………66
4.3 Positional Measures……………………………………………………………...…………..67
ii
COURSE DESCRIPTION
We use statistical concepts intuitively in our daily lives; and believe it or not, we all think
statistically. If not, think how many times you have decided to take a jacket with you because
you have predicted it will be cold after hours; how many times you have given your blood for
medical test in a laboratory; etc. In fact, modern society is driven by statistics. This course is an
introductory course which helps students get a preliminary knowledge on statistical tools,
methods and their application. Data and probability related issues will be addressed. On the
progress of the course, emphasis will be given to sampling theory, data collection and
presentation, measures of central tendency and variation, linear regression and elementary
probability theory. The rationale for providing Introduction to Statistics is to equip you with an
arsenal of techniques for understanding Statistics for Economists, which focuses on probability
theory, parameter estimation and hypothesis testing.
COURSE OBJECTIVES:
iii
There are a number of symbols and their representation in the course material:
This tells you there is a question to answer or think about in the text.
This tells you that these are the answers to the activities and self-test questions.
iv
CHAPTER ONE: INTRODUCTION
Chapter objectives:
1.1.Definition of Statistics
Statistics has two meanings. Let’s start with its layman definition. In the more common usage,
statistics refers to a collection of numerically expressed facts or data.
Examples:
Therefore, we define statistics as the science of collecting, organizing, presenting, analyzing, and
interpreting numerical data to assist in making more effective decisions.
According to Dominick Salvatore and Derrick Reagle “statistics refers to collection, presentation, analysis
and utilization of numerical data to make inferences and reach decisions in the face of uncertainty in
economics, business and other social and physical sciences.”
As the definition suggests:
1|Page
Example: If students of economics at a university would like to know the monthly household income of
200 residents in Debre Markos town, then they
I. Have to collect the data, that is, income of the households under study ,
II. Should organize the data (say by arranging the data in ascending or descending order),
III. Should present that data by using charts, tables, etc,
IV. And finally, they should do some analysis (say find the average, median, mode variance,
standard deviation, , etc) and interpret the data.
1.2.Types of Statistics
Dear distance Learner! Can you guess the types of statistics?
_______________________________________________________________________________________
______________________________________________________________________________________
I. Descriptive Statistics
It is a statistical method that deals with describing (summarizing) given set of data
without making inferences about the larger data.
It involves collection, organization and presentation of data in an informative way.
Tables, graphs and numerical summary measures may be used to describe data.
In descriptive statistics, the statistician tries to describe a situation.
Examples on descriptive statistics:
Consider the national census conducted by the Ethiopian government in 1999 E.C.
Results of this census give the average age, average household income, and other
characteristics of the Ethiopian population and these are descriptive statistics.
A survey found that 51% of the populations in Ethiopia are females. The statistic
51describes the number out of every 100 persons who are females.
According to Consumer Reports, Sony TV owners reported 2 defective TVs per 100 TVs
(2%) in 2001. The statistic 2(2%) describes the number of problems out of every 100
TVs.
According to the bureau of the labor statistics, the average daily wages of workers in a
town is birr 15 in August 2007.
The GDP of country X was 100 million in 2010 and 140 million in 2016. If we calculate
the percentage growth of GDP from 2010 to 2016, that is still a descriptive statistics.
What is the percentage growth of GDP from 2010 to 2016?
140 100
[Answer 40 %= x100% ]
100
Question: Would it be descriptive statistics if we used this GDP growth rate (40%) to estimate
the GDP of country X in the year 2017? Why? What type of statistics is it?
2|Page
II. Inferential Statistics
It is also called statistical inference or inductive statistics.
It is a statistical method that involves taking a sample from a population, computing the
statistic based on the sample, and inferring from the statistic about the value of the
corresponding parameter.
It is a branch of statistics that is used to determine something about the population on the
basis of a sample taken from that specific population
It is a decision, estimate, prediction, or generalization about a population, based on a
sample.
Examples:
The accounting department of a large firm will select a sample of the invoices to check for
accuracy for all the invoices of the company.
Wine tasters sip a few drops of wine to make a decision with respect to all the wine waiting to be
released for sale.
Dear distance Learner! Can you guess the difference between population and sample? What about
parameter and statistic? ____________________________________________________________________
________________________________________________________________________
Good!
Note the words “population” and “sample” in the definition of inferential statistics.
When we discuss about inferential statistics we have to differentiate between parameter and statistic.
Parameter is the calculated value of a population (say population mean, population standard deviation,
etc.) and statistic is the calculated value of a sample (say sample mean, sample standard deviation, etc.).
The difference between sample statistic and its corresponding parameter is called sampling error.
i. If we want to do a research on the impact of high school GPA (transcript result) on college GPA of
economics students at a university, the population is all economics students at that university.
ii. A researcher may select all students of economics at Debre Markos University as a sample to know
the impact of high school GPA on college GPA and infer (conclude) something about the impacts of
high school GPA on college GPA of economics students at all Ethiopian colleges/universities.
3|Page
Exercise
The marketing department of a bank asked a sample of 1960 customers to try a newly developed banking
system. Of the 1960 samples, 1176 said they would use the new system if it is marketed. What would the
marketing department report to the bank officials regarding the acceptance of the new system in the
population? Is this an example of descriptive or inferential statistics?
Solution:
Based on the samples of 1960 customers; we estimate that, if it is marketed sixty percent
(1176/1960*100%) of all customers will use the new system and it is inferential statistics, because a sample
was used to draw a conclusion about how all customers in the population would react if the new system
were marketed.
4|Page
B. Researchers and/or students may be called on to conduct research in their fields, since
statistical procedures are basic to research.
To accomplish this, they must be able to design experiments; collect, organize, analyze and
summarize data and possibly make reliable predictions or forecast for future use.
They must also be able to communicate the results of the study in their own words.
C. Students, like professionals, must be able to read and understand the various statistical
studies performed in their field. To have such understanding, they must be knowledgeable
about the vocabulary, concepts and statistical procedures used in these studies.
D. Data is everywhere and no matter what your future line of work, you will make decisions that
involve data and understanding of statistical methods will help you make these decisions more
effectively.
1.4.Uses of Statistics
Importance of statistics is clearly stated in the following words of Carol D. Wright of USA “to a very striking
degree, our culture has become a statistical culture. Even a person who may never have heard of an index
number is affected by of those index numbers which describe the cost of living. It is impossible to
understand psychology, sociology, economics, business, finance, or physical science without some general
idea of the meanings of an average, of variations, of sampling, of how to interpret charts and tables.”
According to H.G Wells “statistical thinking will one day be as necessary for effective citizenship as the
ability to read and write.”
The main functions of statistics are to enlarge our knowledge of complex phenomena. That is;
i. It presents facts in a definite and precise form. Example: Instead of saying that per capita income
of Ethiopia is low; better and clear to say it is 110.
ii. It reduces data: i.e. it simplifies a complex mass of data and presents it in a few, clear, and useful
summaries. The bulky data may be summarized in totals, averages, percentages, etc.
iii. It measures the magnitude of variation in data.
iv. It furnishes with technique of comparison.
v. It helps to estimate the unknown population parameter from a sample.
vi. It helps to test and formulate hypothesis.
vii. It helps to study the relationship between two or more variables.
viii. It helps to forecast future events.
1.5.Users of Statistics
5|Page
_______________________________________________________________________________________
_________________________________________________________________Well!
Most people become familiar with statistics through radio, television, newspaper, and
magazines and statistical methods are used in almost all fields of human endeavor.
Statistical methods help people identify and solve many problems concerning the environment,
the economy, transportation, public health and other matters of public concern.
Economists use statistical techniques to predict future economic conditions, to understand
economic problems, to formulate economic policies, to do research in the areas of economics,
to do market analysis, etc.
Doctors use such methods to determine whether certain drugs help in the treatment of medical
problems.
Weather forecasters use statistics to help them predict the weather more accurately.
Engineers use it to set standards for product safety and quality.
Statistical ideas help scientists design effective experiments.
Lawyers are increasingly turning the statisticians to help weigh evidence and determine
reasonable doubt.
In education, the researchers might want to know if new methods of teaching are better than the
old ones.
1.6.Application of Statistics in Business and Economics
Dear distance learners, could you explain the applications of statistics to Business and Economics?
_______________________________________________________________________________________
__________________________________________________________________________________
Good!!
Now-a-days the success of a particular business or industry very much depends on the accuracy
and precision of statistical analysis.
Before taking a new venture or for the purpose of improvement of an existing venture, the
business executives must have a large number of quantitative facts. Examples:
cost of raw materials, various taxes to be paid,
demand of products in the labor conditions,
market, Sales forecast.
price of products in the market,
All these facts are to be analyzed statistically before stepping in for a new enterprise or before
fixing the price of a commodity.
Statistical methods are now used for exploring possibilities to
advertising campaigns,
for adjustment of production methods and
As an aid to establish standards.
Statistical techniques help in forecasting future markets.
Market research and market surveys by statistical sampling methods are now extremely useful for
any business person.
6|Page
In industry, statistics is widely used in quality control.
In production engineering, to find whether the product confirms to specification, statistical tools
like inspection plans, control charts, etc are of great use.
Wide application of statistics can be found in insurance companies where the premium rates are
fixed on the basis of mortality, average length of life, possibilities of investment, etc.
1.7 Limitations of statistics
Statistics deals with only quantitative information, i.e. information should be capable of
numerically expressed either directly or indirectly.
Statistics deals with only aggregates of facts and not with individual data items.
Statistical data are only approximately and mathematically correct.
Statistics can be easily misused and, therefore, should be used only by experts.
Misuse of statistics
Knowingly: Unknowingly:
Types of Variables
A variable is measurable characteristics of a given phenomenon (object, process, event, etc) which
can take different values in a given population or samples of elements or it is a characteristic about
each element of a population or a sample.
7|Page
Examples:
annual income (it can be Birr 200, Birr 300, Birr 400, or any other value),
quantity demanded (it can be 200 units, 300 units, 400 units, or any other value),
price (it can be Birr 2 per unit, Birr 4 per unit, Birr 10 per unit or any other value),
gender (female or male), etc.
Data (singular datum):
are the set of values collected for the variable from each of the elements of the sample
are the actual measurements or observations that result from an investigation or survey
are the values (response) of the variable associated with an element of a population or a
sample.
Example:
The variable monthly household income of a family in a town can assume different values
(say, Birr 1000, Birr 3000, etc). But if we collect the monthly household income of 100
households then the values are called data.
Data set: is a collection of data values (data). Example: the monthly households’ income of 100
residents in a town is called data set.
Raw data: is a data collected in an original form (not yet organized)
Information: is a set of data corresponding to a specific aspect of knowledge combined in an
organized way. Information is a processed data to be used directly. It can transfer knowledge and
meanings
(2) Process:
-Organize the data
(1) Input (Raw
data) -Enter to the
computer
-Find its mean
(3) Information
- says something
to the user
-meaningful to
the user
From the point of view of statistical methods, variables can be broadly classified into qualitative (or
categorical) and quantitative (or numerical) variables.
8|Page
Qualitative Variable፡
When the characteristic being studied is non-numeric, the variable is called qualitative variable or
attribute.
It is a variable or characteristic which cannot be measured in quantitative form but can only be
identified by name or categories.
Examples include; gender, religious affiliation, type of automobile owned, place of birth, eye color,
etc.
When the data are qualitative, we are usually interested in how many or what portion fall in each
category. For example, what percent of the population are males? What percent of the population
owns a Nokia mobile apparatus?
Note that: Generally, although numerical codes can be assigned to the different categories of
variables, arithmetic operations (addition, subtraction, multiplication and division) are not
applicable to qualitative data.
Quantitative Variable:
Review exercises
1) In each of these statements, tell whether descriptive or inferential statistics have been used.
a) In the year 2015, the enrolment rate of elementary schools in Ethiopia will be 100%.
b) The average household income for people aged 25-34 is birr 2000/month.
c) Drinking coffee may raise cholesterol levels by 7%.
d) Some economists say that National Bank of Ethiopia (NBE) may increase the interest rate on
deposits to lower the money supply of the economy.
2) Classify each of the following variables as qualitative or quantitative.
a) Color of the automobile c) Gender (1=female, 0=male)
b) Number of desks in classrooms d) Number of pages in a book
9|Page
3) Classify each of the following variables as discrete or continuous.
a) Water temperature of the Sauna at a given health spa
b) Income of a household
c) Life time of batteries in a tape recorder
d) Weights of a newly born infants at a certain hospital
4) Consider the following :
Selling price of a house depends on the following factors:
5) Briefly explain the difference between the following concepts and give examples, if necessary.
a) Qualitative variable vs. quantitative d) Sampling vs. Census
variable e) Parameter and statistic.
b) Quantitative data vs. qualitative data f) sample vs. population
c) Descriptive statistics s vs. Inferential
statistics
6) Describe the importance of Statistics for an Economist.
7) Select an article newspaper (say Ethiopian Herald) that involves a statistical study and write a paper
answering the following questions.
a. Is the study descriptive or inferential in nature? Explain your answer.
b. What are the variables used in the study? Classify the variables as qualitative or
quantitative
8) One of the following is not true?
a. Population is sometimes referred to as the universe
b. The height of Ras Dashen mountain is 4440m can be considered as continuous variable
c. The ages of students at Debre Markos University is a variable
d. None
9) The difference between the sample mean and the population mean is called
a) Population mean c) Standard error of the mean
b) Population standard deviation d) Sampling error
10) The number of TVs sold by a certain shop during the months of November, December, January and
February, respectively are 25, 40, 35, and 32. Indicate whether the following conclusions belong to
the domain of descriptive statistics or inferential statistics.
a) During the four months, the average number of TVs sold per month was 33
b) Since the average number of TVs sold per month was small, the shop should invest more on
advertisement.
c) Out of the four months, the sale in November was the least.
d) The number of TVs sold in December was the highest because of Christ mass.
10 | P a g e
CHAPTER TWO: SAMPLING THEORY
Chapter Objectives
_______________________________________________________________________________________
__________________________________________________________________________________ Well!
(i) Population or universe is a group of all elements /observations (persons, animals, objects,
measurements, etc) under consideration in a certain problem. The word population is a technical
term in statistics, not necessarily referring to people.
Examples:
11 | P a g e
(v) Sample is the small group that is chosen for the study. It is a part or portion or sub set of a
population taken so that some generalizations about the population can be made. The main
concern in sampling is to ensure that the sample accurately represents the population we are
interested to study. That is, samples are taken in a way that they will be representative of the
population.
(vi) Sampling is the process involving the selection of a finite number of elements from a given
population of interest for purposes of an inquiry. It is a process of taking samples from a
population of interest for purpose of an inquiry. Example: In industry, the quality of a product is
assessed through sampling; the public opinion on social, economical and political problems is
ascertained through sampling.
(vii) Sample size is the number of individuals or observations in a sample (usually denoted by n).
(viii) Parameter is any measurable characteristic of a population. Example: Population means,
Population standard deviations, population medians, etc.
(ix) Statistic is a number resulting from manipulation of sample data. That is, it is any measurable
characteristic of a sample. Example: sample means, sample standard deviations, sample medians,
etc. A statistic is used to estimate a population parameter such as Population mean ( ),
Population standard deviation ( ), etc.
(x) The sampling error is the difference between a sample statistic and its corresponding population
parameter. It is the error that occurs because a sample has been taken instead of a census. For
example: the sample mean may differ from the true population mean.
(xi) Sampling Unit is the ultimate unit to be sampled (elements of the population to be sampled).It is
the unit of selection in the sampling process. Examples:
In a sample of households, the sampling unit is a household;
In a sample of students, a student is the sampling unit.
In a sample of districts, the sampling unit is a district, etc.
(xii) Sampling Frame is the list of all possible units in the reference population, from which a sample is
to be drawn. Example: If a researcher would like to do a research on poverty levels of residents in
a town and if s/he decided that the sampling unit for the study is an individual, then the sampling
frame would be the list of all individuals living in that town. A student roster is a sampling frame
for a sample of students.
(xiii) Sample design is a set of procedures for selecting the units from the population that are to be in
the sample.
(xiv) Sampling fraction (sampling interval):- the ratio of the number of units in the sample to the
number of units in the sampling frame or in the reference population. For example, a sampling
fraction or ratio of 1:3 is equivalent to a sampling interval of 1 in every 3 units. This means that the
sample constitutes 33.3% of the total units in the sampling frame or in the reference population.
12 | P a g e
Sample design: Probability sampling
Sample size: 2000 students selected from the sampling frame.
Sampling unit (unit of analysis): a student
Statistic: Students in the sample have spent an average of 300 birr per month.
Parameter: Students in the university are probably spending, on average, between 250 birr and
350 birr per month (estimate derived from sample statistic).
Dear distance learners, why we used a Sampling instead of a census? Explain the advantage of
sampling?
____________________________________________________________________________
____________________________________________________________________________
Good!!
When studying characteristics of a population, there are many practical reasons why we prefer to select
samples of a population. Some of the reasons for sampling are:
(i) A census can be extremely expensive and time-consuming. Contacting every member of
a large population would require great expenditures of time and money, and sampling from
the list can provide satisfactory results more quickly and at much lower cost. Efficiency is
the commonly known advantage of sampling. For example: a researcher may wish to
determine the average annual income for households in Ethiopia. A sample of households
would take fewer days and lower cost than interviewing all the households in Ethiopia.
Therefore, a sample has to be taken.
(ii) The physical impossibility of checking all items in the population (sometimes census is
impossible): Example: the population of fish, birds, mosquito and the like are large and
constantly moving, being born and dying. Therefore, we just take some samples to do a
research as it is impractical to have a census upon such types of populations.
(iii)A census can be destructive: The Awash wine factory, like every other winery, employs
wine tasters to ensure the consistency of product quality. Naturally, it would be
counterproductive if the tasters consumed all of the wine, since none would be left to sell
the thirsty customers. Likewise, firms wishing to ensure that its steel cable meets tensile-
strength requirements couldn't test the breaking strength of its entire output. As in the
Awash factory situation, the product "sampled" would be lost during the sampling process,
so a complete census is out of the question.
a) The sample results are usually adequate: In practice, a sample can be more accurate than a
census.
b) Speed: The collection and analysis of data can be done more quickly if the data are not
excessive. Time and energy are saved. That is, the data can be collected and summarized more
13 | P a g e
quickly with a sample than with a census. This is a valid consideration when the information is
urgently needed.
c) It enables the researcher to get more detailed information about a particular subject under
investigation. If only a few people are surveyed, the researcher can conduct an in-depth
interview by spending more time with each person, thus getting more information about the
subject. That is not to say the smaller the sample, the better; in fact, the opposite is true. In
general, larger samples-if correct sampling techniques are used-give more reliable information
about the population.
Disadvantages of sampling:
i. Reliability: If the sample is not a true representative of the population, then we may sacrifice
reliability in favor of less time and money.
ii. If complete information is required on each and every element of the population, census should
be applied.
2.3.Sampling Methods
Dear distance learners, Explain the difference between probability sampling and non- probability
sampling?
____________________________________________________________________________
____________________________________________________________________________
Good!!
A probability sample is a sample selected such that each item in the population being studied has a known
chance (greater than zero) of being included in the sample. These methods remove human judgment from
the sampling process and ensure a more representative sample and it has certain basic features.
Methods of Probability Sampling: The four basic types of sampling methods are:
14 | P a g e
Stratified sampling, and Cluster sampling.
Dear distance learner, Describe the difference between different probability sampling methods?
_______________________________________________________________________________________
_________________________________________________________________________________
Good!!
The choice of which to use in any given situation will depend on the types of a problem being investigated,
aim of the research and the available resources.
a)Simple Random Sample (SRS): In SRS, each item in the population has a known,thesame, non-zero
chance of being included in the sample.
Random samples are selected by using methods such as random numbers (which can be generated
from computers) or lottery method. To select a simple random sample you need to follow the
following procedures:
I. Numbered or named papers representing a unit in the population are placed in a hat.
II. The papers are thoroughly mixed and the number of papers equal to the sample size is selected
from the hat. For a sample of 200 students, the researcher would select 200 papers.
III. The sample then consists of all units of the population corresponding to the selected papers.
Random Number Table Method in SRS
I. The researcher assigns a number to each unit of the population and constructs the random table.
II. Then s/he randomly selects a starting place (point), goes through the table across the rows or
down the columns and lists the numbers as they appear on the table.
III. Members of the population with the selected numbers constitute the sample.
IV. A random number table is a list of numbers generated by a computer that has been programmed
to yield a set of random numbers.
V. It is possible for a unit’s number to be selected more than once.
Advantage of SRS
I. Ensures that the sample is unbiased in that every individual and every sample has an advantage of
being chosen.
15 | P a g e
II. SRS is the basic sampling method assumed in survey statistical computations. This can be used
with confidence.
Disadvantages of SRS
I. SRS requires a sampling frame and this is sometimes impossible (the case of fish population),
II. It is difficult to take samples if the reference population is scattered,
III. If the population is extremely large, it is tedious and time consuming to number and select the
sample,
IV. Minority subgroups of interest in the population may not be represented in the sample.
Note that: In SRS, when we apply the table of random numbers, we have to ignore repeated digits and
those lying above the range of the population size. The following table shows a random number
generated by a computer.
16 | P a g e
a random samples. Assuming that you are a research assistant, select a simple rand sample of 10
clients.
Solution:
1. Number each client from 1 to 250 (based on alphabet of their names or identity
numbers),
2. Using the random numbers shown above, find the starting point. To find the starting
point, one generally closes one's eyes and places one's figure anywhere on the table. In
this case, let us select number 005 in the 6th row and 2nd column,
3. Going down the column and continuing to the next columns, select the first 10
numbers.
4. The numbers are 005, 042, 159, 049, 173, 172, 029, 221,213 and 205. Therefore, clients
with these numbers will be included in the sample for further analysis.
b) Systematic Sampling (Quasi-random sampling): In systematic sampling, the elements to be
included in a sample are picked at a constant interval. That is, the items or individuals of the
population are arranged in some order and a random starting point is selected from 1 through k
population size N
(where k ) and then every kth member of the population is selected for the
Sample size n
sample.
In systematic sampling:
A complete list of all the elements within the population (sampling frame) is required.
The procedure is to take every kth item from the sampling frame.
Let N= population size; n=sample size; k=sampling interval, k=N/n
Choose any number between 1 and k. suppose it is j (1 j k) .
The jth unit is selected at first and then (j+k)th , then ( j+2k)th, …..etc. unit is selected until the
required sample size is reached.
Example 1: Suppose there are 2000 subjects in the population and a sample size of 50 subjects are
needed. Select a systematic sample of these 50 subjects.
Solution: The sampling interval (k) is 40 (2000/50). The number of the first subject to be included in the
sample is chosen randomly, for example, by blindly picking up one out of 40 pieces of paper numbered 1
to 40. Suppose subject 12 was the first subject selected, then the sample would consist of samples whose
numbers were 12, 52, 92, etc until 50 subjects (samples) are obtained.
It is obvious that a sample chosen this way is not strictly random since not all the members of the
population have an equal chance of being selected.
Example 2: Suppose a researcher wants to know the impact of microfinance on the clients' household
income. S/he wishes to select 10 clients out of 250 clients and a research assistant is required to select
systematic samples. Assuming that you are a research assistant, select a systematic sample of 10 clients.
17 | P a g e
Solution:
1. Number each client from 1 to 250 (based on alphabet of their names or identity numbers),
2. Since there are 250 clients and 10 are to be selected, the rule is to select every 25 th clients. This rule
is determined by dividing 250 by 10 which gives 25,
3. The number of the first subject to be included in the sample is chosen randomly from numbers 1
to 25. In this case let us select number 5.
4. Then select every 25th number on the list starting from 5. The numbers include the following: 5, 30,
55, 80, 105, 130, 155,180, 205 and 230. Therefore, clients with these numbers will be included in
the sample for further analysis.
Note: The answer is not unique as it depends where the number of the first subject to be included is
picked.
If there is any sort of cyclic ordering of the subjects, the samples will not be representative of the
population. Example: If subjects in the population are arranged in a manner such as:
1) Defective item
2) Non-defective item
3) Defective item
4) Non-defective item
The selection of the starting point could produce a sample of all defective items or non-defective
items depending on whether the number to be added (k) is even or odd.
Example: starting point =defective item +even k=all defective item in the sample and starting point
=non-defective item +even k=all non-defective items in the sample.
Example: Moha Company stores boxes containing Pepsi and Mirinda in the following order.
1) Box containing Pepsi 200)
2) Box containing Mirinda
3) Box containing Pepsi
4) Box containing Mirinda
5) .
6) .
7) .
. .
. .
18 | P a g e
The quality department of the company would like to check the expiry date of the products by taking a
systematic sample size of 40 boxes containing either Pepsi or Mirinda. Assume that you are working in
the quality department of the company, select the systematic samples required. Is the sample you
selected a representative?
Stratified Sampling: In stratified sampling, a population is first divided into subgroups, called strata
(singular stratum), and a sample is selected from each stratum based on simple random or systematic
sampling method. The strata are made according to various homogeneous characteristics such as sex,
race, region or institutional affiliation such as faculty. This sampling method is appropriate when the
distribution of the characteristic to be studied is strongly affected by certain variables. Note: Stratified
sampling is applied if the population is heterogeneous.
Stratified sampling can also be proportionate or non-proportionate. In the latter case, an equal number
of elements are drawn from each stratum while in the former case a proportionate number is obtained.
a) Proportionate Stratified Sampling: Number of units selected from each stratum is directly
proportional to the size of the strata. If Pi represents the proportion of population included in the
stratum i, and n represents the total sample size, the number of elements selected from stratum i is
nxPi
Examples:
1) Let us suppose that we want a sample size of 30 to be drawn from a population size of 8000
which is divided in to three strata of size 4000, 2400 and 1600. Adopting proportional allocation:
i. Find the sample sizes under each stratum.
Solution: We shall get the sample size for the different strata:
Thus, using proportional allocation, the sample sizes for different strata are 15, 9 and 6
respectively which is in proportion for the sizes of the strata namely 4000:2400:1600.
2)In a class of students, you can stratify the whole class on the basis of gender (F or M) and you
would draw an equal number of students from each group (disproportionate) or an unequal
number of students from each group depending on the proportion of males to female in the
original class list (proportionate). Let us take a numerical example: If there are 50 students in a
class of which 10 are female and if 10 students are needed for some study,
a) select a proportionate stratified sample of 10 students (8M, 2F)
b) select a disproportionate stratified sample of 10 students (5M, 5F)
Advantage: The representation of the sample is improved
19 | P a g e
Disadvantages:
S.No Name Gender Grade level S.No Name Gender Grade level
1 Abebe M Fr 11 Melat F Fr
2 Bekele M So 12 Nigusie M Fr
3 Birtukan F Fr 13 Petros M So
4 Chaltu F Fr 14 Rosa F So
5 Dagmawit F Fr 15 Regassa M Fr
6 Dagne M Fr 16 Selam F Fr
7 Huluka M Fr 17 Solomon M So
8 Lulit F So 18 Tigist F So
9 Melaku M So 19 Tibeyin F So
10 Mohammed M So 20 Tirhas F So
Solution: Steps:
Solution: 1) Divide the population in to two groups based on gender as shown below:
20 | P a g e
Males Females
S.No Name Gender Grade Level S.No Name Gender Grade Level
1 Abebe K. M Fr 11 Melat A. F Fr
2 Bekele M. M So 12 Lulit L. F So
3 Dagne K. M Fr 13 Birtukan L. F Fr
4 Huluka G. M Fr 14 Rosa M. F So
5 Melaku J. M So 15 Chaltu C. F Fr
6 Mohammed A. M So 16 Selam A. F Fr
7 Nigussie K. M Fr 17 Dagmawit B. F Fr
8 Petros L. M So 18 Tigist M. F So
9 Regassa K. M Fr 19 Tibeyin Y. F So
10 Solomon K. M So 20 Tirhas W. F So
2) Divide each subgroup further in to two groups of freshman and sophomore as shown below:
Group 1 Group 2
S.No Name Gender Grade S.No Name Gender Grade
Level Level
1 Abebe K. M Fr 1 Melat A. F Fr
2 Dagne K. M Fr 2 Birtukan L. F Fr
3 Huluka G. M Fr 3 Chaltu C. F Fr
4 Nigussie K. M Fr 4 Selam A. F Fr
5 Regassa K. M Fr 5 Dagmawit F Fr
B.
Group 3 Group 4
S.No Name Gender Grade S.No Name Gender Grade
Level Level
1 Mohammed M So 1 Lulit L. F So
A.
2 Melaku J. M So 2 Rosa M. F So
3 Petros L. M So 3 Tigist M. F So
4 Solomon K. M So 4 Tibeyin Y. F So
5 Bekele M. M So 5 Tirhas W. F So
21 | P a g e
1) Determine how many students need to be selected from each subgroup to have a proportional
representation of each subgroup in the sample. There are four groups and since a total of eight
students are needed for the sample, two students must be selected from each subgroup.
2) Select two students from each group by using random numbers. In this case we can select the
following students: Group 1: Student 5 & 4, Group 2: Students 5 & 2, Group 3: Student 1 & 3,
Group 4: Students 3 & 4.
3) The stratified sample then consists of the following students:
c)Cluster Sampling: if the population is homogeneous and very large or resides in a large area, it is
costly and time consuming to take samples by using the three methods just mentioned above. In this
case, we divide the population in to groups called clusters and then we select representative clusters
randomly. Finally, the samples will be taken from the sample clusters. We can take either all
members of the sample clusters or we may select samples from the clusters by using other sampling
techniques.
Procedures:
A list of all individual study units in the reference population is not required.
Reduces cost
simplify field work and it is convenient
22 | P a g e
Disadvantage:
The members of the clusters are often more homogeneous than the members of the whole
population and therefore, it may not be representative.
The elements in a cluster may not have the same variation in characteristics as elements
selected individually from the population
d) Multi-Stage sampling: is a sampling technique that is used when the reference population is
large and widely scattered. Selection of samples is done in stages until the final sampling unit is
obtained. The number of stages of sampling is the number of times a sampling procedure is carried out.
The primary sampling unit (PSU) is the sampling unit in the first sampling stage and the secondary
sampling unit (SSU) is the sampling unit in the second sampling stage, etc. For example: the PSU can be
the weredas, the SSU can be the kebeles, etc. From PSUs, we can select samples based a suitable
method and each of these selected PSUs is further sub-divided in to second stage units (say kebeles) and
from these SSUs again a sample is taken by some suitable methods. Further stages may be added if
required.
Example:
Multistage sampling procedure was used to conduct a research entitled “Health Service Utilization in
Amhara Region of Ethiopia.”
Procedures followed:
Previous provinces of Gondar, Gojjam, and Wollo are divided in to two zones.
One of the two Gondar zones, one of the two Gojam zones and one of the two Wollo zones
were randomly selected. Later one more zone, North Shoa was included (total four zones).
Two districts from all the zones except the North Shoa (one district only) were selected (Total
seven districts).
Two rural and one urban kebeles were chosen from each selected district were considered (14
rural kebeles and 7 urban kebeles).
Advantages: Cuts the costs of preparing sampling frame.
Disadvantages: Gives less precise estimate than SRS for the same sample size
Non-Probability Sampling: In non-probability sampling, not every unit in the population has a chance of
being included in the sample and the process involves at least some degree of personal subjectivity
instead of following predetermined, probabilistic rules for selection. This sampling technique is:
23 | P a g e
Dear distance learner, Describe the types of non-probability sampling methods?
_____________________________________________________________________________________
_____________________________________________________________ Good!!
a) Convenience Sampling: is a method in which a sample is chosen with ease of access being the
primary concern. Example: Interviews conducted in convenient locations such as student
lounge.
b) Purposive (Judgmental) Sampling: the researcher exercises deliberate subjective choice in
drawing samples what s/he regards as more informative for a study undergoing.
c) Quota Sampling: is a method that ensures that a certain number of sample units from different
categories with specific characteristics are represented. Here, judgmental and convenience
sampling methods are combined. Quota sampling can be applied for affirmative action.
Example: Suppose we know that 54% of the adults in a community are females, and the study
requires 100 respondents as a sample. In quota sampling, we might interview the first 54
females and the first 46 males.
1. Sampling error: is the discrepancy between the population value (parameter) and sample value
(statistic). It may arise due to inappropriate sampling technique applied. It can be minimized by
increasing the size of the sample. When n = N, sampling error = 0
2. Non-sampling error (bias): are due to procedure bias such as:
Subjects’ non-response
Due to incorrect response
Problem with sampling frame
Measurement error
Errors at different stages in processing the data.
Ensure that survey instruments are well prepared, simple to read, and easy to
understand.
Properly select and train interviewer to control data gathering bias or error.
Use sound editing, coding, and tabulating procedures to reduce the possibility of data
processing error.
24 | P a g e
Review Exercises
1) What are the reasons of sampling? Discuss and give example for each reason.
2) Differentiate between parameter and statistic. Which one is the result of taking a sample?
3) Define systematic sampling and explain how it is carried out. Describe how you would obtain a
systematic sample of 80 students from a population of 1600 students.
4) Briefly explain the difference between the following concepts and give examples, if necessary.
Sampling vs. Census
Cluster sampling vs. Stratified sampling
Sampling frame vs. Sampling unit
5) Assume that you are going to undertake research on the Ethiopian culture. Before taking a
sample, you observed that the culture is too diversified and large in number. Which type of
sampling method you are going to use so that your samples will represent the whole cultures.
Why?
6) Briefly explain cluster sampling. In which type of population it is preferred to select the samples
from the population?
7) Assume that there are 500 students in FBE, DMU in five departments with students' size of 150,
100, 50, 150 and 50. Assume that 20 students are to be selected from these five department
students for scholarship based on probability sampling. Further assume that students from all
departments have equal chance of being selected, i.e., departments with large number of
students will send more students than others. If you are assigned to select 20 students from
FBE, then
a) Which type of sampling method you are going to use?
b) Determine the sample size to be selected from each department.
8) To study the reaction of students to a policy issued by a college, a sample of 100 students is
required. The number of male students is 1000 and the number of female students in the
college is 1500. If you want to select your sample of 100 students using a proportional
allocation, how many students of each sex should you include in your sample?
9) Suppose you are a Woreda administrator having five kebeles with respective population size
10000, 5000 15000, 20000, and 50000. If you are supposed to select 1000 representatives of the
Woreda, determine the number of individuals to be selected in each Kebele so that your
selection to be fair.
10) Classify each of the following samples as simple random, systematic, stratified or cluster
a. In a large school district, all teachers from two buildings are interviewed to determine
whether they believe the students have less homework to do now than in the previous
years.
b. Every 7th customer entering a shopping mall is asked to select his or her favorite shoes.
c. Nursing supervisors are selected using random numbers to determine annual salaries.
25 | P a g e
CHAPTER THREE: DATA COLLECTION AND PRESENTATION
Introduction
Dear distance learners! In chapter one, we have define statistics as the science of
collecting, organizing, presenting, analyzing, and interpreting numerical data in order to make
more effective and rational decisions. Data are any collection of a raw facts, figures/ numerical
results of any count or measurement collected from a population or sample that will be used to
draw a conclusion or make a decision. Thus the data collected should have a source,
classification, method of collection, and it should be organized and presented in clear, precise
and understandable way. This chapter more concerned with these issues; classifications and
sources of data, methods of data collection and presentation.
Chapter Objectives
Dear distance learners! In this section we will try to define data. In defining data,
individuals use data and information interchangeably; however, there is a distinct difference
between the two terms. The former, information is the processed, organized and structured data
that is presented in a given context so as to make it useful. While Data is raw, unorganized facts
that needs to be processed. Data can be something simple and seemingly random and useless
until it is organized/ presented in meaningful way. So, information is the most processed data
and meaningful. In short data (the plural form) as we defined above are any collection of raw
facts, figures/ numerical results or values (response) of the variable of any count or
26 | P a g e
measurement collected from a population or sample; and that will be used to draw a conclusion,
inference or make a decision.
In research, statisticians/researchers use data in many different ways. Data can be used to
describe situations or events or to make an inference.
Dear distance learner, can you mention some classifications/types of data? Let you try to
answer it below.
_______________________________________________________________________
Great, the classifications of data are based on different criteria. They can be classified as
quantitative or qualitative data based on their nature; primary or secondary data based on their
source and as time series or cross sectional data, or panel data based on the role of time.
27 | P a g e
Secondary data, on the other hand, are those which have already been collected by someone
else and which have already been passed through the statistical process and used for some
purpose. When we see their character they are not first hand, new, and original rather have
already been collected and used by someone else for the same or other purpose. Secondary data
can be obtained from published and unpublished materials. Various publications of the central,
state are local governments; various publications of foreign governments or of international
bodies and their subsidiary organizations; technical and trade journals; books, magazines and
newspapers; reports and publications of various associations connected with business and
industry, banks, stock exchanges, etc.; reports and research results prepared by research
scholars, universities, economists, etc. in different fields; and public records and statistics,
historical documents, and other sources of published data. Internet is also one source of
secondary data.
Time series data: Data collected overtime (sequence of periods) on one or more than one
variables. Or it is data collected at several successive periods of time.
Example: The data collected by the researcher on one or more than one variables for 20 successive
periods/years can be taken as a cross sectional data time series data.
Panel data: The panel data are collected from repeated survey of a single (cross-section) sample
in different periods of time. It is elements of both time series & cross-sectional data. Because
data are collected on the same elements (cross-section) for more than one period/year. By taking
our example for cross sectional data, if the researcher collected data on the income level of those
1000 households for consecutive three years or more years we call it Panel data.
28 | P a g e
3.3 Methods of data collection
Dear distance learner, can you mention methods of data collection? Let you try to answer
it below._______________________________________________________________________
From the previous section we categorize data based on their source as primary and secondary.
The methods of collecting primary and secondary data differ since primary data are to be
originally collected from primary sources, while in case of secondary data the nature of data
collection work is merely that of compilation the already existing data. So we will discuss about
methods of data collection by considering primary data.
Dear distance learner, Have you ever collect primary data? If your answer is yes, what
types of methods have you used? Let you try to answer it below
________________________________________________________________________
There are several methods of collecting primary data but the most important ones and most
widely used are listed below.
a. Interview method
b. Questionnaire Method
c. Observation Method
We briefly take up each method separately.
a. Interview method
The interview method of collecting data involves presentation of oral-verbal stimuli and reply in
terms of oral-verbal responses. According to Eckhard and Ermann," Interviewing is a data
collection, procedure involving verbal communication between the researcher and respondent
either by telephone or in a face to face situation".
The method of collecting data through interviews is usually carried out with a structured
interviews or unstructured interviews. As such we call the interviews as structured interviews;
such interviews involve the use of a set of predetermined questions and of highly standardized
techniques of recording. Thus, the interviewer in a structured interview follows a rigid procedure
29 | P a g e
laid down, asking questions in a form and order prescribed. As against it, the unstructured
interviews are characterized by flexibility of approach to questioning. Unstructured interviews do
not follow a system of pre-determined questions and standardized techniques of recording the
data. In a non-structured interview, the interviewer is allowed much greater freedom to ask, in
case of need, supplementary questions. The interview can be trough personal (face - to - face),
tele-phone or mail.
In this case the respondents and the interviewer will have a face-to-face contact and oral/ verbal
communication. Meaning that, the interviewer asks certain questions to the interviewee
(respondent).And usually the interviewer is expected to initiates the interview and collects the
information. Personal (face - to - face) interview has its own pros and cons.
Dear distance learner, can you list some pros and cons of Personal (face - to - face)
interview? Let you try to answer it below
30 | P a g e
Tele phone Interview
This method of collecting information consists in contacting respondents on telephone itself. The
medium of communication is telephone. It is not a very widely used method, but plays important
part in industrial surveys, particularly in developed regions.
The features of telephone interview are listed below:
Requires a relatively short span of time.
Has high response rate.
No field staff is required.
Less costly than personal interview.
Less effective in a community with few number of telephone lines.
Not all people have a chance of being surveyed b/c: some people may not have phones or
they may not pick it up.
It is faster than other methods
Extensive geographical coverage may get restricted by cost considerations.
Mail Interview
The medium of communication is mail which can be electronic mail (e-mail).
Characteristics of mail interview
If one drafts a detailed questionnaire, it can be mailed to the respondent for filling or
can be put in charge of enumerators who go around and fill them after obtaining the
desired observation.
It is relatively less costly as compared to telephone and personal interview
The individual should be literate to give an appropriate response
Non-response error may be high if mailing is costly.
This survey can be used to cover a wider geographic area than telephone surveys or
personal interviews since mailed questionnaire surveys are less expensive to conduct.
It has low number of responses and inappropriate answers to questions.
It has low return rate.
Some people may have difficulty in reading or understanding the question.
31 | P a g e
b. Questionnaire Method
Questionnaire method is a method in which data are obtained with the help of a questionnaire,
which is prepared exclusively for the purpose. In other words with the help of a set of questions
all the required data is collected. The Questionnaire can be developed or adapted by the
researcher. Concerned with questionnaire; it can either be structured or unstructured
questionnaire. Structured questionnaires are those questionnaires in which there are definite,
concrete and pre-determined questions. The questions are presented with exactly the same
wording and in the same order to all respondents. When these characteristics are not present in a
questionnaire, it can be termed as unstructured or non-structured questionnaire. The types of
questions in a given questionnaire can be multiple choice (‘closed question) ,dichotomous
(having only two choices) (yes/no, female/male, etc) and Open – ended (where the respondents
are free to give any responses).
c. Observation Method
The observation method is the most commonly used method especially in studies relating to
behavioral sciences. In a way we all observe things around us, but this sort of observation is not
scientific observation. Observation becomes a scientific tool and the method of data collection
for the researcher, when it serves a formulated research purpose, is systematically planned and
recorded and is subjected to checks and controls on validity and reliability.
The observation can be controlled / uncontrolled. If the observation takes place in the natural
setting, it may be termed as uncontrolled observation, but when observation takes place
according to definite pre-arranged plans, involving experimental procedure, the same is then
termed controlled observation. In non-controlled observation, no attempt is made to use precision
instruments.
Characteristics of Observation Method
We see what is happening and record it. E.g. traffic accident, etc
Observation relies on watching or listening, then, counting or measuring.
There are no respondents.
It is time consuming/expensive.
32 | P a g e
3.4 Data Presentation
Dear distance learners! After the data once collected from the subjects under study, they
have to be organized and presented precise and understandable way. Data organization is simply
the process of editing, classifying and arranging the given data set to make it understandable and
to eliminate unnecessary details. And data presentation is the process of presenting or expressing
data using tabular method or graphical methods.
3.4.1. Tabular Methods of Data Presentation
Let’s start our discussion of tabular method of data Presentation by defining tabulation.
Tabulation is the arrangement of a given data set in tables. There are various techniques of
tabulation. The most widely used are data array and frequency distribution.
Data Array
Dear distance learner, what is data array? Let you try to answer it below
a) Data Array
Data array is a table showing data arranged in descending or ascending order both for qualitative
and quantitative data. Descending order is the arrangement of data from the highest to the lowest
and ascending order is the arrangement of data from the lowest to the highest. Examples
Descending (100, 99, 98, 97 ……..)
Ascending (1, 2, 3,4,5,6,7,8,9 …………)
An alphabet list of post office renters can be considered as a data array of qualitative
information. The first two examples shows arrangement of quantitative data.
Dear distance learner, can you list some advantages of data array? Let you try to answer
it below
33 | P a g e
Now we will try to see how we present the given data set (raw data) in data array. The following
data set (raw or ungrouped data) displays the Income level of 50 households in a certain town.
112 100 127 120 134 105 110 118 109 112
110 118 117 116 118 114 114 122 105 109
107 112 114 115 118 118 122 117 106 110
116 108 110 121 113 119 111 120 104 110
120 113 120 117 105 118 112 110 114 114
The data can be arranged in the data array either in descending or ascending order. Let us
arrange it in ascending order (lowest to the highest).
Table 3.1: Data Array
Ascending order
100 110 112 116 119
104 110 113 117 120
105 110 113 117 120
105 110 114 117 120
105 110 114 118 120
106 110 114 118 121
107 111 114 118 122
108 112 114 118 122
109 112 115 118 127
109 112 116 118 134
Maximum data value = 134, Minimum data value = 100, Range = 134 – 100 = 34
d. Frequency Distribution
A frequency distribution is a table that group data in to non-overlapping intervals called classes
and records the number of observations in each class. The frequency distribution summarizes
34 | P a g e
data in a condensed form that can be readily understood and easily interpreted. The reasons for
constructing a frequency distribution are:
To organize the data in a meaningful way
To enable researchers to draw charts and graphs for the presentation of data.
To enable a reader to make comparisons among different data sets.
Dear distance learners! There are some key Terms in frequency distribution. Some of them are
listed below.
Class each category of the frequency distribution is called a class.
Frequency is the number of data values/observations falling within each class.
Total frequency: - the sum of all class frequencies.
: :
xi x1 , x2 ...........xn class
f i f1 , f 2 ........... f n frequency
n
+ + +…+ = total frequency. It implies f
i 1
i = total frequency = n = number of
Class Limits -are the boundaries for each class. These determine which data values are
assigned to that class. Class limits can be lower or upper class limits and they have the same
decimal value as the data value. The lower and upper class limits the lowest and highest values
of the class respectively.
It is also called true class limits. It is the highest and the lowest values when there is no gap
between successive classes. To compute class boundaries we need first the correction factor
which is denoted by d.
d = Lower class limit of a class – upper class limit of the previous class .Then
35 | P a g e
1. We add on each upper class limits to get upper class limit of each class and
2. We subtract from each lower class limits to get lower class limit of each class
Class interval is the width of each class. This is the difference between the lower
limits/upper limit of the class and the lower limit/upper limit of the next higher class. Or it is the
difference between the upper and lower class boundaries of any class. And it is expected to be
rounded number.
range
Approximate class width
number of classes desired
Range Maximum value - minimum value
Class Mark is the midpoint of each class. This is mid- way between the upper and lower
class limits. To be familiarized with these concepts try to work out on the distribution table
below.
Class Frequency
200 – 299 12
300 – 399 19
400 – 499 6
500 – 599 2
600 – 699 11
700 – 799 7
800 – 899 3
Total Frequency 60
As we said that class is each category in the frequency distribution table, so there are 7 categories
in this frequency distribution table. Total Frequency is the sum of Frequencies in each class or
the total number of observations, i .e 60.Now we will try to see the Class Limits, Class
boundaries, Class intervals and class mark of at least the first class.
A. Class Limits of the first class, lower (LCL) and upper (UCL) class limits.
LCL1=200 and UCL1 =299
B. Class boundaries of the first class, lower (LCB) and upper (UCB) class boundaries.
The lower class boundary is the midpoint between 199 and 200, the d is 1
LCB1 =200-1/2 =199.5
36 | P a g e
UCB1 =299+1/2 =299.5
C. Class interval/ width of the first class
299.5 -199.5 =100
D. class mark of the first class
LCL1 + UCL1/2
200+299/2 =249.5
Guidelines for the frequency distribution
In constructing a frequency distribution for a given data set, the following guidelines should be
followed.
a) The set of classes must be mutually exclusive. That is, a given data value should fall into
only one class/category. There should be no-overlapping between classes and limits.
b) The class must be exhaustive. That is, we have to include all possible data values. No data
value should fall outside the range covered by the frequency distribution.
c) If possible, the classes should have equal widths. Unequal class widths make it difficult to
interpret both frequency distribution and their graphical presentation. One exception occurs
when there is an open-ended distribution i.e., it has no specific beginning value or no specific
ending value.
Example: class
< 10 (meaning that any value below 10 will be tallied in this class)
10 - 20
21 – 31
32 – 42
43 – 53
54 – 64
>65 (means values above 65 will be tallied in the last class)
Generally, in open – ended classes, the lowest class lacks a lower limit or the highest class lacks
an upper limit. Open – ended classes are classes with either no lower limit or no upper limit.
37 | P a g e
1. Arrange the data in some order
2. Find the range
3. Find the desired number of classes, there is no clear and fast rule to determine the number
of classes of a data set but it is a subjective process. In general 5 to 20 classes will be
suitable or recommended. In determining the number of classes of a data set we can use
the Sturge’s formula:
k =1+3.322log(n) where n is the number of observations and k is the desired
number of classes which should be rounded to the nearest whole number.
4. Find the class interval or width
Class width = Range/Number of class still it is recommended to be rounded to the nearest
whole number.
5. Select a starting point for the lowest class limit. This can be the smallest data value or any
convenient number less than the smallest data value. Add the width to the lowest score
taken as the starting point to get the lower limit of the next class. Keep adding until there
are 7 classes. Subtract one unit from the lower limit of the second class to get the upper
limit of the first class. Then add the width to each upper limit to get all the upper limits.
6. Tally the data
7. Find the frequency from the tallies
Let us use our previous data set that shows the Household Income level of 50 hh.
Date set
112 100 127 120 134 105 110 118 109 112
110 118 117 116 118 114 114 122 105 109
107 112 114 115 118 118 122 117 106 110
116 108 110 121 113 119 111 120 104 110
120 113 120 117 105 118 112 110 114 114
To construct the frequency distribution table /to group the data
1. Array the data
2. Find the range, 34
3. Determine the number of classes using Sturge’s formula ,
(k): k 1 3.322 log n
k 1 3.322 log 50 =6.64 7, where 3.322 log 50 =5.64
38 | P a g e
4. Then the class width is
Class width = = 4.9
5. Select a starting point for the lowest class limit, Let us use 100 (smallest value) as a
starting point. Add the width to the lowest value taken as the starting point to get the
lower limit of the next class. Keep adding until there are 7 classes. Subtract one unit from
the lower limit of the second class to get the upper limit of the first class. Then, add the
width to each upper limit to get all the upper limits.
105– 1 = 104
1st class = 100 – 104
2nd class = 105 – 109, etc.
6. Then Tally the data and
7. Find the frequency from the tallies
The completed frequency distribution is given as:
Table 3.2: Constructing frequency distribution table
Class Frequency Class boundaries
100-104 2 99.5-104.5
105-109 8 104.5-109.5
110-114 18 109.5-114.5
115-119 13 114.5-119.5
120-124 7 119.5-124.5
125-129 1 124.5-129.5
130-134 1 129.5-134.5
Total frequency 50
39 | P a g e
Dear distance learner, can you define these three types of frequency distributions? Let
you try to answer it below
Dear distance learner, given the following frequency distribution we will see how we can
compute the above three frequencies.
Table 3.3 absolute, relative and cumulative frequencies
Class Class Absolute Cumulative Relative Cumulative
Limits boundaries frequency frequency frequency Relative frequency
24-30 23.5-30.5 3 3 3/25 3/25
31-37 30.5-37.5 1 4 1/25 4/25
38-44 37.5-44.5 5 9 5/25 9/25
40 | P a g e
45-51 44.5-51.5 9 18 9/25 18/25
52-58 51.5-58.5 6 24 6/25 24/25
59-65 58.5-65.5 1 25 1/25 25/25
Total 25 1
n n
fi
f i n 25,
i 1
ni 1
1
Furthermore, cumulative frequency distributions can be classified as “less than” and/or “more
than” cumulative frequency distributions. The “less than” cumulative frequency of a class is the
total frequency of all values less than the upper boundary of the class and the “more than”
cumulative frequency of a class is the total frequency of all values which are greater than the
lower boundary of the class.
By using the previously constructed frequency distribution table we can see the above types of
frequencies.
Table 3.4 less than and more than cumulative frequency distributions
Absolute Upper class Less than cumulative Lower More than cumulative
Class frequency boundaries frequency boundaries frequency
100-104 2 104.5 2 99.5 50
105-109 8 109.5 10 104.5 48
110-114 18 114.5 28 109.5 40
115-119 13 119.5 41 114.5 22
120-124 7 124.5 48 119.5 9
125-129 1 129.5 49 124.5 2
130-134 1 134.5 50 129.5 1
Total 50
Based on the above frequency distribution table we can interpret the results in different ways.
Example
31 (18+13) of the households earn a monthly income from birr 110 – 119
62% of the households earn a monthly income from birr 110 – 119 (31/50*100%)
28 of the households earn a monthly income less than birr 114.5
41 | P a g e
40 of the households earn a monthly income at least birr 109.5
We can interpret in different ways and more than these interpretations.
Dear distance learners! One can construct several different but correct and acceptable
frequency distributions for the same data by using:
a different class width
a different number of classes or
a different starting point
The Histogram: - is a graph that displays the data by using adjacent vertical rectangles (unless
frequency of a class is zero) of various heights to represent the frequencies of the classes. The
tallest rectangle in a histogram is associated with a class having the greatest number of
observations (frequencies) and vice versa.
42 | P a g e
In a histogram the class boundaries are marked on the horizontal axis and the class frequencies
on the vertical axis. The length of adjacent rectangles of a histogram (a long the y-axis) can be
the absolute or relative frequencies of a class. We should know that we would have reached the
same conclusions and the shape of the histogram would have been the same had we used a
relative frequency distribution instead of the absolute (actual) frequencies. The only difference is
that the vertical axis would have been reported in percents (proportions) of households instead of
the number of households.Thefollowing frequency distribution will help us to construct a
histogram.
Class boundaries Absolute frequency
99.5-104.5 2
104.5-109.5 8
109.5-114.5 18
114.5-119.5 13
119.5-124.5 7
124.5-129.5 1
129.5-134.5 1
Total 50
To construct a histogram we mark the class boundaries on the horizontal axis and we mark the
frequencies on the vertical axis, as we said above the frequencies can be absolute or relative.
Then using the frequencies as the heights, we draw vertical bars for each class
Figure 3.1 Histogram
18
18
99.5-104.5
16
14 13 104.5-109.5
12 109-114.5
10 8 114.5-119.5
8 7
119.5-124.5
6
4 2
124.5-129.5
2 1 1
129.5-134.5
0
43 | P a g e
Dear distance learner, from the above histogram, which class constitutes greatest number
of data values(frequencies)? Let you try to answer it below
The frequency polygon :The frequency plygon consists of line segments connecting the points
formed by the interesection of the class marks with the class frequencies. Relative frequencies
or percentages may also be used in constructing the figure. Empty classes are included at each
end so the curve will intrsect the X – axis and the frequency plygon will be closed.
To construct frequency plygon we mark the class marks on the horizontal axis and we mark the
frequencies on the vertical axis, like in the case of histogram, the frequencies can be absolute or
relative.
Using the frequency distribution given in above(in constructing histogram), we can construct a
frequnecy polygon.There are some steps we should to follow.
Find the class marks
Class boundaries Class mark Frequency
99.5 - 104.5 102 2
104.5 - 109.5 107 8
109.5 - 114.5 112 18
114.5 - 119.5 117 13
119.5 - 124.5 122 7
124.5 - 129.5 127 1
129.5 - 134.5 132 1
1. Draw the x – y axis. Label the x – axis with the class marks and use a suitable scale on
the y – axis for the frequencies (absolute or relative).
2. Connect the coordinated (x,y) with line segments.
Figure 3.2 frequency polygon
44 | P a g e
frequency polygon
20
18
16
Frequency 14
12
10 18 Frequency
8
6 13
4 8 7
2 2
0 0 1 1 0
102
107
112
117
122
127
132
137
97
Class Marks
Dear distance learner, now we are going to discuss about the cumulative frequency graph ( o-
give).
The cumulative frequency graph ( o-give): The o-give is a graph that displays cumulative values
for frequencies.There are two types of cumulative frequency graphs ( o-give): “more than” and
“ Less than” cumulative frequency graphs.
Example: construct an o-give for the frequency distribution given in example above(in
constructing histogram and frequnecy polygon ).Like in the case of histogram are some steps we
should to follow,these are
1. Find the cumulative frequency for each class
2. Draw the x – y axis and lable the x– axis with the class boundaries and y – axis with the
cumultive frequencies.
45 | P a g e
3. Plot the cumulative frequency at each upper class boundary. Upper class boundaries are
used since the cumulative frequencies represent the number of data values accumulated
upto the upper boundary of each class.
Cumulative frequency
60
50
40
30
Cumulative frequency
20
10
0
99.5 104.5 109.5 114.5 119.5 124.5 129.5 134.5
Cumulative frequency graphs (less than cumulative frequency) are used to visually represent
how many values are below a certain upper class boundary. For example, to find how many
households earn less than 114.50 birr, we can locate 114.5 birr on the x – axis, draw a vertical
line up until it intersects the graph, and then draw a horizontal line at the point to the y – axis.
The value is 28 households.
Dear distance learner, Please try to construct the more than cumulative frequency graph ( o-
give) by yourself using the above example.
Note: The abscissa (x-value) of the point of intersection of the two o-give curves (less than and
more than) gives the median of the given data. We will discuss in brief about median in the next
chapter.
46 | P a g e
3.4.3 Diagrammatical data presentation
Diagrams make it possible more attractive to eye of a given data set. As such they are better
suited for publicity and propaganda. The most commonly used diagrams in Economics and
business are the following
a) Line graphs
b) Bar charts
c) Pie – charts
Line graphs (charts) are particularly effective for business and economic data to show the
changes or trends in a variable overtime. Line graphs (charts) are more ideal for time series
data. The variable of interest, such as the number of units sold or the total values of sales, is
scaled along the y – axis and time along the x – axis. Line graphs are widely used by investors to
support decisions to buy and sell stocks and bonds in the financial market. The idea is to try to
show a trend that will likely continue into the future, and to use that pattern to make accurate
prediction for the immediate future. Two or more series of data can be plotted on the same line
chart. Thus a chart can show the trend of several different variables and this allows for a
comparison of several series over the same period of time.
The Line graphs (charts) below shows unemployment rate over of a country from 1992 to 2000.
Figure 3.4Line graphs (charts) presentation of data
47 | P a g e
Unemployment rate
18.00%
16.00% 15.70%
14.80% 14.60%
14.00% 13.70% 13.50%
12.00% 12.40%
11% 11.30%
10.00% 10.20% Unemployment rate
8.00%
6.00% Linear (Unemployment
rate)
4.00%
2.00%
0.00%
1990 1992 1994 1996 1998 2000 2002
Dear distance learner! From the above graph we can see that, the unemployment rate decreases
from around 1992 and reaches its minimum in 1995( approximately 10%) and then starts to
increase.
a) Bar Charts: bar charts are more applicable when the horizontal axis deals with data that is
qualitative or non – continuous in nature, e.g. Gender, Marital status, etc.When we represent
data using bar charts, the bars are not joined together. All the bars must have equal width and
the distance between bars must be equal.
Example
48 | P a g e
Earnings/year
Master’s Degree
80,000.00 , 73,165.00
70,000.00
60,000.00
10,000.00
0.00
High school Diploma Bachelor Degree Master’s Degree
Pie – Chart: - A pie chart is more commonly used to display percentages, although it can be
used to display frequencies or relative frequencies. The whole pie (or circle) represents the
total sample or population. Then we divide the pie into different portions that represent the
different categories/classes.
Example: Samples of 200 Students were asked to select their department in which they can
be more effective. The following data shows the number Students in each department .Draw
a pie-chart based on the following data.
49 | P a g e
Figure 3.6 pie chart presentations of data
Marketing
4.50% Percent
Banking, 6
50%
Accounting,
18.50% Economics
46%
Manageme
nt, 24.50%
Dear distance learner! By using the same procedure please draw a pie-chart based on the
following data.
Assume a typical person made an average monthly expenditure in birr on the following goods
and services, food, cloth, transportation and others for 1500, 2000,300 and 800 birr respectively.
Construct a pie chart that represents this data set.
50 | P a g e
Review Questions:
1. What are the differences between Time series and Panel data?
2. Briefly explain the concept of cumulative frequency distribution. How are the more than
and less than cumulative frequencies calculated?
3. When we use line graphs/charts?
4. Briefly explain the three types of data collection methods.
5. Suppose you are asked to group the final exam mark of 50 students which is out of 80
with uniform class interval. The marks of students are listed below.
21 18 30 40 41 33 73 25 23 25
19 33 65 17 20 76 47 69 20 31
18 24 35 24 17 36 65 70 53 25
65 16 24 29 42 37 26 46 27 63
22 22 23 26 71 37 75 25 27 23
A. Find the number of classes
B. Find the class width(1 point)
C. Construct frequency distribution table (take 16 as the lower class limit of the 1 st class)
with class boundaries.
6. Why we use graphs and diagrams to present a given data set?
7. From a certain frequency distribution table, if the 3rd class upper class boundary and
lower class limit are 20.5 and 16 respectively, determine the class mark of the 3 rd class.
8. The following frequency distribution table displays the money spent by 100 foreign
visitors on visiting Hailessilase palace,in bahirdar.
Amount Spent (in $) Number of customers
3-7 10
8-12 30
13-17 35
18-22 20
23-27 5
100
Then find class marks of each class, relative frequency and cumulative frequency.
51 | P a g e
CHAPTER FOUR: MEASURES OF CENTRAL TENDENCY
Introduction
Dear distance learners! In the previous chapter, you have studied the classification of
given data you have also learnt how to represent the data using tabular or
graphical/diagrammatical methods in the form of various graphs such as bar graphs, histogram,
pie charts, o-gives and frequency polygons. In addition to these methods, the given data set can
be described using a range of numerical measures. Perhaps the best place to start is with some
measure or measures of central location/tendency. Measures of central tendency are used to
define, in some sense, the centre of a set of measurements. The most commonly used measures
of central location are the mean, median, mode. And the relative values of these measures are
very much dependent on the shape and position of the distribution for the data they are
describing. Thus, in this chapter, we will discuss how we compute these measures of central
tendency for both ungrouped and grouped data. We shall also discuss about the Positional
measures, Quartiles, Deciles and Percentiles.
Chapter Objectives
Dear distance learners! Before we are going to discuss about types of measures of central
tendency, let us define central tendency. Central tendency refers to the location of distribution in
52 | P a g e
to which more values of a distribution are concentrated. And Measures of central tendency
provide indications on middle values or most likely or most frequent values. In other words, they
tell us where the center of the distribution of the data is located.
There are three most commonly used measures of central tendency .These are: mean, median, and
mode.
a. Arithmetic Mean: The arithmetic mean is the sum of the data set values divided by the number
of observations. In computation of mean summation or sigma notation is a convenient and
simple form of shorthand used to give a concise expression for a sum of the values of a
variable, the general formula is given by,
That can be read as sum up values of X from 1 to n where n can be any number
,1,2,3,4……………………..n.
Dear distance learners, arithmetic mean or average value of a variable is the most important
numerical measures of central tendency. For ungrouped data, the population mean (usually
denoted by “”) is the sum of all the population values divided by the total number of population
values:
The arithmetic mean is the sum of the data set values divided by the number of observations.
Arithmetic mean or average value of a variable is the most important numerical measures of
central tendency. For ungrouped data, the population mean (usually denoted by “”) is the sum
of all the population values divided by the total number of population values:
53 | P a g e
N
X i
i 1
N
where : N number of elements in the population
population mean
The population mean applies when the data represent all of the items within the population.
For ungrouped data, the sample mean is the sum of all the sample values divided by the number
of sample values:
X i
X i 1
n
X sample mean
n number of elements in the sample/sample size
A sample of five executives received the following salaries (Birr in thousands): 14.0, 15.0, 17.0,
16.0, and 15.0, find the mean salary.
All the values in the data set should be included in computing the mean.
A set of data has a unique mean.
Every set of quantitative data has a mean.
The mean is affected by large or small data values, called outliers and may not be the
appropriate average to use in this situations.
We cannot determine a mean for open ended data.
The arithmetic mean is the only measure of central tendency where the sum of the
deviations of each value from the mean is zero. i.e
( x x) 0 Where, x is a value in the data set, and x bar is the sample mean
54 | P a g e
x)
x - nx x - n ( n
xx 0
00
If two data sets with different sample size /observations , n1 and n 2 and with different sample
arithmetic mean, x1 and x 2 respectively, are combined for some purpose then the
n1 x1 n 2 x 2
combined/grand mean will be : xc (is the same as the weighted mean)
n1 n 2
Example:
1) The mean age of 12 men and 10 women are 45 and 42 respectively. What is the combined
mean age?
12 * 45 10 * 42
Solution: xc 43.6
12 10
The arithmetic mean is affected by both change of origin and scale. That is,
Given a mean for data values, if we add or subtract a constant number c from all
data values, the new mean will be the old mean plus or minus c (change of
origin).
Given a mean for data values, if we multiply all data values by a constant
number c, then the new mean will be c times the old one (change of scale).We
shall see these changes with numerical example.
Example: The mean life of a certain brand of bulbs is 1030 hours.
a) If a new process adds 50 hour to the life of each bulb, what will be the mean life of them?
(ans. 1080 hours ),due to change of origin.
b) If you apply a recently developed method of production, the life of each bulb is doubled,
what will happen to the mean life of them? (ans. 2060 hours ) due to change of scale.
55 | P a g e
k
f X i i
X i 1
fi i th class frequency
k
f i
i 1 where: X i class mark of the i th class
k number of classes
Example: Compute the arithmetic mean of for the following grouped data:
f i th class frequency
X i class mark of the i th class
b. Weighted mean:
It is a special case of arithmetic mean. It is the mean value of data values that have been
weighted according to their relative importance. The weighted mean of a set of values X1, X2, ...,
Xi, with corresponding weights w1, w2, ...,wi, is computed from the following formula:
or X ixi
i
Where: is population weighted mean
Often each weight represents the number of items in the data set having a particular value. The
best example for weighted mean is calculating the students’ grade point average (GPA) who
takes different courses with different credit hours.
56 | P a g e
Examples:
X 4 * 3 3 * 2 3 * 4 1* 2
433 2 = 2.67
Dear distance learners! Using the above formula please compute weighted mean of the
following data.
The Satcon Construction Company pays its hourly employees $16.50, $19.00, or $25.00 per
hour. There are 26 hourly employees, 14 of which are paid at the $16.50 rate, 10 at the $19.00
rate, and 2 at the $25.00 rate. What is the mean hourly rate paid the 26 employees?
c. Geometric mean
The geometric mean is widely applicable in business and economics finding the percentage
changes in sales, revenues/returns, profits, GDP, growth rates etc. Like in the case of other types
of mean we can find geometric mean for both ungrouped and grouped data sets.
The geometric mean (GM) of n positive numbers is defined as the nth root of their product. The
formula is: GM = n X 1 X 2 X 3 .... Xn n x i , => multiplication. Now we will try to see
the application of geometric mean finding the percentage changes in sales, revenues/returns,
profits, GDP, growth rates etc with the help of some numerical examples.
Examples
The interest rates on three bonds were 5, 21, and 4 percent. The average interest rate is:
GM 3 5 21 4 7.49
57 | P a g e
The returns on investment earned by a company for four successive years were 30%,
20%, -40% & 200%, what is the geometric rate of return on investment?
Solution: 30% return means additional gain from what we have (i.e. from 100%).
Then 30% return is expressed as 1.3, -40% implies reduction ( 1-0.4 = 0.6)
GM 4 (1.3) * (1.2) * (0.6) * (3.0 ) =1.294 The GM of the return is therefore 1.294-1= 29.4%
Dear distance learners! As we discuss above, the other use of the geometric mean is to
determine the percent increase in sales, production, and population or other business or economic
series from one time period to another which can be computed using the formula below.
GM n
value at end of period 1, n= time gap/time period
value at beginning of period
Example:
1. The sales of soaps for a soap factory increased from 755,000 tons in 1992 to 835,000 in
2000. What would be the rate of production increase? Rate of production increase
835,000
GM 8 1 1.27%
755,000
2. If the population of Ethiopia increased from 53,000,000 in 1980 to 73,000,000 in 2000.
73 , 000 , 000
What is the average annual increase? GM = 20 1 = 0.016 = 1.6%
53 , 000 , 000
3. The price of a certain commodity in 1970 was 1.06 times that of 1969, in 1971 it was
1.04 times that of 1970. In the next two years it was 1.10 and 1.23 times that of the
respective preceding years. What is the average annual percentage increase in the given
period?
58 | P a g e
GM n x1 1 * x 2 2 * ...... * x m
f f fm
Where fi is the frequency of the ith class mark, Xi is class mark, m is number of values and
n=total number of observations.
The table below shows the percentage increase in salary of 16 employees of a company. Given
the grouped data find the geometric mean.
0-4 5 2
5-9 6 7
10-14 3 12
15-19 2 17
If 'n' is a large number, the computing the nth root of the product is a tedious work. To facilitate
the computation of GM, we make use of logarithms.
n n n n
log GM=log (xi )
GM anti log[
logx i ]
n
d. Harmonic Mean
Harmonic mean is mostly applicable to compute average rates of change, like prices, speed and
others. It is used in such cases when units are in harmony.
59 | P a g e
Harmonic mean for grouped data: The harmonic mean of n positive observations is defined as
the number of values divided by the sum of the reciprocals of each value. That is,
n n
HM = n
1 1 1 1
x1 x 2
...
xn x
i 1 i
Example: Suppose a person drove 100kms at 40km/hr and returned driving at 50km/hr. What is
the average speed? Solution
Dis tan ce
Speed
Time
2.5 * 40 2 * 50
Arithmetic mean (weighted mean) 44.44km / hr
4.5
2
HM= 44.44km / h
1 1
40 50
NB. Here, we don't calculate the arithmetic mean to find the average speed because the man
traveled equal distances by different speed on three days. If, however, he had traveled for
60 | P a g e
equal times in 3 days the arithmetic mean would be had correct average. If we want to use
arithmetic mean, we have to take weights in to account:
n n
HM= n
Xi= class mark
f1 f 2 f fn
x1 x 2
... n
xn
i 1 xi
For a set of data containing n-positively valued observations, the following relationships always
holds: HM < GM < AM
The three means become equal when all values in the set of data are equal.
Dear distance learner, have you heard about median? What do you mean by median? Let
you try to answer it below
The other measure of central tendency is the median. Median is that value of a variable which
divides an array of items in two to equal parts; in such a manner that the number of items below
it is equal to the number of items above it. Like we did for mean, we can compute median for
both ungrouped and grouped data. Let us start the discussion by computing median for
ungrouped data.
61 | P a g e
n 1
th
If the number of observations is odd, the median is the middle or observatio n and
2
when number of observations is even, the median is the arithmetic mean of two middle values
th th
n n
observation 1 observation
or ,it is the
2 2
2 .
Example: 1. The ages for a sample of five college students are given below: then find the
median.
19, 20, 21, 22, 25. The observations are odd so the median is the middle value or or
n 1 5 1
th th
rd
observatio n .Thus the median is 21 or observatio n i.e the 3 observation,21.
2 2
Dear distance learners! Based on the data set given below, compute the median.
2. The following data set shows the average life expectancy of four countries, 76, 73, 80,
75.compute the median.
73, 75, 76, 80. The observations are even then the median is the arithmetic mean of
th th
n n
observation 1 observation
two middle values or ,or
2 2
2 .
62 | P a g e
th th
4 4
observation 1 observation
Thus the median is 75.5, or
2 2
2 ,
2nd observation 3 rd observation
= is 75.5.
2 ,
n
cf
MD md 2 * i Where
f
Example, The following grouped data set shows the average time taken travel to work .Then
based on the grouped data, find the median.
63 | P a g e
Solution:
Steps:
Dear distance learners! Based on the grouped data set given below, compute the median
Class Limit Frequency Cumulative Frequency
30-40 2 2
40-50 18 20
50-60 24 44
60-70 20 64
70-80 8 72
80-90 3 75
Properties of Median
64 | P a g e
It can be calculated for an open ended frequency distribution if the median class doesn't lie
in an open ended class.
Mode is the value of the observation that appears most frequently. Or it is the value that has the
highest frequency in a data set in the case of ungrouped data. For grouped data, class mode (or,
modal class) is the class with the highest frequency.
Mode (MO) for ungrouped data: In the case of ungrouped data, the given data set, may not
have mode at all, it can be uni-modal, bimodal or multimodal. The given data set may not have
mode at all when all values of the observation appear equally. The given data set may have one
mode and we call it uni-modal. It can have also two or more mode and called as bimodal and
multimodal respectively. Let us see all with examples.
The approximate modal value of grouped data is calculated by the following formula:
f f1 f f1
Mode Lo i L0 i
f f 1 f f 2 2 f f 1 f 2
Where:
Lo lower classs boundary of the modal class (i.e., the class with the highest frequency)
f is the frequency of the modal class
f1 frequency of the class immediately preceding the modal class class
f2 frequency of the class immediately following the modal class
i class interval/width
65 | P a g e
Suppose the following frequency distribution table shows the daily saving of 100 members of
habru saving and credit institution. Based on the data find the mode
Steps:
a. Find the modal class, the 3rd class is the highest frequency.
b. Find the frequency of modal class, frequency of preceding and following the modal
class.
c. Find the class width. 5 in this case.
40 23 40 23
Then Mode 19.5 5 19.5 5 21.625
40 23 40 17 80 23 17 =
Dear distance learners! Based on the grouped data set given below, compute the mode by
yourself.
Properties of mode
66 | P a g e
4.2.Distribution, Shape and Measures of Central Tendency
Dear distance learners! The relative values of the mean, median and mode are very much
dependent on the shape of the distribution for the data they are describing. The distributions
of data can be either symmetric or skewed depending on how the data are distributed around
the center.
Symmetry (normal, bell shaped) distribution: occurs when the data values are evenly
distributed around the center. In a symmetrical distribution, the left and right sides of the
distribution are mirror images of each other, and the values of the mean, median and mode
are equal. Then Mean, Median and Mode are equal.
Skewed distribution: occurs when the data values are not evenly distributed around the
center. Skewness refers to the tendency of the distribution to “tail off” to the right or left. It is
simply lack of symmetry of a distribution.
Right (positively) skewed distribution: The mean is greater than the median, which in turn
is greater than the mode. In such distributions, the median tend to be a better measure of
central tendency than the mean. In a positively skewed distribution (when the majority of the
data values fall to the left of the mean and cluster at the lower end of the distribution, to the
right) the arithmetic mean is the largest of the three measures as the mean is influenced by a
few extremely high values more than the Median or Mode. Mode<Median<Mean
Left (negatively) skewed distribution: the mean is less than the median, which in turn is
less than the mode (Mean<Median<Mode. As with the positively skewed distribution, the
median is less influenced by extreme values and tends to be a better measure of central
tendency than the mean. If most of observations lie to the right of the mean & the tail
extends to the left, then the distribution is negatively skewed or skewed to the left. Then
both in the case of positively and negatively skewed distributions median will be a better
measure of central tendency.
The figures below clearly show the relative values of mean, median and mode according to
the distribution of data they describe.
Fig 4,1The Relative Positions of the Mean, Median and the Mode
67 | P a g e
4.3.Positional measures
Dear distance learners! Previously we have discussed about measures of central tendency
which shows location of distribution in to which more values of a distribution are concentrated.
Now we will try to discuss about the other measurement that determines the position of a single
value in relation to other values in a sample or a population data set. And we call ita measure of
position. Positional measures divide data into many equal parts are called quantiles (fractiles)
There are many measures of position; however, only quartiles, deciles and percentiles discussed
in this section. To obtain such measures, first of all, the data should be ordered based on their
magnitude.
Quartiles
Quartiles are the summary measures that divide a ranked data set into four equal parts. Three
measures will divide any data set into four equal parts. These three measures are the first
quartile(denoted by Q1), the second quartile (denoted by Q2), and the third quartile (denoted by
Q3). The data should be ranked in increasing order before the quartiles are determined. The
upper quartile, Q3 gives the value where 75% of the observations fall below it and the remaining
25% above it. The lower quartile, Q1 gives the reverse information of Q3. The second quartile is
the same as the median of a data set that gives the value where 50% of the observations fall
below it and the remaining 50% above it. Let us try to see how can compute quartiles for
ungrouped and grouped data.
68 | P a g e
Quartiles for ungrouped data,
The jth quartile denoted by Qj where j=1, 2, 3,4 for ungrouped data is defined as
j (n 1)
th
Qj observation
4
Example: Find the quartiles (Q1, Q2, & Q3) from the following distribution
47 28 39 51 33 37 59 24 33
Solution:Arrangefirst:24,28,33,33,37,39,47,51,59
1(9 1)
th
Q1 item (2.5) th item 2 nd item 0.5(3 rd item 2 nd item) 28 0.5 * (33 28) 30.5
4
2(9 1)
th
3(9 1)
th
Dear distance learners! Using the same procedure, try to find the three quartiles of the
following dada.
Dear distance learners! The jth quartile denoted by Qj where j=1, 2, 3,4 for grouped data is
computed by the following formula
i*n
cf
Qj i *w
4
i
fi
69 | P a g e
i * n th
i = lower class boundary of the ith quartile class (the class which contains the ( ) item).
4
wi =class width
cf=the cumulative frequency of the class preceding the ith quartile class
Let us compute the three quartiles for the following distribution table.
Class Fi Cf
1 – 10 8 8
11 – 20 14 22
21 – 30 12 34
31 – 40 9 43
41 – 50 7 50
i*n
cf
Q1 1 *w
4
i
fi
th
n 50
( ) th item item 12.5 th item is Q1 and it falls in the 2 nd class 11 - 20 is first quartile class
4 4
1 * 50
8
Q1 10.5 4 *10 13.714
14
Q2 ?
70 | P a g e
th
2n th 100
( ) item item 25 th item is Q 2 and it falls in the 3 rd class 21 - 30 is second quartile class
4 4
2 * 50
22
Q2 20.5 *10 20.5 2.5 23 median
4
12
th
3n 150
( ) th item item 37.5 th item is Q 3 and it falls in the 4 th class 31 40 is third quartile class
4 4
3 * 50
34
Q3 30.5 *10 30.5 3.89 34.39
4
9
Dear distance learners! Using the same procedure, try to find the three quartiles of the
following dada.
Class Frequency
6-10 1
11-15 2
16-20 3
21-25 5
26-30 4
31-35 3
36-40 2
Deciles
Dear distance learners! Standing from the name what do you think about deciles? Let you
try to answer it below.
________________________________________________________________________
71 | P a g e
Deciles are measures that divide a distribution/data set in to ten equal parts. We can compute
deciles for ungrouped and grouped data. Let us try to see how can compute deciles for
ungrouped data.
The jth decile for a simple frequency distribution (ungrouped data) denoted as Dj, where j=1, 2,
3.....9 is computed using this formula
j (n 1)
th
Dj observatio n
10
D1 gives the value where 10% of the observations lie below and 90% above it
D2 gives the value where 20% of the observations lie below and 80% above it
D3 gives the value where 30% of the observations lie below and 70% above it
D9 gives the value where 90% of the observations lie below and 100% above it
i*n
cf
Dj i *w
10
i
fi
i * n th
i = lower class boundary of the ith decile class (the class which contains the ( ) item).
10
wi =class width
cf=the cumulative frequency of the class preceding the ith decile class
72 | P a g e
Percentiles
Dear distance learners! As we can guess from its name, Percentiles divide a distribution/data
set in to 100 equal parts.
The jth percentile for a simple frequency distribution (ungrouped data) denoted as Pj, where
j=1, 2, 3.....99 is defined as
j (n 1)
th
Pj observation
100
P1 gives the value where 1% of the observations lie below and 99% above it
P2 gives the value where 2% of the observations lie below and 98% above it
P3 gives the value where 3% of the observations lie below and 97% above it
P99 gives the value where 99% of the observations lie below and 1% above it
i*n
cf
Pj i *w
100
i
fi
i * n th
i = lower class boundary of the ith percentile class (the class which contains the ( ) item).
100
wi =class width
cf=the cumulative frequency of the class preceding the ith percentile class
73 | P a g e
Dear distance learners! As we can see from their formula the computation of percentiles and
deciles is the same to that of quartiles. The only difference the value they divide the given data
set.
Observe that:
The following frequency distribution table shows the time taken by 20 workers to go from their
home to work. Using the frequency distribution table find Q2, D5, P50 and Median.
Table 4.7 calculation of quartiles, deciles and percentile for grouped data
2 * 20
6
Q2 13.5 * 3 13.5 2 15.5 median
4
6
D5=?
i * n th
The deciles class is the class which contains the ( ) item.
10
74 | P a g e
th
5 * 20 th 100
( ) item item 10 th item is D5 and it falls in the 3 trd class 14 - 16 is D5 quartile class
10 10
5 * 20
6
D5 13.5 * 3 13.5 2 15.5 median
10
6
P50=?
i * n th
The percentile class is the class which contains the ( ) item.
100
th
50 * 20 th 1000
( ) item item 10 th item is P50 and it falls in the 3 trd class 14 - 16 is P50 quartile class
100 100
50 * 20
6
P50 13.5 * 3 13.5 2 15.5 median
100
6
Median=?
n
cf
MD md 2 *i
f
20
6
MD 13.5 * 3 13.5 2 15.5
2
6
75 | P a g e
Review Questions:
1. Why median is a better measure of central tendency when the distribution of data is
skewed?
2. What are the differences between measure of central tendency, median, mode and
positional measures, quartiles, deciles and percentiles?
3. What are the differences between weighted and arithmetic mean?
4. Do you think that the mean of raw data and the mean of the same raw data grouped into a
frequency distribution are same?
5. Proof that the sum of the deviations of each value from the mean is always zero.
6. In which distribution of data the Mode is higher than the Mean and Median
A. positively skewed distribution
B. negatively skewed distribution
C. symmetric distribution
D. none
7. In a set of observations, which measure reports the middle value?
A. Quartile (Q2) B. Median C. Percentile 50 (P50)D. All E.A&B
8. The following frequency distribution table shows the monthly salary of 20 professors in
Debre Markos University, in thousand.
Class Frequency
6-10 1
11-15 2
16-20 3
21-25 5
26-30 4
31-35 3
36-40 2
76 | P a g e
9. A nation faces a rate of unemployment of 2% in 1990, 5% in 1992, and 12.5% in 1993.
Find the geometric mean of the unemployment rates?
10. A household purchased Birr 600 worth teff for consumption in three equal purchases of
Birr 200 each over a three months period. The first pack of teff was Birr 2.95/kg, the
second Birr 3.10/kg and the third Birr 3.25/kg. What was the average price per kg paid
for all the teff?
11. The mean age of all students in a class of 50 students is 17 years. If the mean age of 30 of
them is 18 years, find the mean age of the remaining 20 students.
12. For a sample of 50 stocks traded yesterday on the Ethiopian Stock Exchange, 10 showed
a decline of $1.00, 15 showed no change, and 25 increased by $2.00. Find the weighted
mean.
13. Suppose you receive a 5 percent increase in salary this year and a 15 percent increase
next year. The average annual percent increase is 9.886, not 10.0. Why is this so?
77 | P a g e
CHAPTER FIVE: MEASURES OF DISPERSION
Chapter Introduction
Dear distance learners! The measure of central tendency of any series or data distribution in
the previous chapter summarizes it in to single representative form which is useful in many
respects but it fails to account the general distribution pattern of data. Thus any conclusion only
based on central tendency may be misleading.
Measures of dispersion/ variation or spread are all about the amount of the spread or scatter in a
distribution. They measure the variability in the values of observations in the set. If all values are
the same the dispersion is zero. If the values are homogenous and close to each other the
dispersion is small. If the values are so different the dispersion is large.
Chapter Objectives
Compute and interpret the quartile deviation, the mean deviation, the variance and the
standard deviation of ungrouped and grouped data.
Explain the characteristics, uses, advantages and disadvantages of each measure of
dispersion.
Compute and interpret the inter quartile range and its relative measure.
Compute and interpret the relative measures of dispersion
Compute and interpret the Z-score
Understand and measure Moments, Skewness and Kurtosis.
78 | P a g e
terms of the degree of dispersion. For instance, the average income in a community is not an
adequate indicator of the well- being of the community since it doesn’t show us the inequality
among the residents. But, the measure of dispersion can show us this inequality. Therefore, it is
useful to have a measure of dispersion to observe variability of data.
Measures of dispersion fall into two categories:
a. Measures of absolute dispersion: is an absolute form which shows the actual amount of
variation of an item from a measure of central tendency. It includes: Range, Quartile
deviation, Mean deviation, Standard deviation and variance.
b. Measures of relative dispersion contain: is a quotient obtained by dividing the absolute
measure by a quantity in respect to which the absolute deviation has been computed. They
are unitless and are used to compare variability between different sets of data. Examples:
Coefficient of quartile deviation, Coefficient of mean deviation and Coefficient of
variation.
Dear distance learners! Do you know the qualities of good measure of dispersion?
As stated so far, when these measures express the magnitude of dispersion in the same unit
of measurement in which the data are recorded, they are known as measures of absolute
dispersion. However, when dispersion is expressed in percentages or ratios, these measures are
called measures of relative dispersion.
79 | P a g e
5.1.1. Range
Range is defined as the difference between the smallest and the largest observations in a given
set of raw data. Obtaining range from raw data thus requires identifying only these two extreme
values, and taking the difference between them
Properties of range
Only two values are used in its calculation
It is influenced by an extreme value.
It is easy to compute and understand.
It is the crudest measure of dispersion.
It cannot be determined for an open ended data.
The grater the range, the higher the variability of the data and vice versa.
Example 5.1: Consider the following data on the expenditures of two groups of workers:
Solution:
For Group A: For Group B:
Range = highest value – lowest value Range = highest value – lowest value
80 | P a g e
Minimum value = 6 marks and Range = Highest value – lowest value = 30 – 6 = 24
In case of continuous grouped data, range can be obtained in the following three ways:
i) Range is found by taking the difference between the upper class limit of the last class and
the lower limit of the first class. This is because the lowest and the highest observations are
not identifiable in the case of continuous grouped data. That is,
Range = UCLL – LCLF
Where UCLL = Upper class limit of the lest class
LCLF = Lower class limit of the first class
ii) Range is found by taking the difference between the upper class boundary of the last class
and the lower class boundary of the first class. That is,
Range = UCBL – LCBF
Where UCBL = Upper class boundary of the last class
LCBF = Lower class boundary of the first class.
iii) Range is found by taking the difference between the mid points of the first and the last class.
This does yield a result closer to the actual range as it reduces the margin by which it is in
error when computed by using the first the second methods.
Example 5.3:– Compute the range of the data given below in table 5.2.
Table 5.2 Results (out of 35%) of 40 students in Econometrics test
Score (35%) Class Boundary Number of Students (Fi)
6 – 10 5.5 – 10.5 5
11 – 15 10.5 – 15.5 10
16 – 20 15.5 – 20.5 15
21 – 25 20.5 – 25.5 7
26 – 30 25.5 – 30.5 3
Solution
Range = UCBL – LCBF Range = UCLL – LCLF
= 30.5 – 5.5 = 30 – 6
= 25 or = 24
81 | P a g e
And also it can be computed as the difference between the mid- point of the last class and the
mid- point of the first class. That is, Range = 28 – 8 = 20
It may have been noted that range is measured in an absolute form in the above discussions.
It implies that such a measure cannot be used for comparing variabilities expressed in different
units. Therefore, there is a need to have a measure of relative dispersion /variation. The relative
range or coefficient of range is defined as:
Range Highestvalue LowestValue
x100% x100% for raw data
Sumofexteremevalue Highestvalue Losestvalue
& discrete grouped data.
UCB L LCB F
x100% for continuous grouped data.
UCB L LCB F
Example 5.4: Compute the coefficient of range for the following raw data.
2, 4, 6, 8, 16, 18, 20
20 2 18
Solution:- Coefficient of range = X 100% X 100% = 81.8%
20 2 22
Example 5.6: Find the coefficient of rage (relative range) for the data given in table 5.2.
30.5 5.5
Coefficient of range = X 100%
30.5 5.5
25
= X 100% = 69.4%
36
Besides being simple to compute and understand, range is as good a measure of dispersion as
any other where the data consist of a few observations and is advantageous when one wants to
know only the extent of the extreme dispersion under “ordinary” conditions. However, its major
drawbacks include; (i) it tells us noting about the dispersion of the values which fall between the
two extremes, (ii) it is highly sensitive to sample size, (iii) highly affected if the values of the two
extremes change.
82 | P a g e
5.1.2. Quartile Deviations
Quartiles are the values which divide the array into four equal parts. Q1 gives the value of the
ℎ
item which is 1 4 the way up the distribution, Q2 gives the value of the item which is half of
the way and Q3 is the value of the item 3/4th the way up the distribution.
Inter-quartile range is the difference between Q3 and Q1, i.e., inter-quartile range = Q3 – Q1
Q3 Q1
Quartile deviation, denoted as Q D , is defined as Q D = . Quartile deviation is also called
2
semi-quartile range.
Scores (35%) Class Boundary) Frequencies (fi) Less than cumulative frequencies
6 –1 0 5.5 – 10.5 5 5
11 – 15 10.5 – 15.5 10 in
15 (Q1 class, as = 10th value)
4
16 – 20 15.5 – 20.5 15 30 (Q1 value – 3oth value)
21 – 25 20.5 – 25.5 7 37
26 – 30 25.5 – 30.5 3 40
40
Solution:
since the ith quartile is computed as
Qi = LQi +
in CF
4 PQi
xCWQi
FQi
83 | P a g e
Q1 10.5
1x 40 5
4 x5 10 .5 25 13
10 15
Q3 15.5
3 x 40 15x5
4 = 20.5
15
Q3 Q1 20.5 13
Quartile deviation (semi – quartile range) = = 3.75
2 2
Solution
Given: Q3 = 20.5 and Q1 = 13
Q3 Q1 20.5 13 7.5
Coefficient of Q D X 100% = 22.4%
Q3 Q1 20.5 13 33.5
Dear distance learners! Can you differentiate the advantages and disadvantages of
quartile deviation?
Did you try? …………. Good!
84 | P a g e
Its value is very much affected by sampling fluctuations.
It doesn’t show the scatter around the average, but only a distance on scale.
The mean deviation, also called the average deviation, measures the average deviation
/scatters of a set of observations about a central value, usually the mean or the median of the
distribution. It is computed by subtracting the mean/median from each individual observations,
summing all the deviations ignoring the negative sign, and dividing the sum by the total number
of observations. The negative sign is ignored, for instance, otherwise the sum of the deviation
from the mean i.e, X i X will be zero.
The mean absolute deviation from the mean for a set of sample data consisting of n observations
f i
Example 5.8: The age of a sample of 10 students from a class is given below.
Find mean deviation (i) from the mean (ii) from the median
Solution:Arithmetic mean =
X i
206 20.6
n 10
2 2
85 | P a g e
Age Mean Absolute deviation from Mean absolute deviation
the mean from the median
16 16
Therefore,
Example 5.9: Find the mean absolute deviation from the mean and from the median for the data
given in table 5.2.
40 173.75 173.33
86 | P a g e
fX i i (5x8) + (10x13) + (15x18) + (7x23) + (3x28)
Mean =
fX i i
685
= 17.125Median = Lmd
40 CF
2 PMd
xCWmd
n 40 FMd
= 15.5
20 15 x5 = 17.167
15
Note: Coefficients of mean deviation, relative measures, forms the mean and from the median
are given as follows:
Solution:-
4.344
Thus, coefficient of M D from the mean = x100% = 25.37%
17.125
87 | P a g e
4.344
Coefficient of M D from the median = x100% = 25.24%
17.167
Advantages of MeanDeviation
It is easy to understand and compute than standard deviation
It is not unduly influenced by large or small values
All values are used in its calculation
Like other measures, variance and standard deviation also quantifies the dispersion of the
observations around the mean value.
The population variance is defined as the arithmetic mean of the squared deviations from the
population mean.
X
2
The formula for the population variance for raw data is:
2 i
X
2
X
S 2
i
Where; n = sample size, X = mean (sample)
Sample variance: n 1
X = X X 2 X X i
2 2
2
X i
S2 i
n 1 n 1
88 | P a g e
X X 2 X X i
2
Xi X 2X Xi
2 2
2
i
=
n 1 n 1 n 1 n 1
X
i
2
X Xi
2 2 2 2
nX 2n X n
= i
n 1 n 1 n 1 n 1 n 1
n X i X i
2 2
nn 1 For small sample size
n X i
2 2
X i
Why n-1?
The reason for this is, in small sample, if provides a better estimate of the variance of the
population from which the sample is drawn. However, as n increases above about 30, we can use
n instead of n-1, as the two versions given approximately the same result for practical purposes.
Example 5.11: The ages of a family (in years) are: 2, 18, 34, 42.
Solution:
X i
96
= 24
4
The population standard deviation is the square root of the population variance.
X
2
i
and
N
the sample standard deviation is the square root of the sample variance.
89 | P a g e
X X
2 2
X X
S for small sample size and also S
i i
for large sample size
n 1 n
n X i2 X i
2
Example 5.12: From the sample data given below compute variance and standard deviation
Solution:-
n = 6 X i 150
and
X 4414
i
2
Xi 10 15 30 22 41 32
2
Xi 100 225 900 484 1681 1024
n X X i 64414 150
2 2 2
So, S 2 i
= = 132.8
nn 1 45
S S 2 132.8 = 11.51
f X f i X i2 f i X i
2 2
2 i i
f i 2
f X X n f X f i X i
2 2
2 i i i i
S
f i n2
in which Xi’s are the class mid-points and f i N for the population and f i n for the
sample.
90 | P a g e
n f i X iw f i X 2
nn 1
By definition, standard deviations in each case are the square roots of the respective variances.
Example 5.13: From the continuous frequency distribution given in table 5.2, compute the
sample variance and standard deviation.
Solution:
f X
2
Xi 1194.8
= 30.625 S S 2 30.625 = 5.534
2 i
S
n 1 40 1
n f i X i2 f i X 4012925 685
2 2
Alternatively, S 2
= = 30.625
nn 1 4039
S 30.625 = 5.534
91 | P a g e
2. A variance/standard deviation never be a negative number.
3. If a constant is added or subtracted from each observation, the variance/standard deviation
of the resulting observations will not be affected.
4. If every observation is multiplied by a constant K, then the new variance will be K 2 times
the original variance and the new standard deviation will be K times the original standard
deviation.
5. If there are two sets of data consisting of n1 and n2 observations with S12 and S 22 as their
S
2
n1 S12 d12 n2 S 22 d 22
Where; d12 = X 1 X C and d 22 X 2 X C .
2 2
C
n1 n2
n1 X 1 n2 X 2
Herein, the combined mean X C
n1 n2
n1S12 n2 S 22 S 2 S 22
If X 1 X 2 S C2 Further, when n1 = n2 S C2 1
n1 n2 2
Example 5.14: Calculate the standard deviation of the combined group of 400 items form the
following data.
Group A Group B Group C
Number of items (ni) 50 150 200
Mean X i 40 50 60
Variance S i2
81 100 121
Solution:-
n1 X 1 n2 X 2 n3 X 3 50(40) 150(50) 200(60)
XC = = 53.75
n1 n2 n3 50 150 200
92 | P a g e
d i X X C d1 = 40 – 53.75 d2 = 50 – 53.75 d3 = 60 –53.75
:
S C2
n1 S12 d12 n 2 S 22 d 22 n3 S 32 d 32
n1 n 2 n3
=
50 81 13.75 150 100 3.75 200 121 6.25
2 2 2
400
S C 156.56 = 12.512
93 | P a g e
A series /distribution with smaller coefficient of variation is said to be more homogenous
/uniform/ consistent than the other distribution. And a series /distribution with larger CV is said
to be more variable or more heterogeneous than the other distribution.
Example 5.15: The number of employees, the average wages and the variance of the wages for
two factories are given below.
Factory A Factory B
Number of employees 50 100
Average wages 120 85
Variance of the wages 9 16
Which factory is consistent in respect to the wages of employees?
Solution:
Factory A Factory B
Conclusion: CVA< CVB =>The wages of employees of factory A is more consistent than factory
B.
The Z-score is defined to indicate the number of standard deviations that an observation is below
or above the mean depending on whether the Z-score is negative or positive.
94 | P a g e
Xi X
Z – is called the standard value which is given by Z
S .d
Example 5.16: Helen scored 65 in Auditing and Samuel scored 70 in Auditing. If the average
score of the whole students in Auditing is 67 and standard deviation equal to 3, which student
performs better?
Z Helen X 65 67 X X 70 67
Solution Z Helen = = -0.6 Z Samuel Sami = =1
S 3 S 3
Therefore, Samuel performs better in Auditing than Helen and than the average result of the
whole students.
Dear distance learner! In a sample, 100 students doing a master program in management
were tested in a general knowledge paper carrying 100 marks. At the end of the exercise, they
were found distributed according to marks obtained as follows:
Marks obtained 30 -34 35-39 40-44 45-49 50-54 55-59 60-64
Number of students 5 8 12 20 27 20 8
Then find
a) The range of the distribution,
b) Quartile deviation,
c) Mean absolute deviation from the mean,
d) Variance and standard deviation, and
e) Coefficient of variation.
Dear learners! In this section, we will deal with two other important characteristics of a
frequency distribution. One refers to lack of symmetry in the distribution, or its departure from
being bell-shaped. The other relates to the degree of flatness or peakdness of a distribution at its
top. The former is described as skewness and the later kurtosis.
Moments
95 | P a g e
Moments tell us information about the “shape” of the distribution
It is represented by Mr, r =0, 1, …, r, which is called the r th moment.
We can have moments about any constant number, about the mean, zero or any desired
value.
In general, the rthmoment about any arbitrary constant number, say A, is given by
X A
2
Mr i
n
Example 5.18: Consider the following data and compute the first four moment’s bout five (5).
2, 2, 3, 4, 4, 5, 6, 7, 8
Solution:-
A = 5n = 9
Xi Xi-5 X i 52 X i 53 X i 54
2 -3 9 -27 81
2 -3 9 -27 81
3 -2 4 -8 16
4 -1 1 -1 1
4 -1 1 -1 1
5 0 0 0 0
6 1 1 1 1
7 2 4 8 16
8 3 9 27 81
Total -4 38 -28 278
X 5 1
r
X 5
0
Mr i
M0 i
i 1
9
1
n 9 9 9
X 5 X 5
1 2
= 4 M 2
i i
M1 = 38
9 9 9 9
X 5 X 5
3 4
= 28
i i
M3 M4 = 278
9 9 9 9
Note: For grouped data the rth moment about any constant number, say A, is given as:
f X A
r
Mr i i
Where;
f i
96 | P a g e
f i => Frequency of Xi in case of discrete grouped data
f i => Frequency of the ith class in case of continuous groped data and
Xi => Class mark of the ith class.
Note: M0 is always equal to 1.
Example 5.19: Find the first three moments about 4 for the data given in table 5.6
Table 5.6 Number of children in ten families
Xi 2 3 4 5
fi 3 2 3 2
Solution:-
Xi fi Xi 4 f i X i 4 X i 42 f i X i 4 2 f i X i 4 3 f i X i 4 4
2 3 -2 -6 4 12 -8 -24
3 2 -1 -2 1 2 -1 -2
4 3 0 0 0 0 0 0
5 2 1 2 1 2 1 2
Total -6 16 -24
f X 4 f 1 10 = 1 M
2
6 = -0.6 M 2 16 = 1.6 M 3 14
i i i
M0 = -2.4
f 10 10 10
1
10 10
i
X
r
Mr i
, for the population with N observations and mean .
N
Mr
X i X
, for sample data with n sample size and mean X .
n
Similarly, for grouped data the central moment is defined as:
M i i i i
f f
r r
i i
97 | P a g e
= frequency of Xi in case of discrete grouped data & frequency of the ith class in case of
continuous grouped data.
Example 5.20: Find the first three central moments for the population data given by:- X = 2, 3, 7
Solution
X i
2 3 7 12
N 3 5 =4
∑( ) ( ) ( ) ( )
M0 = 1 = = =0
∑( ) ( ) ( ) ( ) ∑( ) ( ) ( ) ( )
= = =14 3 = = =18 3
Note:
∑( )
For central moments : M0 = 1, M1 = 0, and M2 = = (variance of X)
Example 5.21: Compute the first four moments about the mean for the following sample data
(discrete frequency distribution)
Xi -3 1 2 3 5
Fi 2 1 4 2 3
Solution:-
( ) ( ) ( ) ( ) ( )
= = =2
Xi fi ( − ) ( − ) ( − ) ( − ) ( − ) ( − ) ( − ) ( − )
98 | P a g e
Skewness
Skewness refers us lack of symmetry. We study skewness to have an idea about the shape
of the curve which we can draw with the help of the frequency distribution. Frequency
distributions often found skewed on either side of its central value. As a result, it has a longer
tail either to the left or to the right. When there is a longer tail to the right of the center, the
distribution is said to be positively skewed. If the tail is longer to the left of the center, the
distribution is said to be negatively skewed. A positive skewness means a greater dispersal of
individual observations towards the right of the central value. A negative skewness, on the other
hand, implies that individual observations have greater dispersal towards the left of the central
value.
Skewness, therefore, not only refers to the lack of symmetry in distribution, it also shows the
direction of dispersion of individual observations on either side of the center of the distribution.
Accordingly, a measure of skewness quantifies the extent of departure from symmetry and also
indicates the direction in which the departure takes place.
Diagrammatically, the shape of frequency curves:
a) b) c)
= =
Positively Skewed Negatively skewed
Symmetrical distribution
99 | P a g e
a) Moment coefficient of Skewness
In terms of moment coefficient, skewness is defined as: = = =
( )
Where M2 = S2 = variance
Interpretation:
(1) If = 0 => Symmetrical distribution
(2) If < 0 => Negatively skewed distribution
(3) If > 0 => positively skewed distribution
(4) A greater or smaller value of means a greater or smaller degree of skewness.
Example 5.22: Find the skewness of the distribution given in example 5.18
Solution: = −46 9 = 39 9
100 | P a g e
In which S is standard deviation. Using the empirical relationship among mean, mode and
median in a moderately skewed distribution, i.e, mode = mean – 3(mean – median), the above
( )
equation can be modified as: =
Note:
1. −3 ≤ ≤3
2. If = 0, the distribution is symmetrical
3. If > 0, the distribution is positively skewed
4. If < 0, the distribution is negatively skewed
Example 5.23: Find the skewness of the following data using pearsonian’s coefficient of
skewness.
Solution:-
Arrange the data in an increasing order
1, 2, 4, 5, 6, 7, 8, 10, 30, 32
∑ ∑( ) .
= = 6.5 = = = 10.5 = = = 124.06
= √ = √124.06 = 11.14
( ) ( . . )
Therefore, = = .
= 1.077
Kurtosis
101 | P a g e
which is neither to high in peak, nor too flat at the top as in (b) is termed as Mesokurtic
distribution.
We have two measures of Kurtosis: The coefficient of Kurtosis and Moment coefficient of
Kurtosis
Interpretation:
If K = 0.5, approximately the distribution is Mesokurtic
If K > 0.5, approximately the distribution is leptokurtic
If K<0.5, approximately the distribution is platykurtic.
(ii) Moment coefficient of Kurtosis
Moment coefficient of Kurtosis is Kurtosis in terms of the fourth moment about the mean,
denoted by B2, and is defined as = = = −3
102 | P a g e
Summary Exercises
1. The mean and standard deviation of 25 observations were found to be 30 and 3 respectively.
After the calculations were made, it was found that two of the observations were recorded as
29 and 31 incorrectly. Find the mean and standard deviation if the incorrect observations are
excluded
2. A person invested his money in to two areas A and B. His net profit (in Birr) for the first three
months are:
Area A 72 76 74
Area B 45 92 85
Class Intervals 50 - 51 53 - 55 56 – 58 59 - 61 62 - 64
Frequencies 5 10 21 8 6
103 | P a g e
CHAPTER SIX: SIMPLE LINEAR REGRESSION AND CORRELATION
Chapter Objectives
After completing this chapter, students would be able to:
Dear Learners! In the preceding chapters we have been dealing with data on a single
variable. Here we shall focus on methods of dealing with paired data, which may be related in
some way.
Regression Analysis is concerned with describing and evaluating the relationship between a
dependent variable and one or more independent variables. Therefore, regression is used for
bringing out the nature of relationship and using it to know the best approximate value of the
other variable. In what follows, therefore, we will deal with the problem of estimating and/or
predicting the population mean/average values of the dependent variable on the basis of known
values of the independent variable (s).
The variable whose value is to be estimated/predicted is known as dependent variable while the
variables which help us in determining the value of the dependent variable are known as
independent variables.
A regression equation which involves only two variables, a dependent and an in dependent
referred to us simple regression. This model assumes that the dependent variable is influenced
by only one systematic variable and the error term. However, when several variables (necessarily
more than two) are included in the model, it is called multiple/multivariate regression.
104 | P a g e
The relationship between any two variables may be linear or non-linear. The former implies a
constant absolute change in the dependent variable in response to a unit changes in the
independent variable while the latter implies varying marginal change in the dependent variable
in response to changes in the independent variable.
Consequently, in this chapter we will confine ourselves to the type of regression involving only
two variables and the type of relationship between our variables which is linear. If this turns out
to be the case, it is called simple linear regression.
105 | P a g e
Y
20
15
*
10
*
5 *
* *
1 2 3 4 5 6 7 8 9 X
When carefully observed, the scatter diagram at least shows the nature of relationship; whether
positive or negative and whether the curve is linear or non-linear. When the general course of
movement of the paired points is best described by a straight line, the next task is to fit a
regression line which lies as close as possible to every point on the scatter diagram. This can be
done by means of either free hand drawing or the method of least squares. However, the latter is
the most widely used method.
Regression equation is a statement of equality that defines the relationship between two
variables. The equation of the line which is to be used in predicting the value of the dependent
variable takes the form Ye = a + bX. The most universally used and statistically accepted method
of fitting such an equation is the method of least squares.
The Method of Least Squares:-
This method requires that a straight line is to be fitted being the vertical deviations of the
observed Y values from the straight line (predicted Y values) is the minimum.
As shown in figure 6.1, if e1, e2, … e5 are the vertical deviations of observed Y values from the
straight line (predicted Y values – Ye), fitting a straight line in keeping with the above condition
requires that (for n sample size)
106 | P a g e
n
+ + ….+ = e
i 1
2
i is minimum. This can be done by partially differentiating e 2
i with
e Y Ye
2 2
i = i
e 2
i = Y i a bX i
2
Y a bx 0
i i
na Y b X
i i
a Y b X
n n n
ei2 (Yi a bxi ) 2
0
b b
-2 Yi a bX i X i 0
∑ − ∑ − ∑ =0
∑ −( − )[∑ − ∑ ]=0
∑ − ∑ − [∑ − ∑ ]=0
∑ ∑ ∑ ∑
∑ ∑
= ∑ ∑
∑ − ∑
=
∑ − ∑
∑ −∑ ∑
=
∑ − ∑( )
107 | P a g e
Example 6.1: Suppose we want to study the relationship between input (number of workers) and
output (thousands of Birr) of five factories given in table 6.1 above. To fit the regression line of
Yi (thousands of Birr) on Xi (number of workers, we can employ the method of least squares as
follows:
Solution Table 6.2
Arrange the data in tabular form
Yi Xi YiXi Xi2
Where = summation /total
4 2 8 4 ∑
Mean of =
7 3 21 9
3 1 3 1 Mean of =
∑
9 5 45 25
17 9 153 81 n = number of sample size
= 1 + (7 4)(8) = 15
Therefore, if a factory has 8 workers, its level of output will be 15 thousand ETB.
Example 6.2: In what follows you are provided with sample observations on price and
quantity supplied of a commodity X by a competitive firm.
a) Construct the scatter diagram
b) What is the linear regression of Yi (quantity supplies) on Xi(price of the commodity X).
108 | P a g e
c) Suppose price of the commodity X be 32, what will be the quantity supplied by the firm?
Tab.6.3. Data on price and quantity supplied.
a. *
70
*
*
60 **
*
50 *
*
40 *
* *
30
20
10
10 20 30 40 50 60 70
109 | P a g e
∑ ∑ ∑ ( , ) ( )
b) = ∑ ( )
= ( , ) ( )
= 0.7795
= −
∑ ∑
= = 675 12 = = 460 12
6.2. Correlation
The correlation coefficient measures the degree to which two variables are related
/associated – simple correlation denoted by r. For more than two variables we have multiple
correlations.
110 | P a g e
Two variables may have either positive correlation, negative correlation or may not be
correlated. Furthermore, depending on the form of relationship the correlation between two
variables may be linear or non-linear. Therefore, in this section, we shall be concerned with
quantifying the degree of association between two variables with linear relationship.
Contrary to regression analysis explained in the previous section (6.1), the computation of
coefficient of correlation does not require one variable to be designated as dependent and the
other as independent.
The measure of the degree of relationship between any two variables known as the pearsonian
coefficient of correlation, usually denoted by r, is defined
∑( )( )
= and is termed as the product – moment formula.
∑( ) ∑( )
( ) ( )
= = ( )(
= .
= 0.99
[ ( ) ( ) ][ ( ) ( ) ] )
112 | P a g e
( , ) ( )
Therefore, =
[ ( , ) ( ) ][ ( , ) ( ) ]
,
= , .
= 0.974 ==> it implies strong positive relation between X & Y.
Example 6.5: Adding to each value of X and Y given in table 6.1 a constant number, say 1, show
that property 4 holds true.
Solution: Table 6.6.
The pearsonian coefficient of correlation cannot be used in cases when the direct
quantitative measurement of the phenomenon under study is not possible. In such cases, we
make use of the rank correlation coefficient.
Steps involved calculating the spearman’s coefficient of rank correlation:
1. Rank the X values among themselves giving rank (1) to the largest (or smallest value and
(2) to the next largest (or smallest) value and so on.
2. Rank the Y-values among themselves in a similar way to that of X.
3. When there are ties in rank, i.e., when there are values sharing the same rank, assign to
each of the filed observation, the mean of the ranks they jointly occupy and the next rank
to be over looked.
4. Find the sum of the squares of the differences between ranks of two variables.
∑
5. Apply the formula = 1− ( )
Total 4
∑ ( )
=1− ( )
=1− ( )
= 0.75==>It implies that there is similarity between the ranks of
Review Exercises
1. Define and distinguish between;
a) Regression and correlation
b) Simple and multiple regression
c) Linear and non-linear relationship
2. Bring out the relevance of a scatter diagram in regression analysis.
3. Explain the meaning and status of the two constants a and b in the regression equation
Ye = a + bXi.
4. The marks obtained by 10 students in their graduation with B.A. degree in management
and the MBA entrance test were found as given below.
Graduation (Xi) 50 52 55 60 62 65 65 66 70 75
Entrance test (Yi) 52 50 57 65 65 62 65 65 71 75
Therefore, find
a) The two regression equations
b) The correlation coefficient between two sets of marks
114 | P a g e
5. Obtain the regression equation of X on Y and Y on X for the paired data given below.
Also compute the coefficient of correlation.
Market price of X 26 28 30 31 35
Market price of Y 20 27 28 30 25
6. Ten students got the following marks in Maths and Statistics
Student A B C D E F G H I J
Maths (X) 78 36 98 25 75 82 90 62 65 39
Statistics (Y) 84 51 91 60 68 62 86 58 58 47
Compute the coefficient of Rank correlation and interpret the result.
7. For a certain set of paired data on X and Y, 3Xi + 2Yi – 26 = 0 and 6Xi + Yi – 31 = 0
are the two regression equations.
a) Find the mean values
b) Find the coefficient of correlation : = .
8. A leading company engaged in the production of detergents has 10 vacancies of salesman
for which 15 (n) persons were called for personal interviews. The interview board
consisted of the sales manager and a psychologist. The ranks given by the two to all 15
candidates who attend the interview is given below.
115 | P a g e
CHAPTER SEVEN: ELEMENTARY PROBABILITY
Chapter Objectives
Dear learner, at the end of this chapter, you are expected to:
Understand the basic terms such as probability, experiment, outcome and event.
Calculate probabilities applying the rules of addition and multiplication.
Define the terms conditional probability and joint probability.
Understand permutation and combination.
Define the terms random variable and probability distribution.
Distinguish between a discrete and continuous probability distribution
Calculate the mean, variance and standard deviation of discrete probability distributions
Understand binomial and normal probability distributions.
Define and calculate the Z-value
Compute probabilities using the standard normal distribution.
7.1.Introduction
Dear distance learners Explain basic probability concepts, i.e. Experiment, sample space,
sample point, event…?
______________________________________________________________________________
______________________________________________________________________________
Good!
An Experiment – is the process that leads to the occurrence of one or more possible
observations.
116 | P a g e
Example:- Tossing a coin
Rolling two dice once
Drawing a card from a deck
Sample Space – is a complete listing of all elementary events of an experiment.
Example:
The sample space for the experiment of tossing a coin is (H,T). if two coins are tossed once,
the sample space is (H1, H2) (H1, T2) (T2 H2) (T1 T2).
The sample space for the roll of a single die is (1,2,3,4,5,6). If two dice are rolled once, the
possible outcomes (sample space) are:-
(1,1)(1,2)(1,3)(1,4)(1,5)(1,6)
⎡ ⎤
(2,1)(2,2, )(2,3)(2,4)(2,5)(2,6)
⎢ ⎥
⎢ (3,1)(3,2)(3,3)(3,4)(3,5)(3,6) ⎥
⎢ (4,1)(4,2)(4,3)(4,4)(4,5)(4,6) ⎥
⎢ (5,1)(5,2)(5,3)(5,4)(5,5)(5,6) ⎥
⎣ (6,1)(6,2)(6,3)(6,4)(6,5)(6,6) ⎦
Sample points:- are elements of sample space.
Example-2 is one sample point of rolling a die.
To find the number of sample spaces, apply the formula where n is the number of experiments
and K is the number of possible outcomes of a single experiment.
An Event – is the collection of one or more outcomes of an experiment. Events are mutually
exclusive if the occurrence of any one event means that none of the others can occur at the same
time. That is if two events cannot occur at the same time, they are mutually exclusive. Events are
independent if the occurrence of one event does not affect the occurrence of another. Events are
collectively exhaustive if at least one of the events must occur when an experiment is conducted.
Example: A fair die is rolled once. The experiment is rolling a die. The possible outcomes are
the numbers 1,2,,4,5, and 6. If an event is the occurrence of an even number, we should collect
the outcome, 2,4 and 6.
Probability is a measure of the chance or likelihood that a particular event will happen in
the future. It can only assume between 0 and 1. For instance, probability of E which is written as
P(E) as a number do have the properties:
≤ ( )≤1
P(E) = 0 means the event will not happen and is called impossible event.
P(E) = 1 means we are 100% sure that the event will occur (sure event)
117 | P a g e
(ii) Relative frequency (Empirical) probability
(iii) Subjective probability
i) Classical Probabilities: - It is based on the assumption that the outcomes of an
experiment are equally likely. It applies rules and laws and involves an experiment.
( )= Where: N = total possible outcomes of an experiment
n = the number of outcomes in which the event occurs
out of N outcomes in an experiment.
Examples. In a coin tossing experiment, what is the probability of getting a head on one
toss of a coin? As there are only two possible outcomes, the probability is 50% or 0.5 or
½.
An unbiased die is thrown. What is the probability that digit 2 appears? Ans. 1 6.
ii) Relative frequency (Empirical) Probabilities- This method is based on cumulative past
historical data.
ℎ
( )=
:
a) Suppose that, of the last 70 days with conditions like those forecasts for today, it
rained for 12 days, what is the probability of rain today based on those historical
days? 12 70 = 0.17 or 17%
b) Throughout her teaching career a professor has awarded 186 A’s out of 1200
students. What is the probability that a student in her section this semester will
receive an A grade?
( )= = 0.1555
iii) Subjective Probability: -It uses probability value based on an educated guess or
estimate, employing opinions and inexact information. For example, a seismologist might
say that there is a 45% probability that an earthquake will occur in Afar after thirty years.
______________________________________________________________________________
______________________________________________________________________________
Good!!
If two events A and B are mutually exclusive, the special rule of addition states that the
probability of A or B occurring equals the sum of their respective probabilities: P (A or B) = P(A)
+ P(B)
118 | P a g e
Definition: Two events of a single experiment are said to be mutually exclusive if they
cannot occur simultaneously as a result of the experiment. This is equivalent to saying that
mutually exclusive events must have disjoint event sets.
Example: Abay Zuria transport association has recently supplied the following information on
their trip from Bahir Dar to Debre Markos:
If A is the event that a bus arrives early, then P(A) = 100/1000 = .10.
If B is the event that a bus arrives late, then P(B) = 75/1000 = .075.
The complement rule is used to determine the probability of an event occurring by subtracting
the probability of the event not occurring from 1.
If P(A) is the probability of event A and P(~A) is the complement of A, then P(A)+P(~A)=1 or
P(A)= 1- P(~A).
Examples:
(i) Two events X and Y are mutually exclusive. Suppose P(X) =0.04 and P (Y) =0.03. What
is the probability that either X or Y will occur (0.07). What is the probability that neither
X nor Y will happen? (0.93)
(ii) Suppose the probability that you will score an A in this class is 0.25 and the probability
that you will get a B is 0.50. What is the probability that your grade will be above C?
(0.75)
(iii)The probabilities of events A and B are 0.20 and 0.30 respectively. The probability that
both A and B occur is 0.15. What is the probability of either A or B will occur?(0.35)
(iv)A student is taking two courses, microeconomics and statistics. The probability that the
student will pass the microeconomics course is 0.60 and the probability of passing the
statistics course is 0.70. The probability of passing both is 0.50. What is the probability of
passing at least in one course? (0.80)
The general rule of addition
If A and B are two events that are not mutually exclusive, then P(A or B) is given by the
following formula: P(A or B) = P(A) + P(B) - P(A and B)
119 | P a g e
Example: In a sample of 500 students, 320 said they had a radio, 175 said they had a TV, and
100 said they had both:
If a student is selected at random, what is the probability that the student has only a radio,
only a TV, and both a radio and TV? Solution: P(S) = 320/500 = .64.P(T) = 175/500 =
.35.P(S and T) = 100/500 = .20.
If a student is selected at random, what is the probability that the student has either a
radio or a TV in his or her room? Solution: P(S or T) = P(S) +P(T) - P(S and T)= .64
+.35 - .20 = .79.
Joint Probability
A joint probability measures the likelihood that two or more events will happen at the same time.
An example would be the event that a student has both a radio and TV in his or
her dorm room.
The special rule of multiplication requires that two events A and B are independent.
Two events A and B are independent, if the occurrence of one has no effect on the probability of
the occurrence of the other.
If the occurrence of one event has no effect on the probability of the occurrence of any other
event, then the events are called independent events. Two events originating from independent
experiments will be independent, while two events originating from the same experiment will
not, in general, be independent.
Example: Suppose two coins are tossed, the outcomes of one coin (head or tail) is unaffected by
the outcome of the other coin (i.e. head or tail). That is, the outcome of the second event does not
depend on the outcomes of the first event.
This rule is written: P(A and B) = P(A)P(B)
___________________________________________________________________________
___________________________________________________________________________
Good!!
120 | P a g e
A conditional probability is the probability of a particular event occurring, given that another
event has occurred. The probability of the event A given that the event B has occurred is written
P(A|B).
The general rule of multiplication is used to find the joint probability that two events will occur.
It states that for two events A and B, the joint probability that both events will happen is found by
multiplying the probability that event A will happen by the conditional probability of B given
that A has occurred.
Where P (B/A) = probability of B given that event A has occurred. Conditional probability
P( AandB)
P( A / B) , P( B) 0
B
Example: The Dean of the School of Business at a University collected the following
information about undergraduate students in her college:
P (A and F) = 110/1000.
Given that the student is a female, what is the probability that she is an Accounting
major? P (A|F) = P (A and F)/P (F) = [110/1000]/[400/1000] = .275
Let an experiment have a sample space S with E as any event. We define the probability of E
occurring written as P (E) as a number of satisfying the following conditions.
P(S) = 1, p i =1
121 | P a g e
Additional examples:
1. An experiment is performed by tossing a normal coin and observing which side (H or T)
is shown uppermost.
a. Write down the sample space S = (H, T)
b. Calculate P(H) = ½
1 1
c. Show that P(S) = 1 = ( 1)
2 2
d. Show that E1 (H) and E2 (T) are mutually exclusive.
2. A fair dies is rolled once as an experiment with S = (1,2,3,4,5,6)
a. P(1 or 2) = P(1)+P(2) = 1/6+/6=1/3
b. P(X<4) = ½
c. P(even number)= ½
d. P(even or less than 4)=P(even number) + P(<4) – P(even number and <4)=1/2
+1/2 -1/6=5/6
122 | P a g e
Tilahun, Addisu
Tilahun , Chala
Chala, Addisu
Chala, Tilahun
3!
3 p2 6
(3 2)!
Therefore, there are 6 different sitting arrangements for the two guests.
c. What if you are trying to give a seat for a guest out of three guests?
Addisu, Tilahun, Chala
3!
3 p1 3
(3 1)!
Therefore, there are 3 different sitting arrangements for a guest.
123 | P a g e
7.6. Probability Distributions and Random Variables
The possible outcomes for such an experiment will be: TTT, TTH, THT, THH, HTT,
HTH, HHT, HHH.
Thus the possible values of x (number of heads) are 0,1,2,3.
The outcome of zero heads occurred once.
The outcome of one head occurred three times.
The outcome of two heads occurred three times.
The outcome of three heads occurred once.
From the definition of a random variable, x as defined in this experiment is a random
variable.
124 | P a g e
The probability distribution is given as
X P(X)
0 1/8
1 3/8
2 3/8
3 1/8
The Mean of a Discrete Probability Distribution
The mean:
reports the central location of the data.
is the long-run average value of the random variable.
is also referred to as its expected value, E(X), in a probability distribution.
is a weighted average.
The mean is computed by the formula: [( xP( x)] where represents the mean and P(x) is
-
the probability of the various outcomes x.
2 [( x ) 2 p( x)]
Examples:
1. The table listed below show random variables and their probabilities. However only one
of these is actually a probability distribution:
X P (X) X P (X) X P (X)
5 0.30 5 0.10 5 0.50
10 0.30 10 0.30 10 0.30
15 0.20 15 0.20 15 -0.20
20 0.40 20 0.40 20 0.40
a) Which one is a probability distribution?
b) Using the correct probability distribution, find the probability that X is
1) Exactly 15 (0.20)
2) Not more than 10 (0.40)
3) More than 5 (0.90)
125 | P a g e
c) Calculate the mean, variance and standard deviation of the correct probability
distribution.
Mean=5*.10+10*.30+15*.2+20*.4=0.5+3+3+8=14.5
126 | P a g e
The variance is found by: 2 n (1 )
To construct a binomial distribution, let
n be the number of trials
x be the number of observed successes
be the probability of success on each trial
The formula for the binomial probability distribution is:
P ( x ) n c x x (1 ) n x
Example: The Department of Labor reports that 20% of the workforce is unemployed.
From a sample of 14 workers, calculate the following probabilities:
Solution
The normal curve is bell-shaped and has a single peak at the exact center of the
distribution.
The arithmetic mean, median, and mode of the distribution are equal and located at the
peak. Thus half the area under the curve is above the mean and half is below it.
The normal probability distribution is symmetrical about its mean.
The normal probability distribution is asymptotic. That is the curve gets closer and
closer to the X-axis but never actually touches it.
It is a continuous probability distribution.
Theoretically, curve extends to infinity
127 | P a g e
The standard normal distribution is a normal distribution with a mean of 0 and a standard
deviation of 1. It is also called the z distribution. A z-value is the distancesbetween selected
values, designated X, and the population mean divided by the population standard deviation.
X 2,200 2,000
z 1.00
200
X 1,700 2,200
What is the z-value of $1,700? z 1.50
200
A z-value of 1 indicates that the value of $2,200 is one standard deviation above the mean of
$2,000. A z-value of –1.50 indicates that $1,700 is 1.5 standard deviations below the mean of
$2000.
Example: The daily water usage per person in New Providence, New Jersey is normally
distributed with a mean of 20 gallons and a standard deviation of 5 gallons. About 68 percent of
those living in New Providence will use how many gallons of water? About 68% of the daily
water usage will lie between 15 and 25 gallons.
What is the probability that a person from New Providence selected at random will use between
20 and 24 gallons per day?
X 20 20 X 24 20
z 0.00 z 0.80
5 5
The area under a normal curve between a z-value of 0 and a z-value of 0.80 is 0.2881.
We conclude that 28.81 percent of the residents use between 20 and 24 gallons of water per day.
What percent of the population use between 18 and 26 gallons per day?
X 18 20 X 26 20
z 0.40 z 1.20
5 5
128 | P a g e
Review Exercises
1. Sixty percent of the students at Scandia Tech drive to class and 30 percent have GPAs of
at least 3.00. Ten percent of the students have a 3.00 GPA and drive to class. If we select
a student at random, what is the likelihood that the student had a GPA of 3.00 or drives to
class?
2. An insurance sales representative has an appointment with four clients today. From long
experience she knows that the probability of selling a policy to a client is .80.
a. What is the probability of selling a policy to all 4 clients?
b. What is the probability of selling a policy to three or more clients?
3. There are 600 employees at the Tuesday Morning’s Department Store corporate
headquarters in Columbia.
See the following breakdown.
5. Suppose P (A) =0.75, P (B/A) =0.40, what is the joint probability of A and B?
129 | P a g e
References
Freund, J.E. and G.A. Simon (1992). Modern Elementary Statistics, 8th ed., Prentice-
Hall.
Hooda, R. P. (2003). Statistics for Business and Economics, 3rd ed., New Delhi:
Macmillan.
Monga, G.S. (2000). Mathematics and Statistics for Economics, 2nd ed., Delhi: Vikas
Publishing.
130 | P a g e
Debre Markos University, College of Business and Economics
Department of Economics
Part I: Say True if the following statements are correct or false if incorrect (5 points)
1. Time series data is a data collection from a population at a given point in time.
2. A distribution with higher coefficient of variation is said to be more consistent than the
other distribution.
3. Class mark is the mid-way between the upper and lower class limits.
4. Quartile deviation can be taken as among the measures of variation/dispersion.
5. Measures of dispersion can assure which set of certain data is better represented by its
measure of central tendency value.
6. Among the methods of data collection, observational research requires to conduct an
experiment and record the results.
7. The sum of a cumulative frequency is always 1.
8. The intersection of less than and more than cumulative frequency curves gives the
median of a given data.
9. The difference between histogram and bar chart is that the former uses adjacent bars
while the later uses non adjacent bars.
10. Range remains unaffected by the magnitude of the extreme values.
1|P age
6. Certain farmers around Debre Markos have received aid in kind from 1999 which was
2%, 4% in 2000 and 5% in 2001. Find the Geometric mean of the aid.
a. 3 b. 3.41 c. 4 d. 3.28
Group A Group B
1. Frequency distribution a) 2, 7, 1, 1
2. Line graph b) Class must be non-exhaustive
3. Bi-Modal c) Particularly effective to show changes in a
4. Variance variable over time
5. Coefficient of variation d) 2, 7, 1, 2
e) Formed by intersection of the class mark
with the class frequencies
f) Effective in presenting cross sectional data
g) Class limit should be mutually exclusive
h) 1, 7, 1, 2, 7
i) Absolute measures of dispersion
j) Relative measures of dispersion
2|P age
Part IV: Work out (10 points)
18 21 24 37 27
27 34 29 26 20
3. The following frequency distribution table reports the number of patient and their
respective age in a given hospital.
Patient Frequency
1-10 2
11-20 8
21-30 16
a. Find the coefficient of range using class mark (1 pts)
b. Calculate the Mean deviation from the Mean (1 pts)
3|P age