Priinciples of Statistics PDF

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 62

Statistics and Probability for Business and Financial Sciences

Dr. Abood Mohammed Jameel


Chapter 1
Concepts and definitions

1.1 Introduction
Statistical techniques are those techniques which are used in
conducting the statistical enquiry concerning to certain
Phenomenon. They include all the statistical methods beginning
from the collection of data till interpretation of those collected data.
One of the important statistical methods is collection of data. There
are different methods for collecting primary and secondary data.
Although the tradition of collection of data and its use for various
purposes is very old, the development of modern statistics as a
subject is of recent origin. The development of the subject took place
mainly after sixteenth century. The notable mathematicians who
contributed to the development of statistics are Galileo, Pascal, De-
Mere, Ferment and Cardenas of the 17th century. Then in later years
the subject was developed by Abraham De Mover (1667 - 1754),
Marquis De Laplace (1749 - 1827), Karl Friedrich Gauss (1777 -
1855), Adolph Quenelle (1796 - 1874), Francis Galton (1822 - 1911),
etc. Karl Pearson (1857 - 1937), who is regarded as the father of
modern statistics, was greatly motivated by the researches of Galton
and was the first person to be appointed as Galton Professor in the
University of London. William S. Gusset (1876 - 1937), a student of
Karl Pearson, propounded a number of statistical formulae under
the pen-name of 'Student'.
R.A. Fisher is yet another notable contributor to the field of statistics.
His book 'Statistical Methods for Research Workers', published in
1925 marks the beginning of the theory of modern statistics.
The science of statistics also received contributions from notable

1 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
economists such as Augustine Cornet (1801 - 1877), Leon Walrus
(1834 - 1910), Wilfred Pareto (1848 - 1923), Alfred Marshall (1842 -
1924), Edge worth, A.L. Bowleg, etc. They gave an applied form to
the subject.
Among the noteworthy Indian scholars who contributed to statistics
are P.C. Mahalnobis, V.K.R.V. Rao R.C. Desai, P.V. Sukhumi, etc.
Statistics has been defined in different ways by different authors.
These definitions can be broadly classified into two categories. In the
first category are those definitions which lay emphasis on statistics as
data whereas the definitions in second category emphasize statistics
as a scientific method.
1.2 Definitions:
1.2.1 Statistics.
It is a tool in our hands to translate complex facts into simple and
understandable statements of facts. Statistics may be defined as the science
of collecting, summarizing, organizing, presenting, analyzing, and interpreting
of numerical and non- numerical data for the purpose of assisting in making
more effective decision.

1.2.2 Statistical Population. It is any large group or collection of all


possible items or objects of specified characteristics or aspects of
interest. Example: All the students in Cihan University represent
student's Population. We can divide this population in to two
populations, Girls population and Boys population. We indicate to the
size of any Statistical population by "N ". For Example, the size of
student's population in Cihan University is = N = 6000 students.

1.2.3 Sample. It is a part or sub-set of the population. We indicate to


the size of any sample by "n ". For example, First year Students in
class "A" / Accounting Department is a sample drawn or selected
2 Dr. Abood Mohammed Jameel
Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
from student's population in Accounting Department/Cihan
University. The size of this sample is "n = 45" students.
1.2.4. Data and information "Figures and facts" in original form
before any statistical techniques are used to redefine process or
summarize it.
This information may be collected through Censuses, Surveys,
Experiments, Internet, Questionnaires and other sources.

1.2.5 Element. The element is the essential unit or individual in the


Population or in the Sample, on which data are collected. Example,
each student in class "A" / first year in Accounting Department is an
element.

1.2.6 Variables (Xi) .The Variable is any character or aspect which


can vary from one individual "Element" to another. For example:
Age , Income , Height, Color, Blood group, Temperature, High school
average and so on.

1.2.6.1. Discrete Variables: Variables which assume a finite or


countable number of possible values, usually obtained by
counting.
1.2.6.2. Continuous Variables: Variables (data) which assume an
infinite number of possible values, usually obtained by
measurement.
1.2.7 Observations. Data are obtained by collecting measurements on
each variable for every element in the study. The set of measurements
collected for a particular element is called an observation.
1.3. Data Sources:
1.3.1 Experiments;
1.3.2 Sports;

3 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
1.3.3 Internet;
1.3.4 Universities Database;
1.3.5 Banks, Companies, Other privet sector;
1.3.6 Governmental documents;
1.3.7 Libraries.
1.4. Functions of Statistics.
The following are the main functions of Statistics:

1.4.1The most important functions of statistics is to collect the data,


Statistics simplifies complete mass of data.

1.4.2 Statistics presents facts in definite form.

1.4.3 Statistics furnishes techniques of comparison. It facilitates


comparison between two or more variables relating to different
time and location.

1.4.4 Statistics helps in the formulation and testing hypothesis.


Statistical methods are helpful to develop new theories, formulate
and testing Hypothesis and we can test the statistical hypothesis.

1.4.5 Statistics helps in forecasting of future events. The statistical


techniques such as Regression analysis may be used for forecasting
of future requirement.

1.4.6 Statistics studies the relationships. It helps in establishing


relationship between two or more variables relating to different
fields.

1.4.7 Statistics helps to make rational decisions.

1.4.8 Statistics provides techniques for drawing inferences.


1.5. Types of data: There are two types of data:

4 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
1.5.1 Qualitative Data:
When the basis of classification of data is according to characteristics
(Aspects) or attributes we call this data Qualitative data. For example,
Rich and poor persons, Married and unmarried, Honest and dishonest,
good, very good and Excellent, Blood group, All types of collars, yes,
no, and so on.

1.5.2 Quantitative Data:


When the classification of data is based on figures, we call this data
Quantitative data. For example, Weight, Income, Temperature,
Height, High school average, distance and so on.

1.6. Sampling Methods:


There are many ways to collect a sample. The most commonly used
methods are:
1.6.1 Statistical Sampling Methods:
1.6.1.1. Simple Random Sampling: It is a method of selecting items from
a population such that every possible sample of specific size has equal
chance of being selected. In this case, sampling may be with or without
replacement.
1.6.1.2. Stratified Random Sampling: It is obtained by selecting simple
random samples from strata (or mutually exclusive sets). Some of the
criteria for dividing a population into strata are: (male, female); Age
(under 18, 18 - 28, 29 - 39);

1.6.1.3. Cluster Sampling: It is a simple random sample of groups or


cluster of elements. Cluster sampling is useful when it is difficult or
costly to generate a simple random sample. For example, to estimate
the average annual household income in a large city we use cluster
sampling, because to use simple random sampling we need a

5 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
complete list of households in the city from which to sample. To use
stratified random sampling, we would again need the list of
households.
A less expensive way is to let each block within the city represent a
cluster.
1.6.2 Non-statistical Sampling Methods:
1.6.2.1. Judgment Sampling: In this case, the person taking the
sample has direct or indirect control over which items are selected
for the sample.
1.6.2.2. Quota Sampling: The decision maker requires the sample to
contain a certain number of items with a given characteristic. Many
political polls are, in part, quota sampling.

1.7. Levels of Measurement:


Measurements are classified according to the highest level which it fits.
Each additional level adds something to the previous level didn't have.
There are four levels of measurement:

1.7.1. Nominal is the lowest level. Only names are meaningful here.
1.7.2. Ordinal adds an order to the names.
1.7.3. Interval adds meaningful differences.
1.7.4. Ratio adds a zero so that ratios are meaningful.

1.8. Exercises
Exercise 1

6 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
Determine whether the following information Qualitative or
Quantitative Data. Temperature, Colors, Distances, Age, Weight,
Grades, Brands, Names, Blood-group, 55, Yes, No, Excellent, Very-
good, 47, Black, Red, 20 Kilograms, Erbil, and University.

Exercise 2
Describe types of data.

Exercise 3
Explain the Levels of Measurement.

Exercise 4
Explain the Simple Random Sampling.

Exercise 5
Where do you get data and information (Data Sources).

Exercise 6
List all functions of Statistics.

Exercise 7
Explain what is meant by the term population.

Exercise 8
Explain what is meant by the term sample.

Exercise 9
Explain how a sample differs from a population.

Exercise 10
Explain what is meant by the term sample data.

7 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
Chapter Two

Classification and Tabulation


of Data and Information

2.1 Introduction

The data collected in any statistical investigation, known as raw data,


is a complex and unorganized mass of figures. Therefore, it becomes
necessary to organize this in order to apply the Statistical tools of
analysis and interpretation, and it is necessary that the data are
arranged in a definite form. This task is accomplished by the process
of "Classification and Tabulation". The important step in statistics is
the collection of data. This depends on the purpose for which the data
is required. It is the most important step because it is the foundation
of statistical investigation. Data is divided, as we mentioned before in
chapter one, in to two types:
Type1: Quantitative data.
Type2: Qualitative data.

2.2. Summarizing Data.


We can summarize data, either Quantitative or Qualitative data, by
two tools or methods;
2.2.1 Frequency Distribution Table Method:-
The easiest method of Summarizing and organizing data is a frequency
distribution table, which converts raw data into a meaningful pattern
for statistical analysis. So, what is the Table? Any table consists of at
least two columns; column of frequency, and Column of Classes. This
Table is called a Frequency Distribution Table. What is the Frequency?
The frequency may be defined as the number of times a certain value,
item or object occurs.

8 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
2.2.1.1. Types of Frequency:
There are four types of frequency:
1. Ordinary Frequency (f i ): The number of times a certain value
occurs.
2. Relative Frequency distribution (R. f i ): It is the frequency in each
cell divided by the total frequency. Or simply the proportion of the
total number of items belonging to a class, ie R. f i = frequency of
the class / n.= f i / n.
Note: The sum of all the relative frequencies must always be equal
to 1.00
3. Percent Frequency distribution (P.f i ): It is the relative frequency
multiplied by 100%. P.fi = R.fi *100%
Note: The sum of all the Percent frequencies must always be equal
to 100. Relative frequency may be determined for both quantitative
and qualitative data and is a convenient basis for the comparison of
similar groups of different
4. Cumulative Frequency distribution.
The Cumulative frequency distribution uses the number of classes,
class width, and class limits that are developed fourth frequency
distribution. However, rather than showing the frequency of each
class, the cumulative frequency distribution shows the number of
items less than or equal to the upper class limit of each class.
Cumulative frequency may be divided in to two types of Cumulative
frequency:
Type (1): Ascending Cumulative Frequency.
Type (2): Descending Cumulative Frequency.
2.2.1.2. What Frequency Distribution Tells Us?
1. It shows how the observations cluster around a central value.
2. It shows the degree of difference between observations.

9 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
3. A frequency distribution shows the number of data items in
each of several non-overlapping groups or classes. As we will
see in next chapters, frequency distribution is the basis for
probability theory.

2.2.2. Graphs Methods:-


Graph is a Statistical tool that can be used to describe data and
information.
2.3. Summarizing Quantitative Data.
Quantitative data can be summarized by two tools or methods.
2.3.1 Frequency distribution table method.
Constructing a Frequency Distribution Table:
To Construct a Frequency Distribution Table; we must find the
Column of classes and the Column frequency.
The following steps are helping us of finding the Column of classes:
Step (1): Determine the number of classes.
A class is a group (category) of our interest; each Class consists of
two limits, Lower limit and Upper limit. For Example: 4-8, 6-12,
We need to specify the number of classes (K).
There is no accepted rule tells us how many classes are to be used.
The number of classes is generally recommended between 5 and 20
classes.
We suggest the following simple formula to find an approximate
number of Classes (K):
𝟒
K = 2.5 * √𝒏 … 2.1
Where, n = sample size (number of values or items in the sample).
Step (2)
Determine the Width of Class (W).
Class Width (W) is
10 Dr. Abood Mohammed Jameel
Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
W=R/K … 2.2
Where
W = the number of values in each class
R = Range = Largest data value – Smallest data value
K = the number of classes.

Step (3)
Finding Class Limits:
Each class has two limits, lower limit and upper limit.
The limits could actually appear in the data and have gaps between
the upper limit of one class and the lower limit of the next.
Lower class limit = The smallest data value in the sample or less.
Upper class limit = Lower class limit + (W - 1).
Mid-Class (Mid-point): The number in the middle of the class.
It is found by adding the upper and lower limits for each class and
dividing by two.
Mid-Class = (Upper class Limit + Lower class Limit) / 2.

Example1
The number of items rejected daily by a manufacturer because of
defects was recorded for the 30 days. The results are shown below:
4 9 13 7 12 15 5 8 5 7 15 17 19 8 6

6 4 10 8 22 16 9 5 3 9 21 14 13 18 7

Construct:
1. Frequency distribution table.
2. Relative frequency,
3. Percent frequency,

11 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
4. Cumulative frequency,
5. Mid-class.
The Solution:
1- To find the frequency distribution table, we find,
Firstly: the number of Classes, by applying the following three steps;
Step (1): Approximate number of classes (K):

n =30.
𝟒
K=2.5* √𝟑𝟎
K = 5.8508.
Step (2):
Width of the Class (W): From the data in Example1, the Largest
data value = 22 and the Smallest data value = 3. Hence, the range (R)
is,
R = 22 – 3 = 19, and we already have an approximate number of
classes K = 5.8508.

Therefore,
W=R/K
W = 19 / 5.8508 = 3.2474 ≈ 3, which is rounded-up to 3.

Step (3) Class Limits: We can find the class limits as follows:
Lower class limit =the smallest data value in the sample or less = 3.
Upper class limit = Lower class limit +(W - 1 ) = 3+ (3 – 1) = 5.
We define the first class limits as (3 – 5), second class limits (6 – 8),
third class limits (9 –11), forth class limits(12 – 14), fifth class limits
(15 – 17), sixth class limits (18 – 20), and seventh class limits( 21 –
23). The smallest data value, 3, is included in the (3 – 5) class.

Secondly: We find the frequency for each class.

12 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
Once the number of classes, class width, and class limits has been
determined, a frequency will be obtained by counting the number of
data items belonging to each class. For example, data in Example1
shows that seven values (4, 5, 5, 3, 4, 5, and 3) belong to the class (3 –
5). Thus, the frequency of the class (3 – 5) is 7.
Continuing this counting process for the other classes provides the
following frequency distribution in Table 2.1:
Table 2.1
Frequency Distribution for items rejected daily
Class 3-5 6-8 9-11 12-14 15-17 18-20 21-23 Total
Frequency (f i ) 6 8 4 4 4 2 2 30

2- Recall that the relative frequency (R. f i ) is simply the proportion


of the total number of items belonging to a class. For the data set
having n =30 items; the relative frequency of the first class (3 – 5) is:
R. f i = 6/30=0.200.
3- Recall that the Percent Frequency (P.f i ), is the relative frequency
multiplied by 100%.
P.f i = R.f i *100%
Then, the Percent frequency of the first class (3 – 5) is:
P.f i = 0.200000 * 100 = 20.0000
Based on the class frequencies in Table1 and with n = 30.
Table 2.2 shows the relative frequency distribution and percent
frequency distribution for the items rejected daily data.

13 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
Table 2.2
Relative and Percent frequency Distribution for items rejected
Class Frequency Relative Percent
(items rejected (f i ) frequency (R. f i ) frequency (P. f i )
daily) %

3-5 6 0.200000 20.0000

6-8 8 0.266667 26.6667

9-11 4 0.133333 13.3333

12-14 4 0.133333 13.3333

15-17 4 0.133333 13.3333

18-20 2 0.066667 6.6667

21-23 2 0.066667 6.6667

Total 30 1.000000 100.0000

Note that the sum of all the relative frequencies must always be
equal to 1.00 and the sum of all Percent frequencies must always be
equal to 100. In the above example, we see that 0.233333% of all
rejected items are between 3 and 5 items, and 0.066667% of all
rejected items are between 18 and 20 items.
4- Recall the Cumulative Frequency; the cumulative frequency
distribution shows the number of items with values less than or equal
to the upper class limit of each class.
Table 2.3 provides the Ascending and Descending Cumulative
frequency distribution for items rejected daily data.

14 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
Table 2.3
Ascending and Descending Cumulative Frequency Distributions
for Items Rejected Daily
Items rejected daily Frequency Ascending Descending
(fi) Cumulative Cumulative
Frequency Frequency

Less than or equal to 5 6 6 30

Less than or equal to 8 8 14 24

Less than or equal to 11 4 18 16

Less than or equal to 14 4 22 12

Less than or equal to 17 4 26 8

Less than or equal to 20 2 28 4

Less than or equal to 23 2 30 2

Total 30 - --

5- Mid-Class (Mid-point): The number in the middle of the class.


It is found by adding the upper and lower limits for each class and
dividing by two.
Mid-Class = (Upper class Limit + Lower class Limit) / 2.
Calculating Mid-class for the data in Example 1 for items rejected.
The Solution:
Let X i = Mid-Class
X 1 = (3 + 5) /2 = 4
X 2 = (6 + 8) /2 = 7
X 3 = (9 +11) /2 =10.

15 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
And so on, the total results are listed in Table 2.4:
Table 2.4
Mid- Class - Items rejected
Class 3-5 6-8 9-11 12-14 15-17 18-20 21-23

Freq.(f i ) 6 8 4 4 4 2 2

Mid-Class 4 7 10 13 16 19 22
Xi

Example 2
Suppose the age of 10 students is:
21, 18, 19, 21, 20, 25, 22, 23, 24, and 26.
Find: 1) Frequency distribution table,
2) Relative frequency,
3) Percent frequency,
4) Cumulative frequency and
5) Mid-class.
The Solution:

1- To find frequency distribution table, we find firstly the Classes, by


applying the following three steps:

Step (1): Approximate number of classes (K):

n =10.
𝟒
K =2.5* √𝟏𝟎
K = 4.4456

Step (2): Width of the Class (W).


From the data in this Example, the Largest data value= 26, and the
Smallest data value= 18. Hence, the range (R) is:
16 Dr. Abood Mohammed Jameel
Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
R = 26 – 18 = 8
and we already have an approximate number of classes K =5.8508,
therefore,
W=R/K
W = 8 / 4.4456 = 1.7995 ≈ 2.
Which is rounded-up to 2?

Step (3): Class Limits:

We can find the class limits as follows:-Lower class limit = The


smallest data value in the sample or less = 18.
Upper class limit = Lower class limit + (W - 1) = 18+ (2 – 1) =19.
S0 the first class is (18-19), the second class is (20-21), and so on. .
Continuing this counting process for the other classes provides the
following frequency distribution Table 2.5:
Table 2.5
Frequency distribution table for age of 10 students
Class 18-19 20-21 22-23 24-25 26-27 Total

Freq. f i 2 3 2 2 1 10

2- Recall that the relative frequency is simply the proportion of the


total number of students belonging to a class.
For the data set having n=10 items; the relative frequency of the first
class (18 – 19) is: R. f i = 2/10=0.2
3- Recall that the Percent Frequency (P.f i ).
It is the relative frequency multiplied by 100%.
The Percent frequency of the first class (18 – 19) is:

P.f 1 = 0.2 *100% =20

17 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
Based on the class frequency in Table 2.5 and with n =10.

Table 2.6 shows the relative frequency distribution and percent


frequency distribution for the student's age data.
Table 2.6
The relative and percent frequency distributions
for age of 10 students.
Frequency Relative Percent
frequency frequency (P.
Class
(R.fi) fi) %

18-19 2/10=0.2 0.2*100=20

20-21 3/10=0.3 0.3*100=30

22-23 2/10=0.2 0.2*100=20

24-25 2/10=0.2 0.2*100=20

26-27 1/10=0.1 0.1*100=10

Total 1.00 100.00

Note that the sum of the relative column frequency must always be
equal to 1.00 and the sum of the relative column frequency must
always be equal to 100 in the above example.
4- Calculate Ascending and Descending Cumulative Frequencies.
Table 2.7 shows the frequency distribution and Ascending and
Descending frequency distribution for the student's age data:

18 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
Table 2.7
Ascending and Descending
Frequency Distribution
Frequency Frequency Ascending Descending
(f i ) Cumulative Cumulative
Student’s Age
Frequency Frequency

Less than or equal to 2 2 10


19

Less than or equal to 3 5 8


21

Less than or equal to 2 7 5


23

Less than or equal to 2 9 3


25

Less than or equal to 1 10 1


27

Total 10

4- Calculate Mid-Class for the data in Example 2.


The Solution:
Mid-class (1) = (18+19)/2 = 18.5,
Mid-class (2) = (20+21)/2 = 20.5, And so on…

The result in the following Table 2.8:

19 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
Table 2.8
Shows Mid-Class for age of 10 students
Class 18-19 20-21 22-23 24-25 26-27 Total

Freq. f i 2 3 2 2 1 10

Mid-Class 18.5 20.5 22.5 24.5 26.5

We can summarize the results of this Example in the following Table


2.9:
Table 2.9
Summarizing the results of Example 2
Student’s Class Frequency Relative Percent Ascending Descending Mid-
Age (f i ) Frequency Frequency Cumulative Cumulative class
(R.f i ) (P.f i ) % Frequency Frequency

≤ 19 18-19 2 2/10=0.2 0.2*100=20 2 10 18.5

≤ 21 20-21 3 3/10=0.3 0.3*100=30 5 8 20.5

≤ 23 22-23 2 2/10=0.2 0.2*100=20 7 5 22.5

≤ 25 24-25 2 2/10=0.2 0.2*100=20 9 3 24.5

≤ 27 26-27 1 1/10=0.1 0.1*100=10 10 1 26.5

Total Total 10 1.00 100.00

2.3.2. Summarizing Quantitative Data by Graphs:-


Graph is a Statistical tool that can be used to describe data and
information. There are many types of graphs, we choose the

20 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
Histogram, Frequency Polygon (line graph) and O-give graphs for
this type of data.

2.3.2.1. Histogram: It is graph which displays data by using vertical


bars of various heights to represent frequencies on the Y-axis and
classes on horizontal axis.
Example 4
Draw Histogram For the following data:

Class 18-19 20-21 22-23 24-25

Frequency(fi) 2 3 2 2

The Solution:
Frequency on the y-axis and the class limits on the X-axis.
Figure 2.1 Histogram for data in example 4.

3.5

2.5

2 Freq
Column1
1.5
Column2
1

0.5

0
18-19 ‫ﻛﺎﻧون اﻷول‬-۲۰ 22-23 24-25

21 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
2.3.2.2. Line graph.
The frequency is placed along the vertical axis or (y- axis) and the
mid-class (mid-points) is placed along the horizontal axis (X-axis).
These points are connected with lines.

Example 5
Draw a line graph from the following data.
Class 3-5 6-8 9-11 12-14 15-17 18-20 21-23

Freq.(f i ) 6 8 4 4 4 2 2

Mid-Class 4 7 10 13 16 19 22
Xi

The Solution:
We put mid-class on the X-Axis and frequency on the Y-Axis to get
the following graph.

Figure 2.2 line graphs.

22 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
9

5 frequency
Column1
4
Column2
3

0
4 7 10 13 16 19 22

2.3.2.3. O-give: It is a curve line for the cumulative frequency


distribution or the relative cumulative frequency. We put on the
vertical y-axis the cumulative frequency or relative cumulative
frequency and we put on the horizontal X-axis the mid- class.

Example 6
Draw O -give (Curve Line) for Ascending and Descending
Cumulative Frequency for the following data.
Ascending and Descending Cumulative Frequencies.
Items ≤5 ≤ ≤ 11 ≤ ≤ 17 ≤ 20 ≤ 23
rejected 8 14
daily
Ascending 6 14 18 22 26 28 30
Cumulative
Descending 30 23 16 12 8 4 2

23 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
Cumulative

The Solution
Put cumulative frequency on the vertical y-axis.
Put the class boundaries on the horizontal X-axis
Figure 2.3 O give for the data in example 6.
35

30

25

20

15 Series 1

10 Series 2
Series 3
5

0
Items Less Less Less Less Less Less
rejected than or than or than or Less than or than or than or
daily equal to equal to equal to than or equal to equal to equal to
5 8 11 equal to 17 20 23
14

2.4. Summarizing Qualitative Data


2.4.1. By Frequency Distribution Table:-
Another definition of frequency distribution is a tabular summary of
a set of data showing the frequency (or number) of items in each of
several non-overlapping classes or groups. The objective of
developing a frequency distribution Table is to provide insights
about the data that cannot be quickly obtained by looking only at the
original data. To see how frequency distribution can be used with
qualitative data, consider the following example:

24 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel

Example7
We asked first year students in Accounting Dept./ Cihan University
about their Blood group. Their responses are listed below:
Data from a Sample of 50 Students
(AB)+ A+ B+ O- (AB)+ O- (AB)+ B+ O+ (AB)+

O+ B- (AB)+ A- O- A- A+ (AB)+ (AB)+ (AB)-

A+ B+ A+ (AB)+ B+ (AB)- B- B+ B+ O+

B+ B- A- (AB)- B- B- (AB)+ O+ A+ B+

(AB)- (AB)+ O+ B- A+ A+ B+ O+ A+ A+

Find:
1.Frequency distribution table.
2. Relative frequency,
3. Percent frequency.
4. Ascending Cumulative frequency
The solution:
1. To develop a frequency distribution table for these data, we
count the number of times each of the Blood groups appears in the
data set. (AB)+ appears 10 times, O+ appears 6 times, B+ appears 9
times, and so on. These counts are summarized in the frequency
distribution Table 2.11 below:
Table 2.11
Frequency Distribution of Blood group
25 Dr. Abood Mohammed Jameel
Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
Blood group (AB)+ O+ O- B+ B- A+ A- (AB)- Total
Frequency ( f i ) 10 6 3 9 6 9 3 4 50

We are often interested in knowing the proportion, or percentage, of


the data items in each group or class.
2. The relative frequency distribution is a tabular summary of a set
of data showing the relative frequency for each group or class.
3. The percent frequency is a tabular summary of a set of data
showing the percent frequency for each group or class.
We can use Table 2.11, to develop the relative and percent frequency
distribution for the student's Blood group data, as summarized in
Table 2.12.
We can see, In Table 2.12 that the relative frequency for O+ is:
6/ 50= 0.12, the relative frequency for (AB)+ is 10/50 = 0.2, and so on.
Multiplying each of the relative frequencies by 100 provides the
percent frequency distribution.
We find from these distributions, that on the basis of the sample
data, 20% of the students have (AB)+ Blood group, 18% have ( B+ )
Blood group, 18% have (A+ ) Blood group, 12% have ( O+ ) Blood
group, 12% have ( B-) Blood group, 8% have (AB)- Blood group,
6% have (O-) Blood group, and 6% have ( A- ) Blood group.
Computing the relative frequency, percent frequency and Ascending
Cumulative Frequency for the all Student's Blood group provides the
frequency distributions in Table 2.12.

26 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel

Table 2.12
Relative, Percent and Ascending Cumulative
Frequency Distribution of Student's Blood Group
Frequency Frequency Relative Percent % Ascending
(f i ) Frequency Frequency Cumulative
Blood
(R.f i ) (P.f i ) Frequency
group

(AB)+ 10 0.20 20 10

O+ 6 0.12 12 16

O- 3 0.06 06 19

B+ 9 0.18 18 28

B- 6 0.12 12 34

A+ 9 0.18 18 43

A- 3 0.06 06 46

(AB)- 4 0.08 08 50

Total 50 1.00 100 -

2.4.2. Summarizing Qualitative Data by Graphs:


2.4.2.1. Bar Graphs:
Bar Graphs are used to describe the qualitative data.
It is a graph which displays the data by using vertical bars of various
heights to represent frequencies, relative frequency, or Percent

27 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
frequency distribution. On the horizontal axis of the graph, we
specify the labels that are used for each of the classes or groups. A
frequency, relative frequency, or percent frequency scale can be used
for the vertical axis of the graph. Then, using a bar of fixed width
drawn above each class or group label, we extend the height of the
bar until we reach the frequency, relative frequency, or percent
frequency of the class or group as indicated by the vertical axis.
Figure 2.4 is a bar graph of the frequency distribution for the 50
Student's Blood group Table 2.11.
Note: how the graphical presentation shows the Blood groups (AB)+,
B+, and A+ have the highest frequency.
Figure 2.4
Bar graph of Student's Blood group
12

10

Frequency ( fi )
6
Column2
Column1
4

0
(AB)+ O+ O- B+ B- A+ A- (AB)-

2.4.2.2. Pie- Chart (Circle graph): It is Graphical device that describe


groups of qualitative data as slices of a pie or a circle. Or It is
graphical device for presenting relative frequency distribution for
qualitative data. To draw a pie chart or to construct a pie chart, we
first draw a circle to represent all the data; then use the relative

28 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
frequencies to sub-divide the circle into sectors, or parts, that
correspond to the relative frequency for each class.
For example, since there are 360 degrees in a circle and since (AB)+
has a relative frequency of 0.20, the sector of the Pie chart labeled
(AB)+ should consist of 0.20* 360= 72 degrees. It means that the
relative frequency determines the size of the slice. In other words the
number of degrees in any slice is the relative frequency times 360
degrees, i.e the number of degrees in any slice = R.f i * 360. Similar
calculations for the other classes yield the following frequency
distribution Table 2.13 and the Figure (2.5) Pie chart:
Table 2.13
The number of degrees in any slice
(Part) of Student's Blood group
Blood Relative Number of
group Frequency degrees
(R.f i ) = R.f i *360

(AB)+ 0.20 0.20 x360=72.0

O+ 0.12 0.12 x360=43.2

O- 0.06 0.06 x360=21.6

B+ 0.18 0.18 x360=64.8

B- 0.12 0.12 x360=43.2

A+ 0.18 0.18 x360=64.8

A- 0.06 0.06 x360=21.6

(AB)- 0.08 0.08 x360=28.8

29 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
Total 1.00 360.0

Figure (2.5) Pie Chart of Student's Blood group

Student's Blood Group

AB+ 0.2
O+ 0.12
O- 0.06
B+ 0.18
B- 0.12
A+ 0.18
A- 0.06
AB- 0.08

2.5 Exercises
Exercise 1
Define the following; Element, Variable, Sample, Statistical
Population, types of data.
Exercise 2
Consider the following data
Good Very Good Excellent Good Very Good

Good Very Good Good Excellent Very Good

Excellent Very Good Good Excellent Good

Very Good Excellent Good Good Excellent

30 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
Excellent Very Good Good Excellent Good

Very Good Excellent Good Good Excellent

Find:
1-Frequency distribution Table
2-Relative frequency and Percent frequency distribution
3-Ascending and Descending Cumulative frequency

Exercise 3
Consider the following data:
Apple L.G H.P IBM Sony Dell L.G H.P IBM Sony L.G
H.P IBM Sony Apple L.G H.P IBM Sony Apple L.G
H.P Apple IBM L.G H.P IBM Sony Apple Dell L.G H.P
Dell L.G H.P Dell L.G H.P Dell Sony
Find:
1.Frequency distribution Table,
2. Relative frequency and Percent frequency distribution,
3. Ascending and Descending Cumulative frequency.
Exercise 4
the data below shows the different types of blood group for 40 people,
where: 1 =Type A+ blood group. 2 = Type O+ blood group.
3 = Type B+ blood group. 4 = Type (AB)+ blood group.

3 4 4 3 2 3 4 3 1 4 4 2 4 3 1 4 4 2 4 4

2 3 2 3 3 2 3 2 1 3 2 3 3 4 1 4 2 3 4 1

1. Construct a frequency distribution table.


2. Display the results in Bar chart.
31 Dr. Abood Mohammed Jameel
Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
Exercise 5
the number of items rejected daily by a manufacturer because of defects
was recorded for the 30 days. The results are as follows:

4 9 13 7 12 15 5 8 5 7 15 17 19 8 3

4 10 8 22 16 9 5 3 9 21 14 13 18 7 5

1. Find a frequency distribution table,


2. Compute Relative and Percent, frequency distribution.
3. Compute Ascending and descending cumulative frequency
distribution.
Exercise 6
suppose you have the following frequency distribution table:

Group A B C D E Total

Freq. ( f i ) 10 25 40 30 15 120

Draw:
1. Bar –Graph.
2. Pie –Chart (Circle- graph)
Exercise 7
suppose you have the following data

Class 3-5 6-8 9-11 12-14 15-17 18-20 21-23 Total

Frequency 7 7 4 4 4 2 2 30

Find:

32 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
1. the relative, Percent, and Cumulative frequency distribution.
2. Draw O-give (Curve line).

Exercise 8
The data below shows the death due to variety of causes for 45 people,
Where 1 = heart disease 2 = cancer 3 = accidents 4 = other.

2 3 3 4 1 4 2 3 4 1 3 2 4 3 4

3 4 4 3 2 3 4 3 1 4 4 2 3 1 4

2 4 4 2 3 2 2 3 2 2 3 3 4 1 4

1. Construct a frequency distribution table.


2. Find the Relative and the Percent frequency distribution.
Exercise 9
suppose you have the following table:

Group A B C D E Total

R.f i 0.15 0.25 0.30 --- 0.12

1. Calculate (R.f i ) of group(D) .


2. If the sample size n=400, find (f i ) and (P.f i ).
3. Draw Pie–Chart.
Exercise 10
suppose you have the following data:

Group A B C D E Total

R.f i 0.25 0.20 0.30 0.15 ------- -------


10

33 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
1. What is the (R.f i ) of group (E) .
2. The Sample size n=500, find (f i ) and (P.f i ).
Exercise 11
suppose you have the following data:

Group A B C D E Total

Freq. (f i 25 20 45 ---- 30 130


)

Draw Pie- Chart to summarize these groups.

Exercise 12
Consider the following table:

Class 6-10 11-15 16-20 21-25 26-30 Total

Freq. f i 20 25 ------- 15 10 120


2 5

Find:
1.The (f 3 ) of class (16-20).
2. The Width of the class (W).
3. The Range (R).

Exercise 13
Consider the following table:

Class 5-10 11-16 17-22 23-28 29-34 Total

Freq. f i 20 25 50 15 10 120

34 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
Find:
1.The Ascending and Descending Cumulative frequency distribution
2. Draw O-give graph.

Exercise14: consider the following table:

Class 3-6 7-10 11-14 15-18 19-22 Total

fi 25 35 55 --- 10 150

Find:
1. The (f i ) of class (15-18).
2. The Width of the class (W).
3. The Range (R).

Exercise 15:
Consider the following table

Class 3-6 7-10 11-14 15-18 19-22 Total

fi 25 35 55 25 10 150

Find:
1.The Ascending and Descending Cumulative frequency distribution.
2. Draw O-give graph.

Exercise16:
Consider the following data.

14 21 23 21 16

19 22 25 16 16

35 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
24 24 25 19 16

20 23 16 20 19

24 26 15 22 24

20 22 24 22 20

1. Develop a frequency distribution using classes of


12-14, 15-17, 18-20, 21-23, and 24-26.
2. Develop a relative frequency distribution and a percent frequency
distribution using the classes in part (a).
Exercise17
Researcher distributes questionnaires to ask customers how they rate
the server, food quality, cocktails, prices, and atmosphere at the
restaurant. Each characteristic is rated on a scale of outstanding (O),
very good (V), good (G), average (A), and poor (P).
G O V G A O V O V G O V
V O P V O G A O O O G O
V A G O V P V O O G O O
O G A O V O O G V A G G
1. Construct a frequency distribution table.
2. Display the results in Bar chart, and Pie Chart.

Exercise18:
Consider the following table:

Class Freq. fi R.fi P.fi Cumulative fi

10-20 60 20 60

21-31 30

36 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel

---- 0.22 ----

43-53 18

54-64 10

Total 300

1. Complete the Table.


2. Find: K= Number of class and W =Width of class.
3. Find Mid class.
4. Draw Dot-Plot.

Exercise19:
Suppose you have the following Data:

Class Freq. fi Rfi Pfi Cumulative fi

5-10 22

11-16 140 28

---- 0.20 ----

----- 17

29-34 13

Total 500 100

1. Complete the Table. 2. Find: K= Number of class and


W =Width of class. 3. Find Mid class. 4. Draw Dot-Plot.

Exercise20
consider the following data.

37 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
8.9 10.2 11.5 7.8 10.0 12.2 13.5 14.1 10.0 12.2
9.5 11.5 11.2 14.9 7.5 10.0 6.0 15.8 11.5
1- Construct a dot plot.
2- Construct a frequency distribution.
3- Construct a percent frequency distribution.

Exercise21
doctor’s office staff studied the waiting times for patients who arrive
at the office with a request for emergency service. The following data
with waiting times in minutes were collected over 20 days.
2 5 10 12 4 4 5 17 11 8
9 8 12 21 6 8 7 13 18 4 3
Use classes of 0-4, 5-9, and so on in the following:
Show the frequency distribution. Show the relative frequency
distribution.

38 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel

Chapter Three

Measures of Central Tendency

(Measures of Location)
3.1 Introduction
Summarization of the data is a necessary function of any statistical
analysis. As a first step in this direction, the huge mass of data is
summarized in the form of tables and frequency distributions. In
order to bring the characteristics of the data into sharp focus, these
tables and frequency distributions need to be summarized further.
A measure of central tendency or an average is very essential and an
important summary measure in any statistical analysis. An average
is a single value which can be taken as representative of the whole
distribution.

3.2 Mean
Before the discussion of the mean, we shall introduce certain notations.
Consider that there are n observations whose values are denoted
by X1 , X 2 , ... X n respectively. The sum of these observations X1 +

X2 + ... + X n will be denoted in abbreviated form as∑X i , where


∑ (called sigma) denotes summation sign.
The subscript of X, i.e., 'i' is a positive integer, which indicates the
serial number of the observation. Since there are n observations,
39 Dr. Abood Mohammed Jameel
Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
variation in i will be from 1 to n. When there is no ambiguity in
range of summation, this indication can be skipped and we may
simply write X1 + X2 + ... + X n = ∑ Xi.

The Mean is the most important numerical measure of location and


obtained by adding all the data values and dividing by the total
number of values.
The Mean can either be a population mean (denoted by µ) or a
� ). The formula of the Population Mean
Sample mean (denoted by 𝑿
is as follows:
The population means:
µ = ∑ X i /N … 3.1
or

The formula for sample mean:


𝑿 = ∑ X i /n … 3.2

40 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
or

Properties of the mean


1. Uniqueness: For a given set of data there is one and only one
mean.
2. Simplicity: The mean is easy to calculate.
3. Affected by extreme values: The mean is influenced by each
value.
Calculate sample Mean.
Assume there are n observations: X1, X2 , ... , Xn.
The mean can be calculated by the previous formula which shows
how the Mean is computed for a sample with size n, where n =
Sample Size (Number of items in the Sample).
In this formula, the numerator is the sum of all data values.
That is,
� = ∑ X i / n, = ( X 1 + X 2 + . . . + X n ) / n
𝑿

To illustrate the computation of the sample Mean, let us consider the


following examples:
Example 1
Consider the following data:

10 20 25 30 35 28 22 15 18 12

41 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
Compute: The Mean.
The Solution:
We use the following formula of the Sample Mean to get:

� = ∑ X i / n, = ( X 1 + X 2 + . . . + X 10 ) / n
𝑿
� =(10 + 20 +25 + 30 +35+ 28 +22 +1 5 +1 8 +12 )/10 = 215 /10 = 21.5
𝑿

The Sample Mean is 21.5.

Example 2

We asked 30 first year students, in Accounting Department at Cihan


University, about their age. The following are their responses:
19 21 20 21 22 18 23 24 20 19 24 23 25 19 21
21 23 22 18 24 19 18 20 22 20 22 19 19 21 23

Compute the Mean.


The Solution: The Mean age of student is computed as follow:

� = ∑ X i / n, = ( X 1 + X 2 + . . . + X 30 ) / n
𝑿

= ( 19 + 21 + 20 + . . . + 21 + 20) / 30 = 630 / 30 = 21 years.


3.3 Median
42 Dr. Abood Mohammed Jameel
Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
The Median is another measure of central location for data.
Median of distribution is that value of the variant which divides it into
two equal parts. Median is a positional average because its value
depends upon the position of an item and not on its magnitude.
Determination of Median:
The following steps are involved in the determination of median:
1. The given observations are arranged in either
ascending or
2. Descending order of magnitude.
3. Given that there are n observations, the median is given by:
1. The size of {( n+1) /2 } th observations, when n is odd.
2. The mean of the sizes of ( n / 2) th and {(n+1) / 2} th
observations, when n is even.

Example 3
Find median of the following observations:
20, 15, 25, 28, 18, 16, 30.
The Solution: Writing the observations in ascending order, we get 15,
16, 18, 20, 25, 28, 30. Since n = 7, i.e., odd, the median is the size of (
7+1) /2 = 4 th, i.e., 4th observation. Hence, median, denoted by Md =
20.
Note: The same value of Md will be obtained by arranging the
observations in descending order of magnitude.

Example 4
Find the median of the following data:
245, 230, 265, 236, 220, 250.
The Solution:
43 Dr. Abood Mohammed Jameel
Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
Arranging these observations in ascending order of magnitude, we
get 220, 230, 236, 245, 250, 265.
Here n = 6, i.e., even.
Median will be the mean of the size of 6/2 =3, i.e., 3rd and [(6/2) + 1] =
4, i.e 4th observations. Hence Md = [ (236+245) /2] = 240.5

44 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
Again, First of all we arrange the data in ascending order (There are as
many numbers below the median as above the median).
If there is an odd number of a data value, the Median is the value in the
middle.
If there is an even number of data values, the Median is the Average of
two middle values.
Let us apply this definition to calculate the Median of the following data
set:
4 3 6 8 10 2 5 11 13 16 15
We arrange these 11 data values in ascending order to get
2 3 4 5 6 8 10 11 13 15 16
Since n = 11 is odd number, the Median is the middle value 8.
Suppose we also compute the median of Student's age / Example 2.
We arrange the 30 data values in ascending order.
18 18 18 19 19 19 19 19 19 20 20 20 20 21 ( 21
21 ) 21 21 22 22 22 22 23 23 23 23 24 24 24 25
Since n = 30, is Even number, we identify the middle of two data values.
The average of these two values is the Median = ( 21 + 21 ) / 2 = 21.
Although the mean is the more commonly used measure of central
location, there are some situations in which the Median is preferred.
Whenever there are extreme data values, the Median is often the
preferred measure of central tendency.

3.4 The Mode


The concept of mode, as a measure of central tendency, is
preferable to mean and median when it is desired to know the most

45 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
typical value, e.g., the most common size of a ready-made garment, the
most common size of income, the most common size of pocket
expenditure of a college student, the most common size of a family in
a locality, the most common duration of cure of viral-fever, the
most popular candidate in an election, etc.
How to Find the Mode: The mode is the most frequent data value. Or
the Mode is the data value that occurs with greatest frequency. The
mode is an important measure of location for qualitative data. There
may be no mode if no one value appears more than any other. There
may also be two modes, three modes, or more than three modes.

Example 5
Compute mode of the following data:
3, 4, 5, 10, 15, 3, 6, 7, 9, 12, 10, 16, 18,
20, 10, 9, 8, 19, 11, 14, 10, 13, 17, 9, 11
The Solution:
Writing this in the form of a frequency distribution, we get

Values: 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Frequency: 2 1 1 1 1 1 3 4 2 1 1 1 1 1 1 1 1 1
Mode = 10

Example 6
Consider the following data set:

1 2 3 4 5 6 7 8 8 9 8 10 11 12 13 14 15

Find: 1- The Mean 2- The median 3- The Mode

46 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
The Solution:

� = 1+ 2+ 3+ 4+ 5+ 6+ 7+ 8+8+9+8+…+15/17
1. The Mean: 𝑿
= 136/17 = 8.
2. The Median, We arrange the values in an ascending order.
1 2 3 4 5 6 7 8 8 8 9 10 11 12 13 14 15.
The median is the value in the middle (Red value)
3.The Mode will be equal to 8.

3.5 Summary:
The Mean is used in computing other statistics (such as the variance). It
does not exist for open ended grouped frequency distributions. It is
often not appropriate for skewed distributions such as salary
information.
The Median is the center number and is good for skewed distributions
because it is resistant to change. The Mode is used to describe the most
typical case. The mode can be used with nominal data whereas the
others can't. The mode may or may not exist and there may be more
than one value for the mode

3.6 Exercises

Exercise 1
The following are behavioral ratings as measured for 10 cases.
3 , 6 , 4 , 3 , 4 , 4 , 5 , 2 , 5, 4
Compute:
A) The mean ,
B) The median and
C) The mode.

47 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
Exercise 2
The number of sick days due to colds and flu last year was recorded by
a sample of 15 adults.
The data are: 5 7 1 3 15 6 5 8 3 8 10 5 2 1 11
Compute: A- The Mean B- The Median C- The Mode.

Exercise 3
Suppose the following data represent 25 cell phone prices sold in the
Erbil area. Data are in dollars ($).

119 121 120 118 119 118 119 120 121 117 125 128
121 118 129 128 121 119 124 121 123 127 130 129 121

Compute: a) The Mean b) The Median c) The Mode.

Exercise 4
A sample of 20 college Lecturers showed the following hours taken
during the first semester, 2011-2012.

15 14 16 22 24 16 18 20 22 24

36 34 36 22 20 18 22 36 18 20

What are the Mean, Median and Mode?

48 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
Chapter Four

Measures of Variation (Dispersion)

4.1 Range: The range is the simplest measure of variation to find. It is


simply the highest (Largest) value minus the lowest (Smallest) value.
Range = Largest value – Smallest value

R = XL – XS … 4.1

Since the range only uses the largest and smallest values, it is greatly
affected by extreme values, that is - it is not resistant to change.

Example1

Find the Range for the following data.

18 16 17 12 23 21 18 22 26 19 25

Range (R) = 26 – 16 = 10

4.2 Population Variance (σ2) ; It is the average of the squares of the


distances from the population mean. Or it is the sum of the squares of
the deviations from the mean divided by the population size (N) (the
number of values in the population)
The formula of the population Variance is

σ2 = ∑ (X i - µ)2 / N … 4.2

= population variance

49 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
N = population size
= population mean
4.3. Population Standard Deviation (σ ) which is the square root of the
population variance.

… 4.3

4.4. Sample Variance (S2): It is unbiased estimator of a population


variance. Instead of dividing by the population size, the sum of the
squares of the deviations from the sample mean is divided by (n-1),
where n is the sample size.
Calculation of the sample variance:

� )2 / n-1
S2 = ∑ (X i - 𝑿 … 4.4
= sample variance

= individual value
= sample mean
n = number of values
Degrees of freedom.
There are (n – 1) degrees of freedom in computing the variance, because
if (n -1) values are known, the nth one is determined automatically.
This is because all of the values of (xi - x) must add to zero.

4.5 Sample Standard Deviation


The standard deviation is the square rood of the variance.

50 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
The standard deviation expresses the dispersion in terms of the original
units. Since the variance of a sample is (S2 ) we take the square root.

… 4.5
or
S = √ S2

Example 2
Consider the following data set:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Calculate: 1.The mean 2.Variance (S2) 3.The Standard deviation (S)

The Solution:
To solve problem like this, we have to prepare the following table:

Table 4.1

Xi 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 120

�)
(X i - 𝑿 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 0

� )2
(X i - 𝑿 49 36 25 16 9 4 1 0 1 4 9 16 25 36 49 280

Hence,
� ) = 1+ 2 +3 + … +14+ 15 =120 /15= 8
1.The Mean, (𝑿

51 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
� )2 ∕ (n-1) =280/ (15-1) = 20
2.The Variance, S2 = ∑(X i - 𝑿
3.The Standard deviation S = √ 20 = 4.47213

4.6. Coefficient of Variation (C.V)


Standard deviation divided by the mean, expressed as a percentage.

� )* 100%
C.V = (S / 𝑿 … 4.6
Example 3
Find the coefficient of variation ( C.V.) for data in Example 4.2
The Solution:
Mean = 8, S= 4.47213, then the C.V can be calculated by:
C.V = (4.47213 / 8)*100% = 55.9016%.
Example 4
The following data represent daily salaries (ID) paid to 15 employees
working in a Constructing company.
50 55 60 65 80 68 78 75
60 72 65 77 50 65 70
Calculate:
1. The Variance (S2).
2. The Standard deviation(S).
3. The Coefficient of Variation (C.V).
The Solution:
We can calculate the Variance (S2) , the Standard deviation(S) and The
Coefficient of Variation (C.V) from the data in the following Table 4.2:

Table 4.2

52 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel

Xi 50 55 60 65 80 68 78 75 60 72 65 77 50 65 70 990

(Xi − X ) -16 -11 -6 -1 14 2 12 9 -6 6 -1 11 -16 -1 4 0

( X i − X ) 2 256 121 36 1 196 4 144 81 36 36 1 121 256 1 16 1306

Hence,
1.The Mean: 990/15 = 66
2.The variance: S2 = 1306 /14 = 93.2857
3.The standard deviation S = √ 93.2857 = 9.6585
4. C.V = ( 9.6585 / 66 ) * 100% = 14.6341 %

Example 5
Suppose you have the following 16 observations:
14 16 17 18 21 17 25 13
15 17 19 22 23 20 22 25

Calculate: 1.The Range, 2.The Standard deviation(S),


3.The Coefficient of Variation (C.V).

The Solution:
From the data in the Table 4.3, we get;

1. The Mean: 304/16 = 19


2. The variance: S2 = 210 /15 = 14
3. The standard deviation S = √ 14 = 3.74
4. C.V = ( 3.74 / 19 ) * 100% = 19.68 %
5. Range = max – min = 25 -13 = 12

53 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
Table 4.3

Xi 14 16 17 18 21 17 25 13 15 17 19 22 23 20 22 25 304

(Xi − X ) -5 -3 -2 -1 2 -2 6 -6 -4 -2 0 3 4 1 3 6 0

( X i − X ) 2 25 9 4 1 4 4 36 36 16 4 0 9 16 1 9 6 210

Example 6
The number of absence hours was recorded by a sample of 20 Students
as follows:
3 4 4 5 6 6 6 6 7 7
7 7 8 9 10 10 11 12 15 17
Calculate: 1.The Variance (S2), 2.The Standard deviation (S ).
3. Coefficient of variation (C.V).
The Solution:
From the data in the Table 4.4, we get;
1.The Mean: 160/20 = 8
2.The variance: S2 = 250 /20 = 12.5
3.The standard deviation S = √ 12.5 = 3.54
4. C.V = (3.54 / 8) * 100% = 44.25 %
4.7 Measure of Position
Standard Scores (z-scores): The standard score is obtained by
subtracting the mean and dividing the difference by the standard
deviation. The symbol is Z, also called a Z-score.

� )/S
Z = (X -µ) / σ. Or Z = (X - 𝑿 … 4.7

54 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
The mean of the standard scores is zero and the standard deviation is
one. This is the nice feature of the standard score.
Example 7
Find the standard Score (Z), for data in Example 4.2.
The solution:
S = 4.47213, the Mean = 8 the standard Score is
� ) / 4.47213.
Z = (X i -𝑿
The standard score for each variable can be described by the following
Table 4.5. Table 4.4 the standard score
Xi �)
(X i - 𝑿 Z
1 1-8= -7 -1.56249
2 2-8=-6 -1.34164
3 3-8=-5 -1.11803
4 4-8=-4 -0.69443
5 5-8=-3 -0.67082
6 6-8=-2 -0.44714
7 7-8=-1 -0.22361
8 8-8= 0 000000
9 9-8=1 0.22361
10 10-8=2 0.44714
11 11-8=3 0.67082
12 12-8=4 0.69443
13 13-8=5 1.11803
14 14-8=6 1.34164
15 15-8=7 1.56249
Total=120 Zero

55 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel

Chapter Five
Mean, Variance, and Standard deviation
For Grouped (Classified) data.

5.1 The Mean for grouped (Classified) data (Weighted Mean)


� , in this case, can be calculated as follows:
The weighted mean 𝑿
1. Find the sum of multiplication of each value (X i ) by its frequency or
weight (w i ) . i.e (∑ X i w i ).
2. Find the Sum of frequency (∑ w i ).
3. Divide the sum of multiplication (∑X i w i ) by the sum of frequency


i =1
wi X i
Xw = n … 5.1
∑w i =1
i

Example 1
A student in Accounting Department/ Cihan University-Erbil has passed
his final exam/first semester and got the following marks:

Table 5.1

Subject Mark Unit

Principles of Accounting 65 5
Principles of Statistics 70 3
Principles of Management 73 3
Microeconomics 77 3

56 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel

Financial Mathematics 60 3
Computer Skills 85 2
English Language 80 3

Calculate the Weighted Mean (Average).


The Solution:
We have to find the following table:
Table 5.2
Weighted Mean (Average)

Subject Mark (X i ) Unit(w i ) Xi * w i

Principles of Accounting 65 5 325

Principles of Statistics 70 3 210

Principles of Management 73 3 219

Microeconomics 77 3 231

Financial Mathematics 60 3 180

Computer Skills 85 2 170

English Language 83 3 249

Total 513 22 1584

Then, from the table 5.2, we find:


∑ wi = 22
∑ X i w i = 1584
� = ∑ w i * X i / ∑ w i = 1584/22 = 72.
The Weighted Mean is: 𝑿
57 Dr. Abood Mohammed Jameel
Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel

Example 2

The following data represent first year Student’s Grade

Grades 50 60 74 80 54 70

Units (W i ) 2 3 4 3 2 3

Calculate The Weighted Mean.

The Solution:
We have to find the following table:
Table 5.3
Weighted Mean (Average)
Grades X i Units (w i ) X i *w i

50 2 100

60 3 180

74 4 296

80 3 240

54 2 108

70 3 210

Total 17 1134

58 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
n
∑ wi X i
i =1 1134
Xw = = = 66.71
n 17
∑ wi
i =1

5.2The Variance ( S2 ) for grouped data

� )2 * w i } / ∑ (w i - 1)
S2 = { ∑ ( X i – 𝑿 … 5.2

Example 3
Using data in Table 5.1 to calculate the Variance (S2 ).
The Solution:
Create the following Table 5.4

Table 5.4

Subject Mark � ) ( X i –𝑿
Unit(W i ( X i – 𝑿 � )2 � )2*W i
( Xi – 𝑿
(X i ) )

Principles of Accounting 65 5 -7 49 245

Principles of Statistics 70 3 -2 4 12

Principles of 73 3 1 1 3
Management

Microeconomics 77 3 5 25 75

59 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel

Financial Mathematics 60 3 -12 144 432

Computer Skills 85 2 13 169 338

English Language 83 3 11 121 363

Total 513 22 1468

� = ∑ w i * X i / ∑ w i = 1584/22 = 72.
𝑿

� )2 * w i } / ( ∑ w i - 1) = 1468 / 22-1 = 69.9047


S2 = { ∑ ( X i – 𝑿

5.3 The Standard deviation ( S ).

S = √ S2 … 5.3

Example 4
Use data in Table 5.3, to calculate the Standard deviation (S ).

The Solution:
S = √ S2 = √ 69.9047 = 8.3609

Example 5
Consider the following Table 5.5

Table 5.5

Class 3-6 7-10 11-14 15-18 19-22 Total

Wi 25 35 55 25 10 150

Calculate:
1.The weighted Mean,

60 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
2. The Variance, and
3- the Standard deviation.
The Solution: We compute the Mid-class (X i ) as follows:
X 1 = (3+6) / 2 = 4.5 X 2 = (7+10)/ 2 = 8.5 and so on X 5 = (19+22)/ 2 =
20.5, then create the following table:

Table 5.6
Weighted Mean, Variance, and Standard deviation
Class Wi Mid-Class=X i � ) ( X i –𝑿
X i *W i ( X i –𝑿 � )2 � )2 *
(X i – 𝑿
Wi

3-6 25 4.5 112.5 -6.9333 48.0706 1201.7662

7-10 35 8.5 297.5 -2.9333 8.6042 301.1487

11-14 55 12.5 687.5 1.0667 1.1378 62.5816

15-18 25 16.5 412.5 5.0667 25.6714 641.7862

19-22 10 20.5 205.0 9.0667 82.20504 822.0504

Total 150 1715 3029.3331

From the table (5.5) we find:


∑ W i = 150, ∑ X i * W i = 1715
� = ∑ W i * X i / ∑ W i = 1715/150 =
1. The weighted Mean is: 𝑿
11.43333.

2. The Variance is,

61 Dr. Abood Mohammed Jameel


Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel
� )2 * W i } / ( ∑ W i - 1) = 3029.3331 / 150 -1
S2 = { ∑ ( X i – 𝑿
= 20.3311

3. The Standard deviation ( S ). S = √ S2 = √ 20.3311 = 4.5090 .

62 Dr. Abood Mohammed Jameel

You might also like