Priinciples of Statistics PDF

Statistics and Probability for Business and Financial Sciences
Dr. Abood Mohammed Jameel

Chapter 1
Concepts and definitions
1.1 Introduction
Statistical techniques are those techniques which are used in
conducting the statistical enquiry concerning to certain
Phenomenon. They include all the statistical methods beginning
from the collection of data till interpretation of those collected data.
One of the important statistical methods is collection of data. There
are different methods for collecting primary and secondary data.
Although the tradition of collection of data and its use for various
purposes is very old, the development of modern statistics as a
subject is of recent origin. The development of the subject took place
mainly after sixteenth century. The notable mathematicians who
contributed to the development of statistics are Galileo, Pascal, De-
Mere, Ferment and Cardenas of the 17th century. Then in later years
the subject was developed by Abraham De Mover (1667 - 1754),
Marquis De Laplace (1749 - 1827), Karl Friedrich Gauss (1777 -
1855), Adolph Quenelle (1796 - 1874), Francis Galton (1822 - 1911),
etc. Karl Pearson (1857 - 1937), who is regarded as the father of
modern statistics, was greatly motivated by the researches of Galton
and was the first person to be appointed as Galton Professor in the
University of London. William S. Gusset (1876 - 1937), a student of
Karl Pearson, propounded a number of statistical formulae under
the pen-name of 'Student'.
R.A. Fisher is yet another notable contributor to the field of statistics.
His book 'Statistical Methods for Research Workers', published in
1925 marks the beginning of the theory of modern statistics.
The science of statistics also received contributions from notable
1 Dr. Abood Mohammed Jameel

economists such as Augustine Cornet (1801 - 1877), Leon Walrus
(1834 - 1910), Wilfred Pareto (1848 - 1923), Alfred Marshall (1842 -
1924), Edge worth, A.L. Bowleg, etc. They gave an applied form to
the subject.
Among the noteworthy Indian scholars who contributed to statistics
are P.C. Mahalnobis, V.K.R.V. Rao R.C. Desai, P.V. Sukhumi, etc.
Statistics has been defined in different ways by different authors.
These definitions can be broadly classified into two categories. In the
first category are those definitions which lay emphasis on statistics as
data whereas the definitions in second category emphasize statistics
as a scientific method.
1.2 Definitions:
1.2.1 Statistics.
It is a tool in our hands to translate complex facts into simple and
understandable statements of facts. Statistics may be defined as the science
of collecting, summarizing, organizing, presenting, analyzing, and interpreting
of numerical and non- numerical data for the purpose of assisting in making
more effective decision.
1.2.2 Statistical Population. It is any large group or collection of all

possible items or objects of specified characteristics or aspects of
interest. Example: All the students in Cihan University represent
student's Population. We can divide this population in to two
populations, Girls population and Boys population. We indicate to the
size of any Statistical population by "N ". For Example, the size of
student's population in Cihan University is = N = 6000 students.
1.2.3 Sample. It is a part or sub-set of the population. We indicate to

the size of any sample by "n ". For example, First year Students in
class "A" / Accounting Department is a sample drawn or selected
from student's population in Accounting Department/Cihan
University. The size of this sample is "n = 45" students.
1.2.4. Data and information "Figures and facts" in original form
before any statistical techniques are used to redefine process or
summarize it.
This information may be collected through Censuses, Surveys,
Experiments, Internet, Questionnaires and other sources.
1.2.5 Element. The element is the essential unit or individual in the

Population or in the Sample, on which data are collected. Example,
each student in class "A" / first year in Accounting Department is an
element.
1.2.6 Variables (Xi) .The Variable is any character or aspect which

can vary from one individual "Element" to another. For example:
Age , Income , Height, Color, Blood group, Temperature, High school
average and so on.
1.2.6.1. Discrete Variables: Variables which assume a finite or

countable number of possible values, usually obtained by
counting.
1.2.6.2. Continuous Variables: Variables (data) which assume an
infinite number of possible values, usually obtained by
measurement.
1.2.7 Observations. Data are obtained by collecting measurements on
each variable for every element in the study. The set of measurements
collected for a particular element is called an observation.
1.3. Data Sources:
1.3.1 Experiments;
1.3.2 Sports;

1.3.3 Internet;
1.3.4 Universities Database;
1.3.5 Banks, Companies, Other privet sector;
1.3.6 Governmental documents;
1.3.7 Libraries.
1.4. Functions of Statistics.
The following are the main functions of Statistics:
1.4.1The most important functions of statistics is to collect the data,

Statistics simplifies complete mass of data.
1.4.2 Statistics presents facts in definite form.
1.4.3 Statistics furnishes techniques of comparison. It facilitates

comparison between two or more variables relating to different
time and location.
1.4.4 Statistics helps in the formulation and testing hypothesis.

Statistical methods are helpful to develop new theories, formulate
and testing Hypothesis and we can test the statistical hypothesis.
1.4.5 Statistics helps in forecasting of future events. The statistical

techniques such as Regression analysis may be used for forecasting
of future requirement.
1.4.6 Statistics studies the relationships. It helps in establishing

relationship between two or more variables relating to different
fields.
1.4.7 Statistics helps to make rational decisions.
1.4.8 Statistics provides techniques for drawing inferences.

1.5. Types of data: There are two types of data:

1.5.1 Qualitative Data:
When the basis of classification of data is according to characteristics
(Aspects) or attributes we call this data Qualitative data. For example,
Rich and poor persons, Married and unmarried, Honest and dishonest,
good, very good and Excellent, Blood group, All types of collars, yes,
no, and so on.
1.5.2 Quantitative Data:

When the classification of data is based on figures, we call this data
Quantitative data. For example, Weight, Income, Temperature,
Height, High school average, distance and so on.
1.6. Sampling Methods:

There are many ways to collect a sample. The most commonly used
methods are:
1.6.1 Statistical Sampling Methods:
1.6.1.1. Simple Random Sampling: It is a method of selecting items from
a population such that every possible sample of specific size has equal
chance of being selected. In this case, sampling may be with or without
replacement.
1.6.1.2. Stratified Random Sampling: It is obtained by selecting simple
random samples from strata (or mutually exclusive sets). Some of the
criteria for dividing a population into strata are: (male, female); Age
(under 18, 18 - 28, 29 - 39);
1.6.1.3. Cluster Sampling: It is a simple random sample of groups or

cluster of elements. Cluster sampling is useful when it is difficult or
costly to generate a simple random sample. For example, to estimate
the average annual household income in a large city we use cluster
sampling, because to use simple random sampling we need a

complete list of households in the city from which to sample. To use
stratified random sampling, we would again need the list of
households.
A less expensive way is to let each block within the city represent a
cluster.
1.6.2 Non-statistical Sampling Methods:
1.6.2.1. Judgment Sampling: In this case, the person taking the
sample has direct or indirect control over which items are selected
for the sample.
1.6.2.2. Quota Sampling: The decision maker requires the sample to
contain a certain number of items with a given characteristic. Many
political polls are, in part, quota sampling.
1.7. Levels of Measurement:

Measurements are classified according to the highest level which it fits.
Each additional level adds something to the previous level didn't have.
There are four levels of measurement:
1.7.1. Nominal is the lowest level. Only names are meaningful here.
1.7.2. Ordinal adds an order to the names.
1.7.3. Interval adds meaningful differences.
1.7.4. Ratio adds a zero so that ratios are meaningful.
1.8. Exercises
Exercise 1

Determine whether the following information Qualitative or
Quantitative Data. Temperature, Colors, Distances, Age, Weight,
Grades, Brands, Names, Blood-group, 55, Yes, No, Excellent, Very-
good, 47, Black, Red, 20 Kilograms, Erbil, and University.
Exercise 2
Describe types of data.
Exercise 3
Explain the Levels of Measurement.
Exercise 4
Explain the Simple Random Sampling.
Exercise 5
Where do you get data and information (Data Sources).
Exercise 6
List all functions of Statistics.
Exercise 7
Explain what is meant by the term population.
Exercise 8
Explain what is meant by the term sample.
Exercise 9
Explain how a sample differs from a population.
Exercise 10
Explain what is meant by the term sample data.

Chapter Two
Classification and Tabulation

of Data and Information
2.1 Introduction
The data collected in any statistical investigation, known as raw data,

is a complex and unorganized mass of figures. Therefore, it becomes
necessary to organize this in order to apply the Statistical tools of
analysis and interpretation, and it is necessary that the data are
arranged in a definite form. This task is accomplished by the process
of "Classification and Tabulation". The important step in statistics is
the collection of data. This depends on the purpose for which the data
is required. It is the most important step because it is the foundation
of statistical investigation. Data is divided, as we mentioned before in
chapter one, in to two types:
Type1: Quantitative data.
Type2: Qualitative data.
2.2. Summarizing Data.

We can summarize data, either Quantitative or Qualitative data, by
two tools or methods;
2.2.1 Frequency Distribution Table Method:-
The easiest method of Summarizing and organizing data is a frequency
distribution table, which converts raw data into a meaningful pattern
for statistical analysis. So, what is the Table? Any table consists of at
least two columns; column of frequency, and Column of Classes. This
Table is called a Frequency Distribution Table. What is the Frequency?
The frequency may be defined as the number of times a certain value,
item or object occurs.

2.2.1.1. Types of Frequency:
There are four types of frequency:
1. Ordinary Frequency (f i ): The number of times a certain value
occurs.
2. Relative Frequency distribution (R. f i ): It is the frequency in each
cell divided by the total frequency. Or simply the proportion of the
total number of items belonging to a class, ie R. f i = frequency of
the class / n.= f i / n.
Note: The sum of all the relative frequencies must always be equal
to 1.00
3. Percent Frequency distribution (P.f i ): It is the relative frequency
multiplied by 100%. P.fi = R.fi *100%
Note: The sum of all the Percent frequencies must always be equal
to 100. Relative frequency may be determined for both quantitative
and qualitative data and is a convenient basis for the comparison of
similar groups of different
4. Cumulative Frequency distribution.
The Cumulative frequency distribution uses the number of classes,
class width, and class limits that are developed fourth frequency
distribution. However, rather than showing the frequency of each
class, the cumulative frequency distribution shows the number of
items less than or equal to the upper class limit of each class.
Cumulative frequency may be divided in to two types of Cumulative
frequency:
Type (1): Ascending Cumulative Frequency.
Type (2): Descending Cumulative Frequency.
2.2.1.2. What Frequency Distribution Tells Us?
1. It shows how the observations cluster around a central value.
2. It shows the degree of difference between observations.

3. A frequency distribution shows the number of data items in
each of several non-overlapping groups or classes. As we will
see in next chapters, frequency distribution is the basis for
probability theory.
2.2.2. Graphs Methods:-

Graph is a Statistical tool that can be used to describe data and
information.
2.3. Summarizing Quantitative Data.
Quantitative data can be summarized by two tools or methods.
2.3.1 Frequency distribution table method.
Constructing a Frequency Distribution Table:
To Construct a Frequency Distribution Table; we must find the
Column of classes and the Column frequency.
The following steps are helping us of finding the Column of classes:
Step (1): Determine the number of classes.
A class is a group (category) of our interest; each Class consists of
two limits, Lower limit and Upper limit. For Example: 4-8, 6-12,
We need to specify the number of classes (K).
There is no accepted rule tells us how many classes are to be used.
The number of classes is generally recommended between 5 and 20
classes.
We suggest the following simple formula to find an approximate
number of Classes (K):
𝟒
K = 2.5 * √𝒏 … 2.1
Where, n = sample size (number of values or items in the sample).
Step (2)
Determine the Width of Class (W).
Class Width (W) is
W=R/K … 2.2
Where
W = the number of values in each class
R = Range = Largest data value – Smallest data value
K = the number of classes.
Step (3)
Finding Class Limits:
Each class has two limits, lower limit and upper limit.
The limits could actually appear in the data and have gaps between
the upper limit of one class and the lower limit of the next.
Lower class limit = The smallest data value in the sample or less.
Upper class limit = Lower class limit + (W - 1).
Mid-Class (Mid-point): The number in the middle of the class.
It is found by adding the upper and lower limits for each class and
dividing by two.
Mid-Class = (Upper class Limit + Lower class Limit) / 2.
Example1
The number of items rejected daily by a manufacturer because of
defects was recorded for the 30 days. The results are shown below:
4 9 13 7 12 15 5 8 5 7 15 17 19 8 6
6 4 10 8 22 16 9 5 3 9 21 14 13 18 7
Construct:
1. Frequency distribution table.
2. Relative frequency,
3. Percent frequency,

4. Cumulative frequency,
5. Mid-class.
The Solution:
1- To find the frequency distribution table, we find,
Firstly: the number of Classes, by applying the following three steps;
Step (1): Approximate number of classes (K):
n =30.
𝟒
K=2.5* √𝟑𝟎
K = 5.8508.
Step (2):
Width of the Class (W): From the data in Example1, the Largest
data value = 22 and the Smallest data value = 3. Hence, the range (R)
is,
R = 22 – 3 = 19, and we already have an approximate number of
classes K = 5.8508.
Therefore,
W=R/K
W = 19 / 5.8508 = 3.2474 ≈ 3, which is rounded-up to 3.
Step (3) Class Limits: We can find the class limits as follows:
Lower class limit =the smallest data value in the sample or less = 3.
Upper class limit = Lower class limit +(W - 1 ) = 3+ (3 – 1) = 5.
We define the first class limits as (3 – 5), second class limits (6 – 8),
third class limits (9 –11), forth class limits(12 – 14), fifth class limits
(15 – 17), sixth class limits (18 – 20), and seventh class limits( 21 –
23). The smallest data value, 3, is included in the (3 – 5) class.
Secondly: We find the frequency for each class.

Once the number of classes, class width, and class limits has been
determined, a frequency will be obtained by counting the number of
data items belonging to each class. For example, data in Example1
shows that seven values (4, 5, 5, 3, 4, 5, and 3) belong to the class (3 –
5). Thus, the frequency of the class (3 – 5) is 7.
Continuing this counting process for the other classes provides the
following frequency distribution in Table 2.1:
Table 2.1
Frequency Distribution for items rejected daily
Class 3-5 6-8 9-11 12-14 15-17 18-20 21-23 Total
Frequency (f i ) 6 8 4 4 4 2 2 30
2- Recall that the relative frequency (R. f i ) is simply the proportion

of the total number of items belonging to a class. For the data set
having n =30 items; the relative frequency of the first class (3 – 5) is:
R. f i = 6/30=0.200.
3- Recall that the Percent Frequency (P.f i ), is the relative frequency
multiplied by 100%.
P.f i = R.f i *100%
Then, the Percent frequency of the first class (3 – 5) is:
P.f i = 0.200000 * 100 = 20.0000
Based on the class frequencies in Table1 and with n = 30.
Table 2.2 shows the relative frequency distribution and percent
frequency distribution for the items rejected daily data.

Table 2.2
Relative and Percent frequency Distribution for items rejected
Class Frequency Relative Percent
(items rejected (f i ) frequency (R. f i ) frequency (P. f i )
daily) %
3-5 6 0.200000 20.0000
6-8 8 0.266667 26.6667
9-11 4 0.133333 13.3333
12-14 4 0.133333 13.3333
15-17 4 0.133333 13.3333
18-20 2 0.066667 6.6667
21-23 2 0.066667 6.6667
Total 30 1.000000 100.0000
Note that the sum of all the relative frequencies must always be
equal to 1.00 and the sum of all Percent frequencies must always be
equal to 100. In the above example, we see that 0.233333% of all
rejected items are between 3 and 5 items, and 0.066667% of all
rejected items are between 18 and 20 items.
4- Recall the Cumulative Frequency; the cumulative frequency
distribution shows the number of items with values less than or equal
to the upper class limit of each class.
Table 2.3 provides the Ascending and Descending Cumulative
frequency distribution for items rejected daily data.

Table 2.3
Ascending and Descending Cumulative Frequency Distributions
for Items Rejected Daily
Items rejected daily Frequency Ascending Descending
(fi) Cumulative Cumulative
Frequency Frequency
Less than or equal to 5 6 6 30
Total 30 - --
5- Mid-Class (Mid-point): The number in the middle of the class.

It is found by adding the upper and lower limits for each class and
dividing by two.
Mid-Class = (Upper class Limit + Lower class Limit) / 2.
Calculating Mid-class for the data in Example 1 for items rejected.
The Solution:
Let X i = Mid-Class
X 1 = (3 + 5) /2 = 4
X 2 = (6 + 8) /2 = 7
X 3 = (9 +11) /2 =10.

And so on, the total results are listed in Table 2.4:
Table 2.4
Mid- Class - Items rejected
Class 3-5 6-8 9-11 12-14 15-17 18-20 21-23
Freq.(f i ) 6 8 4 4 4 2 2
Mid-Class 4 7 10 13 16 19 22
Xi
Example 2
Suppose the age of 10 students is:
21, 18, 19, 21, 20, 25, 22, 23, 24, and 26.
Find: 1) Frequency distribution table,
2) Relative frequency,
3) Percent frequency,
4) Cumulative frequency and
5) Mid-class.
The Solution:
1- To find frequency distribution table, we find firstly the Classes, by

applying the following three steps:
Step (1): Approximate number of classes (K):
n =10.
𝟒
K =2.5* √𝟏𝟎
K = 4.4456
Step (2): Width of the Class (W).

From the data in this Example, the Largest data value= 26, and the
Smallest data value= 18. Hence, the range (R) is:
R = 26 – 18 = 8
and we already have an approximate number of classes K =5.8508,
therefore,
W=R/K
W = 8 / 4.4456 = 1.7995 ≈ 2.
Which is rounded-up to 2?
Step (3): Class Limits:
We can find the class limits as follows:-Lower class limit = The

smallest data value in the sample or less = 18.
Upper class limit = Lower class limit + (W - 1) = 18+ (2 – 1) =19.
S0 the first class is (18-19), the second class is (20-21), and so on. .
Continuing this counting process for the other classes provides the
following frequency distribution Table 2.5:
Table 2.5
Frequency distribution table for age of 10 students
Class 18-19 20-21 22-23 24-25 26-27 Total
Freq. f i 2 3 2 2 1 10
2- Recall that the relative frequency is simply the proportion of the

total number of students belonging to a class.
For the data set having n=10 items; the relative frequency of the first
class (18 – 19) is: R. f i = 2/10=0.2
3- Recall that the Percent Frequency (P.f i ).
It is the relative frequency multiplied by 100%.
The Percent frequency of the first class (18 – 19) is:
P.f 1 = 0.2 *100% =20

Based on the class frequency in Table 2.5 and with n =10.
Table 2.6 shows the relative frequency distribution and percent

frequency distribution for the student's age data.
Table 2.6
The relative and percent frequency distributions
for age of 10 students.
Frequency Relative Percent
frequency frequency (P.
Class
(R.fi) fi) %
18-19 2/10=0.2 0.2*100=20
20-21 3/10=0.3 0.3*100=30
22-23 2/10=0.2 0.2*100=20
24-25 2/10=0.2 0.2*100=20
26-27 1/10=0.1 0.1*100=10
Total 1.00 100.00
Note that the sum of the relative column frequency must always be
equal to 1.00 and the sum of the relative column frequency must
always be equal to 100 in the above example.
4- Calculate Ascending and Descending Cumulative Frequencies.
Table 2.7 shows the frequency distribution and Ascending and
Descending frequency distribution for the student's age data:

Table 2.7
Ascending and Descending
Frequency Distribution
Frequency Frequency Ascending Descending
(f i ) Cumulative Cumulative
Student’s Age
Frequency Frequency
Less than or equal to 2 2 10

19

21

23

25

27
Total 10
4- Calculate Mid-Class for the data in Example 2.

The Solution:
Mid-class (1) = (18+19)/2 = 18.5,
Mid-class (2) = (20+21)/2 = 20.5, And so on…
The result in the following Table 2.8:

Table 2.8
Shows Mid-Class for age of 10 students
Class 18-19 20-21 22-23 24-25 26-27 Total
Freq. f i 2 3 2 2 1 10
Mid-Class 18.5 20.5 22.5 24.5 26.5
We can summarize the results of this Example in the following Table

2.9:
Table 2.9
Summarizing the results of Example 2
Student’s Class Frequency Relative Percent Ascending Descending Mid-
Age (f i ) Frequency Frequency Cumulative Cumulative class
(R.f i ) (P.f i ) % Frequency Frequency
≤ 19 18-19 2 2/10=0.2 0.2*100=20 2 10 18.5
≤ 21 20-21 3 3/10=0.3 0.3*100=30 5 8 20.5
≤ 23 22-23 2 2/10=0.2 0.2*100=20 7 5 22.5
≤ 25 24-25 2 2/10=0.2 0.2*100=20 9 3 24.5
≤ 27 26-27 1 1/10=0.1 0.1*100=10 10 1 26.5
Total Total 10 1.00 100.00
2.3.2. Summarizing Quantitative Data by Graphs:-

Graph is a Statistical tool that can be used to describe data and
information. There are many types of graphs, we choose the

Histogram, Frequency Polygon (line graph) and O-give graphs for
this type of data.
2.3.2.1. Histogram: It is graph which displays data by using vertical

bars of various heights to represent frequencies on the Y-axis and
classes on horizontal axis.
Example 4
Draw Histogram For the following data:
Class 18-19 20-21 22-23 24-25
Frequency(fi) 2 3 2 2
The Solution:
Frequency on the y-axis and the class limits on the X-axis.
Figure 2.1 Histogram for data in example 4.
3.5
2.5
2 Freq
Column1
1.5
Column2
1
0.5
0
18-19 ‫ﻛﺎﻧون اﻷول‬-۲۰ 22-23 24-25

2.3.2.2. Line graph.
The frequency is placed along the vertical axis or (y- axis) and the
mid-class (mid-points) is placed along the horizontal axis (X-axis).
These points are connected with lines.
Example 5
Draw a line graph from the following data.
Class 3-5 6-8 9-11 12-14 15-17 18-20 21-23
Freq.(f i ) 6 8 4 4 4 2 2
Mid-Class 4 7 10 13 16 19 22
Xi
The Solution:
We put mid-class on the X-Axis and frequency on the Y-Axis to get
the following graph.
Figure 2.2 line graphs.

9
5 frequency
Column1
4
Column2
3
0
4 7 10 13 16 19 22
2.3.2.3. O-give: It is a curve line for the cumulative frequency

distribution or the relative cumulative frequency. We put on the
vertical y-axis the cumulative frequency or relative cumulative
frequency and we put on the horizontal X-axis the mid- class.
Example 6
Draw O -give (Curve Line) for Ascending and Descending
Cumulative Frequency for the following data.
Ascending and Descending Cumulative Frequencies.
Items ≤5 ≤ ≤ 11 ≤ ≤ 17 ≤ 20 ≤ 23
rejected 8 14
daily
Ascending 6 14 18 22 26 28 30
Cumulative
Descending 30 23 16 12 8 4 2

Cumulative
The Solution
Put cumulative frequency on the vertical y-axis.
Put the class boundaries on the horizontal X-axis
Figure 2.3 O give for the data in example 6.
35
30
25
20
15 Series 1
10 Series 2
Series 3
5
0
Items Less Less Less Less Less Less
rejected than or than or than or Less than or than or than or
daily equal to equal to equal to than or equal to equal to equal to
5 8 11 equal to 17 20 23
14
2.4. Summarizing Qualitative Data

2.4.1. By Frequency Distribution Table:-
Another definition of frequency distribution is a tabular summary of
a set of data showing the frequency (or number) of items in each of
several non-overlapping classes or groups. The objective of
developing a frequency distribution Table is to provide insights
about the data that cannot be quickly obtained by looking only at the
original data. To see how frequency distribution can be used with
qualitative data, consider the following example:

Example7
We asked first year students in Accounting Dept./ Cihan University
about their Blood group. Their responses are listed below:
Data from a Sample of 50 Students
(AB)+ A+ B+ O- (AB)+ O- (AB)+ B+ O+ (AB)+
O+ B- (AB)+ A- O- A- A+ (AB)+ (AB)+ (AB)-
A+ B+ A+ (AB)+ B+ (AB)- B- B+ B+ O+
B+ B- A- (AB)- B- B- (AB)+ O+ A+ B+
(AB)- (AB)+ O+ B- A+ A+ B+ O+ A+ A+
Find:
1.Frequency distribution table.
2. Relative frequency,
3. Percent frequency.
4. Ascending Cumulative frequency
The solution:
1. To develop a frequency distribution table for these data, we
count the number of times each of the Blood groups appears in the
data set. (AB)+ appears 10 times, O+ appears 6 times, B+ appears 9
times, and so on. These counts are summarized in the frequency
distribution Table 2.11 below:
Table 2.11
Frequency Distribution of Blood group
Blood group (AB)+ O+ O- B+ B- A+ A- (AB)- Total
Frequency ( f i ) 10 6 3 9 6 9 3 4 50
We are often interested in knowing the proportion, or percentage, of

the data items in each group or class.
2. The relative frequency distribution is a tabular summary of a set
of data showing the relative frequency for each group or class.
3. The percent frequency is a tabular summary of a set of data
showing the percent frequency for each group or class.
We can use Table 2.11, to develop the relative and percent frequency
distribution for the student's Blood group data, as summarized in
Table 2.12.
We can see, In Table 2.12 that the relative frequency for O+ is:
6/ 50= 0.12, the relative frequency for (AB)+ is 10/50 = 0.2, and so on.
Multiplying each of the relative frequencies by 100 provides the
percent frequency distribution.
We find from these distributions, that on the basis of the sample
data, 20% of the students have (AB)+ Blood group, 18% have ( B+ )
Blood group, 18% have (A+ ) Blood group, 12% have ( O+ ) Blood
group, 12% have ( B-) Blood group, 8% have (AB)- Blood group,
6% have (O-) Blood group, and 6% have ( A- ) Blood group.
Computing the relative frequency, percent frequency and Ascending
Cumulative Frequency for the all Student's Blood group provides the
frequency distributions in Table 2.12.

Table 2.12
Relative, Percent and Ascending Cumulative
Frequency Distribution of Student's Blood Group
Frequency Frequency Relative Percent % Ascending
(f i ) Frequency Frequency Cumulative
Blood
(R.f i ) (P.f i ) Frequency
group
(AB)+ 10 0.20 20 10
O+ 6 0.12 12 16
O- 3 0.06 06 19
B+ 9 0.18 18 28
B- 6 0.12 12 34
A+ 9 0.18 18 43
A- 3 0.06 06 46
(AB)- 4 0.08 08 50
Total 50 1.00 100 -
2.4.2. Summarizing Qualitative Data by Graphs:

2.4.2.1. Bar Graphs:
Bar Graphs are used to describe the qualitative data.
It is a graph which displays the data by using vertical bars of various
heights to represent frequencies, relative frequency, or Percent

frequency distribution. On the horizontal axis of the graph, we
specify the labels that are used for each of the classes or groups. A
frequency, relative frequency, or percent frequency scale can be used
for the vertical axis of the graph. Then, using a bar of fixed width
drawn above each class or group label, we extend the height of the
bar until we reach the frequency, relative frequency, or percent
frequency of the class or group as indicated by the vertical axis.
Figure 2.4 is a bar graph of the frequency distribution for the 50
Student's Blood group Table 2.11.
Note: how the graphical presentation shows the Blood groups (AB)+,
B+, and A+ have the highest frequency.
Figure 2.4
Bar graph of Student's Blood group
12
10
Frequency ( fi )
6
Column2
Column1
4
0
(AB)+ O+ O- B+ B- A+ A- (AB)-
2.4.2.2. Pie- Chart (Circle graph): It is Graphical device that describe

groups of qualitative data as slices of a pie or a circle. Or It is
graphical device for presenting relative frequency distribution for
qualitative data. To draw a pie chart or to construct a pie chart, we
first draw a circle to represent all the data; then use the relative

frequencies to sub-divide the circle into sectors, or parts, that
correspond to the relative frequency for each class.
For example, since there are 360 degrees in a circle and since (AB)+
has a relative frequency of 0.20, the sector of the Pie chart labeled
(AB)+ should consist of 0.20* 360= 72 degrees. It means that the
relative frequency determines the size of the slice. In other words the
number of degrees in any slice is the relative frequency times 360
degrees, i.e the number of degrees in any slice = R.f i * 360. Similar
calculations for the other classes yield the following frequency
distribution Table 2.13 and the Figure (2.5) Pie chart:
Table 2.13
The number of degrees in any slice
(Part) of Student's Blood group
Blood Relative Number of
group Frequency degrees
(R.f i ) = R.f i *360
(AB)+ 0.20 0.20 x360=72.0
O+ 0.12 0.12 x360=43.2
O- 0.06 0.06 x360=21.6
B+ 0.18 0.18 x360=64.8
B- 0.12 0.12 x360=43.2
A+ 0.18 0.18 x360=64.8
A- 0.06 0.06 x360=21.6
(AB)- 0.08 0.08 x360=28.8

Total 1.00 360.0
Figure (2.5) Pie Chart of Student's Blood group
Student's Blood Group
AB+ 0.2
O+ 0.12
O- 0.06
B+ 0.18
B- 0.12
A+ 0.18
A- 0.06
AB- 0.08
2.5 Exercises
Exercise 1
Define the following; Element, Variable, Sample, Statistical
Population, types of data.
Exercise 2
Consider the following data
Good Very Good Excellent Good Very Good
Good Very Good Good Excellent Very Good
Excellent Very Good Good Excellent Good
Very Good Excellent Good Good Excellent

Excellent Very Good Good Excellent Good
Very Good Excellent Good Good Excellent
Find:
1-Frequency distribution Table
2-Relative frequency and Percent frequency distribution
3-Ascending and Descending Cumulative frequency
Exercise 3
Consider the following data:
Apple L.G H.P IBM Sony Dell L.G H.P IBM Sony L.G
H.P IBM Sony Apple L.G H.P IBM Sony Apple L.G
H.P Apple IBM L.G H.P IBM Sony Apple Dell L.G H.P
Dell L.G H.P Dell L.G H.P Dell Sony
Find:
1.Frequency distribution Table,
2. Relative frequency and Percent frequency distribution,
3. Ascending and Descending Cumulative frequency.
Exercise 4
the data below shows the different types of blood group for 40 people,
where: 1 =Type A+ blood group. 2 = Type O+ blood group.
3 = Type B+ blood group. 4 = Type (AB)+ blood group.
3 4 4 3 2 3 4 3 1 4 4 2 4 3 1 4 4 2 4 4
2 3 2 3 3 2 3 2 1 3 2 3 3 4 1 4 2 3 4 1
1. Construct a frequency distribution table.

2. Display the results in Bar chart.
Exercise 5
the number of items rejected daily by a manufacturer because of defects
was recorded for the 30 days. The results are as follows:
4 9 13 7 12 15 5 8 5 7 15 17 19 8 3
4 10 8 22 16 9 5 3 9 21 14 13 18 7 5
1. Find a frequency distribution table,

2. Compute Relative and Percent, frequency distribution.
3. Compute Ascending and descending cumulative frequency
distribution.
Exercise 6
suppose you have the following frequency distribution table:
Group A B C D E Total
Freq. ( f i ) 10 25 40 30 15 120
Draw:
1. Bar –Graph.
2. Pie –Chart (Circle- graph)
Exercise 7
suppose you have the following data
Class 3-5 6-8 9-11 12-14 15-17 18-20 21-23 Total
Frequency 7 7 4 4 4 2 2 30
Find:

1. the relative, Percent, and Cumulative frequency distribution.
2. Draw O-give (Curve line).
Exercise 8
The data below shows the death due to variety of causes for 45 people,
Where 1 = heart disease 2 = cancer 3 = accidents 4 = other.
2 3 3 4 1 4 2 3 4 1 3 2 4 3 4
3 4 4 3 2 3 4 3 1 4 4 2 3 1 4
2 4 4 2 3 2 2 3 2 2 3 3 4 1 4

2. Find the Relative and the Percent frequency distribution.
Exercise 9
suppose you have the following table:
R.f i 0.15 0.25 0.30 --- 0.12
1. Calculate (R.f i ) of group(D) .

2. If the sample size n=400, find (f i ) and (P.f i ).
3. Draw Pie–Chart.
Exercise 10
suppose you have the following data:
R.f i 0.25 0.20 0.30 0.15 ------- -------

10

1. What is the (R.f i ) of group (E) .
2. The Sample size n=500, find (f i ) and (P.f i ).
Exercise 11
suppose you have the following data:
Freq. (f i 25 20 45 ---- 30 130

)
Draw Pie- Chart to summarize these groups.
Exercise 12
Consider the following table:
Class 6-10 11-15 16-20 21-25 26-30 Total
Freq. f i 20 25 ------- 15 10 120

2 5
Find:
1.The (f 3 ) of class (16-20).
2. The Width of the class (W).
3. The Range (R).
Exercise 13
Class 5-10 11-16 17-22 23-28 29-34 Total
Freq. f i 20 25 50 15 10 120

Find:
1.The Ascending and Descending Cumulative frequency distribution
2. Draw O-give graph.
Exercise14: consider the following table:
Class 3-6 7-10 11-14 15-18 19-22 Total
fi 25 35 55 --- 10 150
Find:
1. The (f i ) of class (15-18).
2. The Width of the class (W).
3. The Range (R).
Exercise 15:
Consider the following table
Class 3-6 7-10 11-14 15-18 19-22 Total
fi 25 35 55 25 10 150
Find:
1.The Ascending and Descending Cumulative frequency distribution.
2. Draw O-give graph.
Exercise16:
Consider the following data.
14 21 23 21 16
19 22 25 16 16

24 24 25 19 16
20 23 16 20 19
24 26 15 22 24
20 22 24 22 20
1. Develop a frequency distribution using classes of

12-14, 15-17, 18-20, 21-23, and 24-26.
2. Develop a relative frequency distribution and a percent frequency
distribution using the classes in part (a).
Exercise17
Researcher distributes questionnaires to ask customers how they rate
the server, food quality, cocktails, prices, and atmosphere at the
restaurant. Each characteristic is rated on a scale of outstanding (O),
very good (V), good (G), average (A), and poor (P).
G O V G A O V O V G O V
V O P V O G A O O O G O
V A G O V P V O O G O O
O G A O V O O G V A G G
2. Display the results in Bar chart, and Pie Chart.
Exercise18:
Class Freq. fi R.fi P.fi Cumulative fi
10-20 60 20 60
21-31 30

---- 0.22 ----
43-53 18
54-64 10
Total 300
1. Complete the Table.

2. Find: K= Number of class and W =Width of class.
3. Find Mid class.
4. Draw Dot-Plot.
Exercise19:
Suppose you have the following Data:
Class Freq. fi Rfi Pfi Cumulative fi
5-10 22
11-16 140 28
---- 0.20 ----
----- 17
29-34 13
Total 500 100
1. Complete the Table. 2. Find: K= Number of class and

W =Width of class. 3. Find Mid class. 4. Draw Dot-Plot.
Exercise20
consider the following data.

8.9 10.2 11.5 7.8 10.0 12.2 13.5 14.1 10.0 12.2
9.5 11.5 11.2 14.9 7.5 10.0 6.0 15.8 11.5
1- Construct a dot plot.
2- Construct a frequency distribution.
3- Construct a percent frequency distribution.
Exercise21
doctor’s office staff studied the waiting times for patients who arrive
at the office with a request for emergency service. The following data
with waiting times in minutes were collected over 20 days.
2 5 10 12 4 4 5 17 11 8
9 8 12 21 6 8 7 13 18 4 3
Use classes of 0-4, 5-9, and so on in the following:
Show the frequency distribution. Show the relative frequency
distribution.

Chapter Three
Measures of Central Tendency
(Measures of Location)
3.1 Introduction
Summarization of the data is a necessary function of any statistical
analysis. As a first step in this direction, the huge mass of data is
summarized in the form of tables and frequency distributions. In
order to bring the characteristics of the data into sharp focus, these
tables and frequency distributions need to be summarized further.
A measure of central tendency or an average is very essential and an
important summary measure in any statistical analysis. An average
is a single value which can be taken as representative of the whole
distribution.
3.2 Mean
Before the discussion of the mean, we shall introduce certain notations.
Consider that there are n observations whose values are denoted
by X1 , X 2 , ... X n respectively. The sum of these observations X1 +
X2 + ... + X n will be denoted in abbreviated form as∑X i , where

∑ (called sigma) denotes summation sign.
The subscript of X, i.e., 'i' is a positive integer, which indicates the
serial number of the observation. Since there are n observations,
variation in i will be from 1 to n. When there is no ambiguity in
range of summation, this indication can be skipped and we may
simply write X1 + X2 + ... + X n = ∑ Xi.
The Mean is the most important numerical measure of location and

obtained by adding all the data values and dividing by the total
number of values.
The Mean can either be a population mean (denoted by µ) or a
� ). The formula of the Population Mean
Sample mean (denoted by 𝑿
is as follows:
The population means:
µ = ∑ X i /N … 3.1
or
The formula for sample mean:
�
𝑿 = ∑ X i /n … 3.2

or
Properties of the mean

1. Uniqueness: For a given set of data there is one and only one
mean.
2. Simplicity: The mean is easy to calculate.
3. Affected by extreme values: The mean is influenced by each
value.
Calculate sample Mean.
Assume there are n observations: X1, X2 , ... , Xn.
The mean can be calculated by the previous formula which shows
how the Mean is computed for a sample with size n, where n =
Sample Size (Number of items in the Sample).
In this formula, the numerator is the sum of all data values.
That is,
� = ∑ X i / n, = ( X 1 + X 2 + . . . + X n ) / n
𝑿
To illustrate the computation of the sample Mean, let us consider the

following examples:
Example 1
Consider the following data:
10 20 25 30 35 28 22 15 18 12

Compute: The Mean.
The Solution:
We use the following formula of the Sample Mean to get:
� = ∑ X i / n, = ( X 1 + X 2 + . . . + X 10 ) / n
𝑿
� =(10 + 20 +25 + 30 +35+ 28 +22 +1 5 +1 8 +12 )/10 = 215 /10 = 21.5
𝑿
The Sample Mean is 21.5.
Example 2
We asked 30 first year students, in Accounting Department at Cihan

University, about their age. The following are their responses:
19 21 20 21 22 18 23 24 20 19 24 23 25 19 21
21 23 22 18 24 19 18 20 22 20 22 19 19 21 23
Compute the Mean.

The Solution: The Mean age of student is computed as follow:
� = ∑ X i / n, = ( X 1 + X 2 + . . . + X 30 ) / n
𝑿
= ( 19 + 21 + 20 + . . . + 21 + 20) / 30 = 630 / 30 = 21 years.

3.3 Median
The Median is another measure of central location for data.
Median of distribution is that value of the variant which divides it into
two equal parts. Median is a positional average because its value
depends upon the position of an item and not on its magnitude.
Determination of Median:
The following steps are involved in the determination of median:
1. The given observations are arranged in either
ascending or
2. Descending order of magnitude.
3. Given that there are n observations, the median is given by:
1. The size of {( n+1) /2 } th observations, when n is odd.
2. The mean of the sizes of ( n / 2) th and {(n+1) / 2} th
observations, when n is even.
Example 3
Find median of the following observations:
20, 15, 25, 28, 18, 16, 30.
The Solution: Writing the observations in ascending order, we get 15,
16, 18, 20, 25, 28, 30. Since n = 7, i.e., odd, the median is the size of (
7+1) /2 = 4 th, i.e., 4th observation. Hence, median, denoted by Md =
20.
Note: The same value of Md will be obtained by arranging the
observations in descending order of magnitude.
Example 4
Find the median of the following data:
245, 230, 265, 236, 220, 250.
The Solution:
Arranging these observations in ascending order of magnitude, we
get 220, 230, 236, 245, 250, 265.
Here n = 6, i.e., even.
Median will be the mean of the size of 6/2 =3, i.e., 3rd and [(6/2) + 1] =
4, i.e 4th observations. Hence Md = [ (236+245) /2] = 240.5

Again, First of all we arrange the data in ascending order (There are as
many numbers below the median as above the median).
If there is an odd number of a data value, the Median is the value in the
middle.
If there is an even number of data values, the Median is the Average of
two middle values.
Let us apply this definition to calculate the Median of the following data
set:
4 3 6 8 10 2 5 11 13 16 15
We arrange these 11 data values in ascending order to get
2 3 4 5 6 8 10 11 13 15 16
Since n = 11 is odd number, the Median is the middle value 8.
Suppose we also compute the median of Student's age / Example 2.
We arrange the 30 data values in ascending order.
18 18 18 19 19 19 19 19 19 20 20 20 20 21 ( 21
21 ) 21 21 22 22 22 22 23 23 23 23 24 24 24 25
Since n = 30, is Even number, we identify the middle of two data values.
The average of these two values is the Median = ( 21 + 21 ) / 2 = 21.
Although the mean is the more commonly used measure of central
location, there are some situations in which the Median is preferred.
Whenever there are extreme data values, the Median is often the
preferred measure of central tendency.
3.4 The Mode

The concept of mode, as a measure of central tendency, is
preferable to mean and median when it is desired to know the most

typical value, e.g., the most common size of a ready-made garment, the
most common size of income, the most common size of pocket
expenditure of a college student, the most common size of a family in
a locality, the most common duration of cure of viral-fever, the
most popular candidate in an election, etc.
How to Find the Mode: The mode is the most frequent data value. Or
the Mode is the data value that occurs with greatest frequency. The
mode is an important measure of location for qualitative data. There
may be no mode if no one value appears more than any other. There
may also be two modes, three modes, or more than three modes.
Example 5
Compute mode of the following data:
3, 4, 5, 10, 15, 3, 6, 7, 9, 12, 10, 16, 18,
20, 10, 9, 8, 19, 11, 14, 10, 13, 17, 9, 11
The Solution:
Writing this in the form of a frequency distribution, we get
Values: 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Frequency: 2 1 1 1 1 1 3 4 2 1 1 1 1 1 1 1 1 1
Mode = 10
Example 6
Consider the following data set:
1 2 3 4 5 6 7 8 8 9 8 10 11 12 13 14 15
Find: 1- The Mean 2- The median 3- The Mode

The Solution:
� = 1+ 2+ 3+ 4+ 5+ 6+ 7+ 8+8+9+8+…+15/17
1. The Mean: 𝑿
= 136/17 = 8.
2. The Median, We arrange the values in an ascending order.
1 2 3 4 5 6 7 8 8 8 9 10 11 12 13 14 15.
The median is the value in the middle (Red value)
3.The Mode will be equal to 8.
3.5 Summary:
The Mean is used in computing other statistics (such as the variance). It
does not exist for open ended grouped frequency distributions. It is
often not appropriate for skewed distributions such as salary
information.
The Median is the center number and is good for skewed distributions
because it is resistant to change. The Mode is used to describe the most
typical case. The mode can be used with nominal data whereas the
others can't. The mode may or may not exist and there may be more
than one value for the mode
3.6 Exercises
Exercise 1
The following are behavioral ratings as measured for 10 cases.
3 , 6 , 4 , 3 , 4 , 4 , 5 , 2 , 5, 4
Compute:
A) The mean ,
B) The median and
C) The mode.

Exercise 2
The number of sick days due to colds and flu last year was recorded by
a sample of 15 adults.
The data are: 5 7 1 3 15 6 5 8 3 8 10 5 2 1 11
Compute: A- The Mean B- The Median C- The Mode.
Exercise 3
Suppose the following data represent 25 cell phone prices sold in the
Erbil area. Data are in dollars ($).
119 121 120 118 119 118 119 120 121 117 125 128
121 118 129 128 121 119 124 121 123 127 130 129 121
Compute: a) The Mean b) The Median c) The Mode.
Exercise 4
A sample of 20 college Lecturers showed the following hours taken
during the first semester, 2011-2012.
15 14 16 22 24 16 18 20 22 24
36 34 36 22 20 18 22 36 18 20
What are the Mean, Median and Mode?

Chapter Four
Measures of Variation (Dispersion)
4.1 Range: The range is the simplest measure of variation to find. It is

simply the highest (Largest) value minus the lowest (Smallest) value.
Range = Largest value – Smallest value
R = XL – XS … 4.1
Since the range only uses the largest and smallest values, it is greatly
affected by extreme values, that is - it is not resistant to change.
Example1
Find the Range for the following data.
18 16 17 12 23 21 18 22 26 19 25
Range (R) = 26 – 16 = 10
4.2 Population Variance (σ2) ; It is the average of the squares of the

distances from the population mean. Or it is the sum of the squares of
the deviations from the mean divided by the population size (N) (the
number of values in the population)
The formula of the population Variance is
σ2 = ∑ (X i - µ)2 / N … 4.2
= population variance

N = population size
= population mean
4.3. Population Standard Deviation (σ ) which is the square root of the
population variance.
… 4.3
4.4. Sample Variance (S2): It is unbiased estimator of a population

variance. Instead of dividing by the population size, the sum of the
squares of the deviations from the sample mean is divided by (n-1),
where n is the sample size.
Calculation of the sample variance:
� )2 / n-1
S2 = ∑ (X i - 𝑿 … 4.4
= sample variance
= individual value
= sample mean
n = number of values
Degrees of freedom.
There are (n – 1) degrees of freedom in computing the variance, because
if (n -1) values are known, the nth one is determined automatically.
This is because all of the values of (xi - x) must add to zero.
4.5 Sample Standard Deviation

The standard deviation is the square rood of the variance.

The standard deviation expresses the dispersion in terms of the original
units. Since the variance of a sample is (S2 ) we take the square root.
… 4.5
or
S = √ S2
Example 2
Consider the following data set:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Calculate: 1.The mean 2.Variance (S2) 3.The Standard deviation (S)
The Solution:
To solve problem like this, we have to prepare the following table:
Table 4.1
Xi 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 120
�)
(X i - 𝑿 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 0
� )2
(X i - 𝑿 49 36 25 16 9 4 1 0 1 4 9 16 25 36 49 280
Hence,
� ) = 1+ 2 +3 + … +14+ 15 =120 /15= 8
1.The Mean, (𝑿

� )2 ∕ (n-1) =280/ (15-1) = 20
2.The Variance, S2 = ∑(X i - 𝑿
3.The Standard deviation S = √ 20 = 4.47213
4.6. Coefficient of Variation (C.V)

Standard deviation divided by the mean, expressed as a percentage.
� )* 100%
C.V = (S / 𝑿 … 4.6
Example 3
Find the coefficient of variation ( C.V.) for data in Example 4.2
The Solution:
Mean = 8, S= 4.47213, then the C.V can be calculated by:
C.V = (4.47213 / 8)*100% = 55.9016%.
Example 4
The following data represent daily salaries (ID) paid to 15 employees
working in a Constructing company.
50 55 60 65 80 68 78 75
60 72 65 77 50 65 70
Calculate:
1. The Variance (S2).
2. The Standard deviation(S).
3. The Coefficient of Variation (C.V).
The Solution:
We can calculate the Variance (S2) , the Standard deviation(S) and The
Coefficient of Variation (C.V) from the data in the following Table 4.2:
Table 4.2

Xi 50 55 60 65 80 68 78 75 60 72 65 77 50 65 70 990
(Xi − X ) -16 -11 -6 -1 14 2 12 9 -6 6 -1 11 -16 -1 4 0
( X i − X ) 2 256 121 36 1 196 4 144 81 36 36 1 121 256 1 16 1306
Hence,
1.The Mean: 990/15 = 66
2.The variance: S2 = 1306 /14 = 93.2857
3.The standard deviation S = √ 93.2857 = 9.6585
4. C.V = ( 9.6585 / 66 ) * 100% = 14.6341 %
Example 5
Suppose you have the following 16 observations:
14 16 17 18 21 17 25 13
15 17 19 22 23 20 22 25
Calculate: 1.The Range, 2.The Standard deviation(S),

3.The Coefficient of Variation (C.V).
The Solution:
From the data in the Table 4.3, we get;
1. The Mean: 304/16 = 19

2. The variance: S2 = 210 /15 = 14
3. The standard deviation S = √ 14 = 3.74
4. C.V = ( 3.74 / 19 ) * 100% = 19.68 %
5. Range = max – min = 25 -13 = 12

Table 4.3
Xi 14 16 17 18 21 17 25 13 15 17 19 22 23 20 22 25 304
(Xi − X ) -5 -3 -2 -1 2 -2 6 -6 -4 -2 0 3 4 1 3 6 0
( X i − X ) 2 25 9 4 1 4 4 36 36 16 4 0 9 16 1 9 6 210
Example 6
The number of absence hours was recorded by a sample of 20 Students
as follows:
3 4 4 5 6 6 6 6 7 7
7 7 8 9 10 10 11 12 15 17
Calculate: 1.The Variance (S2), 2.The Standard deviation (S ).
3. Coefficient of variation (C.V).
The Solution:
From the data in the Table 4.4, we get;
1.The Mean: 160/20 = 8
2.The variance: S2 = 250 /20 = 12.5
3.The standard deviation S = √ 12.5 = 3.54
4. C.V = (3.54 / 8) * 100% = 44.25 %
4.7 Measure of Position
Standard Scores (z-scores): The standard score is obtained by
subtracting the mean and dividing the difference by the standard
deviation. The symbol is Z, also called a Z-score.
� )/S
Z = (X -µ) / σ. Or Z = (X - 𝑿 … 4.7

The mean of the standard scores is zero and the standard deviation is
one. This is the nice feature of the standard score.
Example 7
Find the standard Score (Z), for data in Example 4.2.
The solution:
S = 4.47213, the Mean = 8 the standard Score is
� ) / 4.47213.
Z = (X i -𝑿
The standard score for each variable can be described by the following
Table 4.5. Table 4.4 the standard score
Xi �)
(X i - 𝑿 Z
1 1-8= -7 -1.56249
2 2-8=-6 -1.34164
3 3-8=-5 -1.11803
4 4-8=-4 -0.69443
5 5-8=-3 -0.67082
6 6-8=-2 -0.44714
7 7-8=-1 -0.22361
8 8-8= 0 000000
9 9-8=1 0.22361
10 10-8=2 0.44714
11 11-8=3 0.67082
12 12-8=4 0.69443
13 13-8=5 1.11803
14 14-8=6 1.34164
15 15-8=7 1.56249
Total=120 Zero

Chapter Five
Mean, Variance, and Standard deviation
For Grouped (Classified) data.
5.1 The Mean for grouped (Classified) data (Weighted Mean)

� , in this case, can be calculated as follows:
The weighted mean 𝑿
1. Find the sum of multiplication of each value (X i ) by its frequency or
weight (w i ) . i.e (∑ X i w i ).
2. Find the Sum of frequency (∑ w i ).
3. Divide the sum of multiplication (∑X i w i ) by the sum of frequency
∑
i =1
wi X i
Xw = n … 5.1
∑w i =1
i
Example 1
A student in Accounting Department/ Cihan University-Erbil has passed
his final exam/first semester and got the following marks:
Table 5.1
Subject Mark Unit
Principles of Accounting 65 5
Principles of Statistics 70 3
Principles of Management 73 3
Microeconomics 77 3

Financial Mathematics 60 3
Computer Skills 85 2
English Language 80 3
Calculate the Weighted Mean (Average).

The Solution:
We have to find the following table:
Table 5.2
Weighted Mean (Average)
Subject Mark (X i ) Unit(w i ) Xi * w i
Principles of Accounting 65 5 325
Principles of Statistics 70 3 210
Principles of Management 73 3 219
Microeconomics 77 3 231
Financial Mathematics 60 3 180
Computer Skills 85 2 170
English Language 83 3 249
Total 513 22 1584
Then, from the table 5.2, we find:

∑ wi = 22
∑ X i w i = 1584
� = ∑ w i * X i / ∑ w i = 1584/22 = 72.
The Weighted Mean is: 𝑿
Example 2
The following data represent first year Student’s Grade
Grades 50 60 74 80 54 70
Units (W i ) 2 3 4 3 2 3
Calculate The Weighted Mean.
The Solution:
We have to find the following table:
Table 5.3
Weighted Mean (Average)
Grades X i Units (w i ) X i *w i
50 2 100
60 3 180
74 4 296
80 3 240
54 2 108
70 3 210
Total 17 1134

n
∑ wi X i
i =1 1134
Xw = = = 66.71
n 17
∑ wi
i =1
5.2The Variance ( S2 ) for grouped data
� )2 * w i } / ∑ (w i - 1)
S2 = { ∑ ( X i – 𝑿 … 5.2
Example 3
Using data in Table 5.1 to calculate the Variance (S2 ).
The Solution:
Create the following Table 5.4
Table 5.4
Subject Mark � ) ( X i –𝑿
Unit(W i ( X i – 𝑿 � )2 � )2*W i
( Xi – 𝑿
(X i ) )
Principles of Accounting 65 5 -7 49 245
Principles of Statistics 70 3 -2 4 12
Principles of 73 3 1 1 3
Management
Microeconomics 77 3 5 25 75

Financial Mathematics 60 3 -12 144 432
Computer Skills 85 2 13 169 338
English Language 83 3 11 121 363
Total 513 22 1468
� = ∑ w i * X i / ∑ w i = 1584/22 = 72.
𝑿
� )2 * w i } / ( ∑ w i - 1) = 1468 / 22-1 = 69.9047

S2 = { ∑ ( X i – 𝑿
5.3 The Standard deviation ( S ).
S = √ S2 … 5.3
Example 4
Use data in Table 5.3, to calculate the Standard deviation (S ).
The Solution:
S = √ S2 = √ 69.9047 = 8.3609
Example 5
Consider the following Table 5.5
Table 5.5
Class 3-6 7-10 11-14 15-18 19-22 Total
Wi 25 35 55 25 10 150
Calculate:
1.The weighted Mean,

2. The Variance, and
3- the Standard deviation.
The Solution: We compute the Mid-class (X i ) as follows:
X 1 = (3+6) / 2 = 4.5 X 2 = (7+10)/ 2 = 8.5 and so on X 5 = (19+22)/ 2 =
20.5, then create the following table:
Table 5.6
Weighted Mean, Variance, and Standard deviation
Class Wi Mid-Class=X i � ) ( X i –𝑿
X i *W i ( X i –𝑿 � )2 � )2 *
(X i – 𝑿
Wi
3-6 25 4.5 112.5 -6.9333 48.0706 1201.7662
7-10 35 8.5 297.5 -2.9333 8.6042 301.1487
11-14 55 12.5 687.5 1.0667 1.1378 62.5816
15-18 25 16.5 412.5 5.0667 25.6714 641.7862
19-22 10 20.5 205.0 9.0667 82.20504 822.0504
Total 150 1715 3029.3331
From the table (5.5) we find:

∑ W i = 150, ∑ X i * W i = 1715
� = ∑ W i * X i / ∑ W i = 1715/150 =
1. The weighted Mean is: 𝑿
11.43333.
2. The Variance is,

� )2 * W i } / ( ∑ W i - 1) = 3029.3331 / 150 -1
S2 = { ∑ ( X i – 𝑿
= 20.3311
3. The Standard deviation ( S ). S = √ S2 = √ 20.3311 = 4.5090 .

Priinciples of Statistics PDF

Uploaded by

Copyright:

Available Formats

You might also like

Priinciples of Statistics PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Priinciples of Statistics PDF

Uploaded by

Copyright:

Available Formats

Statistics and Probability for Business and Financial Sciences

Dr. Abood Mohammed Jameel

1 Dr. Abood Mohammed Jameel

1.2.2 Statistical Population. It is any large group or collection of all

1.2.3 Sample. It is a part or sub-set of the population. We indicate to

1.2.5 Element. The element is the essential unit or individual in the

1.2.6 Variables (Xi) .The Variable is any character or aspect which

1.2.6.1. Discrete Variables: Variables which assume a finite or

3 Dr. Abood Mohammed Jameel

1.4.1The most important functions of statistics is to collect the data,

1.4.2 Statistics presents facts in definite form.

1.4.3 Statistics furnishes techniques of comparison. It facilitates

1.4.4 Statistics helps in the formulation and testing hypothesis.

1.4.5 Statistics helps in forecasting of future events. The statistical

1.4.6 Statistics studies the relationships. It helps in establishing

1.4.7 Statistics helps to make rational decisions.

1.4.8 Statistics provides techniques for drawing inferences.

4 Dr. Abood Mohammed Jameel

1.5.2 Quantitative Data:

1.6. Sampling Methods:

1.6.1.3. Cluster Sampling: It is a simple random sample of groups or

5 Dr. Abood Mohammed Jameel

1.7. Levels of Measurement:

6 Dr. Abood Mohammed Jameel

7 Dr. Abood Mohammed Jameel

Classification and Tabulation

The data collected in any statistical investigation, known as raw data,

2.2. Summarizing Data.

8 Dr. Abood Mohammed Jameel

9 Dr. Abood Mohammed Jameel

2.2.2. Graphs Methods:-

11 Dr. Abood Mohammed Jameel

Secondly: We find the frequency for each class.

12 Dr. Abood Mohammed Jameel

2- Recall that the relative frequency (R. f i ) is simply the proportion

13 Dr. Abood Mohammed Jameel

3-5 6 0.200000 20.0000

6-8 8 0.266667 26.6667

9-11 4 0.133333 13.3333

12-14 4 0.133333 13.3333

15-17 4 0.133333 13.3333

18-20 2 0.066667 6.6667

21-23 2 0.066667 6.6667

Total 30 1.000000 100.0000

14 Dr. Abood Mohammed Jameel

Less than or equal to 5 6 6 30

Less than or equal to 8 8 14 24

Less than or equal to 11 4 18 16

Less than or equal to 14 4 22 12

Less than or equal to 17 4 26 8

Less than or equal to 20 2 28 4

Less than or equal to 23 2 30 2

5- Mid-Class (Mid-point): The number in the middle of the class.

15 Dr. Abood Mohammed Jameel

1- To find frequency distribution table, we find firstly the Classes, by

Step (1): Approximate number of classes (K):

Step (2): Width of the Class (W).