Quantitative Methods

MINISTRY OF EDUCATION
DIPLOMA IN
INFORMATION COMMUNICATION
TECHNOLOGY
KENYA INSTITUTE OF CURRICULUM DEVELOPMENT
STUDY NOTES
Quantitative Methods
MODULE 2: SUBJECT NO 5
Contents
CHAPTER 1: DATA COLLECTION AND PRESENTATION.......................................................9
Introduction to Data Collection.........................................................................................................9
Basis for data collection.....................................................................................................................9
Statistical units.................................................................................................................................11
Data sources and types....................................................................................................................11
Collection methods and limitations................................................................................................13
Data classification...........................................................................................................................16
1
Reasons for Data Classification...................................................................................................16
Bases of Classification:................................................................................................................16
Types of Classification:................................................................................................................17
The Data Classification Process...................................................................................................17
Classification rule........................................................................................................................17
Testing classification rules by function........................................................................................18
Data tabulation...............................................................................................................................18
Definitions and parts of table.....................................................................................................18
Types of Tabulation:....................................................................................................................20
Application/uses of Tabulation...................................................................................................20
Difference between Classification and Tabulation..........................................................................21
Diagrammatic and graphic Data presentation.................................................................................21
Introduction to data presentation in Diagrams and Graphs........................................................21
Types of construction diagrams..................................................................................................22
Types of construction graphs......................................................................................................23
Interpretation of diagrams and graphs.......................................................................................23
CHAPTER 2: MEASURES OF CENTRAL TENDENCY............................................................29
Definition of measures of central tendency....................................................................................29
Properties /Characteristics of a Good Average...............................................................................29
Calculation and interpretation of Central Tendency.......................................................................30
CHAPTER 3: MEASURE OF DISPERSION................................................................................37
Introduction & Terminology............................................................................................................37
Characteristics & Objectives of Dispersion......................................................................................37
Absolute and relative measures of dispersion.................................................................................37
Types of measures of dispersion......................................................................................................38
Interpreting skewness and kurtosis.................................................................................................41
CHAPTER 4: CORRELATION AND REGRESSION...................................................................43
Introduction.....................................................................................................................................43
Computation of parameters related to correlation..........................................................................43
2
Correlation coefficient.................................................................................................................43
Looking at data: dependent, independent Variable & the scatter diagrams................................44
Calculation of the correlation coefficient.....................................................................................45
Significance test..........................................................................................................................47
Spearman rank correlation.........................................................................................................48
The regression equation.................................................................................................................49
Uses of Correlation and Regression................................................................................................54
Mathematical model and regression model...................................................................................55
Mathematical model...................................................................................................................55
Regression model.......................................................................................................................57
Principles of least square method...................................................................................................58
Normal equations...........................................................................................................................60
Solve normal equations to obtain the regression equation............................................................61
Using regression equation of forest................................................................................................61
Assumptions made in linear regression..........................................................................................62
CHAPTER 5: TIME SERIES ANALYSIS......................................................................................63
Introduction to time series.............................................................................................................63
Basic Objectives of the Analysis..................................................................................................63
Types of Models..........................................................................................................................63
Important Characteristics to Consider First.................................................................................63
The Components of Time Series.....................................................................................................64
Types of Time Series Data...............................................................................................................65
Time Series Models.........................................................................................................................65
Trend & Measurement Methods....................................................................................................67
Seasonal Trend............................................................................................................................67
Cyclic Trends...............................................................................................................................68
Random Trends...........................................................................................................................68
Fitting Trend Lines to Time Series Plots.......................................................................................69
Smoothing Time Series....................................................................................................................71
3
Moving Average Smoothing.........................................................................................................71
Median Smoothing (or Moving Medians)....................................................................................75
Seasonal Adjustments or Deseasonalisation...................................................................................78
More about Seasonal Indexes..........................................................................................................81
Extrapolation of past and future values...........................................................................................82
Interpolation of values.....................................................................................................................83
Application of time series................................................................................................................83
CHAPTER 6: INDEX NUMBERS...................................................................................................84
Introduction to Index Numbers.......................................................................................................84
Definitions of Terms........................................................................................................................84
Characteristics of index numbers....................................................................................................86
Application/Uses of index numbers................................................................................................87
Classification of index numbers......................................................................................................87
Problems in constructing index numbers........................................................................................88
Methods of constructing index numbers........................................................................................89
Test of consistency or adequacy.....................................................................................................96
The Chain Index Numbers.............................................................................................................100
Base shifting..................................................................................................................................104
Deflating.......................................................................................................................................106
Construction of cost of living index numbers................................................................................108
Uses of cost of living index numbers.........................................................................................108
Methods for construction of cost of living index numbers.......................................................109
Possible errors in construction of cost of living index numbers:...............................................111
Problems or steps in construction of wholesale price index numbers (WPI):...........................111
Wholesale price index numbers (Vs) consumer price index numbers:......................................113
Importance and methods of assigning weights............................................................................113
Limitations or demerits of index numbers....................................................................................114
CHAPTER 7: PROBABILITY DISTRIBUTION........................................................................115
Introduction to Probability Distribution........................................................................................115
4
Discrete and continuous variables................................................................................................115
Discrete Probability Distributions.................................................................................................116
Continuous Probability Distributions............................................................................................117
Discrete & Continuous probability distribution In problems Solutions..........................................118
Binomial Probability Distribution...............................................................................................118
Poisson Distribution...................................................................................................................121
Normal Distribution...................................................................................................................123
CHAPTER 8: NETWORK PLANNING.......................................................................................126
Introduction to Project Planning....................................................................................................126
The Importance/Uses of Planning in an Organization................................................................126
Benefits of Planning in Project Management............................................................................127
The Basic Steps in the Management Planning Process..............................................................128
Advantages & Disadvantages of Using a Project Scheduling Tool..............................................129
Project scheduling and Network planning....................................................................................130
Project Scheduling:.......................................................................................................................130
Network Planning.........................................................................................................................132
Project Scheduling and Network Planning................................................................................132
Terms frequently used in Network Diagram:............................................................................133
Network Diagram Analysis/ Network Construction......................................................................137
Introduction to PERT and CPM.................................................................................................137
The PERT/CPM Procedure.........................................................................................................137
Critical Path Analysis/ Critical Path Construction......................................................................138
PERT and Activity Time Estimation...........................................................................................139
Probability Analysis...................................................................................................................140
Worked Examples on Networks................................................................................................140
CHAPTER 9: LINEAR PROGRAMMING (LP)........................................................................143
Introduction..................................................................................................................................143
Some characteristic LP applications..............................................................................................143
Requirements of an LP problem...................................................................................................143
5
The importance of Linear Programming.......................................................................................144
Constraints which limit the achievement of objectives................................................................144
Basic Structure of a Linear Program Problem...............................................................................144
Linear Programming Assumptions & General Limitations.............................................................145
Assumptions:............................................................................................................................145
Advantages and limitations:......................................................................................................145
Limitations................................................................................................................................146
Managerial uses and applications of Linear Programming............................................................146
Linear Programming Model...........................................................................................................147
Linear Programming Model Types.................................................................................................148
The Red Gadget-Blue Gadget Problem..........................................................................................149
Problem Solutions in Linear Programming.....................................................................................152
Choice between graphical method and Simplex Method..........................................................152
LINEAR PROGRAMMING: GRAPHICAL SOLUTION......................................................................152
THE SIMPLEX METHOD..............................................................................................................155
Setting up the Simplex Tableau..................................................................................................158
CHAPTER 10: ESTIMATION AND TEST OF HYPOTHESIS..................................................168
Introduction to Estimation and Measure of Hypothesis................................................................168
Estimation.....................................................................................................................................168
Uses of estimation....................................................................................................................168
Estimators.................................................................................................................................168
Types of Estimator....................................................................................................................168
Sampling and Distribution............................................................................................................169
Introduction to Sampling..........................................................................................................169
Sampling distribution................................................................................................................170
Variability of a Sampling Distribution........................................................................................170
Sampling Distribution of the Mean...........................................................................................170
Sampling Distribution of the Proportion...................................................................................171
Central Limit Theorem..............................................................................................................171
6
T-Distribution vs. Normal Distribution......................................................................................172
Survey Sampling Methods........................................................................................................172
Population Parameter vs. Sample Statistic................................................................................172
Probability vs. Non-Probability Samples...................................................................................173
Non-Probability Sampling Methods..........................................................................................173
Probability Sampling Methods..................................................................................................173
Interpreting odds ratios, confidence intervals and p-values.....................................................174
Hypothesis....................................................................................................................................177
What is Hypothesis Testing?.....................................................................................................177
Statistical Hypotheses...............................................................................................................177
Hypothesis Tests...........................................................................................................................177
Decision Errors..........................................................................................................................178
Decision Rules...........................................................................................................................178
One-Tailed and Two-Tailed Tests................................................................................................178
Power of a Hypothesis Test........................................................................................................179
How to Conduct Hypothesis Tests.................................................................................................179
Applications of the General Hypothesis Testing Procedure...........................................................180
Test statistics and the test interpretation......................................................................................181
CHAPTER 11: THEORY DECISION...........................................................................................190
Introduction to Decision theory.....................................................................................................190
Mathematical Expectation.............................................................................................................192
Decision Theory computations......................................................................................................193
Bayer’s rule: Expected Value (Realist)........................................................................................193
Maximax (Optimist)..................................................................................................................193
Maximin (Pessimist)..................................................................................................................194
Minimax (Opportunist).............................................................................................................194
States of Nature........................................................................................................................195
Payoff Table...............................................................................................................................195
Opportunistic Loss Table...........................................................................................................195
7
Expected Value Criterion..........................................................................................................196
Maximax Criterion....................................................................................................................196
Maximin Criterion.....................................................................................................................196
Minimax Criterion.....................................................................................................................197
Putting it all together................................................................................................................197
Practice Problem...........................................................................................................................197
Decision Trees...............................................................................................................................197
Characteristic and Format of Decision Trees.............................................................................197
Analysis of Decision Trees.........................................................................................................198
CHAPTER 12: SIMULATION......................................................................................................202
An Overview of Simulation and Modeling....................................................................................202
Fundamental Concepts in Simulation...........................................................................................203
Designing Instructional/Learning Components of Simulation...................................................203
VV&A - Verification, Validation and Accreditation....................................................................203
Three types of simulations............................................................................................................203
Computer Modeling & Classification............................................................................................205
Simulation classified:................................................................................................................205
Advantages & Disadvantages of simulation models..................................................................206
Methods of Studying a System......................................................................................................207
Model Classifications.....................................................................................................................208
Simulation Modeling, Input, Output, and Experiments.................................................................213
CHAPTER 13: SAMPLING...........................................................................................................215
Introduction to sampling...............................................................................................................215
Terms in Sampling..........................................................................................................................215
Types of Samples & Sampling techniques/methods......................................................................215
Sampling theory and Concept........................................................................................................218
Standard error...............................................................................................................................219
Sampling Distribution of Difference between Means....................................................................220
CHAPTER 14: FINANCIAL MATHEMATICS..........................................................................224
8
Simple vs. Compound Interest Calculation...................................................................................224
Simple Interest..........................................................................................................................224
Compound Interest...................................................................................................................224
Concepts of sinking fund...............................................................................................................226
The Future Value and Present Value of an Annuity.......................................................................228
The Future Value of an Annuity................................................................................................228
The Present Value of an Annuity...............................................................................................230
Calculating the Interest rate......................................................................................................232
Net Present Value and Internal Rate of Return.............................................................................233
Discount Factor Calculation..........................................................................................................234
Cash flow......................................................................................................................................234
Discount Factor Table for Discrete Compounding.........................................................................236
Discount Factors for Continuous Compounding............................................................................237
Inventory control system..............................................................................................................238
Inventory/Stock control systems...............................................................................................238
Advantages and disadvantages.................................................................................................238
Types of Inventory Control systems :........................................................................................239
CHAPTER 1: DATA COLLECTION AND

PRESENTATION
Introduction to Data Collection
Data collection is the process of gathering and measuring information on targeted variables in an
established systematic fashion, which then enables one to answer relevant questions and evaluate
9
outcomes. Data collection is a component of research in all fields of study including physical and
social sciences, humanities, and business. While methods vary by discipline, the emphasis on
ensuring accurate and honest collection remains the same. The goal for all data collection is to
capture quality evidence that allows analysis to lead to the formulation of convincing and credible
answers to the questions that have been posed.
What is data?
• The terms 'data' and 'information' are used interchangeably
• However the terms have distinct meanings:
– Data are facts, events, transactions and so on which have been recorded. They are the input raw
materials from which information is processed.
– Information is data that have been produced in such a way as to be useful to the recipient.
• In general terms basic data are processed in some way to form information but the mere act of
processing data does not itself produce information.
Data Characteristics • Data are facts obtained by reading, observation, counting,

measuring, and weighing etc.
which are then recorded.
• Called raw or basic data and are often records of the day to day transactions of an
organization.
• Data are derived from both external and internal sources.
• Data may be produced as an automatic by-product of some routine but essential operation
The pool of data available is effectively limitless.
• This abundance means that organizations have to be selective in the data they collect.
• They must continually monitor their data gathering procedures to ensure that they continue
to meet the organisation's specific needs
• The data gathered and the means employed naturally vary from business to business
depending on the organization's requirements.
Basis for data collection

Importance/Purpose or Scope of data collection
Regardless of the field of study or preference for defining data (quantitative or qualitative),
accurate data collection is essential to maintaining the integrity of research. Both the
selection of appropriate data collection instruments (existing, modified, or newly developed)
and clearly delineated instructions for their correct use reduce the likelihood of errors
occurring.
A formal data collection process is necessary as it ensures that the data gathered are both
defined and accurate and that subsequent decisions based on arguments embodied in the
findings are valid. The process provides both a baseline from which to measure and in certain
cases an indication of what to improve.
10
Impact of faulty data
Consequences from improperly collected data include:
 Inability to answer research questions accurately; 
Inability to repeat and validate the study.
Distorted findings result in wasted resources and can mislead other researchers into pursuing
fruitless avenues of investigation; it may also compromise decisions, for example for public
policy, which may cause disproportionate harm.
Objectives of data collection

• Understand the relationship between data and analysis objectives
• Understand the data collection planning process
• Appreciate human factors of data collection
Why collect data?

• Measure reliability
• Document spares consumption
• Provide statistics These are reactive
• …Better to be pro-active
• Maintenance planning
• Maintenance improvement
• Identify & justify need for modification
• Calculate future resource & spares requirements
• Assess likelihood of mission success
• Confirm contractual requirements
• To assist achievement of worthwhile objectives
• Data collection is time-consuming & costly.
– We should only collect data where there is an identified and worthwhile benefit
from doing so.
Put planning into data collection

• Worthwhile objectives require decisions: – To change: how much, what, when, how –
To not change
• Decisions need clear supporting evidence: – Analyzed results: not all analysis is equal
• Analysis needs data
– Good results need good analysis: but good analysis may need expensive data
– Options: consider alternatives and identify most cost-effective that enables objectives
Put planning into data collection
11
• Data collection does not need to satisfy all objectives all the time. For example: –
Objective 1: Identify quickly that there is a reliability problem
• Routine data collection sufficient to allow analysis of occurrences subject in question –
Objective 2: Identify accurately what the problem is
• Special data collection once a problem has been identified: possibly using sampling
techniques and engineering analysis rather than data analysis
Data collection must have a purpose!

• Data should be collected for a purpose:
– to enable analysis,
– Focus on increasing understanding of item operation and failure, – Application
of this knowledge to a goal or objective.
• Without a definition of the objective for the future data analysis and the application of its
findings, collection of data is likely to be aimless and will omit important data, allow
corruption of data, or may waste time and resources by including data that offer little benefit.
Statistical units
A statistical unit is the unit of observation or measurement for which data are collected or derived.
A unit in a statistical analysis refers to one member of a set of entities being studied. It is the
material source for the mathematical abstraction of a "random variable". Common examples
of a unit would be a single person, animal, plant, manufactured item, or country that belongs
to a larger collection of such entities being studied.
Units are often referred to as being either experimental units, sampling units or, more
generally, units of observation:
• An "experimental unit" is typically thought of as one member of a set of objects that

are initially equivalent, with each object then subjected to one of several experimental
treatments.
• A "sampling unit" is typically thought of as an object that has been sampled from a
statistical population. This term is commonly used in opinion polling and survey
sampling.
Data sources and types

Data sources: area where data is mined / generated
Data resources: these is the information dimension of the source Data
Type: What data needs collecting?
• Inventory
– Information proving that a particular item exists in the field
12
– How that item is configured
– What other items that item contains
• Usage
– Information about when an item was placed into the field,
– How that item is operated in the field
– When that item was removed from the field
• Environment
– Information about the operating conditions of the item
• Events
– Information about anything that has happened to the item during its life
Data sources
• Servicing records,
• warranty records,
• repaired product records
• Spares used records
• Disposal records
• Customer complaints
• Customer reports and comments can also be used to help complete a data set.
• Insurance claims and coverage records
Resources
• The infrastructure:
– Diagnosis and service utilities as necessary for maintenance;
– Computerized tools for data storage, aggregation, Analysis and reporting; –
Facilities for raw data recording computerized facilities – Remote condition
monitoring and data collection.
• Economical and financial aspects to be considered are:
– Cost for implementation and maintaining regular data collection;
– Benefits gained by improvement of processes caused by measures based on the
information feedback from field data.
Data Validation
This is cross check on data at its entry or acquisition
• Why validate
– Avoid garbage-in, garbage-out
– Avoid wrong decisions with costly consequences
– Reliability analysis often requires large amounts of data, collected over a long
period of time: it is too late to find that data is corrupt when analysis is attempted
13
• How to validate
– Input masks, cross-checks (e.g. serial # fitted previously is serial # removed, serial
# fitted is serial # removed from stores, item fitted matches host equipment, etc.),
usage matches expectation, gaps in data …
– Use electronic aids such as smart-chips, bar-coding
– Validate incrementally: validate at point of data entry
Human factors in data collection

These are the elements of human contribution/participation in data collection process
• Make simple to get data collection correct
• Make difficult to get data collection wrong
• Complexity? Layout? Masks? Computer assistance?
• Involve those who collect the data in the planning process—buy-in to objectives
Kinds of data
There are basically three kinds of data:
1. Interval data: These are data taken from an independent scale with units. Examples
include height, weight and temperature.
2. Ordinal data: These are data collected from ranking variables on a given scale. For
example, you may ask respondents to rank some variable based on their perceived level
of importance of the variables using Likert type scale such as 1, 2, 3, 4 and 5.
3. Nominal data: Merely statements of qualitative category of membership. Examples
include gender (male or female), race (black or white), nationality (British, American,
African, etc.).
It should be appreciated that both Interval and Ordinal data relate to quantitative variables
while Nominal data refers to qualitative variables.
Types of data analysis

It is common to differentiate between three different types of data analysis, and we will go
through all the three in the next chapters:
a) Exploratory Data Analysis: used to quickly produce and visualize simple summaries of
data sets. We use exploratory data analysis mostly for arranging the data for further
analysis.
b) Descriptive Data Analysis: tell us how the data look, and what the relationships are
between the different variables in the data set. We perform descriptive data analysis to
present quantitative descriptions in a manageable form. It produces summaries as:
dispersion, distribution, central tendency, percentile
It should be noted that every time we try to describe a large set of observations with a single
indicator, we run the risk of distorting the original data or losing important detail. However,
given these limitations, descriptive statistics provide a powerful summary that may enable
comparisons across groups of people or other units.
14
c) Inferential Statistics
Inferential statistics test hypotheses about the data that makes it possible to generalize beyond
our data set on comparing groups.
It is also common to differentiate between the three following types of statistical analyses:
1. Univariate - when one variable is analyzed (e.g. t-test)
2. Bivariate - analysis of two variables (e.g. Paired-Sample test, ANOVA, Pearson’s
Chisquare, Simple Linear Regression, Pearson’s & Spearman’s Correlation) 3.
Multivariate - analysis of three or more variables (e.g. Multiple Regression)
Collection methods and limitations

Information Gathering methods: The main aim of fact finding methods is to determine the
information requirements of an organization used by analysts to prepare a precise Information
Resource Specification (SRS) understood by user. The methods are techniques of acquiring
information
Data Collection Methods: Pros and Cons
Method Description Pros Cons
Archival Data that have  Low cost  May be

already been  Relatively rapid difficult to
collected by an  Unobtrusive Can access local
agency or be highly data

organization and accurate Often  Often out of
are in their records good to date When

or archives moderate validity rules for

Usually allows for recordkeeping
 historical are changed,
comparisons or makes trend
trend analysis analysis
Often allows for difficult or
 comparisons with
larger populations invalid
 Need to learn
how records
were compiled
to assess
validity
15
 May not be
data on
knowledge,
attitudes, and
opinions
 May not
provide a
complete
picture of the
situation
Key Structured or  Low cost  Can be time
Informant unstructured one- (assuming consuming to
Interviews on-one directed relatively few) set up
conversations with  Respondents interviews
key individuals or define what is with busy
leaders in a important Rapid informants
community data collection Requires
 
Possible to skilled and/or

explore issues in trained
depth interviewers
Opportunity to  Accuracy
clarify responses (generalizabilit

through probes y) limited and
Sources of leads to difficult to
other data sources specify
 and other key Produces

informants limited
quantitative
data
 May be
difficult to
analyze and
summarize
findings
Focus Structured  Low cost Rapid  Can be time
Groups interviews with  data collection consuming to
small groups of like Participants assemble
individuals using define what is groups


Produces
16
standardized important limited
questions,  Some opportunity quantitative follow-up to explore
issues data questions, and in depth  Requires exploration of 
Opportunity to trained other topics that clarify responses facilitators
arise to better through probes  Less control understand over
process participants than key
informant
interviews
• Difficult to collect sensitive information  Accuracy
(generalizabilit
y) limited and
difficult to
specify
• May be
difficult to
analyze and
summarize
findings
Surveys Standardized  Can be highly  Relatively high paper-
and-pencil accurate cost or phone  Can be highly
 Relatively questionnaires reliable and valid slow
to that ask  Allows for design,
predetermined comparisons with implement, questions
other/larger and analyze populations when  Accuracy
items come from depends on
existing who and how instruments many
people
• Easily generates sampled quantitative data  Accuracy
limited to
willing and
reachable
respondents
• May have low
response rates
• Little
opportunity to
explore issues
in depth
17
Data classification
Data classification is the process of organizing data into categories for its most effective and efficient
use.
The process of arranging data into homogenous group or classes according to some common
characteristics present in the data is called classification.
For Example: The process of sorting letters in a post office, the letters are classified
according to the cities and further arranged according to streets.
Reasons for Data Classification

Data classification is carried out for a variety of purposes, one of the most common being a
process that supports data security initiatives. But data may be classified for a number of
reasons, including ease of access, to comply with regulatory requirements, and to meet
various other business or personal objectives. In some cases, data classification is a
regulatory requirement, as data must be searchable and retrievable within specified
timeframes. For the purposes of data security, data classification is a useful tactic that
facilitates proper security responses based on the type of data being retrieved, transmitted, or
copied.
Bases of Classification:
There are four important bases of classification:
(1) Qualitative Base (2) Quantitative Base (3) Geographical Base (4) Chronological
or Temporal Base
(1) Qualitative Base: When the data are classified according to some quality or attributes
such as sex, religion, literacy, intelligence etc…
(2) Quantitative Base: When the data are classified by quantitative characteristics like
heights, weights, ages, income etc…
(3) Geographical Base: When the data are classified by geographical regions or location,
like states, provinces, cities, countries etc…
18
(4) Chronological or Temporal Base: When the data are classified or arranged by their
time of occurrence, such as years, months, weeks, days etc… For Example: Time series
data.
Types of Classification:
(1) One -way Classification: If we classify observed data keeping in view single
characteristic, this type of classification is known as one-way classification.
For Example: The population of world may be classified by religion as Muslim, Christians
etc…
(2) Two -way Classification: If we consider two characteristics at a time in order to

classify the observed data then we are doing two way classifications.
For Example: The population of world may be classified by Religion and Sex.
(3) Multi -way Classification: We may consider more than two characteristics at a time to
classify given data or observed data. In this way we deal in multi-way classification.
For Example: The population of world may be classified by Religion, Sex and Literacy.
The Data Classification Process

Data classification can be a complex and cumbersome process, unless automated systems are
used to streamline the process. Still, an enterprise must determine the categories and criteria
that will be used to classify data, understand and define its objectives, outline the roles and
responsibilities of employees in maintaining proper data classification protocols, and
implement security standards that correspond with data categories and tags. When done
correctly, this process will provide employees and third parties involved in the storage,
transmission, or retrieval of data with a framework within which to operate.
Policies and procedures should be well-defined, considerate of the security requirements (or
confidentiality) of data types, and straightforward enough that policies are easily interpreted
by employees to promote compliance. For instance, each category should include information
about the types of data classified as such, security considerations with rules for retrieving,
transmitting, and storing data, clear examples, and potential risks associated with a breach of
security policies.
The data classification process goes far beyond making information easy to find. Data
classification is necessary to enable modern enterprises to make sense of the vast amounts of
data available at any given moment. Data classification provides a clear picture of the data
19
within the organization‘s control and an understanding of where data is stored, how it‘s most
easily accessed, and how data is best protected from potential security risks. Data
classification, once implemented, provides an organized information framework that
facilitates more adequate data protection measures and promotes employee compliance with
security policies.
Classification rule
Given a population whose members each belong to one of a number of different sets or
classes, a classification rule or classifier is a procedure by which the elements of the
population set are each predicted to belong to one of the classes. A perfect classification is
one for which every element in the population is assigned to the class it really belongs to. An
imperfect classification is one in which some errors appear, and then statistical analysis must
be applied to analyse the classification.
A special kind of classification rule is binary classification, for problems in which there are
only two classes.
Testing classification rules by function

Given a dataset consisting of pairs x and y, where x denotes an element of the population and
y the class it belongs to, a classification rule h(x) is a function that assigns each element x to a
predicted class ý= h(x) A binary classification is such that the label y can take only one of
two values.
The true labels yi can be known but will not necessarily match their approximations ýi=
h(xi) . In a binary classification, the elements that are not correctly classified are named false
positives and false negatives.
Some classification rules are static functions. Others can be computer programs. A computer
classifier can be able to learn or can implement static classification rules. For a training
dataset, the true labels yj are unknown, but it is a prime target for the classification procedure
that the approximation ýj= h(xj) ≈ yj as well as possible, where the quality of this
approximation needs to be judged on the basis of the statistical or probabilistic properties of
the overall population from which future observations will be drawn.
Given a classification rule, a classification test is the result of applying the rule to a finite
sample of the initial data set.
Data tabulation
Tabulating is a way of processing information or data by putting it in a table. This doesn't mean the
kind of table you eat off of, though. It refers to a table, or chart, with rows and columns. When
tabulating, you might have to make calculations.
20
Definitions and parts of table
The process of placing classified data into tabular form is known as tabulation. A table is a
symmetric arrangement of statistical data in rows and columns. Rows are horizontal
arrangements whereas columns are vertical arrangements. It may be simple, double or
complex depending upon the type of classification.
A statistical table has at least four major parts and some other minor parts.
(1) The Title
(2) The Box Head (column captions)
(3) The Stub (row captions)
(4) The Body
(5) Prefatory Notes
(6) Foots Notes
(7) Source Notes
The general sketch of table indicating its necessary parts is shown below:
-----THE TITLE---- ----Prefatory

Notes----
----Box Head----
----Row Captions---- ------Column Captions-----
---Stub Entries--- -----The Body-----
Foot Notes…
Source Notes…
(1) The Title: A title is the main heading written in capital shown at the top of the table. It
must explain the contents of the table and throw light on the table as whole different parts of
the heading can be separated by commas there are no full stop be used in the little.
(2) The Box Head (column captions): The vertical heading and subheading of the column
are called columns captions. The spaces were these column headings are written is called box
head. Only the first letter of the box head is in capital letters and the remaining words must be
written in small letters.
21
(3) The Stub (row captions): The horizontal headings and sub heading of the row are
called row captions and the space where these rows headings are written is called stub.
(4) The Body: It is the main part of the table which contains the numerical information
classified with respect to row and column captions.
(5) Prefatory Notes : A statement given below the title and enclosed in brackets usually
describe the units of measurement is called prefatory notes.
(6) Foot Notes: It appears immediately below the body of the table providing the further
additional explanation.
(7) Source Notes: The source notes is given at the end of the table indicating the source
from when information has been taken. It includes the information about compiling agency,
publication etc…
General Rules of Tabulation:
• A table should be simple and attractive. There should be no need of further explanations
(details).
• Proper and clear headings for columns and rows should be need.
• Suitable approximation may be adopted and figures may be rounded off.
• The unit of measurement should be well defined.
• If the observations are large in number they can be broken into two or three tables.
Thick lines should be used to separate the data under big classes and thin lines to separate the
sub classes of data.
Types of Tabulation:
(1) Simple Tabulation or One-way Tabulation:
When the data are tabulated to one characteristic, it is said to be simple tabulation or one-way
tabulation.
For Example: Tabulation of data on population of world classified by one characteristic like
Religion is example of simple tabulation.
(2) Double Tabulation or Two-way Tabulation:
When the data are tabulated according to two characteristics at a time. It is said to be double
tabulation or two-way tabulation.
22
For Example: Tabulation of data on population of world classified by two characteristics like
Religion and Sex is example of double tabulation.
(3) Complex Tabulation:
When the data are tabulated according to many characteristics, it is said to be complex
tabulation.
For Example: Tabulation of data on population of world classified by two characteristics like
Religion, Sex and Literacy etc…is example of complex tabulation.
Application/uses of Tabulation
There are several specific situations in which tables are routinely used as a matter of custom
or formal convention.
1) Publishing 2)
Mathematics
• Arithmetic (Multiplication table)
• Logic (Truth table) 3) Natural sciences
• Chemistry (Periodic table)

• Oceanography (tide table)
4) Information technology
- Software applications: Modern software applications give users the ability to generate,
format, and edit tables and tabular data for a wide variety of uses, for example:
• word processing applications;
• spreadsheet applications;
• presentation software;
• tables specified in HTML or another markup language
- Software development: Tables have uses in software development for both high-level
specification and low-level implementation. Usage in software specification can
encompass ad hoc inclusion of simple decision tables in textual documents through to the
use of tabular specification methodologies, examples of which include SCR and State
step. Proponents of tabular techniques, among whom David Parnas is prominent,
emphasize their understandability, as well as the quality and cost advantages of a format
allowing systematic inspection, while corresponding shortcomings experienced with a
graphical notation were cited in motivating the development of at least two tabular
approaches.
At a programming level, software may be implemented using constructs generally

represented or understood as tabular, whether to store data (perhaps to memoize earlier
23
results), for example, in arrays or hash tables, or control tables determining the flow of
program execution in response to various events or inputs.
5) Databases
Database systems often store data in structures called tables; in which columns are data
fields and rows represent data records. 6) Historical relationship to furniture
In medieval counting houses, the tables were covered with a piece of checkered cloth, to
count money. Exchequer is an archaic term for the English institution which accounted for
money owed to the monarch. Thus the checkerboard tables of stacks of coins are a concrete
realization of this information.
Difference between Classification and Tabulation

(1) First the data are classified and then they are presented in tables, the classification and
tabulation in fact goes together. So classification is the basis for tabulation.
(2) Tabulation is a mechanical function of classification because in tabulation classified

data are placed in row and columns.
(3) Classification is a process of statistical analysis where as tabulation is a process of

presenting the data in suitable form.
Diagrammatic and graphic Data presentation
Introduction to data presentation in Diagrams and Graphs
The graph is a diagram that shows the relation between variable quantities, usually of two variables,
each measured at different axes at right angles. The frequency distribution is better represented in
graphs and drawings than the table so we use different types of graphs and diagrams to represent it.
Advantages of Graphic Presentation:

1. Graphs represent complex data in a simple form.
2. Values of median, mode can be found through graphs.
3. Graphs create long lasting effect on people’s mind.
Disadvantages of graphic Presentation: 1.
Graphs do not show precise values.
2. Only experts can interpret graphs.
3. Graphs may suggest wrong conclusions.
24
Rules of Constructing graph:
1. The heading of the graph should be simple, clear and self explanatory.
2. Graphs should always be drawn with reference to some scale.
3. False baselines should be drawn if the difference between zero and the smallest value is
high.
4. Index should be made if different lines are drawn as in time series graphs.
Diagrams are the schematic representation of something.(that can be any object or subject) It
shows the appearances, structure and working of somethings.
Data may be presented in a simple and attractive manner in the form of diagrams. Diagrammatic
presentation provides the quickest understanding of the actual situation to be explained by the data
in comparison to the tabular and textual presentations.
While making the diagrams, emphasis should be on:

 Heading and number
 Size and attraction
 Width and Height
 Scale
 Scale measurement and index
 Footnotes and source notes and
 Simplicity
Utility or uses of diagrammatic presentation:
1. Makes complex data simple.

2. Diagrams are attractive.
3. Diagrams save time when compared to other methods.
4. Diagrams create a lasting impression on the minds of observers. Limitations of
diagrammatic presentation:
1. They do not provide detailed information.
2. Diagrams can be easily misinterpreted.
3. Diagrams can take much time and labour.
4. Exact measurement is not possible in diagrams.
Types of construction diagrams

Among the various kinds of diagrams some important ones are:
25
II. Line diagrams – Lines are drawn vertically to show large number of items.
III. Bar diagram
1. Simple Bar diagrams – These diagrams represent only one particular type of data.
2. Multiple Bar diagrams – These diagrams represent more than one type of data at a time.
3. Subdivided Bar diagram or Component Bar diagram – These diagrams present total
values and parts in a set of a data.
IV. Pie diagrams – Circle may be divided into various sectors representing various
components.
Types of construction graphs
Among the various kinds of diagrams some important ones are:
1. Line frequency graphs – Such graphs are used to represent discrete series.
2. Histogram – A two dimensional diagram whose length shows frequency and the
breadth shows size of class interval.
Frequency Polygon: A histogram becomes frequency polygon when a line is drawn joining midpoints
of tops of all rectangles in a histogram.
Frequency Curve: Smooth curve joining the points corresponding to the frequency and provides
frequency curve of the data.
Ogive: A curve obtained by plotting frequency data on the graph paper.
Interpretation of diagrams and graphs

Interpreting pie charts and frequency diagrams
Example
The pie chart below shows the heights (in cm) of 30 pupils in a class.
26
The biggest slice of the pie chart contains the most people - 151-160cm.
Question How many pupils are between 121-

130cm tall?
Answer
The angle of this section is 36 degrees. The question says there are 30 pupils in the class. So
the number of pupils of height 121 - 130 cm is:
36
/360 x 30 = 3
Example
A survey was conducted to determine the number of people in cars during rush hour. The
results are shown in the frequency diagram below.
Question
What is total number of cars in the survey?
Answer
6 + 3 + 5 + 1 = 15
There are 6 cars with one person in, 3 cars with two people, 5 cars with three people, and 1
car with four people.
Question
What is the most likely number of people in a car?
Answer
1.
Cars in the survey are most likely to have 1 person in them as this is the tallest bar - 6 of the
cars in the survey had one occupant.
27
Interpreting graphs
It is important that when we have a graph we actually find it useful and can take information
from it. First of all let‘s look at some misleading or badly-drawn graphs.
Use of scales on graph axes
In the diagram above the scale on the left-hand-side graph is inappropriate. The numbers go
up unnecessarily high on both axes which means the points are squashed into just a small part
of the graph area. The scale on the right-hand-side graph is much more suitable. Because the
scale suits the information the points fill the whole of the sheet making it clearer to read.
In the graph above the numbers on the bottom axis are unequal. This is wrong and makes the
graph difficult to read.
28
Scales should either start at zero, or be concertinaed (squashed) as shown in the y-axis in the
above graph.
Trends
This is a common type of question where you have to look at the results displayed in a graph
and decide what the overall result is.
An upward trend
In the graph above we do not have a nice straight line increase in figures, but overall there is
an increase in sales. If asked what the trend was we would say that sales were increasing.
A downward trend
In this graph we would say that sales were decreasing or dropping.
29
Interpreting points
Example
Four friends are represented on a graph. Look at the following statements about them.
• Sophie is the tallest.
• George and Waseem are the same height.
• Freda is the shortest.
• George is the oldest.
• Sophie and Waseem are the same age.  Freda is the youngest.
Match the names of the people to the letters on the graph.
We first of all look at the two variables along the axes. In this case they show height and age.
Start by putting the four people in order of height from shortest to tallest. In this case it will
be Freda first (A), George and Waseem next equal and Sophie last (D).
Now put the ages in the correct order according to the statments in the question.
Although George and Waseem are the same height, George is older so he is C and Waseem is
B.
• A is Freda
• B is Waseem
• C is George
• D is Sophie
Question
A company uses the following charges for photocopying:
copies or less - each Extra copies - each
Which of the following graphs A, B or C could show how the cost changes with the number
of copies?
30
Answer
The answer is C. Prices start off at per copy, so there is a steady straight line. Charges
then drop to only , so a less-steep straight line.
31
CHAPTER 2: MEASURES OF CENTRAL
TENDENCY
Definition of measures of central tendency

A measure of central tendency is a single value that attempts to describe a set of data by
identifying the central position within that set of data. As such, measures of central tendency
are sometimes called measures of central location. They are also classed as summary
statistics or averages.
A measure of central tendency is a single value that describes the way in which a group of
data clusters around a central value. To put in other words, it is a way to describe the center of
a data set. There are three measures of central tendency: the mean, the median, and the mode.
Why Is Central Tendency Important?

Central tendency is very useful in psychology. It lets us know what is normal or 'average'
for a set of data. It also condenses the data set down to one representative value, which is
useful when you are working with large amounts of data. Could you imagine how difficult it
would be to describe the central location of a 1000-item data set if you had to consider every
number individually?
Central tendency also allows you to compare one data set to another. For example, let's say
you have a sample of girls and a sample of boys, and you are interested in comparing their
heights. By calculating the average height for each sample, you could easily draw
comparisons between the girls and boys.
Central tendency is also useful when you want to compare one piece of data to the entire data
set. Let's say you received a 60% on your last psychology quiz, which is usually in the D
range. You go around and talk to your classmates and find out that the average score on the
quiz was 43%. In this instance, your score was significantly higher than those of your
classmates. Since your teacher grades on a curve, your 60% becomes an A. Had you not
known about the measures of central tendency, you probably would have been really upset by
your grade and assumed that you bombed the test.
Properties /Characteristics of a Good Average

(i) It should be rigidly defined. If an average is left to the estimation of an observer and if it
is not a definite and fixed value it cannot be representative of a series. The bias of the
investigator in such cases would considerably affect the value of the average. If the average is
rigidly defined; this instability in its value would be no more, and it would always be a
definite figure,
32
(ii) It should be based on all the observations of the series. If some of the items of the series
are not taken into account in its Calculation the average cannot be said to be a representative
one. As we shall see later on there are some averages which do not take into account all the
values of a group and to this extent they are not satisfactory averages.
(iii) It should be capable of further algebraic treatment. If an average dose not possess this quality,
its use is bound to be very limited. It will not be possible to calculate, say, the combined average of
two or more series from their individual averages; further it will not be possible to study the average
relationship of various parts of a variable if it is expressed as the sum of two or more variables. Many
other similar studies would not be possible if the average is not capable of further algebraic
treatment.
(iv) It should be easy to calculate and simple to follow. If the calculation of the average involves
tedious mathematical processes it will not be readily understood and its use will be confined only to
a limited number of persons. It can never be a popular average. As such, one of the qualities of a
good average is that it should not be too abstract or mathematical and there should be no difficulty
in its calculation. Further, the properties of the average should be such that they can be easily
understood by persons of ordinary intelligence. (v) It should not be affected by fluctuations of
sampling. If two independent sample studies are made in any particular field, the averages thus
obtained, should not materially differ from each other. No doubt, when two separate enquires are
made, there is bound to be a difference, in the average values calculated but in some cases this
difference would be great while in others comparatively less. These averages in which this difference,
which is technically called "fluctuation of sampling" is less, are considered better than those in which
its difference is more. One more thing to be remembered about averages is that the items whose
average is being calculated should form a homogenous group. It is absurd to talk about the average
of a man's height and his weight. If the data from which an average is being calculated are not
homogeneous, misleading conclusions are likely to be drawn. To find out the average production of
cotton cloth per mill, if big and small mills are not separated the average would be unrepresentative.
Similarly, to study wage level in cotton mill industry of India, separate averages should be calculated
for the male and female workers. Again, adult workers should be separately studied from the
juvenile group. Thus we see that as far as possible, the data from which an average is calculated
should be a homogeneous lot. Homogeneity can be achieved either by selecting only like items or by
dividing the heterogeneous data into a number of homogeneous groups.
Calculation and interpretation of Central Tendency

There are three main measures of central tendency: the mode, the median and the mean. Each of these
measures describes a different indication of the typical or central value in the distribution.
Mode
The mode is the most commonly occurring value in a distribution.
Consider this dataset showing the retirement age of 11 people, in whole years:
33
54, 54, 54, 55, 56, 57, 57, 58, 58, 60, 60
This table shows a simple frequency distribution of the retirement age data.
Age Frequency
54 3
55 1
56 1
57 2
58 2
60 2
The most commonly occurring value is 54, therefore the mode of this distribution is 54 years.
Advantage of the mode:

The mode has an advantage over the median and the mean as it can be found for both
numerical and categorical (non-numerical) data.
Limitations of the mode:

The are some limitations to using the mode. In some distributions, the mode may not reflect
the centre of the distribution very well. When the distribution of retirement age is ordered
from lowest to highest value, it is easy to see that the centre of the distribution is 57 years,
but the mode is lower, at 54 years.
54, 54, 54, 55, 56, 57, 57, 58, 58, 60, 60
It is also possible for there to be more than one mode for the same distribution of data,
(bimodal, or multi-modal). The presence of more than one mode can limit the ability of the
mode in describing the centre or typical value of the distribution because a single value to
describe the centre cannot be identified.
In some cases, particularly where the data are continuous, the distribution may have no
mode at all (i.e. if all values are different).
In cases such as these, it may be better to consider using the median or mean, or group the
data in to appropriate intervals, and find the modal class.
34
Median
The median is the middle value in distribution when the values are arranged in ascending
or descending order.
The median divides the distribution in half (there are 50% of observations on either side of
the median value). In a distribution with an odd number of observations, the median value is
the middle value.
Looking at the retirement age distribution (which has 11 observations), the median is the
middle value, which is 57 years:
54, 54, 54, 55, 56, 57, 57, 58, 58, 60, 60
When the distribution has an even number of observations, the median value is the mean of
the two middle values. In the following distribution, the two middle values are 56 and 57,
therefore the median equals 56.5 years:
52, 54, 54, 54, 55, 56, 57, 57, 58, 58, 60, 60
Advantage of the median:

The median is less affected by outliers and skewed data than the mean, and is usually the
preferred measure of central tendency when the distribution is not symmetrical.
Limitation of the median:

The median cannot be identified for categorical nominal data, as it cannot be logically
ordered.
Mean
The mean is the sum of the value of each observation in a dataset divided by the number
of observations. This is also known as the arithmetic average.
Looking at the retirement age distribution again:
54, 54, 54, 55, 56, 57, 57, 58, 58, 60, 60
The mean is calculated by adding together all the values

(54+54+54+55+56+57+57+58+58+60+60 = 623) and dividing by the number of observations
35
(11) which equals 56.6 years.
The mean (or average) is the most popular and well known measure of central tendency. It can
be used with both discrete and continuous data, although its use is most often with continuous
data (see our Types of Variable guide for data types). The mean is equal to the sum of all the
values in the data set divided by the number of values in the data set. So, if we have n values
in a data set and they have values x1, x2, ..., xn, the sample mean, usually denoted by
(pronounced x bar), is:
This formula is usually written in a slightly different manner using the Greek capitol letter, ,
pronounced "sigma", which means "sum of...":
You may have noticed that the above formula refers to the sample mean. So, why have we
called it a sample mean? This is because, in statistics, samples and populations have very
different meanings and these differences are very important, even if, in the case of the mean,
they are calculated in the same way. To acknowledge that we are calculating the population
mean and not the sample mean, we use the Greek lower case letter "mu", denoted as µ:
The mean is essentially a model of your data set. It is the value that is most common. You will
notice, however, that the mean is not often one of the actual values that you have observed in
your data set. However, one of its important properties is that it minimises error in the
prediction of any one value in your data set. That is, it is the value that produces the lowest
amount of error from all other values in the data set.
An important property of the mean is that it includes every value in your data set as part of the
calculation. In addition, the mean is the only measure of central tendency where the sum of
the deviations of each value from the mean is always zero.
Advantage of the mean:

The mean can be used for both continuous and discrete numeric data.
Limitations of the mean:

The mean cannot be calculated for categorical data, as the values cannot be summed.
36
As the mean includes every value in the distribution the mean is influenced by outliers and
skewed distributions.
NOTE
where there is need to know about the mean?
The population mean is indicated by the Greek symbol µ (pronounced ‘mu’). When the mean is
calculated on a distribution from a sample it is indicated by the symbol (pronounced X-bar).
Influence of a distribution by the shape of Central Tendency Symmetrical

distributions:
When a distribution is symmetrical, the mode, median and mean are all in the middle of the
distribution. The following graph shows a larger retirement age dataset with a distribution which is
symmetrical. The mode, median and mean all equal 58 years.
Skewed distributions:
When a distribution is skewed the mode remains the most commonly occurring value, the median
remains the middle value in the distribution, but the mean is generally ‘pulled’ in the direction of the
tails. In a skewed distribution, the median is often a preferred measure of central tendency, as the
mean is not usually in the middle of the distribution.
A distribution is said to be positively or right skewed when the tail on the right side of the
distribution is longer than the left side. In a positively skewed distribution it is common for the mean
to be ‘pulled’ toward the right tail of the distribution. Although there are exceptions to this rule,
generally, most of the values, including the median value, tend to be less than the mean value.
37
The following graph shows a larger retirement age data set with a distribution which is right skewed.
The data has been grouped into classes, as the variable being measured (retirement age) is
continuous. The mode is 54 years, the modal class is 54-56 years, the median is 56 years and the
mean is 57.2 years.
A distribution is said to be negatively or left skewed when the tail on the left side of the distribution
is longer than the right side. In a negatively skewed distribution, it is common for the mean to be
‘pulled’ toward the left tail of the distribution. Although there are exceptions to this rule, generally,
most of the values, including the median value, tend to be greater than the mean value.
The following graph shows a larger retirement age dataset with a distribution which left skewed. The
mode is 65 years, the modal class is 63-65 years, the median is 63 years and the mean is 61.8 years.
38
Outliers influence on measures of central tendency
Outliers are extreme, or atypical data value(s) that are notably different from the rest of the data.
It is important to detect outliers within a distribution, because they can alter the results of the data
analysis. The mean is more sensitive to the existence of outliers than the median or mode.
Consider the initial retirement age dataset again, with one difference; the last observation of 60
years has been replaced with a retirement age of 81 years. This value is much higher than the other
values, and could be considered an outlier. However, it has not changed the middle of the
distribution, and therefore the median value is still 57 years.
54, 54, 54, 55, 56, 57, 57, 58, 58, 60, 81
As the all values are included in the calculation of the mean, the outlier will influence the mean
value.
(54+54+54+55+56+57+57+58+58+60+81 = 644), divided by 11 = 58.5 years
In this distribution the outlier value has increased the mean value.
Despite the existence of outliers in a distribution, the mean can still be an appropriate measure of
central tendency, especially if the rest of the data is normally distributed. If the outlier is confirmed
as a valid extreme value, it should not be removed from the dataset. Several common regression
techniques can help reduce the influence of outliers on the mean value.
39
40
CHAPTER 3: MEASURE OF DISPERSION
Introduction & Terminology
The word dispersion means deviation or difference. In statistics refers to deviation f the
values of a variable from their central value. Measures of dispersion indicate the extent to
which individual observations vary from their averages i.e. mean, median or mode. It shows
the spread of items of a series from their central value. This is otherwise known as variation
or dispersion.
Definitions of terms:
―Dispersion is the measure of variation of the variables about a central value”.
“Dispersion is a measure of the extent to which the individual items vary”.
Characteristics & Objectives of Dispersion

Requisites of a good average (or) characteristics for ideal measures of dispersion:
i) It should be rigidly defined.

ii) It should be easy to under stand and simple to calculate. iii)
It should be based on all the observations of the series. iv)
It should be used for further algebraic treatments.
v) It should not be affected much by the sampling fluctuations.
vi) It should not be affected by the extreme items in the series.
Objectives or uses of dispersion:
i) Measures of dispersion tell us whether an average is a true representative of

the series or not.
ii) The extent of variability between two or more series can be compared with the
help of the measures of dispersion. iii) It is used to determine the degree
of uniformity, reliability and consistency amongst two or more sets of data. iv)
Measures of dispersion are used in the statistical measures like
correlation, regression etc. for further analysis.
Absolute and relative measures of dispersion

The dispersion of a series may be measured either absolutely or relatively. If the dispersion is
expressed in terms of the original units of the series, it is called absolute measure. of
dispersion. The disadvantage of absolute measure of dispersion is that it is not suitable for
comparative study of the characteristics of two or more series.
For example the income of workers may be in rupees, while their heights may be in inches.
Thus a comparison to measure their variations cannot be made as both are in different units.
41
So for comparison point of view it is necessary to calculate the relative measures of
dispersion which are expressed as percentage form (i.e. unit less number). These types
of expressions are called coefficients of dispersion or coefficient of variation.
Types of measures of dispersion

The following are the important measures of dispersion.
1) Range
2) Quartile deviation or Semi inter quartile range.
3) Mean absolute deviation or Mean deviation
4) Standard deviation
5) Lorenz curve
First two measures are called „method of limits‟, 3rd and 4th are called „method of moments‟
and 5th one is „graphic method‟.
1) Range
The range is calculated as the difference between the smallest and the largest values in a set of data.
The range of a data set is easy to calculate, but it is an insensitive measure of variation (does not
change with a change in the distribution of data), and is not very informative.
The range is calculated by subtracting the smallest value in the data set from the largest value
in the data set:
Range = Largest value - Smallest value
2) Quartile deviation or Semi inter quartile range.

In a distribution, partial variance between the upper quartile and lower quartile is known as
'quartile deviation'. Quartile Deviation is often regarded as semi inter quartile range.
Formula:
(Upper quartile- lower quartile) / 2 Example:
Upper quartile = 400, lower quartile = 200 then Quartile
deviation (QD) = (400-200)/2 = 200/2 =100.
Example2:
Calculate the QD for a group of data, 241,521,421,250,300,365,840,958. Solution:
Given data = { 241,521,421,250,300,365,840,958 }
Step 1:
42
First, arrange the given digits in ascending order = 241,250,300,365,421,521,840,958. Total
number of given data (n) = 8. Step 2:
Calculate the center value (n/2) for the given data {241,250,300,365,421,521,840,958}.
n=8 n/2 = 8/2 n/2 = 4. From the given data, { 241,250,300,365,421,521,840,958 } the
fourth value is 365 Step 3:
Now, find out the n/2+1 value. i.e n/2 +1 = 4+1=5 From the given data, {
241,250,300,365,421,521,840,958 } the fifth value is 421 Step 4:
From the given group of data, { 241,250,300,365,421,521,840,958 } Consider, First four
values Q1 = 241,250,300,365 Last four values Q3 = 421,521,840,958 Step 5:
Now, let us find the median value for Q1. Q1= {241,250,300,365} For Q1, total count (n) = 4
Q1(n/2) = Q1(4/2) = Q1(2) i.e) Second value in Q1 is 250 Q1( (n/2)+1 ) = Q1( (4/2)+1 ) =
Q1(2+1) = Q1(3) i.e) Third value in Q1 is 300 Median (Q1) = ( Q1(n/2) + Q1((n/2)+1) ) / 2
(Q1) = 250+300/2 (Q1) = 550/2 = 275 Step 6:
Let us now calculate the median value for Q3. Q3= {421,521,840,958} For Q3, total count
(n) = 4 Q3(n/2) = Q3(4/2) = Q3(2) i.e) Second value in Q3 is 521 Q3( (n/2)+1 ) = Q3(
(4/2)+1 ) = Q3(2+1) = Q3(3) i.e) Third value in Q3 is 840. Median (Q3) = ( Q1(n/2) +
Q1((n/2)+1) ) / 2 (Q3) = ( 521 + 840 ) / 2 (Q3) = 1361/2 = 680.5 Step 7:
Now, find the median value between Q3 and Q1. Quartile Deviation = Q3-Q1/2 = 680.5 -
275/2 = 202.75
3) Mean absolute deviation or Mean deviation
The formula is:
Σ|x - μ|
Mean Deviation =
N
Note:
• μ is the mean (in our example μ = 9)

• x is each value (such as 3 or 16)
• N is the number of values (in our example N = 8)
Absolute Deviation = |x - μ|
Example: the Mean Deviation of 3, 6, 6, 7, 8, 11, 15, 16
Step 1: Find the mean:
43
3 + 6 + 6 + 7 + 8 + 11 + 15 + 16 72
μ= = =9
8 8
Step 2: Find the Absolute Deviations:
x |x - μ|
3 6
6 3
6 3
7 2
8 1
11 2
15 6
16 7
Σ|x - μ| = 30
Step 3. Find the Mean Deviation:
Σ|x - μ| 30
Mean Deviation = = = 3.75
N 8
4) Standard deviation
Standard deviation (SD) (represented by the Greek letter sigma, σ) is a measure that is used to
quantify the amount of variation or dispersion of a set of data values.
44
The sample standard deviation formula is:
where,
s = sample standard deviation

= sum of...
= sample mean n = number
of scores in sample.
5) Lorenz curve
A graphical representation of wealth distribution developed by American economist Max Lorenz in
1905. On the graph, a straight diagonal line represents perfect equality of wealth distribution; the
Lorenz curve lies beneath it, showing the reality of wealth distribution. The difference between the
straight line and the curved line is the amount of inequality of wealth distribution, a figure described
by the Gini coefficient.
The Lorenz curve can be used to show what percentage of a nation's residents possess what
percentage of that nation's wealth. For example, it might show that the country's poorest 10%
possess 2% of the country's wealth.
45
Lorenz curve is a graphical representation of the cumulative distribution function of the empirical
probability distribution of wealth or income, and was developed by Max O. Lorenz in 1905 for
representing inequality of the wealth distribution.
Interpreting skewness and kurtosis
Skewness quantifies how symmetrical the distribution is.
•A symmetrical distribution has a skewness of zero.

•An asymmetrical distribution with a long tail to the right (higher values) has a positive
skew.
•An asymmetrical distribution with a long tail to the left (lower values) has a negative skew.
•The skewness is unit-less.
46
•Any threshold or rule of thumb is arbitrary, but here is one: If the skewness is greater than
1.0 (or less than -1.0), the skewness is substantial and the distribution is far from
symmetrical.
Kurtosis quantifies whether the shape of the data distribution matches the Gaussian
distribution.
•A Gaussian distribution has a kurtosis of 0.

•A flatter distribution has a negative kurtosis,
•A distribution more peaked than a Gaussian distribution has a positive kurtosis.
•Kurtosis has no units.
•The value that Prism reports is sometimes called the excess kurtosis since the
expected kurtosis for a Gaussian distribution is 0.0.
•An alternative definition of kurtosis is computed by adding 3 to the value reported by
Prism. With this definition, a Gaussian distribution is expected to have a kurtosis of 3.0.
How skewness is computed
Skewness has been defined in multiple ways. The steps below explain the method used by
Prism, called g1 (the most common method). It is identical to the skew() function in Excel.
1. We want to know about symmetry around the sample mean. So the first step is to
subtract the sample mean from each value, The result will be positive for values greater than
the mean, negative for values that are smaller than the mean, and zero for values that exactly
equal the mean.
2. To compute a unitless measures of skewness, divide each of the differences computed
in step 1 by the standard deviation of the values. These ratios (the difference between each
value and the mean divided by the standard deviation) are called z ratios. By definition, the
average of these values is zero and their standard deviation is 1.
3. For each value, compute z3. Note that cubing values preserves the sign. The cube of a
positive value is still positive, and the cube of a negative value is still negative.
4. Average the list of z3 by dividing the sum of those values by n-1, where n is the number
of values in the sample. If the distribution is symmetrical, the positive and negative values
will balance each other, and the average will be close to zero. If the distribution is not
symmetrical, the average will be positive if the distribution is skewed to the right, and
negative if skewed to the left. Why n-1 rather than n? For the same reason that n-1 is used
when computing the standard deviation.
5. Correct for bias. For reasons that I do not really understand, that average computed in
step 4 is biased with small samples -- its absolute value is smaller than it should be. Correct
for the bias by multiplying the mean of z3 by the ratio n/(n-2). This correction increases the
47
value if the skewness is positive, and makes the value more negative if the skewness is
negative. With large samples, this correction is trivial. But with small samples, the correction
is substantial.
CHAPTER 4: CORRELATION AND

REGRESSION
Introduction
Correlations and regressions: both use similar mathematical procedures to provide a
measure of relation; the degree to which two continuous variables vary together ... or covary
The correlations term is used when 1) both variables are random variables, and 2) the end
goal is simply to find a number that expresses the relation between the variables
The regression term is used when 1) one of the variables is a fixed variable, and 2) the end
goal is use the measure of relation to predict values of the random variable based on values of
the fixed variable
The word correlation is used in everyday life to denote some form of association. We might
say that we have noticed a correlation between foggy days and attacks of wheeziness.
However, in statistical terms we use correlation to denote association between two
quantitative variables. We also assume that the association is linear, that one variable
increases or decreases a fixed amount for a unit increase or decrease in the other. The other
technique that is often used in these circumstances is regression, which involves estimating
the best straight line to summarise the association.
Computation of parameters related to correlation

Correlation coefficient
The degree of association is measured by a correlation coefficient, denoted by r. It is
sometimes called Pearson's correlation coefficient after its originator and is a measure of
linear association. If a curved line is needed to express the relationship, other and more
complicated measures of the correlation must be used.
The correlation coefficient is measured on a scale that varies from + 1 through 0 to - 1.

Complete correlation between two variables is expressed by either + 1 or -1. When one
variable increases as the other increases the correlation is positive; when one decreases as the
other increases it is negative. Complete absence of correlation is represented by 0. Figure:
Correlation illustrated. gives some graphical representations of correlation.
48
Figure: Correlation illustrated.
Looking at data: dependent, independent Variable & the scatter diagrams

When an investigator has collected two series of observations and wishes to see whether there
is a relationship between them, he or she should first construct a scatter diagram. The vertical
scale represents one set of measurements and the horizontal scale the other. If one set of
observations consists of experimental results and the other consists of a time scale or
observed classification of some kind, it is usual to put the experimental results on the vertical
axis. These represent what is called the "dependent variable". The "independent variable",
such as time or height or some other observed classification is measured along the horizontal
axis, or baseline.
The words "independent" and "dependent" could puzzle the beginner because it is sometimes
not clear what is dependent on what. This confusion is a triumph of common sense over
misleading terminology, because often each variable is dependent on some third variable,
which may or may not be mentioned. It is reasonable, for instance, to think of the height of
children as dependent on age rather than the converse but consider a positive correlation
between mean tar yield and nicotine yield of certain brands of cigarette.' The nicotine
liberated is unlikely to have its origin in the tar: both vary in parallel with some other factor
or factors in the composition of the cigarettes. The yield of the one does not seem to be
"dependent" on the other in the sense that, on average, the height of a child depends on his
age. In such cases it often does not matter which scale is put on which axis of the scatter
diagram. However, if the intention is to make inferences about one variable from the other,
49
the observations from which the inferences are to be made are usually put on the baseline. As
a further example, a plot of monthly deaths from heart disease against monthly sales of ice
cream would show a negative association. However, it is hardly likely that eating ice cream
protects from heart disease! It is simply that the mortality rate from heart disease is inversely
related - and ice cream consumption positively related - to a third factor, namely
environmental temperature.
Calculation of the correlation coefficient

A paediatric registrar has measured the pulmonary anatomical dead space (in ml) and height
(in cm) of 15 children. The data are given in table 11.1 below and the scatter diagram shown
in figure 11.2 Each dot represents one child, and it is placed at the point corresponding to the
measurement of the height (horizontal axis) and the dead space (vertical axis). The registrar
now inspects the pattern to see whether it seems likely that the area covered by the dots
centres on a straight line or whether a curved line is needed. In this case the pediatrician
decides that a straight line can adequately describe the general trend of the dots. His next step
will therefore be to calculate the correlation coefficient.
50
When making the scatter diagram (figure 11.2) to show the heights and pulmonary
anatomical dead spaces in the 15 children, the pediatrician set out figures as in columns (1),
(2), and (3) of table 11.1 . It is helpful to arrange the observations in serial order of the
independent variable when one of the two variables is clearly identifiable as independent. The
corresponding figures for the dependent variable can then be examined in relation to the
increasing series for the independent variable. In this way we get the same picture, but in
numerical form, as appears in the scatter diagram.
Figure 11.2 Scatter diagram of relation in 15 children between height and pulmonary
anatomical dead space.
The calculation of the correlation coefficient is as follows, with x representing the values of
the independent variable (in this case height) and y representing the values of the dependent
variable (in this case anatomical dead space). The formula to be used is:
51
Which can be shown to be equal to:
Calculator procedure
Find the mean and standard deviation of x, as described in
Find the mean and standard deviation of y:
Subtract 1 from n and multiply by SD(x) and SD(y), (n - 1)SD(x)SD(y)
This gives us the denominator of the formula. (Remember to exit from "Stat" mode.)
For the numerator multiply each value of x by the corresponding value of y, add these values
together and store them.
110 x 44 = Min
116 x 31 = M+
etc.
This stores in memory. Subtract
MR - 15 x 144.6 x 66.93 (5426.6)
Finally divide the numerator by the denominator. r
= 5426.6/6412.0609 = 0.846.
52
The correlation coefficient of 0.846 indicates a strong positive correlation between size of
pulmonary anatomical dead space and height of child. But in interpreting correlation it is
important to remember that correlation is not causation. There may or may not be a causative
connection between the two correlated variables. Moreover, if there is a connection it may be
indirect.
A part of the variation in one of the variables (as measured by its variance) can be thought of
as being due to its relationship with the other variable and another part as due to
undetermined (often "random") causes. The part due to the dependence of one variable on the
other is measured by Rho . For these data Rho= 0.716 so we can say that 72% of the variation
between children in size of the anatomical dead space is accounted for by the height of the
child. If we wish to label the strength of the association, for absolute values of r, 0-0.19 is
regarded as very weak, 0.2-0.39 as weak, 0.40-0.59 as moderate, 0.6-0.79 as strong and 0.8-1
as very strong correlation, but these are rather arbitrary limits, and the context of the results
should be considered.
Significance test
To test whether the association is merely apparent, and might have arisen by chance use the t
test in the following calculation:
The t is entered at n - 2 degrees of freedom.
For example, the correlation coefficient for these data was 0.846.
The number of pairs of observations was 15. Applying equation 11.1, we have:
Entering table B at 15 - 2 = 13 degrees of freedom we find that at t = 5.72, P<0.001 so the

correlation coefficient may be regarded as highly significant. Thus (as could be seen
immediately from the scatter plot) we have a very strong correlation between dead space and
height which is most unlikely to have arisen by chance.
The assumptions governing this test are:
53
1. That both variables are plausibly Normally distributed.
2. That there is a linear relationship between them.
3. The null hypothesis is that there is no association between them.
The test should not be used for comparing two methods of measuring the same quantity, such
as two methods of measuring peak expiratory flow rate. Its use in this way appears to be a
common mistake, with a significant result being interpreted as meaning that one method is
equivalent to the other. The reasons have been extensively discussed(2) but it is worth
recalling that a significant result tells us little about the strength of a relationship. From the
formula it should be clear that with even with a very weak relationship (say r = 0.1) we would
get a significant result with a large enough sample (say n over 1000).
Spearman rank correlation

A plot of the data may reveal outlying points well away from the main body of the data,
which could unduly influence the calculation of the correlation coefficient. Alternatively the
variables may be quantitative discrete such as a mole count, or ordered categorical such as a
pain score. A non-parametric procedure, due to Spearman, is to replace the observations by
their ranks in the calculation of the correlation coefficient.
This results in a simple formula for Spearman's rank correlation, Rho.
where d is the difference in the ranks of the two variables for a given individual. Thus we can
derive table 11.2 from the data in table 11.1 .
54
From this we get that
In this case the value is very close to that of the Pearson correlation coefficient. For n> 10,
the Spearman rank correlation coefficient can be tested for significance using the t test given
earlier.
The regression equation

Correlation describes the strength of an association between two variables, and is completely
symmetrical, the correlation between A and B is the same as the correlation between B and A.
However, if the two variables are related it means that when one changes by a certain amount
the other changes on an average by a certain amount. For instance, in the children described
earlier greater height is associated, on average, with greater anatomical dead Space. If y
represents the dependent variable and x the independent variable, this relationship is
described as the regression of y on x.
55
The relationship can be represented by a simple equation called the regression equation. In
this context "regression" (the term is a historical anomaly) simply means that the average
value of y is a "function" of x, that is, it changes with x.
The regression equation representing how much y changes with any given change of x can be
used to construct a regression line on a scatter diagram, and in the simplest case this is
assumed to be a straight line. The direction in which the line slopes depends on whether the
correlation is positive or negative. When the two sets of observations increase or decrease
together (positive) the line slopes upwards from left to right; when one set decreases as the
other increases the line slopes downwards from left to right. As the line must be straight, it
will probably pass through few, if any, of the dots. Given that the association is well
described by a straight line we have to define two features of the line if we are to place it
correctly on the diagram. The first of these is its distance above the baseline; the second is its
slope. They are expressed in the following regression equation:
With this equation we can find a series of values of yα the variable, that correspond to each of
a series of values of x, the independent variable. The parameters α and β have to be estimated
from the data. The parameter signifies the distance above the baseline at which the regression
line cuts the vertical (y) axis; that is, when y = 0. The parameter β (the regression coefficient)
signifies the amount by which change in x must be multiplied to give the corresponding
average change in y, or the amount y changes for a unit increase in x. In this way it represents
the degree to which the line slopes upwards or downwards.
The regression equation is often more useful than the correlation coefficient. It enables us to
predict y from x and gives us a better summary of the relationship between the two variables.
If, for a particular value of x, x i, the regression equation predicts a value of y fit , the
prediction error is y1 - yα . It can easily be shown that any straight line passing through the
mean values x and y will give a total prediction error ∑ ( y1 - yα) of zero because the positive
and negative terms exactly cancel. To remove the negative signs we square the differences
and the regression equation chosen to minimise the sum of squares of the prediction errors,
s2= ∑ ( y1 - yα)2 We denote the sample estimates of Alpha and Beta by a and b. It can be
shown that the one straight line that minimises , the least squares estimate, is given by
and
it can be shown that
56
which is of use because we have calculated all the components of equation (11.2) in the
calculation of the correlation coefficient.
The calculation of the correlation coefficient on the data in table 11.2 gave the
following:
Applying these figures to the formulae for the regression coefficients, we have:
Therefore, in this case, the equation for the regression of y on x becomes
This means that, on average, for every increase in height of 1 cm the increase in anatomical
dead space is 1.033 ml over the range of measurements made.
The line representing the equation is shown superimposed on the scatter diagram of the data
in figure 11.2. The way to draw the line is to take three values of x, one on the left side of the
scatter diagram, one in the middle and one on the right, and substitute these in the equation,
as follows:
If x = 110, y = (1.033 x 110) - 82.4 = 31.2
If x = 140, y = (1.033 x 140) - 82.4 = 62.2
If x = 170, y = (1.033 x 170) - 82.4 = 93.2
Although two points are enough to define the line, three are better as a check. Having
put them on a scatter diagram, we simply draw the line through them.
57
Figure 11.3 Regression line drawn on scatter diagram relating height and pulmonaiy
anatomical dead space in 15 children
The standard error of the slope SE(b) is given by:
where is the residual standard deviation, given by:
This can be shown to be algebraically equal to
We already have to hand all of the terms in this expression. Thus is the square root of
. The denominator of (11.3) is 72.4680. Thus
SE(b) = 13.08445/72.4680 = 0.18055.
We can test whether the slope is significantly different from zero by:
58
t = b/SE(b) = 1.033/0.18055 = 5.72.
Again, this has n - 2 = 15 - 2 = 13 degrees of freedom. The assumptions governing this test
are:
1. That the prediction errors are approximately Normally distributed. Note this does not mean
that the x or y variables have to be Normally distributed.
2. That the relationship between the two variables is linear.
3. That the scatter of points about the line is approximately constant - we would not wish the
variability of the dependent variable to be growing as the independent variable increases. If
this is the case try taking logarithms of both the x and y variables.
Note that the test of significance for the slope gives exactly the same value of P as the test of
significance for the correlation coefficient. Although the two tests are derived differently,
they are algebraically equivalent, which makes intuitive sense.
We can obtain a 95% confidence interval for b from
where the tstatistic from has 13 degrees of freedom, and is equal to 2.160.
Thus the 95% confidence interval is
l.033 - 2.160 x 0.18055 to l.033 + 2.160 x 0.18055 = 0.643 to 1.422.
Regression lines give us useful information about the data they are collected from. They show
how one variable changes on average with another, and they can be used to find out what one
variable is likely to be when we know the other - provided that we ask this question within
the limits of the scatter diagram. To project the line at either end - to extrapolate - is always
risky because the relationship between x and y may change or some kind of cut off point may
exist. For instance, a regression line might be drawn relating the chronological age of some
children to their bone age, and it might be a straight line between, say, the ages of 5 and 10
years, but to project it up to the age of 30 would clearly lead to error. Computer packages will
often produce the intercept from a regression equation, with no warning that it may be totally
meaningless. Consider a regression of blood pressure against age in middle aged men. The
regression coefficient is often positive, indicating that blood pressure increases with age. The
intercept is often close to zero, but it would be wrong to conclude that this is a reliable
estimate of the blood pressure in newly born male infants!
59
Summary
The goal of a correlation analysis is to see whether two measurement variables co vary, and to
quantify the strength of the relationship between the variables, whereas regression expresses
the relationship in the form of an equation.
For example, in students taking a Maths and English test, we could use correlation to
determine whether students who are good at Maths tend to be good at English as well, and
regression to determine whether the marks in English can be predicted for given marks in
Maths.
What a Scatter Diagram Tells Us

The starting point is to draw a scatter of points on a graph, with one variable on the X-axis
and the other variable on the Y-axis, to get a feel of the relationship (if any) between the
variables as suggested by the data. The closer the points are to a straight line, the stronger the
linear relationship between two variables.
Why Use Correlation?

We can use the correlation coefficient, such as the Pearson Product Moment Correlation
Coefficient, to test if there is a linear relationship between the variables. To quantify the
strength of the relationship, we can calculate the correlation coefficient (r). Its numerical
value ranges from +1.0 to -1.0. r > 0 indicates positive linear relationship, r < 0 indicates
negative linear relationship while r = 0 indicates no linear relationship.
A Caveat
It must, however, be considered that there may be a third variable related to both of the
variables being investigated, which is responsible for the apparent correlation. Correlation
does not imply causation. Also, a nonlinear relationship may exist between two variables that
would be inadequately described, or possibly even undetected, by the correlation coefficient.
60
Why Use Regression
In regression analysis, the problem of interest is the nature of the relationship itself between
the dependent variable (response) and the (explanatory) independent variable.
The analysis consists of choosing and fitting an appropriate model, done by the method of
least squares, with a view to exploiting the relationship between the variables to help estimate
the expected response for a given value of the independent variable. For example, if we are
interested in the effect of age on height, then by fitting a regression line, we can predict the
height for a given age.
Assumptions
Some underlying assumptions governing the uses of correlation and regression are as follows.
The observations are assumed to be independent. For correlation, both variables should be
random variables, but for regression only the dependent variable Y must be random. In
carrying out hypothesis tests, the response variable should follow Normal distribution and
the variability of Y should be the same for each value of the predictor variable. A scatter
diagram of the data provides an initial check of the assumptions for regression.
Uses of Correlation and Regression

There are three main uses for correlation and regression.
• One is to test hypotheses about cause-and-effect relationships. In this case, the

experimenter determines the values of the X-variable and sees whether variation in X causes
variation in Y. For example, giving people different amounts of a drug and measuring their
blood pressure.
• The second main use for correlation and regression is to see whether two variables are
associated, without necessarily inferring a cause-and-effect relationship. In this case, neither
variable is determined by the experimenter; both are naturally variable. If an association is
found, the inference is that variation in X may cause variation in Y, or variation in Y may cause
variation in X, or variation in some other factor may affect both X and Y.
The third common use of linear regression is estimating the value of one variable corresponding to a
particular value of the other variable.
Mathematical model and regression model

Mathematical model
A mathematical model is an abstract model that uses mathematical language to describe the
behavior of a system.
61
Mathematical models are used particularly in the natural sciences and engineering disciplines
(such as physics, biology, and electrical engineering) but also in the social sciences (such as
economics, sociology and political science); physicists, engineers, computer scientists, and
economists use mathematical models most extensively.
A mathematical model is a description of a system using mathematical concepts and language.

Method of simulating real-life situations with mathematical equations to forecast their future
behavior. Mathematical modeling uses tools such as decision-theory, queuing theory, and linear
programming, and requires large amounts of number crunching.
Elements of a mathematical model

Mathematical models can take many forms, including dynamical systems, statistical models,
differential equations, or game theoretic models. These and other types of models can
overlap, with a given model involving a variety of abstract structures. In general,
mathematical models may include logical models. In many cases, the quality of a scientific
field depends on how well the mathematical models developed on the theoretical side agree
with results of repeatable experiments. Lack of agreement between theoretical mathematical
models and experimental measurements often leads to important advances as better theories
are developed.
In the physical sciences, the traditional mathematical model contains four major elements.
These are
1. Governing equations
2. Defining equations
3. Constitutive equations
4. Constraints
Classifications
Mathematical models are usually composed of relationships and variables. Relationships can
be described by operators, such as algebraic operators, functions, differential operators, etc.
Variables are abstractions of system parameters of interest, that can be quantified. Several
classification criteria can be used for mathematical models according to their structure:
• Linear vs. nonlinear: If all the operators in a mathematical model exhibit linearity, the resulting
mathematical model is defined as linear. A model is considered to be nonlinear otherwise. The
definition of linearity and nonlinearity is dependent on context, and linear models may have
nonlinear expressions in them. For example, in a statistical linear model, it is assumed that a
relationship is linear in the parameters, but it may be nonlinear in the predictor variables.
Similarly, a differential equation is said to be linear if it can be written with linear differential
operators, but it can still have nonlinear expressions in it. In a mathematical programming model,
if the objective functions and constraints are represented entirely by linear equations,
62
then the model is regarded as a linear model. If one or more of the objective functions or
constraints are represented with a nonlinear equation, then the model is known as a nonlinear
model.
Nonlinearity, even in fairly simple systems, is often associated with phenomena such as chaos
and irreversibility. Although there are exceptions, nonlinear systems and models tend to be more
difficult to study than linear ones. A common approach to nonlinear problems is linearization,
but this can be problematic if one is trying to study aspects such as irreversibility, which are
strongly tied to nonlinearity.
• Static vs. dynamic: A dynamic model accounts for time-dependent changes in the state of the
system, while a static (or steady-state) model calculates the system in equilibrium, and thus is
time-invariant. Dynamic models typically are represented by differential equations or difference
equations.
• Explicit vs. implicit: If all of the input parameters of the overall model are known, and the output
parameters can be calculated by a finite series of computations, the model is said to be explicit.
But sometimes it is the output parameters which are known, and the corresponding inputs must
be solved for by an iterative procedure, such as Newton's method (if the model is linear) or
Broyden's method (if non-linear). In such a case the model is said to be implicit. For example, a
jet engine's physical properties such as turbine and nozzle throat areas can be explicitly
calculated given a design thermodynamic cycle (air and fuel flow rates, pressures, and
temperatures) at a specific flight condition and power setting, but the engine's operating cycles
at other flight conditions and power settings cannot be explicitly calculated from the constant
physical properties.
• Discrete vs. continuous: A discrete model treats objects as discrete, such as the particles in a
molecular model or the states in a statistical model; while a continuous model represents the
objects in a continuous manner, such as the velocity field of fluid in pipe flows, temperatures and
stresses in a solid, and electric field that applies continuously over the entire model due to a
point charge.
• Deterministic vs. probabilistic (stochastic): A deterministic model is one in which every set of
variable states is uniquely determined by parameters in the model and by sets of previous states
of these variables; therefore, a deterministic model always performs the same way for a given
set of initial conditions. Conversely, in a stochastic model—usually called a "statistical model"—
randomness is present, and variable states are not described by unique values, but rather by
probability distributions.
• Deductive, inductive, or floating: A deductive model is a logical structure based on a theory. An
inductive model arises from empirical findings and generalization from them. The floating model
rests on neither theory nor observation, but is merely the invocation of expected structure.
Application of mathematics in social sciences outside of economics has been criticized for
unfounded models. Application of catastrophe theory in science has been characterized as a
floating model.
63
Significance in the natural sciences
Mathematical models are of great importance in the natural sciences, particularly in physics.
Physical theories are almost invariably expressed using mathematical models.
Throughout history, more and more accurate mathematical models have been developed.
Newton's laws accurately describe many everyday phenomena, but at certain limits relativity
theory and quantum mechanics must be used; even these do not apply to all situations and
need further refinement. It is possible to obtain the less accurate models in appropriate limits,
for example relativistic mechanics reduces to Newtonian mechanics at speeds much less than
the speed of light. Quantum mechanics reduces to classical physics when the quantum
numbers are high. For example, the de Broglie wavelength of a tennis ball is insignificantly
small, so classical physics is a good approximation to use in this case.
It is common to use idealized models in physics to simplify things. Massless ropes, point
particles, ideal gases and the particle in a box are among the many simplified models used in
physics. The laws of physics are represented with simple equations such as Newton's laws,
Maxwell's equations and the Schrödinger equation. These laws are such as a basis for making
mathematical models of real situations. Many real situations are very complex and thus
modeled approximate on a computer, a model that is computationally feasible to compute is
made from the basic laws or from approximate models made from the basic laws. For
example, molecules can be modeled by molecular orbital models that are approximate
solutions to the Schrödinger equation. In engineering, physics models are often made by
mathematical methods such as finite element analysis.
Different mathematical models use different geometries that are not necessarily accurate
descriptions of the geometry of the universe. Euclidean geometry is much used in classical
physics, while special relativity and general relativity are examples of theories that use
geometries which are not Euclidean.
Some applications
Since prehistorically times simple models such as maps and diagrams have been used.
Often when engineers analyze a system to be controlled or optimized, they use a

mathematical model. In analysis, engineers can build a descriptive model of the system as a
hypothesis of how the system could work, or try to estimate how an unforeseeable event
could affect the system. Similarly, in control of a system, engineers can try out different
control approaches in simulations.
A mathematical model usually describes a system by a set of variables and a set of equations
that establish relationships between the variables. Variables may be of many types; real or
integer numbers, Boolean values or strings, for example. The variables represent some
properties of the system, for example, measured system outputs often in the form of signals,
64
timing data, counters, and event occurrence (yes/no). The actual model is the set of functions
that describe the relations between the different variables.
Regression model
A frequently applied statistical technique that serves as a basis for studying and
characterizing a system of interest, by formulating a mathematical model of the relation
between a response variable, y and a set of q explanatory variables x1, x2, … xq. The
choice of the explicit form of the model may be based on previous knowledge of the system
or on considerations such as ―smoothness‖ and continuity of y as a function of the x
variables. In very general terms all such models can be considered to be of the form. y =
f(x1,...xq) + e
where the function f reflects the true but unknown relationship between y and the explanatory
variables. The random additive error e which is assumed to have mean 0 and variance
sigma_e^2 reflects the dependence of y on quantities other than x1,…,xq. The goal is to
formulate a function f(x1,x2,…,xp) that is a reasonable approximation of f. If the correct
parametric form of f is known, then methods such as least squares estimation or maximum
likelihood estimation can be used to estimate the set of unknown coefficients. If f is linear in
the parameters, for example, then the model is that of multiple regression. If the experimenter
is unwilling to assume a particular parametric form for f then nonparametric regression
modeling can be used, for example kernel regression smoothing, recursive partitioning
regression or multivariate adaptive regression splines.
Principles of least square method

When you need to estimate a sample regression function (SRF), the most common
econometric method is the ordinary least squares (OLS) technique, which uses the least
squares principle to fit a prespecified regression function through your sample data.
The least squares principle states that the SRF should be constructed (with the constant and
slope values) so that the sum of the squared distance between the observed values of your
dependent variable and the values estimated from your SRF is minimized (the smallest
possible value).
Least Squares Estimation of Βo and Β1
The simple, linear regression model is given by:
Yi = B0 + B1Xi + ei i = 1…N
where Yi = value of response variable for the ith person
B0,B1 = population parameters intercept and slope, respectively
65
Xi = value of fixed variable X for the ith person
Ei = random error term with mean = 0
We want to calculated values from a sample which will estimate Bo and B1 in the model, such that the
sum of the squared residuals, or errors of prediction, is minimized.
2
Y
Let S  E 
i i Y i
ˆ
 Y B B X 
2
i o 1 i
2
(1)
Then the estimates bo and b1 are called the least squares estimates of Bo and B1. To find these
estimates:
Step 1: Find the partial derivative of (1) with respect to B0 and the partial derivative of (1) with
respect to B1.
First, expand the right side of (1) and distribute the summation sign:
Y B B X  Y

i o 1 i
2
i
2
2Yi Bo B1Xi Bo B1Xi 2 
Y i2 2BoYi 2B1XiYi  Bo2  2BoB1Xi  Bi2Xi2 
Yi2 2BoYi 2B1XiYi NBo2  2BoB1Xi Bi2Xi2
From calculus the partial derivatives are:
S
 0 2Yi 0 2NBo  2B1X i  0 2Yi  2NBo  2B1X i
Bo
S 2
2X iYi  2BoX i  2B1X12
 00 2X iYi  0 2BoX i  2B1X1 
B1
Step 2: Rearrange terms, set the two partial derivatives equal to 0
2NBo  2B1Xi 2Yi  0
66
2BoXi  2B1X12 2XiYi  0
Since we are now going to solve for the values of our sample estimates b o and b1 and n, replace Bo, B1
and N in the two above simultaneous equations, and dividing by 2 yields the two normal equations.
nbo b1Xi Yi 0 boXi
b1X12 XiYi  0
ˆ ˆ
NOTE: ∑Yi = nbo + b1∑Xi = ∑bo + ∑ b1Xi = ∑(bo + b1Xi) = Y i so ∑Yi = Y i or in words, the sum of the
observed Y values equals the sum of the fitted values which is one of the properties of a fitted linear
regression line.
Step 3: Solve the simultaneous normal equations to give the estimates that will minimize S.
First multiply both sides of the first equation by ∑Xi and the second equation by n.
nboXi b1(Xi )2 XiYi  0 nboXi nb1X12
nXiYi  0
Subtract the first equation from the second yielding,
nb1Xi2 nXiYi b1(Xi )2 XiYi  0
Factor b1 from the two terms involving it,
b1(nXi2 (Xi )2 ) nXiYi XiYi  0
Solve for b1,
b1 (nnXXiY2i(XXii)Y2i)
By dividing both numerator and denominator by n, this can be expressed as:
67
Xn Y
XiYi  i  i which is equivalent to (X i X)(Yi 2 Y) 
b  i
1 2 (X )2 ) (X i X)
(Xi  n
Either one, then, can be used to estimate the slope b1
Having found b1 either of the normal equations found in Step 2 can be used to find bo. For example
using the first one,
Yi X i
nbo b1Xi Yi 0 leads to bo  b1
n n
Thus bo Y b1 X
This result illustrates another property of the fitted regression line: the line passes through the point
(Y, X)
Normal equations
Normal equations are equations obtained by setting equal to zero the partial derivatives of the sum
of squared errors (least squares); normal equations allow one to estimate the parameters of a
multiple linear regression.
MATHEMATICAL ASPECTS
Consider a general model of multiple linear regression:
Yi=β0+∑j=1p−1βj Xji+εi,i=1,…,n,
where Yi is the dependent variable, Xji, j=1,…,p−1, are the independent variables, εi is the term
of random nonobservable error, βj, j=0,…,p−1
, are the parameters to be estimated, and n is the number of observations.
To apply the method of least squares, we should minimize the sum of squared errors: ...
Given a matrix equation
68
the normal equation is that which minimizes the sum of the square differences between the
left and right sides:
It is called a normal equation because is normal to the range of .
Here, is a normal matrix.
Solve normal equations to obtain the regression equation

It can be demonstrated, using calculus that the ordinary least-squares estimates of the partial
regression coefficients for a multiple regression equation are given by a series of equations
known as the normal equations. A derivation of the normal equations is presented in
Appendix D. The simplest form for these equations is in terms of the correlations among the
variables. In short, we can assume, without any loss of generality, that all of the variables are
in standard form. Specifically, the normal equations can be written such that the correlation
between the dependent variable and each independent variable can be expressed as a linear
function of the standardized partial regression coefficients and the correlations among the
independent variables. For example, the normal equations for a multiple regression equation
with three independent variables can be written as follows: ry1=b y1.23+b y2.13r12+b y3.12r13
ry2=b y1.23 21+b y2.13+b y3.12 23 ry3=b y1.23 31+b y2.13 32+b y3.12
r r r r
The correlations between the dependent variable and the independent variable, ryi, and the
correlations between the independent variables, rij, can be readily calculated from the
observations on these variables.
Using regression equation of forest

There are two approaches for estimating the biomass density of woody formations based on
existing forest inventory data.
The first approach is based on the use of existing measured volume estimates (VOB per ha)
converted to biomass density (t/ha) using a variety of "tools").
The second approach directly estimates biomass density using biomass regression equations.
These regression equations are mathematical functions that relate oven-dry biomass per tree
as a function of a single or a combination of tree dimensions. They are applied to stand tables
or measurements of individual trees in stands or in lines (e.g., windbreaks, live fence posts,
home gardens). The advantage of this second method is that it produces biomass estimates
69
without having to make volume estimates, followed by application of expansion factors to
account for non-inventoried tree components.
The disadvantage is that a smaller number of inventories report stand tables to small diameter
classes for all species. Thus, not all countries in the tropics are covered by these estimates. To
use either of these methods, the inventory must include all tree species. There is no way
to extrapolate from inventories that do not measure all species.
Use of forest inventory data overcomes many of the problems present in ecological studies.
Data from forest inventories are generally more abundant and are collected from large sample
areas (subnational to national level) using a planned sampling method designed to represent
the population of interest. However, inventories are not without their problems. Typical
problems include:
• Inventories tend to be conducted in forests that are viewed as having commercial value,
i.e., closed forests, with little regard to the open, drier forests or woodlands upon which so
many people depend for non-industrial timber.
• The minimum diameter of trees included in inventories is often greater than 10 cm and
sometimes as large as 50 cm; this excludes smaller trees which can account for more than
30% of the biomass.
• The maximum diameter class in stand tables is generally open-ended with trees greater
than 80 cm in diameter often lumped into one class. The actual diameter distribution of
these large trees significantly affects aboveground biomass density.
• Not all tree species are included, only those perceived to have commercial value at the
time of the inventory.
• Inventory reports often leave out critical data, and in most cases, field measurements are
not archived and are therefore lost.
• The definition of inventoried volume is not always consistent.
• Very little descriptive information is given about the actual condition of the forests, they
are often described as primary, but diameter distributions and volumes suggest otherwise
(e.g., Brown et al. 1991, 1994).
• Many of the inventories are old, 1970s or earlier, and the forests may have disappeared or
changed.
Despite the above problems, many inventories are very useful for estimating biomass density
of forests.
Assumptions made in linear regression

Linear regression analysis makes several key assumptions:
• A Linear Relationship between the outcome variable and the independent variables. A
plot of the standardized residuals verses the predicted Y' values show whether there is a
linear or curvilinear relationship.
70
• Multivariate Normality--Multiple regression assumes that the variables are normally
distributed.
• No Multicollinearity--This assumption assumes that the independent variables are not
highly correlated with each other. This assumption is tested by the Variance Inflation
Factor (VIF) statistic.
• Homoscedasticity--This assumption requires that the variance of error terms are similar
across the independent variables. As with the linear relationship assumption,
Intellectus Statistics plot the standardized residuals verses the predicted Y' values can
show whether points are equally distributed across all values of the independent
variables or not.
CHAPTER 5: TIME SERIES ANALYSIS

Introduction to time series
A time series is a series of data points indexed (or listed or graphed) in time order. Most
commonly, a time series is a sequence taken at successive equally spaced points in time.
A time series is a sequence of measurements of the same variable collected over time. Most
often, the measurements are made at regular time intervals.
One difference from standard linear regression is that the data are not necessarily independent
and not necessarily identically distributed. One defining characteristic of time series is that
this is a list of observations where the ordering matters. Ordering is very important because
there is dependency and changing the order could change the meaning of the data.
Basic Objectives of the Analysis

The basic objective usually is to determine a model that describes the pattern of the time
series. Uses for such a model are:
1. To describe the important features of the time series pattern.

2. To explain how the past affects the future or how two time series can ―interact‖.
3. To forecast future values of the series.
71
4. To possibly serve as a control standard for a variable that measures the quality of
product in some manufacturing situations.
Types of Models
There are two basic types of ―time domain‖ models.
1. Models that relate the present value of a series to past values and past prediction
errors - these are called ARIMA models (for Autoregressive Integrated Moving
Average). We‘ll spend substantial time on these.
2. Ordinary regression models that use time indices as x-variables. These can be
helpful for an initial description of the data and form the basis of several simple
forecasting methods.
Important Characteristics to Consider First

Some important questions to first consider when first looking at a time series are:
• Is there a trend, meaning that, on average, the measurements tend to increase (or
decrease) over time?
• Is there seasonality, meaning that there is a regularly repeating pattern of highs
and lows related to calendar time such as seasons, quarters, months, days of the
week, and so on?
• Are their outliers? In regression, outliers are far away from your line. With time
series data, your outliers are far away from your other data.
• Is there a long-run cycle or period unrelated to seasonality factors?
• Is there constant varianceover time, or is the variance non-constant?
• Are there any abrupt changes to either the level of the series or the variance?
• Time period is time elapsed between two values of the same magnitude is defined as
the period of a cycle.
The Components of Time Series

The factors that are responsible for bringing about changes in a time series, also called the
components of time series, are as follows:
1. Secular Trends (or General Trends)

2. Seasonal Movements
3. Cyclical Movements
4. Irregular/Random Fluctuations
72
Secular Trends
The secular trend is the main component of a time series which results from long term effects
of socio-economic and political factors. This trend may show the growth or decline in a time
series over a long period. This is the type of tendency which continues to persist for a very
long period. Prices and export and import data, for example, reflect obviously increasing
tendencies over time.
Seasonal Trends
These are short term movements occurring in data due to seasonal factors. The short term is
generally considered as a period in which changes occur in a time series with variations in
weather or festivities. For example, it is commonly observed that the consumption of
icecream during summer is generally high and hence an ice-cream dealer's sales would be
higher in some months of the year while relatively lower during winter months. Employment,
output, exports, etc., are subject to change due to variations in weather. Similarly, the sale of
garments, umbrellas, greeting cards and fire-works are subject to large variations during
festivals like Valentine‘s Day, Eid, Christmas, New Year's, etc. These types of variations in a
time series are isolated only when the series is provided biannually, quarterly or monthly.
Cyclic Movements
These are long term oscillations occurring in a time series. These oscillations are mostly
observed in economics data and the periods of such oscillations are generally extended from
five to twelve years or more. These oscillations are associated with the well known business
cycles. These cyclic movements can be studied provided a long series of measurements, free
from irregular fluctuations, is available.
Irregular Fluctuations
These are sudden changes occurring in a time series which are unlikely to be repeated. They
are components of a time series which cannot be explained by trends, seasonal or cyclic
movements. These variations are sometimes called residual or random components. These
variations, though accidental in nature, can cause a continual change in the trends, seasonal
and cyclical oscillations during the forthcoming period. Floods, fires, earthquakes,
revolutions, epidemics, strikes etc., are the root causes of such irregularities.
Types of Time Series Data

Continuous vs. Discrete
Continuous - observations made continuously in time
Examples:
1. Seawater level as measured by an automated sensor.
73
2. Carbon dioxide output from an engine.
Discrete - observations made only at certain times.
Examples:
1. Animal species composition measured every month.
2. Bacteria culture size measured every six hours.
Stationary vs. Non-stationary

Stationary - Data that fluctuate around a constant value
Non-stationary - A series having parameters of the cycle (i.e., length, amplitude or

phase) change over time
Deterministic vs. Stochastic

Deterministic time series - This data can be predicted exactly.
Stochastic time series - Data are only partly determined by past values and future values have
to be described with a probability distribution. This is the case for most, if not all, natural
time series. So many factors are involved in a natural system that we can not possibly
correctly apply all of them.
Transformations of the Data We

can transform data to:
1. Stabilize the variance - use the logarithmic transformation

2. Make the seasonal effect additive - this makes the effect constant from year
to year - use the logarithmic transformation.
3. Make data normally distributed - this reduces the skewness in the data so
that we may apply appropriate statistics - use the Box-Cox (logarithmic and
square root) transformation
Time Series Models

Time series analysis comprises methods for analyzing time series data in order to extract meaningful
statistics and other characteristics of the data. Time series forecasting is the use of a model to
predict future values based on previously observed values.
74
Time Series Analysis can be divided into two main categories depending on the type of the model
that can be fitted. The two categories are:
• Kinetic Model: The data here is fitted as xt= f(t). The measurements or observations are seen
as a function of time.
• Dynamic Model: The data here is fitted as xt= f(xt-1 , xt-2 , xt-3 … ).
The classical time series analysis procedures decomposes the time series function xt = f(t) into up to
four components [Error! Reference source not found.]:
1. Trend: a long-term monotonic change of the average level of the time series.
2. The Trade Cycle: a long wave in the time series.
3. The Seasonal Component: fluctuations in time series that recur during specific time periods.
4. The Residual component that represents all the influences on the time series that are not
explained by the other three components.
The Trend and Trade Cycle correspond to the smoothing factor and the Seasonal and Residual
component contribute to the cyclic factor. Often before time series models are applied, the data
needs to be examined and if necessary, it has to be transformed to be able to interpret the series
better. This is done to stabilize the variance. For example, if there is a trend in the series and the
standard deviation is directly proportional to the mean, then a logarithmic transformation is
suggested. And in order to make the seasonal affect addictive, if there is a trend in the series and the
size of the seasonal effect tends to increase with the mean then it may be advisable it transform the
data so as to make the seasonal effect constant from year to year. Transformation is also applied
sometimes to make the data normally distributed
The fitting of time series models can be an ambitious undertaking. There are many methods of
model. These models have been well discussed in [Error! Reference source not found., Error!
Reference source not found.]. The user's application and preference will decide the selection of the
appropriate technique. We will now discuss some of the existing methods in time series analysis.
A time series can have one or more of the following components:

• Trend (positive or negative secular trend)
75
• Seasonal Pattern
• Cyclic Pattern
• Random variation
Trend & Measurement Methods
A trend exists if there is a long term increase (positive) or decrease (negative) in the dependent
variable as time passes.
Positive Upward Trend
Negative Downward Trend

A time series plot can show an overall negative movement. This means that the time series has a
negative secular trend, or downward trend. For example the number of births in a remote country
hospital has decreased steadily over the years from 1996 to 2005.
76
Seasonal Trend
When the seasons of the year affect sales or production, peaks and troughs will appear at regular
intervals during the year. For example, seasonal rainfall during summer, autumn, winter and spring in
a year. The name seasonal is not specific to seasons of the year. It could be related to weekly sales in
which sales on Saturday are consistently higher than the other week days. A key feature of seasonal
trends is that the seasons occur at the same time each cycle.
Notice that the graph peaks to times corresponding to t = 4, 8, 12 etc. which are the summer
quarters of each year over 10 years.
77
Cyclic Trends
Like seasonal trends, cyclic trends show fluctuations upwards and downwards but not according to
season. The peaks and troughs occur on an irregular basis.
For example the number of large earthquakes recorded each year show significant peaks and
troughs, but at unpredictable intervals.
Random Trends
Random variation or random pattern will not show predictable peaks and troughs nor will there be
any significant peaks or troughs at unpredictable times. Instead there is a random movement about a
relatively stable mean.
78
Sometimes it is difficult to decide whether a trend is cyclic or random. Choose random as a last
resort if the trend is not seasonal or cyclic.
Note that although random variation is always a component, it is only mentioned when none of the
other 3 components is present.
Fitting Trend Lines to Time Series Plots
There are 3 possible ways to fit a trend line to a time series plot:
• By eye
• Three median regression method
• Least squares regression method
You can think of a time series plot as similar to a scatter plot with independent variable time along
the axis. Use these techniques on the original data when the trend is clearly linear. The methods
cannot be applied effectively to cyclical or seasonal trends.
79
By Eye
Fitting a trend line by eye will only give approximate results when used to make predictions.
The equation of the line can be found using the 2 points to be

y = 0.11t + 5.44
Note that making predictions using this equation would be very unreliable.
80
3 -Median Regression Method
This method has been met before and CAS can be used to determine the equation of the line using
the Median -Median option. Example:
b.
Equation of-3median regression line is
= 5t + 21.7
The true value of y at t = 4 is 40, compared to the

predicted y = 41.7.
This is a fairly accurate prediction.
Least Squares Regression Method

This method has been met before and CAS can be used to determine the equation of the line using .
Smoothing Time Series

Time series data can be prone to large fluctuations from point to point. This means that at times a
trend line cannot accurately predict the future if there is a large variation in how data moves. We can
smooth out the fluctuations to show a clearer picture of the overall trend. We can use the following
3 techniques:
81
Moving Average Smoothing
This technique relies on the principle that averages of data can be used to represent the original
data. When applied to time series a number of data points are averaged, then we move on to
another group of data points in a systematic fashion and average them, and so on. Note that when
finding the moving average we are finding the mean of the data points. There are two cases to
consider.
Moving Average Smoothing with an Odd Number of Points.

The following example shows how we would use a 3-point moving average to smooth out the data
points.
By drawing a time series plot with the original data and the 3-point moving average data the
general trend becomes more obvious.
82
The least squares equation for the 3-point moving average data can be calculated using
CAS and predictions made from it. This will give more accurate forecasts than if we used
the original time series data.
Moving Average Smoothing with an Even Number of Points
Suppose a 4-point moving average is used in the previous example. We have to find the
moving average in 2 steps to assist in locating the data point. The second step is called
centering. Centering allows us to line up the moving average with a specific year. The number
of babies born in a remote hospital over the period 1996 to 2005 is given by:
To calculate the 4 point moving average we form the table below:
83
Column1 Column2 No of 4-point Centred
Moving Moving
Year Year Births Average Average
1996 1 25
1997 2 18
21.75
1998 3 23 21.000
20.25
1999 4 21 20.500
20.75
2000 5 19 20.125
19.50
2001 6 20 18.875
18.25
2002 7 18 18.000
17.75
2003 8 16 17.125
16.50
2004 9 17
2005 10 15
In Summary:
Year 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005
84
Year t 1 2 3 4 5 6 7 8 9 10
Number 25 18 23 21 19 20 18 16 17 15
of Births
Centred 21.000 20.500 20.125 18.875 18.000 17.125
moving
average
Plotting the Number of Births and the Centred Moving Average on the same grid gives the graph
below:
The least squares regression equation of the centered moving average points can be calculated using
CAS to make predictions.
85
Median Smoothing (or Moving Medians)
Median smoothing uses a similar technique to moving averages; it uses the median values instead of
the average values.
Median Smoothing with an Odd Number of Points
When we smooth with an odd number of points we can often do it by eye.

For example the time series for the data describing the number of births in a country hospital are
shown in the graph below:
Using a 3-point median method we can form the plot below:
86
Median Smoothing with an Even Number of Points
When we smooth with an even number of points an extra step called centering is needed to line up
the moving median values with a specific year.
87
Example: The sales figures for a kiosk between 1993 and 2004 are given below.
The data can be plotted on a graph.
88
Using the smoothed data points CAS can be used to determine the equation of the least squares
regression line. Note that we need to start from year 3 and finish at year 10.
From the graph the equation of best fit is: = 0.337024 + 10.6481 Sales
= 0.337024 × + 10.6481
We can predict the sales for year 2010 by substituting t = 18

Sales = 0.337024 x 18+ 10.6481 = 16.7145
This is approximately $16700
89
Seasonal Adjustments or Deseasonalisation
Seasonal factors are variations due to weather (seasons), the day of the week, the month of the year,
the quarter of the year.
For example:
• Ice cream sales are greater in the summer than in the winter
• Sales of winter clothing are greater in the winter than the summer
• Sales of soft drinks vary with temperature
• Retail sales increase during Xmas period

Smoothing data of this type is more complex and smoothing the data using moving averages or
moving medians may not be as effective.
To improve the smoothing process by taking out the seasonal effects we use a process called
deseasonalization or seasonal adjustment so that a trend line can be fitted and long term trends can
be predicted. This process involves finding the seasonal indices for each quarter, month or week.
The steps to calculate the seasonal indices and the seasonally adjusted data is best shown in an
example.
Example :
The quarterly sales figures (number of houses sold) were recorded by an estate agent for each of the
years from 2003 to 2005.
Determine the Seasonal Index for each Quarter.
90
91
The seasonal indices should sum to 4. This is a useful check!
(Note: In many problems you are given the seasonal indices so you do not have to work them out from
first principles.)
Deseasonalise the Original Time Series Data
Page 92 of 259
Using the deseasonalized sales data, we can create a least squares regression line using CAS and predict
the deseasonalized sales for the first quarter of 2006.
Page 93 of 259
the deseasonalised sales in the first quarter of 2006. Using the equation
𝐷𝑒𝑠𝑒𝑎𝑠𝑜𝑛𝑎𝑙𝑖𝑠𝑒𝑑 𝑆𝑎𝑙𝑒𝑠 = 0.159301 × 𝑞𝑡𝑟 + 5.47121 we can predict
Substitute qtr = 13 gives:

Deseasonalised Sales = 0.159301 × 13 + 5.47121 = 7.54212
To calculate the actual sales you must remember to seasonalise the data! To do this, remember that:
𝐴𝑐𝑡𝑢𝑎𝑙 𝑠𝑎𝑙𝑒𝑠 = 𝑠𝑒𝑎𝑠𝑜𝑛𝑎𝑙 𝑖𝑛𝑑𝑒𝑥 ×𝑑𝑒𝑠𝑒𝑎𝑠𝑜𝑛𝑎𝑙𝑖𝑠𝑒𝑑 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛.
In this case the seasonal index will be for Qtr 1.
Actual Sales = 0.7210 × 7.54212 = 5.43787 which approximates 5.
More about Seasonal Indexes
Seasonal indices can be expressed as percentages as shown.
The larger the seasonal index the higher the performance of the corresponding
to quarter compared
the average quarterly value. Thus in the real estate example the seasonal index of 143.98% for the
third quarter indicates a performance of 43.98% above average.
Example
:
Page 94 of 259
Example:
Extrapolation of past and future values

Extrapolation is the process of estimating, beyond the original observation range, the value of a
variable on the basis of its relationship with another variable. It is similar to interpolation, which
produces estimates between known observations, but extrapolation is subject to greater uncertainty
and a higher risk of producing meaningless results. Extrapolation may also mean extension of a
method, assuming similar methods will be applicable. Extrapolation may also apply to human
experience to project, extend, or expand known experience into an area not known or previously
experienced so as to arrive at a (usually conjectural) knowledge of the unknown [1] (e.g. a driver
extrapolates road conditions beyond his sight while driving). The extrapolation method can be applied
in the interior reconstruction problem.
Example illustration of the extrapolation problem, consisting of assigning a meaningful value at

the blue box, at x=7, given the red data points.
Interpolation of values
Linear interpolation is a method of curve fitting using linear polynomials to construct new data points
within the range of a discrete set of known data points.
Page 95 of 259
Given the two red points, the blue line is the linear interpolant between the points, and the
value y at x may be found by linear interpolation.
Application of time series

The usage of time series models is twofold: Obtain an understanding of the underlying forces
and structure that produced the observed data.
Time series models have been the basis for any study of a behavior of process or metrics over a period
of time. The applications of time series models are manifold, including sales forecasting, weather
forecasting, inventory studies etc. In decisions that involve factor of uncertainty of the future, time
series models have been found one of the most effective methods of forecasting. Most often, future
course of actions and decisions for such processes will depend on what would be an anticipated
result. The need for these anticipated results has encouraged organizations to develop forecasting
techniques to be better prepared to face the seemingly uncertain future. Also, these models can be
combined with other data mining techniques to help understand the behavior of the data and to be
able to predict future trends and patterns in the data behavior.
CHAPTER 6: INDEX NUMBERS

Introduction to Index Numbers
Historically, the first index was constructed in 1764 to compare the Italian price index
in 1750 with the price level in 1500. Though originally developed for measuring the effect of
change in prices, index numbers have today become one of the most widely used statistical
devices and there is hardly any field where they are not used. Newspapers headline the fact
that prices are going up or down, that industrial production is rising or falling, that imports
are increasing or decreasing, that crimes are rising in a particular period compared to the
Page 96 of 259
previous period as disclosed by index numbers. They are used to feel the pulse of the
economy and they have come to be used as indicators of inflationary or deflationary
tendencies, In fact, they are described as ‗barometers of economic activity‘, i.e., if one wants
to get an idea as to what is happening to an economy, he should look to important indices like
the index number of industrial production, agricultural production, business activity, etc.
Some prominent definitions of index numbers are given below:
1. ‗Index numbers are devices for measuring differences in the magnitude of a group of
related variables.—Croxton & Cowdert
2. ―An index number is a statistical measure designed to show changes in a variable or a
group of related variables with respect to time, geographic location or other characteristics
such as income, profession, etc. —Spiegel
3. ―In its simplest form an index number is the ratio of two index numbers expressed as a
per cent. An index number is a statistical measure —a measure designed to show changes
in one variable or in a group of related variables over time, or with respect to geographic
location, or in terms of some other characteristics.‖ —Patternson
Definitions of Terms
Index numbers are statistical devices designed to measure the relative change in the level of
variable or group of variables with respect to time, geographical location etc.
In other words these are the numbers which express the value of a variable at any given
period called ―current period ―as a percentage of the value of that variable at some
standard period called ―base period‖.
Here the variables may be

1. The price of a particular commodity like silver, iron or group of commodities like
consumer goods, food, stuffs etc.
2. The volume of trade, exports, imports, agricultural and industrial production, sales in
departmental store.
3. Cost of living of persons belonging to particular income group or profession etc. Ex:
suppose rice sells at Rs.9/kg at BBSR in 1995 as compare to Rs. 4.50/Kg in 1985, the
index number of price in 1995 compared to 1985.
Therefore the index number of price of rice in 1995 compared to 1985 is calculated as
Rs.9.00
100  200
Rs.4.50
This means there is a net increase of 100% in the price of rice in 1995as compared to 1985 [the
base year’s index number is always treated as 100]
Suppose, during the same period 1995 the rice sells at Rs. 12.00/kg in Delhi. There fore, the
index number of price at Bhubaneswar compared to price at Delhi is
Rs.9.00
Page 97 of 259
100  75
Rs.12.00
This means there is a net decrease of 25% in the price of rice in 1995as compared to
1985
The above index numbers are called ‘price index numbers’.
To take another example the production of rice in 1978 in Orissa was 44, 01,780 metric c
tons compare to 36, 19,500 metric tons in 1971. So the index number of the quantity
produced in 1978 compared to 1971 is
100 121.61
That means there is a net increase of 21.61% in production of rice in 1978 as compared to 1971.
The above index number is called ‘quantity index number’
Univariate index: An index which is calculated from a single variable is called univariate index.
Composite index: An index which is calculated from group of variables is called Composite
index
Price (quantity) index

A price (quantity) index is a measure reflecting the average of the proportionate changes in the
prices (quantities) of the specified set of goods and services between two periods of time.
Usually the index is assigned a value of 100 in some selected base period, and the values of
the index for other periods are intended to indicate the average percentage change in prices
compared with the base period.
This index number measures the changes in the level of quantities of items consumed, or
produced, or distributed during a year during a year under study with reference to another year
known as the base year. Like the price index number, the simplest formula of this index
number is as follows:
Q01 = (q1/q0) x 100
Where, Q01 = quantity index number of the current year on the basis of the base year’s
quantity.
Base period of a price index

The base period generally is understood to be the period with which other periods are
compared and whose values provide the weights for a price index. However, the concept of
the ―base period‖ is not a precise one and may be used to mean rather different things.
Three types of base periods may be distinguished:
Page 98 of 259
(i) the price reference period, that is, the period whose prices appear in the denominators
of the price relatives used to calculate the index, or
(ii) the weight reference period, that is, the period, usually a year, whose values serve as
weights for the index. However, when hybrid expenditure weights are used in which the
quantities of one period are valued at the prices of some other period, there is no unique
weight reference period, or
(iii) the index reference period, that is, the period for which the index is set equal to 100.
Current period should refer to the most recent period for which an index has been computed
or is being computed. However, the term is widely used to refer to any period that is
compared with the price reference or index reference period.
Weight refers to the relative importance of the different items in the construction of index
numbers.
Characteristics of index numbers

1. Index numbers are specialized averages:
As we know an average is a single figure representing a group of figures. How ever to obtain an
average the items must be comparable. For example the average weight of man, woman and children
of a certain locality has no meaning at all. Further more the unit of measurement must be same for all
the items. How ever this is not so with index numbers. Index numbers also one type of averages which
shows in a single figure the change in two or more series of different items which can be expressed in
different units. For example while constructing a consumer price index number the various items
which are use in construction are divided into broad heads namely food, clothing, fuel, lighting, house
rent, and miscellaneous which are expressed in different units.
2. Index numbers measures the net change in a group of related variables:
Since index numbers are essentially averages, they describe in one single figure the increase or
decrease in a group of related variables under study. The group of variables may be prices of set
of commodities, the volume of production in different sectors etc.
3. Index numbers measure the effect of changes over period of time:
Index numbers are most widely used for measuring changes over a period of time. For example
we can compare the agricultural production, industrial production, imports, exports, wages etc
in two different periods.
Application/Uses of index numbers

Index numbers are indispensable tools of economics and business analysis. Following are the
main uses of index numbers.
1) Index numbers are used as economic barometers:
Index number is a special type of averages which helps to measure the economic
fluctuations on price level, money market, economic cycle like inflation, deflation etc.
G.Simpson and F.Kafka say that index numbers are today one of the most widely used
statistical devices. They are used to take the pulse of economy and they are used as
Page 99 of 259
indicators of inflation or deflation tendencies. So index numbers are called economic
barometers.
2) Index numbers helps in formulating suitable economic policies and planning etc. Many of the
economic and business policies are guided by index numbers. For example while
deciding the increase of DA of the employees; the employer‘s have to depend primarily
on the cost of living index. If salaries or wages are not increased according to the cost of
living it leads to strikes, lock outs etc. The index numbers provide some guide lines that
one can use in making decisions.
3) They are used in studying trends and tendencies.
Since index numbers are most widely used for measuring changes over a period of time, the
time series so formed enable us to study the general trend of the phenomenon under study.
For example for last 8 to 10 years we can say that imports are showing upward tendency.
4) They are useful in forecasting future economic activity.
Index numbers are used not only in studying the past and present workings of our economy but
also important in forecasting future economic activity.
5) Index numbers measure the purchasing power of money.
The cost of living index numbers determine whether the real wages are rising or falling or
remain constant. The real wages can be obtained by dividing the money wages by the
corresponding price index and multiplied by 100. Real wages helps us in determining the
purchasing power of money.
6) Index numbers are used in deflating.
Index numbers are highly useful in deflating i.e. they are used to adjust the wages for cost
of living changes and thus transform nominal wages into real wages, nominal income to
real income, nominal sales to real sales etc. through appropriate index numbers.
Classification of index numbers

According to purpose for which index numbers are used are classified as
below.
i) Price index number ii)
Quality index number iii)
Value index number iv)
Special purpose index number
Only price and quantity index numbers are discussed in detail. The others will be mentioned but
without detail.
Price index number:

Price index number measures the changes in the price level of one commodity or group of
commodities between two points of time or two areas.
Ex: Wholesale price index numbers
Retail price index numbers
Consumer price index numbers.
Quantity index number:
Page 100 of 259

Quantity index numbers measures the changes in the volume of production, sales, etc in different
sectors of economy with respect to time period or space.
Note: Price and Quantity index numbers are called market index numbers.
Problems in constructing index numbers

Before constructing index numbers the careful thought must be given into following problems
i. Purpose of index numbers.
An index number which is properly designed for a purpose can be most useful and powerful
tool. Thus the first and the foremost problem are to determine the purpose of index numbers. If
we know the purpose of the index numbers we can settle some related problems. For example
if the purpose of index number is to measure the changes in the production of steel, the
problem of selection of items is automatically settled.
ii. Selection of commodities

After defining the purpose of index numbers, select only those commodities which are related
to that index. For example if the purpose of an index is to measure the cost of living of low
income group we should select only those commodities or items which are consumed by
persons belonging to this group and due care should be taken not to include the goods which
are utilized by the middle income group or high income group i.e. the goods like cosmetics,
other luxury goods like scooters, cars, refrigerators, television sets etc.
iii. Selection of base period

The period with which the comparisons of relative changes in the level of phenomenon are
made is termed as base period. The index for this period is always taken as 100. The following
are the basic criteria for the choice of the base period.
i) The base period must be a normal period i.e. a period frees from all sorts of
abnormalities or random fluctuations such as labor strikes, wars, floods, earthquakes
etc.
ii) The base period should not be too distant from the given period. Since index numbers
are essential tools in business planning and economic policies the base period should
not be too far from the current period. For example for deciding increase in dearness
allowance at present there is no advantage in taking 1950 or 1960 as the base, the
comparison should be with the preceding year after which the DA has not been
increased.
iii) Fixed base or chain base .While selecting the base a decision has to be made as to
whether the base shall remain fixing or not i.e. whether we have fixed base or chain
base. In the fixed base method the year to which the other years are compared is
constant. On the other hand, in chain base method the prices of a year are linked with
those of the preceding year. The chain base method gives a better picture than what is
Page 101 of 259

obtained by the fixed base method.  How a base is selected if a normal period is not
available?
Ans: Some times it is difficult to distinguish a year which can be taken as a normal year and
hence the average of a few years may be regarded as the value corresponding to the base year.
iv. Data for index
numbers
The data, usually the set of prices and of quantities consumed of the selected commodities for
different periods, places etc. constitute the raw material for the construction of index numbers.
The data should be collected from reliable sources such as standard trade journals, official
publications etc. for example for the construction of retail price index numbers, the price
quotations for the commodities should be obtained from super bazaars, departmental stores etc.
and not from wholesale dealers.
v. Selection of appropriate weights

A decision as to the choice of weights is an important aspect of the construction of index
numbers. The problem arises because all items included in the construction are not of equal
importance. So proper weights should be attached to them to take into account their relative
importance. Thus there are two type of indices.
i) Un weighted indices- in which no specific weights are attached ii) Weighted
indices- in which appropriate weights are assigned to various items.
vi. Choice of average.
Since index numbers are specialized averages, a choice of average to be used in their construction
is of great importance. Usually the following averages are used.
i) A.M ii)
G.M iii)
Median
Among these averages G.M is the appropriate average to be used. But in practice G.M is not
used as often as A.M because of its computational difficulties. vii. Choice of formula.
A large variety of formulae are available to construct an index number. The problem very often
is that of selecting the appropriate formula. The choice of the formula would depend not only
on the purpose of the index but also on the data available.
Methods of constructing index numbers

A large number of formulae have been derived for constructing index numbers. They can be
1) Unweighted indices
a) Simple aggregative method

b) Simple average of relatives.
2) Weighted indices
a) Weighted aggregative method
Page 102 of 259

i) Lasperey‘s method ii)
Paasche‘s method
iii) Fisher‘s ideal method
iv) Dorbey‘s and Bowley‘s method v) Marshal-
Edgeworth method vi) Kelly‘s method
b) Weighted average of relatives
Unweighted indices:
i) Simple aggregative method:
This is the simplest method of constructing index numbers. When this method is used to
construct a price index number the total of current year prices for the various commodities in
question is divided by the total of the base year prices and the quotient is multiplied by 100.
`Symbolically P01  P 100

1
P 0
Where P0 are the base year prices

P1 are the current year prices
P01 is the price index number for the current year with reference to the base year.
Problem:
Calculate the index number for 1995 taking 1991 as the base for the following data
Commodity Unit Prices 1991 (P0) Prices 1995 (P1)
A Kilogram 2.50 4.00
B Dozen 5.40 7.20
C Meter 6.00 7.00
D Quintal 150.00 200.00
E Liter 2.50 3.00
Total 166.40 221.20
P
221.20
Price index number = P01  1
100  100 132.93
P 0 166.40
There is a net increase of 32.93% in 1995 as compared to 1991.
Limitations:
There are two main limitations of this method
1. The units used in the prices or quantity quotations have a great influence on the
value of index.
2. No considerations are given to the relative importance of the commodities.

Page 103 of 259
ii) Simple average of relatives
When this method is used to construct a price index number, first of all price relatives
are obtained for the various items included in the index and then the average of these
relatives is obtained using any one of the averages i.e. mean or median etc.
When A.M is used for averaging the relatives the formula for computing the index is
1n  P1100
P01   P0  
When G.M is used for averaging the relatives the formula for computing the index is
 
P01  Antilog1n log PP10100
and Where
n is the number of commodities
P1 100 price
relative = 
P0
Problem:
Calculate the index number for 1995 taking 1991 as the base for the following data
Prices 1991 Prices 1995 P1
100
Commodity Unit (P0) (P1)

P0
A Kilogram 50 70 100 =
140
B Dozen 40 60 150
C Meter 80 90 112.5
D Quintal 110 120 109.5
E Liter 20 20 100
Total
Page 104 of 259

 P1 100 15612 122.
4
1
Price index number = P01   P  0  n 

There is a net increase of 22.4% in 1995 as compared to 1991.
Merits:
1. It is not affected by the units in which prices are quoted
2. It gives equal importance to all the items and extreme items don‘t affect the index
number.
3. The index number calculated by this method satisfies the unit test.
Demerits:
1. Since it is an unweighted average the importance of all items are assumed to be the same.
2. The index constructed by this method doesn‘t satisfy all the criteria of an ideal index
number.
3. In this method one can face difficulties to choose the average to be used.
Weighted indices:
i) Weighted aggregative method:
These indices are same as simple aggregative method. The only difference is in this method,
weights are assigned to the various items included in the index.
There are various methods of assigning weights and consequently a large number of
formulae for constructing weighted index number have been designed. Some important
methods are
i. Lasperey’s method: This method is devised by Lasperey in year

1871.It is the most important of all the types of index numbers.
In this method the base year quantities are taken weights. The
formula for constructing Lasperey’s price index number is
P01La=  p q 100
1 0 Ernst Louis Étienne Laspeyres
(1834-1913), Germany
 pq0 0
ii. Paasche’s method: In this method the current year quantities

are taken as weights and the formula is given by
P01Pa=  p q 100
1 1
Page 105 of 259

Hermann Paasche (1851
-1925)
Germany
 pq 0 1
iii. Fisher’s ideal method: Fishers price index number is given by the G.M of the
Lasperey’s and Paasche’s index numbers.
Symbolically
P01F= P01LaP01Pa
= p q 100p q 100
1 0
1 1
p q p q
0 0 0 1
Sir Ronald Aylmer Fisher
= p q p q 100
1 0 1 1 (1890-1962) ,England
p q p q 0 0 0 1
iv. Dorbey’s and Bowley’s method

Dorbey’s and Bowley’s price index number is given by the A.M of the Lasperey’s and Paasche’s
index numbers. Symbolically
DB P01La  P01Pa
P01 =
2 Quantity index numbers:
i. Lasperey’s quantity index number: Base year prices are taken as weights Q 01La= q p1 0
100
q p 0 0
ii. Paasche’s quantity index number : Current year prices are taken as weights
Q01Pa= q p 100
1 1
q p 0 1
F La Pa
q p q p 100
1 0 1 1
iii. Fisher’s ideal method: Q01 = Q01 Q01 =
q p q p
0 0 0 1
Fisher’s index number is called ideal index number. Why?

Page 106 of 259
The Fisher‘s index number is called ideal index number due to the following characteristics.
1) It is based on the G.M which is theoretically considered as the best average of
constructing index numbers.
2) It takes into account both current and base year prices as quantities.
3) It satisfies both time reversal and factor reversal test which are suggested by Fisher.
4) The upward bias of Lasperey‘s index number and downward bias of Paasche‘s index
number are balanced to a great extent.
Example: Compute price index numbers for the following data by
Comparison of Lasperey’s and Paasche’s index numbers:-

In Lasperey‘s index number base year quantities are taken as the weights and in Paasche‘s
index the current year quantities are taken as weights.
Page 107 of 259

From the practical point of view Lasperey‘s index is often proffered to Paasche‘s for the
simple reason that Lasperey‘s index weights are the base year quantities and do not
change from the year to the next. On the other hand Paasche‘s index weights are the
current year quantities, and in most cases these weights are difficult to obtain and
expensive.
Lasperey‘s index number is said to be have upward bias because it tends to over estimate
the price rise, where as the Paasche‘s index number is said to have downward bias, because
it tends to under estimate the price rise.
When the prices increase, there is usually a reduction in the consumption of those items
whose prices have increased. Hence using base year weights in the Lasperey‘s index, we
will be giving too much weight to the prices that have increased the most and the
numerator will be too large. Due to similar considerations, Paasche‘s index number using
given year weights under estimates the rise in price and hence has down ward bias.
If changes in prices and quantities between the reference period and the base period are
moderate, both Lasperey‘s and Paasche‘s indices give nearly the same values.
Demerit of Paasche’s index number:

Paasche‘s index number, because of its dependence on given year‘s weight, has distinct
disadvantage that the weights are required to be revised and computed for each period,
adding extra cost towards the collection of data.
What are the desiderata of good index numbers?
Irving Fisher has considered two important properties which an index number should satisfy.
These are tests of reversibility.
1. Time reversal test
2. Factor reversal test
If an index number satisfies these two tests it is said to be an ideal index number.
Weighted average of relatives:

Weighted average of relatives can be calculated by taking values of the base year (p 0q0) as
the weights. The formula is given by
PV
When A.M is used P01  V
When G.M is used P01  Anti log V log P

V
p1 100 and V  p q i.e. base year value
Where P 
Page 108 of 259

0 0
0
Test of consistency or adequacy

Several formulae have been suggested for constructing index numbers and the problem is
that of selecting most appropriate one in a given situation. The following teats are
suggested for choosing an appropriate index.
The following tests are suggested for choosing an appropriate index.
1) Unit test
2) Time reversal test
3) Factor reversal test
4) Circular test
1) Unit test:
This test requires that the formula for construction of index numbers should be such,
which is not affected by the unit in which the prices or quantities have been quoted.
Page 109 of 259
Note: This test is satisfied by all the index numbers except simple aggregative method.
2) Time reversal test
This is suggested by R.A.Fisher. Time reversal test is a test to determine whether a given
method will work both ways in time i.e. forward and backward. In other words, when the
data for any two years are treated by the same method, but with the bases reversed, the
two index numbers secured should be reciprocals to each other, so that their product is
unity.
Symbolically the following relation should be satisfied.
P01P10 1
Where P01 is the index for time period 1 with reference period 0.
P10 is the index for time period 0 with reference period 1.
Note: This test is not satisfied by Lasperey‘s method and Paasche‘s method. It is satisfied
by Fisher‘s method.
When Lasperey’s method is used
P01La=  p q 100
1 0
 pq0 0
P10La= p q 100
0 1
p q
1 1
Now,
P01La× P10La =  p q   p q ≠1
1 0 0 1
 pq  pq
0 0 1 1
Therefore this test is not satisfied by Lasperey‘s method
When Paasche’s method is used
P01Pa=  p q 100
1 1
 pq 0 1
P10Pa=  p q 100
0 0
 pq1 0
Page 110 of 259

Now,
P01Pa ×P10Pa = p q × p q 1
1 1 0 0
p q p q
0 1 1 0
Therefore this test is not satisfied by Paasche’s method When Fisher’s method is
used
P01F= p q p q 100 p q
1 0 1 1 0 0
p q 0 1
P10F= p q p q 100
0 1 0 0
p q p q
1 1 1 0
Now,
P01F ×P10F= p q p q p q p q = 1
1 0 1 1 0 1 0 0
p q p q p q p q
0 0 0 1 1 1 1 0
Value index:
The value of a single commodity is the product of its price and quantity. Thus a value
index ‗V‘ is the sum of the values of the commodities of given year divided by the sum
of the value of the base year multiplied by 100.
 p q 100
1 1
i.e. V   pq 0 0
3) Factor reversal test:

This is also suggested by R.A.Fisher. It holds that the product of a price index number
and the quantity index number should be equal to the corresponding value index. In other
words the test is that the change in price multiplied by the change in quantity should be
equal to change in value.
If p1& p0represents prices and q1& q0 the quantities in the current year and base year
respectively and if P01 represents the change in price in the current year 1 with reference
to the year 0 and Q01 represents the change in quantity in the current year 1 with reference
to the year 0.
Page 111 of 259

Symbolically P01Q01 V01   pq 1 1
 pq 0 0
Note: This test is not satisfied by Lasperey‘s method and Paasche‘s method. It is satisfied
by Fisher‘s method.
When Lasperey’s method is used
P01La=  p q 100
1 0
 pq0 0
Q01La= q p 100
1 0
q p
0 0
Now,
P01La× Q01La =  p q  q p   p q
1 0 1 0 1 1
 p q q p  p q
0 0 0 0 0 0
Therefore this test is not satisfied by Lasperey‘s method

When Paasche’s method is used
P01Pa=  p q 100
1 1
 pq0 1
q p
Q01Pa= 11 100
q p
0 1
Now,
P01Pa × Q10Pa =  p q × q p   p q
1 1 1 1 1 1
 p q q p  p q
0 1 0 1 0 0
Therefore this test is not satisfied by Paasche‘s method

When Fisher’s method is used
P01F= p q p q 100
1 0 1 1
p q p q
0 0 0 1
Page 112 of 259

Q01F= q p q p 100
1 0 1 1
q p q p 0 0 0 1
La La p q p q q p q p
1 0 1 1 1 0 1 1
P01 × Q01 =
p q p q q p q p
0 0 0 1 0 0 0 1
=  p q  1 1
2
2  pp01qq10
 p q   0 0
Therefore this test is satisfied by Fisher‘s method
4) Circular test:
This is another test of consistency of an index number. It is an extension of time reversal
test. According to this test, the index should work in a circular fashion.
Symbolically
1
P01P12 P20 
Note:
This test is not satisfied by Lasperey‘s method, Paasche‘s method and Fisher‘s method.
This test is satisfied by simple average of relatives based on G.M and Kelly‘s fixed base
method.
Page 113 of 259

 Prove that AM of Lasperey’s index numbers and Paasche’s index number is greater than or equal
to Fisher’s index number.
Let
Lasperey’s index number = P01La
Paasche’s index number= P01Pa
Fisher’s index number= P01F
And we have P01F= P01La P01Pa
Now we have to show that
P01La  P01Pa  P01F
 P01La  P01Pa  P01LaP01Pa

2
 P01La  P01Pa  2 P01LaP01Pa
 La Pa  2 La Pa
 P01  P01  4P01 P01
 P 01La  P01Pa  02
Page 114 of 259

The Chain Index Numbers
In fixed base method the base remain constant through out i.e. the relatives for all the
years are based on the price of that single year. On the other hand in chain base method,
the relatives for each year is found from the prices of the immediately preceding year.
Thus the base changes from year to year. Such index numbers are useful in comparing
current year figures with the preceding year figures. The relatives which we found by this
method are called link relatives.
Current years figure
Thus link relative for current year  100 Previous
years figure
And by using these link relatives we can find the chain indices for each year by using the
below formula
Chain index for current year 

Linkrelative of current year
Chain index of previous year
100
Note: The fixed base index number computed from the original data and chain index
number computed from link relatives give the same value of the index provided that there
is only one commodity, whose indices are being constructed.
Example: from the following data of wholesale prices of wheat for ten years construct
index number taking a) 1998 as base and b) by chain base method
Page 115 of 259

Note: the chain indices obtained in (b) are the same as the fixed base indices obtained in (a). in
fact chain index figures will always be equal to fixed index figure if there is only one series.
Example-2: Compute the chain index number with 2003 prices as base from the following table
giving the average wholesale prices of the commodities A, B and C for the year 2003 to 2007
Page 116 of 259

Conversion of fixed based index to chain based index
Current years F.B.I
Current year C.B.I  100
Previous years C.B.I
Conversion of chain based index to fixed base index.
Current year F.B.I 

Current years C.B.I
Previous years F.B.I

100
Example: Compute the chain base index numbers
Page 117 of 259

Example: Calculate fixed base index numbers from the following chain base index numbers
Note: It may be remembered that the fixed base index for the first year is same as the
chain base index for that year.
Merits of chain index numbers:

1. The chain base method has a great significance in practice, because in economic and
business data we are often concerned with making comparison with the previous
period.
2. Chain base method doesn‘t require the recalculation if some more items are
introduced or deleted from the old data.
3. Index numbers calculated from the chain base method are free from seasonal and
cyclical variations.
Demerits of chain index numbers:

1. This method is not useful for long term comparison.
Page 118 of 259
2. If there is any abnormal year in the series it will effect the subsequent years also.
Differences between fixed base and chain base methods:

Chain base Fixed base
1. Here the base year changes 1. Base year does not changes
2. Here link relative method is used 2. No such link relative method is used
3. Calculations are tedious 3. Calculations are simple
4. It can not be computed if any one year 4. It can be computed if any year is
is missing missing
5. It is suitable for short period 5. It is suitable for long period
6. Index numbers will be wrong if an 6. The error is confined to the index of
error is committed in the calculation that year only.
of link relatives
Base shifting
One of the most frequent operations necessary in the use of index numbers is changing
the base of an index from one period to another with out recompiling the entire series.
Such a change is referred to as „base shifting‟. The reasons for shifting the base are
1. If the previous base has become too old and is almost useless for purposes of
comparison.
2. If the comparison is to be made with another series of index numbers having
different base.
The following formula must be used in this method of base shifting is
current years old index number
Index number based on new base year = 100
new base years old index number
Example:
The following are the index numbers of prices with 1998 as base year
year Index 2003 410

1998 100 2004 400
1999 110 2005 380
2000 120 2006 370
2001 200 2007 340
2002 400
Shift the base from 1998 to 2004 and recast the index numbers.
Solution:
Page 119 of 259

Index number based on new base year = current years old index
number
100
new base years old index number
Index number for 1998 = 100 =25

………………………………………….
Index number for 2007= 100=85
Year Index number Index number Year Index number Index number
(1998as base) (2004 as base) (1998as base) (2004 as base)
1998 100 2003 410

100 =25 100=102.5
1999 110 2004 400
100=27.5 100=100
2000 120 2005 380
100=30 100=95
2001 200 2006 370
100=50 100=92.5
2002 400 2007 340
100=100 100=85
Splicing of two series of index numbers:

The problem of combining two or more overlapping series of index numbers into one
continuous series is called splicing. In other words, if we have a series of index numbers
with some base year which is discontinued at some year and we have another series of
index numbers with the year of discontinuation as the base, and connecting these two
series to make a continuous series is called splicing.
The following formula must be used in this method of splicing

Index number after splicing =
index number to be spliced 
old index number of existing base
100
Page 120 of 259

Example: The index A given was started in 1993 and continued up to 2003 in which year
another index B was started. Splice the index B to index A so that a continuous series of
index is made
Deflating
Deflating means correcting or adjusting a value which has inflated. It makes allowances
for the effect of price changes. When prices rise, the purchasing power of money
declines. If the money incomes of people remain constant between two periods and prices
of commodities are doubled the purchasing power of money is reduced to half. For
example if there is an increase in the price of rice from Rs10/kg in the year 1980 to
Rs20/kg in the year 1982. then a person can buy only half kilo of rice with Rs10. so the
purchasing power of a rupee is only 50paise in 1982 as compared to 1980.
1
Thus the purchasing power of money =
priceindex
Page 121 of 259

In times of rising prices the money wages should be deflated by the price index to get the
figure of real wages. The real wages alone tells whether a wage earner is in better
position or in worst position.
For calculating real wage, the money wages or income is divided by the corresponding price
index and multiplied by 100.
Money wages
i.e. Real wages = 100
Priceindex
Real wageof current year
Thus Real Wage Index= 100
Real wageof baseyear
Example: The following table gives the annual income of a worker and the general Index
Numbers of price during 1999-2007. Prepare Index Number to show the changes in the
real income of the teacher and comment on price increase
The method discussed above is frequently used to deflate individual values, value series or
value indices. Its special use is in problems dealing with such diversified things as rupee
sales, rupee inventories of manufacturer‘s, wholesaler‘s and retailer‘s income, wages and
the like.
Cost of living index numbers (or) Consumer price index numbers:
Page 122 of 259

The cost of living index numbers measures the changes in the level of prices of
commodities which directly affects the cost of living of a specified group of persons at a
specified place. The general index numbers fails to give an idea on cost of living of
different classes of people at different places.
Different classes of people consume different types of commodities, people‘s
consumption habit is also vary from man to man, place to place and class to class i.e.
richer class, middle class and poor class. For example the cost of living of rickshaw
pullers at BBSR is different from the rickshaw pullers at Kolkata. The consumer price
index helps us in determining the effect of rise and fall in prices on different classes of
consumers living in different areas.
Construction of cost of living index numbers

The following are the main steps in constructing a cost of living index number.
1. Decision about the class of people for whom the index is meant
It is absolutely essential to decide clearly the class of people for whom the index is
meant i.e. whether it relates to industrial workers, teachers, officers, labors, etc. Along
with the class of people it is also necessary to decide the geographical area covered
by the index, such as a city, or an industrial area or a particular locality in a city.
2. Conducting family budget enquiry
Once the scope of the index is clearly defined the next step is to conduct a sample
family budget enquiry i.e. we select a sample of families from the class of people for
whom the index is intended and scrutinize their budgets in detail. The enquiry should
be conducted during a normal period i.e. a period free from economic booms or
depressions. The purpose of the enquiry is to determine the amount; an average
family spends on different items. The family budget enquiry gives information about
the nature and quality of the commodities consumed by the people. The commodities
are being classified under following heads
i)Food ii) Clothing iii)Fuel and Lighting iv)House rent v) miscellaneous
3. Collecting retail prices of different commodities

The collection of retail prices is a very important and at the same time very difficult
task, because such prices may vary from lace to place, shop to shop and person to
person. Price quotations should be obtained from the local markets, where the class of
people reside or from super bazaars or departmental stores from which they usually
make their purchases.
Uses of cost of living index numbers

1. Cost of living index numbers indicate whether the real wages are rising or falling. In
other words they are used for calculating the real wages and to determine the change
in the purchasing power of money.
1
Purchasing power of money 
Page 123 of 259
Cost of living index number
Moneywages
Real Wages  100
Cost of living index umbers
2. Cost of living indices are used for the regulation of D.A or the grant of bonus to the
workers so as to enable them to meet the increased cost of living.
3. Cost of living index numbers are used widely in wage negotiations.
4. These index numbers also used for analyzing markets for particular kinds of goods.
Methods for construction of cost of living index numbers

Cost of living index number can be constructed by the following formulae.
1) Aggregate expenditure method or weighted aggregative method
2) Family budget method or the method of weighted relatives
1) Aggregate expenditure method or weighted aggregative method

In this method the quantities of commodities consumed by the particular group in the
base
year are taken as weights. The formula is given by
Consumer price index =  p q 100

1 0
 pq 0 0
Steps:
i) The prices of commodities for various groups for the current year is multiplied by the
quantities of the base year and their aggregate expenditure of current year is obtained
.i.e.  pq1 0
ii) Similarly obtain  p0q0

iii) The aggregate expenditure of the current year is divided by the aggregate expenditure of
the base year and the quotient is multiplied by 100.
 pq
Symbolically 10 100
 pq
0 0
2) Family budget method or the method of weighted relatives

In this method cost of living index is obtained on taking the weighted average of price
relatives, the weights are the values of quantities consumed in the base year
i.e. v  p0q0 . Thus the consumer price index number is given by consumer price
Page 124 of 259

pv
index =
v
p1 100 for each
item Where p  po
v p0q0, value on the base year
Note: It should be noted that the answer obtained by applying the aggregate expenditure method
and family budget method shall be same.
Example: Construct the consumer price index number for 2007 on the basis of 2006 from the
following data using (i) the aggregate expenditure method, and (ii) the family budget method.
Page 125 of 259

Thus, the answer is the same by both the methods. However, the reader should prefer the
aggregate expenditure method because it is far easier to apply compared to the family
budget method.
Possible errors in construction of cost of living index numbers:
Cost of living index numbers or its recently popular name consumer price index numbers
are not accurate due to various reasons.
1. Errors may occur in the construction because of inaccurate specification of groups
for whom the index is meant.
2. Faulty selection of representative commodities resulting out of unscientific family
budget enquiries.
3. Inadequate and unrepresentative nature of price quotations and use of inaccurate
weights
4. Frequent changes in demand and prices of the commodity
5. The average family might not be always a representative one.
Page 126 of 259

Problems or steps in construction of wholesale price index numbers (WPI):
Index numbers are the best indicators of the economic progress of a community, a nation
and the world as a whole. Wholesale price index numbers can also be constructed for
different economic activities such as Indices of Agricultural production, Indices of
Industrial production, Indices of Foreign Trade etc. Besides some International
organizations like the United Nations Organization, the F.A.O. of the U.N., the World
Bank and International Labour Organization, there are a number of organizations in the
country who publish index numbers on different aspects. These are (a) Ministry of Food
and Agriculture, (b) Reserve Bank of India, (c) Central Statistical Organization, (d)
Department of Commercial
Intelligence and Statistics, (e) Labour Bureau, (f) Eastern Economist. The Central
Statistical Organization of the Government of India publishes a Monthly Abstract of‘
Statistics which contains All India index numbers of Wholesale Prices (Revised series :
Base year 1981-82) both commodity-wise and also for the aggregate.
i. Purpose or object of index numbers.

A wholesale price index number which is properly designed for a purpose can be most
useful and powerful tool. Thus the first and the foremost problem are to determine the
purpose of index numbers. If we know the purpose of the index numbers we can settle
some related problems.
ii. Selection of commodities

Representative items should be taken into consideration. The items may be grouped into
relatively homogeneous heads to make the calculation. The construction of WPI of a
region or country we may group the commodities as (1) Primary Articles — (a) Food
Articles (b) Non-food Articles (c) Minerals (ii) Fuel. Power, Light and Lubricants (iii)
Manufactured Products (iv) Chemicals and Chemical Products (v) Machinery and
Machine Equipments (vi) Other Miscellaneous Manufacturing Industries.
iii. Selection of base period

1. The base period must be a normal period i.e. a period frees from all sorts of
abnormalities or random fluctuations such as labor strikes, wars, floods,
earthquakes etc.
2. The base period should not be too distant from the given period. Since index
numbers are essential tools in business planning and economic policies the base
period should not be too far from the current period. For example for deciding
increase in dearness allowance at present there is no advantage in taking 1950 or
1960 as the base, the comparison should be with the preceding year after which the
DA has not been increased.
3. Fixed base or chain base .While selecting the base a decision has to be made as to
whether the base shall remain fixing or not i.e. whether we have fixed base or chain
base. In the fixed base method the year to which the other years are compared is
Page 127 of 259
constant. On the other hand, in chain base method the prices of a year are linked
with those of the preceding year. The chain base method gives a better picture than
what is obtained by the fixed base method.
iv. Data for index numbers

The data, usually the set of prices and of quantities consumed of the selected commodities
for different periods, places etc. constitute the raw material for the construction of
wholesale rice index numbers. The data should be collected from reliable sources such as
standard trade journals, official publications etc.
v. Selection of appropriate weights
A decision as to the choice of weights is an important aspect of the construction of index
numbers. The problem arises because all items included in the construction are not of equal
importance. So proper weights should be attached to them to take into account their relative
importance. Thus there are two type of indices.
1. Un weighted indices- in which no specific weights are attached
2. Weighted indices- in which appropriate weights are assigned to various items.
vi. Choice of average.
Since index numbers are specialized averages, a choice of average to be used in their
construction is of great importance. Usually the following averages are used.
iv) A.M v)
G.M
vi) Median
Among these averages G.M is the appropriate average to be used. But in practice G.M is
not used as often as A.M because of its computational difficulties. vii. Choice of formula.
The selection of a formula along with a method of averaging depends on data at hand and
purpose for which it is used. Different formulae developed for the purpose have already
been discussed in earlier sections.
Wholesale price index numbers (Vs) consumer price index numbers:

1. The wholesale price index number measures the change in price level in a country as
a whole. For example economic advisors index numbers of wholesale prices.
Where as cost of living index numbers measures the change in the cost of living of a
particular class of people stationed at a particular place. In this index number we take
retail price of the commodities.
2. The wholesale price index number and the consumer price index numbers are
generally different because there is lag between the movement of wholesale prices
and the retail prices.
3. The retail prices required for the construction of consumer price index number
increased much faster than the wholesale prices i.e. there might be erratic changes in
the consumer price index number unlike the wholesale price index numbers.
Page 128 of 259

4. The method of constructing index numbers in general the same for wholesale prices
and cost of living. The wholesale price index number is based on different weighting
systems and the selection of commodities is also different as compared to cost of
living index number
Importance and methods of assigning weights

The problem of selecting suitable weights is quite important and at the same time quite
difficult to decide. The term weight refers to the relative importance of the different items
in the construction of the index. Generally various items say wheat, rice, kerosene,
clothing etc. included in the index are not of equal importance, proper weights should be
attached to them to take into their relative importance. Thus there are two types of
indices.
1) Unweighted indices – in which no specific weights are attached to various
commodities.
2) Weighted indices – in which appropriate weights are assigned to various
commodities. The Unweighted indices can be interpreted as weighted indices by
assuming the corresponding weight for each commodity being unity. But actually the
commodities included in the index are all not of equal importance. Therefore it is
necessary to adopt some suitable method of weighting, so that arbitrary and
haphazard weights may not affect the results.
There are two methods of assigning weights.
i) Implicit weighting
ii) Explicit weighting
In implicit weighting, a commodity or its variety is included in the index a number of
times. For example if wheat is to be given in an index twice as much times as rice then
the weight of wheat is two. Where as in explicit weighting two types of weights can be
assigned. i.e. quantity weights or value weights.
A quantity weight symbolized by q means the amount of commodity produced,
distributed or consumed in some time period. A value weight in the other hand combines
price with quantity produced, distributed or consumed and is denoted by v=pq.
For example quantity weights are used in the method of weighted aggregative like
Lasperey‘s, Paasche‘s index numbers and value weights are used in the method of
weighted average of price relatives.
Limitations or demerits of index numbers
Although index numbers are indispensable tools in economics, business, management etc,
they have their limitations and proper care should be taken while interpreting them. Some
of the limitations of index numbers are
1. Since index numbers are generally based on a sample, it is not possible to take into
account each and every item in the construction of index.
2. At each stage of the construction of index numbers, starting from selection of
commodities to the choice of formulae there is a chance of the error being
introduced.
Page 129 of 259
3. Index numbers are also special type of averages, since the various averages like
mean, median, G.M have their relative limitations, their use may also introduce
some error.
4. None of the formulae for the construction of index numbers is exact and contains the
so called formula error. For example Lasperey‘s index number has an upward bias
while Paasche‘s index has a downward bias.
5. An index number is used to measure the change for a particular purpose only. Its
misuse for other purpose would lead to unreliable conclusions.
6. In the construction of price or quantity index numbers it may not be possible to
retain the uniform quality of commodities during the period of investigation.
Page 130 of 259

CHAPTER 7: PROBABILITY DISTRIBUTION
Introduction to Probability Distribution
Definition (probability distribution): a function describes how probabilities are distributed
over the values of the random variable.
Discrete and continuous variables

A variable is a quantity whose value changes.
A discrete variable is a variable whose value is obtained by counting.
Examples:
 number of students present
 number of red marbles in a jar
 number of heads when flipping three coins
 students‘ grade level
A continuous variable is a variable whose value is obtained by measuring.
Examples:
 height of students in class
 weight of students in class
 time it takes to get to school
 distance traveled between classes
A random variable is a variable whose value is a numerical outcome of a random

phenomenon.
▪ A random variable is denoted with a capital letter

▪ The probability distribution of a random variable X tells what the possible
values of X are and how probabilities are assigned to those values
▪ A random variable can be discrete or continuous
A discrete random variable X has a countable number of possible values.
Example: Let X represent the sum of two dice.
Then the probability distribution of X is as follows:

X 2 3 4 5 6 7 8 9 10 11 12
P(X)
To graph the probability distribution of a discrete random variable, construct a

probability histogram.
Page 131 of 259
A continuous random variable X takes all values in a given interval of
numbers. ▪ The probability distribution of a continuous random variable
is shown by a density curve.
▪ The probability that X is between an interval of numbers is the area under the
density curve between the interval endpoints
▪ The probability that a continuous random variable X is exactly equal to a
number is zero
Discrete Probability Distributions
If a random variable is a discrete variable, its probability distribution is called a discrete

probability distribution.
An example will make this clear. Suppose you flip a coin two times. This simple statistical
experiment can have four possible outcomes: HH, HT, TH, and TT. Now, let the random
variable X represent the number of Heads that result from this experiment. The random
variable X can only take on the values 0, 1, or 2, so it is a discrete random variable.
The probability distribution for this statistical experiment appears below.
Number of heads Probability

0 0.25
1 0.50
2 0.25
The above table represents a discrete probability distribution because it relates each value
of a discrete random variable with its probability of occurrence.
Examples of discrete probability distributions.
• Binomial probability distribution

Page 132 of 259
• Hypergeometric probability distribution
• Multinomial probability distribution
• Negative binomial distribution
• Poisson probability distribution
Note: With a discrete probability distribution, each possible value of the discrete random
variable can be associated with a non-zero probability. Thus, a discrete probability
distribution can always be presented in tabular form.
Continuous Probability Distributions
If a random variable is a continuous variable, its probability distribution is called a

continuous probability distribution.
A continuous probability distribution differs from a discrete probability distribution in

several ways.
• The probability that a continuous random variable will assume a particular value is
zero.
• As a result, a continuous probability distribution cannot be expressed in tabular
form.
• Instead, an equation or formula is used to describe a continuous probability
distribution.
Most often, the equation used to describe a continuous probability distribution is called a
probability density function. Sometimes, it is referred to as a density function, a PDF, or
a pdf. For a continuous probability distribution, the density function has the following
properties:
• Since the continuous random variable is defined over a continuous range of values
(called the domain of the variable), the graph of the density function will also be
continuous over that range.
• The area bounded by the curve of the density function and the x-axis is equal to 1,
when computed over the domain of the variable.
• The probability that a random variable assumes a value between a and b is equal to
the area under the density function bounded by a and b.
For example, consider the probability density function shown in the graph below.
Suppose we wanted to know the probability that the random variable X was less than or
equal to a. The probability that X is less than or equal to a is equal to the area under the
curve bounded by a and minus infinity - as indicated by the shaded area.
Page 133 of 259

Note: The shaded area in the graph represents the probability that the random variable X
is less than or equal to a. This is a cumulative probability. However, the probability that X
is exactly equal to a would be zero. A continuous random variable can take on an infinite
number of values. The probability that it will equal a specific value (such as a) is always
zero.
Examples of continuous probability distributions.

• Normal probability distribution
• Student's t distribution
• Chi-square distribution
• F distribution
Discrete & Continuous probability distribution In problems Solutions

Binomial Probability Distribution
To understand binomial distributions and binomial probability, it helps to understand
binomial experiments and some associated notation; so we cover those topics first.
Binomial Experiment
A binomial experiment is a statistical experiment that has the following
properties:  The experiment consists of n repeated trials.
• Each trial can result in just two possible outcomes. We call one of these outcomes a
success and the other, a failure.
• The probability of success, denoted by P, is the same on every trial.
• The trials are independent; that is, the outcome on one trial does not affect the
outcome on other trials.
Consider the following statistical experiment. You flip a coin 2 times and count the
number of times the coin lands on heads. This is a binomial experiment because:
• The experiment consists of repeated trials. We flip a coin 2 times.

• Each trial can result in just two possible outcomes - heads or tails.
• The probability of success is constant - 0.5 on every trial.
• The trials are independent; that is, getting heads on one trial does not affect whether
we get heads on other trials.
Notation
The following notation is helpful, when we talk about binomial probability.
Page 134 of 259
• x: The number of successes that result from the binomial experiment.
• n: The number of trials in the binomial experiment.  P: The probability of

success on an individual trial.
• Q: The probability of failure on an individual trial. (This is equal to 1 - P.) 
n!: The factorial of n (also known as n factorial).
• b(x; n, P): Binomial probability - the probability that an n-trial binomial
experiment results in exactly x successes, when the probability of success on an
individual trial is P.
• nCr: The number of combinations of n things, taken r at a time.
Binomial Distribution
A binomial random variable is the number of successes x in n repeated trials of a binomial
experiment. The probability distribution of a binomial random variable is called a binomial
distribution.
Suppose we flip a coin two times and count the number of heads (successes). The binomial
random variable is the number of heads, which can take on values of 0, 1, or 2. The
binomial distribution is presented below.
Number of heads Probability
0 0.25
1 0.50
2 0.25
The binomial distribution has the following properties:
• The mean of the distribution (μx) is equal to n * P .  The

variance (σ2x) is n * P * ( 1 - P ).
• The standard deviation (σx) is sqrt[ n * P * ( 1 - P ) ].
Binomial Formula and Binomial Probability

The binomial probability refers to the probability that a binomial experiment results in
exactly x successes. For example, in the above table, we see that the binomial probability
of getting exactly one head in two coin flips is 0.50.
Given x, n, and P, we can compute the binomial probability based on the binomial formula:
Binomial Formula. Suppose a binomial experiment consists of n trials and results in
x successes. If the probability of success on an individual trial is P, then the binomial
probability is:
b(x; n, P) = nCx * Px * (1 - P)n - x or

b(x; n, P) = { n! / [ x! (n - x)! ] } * Px * (1 - P)n - x
Example 1
Suppose a die is tossed 5 times. What is the probability of getting exactly 2 fours?
Page 135 of 259

Solution: This is a binomial experiment in which the number of trials is equal to 5, the
number of successes is equal to 2, and the probability of success on a single trial is 1/6 or
about 0.167. Therefore, the binomial probability is:
b(2; 5, 0.167) = 5C2 * (0.167)2 * (0.833)3

b(2; 5, 0.167) = 0.161
Cumulative Binomial Probability

A cumulative binomial probability refers to the probability that the binomial random
variable falls within a specified range (e.g., is greater than or equal to a stated lower limit
and less than or equal to a stated upper limit).
For example, we might be interested in the cumulative binomial probability of obtaining

45 or fewer heads in 100 tosses of a coin (see Example 1 below). This would be the sum
of all these individual binomial probabilities.
b(x < 45; 100, 0.5) =

b(x = 0; 100, 0.5) + b(x = 1; 100, 0.5) + ... + b(x = 44; 100, 0.5) + b(x = 45; 100, 0.5)
Example 1
What is the probability of obtaining 45 or fewer heads in 100 tosses of a coin?
Solution: To solve this problem, we compute 46 individual probabilities, using the binomial
formula. The sum of all these probabilities is the answer we seek. Thus,
b(x < 45; 100, 0.5) = b(x = 0; 100, 0.5) + b(x = 1; 100, 0.5) + . . . + b(x = 45; 100, 0.5)
b(x < 45; 100, 0.5) = 0.184
Example 2
The probability that a student is accepted to a prestigious college is 0.3. If 5 students from
the same school apply, what is the probability that at most 2 are accepted?
Solution: To solve this problem, we compute 3 individual probabilities, using the binomial
formula. The sum of all these probabilities is the answer we seek. Thus,
b(x < 2; 5, 0.3) = b(x = 0; 5, 0.3) + b(x = 1; 5, 0.3) + b(x = 2; 5,

0.3) b(x < 2; 5, 0.3) = 0.1681 + 0.3601 + 0.3087 b(x < 2; 5,
0.3) = 0.8369
Example 3
What is the probability that the world series will last 4 games? 5 games? 6 games? 7 games?
Assume that the teams are evenly matched.
Solution: This is a very tricky application of the binomial distribution. If you can follow
the logic of this solution, you have a good understanding of the material covered in the
tutorial, to this point.
Page 136 of 259

In the world series, there are two baseball teams. The series ends when the winning team
wins 4 games. Therefore, we define a success as a win by the team that ultimately
becomes the world series champion.
For the purpose of this analysis, we assume that the teams are evenly matched. Therefore,
the probability that a particular team wins a particular game is 0.5.
Let's look first at the simplest case. What is the probability that the series lasts only 4
games. This can occur if one team wins the first 4 games. The probability of the National
League team winning 4 games in a row is:
b(4; 4, 0.5) = 4C4 * (0.5)4 * (0.5)0 = 0.0625
Similarly, when we compute the probability of the American League team winning 4
games in a row, we find that it is also 0.0625. Therefore, probability that the series ends in
four games would be 0.0625 + 0.0625 = 0.125; since the series would end if either the
American or National League team won 4 games in a row.
Now let's tackle the question of finding probability that the world series ends in 5 games.
The trick in finding this solution is to recognize that the series can only end in 5 games, if
one team has won 3 out of the first 4 games. So let's first find the probability that the
American League team wins exactly 3 of the first 4 games.
b(3; 4, 0.5) = 4C3 * (0.5)3 * (0.5)1 = 0.25
Okay, here comes some more tricky stuff, so listen up. Given that the American League
team has won 3 of the first 4 games, the American League team has a 50/50 chance of
winning the fifth game to end the series. Therefore, the probability of the American
League team winning the series in 5 games is 0.25 * 0.50 = 0.125. Since the National
League team could also win the series in 5 games, the probability that the series ends in 5
games would be 0.125 + 0.125 = 0.25.
The rest of the problem would be solved in the same way. You should find that the
probability of the series ending in 6 games is 0.3125; and the probability of the series
ending in 7 games is also 0.3125.
Poisson Distribution
A Poisson distribution is the probability distribution that results from a Poisson experiment.
Attributes of a Poisson Experiment

A Poisson experiment is a statistical experiment that has the following properties:
• The experiment results in outcomes that can be classified as successes or failures. 
The average number of successes (μ) that occurs in a specified region is known.
• The probability that a success will occur is proportional to the size of the region.
• The probability that a success will occur in an extremely small region is virtually zero.
Page 137 of 259

Note that the specified region could take many forms. For instance, it could be a length, an
area, a volume, a period of time, etc.
Notation
The following notation is helpful, when we talk about the Poisson distribution.
• e: A constant equal to approximately 2.71828. (Actually, e is the base of the natural

logarithm system.)  μ: The mean number of successes that occur in a
specified region.
• x: The actual number of successes that occur in a specified region.
• P(x; μ): The Poisson probability that exactly x successes occur in a Poisson
experiment, when the mean number of successes is μ.
Poisson Distribution
A Poisson random variable is the number of successes that result from a Poisson
experiment. The probability distribution of a Poisson random variable is called a Poisson
distribution.
Given the mean number of successes (μ) that occur in a specified region, we can compute
the Poisson probability based on the following formula:
Poisson Formula. Suppose we conduct a Poisson experiment, in which the average

number of successes within a given region is μ. Then, the Poisson probability is:
P(x; μ) = (e-μ) (μx) / x!
where x is the actual number of successes that result from the experiment, and e is
approximately equal to 2.71828.
The Poisson distribution has the following properties:
• The mean of the distribution is equal to μ .  The variance is also equal

to μ .
Example 1
The average number of homes sold by the Acme Realty company is 2 homes per day. What
is the probability that exactly 3 homes will be sold tomorrow?
Solution: This is a Poisson experiment in which we know the following:
• μ = 2; since 2 homes are sold per day, on average.

• x = 3; since we want to find the likelihood that 3 homes will be sold
tomorrow.  e = 2.71828; since e is a constant equal to approximately
2.71828.
We plug these values into the Poisson formula as follows:
Page 138 of 259

P(x; μ) = (e-μ) (μx) / x!
P(3; 2) = (2.71828-2) (23) / 3!
P(3; 2) = (0.13534) (8) / 6
P(3; 2) = 0.180
Thus, the probability of selling 3 homes tomorrow is 0.180 .
Cumulative Poisson Probability

A cumulative Poisson probability refers to the probability that the Poisson random
variable is greater than some specified lower limit and less than some specified upper
limit.
Example 1
Suppose the average number of lions seen on a 1-day safari is 5. What is the probability that
tourists will see fewer than four lions on the next 1-day safari?
Solution: This is a Poisson experiment in which we know the following:
• μ = 5; since 5 lions are seen per safari, on average.

• x = 0, 1, 2, or 3; since we want to find the likelihood that tourists will see fewer
than 4 lions; that is, we want the probability that they will see 0, 1, 2, or 3 lions. 
e = 2.71828; since e is a constant equal to approximately 2.71828.
To solve this problem, we need to find the probability that tourists will see 0, 1, 2, or 3 lions.
Thus, we need to calculate the sum of four probabilities: P(0; 5) + P(1; 5) + P(2; 5) + P(3;
5). To compute this sum, we use the Poisson formula:
P(x < 3, 5) = P(0; 5) + P(1; 5) + P(2; 5) + P(3; 5)

P(x < 3, 5) = [ (e-5)(50) / 0! ] + [ (e-5)(51) / 1! ] + [ (e-5)(52) / 2! ] + [ (e-5)(53) / 3! ]
P(x < 3, 5) = [ (0.006738)(1) / 1 ] + [ (0.006738)(5) / 1 ] + [ (0.006738)(25) / 2 ] +
[ (0.006738)(125) / 6 ]
P(x < 3, 5) = [ 0.0067 ] + [ 0.03369 ] + [ 0.084224 ] + [ 0.140375 ]
P(x < 3, 5) = 0.2650
Thus, the probability of seeing at no more than 3 lions is 0.2650.
Normal Distribution
The normal distribution refers to a family of continuous probability distributions
described by the normal equation.
The Normal Equation

The normal distribution is defined by the following equation:
Page 139 of 259

Normal equation. The value of the random variable Y is:
Y = { 1/[ σ * sqrt(2π) ] } * e-(x - μ)2/2σ2
where X is a normal random variable, μ is the mean, σ is the standard deviation, π is

approximately
3.14159, and e is approximately 2.71828.
The random variable X in the normal equation is called the normal random variable. The
normal equation is the probability density function for the normal distribution.
The Normal Curve

The graph of the normal distribution depends on two factors - the mean and the standard
deviation. The mean of the distribution determines the location of the center of the graph,
and the standard deviation determines the height and width of the graph. When the
standard deviation is large, the curve is short and wide; when the standard deviation is
small, the curve is tall and narrow. All normal distributions look like a symmetric, bell-
shaped curve, as shown below.
The curve on the left is shorter and wider than the curve on the right, because the curve on
the left has a bigger standard deviation.
Probability and the Normal Curve

The normal distribution is a continuous probability distribution. This has several
implications for probability.
• The total area under the normal curve is equal to 1.

• The probability that a normal random variable X equals any particular value is 0.
• The probability that X is greater than a equals the area under the normal curve bounded by a
and plus infinity (as indicated by the non-shaded area in the figure below).
• The probability that X is less than a equals the area under the normal curve bounded by a and
minus infinity (as indicated by the shaded area in the figure below).
Page 140 of 259

Additionally, every normal curve (regardless of its mean or standard deviation) conforms
to the following "rule".
• About 68% of the area under the curve falls within 1 standard deviation of the mean.
• About 95% of the area under the curve falls within 2 standard deviations of the mean.
• About 99.7% of the area under the curve falls within 3 standard deviations of the mean.
Collectively, these points are known as the empirical rule or the 68-95-99.7 rule. Clearly,
given a normal distribution, most outcomes will be within 3 standard deviations of the
mean.
Example 1
An average light bulb manufactured by the Acme Corporation lasts 300 days with a standard
deviation of 50 days. Assuming that bulb life is normally distributed, what is the probability
that an Acme light bulb will last at most 365 days?
Solution: Given a mean score of 300 days and a standard deviation of 50 days, we want to
find the cumulative probability that bulb life is less than or equal to 365 days. Thus, we
know the following:
• The value of the normal random variable is 365 days.

• The mean is equal to 300 days.
• The standard deviation is equal to 50 days.
We enter these values into the Normal Distribution Calculator and compute the cumulative
probability. The answer is: P( X < 365) = 0.90. Hence, there is a 90% chance that a light
bulb will burn out within 365 days.
Example 2
Suppose scores on an IQ test are normally distributed. If the test has a mean of 100 and a
standard deviation of 10, what is the probability that a person who takes the test will score
between 90 and 110?
Solution: Here, we want to know the probability that the test score falls between 90 and 110.
The "trick" to solving this problem is to realize the following:
P( 90 < X < 110 ) = P( X < 110 ) - P( X < 90 )
We use the Normal Distribution Calculator to compute both probabilities on the right side of
the above equation.
• To compute P( X < 110 ), we enter the following inputs into the calculator: The value of
the normal random variable is 110, the mean is 100, and the standard deviation is 10. We
find that P( X < 110 ) is 0.84.
Page 141 of 259

• To compute P( X < 90 ), we enter the following inputs into the calculator: The value of the
normal random variable is 90, the mean is 100, and the standard deviation is 10. We find
that P( X < 90 ) is 0.16.
We use these findings to compute our final answer as follows:
P( 90 < X < 110 ) = P( X < 110 ) - P( X < 90 )

P( 90 < X < 110 ) = 0.84 - 0.16
P( 90 < X < 110 ) = 0.68
Thus, about 68% of the test scores will fall between 90 and 110.
CHAPTER 8: NETWORK PLANNING

Introduction to Project Planning
The Importance/Uses of Planning in an Organization
Planning helps an organization chart a course for the achievement of its goals. The process
begins with reviewing the current operations of the organization and identifying what needs to
be improved operationally in the upcoming year. From there, planning involves envisioning the
results the organization wants to achieve, and determining the steps necessary to arrive at the
intended destination--success, whether that is measured in financial terms, or goals that include
being the highest-rated organization in customer satisfaction.
Efficient Use of Resources: All organizations, large and small, have limited resources. The
planning process provides the information top management needs to make effective decisions
about how to allocate the resources in a way that will enable the organization to reach its
objectives. Productivity is maximized and resources are not wasted on projects with little chance
of success.
Establishing Goals: Setting goals that challenge everyone in the organization to strive for better
performance is one of the key aspects of the planning process. Goals must be aggressive, but
realistic. Organizations cannot allow themselves to become too satisfied with how they are
currently doing--or they are likely to lose ground to competitors. The goal setting process can be
a wake-up call for managers that have become complacent. The other benefit of goal setting
comes when forecast results are compared to actual results. Organizations analyze significant
variances from forecast and take action to remedy situations where revenues were lower than
plan or expenses higher.
Managing Risk And Uncertainty: Managing risk is essential to an organization’s success. Even the
largest corporations cannot control the economic and competitive environment around them.
Unforeseen events occur that must be dealt with quickly, before negative financial consequences
from these events become severe. Planning encourages the development of “what-if” scenarios,
where managers attempt to envision possible risk factors and develop contingency plans to deal
Page 142 of 259

with them. The pace of change in business is rapid, and organizations must be able to rapidly
adjust their strategies to these changing conditions.
Team Building: Planning promotes team building and a spirit of cooperation. When the plan is
completed and communicated to members of the organization, everyone knows what their
responsibilities are, and how other areas of the organization need their assistance and expertise in
order to complete assigned tasks. They see how their work contributes to the success of the
organization as a whole and can take pride in their contributions. Potential conflict can be reduced
when top management solicits department or division managers’ input during the goal setting
process. Individuals are less likely to resent budgetary targets when they had a say in their
creation.
Creating Competitive Advantages: Planning helps organizations get a realistic view of their current
strengths and weaknesses relative to major competitors. The management team sees areas where
competitors may be vulnerable and then crafts marketing strategies to take advantage of these
weaknesses. Observing competitors’ actions can also help organizations identify opportunities they may
have overlooked, such as emerging international markets or opportunities to market products to
completely different customer groups.
Benefits of Planning in Project Management

Project management refers to planning and overseeing the tasks necessary to achieve a goal.
These goals can include implementing a new software system, merging two departments or
analyzing the purchase of a subsidiary. The project manager gathers employees to create a
project team. Together, the team creates a plan for completing the project.
Direction: One of the challenges faced by project team members is the lack of knowing how to
proceed. During the planning process, the project team determines what tasks need to be
completed. The planning process provides direction for the team and its members.
Accountability: During the planning process, the project manager and the project team assign
the responsibility for completing each task to specific employees. The team benefits because
one employee holds responsibility for each task and can be held accountable. When an
employee realizes he reaps the rewards and the consequences of not completing his task, he
places a higher priority on fulfilling his requirement.
Adequate Resources: Many projects run out of resources before completion. Resources include
both labor and finances. Planning requires the team to consider what resources it needs to finish
the project and eliminate the potential of discontinuing the project for lack of resources.
Problem Anticipation: Many projects experience problems at different times before the project
completes. These include losing employees, missing deadlines or running out of funds. By
planning the project, the team can proactively address problems, reducing their impact on the
project.
Shared Resources: Many employees work on multiple projects simultaneously. These employees
divide their time between the two projects and run the risk of having too much or not enough
Page 143 of 259

work. Planning allows the project leader to work out a schedule which maximizes the employee’s
available time.
Employee Expertise: After employees plan their assignments, they can invest time developing
the skills to complete their assignments. Some employees have the skills needed and increase
those skills during the project. Other employees learn new skills. The company benefits from the
growing knowledge base of its employees.
Reliability: Companies base decisions on the assumption that a specific project will be completed
on time or within its financial budget. Project teams who spend time planning can reliably predict
what it will cost in time or money to complete.
Skill Discovery: When project team employees plan together, they learn which employees have
skills necessary to complete various tasks. These skills may not appear on the employee’s work
history but still contain value for the company. Without planning each task, the company may
never realize these skills.
Project Completion: Some projects get started and never finish. Without planning, project team
members pursue their own ideas and forget about the project. Planning ensures that the team
members know their role and that the project will be completed.
Lessons Learned: While planning, the project team reviews the results of past projects. The team
evaluates its successes and failures from past projects. This allows the team to keep the successful
processes and eliminate the failures.
The Basic Steps in the Management Planning Process

Management planning is the process of looking at a company's goals and creating a plan. The basic
step in the process is creating a road map to meet its goals.
Management planning is the process of assessing an organization's goals and creating a realistic,
detailed plan of action for meeting those goals. The basic steps in the management planning
process involve creating a road map that outlines each task the company must accomplish to
meet its overall objectives. Much like writing a business plan, a management plan takes into
consideration short- and long-term corporate strategies.
Establish Goals: The first step of the management planning process is to identify specific company
goals. This portion of the planning process should include a detailed overview of each goal,
including the reason for its selection and the anticipated outcomes of goal-related projects. Where
possible, objectives should be described in quantitative or qualitative terms. An example of a goal is
to raise profits by 25 percent over a 12-month period.
Identify Resources: Each goal should have financial and human resources projections associated
with its completion. For example, a management plan may identify how many sales people it will
require and how much it will cost to meet the goal of increasing sales by 25 percent.
Establish Goal-Related Tasks: Each goal should have tasks or projects associated with its
achievement. For example, if a goal is to raise profits by 25 percent, a manager will need to
Page 144 of 259

outline the tasks required to meet that objective. Examples of tasks might include increasing the
sales staff or developing advanced sales training techniques.
Prioritize Goals and Tasks: Prioritizing goals and tasks is about ordering objectives in terms of
their importance. The tasks deemed most important will theoretically be approached and
completed first. The prioritizing process may also reflect steps necessary in completing a task or
achieving a goal. For example, if a goal is to increase sales by 25 percent and an associated task is
to increase sales staff, the company will need to complete the steps toward achieving that
objective in chronological order.
Create Assignments and Timelines: As the company prioritizes projects, it must establish
timelines for completing associated tasks and assign individuals to complete them. This portion of
the management planning process should consider the abilities of staff members and the time
necessary to realistically complete assignments. For example, the sales manager in this scenario
may be given monthly earning quotas to stay on track for the goal of increasing sales by 25
percent.
Establish Evaluation Methods: A management planning process should include a strategy for
evaluating the progress toward goal completion throughout an established time period. One way
to do this is through requesting a monthly progress report from department heads.
Identify Alternative Courses of Action: Even the best-laid plans can sometimes be thrown off
track by unanticipated events. A management plan should include a contingency plan if certain
aspects of the master plan prove to be unattainable. Alternative courses of action can be
incorporated into each segment of the planning process, or for the plan in its entirety.
Advantages & Disadvantages of Using a Project Scheduling Tool

Project scheduling tools, also known as project management software, are designed to help you
organize and manage projects more efficiently. There are many different types of project
management tools -- some are basic organizers, while others help to plan and track all aspects of a
project. There are many advantages and disadvantages of project scheduling tools that you should
think about before you decide if one would be a wise investment for your small business needs.
Allows For Interchangeability: One of the major benefits of a project scheduling tool is that once
all the project information, including deadlines and project phases, are inputted into the
software, the program manages the notifications and organizes the tasks for you. This is can be a
great advantage if your main project manager leaves the company, as his replacement can be
brought up to speed very quickly. Another benefit is that the project management tool will
remind you about small details and items that can be easily overlooked or forgotten.
Provides Tracking: When you have several employees working on one project using a collaboration
tool, you will actually be able to see who is doing what and when they are doing it. This allows you
to see if someone is constantly missing deadlines, as well as can help you identify your top
performers. While you want to promote a team atmosphere, project management tools can help
you figure out your weak links.
Page 145 of 259

Cost: The cost of project management software can be an advantage or a disadvantage
depending on the type of tool you purchase. Project management software is available in two
main ways: webbased and desktop software. Web-based project scheduling tools, also known as
Software as a Service or SaaS, do not require an upfront investment, such as purchasing a
software license. You can simply pay a monthly subscription fee based on the number of users.
Desktop software is installed on your network server or on a single user's hard drive. It requires a
licensing fee and may cost up to hundreds of thousands of dollars depending on the necessary
features and scope of the software.
Complications: Some of disadvantages of project scheduling tools are that they tend to make even
simple projects very complicated. The software may recommend 10 steps, but you really only
need three to get the job done properly. They also do not allow much room for flexibility, which is
necessary in the real world. Projects will inevitably have delays that are out of your control, and
you need to be able to make changes and tweaks as necessary.
Project scheduling and Network planning.
Project Scheduling:
Project schedule is prepared listing down step by step in sequential order the jobs involved
in the implementation of the project. The steps should be well-defined along with the
required time to complete each step.
This project schedule becomes a ―tool‖ to ensure timely implementation of the project.
When a final decision has been taken to launch, the Project Manager is to entrust the jobs
involved to personnel within the Project Team with assigned responsibility to ensure that
the steps are completed within the time-frame allotted and within the budgeted cost.
Any deviation should be brought to the notice of the relevant functionary within the project
team and the matter should be discussed with a decision on the corrective course of action.
Any delay in completion of the project means avoidable extra costs to the project and, as
such, the project team regularly meets to monitor the actual progress as against the
budgeted progress as shown in the schematic diagram on “Execution”.
In view of the importance of timely implementation of the project the Project Team
maintains a chart in the office showing the schedule of the project itself—broken down to
smaller steps. The smaller steps can be represented by individual work-packages
highlighting the milestones.
Along with the passage of time, this chart is updated with the actual progress made and,
thus, the status of the project implementation becomes visually apparent.
Gantt Chart:
The project schedule presented by a bar chart, known as Gantt chart (named after Henry
Gantt, an industrial engineer) displays graphically the time relationship of the steps in a
project.
Page 146 of 259

Each step is represented by a horizontal line placed on the chart showing the time—to start,
perform, and then complete. It shows the steps in sequence as well as those which can be
undertaken simultaneously.
The Gantt chart for a project for construction work is illustrated next with the
schedule of work with time plan and the relevant project schedule chart:
Note:
1. Gantt Chart is simple to prepare and easy to understand. It also displays the actual
progress per activity just below the relevant planned progress line by making it different
than the planning line, may be using a different colour. Here the activity float is easier to
comprehend and, as such, is an excellent management tool. The problem in Gantt chart is
that it does not indicate the interrelationships between the activities.
2. The descriptions of the steps and the period of Time for completion of the steps are with
imaginary figures. It must be noted that the steps will need much preparatory work also,
e.g.
architect‘s design, the quantity surveyor‘s specifications of the building materials followed
by tenders from different possible contractors, selection of the contractors and agreement
with such contractors etc.
These necessary details have been avoided in the illustration to have our discussion
simplified.
3. Specimen of Gantt chart is shown in project schedule chart given below:
Page 147 of 259

Network Planning
Project Scheduling and Network Planning
The network planning is the categorisation of the activities involved in project
implementation in a sequential order followed by a schematic presentation of the
activities necessary for the entire project.
The steps are to:

A. Identify and list the category of activities involved from the start to the completion of
the project. The activities are grouped in categories which are different from each other.
B. Arrange the list of activities, as in A above, in sequential order of their performance.

There may be activity which can be started only after the completion of some other
activity, whereas there may also be some other independent activity which can be started
simultaneously.
In network planning, such independent and inter-dependent activities are laid down along
with their estimated time schedule, i.e. the duration estimated from the start to the
completion of the activity.
C. With the details of A and B above, draw the diagram of the network of the activities
so that the operational planning of the execution of the entire project can be visualised.
This whole procedure is the network planning of the project schedule which makes the
monitoring and controlling of the project easier than to look around the list of activities and
locate lapses, if any.
Page 148 of 259

The network planning as detailed here is a tool available to the project management for a
systematic project scheduling. Under such method the inter-relationship of the various
activities involved in the project schedule will be visible in the total plan and, as such, steps
can be taken to economize the consumption of resources, wherever possible.
Terms frequently used in Network Diagram:

We start with the description of the following most common terms used in a network. For
better understanding, we have defined and described other terms as and when we have
progressed with the network diagrams.
In order to have a better grasping of all these terms, the descriptions and
diagrams relevant to these terms should, at the initial stage, be repeatedly read
through: a. Event and Activity.
b. Dummy Activity.
c. Slack.
d. Arrow.
Events and Activities (Head Event, Tail Event, Burst Event and Merge Event).
An ‗event‘ is an occurrence, representing a happening of an incident and, in network
analysis, it represents a static point of time denoting completion of all preceding activities.
The earliest event occurrence (EET) is the longest of this early finish time of all activities
merging to the event.
‗Activity‘, on the other hand, indicates an operation carrying out a defined work, and, as
such, there is a continuity till the work is completed the time element required to
complete the work is called the ‗duration‘ of the activity.
In the network analysis the event is said to happen, or, in other words, realised, when all
activities leading to the event are completed and, as activities are carried out from one
event to another, the preceding event is called ‗tail event‘ and the succeeding one is ‗head
event‘.
The activity starts from an event, that is, the ‗tail event‘, and the activity on completion of
the defined work lands up to another event, the head event. Therefore, no activity can start
unless the tail event is realised with the exception of the very first activity starting from the
number one event which, naturally, does not have any tail event.
The event is also called ‗NODE‘. In order to avoid confusion we will use only one term, i.e.
‗event‘, and not ‗node‘.
When- more than one activity emanates from one event, such event is called ‗burst event‘.
When a number of activities terminate in one event, such event is called ‗merge event‘.
The burst event and the merge event can be explained by the diagrams shown below:
Page 149 of 259

The figure shown above indicates a dummy activity (4) to (2) which, in reality, is not an
activity in itself. It is shown to establish the logic in the diagram when the activity (2) to
(5) is dependent upon the completion of the activities (1) to (2) and also (3) to (4); the
activity (4) to (6) is, however, independent of activity (1) to (2), but, of course, can only
be started when the activity (3) to (4) is completed and event (4) has been realized.
The event is shown in the network diagram by a circle and the drawing pattern of the
event along with other information is standardized as:
The event is bisected horizontally with the top having the event identifying number. The
bottom part is further bisected vertically with the left side showing the earliest event time
(EET) and the right one showing the latest event time (LET).
In the Western countries, the circle is bisected vertically first, with the left half showing the
event identification number and the right semicircle is further bisected horizontally to
show the EET and the LET. EET is the early starting time of all activities emerging from
the event and the LET is the latest finishing time of activities entering the event.
The activity is shown by an arrow representing the flow of work from left to right with ‗t‘
as initial start and ‗j‘ as completion of the activity and the duration of the activity is
expressed as tij.
Page 150 of 259
The diagram below shows the events with activity:
Slack:
Slack is associated with an event and represents the difference between the EET and LET of
that particular event. This is the breathing time of an event when even the earliest start of
any activity emanating from the event can wait to the extent of ‗slack‘ of the event.
Arrow:
Arrow indicates continuous flow of the projected activity. Every activity is represented by
one arrow with its tail as the start and the ‗head‘ as the completion of the activity. Hence,
for each activity, there is one arrow. Conventionally the arrow is from left to right
showing the direction towards the completion of the activity.
These arrows connect all the activities through the events, thereby terminating at the
completion of the project at the extreme right hand side. The connection of the activities is,
in general, by arrows with continuous lines.
Dummy Activity:
There may be occasion when the connecting of some activities is by dotted line arrow,
which is called dummy activity. Such dummy activity does not consume any resources but
may be, on occasions, only time, and is shown in the network to indicate the logistic.
Activities, Arrows and Events:

The activity is shown by an arrow representing the flow of work. The arrow-heads are at
the completion of work landing at the head event. The arrows are always from left to right;
it is not a necessity that an arrow be a horizontal line. It should preferably always be
represented by straight line but can emerge at any angle from the tail event (maintaining
the direction left to right).
The length of the arrow does not have any relation with the duration of time for the relevant
work. The events and activities follow the dependency rule whereby the succeeding activity
being dependent on the preceding activity should emerge from the head event where the
preceding activity has already converged. This is the rule of dependent activities.
We would like to further deal with the rule of dependent activities by network diagrams
drawn against imaginary sequence of activities as followed hereafter. But as the activity D
(in figure) for serial 4 is independent of C, the activity D does not emerge from C‘s head
event to indicate the logistic. The ‗dummy activity‘ is better explained by diagram.
Overlapping Activities:
Page 151 of 259
In network construction we assume that a succeeding activity can start only after the
completion of the preceding activity.
This, in reality, may not be necessary as such in some cases, particularly when a series of
items are to pass through a sequence of activities like process 1, process 2, process 3 (this
is invariably seen in batch production) represented in our discussion as activities P, Q and
R.
In such cases, instead of completing the entire series through activity P and then to take
them to activity Q and so on, the work can be economically carried out when the activity
Q can start sometime after the start of P, by when P has already processed a part of the
series. Similarly, activity R can start sometime after the start of Q.
Thus, the network in such cases can be shown by breaking each of these activities as start,
progress and (passing on the part completed to the second process which then can start)
continue the progress and (passing on the part completed to the second process) and so on
till we reach the end.
This can be shown in the network diagram as follows:
The diagram shows a number of dummies.
Calling back the overlapping activity P, Q and R, once the activity P has processed part of
the series from event (1) to (3) it goes for the second process represented by activity Q
from event (2) to (3) and then to the third process, activity R, from event (3) to (7). By
that time P continues from event (2) to (4) when the next part of the series is taken over
by Q and processed from event (5) to (6), and so on.
This type of network planning enables institution of control in employment of the resources.
These are also called Ladder Activities.
When the overlapping activity is simple it can be shown as „negative transit time‟ by
a dummy activity as produced below:
Page 152 of 259

It suggests an overlapping; activity Q can start after 5 units of time after P starts i.e. P‘s
duration is 10 minus 5.
Network Diagram Analysis/ Network Construction

Introduction to PERT and CPM
The two most common and widely used project management techniques that can be classified
under the title of Network Analysis are Programme Evaluation and review Technique (PERT) and
Critical Path Method (CPM). Both were developed in the 1950's to help managers schedule,
monitor and control large and complex projects. CPM was first used in 1957 to assist in the
development and building of chemical plants within the DuPont Corporation. Independently
developed, PERT was introduced in 1958 following research within the Special Projects Office of
the US Navy. It was initially used to plan and control the Polaris missile programme which
involved the coordination of thousands of contractors. The use of PERT in this case was reported
to have cut eighteen months off the overall time to completion.
The PERT/CPM Procedure

There are six stages common to both PERT and CPM:
1. Define the project and specify all activities or tasks.

2. Develop the relationships amongst activities. Decide upon precedencies.
3. Draw network to connect all activities.
4. Assign time and/or costs to each activity.
5. Calculate the longest time path through the network: this is the "critical path".
6. Use network to plan, monitor and control the project.
Finding the critical path (step 5) is a major in controlling a project. Activities on the critical path
represent tasks which, if performed behind schedule, will delay the whole project. Managers can
derive flexibility by identifying the non-critical activities and replanning, rescheduling and
reallocating resources such as manpower and finances within identified boundaries.
PERT and CPM differ slightly in their terminology and in network construction. However their
objectives are the same and, furthermore, their project analysis techniques are very similar. The
major difference is that PERT employs three time estimates for each activity. Probabilities are
attached to each of these times which, in turn, is used for computing expected values and
potential variations for activity times. CPM, on the other hand, assumes activity times are known
and fixed, so only one time estimate is given and used for each activity. Given the similarities
between PERT and CPM, their methods will be discussed together. The student will then be able
to use either, deciding whether to employ variable (PERT) or fixed (CPM) time estimates within the
network.
PERT and CPM can help to answer the following questions for projects with thousands of activities and
events, both at the beginning of the project and once it is underway:
Page 153 of 259

• When will the project be completed?
• What are the critical activities (i.e.: the tasks which, if delayed, will effect time for overall
completion)?
• Which activities are non-critical and can run late without delaying project completion time?
• What is the probability of the project being completed by a specific date?
• At any particular time, is the project on schedule?
• At any particular time, is the money spent equal to, less than or greater than the budgeted
amount?
• Are there enough resources left to complete the project on time?
• If the project is to be completed in a shorter time, what is the least cost means to accomplish this
and what are the cost consequences?
Critical Path Analysis/ Critical Path Construction

The objective of critical path analysis is to determine times for the following:
• ES = Earliest Start Time. This is the earliest time an activity can be started, allowing for the fact
that all preceding activities have been completed.
• LS = Latest Start Time. This is the latest time an activity can be started without delaying the start
of following activities which would put the entire project behind schedule.
• EF = Earliest Finish Time. The earliest time an activity can be finished.
• LF = Latest Finish Time. The latest time that an activity can finish for the project to remain on
schedule.
• S = Activity Slack Time. The amount of slippage in activity start or duration time which can be
tolerated without delaying the project as a whole.
• Forward Pass - The term refers specifically to the essential and critical project management
component in which the project team leader (along with the project team in consultation)
attempts to determine the early start and early finish dates for all of the uncompleted segments of
work for all network activities.
• Backward Pass - The term refers specifically to the essential and critical project management
component in which the project team attempts to determine the latest start and latest finish dates
for all of the uncompleted segments of work for all network activities.
If ES and LS for any activity is known, then one can calculate values for the other three times as follows:
EF = ES + t
LF = LS + t
S = LS - ES or S = LF - EF
Analysis of the project normally involves:
1. Determining the Critical Path. The critical path is the group of activities in the project
that have a slack time of zero. This path of activities is critical because a delay in any
Page 154 of 259
activity along it would delay the project as a whole.
2. Calculating the total project completion time, T. This is done by adding the activity times of
those activities on the critical path.
The steps in critical path analysis are as follows:

a) Determine ES and EF values for all activities in the project: the Forward Pass through
the network.
b) Calculate LS and LF values for all activities by conducting a Backward Pass through the
network.
c) Identify the critical path which will be those activities with zero slack (i.e.: ES=LS and
EF=LF).
d) Calculate total project completion time.
PERT and Activity Time Estimation

The major distinguishing difference between PERT and CPM is the use of three time estimates for each
activity in the PERT technique, with CPM using only one time for each activity using CPM.
The three time estimates specified for each activity in PERT are:
i) the optimistic time; ii)

the most probable time;
and iii) the pessimistic
time.
The optimistic, most likely and pessimistic time estimates are used to calculate an expected
activity completion time which, because of the skewed nature of the beta distribution, is
marginally grater than the most likely time estimate. In addition, the three time estimates can be
used to calculate the variance for each activity. The formulae used are as follows:
t  o4mp
 po 2

6

v  
Where:
Page 155 of 259

o, m, p - optimistic, most likely, and pessimistic
times t - expected completion time for task v
- variance of task completion time
Knowing the details of a project, its network and values for its activity times (t) and their variances
(v) a complete PERT analysis can be carried out. This includes the determination of the ES, EF, LS,
LF and S for each activity as well as identifying the critical path, the project completion time (T)
and the variance (V) for the entire project.
Page 156 of 259

Normally when using PERT, the expected times (t) are calculated first from the three values of
activity time estimates, and it is these values of t that are then used exactly as before in CPM. The
variance values are calculated for the various activity times and the variance of the total project
completion time (i.e. the sum of the activity expected times of those activities on the critical path)
is the sum of the variances of the activities lying on that critical path.
Probability Analysis
Once the expected completion time and variance (T and V) have been determined, the probability that a project will be completed by
a specific date can be assessed. The assumption is usually made that the distribution of completion dates follows that of a normal
distribution curve.
Consider the example where the expected completion time for a project (T) is 20 weeks and the
project variance (V) is 100. What is the probability that the project will be finished on or before
week 25?
Answer: 0.69
Worked Examples on Networks

1. A project has the following activities, precedence relationships, and activity durations:
Activity Immediate Activity

Predecessor Duration
s (weeks)
A - 3
B - 4
C - 3
D C 12
E B 5
F A 7
G E, F 3
a) Draw a Gantt chart for the project.

b) Construct a CPM network for the project.
c) Identify those activities comprising the critical path.
d) What is the project's estimated duration?
Page 157 of 259
e) Construct a table showing for each activity, its activity duration, earliest start time, latest start
time, earliest finish time, latest finish time, and the activity slack.
Answers:
c) C, D
d) 15 weeks
2. A project designed to refurbish a hospital operating theatre consists of the following activities,
with estimated times and precedence relationships shown. Using this information draw a
network diagram, determine the expected time and variance for each activity, and estimate the
probability of completing the project within sixty days.
Activity Immediate Optimistic Most Likely Optimistic

Predecessor Time Time Time
s
A - 5 6 7
B - 10 13 28
C A 1 2 15
D B 8 9 16
E B, C 25 36 41
F D 6 9 18
3. An activity has these time estimates: optimistic time o = 15 weeks, most likely time m = 20 weeks,
and pessimistic time p = 22 weeks.
a) calculate the activity's expected time or duration t.
b) calculate the activity's variance v.
c) calculate the activity's standard deviation.
4. A project has the following activities, precedence relationships, and time estimates in
weeks:
Activity Immediate Optimistic Most Likely Optimistic

Predecessor Time Time Time
s
A - 15 20 25
B - 8 10 12
C A 25 30 40
D B 15 15 15
E B 22 25 27
F E 15 20 22
Page 158 of 259
G D 20 20 22
a) Calculate the expected time or duration and the variance for each activity.
b) Construct the network diagram
c) Tabulate the values of ES,EF,LS,LF and slack for each activity
d) Identify the critical path, and the project duration.
e) What is the probability that the project will take longer than 57 weeks to complete?
5. The project detailed below has the both normal costs and "crash" costs shown. The crash time is
the shortest possible activity time given that extra resources are allocated to that activity.
Activity Immediate Normal Normal Crash Crash Time

Predecessor Time Time Cost Time Cost
s (£) (£)
A - 5 2 000 4 6 000
B A 8 3 000 6 6 000
C B 2 1 000 2 1 000
D B 3 4 000 2 6 000
E C 9 5 000 6 8 000
F C, D 7 4 500 5 6 000
G E, F 4 2 000 2 5 000
Assuming that the cost per day for shortening each activity is the difference between crash costs
and normal costs, divided by the time saved, determine by how much each activity should be
shortened so as to complete the project within twenty-six days and at the minimum extra cost.
CHAPTER 9: LINEAR PROGRAMMING (LP)

Introduction
In a business organization, management has to make decisions on how to allocate their
resources to achieve its organization‘s goal. A bank would like to allocate its funds to
achieve the highest possible return. It must operate within liquidity limits set by
regulatory agencies and it must maintain sufficient flexibility to meet the loan demands of
its customers.
Page 159 of 259
Each organization wants to achieve some objective (maximize rate of return, maximize
profits, minimize costs) with constrained resources (deposits, available machine time). To
be able to find the best uses of an organization‘s resources, a mathematical technique
called Linear Programming can be used. The adjective linear is used to describe a
relationship between two or more variables, a relationship which is directly and precisely
proportional. In a linear relationship between work hours and output, for example, a 10
percent change in the number of productive hours used in the operation will cause a 10
percent change in output.
Some characteristic LP applications

1) Scheduling school buses to minimize total distance traveled.
2) Allocating police patrol units to high crime areas in order to minimize response time to 911
calls.
3) Scheduling tellers at banks so that needs are met during each hour of the day while
minimizing the total cost of labour.
4) Selecting the product mix in a factory to make best use of machine- and labor hours available
while maximizing the firm‘s profit.
5) Picking blends of raw materials in feed mills to produce finished feed combinations at
minimum costs.
6) Determining the distribution system that will minimize total shipping cost.
7) Developing a production schedule that will satisfy future demands for a firm‘s product and at
the same time minimize total production and inventory costs.
8) Allocating space for a tenant mix in a new shopping mall so as to maximize revenues to the
leasing company.
Requirements of an LP problem
1) LP problems seek to maximize or minimize some quantity (usually profit or cost) expressed
as an objective function.
2) The presence of restrictions, or constraints, limits the degree to which we can pursue our
objective.
3) There must be alternative courses of action to choose from.
4) The objective and constraints in linear programming problems must be expressed in terms of
linear equations or inequalities.
A Linear Programming model seeks to maximize or minimize a linear function, subject to a set of
linear constraints.
The linear model consists of the following components:
- A set of decision variables

- An objective function
- A set of constraints
Page 160 of 259
The importance of Linear Programming
Many real world problems lend themselves to linear programming modeling.
Many real world problems can be approximated by linear models.
There are well-known successful applications in:
– Manufacturing
– Marketing
– Finance (investment)
– Advertising
– Agriculture
There are efficient solution techniques that solve linear programming models.
The output generated from linear programming packages provides useful ―what if‖ analysis
(sensitivity analysis).
Constraints which limit the achievement of objectives

1) policy
2) finance
3) market
4) Availability of resources
Introduction to Linear Programming Problem

Linear Programming (LP) is a mathematical optimization technique. By optimization technique,
it refers to a method which attempts to maximize or minimize some objective, for example,
maximize profits or minimize costs.
Linear programming is a subset of a larger area of mathematical optimization procedures called
mathematical programming, which is concerned with making an optimal set of decisions. In any
LP problem, certain decisions need to be made. These decisions are represented by

decision variables which are used in the formulation of the LP model.
Basic Structure of a Linear Program Problem

The basic structure of an LP problem is either to maximize or minimize an objective function,
while satisfying a set of constraining conditions called constraints.
Objective function. The objective function is a mathematical representation of the overall goal
stated in terms of the decision variables. The firm‘s objective and its limitations must be
expressed as mathematical equations or inequalities, and these must be linear equations and
inequalities.
Page 161 of 259

Constraints. Constraints are also stated in terms of the decision variables, and represent
conditions which must be satisfied in determining the values of the decision variables. Most
constraints in a linear programming problem are expressed as inequalities. They set upper or
lower limits, they do not express exact equalities; thus permit many possibilities.
Resources must be in limited supply. For example, a furniture plant has a limited number of
machinehours available; consequently, the more hours it schedules for furnitures, the fewer
furnitures it can make.
There must be alternative courses of action, one of which will achieve the objective.
Linear Programming Assumptions & General Limitations

When using Linear Programming to solve a real business problem, five assumptions (also
referred to as limitations) have to be made.
Linearity. The objective function and constraints are all linear functions; that is, every term must
be of the first degree. Linearity implies the next two assumptions.
Proportionality. For the entire range of the feasible output, the rate of substitution between the
variables is constant.
Additivity. All operations of the problem must be additive with respect to resource usage, returns,
and cost. This implies independence among the variables.
Divisibility. Non-integer solutions are permissible.
Certainty. All coefficients of the LP model are assumed to be known with certainty. Remember,
LP is a deterministic model.
Assumptions:
(i) There are a number of constraints or restrictions- expressible in quantitative terms.
(ii) The prices of input and output both are constant.
(iii) The relationship between objective function and constraints are linear.
(iv) The objective function is to be optimized i.e., profit maximization or cost minimization.
Advantages and limitations:
LP has been considered an important tool due to following reasons:
1. LP makes logical thinking and provides better insight into business problems.
2. Manager can select the best solution with the help of LP by evaluating the cost and profit
of various alternatives (maximization of profit & minimization of costs).
3. LP provides an information base for optimum allocation of scarce resources.

Page 162 of 259
4. LP assists in making adjustments according to changing conditions.
5. LP helps in solving multi-dimensional problems.
Limitations
1. This technique could not solve the problems in which variables cannot be stated
quantitatively.
2. In some cases, the results of LP give a confusing and misleading picture. For example, the
result of this technique is for the purchase of 1.6 machines.
It is very difficult to decide whether to purchase one or two- machine because machine can
be purchased in whole.
3. LP technique cannot solve the business problems of non-linear nature.
4. The factor of uncertainty is not considered in this technique.
5. This technique is highly mathematical and complicated.
6. If the numbers of variables or contrains involved in LP problems are quite large, then
using costly electronic computers become essential, which can be operated, only by
trained personel.
7. Under this technique to explain clearly the objective function is difficult.
Managerial uses and applications of Linear Programming

LP technique is applied to a wide variety of problems listed below:
(a) Optimizing the product mix when the production line works under certain specification;
(b) Securing least cost combination of inputs;
(c) Selecting the location of Plant;
(d) Deciding the transportation route;
(e) Utilizing the storage and distribution centres;
(f) Proper production scheduling and inventory control;
(g) Solving the blending problems; (h) Minimizing the raw materials waste;
(i) Assigning job to specialized personnel.
The fundamental characteristic in all such cases is to find optimum combination of factors
after evaluating known constraints. LP provides solution to business managers by
understanding the complex problems in clear and sound way.
Page 163 of 259
The basic problem before any manager is to decide the manner in which limited resources
can be used for profit maximization and cost minimization. This needs best allocation of
limited resources—for this purpose linear programming can be used advantageously.
Linear Programming Model

The linear programming model is an algebraic description of the objective to be minimized and
the constraints to be satisfied by the variables. The variables are the flows in each arc designated
by x1 through xn
Linear programming (LP, also called linear optimization) is a method to achieve the best outcome
(such as maximum profit or lowest cost) in a mathematical model whose requirements are
represented by linear relationships. Linear programming is a special case of mathematical
programming (mathematical optimization).
Modeling Process
We begin by modeling this problem. Modeling a problem using linear programming involves
writing it in the language of linear programming. There are rules about what you can and cannot
do within lianer programming. These rules are in place to make certain that the remaining steps of
the process (solving and interpreting) can be successful.
Key to a linear program is the decision variables, objective and constraints.
Dicision variables: the decision variables represent (unknown) decisions to be made. This is in
contrast to problem data, which are values that are either given or can be simply calculated from
what is given. For this problem, the decision variables are the number of nootbooks to produce
and the number of desktops to produce. We will represent these unkwon values by X1 and X2
respectively. To make the numbers more manageable, we will let X 1 be the number of 1000
notebooks produced (so X1=5 means a decision to produce 5000 notebooks) and X2 be the
number of 1000 desktops. Note that a value like the quarterly profit is not (in this model) a
decision variable: it is an outcome of decisions X1 and X2.
Objective: Every linear program has an objective. This objective is to be either minimized or
maximized. This objective has to be linear in the decision variables, which means it must be the
sum of constants × Decision variable. 3X1 – 10X2 is a linear function. X1 X2 is not a linear
function. In this case, our objective is to maximize the function 750X 1 + 1000X2 (what units is
this in?)
Constraints: Every linear program also has constraints limiting feasibility decisions. Here we
have four types of constraints: Processing chips, Memory sets Assembly and Non-negativity.
In order to satisfy the limit on the number of chips available, it is necessary that X 1+X2≤10.
If this were not the case (say X1=X2=6), the decision would not be implementable (12,000 chips
would be required, though we only have 10,000). Linear programming cannot handle arbitrary
restrictions: once again, the restrictions have to be linear. This means that a linear functions of the
Page 164 of 259

decision variables must be related to a constant, where related can mean less than or equal to,
greater than or equal to. So 3X1-2X2 ≥ 10 is a linear constraint, as is –X1+X3=6.
X1X2 ≤ 10 is not a linear constraint, nor is X1+X2 < 3. Our constraints for processing chips X1+X2
≤ 10 is a linear constraint.
The constraint for memory chip sets is X1+X2 ≤ 15, a linear constraint.
Finally, we do not want to consider decisions like X1=-5, where production is negative. We add
the linear constraints X1≥0, X2≥0 to enforce non negativity of production.
Final Model: this gives us the complete model of this problem
Maximize 750X1 + 1000X2
Subject to
X1 + X2 ≤ 15
X1 + 2X2 ≤ 25
X1 ≤ 0
X2 ≤ 0
Formulating a problem as a linear program means going through the above process to clearly
define the decision variables, objective and constraints
Linear Programming Model Types

This are the Computational Techniques to Overcome the Limitations of LP
There are at least three other mathematical programming techniques that may be used when the
assumptions (limitations) above do not apply to the problem. These are:
 Integer programming
 Dynamic programming
 Quadratic programming
Integer programming problem is a mathematical optimization or feasibility program in which some
or all of the variables are restricted to be integers.
Dynamic programming (also known as dynamic optimization) is a method for solving a complex
problem by breaking it down into a collection of simpler subproblems, solving each of those
subproblems just once, and storing their solutions.
Quadratic programming (QP) involves minimizing or maximizing an objective function subject to bounds,
linear equality, and inequality constraints.
Page 165 of 259

The Red Gadget-Blue Gadget Problem
We shall be using one particular example to illustrate most of the LP approaches and concepts.
We shall call this the red gadget-blue gadget problem.
A company produces gadgets which come in two colors: red and blue. The red gadgets
are made of steel and sell for 30 pesos each. The blue gadgets are made of wood and sell
for 50 pesos each. A unit of the red gadget requires 1 kilogram of steel, and 3 hours of
labor to process. A unit of the blue gadget, on the other hand, requires 2 board meters of
wood and 2 hours of labor to manufacture. There are 180 hours of labor, 120 board
meters of wood, and 50 kilograms of steel available. How many units of the red and blue
gadgets must the company produce (and sell) if it wants to maximize revenue?
The Graphical Approach

Step 1. Define all decision variables.
Let: x1 = number of red gadgets to produce (and
sell) x2 = number of blue gadgets to produce (and
sell) Step 2. Define the objective function.
Maximize R = 30 x1 + 50 x2 (total revenue in pesos)
Step 3. Define all constraints.
(1) x1  50 (steel supply constraint in kilograms)

(2) 2 x2  120 (wood supply constraint in board meters) (3) 3 x1 +
2 x2  180 (labor supply constraint in man hours) x1 , x2  0 (non-

negativity requirement) Step 4. Graph all constraints.
Page 166 of 259

X1
X1
Step 5. Determine the area of feasible solutions.

Step 6. Determine the optimal solution.
Page 167 of 259

The shot-gun approach
List all corners (identify the corresponding coordinates), and pick the best in terms of the
resulting value of the objective function.
(1) x1 = 0 x2 = 0 R = 30 (0) + 50 (0) = 0
(2) x1 = 50 x2 = 0 R = 30 (50) + 50 (0) = 1500
(3) x1 = 0 x2 = 60 R = 30 (0) + 50 (60) = 3000
(4) x1 = 20 x2 = 60 R = 30 (20) + 50 (60) = 3600 (the optimal solution) (5) x1 = 50
x2 = 15 R = 30 (50) + 50 (15) = 2250
The contour-line approach

Assume an arbitrary value for the objective function, then graph the resulting contour line.
Let R = 3000
30 x1 + 50 x2 =
3000 x1 = 0
x2 = 60 x2
=0 x1 = 100
If you are maximizing, draw contour lines parallel to the first contour line drawn,
such that it is to the right of (and above) the latter, and touches the last point (points) on the
area of the feasible solutions.
If minimizing, go the opposite direction.
Step 7. Determine the binding and non-binding constraints.
Page 168 of 259

Observing graphically, we find that the steel supply constraint is non-binding, implying
that the steel supply will not be completely exhausted. On the other hand, both the wood
supply and labor supply constraints are binding. By producing 20 units of the red gadget
and 60 units of the blue gadget, wood and labor will be completely used.
Problem Solutions in Linear Programming

Linear programming (LP) is an application of matrix algebra used to solve a broad class of
problems that can be represented by a system of linear equations. A linear equation is an algebraic
equation whose variable quantity or quantities are in the first power only and whose graph is a
straight line. LP problems are characterized by an objective function that is to be maximized or
minimized, subject to a number of constraints. Both the objective function and the constraints
must be formulated in terms of a linear equality or inequality. Typically; the objective function
will be to maximize profits (e.g., contribution margin) or to minimize costs (e.g., variable costs)..
The following assumptions must be satisfied to justify the use of linear programming:
• Linearity. All functions, such as costs, prices, and technological require-ments, must be
linear in nature.
• Certainty. All parameters are assumed to be known with certainty.
• Nonnegativity. Negative values of decision variables are unacceptable.
Two approaches were commonly used to solve LP problems:
• Graphical method
• Simplex method
Now, however, MSExcel is much easier to use.
Choice between graphical method and Simplex Method

The graphical method is limited to LP problems involving two decision variables and a limited
number of constraints due to the difficulty of graphing and evaluating more than two decision
variables. This restriction severely limits the use of the graphical method for real-world problems.
The graphical method is presented first here, however, because it is simple and easy to understand
and it is a very good learning tool.
The computer-based simplex method is much more powerful than the graphical method and
provides the optimal solution to LP problems containing thousands of decision variables and
constraints. It uses an iterative algorithm to solve for the optimal solution. Moreover, the simplex
method provides information on slack variables (unused resources) and shadow prices
(opportunity costs) that is useful in performing sensitivity analysis. Excel uses a special version
of the simplex method, as will be discussed later.
LINEAR PROGRAMMING: GRAPHICAL SOLUTION

(Minimization Problem)
Example: Mindoro Mines
Page 169 of 259

To illustrate the use of the graphical approach to solving linear programming
minimization problems, we shall use the Mindoro Mines example. If you already have
basic understanding of the graphical method from the example in Session 03, you should
have little difficulty appreciating the following application.
Mindoro Mines operates 2 mines: one in Katibo and the other on Itim Na Uwak Island.
The ore from the mines is crushed at the site and then graded into high-sulfur ore
(ligmite), lowsulfur ore (pyrrite) and mixed ore. The graded ore is then sold to a cement
factory which requires, every year, at least 12,000 tons of ligmite, at least 8,000 tons of
pyrrite, and at least 24,000 tons of the mixed ore.
Each day, at a cost of P22,000 per day, the Katibo mine yields 60 tons of ligmite, 20 tons
of pyrrite, and 30 tons of the mixed ore. In contrast, at the Itim Na Uwak Island mine, at
a cost of P25,000 per day, the mine yields 20 tons of ligmite, 20 tons of pyrrite, and 120
tons of the mixed ore.
The management of Mindoro Mines would like to determine how many days a year it
should operate the two mines to fill the demand from the cement plant at minimum cost.
What are the binding constraints?
The Graphical Approach
Step 1. Define all decision variables.
Let x1 = number of days (in a year) to operate the Katibo mine

x2 = number of days (in a year) to operate the Itim Na Uwak Island mine
Step 2. Define the objective function.
Minimize C = 22000 x1 + 25000 x2 (total cost of operating the mines in pesos)
Step 3. Define all constraints.
(1) 60 x1 + 20 x2  12000 (ligmite demand in tons)

(2) 20 x1 + 20 x2  8000 (pyrrite demand in tons)
(3) 30 x2 + 120 x2  24000 (mixed ore demand in tons)
(4) x1  365 (maximum number of days in a year) (5)
x2  365 (maximum number of days in a year)
x1 , x2  0 (non-negativity requirement)
Step 4. Graph all constraints.
Step 5. Determine the area of feasible solutions.
Page 170 of 259

For Steps 4 and 5, please refer to the following graph.
Step 6. Determine the optimal solution.
Using the shot-gun approach, we list down the following corners or extreme points (with
their respective coordinates):
(1) x1 = 365 x2 = 365 C = 22000 (365) + 25000 (365) = 17,155,000
(2) x1 = 365 x2 = 108.75 C = 22000 (365) + 25000 (108.75) = 10,748,750
(3) x1 = 78.33333 x2 = 365 C = 22000 (78.33333) + 25000 (365) = 10,848,333
(4) x1 = 100 x2 = 300 C = 22000 (100) + 25000 (300) = 9,700,000
(5) x1 = 266.66667 x2 = 133.3333 C = 22000 (266.667)+25000 (133.333) = 9,200,000

The Optimal Solution
Step 7. Determine the binding and non-binding constraints.
Observing graphically, we find that binding constraints are those associated with pyrrite
and mixed ore production. Total production of these two items will exactly match the
requirements of the cement factory. Meanwhile, ligmite production will be in excess of the
minimum requirement of 12000 tons by about 6,667 tons. Likewise, total available
operating days will not be completely utilized.
YOUR ACTIVITY
1. Luzon Timber Corporation

The Luzon Timber Corporation cuts raw timber, lauan, and tanguile, into standard size planks.
Two steps are required to produce these planks from raw timber: debarking and cutting. Each
Page 171 of 259

hundred meters of lauan takes 1.0 hour to debark 1.2 hours to cut. Each hundred meters of
tanguile takes 1.5 hours to debark and 0.6 hour to cut. The bark removing machines can operate
up to 600 hours per week but the cutting machines are limited to 480 hours per week.
Luzon Timber can buy a maximum of 36,000 meters of raw lauan and 32,000 meters of raw
tanguile. If the profit per hundred meters of processed logs is P1,800 for lauan and P2,000 for
tanguile, how much lauan and tanguile should be bought and processed by the corporation in
order to maximize total profits?
Suppose next that the finished planks must go through kiln-drying as well and the kiln can only
process a combined total of 45,000 meters of planks. What product combinations are now
feasible? Which combination will maximize combined profits under the new conditions?
2. Small Refinery
A small refinery produces only two products: lubricants and sealants. These are produced by
processing crude oil through 3 processors: a cracker, a splitter, and a separator. These processors
have limited capacities. For the cracker, at most 1000 hours; for the splitter, at most 4200 hours;
and for the separator, at most 2400 hours per week. Similarly, there is a limit on the supply of
crude oil: at most 700 barrels per week.
To produce one barrel of lubricant, we need one hour at the cracker, 6 hours at the splitter, and 4
hours at the separator. To produce a barrel of sealant, we need 2 hours at the cracker, 7 hours at
the splitter, and 3 hours at the separator.
Under these conditions, what product combinations of lubricants and sealants are feasible? If a
barrel of lubricant nets P2000 and a barrel of sealant nets P2500, which product combination will
maximize combined profits?
Next, suppose it were now possible to expand the splitter capacity from 4200 hours up to 4374
hours per week. If this expansion will mean added costs (along with more production), what is
the most money the small refinery should pay to finance the expansion? Assume all other
conditions of the problem remain the same except for the added capacity on the splitter.
THE SIMPLEX METHOD

So far we have solved linear Programming problems graphically. All of the problems have
involved only 2 variables, and this has allowed us to draw graphs to show the feasible region.
From this we have then been able to determine the extreme points of the region to solve
the problem. However, if there are 3 or more variables then this method is of no use.
Instead we use a numerical method based on Gaussioan Elimination. This is called the
SIMPLEX METHOD.
The first thing we need to do is introduce SLACK VARIABLES.
Let us consider the following problem
Page 172 of 259

Maximise z=+y
Subject to 3 + 4y  12
3 + 2y  9
0,y0
We know that the optimal solution occurs at one of the vertices of the feasible region
shown below:
Since the  axis is the line where y = 0 and the y axis where  = 0, let us define
AB as u = 0 and BC as v = 0.
The vertices are now the points where exactly two of , y, u and v are zero.
I.e. At O =y=0
At A =u=0
At B u=v=0
At C v=y=0
Page 173 of 259

In effect u and v represent the slack between the maximum available for each constraint
and the amount being used. We can therefore replace the inequalities by the equations:
3 + 4y + u = 12
3 + 2y + v = 9
With u  0 and v  0
The problem can now be restated as:
Maximise z=+y
When 3 + 4y + u = 12
3 + 2y + v = 9
With   0 , y  0, u  0 and v  0
The Standard Form of Linear programming Problems

A problem is in standard form if
1. The objective function is to be maximised.

2. All of the constraints must be in the form . This means that
each variable (actual and slack)  0
We can easily obtain the constraints in this form using the fact that if
ab then -a  -b
Example
Write the following Linear Programming problem in standard form and introduce slack
variables.
Maximise P = 2 + y
Subject to the constraints  + y  10
4 + 2y  15
3 + y  5
Page 174 of 259

0,y0
The 3rd constraint (3 + y  5) is not in the correct form but can be changed to
-3 - y  5
The problem now becomes

Subject to the constraints  + y  10
4 + 2y  15
-3 - y  -5
0,y0
We can now introduce slack variables to obtain:
When  + y + u = 10
4 + 2y + v = 15
-3 - y + w = -5
  0 , y  0, u  0, v  0 and w  0
The Simplex Method
The Simplex Method effectively does a tour of the boundary of the feasible region
stopping at the vertices to examine the value of the objective function. We will see that
there is a recognisable situation when we have reached the optimal solution. This means
that we do not necessarily need to examine each vertex as we did for the graphical method.
To solve, we first need to introduce the slack variables and then set up the Simplex
Tableau.
Setting up the Simplex

Tableau
Let us use the example
Maximise z=+y
Page 175 of 259

When 3 + 4y + u = 12
3 + 2y + v = 9
With   0 , y  0, u  0 and v  0
Firstly we need to rewrite the objective equation = 0 by rearranging it.
z--y=0
NOTE We could have rearranged this as  + y – z = 0 but for the Simplex Method to
work we need the values of  and y to be negative at the start.
We then write the equations in a tableau as shown below

 y u v z
3 4 1 0 0 12
3 2 0 1 0 9
-1 -1 0 0 1 0
The first two rows represent the constraint equations with the slack variables and the bottom
row represents the objective function.
Iterations of the Simplex Algorithm

An iteration of the simplex method moves us along a line of the boundary of the feasible
region to another vertex.
We can either increase  or y, which will move us along the  or y axes respectively and
exam questions will often tell you which one to change. However, if you are not told then the
variable which is ALWAYS changed first is the LARGEST NEGATIVE ENTRY in the
OBJECTIVE COLUMN.
The column in which this variable lies is called the PIVOT COLUMN (Exam question might
tell you which is to be the pivot column)
In our example  and y both have values of –1 in the objective row so we have a free choice.
We will choose 
 y u v z
3 4 1 0 0 12
3 2 0 1 0 9
Page 176 of 259

-1 -1 0 0 1 0
column
Pivot
As we increase  it is clear from the graph that we should stop at vertex C which is on the
line 3 + 2y = 9.
We can find the information from the tableau by dividing the right hand column entries (12
and 9) by the  column entries. The SMALLEST POSITIVE VALUE will occur in the
row corresponding to the correct line.
This is called the PIVOT ROW. The  value in the first row is 3 and 12  3 = 4; the  value
in the second row is also 3 and 9  3 = 3. As 3 is smaller than 4 the second row becomes our
pivot row.
 y u v z
3 4 1 0 0 12 12 ÷ 3 = 4
3 2 0 1 0 9 9÷3=3 Pivot row
-1 -1 0 0 1 0
column
Pivot
The value that is in BOTH the pivot column and the pivot row is called the PIVOT
ELEMENT. (The boxed 3 in this case)
Page 177 of 259

We now need to make the PIVOT ELEMENT HAVE A VALUE OF 1. We use division and
have to perform the same operation to all numbers in that row.
In the example we need to change

 y u v z
3 to 1, so obviously we need to
divide by 3 and thus need 3 4 1 0 0 12 to divide
by three throughout this row.
1 2/3 0 1/3 0 3
R2 ÷ 3 -1 -1 0 0 1 0
We now want to make all other elements in the pivot column zero by carrying out ROW
OPERATIONS (similar to solving simultaneous equations).
To change the ‘3’ in row 1 to ‘0’ we need to subtract a multiple of row 2 from
it. i.e. 3 times row 2.
 y u v z
R1 - 3R2
0 2 1 -1 0 3
1 2/3 0 1/3 0 3
-1 -1 0 0 1 0
We now take our value in row 2 and add it to the value in row 3 to obtain a new value of
‗z‘.
 y u v z
0 2 1 -1 0 3
1 2/3 0 1/3 0 3
New value
R3 + R2 0 -1/3 0 1/3 1 3 of z
From the tableau we can set y and v to ‘0’ and find that  = 3 and z = 3 (i.e. read along from
where the value is 1 in the  column)
We have therefore arrived at the vertex where  = 3 and y = 0 giving the
objective z = 3.
This is ONE ITERATION of the SIMPLEX ALGORITHM.

Page 178 of 259
However, we still have a negative value in the objective row so we must repeat the process
again,
We now have the tableau
 y u v z
0 2 1 -1 0 3
1 2/3 0 1/3 0 3
0 -1/3 0 1/3 1 3
 y u v z
0 2 1 -1 0 3 3 ÷ 2 = 1.5 Pivot row
3 ÷ 2/3 = 4.5 1 2/3 0 1/3 0 3
-1 -1 0 1/3 1 3 column
Pivot
From this the pivot element is 2 so we need to divide row 1 by 2 giving
 y u v z
R1 ÷ 2
0 1 1/2 -1/2 0 3/2
1 2/3 0 1/3 0 3
0 -1/3 0 1/3 1 3
We now want to make the y value zero in the other rows.
So we need to do the following R2 – R1 and R3 + R1 giving
 y u v z
0 1 1/2 -1/2 0 3/2

R2 - 2/3R1
R3 + 1/3R1
1 0 -1/3 2/3 0 2
0 0 1/6 1/6 1 3.5
Setting u and v to zero gives us  = 2, y = and z =3 (3.5).
This corresponds to point B on the graph.
Page 179 of 259

From our knowledge of solving using a graphical method we know that B will be the optimal
solution. We can tell when we have reached the optimal solution, as the objective row will
contain no negative values.
OPTIMAL CONDITION If the objective row of the tableau contains no
negative entries then the solution is optimal.
Example 2
Use a simplex tableau to solve the Linear Programming problem. Begin by

pivoting on an element chosen from the  column. Write down the values of
, y and f at the end of each iteration.
Maximise f = 9 + 4y
Subject to 3 + 4y  48
2 + y  17
3 + y  24
  0, y  0
Solution
Introduce the slack variables
3 + 4y + u = 48
2 + y + v = 17
3 + y + w = 24
-9 - 4y + f = 0
The Simplex Tableau is
x y u v w f
3 4 1 0 0 0 48
Page 180 of 259

2 1 0 1 0 0 17
3 1 0 0 1 0 24
-9 -4 0 0 0 1 0
 is the pivot column (told but also the largest negative entry in the
objective column)
48  3 = 16 17  2 = 8.5 24  3 = 8
8 is the lowest so R3 is the pivot row.
This means that 3 is the pivot element.
R3  3
x y u v w f
R1
3 4 1 0 0 0 48
R2
2 1 0 1 0 0 17
R3
1 1/3 0 0 1/3 0 8
R4
-9 -4 0 0 0 1 0
Pivot about the  values i.e. make all the  values in R1, R2 and R4 all equal
0.
R1 – 3R3 R2 – 2R3 R4 + 9R3
x y u v w f
R1
0 3 1 0 -1 0 24
R2
R3 0 1/3 0 1 -2/3 0 1
R4
1 1/3 0 0 1/3 0 8
At this point 0 -1 0 0 3 1 72
set y and w
to 0 giving
Page 181 of 259

 = 8 , y = 0 and f = 72
As we still have a negative value in our object row, we have to repeat the
process and do a second iteration, this time with y as the pivot column as it
is the largest (and only) negative entry.
24  3 = 8 1 =3 8 = 24
3 is the lowest so R2 is the pivot row.
This means that is the pivot element.
R2 
x y u v w f
R1
0 3 1 0 -1 0 24
R2
R3 0 1 0 3 -2 0 3
R4
1 1/3 0 0 1/3 0 8
0 -1 0 0 3 1 72
R1 – 3R2 R3 - R2
R4 + R2
x y u v w f
R1
0 0 1 -9 5 0 15
R2
0 1 0 3 -2 0 3
R3
1 0 0 -1 1 0 7
R4
0 0 0 3 1 1 75
As all the entries in the objective row are non-negative this is the optimal
solution
If we set v and w to 0 we get
Page 182 of 259

 = 7 , y = 3 and f = 75
Example 3
Maximise the objective function f = - + 8y + z
Where  + 2y + 9z  10
y + 4z  10
0,y0,z0
Solution to example 3
Objective function f = - + 8y + z f +  - 8y – z = 0
Slack variables
 + 2y + 9z + u = 10
y + 4z + v = 12
Simplex tableau
x y z u v f
1 2 9 1 0 0 10
0 1 4 0 1 0 12
1 -8 -1 0 0 1 0
-8 is largest negative value so pivot about y 10  2
=5 12  1 = 12
5 is the lowest entry so R1 is the pivot row.
This means that 2 is the pivot element.
R1  2
Page 183 of 259
x y z u v f
R1
R2 1/2 1 9/2 1/2 0 0 5
R3 0 1 4 0 1 0 12
1 -8
R2 – R1 -1 0R3 + 0
8R1 1 0
x y z u v f
R1
R2 1/2 1 9/2 1/2 0 0 5
R3 -1/2 0 -1/2 -1/2 1 0 7
5 0 35 4 0 1 40
Setting , z, and u to 0 gives y = 5 and f = 40
As all values in the objective row are non-negative, this is the optimal
solution.
The maximum value of „f‟ = 40 occurs when  = 0, y = 5 and z = 0.
CHAPTER 10: ESTIMATION AND TEST OF

HYPOTHESIS
Introduction to Estimation and Measure of Hypothesis
A statistical hypothesis, sometimes called confirmatory data analysis, is a hypothesis that is
testable on the basis of observing a process that is modeled via a set of random variables.[1] A
statistical hypothesis test is a method of statistical inference. Commonly, two statistical data
sets are compared, or a data set obtained by sampling is compared against a synthetic data set
from an idealized model. A hypothesis is proposed for the statistical relationship between the
two data sets, and this is compared as an alternative to an idealized null hypothesis that
proposes no relationship between two data sets. The comparison is deemed statistically
significant if the relationship between the data sets would be an unlikely realization of the null
hypothesis according to a threshold probability—the significance level. Hypothesis tests are
used in determining what outcomes of a study would lead to a rejection of the null hypothesis
for a pre-specified level of significance. The process of distinguishing between the null
hypothesis and the alternative hypothesis is aided by identifying two conceptual types of errors
Page 184 of 259
(type 1 & type 2), and by specifying parametric limits on e.g. how much type 1 error will be
permitted.
Estimation
Estimation (or estimating) is the process of finding an estimate, or approximation, which
is a value that is usable for some purpose even if input data may be incomplete,
uncertain, or unstable. The value is nonetheless usable because it is derived from the
best information available.
Uses of estimation
Estimation is valuable when it helps you make a significant decision.
 Estimation can save you money. Always do a quick estimation of how much you should pay
 Estimation can save you from making mistakes with your calculator
 Estimation can save you time (when the calculation does not have to be exact)
Estimators
In statistics, an estimator is a rule for calculating an estimate of a given quantity based on
observed data: thus the rule (the estimator), the quantity of interest (the estimand) and its
result (the estimate) are distinguished.
Types of Estimator
There are point and interval estimators. The point estimators yield single-valued
results, although this includes the possibility of single vector-valued results and results
that can be expressed as a single function. This is in contrast to an interval estimator,
where the result would be a range of plausible values (or vectors or functions). point
estimation involves the use of sample data to calculate a single value (known as a statistic)
which is to serve as a "best guess" or "best estimate" of an unknown (fixed or random)
population parameter.
Interval estimation is the use of sample data to calculate an interval of possible (or probable)
values of an unknown population parameter, in contrast to point estimation, which is a single
number.
In making an estimate, the goal is often most useful to generate a range of possible
outcomes that is precise enough to be useful, but not so precise that it is likely to be
inaccurate. For example, in trying to guess the number of candies in the jar, if fifty were
visible, and the total volume of the jar seemed to be about twenty times as large as the
volume containing the visible candies, then one might simply project that there were a
thousand candies in the jar. Such a projection, intended to pick the single value that is
believed to be closest to the actual value, is called a point estimate. However, a point
estimation is likely to be incorrect, because the sample size - in this case, the number of
candies that are visible - is too small a number to be sure that it does not contain
anomalies that differ from the population as a whole. A corresponding concept is an
interval estimate, which captures a much larger range of possibilities, but is too broad to
Page 185 of 259

be useful. For example, if one were asked to estimate the percentage of people who like
candy, it would clearly be correct that the number falls between zero and one hundred
percent. Such an estimate would provide no guidance, however, to somebody who is
trying to determine how many candies to buy for a party to be attended by a hundred
people.
Sampling and Distribution

Introduction to Sampling
Estimation is often done by sampling, which is counting a small number of examples
something, and projecting that number onto a larger population. An example of
estimation would be determining how many candies of a given size are in a glass jar.
Because the distribution of candies inside the jar may vary, the observer can count the
number of candies visible through the glass, consider the size of the jar, and presume
that a similar distribution can be found in the parts that cannot be seen, thereby making
an estimate of the total number of candies that could be in the jar if that presumption
were true. Estimates can similarly be generated by projecting results from polls or
surveys onto the entire population.
Sampling is concerned with the selection of a subset of individuals from within a

statistical population to estimate characteristics of the whole population. Two advantages
of sampling are that the cost is lower and data collection is faster than measuring the entire
population.
Each observation measures one or more properties (such as weight, location, color) of
observable bodies distinguished as independent objects or individuals. In survey sampling,
weights can be applied to the data to adjust for the sample design, particularly stratified
sampling. Results from probability theory and statistical theory are employed to guide the
practice. In business and medical research, sampling is widely used for gathering
information about a population. Acceptance sampling is used to determine if a production
lot of material meets the governing specifications.
The sampling process comprises several stages:
• Defining the population of concern

• Specifying a sampling frame, a set of items or events possible to measure
• Specifying a sampling method for selecting items or events from the frame
• Determining the sample size
• Implementing the sampling plan
• Sampling and data collecting
Page 186 of 259

Sampling distribution
Suppose that we draw all possible samples of size n from a given population. Suppose further
that we compute a statistic (e.g., a mean, proportion, standard deviation) for each sample. The
probability distribution of this statistic is called a sampling distribution. And the standard
deviation of this statistic is called the standard error.
Variability of a Sampling Distribution
The variability of a sampling distribution is measured by its variance or its standard

deviation. The variability of a sampling distribution depends on three factors:
• N: The number of observations in the population.

• n: The number of observations in the sample.  The way that
the random sample is chosen.
If the population size is much larger than the sample size, then the sampling distribution
has roughly the same standard error, whether we sample with or without replacement. On
the other hand, if the sample represents a significant fraction (say, 1/20) of the population
size, the standard error will be meaningfully smaller, when we sample without
replacement.
Sampling Distribution of the Mean

Suppose we draw all possible samples of size n from a population of size N. Suppose
further that we compute a mean score for each sample. In this way, we create a sampling
distribution of the mean.
We know the following about the sampling distribution of the mean. The mean of the
sampling distribution (μx) is equal to the mean of the population (μ). And the standard
error of the sampling distribution (σx) is determined by the standard deviation of the
population (σ), the population size (N), and the sample size (n). These relationships are
shown in the equations below:
μx = μ and σx = [ σ / sqrt(n) ] * sqrt[ (N - n ) / (N - 1) ]

In the standard error formula, the factor sqrt[ (N - n ) / (N - 1) ] is called the finite
population correction or fpc. When the population size is very large relative to the sample
size, the fpc is approximately equal to one; and the standard error formula can be
approximated by:
σx = σ / sqrt(n).
You often see this "approximate" formula in introductory statistics texts. As a general rule,
it is safe to use the approximate formula when the sample size is no bigger than 1/20 of the
population size.
Page 187 of 259

Sampling Distribution of the Proportion
In a population of size N, suppose that the probability of the occurrence of an event (dubbed
a "success") is P; and the probability of the event's non-occurrence (dubbed a "failure") is Q.
From this population, suppose that we draw all possible samples of size n. And finally,
within each sample, suppose that we determine the proportion of successes p and failures q.
In this way, we create a sampling distribution of the proportion.
We find that the mean of the sampling distribution of the proportion (μp) is equal to the
probability of success in the population (P). And the standard error of the sampling
distribution (σp) is determined by the standard deviation of the population (σ), the
population size, and the sample size. These relationships are shown in the equations
below:
μp = P
σp = [ σ / sqrt(n) ] * sqrt[ (N - n ) / (N - 1) ]
σp = sqrt[ PQ/n ] * sqrt[ (N - n ) / (N - 1) ]
where σ = sqrt[ PQ ].
Like the formula for the standard error of the mean, the formula for the standard error of the
proportion uses the finite population correction, sqrt[ (N - n ) / (N - 1) ]. When the
population size is very large relative to the sample size, the fpc is approximately equal to
one; and the standard error formula can be approximated by:
σp = sqrt[ PQ/n ]
You often see this "approximate" formula in introductory statistics texts. As a general rule,
it is safe to use the approximate formula when the sample size is no bigger than 1/20 of the
population size.
Central Limit Theorem

The central limit theorem states that the sampling distribution of the mean of any
independent, random variable will be normal or nearly normal, if the sample size is large
enough.
How large is "large enough"? The answer depends on two factors.
• Requirements for accuracy. The more closely the sampling distribution needs to
resemble a normal distribution, the more sample points will be required.
• The shape of the underlying population. The more closely the original population
resembles a normal distribution, the fewer sample points will be required.
In practice, some statisticians say that a sample size of 30 is large enough when the
population distribution is roughly bell-shaped. Others recommend a sample size of at
Page 188 of 259

least 40. But if the original population is distinctly not normal (e.g., is badly skewed, has
multiple peaks, and/or has outliers), researchers like the sample size to be even larger.
T-Distribution vs. Normal Distribution

The t distribution and the normal distribution can both be used with statistics that have a
bellshaped distribution. This suggests that we might use either the t-distribution or the
normal distribution to analyze sampling distributions. Which should we choose?
Guidelines exist to help you make that choice. Some focus on the population standard
deviation.
• If the population standard deviation is known, use the normal distribution  If

the population standard deviation is unknown, use the t-distribution.
Other guidelines focus on sample size.
• If the sample size is large, use the normal distribution. (See the discussion above
in the section on the Central Limit Theorem to understand what is meant by a
"large" sample.)
• If the sample size is small, use the t-distribution.
In practice, researchers employ a mix of the above guidelines. On this site, we use the
normal distribution when the population standard deviation is known and the sample size
is large. We might use either distribution when standard deviation is unknown and the
sample size is very large. We use the t-distribution when the sample size is small, unless
the underlying distribution is not normal. The t distribution should not be used with small
samples from populations that are not approximately normal.
Survey Sampling Methods

Sampling method refers to the way that observations are selected from a population to be
in the sample for a sample survey.
Population Parameter vs. Sample Statistic
The reason for conducting a sample survey is to estimate the value of some attribute of a
population.
• Population parameter. A population parameter is the true value of a population attribute.
• Sample statistic. A sample statistic is an estimate, based on sample data, of a population

parameter.
Consider this example. A public opinion pollster wants to know the percentage of voters that
favor a flat-rate income tax. The actual percentage of all the voters is a population
parameter. The estimate of that percentage, based on sample data, is a sample statistic.
The quality of a sample statistic (i.e., accuracy, precision, representativeness) is strongly

affected by the way that sample observations are chosen; that is., by the sampling method.
Page 189 of 259

Probability vs. Non-Probability Samples
As a group, sampling methods fall into one of two categories.
• Probability samples. With probability sampling methods, each population element has a
known (non-zero) chance of being chosen for the sample.
• Non-probability samples. With non-probability sampling methods, we do not know the

probability that each population element will be chosen, and/or we cannot be sure that each
population element has a non-zero chance of being chosen.
Non-probability sampling methods offer two potential advantages - convenience and cost.
The main disadvantage is that non-probability sampling methods do not allow you to
estimate the extent to which sample statistics are likely to differ from population
parameters. Only probability sampling methods permit that kind of analysis.
Non-Probability Sampling Methods

Two of the main types of non-probability sampling methods are voluntary samples and
convenience samples.
• Voluntary sample. A voluntary sample is made up of people who self-select into the survey.
Often, these folks have a strong interest in the main topic of the survey.
Suppose, for example, that a news show asks viewers to participate in an on-line poll. This
would be a volunteer sample. The sample is chosen by the viewers, not by the survey
administrator.
• Convenience sample. A convenience sample is made up of people who are easy to reach.
Consider the following example. A pollster interviews shoppers at a local mall. If the mall was
chosen because it was a convenient site from which to solicit survey participants and/or
because it was close to the pollster's home or business, this would be a convenience sample.
Probability Sampling Methods

The main types of probability sampling methods are simple random sampling, stratified
sampling, cluster sampling, multistage sampling, and systematic random sampling. The key
benefit of probability sampling methods is that they guarantee that the sample chosen is
representative of the population. This ensures that the statistical conclusions will be valid.
• Simple random sampling. Simple random sampling refers to any sampling method that
has the following properties.
o The population consists of N objects. o The sample

consists of n objects.
o If all possible samples of n objects are equally likely to occur,
the sampling method is called simple random sampling.
Page 190 of 259

There are many ways to obtain a simple random sample. One way would be the lottery
method. Each of the N population members is assigned a unique number. The numbers are
placed in a bowl and thoroughly mixed. Then, a blind-folded researcher selects n numbers.
Population members having the selected numbers are included in the sample.
• Stratified sampling. With stratified sampling, the population is divided into groups, based
on some characteristic. Then, within each group, a probability sample (often a simple
random sample) is selected. In stratified sampling, the groups are called strata.
As a example, suppose we conduct a national survey. We might divide the population into
groups or strata, based on geography - north, east, south, and west. Then, within each
stratum, we might randomly select survey respondents.
• Cluster sampling. With cluster sampling, every member of the population is assigned to
one, and only one, group. Each group is called a cluster. A sample of clusters is chosen,
using a probability method (often simple random sampling). Only individuals within
sampled clusters are surveyed.
Note the difference between cluster sampling and stratified sampling. With stratified
sampling, the sample includes elements from each stratum. With cluster sampling, in
contrast, the sample includes elements only from sampled clusters.
• Multistage sampling. With multistage sampling, we select a sample by using

combinations of different sampling methods.
For example, in Stage 1, we might use cluster sampling to choose clusters from a population.
Then, in Stage 2, we might use simple random sampling to select a subset of elements from
each chosen cluster for the final sample.
• Systematic random sampling. With systematic random sampling, we create a list of every
member of the population. From the list, we randomly select the first sample element
from the first k elements on the population list. Thereafter, we select every kth element
on the list.
This method is different from simple random sampling since every possible sample of n
elements is not equally likely.
Interpreting odds ratios, confidence intervals and p-values

Odds ratio (OR)
An odds ratio is a relative measure of effect, which allows the comparison of the
intervention group of a study relative to the comparison or placebo group.
So when researchers calculate an odds ratio they do it like this:
Page 191 of 259

The numerator is the odds in the intervention arm
The denominator is the odds in the control or placebo arm = Odds Ratio (OR)
So if the outcome is the same in both groups the ratio will be 1, which implies there is no
difference between the two arms of the study.
However:
If the OR is > 1 the control is better than the intervention.
If the OR is < 1 the intervention is better than the control.
Confidence interval (CI)

A confidence interval (CI) is a type of interval estimate (of a population parameter) that
is computed from the observed data. The confidence level is the frequency (i.e., the
proportion) of possible confidence intervals that contain the true value of their
corresponding parameter. In other words, if confidence intervals are constructed using a
given confidence level in an infinite number of independent experiments, the proportion of
those intervals that contain the true value of the parameter will match the confidence level.
The confidence interval indicates the level of uncertainty around the measure of effect
(precision of the effect estimate) which in this case is expressed as an OR. Confidence
intervals are used because a study recruits only a small sample of the overall population so
by having an upper and lower confidence limit we can infer that the true population effect
lies between these two points. Most studies report the 95% confidence interval (95%CI).
If the confidence interval crosses 1 e.g. 95%CI 0.9-1.1 this implies there is no difference
between arms of the study.
P values
P < 0.05 indicates a statistically significant difference between groups. P>0.05 indicates
there is not a statistically significant difference between groups.
Interpretation (Real world example)

A drug company-funded double blind randomized controlled trial evaluated the efficacy of
an adenosine receptor antagonist Cangrelor vs Clopidogrel in patients undergoing urgent or
elective Percutaneous Coronary Intervention (PCI) who were followed up for specific
complications for 48 hrs as outlined in the diagram below (Bhatt et al. 2009).
Page 192 of 259

The results section reported ―The rate of the primary efficacy end point was…….
(adjusted odds ratio with Cangrelor, 0.78; 95% confidence interval [CI], 0.66 to 0.93;
P=0.005)‖ What does this mean?
A. The odds of death, myocardial infarction, ischemia-driven revascularization, or stent

thrombosis at 48 hours after randomization in the Cangrelor arm were 22% less than
in the Clopidogrel arm with the true population effect between 34% and 7%. This
result was not statistically significant.
B. The odds of death, myocardial infarction, ischemia-driven revascularization, or stent
result was statistically significant.
C. The odds of death, myocardial infarction, ischemia-driven revascularization, or stent
result was statistically significant.
Summary
This is a very basic introduction to interpreting odds ratios, confidence intervals and p
values only and should help healthcare students begin to make sense of published
research, which can initially be a daunting prospect. However it should be stressed that
any results are only valid if the study was well designed and conducted, which highlights
the importance of critical appraisal as a key feature of evidence based medicine.
I do hope you enjoyed working through this and would appreciate any feedback on the
content, design and presentational aspects of this tutorial.
Hypothesis
What is Hypothesis Testing?
A statistical hypothesis is an assumption about a population parameter. This assumption
may or may not be true. Hypothesis testing refers to the formal procedures used by
Page 193 of 259
statisticians to accept or reject statistical hypotheses.
Statistical Hypotheses
The best way to determine whether a statistical hypothesis is true would be to examine the
entire population. Since that is often impractical, researchers typically examine a random
sample from the population. If sample data are not consistent with the statistical
hypothesis, the hypothesis is rejected.
There are two types of statistical hypotheses.
• Null hypothesis. The null hypothesis, denoted by H0, is usually the hypothesis that sample
observations result purely from chance.
• Alternative hypothesis. The alternative hypothesis, denoted by H1 or Ha, is the hypothesis

that sample observations are influenced by some non-random cause.
For example, suppose we wanted to determine whether a coin was fair and balanced. A
null hypothesis might be that half the flips would result in Heads and half, in Tails. The
alternative hypothesis might be that the number of Heads and Tails would be very
different. Symbolically, these hypotheses would be expressed as
H0: P = 0.5
Ha: P ≠ 0.5
Suppose we flipped the coin 50 times, resulting in 40 Heads and 10 Tails. Given this
result, we would be inclined to reject the null hypothesis. We would conclude, based on
the evidence, that the coin was probably not fair and balanced.
Can We Accept the Null Hypothesis?

Some researchers say that a hypothesis test can have one of two outcomes: you accept the null
hypothesis or you reject the null hypothesis. Many statisticians, however, take issue with the
notion of "accepting the null hypothesis." Instead, they say: you reject the null hypothesis or you
fail to reject the null hypothesis.
Why the distinction between "acceptance" and "failure to reject?" Acceptance implies that the
null hypothesis is true. Failure to reject implies that the data are not sufficiently persuasive for us
to prefer the alternative hypothesis over the null hypothesis.
Hypothesis Tests
Statisticians follow a formal process to determine whether to reject a null hypothesis,
based on sample data. This process, called hypothesis testing, consists of four steps.
• State the hypotheses. This involves stating the null and alternative hypotheses. The
hypotheses are stated in such a way that they are mutually exclusive. That is, if one is true,
the other must be false.
Page 194 of 259
• Formulate an analysis plan. The analysis plan describes how to use sample data to
evaluate the null hypothesis. The evaluation often focuses around a single test statistic.
• Analyze sample data. Find the value of the test statistic (mean score, proportion, t statistic,
z-score, etc.) described in the analysis plan.
• Interpret results. Apply the decision rule described in the analysis plan. If the value of the
test statistic is unlikely, based on the null hypothesis, reject the null hypothesis.
Decision Errors
Two types of errors can result from a hypothesis test.
• Type I error. A Type I error occurs when the researcher rejects a null hypothesis when it is
true. The probability of committing a Type I error is called the significance level. This
probability is also called alpha, and is often denoted by α.
• Type II error. A Type II error occurs when the researcher fails to reject a null hypothesis
that is false. The probability of committing a Type II error is called Beta, and is often
denoted by β. The probability of not committing a Type II error is called the Power of the
test.
Decision Rules
The analysis plan includes decision rules for rejecting the null hypothesis. In practice,
statisticians describe these decision rules in two ways - with reference to a P-value or with
reference to a region of acceptance.
• P-value. The strength of evidence in support of a null hypothesis is measured by the

Pvalue. Suppose the test statistic is equal to S. The P-value is the probability of
observing a test statistic as extreme as S, assuming the null hypotheis is true. If the P-
value is less than the significance level, we reject the null hypothesis.
• Region of acceptance. The region of acceptance is a range of values. If the test statistic
falls within the region of acceptance, the null hypothesis is not rejected. The region of
acceptance is defined so that the chance of making a Type I error is equal to the
significance level.
The set of values outside the region of acceptance is called the region of rejection. If
the test statistic falls within the region of rejection, the null hypothesis is rejected. In
such cases, we say that the hypothesis has been rejected at the α level of significance.
These approaches are equivalent. Some statistics texts use the P-value approach; others
use the region of acceptance approach. In subsequent lessons, this tutorial will present
examples that illustrate each approach.
One-Tailed and Two-Tailed Tests

A test of a statistical hypothesis, where the region of rejection is on only one side of the
sampling distribution, is called a one-tailed test. For example, suppose the null hypothesis
states that the mean is less than or equal to 10. The alternative hypothesis would be that the
Page 195 of 259
mean is greater than 10. The region of rejection would consist of a range of numbers located
on the right side of sampling distribution; that is, a set of numbers greater than 10.
A test of a statistical hypothesis, where the region of rejection is on both sides of the
sampling distribution, is called a two-tailed test. For example, suppose the null
hypothesis states that the mean is equal to 10. The alternative hypothesis would be that
the mean is less than 10 or greater than 10. The region of rejection would consist of a
range of numbers located on both sides of sampling distribution; that is, the region of
rejection would consist partly of numbers that were less than 10 and partly of numbers
that were greater than 10.
Power of a Hypothesis Test

The probability of not committing a Type II error is called the power of a hypothesis test.
Effect Size
To compute the power of the test, one offers an alternative view about the "true" value of the
population parameter, assuming that the null hypothesis is false. The effect size is the
difference between the true value and the value specified in the null hypothesis.
Effect size = True value - Hypothesized value
For example, suppose the null hypothesis states that a population mean is equal to 100. A
researcher might ask: What is the probability of rejecting the null hypothesis if the true
population mean is equal to 90? In this example, the effect size would be 90 - 100, which
equals -10.
Factors That Affect Power

The power of a hypothesis test is affected by three factors.
• Sample size (n). Other things being equal, the greater the sample size, the greater the
power of the test.
• Significance level (α). The higher the significance level, the higher the power of the test. If
you increase the significance level, you reduce the region of acceptance. As a result, you
are more likely to reject the null hypothesis. This means you are less likely to accept the
null hypothesis when it is false; i.e., less likely to make a Type II error. Hence, the power of
the test is increased.
• The "true" value of the parameter being tested. The greater the difference between the
"true" value of a parameter and the value specified in the null hypothesis, the greater the
power of the test. That is, the greater the effect size, the greater the power of the test.
How to Conduct Hypothesis Tests

All hypothesis tests are conducted the same way. The researcher states a hypothesis to be
tested, formulates an analysis plan, analyzes sample data according to the plan, and accepts
or rejects the null hypothesis, based on results of the analysis.
Page 196 of 259
• State the hypotheses. Every hypothesis test requires the analyst to state a null hypothesis
and an alternative hypothesis. The hypotheses are stated in such a way that they are mutually
exclusive. That is, if one is true, the other must be false; and vice versa.
• Formulate an analysis plan. The analysis plan describes how to use sample data to accept or
reject the null hypothesis. It should specify the following elements.
o Significance level. Often, researchers choose significance levels equal to 0.01, 0.05, or
0.10; but any value between 0 and 1 can be used.
o Test method. Typically, the test method involves a test statistic and a sampling
distribution. Computed from sample data, the test statistic might be a mean score,
proportion, difference between means, difference between proportions, z-score, t
statistic, chi-square, etc. Given a test statistic and its sampling distribution, a researcher
can assess probabilities associated with the test statistic. If the test statistic probability
is less than the significance level, the null hypothesis is rejected.
• Analyze sample data. Using sample data, perform computations called for in the analysis
plan.
o Test statistic. When the null hypothesis involves a mean or proportion, use either of the
following equations to compute the test statistic.
Test statistic = (Statistic - Parameter) / (Standard deviation of statistic)

Test statistic = (Statistic - Parameter) / (Standard error of statistic)
where Parameter is the value appearing in the null hypothesis, and Statistic is the point
estimate of Parameter. As part of the analysis, you may need to compute the standard
deviation or standard error of the statistic. Previously, we presented common formulas for
the standard deviation and standard error.
When the parameter in the null hypothesis involves categorical data, you may use a
chisquare statistic as the test statistic. Instructions for computing a chi-square test
statistic are presented in the lesson on the chi-square goodness of fit test.
o P-value. The P-value is the probability of observing a sample statistic as extreme as the
test statistic, assuming the null hypotheis is true.
• Interpret the results. If the sample findings are unlikely, given the null hypothesis, the
researcher rejects the null hypothesis. Typically, this involves comparing the P-value to the
significance level, and rejecting the null hypothesis when the P-value is less than the
significance level.
Applications of the General Hypothesis Testing Procedure

The next few lessons show how to apply the general hypothesis testing procedure to
different kinds of statistical problems.
Page 197 of 259

• Proportions
• Difference between proportions
• Regression slope
• Means
• Difference between means
• Difference between matched pairs
• Goodness of fit
• Homogeneity
• Independence
Test statistics and the test interpretation

Example #1
Study Question: Do extraverts have sex more sexual partners than the average person?
H0: Extraverts do not differ from the population in terms of number of sexual partners
H1: Extraverts do differ from the population in terms of number of sexual partners
• We want to draw conclusions about the population (about how most
people behave)
• Obviously, not everyone in the world will participate in the study  How
do we test our hypotheses?
o Collect data from a sample of extraverts o Compare them
to the population average
o If the sample of extraverts differs only marginally from
the population, accept H0 (any small differences are due
to sampling error – ―chance‖)
If the sample of extraverts differs a lot from the population, the results probably are not just
due to chance, so accept H1
Example #2
Study Question: Does SSRI medication impact symptoms of suicide?
H0: SSRI medication does not impact symptoms of suicide.
H1: SSRI medication does impact symptoms of suicide.
Page 198 of 259

Depressed SSRI users
SSRI users
 Draw conclusions about how populations differ, based on statistics from sample.
 If sample differs a little on suicidality, probably accept H0
 If sample differs a lot on suicidality, probably accept H1
To determine whether the sample differs ―a little‖ or ―a lot‖ we use statistics

Steps of Hypothesis Testing
1. State hypotheses about population H0 and H1
2. Set criteria (rules) for rejecting the null hypothesis (H0)
3. Calculate a statistic
4. Make a decision and report results
Step #1: State Hypotheses

• State H0: No difference, no effect at the population level
• State H1: Some difference, some effect at the population level
Step #2: Set Criteria for Rejecting H0

• Small or no mean difference  Accept H0
• Large mean difference  Accept H1
• Use Z statistic to determine if the sample differs from the larger population more
than what would be expected by chance
• Just by random luck (sampling error), we should only get a Z score more extreme
than ±1.96 on 5% of occasions
Page 199 of 259

• If Z is more extreme than ±1.96, results are probably not just due to sample error.
We have probably found a real effect.
Low probability Low probability

of Ho being Typical of Ho being
true; accept H1 Outcome when true; accept H1
Ho is true
Still confused?
Assume there is no effect (treatment has no impact at all). Samples are never
perfect (sampling error), so we always expect some small differences, even if
there is no effect. In fact, if there is no effect, we would only get a Z value more
extreme than ±1.96 about 5% of the time. Since it‟s so rare to get an extreme Z
by chance, we conclude that the differences are real – treatment had some
effect.
Rules
Accept H0 that there is no effect:
-1.96 < Z < +1.96
Accept H1 that there is some effect:
Z ≥ +1.96 (big positive) or Z ≤ -1.96 (big negative)
Step #3: Calculate the Relevant Statistic
M = sample of interest μ = general
Review: Z = (M – μ) / SE
population mean
σ = general population standard

where SE = σ / n
deviation
Most depressed people have a mild elevation on the Suicidality Questionnaire (μ = 5.0, σ =
1.3). You administer the survey to a random group of depressed people (n = 25) who are
taking SSRIs and find that the average score is a 4.6.
Page 200 of 259

To make any type of statistical decisions, must calculate the Z score for the sample
Z = (M – μ) / SE where SE = σ / n
= 1.3 / 25 = 0.26
Z = (4.6 – 5.0) / 0.26
= -0.4 / 0.26 = -1.54
Step #4: Make a Decision and Report Results

 Compare the obtained statistic to the decision rules  Is
the obtained Z more extreme than ±1.96?
o If Yes
 Less than 5% probability of getting this result by chance, accept H1 that
there is some real effect o If No
 Observed differences assumed due to chance, accept H0 that there is no
effect
 Sometimes we get weak results due to a poor study design. Could design
a better study and try again.
Accept H0 or H1?
Z = +0.26 Z = +3.28
Z = -2.14 Z = -1.96
Z = +1.95 Z = +3.50
Side note: The number for the Z score tells you how much your sample mean differed relative to the
amount of expected error or discrepancy (SE). If Z = 3.5, it means that your sample differed from
the population by three and a half times what we‘d expect due to chance alone.
Writing up the results in APA style:

First, provide a sentence with the statistical information, and then explain it in simple terms.
ns = not
• Our result (Z = -1.54, no effect): significant,
Page 201 of 259
SSRI use was not related to differences in suicidality, Z = -1.54, probably just a
ns. chance
SSRI use did not impact suicide. difference
• If Z = -1.97 (lower suicide scores):

SSRI use was related to differences in suicidality, Z = -1.97, p < .05. SSRI
users had lower suicide scores.
p < .05 means
the result is not
• If Z = 3.26 (higher suicide scores): likely due to
SSRI use was related to differences in suicidality, Z = 3.26, p < .05. chance (less
Surprisingly, SSRI users were more suicidal, so treatment may
be iatrogenic. than 5%
probability), so
assume the effect is real.
Alpha Level and Critical Region
Low probability Low probability

of Ho being Typical of Ho being
true, accept H1 Outcome when true, accept H1
Ho is true
• When Z is small, we assume the result is due to sampling error (chance)

• As Z becomes more extreme, the result is less and less likely to be due to chance 
When Z is more extreme than ±1.96, there is only a 5% probability that we would
get that result by chance, so we infer that there is some real effect (not just a chance
difference)
Page 202 of 259

So if Z is 1.95, we say
the result is due to
chance, and if Z is
1.96, we say there is
a real effect? This
• Yes, this is a bit odd, but we need to draw a line somewhere if we are to have rules
for making decisions (kind of like speed limits).
• Alpha level or significance level:
Percent of the time (or probability) we will incorrectly reject the null hypothesis
o Generally set at .05, so psychologists are willing to incorrectly reject the
null hypothesis about 5% of the time
o Used to determine cutoff point for determining ―significance‖
• Critical Region:
Z values where result is considered significant, reliable, and unlikely to be due
to chance o Use Z‘s more extreme than ±1.96, which corresponds to the alpha
level  Rarely other standards are used:
Alpha Probability of
Z value for
level α incorrectly rejecting
cutoff
Ho
.10 10.0% 1.65
.05 5.0% 1.96
.01 1.0% 2.58
.001 0.1% 3.30
Page 203 of 259

• Tiny alphas (e.g. .001) used in areas of research where there are more likely to be
erroneous findings, such as neuroscience
• Sometimes researchers try to cheat and use big alphas (e.g. .10) to say their results
are
―significant‖ when really they are not too reliable or impressive
Probability and Errors
• Decisions based on probability will always be wrong a proportion of the time o
Hamilton is a 90% free throw shooter but he still misses sometimes.
o Forecast says 80% chance of snow but this will be wrong at times too.
• Using the alpha = .05 (Z more extreme than ±1.96) rule, we will by definition be
wrong about 5% of the time when we reject the null hypothesis
• Two main types of errors
Type I Errors:
• Z more extreme than ±1.96 by accident; really there is no effect
• H0 is correct, but mistakenly accept H1
• “Saying something is true when it really isn‟t”
• Why?
1. Unlucky sample
2. Looked at too many variables at once, and some were related by
chance
3. Illusory correlations – instead of using science, draw conclusions
based on anecdotal observations (e.g. superstitions) Type II Errors:
Page 204 of 259

• Weak Z, even though there is some real-world effect
• H1 is correct, but mistakenly accept H0
• “Oops, we missed it”  Why?
1. Unlucky sample
2. Sample too small
3. Study designed poorly
Skepticism
“It is wrong always, everywhere, and for anyone, to believe anything upon
insufficient evidence.”
“The danger to society is not merely that it should believe wrong things, though
that is great enough; but that it should become credulous, and lose the habit of
testing things and inquiring into them; for then it must sink back into savagery.” -
William Kingdon Clifford
Different Types of Hypothesis Tests Two-

tailed hypothesis test
• Used when you want to test whether a treatment or manipulation has any effect,
positive or negative o Does this pill impact appetite?
o Do extra credit quizzes impact performance evaluations?
o Does reading aloud impact retention of material when studying?
• Both positive and negative results are seen as interesting and important
One-tailed hypothesis test

• Used when you want to test whether a treatment or manipulation has an effect in
a specific direction o Does reading make people smarter?
o Does human growth hormone make baseball players stronger?
o Does the tire store make more money when there are more potholes?
• Sometimes findings are only interesting and important when they occur in one
direction. For example, if we found that the tire store‘s business went down when
there were a lot of potholes, we‘d likely dismiss this finding.
Comparison
• Because One-tail tests are a bit more specific, a less strict Z value is used
Page 205 of 259
Two-tail test
One-tail test
Two-tail One-tail
Critical Z ±1.96 +1.65
• Probably 99% of the time, people use two tail hypothesis tests. Any extreme result
is an interesting finding in most cases.
CHAPTER 11: THEORY DECISION

Introduction to Decision theory
Decision theory (or the theory of choice) is the study of the reasoning underlying an
agent's (an agent is an actor and more specifically a decision maker in a model of some aspect of
the economy) choices.
Page 206 of 259

Decision theory is a Framework of logical and mathematical concepts, aimed at helping
managers in formulating rules that may lead to a most advantageous course of action under
the given circumstances.
Decision theory can be broken into two branches:

 normative decision theory, which gives advice on how to make the best decisions,
given a set of uncertain beliefs and a set of values; and
 descriptive decision theory, which analyzes how existing, possibly irrational agents
actually make decisions.
Norms are concepts (sentences) of practical import, oriented to effecting an action, rather
than conceptual abstractions that describe, explain, and express. Normative sentences
imply "ought-to" types of statements and assertions, in distinction to sentences that provide
"is" types of statements and assertions.
Description or descriptive linguistics is the work of objectively analyzing and describing
how language is actually used (or how it was used in the past) by a group of people in a
speech community.
Decision theory divides decisions into three classes

(1) Decisions under certainty: where a manager has far too much information to choose the
best
alternative.
(2) Decisions under conflict: where a manager has to anticipate moves and counter-moves
of one or more competitors.
(3) Decisions under uncertainty: where a manager has to dig-up a lot of data to make sense
of what is going on and what it is leading to. See also game theory.
Decision theory problems are characterized by the following:
• A decision criterion - A decision criterion represents the decision maker´s point of

view for selecting the best alternative. Usually, it is derived from the decision goal.
It may be maximization (for example gains) or minimization (for example costs)
• A list of alternatives - The alternative represents an acceptable solution to the
decision problem. Success of developing suitable alternatives depends on the
experience and creativity of the decision maker
• A list of possible future events (states of nature) - Events represent possible future
situations that will be the primary determinants of the eventual consequence of the
decision. The situations must be mutually exclusive (no two or more events can
occur simultaneously) and collectively exhaustive (the events must cover all the
possibilities).
• Payoffs associated with each combination of alternatives and events - These
payoffs may be profits, revenues, costs, or other measure of value. Usually the
measures are financial. They may be weekly, monthly or annual amounts, or they
might represent values of future cash flows. Usually, payoffs are estimated values.
The more accurate these estimates, the more likely it is that the decision maker will
Page 207 of 259

choose an appropriate alternative. If the number of alternatives is m and the number
of states of nature is n, m´ n possible payoffs must be determined.
• The degree of certainty of possible future events - This degree can range from
complete knowledge about which state will occur to partial knowledge (the
probabilities of states of nature are known) and to no knowledge (complete
uncertainty).
There is a wide range of management decision problems. Among them there are capacity
and order planning, product and service design, equipment selection, location planning,
and so on.
Some examples of alternatives and possible events for these alternatives are shown in Table
below.
Alternatives Events
To order 10, 11, … units of a product Demand for the product may be 0, 1, …
units
To make or to buy a product The cost of making may be 20, 22,
…$thousands
To buy or not to buy accident insurance An accident may occur, or may not occur
Table 2.1 “Examples of Alternatives and Events”
Various properties of decision problems enable a classification of decision problems.

Point of View Types of Decision Problems
possibility and complexity of ill - structured, well - structured,
algoritmization semi - structured
number of criteria one criterion, multiple criteria
character of the decision maker individual, group
number of decision making one stage, multiple stage
periods
relationships among decision conflict, cooperative,
makers noncooperative
degree of certainty for future complete certainty, risk,
events uncertainty
Solution to any decision problem consists of these steps:
1. Identify the problem

2. Specify objectives and the decision criteria for choosing a solution
3. Develop alternatives
4. Analyze and compare alternatives
5. Select the best alternative
6. Implement the chosen alternative
7. Verify that desired results are achieved
Page 208 of 259

Decisions problems that involve a single decision are usually best handled through payoff
tables, whereas problems that involve a sequence of decisions, are usually best handled
using decision trees.
Mathematical Expectation
Mathematical expectation, also known as the expected value, is the summation or
integration of a possible values from a random variable. It is also known as the product of
the probability of an event occurring, denoted P(x), and the value corresponding with the
actual observed occurrence of the event. The expected value is a useful property of any
random variable. Usually notated as E(X), the expect value can be computed by the
summation overall the distinct values that the random variable can take. The mathematical
expectation will be given by the mathematical formula as, E(X)= Σ (x1p1, x2p2, …, xnpn),
where x is a random variable with the probability function, f(x), p is the probability of the
occurrence, and n is the number of all possible values In the case
The mathematical expectation of an indicator variable can be zero if there is no occurrence

of an event A, and the mathematical expectation of an indicator variable can be one if there
is an occurrence of an event A. Thus, it is a useful tool to find the probability of event A.
Questions answered:
 What is the expected number of coin flips for getting tails?
 What is the expect number of coin flips for getting two tails in a row?
Properties and Assumptions:

The first property is that if X and Y are the two random variables, then the mathematical
expectation of the sum of the two variables is equal to the sum of the mathematical
expectation of X and the mathematical expectation of Y, provided that the mathematical
expectation exists. In other words, E(X+Y)=E(X)+E(Y).
The second property is that the mathematical expectation of the product of the two
random variables will be the product of the mathematical expectation of those two
variables, provided that the two variables are independent in nature. In other words,
E(XY)=E(X)E(Y).
The generalization of this property states that the mathematical expectation of the product
of the n number of independent random variables is equal to the product of the
mathematical expectation of the nindependent random variables.
The third property states that the mathematical expectation of the product of a constant
and the function of a random variable is equal to the product of the constant and the
mathematical expectation of the function of that random variable provided that their
mathematical expectation exists. The third also states that the mathematical expectation of
the sum of a constant and the function of a random variable is equal to the sum of the
constant and the mathematical expectation of the function of that random variable provided
that their mathematical expectation exists. In other words, E(a *f(X))=a E(f(X)) and
E(a+f(X))=a+E(f(X)), where a is a constant and f(X) is the function.
Page 209 of 259

The fourth property states that the mathematical expectation of the sum of the product
between a constant and the function of a random variable and the other constant is equal to
the sum of the product between the constant and the mathematical expectation of the
function of that random variable and the other constant provided that their mathematical
expectation exists. In other words, E(aX+b)=aE(X)+b, where a and b are constants.
The fifth property states that the mathematical expectation of the linear combination of
the random variables is equal to the sum of the product between the ‗n‘ constant and the
mathematical expectation of the ‗n‘ number of variables. In other words, E(∑aiXi)=∑ ai
E(Xi). Here, ai, (i=1…n) are constants.
Decision Theory computations
There are four types of criteria that we will look at.
Bayer’s rule: Expected Value (Realist)

Compute the expected value under each action and then pick the action with the largest expected
value. This is the only method of the four that incorporates the probabilities of the states of nature.
The expected value criterion is also called the Bayesian principle.
Bayes’ theorem (alternatively Bayes’ law or Bayes' rule) describes the probability of an event, based
on prior knowledge of conditions that might be related to the event. For example, if cancer is related
to age, then, using Bayes’ theorem, a person’s age can be used to more accurately assess the
probability that they have cancer, compared to the assessment of the probability of cancer made
without knowledge of the person's age.
Maximax (Optimist)
The maximax looks at the best that could happen under each action and then chooses the action
with the largest value. They assume that they will get the most possible and then they take the
action with the best best case scenario. The maximum of the maximums or the "best of the best".
This is the lotto player; they see large payoffs and ignore the probabilities.
The Hurwicz -criterion represents a compromise between the , optimistic and the
pessimistic approach to decision making under uncertainty. The measure of optimism and
pessimism is expressed by an optimism - pessimism index <0,1> . The more this
index is near to 1, the more the decision maker is optimist. By means of the index , a
weighted average of the best payoff (its weight = ) and the worst payoff (its weight = 1-
) is computed for each alternative and the alternative with the largest weighted average
should be chosen
Maximin (Pessimist)
The maximin person looks at the worst that could happen under each action and then choose the
action with the largest payoff. They assume that the worst that can happen will, and then they take
the action with the best worst case scenario. The maximum of the minimums or the "best of the
Page 210 of 259

worst". This is the person who puts their money into a savings account because they could lose
money at the stock market.
The maximin rule (Wald criterion) represents a pessimistic approach when the worst
decision results are expected. The decision maker determines the smallest payoff for each
alternative and then chooses the alternative that has the best (maximum) of the worst
(minimum) payoffs (therefore ―maximin‖).
Minimax (Opportunist)
Minimax decision making is based on opportunistic loss. They are the kind that look back after the
state of nature has occurred and say "Now that I know what happened, if I had only picked this
other action instead of the one I actually did, I could have done better". So, to make their decision
(before the event occurs), they create an opportunistic loss (or regret) table. Then they take the
minimum of the maximum. That sounds backwards, but remember, this is a loss table. This similar
to the maximin principle in theory; they want the best of the worst losses.
Example: A bicycle shop

Zed and Adrian and run a small bicycle shop called "Z to A Bicycles". They must order
bicycles for the coming season. Orders for the bicycles must be placed in quantities of
twenty (20). The cost per bicycle is $70 if they order 20, $67 if they order 40, $65 if they
order 60, and $64 if they order 80. The bicycles will be sold for $100 each. Any bicycles
left over at the end of the season can be sold (for certain) at $45 each. If Zed and Adrian
run out of bicycles during the season, then they will suffer a loss of "goodwill" among their
customers. They estimate this goodwill loss to be $5 per customer who was unable to buy a
bicycle. Zed and Adrian estimate that the demand for bicycles this season will be 10, 30,
50, or 70 bicycles with probabilities of 0.2, 0.4, 0.3, and 0.1 respectively.
Actions
There are four actions available to Zed and Adrian. They have to decide which of the
actions is the best one under each criteria.
1. Buy 20 bicycles
2. Buy 40 bicycles
3. Buy 60 bicycles
4. Buy 80 bicycles
Zed and Adrian have control over which action they choose. That is the whole point of
decision theory - deciding which action to take.
States of Nature
There are four possible states of nature. A state of nature is an outcome.
1. The demand is 10 bicycles

Page 211 of 259
Zed and Adrian have no control over which state of nature will occur. They can only plan
and make the best decision based on the appropriate decision criteria.
Payoff Table
After deciding on each action and state of nature, create a payoff table. The numbers in
parentheses for each state of nature represent the probability of that state occurring.
Action
State of Nature Buy 20 Buy 40 Buy 60 Buy 80
Demand 10 50 -330 -650 -970
(0.2)
Demand 30 550 770 450 130
(0.4)
Demand 50 450 1270 1550 1230
(0.3)
Demand 70 350 1170 2050 2330
(0.1)
Ok, the question on your mind is probably "How the [expletive deleted] did you come up
with those numbers?". Let's take a look at a couple of examples.
Demand is 50, buy 60:
They bought 60 at $65 each for $3900. That is -$3900 since that is money they spent. Now,
they sell 50 bicycles at $100 each for $5000. They had 10 bicycles left over at the end of the
season, and they sold those at $45 each of $450. That makes $5000 + 450 - 3900 = $1550.
Demand is 70, buy 40:
They bought 40 at $67 each for $2680. That is a negative $2680 since that is money they
spent. Now, they sell 40 bicycles (that's all they had) at $100 each for $4000. The other 30
customers that wanted a bicycle, but couldn't get one, left mad and Zed and Adrian lost $5
in goodwill for each of them. That's 30 customers at -$5 each or -$150. That makes $4000 -
2680 - 150 = $1170.
Opportunistic Loss Table

The opportunistic loss (regret) table is calculated from the payoff table. It is only needed
for the minimax criteria, but let's go ahead and calculate it now while we're thinking about
it.
The maximum payoffs under each state of nature are shown in bold in the payoff table
above. For example, the best that Zed and Adrian could do if the demand was 30 bicycles
is to make $770.
Each element in the opportunistic loss table is found taking each state of nature, one at a
time, and subtracting each payoff from the largest payoff for that state of nature. In the way
we have the table written above, we would subtract each number in the row from the
largest number in the row.
Page 212 of 259
Action
State of Nature Buy 20 Buy 40 Buy 60 Buy 80

Demand 10 0 380 700 1020
Demand 30 220 0 320 640
Demand 50 1100 280 0 320
Demand 70 1980 1160 280 0
Remember that the numbers in this table are losses and so the smaller the number, the
better.
Expected Value Criterion

Compute the expected value for each action.
For each action, do the following: Multiply the payoff by the probability of that payoff
occurring. Then add those values together. Matrix multiplication works really well for this
as it multiplied pairs of numbers together and adds them. If you place the probabilities into
a 1x4 matrix and use the 4x4 matrix shown above, then you can multiply the matrices to
get a 1x4 matrix with the expected value for each action.
Here is an example of the "Buy 60" action if you wish to do it by hand.
0.2(-650) + 0.4(450) + 0.3(1550) + 0.1(2050) = 720
The expected values for buying 20, 40, 60, and 80 bicycles are $400, 740, 720, and 460
respectively. Since the best that you could expect to do is $740, you would buy 40 bicycles.
Maximax Criterion
The maximax criterion is much easier to do than the expected value. You simply look at the
best you could do under each action (the largest number in each column). You then take the
best (largest) of these.
The largest payoff if you buy 20, 40, 60, and 80 bicycles are $550, 1270, 2050, and 2330
respectively. Since the largest of those is $2330, you would buy 80 bicycles.
Maximin Criterion
The maximin criterion is as easy to do as the maximax. Except instead of taking the largest
number under each action, you take the smallest payoff under each action (smallest number
in each column). You then take the best (largest of these).
The smallest payoff if you buy 20, 40, 60, and 80 bicycles are $50, -330, -650, and -970
respectively. Since the largest of those is $50, you would buy 20 bicycles.
Minimax Criterion
Be sure to use the opportunistic loss (regret) table for the minimax criterion. You take the
largest loss under each action (largest number in each column). You then take the smallest
of these (it is loss, afterall).
Page 213 of 259

The largest losses if you buy 20, 40, 60, and 80 bicycles are $1980, 1160, 700, and 1020
respectively. Since the smallest of those is $700, you would buy 60 bicycles.
Putting it all together.

Here is a table that summarizes each criteria and the best decision.
Action
Criterion Buy 20 Buy 40 Buy 60 Buy 80 Best

Action
Expected 400 740 720 460 Buy 40
Value
Maximax 550 1270 2050 2330 Buy 80
Maximin 50 -330 -650 -970 Buy 20
Minimax 1980 1160 700 1020 Buy 60
Practice Problem
Since there aren't any problems of this kind in the text, work this problem out, and then you
can check your answer with the instructor.
Finicky's Jewelers sells watches for $50 each. During the next month, they estimate that
they will sell 15, 25, 35, or 45 watches with respective probabilities of 0.35, 0.25, 0.20, and
... (figure it out). They can only buy watches in lots of ten from their dealer. 10, 20, 30, 40,
and 50 watches cost $40, 39, 37, 36, and 34 per watch respectively. Every month, Finicky's
has a clearance sale and will get rid of any unsold watches for $24 (watches are only in
style for a month and so they have to buy the latest model each month). Any customer that
comes in during the month to buy a watch, but is unable to, costs Finicky's $6 in lost
goodwill.
Find the best action under each of the four decision criteria.
Decision Trees
Characteristic and Format of Decision Trees
Decision tree is a graphic tool for describing the actions available to the decision maker,
the events that can occur, and the relationship between these actions and events. Decision
trees are particularly useful for analyzing situations that involve sequential decisions.
The term ‖decision tree‖ gets its name from the treelike appearance of the diagram. A
decision tree can be deterministic or stochastic. As a prototype of decision tree, a tree with
deterministic and stochastic elements is considered.
A decision tree is composed of nodes and branches (arcs).The terminology of nodes and
arcs comes from network models which have a similar pictorial representation.
A decision tree has three types of nodes: decision nodes, chance event nodes, and
terminating nodes.
Page 214 of 259

Decision nodes are denoted by squares. Each decision node has one or more arcs
beginning at the node and extending to the right. Each of those arcs represents a possible
decision alternative at that decision point.
Chance event nodes are denoted by circles. Each chance event node has one or more arcs
beginning at the node and extending to the right. Each of those arcs represents a possible
event at that chance event point. The decision maker has no control over these chance
events. The events associated with branches from any chance event node must be mutually
exclusive and all events included. The probability of each event is conditional on all of the
decision alternatives and chance events that precede it on the decision tree. The
probabilities for all of the arcs beginning at a chance event node must sum to 1.
A terminating node represents the end of the sequence of decisions and chance events. No
arcs extend to the right from a terminating node. No geometric picture is used to denote
terminating nodes. Terminating nodes are the starting points for the computations needed to
analyze the decision tree.To construct a decision tree, we must list the sequence of decision
alternatives and events that can occur and that can affect consequences of decisions
Format of a decision tree is evident from Picture below.
Picture. ”Decision Tree Format”
Analysis of Decision Trees

After the tree has been drawn, it is analyzed from right to left. The aim of this analysis is to
determine the best strategy of the decision maker, that means an optimal sequence of the
decisions.
To analyze a decision tree, we must know a decision criterion, probabilities that are
assigned to each event, and revenues and costs for the decision alternatives and the chance
events that occur.
There are two possibilities how to include revenues and costs in a decision tree. One
possibility is to assign them only to terminating nodes where they are included in the
conditional value of the decision criterion associated with the decisions and events along
the path from the first part of the tree to the end. However, it can be more convenient to
assign revenues and costs to branches. This reduces the required arithmetic for calculating
Page 215 of 259

the values of the decision criterion for terminating nodes and focuses attention on the
parameters for sensitivity analysis.
Analyzing a decision tree, we begin at the end of the tree and work backwards. We carry
out two kinds of calculations.
For chance event nodes we calculate certainty equivalents related to the events emanating
from these nodes. Under the assumption that the decision maker has a neutral attitude
toward risk, certainty equivalent of uncertain outcomes can be replaced by their expected
value.
At decision nodes, the alternative with the best expected value of the decision criterion is
selected.
The analysis of a decision tree is illustrated by the following example.
A firm is deciding between two alternatives: to introduce a new product or to keep the
existing product. Introducing a new product has uncertain outcomes in dependence on the
demand. If the demand is high, the resulting profit of the firm will be 140. The low demand
will be result in the profit 80. The firm estimates the probabilities of a high and low
demand 0.7 and 0.3, respectively. If the firm keeps the existing product, its profit will be
110.
The decision tree for this decision problem is drawn in Picture below.
Picture. ”Analysis of a Decision Tree”
The estimated profit is written at the end of the chance branches. The probabilities of a
high and a low demand for the new product are written below the branches leaving the
chance node. The nodes are numbered.
For the chance node 2, we calculate the expected value of the profit (0.7*140 + 0.3*80 =
122) and write this value over the node 2.
At the decision node 1, we select the decision alternative with the higher expected profit.
Because max (122;110) = 122, introducing the new product is profitable. We write the
Page 216 of 259

maximum expected profit over the node 1 and draw double lines through the branch
representing the inferior (worse) decision alternative.
Every decision problem that is modelled by means of a decision table can be structured and
pictured as a decision tree (in general, the reverse statement is not true). It is shown in
Picture 2.6 that represents a decision tree for the order planning problem given in Table
2.2. We recall this table.
Picture 2.6 ”Decision Tree for Order Planning”
For the above order planning problem, the use of a decision table in comparison with the
use of a decision tree may seem easier and simpler. However, as the decision problem
becomes more complex, the decision tree becomes more valuable in organizing the
information needed to make the decision. This is especially true if the decision maker must
make a sequence of decisions, rather than a single decision, as the next example illustrates.
Suppose the marketing manager of a firm is trying to decide whether or not to market a
new product and at what price to sell it. The profit to be made depends on whether or not a
competitor will introduce a similar product and on what price the competitor charges.
Note that there are two decisions: (1) introduce the product or not, and (2) the price to
charge. Likewise, there are two events: (1) competition introduces a competitive product
(or not), and
Page 217 of 259

(2) the competitor‘s price. The timing or sequence of these decisions and events is very
important in this decision. If the marketing manager must act before knowing whether or
not the competitor has a similar product, the price may be different than with such
knowledge. A decision tree is particularly useful in this type of situation, since it displays
the order in which decisions are made and events occur.
Suppose that the firm must decide whether or not to market its new product shortly.
However, the price decision can be made later. If the competitor is going to act, it will
introduce its product within a month. In three months, the firm will establish and announce
its price. After that, the competitor will announce its price.
Note that the given problem is a sequential decision problem. The firm must make a
decision now about introduction and subsequently set price, after learning about the
competitor‘s action. This structure of the problem is diagrammed in the decision tree in
Picture 2.7. Estimated profits for every path through the tree are shown at the ends of the
tree. The probabilities for each event are shown under the event branches. Note that the
probabilities for the competitor‘s price behavior are different when the firm‘s price is high
than when the firm‘s price is medium or low.
After the analysis of the drawn decision tree the following strategy can be recommended
to the firm: Introduce the new product and charge a high price if there is no competitive
entry; but charge a medium price if there is competition. For this strategy, the expected
profit is $156,000.
Picture 2.7 ”Decision Tree for Product Introduction Example”
Page 218 of 259

CHAPTER 12: SIMULATION
An Overview of Simulation and Modeling
The terms ―modeling‖ and ―simulation‖ are sometimes used interchangeably. In reality,
they are distinct, though related, terms.
MODELING is the representation of an object or phenomena, which is used by

simulation. Models may be mathematical, physical, or logical representations of a system,
entity, phenomenon, or process. Models are, in turn, used by simulation to predict a future
state.
Examples of models:
• Mathematical model of sensor response

• Computer aided design model of an armoured vehicle or a helicopter or a human
being
Modeling refers to the process of creating models.
SIMULATION is a representation of the functioning of a system or process. Through

simulation, a model may be implanted with unlimited variations, producing complex
scenarios. These capabilities allow analysis and understanding of how individual elements
interact and affect the simulated environment.
Example of a simulation:
• three-dimensional model of an armoured vehicle which moves across a model of

terrain over time
The tool that executes the simulation is a "simulator".
Live Simulations - Real people operate in the real world.

Virtual Simulations - Real people operate in synthetic worlds.
Constructive Simulations - Simulated entities operate in synthetic worlds.
Undefined Simulations - Simulated entities are subjected to real world environments.
Applications of Simulation: Experimentation, Operational Planning, Training, Missions

Rehearsal, Support to the Conduct of Operations, Life Cycle Management.
Generally, people most readily associate Modelling and Simulation (M&S) with training.
M&S tools are used to train astronauts, commercial and military aircrews, nuclear power
specialists, healthcare workers, and maintenance specialists, just to name a few
professionals. M&S provides rehearsal environments for civilian first responder and
military personnel. Repeated rehearsal of procedures improves performance, saving
countless lives as well as aircraft, ships, and other vehicles. Also, training individuals
Page 219 of 259

before allowing them to use actual equipment improves the safety of the individuals
undergoing training, the participants around them and the safety of the actual equipment.
While training is perhaps the most visible of M&S applications, M&S can be used to study
any system or process. This ranges from human bodily systems and transportation
networks, to vehicle systems, communities, and product design or manufacturing. M&S
tools and processes help solve pressing issues across government, industry, and academic
domains. M&S can answer ―what if‖ questions or provide a robust experimentation or
training environments that may not be otherwise realised.
Fundamental Concepts in Simulation

Learning the language is a key task facing everyone who is entering any new field of work,
especially one such as simulation, which has both technical and educational aspects. When
we are engaged in learning through use of a simulation we might find towards in any one
of a number of quite different contexts. In one setting we may be taking part in a face-to-
face role-based activity, drawing on individual interpretations of some aspect of real world
conditions. In another setting we might have a role as a member of a team that has the task
of using a technical simulation to create learning environments, using data obtained
through analysis of the real world. Both of these are using ‗simulation‘ to create a learning
context and each uses the same core essentials to do so. However the visible setting may be
quite different and reliance on different technologies may even obscure the similarities
between them.
Designing Instructional/Learning Components of Simulation

The initial design basis for all training/learning involving any use of a simulation is
instructional design. The process does not begin with the technology, although sometimes
it is hard to convince clients that the technology to be used is not the beginning point. In
the past sixty years there have been tremendous advances in instructional design and
simulation development and both of these must be taken advantage of as much as possible
to create truly engaging learning environments. All sectors of the simulation design and
construction process are becoming more aware of this as divisions separating engineering,
learning and support are dissolving in the face of the need to address ever more complex
learning outcomes through integrated use of face to face and technical simulation.
VV&A - Verification, Validation and Accreditation

Verification, Validation and Accreditation is a trio of concepts vital to assuring that any
simulation meets relevant quality control criteria. They are used together as a means of
putting a simulation ‗through its paces‘ prior to committing it to use.
Standards for simulation and modelling are essential to tasks such as the conduct of VV&A
and simulator interoperability. But they can impose costs and limit the capability of
simulation, so should only be used where the benefits outweigh the costs.
Three types of simulations

Simulations generally come in three styles: live, virtual and constructive. A simulation also
may be a combination of two or more styles. Within these styles, simulations can be
Page 220 of 259

sciencebased (where, for example, interactions of things are observed or measured), or
involve interactions with humans. Our primary focus at IST is on the latter— human-in-
the-loop — simulations.
Live simulations typically involve humans and/or equipment and activity in a setting
where they would operate for real. Think war games with soldiers out in the field or
manning command posts. Time is continuous, as in the real world. Another example of live
simulation is testing a car battery using an electrical tester.
Virtual simulations typically involve humans and/or equipment in a computer-controlled

setting. Time is in discrete steps, allowing users to concentrate on the important stuff, so to
speak. A flight simulator falls into this category.
Constructive simulations typically do not involve humans or equipment as participants.

Rather than by time, they are driven more by the proper sequencing of events. The
anticipated path of a hurricane might be "constructed" through application of temperatures,
pressures, wind currents and other weather factors. Science-based simulations are typically
constructive in nature.
A simulator is a device that may use any combination of sound, sight, motion and smell to
make you feel that you are experiencing an actual situation. Some video games are good
examples of low-end simulators. For example, you have probably seen or played race car
arcade games.
The booths containing these games have a steering wheel, stick shift, gas and brake pedals
and a display monitor. You use these devices to "drive" your "race car" along the track and
through changing scenery displayed on the monitor. As you drive, you hear the engine
rumble, the brakes squeal and the metal crunch if you crash. Some booths use movement to
create sensations of acceleration, deceleration and turning. The sights, sounds and feel of
the game booth combine to create, or simulate, the experience of driving a car in a race.
Most people first think of "flight simulators" or "driving simulators" when they hear the
term "simulation." But simulation is much more.
Because they can recreate experiences, simulations hold great potential for training people
for almost any situation. Education researchers have, in fact, determined that people,
especially adults, learn better by experience than through reading or lectures. Simulated
experiences can be just as valuable a training tool as the real thing.
Simulations are complex, computer-driven re-creations of the real thing. When used for
training, they must recreate "reality" accurately, otherwise you may not learn the right way
to do a task.
For example, if you try to practice how to fly in a flight simulator game that does not
accurately model the flight characteristics of an airplane, you will not learn how a real
aircraft responds to your control.
Page 221 of 259

Building simulator games is not easy, but creating simulations that accurately answer such
questions as "If I do this, what happens then?" is even more demanding.
Computer Modeling & Classification

A computer model, as used in modeling and simulation science, is a mathematical
representation of something—a person, a building, a vehicle, a tree—any object. A model
also can be a representation of a process—a weather pattern, traffic flow, air flowing over a
wing.
Models are created from a mass of data, equations and computations that mimic the actions
of things represented. Models usually include a graphical display that translates all this
number crunching into an animation that you can see on a computer screen or by means of
some other visual device.
Models can be simple images of things—the outer shell, so to speak—or they can be
complex, carrying all the characteristics of the object or process they represent. A complex
model will simulate the actions and reactions of the real thing. To make these models
behave the way they would in real life, accurate, real-time simulations require
Simulation classified:
1. Deterministic models: In these models, input and output variables are not
permitted to be random variables, and models are described by exact functional
relationship.
2. Stochastic models: In these models, at least one of the variables or functional
relationship is given by probability functions.
3. Static models: These models do not take variable time into consideration.
4. Dynamic models: These models deal with time varying interaction.
Some examples:
Deterministic Stochastic
No randomness Random inputs—

Inputs are exact, no uncertain
uncertainty Inputs are from known
One model needs only distributions
one run One model needs more
than one run
Page 222 of 259

Static No time Use fitted regression “Monte Carlo” simulation
element model for unobserved Estimate an intractable
independent-variable integral
combinations
Get empirical distribution
Financial scenarios of a new test statistic
for some null
hypothesis
Dynamic Passage Differential-equation Queueing models
of time is models of population representing
important growth and decay manufacturing,
part of Deterministic forecasting computer, or
model over time communications
systems
Dynamic macroeconomic
models Inventory models
Compute (exactly) Can only estimate

desired output desired output
quantities quantities
Advantages & Disadvantages of simulation models

Advantages of simulation models:
• Simulation models are comparatively flexible, and can be modified to adjust
according to the variation in the environments of real life situations.
• Simulation is easier to use than mathematical models and is hence considered
superior to mathematical analysis.
• Simulation techniques have the advantage of being relatively free from complicated
mathematics and hence, can be easily understood by the operating staff and also by
non-technical managers.
• Simulation offers up a solution by performing virtual experimentation with a model
of the system without interfering with the real system. It thus bypasses complex
mathematical analysis.
• Simulation compresses the performance of a system over several years, and hence
performs large calculations in a few minutes of computer running time.
• By using simulation, management can foresee the difficulties and bottlenecks that
may arise due to addition of new machines or equipment, or by modifying a
process. It eliminates the need for costly trial and error methods of trying out a new
concept on real processes and equipment.
• It is better to train people on a simulated model, rather than putting them to work
straightaway on the real system. Simulation develops the trainee making him
experienced and an expert, due to which the trainee now has sufficient confidence
in handling the real system.
Disadvantages of simulation models:

Page 223 of 259
• Optimum results cannot be produced by simulation. Since the models only deal
with uncertainties, results of simulation are merely reliable approximations
involving statistical errors.
• In many situations, it isn‘t possible to quantify all the variables which play a role in
the system.
• In large, complex problems involving many variables and their inter-relationships,
the capacity of the computer may not be enough to process the entire system.
• Since computers are involved in simulation, it makes simulation a comparatively
costlier technique to use.
• Simulation is sometimes applied to simple problems, due to over reliance on

simulation, when in fact the problems could be solved in an easier manner by some
other technique like mathematical analysis.
Solution derived from analytical models:

• These solutions are precise, i.e. accurate, and reflect the true state of the system.
• Moreover, such derived solutions are most often the optimal solution, depicting the
best state of the system.
• However, these solutions involve complex calculations which require a fair amount
of time.
• Also, it is not easy to derive a solution for systems other than small scale systems.
Solutions derived from simulation models:

• These solutions are not accurate, and certainly not optimal.
• This is because there exists a lot of uncertainties that lead to various statistical
errors.
• However, these solutions are quick to generate, with no complex calculations
involved whatsoever.
• Also, though not accurate, they still represent a fair overview of the behavior of the
system.
Methods of Studying a System
Page 224 of 259

SYSTEM
Experiment Experiment
with the with a model
actual system of the system
Physical Mathematical
model model
Analytical Simulation
solution
Physical model (most commonly referred to simply as a model but in this context
distinguished from a conceptual model) is a smaller or larger physical copy of an object.
The object being modelled may be small (for example, an atom) or large (for example, the
Solar System).
Physical models allow visualization, from examining the model, of information about the
thing the model represents. A model can be a physical object such as an architectural model
of a building. Uses of an architectural model include visualization of internal relationships
within the structure or external relationships of the structure to the environment. Other uses
of models in this sense are as toys.
Mathematical model is a description of a system using mathematical concepts and

language.
The process of developing a mathematical model is termed mathematical modeling.
Mathematical models are usually composed of relationships and variables. Relationships
can be described by operators, such as algebraic operators, functions, differential
operators, etc. Variables are abstractions of system parameters of interest, that can be
quantified.
Analytical models are mathematical models that have a closed form solution, i.e. the
solution to the equations used to describe changes in a system can be expressed as a
mathematical analytic function.
For example,
A model of personal savings that assumes a fixed yearly growth rate, r, in savings (S)
implies that time rate of change in saving d(S)/dt is given by, d(S)/dt= r (S) eqn. 1
Page 225 of 259
A simulation model is a mathematical model that calculates the impact of uncertain inputs
and decisions we make on outcomes that we care about, such as profit and loss, investment
returns, environmental consequences, and the like.
Model Classifications
Several classification categories for models exist. A system we are modeling exhibits
probabilistic or stochastic behavior if an element of chance exists. For example, the path
of a hurricane is probabilistic. In contrast, a behavior can be deterministic, such as the
position of a falling object in a vacuum. Similarly, models can be deterministic or
probabilistic. A probabilistic or stochastic model exhibits random effects, while a
deterministic model does not. The results of a deterministic model depend on the initial
conditions; and in the case of computer implementation with particular input, the output is
the same for each program execution. As we studied this and other modules, we can have a
probabilistic model for a deterministic situation, such as a model that uses random numbers
to estimate the area under a curve .Figure 6 is depicted the classification of different kinds
of models.
1. Discrete-Event Simulation Model

Sufficient modeling concepts have been defined so that a discrete event simulation model
can be defined as one in which the state variables change only at those discrete points in
time at which events occur. Events occur as a consequence of activity times and delays.
Entities may compete for system resources, possibly joining queues while waiting for an
available resource. Activity and delay times may "hold" entities for durations of time. A
discrete-event simulation model is conducted over time ("run") by a mechanism that moves
Page 226 of 259

simulated time forward. The system state is updated at each event along with capturing and
freeing of resources that may occur at that time.
2. Stochastic and Deterministic Systems

Definitions A system exhibits probabilistic or stochastic behavior if an element of
chance exists. Otherwise, it exhibits deterministic behavior. A probabilistic or stochastic
model exhibits random effects, while a deterministic model does not.
Deterministic: Randomness does not affect the behavior of the system. The output of the
system is not a random variable.
Stochastic: Randomness affects the behavior of the system. The output of the system is a
random variable.
3. Static and Dynamic Simulations

We can also classify models as static or dynamic. In a static model, we do not consider
time, so that the model is comparable to a snapshot or a map. For example, a model of the
weight of a salamander as being proportional to the cube of its length has variables for
weight and length, but not for time. By contrast, in a dynamic model, time changes, so that
such a model is comparable to an animated cartoon or a movie. For example, the number of
salamanders in an area undergoing development changes with time; and, hence, a model of
such a population is dynamic. Many of the models we consider in this text are dynamic and
employ a static component as part of the dynamic model.
Definitions A static model does not consider time, while a dynamic model changes with
time.
Static: A simulation of a system at one specific time, or a simulation in which time is not a
relevant parameter for example, Monte Carlo & steady-state simulations.
Dynamic: A simulation representing a system evolving over time for examples, the
majority of simulation problems.
4. Discrete vs. Continuous Systems

When time changes continuously and smoothly, the model is continuous. If time changes
in incremental steps, the model is discrete. A discrete model is analogous to a movie. A
sequence of frames moves so quickly that the viewer perceives motion. However, in a live
play, the action is continuous. Just as a discrete sequence of movie frames represents the
continuous motion of actors, we often develop discrete computer models of continuous
situations .
Definitions In a continuous model, time changes continuously, while in a discrete model
time changes in incremental steps.
Continuous: State variables change continuously as a function of time (Figure 7) and
generally analytical method like deductive mathematical reasoning is used to define and
solve the system.
State Variable (S.V.) = f (t)
Page 227 of 259

Figure: Continuous model behavior
Discrete: State variables change at discrete points in time (Figure below) and generally
numerical method like computational procedures is used to solve mathematical models.
State Variable(S.V.) = f(n t)
Figure: Discrete model behavior

Examples of Different Systems
• Queue length at a cash machine: Stochastic, Discrete Time, Discrete System
• The motion of the planets: Deterministic, Continuous Time, Discrete System
• Logic circuit in a computer: Deterministic, Discrete Time, Discrete System
• . Flow of air around a car: Deterministic, Continuous Time, Continuous System
• Closing prices of the 30 DAX shares: Stochastic, Discrete Time, Discrete System
Page 228 of 259

Steps in a Simulation Model
Formulate problem
and plan the study
Collect data and

define a model
No
Valid?
Yes
Construct a computer
program and verify
Make pilot runs
No
Valid?
Yes
Design experiments
Make production runs
Analyze output data
Document, present, and

implement results
Simulation Modeling, Input, Output, and Experiments

The modeling process: two distinct but related activities
Structural modeling
Physical/logical relationships among components
Topology/layout of machines
Page 229 of 259
Possible routings for part flows
Feedback/failure loops
Closed vs. open structure in model of a computer system
Quantitative modeling
Specific numerical/distributional assumptions composing model
How many machines at each workcenter?
Probabilities for branch points on routing decisions?
Cycle times of part type 3 on a machine in group 5 are random variates drawn from
what distribution? With what parameters?
Run model for one hour? One year? Until 5000 parts have been produced?
Building ―good‖ simulation models:

Verification—Code (in whatever language or product) is correct
Validation—Model (as expressed in the verified code) faithfully mimics the system to
study; can use model/code as surrogate for system to make decisions
Credibility—The valid model is accepted by decision makers; critical for
implementation success
Elements of both structural and quantitative components can become variables (or factors)
in the design of simulation experiments
Structural factors:
Try a different layout of machines
What if part-flow routings changed due to technology?
What if rework were just scrapped instead (no feedback loops)?
What if the computer system went from open (batch jobs) to closed (interactive)?
Quantitative factors:
What if we added a machine somewhere?
What if quality improvement changed pass/fail branching probabilities?
How effective would it be to reduce cycle times on the bottleneck work center?
How long will the model operate before becoming unduly congested?
“Machine” View of What a Simulation Does:
Inputs – Model Outputs –

Structural and Performance
Quantitative code measures
Page 230 of 259
Page 231 of 259
CHAPTER 13: SAMPLING
Introduction to sampling
Definition of sampling : the act, process, or technique of selecting a suitable sample; specifically :the
act, process, or technique of selecting a representative part of a population for the purpose of
determining parameters or characteristics of the whole population.
Sampling is a process used in statistical analysis in which a predetermined number of observations

are taken from a larger population. The methodology used to sample from a larger population
depends on the type of analysis being performed, but may include simple random sampling or
systematic sampling.
Terms in Sampling
Population: The entire collection individuals about which we desire information.
[Examples: Michigan residents who will vote in the upcoming election; all robins born in
Michigan this year.] Note that there may be practical difficulties in identifying exactly
which individuals are in the population.
Parameter: A numerical summary (usually unknown but desired) based on the entire
population. [Examples: percent of Michigan residents who will vote for candidate A in the
upcoming election; mean birth weight of all robins in Michigan.]
Sample: A subset of the population from which data is actually collected. [Examples: 1000
Michigan residents who claim they will vote and answered a telephone survey; 250 robins
that are found and weighed within 2 hours of birth.]
Note: It is actually possible that the sample is not actually a subset of the population.
Although we want the sample to be a subset of the population, because it may difficult to
determine the population (think about the voting example) we may sample some
individuals that aren't actually in the population (because they say they will vote but then
don't, for example).
Statistic: numerical summary based on data collected from the sample. [Examples:
percent of Michigan residents in a telephone survey who say they favor candidate A; mean
weight of a sample of new born robins.]
Population Distribution (of a variable): The value of a variable over a population can be
thought of as a random variable because the value of the variable depends on which
individual is selected. The probability distribution of this random variable is called the
population distribution.
Sampling Distribution (of a statistic): A statistic computed from a random sample (or in
a randomized experiment) is a random variable because the outcome depends on which
individuals are included in the sample. The probability distribution of the sample statistic is
called the sampling distribution.
Types of Samples & Sampling techniques/methods

a) Population
Page 232 of 259

The collection of all units of a specified type in a given region at a particular point or period
of time is termed as a population or universe. Thus, we may consider a population of persons,
families, farms, cattle in a region or a population of trees or birds in a forest or a population of
fish in a tank etc. depending on the nature of data required.
b) Sampling Unit
Elementary units or group of such units which besides being clearly defined, identifiable
and observable, are convenient for purpose of sampling are called sampling units. For
instance, in a family budget enquiry, usually a family is considered as the sampling unit
since it is found to be convenient for sampling and for ascertaining the required
information. In a crop survey, a farm or a group of farms owned or operated by a
household may be considered as the sampling unit.
c) Sampling Frame
A list of all the sampling units belonging to the population to be studied with their
identification particulars or a map showing the boundaries of the sampling units is known
as sampling frame. Examples of a frame are a list of farms and a list of suitable area
segments like villages in India or counties in the United States. The frame should be up to
date and free from errors of omission and duplication of sampling units.
d) Random Sample
One or more sampling units selected from a population according to some specified
procedures are said to constitute a sample. The sample will be considered as random or
probability sample, if its selection is governed by ascertainable laws of chance. In other
words, a random or probability sample is a sample drawn in such a manner that each unit in
the population has a predetermined probability of selection. For example, if a population
consists of the N sampling units U1, U2,…,Ui,…,UN then we may select a sample of n
units by selecting them unit by unit with equal probability for every unit at each draw with
or without replacing the sampling units selected in the previous draws.
e) Non-random sample
A sample selected by a non-random process is termed as non-random sample. A Non-
random sample, which is drawn using certain amount of judgment with a view to getting a
representative sample is termed as judgment or purposive sample. In purposive sampling
units are selected by considering the available auxiliary information more or less
subjectively with a view to ensuring a reflection of the population in the sample. This type
of sampling is seldom used in large-scale surveys mainly because it is not generally
possible to get strictly valid estimates of the population parameters under consideration and
of their sampling errors due to the risk of bias in subjective selection and the lack of
information on the probabilities of selection of the units.
f) Sampling and Non-sampling error
Page 233 of 259

The error arising due to drawing inferences about the population on the basis of
observations on a part (sample) of it is termed sampling error. The sampling error is non-
existent in a complete enumeration survey since the whole population is surveyed.
The errors other than sampling errors such as those arising through non-response, in-
completeness and inaccuracy of response are termed non-sampling errors and are likely to
be more wide-spread and important in a complete enumeration survey than in a sample
survey. Non-sampling errors arise due to various causes right from the beginning stage
when the survey is planned and designed to the final stage when the data are processed and
analyzed.
The sampling error usually decreases with increase in sample size (number of units
selected in the sample) while the non-sampling error is likely to increase with increase in
sample size. As regards the non-sampling error, it is likely to be more in the case of a
complete enumeration survey than in the case of a sample survey since it is possible to
reduce the nonsampling error to a great extent by using better organization and suitably
trained personnel at the field and tabulation stages in the latter than in the former.
Sampling techniques: Advantages and disadvantages

Technique Descriptions Advantages Disadvantages
Simple Random sample from Highly representative if all Not possible without complete
random whole population subjects participate; the list of population members;
ideal potentially uneconomical to
achieve; can be disruptive to
isolate members from a group;
time-scale may be too long,
data/sample could change
Stratified Random sample from Can ensure that specific More complex, requires
random identifiable groups groups are represented, greater effort than simple
(strata), subgroups, etc. even proportionally, in the random; strata must be
sample(s) (e.g., by gender), carefully defined
by selecting individuals
from strata list
Cluster Random samples of Possible to select randomly Clusters in a level must be

successive clusters of when no single list of equivalent and some natural
subjects (e.g., by population members exists, ones are not for essential
institution) until small but local lists do; data characteristics (e.g.,
groups are chosen as collected on groups may geographic: numbers equal,
units avoid introduction of but unemployment rates
confounding by isolating differ)
members
Page 234 of 259

Stage Combination of cluster Can make up probability Complex, combines
(randomly selecting sample by random at stages limitations of cluster and
clusters) and random or and within groups; possible stratified random sampling
stratified random to select random sample
sampling of individuals when population lists are
very localized
Purposive Hand-pick subjects on Ensures balance of group Samples are not easily
the basis of specific sizes when multiple groups defensible as being
characteristics are to be selected representative of populations
due to potential subjectivity of
researcher
Quota Select individuals as Ensures selection of Not possible to prove that the
they come to fill a adequate numbers of sample is representative of
quota by characteristics subjects with appropriate designated population
proportional to characteristics
populations
Snowball Subjects with desired Possible to include No way of knowing whether

traits or characteristics members of groups where the sample is representative of
give names of further no lists or identifiable the population
appropriate subjects clusters even exist (e.g.,
drug abusers, criminals)
Volunteer, Either asking for Inexpensive way of Can be highly

accidental, volunteers, or the ensuring sufficient unrepresentative
convenience consequence of not all numbers of a study
those selected finally
participating, or a set of
subjects who just
happen to be available
Sampling theory and Concept

Often we are interested in drawing some valid conclusions (inferences) about a large group
of individuals or objects (called population in statistics). Instead of examining (studying)
the entire group (population, which may be difficult or even impossible to examine), we
may examine (study) only a small part (portion) of the population (entire group of objects
or people). Our objective is to draw valid inferences about certain facts for the population
from results found in the sample; a process known as statistical inferences. The process of
obtaining samples is called sampling and theory concerning the sampling is called
sampling theory.
Example: We may wish to draw conclusions about the percentage of defective bolts
produced in a factory during a given 6-day week by examining 20 bolts each day produced
Page 235 of 259

at various times during the day. Note that all bolts produced in this case during the week
comprise the population, while the 120 selected bolts during 6-days constitutes a sample.
The sampling process comprises several stages:
• Defining the population of concern

• Specifying the sampling frame (set of items or events possible to measure)
• Specifying a sampling method for selecting the items or events from the sampling
frame
• Determining the appropriate sample size
• Implementing the sampling plan
• Sampling and data collecting
• Data which can be selected
When studying the characteristics of a population, there many reasons to study a sample
(drawn from population under study) instead of entire population such as:
1. Time: as it is difficult to contact each and every individual of the whole population
2. Cost: The cost or expenses of studying all the items (objects or individual) in a
population may be prohibitive
3. Physically Impossible: Some population are infinite, so it will be physically
impossible to check the all items in the population, such as populations of fish,
birds, snakes, mosquitoes. Similarly it is difficult to study the populations that are
constantly moving, being born, or dying.
4. Destructive Nature of items: Some items, objects etc are difficult to study as
during testing (or checking) they destroyed, for example a steel wire is stretched
until it breaks and breaking point is recorded to have a minimum tensile strength.
Similarly different electric and electronic components are check and they are
destroyed during testing, making impossible to study the entire population as time,
cost and destructive nature of different items prohibits to study the entire
population.
5. Qualified and expert staff: For enumeration purposes, highly qualified and expert
staff is required which is some time impossible. National and International research
organizations, agencies and staff is hired for enumeration purposive which is some
time costly, need more time (as rehearsal of activity is required), and some time it is
not easy to recruiter or hire a highly qualified staff.
6. Reliability: Using a scientific sampling technique the sampling error can be
minimized and the non-sampling error committed in the case of sample survey is
also minimum, because qualified investigators are included.
Every sampling system is used to obtain some estimates having certain properties of the
population under study. The sampling system should be judged by how good the estimates
obtained are. Individual estimates, by chance, may be very close or may differ greatly from
the true value (population parameter) and may give a poor measure of the merits of the
system.
Page 236 of 259

A sampling system is better judged by frequency distribution of many estimates obtained
by repeated sampling, giving a frequency distribution having small variance and mean
estimate equal to the true value.
Standard error
Standard Error is a method of measurement or estimation of standard deviation of sampling
distribution associated with an estimation method. The formula to calculate Standard Error is,
Standard Error Formula:
where
SEx = Standard Error of the
Mean s = Standard
Deviation of the Mean n =
Number of Observations of the
Sample
Standard Error Example:

X = 10, 20,30,40,50
Total Inputs (N) = (10,20,30,40,50)
Total Inputs (N) =5
To find Mean:
Mean (xm) = (x1+x2+x3...xn)/N
Mean (xm) = 150/5
Mean (xm) = 30
Sampling Distribution of Difference between Means

Statistical analyses are very often concerned with the difference between means. A typical
example is an experiment designed to compare the mean of a control group with the mean
of an experimental group. Inferential statistics (The branch of statistics concerned with
drawing conclusions about a population from a sample. This is generally done through
random sampling, followed by inferences made about central tendency, or any of a number
Page 237 of 259

of other aspects of a distribution.) used in the analysis of this type of experiment depend on
the sampling distribution of the difference between means.
The sampling distribution of the difference between means can be thought of as the
distribution that would result if we repeated the following three steps over and over again:
(1) sample n1 scores from Population 1 and n2 scores from Population 2, (2) compute the
means of the two samples (M1 and M2), and (3) compute the difference between means, M1
- M2. The distribution of the differences between means is the sampling distribution of the
difference between means.
As you might expect, the mean of the sampling distribution of the difference between
means is:
which says that the mean of the distribution of differences between sample means is equal
to the difference between population means. For example, say that the mean test score of
all 12year-olds in a population is 34 and the mean of 10-year-olds is 25. If numerous
samples were taken from each age group and the mean difference computed each time, the
mean of these numerous differences between sample means would be 34 - 25 = 9.
From the variance sum law, we know that:
which says that the variance of the sampling distribution of the difference between means
is equal to the variance of the sampling distribution of the mean for Population 1 plus the
variance of the sampling distribution of the mean for Population 2. Recall the formula for
the variance of the sampling distribution of the mean:
Since we have two populations and two samples sizes, we need to distinguish between the
two variances and sample sizes. We do this by using the subscripts 1 and 2. Using this
convention, we can write the formula for the variance of the sampling distribution of the
difference between means as:
Since the standard error of a sampling distribution is the standard deviation of the sampling
distribution, the standard error of the difference between means is:
Page 238 of 259

Just to review the notation, the symbol on the left contains a sigma (σ), which means it is a
standard deviation. The subscripts M1 - M2 indicate that it is the standard deviation of the
sampling distribution of M1 - M2.
Now let's look at an application of this formula. Assume there are two species of green
beings on Mars. The mean height of Species 1 is 32 while the mean height of Species 2 is
22. The variances of the two species are 60 and 70, respectively and the heights of both
species are normally distributed. You randomly sample 10 members of Species 1 and 14
members of Species 2. What is the probability that the mean of the 10 members of Species
1 will exceed the mean of the 14 members of Species 2 by 5 or more? Without doing any
calculations, you probably know that the probability is pretty high since the difference in
population means is 10. But what exactly is the probability?
First, let's determine the sampling distribution of the difference between means. Using the
formulas above, the mean is
The standard error is:
The sampling distribution is shown in Figure 1. Notice that it is normally distributed with a
mean of 10 and a standard deviation of 3.317. The area above 5 is shaded blue.
Figure. The sampling distribution of the difference between means.
The last step is to determine the area that is shaded blue. Using either a Z table or the
normal calculator, the area can be determined to be 0.934. Thus the probability that the
mean of the sample from Species 1 will exceed the mean of the sample from Species 2 by 5
or more is 0.934.
Page 239 of 259

As shown below, the formula for the standard error of the difference between means is
much simpler if the sample sizes and the population variances are equal. When the
variances and samples sizes are the same, there is no need to use the subscripts 1 and 2 to
differentiate these terms.
This simplified version of the formula can be used for the following problem: The mean
height of 15-year-old boys (in cm) is 175 and the variance is 64. For girls, the mean is 165
and the variance is 64. If eight boys and eight girls were sampled, what is the probability
that the mean height of the sample of girls would be higher than the mean height of the
sample of boys? In other words, what is the probability that the mean height of girls minus
the mean height of boys is greater than 0?
As before, the problem can be solved in terms of the sampling distribution of the difference
between means (girls - boys). The mean of the distribution is 165 - 175 = -10. The standard
deviation of the distribution is:
A graph of the distribution is shown in Figure 2. It is clear that it is unlikely that the mean
height for girls would be higher than the mean height for boys since in the population boys
are quite a bit taller. Nonetheless it is not inconceivable that the girls' mean could be higher
than the boys' mean.
Figure. Sampling distribution of the difference between mean heights.
A difference between means of 0 or higher is a difference of 10/4 = 2.5 standard deviations

above the mean of -10. The probability of a score 2.5 or more standard deviations above
the mean is 0.0062.
Page 240 of 259

CHAPTER 14: FINANCIAL MATHEMATICS
Simple vs. Compound Interest Calculation
Interest is the charge against the use of money by the borrower. The same is profit earned
by the lender of money. The amount which is invested in a bank in order to earn interest is
called principal. The interest rate is normally expressed in percentage and represents the
dollar interest earned per $100 of principal in a specific time, usually a year. Simple
interest and compound interest are the two types of interest based on the way they are
calculated.
Simple Interest
Simple interest is charged only on the principal amount. The following formula can be
used to calculate simple interest: Simple Interest (Is) = P × i × t
Where,
P is the principle amount; i is the interest rate per
period; t is the time for which the money is
borrowed or lent.
Example 1
Suppose $1,000 were invested on January 1, 2010 at 10% simple interest rate for 5 years.
Calculate the total simple interest on the amount.
Solution
We have,
Principle P = $1,000
Interest Rate i = 10% per year
Time t = 5 years
Simple Interest Is = $1,000 × 0.1 × 5 = $500
Compound Interest
Compound interest is charged on the principal plus any interest accrued till the point of
time at which interest is being calculated. In other words, compound interest system works
as follows:
1. Interest for the first period charged on principle amount.

2. For the second period, its charged on the sum of principle amount and interest
charged during the first period.
Page 241 of 259
3. For the third period, it is charged on the sum of principle amount and interest
charged during first and second period, and so on ...
It can be proved mathematically, that the interest calculated as per above procedure is given
by the following formula:
Compound Interest (Ic) = P × (1 + i) n – P
Where,
P is the principle amount; i is the
compound interest rate per period; n
are the number of periods.
S=P(1+r/n)^nt
S= future value P=original principle r= interest rate (in decimal form)
n= number of times per year the interest is compounded t= number of
years
Example 1
Consider the same information as given in Example 1. Now calculate the total compound
interest on the amount invested.
Solution
We have,
Principle P = $1,000
Interest Rate i = 10% per year
No. of Periods n = 5
Compound Interest Ic = $1,000 × ( 1 + 0.1 )^5 − $1,000
= $1,000 × 1.1^5 − $1,000
= $1,000 × 1.61051 − $1,000
= $1,610.51 − $1,000 = $610.51
Example 2
Jack started his account with $1000.00 at a rate of 8%. It was compounded once a year, and
we watched his account over a five year period.
Page 242 of 259

S=($1000.00)(1+0.08/1)^(1)(5)
S=($1000.00)(1+0.08)^(5)
S=$1469.33
Concepts of sinking fund

A sinking fund is a type of fund that is set up by a business in order to retire debt. A
company will put money into the sinking fund and then periodically use it to pay off
certain debts of the company. Here are the basics of the sinking fund and how it works.
Sinking Fund
This type of fund is most commonly used when companies issue corporate bonds. When a
company issues a corporate bonds, they have to make regular interest payments to the
bondholders. Most of the time, companies can afford to make these small regular payments
to the bondholders. However, when it comes time to pay back the principal of the bond,
they may not have enough cash on hand. The purpose of the sinking fund is to accumulate
enough cash so that a company will be able to repay the debt at the end of the bond term.
Many times, as companies accumulate extra cash in the sinking fund, they will go ahead
and purchase some of the bonds in advance of maturation.
Impact to investors
If you are a bond holder in a company that has a sinking fund, it could potentially affect
your investment. With this type of fund, there is the chance that your investment could end
at any point. The company could decide to purchase your bond back from you anytime
without notice. If you were counting on the interest from this type of investment, you could
potentially lose it at any point.
Types of Sinking Fund

There are four different types of sinking funds that a company could choose to have. Even
though all of the different types of sinking fund are similar, there are a few key differences
between them.
1) The first type of sinking fund sets out to purchase a specific amount of bonds back over the
course of a calendar year. Every year, they will try to buy the same amount of bonds as
long as they can accumulate enough cash to do so.
2) Another type of sinking fund utilizes callable bonds. With this type of arrangement, the
company has a specific call price that it can purchase the bonds back at.
3) The third type of sinking fund utilizes an option of how the bond is purchased back. The
company can buy it back from the bond holder at one of two prices. They could decide to
purchase it at the market price. They could also decide to purchase it back at the sinking
fund price. The company will be able to purchase the bonds back at the lower of the two
prices.
4) The fourth type of sinking fund makes regular payments to a trustee that hangs onto the
money on behalf of the company. The value of the asset continues to increase until it
Page 243 of 259
matches the amount of the outstanding bonds. This strategy is not as common as the other
two but it does happen in some cases.
Key Points
• Sinking fund provision of the corporate bond indenture requires a certain portion of the
issue to be retired periodically.
• A sinking fund reduces credit risk but presents reinvestment risk to bondholders.
• For the creditors, the fund reduces the risk the organization will default when the
principal is due: it reduces credit risk. However, if the bonds are callable, this comes at
a cost to creditors, because the organization has an option on the bonds.
Related Terms
• Debentures - A debenture is a document that either creates a debt or acknowledges it, and it
is a debt without collateral.
• call provision - the right for the issuer to buy back the bond at a predetermined price at a
certain time in future
• Preferred Stock - Stock with a dividend, usually fixed, that is paid out of profits before any
dividend can be paid on common stock. It also has priority to common stock in liquidation.
NOTE: Sinking funds are commonly used by companies in order to set aside enough
money to pay off the bonds that they have issued. This type of fund carries with it some
advantages and disadvantages for investors. Here are a few of the pros and cons of sinking
funds.
Pros
The basic idea behind a sinking fund is that companies are trying to address their debt in
advance. Instead of waiting for all of the bonds that have been issued to mature, they are
going to set aside a certain amount of money into the sinking fund each year. They will use
some of the money in the sinking fund to purchase a few of the bonds early.
As an investor, it is good to know that the company is going to be able to address its
debt. You do not want to be a bond holder in a company that cannot afford to pay back
the bonds that it issued. If this is the case, you may not be able to get your initial
investment in the company back.
Investors like to see sinking funds in the companies that they are planning on investing in.
This provides some peace of mind to the investors because they know that the company is
not going to go under anytime soon. When a company lets its debt get out of control, it
starts to become a much less attractive investment. No one wants to put money into a
company that looks like it stands the risk of becoming insolvent in the near future.
Cons
The sinking fund also provides a few disadvantages for investors as well. When a company
utilizes a sinking fund, they are going to periodically use the money to purchase some of
Page 244 of 259
the bonds early. If you are an investor that owns one of the bonds that is being purchased, it
means that you are going to be giving up your interest payments.
Another problem with this type of fund is that the company has the right to purchase the
bonds at a discount. Most the time, they can purchase the bonds at the par value. Many
times, companies will wait until interest rates go down so that the values of the bonds will
increase.
At that point, they will purchase several bonds at the par value which would actually be
less than what they would have to pay for them in the market.
If you are a bond holder in this situation, you are potentially going to have to lose money. If
you would have sold your bond in the secondary market shortly before it was purchased by
the company, you could have made a greater profit.
This situation creates a lot of uncertainty for investors. You never really know when your
bonds are going to be purchased back from the company and your investment will end.
Because of this, it can sometimes be difficult to justify investing in these bonds.
The Future Value and Present Value of an Annuity

Understanding annuities is crucial for understanding loans, and investments that require or
yield periodic payments. For instance, how much of a mortgage can I afford if I can only
pay $1,000 monthly? How much money will I have in my IRA account if I deposit $2,000
at the beginning of each year for 30 years, and earns an annual interest rate of 5%, but is
compounded daily?
An annuity is a series of equal payments in equal time periods. Usually, the time period is
1 year, which is why it is called an annuity, but the time period can be shorter, or even
longer. These equal payments are called the periodic rent. The amount of the annuity is
the sum of all payments.
An annuity due is an annuity where the payments are made at the beginning of each time
period; for an ordinary annuity, payments are made at the end of the time period. Most
annuities are ordinary annuities.
Analogous to the future value and present value of a dollar, which is the future value and
present value of a lump-sum payment, the future value of an annuity is the value of
equally spaced payments at some point in the future. The present value of an annuity is
the present value of equally spaced payments in the future.
The Future Value of an Annuity

The future value of an annuity is simply the sum of the future value of each payment. The
equation for the future value of an annuity due is the sum of the geometric sequence:
FVAD = A(1 + r)1 + A(1 + r)2 + ...+ A(1 + r)n.
The equation for the future value of an ordinary annuity is the sum of the geometric
sequence:
Page 245 of 259
FVOA = A(1 + r)0 + A(1 + r)1 + ...+ A(1 + r)n-1.
Without going through an extensive derivation, just note that the future value of an annuity
is the sum of the geometric sequences shown above, and these sums can be simplified to
the following formulas, where A = the annuity payment or periodic rent, r = the interest
rate per time period, and n = the number of time periods.
The future value of an ordinary annuity (FVOA) is:
And the future value of an annuity due (FVAD) is:
Note that the difference between FVAD and FVOA is:
In other words, the difference is merely the interest earned in the last compounding period.
Because payments of an ordinary annuity are made at the end of the period, the last
payment earns no interest, while the last payment of an annuity due earns interest during
the last compounding period.
Page 246 of 259

Examples
Example — Calculating the Amount of an Ordinary Annuity
If at the end of each month, a saver deposited $100 into a savings account that paid 6%
compounded monthly, how much would he have at the end of 10 years?
A = $100 r = 6% per year compounded monthly, which = .5% interest per

month = .005 n = the number of compounding time periods = 120 in 10
years.
Substituting these values into the equation for the future value of an ordinary annuity:
100 * ((1+.005)120 -1)/.005 = $16,387.93
Example — Calculating the Amount of an Annuity Due

If the saver deposited the money at the beginning of the month instead of the end, then there
will be an additional amount of money = A(1 + r)n - A = 100(1.005)120 -100 = $81.94, which
is the difference in this example between an annuity due and an ordinary annuity.
Example — Calculating the Annuity Payment, or the Periodic Rent
A 20 year old wants to retire as a millionaire by the time she turns 70. (With life spans
increasing, and the social security fund being depleted by baby boomers, the retirement age
will have invariably risen by the time she reaches 65 years of age, probably to something
even higher than 70, actually.) How much will she have to save at the end of each month if
she can earn 5% compounded annually, tax-free, to have $1,000,000 by the time she is 70?
Solution: Note that the equation for the future value of an annuity consists of 3 independent
variables, and 1 dependent variable. In other words, if we know the value of 3 of the
variables, then we can determine the remaining variable.
Since r = 5% = .05, and n = 50, the interest factor (1 + r)n - 1)/r = (1.0550 - 1)/.05 = 209.35,
rounded to 2 decimal places. To find A, we divide both sides of the equation for the future
value of an annuity by this interest factor, which yields 1,000,000/209.35 = $4,776.69. So
she would have to save $4,776.69 dollars per year, or $398.06 per month, to have
$1,000,000 in 50 years—assuming, of course, that she could save it tax-free!
Of course, using the formula for the present value of a dollar, we find that in 50 years,
assuming 3% inflation, $1,000,000 will be worth about 1,000,000/1.0350 = $228,107.08!
Ouch!
Since the current limit for IRA contributions is $2,000 per year for a young person, how
much will this earn after 50 years, assuming that the $2,000 is deposited at the end of the
year? FVOA = 2,000 * (1.0550 - 1)/.05 = $418,695.99.
What's that in today's dollars, assuming 3% inflation? 418,695.99/1.0350 = $95,507.52!

Clearly, the IRA contribution limits must be raised substantially. Of course, you can save all
Page 247 of 259
of the money at the beginning of each year instead of at the end, and this annuity due will
yield an extra (using the Annuity Difference Formula above) 2,000 * 1.0550 - 2,000 =
$20,934.80 which, in today's dollars, again assuming a 3% inflation rate, =
$20,934.80/1.0350 = $4,775.38 more money in today's dollars over the ordinary annuity, but
clearly, you'll still be eating dog food when you retire with this amount of cash, unless you
are planning to die early! With the limitations on IRAs, stocks are the only viable choice for
investments that could possibly yield anything decent to retire on!
The Present Value of an Annuity

The present value of an annuity (PVA) is the sum of the present value of each annuity
payment. Since the present value of a lump sum payment is simply the future value of that
payment divided by the interest factor (1 + r)n, the present value of an annuity is the sum of
the present value of each of those payments:
The sum of this geometric progression can be simplified to:
Page 248 of 259

Examples
Example — Calculating the Present Value of an Annuity
You win a $1,000,000 lottery, which is paid in annual installments of $50,000 over 20 years.
How much did you really win, assuming that you could earn 5% interest, compounded
annually?
Solution: Since you are not receiving the full $1,000,000 payment right away, but in the form
of an annuity, its actual worth is much less.
Present Value of Annuity = 50,000 * (1 - (1 + .05)-20)/.05 = $623,110.52
Example — How Much of a Loan Can you afford?
You want to get a mortgage, but can only afford to pay $1,000 per month. How much can you
borrow, if the interest rate is 5% annually for a 30 year mortgage?
Solution: The monthly payments constitute an annuity, whose present value is the amount of
the loan.
Loan Amount = 1,000 * (1 - (1 + .004166667)-360)/.004166667 = $186,281.62
r = the monthly interest rate = .05/12 = .004166667. n

= the number of months in 30 years = 12 × 30 = 360.
Math Reminder: y-x = 1/yx.
Example — Calculating Monthly Mortgage payments
You want to borrow $200,000 to buy a house. What are the monthly mortgage payments if
the interest rate is 6% for 30 years?
Solution: In the above example, we asked how much one would have to save per month or
per year to have $1,000,000 in 50 years. In other words, what periodic payments would we
have to make to have a future value of $1,000,000? Here, we take out a loan, and thus, we
already have the money, whose present value, or discounted value, is equal to the amount
of the loan. The monthly payment would be the annuity payment, A. Thus, we use the
equation for the present value, because the present value is already known, and what we
need to know is how much are the payments going to be if the length of the loan is 30
years, and the interest rate is 6% annually.
Because we know 3 of the 4 variables, but not A, the monthly payment, we solve for A by
dividing both sides of the present value of annuity equation by the factor (1 - (1 + r)-n)/r,
but note that to divide by a fraction is the same as multiplying the numerator by the inverse
of the fraction, and so, we can simplify further:
Page 249 of 259

Calculating the Interest rate
We end our discussion on annuities by noting that r cannot be solved algebraically in the
formula for the present value of annuities, so, even if we know the annuity payment, the
number of time periods, and the present value, we can only estimate r. It is possible to
estimate r either by plugging in values with guesses, by looking it up in special tables that
plot r against the annuity payment A, or by using a graphing calculator, and graphing the
value of the annuity payment as a function of interest for a given present value. In the latter
case, the interest rate is where the line representing the rate of interest intersects the line for
the annuity payment.
Net Present Value and Internal Rate of Return
The present value of an annuity can be easily calculated because it consists of periodic
payments of equal amounts. However, many times the payments are not equal in amount,
and time intervals between payments may differ, in which case the present value of an
annuity must be calculated by summing the present value of each payment. These unequal
payments are sometimes referred to as a mixed stream:
Present Value of a Mixed Stream = Sum of the Present Value of Each Payment
Additionally, many business investments consist of both cash inflows and cash outflows.
When a business wants to make an investment, one of the main factors in determining
whether the investment should be made is to consider its return on investment. In most
cases, not only will cash flows be uneven, but some of the cash flows will be received and
some will be paid out. Additionally, some of the cash flows will be uncertain, and the
taxation of some of the transactions could also have an effect on the present value of the
inflows and outflows of the investment, especially over an extended period.
To decide whether to make a business investment, the business calculates what is called the
net present value (NPV) of the investment, which is the net present value of all cash
inflows minus the sum of the present value of the cash outflows, including the cost of the
investment, using a discount rate (DR) that is judged to be a required rate of return. If
the NPV is positive, then the investment is considered worthwhile. The NPV can also be
calculated for a number of investments to see which investment yields the greatest return.
Net Present Value = Sum of Present Value of Cash Inflows – Sum of Present Value of
Cash Outflows
In the capital budgeting of long-term investments in business, the required rate of return is
called the hurdle rate or the discount rate, and should be equal to or greater than the
Page 250 of 259

incremental cost of capital (aka marginal cost of capital), which is the weighted average
of costs to issue debt or equity to finance the investment.
Closely related to the net present value is the internal rate of return (IRR), calculated by
setting the net present value to 0, then calculating the discount rate that would return that
result. If the IRR ≥ required rate of return, then the project is worth investing in.
• If IRR ≥ DR, then invest.

• If IRR < DR, then forget about it.
Discount Factor
The percentage rate required to calculate the present value of a future cash flow.
Discount Factor Calculation

The discount factor is calculated in the following way, where P(T) is the discount factor, r
the discount rate, and T the discretely compounded over time:
P(T) = 1 / (1 + r)T
Example: If the discount rate is 10% , how do we calculate the discount rate of the next 5
years?
Discount factor:
year 1
1.1^-1 = .909090
year 2
1.1^-2 = ..82644
for year 5
1.1^-5 = .62092 just change the
power as per year..
Cash flow
Cash flow is the money that comes in and goes out of a company. It is the generation of
income and the payment of expenses. Cash inflows result from either the generation of
revenue through the selling of goods and services, money borrowed, or money earned
through investments.
If more cash is coming into the company than leaving the company, you are experiencing
positive cash flow. But if more cash is leaving the company than coming into the
company, then you are experiencing negative cash flow. Keep in mind that just because
you are experiencing negative cash flow for the moment doesn't mean you are going to
Page 251 of 259
suffer a loss, because cash flow is dynamic. Cash flow is reported on the company's cash
flow statement, which is also called a statement of cash receipts and disbursements.
Formulas
Free cash flow (FCF) measures how much cash you generate after taking into account
capital expenditures for such things buildings, equipment and machinery. The formula is:
FCF = Operating Cash Flow - Capital Expenditures
Example: Let's say that your company earned $12,000,000 in revenue last year. When you
add up all the capital expenses paid for your factory, equipment and machinery, it totals
$4,000,000. Now, let's figure out the FCF:
FCF = Operating Cash Flow - Capital Expenditures
FCF = $12,000,000 - $4,000,000
FCF = $8,000,000
Operating Cash Flow (OCF) is the measure of your company's ability to generate positive
cash flow from its core business activities. Here's the formula:
OCF = Earnings Before Interest And Taxes + Depreciation + Amortization - Taxes
Let's take a closer look at the equation. Earnings before interest and taxes (EBIT) is the
revenue left over after subtracting the cost of production, selling, general expenses and
administrative expenses. It's a measure of your operating profit before interest and taxes
are deducted. Depreciation is an accounting practice where you deduct the cost of a
tangible capital asset, such as machinery or real estate, over a period of time, while
amortization is where you deduct the cost of an intangible capital asset, such as a patent,
over a period of time.
Discounted cash flow

A method of assessing investments taking into account the expected accumulation of
interest
Discounted cash flow (DCF) analysis is the process of calculating the present value of an
investment's future cash flows in order to arrive at a current fair value estimate for the
investment.
How it works/Example:
The formula for discounted cash flow analysis is:
DCF = CF1/(1+r)1 + CF2/(1+r)2 + CF3/(1+r)3 ...+ CFn/(1+r)n
Page 252 of 259

Where:
CF1 = cash flow in period 1
CF2 = cash flow in period 2
CF3 = cash flow in period 3 CFn = cash flow in period n r =
discount rate (also referred to as the required rate of return)
Discount factors is calculated using the formula (1 + r)-n.

A discount factor can be thought of as a conversion factor for time value of money
calculations. The discount factor table below provides both the mathematical formulas and
the Excel functions used to convert between present value (P), future worth (F), uniform
gradient amount (G), and uniform series or annuity amount (A).
The discounting principle states that if we want to have $F in n years, we need to invest $P
right now. So, discounting is basically just the inverse of compounding: $P=$F*(1+i)-n.
The discount formula can be written as P=F*(P/F,i%,n), where (P/F,i%,n) is the symbol
used to define the discount factor. To convert the future value to the equivalent present
value, you simply multiple the future value by the discount factor.
The Discount Rate, i%, used in the discount factor formulas is the effective rate per
period. It uses the same basis for the period (annual, monthly, etc.) as used for the number
of periods, n. If only a nominal interest rate (rate per annum or rate per year) is known,
you can calculate the discount rate using the following formula:
Where
• r = nominal annual interest rate
• k = number of compounding periods per year
• p = number of periods per year corresponding to the basis for n
Discount Factor Table for Discrete Compounding

The following table lists discount factors used for conversions between common discrete
cash flow series, present value, future worth, etc. The { } braces around the Excel formula
indicate that the formula must be entered as an array function using Ctrl+Shift+Enter.
Example: To convert F to P, multiply F by the discount factor (P/F,i%,n). It might help to

think of "P/F" as "P given F". This representation comes from the algebraic equivalence
P=F*(P/F).
Nomenclature i Discount Rate
(effective rate per period) n Number of
Page 253 of 259

Periods P Present Worth
F Future Worth
A Uniform Series Amount (or "Annuity")
G Uniform Gradient Amount
Conve Symbol Discount Discount Factor Formula in rt Factor Excel

Formula
P to F (F/P,i%,n (1+i)n =FV(i,n,0,-1)
)
F to P (P/F,i%,n (1+i)-n =PV(i,n,0,-1)
)
F to A (A/F,i%, i/((1+i)n-1) =PMT(i,n,0,-1) n)
n
P to A (A/P,i%, i*(1+i) /((1+i) =PMT(i,n,-1)
n) n-1)
A to F (F/A,i%, ((1+i)n-1)/i =FV(i,n,-1) n)

n
A to P (P/A,i%, ((1+i) - =PV(i,n,-1)
n) 1)/(i*(1+i)n)
G to P (P/G,i%,((1+i)n- {=NPV(i,(ROW(INDIRECT("1:"&
n) 1)/(i2*(1+i)n)-n))-1))}
n/(i*(1+i)n)
G to F (F/G,i%, ((1+i)n-1)/i2- {=(P/G,i%,n) * (F/P,i%,n)}
n) (n/i)
G to A (A/G,i%, (1/i)- {=(P/G,i%,n) * (A/P,i%,n)}
n) n/((1+i)n-1)
EG to (P/EG,z- (zn-1)/(zn(z- =PV(z-1,n,-1)
P 1,n) 1)),
z=(1+i)/(1+g)
The Excel formulas for (F/G,i%,n) and (A/G,i%,n) are based on the algebraic equivalence
of F/G=(P/G)*(F/P) and A/G=(P/G)*(A/P). Replace the discount factor symbols (P/G,i
%,n), (F/P,i%,n) and (A/P,i%,n) with the appropriate discount factor formula listed in the
table.
Discount Factors for Continuous Compounding

Continuous compounding is not exactly the same as daily compounding. The exact
discount factor formulas for continuous compounding are given in the table below (where
Page 254 of 259

n is the number of years and r is the nominal annual rate). Note that the discount factor for
F to P is just the inverse (1/x) of the factor for P to F.
Convert Symbol Discount Factor Formula ... in Excel
F to P (P/F,r%,n) e-r*n = 1/er*n =1/EXP(r*n)
P to F (F/P,r%,n) er*n =EXP(r*n)
F to A (A/F,r%,n) (er-1)/(er*n-1) =(EXP(r)-1)/(EXP(r*n)-1)
A to F (F/A,r%,n) (er*n-1)/(er-1) =(EXP(r*n)-1)/(EXP(r)-1)
P to A (A/P,r%,n) (er-1)/(1-e-r*n) =(EXP(r)-1)/(1-1/EXP(r*n))
A to P (P/A,r%,n) (1-e-r*n)/(er-1) =(1-1/EXP(r*n))/(EXP(r)-1)
Inventory control system

An inventory control system is a process for managing and locating objects or materials.
Inventory Control is the supervision of supply, storage and accessibility of items in order
to ensure an adequate supply without excessive oversupply. Stock control is defined as
"the activity of checking a shop‘s stock".
It can also be referred as internal control - an accounting procedure or system designed to

promote efficiency or assure the implementation of a policy or safeguard assets or avoid
fraud and error etc.
Inventory control may refer to:
• In economics, the inventory control problem, which aims to reduce overhead cost without
hurting sales
• In the field of loss prevention, systems designed to introduce technical barriers to shoplifting
It answers the 3 basic questions of any supply chain:

1. When?
2. Where?
3. How much?
Inventory/Stock control systems

Many shops now use stock control systems. The term "stock control system" can be used to
include various aspects of controlling the amount of stock on the shelves and in the
stockroom and how reordering happens. Typical features of stock control software include:
• Ensuring that the products are on the shelf in shops in just the right quantity.
Page 255 of 259

• Recognizing when a customer has bought a product.
• Automatically signalling when more products need to be put on the shelf from the stockroom.
• Automatically reordering stock at the appropriate time from the main warehouse.
• Automatically producing management information reports that could be used both by local
managers and at head office.
These might detail what has sold, how quickly and at what price, for example. Reports
could be used to predict when to stock up on extra products, for example, at Christmas or
to make decisions about special offers, discontinuing products and so on. Sending
reordering information not only to the warehouse but also directly to the factory producing
the products to enable them to optimize production.
Advantages and disadvantages

Stock control systems ensure that shelves are appropriately stocked. If there is too much
stock, it ties up a company's money, money that might be better spent on reducing their
overdraft, on advertising the business or on paying for better facilities for customers, for
example. Too much stock means that some perishable products might not sell and would
have to be thrown away and this would reduce a stock control system outweigh the
disadvantages.
Types of Inventory Control systems :
 ABC
 Two Bin Method
 Three Bin Method
 Order point system / Fixed Order quantity system (Fixed Order Quantity, Fixed
Period Ordering - Maximum & Minimum levels)
 Just In Time
 Vendor Managed Inventory
 Material requirements planning (MRP)
ABC Method:
This is one of the common methods used across retail industry and it is at times coupled
with other methods for better control on inventory. This is more of an inventory
classification technique where in products are classified based on the sales contribution and
importance of the same in their assortment plan.
A- Category products will be the maximum grocers in sales and flagship products with
higher margin. Usually top 20% of the products in the assortment contributing to 80% of
the total sales are classified under A category where tight control on inventory is required
to ensure no loss in sales. 20% of products contributing to 80% of sales is known as 80-20
Rule or Pareto principle
C-Category products are bottom of the line contributing less to sales. These items are
marginally important for the business and are kept only for the sole purpose of customer
requirement.
Page 256 of 259

B-Category products are important to the retailer but are less important compared to A
Category products.
TWO BIN Method:

This is a simple method used usually in warehousing where in an item is stored in two
locations or bins in a warehouse and the stock is replenished in the first bin from the
second bin once the first bin is consumed completely. The required quantity to be filled in
the second bin is placed for ordering.
The availability of stock in each bin is calculated based on reorder lead time to ensure
enough stock is made available till the new stock arrives.
THREE BIN Method:

This is a common method following in manufacturing where Kanban system is being
followed. It is similar to two bins system with a third bin at the suppliers' location. The
supplier will not manufacture spare parts for the manufacturer until the reserve bin is
emptied. Three bins each with a Kanban card tracking movement of inventory is available ,
one at manufacturing/ shop floor, one at the shop/back store, one with the supplier. Once
the inventory in manufacturing/shop floor bin/display is consumed/sold, it is replenishmed
with the complete bin from the back store/shop. Later the back store bin is sent to the
supplier and replace with a complete bin from the supplier. Then the supplier will
manufacture to fill the inventory in the third bin with him. This will act as a complete loop
until manufacturing of the product is ceased.
FIXED ORDER QUANTITY:

This method is used to avoid ordering mistakes and ensure regular replenishment of
existing products. Only a fixed quantity can be ordered at one time for the item. This type
of ordering is usually used in auto replenishment of goods where in auto reordering point is
set in system and when the product's inventory level hits the reordering point or minimum
stock levels, an order is placed to the maximum stocking capacity of the product. To use
this method the retailer should know the minimum and maximum stocking capacity of the
product based on space allocated and the sales trend.
Order Point System / Fixed Order System:

In this system there is fixed time interval between every order placed for the item. For
example a vendor will visit the store in person and check the inventory of the respective
products and resupply the products based on the sales for the time duration. This kind of
ordering is done in small format stores like pharmacies and grocery stores
Order point system / Fixed Order quantity system of inventory control is based on
the (Re)Order point and Order quantity factors rather than on the time factor. The
inventory policy, in this system, is drawn ,defining the following:
• Fixed Order point / Re-Order Level (ROL) for each item

• Fixed Maximum , Minimum levels for each item
• Fixed Quantity to be ordered
Page 257 of 259

Often called Min-Max systems, these involve both a maximum inventory level and a
minimum at which reorders are generated. Basically, units of an item are issued until the
level of that inventory reaches the predefined reorder point. An order is then triggered for a
predetermined quantity (usually a calculated economic order quantity). In this system, the
order quantity is constant and the time between orders s variable.
The different Inventory points (Levels) of stock for an item are :
Maximum level (Max.) , predetermined Minimum Level (Safety stock, SS),

predetermined Lead time (LT) , predetermined
Monthly demand = D (often based on Moving average method)
MaxL.= (Review period + LT + SS) X D
Reorder level (ROL) = (LT + SS) X D
Order Quantity (OQ) = Max. – (Present stock + Pipeline dues)
Process: In course of consumption of an Inventory item, say, in the form of issue

from Stores to the users, the stock level of the item starts depleting through its
usage rate D.
As per the above definition, the stock goes up to the maximum level in the first
replenishment and then, because of steady consumption, comes gradually down. In that
process, again as per the definition, it touches the ROL.
As soon as the stock level touches the ROL fresh replenishment action is initiated
It is presumed that the next lot shall arrive by the time the present depleting stock touches
the Safety stock , keeping a stable Lead time and a stable usage rate D.
In some places the Order quantity is decided by the above formula whereas in some other
places it is determined by the Economic Order Quantity (EOQ) concept. That's whenever
an order is to be placed the quantity shall be EOQ.
Advantages:
1. Each item is procured in the most economical quantity
2. An item is attended to only when it needs attention i.e. when its stock has reached the
ROL
3. Control can be exercised on Inventory w.r.t. Max & Min levels
Applicability of Order Point system :

1. Item must have a reasonable stable usage
2. Lead time should not have radical variation
3. Supplier should be able to accept irregularly timed and unscheduled orders
Limitations of the system :

1. Needs continuous monitoring of stock level of each item
2. Cumbersome to operate for items with unstable usage and lead time
Page 258 of 259
3. Perpetual inventory records are required
JUST IN TIME:
The objective of JUST IN TIME method is to increase the inventory turnover and at the
same time reduce the inventory holding cost. JIT inventory system also exposes the
unwanted or the dead inventory held my retailer/ manufacturer. This method is ideal for
manufacturing organization and it is not used in Retail industry in general. This will also
involve usage of Kanban card to track inventory movement.
VENDOR MANAGED INVENTORY:

As the name explains, it involved SKUs managed directly by the supplier. Inventory is
replenished based on the sales on regular intervals by the vendor. The retailer provides
shop floor space and the vendor is charged a consignment rate on every product sold at the
location. The ownership of the items from receiving to sales and inventory loss if any will
be with the supplier.
MATERIAL REQUREMENTS PLANNING (MRP):

Material requirements planning (MRP) is a production planning, scheduling, and
inventory control system used to manage manufacturing processes. Most MRP systems
are software-based, while it is possible to conduct MRP by hand as well. An MRP
system is intended to simultaneously meet three objectives:
• Ensure materials are available for production and products are available for delivery to
customers.
• Maintain the lowest possible material and product levels in store
• Plan manufacturing activities, delivery schedules and purchasing activities.
Page 259 of 259

Quantitative Methods

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Quantitative Methods

Uploaded by

Copyright:

Available Formats

MINISTRY OF EDUCATION

CHAPTER 1: DATA COLLECTION AND

Data Characteristics • Data are facts obtained by reading, observation, counting,

Basis for data collection

Objectives of data collection

Why collect data?

Put planning into data collection

Put planning into data collection

Data collection must have a purpose!

• An "experimental unit" is typically thought of as one member of a set of objects that

Data sources and types

Human factors in data collection

Types of data analysis

Collection methods and limitations

Method Description Pros Cons

Archival Data that have  Low cost  May be

larger populations invalid

Reasons for Data Classification

(2) Two -way Classification: If we consider two characteristics at a time in order to

The Data Classification Process

Testing classification rules by function

-----THE TITLE---- ----Prefatory

----Row Captions---- ------Column Captions-----

---Stub Entries--- -----The Body-----

General Rules of Tabulation:

(1) Simple Tabulation or One-way Tabulation:

(2) Double Tabulation or Two-way Tabulation:

(3) Complex Tabulation:

• Chemistry (Periodic table)

At a programming level, software may be implemented using constructs generally

Difference between Classification and Tabulation

(2) Tabulation is a mechanical function of classification because in tabulation classified

(3) Classification is a process of statistical analysis where as tabulation is a process of

Diagrammatic and graphic Data presentation

Introduction to data presentation in Diagrams and Graphs

Advantages of Graphic Presentation:

While making the diagrams, emphasis should be on:

Utility or uses of diagrammatic presentation:

1. Makes complex data simple.

Types of construction diagrams

Ogive: A curve obtained by plotting frequency data on the graph paper.

Interpretation of diagrams and graphs

Question How many pupils are between 121-

Use of scales on graph axes

In this graph we would say that sales were decreasing or dropping.

Match the names of the people to the letters on the graph.

Definition of measures of central tendency

Why Is Central Tendency Important?

Properties /Characteristics of a Good Average

Calculation and interpretation of Central Tendency

Advantage of the mode:

Limitations of the mode:

Advantage of the median:

Limitation of the median:

Looking at the retirement age distribution again:

The mean is calculated by adding together all the values

Advantage of the mean:

Limitations of the mean:

Influence of a distribution by the shape of Central Tendency Symmetrical

(54+54+54+55+56+57+57+58+58+60+81 = 644), divided by 11 = 58.5 years

―Dispersion is the measure of variation of the variables about a central value”.

“Dispersion is a measure of the extent to which the individual items vary”.

Characteristics & Objectives of Dispersion