Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 162

2021 Business Statistics

Regression
Linear Regression
• After having established the fact that 2 variables are closely related, we should
estimate/predict the value of one variable given the value of another.
• Eg. If advertising and sales are correlated, we can find out:
• The expected amount of sales for a given advertisement expenditure or
• The required amount of expenditure for achieving a fixed sales target
• The statistical tool with the help of which we are in a position to estimate or predict the
unknown values of one variable from known values of another variable is called regression
• It helps to find the average probable change in one variable given a certain amount of change
in another
• Regression is a technique for measuring the linear association between a dependent (Y) and
an independent variable (X)
• Regression analysis attempts to predict the values of a continuous DV from specific values of
the IV
3 advantages of regression analysis
• It provides estimates of values of the DVs from the values of IVs- with
the help of regression line- which describes the average relationship
existing between X and Y variables
• Calculates the standard error- the error involved in using the
regression line as a basis for estimations
• If the scatter is lesser and the line fits the data closely, it means that there is
relatively little scatter of observations around the regression line. This means,
we can make a good estimate of Y, the DV
• But, if the observations are scattered around the fitted regression line, it will
not produce accurate estimates of the DV
Difference between correlation and
regression
Correlation Regression
• Precedes regression • Succeeds correlation
• Tool for ascertaining the dergee • Tool for studying the nature of
of relationship relation- i.e. the cause and effect
relationship
Linear Bivariate Regression Model
• In this, we proceed by observing the sample data,
• and use the results obtained as estimates of the corresponding population relationship
• For a bivariate population, the model chosen is simple linear regression model
• Assumptions:
• The value of Y is dependent upon the value of X
• The average relationship between X & Y can be described as a linear equation
Y=a+bX, which gives a straight line graph
• Y: DV,
• X: IV,
• a: Y-intercept and
• b : slope, i.e. the average amount of change of Y per unit of change in the value of
X. The sign of b indicates the type of relationship between X & Y (direct/ inverse)
Regression lines
• Considering 2 variables X & Y, we have 2 regression lines
• Regression line of X on Y (is the line which gives the best estimate for the value of X for any specified value of
Y)
• Regression line of Y on X (is the line which gives the best estimate for the value of Y for any specified value of
X)

• The farther these lines are from each other, the lesser the degree of
correlation between them and vice versa
• The 2 regression lines show the average relationship between the 2 variables
based on the 2 equations known as Regression equations
• Regression equations: are algebraic expressions of the regression lines. Since
there are 2 regression lines, there are 2 regression equations (RE)
• RE of X on Y: describes the variations in the values of X for given change(s) in Y
and
• RE of Y on X: describes the variations in the values of Y for given change(s) in X
Regression Equation of Y on X
• Y= a + b X
• Y is the DV or criterion variable to be estimated and X is the IV or the predictor variable
• a & b are 2 unknown constants (fixed numerical values) which determine the position of the line
completely
• The constants are called parameters of the line
• If the value of either or both of them is changed, another line is determined
• Parameter ‘a’ determines the level of the fitted line (i.e. the distance of the line directly above or
below the origin)
• Parameter ‘b’ determines the slope of the line (i.e. change in Y for unit change in X)
• Once the values of a and b are obtained, we can determine the line. This is done by the METHOD OF
LEAST SQUARES
• It states tat the line should be drawn through the plotted points in such a manner that the sum of the
squares of the vertical deviations of the actual Y values from the estimated Y value IS THE LEAST
• i.e the (Actual Y – Estimated Y)^2 is minimum
• Such a line is called the LINE OF BEST FIT
The line of best fit

For the points which lie above the line, the


error would be positive and for the points
which lie below the line, the error would be
negative.
1
2

3
4
5
Unit 6
• Introduction to Probability
• Basic Concepts of Probability
Uncertainties

Managers often base their decisions on an analysis


of uncertainties such as the following:
What are the chances that sales will decrease
if we increase prices?

What is the likelihood a new assembly method


method will increase productivity?

What are the odds that a new investment will


be profitable?
Probability

Probability is a numerical measure of the likelihood


that an event will occur.

Probability values are always assigned on a scale


from 0 to 1.

A probability near zero indicates an event is quite


unlikely to occur.

A probability near one indicates an event is almost


certain to occur.

An experiment is any process that generates The sample space for an experiment is the set of
An experimental outcome is also called a sample point.
well- defined outcomes. all experimental outcomes.
Probability as a Numerical Measure
of the Likelihood of Occurrence

Increasing Likelihood of Occurrence

0 .5 1
Probability:

The event The occurrence The event


is very of the event is is almost
unlikely just as likely as certain
to occur. it is unlikely. to occur.
Statistical Experiments

In statistics, the notion of an experiment differs


somewhat from that of an experiment in the
physical sciences.

In statistical experiments, probability determines


outcomes.

Even though the experiment is repeated in exactly


the same way, an entirely different outcome may
occur.

For this reason, statistical experiments are some-


times called random experiments.
An Experiment and Its Sample Space

Experiment Experiment Outcomes


Toss a coin Head, tail
Inspection a part Defective, non-defective
Conduct a sales call Purchase, no purchase
Roll a die 1, 2, 3, 4, 5, 6
Play a football game Win, lose, tie
Assigning Probabilities

 Basic Requirements for Assigning Probabilities

1. The probability assigned to each experimental


outcome must be between 0 and 1, inclusively.

0 < P(Ei) < 1 for all i

where:
Ei is the ith experimental outcome
and P(Ei) is its probability
2. The sum of the probabilities for all experimental
outcomes must equal 1.

P(E1) + P(E2) + . . . + P(En) = 1 where:


n is the number of experimental outcomes
Assigning Probabilities
Classical Method
Assigning probabilities based on the assumption
of equally likely outcomes

Relative Frequency Method


Assigning probabilities based on experimentation
or historical data

Subjective Method
Assigning probabilities based on judgment
Classical Method
 Example: Rolling a Die
If an experiment has n possible outcomes, the
classical method would assign a probability of 1/n
to each outcome.

Experiment: Rolling a die


Sample Space: S = {1, 2, 3, 4, 5, 6}
Probabilities: Each sample point has a
1/6 chance of occurring
Relative Frequency Method
 Example: Lucas Tool Rental
Lucas Tool Rental would like to assign probabilities
to the number of car polishers it rents each day.
Office records show the following frequencies of daily
rentals for the last 40 days.
Number of Number
Polishers Rented of Days
0 4
1 6
2 18
3 10
4 2
Relative Frequency Method
 Example: Lucas Tool Rental
Each probability assignment is given by dividing
the frequency (number of days) by the total frequency
(total number of days).
Number of Number
Polishers Rented of Days Probability
0 4 .10
1 6 .15
2 18 .45 4/40
3 10 .25
4 2 .05
40 1.00
Subjective Method
 When economic conditions and a company’s
circumstances change rapidly it might be
inappropriate to assign probabilities based solely on
historical data.
 We can use any data available as well as our
experience and intuition, but ultimately a probability
value should express our degree of belief that the
experimental outcome will occur.
 The best probability estimates often are obtained by
combining the estimates from the classical or relative
frequency approach with the subjective estimate.
Subjective Method
 Example: Bradley Investments
An analyst made the following probability estimates.
Exper. Outcome Net Gain or Loss Probability
(10, 8) $18,000 Gain .20
(10, -2) $8,000 Gain .08
(5, 8) $13,000 Gain .16
(5, -2) $3,000 Gain .26
(0, 8) $8,000 Gain .10
(0, -2) $2,000 Loss .12
(-20, 8) $12,000 Loss .02
(-20, -2) $22,000 Loss .06
Events and Their Probabilities

An event is a collection of sample points.

The probability of any event is equal to the sum of


the probabilities of the sample points in the event.

If we can identify all the sample points of an


experiment and assign a probability to each, we
can compute the probability of an event.
Some Basic Relationships of Probability
There are some basic probability relationships that
can be used to compute the probability of an event
without knowledge of all the sample point probabilities.
Complement of an Event

Union of Two Events

Intersection of Two Events

Mutually Exclusive Events


Complement of an Event

The complement of event A is defined to be the event


consisting of all sample points that are not in A.

The complement of A is denoted by Ac.

Sample
Event A Ac Space S

Venn
Diagram
Union of Two Events

The union of events A and B is the event containing


all sample points that are in A or B or both.

The union of events A and B is denoted by A B

Sample
Event A Event B Space S
Intersection of Two Events

The intersection of events A and B is the set of all


sample points that are in both A and B.

The intersection of events A and B is denoted by A 

Sample
Event A Event B Space S

Intersection of A and B
Addition Law

The addition law provides a way to compute the


probability of event A, or B, or both A and B occurring.

The law is written as:

P(A B) = P(A) + P(B) - P(A  B


Mutually Exclusive Events

Two events are said to be mutually exclusive if the


events have no sample points in common.

Two events are mutually exclusive if, when one event


occurs, the other cannot occur.

Sample
Event A Event B Space S
Mutually Exclusive Events

If events A and B are mutually exclusive, P(A  B = 0.

The addition law for mutually exclusive events is:

P(A B) = P(A) + P(B)

There is no need to
include “- P(A  B”
Independent Events

If the probability of event A is not changed by the


existence of event B, we would say that events A
and B are independent.
Multiplication Law
for Independent Events
The multiplication law also can be used as a test to see
if two events are independent.

The law is written as:

P(A B) = P(A)P(B)


Mutual Exclusiveness and Independence

Do not confuse the notion of mutually exclusive


events with that of independent events.

Two events with nonzero probabilities cannot be


both mutually exclusive and independent.

If one mutually exclusive event is known to occur,


the other cannot occur.; thus, the probability of the
other event occurring is reduced to zero (and they
are therefore dependent).

Two events that are not mutually exclusive, might


or might not be independent.
Question
• What is the probability that a randomly chosen card from a deck of cards will be
either a king or a heart?
• Solution:
• 52 cards
• A: Kings= 4, P(A)= 4/52
• B: Hearts= 13, P(B)= 13/52
• King & heart= 1
• P(A or B)= P (A U B)= P(A B) = P(A) + P(B) - P(A  B= 16/52= 4/13
• P(A  B= P(A & B)= 1/52
Binomial, Poisson (theory only)
Normal distributions (theory & sums)
Business Forecasting and Time
Series Analysis

Importance
Components
Trend
Free hand method
Methods of semi-averages, moving averages and least squares
Problems based on these
Forecasting, methods of forecasting
Introduction
• When estimates of future conditions are made on a systematic
basis, the process is referred to as forecasting
• The figure or statement obtained is called forecast
• Forecasting is a service whose purpose is to offer the best available
basis for ‘management expectations of the future’
• Forecasting aims at reducing the area of uncertainty that surrounds
management decision-making, with respect to costs, profit, sales,
production, pricing, capital investment etc.
• Forecasting is the process of making predictions of the future,
based on the past & present data by the analysis of trends
• The knowledge of forecasting methods is essential for decision
makers to make reliable and accurate estimates and assess or
evaluate the future consequences of decisions in the face of
uncertainty
What Is Forecasting?
• An essential tool in any decision-making process
• Process of predicting a future event
• Underlying basis of all business decisions
• Planning production in expectation of certain levels of
sales
• Building warehouses in expectation of certain levels of
stocks and sales
• Setting prices in expectation of certain levels of raw
material costs, financial constraints, wages and sales
• Recruiting labour, buy materials, arrange finance or plan
factories in expectation of certain levels of sales and
other activity
Objectives of forecasting
• Creating plans of action
• Monitoring its progress
• Developing a warning-system of the critical factors
Types of forecasts
• Demand forecasts: prediction of demands for products/services based
on the sales and marketing information
• Environmental forecasts: concerned with the social, political and
economic environments of the state/ country
• Technological forecasts: concerned with the new developments in
existing technologies
Timing of forecasts
• Short-range: 0-3 months, may go upto 1 year- for job scheduling, work
force levels, job assignments etc.
• Uses mathematical techniques like moving averages, exponential
smoothing, trend exploration
• Medium-range: 1-3 yrs time span- used for sales planning, production
planning, budgeting etc.
• Long-range: >=3 yrs: for designing/ installing new plants, facility
location, R&D etc.
Forecasting Approaches
Qualitative Methods Quantitative Methods
• Used when situation is • Used when situation is
vague & little data exist ‘stable’ & historical data
• New products exist
• New technology • Existing products
• Involve intuition, • Current technology
experience • Involve mathematical
techniques
Quantitative Forecasting- process

• Select several forecasting methods


• Evaluate forecasts
• Select best method
• Forecast the future
• Monitor the forecasting accuracy continuously

These are used when:


• Uses past data
• Quantified information
• Past pattern continues into the future
Quantitative Forecasting Methods
Quantitative
Forecasting

Time Series Causal


Models Models

Moving Exponential Trend


Free-hand Regression
method Average Smoothing Models
A. Time series forecasting methods
• Time series is a set of measurements of a variable that changes through
time
• The data is gathered on a variable over a period of time at regular intervals
• The future periods’ outcome is predicted by analysing patterns over a
period of time
• Considers an observed historical pattern for any variable and projects the
same into the future using a mathematical formula
• B. Causal forecasting methods
• Based on the assumptions that the variable value to be forecasted has a
cause-effect relationship with one or more variables
• Eg. Correlation, Linear regression method
• Identifies the factors that influence cause variations in the value of any
variable in some predictable manner
Free-hand method
• A trend line fitted by the free-hand method should confirm to the following conditions:
• It should be smooth
• The sum of the vertical deviations of the observations above the trend line should be
equal to the sum of the vertical deviations of the observations below the trend line
• The sum of the squares of the vertical deviations of the observations from the trend
line should be as small as possible

• Limitations:
• Highly subjective method as the trend line depends on personal judgement
• Time-consuming
Smoothing methods
• This provides pattern of movements in the data over time, by eliminating random
variations due to irregular components of time series
• 3 smoothing methods are:
Moving averages: a subjective method which depends on the length of the period
for calculating moving averages
• It is a technique to get an overall idea of the trends in a data set
• It is described as moving because old data points get replaced by new figures in its calculation
• Focuses on long-term trend in a time series
Weighted moving averages: A moving average where some time periods are
weighted differently than others
• The most recent observations are assigned larger weightage and it decreases for older data
values
• WMA= ∑ (Weight for period n) * (Data values in period n) / ∑weights
Semi-average method: used to estimate the slope and intercept of the trend line, if
time series is represented by a linear function
Steps in semi-average method
1. Data are divided into 2 parts
2. Their respective arithmetic means are computed
3. These 2 means are plotted corresponding to the midpoint of the data/
class interval covered by the respective part
4. These points are joined by a straight line to get the required trend line
5. The AM of the 1st part is the intercept value ‘a’
6. Slope: ratio of the difference in the AM of the number of years
between them, b= Δy/ Δ x = (AM1-AM2)/ (year 1- year 2)
7. Time series of the form y’ =a+bx, where y’ is the predicted y
Trend projection method
(linear, exponential or quadratic)
• It fits a trend line to a time series data and then projects medium-to-
long range forecasts
• Helps to describe the long-term general direction of any business
activity over a long period of time
• The study of trend facilitate in making intermediate and long term
forecasting projections
Linear projection method
• The method of least squares from regression analysis is used to find
the trend line of best fit to a time series data.
• This line is defend as y’ = a+bx where
• y’ is the predicted value of the DV
• a is the y intercept
• b is the slope of regression line Δy/ Δx
• x is the IV represented as time in year/ month etc.
Characteristics of the trend line of best fit
• The sum of all vertical deviations about the line of best fit is zero
• ∑(y-y’)=0
• The sum of all vertical deviations squared is minimum
• ∑(y-y’)^2 is the least
• The line of best fit passes through the mean values of variables x & y
• The value of 2 constants a & b can be found by the simultaneous
solution of normal equations
What is a Time Series?
• It is used to detect patterns of change in statistical
information over regular intervals of time
• We project these patterns to arrive at an estimate for the
future
• Thus it helps to cope with uncertainty about the future

• Set of evenly spaced numerical data


• Obtained by observing variables at regular time periods
• Forecast based only on past values
• Assumes that factors influencing past, present & future will
continue
• Example
• Year: 201620172018 2019 20202022
• Sales: 78.763.589.7 93.292.1 ??

Time Series data
Time series data is a sequence of observations
• collected from a process
• with equally spaced periods of time
• major purpose of forecasting with time series
is to extrapolate
• Time series is dynamic, it does change over
time.
• Data should be plotted on table or graph, so
that researcher can view it, and understand the
behavior of the variable
Objective: to identify the pattern and isolate the
influencing factors of production, planning
and control
Time Series Components

Secular Cyclical
trend variations

Seasonal Irregular
variations variations
Time Series Patterns

© Wiley 2010 74

Secular Trend Component
Value of the variable tends to increase or decrease over a long
period of time  Overall Upward or Downward Movement- in the
average value of the forecast variable
• Due to long-term factors like population increase, changing
demographic characteristics, technology, consumer preferences
etc.
• Data Taken Over a Period of Years
Sales d tre nd
Upwar

Time (eg: month/quarter/year)


Cyclical Component (non-seasonal)
• There are years when the business cycle hits a peak above the trend line
• Variations are periodic in nature and repeats like the 4 phases of a business cycle
• At times, business activity slumps and hits a low point below the trend line 
Upward or Downward Swings
• 4 well-defined periods /phases in the business cycle: prosperity, decline, depression &
improvement
• Due to interactions of factors influencing economy
• May Vary in length & usually lasts 2 - 10 Years
Sales
1 Cycle

Time
Seasonal Component
• Fluctuations are repeated within a year- daily, weekly, monthly, quarterly etc.
• Regular patterns of upward or downward swings- high degree of regularity
• Due to climate, weather, customs, traditions etc.- tend to be repeated from year to
year (Observed Within One Year mostly)
• daily traffic volume shows within-the-day “seasonal” behavior, with peak levels
occurring during rush hours, moderate flow during the rest of the day and early
evening, and light flow from midnight to early morning

Sales

Summer

Winter
Spring Fall
Time (Monthly or Quarterly)
Irregular Component
• Rapid changes caused by short-term unanticipated and non recurring
factors
• Due to random variation or unforeseen events
• Nature (flood, earthquake), accidents, Union strike, War
• Erratic, unpredictable, unsystematic, random, ‘residual’ fluctuations
• Short duration & non-repeating
Time Series Forecasting
T im e
S e r ie s

S m o o th in g No Yes T re n d
M e th o d s T re n d ? M o d e ls

M o v in g E x p o n e n tia l
A v e ra g e S m o o th in g

A u to -
L in e a r Q u a d r a tic E x p o n e n tia l R e g r e s s iv e
3 forecasting methods
• Three forecasting methods that are appropriate for a time series with
a horizontal pattern:
• moving averages, weighted moving averages, and exponential smoothing
• objective of each of these methods is to “smooth out” the random
fluctuations in the time series, they are referred to as smoothing
methods
Moving Averages
• The moving averages method uses the average of the most
recent data values in the time series as the forecast for the
next period
• The term moving is used because every time a new
observation becomes available for the time series, it replaces
the oldest observation in the equation and a new average is
computed
• Eg: sales of women’s blouses in the first three weeks (in
thousands of Rs.) are 17, 21 and 19
• forecast of sales in week 4 using the average of the time
series values in weeks 1–3, F4= average of weeks 1-
3=(17+21+19)/3=19
• Thus, the moving average forecast of sales in week 4 is 19 or
Rs. 19,000
• But, the actual value observed in week 4 is 23, the forecast
error in week 4 is 23-19= 4 (Rs.4000)
• Next, we compute the forecast of sales in week 5 by averaging the time
series values in weeks 2–4.
• F5 average of weeks 2-4= (21+19+23)/3 =21
• Hence, the forecast of sales in week 5 is 21 and the error associated with
this forecast is 18 -21 = 3
WEIGHTED MOVING AVERAGES
• select a different weight for each data value and then computing a
weighted average of the most recent values as the forecast.
• In most cases, the most recent observation receives the most weight, and
the weight decreases for older data values.
Exponential smoothing
• Exponential smoothing also uses a weighted average of past time
series values as a forecast; it is a special case of the weighted moving
averages method in which we select only one weight—the weight for
the most recent observation.
• The weights for the other data values are computed automatically
and become smaller as the observations move farther into the past
Steps in forecasting
• Define organizational objective of forecasting
• Select the variables to be forecasted- eg. Capital investment,
employment level, etc.
• Determine the time horizon- short/ medium or long-term of the
forecast, to predict the future
• Select appropriate forecasting method
• Collect relevant data for forecasting
• Make the forecast and implement the results
Unit – VII: Introduction to Inferential
Statistics
• Meaning & Purpose of inferential statistics, Introduction to testing of Hypothesis:
Procedure for testing hypothesis - Setting of Hypothesis -Null and alternative
hypotheses,
• Computation of Test statistics ( simple problems)-
• Types of errors in hypothesis testing - Level of significance, Critical region and
value - Decision making.
• Test of significance for Large and small sample tests, Z and t tests for mean and
proportion,
• One way ANOVA, Chi-square test for goodness of fit and independence of
attributes.
• (Simple problems)
Hypothesis
• An unproven statement or supposition that tentatively explains
certain facts or phenomena; a proposition that is empirically testable
• Null hypothesis (H0): a statement in which no difference or no effect
is expected. It is the hypothesis that is always tested.
• Alternative hypothesis (H1): A statement that some difference or
effect is expected. It is a statement indicating the opposite of the null
hypothesis

86
The role of hypothesis in a research study

 Guides the direction of study

 Specifies what data are relevant and


irrelevant for the purpose of research.

 Provides the framework for data analyses


and organising the conclusions that result.

87
Types of hypothesis
1. Hypotheses based on Empirical Uniformities.

Univariate and descriptive in nature,


states that something is the case, that
some group of individuals, objects, events
or incidents has certain property or
characteristic.

Eg: All software professionals lead a


sedentary life.

88
2. Hypotheses based on association between Variables.

Bivariate and explanatory in nature, states that


 something is associated with some other
thing;
 that something is greater than some other
thing;
 that some group of individuals, objects,
events or incidents have more of a certain
property as compared to some other group; that
speaks of the frequency of occurrence of
something as compared to some other thing.

Eg: Men make better managers than women.


89
3. Hypotheses based on cause and effect
relationship.

Bivariate and deterministic in nature,


states that something is causally related to
some other thing;
that something is a determinant of some other
thing;

Eg: Cigarette smoking causes cancer.

90
Stating hypotheses
 Stated in declarative form.
 States generally a relationship between variables.
 Ideally reflects the theoretical framework of the study
based on a theory/body of literature.
 Is brief and to the point.

Example of Hypothesis
Research Idea Objective Hypothesis

Drug abuse and To ascertain the There is a positive


child abuse linkage between relationship between drug
drug abuse of abuse among adults and
adults and abuse their physical and
of children. psychological abuse of
children in their contact.
91
Hypothesis testing procedures
1. Formulate H0 (null hypothesis) and H1 (alternative hypothesis)
2. Select an appropriate statistical technique
3. Choose the level of significance (α)- usually 5% is taken for management decisions.
• This means, there are about 5 chances out of 100 that we would reject null hypothesis, when it should be
accepted. i.e We are 95% confident about our right decision
• At 5% significance level, when we reject H0, the test is said to be significant
4. Compute the degrees of freedom (df)- applicable for t-test, chi square test and
ANOVA
5. Calculate the value of the test statistic- z, t, F, chi square
• The test statistic is a value that is computed from the sample data that is used in making the decision about the rejection of
the null hypothesis.
6. Determine the critical region/ rejection region: those values of the test statistic
leading to the rejection of H0 (figure in the next 3 slides)
7. Find whether the test statistic calculated in step 5 falls in the critical region or
acceptance region
8. Make the statistical decision to reject or not reject the H0

92
Right-tailed test (critical region is the blue
region

zα =

93
Left-tailed test

-zα
94
2-tailed test

Critical Region &


Area of Rejection
Critical Region &
Area of Rejection

Area of
Acceptance
(95%)

-Zα/2 +Zα/2
.025 - 1.96 1.96
.025
CRITICAL
VALUES
95
What does the critical region mean?
• The critical region of the sampling distribution of a
statistics is also known as the alpha region.
• The critical region of a hypothesis test is the set of
all outcomes which, if they occur, cause the H0 to be
rejected and the H1 accepted.
• The values within the acceptance region are called
acceptable at the 95% Confidence Level and if we
find that our sample mean lies within this region, we
would conclude that Ho is true and we accept it.

96
• The critical region CR, or rejection region RR, is a set of values of the test
statistic for which the null hypothesis is rejected in a hypothesis test.
• That is, the sample space for the test statistic is partitioned into two regions;
one region (the critical region) will lead us to reject the null hypothesis Ho, the
other will not.
• So, if the observed value of the test statistic is a member of the critical region,
we conclude "Reject Ho"; if it is not a member of the critical region then we
conclude "Do not reject Ho".
• For instance, if we have calculated that the critical region at a 95% confidence
level is between 10 and 20, then we can be 95% confident that the true mean
lies within that region.

97
Hypotheses pertaining to Left, right and 2-tailed tests
• There are three ways to set up a the null and alternate
hypothesis, mathematically.

• Equal hypothesis versus not equal hypothesis (two-tailed test)


H0 : μ = some value
H1: μ ≠ some value

• Equal hypothesis versus less than hypothesis (left-tailed test)


H0 : μ = some value
H1 : μ < some value

• Equal hypothesis versus greater than hypothesis (right-tailed


test)
H0 : μ = some value
H1 : μ > some value

98
Type I and Type II errors
• Whenever we draw inferences about a population, there is a risk that
an incorrect conclusion will be reached
Two types of errors can occur
• Type I error and
• Type II error

99
• H0: Patient is alive (because null hypothesis represents no
change)
• H1: Patient is not alive (dead)
• Possible states of nature: (based on H0)
• Patient is alive (H0 true & H1 false)
• Patient is dead (H0 false & H1 true)
• Decisions are something that researcher has control over, we
make correct or incorrect decision
• Possible decisions (based on Ho) / conclusions (based on claim)
• Reject Ho: sufficient evidence to say patient is dead
• Fail to reject or accept Ho: insufficient evidence to say patient is
dead
• Four possibilities that can occur based on 2 possible states of
nature and the 2 decisions which we can make:

100
Testing of Hypotheses
Errors in hypothesis testing

Decision
Accept H0 Reject H0

Correct Type I
H0 is True Decision Error
Condition

H0 is False
Type II Correct
Error Decision

 = P(Type I Error) ;  = P(Type II Error)


Goal: Keep a, b reasonably small
101
Example
• A major dept store is considering the introduction of an Internet shopping service
• The new internet shopping service will be introduced if more than 40% of the Internet
users shop via the Internet, say p
• H0: p =0.40
• H1: p >0.40
• If H0 is rejected, then H1 will be accepted and the new Internet shopping service will be
introduced.
• If H0 is not rejected, then the new Internet shopping service should not be introduced
unless additional evidence is obtained
• This is a one-tailed test

102
Type I error
• Occurs when the sample results lead to the rejection of H0 when it is in fact true
• Type I error in this eg:
Would occur if we concluded, based on the sample data, that the proportion of
customers preferring the new service plan was greater than 0.40, when in fact it was
less than or equal to 0.40 (ie. when H0 is true). Hence we do the mistake of
introducing the service incurring huge loss.
The probability of Type I error (α) is also called the level of significance

103
Type II error (β)
• Occurs when, based on the sample results, H0 is not rejected when it is
in fact false.
• In our eg., type II error would occur
If we concluded, based on sample data, that the proportion of customers
preferring the new service plan was less than or equal to 0.40, when, in
fact, it was greater than 0.40. Hence we do the mistake of not
introducing the service.

104
1. Test of hypothesis concerning
population mean
• Test concerning mean of one population
To test Ho: μ= μo against
a) H1: μ> μo
b) H1: μ< μo
c) H1: μ= μo

105
• A sample of size n (n>30) is taken from the population with unknown
mean μ and known SD σ
• Let x be the sample mean
• Critical value z=(x- μ)
(σ/√n)

106
Case a) H1: μ> μo
• This is right tailed test.
• The rule is: “If z> zα (the tabled value), the test is significant. There is
significant difference between the sample mean and the hypothetical
mean and hence we reject Ho at (1- α)100% confidence level”
• hence we reject Ho at (α)100% significance level
• “If z< zα (the tabled value), the test is not significant. There is no
significant difference between the sample mean and the hypothetical
mean and hence and we fail to reject Ho at (1- α)100% confidence
level”
• hence we accept Ho at (α)100% significance level

107
Case b) H1: μ< μo
• This is left tailed test.
• The rule is: “If z≤ -zα (the tabled value), the test is significant. There is
significant difference between the sample mean and the hypothetical
mean and hence we reject Ho at (1- α)100% confidence level”
• “If z>- zα (the tabled value), the test is not significant. There is no
significant difference between the sample mean and the hypothetical
mean and hence and we fail to reject Ho at (1- α)100% confidence
level”

108
Case c) H1: μ= μo
• This is two tailed test.
• The rule is: “If absolute value of z, ie. IzI > zα/2 (the tabled value), the test is
significant. There is significant difference between the sample mean and the
hypothetical mean and hence we reject Ho at (1- α)100% confidence level”
• “If IzI < zα/2 (the tabled value), the test is not significant. There is no significant
difference between the sample mean and the hypothetical mean and hence and
we fail to reject Ho at (1- α)100% confidence level”

109
• To save time and effort, the table below relates critical z values to alpha
levels and type of test (whether one-tailed or two-tailed).
• Alpha       Tails        Critical Z
•  0.05        two           plus or minus 1.96
 0.05        right          1.645
 0.05        left            -1.645
 0.01        two          plus or minus 2.58
 0.01        right          2.33
 0.01        left            -2.33

110
Practice 1
• A sample of 100 students is taken from the students of a college with
heights having standard deviation 10 cm. the mean height of the
sample of students was found to be 168.8 cm. Can we accept the
assumption that the mean height of the students of the college is 170
cm? Significance level= 0.05

111
Solution 1
• σ=10
• x = 168.8
• n=100
• To test Ho: μ= 170 against H1: μ= 170
• This is a 2-tailed test
• α= 0.05, then zα/2 = 1.96
• Applying the formula, z= -1.2
• Here IzI < zα/2 and hence we accept the assumption that the mean
height of the students of the college is 170 cm

112
Practice 2
• A sample of 400 observations were taken from a population with
standard deviation of 15. if the mean of the sample is 27, test
whether the hypothesis that the mean of the population is less than
24.
α= 0.05

113
Solution 2
• To test Ho: μ= 24 against H1: μ<24
• σ=15 α= 0.05
• x = 27 so, zα = -1.64
• n=400
• Applying the formula, z= (27-24) /(15/20)=4
• This is a left-tailed test. If z>-zα, the test is not
significant. We accept Ho at 95% CL.
• Hence, mean is reasonable accepted to be 24.

114
2. Test of hypothesis concerning population
proportions
• Test concerning one population proportion
To test Ho: p= po against (read po as p not)
a) H1: p> po
b) H1: p< po
c) H1: p= po

115
• Given a sample of size n from the population.
• x is the number of items having a particular characteristic
• Sample proportion p=x/n
• Formula to calculate z is z= (p- po)/√ po(1- po)/n

116
Practice 1
• In a survey of 70 business firms, it was found that 45 are planning to
expand their capacities next year. Does the sample information
contradict the hypothesis that 70% of the firms are planning to
expand next year?

117
Solution
• To test H0: p= 0.7 against H1: p=0.7
• This is a 2-tailed test. At α= 0.05, zα/2= 1.96
• n=70, x=45
• p= x/n = 45/70 = 0.64
• z= (p- po)/√ po(1- po)/n
= (0.64-0.70)/√(0.7x0.3)/70
= -0.06/0.05 = -1.2
IzI= 1.2 is < z α/2 (ie. 1.96)
Test is not significant and we accept Ho at 95% CL. There is no reason to doubt the
hypothesis that 70% of the companies are going to expand their capacities.

118
• Practice 1
• An e-commerce research company claims that 60%
or more graduate students have bought
merchandise on-line. A consumer group is
suspicious of the claim and thinks that the
proportion is lower than 60%. A random sample of
80 graduate students show that only 22 students
have ever done so. Is there enough evidence to
show that the true proportion is lower than 60%? 
Conduct the test at 5% Type I error rate, and use
the rejection region approaches.

119
• Left tailed test
• To test H0: p= 0.6 against H1: p< 0.6
• n=80, x=22; p= x/n =22/80=0.275;
• po = 0.6; 1- po = 0.4
• Z= (p- po) / √ po(1- po)/n
• =(0.275−0.6)/√[0.6×0.4]/80= −5.93
• Z< - zα; Test is significant and we reject Ho at
5% SL. There is enough evidence to show
that the true proportion is lower than 60%

120
3. Test of hypothesis concerning Chi Square
Statistic
• When the assumption that ‘the samples are drawn from a normal population’ cannot be
justified, we use statistical procedures generally referred to as non-parametric tests.
• Chi square is one such test belonging to this category, first used by Karl Pearson

121
Properties
• Chi square distribution is a continuous probability distribution which has a
value zero as its lower limit and extends to infinity in the positive direction
• It can never have a negative value, as the difference between observed
and expected frequencies is squared
• The exact shape of the distribution depends upon the degrees of freedom
• For a small df, the shape of the curve is positively skewed. As the df
becomes larger, it becomes symmetrical and approximates to the shape of
a normal distribution
• It makes no assumptions about the population being sampled
• The greater the chi square value, the greater is the discrepancy between
observed and expected frequencies
3. Test of hypothesis concerning Chi Square
Statistic
Calculate the chi square statistic by completing the following steps:
• For each observed number in the table subtract the corresponding
expected number (O — E).
• Square the difference [ (O —E)2 ].
• Divide the squares obtained for each cell in the table by the expected
number for that cell [(O - E)2 / E ].
• Sum all the values for (O - E)2 / E. This is the chi square statistic.

123
Expected value
• Eij = (Ri * Cj)/n
where
Ri= total observed frequency in the ith row
Cj= total observed frequency in the jth column
and n is the sample size

124
Conditions
• Minimum 50 observation in the sample (n>50)
• Each cell frequency should not be less than 5 observations, otherwise
increase the sample size per cell
• The data should be expressed in original units (frequencies/counts),
i.e. frequencies and not in percentage or ratio form
• Sample data to be drawn at random from the target population

125
• Formulate the null hypothesis and determine the expected
frequency of each answer
• Ho: the two attributes are independent/ there is no association
between the attributes
• H1: X is dependent on Y/ there is a significant association
between the 2 attributes
• Determine the appropriate significance level
• Calculate the chi-square value, using the observed (from
sample) and expected frequencies
• Make the statistical decision by comparing the calculated chi
square with the critical (tabled) value

126
Decision rule- X2 test
• The tabled X2 value is X2 k-1,α where k is the number of classes and k-1
is the degrees of freedom to find the tabled value.
• If calculated X2 > tabled X2 k-1,α, the test is significant and we reject H0 at
(1-α)100% CL.
• Otherwise we accept H0

127
Problem
• A company has to choose among three No. of employees favoring
proposed pension plans. The company Job Plan A Plan B Plan C
wishes to test the hypothesis ‘preference classifi
for plans is independent of job cation
classification’. It asks the opinion of a sample Factory 160 30 10
of employees and obtains the information employ
presented in the table. Test the hypothesis ees
which the company wishes to do.
Clerical 148 40 20
employ
ees
Supervi 72 10 10
sors
Executi 70 20 10
ves

128
observed expected
160 30 10 200 150 33.33 16.67
148 40 20 208 156 34.67 17.33
72 10 10 92 69 15.33 7.67
70 20 10 100 75 16.67 8.33
450 100 50 600

O E (O-E)^2 (O-E)^2/E
160 150 100 0.666666667
Ho: preference for plans is
148 156 64 0.41025641
independent of job classification
72 H1: preference for plans is dependent
69 9 0.130434783

70 on job classification
75 25 0.333333333
30 33.33 11.0889 0.33270027
dof= (3-1)*(4-1)=6
40 34.67 28.4089 0.819408711
X2 calc<X2 6,0.05, =12.6
10 15.33 28.4089 1.853157208

20 16.66 11.15 0.66


Interpretation: We fail to reject Ho at
10 16.66 44.35 2.66 5% sig level. We have no evidence to
20 17.33 7.12 0.41 state that preference for plans is
10 7.66 5.47 0.71 dependent on job classification
10 8.33 2.78 0.33

9.3 129
• This has a X2 distribution with (r-1)*(c-1) degrees of freedom
• If calculated X2 > tabled X2 (r-1)*(c-1) ,α, the test is significant and we reject H0
at (1-α)100% CL. There is evidence to believe that the two attributes
(variable 1 and variable 2) are dependent or related
• Otherwise we accept H0
• Cell: section of a table representing a specific combination of two
variables or a specific value of a variable

130
Chi square problem
• Of the 1000 workers in a factory exposed to an Covid-19, 700 in all were
attacked, 400 had been inoculated and of these 200 were attacked
• On the basis of this information, can it be said that inoculation and
attack are independent?
Table
Inoculated Not Inoculated Total

Attacked by Covid-19 200 500 700

Not attacked 200 100 300

Total 400 600 1000

132
Tests based on statistics following Student’s t distribution:
the study of statistical inference with small samples
• If the orginial population is normally distributed, SD of the
population is unknown, the sampling distribution of the mean
derived from the small samples (n<30) will follow a t-distribution
• The shape of the t-distribution is influenced by its degrees of
freedom (d.o.f or d.f)
• The number of d.o.f is equal to the number of useful items of
information generated by a sample of given size with respect to the
estimation of a given population parameter
• It is calculated as df= n-1

133
• In statistics, Student's t-
distribution (or simply
the t-distribution) is a
probability distribution
that arises in the problem
of estimating the mean of
a normally distributed
population when the
sample size is small
• Assumptions:
1.Population is normal
2.SD of population is
unknown
• Properties:
1.It ranges from -∞ to +∞
• The t-table :
2.It is bell shaped and symmetrical around the mean
3.Its shape changes with the change in df
1.Its value is called tα or tα/2
4.It is more platykurtic that the normal distribution 2.Determined from the table given a
5.As n approaches 30, the t-distribution approaches the normal particular df and level of significance
134
form
• A sample of size n (n<30) is taken from a normal population with
unknown population standard deviation
• Let x be the sample mean and s be the sample SD
• Then t=(x-μ) / (s/√n)
• μ is the hypothesized population mean
• s= √∑(x-x) 2/n-1

135
Questions you may ask to arrive at a decision
between t and Z
• Is the population standard deviation (σ) known?
• If the answer is yes, the Z-distribution is appropriate
• When σ is unknown, a second question is asked: “Is the sample size
greater than 30?”
• If the answer is no, the t-distribution should be used, if it is yes, the Z-
distribution should be used (because as the sample size increases, the t-
distribution becomes increasingly similar to the Z-distribution)

136
Test concerning mean of one population

To test Ho: μ= μo against


a) H1: μ> μo
b) H1: μ< μo
c) H1: μ= μo

137
Case a) H1: μ> μo
• This is right tailed test.
• The golden rule is: “If calculated t> tα,n-1 (the tabled value with n-1 degrees
of freedom), the test is significant. We reject Ho at (1- α)*100% confidence
level”
• “Otherwise, we accept (fail to reject) Ho at (1- α)100% confidence level”

138
Case b) H1: μ< μo
• This is left tailed test.
• The golden rule is: “If calculated t< -tα,n-1 (the tabled value with n-1
degrees of freedom), the test is significant. We reject Ho at (1- α)100%
confidence level”
• “Otherwise, we accept (fail to reject) Ho at (1- α)100% confidence level”

139
Case c) H1: μ= μo
• This is two tailed test.
• The golden rule is: “If absolute value of the calculated ItI> tα/2,n-1 (the
tabled value with n-1 degrees of freedom), the test is significant. We
reject Ho at (1- α)100% confidence level”
• “Otherwise, we accept (fail to reject) Ho at (1- α)100% confidence
level”

140
Analysis Of Variance (ANOVA)
• Analysis of variance (ANOVA) is a collection of statistical models and their associated
estimation procedures (such as the "variation" among and between groups) used to
analyze the differences among means. ANOVA was developed by the statistician Ronald
Fisher.

• When the means of more than two groups or populations are to be compared, one-way
analysis of variance, a bivariate statistical technique, is the appropriate statistical tool
• One way because there is only one independent variable (though several levels of that
variable may be present)
• It is the analysis of the effects of one treatment variable on an interval-scaled or ratio-
scaled dependent variable; a technique to determine if statistically significant
differences in means occur between two or more groups
• Eg. Students from different colleges take the same exam. You want to see if one college
outperforms the other.
Example of an ANOVA problem
• To compare women who are working full-time outside the
home, working part-time outside the home, and not working
outside the home on their willingness to purchase a personal
computer
• This eg: has only one IV- working status with 3 levels:
• Full time employment
• Part-time employment and
• No employment outside the home
• Because there are 3 levels (groups), a t-test cannot be used to
test for statistical significance.
Contd…
• The null hypothesis, here, can be stated as “All the means are equal”
or
• Ho: μ1= μ2 = μ3

• As the name suggests, ANOVA requires comparing variances to make


inferences about the means
The logic of this technique:
• The variance of the means of the three groups will be large if these
women differ from each other in terms of purchasing intentions
• If we calculate this variance within groups and compare it to the
variance of the group means about a grand mean, we can determine
whether the means are significantly different
• F-test makes this easier
F-test
• A procedure for comparing one sample variance to another sample
variance
• It determines whether there is more variability in the scores of one
sample than in the scores of another sample
To obtain F-statistic or F-ratio
• F-statistic: a test statistic that measures the ratio of one sample
variance to another sample variance, such as the variance between
groups to the variance within groups
• F = variance between groups = MS between where MS is the mean square
variance within groups MS within
If the F-value is large, it is likely that the results are statistically
significant
Some terminologies for calculation of F-ratio

• Within-group variance : variation of scores within a group due


to random error or individual difference
• Between-group variance: variation of scores between groups
due to either the manipulation of an IV or characteristics of the
IV
• Total variance: the sum of within-group variance and between-
group variance
Calculating the F-ratio
Sales in units (thousands)
• The data in the table is from Regular Reduced off coupon
a hypothetical packaged- Price U Price V price W
goods company’s test- Test market 130 145 153
market experiment on A, B or C
pricing. Three pricing Test market 118 143 129
treatments are D, E or F
administered in four
separate test markets (ie. 12 Test market 87 120 96
test areas, A-L, were G, H or I
required). Do all the 3 price
treatments produce the Test market J, 84 131 99
same sales volume? K or L

Mean X1 = X2 = X3 =
104.75 134.75 119.25
Grand Mean Is the mean
of all 3 means
X =119.58
• Total sum of squares =
within group sum of squares + between group sum
of squares
ie. SS total = SS within + SS between

SS total is computed by squaring the deviation of each


score form the grand mean and summing these
squares

SS total = Σn Σc (Xij – X)2 where


i=1 j=1
• Xij = individual score, i.e., the ith observation or test unit in the jth group
• X = grand mean
• n= number of all observations or test units in a group (in this eg: 4)
• c= number of jth groups (or columns) (in this eg: 3)
Applying the formula …
SStotal = (130-119.58)2 + (118-119.58)2
+ (87-119.58)2 + (84-119.58)2
+(145-119.58)2 + (143-119.58)2
+(120-119.58)2 + (131-119.58)2
+(153-119.58)2 + (129-119.58)2
+(96-119.58)2 + (99-119.58)2
= 5948.93
To calculate SSwithin
• SSwithin, the variability that we observe within each group, is calculated by
squaring the deviation of each score from its group mean and summing these
scores:
• SSwithin = Σn Σc (Xij – Xj)2 where
• Xij = individual score
• Xj = group mean for the jth group
• n= number of observations in a group
• c= number of jth groups

i=1 j=1
Applying the formula…
• SSwithin = (130-104.75)2 + (118-104.75)2
+ (87-104.75)2 + (84-104.75)2
+ (145-134.75)2 + (143-134.75)2
+ (120-134.75)2 + (131-134.75)2
+ (153-119.25)2 + (129-119.25)2
+ (96-119.25)2 + (99-119.25)2
= 4148.25
To calculate SSbetween
• SSbetween , the variability of the group means about a grand mean, is calculated by squaring the
deviation of each group mean from the grand mean, multiplying by the number of items in the
group, and summing these scores:
• SSbetween = Σc nj (Xj – X)2 where

• Xj = group mean for the jth group

• X = grand mean
• nj= number of items in the jth group, same as ‘n’

j=1
Applying the formula…
• SSbetween = 4(104.75-119.58)2
+ 4(134.75-119.58)2
+ 4(119.25-119.58)2

= 1800.68
• The next calculation requires dividing the various sums of squares by
their appropriate degrees of freedom.
• These divisions produce the variances, or mean squares
MSbetween
• To obtain mean square between groups, SSbetween is divided by c-1
degrees of freedom:
• MSbetween = SSbetween
c-1
= 1800.68 =900.34
3-1
MSwithin
• To obtain the mean square within groups, SSwithin is divided by cn-c
degrees of freedom
• MSwithin = SSwithin
cn-c
= 4148.25 =460.91
12-3
F-ratio
• F-ratio is calculated by taking the ratio of the mean square between
groups to the mean square within groups
• The between-groups mean square is used as the numerator and the
within-groups mean square is used as the denominator:
F= MSbetween =900.34 =1.95
MSwithin 460.91
Summary for Analysis Of Variance
Source of Sum of squares Degre Mean F-ratio
variation es of square
freedo
m
Between SSbetween = c-1 MSbetween = -
groups Σc nj (Xj – X)2 Ssbetween/
j=1
c-
1
Within groups SSwithin = cn-c MSwithin = F=
SSwithin / cn-c MSbetween
Σn Σc (Xij – Xj)2
i=1 j=1
MSwithin

Total SS total = Σn c is the


number
n is the number
of observations
Σc (Xij – X)2 of in a group
i=1 j=1 groups
• There will be (c-1) degrees of freedom in the numerator and (cn-c)
degrees of freedom in the denominator
c-1 = 3-1 =2
cn-c 3(4)-3 9

From F-distribution table, at 0.05 level for 2 (n1) and 9 (n2) dof, indicates an F of
4.26
Pricing experiment: ANOVA table
• As calculated F 1.95< Source of
variation
Sum of
squares
Degrees
of
Mean
square
F
-ratio
4.26 (tabled), we fail freedom
to reject H0 at 95 % CL
Between 1800.68 2 900.34 --
• We conclude that all groups
the price treatments
produce Within 4148.25 9 460.91 1.953
groups
approximately the
same sales volume
Total 5948.93 11 -- --
All the best!

You might also like