Professional Documents
Culture Documents
Analytics Task-1
Analytics Task-1
Analytics Task-1
Outlook Group
TASK 1
⚫ Refer to the dataset of “Incinerator”
⚫ Find 5 problem statements from the entire
dataset.
⚫ Model the relationships.
⚫ Find the solutions to the problem
statements.
⚫ Justify the solution with concrete details.
TASK 2
⚫ Refer to the dataset “Tree Making”
⚫ Analyze the dataset and analyze the nature
of the variables.
⚫ Find out the course of action using a
suitable decision tree.
TASK 3
⚫ Refer to the dataset “MRA Analytics”
⚫ 2 problem statements should be analysed.
⚫ Choice of software is yours.
⚫ Excel Miner may be used in trial version.
⚫ Analyze all the parameters of the model
and prepare a descriptive report.
TASK 4
⚫ Refer to the dataset “Forecasting”
⚫ Fill in the excel sheet with valid
calculations.
⚫ Give logic behind the same.
Submission Format
⚫ Task 4 should be done in the same excel
sheet.
⚫ Other tasks may be done in word or ppt or
excel.
HINTS
MULTIVARIATE ANALYSIS
⚫ UNIVARIATE
⚫ MEASURES OF CENTRAL TENDENCY
⚫ MEASURES OF DISPERSION
⚫ BIVARIATE
⚫ CROSS-TABULATION
⚫ CHI-SQUARE
⚫ ONE-WAY ANOVA
⚫ CORRELATION ANALYSIS
⚫ SIMPLE REGRESSION ANALYSIS
⚫ MULTIVARIATE
⚫ MULTIPLE VARIABLES
⚫ MULTIPLE VARIATES
WHAT TYPE OF RELATIONSHIP IS
BEING EXAMINED
DEPEND INTERDEPEN
ENCE DENCE
NUMBER OF DEPENDENT STRUCTURE OF
VARIABLES RELATIONSHIP IS AMONG
SEVERAL ONE DV
VARIAB CASES/ OBJE
DVs IN A
LES RESPOND CTS
IN A SINGLE
MANO EFA ENTS
CLA MDS
SINGLE RELATIO
MEASUREM
VA
RELATIO NSHIP
ENT
MULTIPLE
NSHIP SCALE OF
RELATION
SHIPS MET DV NON-ME
OF DVs & RIC
MRA TRIC
MDA
IVs
SEM CJA LRA
APPLICATIONS
⚫ Correlation
⚫ How strongly are sales related to advertising expenditure?
⚫ Is there an association between market share and size of
the sales force?
⚫ Partial correlation
⚫ How strongly are sales related to advertising expenditure
when the effect of price is controlled
⚫ Is there an association between market share and size of
the sales force after adjusting for the effect of sales
promotion
⚫ Regression
⚫ Can variation in sales be explained in terms of variation in
advertising expenditures? What is the structure and form
of this relationship, and can it be modeled
REGRESSION ANALYSIS
Regression analysis examines associative relationships
between a metric dependent variable and one or more
independent variables in the following ways:
⚫ Determine whether the independent variables explain a
significant variation in the dependent variable: whether
a relationship exists.
⚫ Determine how much of the variation in the dependent
variable can be explained by the independent variables:
strength of the relationship.
⚫ Determine the structure or form of the relationship: the
mathematical equation relating the independent and
dependent variables.
⚫ Predict the values of the dependent variable.
⚫ Control for other independent variables when
evaluating the contributions of a specific variable or set
STATISTICS ASSOCIATED WITH BIVARIATE
REGRESSION ANALYSIS
⚫ Bivariate regression model. The basic regression equation
β 0 β1
is Yi = β+ Xi + ei, where Y = βdependent
1
or criterion variable,
X = independent or predictor variable,
0
= intercept of the
line, = slope of the line, and ei is the error term associated
with the i th observation.
2
⚫ SumΣof
e j squared errors. The distances of all
the points from the regression line are
squared and added together to arrive at the
sum of squared errors, which is a measure of
total error,
ANALYSIS
PLOT THE SCATTER DIAGRAM
9
Attitude
Duration of Residence
WHICH STRAIGHT LINE IS BEST?
Line
1
Line 2
9 Line 3
Line 4
6
β 0 + β 1X
Y
YJ
eJ
eJ
YJ
X
X1 X2 X3 X4 X5
VARIATION IN BIVARIATE
REGRESSION
Y
t a l Residual Variation
To ation SSres
ari S y Explained Variation
V S SSreg Y
X
X1 X2 X3 X4 X5
MULTIPLE REGRESSION
where
n
SSy = Σ (Y i - Y )2
i =1
n 2
S S reg = Σ
i =1
(Y i - Y )
n 2
S S res = Σ
i =1
(Y i - Y i )
ANALYSIS
STRENGTH OF ASSOCIATION
= R 2 /k
(1 - R 2 )/(n- k - 1)
which has an F distribution with k and (n - k -1)
degrees of freedom.
ANALYSIS
SIGNIFICANCE TESTING
Testing to
similar forthat
the significance of the
in the bivariate can be done in a
β i'scase by using t tests.
manner
The
significance of the partial coefficient for importance
attached to weather may be tested by the following
equation:
t= b
S Eb
which has a t distribution with n - k -1 degrees
of freedom.
ANALYSIS
EXAMINATION OF RESIDUALS
859 8
682 5
471 3
708 9
1094 11
224 2
320 1
651 8
1049 12
Year Operating Year Operating Year Operating
Revenue Revenue Revenue
METHODS OF FORECASTING
MEASURING FORECAST ERROR
⚫ Mean absolute deviation (MAD)
⚫ Average size of the “miss” regardless of direction
⚫ Mean squared error (MSE)
⚫ Penalizes large forecasting errors
⚫ A technique that produced moderate errors is preferable
to one that usually has small errors but occasionally
yields extremely large ones
⚫ Root mean squared error (RMSE)
⚫ Can be more easily interpreted since it has the same unit
as the series
⚫ Mean absolute percentage error (MAPE)
⚫ Useful when error relative to the time series is important
in evaluating accuracy
⚫ When Y(t) values are large
FORECASTING METHODS
⚫ Naïve: Used to develop simple models
assuming that very recent data provide the
best predictors of the future
⚫ Moving averages: Generates forecast based
on an average of past observations
⚫ Smoothing: Past data ProduceYou are here (t) forecasts Periods to be forecastby
averaging
…… Y , Y , Y ,
t-3 t-2
past values
t-1
Yt’
of a series t+1 t+2
with a
Y , Y , Y , ……..
t+3
decreasing
t (exponential) series of weights
Y is the most recent observation of a variable. Y is the forecast for one period in the future.
t+1
e is the forecast error - the difference between the observed and the forecasted value.
t
FORECASTING METHODS
⚫ Stepsinvolved in evaluating forecasting
methods:
⚫ Forecasting method is selected based on the
forecaster’s analysis of and intuition about the
nature of the data
⚫ Data set is divided into two sections: training or
fitting section and test or forecasting section
⚫ Selected forecasting technique is used to
develop fitted values for training the data
⚫ The technique is used to forecast for the test
FORECASTING METHODS
⚫ Naïve models
⚫ Based on the most recent information available
⚫ Assume that the recent periods are the best
predictors of the future
⚫ Technique can be adjusted to take trend into
consideration
⚫ Simple averages
⚫ Uses the mean of all relevant historical
observations as the forecast for the next period
⚫ The objective is to use past data to develop a
forecasting model for future periods
FORECASTING METHODS
⚫ Moving averages
⚫ MA of order k is the mean value of k
consecutive observations
⚫ Equal weights are assigned to each observation
⚫ Deals only with the latest k periods of known
data
⚫ Does not handle trend or seasonality very well
⚫ Smaller the number, larger the weight given to
recent periods
⚫ Larger the number, greater is the smoothing
BUSINESS FORECASTING
METHODS OF FORECASTING
FORECASTING METHODS –
EXPONENTIAL SMOOTHING
⚫ Smoothing: Produce forecasts by
averaging past values of a series with a
decreasing (exponential) series of weights
⚫ Exponential smoothing method:
Procedure for continually revising a
forecast in the light of more recent
observations
⚫ The smoothing constant α serves as the
weighting factor
DOUBLE EXPONENTIAL
SMOOTHING – HOLT’S METHOD
⚫ Adjusted for trend variation in the data
⚫ Allows evolving local linear trends to
generate forecasts
⚫ Estimate of current level and current trend
(slope) is determined
⚫ Two smoothing constants: α (Smoothing
constant for level) and β (Smoothing
constant for trend) are estimated
⚫ α and β generated by minimizing MSE
EXPONENTIAL SMOOTHING –
WINTER’S METHOD
⚫ Adjusted for trend and seasonal variation in
the data
⚫ Three parameters α, β, and ϒ are determined
for estimating level, trend and seasonality.
⚫ Generated by minimizing MSE
⚫ Popular technique for short-term forecasting
⚫ Low cost and simple
⚫ Based on past values which include both
random fluctuations as well as information
⚫ Assumes that extreme fluctuations represent
randomness in a series
BUSINESS FORECASTING