Assumptions of Various Statistical Tests

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 21

Tackling Violations of assumptions in Various Statistical

Tests and summary of statistical Tests


by
Dr Lalit Prasad
Ouline
• No outliers
• Linearity
• Normality
• Multicollinearity
• Mahalanobis Distance (D2 ) and Cooks distance
• Homogeneity of Variance or Homogeneity of variance-covariance
(Box M Test Sig Value>0.05)
• Standardization of variables
• Homoscedasticity
• Non parametric options of Parametric Tests
• Questionnaire Design and reversing of Data
• Validity and Reliability : Importance of Literature Review
• When to use which statistical Test ?
Outliers
• How to identify outliers in SPSS- for
univariate(box plot)-
• Analyze-Descriptives –Explore-statistics-outliers
• If no outliers: Go ahead.
• If Outliers are available : remove outliers from
the data set, Increase sample Size.
• Also find the significant value of statistical test
with and without outliers.
Linearity
• Linearity: There must be linear relationship between
variables.-
In SPSS- Graphs –Legacy dialogs –Scatter plot-Matrix
Scatter , for multiple Variables.
No linearity : Use Polynomial Regression in place of
Linear regression
Trends of Data

• Linear

• Seasonality

• Cyclical elements

• Autocorrelation

• Random variation

Assumption of Independence: Residuals should be uncorrelated . This assumption is tested


by Durbin –watson in model summary table of regression. The value should lies between 0-
4. If value is equal or near to 2, no autocorrelation. Values 0-2 positive auto correlation,
and 2-4, negative autocorrelation.
Normality

• Univaraite Normality (Shapiro wilk sig value>0.05) in SPSS.


• Multivariate Normality : Mardia , Henze Zirkler and Roystan in
R/R studio
• Why Residual Normality is more important than observed value
Normality
• If No Normality : Use sqrt( ), Log() and Ln() of variables, Increase
sample Size.
• How to transform observed variable into sqrt, log() and ln()
function in SPSS
• Transform –Compute variable-define target variable-Function
group(all)-Function and special variables [sqrt(), log(), ln()]
Mahalanobis and Cooks Distance for
Multivariate distribution
• Mahalanobis Distance (D2) = (xi-m)2/(sd)2

• Threshold or cut off value of mahalanobis distance is


=18.47
• Threshold value of Cooks Distance= 3 times mean, or
4/n or less than 1.
• x11 x12 x13…….x1m
Y = x21 x22 x23 …...x2m

x31 x32 x33….. X3m


………………..
xm1 m2 xm3……xmm
Multicollinearity
• Multicollinearity : r>0.7, or VIF>10 or
Tolerance <0.2
• What to do if Multicollinearity exists ?
• Drop one or two….. variable(s) from the
highly correlated or singularity variables or
combine two variable as one or use Ridge
Regression
Homogeneity of Variance or
Homogeneity of variance-covariance
• The variance and covariance of different independent
groups should be same.
• Levene test of homogeneity (ANOVA) and Box M Test
(MANOVA) Sig value > 0.05
• Levene test sig value <0.05 , use Kruskal Wallis H Test
in case of one way ANOVA.
• If Box M sig value is <0.05 , use Pillai trace value in case
of MANOVA.
• Note : Focus on equal sample size of independent
groups and random selection
Standardization of Variables
• If data are not in same scale of
measurement or unit, we should convert the
variables in standardized form.
• How to convert variables into standardized
variables.
• Analyze-Descriptives- Save standardized
variables as variables
Homoscedasticity
• Homoscedasticity describes a situation in which
the error term is the same across all values of the
independent variables.

• i.e. it is concerned with how the scores cluster


uniformly around regression line.
Non parametric options of
parametric Tests
Sr Parametric Tests Non-parametric Tests
No.
1 Independent T-test Mann whitney U-Test

2 Paired T-test Wilcoxon Sign Ranked Test

3 One Way ANOVA Krushkal Wallis Test

4 MANOVA Multivarite Krushkal Wallis Test

4 One way ANOVA with Friedman Test


repeated Measures
5 Correlation Spearman’s Rank Correlation and
Chi-SquareTest
Questionnaire Design and reversing of Data
Sr Question(s) SD (1) D(2) 3(SWD) 4(N 5(SWA 6(A) 7(SA)
No ) )
1 I achieved by targets in √
1st quarter of 19-20
2 I achieved by targets in √
2nd quarter of 19-20
3 I achieved by targets in √
3rd quarter of 19-20
4 I achieved by targets in √
4th quarter of 19-20
5 I didn’t achieve by √
targets in the FA 2019-
20
Validity and Reliability :
Importance of Literature Review
1. Validity : Ability to measure what we are
suppose to measure.
a. Salary
b. Incentives
c. working environment
d. promotion Policy
e. Behavior of Boss
2. Reliability
When to use which statistical Test ?
Tests Factor Response Sample(s) Application
Variable/Indepe Variable/Dependent
ndent Variable/ Criterion
Variable
/predictor

One sample t- Metric One Sample Compare Means


test Metric (Interval and
(Test Value) Ratio)
Independent Non-Metric Metric Two Compare Means
Sample t-test Independent
Samples
Paired t-test Metric Metric Related sample Compare Means
(Before and
After)

Chi-Square Non-Metric Non-Metric One sample, Test of


Two or More Independence &
sample Goodness of fit
Summary : ANOVA Family
DV=1 DV>=2
IV= Independent Variable,
DV=Dependent Variable Metric Metric
IV=1 Nonmetric One Way ANOVA One Way MANOVA
Both Nonmetric Two way ANOVA Two Way MANOVA

IV=2 Mixed-one
Metric(Covaraite) One Way ANCOVA One Way MANCOVA
& one Non-metric

All Nonmetric N-Way ANOVA N-Way MANOVA

IV=3
Mixed -Two Non-
Two way ANCOVA Two way MANCOVA
metric & one Metric
Continued…
Tests Factor Response Application
Variable(s)/Independ Variable(s)/Dependent
ent Variable Variable/(s) Criterion
(s)/predictors

Correlation Metric Metric Strength of Relationship &


Direction of Relationship
Regression: Metric Metric Impact of Independent
Simple and variable on Dependent
Multiple Variable
Multiple Mixed or Metric --- Do----
Regression Using categorical
Dummy
Hierarchical Metric or Metric Impact of Independent
Regression Mixed variable on Dependent
Variable- Best Model
Logistic Non-Metric or Non -Metric Impact of Independent
Regression Metric or variable on Dependent
Mixed Variable
Continued…
Tests Factor Response Sample(s) Application
Variable/Independe Variable/Dependent
nt Variable Variable/ Criterion
/predictor

Discriminant Metric Non -Metric ---- Impact of Independent


Analysis variable on Dependent
Variable

Correspondenc Non- Metric ( Nominal with ---- Association


e analysis two category )

Multidimensio Ordinal, Interval and Ratio ---- Association


nal Scaling
Canonical Metric Metric ----- Relationship between
Correlation two variates

Conjoint Metric or Metric ----- Find values of different


Analysis Mixed or non- stumli
metric
Continued…
Tests Factor Response Sample(s) Application Comment
Variable/Indepe Variable/Depen
ndent dent Variable/
Variable Criterion
/predictor

Hierarchica Metric or Metric -- Impact of Best Model


l Mixed Independent with the
Regression variable on IVs
Dependent Variable
Correspond Non- Non-metric ----- Strength of Association
ence Metric Relationship between
analysis products
and
attributes.
Canonical Metric Metric ----- Relationship
Correlation between two
variates
Conjoint Metric or Metric Find values of -
Analysis Mixed or different stumli
non-metric
Continued….
Tests Scale Comments
Factor Analysis Metric Scale Data Reduction or Dimension reduction or
reduction of variables into factors

Cluster Analysis Metric Scale or Segmentation


Non metric

Confirmatory Metric Validity


Factor Analysis
(Measurement
Model)
Structured Metric SEM is somewhat like performing many linear
Equation regressions at the same time, along with each of
Modeling(SEM) the variables in the regression also being tested as
using SPSS and a factor analysis .
AMOS
Multiple tests done simultaneously add their own
value to understanding the larger picture.
Thank You

You might also like