Professional Documents
Culture Documents
Analysis of Data - Unit III (New)
Analysis of Data - Unit III (New)
Analysis of Data - Unit III (New)
Frequency table
Two Proportion (percentage)
categories
Type of Type of
Measurement descriptive analysis
Type of Type of
Measurement descriptive analysis
Type of Type of
Measurement descriptive analysis
Index numbers
Ratio Geometric mean
Harmonic mean
Central Tendency
Measure of
Central Measure of
Type of Scale Tendency Dispersion
90
80
70
60
50 East
40 West
30 North
20
10
0
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
Web Surveyor Bar Chart
How did you find your last job?
643 Netw orking
213 print ad
Temporary agency 1.5 % 179 Online recruitment site
112 Placement firm
18 Temporary agency
Placement firm 9.6 % manner
print ad 18.3 %
•Univariate Statistics
•Bivariate statistics
•Multivariate Statistics
Univariate Statistics
• Test of statistical significance
• Hypothesis testing one variable at a time
Significance Level
• Critical Probability
• Confidence Level
• Alpha
• Probability Level selected is typically .05 or
.01
Type I and Type II Errors
Accept null Reject null
Null is true
(Medicine can Correct- Type I
cure no error error
disease)
Null is false
(Medicine Type II Correct-
cannot cure error no error
disease)
Type I and Type II Errors in
Hypothesis Testing
• ie finding a ‘typical’
value from the middle
of the data.
Arithmetic Mean
Arithmetic mean is a mathematical average and it
is the most popular measures of central tendency.
It is frequently referred to as ‘mean’ it is obtained by
dividing sum of the values of all observations in a series
(ƩX) by the number of items (N) constituting the series.
Thus, mean of a set of numbers X1, X2, X3,
………..Xn denoted by x̅ and is defined as
Arithmetic Mean Calculated Methods :
• Direct Method :
= 40+
=
40+0.52X20
= 40+10.37
= 50.37
Advantages of Median
in the distribution.
It is defined as that value of the item in a series.
It is denoted by the capital letter Z.
MODE
Croxton and Cowden defined it as “the mode
of a distribution is the value at the point armed
with the item tend to most heavily concentrated.
It may be regarded as the most typical of a series
of value.”
The exact value of mode can be obtained by the
following formula.
Z=L1
+
Example: Calculate Mode for the distribution of
monthly rent Paid by Libraries in Karnataka
Z =2000+
Z=2000+0.8 ×500=400
Z=2400
Advantages of Mode
• Mode is readily comprehensible and easily
calculated
• It is the best representative of data
• It is not at all affected by extreme value.
• The value of mode can also be determined
graphically.
• It is usually an actual value of an important
part of the series.
Disadvantages of Mode
• It is not based on all observations.
• It is not capable of further
mathematical manipulation.
• Mode is affected to a great extent by
sampling fluctuations.
• Choice of grouping has great
influence on the value of mode.
Advantages and Disadvantages
Mean More sensitive than the It can be misrepresentative
median, because it makes if there is an extreme
use of all the values of the value.
data.
Median It is not affected by It is less sensitive than the
extreme scores, so can give mean, as it does not take
a representative value. into account all of the
values.
Mode It is useful when the data It is not a useful way of
are in categories, such as describing data when there
the number of babies who are several modes.
are securely attached.
Measures of Dispersion
• Measures of ‘spread.’
• This looks at how
‘spread out’ the data
are.
• Are the scores similar
to each other (closely
clustered), or quite
spread out?
Range and Standard Deviation
• The range is the difference between the highest
and lowest numbers. What is the range of …
I used Cara Flanagan’s (2005) Research Methods for AQA A Psychology Nelson Thornes in preparing these slides.
Choosing the Appropriate
Statistical Technique
Type of question to be answered
• Number of variables
– Univariate
– Bivariate
– Multivariate
• Scale of measurement
• Data Distribution
Inferential Statistical Tools
Univariate Analysis
Univariate Tools
• Z-Test
• t-Test
• Chi-Square Test (Distribution Test)
• Mann Whitney U Test
• Univariate ANOVA
Calculating Zobs
x
z
sx
obs
Alternate Way of Testing the
Hypothesis
X
Z obs
SX
t-Distribution
• Symmetrical, bell-shaped distribution
• Mean of zero and a unit standard deviation
• Shape influenced by degrees of freedom
Degrees of Freedom
• Abbreviated d.f.
• Number of observations
• Number of constraints
Testing a Hypothesis about a
Distribution
• Chi-Square test
• Test for significance in the analysis of
frequency distributions
• Compare observed frequencies with
expected frequencies
• “Goodness of Fit”
Chi-Square Test
(Oi Ei )²
x²
Ei
Chi-Square Test
x² = chi-square statistics
Oi = observed frequency in the ith cell
Ei = expected frequency on the ith cell
Chi-Square Test
Estimation for Expected Number
for Each Cell
X 2
O1 E1
2
O2 E 2
2
E1 E2
Inferential Statistical Tools
Bivariate Analysis
Measures of Association
Chi-square
Spearman R
Ordinal Scales Kendall Tau
Coefficient Gamma
Type of Measure of
Measurement Association
Chi-Square
Nominal Phi Coefficient
Fisher exact test
Bivariate Analysis -
Tests of Differences
Common Bivariate Tests
Differences among
Differences between
Type of Measurement three or more
two independent groups
independent groups
Differences among
Differences between
Type of Measurement three or more
two independent groups
independent groups
Differences among
Differences between
Type of Measurement three or more
two independent groups
independent groups
Yes No
Dependence Interdependence
methods methods
Dependence Methods
• A category of multivariate statistical
techniques; dependence methods explain or
predict a dependent variable(s) on the basis
of two or more independent variables
Dependence
Methods
How many
variables are
dependent
Multiple
Several
One dependent independent
dependent
variable and dependent
variables
variables
Dependence
Methods
How many
variables are
dependent
One dependent
variable
Metric Non-metric
Multiple Multiple
regression discriminant
analysis analysis
Dependence
Methods
How many
variables are
dependent
Several
dependent
variables
Metric Non-metric
Multivariate
Conjoint
analysis of
analysis
variance
Dependence
Methods
Multiple
How many
independent
variables are
and dependent
dependent
variables
Metric
or
Non-metric
Canonical
correlation
analysis
Interdependence Methods
Metric Nonmetric
Interdependence
methods
Metric
Metric
Factor Cluster
multidimensional
analysis analysis
scaling
Interdependence
methods
Nonmetric
None
Summary Table of Statistical Tests
Level of Sample Characteristics Correlation
Measurement
1 2 Sample K Sample (i.e., >2)
Sample
Independent Dependent/ Independent Dependent
Paired/
Related
Categorical or Χ2 or Χ2 /Repeated ,
McNemar’s Χ2 Cochran’s Q
Nominal bi- Χ2
nomial
Parametric z test or t test between t test within 1 way ANOVA 1 way Pearson’s r
(Interval & t test groups groups between ANOVA
Ratio) (Independent (Paired t- groups (within or
Sample t-test) test) repeated
measure)
Factorial (2 way) ANOVA
(Plonskey, 2001)
If we want to compare attitude towards brand among
the buyers of different cities. Which test can we
apply and why?
In a yoga class BP is measured three times in the
span of three weeks, which test will be suitable in
this case?
If we want to measure the impact of brand image on
purchase intention, which test would be applied and
why?
If preference towards shopping malls are measured
between male and female respondents, which test to
be applied?
If individuals are compared for their attitude towards online classes in
three sections. Which test to be applied?
I want to purchase branded clothes but restricted by its price, which kind
of study is this and which test to be applied?