Download as pdf or txt
Download as pdf or txt
You are on page 1of 101

Statistical Tools and

Techniques in Research
SHERWIN E. BALBUENA
Statistics
• a branch of mathematics
dealing with the collection,
analysis, interpretation, and
presentation of masses of
numerical data (Merriam-
Webster, n.d.)

• Merriam-Webster. (n.d.). Statistics. In Merriam-


Webster.com dictionary. Retrieved May 27, 2021, from
https://www.merriam-webster.com/dictionary/statistics

9/3/20XX Presentation Title 2


the practice or science of
collecting and analyzing
numerical data in large
Statistics quantities, especially for
the purpose of inferring
proportions in a whole
from those in a
representative sample.

6/2/2021 Statistical Tools and Techniques in Research 3


Statistical tools involved in
carrying out a study
include planning,
Statistical designing, collecting
data, analyzing, drawing
Tools meaningful
interpretation and
reporting of the research
findings.

9/3/20XX Presentation Title 4


• statistical methods
Statistical
• quantitative methods
Techniques

9/3/20XX Presentation Title 5


Statistical
Techniques
9/3/20XX Presentation Title 6
• either graphical,
numerical, or tabular
Statistical
Techniques

9/3/20XX Presentation Title 7


• either parametric
or nonparametric
methods
Statistical
Techniques

9/3/20XX Presentation Title 8


Research and
Statistics
How are they related?
The Scientific
Method

9/3/20XX Presentation Title 10


Research
Designs

9/3/20XX Presentation Title 11


Levels of
Measurement of
Data Gathered
from Research

9/3/20XX Presentation Title 12


Level of Measurement:
Example (Grade)
•Nominal: passed/failed
•Ordinal: 1.0, 1.25, 2.0,
3.0, 5.0
•Interval: Z-score = 1.96
•Ratio: 95%

9/3/20XX Presentation Title 13


Are Likert Scales Ordinal?

Individual Likert item responses are ordinal

The sum of ordinal responses can be treated as interval if


the test or questionnaire is unidimensional or reliable

Reliable Likert-type instruments have at least 0.70


Cronbach’s alpha coefficients

9/3/20XX Presentation Title 14


Variables Used
in Research

9/3/20XX Presentation Title 15


Probability
Sampling
Techniques
9/3/20XX Presentation Title 16
1. Add a new column within the
spreadsheet and name it
Random_number
2. In the first cell underneath your
heading row, type “= RAND()”
3. Press “Enter,” and a random
Random number will appear in the cell
Sampling in 4. Copy and paste the first cell into
the other cells in this column
Excel 5. Once each row contains a random
number, sort the records by
Random_number column
6. Choose the top n entries. Those
will be the random n out of N
entries
9/3/20XX Presentation Title 17
Nonprobability
Sampling
Techniques
9/3/20XX Presentation Title 18
Sample Size
Determination

Cochran’s
Formula
9/3/20XX Presentation Title 19
Distributions
of Data

9/3/20XX Presentation Title 20


Statistical
Significance
If the statistic exceeds the critical
value at a certain level of
significance (5%, 1%) and
degree of freedom.

If the p-value is less than the set


level of significance (* p < 0.05,
** p < 0.01)

9/3/20XX Presentation Title 21


Five Steps in Hypothesis
Testing:
• Specify the Null Hypothesis.
Hypothesis • Specify the Alternative
Testing Hypothesis.
• Set the Significance Level (α)
• Calculate the Test Statistic and
Corresponding P-Value.
• Drawing a Conclusion.
9/3/20XX Presentation Title 22
Two-Sample t-Test
• H0: μ1 = μ2.
Hypothesis
• H1: μ1 ≠ μ2.
Testing:
• α = 0.05
Example
• t = -2.5, p = 0.0254
• Reject H0 in favor of H1.

9/3/20XX Presentation Title 23


• Measures of Central
Tendency (or
Location, Position)
Descriptive
• Measures of
Statistics Dispersion (or
Variation, Spread)
• Measures of Shape

9/3/20XX Presentation Title 24


Mean
Measures of
Central Median
Tendency

Mode
9/3/20XX Presentation Title 25
9/3/20XX Presentation Title 26
The Mean • Referred to as average
• Add all entries; divide
the sum by no. of
entries
• Excel formula:
=AVERAGE
9/3/20XX Presentation Title 27
• When data distribution is
continuous and
symmetric
• When data are normally
distributed
• No outliers
When to Use
the Mean?

9/3/20XX Presentation Title 28


Outliers

9/3/20XX Presentation Title 29


• Arrange the data from
lowest to highest; Find
The Median the middle of the data
• Excel formula:
=MEDIAN

9/3/20XX Presentation Title 30


• When your data set is
skewed
• When you are dealing
When to Use with ordinal data
the Median?

9/3/20XX Presentation Title 31


• The most frequently
occurring categories or
values
• Excel formula (for
The Mode unimodal): =MODE.SNGL
• Excel formula (for
multimodal):
=MODE.MULT

9/3/20XX Presentation Title 32


• When dealing with
nominal data
When to Use
the Mode?

9/3/20XX Presentation Title 33


• Frequency = ∑xi
Other Stats • Percentage = 100*∑xi/n
• Proportion = ∑xi/n
1 if present
x=
 0 if absent

9/3/20XX Presentation Title 34


Bar graphs

Pie charts

Histograms
Graphical Line graphs
Techniques
Boxplots

Stem-and-leaf plots

Scatterplots

9/3/20XX Presentation Title 35


• Used when data are
nominal or ordinal
• The heights or lengths are
Bar Charts frequencies/percentages

9/3/20XX Presentation Title 36


• Used when
you are
trying to
compare
parts of a
whole
Pie Charts
• When the
data are
nominal or
ordinal

9/3/20XX Presentation Title 37


• Used when you want to see
the shape of the data's
distribution
• When the data are
interval/ratio

Histograms

9/3/20XX Presentation Title 38


• Used to track changes over short and
long periods of time
• Also used to compare changes over
the same period of time for more
than one group

Line Graphs

9/3/20XX Presentation Title 39


• Used to show the shape of
the distribution, its central
value, and its variability
Boxplots • Used when the data is at
(Box-and-Whisker least interval level
Plots)

9/3/20XX Presentation Title 40


• Used to classify either
discrete or continuous
variables.
• Looks something like a bar
graph

Stem-and-Leaf
Plot

9/3/20XX Presentation Title 41


• When you have two
variables (X and Y)
that pair well
together
• X and Y variables
are at least in
interval/ratio level

Scatterplots

9/3/20XX Presentation Title 42


Range

Interquartile Range
Measures of
Dispersion
Variance

Standard Deviation
9/3/20XX Presentation Title 43
Dispersion,
Variability,
Spread

9/3/20XX Presentation Title 44


• Range(X) = Max(X) – Min(X)
The Range • Difference between the
highest score and the
lowest score

9/3/20XX Presentation Title 45


The
Interquartile
Range

9/3/20XX Presentation Title 46


Variance
and
Standard
Deviation
Excel formulas:
=STDEV.P
=STDEV.S
9/3/20XX Presentation Title 47
• In conjunction with a
mean
When to use • To summarize
continuous data
SD? • When data is normal
• When data has no
outliers

9/3/20XX Presentation Title 48


Skewness
Measures of
Shape
Kurtosis

9/3/20XX Presentation Title 49


Skewness
(Symmetry)

Excel function:
=SKEW
9/3/20XX Presentation Title 50
Kurtosis
(Peakedness)

Excel function:
=KURT
9/3/20XX Presentation Title 51
Properties:

Normal 1. Mean=Median=Mode
2. Symmetric at the center
3. Area below the mean =
Distribution Area above the mean
4. Total area under the curve
=1

9/3/20XX Presentation Title 52


Kolmogorov-Smirnov

Tests of Shapiro-Wilk

Normality Normal QQ-plots

Boxplots

9/3/20XX Presentation Title 53


Kolmogorov-
Smirnov Test
H0: There is no difference between
the observed and theoretical
distribution.

9/3/20XX Presentation Title 54


Kolmogorov-
Smirnov Test
Examples of outputs

Higher p-values (>0.05)


indicate normally
distributed data.

9/3/20XX Presentation Title 55


Shapiro-Wilk
Test
H0: The data are normally
distributed.

9/3/20XX Presentation Title 56


Shapiro-Wilk
Test
Examples of outputs

Higher p-values (>0.05)


indicate normally
distributed data.

9/3/20XX Presentation Title 57


GRAPHICAL TECHNIQUE

Normal
QQ-plots
If the points seem to fall
about a straight line the
distribution is normal.

9/3/20XX Presentation Title 58


GRAPHICAL TECHNIQUE

Boxplot
When the median is in the
middle of the box, and the
whiskers are about the
same on both sides of the
box, then the distribution is
symmetric.

9/3/20XX Presentation Title 59


Tests of
Homogeneity
of Variances

Levene’s Test

Bartlett’s Test
9/3/20XX Presentation Title 60
Null hypothesis:
The variances are equal
across all samples. In more
formal terms, that's written as:
Levene’s Test H0: σ12 = σ22 = … = σk2.

9/3/20XX Presentation Title 61


Levene’s Test
Sample
Outputs

9/3/20XX Presentation Title 62


Bartlett’s
Test

9/3/20XX Presentation Title 63


Bartlett’s Test
Sample
Outputs

9/3/20XX Presentation Title 64


Independent samples t-test

Paired samples t-test

Some
Parametric Analysis of variance

Statistics Pearson product-moment


correlation

Linear regression

9/3/20XX Presentation Title 65


• Compares the means of two
independent groups in order to
determine whether there is
Independent statistical evidence that the
associated population means are
Samples t- significantly different.
Test

9/3/20XX Presentation Title 66


• One independent, categorical
variable that has two levels/groups.
Independent • One continuous dependent
variable.
Samples t- • Normality of data within each group
Test: • No significant outliers in the two
Assumptions groups
• Random sampling from the
population
• Homogeneity of group variances

9/3/20XX Presentation Title 67


• Comparing the math
Independent abilities of male and female
Samples t- students
Test: • Comparing the reading
Applications performances of the control
group and the
experimental group

9/3/20XX Presentation Title 68


Independent
Samples t-
Test:
Sample
Outputs
• H0: μ1 = μ2
• t= -1.99, p-value = 0.055
9/3/20XX Presentation Title 69
Please visit this link:
https://support.microsoft.com/
en-us/office/use-the-analysis-
toolpak-to-perform-complex-
How to Activate data-analysis-6c67ccf0-f4a9-
487c-8dec-bdb5a2cefab6
Analysis ToolPak in
Excel Or scan the QR code below:

9/3/20XX Presentation Title 70


Paired t-Test
Compares the means of two
measurements taken from the
same individual, object, or
related units

9/3/20XX Presentation Title 71


• Your dependent variable should be
measured on a continuous scale.
• Your independent variable should
consist of two categorical, "related
groups" or "matched pairs".
Paired t-Test: • There should be no significant
Assumptions outliers in the differences between
the two related groups.
• The distribution of the differences
in the dependent variable between
the two related groups should be
approximately normally
distributed.
9/3/20XX Presentation Title 72
• Comparing the pretest and
posttest scores after using
Paired t-Test: an intervention
Applications • Comparing the math
anxiety scores before and
after conducting a non-
traditional teaching strategy

9/3/20XX Presentation Title 73


Paired t-Test:
Sample
Outputs
• H0: µd = 0
• t= -6.53, p-value = 0.000
9/3/20XX Presentation Title 74
• Compares three or more
than three categorical
groups to establish
whether there is a
difference between them.
Analysis of
Variance
(ANOVA)

• H0: μ1 = μ2 = μ3 = . . . = μk
9/3/20XX Presentation Title 75
• Interval/ratio level dependent
variable
• Independent variable should
consist of two or more
categorical, independent
Assumptions groups
of ANOVA • Independence of observations
• No significant outliers
• Normally distributed data in
each group
• Homogeneity of variances

9/3/20XX Presentation Title 76


• Data may be transformed
using any of the following
techniques:
• Logarithmic (=LN or =LOG)
What if • Square root (=SQRT)
• Cube root (=[cell]^1/3)
normality is
violated?

9/3/20XX Presentation Title 77


• Determining the effect of a
nominal/ordinal-level variable
on an interval/ratio-level
variable
Applications • In experiments involving two or
of ANOVA more experimental groups and
one or more factors
• Comparing the average NAT
scores of 5 schools in the
division

9/3/20XX Presentation Title 78


One-Way ANOVA: • H0: μA = μB = μC
Sample Output • F(2,41)=1.11, p = 0.34
Anova: Single Factor

SUMMARY
Groups Count Sum Average Variance
Method A 16 172 10.75 10.6
Method B 13 143 11 4.666667
Method C 15 181 12.06667 4.066667

ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 14.79394 2 7.39697 1.115258 0.337571 3.225684
Within Groups 271.9333 41 6.63252

Total 286.7273 43

9/3/20XX Presentation Title 79


• If the p-value is significant at a
specified alpha, proceed to
post-hoc tests.
What to do • Examples of post-hoc tests:
if ANOVA is • Fisher’s Least Significant
Difference (LSD)
significant? • Tukey’s Honestly Significant
Difference (HSD)
• Scheffe’s Test
• Duncan’s new multiple range test
(DMRT)
9/3/20XX Presentation Title 80
Pearson Product-Moment
Correlation (r)
is a measure of the strength of a linear association between two
variables

9/3/20XX Presentation Title 81


• The two variables, X and Y, are
of interval/ratio level
• X and Y are paired
Assumptions • Linear relationship between X
of Pearson’s r and Y
• Bivariate normal distribution
• Homoscedasticity
• No univariate or multivariate
outliers

9/3/20XX Presentation Title 82


LINEAR
BIVARIATE
NORMAL

9/3/20XX Presentation Title 83


Strength of
Linear
Correlation

9/3/20XX Presentation Title 84


• Negative: As X increases,
Y decreases. Or as X
decreases, Y increases.
What does a
• Positive: As X increases,
correlation so does Y. Or as X
coefficient decreases, Y also
decreases.
mean? • But it does not mean that
X affects Y, nor that Y
affects X.

9/3/20XX Presentation Title 85


Significance
of Pearson’s r

9/3/20XX Presentation Title 86


Pearson’s r in
Excel

Excel function:
=CORREL
9/3/20XX Presentation Title 87
Mann-Whitney U Test

Wilcoxon Signed-Rank
Test
Nonparametric
Alternatives Kruskal-Wallis H Test

Spearman’s Rank
Correlation

9/3/20XX Presentation Title 88


• Used when the assumptions
of independent samples t-
test are not met
Mann-
Whitney U
Test

9/3/20XX Presentation Title 89


Mann-
Whitney U
Test Sample
Outputs

9/3/20XX Presentation Title 90


Wilcoxon
Signed-Rank
Test
Used when the
assumptions of paired t-
test are not met

9/3/20XX Presentation Title 91


Wilcoxon
Signed-Rank
Test Sample
Outputs

9/3/20XX Presentation Title 92


• Used when the
assumptions of ANOVA
Kruskal- are not met
Wallis H Test

9/3/20XX Presentation Title 93


Kruskal-
Wallis H Test
Sample
Outputs

9/3/20XX Presentation Title 94


• Used when assumptions of
Pearson’s r are not met by
Spearman’s the data
Rank
Correlation
(Rho)

9/3/20XX Presentation Title 95


Spearman’s
Rho Sample
Outputs

9/3/20XX Presentation Title 96


Statistical
Tests
Appropriate
to Your
Research
Design

9/3/20XX Presentation Title 97


• Wilcoxon signed-rank test
https://www.socscistatistics.com/tests/s
ignedranks/default2.aspx
• Mann-Whitney U test
https://www.socscistatistics.com/tests/
mannwhitney/default2.aspx
• Kruskall-Wallis test
Online https://www.socscistatistics.com/tests/k
ruskal/default.aspx
Calculators • Levene’s test (Homogeneity of variance
test)
https://www.socscistatistics.com/tests/l
evene/default.aspx
• Shapiro-Wilk (Normality test)
http://www.statskingdom.com/320Sha
piroWilk.html

9/3/20XX Presentation Title 98


• https://www.rstudio.com/p
roducts/rstudio/download/

Open-source • https://www.jamovi.org/do
wnload.html
Software
• https://www.blueskystatisti
cs.com/Articles.asp?ID=30
1

9/3/20XX Presentation Title 99


“Facts are stubborn
things, but statistics
are pliable.”
Mark Twain

9/3/20XX Presentation Title 100


Sherwin E. Balbuena
Thank you balbuenasherwine@debesmscat.edu.ph

Phone: +63 909 522 6069

9/3/20XX Presentation Title 101

You might also like