Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 73

RESEARCH METHODOLOGY &

BIOSTATISTICS

DR WAQAR AHMED AWAN


PhD In Rehabilitation Sciences
Professor
NON-PARAMETRIC TEST
INTRODUCTION
These parametric tests require that the assumptions of normality and

homogeneity of variance are met to a reasonable degree for validity of


analysis
A set of statistical procedures classified as nonparametric, which test

hypotheses for group comparisons without normality or variance


assumptions.
For this reason, these methods are sometimes referred to as

distribution-free tests
Nonparametric methods are similar to parametric methods in that test

hypotheses involve the use of a statistical ratio or test statistic to


determine associated probability.
Similarly, the outcomes of these tests are evaluated according to a

predetermined alpha level of significance.


These tests are included in most statistical packages for computer

analysis.
Criteria For Choosing Nonparametric Tests
Two major criteria are generally adopted for choosing a nonparametric test

over a parametric procedure.

1. Assumptions of population normality and homogeneity of variance

cannot be satisfied.
Many clinical investigations involve skewed distributions rather than

symmetrical ones.
In addition, small clinical samples and samples of convenience cannot

automatically be considered representative of larger normal distributions


2. Data are measured on the nominal or ordinal scale.
Many assessment tools have been developed around these scales.

Nonparametric tests provide an objective mechanism for supporting statistical

hypotheses when these levels of measurement are used

the nonparametric tests apply to data that are at least at the ordinal level that

is, the variable of interest has an underlying continuous distribution that can
be ranked,
For instance, strength can be measured using discrete manual muscle test grades

on an ordinal scale, even though strength truly exists along a continuum.


Although nonparametric tests require fewer statistical assumptions than parametric

procedures, they still put some restrictions on data.

Some type of randomization procedure should be used in forming groups to make

assumptions about the equality of groups before the independent variable is


administered.

Ordinal scales are used often to measure relative changes in clinical variables such as

balance or function.

The major disadvantage of nonparametric tests is that they do not accommodate

complex clinical designs.


Power-Efficiency in Non parametric Tests
Many researchers prefer to use parametric tests because they are generally more

powerful.
Nonparametric tests are less sensitive than parametric tests because most of them

involve ranking scores rather than comparing precise metric changes.


Nonparametric and parametric methods have been compared on the basis of their

power-efficiency, which is a test's relative ability to identify significant differences


for a given sample size.
Generally, an increase in sample size is needed to make a nonparametric test as

powerful as a parametric test.


For instance, a nonparametric test may require a sample size of 50 to

achieve the same degree of power as a parametric test with 30 subjects.


This relationship can be expressed as a percentage that indicates the

relative power efficiency of the nonparametric test.


For example, if power-efficiency is 60%, then with equal sample sizes, the

nonparametric test is 60% as powerful as the parametric test.


In other words, to achieve equal power with the nonparametric test, we

would need 10 subjects for every 6 used with the parametric procedure
With equal sample sizes, nonparametric tests will generally be less

powerful however, with larger samples this discrepancy is minimized.


Most of the nonparametric tests can achieve approximately 65% to 95%

power-efficiency in comparison to their most powerful parametric analogs.


With very small samples, as with six subjects or less, many nonparametric

tests will be as powerful as their parametric counterparts.


With larger non-normal populations, the nonparametric statistics may

actually be more powerful.


Procedure For Ranking Scores
• Most nonparametric tests are based on rank
ordering of scores.
• Scores are always ranked from smallest to
largest, with the rank of 1 assigned to the
smallest score.
• the lowest ranks are assigned to the largest
negative values, if any.
• The highest rank will equal n.
• As shown in Sample A, the rank of 1 is assigned
to the smallest score ( -3), the rank of 2 goes to
the next smallest (0), and so on, until the rank of
8 is assigned to the highest score (16).
• When two or more scores in a distribution are
tied, they are each given the same rank, which is
the average of the ranks they occupy.
Mann-Whitney U Test
The Mann-Whitney U test is the nonparametric equivalent to the independent t-test

The Mann-Whitney U test is used to compare differences between two

independent groups when the dependent variable is either ordinal or continuous,

but not normally distributed.

The Mann-Whitney U test compares the number of times a score from one sample

is ranked higher than a score from another sample


Assumptions
1. Dependent variable should be measured at the ordinal or continuous
level.

2.  Your independent variable should consist of two categorical,


independent groups

3. Independence of observations, which means that there is no relationship


between the observations in each group or between the groups
themselves

4. two variables are not normally distributed


Example
a researcher decided to investigate whether an Progressive PT or

Maintenance PT was more effective in reducing constipation severity on

constipation severity scale in Spastic CP children


Output

IQR=25th – 75th
2-8=6

MR x
N

Formula: sum of the ranks - N x (N+1)/2


Interpretation
The results showed that constipation severity in the Progressive PT group

was significantly lower than Maintenance PT group (MR=10.44 vs 25.44,

U = 17, p<0.001) after 6 week in spastic CP children.


Wilcoxon Signed-Rank Test
The Wilcoxon signed-rank test is the nonparametric test equivalent to

the paired t-test.

It is used to compare two sets of scores that come from the same

participants.
Assumptions
dependent variable should be measured at the ordinal or continuous level

independent variable should consist of two categorical, "related

groups" or "matched pairs“

Data not normally distributed.


Example
a researcher decided to investigate whether Physical therapy is effective or

not, in reducing constipation severity on constipation severity scale in

Spastic CP children after 6 weeks intervention.


Test Procedure
IQR= 25th percentile – 75th percentile
Interpretation
The results showed that constipation severity was significantly reduced

from {8(3) ver. 6(6), Z=-4.50, p≤0.001) after 6 weeks physical therapy

intervention in spastic CP children.


Kruskal-Wallis H Test
Introduction
The Kruskal-Wallis H test (sometimes also called the "one-way ANOVA on

ranks") is a rank-based nonparametric test


can be used to determine if there are statistically significant differences between

more than two groups of an independent variable on a continuous without normal


distribution or ordinal dependent variable.
It is considered the nonparametric alternative to the one-way ANOVA, and an

extension of the Mann-Whitney U test to allow the comparison of more than two
independent groups.
Assumptions
1. dependent variable should be measured at the ordinal or continuous level
with normality violated (i.e., interval or ratio)

2. independent variable should consist of more than two categorical,


independent groups

3. should have independence of observations, which means that there is no


relationship between the observations in each group or between the
groups themselves.
Question
is there any difference in constipation severity among CP children
with different level of functional independence after six week PT
intervention?
level of functional independence (GMFCS)- Independent variable
Group 1 (Level 3)
Group 2 (Level 4)
Group 3 (Level 5)
Constipation severity (CAS score)- Dependent variable
(Continuous)
Test Procedure in SPSS Statistics
Interpretation
A Kruskal-Wallis H test showed that there was no statistically significant

difference in constipation severity score (χ2(2) = 3.5, p = 0.17) between the


different level of functional independence (GMFCS) in CP children after 6
week of PT intervention, with a mean rank constipation severity score of
24.14 for Level III, 32.83 for Level IV and 33.77 for Level V.
Friedman Test
Introduction
The Friedman test is the non-parametric alternative to the one-way

ANOVA with repeated measures.


It is used to test for differences between groups when the dependent

variable being measured is ordinal.


It can also be used for continuous data that has violated the assumptions

necessary to run the one-way ANOVA with repeated measures


Assumption
1. One group that is measured on three or more different occasions.

2. Group is a random sample from the population.

3. dependent variable should be measured at the ordinal or continuous level

4. Samples do NOT need to be normally distributed


Question
is there any difference in constipation severity among CP children
from 0 week to six week PT intervention?
PT – Independent variable
Constipation severity (CAS score)- Dependent variable
(Continuous)- not normally distributed
Test Procedure in SPSS
Friedman Test (with post hoc tests)
Interpretation
The constipation severity score was statistically significantly improved after 6 week of PT

intervention in CP children (X2(3)=41.87, p<0.001).

Pairwise comparison with Wilcoxon signed-rank tests using the Bonferroni correction

revealed that PT cause statistically significant reduction in constipation severity from 0 week

to end of 2-weeks {8(2) vs 8(3), Z=-3.36, p<0.001}.

However, from 2nd week to 4th week no statistically significant reduction in constipation

severity was observed. {8(3) vs 8(5.5), Z=-1.92, p=0.054}

While from 4th week to end of 6th week statistically significant reduction in constipation

severity {8(5.5) vs 6(6), Z=-2.17, p=0.03}.


Association Between Categorical Variable
CHI-SQUARE TEST
The chi-square test for independence,

also called Pearson's chi-square test or the chi-square test of association,

is used to discover if there is a relationship between two categorical

variables.

It cannot make comparisons between continuous variables or between

categorical and continuous variables.


Assumptions
Your data must meet the following requirements:
Two categorical variables.
Two or more categories (groups) for each variable.
Independence of observations.
There is no relationship between the subjects in each group.
The categorical variables are not "paired" in any way (e.g. pre-test/post-test
observations).
Relatively large sample size.
Expected frequencies for each cell are at least 1.
Expected frequencies should be at least 5 for the majority (80%) of the cells
Example
Biostatistician would like to know whether gender (male/female) is

associated with the preferred type of learning medium (online vs. On

Campus).

Therefore, we have two nominal variables: Gender (male/female) and

Preferred Learning Medium (online/On Campus).


Test Procedure in SPSS Statistics
The results of the "Pearson Chi-Square" row. We can see here that
χ(1) = 0.487, p = .485.

Output This tells us that there is no statistically significant association


between Gender and Preferred Learning Medium; that is, both Males
and Females equally prefer campus learning versus online.

Phi and Cramer's V are both tests of the strength of


association. We can see that the strength of association
between the variables is very weak.
Association Between Continuous Variable
PEARSON PRODUCT-MOMENT CORRELATION
is a measure of the strength and direction of a linear association between

two variables and is denoted by r.


attempts to draw a line of best fit through the data of two variables,

indicates how far away all these data points are to this line of best fit (i.e.,

how well the data points fit this new model/line of best fit).
Interpreting Pearson's correlation coefficient
The Pearson correlation coefficient, r, can take a range of values from +1
to -1.
A value of 0 indicates that there is no association between the two
variables.
A value greater than 0 indicates a positive association; that is, as the value
of one variable increases, so does the value of the other variable.
 A value less than 0 indicates a negative association; that is, as the value of
one variable increases, the value of the other variable decreases. This is
shown in the diagram below:
Assumptions
Two or more continuous variables (i.e., interval or ratio level)

Cases that have values on both variables

Independent cases (i.e., independence of observations)

Bivariate normality

Random sample of data from the population

No outliers
Example
A researcher wants to explore that, is there any relation between age and
Hb concentration among female students
The Pearson's correlation showed medium negative correlation (r=-0.414,

p-0.003) between age and level of hemoglobin among female students


Association Between Ordinal Variable
SPEARMAN'S RANK-ORDER CORRELATION
Spearman’s correlation, for short is a nonparametric test

It measures the strength and direction of association exists between two

variables measured on at least an ordinal scale.

It is denoted by the symbol rs 

The test is used for either ordinal variables or for continuous data that has

failed the assumptions necessary for conducting the Pearson's product-


moment correlation. 
Assumptions
two variables should be measured on an ordinal, interval or ratio scal

two variables represent paired observations. 

There is a monotonic relationship between the two variables


Example
Is there any relation between level of activity and level of stress among
students
There is Strong negative correlation (rs= -0.733, p<0.001) between Level
of Physical Activity and Level of stress among female students
Linear Regression
SIMPLE LINEAR REGRESSION

Linear regression is the next step up after correlation.

It is used when we want to predict the value of a variable based on the

value of another variable.


Assumptions
two variables should be measured at the continuous level (i.e., they are
either interval or ratio variables).
There needs to be a linear relationship between the two variables.
There should be no significant outliers. 
You should have independence of observations.(Durbin-Watson statistic)
Data needs to show homoscedasticity, which is where the variances along
the line of best fit remain similar as you move along the line.
the residuals (errors) of the regression line are approximately normally
distributed
Question

Is there any relationship between sodium intake and constipation severity

in CP children.

Sodium intake (IV)

Constipation severity (DV)


Test Procedure in SPSS Statistics
Interpretation
Sodium has significant negative correlation with Constipation severity

{r=-0.26, F(1,58)=4.51, p=0.038) and causes 7% variance in Constipation

severity

The result suggest that 1 unit increase in Sodium causes 0.004 unit

decreases in Constipation severity


Thank You

You might also like