Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

UNIT IV

Tests Of Significance

Basics of Testing of hypothesis - Null & Alternative Hypothesis

● Testing of hypothesis is a fundamental concept in statistics that allows researchers to


make informed decisions based on data.
● At its core, hypothesis testing involves making a claim about a population parameter and
then using sample data to evaluate the validity of that claim.
● The process typically involves several key steps.
● Hypothesis testing is a statistical method used to make inferences about population
parameters based on sample data.
● It involves the formulation of two competing statements, the null hypothesis ((H0)) and
the alternative hypothesis ((Ha)), and the use of sample data to determine which
statement is supported by the evidence.
1. Null Hypothesis ((H0)): The null hypothesis represents the default assumption
or the claim that there is no effect or no difference in the population. It typically
states that the population parameter is equal to a specific value or follows a
certain distribution. In hypothesis testing, the null hypothesis serves as the
baseline against which the alternative hypothesis is compared. For example, if
testing the effectiveness of a new drug, the null hypothesis might state that there
is no difference in outcomes between the group receiving the drug and the group
receiving a placebo.

2. Alternative Hypothesis ((Ha)): The alternative hypothesis is the claim that the
researcher wants to test. It represents the assertion that there is indeed an effect
or a difference in the population. The alternative hypothesis can take different
forms depending on the research question, such as stating that the population
parameter is not equal to a specific value, is greater than a specific value, or is
less than a specific value. Using the example of testing a new drug, the
alternative hypothesis might state that the group receiving the drug shows
significantly better outcomes compared to the group receiving a placebo.

● In hypothesis testing, the researcher collects sample data and calculates a test statistic
based on the data.
● The test statistic provides a measure of how much the sample data deviate from what
would be expected under the null hypothesis.
● The conclusion of the hypothesis test depends on whether the observed data provide
sufficient evidence to reject the null hypothesis in favor of the alternative hypothesis, or if
there is not enough evidence to do so.
● First, the researcher formulates two competing hypotheses: the null hypothesis ((H0))
and the alternative hypothesis ((Ha)).
● The null hypothesis represents the status quo or the default assumption, often stating
that there is no effect or no difference in the population.
● The alternative hypothesis, on the other hand, represents the claim the researcher wants
to test, suggesting that there is indeed an effect or a difference.
● Next, the researcher selects an appropriate statistical test based on the type of data and
the research question. Common statistical tests include t-tests, chi-square tests,
ANOVA, and correlation tests, among others.
● After selecting the test, the researcher collects sample data and calculates a test statistic
based on the data. This test statistic provides a measure of how much the sample data
deviate from what would be expected under the null hypothesis.
● Once the test statistic is calculated, the researcher determines the probability of
observing such extreme results if the null hypothesis were true.
● This probability is known as the p-value.
● A low p-value indicates strong evidence against the null hypothesis, while a high p-value
suggests that the null hypothesis cannot be rejected.
● Finally, the researcher compares the p-value to a predetermined significance level (often
denoted as (alpha)), typically set at 0.05. If the p-value is less than or equal to the
significance level, the researcher rejects the null hypothesis in favor of the alternative
hypothesis. If the p-value is greater than the significance level, the null hypothesis is not
rejected.
● Steps:

1. State the Hypotheses:
- Null Hypothesis ((H0)): Start by stating the null hypothesis, which represents the
default assumption or the claim that there is no effect or no difference in the population.
This hypothesis is typically denoted as (H_0) and often states that a population
parameter is equal to a specific value or follows a certain distribution.
- Alternative Hypothesis ((Ha)): Next, state the alternative hypothesis, which is the
claim that you are testing. This hypothesis represents the assertion that there is indeed
an effect or a difference in the population. The alternative hypothesis can take different
forms depending on the research question and is denoted as (Ha).

2. Choose a Significance Level ((alpha)): Select a significance level, denoted as (alpha),


which is the probability of rejecting the null hypothesis when it is actually true. Commonly
used significance levels include 0.05 and 0.01.

3. Select the Appropriate Test: Choose a statistical test based on the type of data and
the research question being investigated. Common statistical tests include t-tests,
chi-square tests, ANOVA, and correlation tests, among others.

4. Collect Data and Calculate Test Statistic: Collect sample data and calculate a test
statistic based on the data. The test statistic provides a measure of how much the
sample data deviate from what would be expected under the null hypothesis.
5. Determine the Critical Region: Determine the critical region, which is the range of
values of the test statistic that would lead to rejection of the null hypothesis at the
chosen significance level. This critical region is determined based on the selected
significance level and the distribution of the test statistic.

6. Calculate the p-value: Calculate the p-value, which is the probability of observing
results as extreme as or more extreme than the observed data, assuming the null
hypothesis is true. The p-value provides a measure of the strength of evidence against
the null hypothesis.

7. Make a Decision: Compare the p-value to the significance level ((alpha)). If the
p-value is less than or equal to (alpha), reject the null hypothesis in favor of the
alternative hypothesis. If the p-value is greater than (alpha), fail to reject the null
hypothesis.

8. Interpretation: Finally, interpret the results of the hypothesis test in the context of the
research question. If the null hypothesis is rejected, conclude that there is evidence to
support the alternative hypothesis. If the null hypothesis is not rejected, conclude that
there is not enough evidence to support the alternative hypothesis.

Type I and Type II Error

● Type I and Type II errors are fundamental concepts in hypothesis testing, representing
the two potential mistakes that researchers can make when interpreting the results of
their analyses.
● Type I errors, often referred to as false positives, occur when the null hypothesis is
incorrectly rejected, indicating an effect or difference in the population when, in reality,
there is none.
● This error is akin to seeing an effect that doesn't actually exist.
● The probability of committing a Type I error is determined by the significance level
((alpha)) chosen for the test. For instance, if the significance level is set at 0.05, there is
a 5% chance of making a Type I error.
● An example of a Type I error in a medical context could be mistakenly concluding that a
new drug is effective in treating a condition when, in fact, it has no therapeutic benefit.
● On the other hand, Type II errors, also known as false negatives, occur when the null
hypothesis is incorrectly not rejected, failing to detect an effect or difference in the
population when one actually exists.
● This error represents a missed opportunity to identify a true effect.
● Unlike Type I errors, the probability of committing a Type II error is influenced by factors
such as sample size, effect size, and the chosen significance level.
● For example, in a medical study, a Type II error might occur if researchers fail to
conclude that a new drug is effective in treating a condition when, in reality, it does have
a beneficial effect.
● This error could have serious implications, as it may result in the rejection of a potentially
beneficial treatment.

Level of Significance (parametric) & Power of the tests


● In biostatistics, the level of significance and power of statistical tests are critical concepts
that influence the interpretation of research findings and the validity of study conclusions.
● The level of significance, often denoted by (alpha), represents the probability of
committing a Type I error—rejecting the null hypothesis when it is actually true.
● In medical and biological research, where decisions can have significant consequences
for patient care or public health policies, maintaining a low level of significance is crucial.
Commonly used significance levels include 0.05 and 0.01, meaning that researchers
accept a 5% or 1% chance of making a Type I error, respectively.
● By setting a predetermined level of significance, researchers establish a threshold for
considering the evidence against the null hypothesis strong enough to warrant rejection.
● Conversely, the power of a statistical test represents the probability of correctly rejecting
the null hypothesis when it is false, thus avoiding a Type II error.
● In biostatistics, where detecting meaningful effects or differences is paramount,
achieving high power is essential.
● A high-powered study is more likely to detect true effects or differences if they exist,
leading to more reliable conclusions.
● Power is influenced by several factors, including the sample size, effect size, and the
chosen significance level.
● Increasing the sample size and effect size generally enhances the power of a study,
while lowering the significance level decreases it.
● Researchers often perform power analyses before conducting studies to determine the
necessary sample size to achieve adequate power for detecting the effects of interest.
● In biostatistics, balancing the level of significance and power of statistical tests is crucial
for ensuring the validity and reliability of research findings.
● While a low level of significance helps control the risk of false positive conclusions, high
power increases the likelihood of detecting true effects or differences.
● By carefully considering these factors and conducting appropriate power analyses,
researchers can design studies that yield robust and meaningful results, ultimately
contributing to advancements in medical and biological sciences and informing
evidence-based decision-making in healthcare.

p value and Confidence interval


● In biostatistics, the p-value and confidence interval are two key statistical measures used
to assess the results of hypothesis tests and estimate the precision of population
parameters, respectively.
● The p-value is a measure of the strength of evidence against the null hypothesis ((H0))
in hypothesis testing.
● It represents the probability of observing results as extreme as or more extreme than the
observed data, assuming that the null hypothesis is true.
● A low p-value indicates strong evidence against the null hypothesis, suggesting that the
observed results are unlikely to have occurred by chance alone.
● Commonly used significance levels, such as 0.05 or 0.01, provide thresholds for
determining whether the evidence against the null hypothesis is strong enough to reject
it.
● If the p-value is less than or equal to the chosen significance level, the null hypothesis is
typically rejected in favor of the alternative hypothesis ((Ha)), indicating that there is
sufficient evidence to support the claim being tested.

● A confidence interval is a range of values that is likely to contain the true value of a
population parameter, such as a mean or proportion, with a specified level of confidence.
● For example, a 95% confidence interval for a mean represents the range of values within
which we are 95% confident that the true population mean lies.
● The width of the confidence interval depends on factors such as the sample size and
variability of the data.
● A narrower confidence interval indicates greater precision in estimating the population
parameter.
● While the p-value provides information about the statistical significance of an effect or
difference, the confidence interval provides information about the precision of the
estimate of a population parameter.
● Together, these measures help researchers interpret the results of statistical analyses,
assess the reliability of study findings, and make informed decisions in medical and
biological research.

● The p-value quantifies the strength of evidence against the null hypothesis.
● It measures the probability of obtaining results as extreme as the observed data,
assuming that the null hypothesis is true.
● A small p-value (typically less than the chosen significance level, e.g., 0.05) indicates
strong evidence against the null hypothesis, suggesting that the observed results are
unlikely to have occurred by random chance alone.
● However, it's important to note that a small p-value does not necessarily imply the
practical significance or importance of the observed effect.
● It only indicates the statistical significance.
● The p-value is influenced by factors such as the sample size, effect size, and variability
of the data.
● Larger sample sizes tend to result in smaller p-values for the same effect size.
Interpretation of the p-value should consider the context of the study and the research
question being investigated.
● It is a measure of statistical significance, not necessarily of clinical or practical
significance.
● A confidence interval provides a range of values within which we are confident that the
true value of a population parameter lies, based on the sample data.
● Commonly used confidence levels include 90%, 95%, and 99%. For example, a 95%
confidence interval for a mean indicates that there is a 95% probability that the true
population mean falls within the interval.
● A narrower confidence interval indicates greater precision in estimating the population
parameter, while a wider interval indicates greater uncertainty.
● The width of the confidence interval depends on factors such as the sample size and
variability of the data. Larger sample sizes generally result in narrower confidence
intervals.
● Confidence intervals provide valuable information about the precision of parameter
estimates and help researchers assess the reliability of study findings. They also allow
for comparisons between different groups or populations.

Test of Significance - T test (paired and unpaired)


● A t-test is a statistical tool used to compare the means of two groups and determine if
they are significantly different from each other.
● It helps you understand if the difference between the averages of two sets of data is
likely due to chance or if it's a real difference.
● The one-sample t-test and the two-sample t-test are both statistical methods used to
compare means, but they are applied in slightly different situations:
1. One-Sample t-test:
● This test is used when you want to compare the mean of a single sample to a known or
hypothesized population mean.
● Suppose you have a sample of students' test scores and you want to determine if the
average score differs significantly from the national average score (a known population
mean).
2. Two-Sample t-test:
● This test is used when you want to compare the means of two independent samples to
determine if they are significantly different from each other.
● Suppose you have two groups of students, one that received a traditional teaching
method and another that received an experimental teaching method, and you want to
know if there is a significant difference in their average test scores.
Steps:

1. Formulate Hypotheses: Start by stating your null hypothesis (H0) and alternative
hypothesis (Ha).
2. Choose the Type of t-test: Decide which type of t-test is appropriate for your data. The
most common types are:
● Independent samples t-test: Compares the means of two independent groups.
● Paired samples t-test: Compares the means of two related groups.
3. Collect Data: Collect data from your samples or groups.
4. Calculate the Test Statistic: Calculate the t-value using the appropriate formula based
on the type of t-test you're conducting.

● For One t test:

● Two t test:

5. Determine the Degrees of Freedom: Calculate the degrees of freedom (df), which
depend on the sample sizes and the type of t-test.
● For one t test: n–1
● For two t test: Welch Satterthwaite formula:
6. Find the Critical Value or P-value: Use a t-table or statistical software to find the critical
value corresponding to your chosen significance level (alpha) or directly obtain the
p-value.
7. Make a Decision: Compare the calculated t-value with the critical value or compare
the p-value with the chosen significance level. If the calculated t-value is greater than the
critical value or if the p-value is less than alpha, reject the null hypothesis. Otherwise, fail
to reject the null hypothesis.
8. Interpret Results: Interpret the findings in the context of your research question or
hypothesis. If you reject the null hypothesis, it suggests that there is a significant
difference between the groups you're comparing.
9. Report Results: Report the results of your t-test, including the calculated t-value,
degrees of freedom, critical value or p-value, and your decision regarding the null
hypothesis.

Paired t-test
● A paired t-test, also known as a dependent t-test or a paired samples t-test.
● It is a statistical test used to compare the means of two related groups.
● This test is appropriate when the same subjects have been measured under two
different conditions or at two different points in time, resulting in pairs of observations.

Steps:

1. Hypotheses: The null hypothesis (H0) typically states that there is no difference
between the means of the two groups, while the alternative hypothesis (H1) states that
there is a difference.
2. Assumptions: The data should be continuous, approximately normally distributed, and
have a similar variance between the two groups.
3. Procedure:
● Calculate the differences between paired observations (e.g., post-test score minus
pre-test score).
● Calculate the mean and standard deviation of the differences.

● Determine the degrees of freedom, typically n-1 where n is the number of pairs.
● Finally, compare the obtained t statistic to a critical value from the t-distribution or use
software to determine the p-value.
4. Interpretation: If the calculated p-value is less than the chosen significance level (e.g.,
0.05), then the null hypothesis is rejected, and it is concluded that there is a significant
difference between the means of the two groups.

Chi square Test

● The chi-square test is a statistical method used to determine if there's a significant


association between two categorical variables.
● The chi-square test is a statistical test used to determine whether there is a significant
association between categorical variables.
1. Formulate hypotheses:
● Null Hypothesis (H0): There is no association between the categorical variables.
● Alternative Hypothesis (H1): There is an association between the categorical variables.

2. Set the significance level (α): This is typically set at 0.05, but it can vary depending on the
context and requirements of your analysis.

3. Collect data: Obtain data in the form of a contingency table. A contingency table is a tabular
representation of the frequencies of different categories for two or more variables.

4. Calculate expected frequencies: Compute the expected frequencies under the assumption
that there is no association between the variables. This is done using the formula:

5. Calculate the chi-square statistic: Compute the chi-square statistic using the formula:
6. Determine degrees of freedom: Degrees of freedom ((df)) is calculated as:

7. Find the critical value: Look up the critical value of chi-square in the chi-square distribution
table with (df) degrees of freedom and at the chosen significance level (α).

8. Compare calculated chi-square statistic with critical value: If the calculated chi-square statistic
is greater than the critical value, then reject the null hypothesis. If it is less than the critical value,
fail to reject the null hypothesis.

9. Interpret results: Based on the comparison, make a decision regarding the null hypothesis. If
you reject the null hypothesis, it suggests that there is a significant association between the
categorical variables.

10. Conclusion: Provide a conclusion based on the results of the test, indicating whether there is
evidence to support the alternative hypothesis.

TEST OF PROPORTION
● A test of proportion is a statistical test used to determine whether the proportion of
successes in a sample is significantly different from a hypothesized population
proportion. It's commonly used when dealing with categorical data with two outcomes
(success or failure).

1. Formulate hypotheses:
● Null Hypothesis (H0): The population proportion is equal to a specified value.
● Alternative Hypothesis (H1): The population proportion is not equal to the specified
value.

2. Set the significance level (α): Typically set at 0.05 or another predetermined value.

3. Collect data: Obtain a sample that includes the number of successes and the total
sample size.
4. Calculate the sample proportion: Compute the proportion of successes in the sample
by dividing the number of successes by the total sample size.

5. Calculate the test statistic: For large sample sizes (typically n > 30), you can use the z-test for
proportions. The test statistic (z-score) is calculated as:

6. Determine critical value or p-value: Based on the chosen significance level (α), determine the
critical value from the standard normal distribution table for a two-tailed test or use statistical
software to find the p-value associated with the test statistic.

7. Compare the test statistic with the critical value or p-value:


● If using critical values, reject the null hypothesis if the absolute value of the test
statistic is greater than the critical value.
● If using p-values, reject the null hypothesis if the p-value is less than the chosen
significance level.

8. Interpret results: If the null hypothesis is rejected, conclude that there is evidence to suggest
that the sample proportion differs significantly from the hypothesized population proportion.

9. Conclusion: Provide a conclusion based on the results of the test, indicating whether there is
evidence to support the alternative hypothesis.
ANOVA TEST (One Way Analysis of Variance)
● ANOVA (Analysis of Variance) is a statistical test used to determine whether there are
statistically significant differences between the means of three or more independent
groups.
● It is an extension of the t-test for comparing the means of two groups to situations where
there are more than two groups.
● Certainly! One-way Analysis of Variance (ANOVA) is a statistical method used to
compare the means of three or more independent groups.
● It's called "one-way" because there's only one independent variable or factor being
considered.
● ANOVA helps researchers determine whether there are statistically significant
differences between the means of the groups, beyond what would be expected by
random chance.
● In a one-way ANOVA, the data are organized into groups based on the levels of the
independent variable.
● The ANOVA test compares the variation between the group means (the "between-group"
variation) to the variation within each group (the "within-group" variation).
● If the between-group variation is larger than the within-group variation, it suggests that
there are systematic differences between the groups.
● One-way ANOVA has several advantages. It allows researchers to simultaneously
compare multiple groups, which can be more efficient than conducting pairwise
comparisons.
● Additionally, ANOVA provides a way to control for Type I error inflation that can occur
with multiple t-tests.
● However, there are some assumptions that must be met for ANOVA to produce valid
results.
● These include the assumption of normality (the data within each group should be
approximately normally distributed), homogeneity of variances (the variances within each
group should be approximately equal), and independence of observations.
● If the assumptions are met and the ANOVA yields a significant result, post-hoc tests can
be conducted to determine which specific group means differ from each other. Common
post-hoc tests include Tukey's HSD (Honestly Significant Difference), Bonferroni
correction, and Scheffé's method.
● In conclusion, one-way ANOVA is a powerful statistical tool for comparing means across
multiple groups, providing researchers with valuable insights into differences between
experimental conditions or groups.
● Steps:

1. Formulate hypotheses:
- Null Hypothesis (H0): The means of all groups are equal.
- Alternative Hypothesis (H1): At least one group mean is different from the others.

2. Set the significance level (α): Typically set at 0.05 or another predetermined value.
3. Collect data: Obtain data from three or more independent groups. Each group should have
data that represent the variable of interest (e.g., test scores, reaction times, etc.).

4. Check assumptions:
- Independence: Observations within each group should be independent.
- Normality: The data within each group should be approximately normally distributed.
- Homogeneity of variances: The variances of the groups should be approximately equal.

5. Calculate group means: Compute the mean for each group.

6. Calculate the overall mean: Compute the grand mean, which is the mean of all observations
across all groups.

7. Calculate the sum of squares (SS):


- Total sum of squares (SST): This measures the total variability in the data.
- Between-group sum of squares (SSB): This measures the variability between group means.
- Within-group sum of squares (SSW): This measures the variability within each group.

8. Calculate the degrees of freedom (df):


- Total degrees of freedom (DFT):
( DFT = N - 1 ), where ( N ) is the total number of observations.
- Between-group degrees of freedom (DFB):
( DFB = k - 1 ), where ( k ) is the number of groups.
- Within-group degrees of freedom (DFW): (
DFW = N - k ).

9. Calculate the mean squares (MS):


- Between-group mean square (MSB): ( MSB = SSB / DFB )
- Within-group mean square (MSW): ( MSW = SSW / DFW )

10. Calculate the F-statistic: Compute the ratio of between-group mean square to within-group
mean square:

[ F = {MSB}/{MSW} ]

11. Determine the critical value or p-value: Based on the chosen significance level (α) and the
degrees of freedom, determine the critical value from the F-distribution table or use statistical
software to find the p-value associated with the F-statistic.

12. Compare the F-statistic with the critical value or p-value:


- If using critical values, reject the null hypothesis if the F-statistic is greater than the critical
value.
- If using p-values, reject the null hypothesis if the p-value is less than the chosen significance
level.

13. Interpret results: If the null hypothesis is rejected, conclude that there is evidence to suggest
that at least one group mean is different from the others.

14. Post-hoc tests (optional): If the ANOVA indicates significant differences between groups,
post-hoc tests (e.g., Tukey's HSD, Bonferroni correction, Scheffé's method) can be performed to
identify which specific group means differ from each other.

15. Conclusion: Provide a conclusion based on the results of the test, indicating whether there is
evidence to support the alternative hypothesis.

Repeated measures analysis of variance

● Repeated Measures Analysis of Variance (RM-ANOVA) is a statistical test used to


analyze the mean differences between two or more related groups or conditions where
the same subjects are measured multiple times.
● It's often used in experimental designs where participants are measured at different time
points or under different conditions.
● Repeated Measures Analysis of Variance (RM-ANOVA) is a statistical technique used to
analyze the mean differences between two or more related groups or conditions where
the same subjects are measured multiple times.
● This method is particularly useful in experimental designs where subjects undergo
different treatments or conditions over time, allowing researchers to assess changes
within subjects across different conditions.
● RM-ANOVA differs from traditional ANOVA in that it accounts for the repeated measures
within subjects, making it more powerful and efficient in detecting differences between
conditions.
● By analyzing the variability within subjects across different conditions, RM-ANOVA can
help researchers determine whether there are significant differences in means across
conditions, while controlling for individual differences between subjects.
● One of the key advantages of RM-ANOVA is its ability to reduce error variance by
accounting for individual differences between subjects, thereby increasing the sensitivity
of the analysis.
● Additionally, RM-ANOVA allows researchers to assess the effects of multiple factors or
treatments within the same subjects, making it well-suited for experimental designs with
complex factorial structures.
● However, RM-ANOVA also has certain assumptions that must be met for valid
interpretation of results.
● These include the assumption of sphericity, which refers to the equality of variances for
all possible pairs of conditions, as well as the assumption of normality for the differences
between conditions.
● RM-ANOVA is a powerful statistical tool for analyzing repeated measures data, providing
researchers with valuable insights into the effects of different treatments or conditions
over time while controlling for individual differences between subjects.
● It is commonly used in various fields such as psychology, medicine, and experimental
sciences to examine changes within subjects in response to different interventions or
experimental manipulations.
● Steps:

1. Formulate hypotheses:
- Null Hypothesis (H0): There is no significant difference in means across the conditions
or time points.
- Alternative Hypothesis (H1): At least one condition or time point has a different mean
from the others.

2. Set the significance level (α): Typically set at 0.05 or another predetermined value.

3. Collect data: Obtain data from subjects who are measured under different conditions
or at different time points. Each subject should provide multiple measurements.

4. Check assumptions:
- Sphericity: This assumption implies that the variances of the differences between all
possible pairs of conditions are equal.
- Normality: The differences between conditions should be approximately normally
distributed.

5. Calculate the within-subjects sum of squares (SSW): This measures the variability
between measurements within each subject.

6. Calculate the between-subjects sum of squares (SSB): This measures the variability
between subjects.

7. Calculate the total sum of squares (SST): This measures the total variability in the
data.

8. Calculate the degrees of freedom (df):


- Within-subjects degrees of freedom (DFW): The total number of measurements
minus the number of subjects.
- Between-subjects degrees of freedom (DFB): The number of conditions minus one.

9. Calculate the mean squares (MS):


- Within-subjects mean square (MSW): ( MSW = SSW / DFW )
- Between-subjects mean square (MSB): ( MSB = SSB / DFB )
10. Calculate the F-statistic: Compute the ratio of between-subjects mean square to
within-subjects mean square:

[ F = MSB/MSW ]

11. Determine the critical value or p-value: Based on the chosen significance level (α)
and the degrees of freedom, determine the critical value from the F-distribution table or
use statistical software to find the p-value associated with the F-statistic.

12. Compare the F-statistic with the critical value or p-value:


- If using critical values, reject the null hypothesis if the F-statistic is greater than the
critical value.
- If using p-values, reject the null hypothesis if the p-value is less than the chosen significance
level.

13. Interpret results: If the null hypothesis is rejected, conclude that there is evidence to
suggest that at least one condition or time point has a different mean from the others.

14. Conclusion: Provide a conclusion based on the results of the test, indicating whether
there is evidence to support the alternative hypothesis.

● RM-ANOVA assumes that the data meet the assumptions of sphericity and normality.
● Violations of these assumptions can affect the validity of the results.
● Additionally, RM-ANOVA is best suited for balanced designs, meaning equal numbers of
measurements for each subject in each condition or time point.

Test of Significance - Non Parametric

● A test of significance using non-parametric methods is a statistical analysis technique


that doesn't rely on assumptions about the distribution of the data being analyzed.
● Unlike parametric tests, which assume specific distributions (such as normal
distribution), non-parametric tests offer robustness against violations of distributional
assumptions and are particularly useful when dealing with ordinal or non-normally
distributed data.
● These tests are often employed in situations where the data may not meet the
requirements of parametric tests, such as small sample sizes, skewed distributions, or
data that cannot be transformed to meet parametric assumptions.
● Non-parametric tests provide a valuable alternative for analyzing data in a wide range of
research fields, including social sciences, medicine, and environmental studies.
● Common non-parametric tests include the Wilcoxon Signed-Rank test, Mann-Whitney U
test, Kruskal-Wallis test, and Spearman's rank correlation coefficient. These tests offer
robustness and flexibility, allowing researchers to make inferences about population
parameters without relying on stringent distributional assumptions.
● Non-parametric tests provide researchers with powerful tools for analyzing data when
parametric assumptions cannot be met.
● By offering robustness against violations of distributional assumptions, non-parametric
tests ensure reliable and valid statistical inference in various research contexts.

Mann-Whitney U test
● The Mann-Whitney U test, also known as the Mann-Whitney-Wilcoxon test or Wilcoxon
rank-sum test, is a non-parametric test used to compare two independent groups to
determine whether there is a significant difference between their distributions.
● It's commonly used when the assumptions of the t-test (a parametric test for comparing
means of two groups) are violated, such as when the data are not normally distributed or
when the variances are not equal.

1. Formulate hypotheses:
- Null Hypothesis (H0): There is no difference between the distributions of the two
groups.
- Alternative Hypothesis (H1): There is a difference between the distributions of the two
groups.

2. Set the significance level (α): Typically set at 0.05 or another predetermined value.

3. Collect data: Obtain data from two independent groups. Each group should have data
that represent the variable of interest (e.g., test scores, reaction times, etc.).

4. Rank the data: Pool the data from both groups and rank them from smallest to largest,
ignoring group membership. Ties are assigned the average of the ranks they would
occupy if un-tied.

5. Calculate the test statistic (U): The Mann-Whitney U test statistic (U) is calculated
based on the ranks of the data. It represents the sum of the ranks of one of the groups.

6. Determine the critical value or p-value: Based on the chosen significance level (α) and
the sample sizes of the groups, determine the critical value from the Mann-Whitney U
distribution table or use statistical software to find the p-value associated with the test
statistic.

7. Compare the test statistic with the critical value or p-value:


- If using critical values, reject the null hypothesis if the test statistic is less than or
equal to the critical value.
- If using p-values, reject the null hypothesis if the p-value is less than the chosen
significance level.
8. Interpret results: If the null hypothesis is rejected, conclude that there is evidence to
suggest that there is a significant difference between the distributions of the two groups.

9. Conclusion: Provide a conclusion based on the results of the test, indicating whether
there is evidence to support the alternative hypothesis.

● The Mann-Whitney U test is widely used in various fields, including psychology, biology,
and social sciences, as it provides a robust method for comparing distributions when
parametric assumptions cannot be met.

● The Wilcoxon signed-rank test, often referred to simply as the Wilcoxon test, is a
non-parametric statistical test used to determine whether there is a significant difference
between two related groups or conditions.
● It's specifically designed for paired data, where each observation in one group is paired
with a corresponding observation in the other group.

1. Formulate hypotheses:
- Null Hypothesis (H0): There is no difference between the two related groups or
conditions.
- Alternative Hypothesis (H1): There is a difference between the two related groups or
conditions.

2. Set the significance level (α): Typically set at 0.05 or another predetermined value.

3. Collect data: Obtain paired data from two related groups or conditions. Each pair
should represent measurements or observations taken under different conditions or at
different time points.

4. Calculate the differences: For each pair of observations, calculate the difference
between the two measurements.

5. Rank the absolute differences: Rank the absolute values of the differences from
smallest to largest, ignoring the signs.

6. Assign ranks: Assign ranks to the absolute differences based on their magnitude,
starting from 1 for the smallest absolute difference.

7. Calculate the test statistic (W): The Wilcoxon signed-rank test statistic (W) is the sum
of the ranks of the positive or negative differences, whichever is smaller.
8. Determine the critical value or p-value: Based on the chosen significance level (α) and
the sample size, determine the critical value from the Wilcoxon signed-rank distribution
table or use statistical software to find the p-value associated with the test statistic.

9. Compare the test statistic with the critical value or p-value:


- If using critical values, reject the null hypothesis if the test statistic falls within the
critical region.
- If using p-values, reject the null hypothesis if the p-value is less than the chosen
significance level.

10. Interpret results: If the null hypothesis is rejected, conclude that there is evidence to
suggest that there is a significant difference between the two related groups or
conditions.

11. Conclusion: Provide a conclusion based on the results of the test, indicating whether
there is evidence to support the alternative hypothesis.

● The Wilcoxon signed-rank test is widely used in various fields, such as psychology,
medicine, and environmental science, when the data do not meet the assumptions of
parametric tests like the paired t-test.
● It provides a robust method for comparing paired data while allowing for flexibility and
sensitivity to deviations from normality.

Kruskal-Wallis test
● The Kruskal-Wallis test is a non-parametric test used to determine whether there are
statistically significant differences between the medians of three or more independent
groups.
● It's an extension of the Mann-Whitney U test for comparing two groups to situations with
more than two groups.

1. Formulate hypotheses:
- Null Hypothesis (H0): There is no difference between the medians of the groups.
- Alternative Hypothesis (H1): At least one group median is different from the others.

2. Set the significance level (α): Typically set at 0.05 or another predetermined value.

3. Collect data: Obtain data from three or more independent groups. Each group should have
data that represent the variable of interest (e.g., test scores, reaction times, etc.).

4. Rank the data: Pool the data from all groups and rank them from smallest to largest, ignoring
group membership. Ties are assigned the average of the ranks they would occupy if un-tied.
5. Calculate the test statistic (H): The Kruskal-Wallis test statistic (H) is calculated based on the
ranks of the data. It measures the degree of difference between the groups.

6. Determine the critical value or p-value: Based on the chosen significance level (α) and the
sample sizes of the groups, determine the critical value from the Kruskal-Wallis distribution table
or use statistical software to find the p-value associated with the test statistic.

7. Compare the test statistic with the critical value or p-value:


- If using critical values, reject the null hypothesis if the test statistic is greater than the critical
value.
- If using p-values, reject the null hypothesis if the p-value is less than the chosen significance
level.

8. Interpret results: If the null hypothesis is rejected, conclude that there is evidence to suggest
that at least one group median is different from the others.

9. Conclusion: Provide a conclusion based on the results of the test, indicating whether there is
evidence to support the alternative hypothesis.

● The Kruskal-Wallis test is particularly useful when the assumptions of parametric tests,
such as ANOVA, cannot be met, such as when the data are not normally distributed or
when the variances are not equal across groups.
● It's widely used in various fields, including psychology, biology, and social sciences, to
compare distributions between multiple groups.

Friedman's Analysis of Variance


● Friedman's Analysis of Variance (ANOVA) is a non-parametric statistical test used to
determine whether there are statistically significant differences between the medians of
three or more related groups or conditions.
● It's an extension of the Wilcoxon signed-rank test for comparing two related groups to
situations with more than two related groups.

1. Formulate hypotheses:
- Null Hypothesis (H0): There is no difference between the medians of the related groups.
- Alternative Hypothesis (H1): At least one group median is different from the others.

2. Set the significance level (α): Typically set at 0.05 or another predetermined value.

3. Collect data: Obtain data from three or more related groups or conditions. Each group should
have data that represent the variable of interest (e.g., test scores, reaction times, etc.).

4. Rank the data: For each subject or experimental unit, rank the observations within each
group from smallest to largest, ignoring ties.
5. Calculate the test statistic (Friedman's chi-squared statistic): Friedman's test statistic (χ²) is
calculated based on the ranks of the data. It measures the degree of difference between the
related groups.

6. Determine the critical value or p-value: Based on the chosen significance level (α) and the
sample size, determine the critical value from the chi-squared distribution table with \(k-1\)
degrees of freedom, where \(k\) is the number of related groups. Alternatively, use statistical
software to find the p-value associated with the test statistic.

7. Compare the test statistic with the critical value or p-value:


- If using critical values, reject the null hypothesis if the test statistic is greater than the critical
value.
- If using p-values, reject the null hypothesis if the p-value is less than the chosen significance
level.

8. Interpret results: If the null hypothesis is rejected, conclude that there is evidence to suggest
that at least one group median is different from the others.

9. Conclusion: Provide a conclusion based on the results of the test, indicating whether there is
evidence to support the alternative hypothesis.

● Friedman's ANOVA is useful when the assumptions of parametric tests, such as


repeated measures ANOVA, cannot be met, such as when the data are not normally
distributed or when the variances are not equal across groups.
● It's commonly used in various fields, including psychology, medicine, and environmental
science, to compare distributions between multiple related groups or conditions.

You might also like