Professional Documents
Culture Documents
Statistical Analysis Report Using Ibm SPSS: Jaganath Kaliyamoorthy
Statistical Analysis Report Using Ibm SPSS: Jaganath Kaliyamoorthy
IBM SPSS
RESEARCH METHODS – CA 1
MSc in Cybersecurity
Jaganath Kaliyamoorthy
Student ID: 19198868
School of Computing
1
Contents
1. Introduction: ............................................................................................................................................ 3
1.1 Tool Used: .................................................................................................................................................... 3
1.2 Independent Sample t test:......................................................................................................................... 3
1.3 Dataset: ........................................................................................................................................................ 3
1.3.1 Assumptions: ............................................................................................................................................ 3
1.3.2 Hypothesis: ............................................................................................................................................... 3
1.3.3 Group Statistics: ...................................................................................................................................... 3
1.3.4 Independent sample test:......................................................................................................................... 4
1.3.5 Conclusion: ............................................................................................................................................... 4
1.4 Mann-Whitney U Test ................................................................................................................................ 5
1.4.1 Assumptions: ............................................................................................................................................ 5
1.4.2 Hypothesis: ............................................................................................................................................... 5
1.4.3 Mann-Whitney U Test: ............................................................................................................................ 5
1.4.4 Conclusion: ............................................................................................................................................... 6
1.5 Chi-Square test: .......................................................................................................................................... 6
1.5.1 Assumptions: ............................................................................................................................................ 6
1.5.2 Hypothesis: ............................................................................................................................................... 6
1.5.3 Chi-Square Test: ...................................................................................................................................... 6
1.5.4 Conclusion: ............................................................................................................................................... 8
2.1 Dataset: ........................................................................................................................................................ 8
2.2 Assumptions: ............................................................................................................................................... 8
2.3 Hypothesis: .................................................................................................................................................. 9
2.4 Data Cleaning: ............................................................................................................................................ 9
2.5 Descriptive statistics: .................................................................................................................................. 9
2.6 Homogeneity of variances and ANOVA table:....................................................................................... 10
2.7 Post Hoc test: ............................................................................................................................................. 10
2.8 Conclusion: ................................................................................................................................................ 11
REFERENCE: .................................................................................................................................................... 12
2
1. Introduction:
The report provides overview of multiple statistical tests conducted on two datasets. The test
includes Independent sample t test, Mann – Whitney U test, Chi- square test and one-way anova
test. One of the datasets has been downloaded from Kaggle and the link has been provided.
Formula:
Assumptions:
1.3 Dataset:
Dataset link: https://www.kaggle.com/muraleetharan/college-student-dataset
The Dataset is a college student dataset which has demographics such as height, age, marital
status, hours of study, student’s current gpa, does they have childrens etc., Using these variables
we are performing number of test and calculating the mean and testing whether the model is
significant.
1.3.1 Assumptions:
• The dependent variable must be quantified in a continuous scale.
• The independent variable must have more than two groups.
• The Observation should not have any relationship between each group.
• No Substantial outliers.
• Variances ought to be homogeneous.
• The target attribute must be distributed normally for every group of the dependent
factors.
1.3.2 Hypothesis:
• Null hypothesis(H0): The hours of study per week between marital status (single) and
(married) is equal.
• Alternate hypothesis (H1): The hours of study per week between marital status (single)
and (married) is not equal.
3
18 samples in married group. The mean value of hours of study per week for single is 11.50
and married is 18.06.
1.3.5 Conclusion:
Levene’s test has been performed to test the equality of variances which drive us to the
conclusion. We can conclude that the p value is 0.016 thereby proving to be less than 0.05
which indicates that the null hypothesis is rejected. Also, the hours of study per week is
statistically different for married and single categories.
4
1.4 Mann-Whitney U Test
The Significances difference between the two independent categories are compared using the
Mann Whitney U test. This is usually performed when the dependent attribute consists of
nominal or continuous values, although, not being normally distributed.[2]
1.4.1 Assumptions:
• The dependent variable must be quantified in a continuous scale.
• The independent variable must have independent groups which are categorical.
• The Observation should not have any relationship between each group.
• This test can be conducted when the two variables are not normally distributed.
1.4.2 Hypothesis:
• Null hypothesis (H0): The hours of study per week between students who has children
and who does not have children are equal.
• Alternate hypothesis (H1): The hours of study per week between students who has
children and who does not have children are not equal.
The table below shows that the hours of study per week for the students who have children are
more than the hour of study per week for the students who does not have children.
The table below shows the test statistics and it clearly states the actual significance value for
the test. The table provides the value of U statistics and Asymp. Sig (2-tailed) p-value. From
the table below we can conclude that the hours of study per week for people who have children
are higher than the students who did not have children group (U = 142, p= .001).
5
Fig 4 Test statistics
1.4.4 Conclusion:
The p-value being 0.01 which is lesser than 0.05, the null hypothesis is rejected since the
hours of study per week for the students who have children and who does not have children is
significantly different.
1.5 Chi-Square test:
This test is used to find out the relation between the categorical variables.[3]
1.5.1 Assumptions:
• We should consider two continuous attributes for the chi-square test.
• The two attributes which are continuous must be categorical with more than two
independent groups.
1.5.2 Hypothesis:
• Null hypothesis (H0): The student’s current GPA is not associated with the age group
(less than 22, 22-28, 30 or more).
• Alternate hypothesis (H1): The student’s current GPA is associated with the age group
(less than 22, 22-28, 30 or more).
6
In the above table, the missing values has been mentioned in the column as 0 and the number
of samples are 50 in numbers.
……
7
The table shows the person chi-square value to be 57.37 and significance value is 0.002 which
is much lesser than 0.05. In the fig 7, we could see the phi and Cramer’s V test significance.
This test is performed for checking the strength of the association between the variables is
strong.
1.5.4 Conclusion:
Since the significance p value is .002 which is lesser than 0.05, the null hypothesis is rejected
since the student’s current GPA and age group is significantly different. So, we are rejecting
the null hypothesis and stating that the student’s current GPA is associated with the age group.
2. ANOVA TEST:
The difference between the means of independent categories is tested by the application of
ANOVA test which nothing but a one-way analysis of variance. [4]
2.1 Dataset:
Group Bone density (mg/cm3)
Control 611 621 614 593 593 653 600 554 603 569
Low jump 635 605 638 594 599 632 631 588 607 596
High jump 650 622 626 626 631 622 643 674 643 650
The dataset is associated with the study of rats which undergo three types of treatments, one
control with no jumping and another is low jump of height up to 30 centimeters and the last
one is up to 60 centimeters. And the bone density of rats after 8 weeks is provided.
2.2 Assumptions:
• The dependent variable must be quantified in a continuous scale.
• The independent variable must have more than two independent groups which are
categorical.
• The Observation should not have any relationship between each group.
• No Considerable existence of outliers.
• The dependent variable should be normally distributed for each group of the
independent variable.
• Variances ought to be homogeneous.
8
2.3 Hypothesis:
• Null hypothesis (H0): The mean of the groups is equal with respective to bone density
of rats.
• Alternate hypothesis (H1): The mean of the groups is not equal with respective to bone
density of rats.
9
2.6 Homogeneity of variances and ANOVA table:
The Homogeneity of the variance has been tested through Tukey method and we can assume
that the variance of each groups is equal. From the table, the control and Low jump have mean
value of 601.10 and 612.50 which shows the homogeneity has been breached. However, in
high jump the mean is 638.70 which has a mean long way from other two groups which shows
the homogeneity is not breached for this variance.
The ANOVA table below clearly reveals the mean between the groups is statistically
significant since the value of F and significance are 7.978 and 0.002 respectively. This clearly
shows that the bone density is higher for the rats which undergone high jump.
10
is lesser the 0.05. For both the test the value is the same. Hence, the high jump supports the
alternate hypothesis.
2.8 Conclusion:
From the analysis, we have statistically significantly demonstrated the difference between the
groups through one-way ANOVA where the values of F (2,27) = 7.978 and p = 0.002 which
is less than 0.05. This concludes that the bone density of rats which performs high jump are
higher than the other two groups. Hence, statistically significantly the high jump supports in
the rejection of null hypothesis.
11
REFERENCE:
[1] Independent t-test in SPSS Statistics - Procedure, output and interpretation of the output
using a relevant example | Laerd Statistics [WWW Document], n.d. URL
https://statistics.laerd.com/spss-tutorials/independent-t-test-using-spss-statistics.php
(accessed 11.1.20).
[2] Mann-Whitney U Test in SPSS Statistics | Setup, Procedure & Interpretation | Laerd
Statistics [WWW Document], n.d. URL https://statistics.laerd.com/spss-tutorials/mann-
whitney-u-test-using-spss-statistics.php (accessed 11.1.20).
[3] Yeager, K., n.d. LibGuides: SPSS Tutorials: Chi-Square Test of Independence [WWW
Document]. URL https://libguides.library.kent.edu/SPSS/ChiSquare (accessed 11.1.20).
[4] One-way ANOVA in SPSS Statistics - Step-by-step procedure including testing of
assumptions. [WWW Document], n.d. URL https://statistics.laerd.com/spss-tutorials/one-
way-anova-using-spss-statistics.php (accessed 11.1.20).
12