Professional Documents
Culture Documents
SPSS Advance Statistics Session 1 RCD DR Muhammad Khan Asif
SPSS Advance Statistics Session 1 RCD DR Muhammad Khan Asif
Workshop on
Introduction to
Statistics in Dentistry
using SPSS
1
13/3/2023
What is Statistics?
2
13/3/2023
3
13/3/2023
Sampling Methods
Types:
1. Probability Sampling: Every member of the population has a chance of
being selected. Probability sampling techniques are the most valid choice.
4
13/3/2023
Variable
A variable can be defined as a characteristic of things or objects that
take different values in different items that are tested.
There are two types of variables:
Quantitative variable: This is a phrase used to describe
measurable characteristics like height, weight, age and exam
marks and counts like number of passes, number students and
number of accidents.
Qualitative variable: This is a phrase used to describe
characteristics that cannot be measured or counted, but merely
categorized like race, sex, colour, exam grades and blood group
5
13/3/2023
Data
Raw material of statistics
Types of data
Quantitative data can be classified into
discrete data and continuous data.
1. Discrete data are numerical characteristics
that are countable (whole numbers).
Examples: Number of males and number of
females, Number of patients waiting for surgery,
Number of students sitting for an exam
2. Continuous data are numerical characteristics
that are measurable.
Examples: Marks obtain by students in an exam,
Body mass index (BMI) of patients, Time taken by
athletes to complete a road race
6
13/3/2023
Data
Qualitative data can be classified further into nominal data and
ordinal data.
1. Nominal data are categorical characteristics that can be named.
Examples: Gender: Male or female – based on physical traits. Blood
group: A, B, AB or O – based on allele types. Of course, it is not true that
group A is better than group B. They are just names given based on
particular characteristics.
2. Ordinal data are categorical characteristics that can be named
and ranked as well.
Examples: Socio-economic status: Low, middle or high. Exam grades: A,
B, C, D or E – based on level of achievement. Of course, grade A is
better than grade B and so on.
7
13/3/2023
DESCRIPTIVE STATISTICS
8
13/3/2023
DESCRIPTIVE STATISTICS
Distribution
Skewness: Measures the asymmetry of a distribution.
Generally, if the skewness value is within plus minus 1,
symmetry can be assumed.
9
13/3/2023
DESCRIPTIVE STATISTICS
Test of normality
Shapiro-Wilk test is usually used when the sample size is small, generally less
than 50. Kolmogorov-Smirnov test can be used when the sample size is large.
In both tests, if the p-value is more than 0.05, normality can be assumed.
In this example, since the sample size is 36, the Shapiro-Wilk test will be
used.
The p-value of the test is more than 0.05. Hence, the data can be assumed to
be distributed normal. The normality assumption is the foundation of many
statistical tests.
A number of tests require the raw data to be distributed normal. There are
other tests that require a derived variable to be distributed normal. When
the normality assumption is not met, the researcher will have to turn to the
next alternative, which is using nonparametric methods.
10
13/3/2023
11
13/3/2023
12
13/3/2023
Interpretation:
The mean BMI for the 36 subjects is 28.4 with a standard
deviation of 5.3 (usually written as 28.4±5.3). Maximum and
minimum BMI values are 37.8 and 17.2.The range is 20.6. The
median BMI is 28.0. Median value of 28.0 indicates that at least
50% of the respondents’ BMI is more than 28.0. The skewness
value is -0.131, which is within ±1. Hence, the data can be
assumed to be symmetrical. The P-Value for Shapiro-wilk test is
greater than 0.05, which shows the data is normally distributed
so the assumption is met.
13
13/3/2023
14
13/3/2023
15
13/3/2023
16
13/3/2023
17
13/3/2023
18
13/3/2023
Assumptions
The main assumption is that the test variable is normally distributed
in the population and the cases in the sample represent a random
sample from the population. In most circumstances, with a large
sample size of more than 30, the test yields relatively valid results even
if the population is substantially non-normal.
19
13/3/2023
20
13/3/2023
Findings
1. The mean difference is 2.4.
2. The p-value of the test is 0.010, which is less than 0.05.
3. The 95% CI for mean difference is [0.6, 4.2] which do not include zero.
4. The 95% CI for population mean = [(0.6+26), (4.2+26] = [26.6, 30.2].
Conclusion: The p-value of the test is less than 0.05. Thus, the mean BMI in
the population is not 26. We are 95% confident that the mean BMI in the
population is between 26.6 and 30.2 kg/m2 .
21
13/3/2023
Findings
1. The mean difference is 1.3
2. The p-value of the test is 0.621, which is more than 0.05.
3. The 95% CI for mean difference is [-6.8, 4.1], which includes zero.
4. The 95% CI for population mean = [(-6.8+150), (4.1+150)] = [143, 159].
Conclusion: The p-value of the test is more than 0.05. Thus, the mean SBP in
the population is no different from 150. We are 95% confident that the mean
SBP in the population is between 143 and 154 mmHg.
22
13/3/2023
Test if the mean Diastolic blood pressure is equal to 100? Run test and
interpret it.
23
13/3/2023
Assumptions
The difference scores must be distributed normal. If this
assumption is not met, the nonparametric test must be used.
24
13/3/2023
25
13/3/2023
The p-value of the test is 0.235, which is greater than 0.05. Hence, the
assumption of equality of variances is met. Hence, the parametric
procedure test must be used.
26
13/3/2023
27
13/3/2023
Findings
1. The mean difference is 5.1±2.7.
2. The p-value of the test is less than 0.001.
3. The 95% CI for mean difference is [4.0, 6.2], which do not involve zero.
Conclusion: The p-value of the test is less than 0.05. Thus, there is a significant change in
mean SBP. Since the mean after is less than the mean before, the drug is effective. We are 95%
confident that the reduction in SBP is between 4.0mmHg and 6.2mmHg
28
13/3/2023
29
13/3/2023
30
13/3/2023
31
13/3/2023
Conclusion: There is a
significant difference in DBP.
Hence, the drug is effective.
32
13/3/2023
33
13/3/2023
Before the training program data: 12, 15, 16, 15, 13, 14,
15, 12, 18, 19
After the training program data: 11, 14, 12, 14, 10, 12,
13, 11, 16, 17
34
13/3/2023
35
13/3/2023
36
13/3/2023
Findings
1.The p-value for the Levene’s test for equality of variance is 0.883. Since the p-value is
more than 0.05, equality of variances is assumed.
2. The mean difference is 7.9.
3. The p-value of the test is less than 0.001.
4. The 95% CI for mean difference is [5.6, 10.3], which do not include zero.
Conclusion: The p-value of the test is less than 0.05. Thus, there is a significant difference
in mean BMI between the males and females. The mean BMI among the females is higher
compared to the males. We are 95% confident that the difference is between 5.6 and 10.3
37
13/3/2023
38
13/3/2023
39
13/3/2023
40
13/3/2023
ONE-WAY ANOVA
The one way ANOVA can be used to test if there is a difference in a
measured characteristic between more than two groups of cases.
The objective is to test if there is a difference in means between more
than 2 groups of population.
Example:
41
13/3/2023
42
13/3/2023
Findings
1. The p-value for the Levene’s test for equality of variance is 0.384, which is more
than 0.05. Thus, equality of variances assumption is met.
2. The p-value of the test is 0.028, which is less than 0.05. Hence, at least one pair
of means differ significantly.
Conclusion: At least one pair of means differ significantly.
43
13/3/2023
44
13/3/2023
• The p-values for the mean difference are given in the column ‘Sig.’
45
13/3/2023
46
13/3/2023
47
13/3/2023
48
13/3/2023
49
13/3/2023
50
13/3/2023
51
13/3/2023
52
13/3/2023
53
13/3/2023
54
13/3/2023
Thank you
55