SPSS Advance Statistics Session 1 RCD DR Muhammad Khan Asif

13/3/2023
Workshop on
Introduction to
Statistics in Dentistry
using SPSS
Dr. Muhammad Khan Asif

BDS (PAK), CHPE (PAK), MDSc
(Malaysia), PhD (Malaysia).
Chairman Shifa College of Dentistry
Research Board, Assistant Professor,
Head of Research & Development and
Forensic Odontology Department,
Shifa College of Dentistry, Islamabad.
1
13/3/2023
What is Statistics?
 Statistics is the science of learning from data.
 As Dr. Diego Kuonen (from Statoo Consulting, Switzerland) once

remarked, “Statistics is concerned with one of the most basic
human needs: the need to know about the world and how it
operates in the face of variation and uncertainty.” We can say the
whole field of statistics revolves around this concept – variation.
2
13/3/2023
Population and Sample
 Population: A set of things or

objects in which we have an
interest at the particular time.
Examples: Workers at a factory,
students in a college, in-patients at
a hospital.
 Sample: A subset of the
population
Examples: A group of workers at
the factory, a selection of students
from the college.
3
13/3/2023
Sampling Methods
 Types:
 1. Probability Sampling: Every member of the population has a chance of
being selected. Probability sampling techniques are the most valid choice.
 2. Non Probability Sampling: Individuals are selected based on non-random

criteria, and not every individual has a chance of being included in the study.
This type of sample is easier and cheaper to access but it has a higher risk of
sampling bias.
4
13/3/2023
Variable
A variable can be defined as a characteristic of things or objects that
take different values in different items that are tested.
There are two types of variables:
 Quantitative variable: This is a phrase used to describe
measurable characteristics like height, weight, age and exam
marks and counts like number of passes, number students and
number of accidents.
 Qualitative variable: This is a phrase used to describe
characteristics that cannot be measured or counted, but merely
categorized like race, sex, colour, exam grades and blood group
5
13/3/2023
Data
 Raw material of statistics
Types of data
 Quantitative data can be classified into
discrete data and continuous data.
1. Discrete data are numerical characteristics
that are countable (whole numbers).
Examples: Number of males and number of
females, Number of patients waiting for surgery,
Number of students sitting for an exam
2. Continuous data are numerical characteristics
that are measurable.
Examples: Marks obtain by students in an exam,
Body mass index (BMI) of patients, Time taken by
athletes to complete a road race
6
13/3/2023
Data
 Qualitative data can be classified further into nominal data and
ordinal data.
1. Nominal data are categorical characteristics that can be named.
Examples: Gender: Male or female – based on physical traits. Blood
group: A, B, AB or O – based on allele types. Of course, it is not true that
group A is better than group B. They are just names given based on
particular characteristics.
2. Ordinal data are categorical characteristics that can be named
and ranked as well.
Examples: Socio-economic status: Low, middle or high. Exam grades: A,
B, C, D or E – based on level of achievement. Of course, grade A is
better than grade B and so on.
7
13/3/2023
DESCRIPTIVE STATISTICS
8
13/3/2023
Distribution
 Skewness: Measures the asymmetry of a distribution.
 Generally, if the skewness value is within plus minus 1,
symmetry can be assumed.
9
13/3/2023
Test of normality
 Shapiro-Wilk test is usually used when the sample size is small, generally less
than 50. Kolmogorov-Smirnov test can be used when the sample size is large.
 In both tests, if the p-value is more than 0.05, normality can be assumed.
In this example, since the sample size is 36, the Shapiro-Wilk test will be
used.
 The p-value of the test is more than 0.05. Hence, the data can be assumed to
be distributed normal. The normality assumption is the foundation of many
statistical tests.
 A number of tests require the raw data to be distributed normal. There are
other tests that require a derived variable to be distributed normal. When
the normality assumption is not met, the researcher will have to turn to the
next alternative, which is using nonparametric methods.
10
13/3/2023
Example 1: Body Mass Index (Dataset 1)
11
13/3/2023
DESCRIPTIVE STATISTICS: Example 1 BMI
12
13/3/2023
DESCRIPTIVE STATISTICS: Example 1 BMI
 Interpretation:
The mean BMI for the 36 subjects is 28.4 with a standard
deviation of 5.3 (usually written as 28.4±5.3). Maximum and
minimum BMI values are 37.8 and 17.2.The range is 20.6. The
median BMI is 28.0. Median value of 28.0 indicates that at least
50% of the respondents’ BMI is more than 28.0. The skewness
value is -0.131, which is within ±1. Hence, the data can be
assumed to be symmetrical. The P-Value for Shapiro-wilk test is
greater than 0.05, which shows the data is normally distributed
so the assumption is met.
13
13/3/2023
Example 2: : To examine Triglyceride (TG)
14
13/3/2023
Example 2: : To examine Triglyceride

(TG)
 Interpretation
The mean TG is 2.78± 1.77. The maximum and minimum values
are 9.00 and 1.00. The range is 8.00. The median value is 2.19,
indicating that at least 50% of the respondents’ TG is more than
2.19. The skewness value is 1.971 which is more than 1. Hence,
the data is not symmetrical. The P-Value for Shapiro-wilk test is
less than 0.05, which shows the data is not normally distributed
so the assumption is not met.
There is one extreme value (*) in case number 34, with a value of
approximately 9. There are two outliers (o) in cases 9 and 2.
15
13/3/2023
To Obtain Descriptive Statistics for

qualitative data (For Physical Activity PA)
16
13/3/2023
To Obtain Descriptive Statistics for

qualitative data
17
13/3/2023
Additional Exercise (Dataset 1): Investigate

and Interpret the descriptive statistics
(Explore) of Diastolic blood pressure
 94, 102, 106, 97, 101, 98, 95, 105, 112, 99, 97, 104, 109, 100, 111, 93, 100,
93, 116, 83, 91, 84, 105, 111, 110, 90, 122, 86, 111, 93, 112, 88, 88, 95, 124,
95
18
13/3/2023
ONE SAMPLE T-TEST

The objective is to test if the population mean is equal to a hypothesize
value.
 Example 1: To test if mean BMI in the population is 26 kg/m2
Assumptions
 The main assumption is that the test variable is normally distributed
in the population and the cases in the sample represent a random
sample from the population. In most circumstances, with a large
sample size of more than 30, the test yields relatively valid results even
if the population is substantially non-normal.
19
13/3/2023
ONE SAMPLE T-TEST Example 1 (dataset 1):

To test if mean BMI in the population is 26
kg/m2
20
13/3/2023
ONE SAMPLE T-TEST Example 1: To test if

mean BMI in the population is 26 kg/m2
Findings
1. The mean difference is 2.4.
2. The p-value of the test is 0.010, which is less than 0.05.
3. The 95% CI for mean difference is [0.6, 4.2] which do not include zero.
4. The 95% CI for population mean = [(0.6+26), (4.2+26] = [26.6, 30.2].
Conclusion: The p-value of the test is less than 0.05. Thus, the mean BMI in
the population is not 26. We are 95% confident that the mean BMI in the
population is between 26.6 and 30.2 kg/m2 .
21
13/3/2023
ONE SAMPLE T-TEST Example 2 (dataset 1):

To test if mean SBP in the population is 150
mmHg
Findings
1. The mean difference is 1.3
2. The p-value of the test is 0.621, which is more than 0.05.
3. The 95% CI for mean difference is [-6.8, 4.1], which includes zero.
4. The 95% CI for population mean = [(-6.8+150), (4.1+150)] = [143, 159].
Conclusion: The p-value of the test is more than 0.05. Thus, the mean SBP in
the population is no different from 150. We are 95% confident that the mean
SBP in the population is between 143 and 154 mmHg.
22
13/3/2023
Additional Exercise: Additional Exercise:

Enter the following data sets in SPSS and
perform the tests.
 94, 102, 106, 97, 101, 98, 95, 105, 112, 99, 97, 104, 109,
100, 111, 93, 100, 93, 116, 83, 91, 84, 105, 111, 110, 90,
122, 86, 111, 93, 112, 88, 88, 95, 124, 95
 Test if the mean Diastolic blood pressure is equal to 100? Run test and
interpret it.
23
13/3/2023
PAIRED SAMPLE TESTS

The Paired Samples T-Test can be used to test if there is a
difference in a measured characteristic between two time points.
Assumptions
The difference scores must be distributed normal. If this
assumption is not met, the nonparametric test must be used.
24
13/3/2023
Assumption test (Data set 2):

The difference scores must be distributed
normal.
25
13/3/2023
Assumption test (Data set 2):

The difference scores must be distributed
normal.
The p-value of the test is 0.235, which is greater than 0.05. Hence, the
assumption of equality of variances is met. Hence, the parametric
procedure test must be used.
26
13/3/2023
Paired Sample Tests Data Set 2 (Example 1): To test

if the drug is effective for the management of
hypertension.
27
13/3/2023
Paired Sample Tests Data Set 2 (Example 1): To test

if the drug is effective for the management of
hypertension.
Findings
1. The mean difference is 5.1±2.7.
2. The p-value of the test is less than 0.001.
3. The 95% CI for mean difference is [4.0, 6.2], which do not involve zero.
Conclusion: The p-value of the test is less than 0.05. Thus, there is a significant change in
mean SBP. Since the mean after is less than the mean before, the drug is effective. We are 95%
confident that the reduction in SBP is between 4.0mmHg and 6.2mmHg
28
13/3/2023
Paired Sample Tests Data Set 2 (Example 2): To

test if the drug is effective for the management
of hypertension (diastolic BP).
 Test of Normality:
29
13/3/2023
Paired Sample Tests Data Set 2 (Example 2):

To test if the drug is effective for the
management of hypertension (diastolic BP).
 Test of Normality:
The p-value of the test is 0.007, which is less than

0.05. Hence, the assumption of equality of variances is
not met. Hence, the parametric procedure is not valid.
Then the nonparametric test based on ranking must
be used to test if there is a difference.
30
13/3/2023

Nonparametric Wilcoxon Signed –Rank test To
Obtain a Nonparametric Paired-Sample Test
31
13/3/2023

Nonparametric Wilcoxon Signed –Rank test To Obtain
a Nonparametric Paired-Sample Test
Out of the 24 subjects, 19

recorded lower DBP and 5
recorded higher DBP
The p-value of the test is

0.004, which is less than 0.05.
Overall, there is a change in
DBP.
Conclusion: There is a
significant difference in DBP.
Hence, the drug is effective.
32
13/3/2023
Paired samples test
33
13/3/2023
Additional Exercise 1: The following are the

time (in min) taken to complete a task before
and after a new training program.
 Before the training program data: 12, 15, 16, 15, 13, 14,
15, 12, 18, 19
 After the training program data: 11, 14, 12, 14, 10, 12,
13, 11, 16, 17
 Test if the training program is effective???
34
13/3/2023
Independent Samples T-Test

The Independent Samples T-Test can be used to test if there is a
difference in a measured characteristic between two groups of
cases.
The objective is to test if there is a difference in means between the
2 populations.
Assumptions: The variances in the two groups must be similar, a

condition known as homogeneity. In SPSS the Levene’s test is used
to test if this assumption is met.
35
13/3/2023
Independent Samples T-Test Data set 1 (Example 1): To

test if there is a difference in mean BMI between the
male and females
36
13/3/2023
Independent Samples T-Test
Findings
1.The p-value for the Levene’s test for equality of variance is 0.883. Since the p-value is
more than 0.05, equality of variances is assumed.
2. The mean difference is 7.9.
3. The p-value of the test is less than 0.001.
4. The 95% CI for mean difference is [5.6, 10.3], which do not include zero.
Conclusion: The p-value of the test is less than 0.05. Thus, there is a significant difference
in mean BMI between the males and females. The mean BMI among the females is higher
compared to the males. We are 95% confident that the difference is between 5.6 and 10.3
37
13/3/2023
Independent Samples T-Test (Non

parametric test Example)
38
13/3/2023
Independent Samples T-Test (Non

parametric test Example)
39
13/3/2023
Independent Samples T-Test (Task 1)
Age of Facebook users:

33, 31, 44, 35, 45, 42, 40, 50, 46, 42, 38, 33, 43, 49, 46,
50, 40, 41, 37, 52
Age of Non-Facebook users:
46, 49, 39, 42, 43, 42, 43, 37, 50, 36, 40, 32, 40, 42, 36, 42
Test if there is a difference in mean age

between the Facebook users and non users?
40
13/3/2023
ONE-WAY ANOVA
The one way ANOVA can be used to test if there is a difference in a
measured characteristic between more than two groups of cases.
The objective is to test if there is a difference in means between more
than 2 groups of population.
Example:
Doctors Attendants Nurse
Group 1 Group 2 Group 3

Assumptions: The variances within the levels must be similar, a condition
known as homogeneity. Also, the distribution for all the data must be at
least fairly normal. In SPSS the Levene’s test is used to test if this
assumption is met.
41
13/3/2023
ONE-WAY ANOVA (Dataset 1): To test if mean

BMI differ between three job categories.
42
13/3/2023
ONE-WAY ANOVA Example 1: To test if mean

BMI differ between three job categories.
Findings
1. The p-value for the Levene’s test for equality of variance is 0.384, which is more
than 0.05. Thus, equality of variances assumption is met.
2. The p-value of the test is 0.028, which is less than 0.05. Hence, at least one pair
of means differ significantly.
Conclusion: At least one pair of means differ significantly.
43
13/3/2023
ONE-WAY ANOVA Example 1: To test if mean BMI differ

between three job categories.
When there is a difference, there is a need to identify the pair(s) that
differs significantly. This is done using post hoc tests
In this example, since equality of variances can be assumed, Tukey procedure is

chosen. When the variances are not similar, either Tamhanes’ T2, Dunnett’s T3,
Dunnett’s C or Games-Howell procedures is used.
44
13/3/2023
ONE-WAY ANOVA Example 1: To test if mean BMI differ

between three job categories.
• The p-values for the mean difference are given in the column ‘Sig.’
Conclusion: The mean BMI among Nurses is significantly higher

compared to that of Attendants and doctors. However there is no significant
difference in the mean BMI of doctor and attendants.
45
13/3/2023
ONE-WAY ANOVA (Dataset 1): To test if mean

DBP differ between three job categories
The p-value for the

Levene’s test for equality
of variance is 0.035,
which is less than 0.05.
Thus, equality of
variances assumption is
not met. Thus non-
parametric test should
be used.
46
13/3/2023
ONE-WAY ANOVA Example 2:Nonparametric

test:
47
13/3/2023
ONE-WAY ANOVA Example 2: Nonparametric

test:
The p-value of this test is 0.020,
which is less than 0.05. Thus,
DBP differ between at least one
pair of categories. Based on the
mean ranks, obviously, DBP
among Nurses is higher
compared to Attendants and
doctors. In this example, further
investigation can be made by
performing a post hoc test, under
the condition of equality of
variances not assumed (for
example Dunnett T3 procedure).
48
13/3/2023

test (Post hoc analysis):
49
13/3/2023

test (post hoc analysis):
Conclusion: There is a significant difference in DBP between

Attendants and Nurses.
50
13/3/2023
ONE-WAY ANOVA (Additional Exercise 1)
Test if mean BP differ between doctors, nurses and teachers?
51
13/3/2023
One-Way ANOVA: Flow Diagram
52
13/3/2023
ONE-WAY ANOVA (Additional Exercise 2)
Test if the marks obtained differ between the 3 faculties?
53
13/3/2023
Exercise: Choosing the Right Statistical

Test?????
1. Test if the mean fasting glucose level is 5.6 mmol/l??????
2. Test if there is a significant difference in the mean weight

change between dentists and students????
3. Test if there is a difference in pH levels between morning and

afternoon among 3rd year BDS students of the SCD?
4. To test distance travelled (in km) by four different makes of

“gasoline-saving” cars on 50 liters of petrol, over the same course?
54
13/3/2023
Thank you
Contact details: WhatsApp 03079314642

Email: muhammad.khan.scd@stmu.edu.pk /
dr.muhammad.khan@hotmail.com
55

SPSS Advance Statistics Session 1 RCD DR Muhammad Khan Asif

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

SPSS Advance Statistics Session 1 RCD DR Muhammad Khan Asif

Uploaded by

Copyright:

Available Formats

13/3/2023

Dr. Muhammad Khan Asif

 Statistics is the science of learning from data.

 As Dr. Diego Kuonen (from Statoo Consulting, Switzerland) once

Population and Sample

 Population: A set of things or

 2. Non Probability Sampling: Individuals are selected based on non-random

Example 1: Body Mass Index (Dataset 1)

DESCRIPTIVE STATISTICS: Example 1 BMI

DESCRIPTIVE STATISTICS: Example 1 BMI

Example 2: : To examine Triglyceride (TG)

Example 2: : To examine Triglyceride

To Obtain Descriptive Statistics for

To Obtain Descriptive Statistics for

Additional Exercise (Dataset 1): Investigate

ONE SAMPLE T-TEST

 Example 1: To test if mean BMI in the population is 26 kg/m2

ONE SAMPLE T-TEST Example 1 (dataset 1):

ONE SAMPLE T-TEST Example 1: To test if

ONE SAMPLE T-TEST Example 2 (dataset 1):

Additional Exercise: Additional Exercise:

PAIRED SAMPLE TESTS

Assumption test (Data set 2):

Assumption test (Data set 2):

Paired Sample Tests Data Set 2 (Example 1): To test

Paired Sample Tests Data Set 2 (Example 1): To test

Paired Sample Tests Data Set 2 (Example 2): To

Paired Sample Tests Data Set 2 (Example 2):

The p-value of the test is 0.007, which is less than

Paired Sample Tests Data Set 2 (Example 2):

Paired Sample Tests Data Set 2 (Example 2):

Out of the 24 subjects, 19

The p-value of the test is

Paired samples test

Additional Exercise 1: The following are the

 Test if the training program is effective???

Independent Samples T-Test

Assumptions: The variances in the two groups must be similar, a

Independent Samples T-Test Data set 1 (Example 1): To

Independent Samples T-Test

Independent Samples T-Test (Non

Independent Samples T-Test (Non

Independent Samples T-Test (Task 1)

Age of Facebook users:

Test if there is a difference in mean age

Doctors Attendants Nurse

Group 1 Group 2 Group 3

ONE-WAY ANOVA (Dataset 1): To test if mean

ONE-WAY ANOVA Example 1: To test if mean

ONE-WAY ANOVA Example 1: To test if mean BMI differ

In this example, since equality of variances can be assumed, Tukey procedure is

ONE-WAY ANOVA Example 1: To test if mean BMI differ

Conclusion: The mean BMI among Nurses is significantly higher

ONE-WAY ANOVA (Dataset 1): To test if mean

The p-value for the

ONE-WAY ANOVA Example 2:Nonparametric

ONE-WAY ANOVA Example 2: Nonparametric

ONE-WAY ANOVA Example 2: Nonparametric

ONE-WAY ANOVA Example 2: Nonparametric

Conclusion: There is a significant difference in DBP between

ONE-WAY ANOVA (Additional Exercise 1)

Test if mean BP differ between doctors, nurses and teachers?

One-Way ANOVA: Flow Diagram