Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 18

1

SPSS EXERCISES

ZAHRA JANELLE G. HAMBRE

MASTER IN MANAGEMENT MAJOR IN MANAGEMENT ENGINEERING

DR. NENITA I. PRADO

PROFESSOR

MM 120B – STATISTCAL RESEARCH DESIGN

LICEO DE CAGAYAN UNIVERSITY

APRIL 11, 2021


2

EXERCISE 1

MEASURE FREQUENCIES AND PERCENTAGE

Exercise 1 is about measuring frequencies and percentage. Frequency is the number of

occurrences of a repeating event per unit of time. The frequency of an observation tells you the

number of times the observation occurs in the data. On the other hand, percentage is used to

represent a particular percent of a set of data and how many values of the provided data fall

within a particular range (Daniel & Cross, 2018). Using SPSS Program Software, we can easily

calculate the frequencies and percentiles of the given data.

There are various products that people use to smoke. Some of these are cigar, cigarettes,

tobacco, and e-cigarettes (vapes). Health risks from these products range from asthma to chronic

obstructive pulmonary disease and cancer. Cigarette smoking is a major cause of cardiovascular

disease (CVD), and past reports of the Surgeon General extensively reviewed the relevant

evidence (U.S. Department of Health, Education, and Welfare [USDHEW] 1971 1979;

USDHHS 1983, 2001, 2004). Cigarette smoking has been responsible for approximately 140,000

premature deaths annually from CVD (USDHHS 2004).

The purpose for this study is to examine the comparison of the total cholesterol levels

between smokers and nonsmokers. For this exercise, we will determine the ages of the

participants either smokers or non-smokers and their corresponding percentage range of the

population.
3

Table 1.1 Statistics


Age
N Valid 60

Missing 0
Percentiles 25 25.50

50 37.50

75 51.00

Table 1.1 shows the percentiles on the respondent’s age in this study. Where N for Valid

is 60 and Missing is equal to 0. Therefore, all the input values were valid for the variable age.

There are three values reported for Percentiles: 25, 50 and 75. For the 25 th percentile we attained

a value of 25.50, for the 50th percentile we have the value of 37.50 and a value of 51.00 for the

75th percentile. Hence, 25% of the population had ages lower than 25.50 while 50% of the of the

population had ages lower than 37.50 and lastly 75% of the population have ages lower than

51.00.

Table 1.2 Age Frequency Table


Cumulative
Frequency Percent Valid Percent Percent
Valid 17 2 3.3 3.3 3.3
18 2 3.3 3.3 6.7
19 2 3.3 3.3 10.0
20 1 1.7 1.7 11.7
21 2 3.3 3.3 15.0
22 2 3.3 3.3 18.3
23 2 3.3 3.3 21.7
24 1 1.7 1.7 23.3
25 1 1.7 1.7 25.0
27 2 3.3 3.3 28.3
28 1 1.7 1.7 30.0
29 1 1.7 1.7 31.7
30 2 3.3 3.3 35.0
4

31 1 1.7 1.7 36.7


32 2 3.3 3.3 40.0
33 1 1.7 1.7 41.7
34 1 1.7 1.7 43.3
35 2 3.3 3.3 46.7
37 2 3.3 3.3 50.0
38 1 1.7 1.7 51.7
40 1 1.7 1.7 53.3
41 1 1.7 1.7 55.0
42 2 3.3 3.3 58.3
43 1 1.7 1.7 60.0
44 2 3.3 3.3 63.3
47 1 1.7 1.7 65.0
48 2 3.3 3.3 68.3
49 2 3.3 3.3 71.7
50 1 1.7 1.7 73.3
51 2 3.3 3.3 76.7
52 1 1.7 1.7 78.3
53 1 1.7 1.7 80.0
54 1 1.7 1.7 81.7
55 2 3.3 3.3 85.0
56 1 1.7 1.7 86.7
57 1 1.7 1.7 88.3
58 1 1.7 1.7 90.0
59 1 1.7 1.7 91.7
60 1 1.7 1.7 93.3
61 1 1.7 1.7 95.0
62 1 1.7 1.7 96.7
66 1 1.7 1.7 98.3
68 1 1.7 1.7 100.0
Total 60 100.0 100.0

Table 1.2 shows the age frequency table of the respondents having the most frequency of

2 and rest is 1. The frequencies are converted to percentages in the percent column which results

to a total of 100%. The values under is Valid Percent is the same with the Percent column. The
5

column for cumulative percent gives us a running total of the percentage values occurring across

the responses reaching the highest value of 100% after totaling all of the previous percentages.

EXERCISE 2

MEASURES OF CENTRAL TENDENCY AND VARIABILITY

The purpose this exercise is to determine the measures central tendencies and measures

of variability, also known as the descriptive statistics. Central tendency is a single value that

describes a set of data by identifying the central position within that set of data. Mean, Median

and Mode are all valid measures of central tendency. On the other hand, measures of variability

is used to determine how far the data points tend to fall from the center. The most common

measures of variability are range, standard deviation and variance. Using SPSS Program

Software, we can easily analyze the descriptive statistics of the given data.

There are various products that people use to smoke. Some of these are cigar, cigarettes,

tobacco, and e-cigarettes (vapes). Health risks from these products range from asthma to chronic

obstructive pulmonary disease and cancer. Cigarette smoking is a major cause of cardiovascular

disease (CVD), and past reports of the Surgeon General extensively reviewed the relevant

evidence (U.S. Department of Health, Education, and Welfare [USDHEW] 1971 1979;

USDHHS 1983, 2001, 2004). Cigarette smoking has been responsible for approximately 140,000

premature deaths annually from CVD (USDHHS 2004).


6

The purpose for this study is to examine the comparison of the total cholesterol levels

between smokers and nonsmokers. For this exercise, we will determine descriptive statistics, for

the ages of the participants either smokers or non-smokers.

Table 2.1 Descriptive Statistics


N Minimum Maximum Mean Std. Deviation
Age 60 17 68 38.82 14.571

Valid N (listwise) 60

Table 2.1 presents the descriptive statistics of participants in the study. The total number

of participants, N = 60 is equal to the value for Valid N (listwise). Hence all the input values

were valid for the variable age. The columns for minimum, maximum and mean are 17, 68, and

38.82, respectively. Therefore, the minimum age of this group is 17 years old while the

maximum age is 68 years old. Furthermore, the average age among the participants is 38.82

years old. The standard deviation for age is 14.571 which indicates that the participant’s age was

between 14.751 younger and 14.571 older from the mean of 38.82.
7

EXERCISE 3

PEARSON PRODUCT MOMENT CORRELATION

The purpose for this exercise is to determine the correlation between two variables.

Correlation is defined as the degree of relationship between two or more variables. The Pearson

product-moment correlation evaluates the strength of a linear relationship between two variables.

We used SPSS Program Software to calculate the correlation of the variables.

There are various products that people use to smoke. Some of these are cigar, cigarettes,

tobacco, and e-cigarettes (vapes). Health risks from these products range from asthma to chronic

obstructive pulmonary disease and cancer. Cigarette smoking is a major cause of cardiovascular

disease (CVD), and past reports of the Surgeon General extensively reviewed the relevant

evidence (U.S. Department of Health, Education, and Welfare [USDHEW] 1971 1979;

USDHHS 1983, 2001, 2004). Cigarette smoking has been responsible for approximately 140,000

premature deaths annually from CVD (USDHHS 2004).

The purpose for this study is to examine the correlation between the smoker status and

the total cholesterol levels. For this exercise, we will determine the correlation between smoker

status and cholesterol using a two-tailed test. Smoker status indicates whether you are smoker, or
8

a non-smoker and Cholesterol is a waxy type of fat, or lipid, which moves throughout your body

in your blood and can increase the risk for cardiovascular disease. Our null hypothesis which is,

H0=0, indicates that there is no correlation between the variables and our alternative hypothesis

H1≠0 indicates that there is a correlation between the variables and our alternative hypothesis.

We refute the null hypothesis when p-value is below 0.01.

Table 3.1 Correlation between smoker status and cholesterol


Smoker status Cholesterol
Smoker status Pearson Correlation 1 -.519**
Sig. (2-tailed) .000
N 60 60
Cholesterol Pearson Correlation -.519** 1
Sig. (2-tailed) .000
N 60 60
**. Correlation is significant at the 0.01 level (2-tailed).

Table 3.1 shows the correlation between variables smoker status and cholesterol levels,

with N = 60. The values for Pearson Correlation and Sig. (2-tailed) for age and smoke status

relation is -.519 and .000, respectively. Note that the correlation is significant at 0.01 level for 2-

tailed test.

Based from the results we can conclude that the correlation between smoker status and

cholesterol is significant because the p-value 0.000 is less than 0.01 level significance (2-tailed)

therefore rejecting the null hypothesis.


9

EXERCISE 4

INDEPENDENT SAMPLE T-TEST

The purpose for this exercise is to determine the significant difference between two

variables. The independent samples t-test was used to compare the means of two independent

groups to determine whether there is statistical evidence that the associated population means are

significantly different. The independent samples t-test compares two independent groups of

observations or measurements on a single characteristic.

There are various products that people use to smoke. Some of these are cigar, cigarettes,

tobacco, and e-cigarettes (vapes). Health risks from these products range from asthma to chronic

obstructive pulmonary disease and cancer. Cigarette smoking is a major cause of cardiovascular

disease (CVD), and past reports of the Surgeon General extensively reviewed the relevant

evidence (U.S. Department of Health, Education, and Welfare [USDHEW] 1971 1979;

USDHHS 1983, 2001, 2004). Cigarette smoking has been responsible for approximately 140,000

premature deaths annually from CVD (USDHHS 2004).

The purpose for this study is to examine the significant difference between the smoker

status and the total cholesterol levels. Smoker status indicates whether you are smoker, or a non-

smoker and Cholesterol is a waxy type of fat, or lipid, which moves throughout your body in

your blood and can increase the risk for cardiovascular disease. Our null hypothesis which is,

H0=0, indicates that there is no significant difference between the variables and our alternative
10

hypothesis H1≠0 indicates that there is a significant difference between the variables and our

alternative hypothesis. We refute the null hypothesis when p-value is below 0.05.

Table 4.1 Group Statistics on smoker status and cholesterol


Smoker status N Mean Std. Deviation Std. Error Mean
Cholesterol Smoker 29 6.5966 1.65902 .30807
Non-smoker 31 4.7213 1.48095 .26599

Table 4.1 presents the group statistics on the participants smoker status with the

cholesterol level. From the 60 participants the number of persons who smoke is 29 while the

non-smokers are 31. The mean, standard deviation and standard error mean on smoker’s row is

6.5966, 1.625902, and .30807, respectively. For the non-smokers the mean, standard deviation

and standard error mean is 4.7213, 1.48095, and .26599, respectively. Based from the results we

can say that the mean of smokers is higher than the mean of non-smokers.

Table 4.2 Independent Samples Test


Levene's Test
for Equality of
Variances t-test for Equality of Means
95% Confidence
Std.
Interval of the
Mean Error
Difference
Sig. (2- Differen Differe
F Sig. t df tailed) ce nce Lower Upper
Chol Equal .173 .679 4.625 58 .000 1.87526 .40545 1.06366 2.68686
ester variances
ol assumed
Equal 4.607 56.171 .000 1.87526 .40701 1.05998 2.69055
variances
not
assumed
11

Table 4.2 shows the results using the independent samples T-test derived from SSPS

software. The Levene’s test for Equality of Variance show that the value for F is .173 and

significant value of p is 0.679 which is higher than 0.05 which means that this Levene’s test is

significant and the variances are statistically significantly different. Therefore, we should focus

on interpreting the variances that are not assumed to be equal.

The next column which is the t-test for equality of means shows us that the t-value is

4.607, the degrees of freedom is 56.171 and the significance value is 0.000 which is less than

0.05. It also shows that the 95% confidence interval for the difference between sample means,

which is mean 1 minus mean 2 shows a lower bound of a 1.05998 and an upper bound of

2.69055. Since the interval does cross zero, it refutes the null hypothesis value of 0 difference.

Therefore, the means for the two groups, smoker status and cholesterol levels is

significantly different. The cholesterol level for smokers is different from the non-smokers.
12

EXERCISE 5

PAIRED SAMPLE T-TEST

The purpose for this exercise is to determine the significant difference between two

variables. The paired samples t-test compares the means of two measurements taken from the

same individual, object, or related units. It is used when we are interested in the difference

between two variables for the same subject. Using SPSS Program Software, we can easily

analyze the paired sample t-test of the given data.

There are various products that people use to smoke. Some of these are cigar, cigarettes,

tobacco, and e-cigarettes (vapes). Health risks from these products range from asthma to chronic

obstructive pulmonary disease and cancer. Cigarette smoking is a major cause of cardiovascular

disease (CVD), and past reports of the Surgeon General extensively reviewed the relevant

evidence (U.S. Department of Health, Education, and Welfare [USDHEW] 1971 1979;

USDHHS 1983, 2001, 2004). Cigarette smoking has been responsible for approximately 140,000

premature deaths annually from CVD (USDHHS 2004).

The purpose for this study is to examine the significant difference between smoker status

of the participants and the total cholesterol levels. Smoker status indicates whether you are

smoker, or a non-smoker and Cholesterol is a waxy type of fat, or lipid, which moves throughout

your body in your blood and can increase the risk for cardiovascular disease. Our null hypothesis

which is, H0=0, indicates that there is no significant difference between the variables and our
13

alternative hypothesis H1≠0 indicates that there is a significant difference between the variables

and our alternative hypothesis. We refute the null hypothesis when p-value is below 0.05.

Table 5.1Paired Samples Statistics


Mean N Std. Deviation Std. Error Mean
Pair 1 Smoker 1.52 60 .504 .065
status
Cholesterol 5.6277 60 1.82056 .23503

Table 5.1 presents the paired sample statistics on the participants smoker status and

cholesterol level. The mean for age is 1.52, with the standard deviation of .504, and the standard

error mean is .065. For the cholesterol row the mean is 5.6277, the standard deviation is 1.82056,

and the standard error mean is .23503. We observed that the mean of age is higher than the mean

of cholesterol.

Table 5.2 Paired Samples Correlations


N Correlation Sig.
Pair 1 Smoker status & 60 -.519 .000
Cholesterol

Table 5.2 presents the paired sample correlations between smoker status and cholesterol.

The p-value for correlation is .000 which is less than 0.05. It means smoker status and

cholesterol is correlated with each other. However, this does not affect the results for the paired

sample t-test.

Paired Samples Test


Sig. (2-
Paired Differences t df tailed)
Mean Std. Std. 95% Confidence
Deviati Error Interval of the
on Mean Difference
14

Lower Upper
Pair Smoker -4.11100 2.12623 .27450 -4.66026 -3.56174 -14.977 59 .000
1 -
Choleste
rol
Table 5.3 shows the results of the paired samples test using SPSS Software. The value for

the mean is subtracted from the values of the means from Table 5.1 which gives the value of

-4.11100. The standard deviation of the paired differences is 2.12623, and the standard error

mean of the paired differences is .27450. The t-test value is -14.977 and the significance value or

the p-value is .000 which is less than .05. We also observed that from the 95% confidence

interval of the difference, the lower bound is -4.66026 and the upper bound is -3.56174 which

does not cross the 0 value.

Based from the results, we conclude that there is a significant difference between the age

and cholesterol level which refutes our null hypothesis. The cholesterol level for smokers is

different from the non-smokers.


15

EXERCISE 6

CHI-SQUARE TEST

The purpose for this exercise is to determine the significant difference between two

variables. The chi-square test is a test that measures how a model compares to actual observed

data with expected data. The data used in calculating a chi-square statistic must be random, raw,

mutually exclusive, drawn from independent variables, and drawn from a large enough sample. It

is used when we are interested in the difference between two variables for the same subject.

Using SPSS Program Software, we can easily analyze the paired sample t-test of the given data.

There are various products that people use to smoke. Some of these are cigar, cigarettes,

tobacco, and e-cigarettes (vapes). Health risks from these products range from asthma to chronic

obstructive pulmonary disease and cancer. Cigarette smoking is a major cause of cardiovascular

disease (CVD), and past reports of the Surgeon General extensively reviewed the relevant

evidence (U.S. Department of Health, Education, and Welfare [USDHEW] 1971 1979;

USDHHS 1983, 2001, 2004). Cigarette smoking has been responsible for approximately 140,000

premature deaths annually from CVD (USDHHS 2004).

The purpose for this study is to examine the significant difference between smoker status

of the participants and their gender. Smoker status indicates whether you are smoker, or a non-

smoker and gender indicates whether the respondent is a male or a female. Our null hypothesis

which is, H0=0, indicates that there is no significant difference between the variables and our
16

alternative hypothesis H1≠0 indicates that there is a significant difference between the variables

and our alternative hypothesis. We refute the null hypothesis when p-value is below 0.05.

Table 6.1 Case Processing Summary


Cases
Valid Missing Total
N Percent N Percent N Percent
Gender * Smoker 60 100.0% 0 0.0% 60 100.0%
status

Table 6.1 presents the case processing summary paired on the participants gender and

smoker status using SPSS Software. This table shows that there are no missing values from the

data and all values are valid.

Table 6.2 Gender * Smoker Crosstabulation


Count

Smoker

Smoker Non-smoker Total


Gender Male 15 16 31

Female 14 15 29
Total 29 31 60

Table 6.2 presents the crosstabulation on the participants gender, male and female, and

smoker status using SPSS Software. The total population for this study is 60 respondents. It is

observed that the number of males who smoke is 15 while the non-smokers are 16 which gives
17

us a total of 31 male respondents. Furthermore, it is observed that the number of females who

smoke is 14 while the non-smokers are 15 which gives us a total of 29 female respondents.

Table 6.3 Chi-Square Tests


Asymptotic
Significance Exact Sig. (2- Exact Sig. (1-
Value df (2-sided) sided) sided)
Pearson Chi-Square .000a 1 .993
Continuity Correctionb .000 1 1.000
Likelihood Ratio .000 1 .993
Fisher's Exact Test 1.000 .599
Linear-by-Linear .000 1 .993
Association
N of Valid Cases 60
a. 0 cells (0.0%) have expected count less than 5. The minimum expected count is 14.02.
b. Computed only for a 2x2 table

Table 6.3 shows the results of the Chi-Square test from Table 6.1 and Table 6.2 using

SPSS Software. There are 0 cells that have expected count less than 5 which is means that the

values are not violated. Furthermore, it is observed that the p values for the Pearson Chi-Square

is .993 which is greater than .05.

Table 6.4 Symmetric Measures


Asymptotic
Standard Approximate Approximate
Value Errora Tb Significance
Nominal by Phi .001 .993
Nominal Cramer's V .001 .993
Interval by Pearson's R .001 .129 .008 .993c
Interval
18

Ordinal by Spearman .001 .129 .008 .993c


Ordinal Correlation
N of Valid Cases 60
a. Not assuming the null hypothesis.
b. Using the asymptotic standard error assuming the null hypothesis.
c. Based on normal approximation.

Table 6.4 shows the symmetric measures taken from the chi-square test. It is observed

that the approximate significance or the p values for the Phi, Cramer’s V, Pearson’s R and

Spearman Correlation is .993 which is greater than .05.

Based from the derived results, there is no significant difference and gender is

independent for smoker status which supports our null hypothesis.

You might also like