Professional Documents
Culture Documents
Statistics Summary
Statistics Summary
Descriptive Statistics
methods involving the collection, presentation, and characterization of a set of data in
order to properly describe the various features of that set of data.
concerned strictly with the data on hand, and can only measure what already exists.
Inferential Statistics
Is defined as those methods that make possible the estimation of a characteristic or
property of a population or the making of a decision about a population based only on
sample results (to infer)
It demands higher order of critical judgement and any treatment of data may lead either
to predictions or inferences concerning a larger set of data known as the true population
parameters
Types of Data
Categorical Data
Classes
Gender
Civil Status
Political Party
Tribal Affiliation
Ranked Data
Ranked Order/Position but must not include numerical values at all
Numerical Data
Discrete –counting process
Numbers
Continuous – measurement process
Height
Weight
Length
Temperature
Levels of Measurement
Nominal scale – categorical data
Ordinal scale – ranked data
Interval Scale – measurement data
Ratio scale
TEST OF DIFFERENCE
1. T-test for Independent Samples
When to Use?
used for testing the difference between the means of two independent groups.
particularly useful when the research question requires the comparison of variables obtained from
two independent samples.
Checklist
Only one independent (grouping) variable (IV) (e.g., subject’s gender)
Only two levels for that IV (e.g., male and female)
Only one dependent variable (DV)
Sample Research:
Research Background
A researcher wants to investigate whether first-year male and female students at a university
differ in their memory abilities. Ten students of each group were randomly selected from the first-
year enrolment roll to serve as subjects.
All 20 subjects were asked to read 30 unrelated words and then asked to recall as many of the
words as possible. The numbers of words correctly recalled by each subject were recorded.
Statement of the Problem
1. What is the level of memory abilities of male respondents?
2. What is the level of memory abilities of female respondents?
3. Is there a significant difference in the memory abilities of male and female respondents?
Null Hypothesis
There is no significant difference in the memory abilities of male and female respondents.
Conceptual Framework
Independent Variable Dependent Variable
Gender
Male Memory Ability
Female
Data
Male Female
16 24 How to Run the Data in SPSS
14 23 1 Analyze
1 Compare Means
18 26
2 Independent Samples T-test
25 17
3 IV to Grouping Variable
17 18
4 DV to Test Variable
14 20
5 Define Groups (Group 1 = 1, Group 2 =2)
19 23
21 26 Note:
16 24 Add Values in Variable View
17 20
Results
Interpretation
#1 The result from the analysis indicates that there is a significant difference between the male and
female samples in the number of words correctly recalled, t(df = 18) = –3.02, p < .007. The mean
values indicate that females correctly recalled significantly more words (M= 22.10) than males (M =
17.70).
#2 The result of the significance (probability) value of about .007 was found to be lower than the .05
level of significance set for this study. This implies that the level of memory ability of the female
students is significantly higher than the male students. Thus, the null hypothesis is rejected.
Exercise 1:
Research Background
A research was done to determine if urban and rural teachers differ in their attitude towards using
the K-12 Learning Package. Data gathered are presented below:
Data
Urban Teacher Rural Teacher
3.98 4.03
3.12 3.03
2.98 2.78
2.77 2.87
2.12 1.99
4.10 4.23
3.15 3.10
2.65 2.41
2.43 2.59
3.01 2.99
Results
Interpretation
The aim of the study is to determine if there is a significant difference between the attitude of urban
and rural teachers on the use of K-12 Learning Package in teaching. In order to answer the research
problem, T-test for Independent Samples was employed. Results reveal that urban teachers show
higher (m=3.03) level of attitude towards the use of K-12 Learning Package than those in the rural
areas (m=3.00).
However, the significance value of about .922 which is found to be higher than the .05 level of set for
this study implies that the difference in the mean scores of the two groups is not found to be
significantly different. This means that while urban teachers show higher level of attitude towards the
use of K-12 Learning Package than those in the rural areas, the difference is found to be not
significant. Thus, the null hypothesis is accepted.
2. Paired Sample T-test
When to Use?
used in repeated measures or correlated groups design, in which each subject is tested twice on
the same variable.
Involves the before and after design. (Pre-Test & Post-Test)
Checklist
There must be only two sets of data.
Must be obtained from:
(1) the same subjects/respondents
(2) from two matched groups of subjects/respondents.
Sample Research:
Research Background
A researcher designed an experiment to test the effect of drug X on eating behavior. The amount
of food eaten by a group of rats in a one-week period prior to ingesting drug X was recorded.
The rats were then given drug X, and the amount of food eaten in a one-week period was again
recorded. The following amounts of food in grams were eaten during the “before” and “after”
conditions.
Statement of the Problem
1. What is the amount of food consumed by rats before drug X
2. What is the amount of food consumed by rats after drug X
3. Is there a significant difference in the level of food consumption before and after drug X?
Null Hypothesis
There is no significant difference in the level of food consumption before and after drug X?
Conceptual Framework
Variable X Variable Y
Data
Before Drug X After Drug X
100 60 How to Run the Data in SPSS
80 80 1 Analyze
2 Compare Means
16 110
3 Paired Samples T-test
220 140
4 Transfer ‘Before’ and ‘After’ to Paired Variables
140 100
250 200
Note:
170 100 Add Values in Variable View
220 180
120 140
210 130
Results
Interpretation
#1 The result from the analysis indicates that there is no significant difference in the amount of food
eaten before and after drug X was ingested, p < .127. The mean values indicate that less food was
consumed after ingestion of drug X (M = 124.0) than before (M = 152.60), but, the difference is
found to be not significant.
#2 The study was conducted to determine if there is a significant difference in the amount of food
eaten before and after drug X was ingested. To analyze the data, paired sample t-test was employed.
The result from the analysis indicates that there is no significant difference in the amount of food
eaten before and after drug X was ingested as manifested by the sig value of .127 which is greater
than the .05 level of significance set for this study. The mean values indicate that less food was
consumed after ingestion of drug X (M = 124.00) than before (M = 152.60), but, the difference is
found to be not significant. Thus, the null hypothesis is accepted.
Exercise 2:
Research Background
A researcher wants to know if the Grade 10 Math Students really have learned something from
the lessons discussed. To know this, a pre-test was conducted and after the class discussion, the same
test (post-test) was conducted and below are the test results:
Data
Pre-Test in Math 10 Post-Test in Math 10
78 79
67 90
89 79
90 89
65 67
57 89
89 67
68 89
77 99
88 87
Statement of the Problem
1. What is the mean score of the Pre-Test of Math 10 Students?
2. What is the mean score of the Post-Test of Math 10 Students?
3. Is there a significant difference between the pre-test and post-test scores of Math 10 students?
Null Hypothesis
There is no significant difference between the pre-test and post-test scores of Math 10 students.
Conceptual Framework
Variable X Variable Y
Results
Interpretation
The result from the analysis indicates that there is no significant difference between the pre-test and
post-test scores of Math 10 students with a p value (.246) > .05. The mean value indicates that the
post-test scores (M = 83.5) is greater than the pre-tests scores (M = 76.8), but, the difference is found
to be not significant. Thus, the Null Hypothesis is accepted.
3. One-way Analysis of Variance
When to Use?
an extension of the independent t-test.
used when the researcher is interested in whether the means from several (> 2) independent
groups differ.
Checklist
Only one independent variable (e.g., ethnicity).
More than two levels for that independent variable (e.g., Australian, American, Chinese, and
African).
Only one dependent variable.
Sample Research:
Research Background
A researcher is interested in finding out whether the intensity of electric shock will affect the time
required to solve a set of difficult problems. Eighteen subjects are randomly assigned to the three
experimental conditions of low shock, medium shock, and high shock. The total time (in minutes)
required to solve all the problem is the measure recorded for each subject.
Statement of the Problem
1. What is the average time consumed in answering difficult problems by those subjected to low
intensity shock?
2. What is the average time consumed in answering difficult problems by those subjected to
medium intensity shock?
3. What is the average time consumed in answering difficult problems by those subjected to high
intensity shock?
4. Is there a significant difference in the time consumed in answering difficult problems when
grouped according to intensity shocks?
Null Hypothesis
There is no significant difference in the time consumed in answering difficult problems when grouped
according to intensity shocks
Conceptual Framework
Independent Variable Dependent Variable
Electric Shock
Low Time to answer
Medium Difficult Answer
High
Results
Interpretation
#1 The results from the analysis indicate that the intensity of the electric shock has a significant
effect on the time taken to solve the problems, F(2,15) = 40.13, p < .001. The mean values for the
three shock levels indicate that as the shock level increased (from low to medium to high), so did the
time taken to solve the problems (low: M = 17.17; medium: M = 22.17; high: M = 42.17).
In the Multiple Comparisons table, in the column labeled Mean Difference (I – J), the mean
difference values accompanied by asterisks indicate which shock levels differ significantly from each
other at the 0.05 level of significance. The results indicate that the high shock level is significantly
different from both the low shock and medium shock levels. The low shock level and the medium
shock level do not differ significantly. These results show that the overall difference in the time taken
to solve complex problems between the three shock-intensity levels is because of the significantly
greater amount of time taken by the subjects in the high shock condition.
#2 The aim of this study is to determine if there is a significant difference in the average time
consumed by the students in solving problems when subjected to LI, MI, and HI. In order to arrive
with an answer to the research problem, One Way Analysis of Variance was employed.
Results reveal that average time consumed by the students significantly differ when they are
grouped according to LI, MI, & HI. This is manifested by a sig value of 0.000 which was found to be
less than the .05 level of significance set for this study.
To test for individual differences of average time consumed in solving problems when grouped
according to level of intensity shock, post hoc comparison technique using Scheffe was employed.
Results in the Multiple Comparisons table, in the column labeled Mean Difference (I – J), the mean
difference values accompanied by asterisks indicate which level of shock differ significantly from
each other at the 0.05 level of significance.
The results indicate that HI is significantly different as compared to LI and MI. However, no
significant difference is noticed between those subjected to LI and MI. These results show that the
overall difference in the average time consumed is because of the significantly high time consumed
by those subjected to HI. Thus, the null hypothesis is rejected.
Exercise 1:
Research Background
The researcher would like to determine whether there exist a significant difference in the
mathematical ability of grade 10 students when compared according to countries in ASIA. He
gathered the following data:
Data
Singapore Malaysia Vietnam Philippines
85 81 79 89
87 83 78 90
89 85 90 88
84 87 82 85
80 88 83 87
79 80 81 79
83 77 76 85
82 75 81 84
86 81 79 85
84 82 82 87
88 83 80 92
82 82 82 82
Null Hypothesis
There is no significant difference in the level of Mathematical Ability of Grade 10 Students when
compared according to various countries in Asia namely Singapore, Malaysia, Vietnam, and
Philippines.
Conceptual Framework
Independent Variable Dependent Variable
Grade 10 Students in
various countries in Asia
Level Mathematical
Singapore Ability of Grade 10
Malaysia Students in various
Vietnam countries in Asia
Philippines
Results
Interpretation
The result from the analysis indicates that the country where the Grade 10 Students come from do
have a significant difference in their Mathematical Ability when compared to other Grade 10 students
from various countries in Asia, F (3, 44) = 5.000, p < 0.05 (Sig. = .005). The mean value indicates
that the Mathematical Ability of Grade 10 Students from various countries in Asia ranges within 81 –
85 with an average mean of (M = 83.31). To test the differences of Mathematical Ability among
Grade 10 students when compared various countries in Asia.
In the Multiple Comparisons table, in the column labeled Mean Difference (I – J), the mean
difference values accompanied by asterisks indicate which shock levels differ significantly from each
other at the 0.05 level of significance.
The results indicate that the level of mathematical ability of grade 10 students in Vietnam and
Philippines show a significant difference at the 0.05 level. Further, the overall results do indicate that
there is a significant difference in the level of mathematical ability among the Asian countries and
that is due to the differences found between Vietnam and Philippines. Thus, the null hypothesis is
rejected.
TEST OF RELATIONSHIP
1. Pearson Product Moment Correlation (Simple Correlation)
When to Use?
finding out whether a relationship exists and determining its magnitude and direction
attempts to find the extent to which two or more variables are related
No variables are manipulated as in an experiment
measures naturally occurring events, behaviors, or personality characteristics and then determines
if the measured scores covary
obtaining a pair of observations or measures on two different variables from a number of
individuals.
Two Types of Correlation Coefficient
1 Pearson product moment correlation coefficient ( r ), employed with interval or ratio scaled
variables
2 Spearman rank order correlation coefficient ( r rho ), employed with ordered or ranked data.
Sample Research:
Research Background
Assume that a researcher wishes to determine whether there is a relationship between grade point
average (GPA) and the scores on a reading-comprehension test of 12 first-year students. The
researcher recorded the pair of scores given in the following, together with their rankings:
Data
Student # GPA Reading Comprehensions Score
1 91 43
2 87 37
3 76 26
4 82 29
5 85 30
6 90 41
7 77 25
8 88 38
9 82 30
10 80 28
11 93 46
12 83 32
Null Hypothesis
There is no significant relationship between the respondents’ GPA and reading comprehension.
Conceptual Framework
Variable X Variable Y
Reading
GPA
Comprehension
Note:
No need to add Values in Variable View
Results
Interpretation
The purpose of the study is to determine if there is a significant relationship between GPA and
reading comprehension. To answer the problem, Pearson Product Moment Correlation test was
employed. Results indicate that the sig. value of .000 was found to be lower than the .05 level of
significance set for the study. This indicates that there is a significant relationship between GPA and
reading comprehension and that the magnitude of the relationship was found to be very high positive
correlation as manifested by the r value of .967. The results further imply that as GPA increases, so
does the reading comprehension. Thus, the null hypothesis is rejected.
Exercise:
Research Background
The researcher is interested of knowing if there is correlation/relationship between weight and
self-confidence. She asked 12 people to join the study and gathered the following data:
Data
Student # Weight Self-Confidence
1 120 2.10
2 115 2.35
3 135 1.87
4 95 2.76
5 87 2.98
6 79 3.35
7 109 2.25
8 101 2.68
9 168 4.05
10 127 1.99
11 92 3.08
12 90 2.85
Null Hypothesis
There is no significant relationship between a person’s weight and self-confidence.
Conceptual Framework
Variable X Variable Y
Weight Self-Confidence
Note:
No need to add Values in Variable View
Results
Interpretation
The purpose of the study is to determine if there is a significant relationship between weight and
self-confidence. To answer the problem, Pearson Product Moment Correlation test was employed.
Results indicate that the sig. value of .000 was found to be lower than the 0.5 level of significance set
for the study. This indicate that there is a significant relationship between weight and self-confidence
and that the magnitude of the relationship was found to be very high negative correlations as
manifested by the r value of -.965. The results further imply that as the weight increases, self-
confidence decreases. Similarly, as the weight decreases, self-confidence also increases. Thus, the
null hypothesis is rejected.
Null Hypothesis
Self-esteem cannot predict job performance
Conceptual Framework
Independent Variable Dependent Variable
Note:
No need to add Values in Variable View
How to Run the Data in SPSS
1 Analyze
2 Regression
3 Linear
4 DV transform to Dependent Variable Box
5 IV transform to Independent Variable Box
6 Statistics
7 Model Fit &Descriptive
Where:
JP = Endogenous Variable (DV)
Bo = Constant / Intercept
B1 = Regression Coefficient
X1 = Exogenous variable (IV)
Є1 = Residual or Error Term
Results
Adjusted R square = .860
This implies that predictor variable (which is Self-esteem) has explained 86.0 % of the variance
in the dependent variable (which is Job Performance).
Or
86.0% of the variations in the variable Job Performance can be explained by the variations in the
variable Self-Esteem.
Interpretation
When regression equation JP = βo + β1SE + Є1 was tested using simple linear regression analysis,
results from the ANOVA table show that the sig-value is .005 which is found to be below the .05
level of significance set for this study. This implies that overall, the model is considered to be
significant and that the model fits the data. When looking at the regression coefficient of SE, the
estimated regression model can be mathematically presented as: JP = -.4326 (constant) + .097 (SE) +
Єi
The value of the beta coefficient for SE implies that holding all other variables in the regression
constant, its coefficient indicates that for every 1 unit change in the level of SE would give a
corresponding .097 unit increase in the level of Job Performance. This implies that the higher the
level of the SE, the higher it would be for the level of job performance. The high positive beta
coefficient with p-value of .005 for SE confirms the empirical findings which claimed that SE can
significantly predict JP (cite authors here).
In its entirety, the explanatory and predictive power of SE is considered to be high because it could
account for 86.0 % of the variation in the Job Performance. This is manifested in the model summary
table which shows that the value of the Adjusted R2 is .86 which implies that about 86.0 percent of
the variations in the JP can be explained by the variations in the SE. The remaining 14.0 percent
unexplained variation could be accounted for by other variables not included in the model.
Exercise 1:
3. Multiple Linear Regression Analysis
When to Use?
An extension of Simple Linear Regression Analysis
The aims, assumptions, and requirements, are just the same as with Simple Linear Regression
Analysis
Assumptions
For each subject in the study, there must be related pairs of scores. That is, if a subject has a score
on variable X, then the same subject must also have a score on variable Y.
The variables should be measured at least at the ordinal level.
Sample Research:
Research Background
The researcher is interested of knowing if Income and Study Habit can significantly PREDICT
AcadPer. To arrive with an answer, he gathered the following data:
Data
Academic Performance (GPA) Income (Monthly income of Study Habits (hours per week
parents) of study)
79 10, 500 21
84 14, 000 13
83 13, 500 16
89 20, 000 24
87 18, 000 16
91 20, 500 21
Null Hypothesis
Income and study habit cannot significantly predict academic performance of the respondents. Or
Income and study habit do not have significant influence towards academic performance of students.
Conceptual Framework
Independent Variable Dependent Variable
Income Academic
Study Habits Performance
Note:
No need to add Values in Variable View
How to Run the Data in SPSS
1 Analyze
2 Regression
3 Linear
4 DV transform to Dependent Variable Box
5 IV transform to Independent Variable Box
6 Statistics
7 Model Fit &Descriptive
Population Regression Model:
JP = Bo + B1X1 + B2X2 + Є1
Where:
JP = Endogenous Variable (DV)
Bo = Constant / Intercept
B1 = Regression Coefficient
X1 = Exogenous variable (IV)
Є1 = Residual or Error Term
Population Regression Model:
JP = Bo + B1(In) + B1(SH) + Є1
Where:
JP = Job Performance
Bo = Constant / Intercept
B1 = regression coefficient
In = Income
SE = Study Habit
Є1 = Residual or Error Term
Results
Adjusted R square = .971
Sig = .002
Constant = 69.318
In = .001
Adjusted R square = .971
This implies that predictor variables Income and SH had explained 97.1 % of the variance in the
dependent variable (Acad Performance).
Or
97.1% of the variations in the Dependent Variable (Acad Per) can be explained by the variations
in the independent variables (Income and Study Habit)
Interpretation
When regression equation AP = βo + β1In + β2 SH + Є1 was tested using multiple linear regression
analysis, results from the ANOVA table show that the sig-value is .002 which is found to be below
the .05 level of significance set for this study. This implies that overall, the model is considered to be
significant and that the model fits the data.
However, when looking at the regression coefficient of each of the independent variables, only
Income was found to be significant (p = .002). This result implies that of all the factors included in
the model, only income can explain the variations in the dependent variable which is Academic
Performance. Thus, the estimated regression model can be mathematically presented as: AP = 69.318
(constant) + .001 (In) + Єi
The value of the beta coefficient for Income implies that holding all other variables in the regression
constant, its coefficient indicates that for every 1 unit change in the level of Income would give a
corresponding .001 unit increase in the level of Academic Performance. This implies that the higher
the level of the income of the parents, the higher it would be for the level of AP of the students. The
high positive beta coefficient with p-value of .002 for Income confirms the empirical findings which
claimed that Income can significantly predict AP (cite authors here).
The remaining factor, which is study habit, was found to be not a good predictor of AP as manifested
by significance (p=.373) value which is greater than .05 level of significance set for this study.
In its entirety, the explanatory and predictive power of Income is considered to be high because it
could account for 97.1 % of the variation in the AP. This is manifested in the model summary table
which shows that the value of the Adjusted R2 is .971 which implies that about 97.1 % of the
variations in the AP can be explained by the variations in the Income. The remaining 2.9%
unexplained variation could be accounted for by other variables not included in the model.