Statistics Summary

General Fields of Statistics
 Descriptive Statistics
 methods involving the collection, presentation, and characterization of a set of data in
order to properly describe the various features of that set of data.
 concerned strictly with the data on hand, and can only measure what already exists.
 Inferential Statistics
 Is defined as those methods that make possible the estimation of a characteristic or
property of a population or the making of a decision about a population based only on
sample results (to infer)
 It demands higher order of critical judgement and any treatment of data may lead either
to predictions or inferences concerning a larger set of data known as the true population
parameters
Types of Data
 Categorical Data
 Classes
 Gender
 Civil Status
 Political Party
 Tribal Affiliation
 Ranked Data
 Ranked Order/Position but must not include numerical values at all
 Numerical Data
 Discrete –counting process
 Numbers
 Continuous – measurement process
 Height
 Weight
 Length
 Temperature
Levels of Measurement
 Nominal scale – categorical data
 Ordinal scale – ranked data
 Interval Scale – measurement data
 Ratio scale
TEST OF DIFFERENCE
1. T-test for Independent Samples
When to Use?
 used for testing the difference between the means of two independent groups.
 particularly useful when the research question requires the comparison of variables obtained from
two independent samples.
Checklist
 Only one independent (grouping) variable (IV) (e.g., subject’s gender)
 Only two levels for that IV (e.g., male and female)
 Only one dependent variable (DV)
Sample Research:
Research Background
A researcher wants to investigate whether first-year male and female students at a university
differ in their memory abilities. Ten students of each group were randomly selected from the first-
year enrolment roll to serve as subjects.
All 20 subjects were asked to read 30 unrelated words and then asked to recall as many of the
words as possible. The numbers of words correctly recalled by each subject were recorded.
Statement of the Problem
1. What is the level of memory abilities of male respondents?
2. What is the level of memory abilities of female respondents?
3. Is there a significant difference in the memory abilities of male and female respondents?
Null Hypothesis
There is no significant difference in the memory abilities of male and female respondents.
Conceptual Framework
Independent Variable Dependent Variable
Gender
 Male Memory Ability
 Female
Data
Male Female
16 24 How to Run the Data in SPSS
14 23 1 Analyze
1 Compare Means
18 26
2 Independent Samples T-test
25 17
3 IV to Grouping Variable
17 18
4 DV to Test Variable
14 20
5 Define Groups (Group 1 = 1, Group 2 =2)
19 23
21 26 Note:
16 24 Add Values in Variable View
17 20
Results
Interpretation
#1 The result from the analysis indicates that there is a significant difference between the male and
female samples in the number of words correctly recalled, t(df = 18) = –3.02, p < .007. The mean
values indicate that females correctly recalled significantly more words (M= 22.10) than males (M =
17.70).
#2 The result of the significance (probability) value of about .007 was found to be lower than the .05
level of significance set for this study. This implies that the level of memory ability of the female
students is significantly higher than the male students. Thus, the null hypothesis is rejected.
Exercise 1:
Research Background
A research was done to determine if urban and rural teachers differ in their attitude towards using
the K-12 Learning Package. Data gathered are presented below:
Data
Urban Teacher Rural Teacher
3.98 4.03
3.12 3.03
2.98 2.78
2.77 2.87
2.12 1.99
4.10 4.23
3.15 3.10
2.65 2.41
2.43 2.59
3.01 2.99

1. What is the average attitude of urban teachers towards the K-12 Learning Package?
2. What is the average attitude of rural teachers towards the K-12 Learning Package?
3. Is there a significant difference between the attitude of urban teachers and rural teachers towards
the K-12 Learning Package?
Null Hypothesis
There is no significant difference between the attitude of urban teachers and rural teachers towards
the K-12 Learning Package.
Teacher Type Attitude towards

 Urban Teachers the K-12 Learning
 Rural Teachers Package
Results
Interpretation
The aim of the study is to determine if there is a significant difference between the attitude of urban
and rural teachers on the use of K-12 Learning Package in teaching. In order to answer the research
problem, T-test for Independent Samples was employed. Results reveal that urban teachers show
higher (m=3.03) level of attitude towards the use of K-12 Learning Package than those in the rural
areas (m=3.00).
However, the significance value of about .922 which is found to be higher than the .05 level of set for
this study implies that the difference in the mean scores of the two groups is not found to be
significantly different. This means that while urban teachers show higher level of attitude towards the
use of K-12 Learning Package than those in the rural areas, the difference is found to be not
significant. Thus, the null hypothesis is accepted.
2. Paired Sample T-test
When to Use?
 used in repeated measures or correlated groups design, in which each subject is tested twice on
the same variable.
 Involves the before and after design. (Pre-Test & Post-Test)
Checklist
 There must be only two sets of data.
 Must be obtained from:
(1) the same subjects/respondents
(2) from two matched groups of subjects/respondents.
Sample Research:
Research Background
A researcher designed an experiment to test the effect of drug X on eating behavior. The amount
of food eaten by a group of rats in a one-week period prior to ingesting drug X was recorded.
The rats were then given drug X, and the amount of food eaten in a one-week period was again
recorded. The following amounts of food in grams were eaten during the “before” and “after”
conditions.
1. What is the amount of food consumed by rats before drug X
2. What is the amount of food consumed by rats after drug X
3. Is there a significant difference in the level of food consumption before and after drug X?
Null Hypothesis
There is no significant difference in the level of food consumption before and after drug X?
Variable X Variable Y
Food Consumption Food Consumption

before Drug X after Drug X
Data
Before Drug X After Drug X
100 60 How to Run the Data in SPSS
80 80 1 Analyze
2 Compare Means
16 110
3 Paired Samples T-test
220 140
4 Transfer ‘Before’ and ‘After’ to Paired Variables
140 100
250 200
Note:
170 100 Add Values in Variable View
220 180
120 140
210 130
Results
Interpretation
#1 The result from the analysis indicates that there is no significant difference in the amount of food
eaten before and after drug X was ingested, p < .127. The mean values indicate that less food was
consumed after ingestion of drug X (M = 124.0) than before (M = 152.60), but, the difference is
found to be not significant.
#2 The study was conducted to determine if there is a significant difference in the amount of food
eaten before and after drug X was ingested. To analyze the data, paired sample t-test was employed.
The result from the analysis indicates that there is no significant difference in the amount of food
eaten before and after drug X was ingested as manifested by the sig value of .127 which is greater
than the .05 level of significance set for this study. The mean values indicate that less food was
consumed after ingestion of drug X (M = 124.00) than before (M = 152.60), but, the difference is
found to be not significant. Thus, the null hypothesis is accepted.
Exercise 2:
Research Background
A researcher wants to know if the Grade 10 Math Students really have learned something from
the lessons discussed. To know this, a pre-test was conducted and after the class discussion, the same
test (post-test) was conducted and below are the test results:
Data
Pre-Test in Math 10 Post-Test in Math 10
78 79
67 90
89 79
90 89
65 67
57 89
89 67
68 89
77 99
88 87
1. What is the mean score of the Pre-Test of Math 10 Students?
2. What is the mean score of the Post-Test of Math 10 Students?
3. Is there a significant difference between the pre-test and post-test scores of Math 10 students?
Null Hypothesis
There is no significant difference between the pre-test and post-test scores of Math 10 students.
Math 10 Pre-Test Math 10 Post-Test
Results
Interpretation
The result from the analysis indicates that there is no significant difference between the pre-test and
post-test scores of Math 10 students with a p value (.246) > .05. The mean value indicates that the
post-test scores (M = 83.5) is greater than the pre-tests scores (M = 76.8), but, the difference is found
to be not significant. Thus, the Null Hypothesis is accepted.
3. One-way Analysis of Variance
When to Use?
 an extension of the independent t-test.
 used when the researcher is interested in whether the means from several (> 2) independent
groups differ.
Checklist
 Only one independent variable (e.g., ethnicity).
 More than two levels for that independent variable (e.g., Australian, American, Chinese, and
African).
 Only one dependent variable.
Sample Research:
Research Background
A researcher is interested in finding out whether the intensity of electric shock will affect the time
required to solve a set of difficult problems. Eighteen subjects are randomly assigned to the three
experimental conditions of low shock, medium shock, and high shock. The total time (in minutes)
required to solve all the problem is the measure recorded for each subject.
1. What is the average time consumed in answering difficult problems by those subjected to low
intensity shock?
2. What is the average time consumed in answering difficult problems by those subjected to
medium intensity shock?
3. What is the average time consumed in answering difficult problems by those subjected to high
intensity shock?
4. Is there a significant difference in the time consumed in answering difficult problems when
grouped according to intensity shocks?
Null Hypothesis
There is no significant difference in the time consumed in answering difficult problems when grouped
according to intensity shocks
Electric Shock
 Low Time to answer
 Medium Difficult Answer
 High
How to Run the Data in SPSS

1 Analyze
2 Compare Means
3 One-Way ANOVA
4 IV to Factor Field
5 Post Hoc
6 Scheffe
7 Options
8 Descriptive Cell
Note: Add Values in Variable View
Data
Low Medium High
15 30 40
10 15 35
25 20 50
15 25 43
20 23 45
18 20 40
Results
Interpretation
#1 The results from the analysis indicate that the intensity of the electric shock has a significant
effect on the time taken to solve the problems, F(2,15) = 40.13, p < .001. The mean values for the
three shock levels indicate that as the shock level increased (from low to medium to high), so did the
time taken to solve the problems (low: M = 17.17; medium: M = 22.17; high: M = 42.17).
In the Multiple Comparisons table, in the column labeled Mean Difference (I – J), the mean
difference values accompanied by asterisks indicate which shock levels differ significantly from each
other at the 0.05 level of significance. The results indicate that the high shock level is significantly
different from both the low shock and medium shock levels. The low shock level and the medium
shock level do not differ significantly. These results show that the overall difference in the time taken
to solve complex problems between the three shock-intensity levels is because of the significantly
greater amount of time taken by the subjects in the high shock condition.
#2 The aim of this study is to determine if there is a significant difference in the average time
consumed by the students in solving problems when subjected to LI, MI, and HI. In order to arrive
with an answer to the research problem, One Way Analysis of Variance was employed.
Results reveal that average time consumed by the students significantly differ when they are
grouped according to LI, MI, & HI. This is manifested by a sig value of 0.000 which was found to be
less than the .05 level of significance set for this study.
To test for individual differences of average time consumed in solving problems when grouped
according to level of intensity shock, post hoc comparison technique using Scheffe was employed.
Results in the Multiple Comparisons table, in the column labeled Mean Difference (I – J), the mean
difference values accompanied by asterisks indicate which level of shock differ significantly from
each other at the 0.05 level of significance.
The results indicate that HI is significantly different as compared to LI and MI. However, no
significant difference is noticed between those subjected to LI and MI. These results show that the
overall difference in the average time consumed is because of the significantly high time consumed
by those subjected to HI. Thus, the null hypothesis is rejected.
Exercise 1:
Research Background
The researcher would like to determine whether there exist a significant difference in the
mathematical ability of grade 10 students when compared according to countries in ASIA. He
gathered the following data:
Data
Singapore Malaysia Vietnam Philippines
85 81 79 89
87 83 78 90
89 85 90 88
84 87 82 85
80 88 83 87
79 80 81 79
83 77 76 85
82 75 81 84
86 81 79 85
84 82 82 87
88 83 80 92
82 82 82 82

1. What is the level of Mathematical Ability of Grade 10 Students in Singapore, Malaysia, Vietnam,
and Philippines?
2. Is there a significant difference in the level of Mathematical Ability of Grade 10 Students when
compared according to various countries in Asia namely Singapore, Malaysia, Vietnam, and
Philippines?
Null Hypothesis
There is no significant difference in the level of Mathematical Ability of Grade 10 Students when
compared according to various countries in Asia namely Singapore, Malaysia, Vietnam, and
Philippines.
Grade 10 Students in
various countries in Asia
Level Mathematical
 Singapore Ability of Grade 10
 Malaysia Students in various
 Vietnam countries in Asia
 Philippines

Results
Interpretation
The result from the analysis indicates that the country where the Grade 10 Students come from do
have a significant difference in their Mathematical Ability when compared to other Grade 10 students
from various countries in Asia, F (3, 44) = 5.000, p < 0.05 (Sig. = .005). The mean value indicates
that the Mathematical Ability of Grade 10 Students from various countries in Asia ranges within 81 –
85 with an average mean of (M = 83.31). To test the differences of Mathematical Ability among
Grade 10 students when compared various countries in Asia.
In the Multiple Comparisons table, in the column labeled Mean Difference (I – J), the mean
difference values accompanied by asterisks indicate which shock levels differ significantly from each
other at the 0.05 level of significance.
The results indicate that the level of mathematical ability of grade 10 students in Vietnam and
Philippines show a significant difference at the 0.05 level. Further, the overall results do indicate that
there is a significant difference in the level of mathematical ability among the Asian countries and
that is due to the differences found between Vietnam and Philippines. Thus, the null hypothesis is
rejected.
TEST OF RELATIONSHIP
1. Pearson Product Moment Correlation (Simple Correlation)
When to Use?
 finding out whether a relationship exists and determining its magnitude and direction
 attempts to find the extent to which two or more variables are related
 No variables are manipulated as in an experiment
 measures naturally occurring events, behaviors, or personality characteristics and then determines
if the measured scores covary
 obtaining a pair of observations or measures on two different variables from a number of
individuals.
Two Types of Correlation Coefficient
1 Pearson product moment correlation coefficient ( r ), employed with interval or ratio scaled
variables
2 Spearman rank order correlation coefficient ( r rho ), employed with ordered or ranked data.
Characteristics of Correlation Coefficient

 Two sets of measurements are obtained on the same individuals or on pairs of individuals who
are matched on some basis.
 The values of the correlation coefficients vary between +1.00 and –1.00 (extremes)
o perfect relationships - between +1.00 and –1.00
o absence of a relationship - 0.00
 Positive relationship – individuals obtaining high scores on one variable tend to obtain high
scores on a second variable and vice versa
o Ex. + + or --
 Negative relationship – individuals scoring low on one variable tend to score high on a second
variable and vice versa
o Ex. + - or -+
Sample Research:
Research Background
Assume that a researcher wishes to determine whether there is a relationship between grade point
average (GPA) and the scores on a reading-comprehension test of 12 first-year students. The
researcher recorded the pair of scores given in the following, together with their rankings:
Data
Student # GPA Reading Comprehensions Score
1 91 43
2 87 37
3 76 26
4 82 29
5 85 30
6 90 41
7 77 25
8 88 38
9 82 30
10 80 28
11 93 46
12 83 32

1. What is the GPA of the respondents?
2. What is the level of reading comprehension of the respondents?
3. Is there a significant relationship between the respondents’ GPA and reading comprehension?
Null Hypothesis
There is no significant relationship between the respondents’ GPA and reading comprehension.
Reading
GPA
Comprehension

1 Analyze
2 Correlate
3 Bivariate (Bivariate Correlations)
4 Transfer variables to Variables Field
5 Pearson Correlation Analysis and Two-tailed test of Significance (both fields should be checked)
6 Options (Bivariate Correlation) Continue
Note:
No need to add Values in Variable View
Results
Interpretation
The purpose of the study is to determine if there is a significant relationship between GPA and
reading comprehension. To answer the problem, Pearson Product Moment Correlation test was
employed. Results indicate that the sig. value of .000 was found to be lower than the .05 level of
significance set for the study. This indicates that there is a significant relationship between GPA and
reading comprehension and that the magnitude of the relationship was found to be very high positive
correlation as manifested by the r value of .967. The results further imply that as GPA increases, so
does the reading comprehension. Thus, the null hypothesis is rejected.
Exercise:
Research Background
The researcher is interested of knowing if there is correlation/relationship between weight and
self-confidence. She asked 12 people to join the study and gathered the following data:
Data
Student # Weight Self-Confidence
1 120 2.10
2 115 2.35
3 135 1.87
4 95 2.76
5 87 2.98
6 79 3.35
7 109 2.25
8 101 2.68
9 168 4.05
10 127 1.99
11 92 3.08
12 90 2.85

1. What is the average weight of the respondents?
2. What is the level of self-confidence of the respondents?
3. Is there a significant relationship between a person’s weight and self-confidence?
Null Hypothesis
There is no significant relationship between a person’s weight and self-confidence.
Weight Self-Confidence

1 Analyze
2 Correlate
3 Bivariate (Bivariate Correlations)
4 Transfer variables to Variables Field
5 Pearson Correlation Analysis and Two-tailed test of Significance (both fields should be checked)
6 Options (Bivariate Correlation) Continue
Note:
Results
Interpretation
The purpose of the study is to determine if there is a significant relationship between weight and
self-confidence. To answer the problem, Pearson Product Moment Correlation test was employed.
Results indicate that the sig. value of .000 was found to be lower than the 0.5 level of significance set
for the study. This indicate that there is a significant relationship between weight and self-confidence
and that the magnitude of the relationship was found to be very high negative correlations as
manifested by the r value of -.965. The results further imply that as the weight increases, self-
confidence decreases. Similarly, as the weight decreases, self-confidence also increases. Thus, the
null hypothesis is rejected.
2. Simple Linear Regression Analysis

When to Use?
 Regression and correlation are closely related. However…
 Regression focuses on using the relationship for PREDICTION.
 In terms of prediction, if two variables were correlated perfectly, then knowing the value of one
score permits a perfect prediction of the score on the second variable. Generally, whenever two
variables are significantly correlated, the researcher may use the score on one variable to predict
the score on the second.
 These questions involve predictions from one variable to another, and psychologists, educators,
biologists, sociologists, and economists are constantly being called upon to perform this function.
Assumptions
 For each subject in the study, there must be related pairs of scores. That is, if a subject has a score
on variable X, then the same subject must also have a score on variable Y.
 The variables should be measured at least at the ordinal level.
Sample Research:
Research Background
The researcher is interested of knowing if self-esteem can significantly predict job performance.
To arrive with an answer, he gathered the following data:
Data
Level of Self-esteem Level of Job Performance
79 3.3
84 3.9
83 3.7
89 4.5
87 4.3
91 4.3

1. What is the level of the self-esteem of the respondents?
2. What is the level of the job performance of the respondents?
3. Can self-esteem predict the job performance of the respondents? or Is self-esteem a good
predictor/ determinant of job performance?
Null Hypothesis
Self-esteem cannot predict job performance
Self-Esteem Job Performance
Note:
1 Analyze
2 Regression
3 Linear
4 DV transform to Dependent Variable Box
5 IV transform to Independent Variable Box
6 Statistics
7 Model Fit &Descriptive
Population Regression Model:

JP = Bo + B1X1 + Є1
Where:
JP = Endogenous Variable (DV)
Bo = Constant / Intercept
B1 = Regression Coefficient
X1 = Exogenous variable (IV)
Є1 = Residual or Error Term

JP = Bo + B1(SE) + Є1
Where:
JP = Job Performance
B1 = regression coefficient
SE = Self-esteem
Results
Adjusted R square = .860
This implies that predictor variable (which is Self-esteem) has explained 86.0 % of the variance
in the dependent variable (which is Job Performance).
Or
86.0% of the variations in the variable Job Performance can be explained by the variations in the
variable Self-Esteem.
Interpretation
When regression equation JP = βo + β1SE + Є1 was tested using simple linear regression analysis,
results from the ANOVA table show that the sig-value is .005 which is found to be below the .05
level of significance set for this study. This implies that overall, the model is considered to be
significant and that the model fits the data. When looking at the regression coefficient of SE, the
estimated regression model can be mathematically presented as: JP = -.4326 (constant) + .097 (SE) +
Єi
The value of the beta coefficient for SE implies that holding all other variables in the regression
constant, its coefficient indicates that for every 1 unit change in the level of SE would give a
corresponding .097 unit increase in the level of Job Performance. This implies that the higher the
level of the SE, the higher it would be for the level of job performance. The high positive beta
coefficient with p-value of .005 for SE confirms the empirical findings which claimed that SE can
significantly predict JP (cite authors here).
In its entirety, the explanatory and predictive power of SE is considered to be high because it could
account for 86.0 % of the variation in the Job Performance. This is manifested in the model summary
table which shows that the value of the Adjusted R2 is .86 which implies that about 86.0 percent of
the variations in the JP can be explained by the variations in the SE. The remaining 14.0 percent
unexplained variation could be accounted for by other variables not included in the model.
Exercise 1:
3. Multiple Linear Regression Analysis
When to Use?
 An extension of Simple Linear Regression Analysis
 The aims, assumptions, and requirements, are just the same as with Simple Linear Regression
Analysis
Assumptions
 For each subject in the study, there must be related pairs of scores. That is, if a subject has a score
on variable X, then the same subject must also have a score on variable Y.
 The variables should be measured at least at the ordinal level.
Sample Research:
Research Background
The researcher is interested of knowing if Income and Study Habit can significantly PREDICT
AcadPer. To arrive with an answer, he gathered the following data:
Data
Academic Performance (GPA) Income (Monthly income of Study Habits (hours per week
parents) of study)
79 10, 500 21
84 14, 000 13
83 13, 500 16
89 20, 000 24
87 18, 000 16
91 20, 500 21

1. What is the level of the AP of the respondents?
2. What is the average monthly income of the parents of the respondents?
3. What is the average number of hours of study of the respondents?
4. Can income and study habit significantly predict academic performance of the respondents? or
Do income and study habit have significant influence towards academic performance of students?
Null Hypothesis
Income and study habit cannot significantly predict academic performance of the respondents. Or
Income and study habit do not have significant influence towards academic performance of students.
 Income Academic
 Study Habits Performance
Note:
1 Analyze
2 Regression
3 Linear
4 DV transform to Dependent Variable Box
5 IV transform to Independent Variable Box
6 Statistics
7 Model Fit &Descriptive
JP = Bo + B1X1 + B2X2 + Є1
Where:
JP = Endogenous Variable (DV)
B1 = Regression Coefficient
X1 = Exogenous variable (IV)
JP = Bo + B1(In) + B1(SH) + Є1
Where:
JP = Job Performance
B1 = regression coefficient
In = Income
SE = Study Habit
Results
Sig = .002
Constant = 69.318
In = .001
This implies that predictor variables Income and SH had explained 97.1 % of the variance in the
dependent variable (Acad Performance).
Or
97.1% of the variations in the Dependent Variable (Acad Per) can be explained by the variations
in the independent variables (Income and Study Habit)
Interpretation
When regression equation AP = βo + β1In + β2 SH + Є1 was tested using multiple linear regression
analysis, results from the ANOVA table show that the sig-value is .002 which is found to be below
the .05 level of significance set for this study. This implies that overall, the model is considered to be
significant and that the model fits the data.
However, when looking at the regression coefficient of each of the independent variables, only
Income was found to be significant (p = .002). This result implies that of all the factors included in
the model, only income can explain the variations in the dependent variable which is Academic
Performance. Thus, the estimated regression model can be mathematically presented as: AP = 69.318
(constant) + .001 (In) + Єi
The value of the beta coefficient for Income implies that holding all other variables in the regression
constant, its coefficient indicates that for every 1 unit change in the level of Income would give a
corresponding .001 unit increase in the level of Academic Performance. This implies that the higher
the level of the income of the parents, the higher it would be for the level of AP of the students. The
high positive beta coefficient with p-value of .002 for Income confirms the empirical findings which
claimed that Income can significantly predict AP (cite authors here).
The remaining factor, which is study habit, was found to be not a good predictor of AP as manifested
by significance (p=.373) value which is greater than .05 level of significance set for this study.
In its entirety, the explanatory and predictive power of Income is considered to be high because it
could account for 97.1 % of the variation in the AP. This is manifested in the model summary table
which shows that the value of the Adjusted R2 is .971 which implies that about 97.1 % of the
variations in the AP can be explained by the variations in the Income. The remaining 2.9%
unexplained variation could be accounted for by other variables not included in the model.

Statistics Summary

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Statistics Summary

Uploaded by

Copyright:

Available Formats

General Fields of Statistics

Statement of the Problem

Teacher Type Attitude towards

Food Consumption Food Consumption

Math 10 Pre-Test Math 10 Post-Test

How to Run the Data in SPSS

Statement of the Problem

Characteristics of Correlation Coefficient

Statement of the Problem

How to Run the Data in SPSS

Statement of the Problem

How to Run the Data in SPSS

2. Simple Linear Regression Analysis

Statement of the Problem

Self-Esteem Job Performance

Population Regression Model:

Population Regression Model:

Statement of the Problem

You might also like