Chapter 3 Stat

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 66

Chapter 3

INFERENTIAL STATISTICS
In the previous chapter, descriptive statistics
describe only the results of the study,
for instance, very adequate or inadequate
facilities as options of the study, but does not tell
whether there is significant or insignificant
differences on the adequacy of facilities.
Inferential statistics tell significantly or
insignificantly the difference of the research
results.
Inferential statistical tools are:
- correlation,
- chi-square,
- z-test,
- t-test,
- F-test or analysis of variance ( ANOVA),
- kruskal- Wallis one-way ANOVA by ranks, and --- -
Friedman two-way ANOVA by ranks.
The results of these statistical tools tell whether they are
significant or insignificant.
CORRELATION

Correlation is used if researcher wishes to determine


the relationship between two variables, x and y or
bivariate data. This is also called linear
correlation. Suppose the researcher wishes to
determine the relationship between Mathematics (X)
scores and Statistics (Y) scores taken by civil
engineering students in certain University. If the
results show that correlation value is 0.85, high
correlation, this means that if students got high score
in Math, they also got high in statistics or vice versa.
In other words, correlation is defined as
relationship between two sets of variables, X and Y,
or any symbols representing the two variables.
Correlation analysis is concerned with the
relationship in the changes and movements of two
variables. This relationship has a computed value
and may be visually illustrated through the scatter
diagram.
There are three degrees of relationship or correlation between
two variables
1. perfect correlation ( positive and negative )
2. some degrees of correlation ( positive and negative)
3. no correlation
the measure of the degree of relationship or association
between two variables may be further classified as linear or
non-linear.
when height and weight are plotted in a graph called scatter
diagram, the relationship is a simple or linear correlation. If on
the other hand, the graph of the points is a curve the
relationship is said to be non-linear.
In statistics, correlation is usually referred to as correlation
coefficient. A value of correlation coefficient (r) is the same
with mean and standard deviation which characterizes the
whole set of observation and tells a story or results.
Correlation is applicable both descriptive and experimental
researches. Example of experimental research in which
correlation is used- the weight-length relationship of
milkfish cultured in fish cages using bread meal as
supplemental feed. If correlation value obtained is 0.90,
very high correlation, this tells a story that as weight
increases, length also increases or the heavier the weight of
milkfish, the longer is the length.
Correlation coefficient usually assigned values
range from -1.0 to +1.0. The maximum value of
correlation coefficient is + 1.0 which denotes perfect
positive correlation and negative perfect
correlation. These value are seldom happened. Zero
correlation also rarely happens. This occurs when all
individuals have the same score in Test X and Test Y.
if everybody gets score of 88 in test X, they also get
88 in test Y.
INTERPRETATION OF CORRELATION (r ) VALUE

The classifications to interpret the correlation ( r ) value


are as follows:
An r 0.00 to + 0.20, denotes negligible correlation
An r +0.21 to + 0.40, indicates low or slight
relationship
An r +0.41 to + 0.70, signifies moderate correlation
An r + 0.71 to + 0.90, high relationship
An r + 0.91 to + 0.99, shows very high correlation
An r + 1.0 gives perfect correlation.
CORRELATION METHOD

There are several ways of measuring the relationship between two


variables depending upon the nature of data. The most
frequently encountered measure is the Pearson-Moment
Correlation Coefficient (r ).

A. Computation of the Pearson r from the Deviations from the


Means. This is computed using the formula.
r= ∑xy/
√ ∑(x²)∑y²)
Where :
x- deviation of x-value from its mean
y- deviation of y-value from its mean
Given:
X Y X-x y x2 y2 xy
(math ( scien
ce)
13 19 13- -15 4 225 30
15=-2
14 73 -1 39 1 1521 -39
15 25 0 -9 0 81 0
16 26 1 -8 1 64 -8
17 27 2 -7 4 49 -14
∑x=75 ∑y=170 0 0 ∑x2=10 ∑y2=1940 ∑xy=-31
After substituting the obtained values, we have
r= ∑xy/
√ ∑(x²)∑y²)

r= -31 = 31 = -31 = -0.22


√(10)(1940) √19400 139.29

A correlation coefficient of -0.22 is negative low or


slight relationship
B. Computation of Pearson r from raw scores

Pearson Product-Moment correlation


coefficient ( rxy) is a linear correlation used to
determine the relationship between two sets of
variables, X and Y. this is the most common measure
to determine the association between two sets of
variables quantitatively. Used the formula:
rxy = N∑XY – ( ∑X ) ( ∑Y )
√ [ N∑X² - ( ∑X)²] [N∑Y² – (∑Y)²]
Where:
rxy = correlation between X and Y
∑XY = sum of the product of X and Y
∑X = sum of X
∑Y = sum of Y
N = number of cases
Steps in getting Pearson Product-Moment of Correlation Coefficient
formula:

1. Find the sum of X and Y to get ∑X and ∑Y.


2. Square individually X and Y data to get X² and Y².
3. Find the sum of X² and Y² to get ∑X² and ∑Y².
4. Multiply X and Y and add to get ∑XY.
5. Apply formula.
Given :
x y x2 y2 xy
13 19 169 361 247
14 23 196 529 322
15 25 225 625 375
16 26 256 676 416
17 27 289 729 459
∑x=75 ∑=120 ∑x2=1135 ∑y2=2920 ∑xy=1819
Substituting the values, we have:

r= 5( 1819 ) – 75( 120 )


√[5( 1135)-(75)²] [5 ( 2920) – 120)²]

= 9095 – 9000
√(5675- 5625) ( 14600 – 14400)
= 95 = 95
√(50)(200) √10,000
= 0.95
A correlation coefficient of 0.95 indicates a high degree or strong
correlation between the values presented.
Example: Compute rxy on the Achievement Test Between
Biology (X) and Chemistry (Y) taken by 20 BS Biology students
in certain University.
Biology-
85,88,79,90,78,83,95,82,78,89,77,85,86,79,91,76,84,80,93,88.
Chemistry –
80,83,75,88,75,79,91,78,75,84,75,81,81,76,89,74,79,77,90,82.
a. Solve by following the steps. Then compare using by computer.
b. Pearson Correlation with computer
( same step ) ( correlation )
Given :
∑XY = 136458
∑ X = 1686
∑ Y = 1612
∑ X² = 142734
∑ Y² = 130484
N = 20

Rxy = 0.977 or 0.98 very high relationship


Interpretation
The rxy value obtained is 0.98 which denotes very high
relationship between Biology and Chemistry
achievement test taken BY Biology students in
certain university. This means that if students got
very high score in Biology, they got also very high
score in Chemistry and if they got very low score in
Biology, they got also very low in Chemistry.
Example 2. What is the relationship between the
Weight (X) and Length (Y) of Milkfish Cultured in
Fish Cages Using Shrimp Meal as Supplemental Feed
by using the Pearson Product -Moment Correlation
Coefficient.
X (kg) = 0.8, 1.0, 0.5, 0.7, 0.6, 1.1, 0.9, 0.4, 0.5, 0.9,
0.8, 0.7, 0.5, 0.9, 1.2
Y (m) = 0.42, 0.55, 0.34, 0.43, 0.41, 0.58, 0.43, 0.33,
0.33, 0.44, 0.41, 0.42, 0.35, 0.45, 0.56

2.1.. Solve using computer


Spearman Rank Correlation coefficient or Spearman rho

Spearman rank correlation coefficient or


Spearman rho is correlation used to find out the
relationship of the paired ranks assigned to
individuals in two sets of variables X and Y. this is
also applicable to both descriptive and experimental
researches.

rs = 1 - 6 ∑ D²
N³ - N
Where:
rs = Spearman rho
∑D ² = sum of the squared difference between ranks
N = number of cases
Rx and Ry = ranks
Example :
Compute the spearman rho on the relationship between capital (X) and
profit (Y) by small entrepreneurs in Iloilo.
Capital = 10,000; 8,000;
15,000;9,000;12,000;18,000;14,000;9,000;7,000;11,000;20,000;19,000;1
0,000;21,000,12,000.
Profit=
5,500;4,800;13,700;5,800;10,550;17,850;11,200;5,500;4,500;10,550;20,000;17,500;6,8
00;20,000;10,350.
Steps:
1. Rank the values from highest to lowest of the
Capital (X) and Profit (Y)
2. Get the difference between Rx and RY
3. Square the difference to get D²
4. Sum the squared difference to get ∑D²
5. Apply the formula
Rank X and Y from Highest to Lowest

X Rank Rx Y Rank ry
21000 1 1 20000 1 1.5
20000 2 2 20000 2 1.5
19000 3 3 17580 3 3
18000 4 4 17500 4 4
15000 5 5 13700 5 5
14000 6 6 11200 6 6
12000 7 7.5 10550 7 7.5
12000 8 7.5 10550 8 7.5
11500 9 9 10350 9 9
10000 10 10.5 6800 10 10
10000 11 10.5 5800 11 11
9000 12 12.5 5500 12 12.5
9000 13 12.5 5500 13 12.5
8000 14 14 4800 14 14
7000 15 15 4500 15 15
N X Y Rx Ry D D2
1 10000 5500 10.5 12.5 -2.0 4
2 8000 4800 14.0 14.0 0.0 0
3 15000 13700 5.0 5.0 0.0 0
4 9000 5800 12.5 11.0 1.5 2.25
5 12000 10550 7.5 7.5 0.0 0
6 18000 17850 4.0 3.0 1.0 1
7 14000 11200 6.0 6.0 0.0 0
8 9000 5500 12.5 12.5 0.0 0
9 7000 4500 15.0 15.0 0.0 0
10 11500 10550 9.0 7.5 1.5 2.25
11 20000 20000 2.0 1.5 0.5 0.25
12 19000 17500 3.0 4.0 -1.0 1
13 10000 6800 10.5 10.0 0.5 0.25
14 21000 20000 1.0 1.5 -0.5 0.25
15 12000 10350 7.5 9.0 - 1.5 2.25
TOTAL 13.50
Apply the formula, rs = 1 – (6∑D²/N³-N)

Given :
∑D² = 13.50
N = 15
TV at N15 at 0.01 = 0.645**
Solution:
rs = 1 – 6 ∑D² / N3 - N

rs = 1 – [6 (13.50)] = 1 – 81 = 1- 81
15³ -15 3375-15 3360
1- 0.024= 0.976= 0.98

0.98** significant at 0.01 level or very high relationship


Interpretation

The Spearman rho (rxy) value obtained is 0.9758 or


0.98 between capital and profit which is highly
significant at 0.01 level of confidence or very high
relationship. To be significant at 0.01 level with N
equals 15 is 0.645. this means capital and profit
really differ with each other because the more capital
the more profit and the lower the capital the lower
the profit.
Quiz n0.5:
BA1. Compute the Weight (X) in gram and Length (Y) in
centimeter of Ampalaya Planted along the Rice Field Using
Chicken Dung as Organic Fertilizer using Spearman
Weight (X) – 250, 450, 375, 255, 180, 300, 250, 380, 200, 250,
480, 500.
Length (Y) – 49, 63, 55, 53, 48, 54, 50, 55, 48, 48, 65, 65.
PA 2. The following table shows the final grades of ten students
in algebra and statistics.
Algebra ( x) 77, 84, 68, 98, 71, 87, 65, 93, 80, 75
Statistics (y) 74, 89, 72, 95, 80, 91, 72, 86, 78, 82
What is the correlation coefficient of Algebra and Statistics and
interpret your result.
Chi – square ( X²)

Chi-square is another inferential statistical tool which


determines the observed and expected frequencies of
independent variables. The observed frequencies are taken
practically by direct observation from the subjects of the
study. The expected frequencies are derived on the basis of
hypothesis. The null hypothesis Ho states the proportion of
objects falling in each category ( observed and expected
frequencies) in the presumed population. The chi-square
test whether the observed frequency is adequately closed to
the expected frequency. This considers the practical and
theoretical importance in a set of observation. Chi-square
test is applicable only in descriptive research.
Meaning of chi-square (X2)

Chi-square is defined as the sum of the squared


difference of the observed and expected frequencies
divided by the expected frequency.
X² = ∑( O – E )² / E
Chi-square is a descriptive measure of the discrepancy
between observed and expected frequencies. The larger
the discrepancies, the larger the chi-square value
obtained. If no discrepancies between the observed and
expected frequencies, the chi-square value is zero.
Similarly, chi-square value is always a positive number.
Formula:
X² = ∑( O – E )²
E
Where ; X² = chi-square
O = observed frequency
E = expected frequency
Uses of Chi- square

1. chi-square is used in descriptive research when the researcher


wishes to determine if observed and expected frequencies
differ significantly or insignificantly from the independent
variable.
2. It is used to compare correlated and uncorrelated proportion.
3. It is used to test the hypothesis that the variance of normal
population is equal to a given value.
4. It is used to test the goodness of fit when a theoretical
distribution is fitted to some data, for example, the fitting of a
normal curve.
5. It is also used for the construction of confidence interval for
variance.
One – Way Classification

Chi-square in one-way classification is applied when the


researcher is interested to determine the number of subjects,
objects, or responses which fall in various categories. For
instance, the research problem is “Do you agree that
chartered change (cha-cha) be implemented in the
Philippines?
The subjects are 300 male politicians and 200 female
politicians. Of the 300male politicians, 100 said yes; 180
said no; and 20 undecided. Of the 200 female
politicians, 80 said yes, 110 no and 10 undecided. To
determine if the responses of the male and female politicians
differ significantly or insignificantly, consider the following :
1. Null hypothesis: There is no significant difference
on the responses of male and female politicians that
cha-cha be implemented in the Phil.
2. Statistical tool: chi-square: X²= ∑(O-E)²/E
3. Significant level: let alpha = 0.01
4. Sampling distribution: N= 500 with degrees of
freedom (df) = 2. df = (R – 1) (C – 1)
5. Rejection region: null hypothesis is rejected if X2
value is equal to or greater than the tabular value
with df 2 at 0.01 level of significance.
Ho is rejected if the CV is ≥ TV
Ho is accepted if CV is≤ TV
1. Solve for the expected frequency
2. Find the difference between observed and expected
frequency to get ( O – E )
3. Square the difference between O-E to get (O-E)2.
4. Divide the quotient of ( o-e)2 BY e OR (O-E)2 / E
5. Get the sum of (o-E)2 /E to the chi-square value
6. Solve for the degree of freedom or df = (R-1)(C-1)
then compare the computed X2 value with the
tabular value . If computed value (CV) is equal to or
greater than the tabular value(TV) , it is significant.
If CV is lesser than TV, it is insignificant.
O E O-E (O-E)² (O-E)²/E
Responses Male Female Total

YES 115 95 210

NO 85 125 210

UNDECIDED 50 80 130

Total 250 300 550


Expected Frequency Computation
(100)180 x 300 = 108 (80) 180 X 200 = 72
500 500
(180) 290 x 300 = 174 (110) 290 x 200=116
500 500
(101) 30 x 300 = 18 (10) 30 x 200 = 12
500 500
O E O-E (O - E )² ( O – E )² /E

100 108 -8 64 0.592593

180 174 6 36 0.206897

20 18 2 4 0.222222

80 72 8 64 0.88889

110 116 -6 36 0.310345

10 12 -2 4 0.333333

TOTAL 500 0 2.554279


Degrees of freedom computation
df = (R-1)(C-1)
= (3-1)( 2-1) = 2
Tabular value at 0.05 is 5.99
Tabular value at 0.01 is 9.21
Computed value is 2.55 ( insignificant) or accepted

If CV is greater than ≥ the TV is significant


If CV is lesser than ≤ the TV is insignificant
INTERPRETATION:
The computed chi-square value obtained is 2.55 which
is lesser than the tabular value of 9.21 at 0.01 level of
significance with df 2, thus, it is insignificant. This
means that the responses of male and female
politicians if chartered change ( cha-cha) be
implemented in the Philippines is almost the same.
Hence the null hypothesis is accepted.
Examples
1. Compute the chi-square (X2) in a 2 x 2 table
with 2 rows and 2 columns on the Scholastic
Achievements and Performance of the
Licensure examination for Teachers of
Teacher- Education Graduates in Certain
University.
Scholastic Licensure Examination
achievement of Teachers Total
Performance
Pass Fail

Very Good (85 & above) 150 80 230

Good (84 & below) 100 120 220

total 250 200 450


(150)230x250/450= 127.78
(100) 220x250/450 = 122. 22
(151)230 x 200 / 450 = 102.22
(120) 220 x 200 / 450 = 97.78

O E O-E ( O-E)² ( O-E)² /E


150 127.78 22.22 493.73 3.86
151 122.22 -22.22 493.73 4.04
152 102.22 -22.22 493.73 4.83
120 97.78 22.22 493.73 5.05
X2 = 17.78
2.) 3x3 table chi-square on yhr Job-performance
and Economic Status of Teachers at the DedEd in
Cagayan.
Job Economic Status ( Class ) Total
Performance

Outstanding 20 30 40 90

Very Satisfactory 30 35 50 115

Satisfactory 10 15 20 45

Total 60 80 110 250


Z-test Between Means

Z-test is another inferential statistical tool which is


applicable in descriptive research with two variables
or bivariate. For a two tailed test, tabular values of
2.58 and 1.96 are used to determine the significance
of z-test value at 0.01 and 0.05 levels of confidence,
respectively.
For illustration purposes, consider the specific
research problem- “ Is there a significant difference
on the job performance between private and public
school teachers in Tuguegarao city?
To answer the aforementioned specific research
problem, consider the following:
1. Null hypothesis: there is no significant difference
on job performance between private and public
school teachers in Tuguegarao City.
H0: X1 = X2 =0
2. Statistical tool : z-test between means
3. Significance level: let alpha (ǝ) = 0.01 (2.58)
4. Sampling distribution: N1 =25 , N2 = 25
5. Rejection region: the null hypothesis is rejected if
the computed value ( CV ) is equal to or greater
than the tabular value ( TV ) of 2.58 ( CV ≥ TV,
2.58 ) at 0.01 level of significance.
6. Compute , using with computer.
Steps :
1. Swirch on the computer
2. Wait until start menu appears
3. Hold the mouse. Click start menu,click programs,
click microsoft excel
4. Wait until after the computer displays microsoft
excel program
5. Type the data
6. Highlight the data. Click tools menu. Click data
analysis.
7. The computer displays analysis tools. Click z-test:
two sample for means. Click ok.
8. The computer displays z-test: two sample for
means. In the input range type the following:
variance 1 Range : $A1:$A25
variance 2 Range: $B1:$B25
hypothesized Mean Difference: 0
Variable 1 Variable ( known ) : 1
Variable 2 Variance ( known ): 1.69333
Alpha= ).01
9. Click OK.
10. The computer displays the answer.
Cell A-
8,10,10,10,8,8,8,10,10,8,8,10,10,8,8,8,10,8,8,10,8,
8,10,8,8.
Cell B-
6,6,8,8,6,8,6,6,10,8,6,6,6,8,6,6,6,10,8,6,6,6,6,8,6.
t-test

Majority of the research papers, theses and


dissertations used t-test on descriptive research.
the author read several of these researches used t-
test for two variables or bivariate, but t-test is in
appropriate in descriptive research. perhaps,
researches and advisers are unaware yet of z-test
between means and z-test between percentages
which are best appropriate for bivariate descriptive
research to determine the significant difference
between means and percentage.
t-test is another inferential statistical tool which is
applicable only for experimental research to
determine the significant difference between
means. There are two statistical tools involved in
getting t-test, namely , the mean and the variance.
Example:
supposed the researcher wishes to determine
the weight increment of grouper ( lapu-lapo)
cultured in fish cages using bread meal and fish
meal as supplemental feeds.
The specific research question is “ Is there a
significant difference on the weight increment of
grouper (lapu-lapo) cultured in fish cages using
bread meal and fish meal as supplemental feeds?
Illustration: to answer the foregoing specific
research question, consider the following:
1. Null hypothesis: there is no significant difference
on the weight increment of grouper (lapu-lapo)
cultured in fish cages using fish meal as
supplemental feed.
6. Highlight the data. Click data menu. Click data analysis.
7. Click t-test : two-sample assuming unequal variance.
Click ok.
8. The computer displays t-test: two sample assuming
unequal variances.
In input variable1: range ,type $A1:$A10
in input variable 2: range type $B1:$B10
alpha : 0.01
9. Click OK.
10. Displays the answer.
2. Statistical tool: t-test
3. Significant level: alpha at 0.01
4. Smpling distribution: n1 = 10, N= 10
5. Rejection region: the null hypothesis is rejected if the
computed t-value is equal to or greater than the tabular
value.
6. Computation using with computer:
steps: same steps 1-4
5. type the data
cell A – 0.7, 0.9, 1.0, 0.8, 1.1, 1.0, 1.2, 0.9, 1.2, 1.3.
cell B – 0.5, 0.7, 0.4, 0.3, 0.6, 0.5, 0.8, 0.4, 0.6, 0.7
t-test : Two sample Assuming Unequal Variances
T-test with computer: paired two-sample for means

Same steps..
Cell A -90,85, 78, 95,
88,90,93,78,86,79,87,89,88,78,94,77,90,82,81,91,83
,90,88,87,79
Cell B-
80,81,75,85,85,77,85,76,81,76,84,80,84,75,84,75,81,
79,78,81,79,80,83,83,77
Step 7.. Click t-test: paired two-sample for means
F-test or Analysis of Variance (ANOVA)

 F-test of analysis of variance (ANOVA) is an inferential


statistics used to determine the significant difference of three
or more variables or multivariate collected from experimental
research.

 The experimental design, may be


- single –group design with different levels;
- parallel group design, one control group and two-or-more
experimental groups;
- two-pair group design, two control groups and two
experimental groups; complete randomized design (CRD)
using one-factor analysis of variance (ANOVA)
 latin-square design, two-factor ANOVA is used
 Randomized complete block design (RCBD), two-
factor ANOVA is used
F-test or ANOVA Single Factor

F-test : Single factor analysis of variance (ANOVA)


involve one independent variable as basis for
classification. This is usually applied in single-group
design and complete randomized design (CRD). To
test the significance of the difference between means
using F-test single factor ANOVA, with computer.
Same steps 1-4
Example: computation of F-test or ANOVA Single
Factor on the Effect of Chicken Dung as Organic
Fertilizer upon the Yield of Tomatoes Planted in
Plots
Step 5. type the data:
Cell A- 5, 4, 3, 2, 1,
Cell B- 8, 7, 6, 5, 4
Cell C – 10,9,8,7,6
Step 7. click ANOVA: Single Factor

In the summary, Groups refer for Treatment;


 Column 1. treatment1
 Column 2, treatment 2
 Column 3, treatment 3
 Count , number of cases
 Sum, total
 Average , mean
For ANOVA Table
 Between G, stand for treatment
 Within group, error
 F-crit, F- tabular
F-test Two – Factor or ANOVA Two-Factor

F-test two factor or ANOVA two-factor involves three


or more independent variables as basis for
classification. F-test two factor or ANOVA two factor
is appropriate for parallel –group design.
In this design, three or more groups are used at the
same time with one variable (control group) is
manipulated or changed.
Example:
Suppose the researcher wishes to conduct a study on
the flavor acceptability of luncheon meat from
commercial, milkfish bone meal, and goatfish bone
meal.
The specific research problem –” Is there a significant
difference on the flavor acceptability of lucheon meat
from commercial, milkfish bone meal, and goatfish
bone meal?”
Commercial –Cell A; 8,8,7,7,8,8,8,7,9,8,9,8,8,7,8,8,7,7,7,8
Milkfish bone meal –Cell B:
8,9,8,8,9,8,9,9,9,8,9,9,9,8,9,8,8,8,8,8
Goat fish bone meal- Cell C:
7,7,6,6,8,7,7,6,8,7,7,7,7,6,7,7,6,7,6,7
Scale:
9- like extremely
8- like very much
7- like moderately
6- like slightly
Same steps 1-7
Step 8. click ANOVA: two factor without replication.

The above values are the same with the values computed
with the use of calculator. Row stands for panelist;
Column 1, commercial luncheon meat;
Column 2, milkfish bone meal;
Column 3, goatfish luncheon meat.

The ANOVA Table, Rows means panelists; and columns,


samples.

You might also like