Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 10

M e asur e s of Correlation Defined

Correlation statistics, specifically the Pearson product-moment correlation


coefficient (\( r_{xy} \)), are employed when analyzing scatter plots to identify non-
random trends in data points. This coefficient measures the strength and direction
of the linear relationship between two sets of variables, X and Y.

Bivariate data, representing paired measurements of X and Y, are examined for


correlation, indicating how changes in one variable affect the other. Positive
correlation occurs when both variables increase or decrease together, while negative
correlation occurs when one variable increases while the other decreases. A zero
correlation implies no linear relationship between the variables.

In essence, correlation statistics provide a quantitative measure of the degree of


association between two sets of variables, aiding in the interpretation of trends
observed in scatter plots.

Measures of correlation are used both in descriptive and experimental researches.


Some examples of descriptive researches on correlation are as follows:

1. Correlation Between Achievement and Economic Status of Grade 12 Senior


High School Students
2. I.Q. and Personality Relationship of Grade 10 Junior High School Students
3. Capital and Profit Relationship of Income Generating Projects of Grade 11
Senior High School Students
4. Correlation Between Mathematics and English Achievements of Grade 9
Junior High School Students
5. Correlation Between High and Low Achievers of Grade 10 Junior High School
Students

Some examples of correlation in experimental research are the following:

1. Weight-Length Relationship of Mudcrab (Alimango) Cultured in the Backyard


Fishpond Using Bread Meal as Supplemental Feed
2. The Height-Weight Relationship of Bottle-Fed Infants Using the Same Milk
Brand
3. Weight-Length Relationship of Tilapia Cultured in Backyard Fishpond Using
Trash Fish as Supplemental Feed
4. Weight-Length Relationship of Eggplant Planted in Pots Using Chicken Dung
jas Fertilizer
5. Weight-Length Relationship of Cucumber Planted in Pots Using Nigh Soil as
Fertilizer

Correlation measures, typically ranging from -1 to +1, indicate the degree of


association between two sets of variables. A value of -1 signifies a perfect negative
correlation, +1 indicates a perfect positive correlation, and 0 denotes no correlation.
A perfect positive correlation rarely occurs, where all individual performances in
variables X and Y align in the same direction. For instance, if a student excels in Test
X (Mathematics), they also excel in Test Y (English), and vice versa. Conversely, a
negative correlation implies that as one variable increases, the other decreases,
reflecting an inverse relationship. A correlation of 0 suggests no linear relationship
between the variables.

A perfect negative correlation, represented b y a value of -1, is rare


and indicates that the scores of one variable (e.g., Test X ) are exact
opposites o f the scores of the other variable (e.g., Test Y). A s o n e
variable increases, t h e ot her d e c r e a s e s consistently. F o r instance,
a s t u d e n t w h o s c o r e s h i g h e s t i n T e s t X ( M a t h e m a t i c s ) will s c o r e
lowest in Test Y (English), and vice versa.
While correlation calculations typically assume a linear relationship, it’s important to note that
relationships may be curvilinear or nonlinear. Therefore, examining scatter plots is crucial to
determine if the relationship between two variables follows a linear pattern or exhibits another form
of association.

Interpretation of Correlation Value


To interpret the correlation value, the classification are as follows:

An r from 0.00 to ± 0.20 denotes negligible correlation or relationship

An r from ± 0.21 to ± 0.40 means low or slight relationship or correlation

An r from ± 0.41 to ± 0.70 represents marked or moderate relationship

An r from ± 0.71 to ± 0.90 signifies high relationship or correlation

An r from ± 0.91 to ± 0.99 denotes very high relationship or correlation

An r ± 1.00 means perfect correlation or relationship

Pearson Product -M o m e n t Correlation Coefficient


Pearson product-m o m e n t correlation coefficient (r xy ) is a linear correlation necessary
to find the degree of the association of two sets of variables, X and Y. This is the most
commonly used measure of linear correlation to determine the relationship between
two sets of variables quantitatively. To obtain the value of r from ungrouped data,
the formula is as follows (Guilford, 1973).

N(∑ )−(∑ )(∑ Y)


r xy =
√ [ ∑ 2 − (∑ ) 2 ][ ∑ 2 − (∑ ) 2 ]
where:

r xy = Correlation between X and Y

∑ = Sum of Test X

∑ = Sum of Test Y

∑ = S u m of the product of X and Y

N = Number of Cases

∑ 2 = S u m of squared X scores

∑ 2 = S u m of squared Y scores

T h e steps in computing Pearson Product -M o m e n t Correlation Coefficient (r xy ) are as


follows:

Step 1. Find the s u m of X and Y


Step 2. Square all X and Y values
Step 3. S u m X 2 and Y 2 to get Σ X 2 and Σ Y 2
Step 4. Multiply X and Y
Step 5. Get the sum of the product X Y to get Σ X Y
Step 6. Apply the formula

Illustration 1 (Descriptive R e s e a r c h )
For illustration purposes, suppose the researcher wishes to determine the
relationship between Mathematics (X) and English (Y) scores taken by Grade 10
junior high school students in a certain high school. Below are fictitious data of
Mathematics and English scores to illustrate Pearson product-moment correlation
in descriptive research. Table 1 shows the computation of the said data.

Table 1. Computation of Pearson Product-Moment Correlation Coefficient between Mathematics


(X) and English (Y) Scores Taken by Grade 1 0 Junior High School Students in a Certain H i g h
School (Fictitious Data)

Students X Y X2 Y2 XY
1 25 30 625 900 750
2 42 53 1764 2809 2226
3 18 21 324 441 378
4 30 30 900 900 900
5 54 53 2916 2809 2862
6 27 29 729 841 783
7 55 60 3025 3600 3300
8 48 42 2304 1764 2016
9 27 31 729 961 837
10 58 60 3364 3600 3480
11 57 72 3249 5184 4104
12 32 31 1024 961 992
13 44 50 1936 2500 2200
14 28 29 784 841 812
15 60 73 3600 5329 4380
TOTAL 605 664 27273 33440 30020

N(∑ )−(∑ )(∑ Y) Given:


rxy =
√[ ∑ 2 − (∑ ) 2 ][ ∑ 2 − (∑ ) 2 ] N = 10
Σ X = 30020
15(30020)−(605)(664)
= ΣY = 605
√[15(27273)− (605) 2 ][15(33440)− (664) 2 ]
ΣXY = 664
450300−401720 Σ X 2 = 27273
=
√[409095− 366025][501600−440896]
ΣY2 = 33440
48580
=
√(43070)(60704)
48580
=
√2614521280

48580
=
51132.38973
= . (very high relationship)
T h e correlation v a l u e o b t a i n e d is 0.95, v e r y h i g h relationship. T h i s m e a n s st ude nt s
w h o got very high score in Mathematics also got very high score in English a nd those
w h o got very low score in Mathematics also got very l ow score in English.

Illustration 2 (Experimental Research)


S u p p o s e the researcher w i s h e s to c o n d u c t a st udy o n the w e i g h t ( X ) a n d length ( Y )
relationship of m u d c r a b (alimango ) cultured at the ba c kground fishpond using bread
m e a l as s u p p l e m e n t a l feed. T a b l e 2 presents t he c o m p u t a t i o n of the said data.

Table 2. Computation of Pearson Product-Moment Correlation Coefficient on the Weight (X) and
Length (Y) Relationship of Mudcrab (Alimago) Cultured at the Backyard Fishpond Using Bread
Meal as Supplemental Feed

Mudcrab X (kg) Y (m) X2 Y2 XY


1 0.12 0.23 0.0144 0.0529 0.0276
2 0.25 0.31 0.0625 0.0961 0.0775
3 0.53 0.42 0.2809 0.1764 0.2226
4 0.57 0.51 0.3249 0.2601 0.2907
5 1.00 0.61 1.0000 0.3721 0.6100
6 0.75 0.53 0.5625 0.2809 0.3975
7 0.32 0.35 0.1024 0.1225 0.1120
8 0.44 0.42 0.1936 0.1764 0.1848
9 0.81 0.55 0.6561 0.3025 0.4455
10 0.98 0.61 0.9604 0.3721 0.5978
TOTAL 5.77 4.54 4.1577 2.2120 2.9660
N(∑ )−(∑ )(∑ Y) Given:
rxy =
√ [ ∑ 2 − (∑ ) 2 ][ ∑ 2 − (∑ ) 2 ] N = 10
ΣX = 5.77
10(2.9660)−(5.77)(4.54)
= ΣY = 4.54
√[10(4.1577)− (5.77)2 ][10(2.2120)− (4.54)2 ]
ΣXY = 2.9660
29.66−26.1958 Σ X 2 = 4.1577
=
√[41.577− 33.2929][22.12−20.6116]
ΣY2 = 2.2120
3.4642
=
√(8.2841)(1.5084)
3.4642
=
√12.49573644
3.4642
=
3.534931

= . (very high relationship)


The correlation value obtained is 0.98, very high relationship. This means the heavier
the weight, the longer is the length of mudcrab and the lighter the weight the shorter
it is or as the weight increases, the length also increases.

S p e a r m a n R a n k Correlation Coefficient or S p e a r m a n r h o
Spearman’s rho (rs) correlation coefficient measures the relationship between paired
ranks assigned to individual scores of sets of variables. It’s used when variables are
ranked on an ordinal scale (like first, second, third) rather than measured on a
continuous scale. This correlation coefficient indicates how well the ranks of one
variable correspond to the ranks of another.
It ranges from -1 to +1, where a value of +1 indicates a perfect positive relationship
(ranks increase together), -1 indicates a perfect negative relationship (ranks move in
opposite directions), and 0 means no relationship. If the value exceeds 1 or falls below
-1, there’s an error in the calculation.

T o get the value of Spearman rho (r s ), consider the formula below:

rs = 1 - 6 2
3−

where:
rs = S p e a r m a n r h o
Σ D 2 = S u m of the squared differences b e t w e e n ra nks
N = Number of cases
To apply the above formula, the steps are as follows:
Step 1. Rank the values from highest to lowest in the first set of variable (X)
and m a r k t h e m R x . T h e highest value is given the rank of 1; second, 2; third,3
and so on.
Step 2. Rank the second set of values (Y) in the same way as in Step 1 and
mark them R y
S t e p 3. Get the rank difference o f R x a nd R y .
S t e p 4. Square each rank difference t o get D 2
S t e p 5. S u m the squared difference to get Σ D 2
Step 6. Compute Spearman rho (rs) by applying the formula

Spearman rho (rs) is also applicable both in descriptive and experimental researches.

Illustration 1 (Descriptive R e s e a r c h )
Suppose the Grade 10 junior high School students want to find out the rank
relationship between capital and profit of boneless bangus as their income generating
projects. For illustration purposes, consider the computation of the following data
using the Spearman rho.

Table 3 . Computation of Spearman rho Between Capital (X) and Profit (Y) of Boneless Bangus
(fictitious data)

Operation X Y Rx Ry D D2
1 P 1,000 P 500 6.5 5.5 1 1
2 1,500 550 3.5 4 0.5 0.25
3 900 340 9 9 0 0
4 1,000 400 6.5 7.5 1 1
5 800 270 10 10 0 0
6 1,350 500 5 5.5 0.5 0.25
7 2,000 930 1 1 0 0
8 1,900 900 2 2 0 0
9 1,500 600 3.5 3 0.5 0.25
10 950 400 8 7.5 0.5 0.25
TOTAL 3.00

Arrange from highest to lowest

X Temporary Rx Y Temporary Ry
Rank Rank
2,000 1 1 930 1 1
1,900 2 2 900 2 2
1,500 3 3.5 600 3 3
1,500 4 3.5 550 4 4
1,350 5 5 500 5 5.5
1,000 6 6.5 500 6 5.5
1,000 7 6.5 400 7 7.5
950 8 8 400 8 7.5
900 9 9 340 9 9
800 10 10 270 10 10

If there are tie values, for instance, 1,500 has two and the temporary ranks are 3
and 4. To get the final rank, just add the temporary rank and divided by 2 because
there are only two 1,500. Thus, 3+4 = 7/2 = 3.5. Another example, there are two
1,000 with temporary ranks of 6 and 7. hence, 6+7 = 13/2 = 6.5 and so on.
rs = 1 - 6 2 Given:
3−
ΣD2 = 3
6(3)
= 1 -
10 3 −10
N = 10

18
= 1 -
1000 −10

= -

r s = 1 − 0.0181818
r s = 0.98 (very high relationship)

T h e S p e a r m a n r h o (r s ) value obtaine d is 0.98, v e r y h i g h relationship. T h i s m e a n s if


capital is high, the profit is also h i g h a n d if capital is l o w , t he profit is a l so l o w . In
other words, profit increases if capital also increases.

Illustration 2 (Experimental Research)


S u p p o s e the researcher w i s h e s to c o n d u c t a st udy o n the w e i g h t ( X ) a n d length ( Y )
relationship of radish planted in pots using chicken d u n g as fertilizer. For illustration
purposes, consider the computation of Spearman rho (r s ) on the weight (X) and length
( Y ) relationship o f radish planted i n pot s u s i n g c h i c k e n d u n g a s fertilizer a s s h o w n
in table 4.

T a b l e 4 . C o m p u t a t i o n of S p e a r m a n r h o o n t h e W e i g h t ( X ) a n d L e n g t h ( Y ) R e l a t i o n s h i p o f
Relationship of Radish Planted in Pots Using Chicken Dung as Fertilizer

Radish X (g) Y (cm) Rx Ry D D2


1 150 32 5 4.5 0.5 0.25
2 100 28 9 8 1 1
3 200 35 3 3 0 0
4 100 25 9 10 1 1
5 200 32 3 4.5 1.5 2.25
6 130 29 7 6.5 0.5 0.25
7 100 27 9 9 0 0
8 250 40 1 1.5 0.5 0.25
9 140 29 6 6.5 0.5 0.25
10 200 40 3 1.5 1.5 2.25
TOTAL 7.50

Arrange from highest to lowest

X Temporary Rx Y Temporary Ry
Rank Rank
250 1 1 40 1 1 .5
200 2 3 40 2 1.5
200 3 3 35 3 3
200 4 3 32 4 4.5
150 5 5 32 5 4.5
140 6 6 29 6 6.5
130 7 7 29 7 6.5
100 8 8 28 8 8
100 9 9 27 9 9
100 10 10 25 10 10

In getting the rank of three tie values, for instance, 200, just add the temporary ranks
and divided by 3. The temporary ranks of 200 are 2, 3, and 4. So, 2+3+4 = 9/3 = 3.
Another example is 100 having three tie values. To get the rank, just add the
temporary ranks of 8, 9 and 10, then divided by 3. Hence, 8+9+10 = 27/3 = 9.

rs = 1 - 6 2
Given:
3−
Σ D 2 = 7.50
= 1 - 6(7.50)
10 3 −10 N = 10
45
= 1 - 1000 −10

= -

rs = 1 − 0.045454545
r s = 0.95 (very high relationship)

Hypothesis Testing

To illustrate the use of Pearson Product-moment correlation in hypothesis testing,


let us consider the data below (see table 5): age (x) and self-efficacy (Y). The
hypothesis testing should be performed with the following steps:
Step 1. Ho: There is no significant correlation between the age and self-efficacy
of the Grade 10 Junior High School Students at Maydolong National High
School
H1: The age and self-efficacy of the Grade 10 Junior High School
Students has a direct significant correlation.
Step 2. Level of measurement: Interval
Step 3. Level of significance: α = 0.05 or 5 %
Step 4. U s e t w o -tailed Pearson correlation
S t e p 5. C a l c u l a t e t h e test statistics
The data obtained on the age and self-efficacy are as follows

Table 5. Computation of Pearson Product-Moment Correlation Coefficient between Age (X) and
Self-efficacy (Y) of Grade 10 Junior High School Students at Maydolong National High School

Students X Y X2 Y2 XY
1 25 29 625 841 725
2 38 27 1444 729 1026
3 21 33 441 1089 693
4 22 28 484 784 616
5 23 27 529 729 621
6 24 31 576 961 744
7 36 34 1296 1156 1224
8 29 37 841 1369 1073
9 30 35 900 1225 1050
10 28 33 784 1089 924
11 38 32 1444 1024 1216
12 37 33 1369 1089 1221
13 45 28 2025 784 1260
14 30 36 900 1296 1080
TOTAL 426 443 13658 14165 13473

Substituting t h e s e values t o the f o r m u l a : Given:

N(∑ )−(∑ )(∑ Y) N = 14


rxy =
√ [ ∑ 2 − (∑ ) 2 ][ ∑ 2 − (∑ ) 2 ] Σ X = 426
ΣY = 443
14(13473)−(426)(443)
= ΣXY = 13473
√[14(13658)− (426)2 ][14 (141465)− (443) 2 ]
Σ X 2 = 13658
= ΣY2 = 141465
− 96
=
√(9736)(2061)
−96
=
20065896
= − . 430976
Step 6. Obtain the critical value of r. First the degree of freedom (df) will
be computed, and then the critical value will be obtained from a table
on r distribution at the specified level of significance and df. The df is
computed by the formula, df = N-2
Fromm an appropriate table, the critical value of r at df = 12 and
α = 0.05 (two-tailed) is 0.532
Step 7. Since the computed r value (- 0.021) is lesser than its
corresponding critical r value (0.532), the null hypothesis is accepted.
Step 8. Interpretation: There is no significant correlation between the age
and self-efficacy of the Grade 10 Junior High School Students at
Maydolong National High School. In other words, the self-efficacy score
of the female Grade 10 junior high school students does not depend on
the age of the student

You might also like