Professional Documents
Culture Documents
Correlational Analysis Pearson R and Spearman's Rank
Correlational Analysis Pearson R and Spearman's Rank
Regression involves assessing the correlation between two variables. The prefix co means two ─ hence
correlation is a term used to describe the strength and direction of two quantitative variables. Direction is
indicated by the sign of r value; ─ or +. Strength is indicated by the numueric value. There are two types of
Positive correlation means that the relationship between two variables move in the same direction ─ one
variable increases while the other increases and vice-versa. On the other hand, negative correlation occurs when
one variable increases as the other decreases and vice-versa. Aside from these two, there is also a term “no
correlation” whereas the other variable does not tend to either increase or decrease.
y y y
x x x
Negative correlation No correlation Positive correlation
Pearson r correlation is about the linear relationship of two paired data. A relationship is said to be linear when
-1 ≤ r ≤ 1
Furthermore:
The closer the value is to 1 or -1, the stronger the linear correlation.
For example, Pearson r correlation might use to evaluate whether increase in temperature at your production
facility are associated with decreasing thickness of your chocolate coating. Pearson r correlation is used to
measure the degree of relationship between the two. The following formula is used to calculate the Pearson r
n ∑ x i y −∑ x i ∑ yi
correlation: rxy=
i
2
√n ∑ x −¿ ¿
i
n= number of obeservations
Correlation is an effect size and so we can verbally describe the strength of the correlation using the guide that
Is there a statistically significant relationship between age, as measured in years, and height, measures in
inches?
Is there a relationship between temperature, measured in degrees Fahrenheit, and ice cream sales,
measured by income?
Is there a relationship between job satisfaction, as measured by the JSS, and income, measured in
dollars?
Assumptions
The calculation of Pearson’s correlation coefficient and subsequent testing of it requires the following data
assumption to hold:
Linearly related;
REMEMBER: If your data does not meet the above assumptions then use Spearman’s rank correlation
Spearman’s correlation
Monotonic Function
Before understanding Spearman’s correlation, it is important to know first about “monotonic function”. A
monotonic function happens if one that either never increases or never decreases as its independent variables.
y y y
x x x
increases.
Spearman rank correlation coefficient evaluates the monotonic relationship between two continuous or ordinal
-1 ≤ rs ≤ 1
The Spearman rank correlation focuses on the ranked values for each variable rather than the raw data. And its
interpretation is the same with of Pearson’s, e.g., the closer rs is to ±1 the stronger the monotonic relationship.
For example, Spearman’s correlation might use to evaluate whether the order in which employees complete a
test exercise is related to the number of months they have been employed. The following formula is used to
6 ∑ d i2
ρ=1−
n(n2 −1)
n= number of observations
Correlation is an effect size and so we can verbally describe the strength of the correlation using the guide that
.20-.39 “weak”
.40-.59 “moderate”
.60-.79 “strong”
Is there a statistically significant relationship between participants’ level of education (high school,
Is there a statistically significant relationship between horse’s fishing position a race and horse’s age?
Assumptions
The calculation of Spearman’s correlation coefficient and subsequent testing of it requires the following data
assumption to hold:
Monotonically related.
NOTE: Unlike Pearson r correlation, there is no requirement of normality and hence it is nonparametric
statistic.
REMEMBER: It is always necessary to study the relationship between variables with a scatterplot.
The Pearson and Spearman correlation coefficients can range in value from −1 to +1. For the Pearson
correlation coefficient to be +1, when one variable increase then the other variable increases by a consistent
amount. This relationship forms a perfect line. The Spearman correlation coefficient is also +1 in this case.
Pearson = +1, Spearman = +1
If the relationship is that one variable increase when the other increases, but the amount is not consistent, the
Pearson correlation coefficient is positive but less than +1. The Spearman coefficient still equals +1 in this case.
When a relationship is random or non-existent, then both correlation coefficients are nearly zero.
If the relationship is that one variable decrease when the other increases, but the amount is not consistent, then
the Pearson correlation coefficient is negative but greater than −1. The Spearman coefficient still equals −1 in
this case
Correlation values of −1 or 1 imply an exact linear relationship, like that between a circle's radius and
circumference. However, the real value of correlation values is in quantifying less than perfect relationships.
Finding that two variables are correlated often informs a regression analysis which tries to describe this type of
relationship more.
Courtesy: Minitab Express Support
1. The correlation between temperature and number of ice cream cones bought is the same whether the temperature is
a. True
b. False
2. The correlation between two sets of numbers is the same as the correlation between the log of those two sets of
numbers.
a. True
b. False
a. -1.5
b. -1
c. 0.99
4. Which is higher, the correlation between height and weight or the correlation between weight and height?
Y
X
a. Positive correlation
b. Negative correlation
c. No correlation
6. The data shows the relation between a company’s production and its employees’ salaries in 5 years.
QUESTION: Find the Spearman’s correlation coefficient between production and salaries.
b. Discrete data
c. Continuous data
8. Find the Spearman’s correlation coefficient between sales and advertising from the given data.
6 ∑ d i2
a. ρ=1−
n(n2 −1)
6 ∑ d i2
b. ρ=1−
n ( n2−2 )
6 ∑ di3
c. ρ=1−
n(n2 −1)
10. The following table represents the relation between sale and profit for six models of televisions.
11. Find Spearman’s correlation coefficient between x and y. Round your answer to three decimal places.
X 4 7 8 5 8 12
Y 7 6 6 4 6 10
12. In a study of the relation between students’ grades in mathematics and science, the following results
MATHEMATIC D B A B D D
S
SCIENCE C C B A C F
QUESTION: Find the Spearman’s correlation coefficient. Round your answer to three decimal places.
13. The following table represents the relation between the results of employees’ appraisal this year and last
year.
14. In a study to discover the relationship between the age of a mother and the number of her children, the
MOTHER’ 19 22 24 28 29 32 34 35
S AGE
NUMBER 2 1 1 2 3 4 3 5
OF
CHILDREN
QUESTION: Find Spearman’s correlation coefficient. Round your answer to three decimal places.
15. The table shows the relation between two variables, L AND M. Find Spearman’s correlation coefficient
L 50 90 75 150 63 35 75 90 50
M 130 130 64 80 100 55 80 100 80
16. Using the information from the table, find the Spearman’s rank correlation coefficient and determine the
type of correlation between the age of a mother and number of children. Give the numerical part of your
AGE OF 24 27 35 24 35 21 35 33
MOTHERS
(YEARS)
NO. OF 1 4 3 5 3 1 2 4
CHILDRE
N (n)
17. Using the information given in the table, find the Spearman’s rank correlation between the variables x
18. Using the information from the table, find the Spearman’s rank correlation coefficient and determine the
correlation between the variables x and y. Give the numerical part of your answer to four decimal places.
X 14 9 10 13 6 9
Y 21 20 23 16 20 16
a. 0.2714, inverse correlation
a. r= 1.08
b. r= 0.95
c. r= 0
d. r= 1.0
20. Compute the value of Pearson r correlation coefficient for the data below:
X -2 -5 3
Y 7 -1 2
a. r= 0.002
b. r=0.235
c. r= -0.002
d. r= -0.235