Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

QUANTITATIVE METHODS

PANALIGAN HANZEL
correlation analysis

➢ Correlation analysis in research is a statistical method used to measure the


strength of the linear relationship between two variables and compute their
association. Simply put - correlation analysis calculates the level of change in
one variable due to the change in the other. A high correlation points to a
strong relationship between the two variables, while a low correlation means
that the variables are weakly related.
➢ There is a positive correlation between two variable when an increase in one
variable leads to the increase in the other. On the other hand, a negative
correlation means that when one variable increases, the other decreases and
vice-versa.

Example of correlation analysis

Correlation between two variables can be either a positive correlation, a negative


correlation, or no correlation. Let's look at examples of each of these three types:

• Positive correlation: A positive correlation between two variables means both


the variables move in the same direction. An increase in one variable leads to an
increase in the other variable and vice versa.
For example, spending more time on a treadmill burns more calories.
• Negative correlation: A negative correlation between two variables means that
the variables move in opposite directions. An increase in one variable leads to a
decrease in the other variable and vice versa.
For example, increasing the speed of a vehicle decreases the time you take to
reach your destination.
• Weak/Zero correlation: No correlation exists when one variable does not affect
the other.
For example, there is no correlation between the number of years of school a
person has attended and the letters in his/her name.

➢ Correlation analysis involves examining the relationship between two variables.


The presence of correlation should not be interpreted as meaning causation.
Two variables can be highly correlated and not be related in a cause-and-effect
manner. For example, one could take the number of ice cream cones sold in the
city and correlate this with the number of drowning deaths. If there is a direct
correlation discovered, does this mean that drowning deaths could be reduced
by cutting down the number of ice cream cones sold? Of course not, there must
be some other variable that could cause the increase in drowning deaths.
➢ How to Determine the Form of Correlation

Correlation is obtained through the examination of a correlation value. When the


variables are metric, correlation is examined through the value for Pearson’s r.
values for Pearson’s r range from –1.00 to +1.00, with values of +1.00 indicating a
perfect correlation. For example, if studying and test scores were correlated at +1.00,
then increases in studying hours would always result in increased test scores. It is
extremely rare to find a perfect correlation. Inversely, a value of –1.00 would
indicate that increases in studying hours would always result in decreases in test
scores. A value of zero means that there is no correlation.

strength of correlation increases as the value for Pearson’s r aPProaches 1.00,


regardless of whether the sign is positive or negative. The strength of correlation is
as follows:

Less than .20, the correlation is slight, almost negligible relationship

.20-.40, the correlation is low, definite but small relationship

.40-.70, the correlation is moderate, substantial relationship

.70-.90, the correlation is high; marked relationship

.90-1.00, the correlation is very high, very dependable relationship

If the correlation is positive, the relationship is said to be direct. In this case, as one variable
increases the other variable will increase as well.

If the correlation is negative, the relationship is said to be inverse. In this case, as one variable
increases the other variable will decrease.

once a value for Pearson’s r is calculated, the next steP is to determine the significance of the
Pearson’s r value. this testing is done to determine whether the correlation can be inferred back
to the general PoPulation. When testing for the significance of the Pearson’s r value We are
testing the null hypothesis that there is no statistically significant relationship between the
two variables and that any perceived relationship is due to chance or sampling error. The null
hypothesis is written as follows:

the value for calculated Pearson’s r is comPared With a table of knoWn critical values and if
the calculated value is greater than the table value then the null hypothesis is rejected. If the
calculated value is less than the table value then the null hypothesis is rejected (or as some
prefer to indicate: we fail to reject the null hypothesis).

there are 5 requirements to using Pearson’s r:

subjects have been randomly selected

the presence of normality

the association between the variables are linear

measurements must be at least interval data.

If the measurements are not at least interval data, and are instead ordinal data (or
dichotomous dummy coded variables) then the statistical technique of choice is sPearman’s rho.
sPearman’s rho is a nonParametric statistical technique, meaning that the statistical technique
does not require normal distributions or data that is in interval or ratio scale.

sPearman’s rho is interPreted in much the same manner as Pearson’s r, and the significance of the
correlation is expressed in the same manner. The null hypothesis is that there is no relationship
between the variables and that any perceived relationship is due to chance or sampling error.
The hypotheses are written as follows:
In correlation analysis, we estimate a sample correlation coefficient, more specifically
the Pearson Product Moment correlation coefficient. The sample correlation coefficient,
denoted r,

ranges between -1 and +1 and quantifies the direction and strength of the linear association
between the two variables. The correlation between two variables can be positive (i.e., higher
levels of one variable are associated with higher levels of the other) or negative (i.e., higher
levels of one variable are associated with lower levels of the other).

The sign of the correlation coefficient indicates the direction of the association. The magnitude
of the correlation coefficient indicates the strength of the association.

Example - Correlation of Gestational Age and Birth Weight


A small study is conducted involving 17 infants to investigate the association between gestational age at birth, measured in weeks, and birth
weight, measured in grams.

Infant ID # Gestational Age (weeks) Birth Weight (grams)

1 34.7 1895

2 36.0 2030

3 29.3 1440

4 40.1 2835

5 35.7 3090

6 42.4 3827

7 40.3 3260

8 37.3 2690

9 40.9 3285

10 38.3 2920

11 38.5 3430

12 41.4 3657

13 39.7 3685

14 39.7 3345

15 41.1 3260

16 38.0 2680

17 38.7 2005

We wish to estimate the association between gestational age and infant birth weight. In this example, birth weight is the dependent variable and
gestational age is the independent variable. Thus y=birth weight and x=gestational age. The data are displayed in a scatter diagram in the
figure below.
Each point represents an (x,y) pair (in this case the gestational age, measured in weeks, and the birth weight, measured in grams). Note that
the independent variable, gestational age) is on the horizontal axis (or X-axis), and the dependent variable (birth weight) is on the vertical axis
(or Y-axis). The scatter plot shows a positive or direct association between gestational age and birth weight. Infants with shorter gestational
ages are more likely to be born with lower weights and infants with longer gestational ages are more likely to be born with higher weights.

You might also like