Assumptions • Assumption #1: Your two variables should be measured at the interval or ratio level (i.e., they are continuous). Examples of variables that meet this criterion include revision time (measured in hours), intelligence (measured using IQ score), exam performance (measured from 0 to 100), weight (measured in kg), and so forth. You can learn more about interval and ratio variables in our Types of Variable guide. • Assumption #2: There is a linear relationship between your two variables. Whilst there are a number of ways to check whether a linear relationship exists between your two variables, we suggest creating a scatterplot using SPSS Statistics, where you can plot the one variable against the other variable, and then visually inspect the scatterplot to check for linearity. Your scatterplot may look something like one of the following: • Assumption #3: There should be no significant outliers. Outliers are simply single data points within your data that do not follow the usual pattern (e.g., in a study of 100 students’ IQ scores, where the mean score was 108 with only a small variation between students, one student had a score of 156, which is very unusual, and may even put her in the top 1% of IQ scores globally). • Assumption #4: Your variables should be approximately normally distributed. To test for normality you can use the Shapiro-Wilk test of normality. Value of the Shapiro-Wilk Test is greater than 0.05, the data is normal. One-tailed vs Two-tailed test • A one-tailed test should be selected when you have a directional hypothesis (e.g. ‘the more anxious someone is about an exam, the worse their mark will be’). • A two-tailed test (the default) should be used when you cannot predict the nature of the relationship (i.e. ‘I’m not sure whether exam anxiety will improve or reduce exam marks’). • Therefore, if you have a directional hypothesis click on 1-tailed , whereas if you have a non-directional hypothesis click on 2-tailed . Example • Our researcher predicted that (1) as anxiety increases, exam performance will decrease, and (2) as the time spent revising increases, exam performance will increase. Both of these are directional hypotheses, so both tests are one-tailed. How to Interpret a Correlation Coefficient r using temprate.sav (these data relate people's body temperatures and heart rates) • Exactly –1. A perfect downhill (negative) linear relationship • –0.70. A strong downhill (negative) linear relationship • –0.50. A moderate downhill (negative) relationship • –0.30. A weak downhill (negative) linear relationship • 0. No linear relationship • +0.30. A weak uphill (positive) linear relationship • +0.50. A moderate uphill (positive) relationship • +0.70. A strong uphill (positive) linear relationship • Exactly +1. A perfect uphill (positive) linear relationship Coefficient of determination, R^2 • A measure of the amount of variability in one variable that is shared by the other. • For example, we may look at the relationship between exam anxiety and exam performance. Exam performances vary from person to person because of any number of factors (different ability, different levels of preparation and so on). If we add up all of this variability (rather like when we calculated the sum of squares in section 2.4.1) then we would have an estimate of how much variability exists in exam performances. We can then use R2 to tell us how much of this variability is shared by exam anxiety. These two variables had a correlation of −0.4410 and so the value of R2 will be (−0.4410)2 = 0.194. This value tells us how much of the variability in exam performance is shared by exam anxiety. • If we convert this value into a percentage (multiply by 100) we can say that exam anxiety shares 19.4% of the variability in exam performance. So, although exam anxiety was highly correlated with exam performance, it can account for only 19.4% of variation in exam scores. To put this value into perspective, this leaves 80.6% of the variability still to be accounted for by other variables. Partial Correlation • Partial correlation is a measure of the strength and direction of a linear relationship between two continuous variables whilst controlling for the effect of one or more other continuous variables (also known as 'covariates' or 'control' variables). Assumptions • Assumption #1: You have one (dependent) variable and one (independent) variable and these are both measured on a continuous scale (i.e., they are measured on an interval or ratio scale). Examples of continuous variables include revision time (measured in hours), intelligence (measured using IQ score), exam performance (measured from 0 to 100), weight (measured in kg), temperature (measured in °C), sales (measured in US dollars), and so forth. • Assumption #2: You have one or more control variables, also known as covariates (i.e., control variables are just variables that you are using to adjust the relationship between the other two variables; that is, your dependent and independent variables). These control variables are also measured on a continuous scale (i.e., they are continuous variables). Examples of continuous variables are provided above. • Assumption #3: There needs to be a linear relationship between all three variables. That is, all possible pairs of variables must show a linear relationship. This is often accomplished by visually inspecting a scatterplot. • Assumption #4: There should be no significant outliers. Outliers are simply single data points within your data that do not follow the usual pattern. Partial correlation is sensitive to outliers, which can have a very large effect on the line of best fit and the correlation coefficient, leading to incorrect conclusions regarding your data. Therefore, it is best if there are no outliers or they are kept to a minimum. • Assumption #5: Your variables should be approximately normally distributed. This can be achieved using the Shapiro-Wilk test of normality, which is easily tested for using SPSS Statistics. How to report correlation coefficents • Five things to note are that: (1) there should be no zero before the decimal point for the correlation coefficientor the probability value (because neither can exceed 1) (2) coefficients are reported to 2 decimal places (3) if you are quoting a one-tailed probability, you should say so (4) each correlation coefficient is represented by a different letter (and some of them are Greek!); (5) there are standard criteria of probabilities that we use (.05, .01 and .001) Example • There was a significant relationship between the number of adverts watched and the number of packets of sweets purchased, r = .87, p (one-tailed) < .05. • Exam performance was significantly correlated with exam anxiety, r = −.44, and time spent revising, r = .40; the time spent revising was also correlated with exam anxiety, r = −.71 (all ps < .001).