Professional Documents
Culture Documents
Linear Regression and Correlation
Linear Regression and Correlation
Linear Regression and Correlation
Example 1
Consider the following data on body weight and plasma volume of eight healthy men The objective of the analysis is to see whether a change in plasma volume is associated with a change in body weight.
Example
Subject Body Plasma weight volume (l) (kg) 58.0 2.75 70.0 2.86 74.0 3.37 63.5 2.76 62.0 2.62 70.5 3.49 71.0 3.05 66.0 3.12
1 2 3 4 5 6 7 8
SCATTER DIAGRAM
Two related variables - plotted on a graph in the form of points or dots Each point on the diagram represents a pair of values, one based on X-scale and the other based on Y-scale. First step in investigating the relationship between two variables Diagram shows visually the shape and degree of closeness of the relationship
SCATTER DIAGRAM
Values on the X-scale refer to the explanatory or independent variable and on the Y-scale refer to the response or dependent variable. In situations where it is not clear which is the dependent variable, the choice of axes is arbitrary
Is there a trend?
3.6 3.4
P l a s m a v o l u m e
LINEAR REGRESSION
can summarize previous relationship by a line drawn through the scatter of points. any straight line drawn on a graph can be represented by the equation: y = a + bx where y refers to the values of the dependent variable x to values of the explanatory (independent) variable.
LINEAR REGRESSION
The constant 'a' is the intercept, the point at which the line crosses the y-axis.
value of y when x = 0
LINEAR REGRESSION
b = (x - x )(y - y ) (x - x )2
Numerator =
xy -(xy)/n
Denominator =
x2 - (x)2/n
LINEAR REGRESSION
a = y - bx
where y = y/n and x = x/n
The resultant line is called the regression line, which estimates the average value of y for a given value of x.
1 2 3 4 5 6 7 8
Example
b = 1615.296 - (535)(24.02)/8 35983.5 - (535)2/8 = 8.96/205.38 = 0.043615 and a = 3.0025 - 0.043615 x 66.875 = 0.0857
Example 1
Regression line is given by:
Plasma volume = 0.09 + 0.04 x body weight Interpretation of slope For every one point change (1 kg) in body weight, on average there is a corresponding increase of 0.04 l in plasma volume
Example 1
3.6 3.4
P l a s m a v o l u m e
CORRELATION
Linear regression - straight line with which to summarize the relationship between two variables. Does not tell how closely the data lie on a straight line. The (Pearson's) correlation coefficient, r measures the closeness (strength) of the linear association
Linear regression - straight line with which to summarize the relationship between two variables. Does not tell how closely the data lie on a straight line.
Formula for r
The correlation coefficient is calculated as
r
2
xy ( x)( y) n ( x) 2 ( x) ] [ x ]*[ y n n
2 2
Correlation Coefficient of 0
0: no correlation
There is no linear relationship between X and Y Or, there may be a relationship but it is nonlinear
No Correlation