Professional Documents
Culture Documents
Assignment 2 - Advanced Statistics
Assignment 2 - Advanced Statistics
Assignment 2 - Advanced Statistics
1.
Co-variance : It states dependency between two independent variables on a different
scale(Units).
There is no range for the value of co-variance.
Co-Relation : It also states dependency between two independent variables but here
there is a range for the value of co-variance. i.e -1 to +1.
If the co-relation tends to -1 then it is negatively co-related. If it tends to +1 then it is
negatively co-related
Sum(B) = 52 + 10 + 5 + 98 + 52 + 36 + 69 = 322
BMean= 46
Covariance(A,B) = 550.5
σ(A)=28.82376
σ(B)=√1063.66= 32.6139
To avoid multi-collinearity, the best and the standard way is to remove the identified
variables.
But there are scenarios when we need to retain some of these variables (which are linearly
dependent) in our final training set for building the model. In such cases, we may consider a
linear combination of these variables as a new variable and drop all the identified variables.
Suppose, we identify a strong correlation between the variables X1 and X2. In this case we
want to discard one of these variables. But the business thinks that both these variables are
important and should be utilized. In such a case, we may drop X1, X2 but introduce a new
variable Z = X1 + X2 (or any other linear combination).