Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

Assignment -2 (Statistics Advance)

Sol 1. Mean (A) = (25+35+21+67+98+27+64) ÷7 =337÷7 =48.143(approx.)

Mean (B) = (52+10+5+98+52+36+69) ÷7 = 322÷7 =46

Covariance (A, B) = ∑ (𝐴𝑖 - A mean)*(𝐵𝑖 - B mean)/ (sample size -1)

= (25-48.143)*(52-46)+(35- 48.143)*(10-46)+(21-48.143)*(5-46)+(67-
48.143)*(98- 46)+(98-48.143)*(52-46)+(27-48.143)*(36-46)+(64-
48.143)*(69-46))/6

Covariance (A, B) = 550.5

Now for correlation

σ(A)=√(∑[𝐴 − 𝜇(𝐴)]2 ÷ N-1) = sqrt( (25-48.143) 2 +(35-48.143) 2 +(21-48.143) 2 +(67-


48.143) 2 +(98-48.143) 2 +(27- 48.143) 2 +(64-48.143) 2 ) /7-1 = √830. 809(Approx.)

σ(A) =28.82376

σ(B)=√(∑[B - μ(B)] 2 ÷ N-1) = sqrt((52-46)2+(10-46)2+(5-46)2+(98-46)2+(52-46)2+(36-


46)2+(69-46)2) /7-1

σ(B) = √1063.66 = 32.6139

Correlation (A, B) = Cov(A,B)÷(σ(A)*σ(B))

=550.5 ÷ (28.82*32.61) = 0.5856

Correlation (A, B) = 0.5856

Sol 2. Ways of dealing with muti-collinearity:

1. Get rid of the redundant variables using a variable selection technique.


2. Multicollinearity affects only the specific independent variables that are correlated.
Therefore, if multicollinearity is not present for the independent variables that you are
particularly interested in, you may not need to resolve it. Suppose your model contains
the experimental variables of interest and some control variables. If high
multicollinearity exists for the control variables but not the experimental variables, then
you can interpret the experimental variables without problems.

Sol 3. The correlation threshold value to determine highly collinear variables should be ± 0.50 as
any value greater than -0.5 or lesser than 0.5 would indicate a weak correlation.
Sol 4. Categorical Variable and Numerical Variable.

Sol 5. Chi-Square Test is applied on 2 categorical variables.

Null Hypothesis: These 2 categorical variables are independent.

Alternate Hypothesis: These 2 categorical variables are not independent.

You might also like