Anova 5

You might also like

Download as txt, pdf, or txt
Download as txt, pdf, or txt
You are on page 1of 2

Bi- Variate Analysis:

The multi-variate analysis show that there is a strong correlation between the
number of
Application received and the number of students accepted (0.94) and number of
Application
received and number of students enrolled (0.85).
15
Inferences from the data:
 There are considerable number of variables that are highly correlated
 “Apps” has high correlation with “Accept”, and ”Enroll”
16
 The correlation Between Applications received and Acceptance is very high (0.94).
Therefore, we
can infer that as the Number of applications increases so does the number of
acceptances.
 The Correlation between Graduation Rates and PHDs is very less (0.32) so we can
say that the
graduation rate is not dependent on whether the faculties are having PHD's or not.
 The Correlation between Accept and enroll is very high(0.91) so we can say that
almost all the
students who are accepted enrolled in a same college
 There is negative Correlation between personal expenditure and Graduation rate.
Therefore, we
can say that the students are not spending their personal expenditure for
educational purpose.
 The Correlation between Percentage of Alumni who donate and the cost of room and
board is
very less (0.27) so we can say that even though the alumni's are donating to the
college it is not
being used for the purpose of providing Room and Board to students at a lower cost.
 There is a negative correlation between total expenditure and enrolment rate (-
0.26) so we can
say that as a total expenditure increases, students will not prefer to enroll in
that particular
college.
 There is a negative correlation between personal and Expend. So we can infer that
as the
instructional expenditure for a student increases the personal expenditure will
come down
2.2 Is scaling necessary for PCA in this case? Give justification and perform
scaling.
Solution:
Yes, it is necessary to perform scaling for PCA.Because as we have observed the
dataset there
were many variables which weren’t normal, many had skewness in them.
The PCA calculates a new projection of your data set. If you normalize your data,
all variables
have the same standard deviation, thus all variables have the same weight and your
PCA
calculates relevant axis. Normalization is important in PCA since it is a variance
maximizing
exercise.
There are different methods to perform scaling, like
 Z score for scaling/standardisation
 StandardScalar for standardisation
 Min-Max method
We get the following output, post we perform scaling using Z score.
17
Box plot:
2.3 Comment on the comparison between the covariance and the correlation matrices
from
this data.[on scaled data]
Solution:
We get the Kmo_model –> 0.84946246682314
Generally, if MSA is less than 0.5, PCA is not recommended, since no reduction is
expected. On
the other hand, MSA >0.7 is expected to provide a considerable reduction is the
dimension and
extraction on meaningful components

You might also like